CN111199050A - System for automatically desensitizing medical records and application - Google Patents

System for automatically desensitizing medical records and application Download PDF

Info

Publication number
CN111199050A
CN111199050A CN201811378972.9A CN201811378972A CN111199050A CN 111199050 A CN111199050 A CN 111199050A CN 201811378972 A CN201811378972 A CN 201811378972A CN 111199050 A CN111199050 A CN 111199050A
Authority
CN
China
Prior art keywords
medical record
desensitized
sample
type
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811378972.9A
Other languages
Chinese (zh)
Other versions
CN111199050B (en
Inventor
罗立刚
康悦
李津辰
罗翔凤
刘晓华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zero Krypton Medical Intelligent Technology Guangzhou Co ltd
Original Assignee
Zero Krypton Medical Intelligent Technology Guangzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zero Krypton Medical Intelligent Technology Guangzhou Co ltd filed Critical Zero Krypton Medical Intelligent Technology Guangzhou Co ltd
Priority to CN201811378972.9A priority Critical patent/CN111199050B/en
Publication of CN111199050A publication Critical patent/CN111199050A/en
Application granted granted Critical
Publication of CN111199050B publication Critical patent/CN111199050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Bioethics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Security & Cryptography (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a system for automatically desensitizing a medical record and application thereof. The system, comprising: the template generation module is used for classifying the sample medical records of different layout types and respectively acquiring the sensitive information area corresponding to each type according to the type so as to generate different types of medical record templates corresponding to different types of sample medical records; and the training module is used for inputting the sample medical record lists corresponding to the medical record templates into a convolutional neural network for training so as to obtain a neural network model for classifying the medical record lists. And the desensitization module is used for matching the medical record form to be desensitized according to the type of the medical record form to be desensitized acquired by the neural network model acquired by the training module, and performing labeling and desensitization treatment on the region to be desensitized of the medical record form to be desensitized according to the matched medical record template so as to acquire the medical record after desensitization treatment. Therefore, the medical record desensitization method and the medical record desensitization device can achieve high-efficiency and accurate medical record desensitization.

Description

System for automatically desensitizing medical records and application
Technical Field
The invention relates to the technical fields of pattern recognition, machine learning, convolutional neural networks and the like, in particular to a system for automatically desensitizing medical records and application thereof.
Background
In the process of processing medical records and collecting information, in order to avoid privacy disclosure of patients, sensitive private information such as patient names, addresses, contact ways and the like needs to be subjected to fuzzy processing so as to be used by other non-medical personnel (such as data analysts) for learning and calling information. With the increase of the number of people seeking medical services and the diversification of disease types, the manual medical record desensitization has great limitations in efficiency and reliability. Because the medical records of different hospitals and departments are different in general layout, the structural standard for unifying the information is lacked. If the existing cursor identification technology is directly utilized to identify the whole content of the medical record so as to perform desensitization treatment, the same information can be continuously and repeatedly identified to cause unnecessary time consumption, and the identification is relatively rough, so that the identification accuracy is deficient. Therefore, the prior art cannot realize efficient and accurate automatic desensitization of medical records.
Therefore, there is a need for a system for automatic desensitization of medical records to achieve efficient and accurate automatic desensitization of medical records.
Disclosure of Invention
In view of this, the present application provides a system for performing automatic desensitization on medical records, so as to implement efficient and accurate desensitization on medical records.
The application provides a system for automatic desensitization of medical records, comprising:
the template generation module is used for classifying the sample medical records of different layout types and respectively acquiring the sensitive information area corresponding to each type according to the type so as to generate different types of medical record templates corresponding to different types of sample medical records;
and the training module is used for inputting the sample medical record lists corresponding to the medical record templates into a convolutional neural network for training so as to obtain a neural network model for classifying the medical record lists.
And the desensitization module is used for matching the medical record form to be desensitized according to the type of the medical record form to be desensitized acquired by the neural network model acquired by the training module, and performing labeling and desensitization treatment on the region to be desensitized of the medical record form to be desensitized according to the matched medical record template so as to acquire the medical record after desensitization treatment.
By the above, the automatic desensitization system can efficiently and accurately perform automatic desensitization on medical records of different types so as to avoid privacy disclosure of patients. The defect of limitation on efficiency and reliability of manual medical record desensitization in the prior art is overcome; by means of the functions of the modules, the defect that when the medical record is desensitized by using a cursor identification technology in the prior art, the same information can be continuously and repeatedly identified due to the fact that all contents of the medical record need to be identified, and unnecessary time is consumed is overcome.
Preferably, the template generating module is specifically configured to:
the acquisition submodule is used for acquiring sample medical records of different layouts and types of different hospitals;
the labeling submodule is used for labeling the sensitive information area in the sample medical record list;
the classification submodule is used for classifying the sample medical record list into different types of sample medical record lists according to different layout structures and positions of the sensitive information areas in the labeled sample medical record lists;
the recording submodule is used for recording the coordinate value of the marked sensitive information area of each type of sample medical record list;
the template generation submodule is used for generating a template for each type of sample medical record list: and according to the coordinate value of the labeled sensitive information area of each sample medical record in each type of sample medical record, taking the sensitive information area with the largest contained area as the final sensitive information area of each type of sample medical record, and taking the sample medical record labeled with the final sensitive information area as a medical record template of the type of sample medical record.
Therefore, different types of medical record templates corresponding to different types of sample medical record lists can be generated. And the template generation submodule takes the sensitive information area with the largest area as the final sensitive information area of each type of medical record template. The method is beneficial to ensuring that sensitive information can be fully contained in a desensitized area when the medical record is desensitized.
Preferably, the template generating module further includes: and the image preprocessing submodule is used for carrying out denoising and binarization processing on the sample medical record list labeled by the labeling submodule.
Therefore, the denoising processing can remove noise points irrelevant to sensitive information, and the binarization processing is beneficial to enabling the collection property of the image to be only relevant to the position of a point with a pixel value of 0 or 255 when the image is further processed, and not to relate to multi-level values of the pixel, so that the processing is simple, and the processing and compression amount of data are small.
Preferably, the template generating module further includes:
and the sample expansion submodule is used for carrying out affine transformation on each type of sample medical record list recorded by the recording submodule so as to obtain the specified number of sample medical record lists.
From the above, it is advantageous to expand the number of sample medical records used for training.
Preferably, the training module is specifically configured to:
inputting the sample medical record list of each type and the type thereof into an input layer of a convolutional neural network;
extracting a characteristic diagram of the sample medical record list by the convolution layer of the convolution neural network;
the pooling layer of the convolutional neural network compresses the feature map and is used for extracting main features;
and the full connection layer of the convolutional neural network is used for performing full connection or global average processing on the features extracted by the pooling layer and performing classification processing to obtain a neural network model for classifying medical records.
Therefore, the neural network model for classifying the medical records of different layout types and labeling the initial region to be desensitized is generated. And taking the sensitive information area with the largest area as the final sensitive information area of each type of medical record template. The method is beneficial to ensuring that sensitive information can be fully contained in a desensitized area when the medical record is desensitized.
Preferably, the mapping relation between the feature map of the medical record template extracted by the convolutional layer of the convolutional neural network and the sample medical record list is as follows:
xm=f(Σxm i*km ij+bm j)
wherein, the xmAn output vector representing the m-th layer; said xm iAn input vector representing an ith node of an mth layer; k ism ijFilter parameters which represent the ith node of the mth layer to be trained; b ism jRepresenting the base of the ith borrowing point of the mth layer needing training; the m represents the current layer number; the i represents a current node; the j represents the current layer.
Therefore, the method is beneficial to better extracting the characteristic diagram of the medical record template.
Preferably, the square cost function of the fully-connected layer of the convolutional neural network for classification is:
EN=ΣNΣc(tk n-yk n)2
wherein, N represents the number of sample medical record lists, ENA type representing an output Nth sample order; c represents the number of types of medical record templates, k represents the layout type of the sample medical record list and the full-connection layer output of the convolutional neural networkThe dimension of the medical record template type of (1), the t represents the kth dimension of the label corresponding to the nth sample, and the y represents the kth dimension of the network output corresponding to the nth sample.
Therefore, the method is beneficial to obtaining the optimal classification.
Preferably, the desensitization module specifically comprises:
the matching sub-module is used for matching the medical record template corresponding to the medical record list to be desensitized according to the type of the medical record list to be desensitized, which is acquired by the neural network model and acquired by the training module;
the labeling submodule is used for labeling the initial region to be desensitized to the medical record with desensitization according to the medical record template;
the positioning submodule is used for accurately positioning the initial region to be desensitized by utilizing an image processing technology;
and the desensitization submodule is used for carrying out independent desensitization treatment on each accurately positioned region to be desensitized.
From the above, the initial region to be desensitized of the medical record with desensitization is obtained through the neural network model for feature extraction of the medical records with different layout types, and accurate positioning and desensitization are further performed. The defect that when the cursor identification technology is used for desensitizing the medical records in the prior art, all contents of the medical records need to be identified, the same information can be continuously and repeatedly identified, and unnecessary time consumption is caused is overcome.
Preferably, the sensitive information includes at least one of, but is not limited to: name, address, contact.
In this way, the sensitive information of the present application is not limited to the above information, but also includes other information related to individual privacy.
Based on the system, the application also provides a method for automatically desensitizing a medical record, which comprises the following steps:
A. acquiring an original medical record picture to be desensitized;
B. performing picture quality judgment on the original medical record picture to be desensitized, and reserving the original medical record picture with the resolution ratio higher than a specified threshold value;
C. denoising and binarizing the original medical record picture to obtain a binary image of the processed original medical record picture;
D. according to the binary image of the original medical record picture, classifying the original medical record picture through the neural network model for classifying the medical record list to obtain the type of the medical record to which the original medical record picture belongs;
E. matching a corresponding medical record template according to the type of the medical record, and acquiring an initial region to be desensitized of the original medical record according to the medical record template;
F. accurately positioning the initial region to be desensitized by using an image processing technology to obtain the accurately positioned region to be desensitized;
G. and desensitizing the region to be desensitized after accurate positioning.
Therefore, the method can realize high-efficiency and accurate automatic desensitization on the medical records of different types, so that when other non-medical staff call and learn the medical record information, the privacy of the patient is prevented from being revealed. The defect of limitation on efficiency and reliability of manual medical record desensitization in the prior art is overcome; meanwhile, the method and the device classify the original medical records and position the initial region to be desensitized, and further perform accurate positioning and desensitization treatment on the original medical records, so that the defect that when the medical records are desensitized by using a cursor identification technology in the prior art, the same information can be continuously and repeatedly identified to cause unnecessary time consumption is overcome.
To sum up, the system and the application for carrying out automatic desensitization on medical records provided by the application can realize high-efficient accurate automatic desensitization on medical records of different types, so that when other non-medical staff call and learn the medical record information, the privacy of patients is prevented from being revealed. The defect of limitation on efficiency and reliability of manual medical record desensitization in the prior art is overcome; and the defect that when the medical record is desensitized by using a cursor identification technology in the prior art, the same information can be continuously and repeatedly identified due to the fact that all the contents of the medical record need to be identified, so that unnecessary time consumption is caused is overcome.
Drawings
FIG. 1 is a schematic diagram of a system for automated desensitization of medical records according to the present application;
FIG. 2 is a schematic diagram of a template generation module and a training module of a system for automatic desensitization of medical records according to the present application;
FIG. 3 is a schematic flow chart of a method for automatic desensitization of medical records according to the present application;
FIG. 4 is a schematic flow chart of a method for automatic desensitization of medical records according to the present application;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the scope of the invention.
Example one
As shown in fig. 1-2, the present invention provides a system for automated desensitization of medical records, comprising:
the template generating module 101 is configured to perform sensitive information region labeling on sample medical records of different layout types to generate different types of medical record templates corresponding to different types of sample medical records. A in fig. 2 shows a schematic diagram of the template generating module 101, specifically, the template generating module is specifically configured to:
the acquisition submodule is used for acquiring sample medical records of different layouts and types of different hospitals;
the labeling submodule is used for labeling the sensitive information area in the sample medical record list; wherein the sensitive information may be: name, address, contact address, or other information related to privacy of the individual.
The classification submodule is used for classifying the sample medical record list into different types of sample medical record lists according to different layout structures and positions of the sensitive information areas in the labeled sample medical record lists;
the recording submodule is used for recording the coordinate value of the marked sensitive information area of each type of sample medical record list;
and the image preprocessing submodule is used for carrying out denoising and binarization processing on the sample medical record list labeled by the labeling submodule to obtain a binary image of the processed sample medical record list.
The template generation submodule is used for generating a template for each type of sample medical record list: and according to the coordinate value of the labeled sensitive information area of each sample medical record in each type of sample medical record, taking the sensitive information area with the largest contained area as the final sensitive information area of each type of sample medical record, and taking the sample medical record labeled with the final sensitive information area as a medical record template of the type of sample medical record. Here, the sensitive information area with the largest area is used as the final sensitive information area of each type of medical record template. The method is beneficial to ensuring that sensitive information can be fully contained in a desensitized area when the medical record is desensitized.
And the sample expansion submodule is used for carrying out affine transformation on each type of sample medical record list recorded by the recording submodule so as to obtain the specified number of sample medical record lists.
The training module 102 is configured to input each type of medical record template processed by the template processing module into a convolutional neural network respectively for training to obtain a neural network model for extracting sensitive information for medical records of different layout types. Wherein B in fig. 2 shows a schematic diagram of the template generation module 102, specifically, the template generation module is specifically configured to:
n1, inputting the sample medical record list of each type and the type thereof into an input layer of a convolutional neural network;
n2, extracting a feature map of the sample medical record list by the convolutional layer of the convolutional neural network; wherein, the mapping relation between the characteristic graph of the medical record template extracted by the convolutional layer of the convolutional neural network and the sample medical record list is as follows:
xm=f(Σxm i*km ij+bm j)
wherein, the xmAn output vector representing the m-th layer; said xm iAn input vector representing an ith node of an mth layer; k ism ijFilter parameters which represent the ith node of the mth layer to be trained; b ism jRepresenting the base of the ith borrowing point of the mth layer needing training; the m represents the current layer number; the i represents a current node; the j represents the current layer.
N3, compressing the feature map by a pooling layer of the convolutional neural network, and extracting main features;
n3, the full connection layer of the convolutional neural network is used for performing full connection or global average processing on the features extracted by the pooling layer, and classification processing is performed to obtain a neural network model for classifying medical records.
Wherein, the square cost function of the full connection layer of the convolutional neural network for classification is as follows: eN=ΣNΣc(tk n-yk n)2
Wherein, N represents the number of sample medical record lists, ENA type representing an output Nth sample order; the c represents the number of types of medical record templates, the k represents the layout type of the sample medical record list and the dimension of the medical record template type output by the full connection layer of the convolutional neural network, the t represents the kth dimension of the label corresponding to the nth sample, and the y represents the kth dimension of the network output corresponding to the nth sample.
The training module 102 of the application adopts a supervised learning method to train the initial parameters of each layer through a back propagation algorithm, so as to realize the feature extraction of the training samples.
The desensitization module 103 is configured to match a medical record template corresponding to the medical record form to be desensitized according to the type of the medical record form to be desensitized acquired by the neural network model acquired by the training module, and perform labeling and desensitization processing on an area to be desensitized on the medical record form to be desensitized according to the matched medical record template to acquire a desensitized medical record. The method specifically comprises the following steps:
the matching sub-module is used for matching the medical record template corresponding to the medical record list to be desensitized according to the type of the medical record list to be desensitized, which is acquired by the neural network model and acquired by the training module;
the labeling submodule is used for labeling the initial region to be desensitized to the medical record with desensitization according to the medical record template;
the positioning submodule is used for accurately positioning the initial region to be desensitized by utilizing an image processing technology;
and the desensitization submodule is used for carrying out independent desensitization treatment on each accurately positioned region to be desensitized.
Example two
Based on the system for automatic desensitization of medical records in the first embodiment, the present application further provides a method for automatic desensitization of medical records, as shown in fig. 3-4, including:
s301, acquiring a medical record list to be desensitized;
s302, performing picture quality judgment on the medical record to be desensitized, and reserving the medical record to be desensitized with resolution higher than a specified threshold;
s303, denoising and binaryzation processing are carried out on the medical record list to be desensitized;
s304, classifying the medical records to be desensitized processed in the S303 through a neural network model for classifying the medical records acquired by the training module 102 in the first embodiment to acquire the types of the medical records to which the medical records belong;
s305, matching a medical record template of the corresponding type according to the type of the medical record list, and acquiring an initial region to be desensitized of the original medical record according to the medical record template;
s306, accurately positioning the initial region to be desensitized by using an image processing technology to obtain the accurately positioned region to be desensitized; where further accurate positioning can be performed using OCR recognition techniques.
And S307, desensitizing the to-be-desensitized region after accurate positioning. The area to be desensitized can be hidden or blurred by mosaic covering or other methods, so as to desensitize sensitive privacy information, and avoid privacy disclosure of patients.
Therefore, the method can realize high-efficiency and accurate automatic desensitization on the medical records of different types, so that when other non-medical staff call and learn the medical record information, the privacy of the patient is prevented from being revealed. The defect of limitation on efficiency and reliability of manual medical record desensitization in the prior art is overcome; meanwhile, the method and the device classify the original medical records and position the initial region to be desensitized, and further perform accurate positioning and desensitization treatment on the original medical records, so that the defect that when the medical records are desensitized by using a cursor identification technology in the prior art, the same information can be continuously and repeatedly identified to cause unnecessary time consumption is overcome.
To sum up, the system and the application for carrying out automatic desensitization to medical records provided by the application can realize carrying out automatic desensitization to medical records of different types efficiently and accurately to avoid privacy disclosure of patients. The defect of limitation on efficiency and reliability of manual medical record desensitization in the prior art is overcome; and the defect that when the medical record is desensitized by using a cursor identification technology in the prior art, the same information can be continuously and repeatedly identified due to the fact that all the contents of the medical record need to be identified, so that unnecessary time consumption is caused is overcome.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that are within the spirit and principle of the present invention are intended to be included in the scope of the present invention.

Claims (10)

1. A system for automatic desensitization of medical records, comprising:
the template generation module is used for classifying the sample medical records of different layout types and respectively acquiring the sensitive information area corresponding to each type according to the type so as to generate different types of medical record templates corresponding to different types of sample medical records;
the training module is used for inputting the sample medical record lists corresponding to the medical record templates into a convolutional neural network for training so as to obtain a neural network model for classifying the medical record lists;
and the desensitization module is used for matching the medical record form to be desensitized according to the type of the medical record form to be desensitized acquired by the neural network model acquired by the training module, and performing labeling and desensitization treatment on the region to be desensitized of the medical record form to be desensitized according to the matched medical record template so as to acquire the medical record after desensitization treatment.
2. The system of claim 1, wherein the template generation module specifically comprises:
the acquisition submodule is used for acquiring sample medical records of different layouts and types of different hospitals;
the labeling submodule is used for labeling the sensitive information area in the sample medical record list;
the classification submodule is used for classifying the sample medical record list into different types of sample medical record lists according to different layout structures and positions of the sensitive information areas in the labeled sample medical record lists;
the recording submodule is used for recording the coordinate value of the marked sensitive information area of each type of sample medical record list;
the template generation submodule is used for generating a template for each type of sample medical record list: according to the coordinate value of the labeled sensitive information area of each sample medical record in each type of sample medical record, the sensitive information area which can cover the labeled sensitive information area of each sample medical record in the current type is used as the final sensitive information area of each type of sample medical record, and the sample medical record labeled with the final sensitive information area is used as the medical record template of the type of sample medical record.
3. The system of claim 2, wherein the template generation module further comprises:
and the image preprocessing submodule is used for carrying out denoising and binarization processing on the sample medical record list labeled by the labeling submodule.
4. The system of claim 3, wherein the template generation module further comprises:
and the sample expansion submodule is used for carrying out affine transformation on each type of sample medical record list recorded by the recording submodule so as to obtain the specified number of sample medical record lists.
5. The system of claim 4, wherein the training module comprises an input sub-module and a convolutional neural network:
the input submodule is used for inputting the sample medical record list of each type and the type thereof into an input layer of a convolutional neural network;
the convolution layer of the convolutional neural network is used for extracting a characteristic diagram of the sample medical record list;
the pooling layer of the convolutional neural network is used for compressing the feature map and extracting main features;
and the full connection layer of the convolutional neural network is used for performing full connection or global average processing on the features extracted by the pooling layer and performing classification processing to obtain a neural network model for classifying medical records.
6. The system of claim 5, wherein the convolutional layer of the convolutional neural network extracts a mapping relation between the feature map of the medical record template and the sample medical record sheet as follows:
xm=f(Σxm i*km ij+bm j)
wherein, the xmAn output vector representing the m-th layer; said xm iAn input vector representing an ith node of an mth layer; k ism ijFilter parameters which represent the ith node of the mth layer to be trained; b ism jRepresenting the base of the ith borrowing point of the mth layer needing training; the m represents the current layer number; the i represents a current node; said j represents the currentAnd (3) a layer.
7. The system of claim 6, wherein the fully-connected layer of the convolutional neural network has a squared cost function for classification as:
EN=ΣNΣc(tk n-yk n)2
wherein, N represents the number of sample medical record lists, ENA type representing an output Nth sample order; the c represents the number of types of medical record templates, the k represents the layout type of the sample medical record list and the dimension of the medical record template type output by the full connection layer of the convolutional neural network, the t represents the kth dimension of the label corresponding to the nth sample, and the y represents the kth dimension of the network output corresponding to the nth sample.
8. The system of claim 1, wherein the desensitization module comprises:
the matching sub-module is used for matching the medical record template corresponding to the medical record list to be desensitized according to the type of the medical record list to be desensitized, which is acquired by the neural network model and acquired by the training module;
the labeling submodule is used for labeling the initial region to be desensitized to the medical record with desensitization according to the medical record template;
the positioning submodule is used for accurately positioning the initial region to be desensitized by utilizing an image processing technology;
and the desensitization submodule is used for carrying out independent desensitization treatment on each accurately positioned region to be desensitized.
9. The system of claim 2, wherein the sensitive information includes at least one of, but is not limited to: name, address, contact.
10. A method for automatic desensitization of medical records, based on the system of any one of claims 1-9, comprising:
A. acquiring a medical record list to be desensitized;
B. performing picture quality judgment on the medical record list to be desensitized, and reserving the medical record list to be desensitized with the resolution ratio higher than a specified threshold value;
C. denoising and binarization processing are carried out on the medical record list to be desensitized;
D. classifying the processed medical record list to be desensitized through the neural network model for classifying the medical record list to obtain the type of the medical record list to which the medical record list belongs;
E. matching medical record templates of corresponding types according to the types of the medical record lists, and accordingly acquiring an initial region to be desensitized of the medical record list to be desensitized;
F. accurately positioning the initial region to be desensitized by using an image processing technology to obtain the accurately positioned region to be desensitized;
G. and carrying out desensitization treatment on the accurately positioned region to be desensitized.
CN201811378972.9A 2018-11-19 2018-11-19 System for automatically desensitizing medical records and application Active CN111199050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811378972.9A CN111199050B (en) 2018-11-19 2018-11-19 System for automatically desensitizing medical records and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811378972.9A CN111199050B (en) 2018-11-19 2018-11-19 System for automatically desensitizing medical records and application

Publications (2)

Publication Number Publication Date
CN111199050A true CN111199050A (en) 2020-05-26
CN111199050B CN111199050B (en) 2023-10-17

Family

ID=70743973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811378972.9A Active CN111199050B (en) 2018-11-19 2018-11-19 System for automatically desensitizing medical records and application

Country Status (1)

Country Link
CN (1) CN111199050B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361254A (en) * 2021-06-03 2021-09-07 重庆南鹏人工智能科技研究院有限公司 Automatic electronic medical record analysis method and device
CN116610659A (en) * 2023-05-22 2023-08-18 南方医科大学南方医院 Method for constructing database of liver cancer specific disease, database, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170039222A1 (en) * 2014-04-29 2017-02-09 Farrow Norris Pty Ltd Method and system for comparative data analysis
CN107239666A (en) * 2017-06-09 2017-10-10 孟群 A kind of method and system that medical imaging data are carried out with desensitization process
US20180232528A1 (en) * 2017-02-13 2018-08-16 Protegrity Corporation Sensitive Data Classification
CN108831559A (en) * 2018-06-20 2018-11-16 清华大学 A kind of Chinese electronic health record text analyzing method and system
CN109920501A (en) * 2019-01-24 2019-06-21 西安交通大学 Electronic health record classification method and system based on convolutional neural networks and Active Learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170039222A1 (en) * 2014-04-29 2017-02-09 Farrow Norris Pty Ltd Method and system for comparative data analysis
US20180232528A1 (en) * 2017-02-13 2018-08-16 Protegrity Corporation Sensitive Data Classification
CN107239666A (en) * 2017-06-09 2017-10-10 孟群 A kind of method and system that medical imaging data are carried out with desensitization process
CN108831559A (en) * 2018-06-20 2018-11-16 清华大学 A kind of Chinese electronic health record text analyzing method and system
CN109920501A (en) * 2019-01-24 2019-06-21 西安交通大学 Electronic health record classification method and system based on convolutional neural networks and Active Learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
程健一;关毅;何彬;: "基于SVM和CRF双层分类器的英文电子病历去隐私化", vol. 6, no. 06, pages 17 - 24 *
臧昊;赵强;卞水荣;: "基于XML的电子病历隐私数据脱敏技术的研究与设计", no. 03, pages 111 - 114 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361254A (en) * 2021-06-03 2021-09-07 重庆南鹏人工智能科技研究院有限公司 Automatic electronic medical record analysis method and device
CN116610659A (en) * 2023-05-22 2023-08-18 南方医科大学南方医院 Method for constructing database of liver cancer specific disease, database, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111199050B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
CN109086756B (en) Text detection analysis method, device and equipment based on deep neural network
CN110210413B (en) Multidisciplinary test paper content detection and identification system and method based on deep learning
CN105574550A (en) Vehicle identification method and device
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
Chandran et al. Missing child identification system using deep learning and multiclass SVM
CN111695392A (en) Face recognition method and system based on cascaded deep convolutional neural network
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
EP3867808A1 (en) Method and device for automatic identification of labels of image
CN111199050B (en) System for automatically desensitizing medical records and application
CN110895661A (en) Behavior identification method, device and equipment
CN110276405B (en) Method and apparatus for outputting information
CN110516638B (en) Sign language recognition method based on track and random forest
CN110969173B (en) Target classification method and device
CN113076860B (en) Bird detection system under field scene
CN110458120B (en) Method and system for identifying different vehicle types in complex environment
CN112417974A (en) Public health monitoring method
CN111950556A (en) License plate printing quality detection method based on deep learning
CN110728316A (en) Classroom behavior detection method, system, device and storage medium
CN115984968A (en) Student time-space action recognition method and device, terminal equipment and medium
CN116912872A (en) Drawing identification method, device, equipment and readable storage medium
CN111209924B (en) System for automatically extracting medical advice and application
CN115424293A (en) Living body detection method, and training method and device of living body detection model
Bhatt et al. Text Extraction & Recognition from Visiting Cards
CN112613341A (en) Training method and device, fingerprint identification method and device, and electronic device
CN112330652A (en) Chromosome recognition method and device based on deep learning and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant