CN116451690A - Medical field named entity identification method - Google Patents
Medical field named entity identification method Download PDFInfo
- Publication number
- CN116451690A CN116451690A CN202310282404.3A CN202310282404A CN116451690A CN 116451690 A CN116451690 A CN 116451690A CN 202310282404 A CN202310282404 A CN 202310282404A CN 116451690 A CN116451690 A CN 116451690A
- Authority
- CN
- China
- Prior art keywords
- data
- medical
- steps
- medical field
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000012795 verification Methods 0.000 claims abstract description 8
- 238000003780 insertion Methods 0.000 claims abstract description 7
- 230000037431 insertion Effects 0.000 claims abstract description 7
- 238000006467 substitution reaction Methods 0.000 claims abstract description 7
- 238000012217 deletion Methods 0.000 claims abstract description 4
- 230000037430 deletion Effects 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 239000003814 drug Substances 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000009193 crawling Effects 0.000 claims description 3
- 238000003745 diagnosis Methods 0.000 claims description 3
- 238000003032 molecular docking Methods 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 238000001356 surgical procedure Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000011282 treatment Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 235000003642 hunger Nutrition 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000037351 starvation Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention provides a method for identifying named entities in the medical field, which is characterized by comprising the following steps: s1, acquiring related data in the medical field, and marking the medical data; s2, processing the marked data by using an EDA method comprises the following steps of; synonym substitution, random insertion, random exchange, and random deletion; s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model; s4, reasonably adjusting and optimizing the model by adopting 10-fold cross verification; the invention can effectively solve the problem of data shortage, can effectively extract complex medical entities in the ultra-long text, and adopts K-fold cross verification to reasonably tune and optimize the model.
Description
Technical Field
The invention belongs to an implementation of artificial intelligence technology in the medical field, in particular to a method for identifying named entities in the medical field.
Background
Medical named entity recognition refers to recognizing the boundaries of medical entities from medical text and judging the category of the medical entities, and common medical entity categories include disease names, body parts, drug information, examination or examination items, symptoms, and the like. The accuracy rate of medical named entity identification influences the effects of tasks such as event extraction, relation extraction and the like, is a key task of medical text data mining, provides a key foundation for constructing a medical ICD coding system, a healthy medical system, an intelligent medical question-answering system and a medical knowledge graph, and has profound significance in a good medical named entity identification method.
In the existing technologies for identifying medical named entities, the problem of data starvation is ignored, the used deep learning method is too simple, good effects cannot be produced when the deep learning method faces to the ultra-long text attributes of complex medical entities and electronic medical records, and the model lacks an optimized flow.
Disclosure of Invention
The invention aims to provide a method for identifying named entities in the medical field, which aims to solve the problems that in the prior art for identifying the named entities in the medical field, the existing technology provided in the background art ignores the problem of lack of data, the used deep learning method is too simple, and the method cannot always produce good effects and the model lacks an optimized flow when facing the ultra-long text attributes of complex medical entities and electronic medical records.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method of medical domain named entity identification, comprising the steps of:
s1, acquiring related data in the medical field, and marking the medical data;
s2, processing marked data (mainly processing non-marked words) by using an EDA (data enhancement technology applied to text classification) method, wherein the method comprises the following steps: synonym substitution, random insertion, random exchange, and random deletion;
s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model;
s4, reasonably adjusting parameters and optimizing the model by adopting 10-fold cross verification.
The step S1 of acquiring the related data of the medical field specifically comprises the following steps:
s1.1, acquiring electronic medical record data by a docking medical institution;
s1.2, network crawling medical field related data; the method specifically comprises the following steps:
s1.2.1, acquiring a URL of target medical data;
s1.2.2, submitting an HTTP request to the corresponding URL;
s1.2.3, parsing the HTTP response;
s1.2.4 storing the analysis result
In the step S1, the acquired medical data is marked, and main marking types include: diagnosis, surgery, treatment, examination, medicine, and site.
The processing of the marked data in the step S2 specifically includes the following contents:
(1) synonym substitution: 1-10 non-stop words are randomly selected from the sentences. For each selected word, replacing with its randomly selected synonym;
(2) random insertion: finding a non-stop word in the sentence, randomly selecting a synonym of the non-stop word, and inserting the synonym into any position in the sentence. Randomly repeating 1-10 times;
(3) random exchange: two words in the sentence are arbitrarily selected, and the positions are exchanged. Randomly repeating 1-10 times;
(4) and (5) randomly deleting: for each word in the sentence with an occurrence probability greater than 0.1, the word is randomly deleted or not deleted.
The step S3 specifically includes the following:
s3.1, acquiring a Bert model which is pre-trained on a large-scale medical data set from the Internet;
s3.2, constructing a Bert layer containing position coding hierarchical decomposition;
s3.3, constructing a Bi-GRU layer containing an attention mechanism;
the step S3.2 specifically includes the following:
specifically, let the maximum position code length that Bert defaults to be trainable be n and the corresponding position code vector be p 1 ,p 2 ,···,p n The new coding vector which can be constructed in turn by the method is q 1 ,q 2 ,···,q m Wherein m=n 2 ,
q (i-1)×n+j =au i +(1+a)u j
Where i is the position index of the first layer, j is the position index of the second layer, n is the Bert layer length, a ε (0, 1) and a+.0.5.
The step S3.3 specifically includes the following:
for any time step i, a small batch of input data X is given i ∈R n×d Wherein n is the batch length, d is the vector length, and the hidden layer activation function is set asThe forward and reverse hidden states of this time step are divided into l i ∈R n×h And r i ∈R n×h Where h is the number of hidden units. The forward and reverse hidden states are updated as follows:
wherein the method comprises the steps ofFor weight item, ++>As bias terms, xi is formed according to the self-attention mechanism;
wherein r is i Corresponding to the ith data after ebedding, r is used for i As a Query (Query), all of the input data r are raw 1 、r 2 ···r n As Keys (Keys) and Values (Values), the attention scoring function is f, then
Where concat is used for concatenation of 2 vectors, β (r i ,r j ) The 2 vectors of queries and keys are mapped into scalar quantities by an attention scoring function (here a scaled dot product attention scoring function is used), and then obtained by a softmax function:
the step S4 specifically includes the following:
s4.1, dividing data into K parts (K is 10 here), taking 1 part of the data as a test set and the rest as a training set, obtaining 10 training sets and verification sets here, training a model sequentially by the data, and obtaining 10 error average values;
and S4.2, reasonably adjusting the model hyper-parameters and the neural network structure, repeating the step S4.1, finding the model with the optimal error result, and training the optimal model by using all data.
Compared with the prior art, the invention has the following beneficial effects:
the invention can be used for extracting the named entity from the medical text, and EDA data augmentation is carried out on the original data; the model uses a pre-trained Bert as an Embedding layer on a large-scale medical data set, and carries out hierarchical decomposition on the position codes of the Bert layer to construct a Bi-GRU+CRF layer with a self-attention mechanism; the model is trained and verified by using a 10-fold cross-validation mode, and the super parameters of the model and the structure of the neural network are reasonably adjusted.
Drawings
FIG. 1 is a general flow chart of the present invention;
FIG. 2 is a flow chart of the invention for acquiring data related to the medical field and labeling the medical data;
FIG. 3 is a schematic diagram of a hierarchical decomposition of a position code when constructing a Bert layer containing the hierarchical decomposition of the position code according to the present invention;
FIG. 4 is a schematic diagram of a method for constructing a Bi-GRU layer containing an attention mechanism according to the present invention;
FIG. 5 is a diagram showing X in step S3.2 of the present invention i A specific flow chart formed;
FIG. 6 is a specific flow of model construction in step S3 of the present invention;
fig. 7 is a flowchart showing the step S4 of the present invention.
Detailed Description
In order to clarify the technical problems, technical solutions, implementation processes and performance, the present invention will be further described in detail below with reference to examples. It should be understood that the specific embodiments described herein are for purposes of illustration only. The invention is not intended to be limiting. Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
Example 1
As shown in fig. 1, a method for identifying named entities in a medical field includes the following steps:
s1, acquiring related data in the medical field, and marking the medical data;
s2, processing marked data (mainly processing non-marked words) by using an EDA (data enhancement technology applied to text classification) method, wherein the method comprises the following steps: synonym substitution, random insertion, random exchange, and random deletion;
s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model;
s4, reasonably adjusting parameters and optimizing the model by adopting 10-fold cross verification.
As shown in fig. 2, the step S1 of acquiring the related data of the medical field specifically includes the following steps:
s1.1, acquiring electronic medical record data by a docking medical institution;
s1.2, network crawling medical field related data; the method specifically comprises the following steps:
s1.2.1, acquiring a URL of target medical data;
s1.2.2, submitting an HTTP request to the corresponding URL;
s1.2.3, parsing the HTTP response;
s1.2.4 storing the analysis result
In the step S1, the acquired medical data is marked, and main marking types include: diagnosis, surgery, treatment, examination, medicine, and site.
The processing of the marked data in the step S2 specifically includes the following contents:
(1) synonym substitution: 1-10 non-stop words are randomly selected from the sentences. For each selected word, replacing with its randomly selected synonym;
(2) random insertion: finding a non-stop word in the sentence, randomly selecting a synonym of the non-stop word, and inserting the synonym into any position in the sentence. Randomly repeating 1-10 times;
(3) random exchange: two words in the sentence are arbitrarily selected, and the positions are exchanged. Randomly repeating 1-10 times;
(4) and (5) randomly deleting: for each word in the sentence with an occurrence probability greater than 0.1, the word is randomly deleted or not deleted.
The step S3 specifically includes the following:
s3.1, acquiring a Bert model which is pre-trained on a large-scale medical data set from the Internet;
s3.2, constructing a Bert layer containing position coding hierarchical decomposition;
s3.3, constructing a Bi-GRU layer containing an attention mechanism;
as shown in fig. 3, the step S3.2 specifically includes the following:
specifically, let the maximum position code length that Bert defaults to be trainable be n and the corresponding position code vector be p 1 ,p 2 ,···,p n The new coding vector which can be constructed in turn by the method is q 1 ,q 2 ,···,q m Wherein m=n 2 ,
q (i-1)×n+j =au i +(1+a)u j
Where i is the position index of the first layer, j is the position index of the second layer, n is the Bert layer length, a ε (0, 1) and a+.0.5.
As shown in fig. 4, the step S3.3 specifically includes the following:
for any time step i, a small batch of input data X is given i ∈R n×d Wherein n is the batch length, d is the vector length, and the hidden layer activation function is set asThe hidden state of the forward and reverse of the time stepDivided into l i ∈R n×h And r i ∈R n ×h Where h is the number of hidden units. The forward and reverse hidden states are updated as follows:
wherein the method comprises the steps ofFor weight item, ++>For bias terms, xi is formed according to a self-attention mechanism, and the specific flow of the formation is shown in fig. 5;
wherein r is i Corresponding to the ith data after ebedding, r is used for i As a Query (Query), all of the input data r are raw 1 、r 2 ···r n As Keys (Keys) and Values (Values), the attention scoring function is f, then
Where concat is used for concatenation of 2 vectors, β (r i ,r j ) The 2 vectors of queries and keys are mapped into scalar quantities by an attention scoring function (here a scaled dot product attention scoring function is used), and then obtained by a softmax function:
as shown in fig. 6, the step S4 specifically includes the following:
s4.1, dividing data into K parts (K is 10 here), taking 1 part of the data as a test set and the rest as a training set, obtaining 10 training sets and verification sets here, training a model sequentially by the data, and obtaining 10 error average values;
and S4.2, reasonably adjusting the model hyper-parameters and the neural network structure, repeating the step S4.1, finding the model with the optimal error result, and training the optimal model by using all data.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (7)
1. A method for identifying named entities in a medical field, comprising the steps of:
s1, acquiring related data in the medical field, and marking the medical data;
s2, processing the marked data by using an EDA method comprises the following steps: synonym substitution, random insertion, random exchange, and random deletion;
s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model;
s4, reasonably adjusting parameters and optimizing the model by adopting 10-fold cross verification.
2. The method for identifying a named entity of a medical field according to claim 1, wherein the step S1 of obtaining the related data of the medical field specifically includes the following steps:
s1.1, acquiring electronic medical record data by a docking medical institution;
s1.2, network crawling medical field related data; the method specifically comprises the following steps:
s1.2.1, acquiring a URL of target medical data;
s1.2.2, submitting an HTTP request to the corresponding URL;
s1.2.3, parsing the HTTP response;
s1.2.4 storing the analysis result
In the step S1, the acquired medical data is marked, and main marking types include: diagnosis, surgery, treatment, examination, medicine, and site.
3. The method for identifying a named entity in a medical field according to claim 1, wherein the processing the labeled data in step S2 specifically includes the following contents:
(1) synonym substitution: randomly selecting 1-10 non-stop words in the sentence; for each selected word, replacing with its randomly selected synonym;
(2) random insertion: finding a non-stop word in the sentence, randomly selecting a synonym of the non-stop word, and inserting the synonym into any position in the sentence; randomly repeating 1-10 times;
(3) random exchange: two words in the sentence are selected at will, and the positions are exchanged; randomly repeating 1-10 times;
(4) and (5) randomly deleting: for each word in the sentence with an occurrence probability greater than 0.1, the word is randomly deleted or not deleted.
4. The method for identifying named entities in medical fields according to claim 1, wherein said step S3 comprises the following steps:
s3.1, acquiring a Bert model which is pre-trained on a large-scale medical data set from the Internet;
s3.2, constructing a Bert layer containing position coding hierarchical decomposition;
s3.3, constructing a Bi-GRU layer containing an attention mechanism.
5. The method for identifying a named entity in a medical field according to claim 4, wherein the step S3.2 specifically comprises the following steps:
specifically, let the maximum position code length that Bert defaults to be trainable be n and the corresponding position code vector be p 1 ,p 2 ,···,p n The new coding vector which can be constructed in turn by the method is q 1 ,q 2 ,···,q m Wherein m=n 2 ,
q (i-1)×n+j =au i +(1+a)u j
Where i is the position index of the first layer, j is the position index of the second layer, n is the Bert layer length, a ε (0, 1) and a+.0.5.
6. The method for identifying a named entity in a medical field according to claim 4, wherein the step S3.3 specifically comprises the following steps:
for any time step i, a small batch of input data X is given i ∈R n×d Let the hidden layer activation function beThe hidden states of the forward and reverse directions of the time step are respectively l i ∈R n×h And r i ∈R n×h Where h is the number of hidden units, the forward and reverse hidden states are updated as follows:
wherein the method comprises the steps ofFor weight item, ++>As bias terms, xi is formed according to the self-attention mechanism;
wherein r is i Corresponding to the ith data after ebedding, r is used for i As a Query (Query), all of the input data r are raw 1 、r 2 ···r n As Keys (Keys) and Values (Values), the attention scoring function is f, then
Where concat is used for concatenation of 2 vectors, β (r i ,r j ) The 2 vectors of queries and keys are mapped into scalar quantities by an attention scoring function (here a scaled dot product attention scoring function is used), and then obtained by a softmax function:
7. the method for identifying a named entity in a medical field according to claim 1, wherein the step S4 specifically comprises the following steps:
s4.1, dividing data into K parts (K is 10 here), taking 1 part of the data as a test set and the rest as a training set, obtaining 10 training sets and verification sets here, training a model sequentially by the data, and obtaining 10 error average values;
and S4.2, reasonably adjusting the model hyper-parameters and the neural network structure, repeating the step S4.1, finding the model with the optimal error result, and training the optimal model by using all data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310282404.3A CN116451690A (en) | 2023-03-21 | 2023-03-21 | Medical field named entity identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310282404.3A CN116451690A (en) | 2023-03-21 | 2023-03-21 | Medical field named entity identification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116451690A true CN116451690A (en) | 2023-07-18 |
Family
ID=87129371
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310282404.3A Pending CN116451690A (en) | 2023-03-21 | 2023-03-21 | Medical field named entity identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116451690A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083831A (en) * | 2019-04-16 | 2019-08-02 | 武汉大学 | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF |
CN112395879A (en) * | 2020-11-10 | 2021-02-23 | 华中科技大学 | Scientific and technological text named entity recognition method |
CN112541356A (en) * | 2020-12-21 | 2021-03-23 | 山东师范大学 | Method and system for recognizing biomedical named entities |
CN113836930A (en) * | 2021-09-28 | 2021-12-24 | 浙大城市学院 | Chinese dangerous chemical named entity recognition method |
CN114372465A (en) * | 2021-09-29 | 2022-04-19 | 武汉工程大学 | Legal named entity identification method based on Mixup and BQRNN |
CN114548106A (en) * | 2022-02-22 | 2022-05-27 | 辽宁工程技术大学 | Method for recognizing science collaborative activity named entity based on ALBERT |
CN114742059A (en) * | 2022-04-13 | 2022-07-12 | 浙江科技学院 | Chinese electronic medical record named entity identification method based on multitask learning |
CN114943230A (en) * | 2022-04-17 | 2022-08-26 | 西北工业大学 | Chinese specific field entity linking method fusing common knowledge |
WO2022222224A1 (en) * | 2021-04-19 | 2022-10-27 | 平安科技(深圳)有限公司 | Deep learning model-based data augmentation method and apparatus, device, and medium |
-
2023
- 2023-03-21 CN CN202310282404.3A patent/CN116451690A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083831A (en) * | 2019-04-16 | 2019-08-02 | 武汉大学 | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF |
CN112395879A (en) * | 2020-11-10 | 2021-02-23 | 华中科技大学 | Scientific and technological text named entity recognition method |
CN112541356A (en) * | 2020-12-21 | 2021-03-23 | 山东师范大学 | Method and system for recognizing biomedical named entities |
WO2022222224A1 (en) * | 2021-04-19 | 2022-10-27 | 平安科技(深圳)有限公司 | Deep learning model-based data augmentation method and apparatus, device, and medium |
CN113836930A (en) * | 2021-09-28 | 2021-12-24 | 浙大城市学院 | Chinese dangerous chemical named entity recognition method |
CN114372465A (en) * | 2021-09-29 | 2022-04-19 | 武汉工程大学 | Legal named entity identification method based on Mixup and BQRNN |
CN114548106A (en) * | 2022-02-22 | 2022-05-27 | 辽宁工程技术大学 | Method for recognizing science collaborative activity named entity based on ALBERT |
CN114742059A (en) * | 2022-04-13 | 2022-07-12 | 浙江科技学院 | Chinese electronic medical record named entity identification method based on multitask learning |
CN114943230A (en) * | 2022-04-17 | 2022-08-26 | 西北工业大学 | Chinese specific field entity linking method fusing common knowledge |
Non-Patent Citations (1)
Title |
---|
苏剑林: "层次分解位置编码,让BERT可以处理超长文本", pages 1 - 4, Retrieved from the Internet <URL:https://www.spaces.ac.cn/archives/7947> * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109471895B (en) | Electronic medical record phenotype extraction and phenotype name normalization method and system | |
CN111382272B (en) | Electronic medical record ICD automatic coding method based on knowledge graph | |
CN112131393B (en) | Medical knowledge graph question-answering system construction method based on BERT and similarity algorithm | |
CN111966917B (en) | Event detection and summarization method based on pre-training language model | |
CN111858944B (en) | Entity aspect level emotion analysis method based on attention mechanism | |
Lerman et al. | Using the structure of web sites for automatic segmentation of tables | |
CN109508459B (en) | Method for extracting theme and key information from news | |
CN109871538A (en) | A kind of Chinese electronic health record name entity recognition method | |
CN112308326B (en) | Biological network link prediction method based on meta-path and bidirectional encoder | |
CN110765277B (en) | Knowledge-graph-based mobile terminal online equipment fault diagnosis method | |
CN110189831A (en) | A kind of case history knowledge mapping construction method and system based on dynamic diagram sequences | |
CN111858940B (en) | Multi-head attention-based legal case similarity calculation method and system | |
CN112487202A (en) | Chinese medical named entity recognition method and device fusing knowledge map and BERT | |
CN115019906B (en) | Drug entity and interaction combined extraction method for multi-task sequence labeling | |
CN115048447B (en) | Database natural language interface system based on intelligent semantic completion | |
CN114048305B (en) | Class case recommendation method of administrative punishment document based on graph convolution neural network | |
CN112687388A (en) | Interpretable intelligent medical auxiliary diagnosis system based on text retrieval | |
CN113764112A (en) | Online medical question and answer method | |
CN114005509B (en) | Treatment scheme recommendation system, method, device and storage medium | |
CN111581364B (en) | Chinese intelligent question-answer short text similarity calculation method oriented to medical field | |
CN111881292A (en) | Text classification method and device | |
CN112559723A (en) | FAQ search type question-answer construction method and system based on deep learning | |
CN114676233A (en) | Medical automatic question-answering method based on skeletal muscle knowledge graph | |
CN112925918A (en) | Question-answer matching system based on disease field knowledge graph | |
CN114781382A (en) | Medical named entity recognition system and method based on RWLSTM model fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |