CN116451690A - Medical field named entity identification method - Google Patents

Medical field named entity identification method Download PDF

Info

Publication number
CN116451690A
CN116451690A CN202310282404.3A CN202310282404A CN116451690A CN 116451690 A CN116451690 A CN 116451690A CN 202310282404 A CN202310282404 A CN 202310282404A CN 116451690 A CN116451690 A CN 116451690A
Authority
CN
China
Prior art keywords
data
medical
steps
medical field
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310282404.3A
Other languages
Chinese (zh)
Inventor
张怡
章永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mabo Shanghai Health Technology Co ltd
Original Assignee
Mabo Shanghai Health Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mabo Shanghai Health Technology Co ltd filed Critical Mabo Shanghai Health Technology Co ltd
Priority to CN202310282404.3A priority Critical patent/CN116451690A/en
Publication of CN116451690A publication Critical patent/CN116451690A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a method for identifying named entities in the medical field, which is characterized by comprising the following steps: s1, acquiring related data in the medical field, and marking the medical data; s2, processing the marked data by using an EDA method comprises the following steps of; synonym substitution, random insertion, random exchange, and random deletion; s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model; s4, reasonably adjusting and optimizing the model by adopting 10-fold cross verification; the invention can effectively solve the problem of data shortage, can effectively extract complex medical entities in the ultra-long text, and adopts K-fold cross verification to reasonably tune and optimize the model.

Description

Medical field named entity identification method
Technical Field
The invention belongs to an implementation of artificial intelligence technology in the medical field, in particular to a method for identifying named entities in the medical field.
Background
Medical named entity recognition refers to recognizing the boundaries of medical entities from medical text and judging the category of the medical entities, and common medical entity categories include disease names, body parts, drug information, examination or examination items, symptoms, and the like. The accuracy rate of medical named entity identification influences the effects of tasks such as event extraction, relation extraction and the like, is a key task of medical text data mining, provides a key foundation for constructing a medical ICD coding system, a healthy medical system, an intelligent medical question-answering system and a medical knowledge graph, and has profound significance in a good medical named entity identification method.
In the existing technologies for identifying medical named entities, the problem of data starvation is ignored, the used deep learning method is too simple, good effects cannot be produced when the deep learning method faces to the ultra-long text attributes of complex medical entities and electronic medical records, and the model lacks an optimized flow.
Disclosure of Invention
The invention aims to provide a method for identifying named entities in the medical field, which aims to solve the problems that in the prior art for identifying the named entities in the medical field, the existing technology provided in the background art ignores the problem of lack of data, the used deep learning method is too simple, and the method cannot always produce good effects and the model lacks an optimized flow when facing the ultra-long text attributes of complex medical entities and electronic medical records.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method of medical domain named entity identification, comprising the steps of:
s1, acquiring related data in the medical field, and marking the medical data;
s2, processing marked data (mainly processing non-marked words) by using an EDA (data enhancement technology applied to text classification) method, wherein the method comprises the following steps: synonym substitution, random insertion, random exchange, and random deletion;
s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model;
s4, reasonably adjusting parameters and optimizing the model by adopting 10-fold cross verification.
The step S1 of acquiring the related data of the medical field specifically comprises the following steps:
s1.1, acquiring electronic medical record data by a docking medical institution;
s1.2, network crawling medical field related data; the method specifically comprises the following steps:
s1.2.1, acquiring a URL of target medical data;
s1.2.2, submitting an HTTP request to the corresponding URL;
s1.2.3, parsing the HTTP response;
s1.2.4 storing the analysis result
In the step S1, the acquired medical data is marked, and main marking types include: diagnosis, surgery, treatment, examination, medicine, and site.
The processing of the marked data in the step S2 specifically includes the following contents:
(1) synonym substitution: 1-10 non-stop words are randomly selected from the sentences. For each selected word, replacing with its randomly selected synonym;
(2) random insertion: finding a non-stop word in the sentence, randomly selecting a synonym of the non-stop word, and inserting the synonym into any position in the sentence. Randomly repeating 1-10 times;
(3) random exchange: two words in the sentence are arbitrarily selected, and the positions are exchanged. Randomly repeating 1-10 times;
(4) and (5) randomly deleting: for each word in the sentence with an occurrence probability greater than 0.1, the word is randomly deleted or not deleted.
The step S3 specifically includes the following:
s3.1, acquiring a Bert model which is pre-trained on a large-scale medical data set from the Internet;
s3.2, constructing a Bert layer containing position coding hierarchical decomposition;
s3.3, constructing a Bi-GRU layer containing an attention mechanism;
the step S3.2 specifically includes the following:
specifically, let the maximum position code length that Bert defaults to be trainable be n and the corresponding position code vector be p 1 ,p 2 ,···,p n The new coding vector which can be constructed in turn by the method is q 1 ,q 2 ,···,q m Wherein m=n 2
q (i-1)×n+j =au i +(1+a)u j
Where i is the position index of the first layer, j is the position index of the second layer, n is the Bert layer length, a ε (0, 1) and a+.0.5.
The step S3.3 specifically includes the following:
for any time step i, a small batch of input data X is given i ∈R n×d Wherein n is the batch length, d is the vector length, and the hidden layer activation function is set asThe forward and reverse hidden states of this time step are divided into l i ∈R n×h And r i ∈R n×h Where h is the number of hidden units. The forward and reverse hidden states are updated as follows:
wherein the method comprises the steps ofFor weight item, ++>As bias terms, xi is formed according to the self-attention mechanism;
wherein r is i Corresponding to the ith data after ebedding, r is used for i As a Query (Query), all of the input data r are raw 1 、r 2 ···r n As Keys (Keys) and Values (Values), the attention scoring function is f, then
Where concat is used for concatenation of 2 vectors, β (r i ,r j ) The 2 vectors of queries and keys are mapped into scalar quantities by an attention scoring function (here a scaled dot product attention scoring function is used), and then obtained by a softmax function:
the step S4 specifically includes the following:
s4.1, dividing data into K parts (K is 10 here), taking 1 part of the data as a test set and the rest as a training set, obtaining 10 training sets and verification sets here, training a model sequentially by the data, and obtaining 10 error average values;
and S4.2, reasonably adjusting the model hyper-parameters and the neural network structure, repeating the step S4.1, finding the model with the optimal error result, and training the optimal model by using all data.
Compared with the prior art, the invention has the following beneficial effects:
the invention can be used for extracting the named entity from the medical text, and EDA data augmentation is carried out on the original data; the model uses a pre-trained Bert as an Embedding layer on a large-scale medical data set, and carries out hierarchical decomposition on the position codes of the Bert layer to construct a Bi-GRU+CRF layer with a self-attention mechanism; the model is trained and verified by using a 10-fold cross-validation mode, and the super parameters of the model and the structure of the neural network are reasonably adjusted.
Drawings
FIG. 1 is a general flow chart of the present invention;
FIG. 2 is a flow chart of the invention for acquiring data related to the medical field and labeling the medical data;
FIG. 3 is a schematic diagram of a hierarchical decomposition of a position code when constructing a Bert layer containing the hierarchical decomposition of the position code according to the present invention;
FIG. 4 is a schematic diagram of a method for constructing a Bi-GRU layer containing an attention mechanism according to the present invention;
FIG. 5 is a diagram showing X in step S3.2 of the present invention i A specific flow chart formed;
FIG. 6 is a specific flow of model construction in step S3 of the present invention;
fig. 7 is a flowchart showing the step S4 of the present invention.
Detailed Description
In order to clarify the technical problems, technical solutions, implementation processes and performance, the present invention will be further described in detail below with reference to examples. It should be understood that the specific embodiments described herein are for purposes of illustration only. The invention is not intended to be limiting. Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
Example 1
As shown in fig. 1, a method for identifying named entities in a medical field includes the following steps:
s1, acquiring related data in the medical field, and marking the medical data;
s2, processing marked data (mainly processing non-marked words) by using an EDA (data enhancement technology applied to text classification) method, wherein the method comprises the following steps: synonym substitution, random insertion, random exchange, and random deletion;
s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model;
s4, reasonably adjusting parameters and optimizing the model by adopting 10-fold cross verification.
As shown in fig. 2, the step S1 of acquiring the related data of the medical field specifically includes the following steps:
s1.1, acquiring electronic medical record data by a docking medical institution;
s1.2, network crawling medical field related data; the method specifically comprises the following steps:
s1.2.1, acquiring a URL of target medical data;
s1.2.2, submitting an HTTP request to the corresponding URL;
s1.2.3, parsing the HTTP response;
s1.2.4 storing the analysis result
In the step S1, the acquired medical data is marked, and main marking types include: diagnosis, surgery, treatment, examination, medicine, and site.
The processing of the marked data in the step S2 specifically includes the following contents:
(1) synonym substitution: 1-10 non-stop words are randomly selected from the sentences. For each selected word, replacing with its randomly selected synonym;
(2) random insertion: finding a non-stop word in the sentence, randomly selecting a synonym of the non-stop word, and inserting the synonym into any position in the sentence. Randomly repeating 1-10 times;
(3) random exchange: two words in the sentence are arbitrarily selected, and the positions are exchanged. Randomly repeating 1-10 times;
(4) and (5) randomly deleting: for each word in the sentence with an occurrence probability greater than 0.1, the word is randomly deleted or not deleted.
The step S3 specifically includes the following:
s3.1, acquiring a Bert model which is pre-trained on a large-scale medical data set from the Internet;
s3.2, constructing a Bert layer containing position coding hierarchical decomposition;
s3.3, constructing a Bi-GRU layer containing an attention mechanism;
as shown in fig. 3, the step S3.2 specifically includes the following:
specifically, let the maximum position code length that Bert defaults to be trainable be n and the corresponding position code vector be p 1 ,p 2 ,···,p n The new coding vector which can be constructed in turn by the method is q 1 ,q 2 ,···,q m Wherein m=n 2
q (i-1)×n+j =au i +(1+a)u j
Where i is the position index of the first layer, j is the position index of the second layer, n is the Bert layer length, a ε (0, 1) and a+.0.5.
As shown in fig. 4, the step S3.3 specifically includes the following:
for any time step i, a small batch of input data X is given i ∈R n×d Wherein n is the batch length, d is the vector length, and the hidden layer activation function is set asThe hidden state of the forward and reverse of the time stepDivided into l i ∈R n×h And r i ∈R n ×h Where h is the number of hidden units. The forward and reverse hidden states are updated as follows:
wherein the method comprises the steps ofFor weight item, ++>For bias terms, xi is formed according to a self-attention mechanism, and the specific flow of the formation is shown in fig. 5;
wherein r is i Corresponding to the ith data after ebedding, r is used for i As a Query (Query), all of the input data r are raw 1 、r 2 ···r n As Keys (Keys) and Values (Values), the attention scoring function is f, then
Where concat is used for concatenation of 2 vectors, β (r i ,r j ) The 2 vectors of queries and keys are mapped into scalar quantities by an attention scoring function (here a scaled dot product attention scoring function is used), and then obtained by a softmax function:
as shown in fig. 6, the step S4 specifically includes the following:
s4.1, dividing data into K parts (K is 10 here), taking 1 part of the data as a test set and the rest as a training set, obtaining 10 training sets and verification sets here, training a model sequentially by the data, and obtaining 10 error average values;
and S4.2, reasonably adjusting the model hyper-parameters and the neural network structure, repeating the step S4.1, finding the model with the optimal error result, and training the optimal model by using all data.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (7)

1. A method for identifying named entities in a medical field, comprising the steps of:
s1, acquiring related data in the medical field, and marking the medical data;
s2, processing the marked data by using an EDA method comprises the following steps: synonym substitution, random insertion, random exchange, and random deletion;
s3, constructing a Bert (pre-trained on a medical big data set, expanding the position codes by using a hierarchical decomposition method) +Bi-GRU (fused with a attention mechanism) +CRF model;
s4, reasonably adjusting parameters and optimizing the model by adopting 10-fold cross verification.
2. The method for identifying a named entity of a medical field according to claim 1, wherein the step S1 of obtaining the related data of the medical field specifically includes the following steps:
s1.1, acquiring electronic medical record data by a docking medical institution;
s1.2, network crawling medical field related data; the method specifically comprises the following steps:
s1.2.1, acquiring a URL of target medical data;
s1.2.2, submitting an HTTP request to the corresponding URL;
s1.2.3, parsing the HTTP response;
s1.2.4 storing the analysis result
In the step S1, the acquired medical data is marked, and main marking types include: diagnosis, surgery, treatment, examination, medicine, and site.
3. The method for identifying a named entity in a medical field according to claim 1, wherein the processing the labeled data in step S2 specifically includes the following contents:
(1) synonym substitution: randomly selecting 1-10 non-stop words in the sentence; for each selected word, replacing with its randomly selected synonym;
(2) random insertion: finding a non-stop word in the sentence, randomly selecting a synonym of the non-stop word, and inserting the synonym into any position in the sentence; randomly repeating 1-10 times;
(3) random exchange: two words in the sentence are selected at will, and the positions are exchanged; randomly repeating 1-10 times;
(4) and (5) randomly deleting: for each word in the sentence with an occurrence probability greater than 0.1, the word is randomly deleted or not deleted.
4. The method for identifying named entities in medical fields according to claim 1, wherein said step S3 comprises the following steps:
s3.1, acquiring a Bert model which is pre-trained on a large-scale medical data set from the Internet;
s3.2, constructing a Bert layer containing position coding hierarchical decomposition;
s3.3, constructing a Bi-GRU layer containing an attention mechanism.
5. The method for identifying a named entity in a medical field according to claim 4, wherein the step S3.2 specifically comprises the following steps:
specifically, let the maximum position code length that Bert defaults to be trainable be n and the corresponding position code vector be p 1 ,p 2 ,···,p n The new coding vector which can be constructed in turn by the method is q 1 ,q 2 ,···,q m Wherein m=n 2
q (i-1)×n+j =au i +(1+a)u j
Where i is the position index of the first layer, j is the position index of the second layer, n is the Bert layer length, a ε (0, 1) and a+.0.5.
6. The method for identifying a named entity in a medical field according to claim 4, wherein the step S3.3 specifically comprises the following steps:
for any time step i, a small batch of input data X is given i ∈R n×d Let the hidden layer activation function beThe hidden states of the forward and reverse directions of the time step are respectively l i ∈R n×h And r i ∈R n×h Where h is the number of hidden units, the forward and reverse hidden states are updated as follows:
wherein the method comprises the steps ofFor weight item, ++>As bias terms, xi is formed according to the self-attention mechanism;
wherein r is i Corresponding to the ith data after ebedding, r is used for i As a Query (Query), all of the input data r are raw 1 、r 2 ···r n As Keys (Keys) and Values (Values), the attention scoring function is f, then
Where concat is used for concatenation of 2 vectors, β (r i ,r j ) The 2 vectors of queries and keys are mapped into scalar quantities by an attention scoring function (here a scaled dot product attention scoring function is used), and then obtained by a softmax function:
7. the method for identifying a named entity in a medical field according to claim 1, wherein the step S4 specifically comprises the following steps:
s4.1, dividing data into K parts (K is 10 here), taking 1 part of the data as a test set and the rest as a training set, obtaining 10 training sets and verification sets here, training a model sequentially by the data, and obtaining 10 error average values;
and S4.2, reasonably adjusting the model hyper-parameters and the neural network structure, repeating the step S4.1, finding the model with the optimal error result, and training the optimal model by using all data.
CN202310282404.3A 2023-03-21 2023-03-21 Medical field named entity identification method Pending CN116451690A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310282404.3A CN116451690A (en) 2023-03-21 2023-03-21 Medical field named entity identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310282404.3A CN116451690A (en) 2023-03-21 2023-03-21 Medical field named entity identification method

Publications (1)

Publication Number Publication Date
CN116451690A true CN116451690A (en) 2023-07-18

Family

ID=87129371

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310282404.3A Pending CN116451690A (en) 2023-03-21 2023-03-21 Medical field named entity identification method

Country Status (1)

Country Link
CN (1) CN116451690A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083831A (en) * 2019-04-16 2019-08-02 武汉大学 A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN112395879A (en) * 2020-11-10 2021-02-23 华中科技大学 Scientific and technological text named entity recognition method
CN112541356A (en) * 2020-12-21 2021-03-23 山东师范大学 Method and system for recognizing biomedical named entities
CN113836930A (en) * 2021-09-28 2021-12-24 浙大城市学院 Chinese dangerous chemical named entity recognition method
CN114372465A (en) * 2021-09-29 2022-04-19 武汉工程大学 Legal named entity identification method based on Mixup and BQRNN
CN114548106A (en) * 2022-02-22 2022-05-27 辽宁工程技术大学 Method for recognizing science collaborative activity named entity based on ALBERT
CN114742059A (en) * 2022-04-13 2022-07-12 浙江科技学院 Chinese electronic medical record named entity identification method based on multitask learning
CN114943230A (en) * 2022-04-17 2022-08-26 西北工业大学 Chinese specific field entity linking method fusing common knowledge
WO2022222224A1 (en) * 2021-04-19 2022-10-27 平安科技(深圳)有限公司 Deep learning model-based data augmentation method and apparatus, device, and medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083831A (en) * 2019-04-16 2019-08-02 武汉大学 A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN112395879A (en) * 2020-11-10 2021-02-23 华中科技大学 Scientific and technological text named entity recognition method
CN112541356A (en) * 2020-12-21 2021-03-23 山东师范大学 Method and system for recognizing biomedical named entities
WO2022222224A1 (en) * 2021-04-19 2022-10-27 平安科技(深圳)有限公司 Deep learning model-based data augmentation method and apparatus, device, and medium
CN113836930A (en) * 2021-09-28 2021-12-24 浙大城市学院 Chinese dangerous chemical named entity recognition method
CN114372465A (en) * 2021-09-29 2022-04-19 武汉工程大学 Legal named entity identification method based on Mixup and BQRNN
CN114548106A (en) * 2022-02-22 2022-05-27 辽宁工程技术大学 Method for recognizing science collaborative activity named entity based on ALBERT
CN114742059A (en) * 2022-04-13 2022-07-12 浙江科技学院 Chinese electronic medical record named entity identification method based on multitask learning
CN114943230A (en) * 2022-04-17 2022-08-26 西北工业大学 Chinese specific field entity linking method fusing common knowledge

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苏剑林: "层次分解位置编码,让BERT可以处理超长文本", pages 1 - 4, Retrieved from the Internet <URL:https://www.spaces.ac.cn/archives/7947> *

Similar Documents

Publication Publication Date Title
CN109471895B (en) Electronic medical record phenotype extraction and phenotype name normalization method and system
CN111382272B (en) Electronic medical record ICD automatic coding method based on knowledge graph
CN112131393B (en) Medical knowledge graph question-answering system construction method based on BERT and similarity algorithm
CN111966917B (en) Event detection and summarization method based on pre-training language model
CN111858944B (en) Entity aspect level emotion analysis method based on attention mechanism
Lerman et al. Using the structure of web sites for automatic segmentation of tables
CN109508459B (en) Method for extracting theme and key information from news
CN109871538A (en) A kind of Chinese electronic health record name entity recognition method
CN112308326B (en) Biological network link prediction method based on meta-path and bidirectional encoder
CN110765277B (en) Knowledge-graph-based mobile terminal online equipment fault diagnosis method
CN110189831A (en) A kind of case history knowledge mapping construction method and system based on dynamic diagram sequences
CN111858940B (en) Multi-head attention-based legal case similarity calculation method and system
CN112487202A (en) Chinese medical named entity recognition method and device fusing knowledge map and BERT
CN115019906B (en) Drug entity and interaction combined extraction method for multi-task sequence labeling
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
CN114048305B (en) Class case recommendation method of administrative punishment document based on graph convolution neural network
CN112687388A (en) Interpretable intelligent medical auxiliary diagnosis system based on text retrieval
CN113764112A (en) Online medical question and answer method
CN114005509B (en) Treatment scheme recommendation system, method, device and storage medium
CN111581364B (en) Chinese intelligent question-answer short text similarity calculation method oriented to medical field
CN111881292A (en) Text classification method and device
CN112559723A (en) FAQ search type question-answer construction method and system based on deep learning
CN114676233A (en) Medical automatic question-answering method based on skeletal muscle knowledge graph
CN112925918A (en) Question-answer matching system based on disease field knowledge graph
CN114781382A (en) Medical named entity recognition system and method based on RWLSTM model fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination