CN115017910A - Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record - Google Patents

Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record Download PDF

Info

Publication number
CN115017910A
CN115017910A CN202210749641.1A CN202210749641A CN115017910A CN 115017910 A CN115017910 A CN 115017910A CN 202210749641 A CN202210749641 A CN 202210749641A CN 115017910 A CN115017910 A CN 115017910A
Authority
CN
China
Prior art keywords
entity
attention
features
characteristic
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210749641.1A
Other languages
Chinese (zh)
Inventor
李丽双
王泽昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202210749641.1A priority Critical patent/CN115017910A/en
Publication of CN115017910A publication Critical patent/CN115017910A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

An entity relationship joint extraction method, a network, equipment and a computer readable storage medium based on a Chinese electronic medical record belong to the field of natural language processing and solve the problems that long-distance context semantic information cannot be fully captured and interaction of entity identification and relationship extraction information is insufficient; acquiring entity task characteristics, wherein the entity task characteristics are subjected to a conditional random field to obtain an entity tag sequence; pairing the obtained entities pairwise to obtain a plurality of entity pairs through the entity marker sequences; according to the relation extraction features, weighting and averaging the features of the corresponding positions of the entities to obtain the features of the entities; the characteristics of the entities corresponding to the two entities of the entity pair are spliced with the sentence representation of the pre-training model, and the entity relationship type obtained according to the spliced representation is output through the classifier, so that the accuracy of entity relationship extraction is improved.

Description

Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record
Technical Field
The invention belongs to the field of natural language processing, and relates to a method for performing entity relationship joint extraction aiming at Chinese Electronic Medical Record (EMR) texts, in particular to a Memory-LSTM coding algorithm, a Co-Attention (Co-Attention) mechanism-based entity relationship information deep fusion algorithm, self-Attention mechanism and CRF algorithm-based entity identification and multilayer CNN-based relationship extraction.
Background
Entity relationship Extraction (Entity relationship Extraction) is one of the core tasks in Information Extraction (IE), and includes two subtasks, Named Entity identification (NER) and Relationship Extraction (RE), which aim to automatically extract Named entities of a predefined type from unstructured text and classify the type of relationship existing between each pair of entities. In the medical field, the electronic medical record records abundant clinical medical information, such as important medical entities of diseases, examinations, symptoms, treatments, parts and the like, and semantic relationships among various types of medical entities. The entity relation extraction can extract valuable medical information from massive electronic medical records, and can provide important technical support for a plurality of downstream tasks such as medical question-answering systems, high-quality medical knowledge map construction, clinical assistant decision making and the like.
The entity relationship extraction task was originally proposed by the information Understanding Conference (MUC) in 1996, and then the authoritative assessment conferences such as ACE, TAC, SemEval and the like provide high-quality assessment corpora and accepted assessment standards for the task, so that the research and development of entity relationship extraction in the general field are strongly promoted. In 2010, the national center for integrated Biology and clinical Informatics research (I2B 2) published a medical entity relationship extraction task based on english electronic medical records, so that the medical entity relationship extraction entered into the research field. In the aspect of extracting the relation of Chinese medical entities, in 2020, the sixth Chinese health information processing conference (CHIP2020) issues a medical text information extraction task, and then a Chinese medical information processing evaluation benchmark CBLUE (Chinese biological Language Understanding evaluation) is brought on line, which is a benchmark disclosed in the first medical information processing field in China at present, and aims to construct a uniformly-identified medical information system performance evaluation platform and promote the rapid development of medical informatics.
Entity relationship extraction is mainly divided into a pipeline method and a combined extraction method. The traditional entity relationship extraction method is a pipeline-form-based method, and the method treats entity identification and relationship extraction as two independent subtasks. For a piece of text, all entities are first identified, and then the relationship category of each entity pair is determined. However, the pipeline method has a problem of error propagation, that is, errors accumulated in the previous stage propagate to the next stage. For example, the erroneous entities obtained in the entity recognition stage may seriously affect the training and effect of the relationship model in the relationship extraction stage. Meanwhile, the pipelining method ignores the potential association information between the entity identification and the relation extraction, so that the traditional pipelining method cannot achieve a satisfactory result.
In order to solve the problems of the pipeline method, researchers propose to design a combined model to simultaneously perform entity identification and relationship extraction. The joint extraction method can effectively relieve the problem of error propagation, and can utilize the relevance of the two tasks to enable the information of the two tasks to be interacted, so that deeper semantic information is mined. Miwa and Bansal (End-to-End interaction using lsms on sequences and tree structures [ C ]. Association for Computational relationships, 2016) first propose an End-to-End model for entity relationship joint extraction, and enable the relationship extraction to share the characteristics of entity identification tasks, proving that the characteristic information of entity identification is helpful for improving the performance of relationship extraction. Zheng et al (Joint extraction of entities and relationships based on a novel labeling scheme [ C ]. Association for general rules, 2017) unifies entity relationships into a sequence labeling problem, so that two tasks share a unified network model for learning, thereby realizing interaction of hidden information. However, the above-mentioned early joint extraction model cannot extract efficiently the more complex nested entities and overlapping relationships. In order to solve the problem, Bekoulis et al (Joint registration and translation as a multi-head selection protocol [ J ]. Expert Systems with Applications,2018) propose a decoding method for table filling, and can represent complex entities and overlapping relations by using a table filling frame. Wei et al (A novel template binding frame for relational triple extraction [ C ]. Association for relational languages, 2020) convert the triplet extraction into extracting the subject words first, and then extracting the subject words and relationship types, so that the model can effectively extract the overlapping relationships. The model generally adopts a parameter sharing mode to model information interaction of two tasks, namely entity recognition and relation extraction share the same word embedding. However, the information interaction mode is not sufficient, and the information of two tasks cannot be extracted through deep joint entity identification and relationship. The method is characterized in that two tasks only share the same input characteristics, and then two models learn task characteristics independently; secondly, information interaction is unidirectional, and relationship extraction needs to use identified entity features for relationship classification, but otherwise, the entity identification does not effectively use the features of the relationship extraction. Wang et al (Two are more than one of a connection and a relationship with a table-sequence encoders [ C ]. actual Methods in Natural Language Processing,2020) adopt a multi-layer transform structure, and interact with Two features of entity identification and relationship extraction at each layer, thereby enhancing information interaction. Yan et al (A partial filter network for joint entry and relationship extraction [ C ]. electronic Methods in Natural Language Processing,2021) split and recombine entity features and relationship features, so that each task can fuse the information of another task, thereby realizing bidirectional interaction and proving that relationship extraction has a promoting effect on entity identification.
Through the analysis, the entity relation extraction research obtains richer results. However, in the chinese medical field, the performance of the joint extraction is still low (the highest F values of the entity identification and the relationship extraction on the CBLUE2.0 basis are only 70.1% and 62.8%, respectively), and the main reasons for this are that, firstly, the entity distribution of the medical text (e.g. electronic medical record) is sparse and has a large span, and the traditional recurrent neural network cannot learn long-distance dependent information. Secondly, the medical entities are complex and a large amount of redundant information exists in the medical texts, so that the difficulty in distinguishing the models is improved. Finally, the current combination model still has the problem of insufficient information interaction, and how to construct a deeper combination mode and strengthen the information interaction between tasks is still a research subject which is very concerned by a plurality of researchers.
Disclosure of Invention
The invention provides a Chinese electronic medical record-based entity relationship joint extraction method, a network, equipment and a computer-readable storage medium, which realize the extraction of medical entities from a large number of unstructured electronic medical records and the classification of entity relationships and solve the problems that the prior research cannot fully capture remote context semantic information and the interaction of entity identification and relationship extraction information is insufficient.
Based on the above purpose, the invention provides the following technical scheme:
an entity relationship joint extraction method based on Chinese electronic medical records comprises the following steps:
by coding each position of a sentence of the Chinese electronic medical record, the output characteristic of each position of the sentence is obtained
Figure BDA0003720748250000031
And output characteristics for each position
Figure BDA0003720748250000032
Splicing to obtain sentence characteristic representation;
the bidirectional time sequence characteristic H of the sentence is obtained by splicing the forward time sequence characteristic and the reverse time sequence characteristic represented by the sentence characteristic m
Characterizing bi-directional timing m Obtaining entity task characteristics H through self-attention network ner Entity task feature H ner Obtaining an entity mark sequence through a conditional random field;
the entity task characteristic H of the sentence ner And a bidirectional timing feature H m Obtaining deep fusion characteristic H through Co-Attention network merge And fuse the depths into a feature H merge Obtaining relation extraction characteristic H through multilayer convolution re
Pairing the obtained entities pairwise to obtain a plurality of entity pairs through the entity marker sequences;
extracting features H according to the relationship re Weighting and averaging the characteristics of the corresponding positions of the entities to obtain the characteristics h of the entities e
The characteristics h of the entities corresponding to the two entities of the entity pair e Sentence representation h with pre-trained model cls And splicing to obtain a splicing expression, and outputting the entity relationship type obtained according to the splicing expression through a classifier.
In one aspect, the present invention further relates to a chinese electronic medical record-based entity relationship joint extraction network, including:
LSTM advanced networks based on Memory mechanisms,
the network is drawn based on the self-attentive entity,
an entity relationship depth information fusion network based on the Co-Attention mechanism, and
extracting a network based on the relation of the multilayer convolution;
the LSTM improved network based on the Memory mechanism consists of two LSTM units, two storage units and two gate control units which are respectively expressed as
Figure BDA0003720748250000033
Wherein forward and reverse arrows indicate the direction of data input; input data X ═ X 0 ,x 1 ,...,x L-1 Dimension of L × H, where L is the length of the input data and H is the dimension of the feature; input size of LSTM cell is 1 XH for each character of input dataCharacteristic x t E.g. X, which is input to the forward LSTM unit in turn to obtain
Figure BDA0003720748250000041
The output size is 1 XH/2; the storage unit is a full connection layer with input size of 1 XH/2, the feature to be forward LSTM unit encoded
Figure BDA0003720748250000042
Obtaining memory characteristics from forward memory cells
Figure BDA0003720748250000043
The output size is 1 XH/2; the gate control units are linear layers and are connected with sigmoid functions, the input size of the gate control units is 1 multiplied by H/2, the output size of the gate control units is 1 multiplied by 1, and the characteristics are memorized
Figure BDA0003720748250000044
Obtaining gate control weights by a gate control unit
Figure BDA0003720748250000045
Features for then encoding LSTM units
Figure BDA0003720748250000046
And memory features
Figure BDA0003720748250000047
By weight
Figure BDA0003720748250000048
Linear combination to obtain
Figure BDA0003720748250000049
The output size is 1 XH/2; all characters are sequentially passed through forward LSTM units and spliced to obtain
Figure BDA00037207482500000410
The output size is L multiplied by H/2, all characters are reversely and sequentially passed through an inverse LSTM unit to obtain
Figure BDA00037207482500000411
Splicing the two-way Memory with the forward characteristic to finally obtain the bidirectional Memory-LSTM characteristic
Figure BDA00037207482500000412
The output size is L multiplied by H;
the self-attention-based entity extraction network consists of a self-attention network and a CRF decoding module; the input size of the self-attention network is L multiplied by H, and the output size is L multiplied by H; obtaining entity task characteristics H through self-attention network ner (ii) a Entity task characteristics H ner Decoding is carried out through a CRF module, the input size of the CRF module is L multiplied by H, and the output is a label sequence L of the path with the highest score pred
The entity relationship depth information fusion network based on the Co-Attention mechanism consists of two Attention feature learners, wherein the learners are linear layers, the input size of the learners is L multiplied by H, and the output size of the learners is L multiplied by 1; entity characteristics H to be self-attention network coded ner And feature H of Memory-LSTM m Respectively output as
Figure BDA00037207482500000413
Splicing two attention characteristics
Figure BDA00037207482500000414
Obtaining attention weight attn by softmax activation function m ,attn ner ∈R L×1 Finally, the attention weight is respectively multiplied with the original features correspondingly to obtain the fused features H re Taking the characteristic as the input of a relation extraction module based on multilayer convolution;
the relation extraction model based on the multilayer convolution is formed by stacking two one-dimensional convolution units, namely Conv1 and Conv 2; the input size of Conv1 is L × H, the output size is L × H/2, and the convolution kernel size is 3; the input size of Conv2 is L × H/2, the output size is L × H, and the convolution kernel size is 3; h to be characterized by a Co-Attention network merge Obtaining output H through convolution operation re The output size is L multiplied by H; then weighting the characteristics of the corresponding position of the entity to obtain the entity characteristics h e The characteristic size is 1 XH; splicing the entity features pairwise, and integrating the entity features into the global features to obtain entity pair features
Figure BDA00037207482500000415
Figure BDA00037207482500000416
The characteristic size is 1 multiplied by 3H; classifying the characteristics through a linear layer and outputting a prediction result y ored ∈R 1×C
In one aspect, the invention also relates to an electronic device comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
In one aspect, the invention also relates to a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method.
Has the advantages that: the invention improves the accuracy of entity relationship extraction.
Drawings
FIG. 1: and (4) extracting a model framework diagram based on entity relation union of Memory-LSTM.
Detailed Description
According to the method, a training set is firstly established to train the model, then the prediction effect of the model is tested, and the model is compared with an advanced pipeline model and a combined model, so that the effectiveness of the model provided by the invention is verified.
1. Data set preprocessing
Firstly, manually labeling electronic medical record data according to entities and relationship types defined by the invention, wherein the constructed data set comprises 12219 samples, the data set is randomly divided into a training set and a testing set, the total number of the samples in the training set is 9773, the number of the samples in the testing set is 2446, and the proportion is 80%: 20 percent.
2. Network model structure
As shown in FIG. 1, the network model constructed by the invention includes an LSTM improved network based on a Memory mechanism, an entity extraction network based on self-Attention, an entity relationship depth information fusion network based on a Co-Attention mechanism, and a relationship extraction network based on multilayer convolution.
In the invention, the LSTM improved network based on the Memory mechanism consists of two LSTM units, two storage units and two gate control units which are respectively expressed as
Figure BDA0003720748250000051
Wherein the forward and reverse arrows indicate the direction of data input. Input data X ═ X 0 ,x 1 ,...,x L-1 Dimension of is L H, where L is the length of the input data and H is the dimension of the feature. The input size of the LSTM cell is 1 XH, the characteristic x for each character of the input data t E.g. X, which is input to the forward LSTM unit in turn to obtain
Figure BDA0003720748250000052
The output size was 1 XH/2. The storage unit is a full connection layer with input size of 1 XH/2, the feature to be forward LSTM unit encoded
Figure BDA0003720748250000053
Obtaining memory characteristics from forward memory cells
Figure BDA0003720748250000054
The output size was 1 XH/2. The gate control units are linear layers and are connected with sigmoid functions, the input size of the gate control units is 1 multiplied by H/2, the output size of the gate control units is 1 multiplied by 1, and the characteristics are memorized
Figure BDA0003720748250000055
Obtaining gate control weights by a gate control unit
Figure BDA0003720748250000056
Features for then encoding LSTM units
Figure BDA0003720748250000061
And memory features
Figure BDA0003720748250000062
By weight
Figure BDA0003720748250000063
Linear combination to obtain
Figure BDA0003720748250000064
The output size was 1 XH/2. Thus the output of each character
Figure BDA0003720748250000065
The system not only comprises the output of the LSTM unit, but also integrates the memory characteristic, relieves the problem of forgetting long-distance information, and can dynamically regulate and control the proportion of the two characteristics through a gating mechanism. All characters are sequentially passed through forward LSTM units and spliced to obtain
Figure BDA0003720748250000066
The output size is L multiplied by H/2, and all characters reversely pass through the reverse LSTM unit in the same way to obtain
Figure BDA0003720748250000067
Splicing the two-way Memory with the forward characteristic to finally obtain the bidirectional Memory-LSTM characteristic
Figure BDA0003720748250000068
The output size is L × H.
In the present invention, the self-attention-based entity extraction network consists of a self-attention network and a CRF decoding module. The input size from the attention network is L × H and the output size is L × H. The entity task characteristics H are obtained by learning the characteristics of the entity through a self-attention network, strengthening the weight of the important characteristics by using an attention mechanism and reducing the interference of redundant information ner . Entity task characteristics H ner Decoding is carried out by a CRF module, the input size of the CRF module is LxH, and the input size is L x HThe label sequence L which is the path with the highest score is obtained pred Loss of and entity identification ner . It uses the Viterbi algorithm for calculating the real path score s real And total score of all paths s total Loss of entity identification is loss ner =s real -s total
In the invention, the entity relationship depth information fusion network based on the Co-Attention mechanism consists of two Attention feature learners, wherein the learners are linear layers, the input size of the learners is L multiplied by H, and the output size of the learners is L multiplied by 1. Entity characteristics H to be self-attention network coded ner And feature H of Memory-LSTM m Respectively output as
Figure BDA0003720748250000069
Then the two attention characteristics are spliced
Figure BDA00037207482500000610
Obtaining attention weight attn by softmax activation function m ,attn ner ∈R L×1 Finally, the attention weight is respectively multiplied with the original features correspondingly to obtain the fused features H re The feature is used as an input to a relationship extraction module.
In the present invention, the relation extraction model based on multilayer convolution is formed by stacking two one-dimensional convolution units, which are Conv1 and Conv 2. Conv1 has an input size of L × H, an output size of L × H/2, and a convolution kernel size of 3. Conv2 has an input size of L × H/2, an output size of L × H, and a convolution kernel size of 3. H to be characterized by a Co-Attention network merge Obtaining output H through convolution operation re The output size is L × H. Then weighting the characteristics of the corresponding position of the entity to obtain the entity characteristics h e The characteristic dimension is 1 XH. Splicing the entity features pairwise, and integrating the entity features into the global features to obtain entity pair features
Figure BDA00037207482500000611
The characteristic dimension is 1 × 3H. Passing the feature throughClassifying the linear layer and outputting a prediction result y pred ∈R 1×C Finally calculating loss by using cross entropy to obtain loss re . The total loss of the combined model is loss ner +loss re
3. Model training
For a training sample, as shown in fig. 1, an electronic medical record text is first subjected to a BERT pre-training model to obtain a context feature representation of each character, and then is encoded through a Memory-LSTM network to obtain a feature H m Features H after encoding m Obtaining entity task characteristics H through self-attention network ner Then the entity characteristics H are set ner Decoding the data by a CRF module to obtain the loss of the entity identification task ner And predicted entity tag sequence L pred . Entity task characteristic H subjected to self-attention network coding ner And the coded characteristics H of Memory-LSTM m Obtaining the characteristic H of depth information fusion through a Co-Attention network merge Then, the relation extraction characteristic H is obtained by the characteristic through multilayer convolution re Obtaining the feature expression h of the entity pair according to the label sequence of the entity <e1,e2> Calculating loss by cross entropy re Finally, the total loss is lost ner +loss re A back propagation training model is performed.
4. Method for designing network model
The method comprises the following steps:
step 1, constructing an electronic medical record data set
The corpus of the invention is from real electronic medical record data of departments such as internal medicine, surgery and pediatrics of a certain hospital. According to the concept standard of the Unified Medical Language System (UMLS) and combining the characteristics of the chinese electronic Medical record corpus, the present invention defines 5 entity types: disease, location, symptoms, examination and treatment; and 7 relationship categories: "disease-disease", "disease-symptom", "disease-site", "treatment-symptom", "treatment-disease", "examination-symptom" and "examination-disease".
Step 2, constructing an entity relation joint extraction model
Construction of LSTM improved network based on Memory mechanism
The problems that the distance between head and tail entities is Long and the entity relationship is sparse exist in the relationship in the electronic medical record, and the traditional Long Short Term Memory network (LSTM) cannot effectively extract the relationship characteristics of the head and tail entities with large span. Therefore, the invention provides a Memory-LSTM structure, the Memory mechanism is used for storing the characteristics obtained by each LSTM unit iteration, and the LSTM unit can utilize the information stored by the Memory in the next iteration, thereby effectively relieving the problem of forgetting the characteristic information in the traditional LSTM.
The LSTM network is mainly composed of an input gate, a forgetting gate and an output gate, and the transmission and loss of information are controlled through a gate mechanism. Wherein at the time t, the LSTM unit has three input features, namely an input feature x at the current time t Hidden layer feature h of previous moment t-1 And LSTM cell state C at the previous time t-1 Finally, the output characteristic h at the time t is obtained t The LSTM model formula is as follows:
f t =σ(W f ·[h t-1 ,x t ]+b f )
i t =σ(W i ·[h t-1 ,x t ]+b i )
o t =σ(W o ·[h t-1 ,x t ]+b o )
Figure BDA0003720748250000081
Figure BDA0003720748250000082
h t =o t ⊙tanh(C t )
wherein, W (·) ,b (·) Indicates trainable weights and offsets, tanh and σ are activation functions, and indicates that elements are correspondingly multiplied. In LSIn the TM structure, the hidden layer feature h t Sensitive and fast changing for short term input features, cell state C t The state change for preserving the long term is slow. I.e. cell state C t The information of long-distance text is contained, so the invention introduces a memory unit M for storing C t Characterized in that at time t the input to the memory cell is cell state C t The formula is as follows:
m t =W M ·C t +b M
wherein, W M ,b M Representing trainable parameters of the memory unit. Through t-1 iterations, the previous cell state C 0 ~C t-1 Will be stored in W M ,b M In the method, memory characteristics m containing long-distance text information are obtained t And then linearly combining output characteristics h of the LSTM through a gate control mechanism t And memory characteristics m t Obtaining the final output at the time t
Figure BDA0003720748250000083
The formula is as follows:
g t =σ(W g ·m t +b g )
Figure BDA0003720748250000084
wherein, W g ,b g Indicates that the gate unit may train parameters, σ indicates that a sigmoid activates a function, and an element indicates that the element corresponds to a multiplication. For a text with length k, the output characteristic of each position is determined
Figure BDA0003720748250000085
Splicing to obtain a complete feature representation of a sentence coded by Memory-LSTM, and finally splicing the forward feature and the reverse feature to obtain the feature H of the bidirectional Memory-LSTM m
Figure BDA0003720748250000086
Figure BDA0003720748250000087
(II) constructing entity extraction network based on self attention
A large amount of redundant information exists in an electronic medical record text, medical entities are distributed sparsely, a traditional entity identification model based on time sequence can learn entity information and redundant information equally, and when the redundant information is too much, the learning of entity characteristics can be hindered, and the effect of entity extraction is influenced. Therefore, the invention adopts the self-attention mechanism to learn the characteristics of the entity, strengthens the weight of important information such as a medical entity and the like through the attention model, and reduces the interference of irrelevant redundant information. The formula for the calculation of self-attention is as follows:
Q=W Q ·H m +b Q
K=W K ·H m +b K
V=W V ·H m +b V
Figure BDA0003720748250000091
obtaining entity characteristics H by attention mechanism ner Then, the entity tag sequence is obtained by Conditional Random Field (CRF) and the loss function is calculated. Setting a sequence with the length of N, wherein the marking types of the entity are M, and the calculation formula is as follows:
Figure BDA0003720748250000092
Figure BDA0003720748250000093
wherein s is i Is the fraction of a certain marker sequence, s real Is a fraction of the authentic marker sequence;
Figure BDA0003720748250000094
is shown at w i Position y j The score of the tag is determined by the score of the tag,
Figure BDA0003720748250000095
the mark indicated at the current position is y i The next place to transfer is the mark y j Is scored. Loss of entity identification through CRF calculation ner
(III) constructing an entity relationship depth information fusion network based on a Co-Attention mechanism
In the joint extraction model, how to model information interaction of entity identification and relationship extraction is always the focus of research. Most of the current researches only use the extracted entity characteristics as interaction information, and ignore the context of the entity and other potentially useful information for relationship extraction. Therefore, the invention provides a depth information fusion algorithm based on a Co-Attention mechanism (Co-Attention), which can realize 'character level' information interaction of complete characteristics of an entity recognition task, thereby achieving depth information fusion. The implementation mode is that firstly, an attention feature learner respectively learns the entity features H obtained by the self-attention mechanism ner And feature H of Memory-LSTM m Get attention characteristics
Figure BDA0003720748250000096
(L is sentence length); then the two attention characteristics are spliced
Figure BDA0003720748250000097
Obtaining an attention score attn by a softmax activation function m ,attn ner ∈R L×1 (ii) a And finally, multiplying the original features by the attention scores to obtain fused features, wherein the calculation formula is as follows:
Figure BDA0003720748250000098
Figure BDA0003720748250000099
Figure BDA00037207482500000910
Figure BDA00037207482500000911
wherein the attention score is specifically expressed as
Figure BDA00037207482500000912
Each element represents the ratio of the features of the current position to the fused features. For example, the position i in the text is characterized by synergistic attention fusion
Figure BDA0003720748250000101
Score of attention
Figure BDA0003720748250000102
And
Figure BDA0003720748250000103
the method can be dynamically adjusted to the optimal proportion in the training process, and the character-level depth information fusion is realized. With satisfaction of attention score
Figure BDA0003720748250000104
The scale is unchanged before and after feature fusion. The cooperative attention mechanism not only considers all characteristics from the entity extraction task, but also can give higher weight to important information (such as an entity, the context thereof and the like) in the entity task through the attention score, so that the relationship extraction model can pay attention to all the important information and exert better performance.
(IV) constructing a relation extraction network based on multilayer convolution
The relation extraction adopts Convolutional Neural network (Convolutional Neural network)k, CNN) as an extracted characterizer. Will get the fusion feature H through the cooperative attention merge Obtaining relation extraction characteristic H by two-layer one-dimensional convolution re . And respectively predicting the relationship type of each entity pair by adopting a pairwise pairing mode aiming at all the entities extracted by the entities. For entity pair<E 1 ,E 2 >Assume entity E 1 Positions in the sentence are i-j, entity E 1 Is a weighted average of the features of positions i-j, expressed as
Figure BDA0003720748250000105
Similarly, entity E can be obtained 2 Is characterized in that
Figure BDA0003720748250000106
Finally, the two entity characteristics are compared with a BERT model' [ CLS ]]"feature of position h 2ls And splicing, judging the relation type by using a classifier and calculating the loss, wherein the formula is as follows:
Figure BDA0003720748250000107
loss re =CrossEntropyLoss(y pred ,y label )
wherein, W pred And b pred Trainable parameters, y, representing a classifier pred ∈R 1×C (C represents the total number of relationship classes) represents the predicted relationship class, y label Expressing the truth value of the relation category, and obtaining the loss of the relation extraction through the Cross EntropyLoss cross entropy loss function re
5. Entity relation combined extraction method based on Chinese electronic medical record executed based on network model
The extraction method comprises the following steps:
by coding each position of a sentence of the Chinese electronic medical record, the output characteristic of each position of the sentence is obtained
Figure BDA0003720748250000108
And for eachOutput characteristics of individual position
Figure BDA0003720748250000109
Splicing to obtain sentence characteristic representation;
the method comprises the steps of obtaining a bidirectional time sequence characteristic H of a sentence by splicing forward characteristics and reverse characteristics represented by sentence characteristics m
Characterizing bi-directional timing m Obtaining entity task characteristics H through self-attention network ner Entity task feature H ner Obtaining an entity mark sequence through a conditional random field;
the entity task characteristic H of the sentence ner And bidirectional timing characteristics H m Obtaining deep fusion characteristic H through Co-Attention network merge And will fuse features H merge Obtaining relation extraction characteristic H through multilayer convolution re
Pairwise pairing the obtained entities through the entity marker sequences to obtain a plurality of entity pairs;
extracting features H according to the relationship re Weighting and averaging the characteristics of the corresponding positions of the entities to obtain the characteristics h of the entities e
The characteristics h of the entities corresponding to the two entities of the entity pair e Sentence representation h with pre-trained model cls And splicing to obtain a splicing expression, and outputting the entity relationship type obtained according to the splicing expression through a classifier.
In one arrangement, a first output characteristic is obtained
Figure BDA0003720748250000111
The method comprises the following steps:
inputting the feature representation of each character of the sentence into the LSTM, and acquiring the unit state C of the LSTM containing long-distance text information t
Storing cell state C of LSTM by memory cell M t In response to cell state C t Obtaining a memory feature m t
Output characteristic h of LSTM at t moment t And memory featuresm t Linear combination to obtain output characteristic of t time
Figure BDA0003720748250000112
In one approach:
obtaining a memory feature m t Expressed by the formula: m is t =W M ·C t +b M ,W M ,b M Representing trainable parameters of a memory cell M, cell state C, with t-1 iterations 0 ~C t-1 Is stored in W M ,b M Middle memory feature m t Containing long-distance text information;
obtaining output characteristics at time t
Figure BDA0003720748250000113
Expressed by the formula: g t =σ(W g ·m t +b g ),
Figure BDA0003720748250000114
W g ,b g Indicates trainable parameters of the gate control unit, sigma indicates a sigmoid activation function, and an all indicates that elements are correspondingly multiplied;
the obtained sentence characteristic representation is expressed by the formula:
Figure BDA0003720748250000115
k is the text length, and t belongs to (0-k-1);
obtaining bidirectional time sequence characteristic H of sentence m Expressed by the formula:
Figure BDA0003720748250000116
a forward timing characteristic is represented that is characteristic of,
Figure BDA0003720748250000117
indicating a reverse timing characteristic.
In one approach, a fusion feature H is obtained merge The method comprises the following steps:
entity task feature H through attention feature learner ner And bidirectional timing characteristics H m Linear mapping to obtain the attention characteristics of the entity task
Figure BDA0003720748250000118
And bidirectional timing attention feature
Figure BDA0003720748250000119
L is the sentence length;
characterizing two-way timing attention
Figure BDA00037207482500001110
And entity task characteristics
Figure BDA00037207482500001111
Splicing, and acquiring the two-way time sequence attention characteristic by activating a function through softmax
Figure BDA00037207482500001112
Attention score of (attn) m And physical task attention features
Figure BDA00037207482500001113
Attention score of (attn) ner ,attn m 、attn ner ∈R L×1
Attention score attn m Characterized by two-way time sequence attention
Figure BDA00037207482500001114
Is taken as a fusion feature H merge The proportion of the features of the corresponding positions is obtained according to the position sequence,
attention score attn ner Characterised by the attention of the entity task
Figure BDA0003720748250000121
Is taken as a fusion feature H merge The proportion of the features corresponding to the positions is obtained by arranging the features according to the position sequence,
fusion feature H merge According to a bidirectional time sequenceAttention characteristic
Figure BDA0003720748250000122
Proportion of features corresponding to location to said location determination and physical task attention features
Figure BDA0003720748250000123
The ratio of the feature of the corresponding position to the position determination is calculated.
In one approach:
obtaining bidirectional temporal attention features
Figure BDA0003720748250000124
Expressed by the formula:
Figure BDA0003720748250000125
obtaining entity task attention features
Figure BDA0003720748250000126
Expressed by the formula:
Figure BDA0003720748250000127
the acquisition attention score is formulated as:
Figure BDA0003720748250000128
obtaining fusion features H merge Expressed by the formula:
Figure BDA0003720748250000129
in one arrangement, features H are extracted based on the relationships re Weighting and averaging the characteristics of the corresponding positions of the entities in the sentence of each entity pair to obtain the characteristics h of the entities e Expressed by the formula: h is e =H re [i~j]And/(j-i), the entities are located in sentence positions i to j.
In one approach:
entity pair<E 1 ,E 2 >Middle entity E 1 Positions in the sentence are i-j, entity E 1 Is a weighted average of the features of positions i-j, expressed as
Figure BDA00037207482500001210
Entity pair<E 1 ,E 2 >Intermediate entity E 2 Positions in the sentence are i-j, entity E 2 Is a weighted average of the features of positions i-j, expressed as
Figure BDA00037207482500001211
Expressing two entity characteristics and sentences of a pre-training model h cls Splicing, wherein the entity relationship type obtained according to the splicing expression is output through a classifier and is represented by a formula:
Figure BDA00037207482500001212
W pred and b pred Trainable parameters, y, representing a classifier pred ∈R 1×C C denotes the total number of relationship categories, y label Representing the truth of the relationship class.
6. Model quality assessment
In order to evaluate the effect of extracting the entity relationship of the model, a micro-average mode is adopted, and the accuracy (P) and the recall rate (F) are used
And F-value to evaluate the effectiveness of the model. The predicted triples are evaluated in a strict matching manner, i.e. they are considered correct if and only if the entity boundaries, entity types and relationship types in the predicted triples are completely correct.
Table 1: entity identification evaluation result
Figure BDA0003720748250000131
The results of the entity identification evaluation are shown in table 1, and it can be seen that the entity identification achieves good effects for each category, and the F-value of 88.05% is reached as a whole. As can be seen from the results, the "symptom" and "site" types are less favorable, one of the reasons for this is that the "symptom" type entities are semantically closer to the "disease" type entities, e.g., "migraine" belongs to a disease, but is easily confused with the symptom "headache". The proportion of the entities of the type of the part to the total entities is small, the model cannot fully learn the characteristics of the entities, and if complex entities of the part, such as the ureter crosses the iliac vessel, the ureter junction of the renal pelvis and the like, cannot be extracted correctly. The entity of the types of 'disease', 'treatment' and 'inspection' has better extraction effect because the characteristics are more obvious and the sample is sufficient.
Table 2: relationship extraction evaluation result
Figure BDA0003720748250000132
The results of the relationship extraction are shown in table 2, and it can be seen that the relationship extraction also achieves better effect overall, reaching 77.17% of F-value. The reason why the categories of "disease-disease" and "disease-symptom" are less effective is that the "symptom" type entity is semantically similar to the "disease" type entity, and the model is not easily distinguished, and if the "disease" (or "symptom") type entity is incorrectly predicted as the "symptom" (or "disease") type, the relationship extraction model uses the incorrect information to predict the incorrect relationship type. Meanwhile, in the process of manually labeling the electronic medical record data set, due to insufficient medical knowledge, labeling personnel are prone to causing the problem of wrong labeling of entity relationship types, and noise can be introduced into model training.
7. Model comparison
To verify the validity of the model proposed by the present invention, the relationship extraction results of the present invention are compared with advanced pipeline models and joint models. The streamline model adopts lie and the like (extraction of the entity relation of the electronic medical record [ J ] based on position noise reduction and rich semantics, Chinese information academy 2021); the joint model is Yan et al (A part filter network for joint entry and relation interaction [ C ]. Empitical Methods in Natural Language Processing, 2021). The two models are compared by adopting the same training set, test set and evaluation mode as the invention, the F-values of the entity recognition and relationship extraction of the pipeline model are 86.10% and 65.07% respectively, and the results of the entity recognition and relationship extraction of each category are shown in tables 3 and 4 respectively:
table 3: pipeline method entity identification evaluation result
Figure BDA0003720748250000141
Table 4: pipeline method relation extraction evaluation result
Figure BDA0003720748250000142
As can be seen from the results of the pipeline model, the combined model of the invention is obviously superior to the traditional pipeline model, the F-value of the entity identification is improved by 1.95%, the F-value of the relation extraction is improved by 12.10%, and the improvement is obvious in each category. The reason is analyzed, firstly, the entity is extracted by the pipeline model in a knowledge base matching mode in an entity identification stage, although the accuracy of the extracted entity can be guaranteed, the long and difficult entity cannot be effectively extracted, for example, the entity 'pancreatic head enlargement accompanied by peripheral exudation' is split in a word segmentation stage, and the entity cannot be matched by the knowledge base. Meanwhile, the quality of the knowledge base can also directly influence the entity matching effect, and the entity extraction effect of the 'part' type is very low as can be seen from table 3, which is largely due to the fact that the 'part' type entity in the knowledge base is not perfect enough, resulting in poor matching effect. Finally, the pipeline model has the problem of error propagation, and the potential correlation of the two tasks cannot be utilized, so that the result of the pipeline model is poor.
We also compared the model of the present invention with the current advanced federated model (a partial filter network for joint entry and relationship extraction [ C ]. Empirical Methods in Natural Language Processing,2021), the federated model (PFN) achieved the best results in 2021 years on the universal domain dataset, and the entity identification and relationship extraction F-values of the PFN model on the dataset of the present invention reached 87.18% and 74.49%, respectively, and the results of the entity identification and relationship extraction for each category are shown in tables 5 and 6, respectively:
table 5: comparing the results of entity identification and evaluation in the combined method
Figure BDA0003720748250000151
Table 6: relation extraction evaluation result of comparison joint method
Figure BDA0003720748250000152
Figure BDA0003720748250000161
The result shows that the combined model of the invention is superior to a PFN combined model in overall effect, the F-value of entity recognition is improved by 0.87%, and the F-value of relationship extraction is improved by 2.68%. From the effect of each category, the model of the invention is obviously superior to a PFN model in predicting the relation (such as 'disease-disease' and 'disease-symptom') related to 'disease', because the medical entity is more complex than the entity in the general field, and the relation span between the head entity and the tail entity is larger, and the combined model of the invention effectively saves the remote characteristics by using a Memory mechanism, so the effect of extracting the relation of the medical entity with larger span is better.
In summary, the invention provides an entity relation joint extraction model based on Chinese electronic medical records, and provides an improved LSTM coding algorithm based on a Memory mechanism aiming at the characteristics of the electronic medical record document level; and a depth information fusion mechanism based on Co-Attention is provided, and information interaction of entity identification and relationship extraction is enhanced. Finally, the model of the invention is compared with advanced pipeline models and combined models, the results of the entity extraction and the relation extraction of the model of the invention are obviously improved, and the effectiveness of the method is verified.
An embodiment of the present invention further provides an electronic device, where the electronic device includes: the memory, the processor and the computer program stored on the memory and capable of running on the processor, when the processor executes the computer program, the steps of the method provided by the above embodiments are realized. The electronic device provided by the embodiment of the invention can realize each implementation mode in the method embodiments and corresponding beneficial effects.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the method provided by the embodiment of the invention is realized, and the same technical effect can be achieved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1. A method for extracting entity relationship jointly based on Chinese electronic medical records is characterized by comprising the following steps:
by coding each position of a sentence of the Chinese electronic medical record, the output characteristic of each position of the sentence is obtained
Figure FDA0003720748240000011
And output characteristics for each position
Figure FDA0003720748240000012
Splicing to obtain sentence characteristic representation;
the method comprises the steps of splicing forward features and backward features represented by sentence features to obtain bidirectional time sequence features H of sentences m
Characterizing bi-directional timing m Obtaining entity task characteristics H through self-attention network ner Entity task feature H ner Obtaining an entity mark sequence through a conditional random field;
the entity task characteristic H of the sentence ner And bidirectional timing characteristics H m Obtaining a deep fusion feature H through a Co-Attention network merge And fuse the depths into a feature H merge Obtaining relation extraction characteristic H through multilayer convolution re
Pairwise pairing the obtained entities through the entity marker sequences to obtain a plurality of entity pairs;
extracting features H according to the relationship re Weighting and averaging the characteristics of the corresponding positions of the entities to obtain the characteristics h of the entities e
Corresponding the characteristics h of the entity of the two entities of the entity pair e Sentence representation h with pre-trained model cls And splicing to obtain a splicing expression, and outputting the entity relationship type obtained according to the splicing expression through a classifier.
2. The method of claim 1, wherein the first output characteristic is obtained
Figure FDA0003720748240000013
The method comprises the following steps:
inputting the feature representation of each character of the sentence into the LSTM, and acquiring the unit state C of the LSTM containing long-distance text information t
Storing cell state C of LSTM by memory cell M t In response to cell state C t Obtaining a memory feature m t
Output characteristic h of LSTM at t moment t And memory characteristics m t Linear combination to obtain output characteristic of t time
Figure FDA0003720748240000014
3. The method of claim 2, wherein the entity relationship extraction method based on Chinese electronic medical record is characterized in that,
obtaining a memory feature m t Expressed by the formula: m is t =W M ·C t +b M ,W M ,b M Representing trainable parameters of a memory cell M, cell state C, with t-1 iterations 0 ~C t-1 Is stored in W M ,b M Middle memory feature m t Containing long-distance text information;
obtaining output characteristics at time t
Figure FDA0003720748240000015
Expressed by the formula: g t =σ(W g ·m t +b g ),
Figure FDA0003720748240000016
W g ,b g Indicates a trainable parameter of the gate control unit, sigma indicates a sigmoid activation function, and-indicates that elements are multiplied correspondingly;
the obtained sentence characteristic representation is expressed by the formula:
Figure FDA0003720748240000021
k is the text length, and t belongs to (0-k-1);
obtaining bidirectional time sequence characteristic H of sentence m Expressed by the formula:
Figure FDA0003720748240000022
Figure FDA0003720748240000023
a forward timing characteristic is represented that is characteristic of,
Figure FDA0003720748240000024
indicating a reverse timing characteristic.
4. The method of claim 1, wherein the depth fusion feature H is obtained merge The method comprises the following steps:
entity task feature H through attention feature learner ner And bidirectional timing characteristics H m Linear mapping to obtain bidirectional time sequence attention characteristics
Figure FDA0003720748240000025
And entity task attention feature
Figure FDA0003720748240000026
Figure FDA0003720748240000027
L is the sentence length;
characterizing bidirectional timing attention
Figure FDA0003720748240000028
And entity task attention features
Figure FDA0003720748240000029
Splicing, and acquiring the two-way time sequence attention characteristic by activating a function through softmax
Figure FDA00037207482400000210
Attention score of (attn) m And physical task attention features
Figure FDA00037207482400000211
Attention score of (attn) ner ,attn m 、attn ner ∈R L×1
Attention score attn m Characterised by two-way time-sequential attention
Figure FDA00037207482400000212
The fusion characteristic H occupied by the characteristic at any position merge The proportion of the features corresponding to the positions is obtained by arranging the features according to the position sequence,
attention score attn ner Characterised by the attention of the entity task
Figure FDA00037207482400000213
The depth occupied by the feature at any position of (2) is fused with the feature H merge The proportion of the features corresponding to the positions is obtained by arranging the features according to the position sequence,
depth fusion feature H merge According to a bidirectional time sequence attention characteristic
Figure FDA00037207482400000214
Proportion of features corresponding to location to said location determination and physical task attention features
Figure FDA00037207482400000215
The ratio of the feature of the corresponding position to the position determination is calculated.
5. The method as claimed in claim 4, wherein the entity relationship extraction method based on Chinese electronic medical record is,
obtaining bidirectional temporal attention features
Figure FDA00037207482400000216
Expressed by the formula:
Figure FDA00037207482400000217
obtaining entity task attention features
Figure FDA00037207482400000218
Expressed by the formula:
Figure FDA00037207482400000219
the acquisition attention score is formulated as:
Figure FDA00037207482400000220
obtaining fusion features H merge Expressed by the formula:
Figure FDA00037207482400000221
6. the method of claim 1, wherein the entity relationship extraction method based on Chinese electronic medical record is,
extracting features H according to the relationship re Weighting and averaging the characteristics of the corresponding positions of the entities in the sentence of each entity pair to obtain the characteristics h of the entities e Expressed by the formula: h is a total of e =H re [i~j]/(j- i) The position of the entity in the sentence is i to j.
7. The method as claimed in claim 6, wherein the entity relationship extraction method based on Chinese electronic medical record is,
entity pair<E 1 ,E 2 >Intermediate entity E 1 Positions in the sentence are i-j, entity E 1 Is a weighted average of the features of positions i-j, expressed as
Figure FDA0003720748240000031
Entity pair<E 1 ,E 2 >Middle entity E 2 Positions in the sentence are i-j, entity E 2 Is a weighted average of the features of positions i-j, expressed as
Figure FDA0003720748240000032
Expressing two entity characteristics and sentences of a pre-training model h cls Splicing, wherein the entity relationship type obtained according to the splicing expression is output through a classifier and is represented by a formula:
Figure FDA0003720748240000033
W pred and b pred Trainable parameters, y, representing a classifier pred ∈R 1×C C denotes the total number of relationship categories, y label Representing the truth of the relationship class.
8. An entity relation combined extraction network based on Chinese electronic medical records is characterized in that,
Included
LSTM advanced networks based on Memory mechanisms,
the self-attention-based entity abstracts the network,
an entity relationship depth information fusion network based on the Co-Attention mechanism, and
extracting a network based on the relation of the multilayer convolution;
the LSTM improved network based on the Memory mechanism consists of two LSTM units, two storage units and two gate control units which are respectively expressed as
Figure FDA0003720748240000034
Wherein forward and reverse arrows indicate the direction of data input; input data X ═ X 0 ,x 1 ,...,x L-1 Dimension of L × H, where L is the length of the input data and H is the dimension of the feature; the input size of the LSTM cell is 1 XH, feature x for each character of the input data t E.g. X, which is input to the forward LSTM unit in turn to obtain
Figure FDA0003720748240000035
The output size is 1 XH/2; the storage unit is a full connection layer with an input size of 1 XH/2, the characteristic to be coded by the forward LSTM unit
Figure FDA0003720748240000036
Obtaining memory characteristics from forward memory cells
Figure FDA0003720748240000037
The output size is 1 XH/2; the gate control units are linear layers and are connected with sigmoid functions, the input size of the gate control units is 1 multiplied by H/2, the output size of the gate control units is 1 multiplied by 1, and the characteristics are memorized
Figure FDA0003720748240000038
Obtaining gate control weights by a gate control unit
Figure FDA0003720748240000039
Features for then encoding LSTM units
Figure FDA00037207482400000310
And memory features
Figure FDA00037207482400000311
By weight
Figure FDA00037207482400000312
Linear combination to obtain
Figure FDA00037207482400000313
The output size is 1 XH/2; all characters are sequentially passed through forward LSTM units and spliced to obtain
Figure FDA0003720748240000041
The output size is L multiplied by H/2, all characters are reversely and sequentially passed through an inverse LSTM unit to obtain
Figure FDA0003720748240000042
Splicing the two-way Memory with the forward characteristic to finally obtain the bidirectional Memory-LSTM characteristic
Figure FDA0003720748240000043
The output size is L multiplied by H;
the self-attention-based entity extraction network consists of a self-attention network and a CRF decoding module; the input size of the self-attention network is L multiplied by H, and the output size is L multiplied by H; obtaining entity task characteristics H through self-attention network ner (ii) a Entity task feature H ner Decoding is carried out through a CRF module, the input size of the CRF module is L multiplied by H, and the output is a label sequence L of the path with the highest score pred
The entity relationship depth information fusion network based on the Co-Attention mechanism consists of two Attention feature learners, wherein the learners are linear layers, the input size of the learners is L multiplied by H, and the output size of the learners is L multiplied by 1; entity characteristics H to be self-attention network coded ner And feature H of Memory-LSTM m Respectively output as
Figure FDA0003720748240000044
Splicing two attention characteristics
Figure FDA0003720748240000045
Obtaining attention weight attn by softmax activation function m ,attn ner ∈R L×1 Finally, the attention weight is respectively multiplied by the original characteristics correspondingly to obtain the fused characteristics H re Taking the characteristic as the input of a relation extraction module based on multilayer convolution;
the relation extraction model based on the multilayer convolution is formed by stacking two one-dimensional convolution units, namely Conv1 and Conv 2; the input size of Conv1 is L × H, the output size is L × H/2, and the convolution kernel size is 3; the input size of Conv2 is L × H/2, the output size is L × H, and the convolution kernel size is 3; h that will get a feature through the Co-Attention network merge Obtaining output H through convolution operation re The output size is L multiplied by H; then weighting the characteristics of the corresponding position of the entity to obtain the entity characteristics h e The characteristic size is 1 XH; splicing the entity features pairwise, and integrating the entity features into the global features to obtain entity pair features
Figure FDA0003720748240000046
Figure FDA0003720748240000047
The characteristic size is 1 multiplied by 3H; classifying the characteristics through a linear layer and outputting a prediction result y pred ∈R 1×C
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the processor implementing the steps in the method as claimed in claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method as set forth in claims 1-7.
CN202210749641.1A 2022-06-29 2022-06-29 Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record Pending CN115017910A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210749641.1A CN115017910A (en) 2022-06-29 2022-06-29 Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210749641.1A CN115017910A (en) 2022-06-29 2022-06-29 Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record

Publications (1)

Publication Number Publication Date
CN115017910A true CN115017910A (en) 2022-09-06

Family

ID=83078611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210749641.1A Pending CN115017910A (en) 2022-06-29 2022-06-29 Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record

Country Status (1)

Country Link
CN (1) CN115017910A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316372A (en) * 2023-11-30 2023-12-29 天津大学 Ear disease electronic medical record analysis method based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316372A (en) * 2023-11-30 2023-12-29 天津大学 Ear disease electronic medical record analysis method based on deep learning
CN117316372B (en) * 2023-11-30 2024-04-09 天津大学 Ear disease electronic medical record analysis method based on deep learning

Similar Documents

Publication Publication Date Title
CN112214995B (en) Hierarchical multitasking term embedded learning for synonym prediction
Cai et al. An CNN-LSTM attention approach to understanding user query intent from online health communities
Zhu et al. Knowledge-based question answering by tree-to-sequence learning
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
Li et al. Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
Zhong et al. A building regulation question answering system: A deep learning methodology
CN113707339B (en) Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases
CN111144410B (en) Cross-modal image semantic extraction method, system, equipment and medium
Cao et al. Clinical-coder: Assigning interpretable ICD-10 codes to Chinese clinical notes
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN114429143A (en) Cross-language attribute level emotion classification method based on enhanced distillation
CN115130465A (en) Method and system for identifying knowledge graph entity annotation error on document data set
CN115311465A (en) Image description method based on double attention models
Hellrich Word embeddings: reliability & semantic change
Da et al. Deep learning based dual encoder retrieval model for citation recommendation
CN115017910A (en) Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record
CN114238636A (en) Translation matching-based cross-language attribute level emotion classification method
CN113920379A (en) Zero sample image classification method based on knowledge assistance
Zhang et al. Research on named entity recognition of chinese electronic medical records based on multi-head attention mechanism and character-word information fusion
Pan et al. A method for extracting tumor events from clinical CT examination reports
CN114943216B (en) Case microblog attribute level view mining method based on graph attention network
Xu et al. Research on depression tendency detection based on image and text fusion
Zeng Intelligent test algorithm for English writing using English semantic and neural networks
CN112102285B (en) Bone age detection method based on multi-modal countermeasure training
Giabbanelli et al. Generative AI for Systems Thinking: Can a GPT Question-Answering System Turn Text into the Causal Maps Produced by Human Readers?

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination