CN112541356A - Method and system for recognizing biomedical named entities - Google Patents

Method and system for recognizing biomedical named entities Download PDF

Info

Publication number
CN112541356A
CN112541356A CN202011519249.5A CN202011519249A CN112541356A CN 112541356 A CN112541356 A CN 112541356A CN 202011519249 A CN202011519249 A CN 202011519249A CN 112541356 A CN112541356 A CN 112541356A
Authority
CN
China
Prior art keywords
attention
named entity
embedding
word embedding
biomedical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011519249.5A
Other languages
Chinese (zh)
Other versions
CN112541356B (en
Inventor
徐卫志
范胜玉
曹洋
于惠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202011519249.5A priority Critical patent/CN112541356B/en
Publication of CN112541356A publication Critical patent/CN112541356A/en
Application granted granted Critical
Publication of CN112541356B publication Critical patent/CN112541356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The present disclosure provides a method and system for biomedical named entity identification, comprising: performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer; adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels; embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model; and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.

Description

Method and system for recognizing biomedical named entities
Technical Field
The disclosure belongs to the technical field of natural language processing and deep learning, and particularly relates to a method and a system for recognizing a biomedical named entity.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Natural Language Processing (NLP) is a branch of the fields of artificial intelligence and linguistics, and is one of the most difficult problems in artificial intelligence. NLP refers to the operation and processing of information such as the form, sound, meaning of natural language, i.e., the input, output, recognition, analysis, understanding, generation, etc., of characters, words, sentences, chapters, etc., by a computer. It has many important effects on the way computers and humans interact. The basic tasks of the method comprise voice recognition, information retrieval, question-answering systems, machine translation and the like, and a model frequently used by NLP is a recurrent neural network and naive Bayes. The term language processing of natural language processing refers to computer technology capable of processing spoken and written languages. By using the related technology, massive data can be efficiently and quickly retrieved and stored. With the development of deep learning technology in many fields, natural language processing has also made a great breakthrough.
Attention Mechanism (Attention Mechanism) is an important tool to improve task performance in the field of natural language processing in recent years. And finally, weighting each dimension value of the word embedding vector of the sentence according to the attention score to finally obtain the word embedding vector subjected to attention calculation. The use of an attention mechanism for attention exploration of word-embedded information in sentences has become a mature technique in the field of named entity recognition.
Named Entity Recognition (NER) is a basic task in the field of NLP, and is also an important basic tool for most NLP tasks such as question and answer systems, machine translation, syntactic analysis, and the like. Previous approaches have been primarily dictionary-based and rule-based. The dictionary-based method is a method of fuzzy search or complete matching through character strings, but the quality and the size of the dictionary are limited as new entity names are continuously emerged; the rule-based method is to manually specify some rules and expand a rule set by common collocation of self characteristics and phrases of entity names, but huge human resources and time cost are consumed, the rules are generally effective only in a certain specific field, the cost of manual migration is high, and the rule portability is not strong. Named entity recognition is carried out, a machine learning method is mostly adopted, model training is continuously optimized, and the trained model shows better performance in test evaluation. Currently, the most applied models include Hidden Markov Models (HMMs), Support Vector Machines (SVMs), Maximum Entropy Markov Models (MEMMs), Conditional Random Fields (CRFs), and the like. The conditional random field model can effectively process the influence problem of the adjacent labels on the prediction sequence, so that the conditional random field model is applied to entity recognition more and has good effect. At present, a deep learning algorithm is generally adopted for the problem of sequence labeling. Compared with the traditional algorithm, the deep learning algorithm eliminates the step of manually extracting the features, and can effectively extract the distinguishing features.
In recent years, with the high-speed operation of the internet, information has come in various storage forms. In the biomedical field, literature resources are increased by thousands of times every year, the information is mostly stored in the form of unstructured texts, and the biomedical named entity recognition aims to convert the unstructured texts into structured texts and recognize and classify specific entity names such as genes, proteins, diseases and the like in the biomedical texts. At present, how to quickly and efficiently retrieve relevant information from huge data is a great challenge.
Disclosure of Invention
In order to solve the problems, the present disclosure provides a method and a system for identifying a biomedical named entity, which is mainly divided into two parts, namely, multilevel attention embedding vector calculation and cross attention fusion; the multi-level attention embedding vector calculation mainly includes character-based local attention calculation, character-based global attention calculation, and word-based local attention calculation.
According to some embodiments, the following technical scheme is adopted in the disclosure:
in a first aspect, the present disclosure provides a method of biomedical named entity identification;
a method of biomedical named entity identification, comprising:
performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer;
adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels;
embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model;
and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.
In a second aspect, the present disclosure provides a system for biomedical named entity identification;
a system for biomedical named entity recognition, comprising:
a word embedding module configured to: performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer;
a feature fusion module configured to: adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels;
a model training module configured to: embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model;
an output module configured to: and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.
In a third aspect, the present disclosure provides a computer-readable storage medium;
the present disclosure provides a computer readable storage medium for storing computer instructions which, when executed by a processor, perform a method of biomedical named entity identification as described in the first aspect.
Compared with the prior art, the beneficial effect of this disclosure is:
1. when the method is used for processing the biomedical named entity recognition, the named entity recognition neural network model is adopted, and algorithms such as multilevel attention embedding vector calculation, cross attention fusion and the like are combined, so that the accuracy of the named entity recognition is improved.
2. When the named entity recognition task is carried out, the sequence structure data is marked and divided through a Conditional Random Field (CRF), and the accurate final sequence marking effect can be realized.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
FIG. 1 is a flow chart of a method of biomedical named entity identification of the present disclosure;
FIG. 2 is a schematic diagram of a character-based local attention mechanism in an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a character-based global attention mechanism in an embodiment of the present disclosure;
FIG. 4 is a local attention test effect based on characters in an embodiment of the present disclosure;
FIG. 5 is a cross attention fusion method according to an embodiment of the present disclosure for an experimental effect of local attention of a character;
FIG. 6 is a character-based global attention experiment effect in an embodiment of the present disclosure;
FIG. 7 is a cross attention fusion method in an embodiment of the present disclosure for an experimental effect on global attention of a character;
FIG. 8 is a word-based local attention test effect in an embodiment of the present disclosure.
The specific implementation mode is as follows:
the present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Interpretation of terms:
natural Language Processing (NLP) is a branch of the fields of artificial intelligence and linguistics, and is one of the most difficult problems in artificial intelligence. NLP refers to the operation and processing of information such as the form, sound, meaning of natural language, i.e., the input, output, recognition, analysis, understanding, generation, etc., of characters, words, sentences, chapters, etc., by a computer. It has many important effects on the way computers and humans interact. The basic tasks of the method comprise voice recognition, information retrieval, question-answering systems, machine translation and the like, and a model frequently used by NLP is a recurrent neural network and naive Bayes. The term language processing of natural language processing refers to computer technology capable of processing spoken and written languages. By using the related technology, massive data can be efficiently and quickly retrieved and stored. With the development of deep learning technology in many fields, natural language processing has also made a great breakthrough.
Attention Mechanism (Attention Mechanism) is an important tool to improve task performance in the field of natural language processing in recent years. And finally, weighting each dimension value of the word embedding vector of the sentence according to the attention score to finally obtain the word embedding vector subjected to attention calculation. The use of an attention mechanism for attention exploration of word-embedded information in sentences has become a mature technique in the field of named entity recognition.
Named Entity Recognition (NER) is a basic task in the field of NLP, and is also an important basic tool for most NLP tasks such as question and answer systems, machine translation, syntactic analysis, and the like.
As described in the background, with the development of science and technology, unstructured biomedical data is emerging, and at present, biomedical named entity recognition faces many difficulties: the entity name is provided with a plurality of modifiers, so that the difficulty in distinguishing the entity boundary is increased; multiple entity names share a word; lack of strict naming standards; ambiguity in abbreviations, etc. In order to solve the problems, the performance of the system can be greatly improved by adopting the convolutional neural network with multiple filters, and the identification accuracy is improved.
Example one
Fig. 1 is a flowchart of a method for identifying a biomedical named entity according to the present embodiment, and as shown in fig. 1, the present embodiment provides a method for identifying a biomedical named entity, including:
performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer;
specifically, a word embedding mechanism is used for carrying out feature extraction on the word embedding in the sentence;
adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels;
embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model;
and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.
As another embodiment, the performing feature sampling on the characters and the words by using an attention mechanism to obtain extensions of word embedding respectively includes: and respectively carrying out attention exploration in local characters, global characters and local words by adopting multilevel attention embedding vector calculation, and extracting word embedding information at different levels.
A multi-level attention-embedding vector calculation comprising: character-based local attention calculations, character-based global attention calculations, and word-based local attention calculations.
Wherein, the character-based local attention calculation mainly models the characters inside the words by using a form of single heat (ont-hot) coding, then performs attention calculation on the modeled character embedding matrixes respectively, and finally selects proper dimension information by using pooling layer sampling for the calculated attention character embedding.
The character-based global attention calculation mainly uses Bi-GRU to search context information on sentence characters for a modeled character embedding matrix, then carries out attention calculation, and finally carries out sampling by using a pooling layer to form corresponding word embedding.
The word-based local attention calculation mainly performs attention distribution calculation on word embedding, and extracts attention distribution among word embedding.
It is worth noting that before calculating attention distribution, context exploration is carried out on word embedding in a sentence, and context information is extracted; the word embedding vector that needs to be carefully calculated thus contains the embedded information inside the sentence.
As another embodiment, the fusing word embedding of different levels by using an attention mechanism to obtain word embedding of multiple levels includes: and weighting the attention of two different levels into corresponding embedded information by adopting cross attention fusion for fusion to obtain multi-level word embedding.
The cross attention fusion algorithm means that the embedded information obtained by the traditional method aiming at different sampling methods usually uses a direct splicing mode and then enters the next processing, in the embodiment, attention calculation of two parties is adopted, the attention between the two parties is weighted into the corresponding embedded information, and finally splicing is carried out to carry out the next processing.
As another implementation mode, before character and word feature sampling is carried out by using an attention mechanism to respectively obtain the expansion of word embedding, the method further comprises the step of marking and dividing the biomedical named entity by adopting a conditional random field.
Specifically, the present embodiment further provides a more detailed implementation: the method of biomedical named entity recognition can also be divided into the following processes:
(1) word embedding. In the embodiment, a multilevel attention form is used for respectively carrying out attention exploration in local characters, global characters and local words, so that word embedding information is extracted in different dimensions by using an attention mechanism, and finally, words in different dimensions are embedded in an attention fusion mode to generate embedding vectors required by a downstream task through fusion, and the training performance of the model can be stably improved by using the scheme. In the process of a plurality of NLPs, the function of extracting features through word embedding information is proved to be effective, such as recent sentence similarity calculation, part-of-speech tagging problem, the word embedding mode of texts improves the performance of the system, and word-level representation can greatly improve the vocabulary which can be processed by the model.
(2) And extracting multi-level attention characteristics. In medical texts, pre-trained word embedding vectors are usually used for model training in the next step, however, in the commonly used pre-trained word embedding, there is a limitation on the support of specialized vocabularies, namely, a large number of word embedding vectors in the form of OOV exist. Therefore, in the embodiment, multi-dimensional attention calculation is used for searching the word embedding information, so that the word embedding information of the professional vocabulary is made up.
(3) And extracting context information. In biomedical texts, to extract efficient and beneficial entity names, the position of a word in a sentence and semantic information of adjacent words need to be considered, that is, context information is very beneficial to the NER task, so that the embodiment mainly adopts a bidirectional long-short term memory network (BLSTM), and the BilSTM is composed of forward LSTM and backward LSTM, and effectively solves the problems of gradient disappearance and gradient explosion.
(4) Labeling and dividing labels. When the named entity recognition task is carried out, sequence structure data is marked and divided through a Conditional Random Field (CRF), and a more accurate final sequence marking effect can be achieved. CRF is a variation of Markov random field, is constructed on BilSTM, generally represents a model by conditional probability for a given output identification label and observation sequence, and performs global normalization processing on all features, which is more advantageous than other machine learning methods.
In recent years, a neural network method combining bidirectional long-short term memory (BilSTM) and Conditional Random Fields (CRF) has achieved better effects on various NER data sets. Although BilSTM explores a great deal of context information, in the existing embedding of training words, the occurrence frequency of medical professional vocabularies is low, more accurate word senses cannot be obtained, and the word labels obtained each time cannot be correctly predicted. Pre-trained models, represented by BioBERT and SciBERT, use BERT models to obtain higher-level embedded information by training specific specialized medical corpuses, thereby improving the performance of downstream tasks.
Although the pre-trained model can achieve faster convergence speed and stable model performance, it uses a lot of computing resources, and the cost of training an excellent model is huge. Therefore, the use of a multi-level attention mechanism, a simple, low-cost approach that does not require pre-training, makes character-level and word-level coders more meaningful for specific word information.
In the NER task, the problem of gradient vanishing or gradient explosion is commonly encountered, but by using a bidirectional long and short term memory network (BLSTM), the named entity recognition neural network model of the present embodiment can obtain context information on both sides of any biomedical text statement, eliminating the limited environment problem in the feed-forward neural network. CRF, as a variant of Markov random field, effectively deals with the probability problem of labeling and partitioning sequence structure data.
Example two
The purpose of the present disclosure is to improve the accuracy of biological named entity recognition. In order that the invention may be more clearly understood, the invention will now be described in detail with reference to the accompanying drawings and specific examples.
In previous research, we can understand that the performance of the named entity recognition task can be improved by performing feature sampling on characters through a convolutional neural network as an extension of word embedding. In this embodiment, two character-based techniques are introduced: local attention mechanisms, global attention mechanisms, and word-based word-embedded attention mechanisms; finally, a multi-level cross attention information fusion mechanism is introduced, which is called as a multi-dimensional fusion technology.
A character-based Local Attention Mechanism (LAM) is shown in fig. 2. An attention mechanism is employed to mine the key components of local characters to embed the characters into words, and then maximum pooling is used to extract word embedding. As an extension of native word embedding, it increases the amount of information of the embedded word. Details of LAM are as follows:
Figure BDA0002848432370000111
EXAMPLE III
A character-based Global Attention Mechanism (GAM) is shown in fig. 3. During the training process, the characters of all sentences in each batch are combined, and then word embedding is extracted at the global character level by using an attention mechanism. Using the attention mechanism directly on the global character set may lose context information. In previous work, character context information was first extracted using BilSTM, and then calculated using an attention mechanism. In our experiments we found that not only can better context information be obtained, but also better computational efficiency can be obtained using BiGRU. The specific algorithm described by GAM is as follows:
Figure BDA0002848432370000121
Figure BDA0002848432370000122
vocabulary level local attention mechanisms have been used many times in past research. The word attention mechanism can accurately extract the attention distribution between word insertions. In addition, studies have shown that the effect of feature extraction using BiLSTM is not ideal after calculating the attention mechanism. Therefore, the present embodiment uses BiGRU to extract context information.
Figure BDA0002848432370000123
Figure BDA0002848432370000124
Multi-level feature fusion of NER tasks is a powerful and efficient strategy to take advantage of the most important functions to achieve better results. The present embodiment does not simply directly link the multi-dimensional feature information. When connecting features of two different dimensions, a cross-attention mechanism is introduced for the first time. For the characteristics of the two levels, an attention mechanism is adopted to calculate the attention scores of the two sides, and then the attention scores are fused to obtain multi-level word embedding. Notably, to enable the calculation of attention for these two levels of features, BilSTM or BiGRU was used to normalize the dimensions. The specific calculation process is as follows:
f1=BiRNN[f1]
f2=BiRNN[f2]
Figure BDA0002848432370000131
Figure BDA0002848432370000132
n1=softmax[m1]
n2=softmax[m2]
Figure BDA0002848432370000133
Figure BDA0002848432370000134
a1=o1⊙f1
a2=o2⊙f2
Att=[a1,a2]
in the layer of bidirectional long and short term memory (BilSTM), three control gates of input, forgetting and output are provided to protect and control the cell state, capture better bidirectional semantic dependence and master the influence degree of the information on a prediction object by adjusting the weight of the related information in the context. The hidden layer uses a sigmod function. The control structure of the single LSTM unit is as follows:
it=σ(Wiht-1+UiXt+bi)
ft=σ(Wfht-1+UfXt+bf)
Figure BDA0002848432370000136
Figure BDA0002848432370000135
ot=σ(Woht-1+UoX1+bo)
ht=ot⊙tanh(ct)
in the biomedical field, when naming genes, diseases and proteins, entities are generally labeled by using label modes such as { B, I, O }, { B, I, O, E, S }, and the like, wherein B refers to the beginning of an entity, I refers to the inside of an entity, E refers to the end of an entity, and O refers to a non-entity component. For example, "B-GENE" refers to the start position tag of a GENE structure. BilSTM outputs label scores, and if the label with the highest score is selected from the labels in the unit, the method is inaccurate, and the CRF layer is required to ensure the legality of the label.
Example four
The embodiment provides a system for biomedical named entity recognition;
a system for biomedical named entity recognition, comprising:
a word embedding module configured to: performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer;
a feature fusion module configured to: adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels;
a model training module configured to: embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model;
an output module configured to: and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.
It should be noted here that the word embedding module, the feature fusion module, the model training module, and the output module correspond to specific steps in the first embodiment, and the modules are the same as examples and application scenarios realized by the corresponding steps, but are not limited to the contents disclosed in the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
EXAMPLE five
A computer readable storage medium storing computer instructions which, when executed by a processor, perform a method of biomedical named entity identification as described in the above embodiments.
EXAMPLE six
FIG. 4 is a graph of the effect of the local attention test based on characters in the embodiment of the present disclosure, as shown in FIG. 4, the effect of data above the character level on the number of different attention heads, that is, a parameter with the best effect, which is explored by increasing the number of heads;
EXAMPLE seven
Fig. 5 is an effect diagram of a cross attention fusion method in an embodiment of the disclosure, which is directed to a local attention experiment of a character, and as shown in fig. 5, when attention of the character is obtained, a direct splicing input step three is compared, the direct splicing input step three is spliced in an attention cross fusion manner, and then word embedding after a cross fusion person is performed and splicing is performed with original embedding.
Example eight
FIG. 6 is a graph of the effect of a global attention experiment based on characters in an embodiment of the present disclosure, as shown in FIG. 6, the impact of the number of attention heads on performance at different word levels is tested.
Example nine
Fig. 7 is a graph of the effect of the cross attention fusion method in the embodiment of the disclosure on the global attention experiment of the character, as shown in fig. 7, comparing the data influence of the word level.
Example ten
Fig. 8 is a graph of the effect of the local attention experiment based on words in the embodiment of the present disclosure, as shown in fig. 8, the word-level information is compared, and the distinguishing effect is compared after the influence of the character information is increased by directly using attention embedding, using bii l stm and then attention extraction, and using cross attention.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims (10)

1. A method of biomedical named entity recognition, comprising:
performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer;
adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels;
embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model;
and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.
2. The method for biomedical named entity recognition according to claim 1, wherein feature sampling is performed on characters and words by using an attention mechanism to obtain word embedding expansion respectively, and the method comprises the following steps: and respectively carrying out attention exploration in local characters, global characters and local words by adopting multilevel attention embedding vector calculation, and extracting word embedding information at different levels.
3. The method of biomedical named entity recognition of claim 2, wherein the multi-level attention-embedding vector computation comprises: the method comprises the steps of character-based local attention calculation, wherein the character-based local attention calculation models characters inside words in a form of single hot coding, then attention calculation is carried out on modeled character embedding matrixes respectively, and finally output attention character embedding calculation is carried out, and suitable dimension information is selected by maximum pooling layer sampling.
4. The method of biomedical named entity recognition of claim 3, wherein the multi-level attention-embedding vector computation comprises: and performing character-based global attention calculation, wherein the character-based global attention calculation is used for performing context information exploration on the character of the sentence by using the Bi-GRU for the modeled character embedding matrix, then performing attention calculation, and finally performing sampling by using a maximum pooling layer to form corresponding word embedding.
5. The method for biomedical named entity recognition of claim 2, wherein the multi-tiered attention embedding vector computation further comprises: word-based local attention calculation that performs attention distribution calculation for word embedding, extracting attention distribution between word embedding.
6. The method of biomedical named entity recognition according to claim 4 or 5, characterized in that before computing the attention distribution, context information is extracted by context exploration for word embedding inside the sentence.
7. The method of biomedical named entity recognition of claim 1, wherein the fusing of word embeddings of different levels using attention mechanism to obtain word embeddings of multiple levels comprises: and weighting the attention of two different levels into corresponding embedded information by adopting cross attention fusion for fusion to obtain multi-level word embedding.
8. The method of biomedical named entity recognition of claim 1, further comprising labeling and partitioning the biomedical named entity with conditional random fields before the feature sampling of characters as an extension of word embedding.
9. A system for biomedical named entity recognition, comprising:
a word embedding module configured to: performing feature sampling on characters and words by using an attention mechanism to respectively obtain word embedding expansion, and then extracting word embedding by using a maximum pooling layer;
a feature fusion module configured to: adopting an attention mechanism to embed words of different levels for fusion to obtain word embedding of multiple levels;
a model training module configured to: embedding the multi-level words into an input named entity recognition neural network model for training to obtain a trained named entity recognition neural network model;
an output module configured to: and inputting the biomedical named entity to be identified into the trained named entity identification neural network model to obtain an entity identification result.
10. A computer readable storage medium storing computer instructions which, when executed by a processor, perform a method for biomedical named entity identification according to any one of claims 1 to 8.
CN202011519249.5A 2020-12-21 2020-12-21 Method and system for recognizing biomedical named entities Active CN112541356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011519249.5A CN112541356B (en) 2020-12-21 2020-12-21 Method and system for recognizing biomedical named entities

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011519249.5A CN112541356B (en) 2020-12-21 2020-12-21 Method and system for recognizing biomedical named entities

Publications (2)

Publication Number Publication Date
CN112541356A true CN112541356A (en) 2021-03-23
CN112541356B CN112541356B (en) 2022-12-06

Family

ID=75019343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011519249.5A Active CN112541356B (en) 2020-12-21 2020-12-21 Method and system for recognizing biomedical named entities

Country Status (1)

Country Link
CN (1) CN112541356B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486666A (en) * 2021-07-07 2021-10-08 济南超级计算技术研究院 Medical named entity recognition method and system
CN113723051A (en) * 2021-08-26 2021-11-30 泰康保险集团股份有限公司 Text labeling method and device, electronic equipment and storage medium
CN113779993A (en) * 2021-06-09 2021-12-10 北京理工大学 Medical entity identification method based on multi-granularity text embedding
CN113838524A (en) * 2021-09-27 2021-12-24 电子科技大学长三角研究院(衢州) S-nitrosylation site prediction method, model training method and storage medium
CN114282539A (en) * 2021-12-14 2022-04-05 重庆邮电大学 Named entity recognition method based on pre-training model in biomedical field
CN116451690A (en) * 2023-03-21 2023-07-18 麦博(上海)健康科技有限公司 Medical field named entity identification method
CN116611436A (en) * 2023-04-18 2023-08-18 广州大学 Threat information-based network security named entity identification method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059217A (en) * 2019-04-29 2019-07-26 广西师范大学 A kind of image text cross-media retrieval method of two-level network
CN110334213A (en) * 2019-07-09 2019-10-15 昆明理工大学 The Chinese based on bidirectional crossed attention mechanism gets over media event sequential relationship recognition methods
CN110675860A (en) * 2019-09-24 2020-01-10 山东大学 Voice information identification method and system based on improved attention mechanism and combined with semantics
CN110750992A (en) * 2019-10-09 2020-02-04 吉林大学 Named entity recognition method, device, electronic equipment and medium
CN111813907A (en) * 2020-06-18 2020-10-23 浙江工业大学 Question and sentence intention identification method in natural language question-answering technology
CN111914097A (en) * 2020-07-13 2020-11-10 吉林大学 Entity extraction method and device based on attention mechanism and multi-level feature fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059217A (en) * 2019-04-29 2019-07-26 广西师范大学 A kind of image text cross-media retrieval method of two-level network
CN110334213A (en) * 2019-07-09 2019-10-15 昆明理工大学 The Chinese based on bidirectional crossed attention mechanism gets over media event sequential relationship recognition methods
CN110675860A (en) * 2019-09-24 2020-01-10 山东大学 Voice information identification method and system based on improved attention mechanism and combined with semantics
CN110750992A (en) * 2019-10-09 2020-02-04 吉林大学 Named entity recognition method, device, electronic equipment and medium
CN111813907A (en) * 2020-06-18 2020-10-23 浙江工业大学 Question and sentence intention identification method in natural language question-answering technology
CN111914097A (en) * 2020-07-13 2020-11-10 吉林大学 Entity extraction method and device based on attention mechanism and multi-level feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐凯等: "基于结合多头注意力机制BiGRU网络的生物医学命名实体识别", 《计算机应用与软件》 *
程名等: "融合注意力机制和BiLSTM+CRF的渔业标准命名实体识别", 《大连海洋大学学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779993A (en) * 2021-06-09 2021-12-10 北京理工大学 Medical entity identification method based on multi-granularity text embedding
CN113779993B (en) * 2021-06-09 2023-02-28 北京理工大学 Medical entity identification method based on multi-granularity text embedding
CN113486666A (en) * 2021-07-07 2021-10-08 济南超级计算技术研究院 Medical named entity recognition method and system
CN113723051A (en) * 2021-08-26 2021-11-30 泰康保险集团股份有限公司 Text labeling method and device, electronic equipment and storage medium
CN113723051B (en) * 2021-08-26 2023-09-15 泰康保险集团股份有限公司 Text labeling method and device, electronic equipment and storage medium
CN113838524A (en) * 2021-09-27 2021-12-24 电子科技大学长三角研究院(衢州) S-nitrosylation site prediction method, model training method and storage medium
CN113838524B (en) * 2021-09-27 2024-04-26 电子科技大学长三角研究院(衢州) S-nitrosylation site prediction method, model training method and storage medium
CN114282539A (en) * 2021-12-14 2022-04-05 重庆邮电大学 Named entity recognition method based on pre-training model in biomedical field
CN116451690A (en) * 2023-03-21 2023-07-18 麦博(上海)健康科技有限公司 Medical field named entity identification method
CN116611436A (en) * 2023-04-18 2023-08-18 广州大学 Threat information-based network security named entity identification method

Also Published As

Publication number Publication date
CN112541356B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN112541356B (en) Method and system for recognizing biomedical named entities
CN108460013B (en) Sequence labeling model and method based on fine-grained word representation model
Yao et al. An improved LSTM structure for natural language processing
CN112733541A (en) Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism
CN111666758B (en) Chinese word segmentation method, training device and computer readable storage medium
CN112818118B (en) Reverse translation-based Chinese humor classification model construction method
CN111414481A (en) Chinese semantic matching method based on pinyin and BERT embedding
CN111738007A (en) Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
Gao et al. Named entity recognition method of Chinese EMR based on BERT-BiLSTM-CRF
Rendel et al. Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end
CN113360667B (en) Biomedical trigger word detection and named entity identification method based on multi-task learning
CN112784604A (en) Entity linking method based on entity boundary network
CN112765956A (en) Dependency syntax analysis method based on multi-task learning and application
CN111191464A (en) Semantic similarity calculation method based on combined distance
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN110134950A (en) A kind of text auto-collation that words combines
CN115600597A (en) Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
CN109815497B (en) Character attribute extraction method based on syntactic dependency
CN114970536A (en) Combined lexical analysis method for word segmentation, part of speech tagging and named entity recognition
Göker et al. Neural text normalization for turkish social media
CN116680407A (en) Knowledge graph construction method and device
CN115759102A (en) Chinese poetry wine culture named entity recognition method
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
CN109960782A (en) A kind of Tibetan language segmenting method and device based on deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant