CN112733541A - Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism - Google Patents

Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism Download PDF

Info

Publication number
CN112733541A
CN112733541A CN202110016942.9A CN202110016942A CN112733541A CN 112733541 A CN112733541 A CN 112733541A CN 202110016942 A CN202110016942 A CN 202110016942A CN 112733541 A CN112733541 A CN 112733541A
Authority
CN
China
Prior art keywords
bert
bigru
model
idcnn
crf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110016942.9A
Other languages
Chinese (zh)
Inventor
张毅
王爽胜
何彬
叶培明
***
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202110016942.9A priority Critical patent/CN112733541A/en
Publication of CN112733541A publication Critical patent/CN112733541A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism, which comprises the following steps: training a BERT pre-training language model through large-scale label-free anticipation; constructing a complete BERT-BiGRU-IDCNN-Attention-CRF named entity recognition model on the basis of the trained BERT model; constructing an entity recognition training set, and training a complete entity recognition model on the training set; the invention combines the feature vectors extracted by the BiGRU and IDCNN neural networks, overcomes the defect that the BiGRU neural networks neglect local features in the process of extracting global context features, introduces an attention mechanism, performs weight distribution on the extracted features, strengthens the features playing a key role in entity recognition, weakens irrelevant features and further improves the recognition effect of named entity recognition.

Description

Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism
Technical Field
The invention belongs to the field of named entity identification, and particularly relates to a method for identifying a named entity of BERT-BiGRU-IDCNN-CRF based on an attention mechanism.
Background
Named Entity Recognition (NER) is one of the basic tasks in the field of natural language processing, namely, recognizing instances, i.e., entities, such as person names, place names, organization names, etc., embodying concepts in text. Named entity recognition is widely applied to tasks such as information extraction, question and answer systems, intelligent translation, knowledge graph construction and the like.
The methods for naming identification are mainly classified into the following three categories:
the first category of dictionary and rule-based methods first performs named entity recognition by matching through manual construction of a dictionary or rule template.
The second category is a statistical machine learning-based method, which considers the named entity recognition task as a sequence tagging problem and learns tagging models using large-scale corpora. Such methods mainly include Hidden Markov Models (HMMs), Maximum Entropy Models (MEM), Support Vector Machines (SVMs), Conditional Random Fields (CRFs), and the like.
The third type is a method based on deep learning, that is, characters or words are mapped to a vector space and then input to a neural network for feature extraction and label prediction, wherein a popular neural network model is BilSTM-CRF, the model can extract global context features through the BilSTM, meanwhile, the dependency relationship among labels is captured through the CRF, and the corrected context features are output. Meanwhile, in order to accelerate the convergence rate, a BiGRU-CRF model is proposed by scholars.
The above prior art has the following disadvantages:
1. the first stage is to adopt a dictionary and rule-based method, which has the disadvantages that the method relies on a manually constructed rule template of a linguist, is labor-consuming, has subjective factors, is easy to generate errors, and cannot be transplanted among different fields.
2. Statistical machine learning-based methods still require a large amount of human involvement in feature extraction and rely heavily on the corpus.
3. The mainstream model BilSTM-CRF based on the deep learning method has the disadvantages that a word vector of a word embedding layer cannot represent word ambiguity, so that the recognition effect of a lower layer is influenced, and in addition, part of local features can be omitted when the BiLSTM and the BiGRU extract global context features.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A named entity recognition method of BERT-BiGRU-IDCNN-CRF based on attention mechanism is provided. The technical scheme of the invention is as follows:
a named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism comprises the following steps:
s1, training a BERT (bidirectional encoder representation from transformations) model based on large-scale unlabeled expectation; the BERT model comprises an embedding layer, a bidirectional Transformer coding layer and an output layer; an imbedding layer is input of a model, and a multi-head attention mechanism is used in transform coding;
s2, acquiring and labeling training corpus data of the named entity recognition model, and constructing a data set;
s3, constructing a BERT-BiGRU-IDCNN-Attention-CRF neural network model on the basis of the trained BERT model obtained in the step S1, and training the model on the data set obtained in the step S2;
s4, conducting named entity recognition on the text to be recognized by utilizing the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3, and obtaining a recognition result.
Further, in step S1, the embedding layer is the sum of word embedding, position embedding, and type embedding, and respectively represents word information, position information, and sentence pair information;
the Transformer consists of a self-attention mechanism and a feedforward neural network, wherein the self-attention mechanism is used for calculating the association degree between words in a text sequence and adjusting the weight coefficient according to the association degree, and the association degree is calculated by the following method:
Figure BDA0002884988310000021
wherein: q represents a query vector, K represents a key vector, and V represents a value vector, and a penalty factor is introduced to prevent the inner product of Q, V from being too large
Figure BDA0002884988310000031
Wherein d iskRepresenting an input vector dimension;
the transform coding structure uses a multi-head Attention mechanism, that is, Q, K, V is subjected to multiple different linear mappings, the obtained new Q, K, V is recalculated to obtain different attentions (Q, K, V) and the attentions are spliced, W is a weight matrix, and the specific method is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headn)W
headi=Attention(QWi Q,KWi k,VWi V)
headithe attention result Concat representing the attention head i represents that the attention results of different attention heads are spliced, the output of the multi-head attention mechanism in the Transformer structure is represented as Z, b is an offset vector, and then the full link feedforward network FFN is represented as:
FFN(Z)=max(0,ZW1+b1)W2+b2
w1, b1 represent weights and biases of the multi-headed attention mechanism to the summation normalization layer, respectively, and W2, b2 represent weights and biases of the summation normalization layer to the fully connected feedforward neural network, respectively. The BERT model obtains corresponding parameters by training on large-scale label-free data, and infers and outputs word vector representation of an input sequence, namely T1,T2,Tn
Further, the step S2 is to obtain and label the corpus data of the named entity recognition model, and construct a data set, which specifically includes:
s2-1, performing conventional data processing on the original corpus identified by the named entity, wherein the conventional data processing comprises correcting wrongly-written characters and scaling the characters;
s2-2, determining the entity type to be identified according to the actual application requirement;
s2-3, in order to cope with the situation that the entity lengths are different and the entity boundaries are difficult to distinguish, adopting a BIO three-stage entity labeling method: b represents the first character of the entity, I represents the character following the first character in the entity, and O represents a non-entity.
Further, the step S3 is to construct a BERT-BiGRU-IDCNN-Attention-CRF neural network model, which specifically includes the steps of:
s3-1, inputting the training set of the named entity recognition obtained by preprocessing in the step S2 into the BERT pre-training model trained in the step S1, and outputting word embedding vectors by the BERT model;
s3-2, inputting the word vector output by the BERT pre-training language model into a biGRU (bidirectional gated recurrent unit) neural network model;
s3-3, inputting the word embedding vector output by the BERT pre-training language model into an IDCNN (iterative scaled convolution neural network) iterative expansion convolution neural network model;
s3-4, merging the feature vectors output by the BiGRU and IDCNN neural networks;
s3-5, inputting the combined feature vector obtained in the step S3-4 into an attention mechanism module, performing weight distribution on the extracted features, strengthening the features which play a key role in named entity identification, and weakening irrelevant features;
s3-6, inputting the weight-distributed feature vectors obtained in the step S3-5 into a CRF layer, extracting the dependency features among the labels, and calculating a loss function;
and S3-7, updating parameters of the whole named entity recognition model by using a loss function of a CRF layer by adopting an SGD random gradient descent method, wherein the parameters comprise model parameters including a BiGRU neural network model, an IDCNN neural network model, an Attention layer and the CRF layer, the parameters of the BERT model are kept unchanged, and when the loss value generated by the model meets the set requirement or reaches the set maximum iteration number, the training of the model is terminated.
Further, the step S3-2 inputs the word vector output by the BERT pre-training language model to the BiGRU neural network model; the GRU is a special recurrent neural network, and the state of its neurons is calculated as follows:
zt=σ(Wi*[ht-1,xt])
rt=σ(Wr*[ht-1,xt])
Figure BDA0002884988310000041
Figure BDA0002884988310000042
where σ is the sigmod function, xtIs an input vector at time t, htIs a hidden state, is also an output vector, contains all valid information of the previous t time, ztIs an update gate, controls the flow of information input for the next time,
Figure BDA0002884988310000043
representing candidate hidden layers, Wi Wc WrAre all represented as weight matrices, r, of GRUstIs a reset gate, the control information is lost, and the update gate and the reset gate jointly determine the output of the hidden state.
Further, the step S3-5 is to input the merged feature vector obtained in the step S3-4 to the attention mechanism module, perform weight distribution on the extracted features, strengthen the features that play a key role in named entity recognition, and weaken irrelevant features;
defining a set of feature vectors H ═ H0,h1...h3The extra information is a part of speech matrix P ═ P0,P1...PnIs an activation function for the purpose of word formationInformation can give weight to the target vector set, and weight matrix W is used respectively1,W2Affine transformation is carried out on H and P to enable vector space dimensions of H and P to be the same, and transformation results are input into tan H (.) activation functions to obtain joint feature vectors
Figure BDA0002884988310000051
And using softmax function pairs
Figure BDA0002884988310000052
Weighting and scoring to obtain the weight of each input
Figure BDA0002884988310000053
Finally, using the weight vector
Figure BDA0002884988310000054
And (3) carrying out attention weighting on the feature vector set H, and outputting the feature vector logits after weight distribution, wherein the calculation method comprises the following steps:
Figure BDA0002884988310000055
Figure BDA0002884988310000056
Figure BDA0002884988310000057
further, the step S3-6 inputs the weight-assigned feature vector obtained in the step S3-5 to the CRF layer, extracts the dependency features between the labels, and calculates the loss function, which specifically includes:
for an input feature vector X, the corresponding prediction sequence is Y, the probability generated by the prediction sequence Y is obtained by calculating the scoring function of Y, and finally, the prediction labeling sequence when the likelihood function of the probability generated by the prediction sequence is maximum is calculated as output, wherein the calculating method of the scoring function of the prediction sequence Y is as follows:
Figure BDA0002884988310000058
a denotes a transition score matrix, n denotes the number of words, yi denotes the i-th label of the predicted sequence Y, Ai,jRepresents the transition of label i to label j, P is the scoring matrix of the upper layer output, Pi,jThe score of the jth label representing the ith word, the probability that the predicted sequence y yields is:
Figure BDA0002884988310000059
wherein the content of the first and second substances,
Figure BDA00028849883100000510
representing predicted sequences
Figure BDA00028849883100000511
The score of the scoring function of (a) is scored,
Figure BDA00028849883100000512
representing the actual annotation sequence, YXRepresenting all possible labeled sequences, and taking logarithms on two sides to obtain a likelihood function of a prediction sequence:
the training loss function is:
Figure BDA0002884988310000061
λ represents a penalty term coefficient and θ represents a penalty term.
Further, the step S4 performs named entity recognition on the text to be recognized by using the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3 to obtain a recognition result, and includes the steps of:
s4-1, inputting text data needing named entity recognition into a trained BERT-BiGRU-IDCNN-Attention-CRF neural network model;
and S4-2, converting the text data into word vectors after passing through a BERT model, extracting the features of the word vectors through a BiGRU and IDCNN neural network, distributing the weights of the extracted features through an Attention layer, and finally solving the most possible labeling sequence of each sentence by adopting a Viterbi algorithm on a CRF layer, namely the result of named entity recognition.
The invention has the following advantages and beneficial effects:
the invention provides a named entity identification method of BERT-BiGRU-IDCNN-CRF based on an attention mechanism. The method can solve the problem that the word vector cannot represent the polysemy of a word, can overcome the defect that the BiGRU module ignores local features in the process of extracting the context features, and can strengthen the features playing a key role in classification and weaken irrelevant features.
1. The BERT model used by the invention can perform unsupervised training on large-scale label-free data, can perform pre-training by combining context semantic information, and learns the characteristics of word level, syntactic structure and context semantic information so as to solve the defect that static word embedding cannot represent word ambiguity.
2. An IDCNN neural network is introduced to extract local features of a text sequence so as to make up for the defect that the BiGRU neural network ignores the local features when extracting global context features.
3. On the basis of global context features and local features extracted by BiGRU and IDCNN, an attention mechanism is introduced, the features playing a key role in the named entity are strengthened by carrying out weight distribution on the features, and irrelevant features are weakened or ignored, so that the accuracy of named entity identification is further improved.
The invention is mainly characterized in that when the BiGRU neural network extracts the global context characteristics, the local characteristics extracted by the IDCNN neural network are fused to make up for the defect that the BiGRU neural network ignores the local characteristics, and meanwhile, the BiGRU neural network has only 2 gate structures, so that the BiGRU neural network has fewer training parameters and higher training speed compared with the BiLSTM neural network. On the other hand, on the basis of the fusion features extracted by the BiGRU neural network and the IDCNN neural network, an attention mechanism is introduced to carry out weight distribution on the features, the features playing a key role in classification are strengthened, and irrelevant features are weakened, so that a better recognition effect is achieved.
Drawings
FIG. 1 is a schematic flow chart of a named entity recognition method based on a BERT-BiGRU-IDCNN-Attention-CRF neural network model according to a preferred embodiment of the present invention;
FIG. 2 is a block diagram of a BERT model of an embodiment of the present invention;
FIG. 3 is a structural diagram of a BERT-BiGRU-IDCNN-Attention-CRF neural network model according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
fig. 1 is a schematic flow chart of the named entity recognition method of BERT-BiGRU-IDCNN-CRF based on the attention mechanism of the present invention. The method comprises the following steps:
s1, training a BERT model based on large-scale label-free expectation;
specifically, the structure of the BERT model is shown in fig. 2, and mainly includes a model embedding layer, a bidirectional Transformer encoding layer, and an output layer.
The embedding layer is the input of the model, which is the sum of word embedding, position embedding and type embedding, and respectively represents word information, position information and sentence pair information.
The Transformer consists of a self-attention mechanism and a feedforward neural network, the working principle of the self-attention mechanism is mainly to calculate the association degree between words in a text sequence, and adjust the weight coefficient according to the association degree, and the calculation method of the association degree is as follows:
Figure BDA0002884988310000081
wherein: q represents a query vector, K represents a key vector, and V represents a value vector, and a penalty factor is introduced to prevent the inner product of Q, V from being too large
Figure BDA0002884988310000082
Wherein d iskRepresenting the input vector dimensions.
The transform coding structure uses a multi-head Attention mechanism, that is, Q, K, V is subjected to multiple different linear mappings, the obtained new Q, K, V is recalculated to obtain different attentions (Q, K, V) and the attentions are spliced, W is a weight matrix, and the specific method is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headn)W
headi=Attention(QWi Q,KWi k,VWi V)
if the output of the multi-head attention mechanism is represented as Z and b is the bias vector in the Transformer structure, the fully-concatenated feed-forward network (FFN) can be represented as:
FFN(Z)=max(0,ZW1+b1)W2+b2
the BERT model obtains corresponding parameters by training on large-scale label-free data, and infers and outputs word vector representation of an input sequence, namely T1,T2,Tn
S2, obtaining and labeling the training corpus data of the named entity recognition model, and constructing a data set, wherein the method comprises the following steps:
s2-1, performing conventional data processing on the original corpus identified by the named entity, wherein the conventional data processing comprises correction on wrongly written characters, large-scale processing on characters and the like;
s2-2, determining the entity type to be identified according to the actual application requirement;
s2-3, in order to cope with the situation that the entity lengths are different and the entity boundaries are difficult to distinguish, adopting a BIO three-stage entity labeling method: b represents the first character of the entity, I represents the character following the first character in the entity, and O represents a non-entity.
S3, on the basis of the trained BERT model obtained in the step S1, constructing a BERT-BiGRU-IDCNN-Attention-CRF neural network model, wherein the model structure is shown in FIG. 3, and the model is trained on the data set obtained in the step S2, and the method comprises the following steps:
s3-1, inputting the training set of the named entity recognition obtained by preprocessing in the step S2 into the BERT pre-training model trained in the step S1, and outputting word embedding vectors by the BERT model;
s3-2, inputting the word vector output by the BERT pre-training language model to the BiGRU neural network model;
specifically, a GRU is a special recurrent neural network whose state of neurons is calculated as follows:
zt=σ(Wi*[ht-1,xt])
rt=σ(Wr*[ht-1,xt])
Figure BDA0002884988310000091
Figure BDA0002884988310000092
where σ is the sigmod function, xtIs an input vector at time t, htIt is a hidden state and is also an output vector, containing all valid information from the previous time t. z is a radical oftIs an update gate that controls the flow of information into the next time instant. r istIs a reset gate, the control information is lost, and the update gate and the reset gate jointly determine the output of the hidden state.
S3-3, inputting the word embedding vector output by the BERT pre-training language model to the IDCNN neural network model;
specifically, the iterative convolutional neural network (IDCNN) is composed of a plurality of layers of convolutional neural networks (DCNN) with different expansion widths, and compared with the conventional convolutional neural network, the DCNN expands the convolution kernel thereof, thereby increasing the receptive field. At a convolution kernel size of 3x3 with a dilation width of 2, but its receptive field is expanded to 7x 7. The benefit of DCNN is that the convolution output contains a larger field of view of information without changing the size of the convolution kernel.
S3-4, merging the feature vectors output by the BiGRU and IDCNN neural networks;
s3-5, inputting the combined feature vector obtained in the step S3-4 into an attention mechanism module, performing weight distribution on the extracted features, strengthening the features which play a key role in named entity identification, and weakening irrelevant features;
specifically, a feature vector set H ═ H is defined0,h1...h3The extra information is a part of speech matrix P ═ P0,P1...PnUsing a weight matrix W to weight the target vector set according to the part of speech information1,W2And performing affine transformation on the H and the P to ensure that the vector space dimensions are the same. Inputting the transformation result into the tanh (.) activation function to obtain a joint feature vector
Figure BDA0002884988310000101
And using softmax function pairs
Figure BDA0002884988310000102
Weighting and scoring to obtain the weight of each input
Figure BDA0002884988310000103
Finally, using the weight vector
Figure BDA0002884988310000104
And (3) carrying out attention weighting on the feature vector set H, and outputting the feature vector logits after weight distribution, wherein the calculation method comprises the following steps:
Figure BDA0002884988310000105
Figure BDA0002884988310000106
Figure BDA0002884988310000107
s3-6, inputting the weight-distributed feature vectors obtained in the step S3-5 into a CRF layer, extracting the dependency features among the labels, and calculating a loss function;
specifically, for an input feature vector X, the corresponding prediction sequence is Y, the probability of generation of the prediction sequence Y is obtained by calculating the scoring function of Y, and finally, the prediction labeling sequence when the likelihood function of the probability of generation of the prediction sequence is maximum is calculated and is used as output. The calculation method of the scoring function of the predicted sequence Y is as follows:
Figure BDA0002884988310000108
a represents a transition score matrix, Ai,jRepresents the transition of label i to label j, P is the scoring matrix of the upper layer output, Pi,jThe score of the jth label representing the ith word, the probability that the predicted sequence y yields is:
Figure BDA0002884988310000109
wherein the content of the first and second substances,
Figure BDA00028849883100001010
representing the actual annotation sequence, YXRepresenting all possible labeled sequences, and taking logarithms on two sides to obtain a likelihood function of a prediction sequence:
the training loss function is:
Figure BDA00028849883100001011
and S3-7, updating parameters of the whole named entity recognition model by using a loss function of a CRF layer by adopting an SGD (random gradient descent) method, wherein the parameters comprise model parameters including a BiGRU neural network model, an IDCNN neural network model, an Attention layer and the CRF layer, the parameters of the BERT model are kept unchanged, and when the loss value generated by the model meets the set requirement or reaches the set maximum iteration number, the training of the model is terminated.
S4, carrying out named entity recognition on the text to be recognized by utilizing the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3 to obtain a recognition result, and comprising the following steps:
s4-1, inputting text data needing named entity recognition into a trained BERT-BiGRU-IDCNN-Attention-CRF neural network model;
and S4-2, converting the text data into word vectors after passing through a BERT model, extracting the features of the word vectors through a BiGRU and IDCNN neural network, distributing the weights of the extracted features through an Attention layer, and finally solving the most possible labeling sequence of each sentence by adopting a Viterbi algorithm on a CRF layer, namely the result of named entity recognition.
The modules or methods illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (8)

1. A named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism is characterized by comprising the following steps:
s1, training a BERT model based on large-scale label-free expectation; the BERT model comprises an embedding word embedding layer, a bidirectional Transformer coding layer and an output layer; embedding an embedding word into a layer, namely inputting a model, wherein a multi-head attention mechanism is used in transform coding;
s2, acquiring and labeling training corpus data of the named entity recognition model, and constructing a data set;
s3, constructing a BERT-BiGRU-IDCNN-Attention-CRF neural network model on the basis of the trained BERT model obtained in the step S1, and training the model on the data set obtained in the step S2;
s4, conducting named entity recognition on the text to be recognized by utilizing the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3, and obtaining a recognition result.
2. The method for identifying named entities on BERT-BiGRU-IDCNN-CRF according to claim 1, wherein in step S1, the embedding layer is the sum of word embedding, position embedding and type embedding, and represents word information, position information and sentence pair information respectively;
the Transformer consists of a self-attention mechanism and a feedforward neural network, wherein the self-attention mechanism is used for calculating the association degree between words in a text sequence and adjusting the weight coefficient according to the association degree, and the association degree is calculated by the following method:
Figure FDA0002884988300000011
wherein: q represents a query vector, K represents a key vector, and V represents a value vector, and a penalty factor is introduced to prevent the inner product of Q, V from being too large
Figure FDA0002884988300000012
Wherein d iskRepresenting an input vector dimension;
the transform coding structure uses a multi-head Attention mechanism, that is, Q, K, V is subjected to multiple different linear mappings, the obtained new Q, K, V is recalculated to obtain different attentions (Q, K, V) and the attentions are spliced, W is a weight matrix, and the specific method is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headn)W
Figure FDA0002884988300000013
headithe attention result of the attention head i is shown, Concat shows that the attention results of different attention heads are spliced, the output of the multi-head attention mechanism in the Transformer structure is shown as Z, b is an offset vector, and then the full link feedforward network FFN is shown as:
FFN(Z)=max(0,ZW1+b1)W2+b2
w1 and b1 respectively represent the weight and the bias of a multi-head attention mechanism to a summation normalization layer, W2 and b2 respectively represent the weight and the bias of the summation normalization layer to a fully-connected feedforward neural network, a BERT model obtains corresponding parameters by training on large-scale unlabeled data, and the word vector representation of an input sequence, namely T, is reasoned and output1,T2,Tn
3. The method for recognizing the named entity of BERT-BiGRU-IDCNN-CRF based on attention mechanism as claimed in claim 1, wherein the step S2 is performed by obtaining and labeling corpus data of the named entity recognition model to construct a data set, specifically comprising:
s2-1, performing conventional data processing on the original corpus identified by the named entity, wherein the conventional data processing comprises correcting wrongly-written characters and scaling the characters;
s2-2, determining the entity type to be identified according to the actual application requirement;
s2-3, in order to cope with the situation that the entity lengths are different and the entity boundaries are difficult to distinguish, adopting a BIO three-stage entity labeling method: b represents the first character of the entity, I represents the character following the first character in the entity, and O represents a non-entity.
4. The method for identifying named entities on BERT-BiGRU-IDCNN-CRF based on Attention mechanism as claimed in claim 3, wherein the step S3 is to construct a BERT-BiGRU-IDCNN-Attention-CRF neural network model, specifically comprising the steps of:
s3-1, inputting the training set of the named entity recognition obtained by preprocessing in the step S2 into the BERT pre-training model trained in the step S1, and outputting word embedding vectors by the BERT model;
s3-2, inputting the word vector output by the BERT pre-training language model into a biGRU (bidirectional gated recurrent unit) neural network model;
s3-3, inputting the word embedding vector output by the BERT pre-training language model into an IDCNN (iterative scaled convolution neural network) iterative expansion convolution neural network model;
s3-4, merging the feature vectors output by the BiGRU and IDCNN neural networks;
s3-5, inputting the combined feature vector obtained in the step S3-4 into an attention mechanism module, performing weight distribution on the extracted features, strengthening the features which play a key role in named entity identification, and weakening irrelevant features;
s3-6, inputting the weight-distributed feature vectors obtained in the step S3-5 into a CRF layer, extracting the dependency features among the labels, and calculating a loss function;
and S3-7, updating parameters of the whole named entity recognition model by using a loss function of a CRF layer by adopting an SGD random gradient descent method, wherein the parameters comprise model parameters including a BiGRU neural network model, an IDCNN neural network model, an Attention layer and the CRF layer, the parameters of the BERT model are kept unchanged, and when the loss value generated by the model meets the set requirement or reaches the set maximum iteration number, the training of the model is terminated.
5. The method for named entity recognition of BERT-BiGRU-IDCNN-CRF based on attention mechanism as claimed in claim 4, wherein the step S3-2 inputs the word vector outputted from the BERT pre-training language model to the BiGRU neural network model; the GRU is a special recurrent neural network, and the state of its neurons is calculated as follows:
zt=σ(Wi*[ht-1,xt])
rt=σ(Wr*[ht-1,xt])
Figure FDA0002884988300000031
Figure FDA0002884988300000032
where σ is the sigmod function, xtIs an input vector at time t, htIs a hidden state, is also an output vector, contains all valid information of the previous t time, ztIs an update gate, controls the flow of information input for the next time,
Figure FDA0002884988300000033
representing candidate hidden layers, Wi WcWrAre all represented as weight matrices, r, of GRUstIs a reset gate, control information is lost, and the update gate and the reset gate jointly decide to hideAnd (4) outputting the state.
6. The method for identifying named entities on a BERT-BiGRU-IDCNN-CRF as claimed in claim 5, wherein the step S3-5 comprises inputting the merged feature vector obtained in the step S3-4 to an attention mechanism module, performing weight distribution on the extracted features, enhancing features critical to named entity identification, and weakening irrelevant features;
defining a set of feature vectors H ═ H0,h1...h3The extra information is a part of speech matrix P ═ P0,P1...PnUsing a weight matrix W to weight the target vector set according to the part of speech information1,W2Affine transformation is carried out on H and P to enable vector space dimensions of H and P to be the same, and transformation results are input into tan H (.) activation functions to obtain joint feature vectors
Figure FDA0002884988300000041
And using softmax function pairs
Figure FDA0002884988300000042
Weighting and scoring to obtain the weight of each input
Figure FDA0002884988300000043
Finally, using the weight vector
Figure FDA0002884988300000044
And (3) carrying out attention weighting on the feature vector set H, and outputting the feature vector logits after weight distribution, wherein the calculation method comprises the following steps:
Figure FDA0002884988300000045
Figure FDA0002884988300000046
Figure FDA0002884988300000047
7. the method for identifying named entities on BERT-BiGRU-IDCNN-CRF according to claim 6, wherein the step S3-6 inputs the weight-assigned feature vectors obtained in step S3-5 into a CRF layer, extracts the dependency features between labels, and calculates the loss function, and comprises:
for an input feature vector X, the corresponding prediction sequence is Y, the probability generated by the prediction sequence Y is obtained by calculating the scoring function of Y, and finally, the prediction labeling sequence when the likelihood function of the probability generated by the prediction sequence is maximum is calculated as output, wherein the calculating method of the scoring function of the prediction sequence Y is as follows:
Figure FDA0002884988300000048
a denotes a transition score matrix, n denotes the number of words, yi denotes the i-th label of the predicted sequence Y, Ai,jRepresents the transition of label i to label j, P is the scoring matrix of the upper layer output, Pi,jThe score of the jth label representing the ith word, the probability that the predicted sequence y yields is:
Figure FDA0002884988300000049
wherein the content of the first and second substances,
Figure FDA00028849883000000410
representing predicted sequences
Figure FDA00028849883000000411
Score function ofThe score is obtained by the above-mentioned method,
Figure FDA00028849883000000412
representing the actual annotation sequence, YXRepresenting all possible labeled sequences, and taking logarithms on two sides to obtain a likelihood function of a prediction sequence:
the training loss function is:
Figure FDA0002884988300000051
λ represents a penalty term coefficient and θ represents a penalty term.
8. The method for named entity recognition on BERT-BiGRU-IDCNN-CRF based on Attention mechanism as claimed in claim 7, wherein the step S4 using the trained BERT-BiGRU-IDCNN-Attention-CRF based neural network model obtained in step S3 to perform named entity recognition on the text to be recognized to obtain the recognition result comprises the steps of:
s4-1, inputting text data needing named entity recognition into a trained BERT-BiGRU-IDCNN-Attention-CRF neural network model;
and S4-2, converting the text data into word vectors after passing through a BERT model, extracting the features of the word vectors through a BiGRU and IDCNN neural network, distributing the weights of the extracted features through an Attention layer, and finally solving the most possible labeling sequence of each sentence by adopting a Viterbi algorithm on a CRF layer, namely the result of named entity recognition.
CN202110016942.9A 2021-01-06 2021-01-06 Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism Pending CN112733541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110016942.9A CN112733541A (en) 2021-01-06 2021-01-06 Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110016942.9A CN112733541A (en) 2021-01-06 2021-01-06 Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism

Publications (1)

Publication Number Publication Date
CN112733541A true CN112733541A (en) 2021-04-30

Family

ID=75590850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110016942.9A Pending CN112733541A (en) 2021-01-06 2021-01-06 Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism

Country Status (1)

Country Link
CN (1) CN112733541A (en)

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949637A (en) * 2021-05-14 2021-06-11 中南大学 Bidding text entity identification method based on IDCNN and attention mechanism
CN113139069A (en) * 2021-05-14 2021-07-20 上海交通大学 Knowledge graph construction-oriented Chinese text entity identification method and system for power failure
CN113221571A (en) * 2021-05-31 2021-08-06 重庆交通大学 Entity relation joint extraction method based on entity correlation attention mechanism
CN113268740A (en) * 2021-05-27 2021-08-17 四川大学 Input constraint completeness detection method of website system
CN113361277A (en) * 2021-06-16 2021-09-07 西南交通大学 Medical named entity recognition modeling method based on attention mechanism
CN113378574A (en) * 2021-06-30 2021-09-10 武汉大学 Named entity identification method based on KGANN
CN113392649A (en) * 2021-07-08 2021-09-14 上海浦东发展银行股份有限公司 Identification method, device, equipment and storage medium
CN113408291A (en) * 2021-07-09 2021-09-17 平安国际智慧城市科技股份有限公司 Training method, device and equipment for Chinese entity recognition model and storage medium
CN113505613A (en) * 2021-07-29 2021-10-15 沈阳雅译网络技术有限公司 Model structure simplification compression method for small CPU equipment
CN113569554A (en) * 2021-09-24 2021-10-29 北京明略软件***有限公司 Entity pair matching method and device in database, electronic equipment and storage medium
CN113627157A (en) * 2021-10-13 2021-11-09 京华信息科技股份有限公司 Probability threshold value adjusting method and system based on multi-head attention mechanism
CN113642862A (en) * 2021-07-29 2021-11-12 国网江苏省电力有限公司 Method and system for identifying named entities of power grid dispatching instructions based on BERT-MBIGRU-CRF model
CN113673248A (en) * 2021-08-23 2021-11-19 中国人民解放军32801部队 Named entity identification method for testing and identifying small sample text
CN113687242A (en) * 2021-09-29 2021-11-23 温州大学 Lithium ion battery SOH estimation method for optimizing and improving GRU neural network based on GA algorithm
CN113744805A (en) * 2021-09-30 2021-12-03 山东大学 Method and system for predicting DNA methylation based on BERT framework
CN113836926A (en) * 2021-09-27 2021-12-24 北京林业大学 Electronic medical record named entity identification method, electronic equipment and storage medium
CN114036948A (en) * 2021-10-26 2022-02-11 天津大学 Named entity identification method based on uncertainty quantification
CN114048750A (en) * 2021-12-10 2022-02-15 广东工业大学 Named entity identification method integrating information advanced features
CN114547301A (en) * 2022-02-21 2022-05-27 北京百度网讯科技有限公司 Document processing method, document processing device, recognition model training equipment and storage medium
CN114580412A (en) * 2021-12-29 2022-06-03 西安工程大学 Clothing entity identification method based on field adaptation
CN114580422A (en) * 2022-03-14 2022-06-03 昆明理工大学 Named entity identification method combining two-stage classification of neighbor analysis
CN114648029A (en) * 2022-03-31 2022-06-21 河海大学 Electric power field named entity identification method based on BiLSTM-CRF model
CN115146630A (en) * 2022-06-08 2022-10-04 平安科技(深圳)有限公司 Word segmentation method, device, equipment and storage medium based on professional domain knowledge
CN115186667A (en) * 2022-07-19 2022-10-14 平安科技(深圳)有限公司 Named entity identification method and device based on artificial intelligence
CN115587594A (en) * 2022-09-20 2023-01-10 广东财经大学 Network security unstructured text data extraction model training method and system
CN115757325A (en) * 2023-01-06 2023-03-07 珠海金智维信息科技有限公司 Intelligent conversion method and system for XES logs
CN115859983A (en) * 2022-12-14 2023-03-28 成都信息工程大学 Fine-grained Chinese named entity recognition method
CN115906845A (en) * 2022-11-08 2023-04-04 重庆邮电大学 E-commerce commodity title naming entity identification method
CN116050418A (en) * 2023-03-02 2023-05-02 浙江工业大学 Named entity identification method, device and medium based on fusion of multi-layer semantic features
CN116484848A (en) * 2023-03-17 2023-07-25 北京深维智讯科技有限公司 Text entity identification method based on NLP
CN116501884A (en) * 2023-03-31 2023-07-28 重庆大学 Medical entity identification method based on BERT-BiLSTM-CRF
CN116545779A (en) * 2023-07-06 2023-08-04 鹏城实验室 Network security named entity recognition method, device, equipment and storage medium
CN116561588A (en) * 2023-07-07 2023-08-08 北京国电通网络技术有限公司 Power text recognition model construction method, power equipment maintenance method and device
CN116611436A (en) * 2023-04-18 2023-08-18 广州大学 Threat information-based network security named entity identification method
CN116682436A (en) * 2023-07-27 2023-09-01 成都大成均图科技有限公司 Emergency alert acceptance information identification method and device
WO2023201791A1 (en) * 2022-04-22 2023-10-26 深圳计算科学研究院 Data entity recognition method and apparatus, and computer device and storage medium
CN117236338A (en) * 2023-08-29 2023-12-15 北京工商大学 Named entity recognition model of dense entity text and training method thereof
CN117933259A (en) * 2024-03-25 2024-04-26 成都中医药大学 Named entity recognition method based on local text information
CN114048750B (en) * 2021-12-10 2024-06-28 广东工业大学 Named entity identification method integrating advanced features of information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083831A (en) * 2019-04-16 2019-08-02 武汉大学 A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN110516231A (en) * 2019-07-12 2019-11-29 北京邮电大学 Expansion convolution entity name recognition method based on attention mechanism
CN112115238A (en) * 2020-10-29 2020-12-22 电子科技大学 Question-answering method and system based on BERT and knowledge base

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083831A (en) * 2019-04-16 2019-08-02 武汉大学 A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN110516231A (en) * 2019-07-12 2019-11-29 北京邮电大学 Expansion convolution entity name recognition method based on attention mechanism
CN112115238A (en) * 2020-10-29 2020-12-22 电子科技大学 Question-answering method and system based on BERT and knowledge base

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
李妮 等: "基于BERT-IDCNN-CRF的中文命名实体识别方法", 《山东大学学报(理学版)》 *
杨文明 等: "在线医疗问答文本的命名实体识别", 《计算机***应用》 *
王雪梅 等: "基于深度学习的中文命名实体识别研究", 《成都信息工程大学学报》 *

Cited By (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139069A (en) * 2021-05-14 2021-07-20 上海交通大学 Knowledge graph construction-oriented Chinese text entity identification method and system for power failure
CN112949637A (en) * 2021-05-14 2021-06-11 中南大学 Bidding text entity identification method based on IDCNN and attention mechanism
CN113268740A (en) * 2021-05-27 2021-08-17 四川大学 Input constraint completeness detection method of website system
CN113221571A (en) * 2021-05-31 2021-08-06 重庆交通大学 Entity relation joint extraction method based on entity correlation attention mechanism
CN113221571B (en) * 2021-05-31 2022-07-01 重庆交通大学 Entity relation joint extraction method based on entity correlation attention mechanism
CN113361277A (en) * 2021-06-16 2021-09-07 西南交通大学 Medical named entity recognition modeling method based on attention mechanism
CN113378574A (en) * 2021-06-30 2021-09-10 武汉大学 Named entity identification method based on KGANN
CN113378574B (en) * 2021-06-30 2023-10-24 武汉大学 KGANN-based named entity identification method
CN113392649B (en) * 2021-07-08 2023-04-07 上海浦东发展银行股份有限公司 Identification method, device, equipment and storage medium
CN113392649A (en) * 2021-07-08 2021-09-14 上海浦东发展银行股份有限公司 Identification method, device, equipment and storage medium
CN113408291A (en) * 2021-07-09 2021-09-17 平安国际智慧城市科技股份有限公司 Training method, device and equipment for Chinese entity recognition model and storage medium
CN113408291B (en) * 2021-07-09 2023-06-30 平安国际智慧城市科技股份有限公司 Training method, training device, training equipment and training storage medium for Chinese entity recognition model
CN113505613A (en) * 2021-07-29 2021-10-15 沈阳雅译网络技术有限公司 Model structure simplification compression method for small CPU equipment
CN113642862A (en) * 2021-07-29 2021-11-12 国网江苏省电力有限公司 Method and system for identifying named entities of power grid dispatching instructions based on BERT-MBIGRU-CRF model
CN113673248A (en) * 2021-08-23 2021-11-19 中国人民解放军32801部队 Named entity identification method for testing and identifying small sample text
CN113673248B (en) * 2021-08-23 2022-02-01 中国人民解放军32801部队 Named entity identification method for testing and identifying small sample text
CN113569554A (en) * 2021-09-24 2021-10-29 北京明略软件***有限公司 Entity pair matching method and device in database, electronic equipment and storage medium
CN113836926A (en) * 2021-09-27 2021-12-24 北京林业大学 Electronic medical record named entity identification method, electronic equipment and storage medium
CN113687242A (en) * 2021-09-29 2021-11-23 温州大学 Lithium ion battery SOH estimation method for optimizing and improving GRU neural network based on GA algorithm
CN113744805A (en) * 2021-09-30 2021-12-03 山东大学 Method and system for predicting DNA methylation based on BERT framework
CN113627157A (en) * 2021-10-13 2021-11-09 京华信息科技股份有限公司 Probability threshold value adjusting method and system based on multi-head attention mechanism
CN114036948A (en) * 2021-10-26 2022-02-11 天津大学 Named entity identification method based on uncertainty quantification
CN114036948B (en) * 2021-10-26 2024-05-31 天津大学 Named entity identification method based on uncertainty quantification
CN114048750A (en) * 2021-12-10 2022-02-15 广东工业大学 Named entity identification method integrating information advanced features
CN114048750B (en) * 2021-12-10 2024-06-28 广东工业大学 Named entity identification method integrating advanced features of information
CN114580412B (en) * 2021-12-29 2024-06-04 西安工程大学 Clothing entity identification method based on field adaptation
CN114580412A (en) * 2021-12-29 2022-06-03 西安工程大学 Clothing entity identification method based on field adaptation
CN114547301A (en) * 2022-02-21 2022-05-27 北京百度网讯科技有限公司 Document processing method, document processing device, recognition model training equipment and storage medium
CN114580422B (en) * 2022-03-14 2022-12-13 昆明理工大学 Named entity identification method combining two-stage classification of neighbor analysis
CN114580422A (en) * 2022-03-14 2022-06-03 昆明理工大学 Named entity identification method combining two-stage classification of neighbor analysis
CN114648029A (en) * 2022-03-31 2022-06-21 河海大学 Electric power field named entity identification method based on BiLSTM-CRF model
WO2023201791A1 (en) * 2022-04-22 2023-10-26 深圳计算科学研究院 Data entity recognition method and apparatus, and computer device and storage medium
CN115146630A (en) * 2022-06-08 2022-10-04 平安科技(深圳)有限公司 Word segmentation method, device, equipment and storage medium based on professional domain knowledge
CN115146630B (en) * 2022-06-08 2023-05-30 平安科技(深圳)有限公司 Word segmentation method, device, equipment and storage medium based on professional domain knowledge
CN115186667A (en) * 2022-07-19 2022-10-14 平安科技(深圳)有限公司 Named entity identification method and device based on artificial intelligence
CN115186667B (en) * 2022-07-19 2023-05-26 平安科技(深圳)有限公司 Named entity identification method and device based on artificial intelligence
CN115587594A (en) * 2022-09-20 2023-01-10 广东财经大学 Network security unstructured text data extraction model training method and system
CN115906845B (en) * 2022-11-08 2024-05-10 芽米科技(广州)有限公司 Method for identifying title named entity of electronic commerce commodity
CN115906845A (en) * 2022-11-08 2023-04-04 重庆邮电大学 E-commerce commodity title naming entity identification method
CN115859983A (en) * 2022-12-14 2023-03-28 成都信息工程大学 Fine-grained Chinese named entity recognition method
CN115859983B (en) * 2022-12-14 2023-08-25 成都信息工程大学 Fine-granularity Chinese named entity recognition method
CN115757325A (en) * 2023-01-06 2023-03-07 珠海金智维信息科技有限公司 Intelligent conversion method and system for XES logs
CN116050418A (en) * 2023-03-02 2023-05-02 浙江工业大学 Named entity identification method, device and medium based on fusion of multi-layer semantic features
CN116050418B (en) * 2023-03-02 2023-10-31 浙江工业大学 Named entity identification method, device and medium based on fusion of multi-layer semantic features
CN116484848A (en) * 2023-03-17 2023-07-25 北京深维智讯科技有限公司 Text entity identification method based on NLP
CN116484848B (en) * 2023-03-17 2024-03-29 北京深维智讯科技有限公司 Text entity identification method based on NLP
CN116501884A (en) * 2023-03-31 2023-07-28 重庆大学 Medical entity identification method based on BERT-BiLSTM-CRF
CN116611436A (en) * 2023-04-18 2023-08-18 广州大学 Threat information-based network security named entity identification method
CN116545779B (en) * 2023-07-06 2023-10-03 鹏城实验室 Network security named entity recognition method, device, equipment and storage medium
CN116545779A (en) * 2023-07-06 2023-08-04 鹏城实验室 Network security named entity recognition method, device, equipment and storage medium
CN116561588A (en) * 2023-07-07 2023-08-08 北京国电通网络技术有限公司 Power text recognition model construction method, power equipment maintenance method and device
CN116561588B (en) * 2023-07-07 2023-10-20 北京国电通网络技术有限公司 Power text recognition model construction method, power equipment maintenance method and device
CN116682436A (en) * 2023-07-27 2023-09-01 成都大成均图科技有限公司 Emergency alert acceptance information identification method and device
CN117236338A (en) * 2023-08-29 2023-12-15 北京工商大学 Named entity recognition model of dense entity text and training method thereof
CN117236338B (en) * 2023-08-29 2024-05-28 北京工商大学 Named entity recognition model of dense entity text and training method thereof
CN117933259B (en) * 2024-03-25 2024-06-14 成都中医药大学 Named entity recognition method based on local text information
CN117933259A (en) * 2024-03-25 2024-04-26 成都中医药大学 Named entity recognition method based on local text information

Similar Documents

Publication Publication Date Title
CN112733541A (en) Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
CN112989834B (en) Named entity identification method and system based on flat grid enhanced linear converter
CN110609891A (en) Visual dialog generation method based on context awareness graph neural network
CN110008469B (en) Multilevel named entity recognition method
Xie et al. Fully convolutional recurrent network for handwritten chinese text recognition
CN110263325B (en) Chinese word segmentation system
CN112541356B (en) Method and system for recognizing biomedical named entities
CN112115238A (en) Question-answering method and system based on BERT and knowledge base
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN114943230B (en) Method for linking entities in Chinese specific field by fusing common sense knowledge
CN110555084A (en) remote supervision relation classification method based on PCNN and multi-layer attention
CN113255366B (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
Ren et al. Detecting the scope of negation and speculation in biomedical texts by using recursive neural network
CN111476024A (en) Text word segmentation method and device and model training method
CN114781375A (en) Military equipment relation extraction method based on BERT and attention mechanism
CN114417872A (en) Contract text named entity recognition method and system
CN112905736A (en) Unsupervised text emotion analysis method based on quantum theory
Aggarwal et al. Recurrent neural networks
CN114254645A (en) Artificial intelligence auxiliary writing system
CN113535897A (en) Fine-grained emotion analysis method based on syntactic relation and opinion word distribution
CN115600597A (en) Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium
CN115331075A (en) Countermeasures type multi-modal pre-training method for enhancing knowledge of multi-modal scene graph
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
CN114416991A (en) Method and system for analyzing text emotion reason based on prompt

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210430

RJ01 Rejection of invention patent application after publication