CN112733541A - Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism - Google Patents
Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism Download PDFInfo
- Publication number
- CN112733541A CN112733541A CN202110016942.9A CN202110016942A CN112733541A CN 112733541 A CN112733541 A CN 112733541A CN 202110016942 A CN202110016942 A CN 202110016942A CN 112733541 A CN112733541 A CN 112733541A
- Authority
- CN
- China
- Prior art keywords
- bert
- bigru
- model
- idcnn
- crf
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000007246 mechanism Effects 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims abstract description 82
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000013528 artificial neural network Methods 0.000 claims abstract description 29
- 230000006870 function Effects 0.000 claims description 36
- 238000003062 neural network model Methods 0.000 claims description 30
- 239000011159 matrix material Substances 0.000 claims description 15
- 238000002372 labelling Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 7
- 230000002457 bidirectional effect Effects 0.000 claims description 6
- 230000007704 transition Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 5
- 230000000306 recurrent effect Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 230000003313 weakening effect Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000005728 strengthening Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 230000002708 enhancing effect Effects 0.000 claims 1
- 230000007547 defect Effects 0.000 abstract description 5
- 230000008569 process Effects 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism, which comprises the following steps: training a BERT pre-training language model through large-scale label-free anticipation; constructing a complete BERT-BiGRU-IDCNN-Attention-CRF named entity recognition model on the basis of the trained BERT model; constructing an entity recognition training set, and training a complete entity recognition model on the training set; the invention combines the feature vectors extracted by the BiGRU and IDCNN neural networks, overcomes the defect that the BiGRU neural networks neglect local features in the process of extracting global context features, introduces an attention mechanism, performs weight distribution on the extracted features, strengthens the features playing a key role in entity recognition, weakens irrelevant features and further improves the recognition effect of named entity recognition.
Description
Technical Field
The invention belongs to the field of named entity identification, and particularly relates to a method for identifying a named entity of BERT-BiGRU-IDCNN-CRF based on an attention mechanism.
Background
Named Entity Recognition (NER) is one of the basic tasks in the field of natural language processing, namely, recognizing instances, i.e., entities, such as person names, place names, organization names, etc., embodying concepts in text. Named entity recognition is widely applied to tasks such as information extraction, question and answer systems, intelligent translation, knowledge graph construction and the like.
The methods for naming identification are mainly classified into the following three categories:
the first category of dictionary and rule-based methods first performs named entity recognition by matching through manual construction of a dictionary or rule template.
The second category is a statistical machine learning-based method, which considers the named entity recognition task as a sequence tagging problem and learns tagging models using large-scale corpora. Such methods mainly include Hidden Markov Models (HMMs), Maximum Entropy Models (MEM), Support Vector Machines (SVMs), Conditional Random Fields (CRFs), and the like.
The third type is a method based on deep learning, that is, characters or words are mapped to a vector space and then input to a neural network for feature extraction and label prediction, wherein a popular neural network model is BilSTM-CRF, the model can extract global context features through the BilSTM, meanwhile, the dependency relationship among labels is captured through the CRF, and the corrected context features are output. Meanwhile, in order to accelerate the convergence rate, a BiGRU-CRF model is proposed by scholars.
The above prior art has the following disadvantages:
1. the first stage is to adopt a dictionary and rule-based method, which has the disadvantages that the method relies on a manually constructed rule template of a linguist, is labor-consuming, has subjective factors, is easy to generate errors, and cannot be transplanted among different fields.
2. Statistical machine learning-based methods still require a large amount of human involvement in feature extraction and rely heavily on the corpus.
3. The mainstream model BilSTM-CRF based on the deep learning method has the disadvantages that a word vector of a word embedding layer cannot represent word ambiguity, so that the recognition effect of a lower layer is influenced, and in addition, part of local features can be omitted when the BiLSTM and the BiGRU extract global context features.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A named entity recognition method of BERT-BiGRU-IDCNN-CRF based on attention mechanism is provided. The technical scheme of the invention is as follows:
a named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism comprises the following steps:
s1, training a BERT (bidirectional encoder representation from transformations) model based on large-scale unlabeled expectation; the BERT model comprises an embedding layer, a bidirectional Transformer coding layer and an output layer; an imbedding layer is input of a model, and a multi-head attention mechanism is used in transform coding;
s2, acquiring and labeling training corpus data of the named entity recognition model, and constructing a data set;
s3, constructing a BERT-BiGRU-IDCNN-Attention-CRF neural network model on the basis of the trained BERT model obtained in the step S1, and training the model on the data set obtained in the step S2;
s4, conducting named entity recognition on the text to be recognized by utilizing the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3, and obtaining a recognition result.
Further, in step S1, the embedding layer is the sum of word embedding, position embedding, and type embedding, and respectively represents word information, position information, and sentence pair information;
the Transformer consists of a self-attention mechanism and a feedforward neural network, wherein the self-attention mechanism is used for calculating the association degree between words in a text sequence and adjusting the weight coefficient according to the association degree, and the association degree is calculated by the following method:
wherein: q represents a query vector, K represents a key vector, and V represents a value vector, and a penalty factor is introduced to prevent the inner product of Q, V from being too largeWherein d iskRepresenting an input vector dimension;
the transform coding structure uses a multi-head Attention mechanism, that is, Q, K, V is subjected to multiple different linear mappings, the obtained new Q, K, V is recalculated to obtain different attentions (Q, K, V) and the attentions are spliced, W is a weight matrix, and the specific method is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headn)W
headi=Attention(QWi Q,KWi k,VWi V)
headithe attention result Concat representing the attention head i represents that the attention results of different attention heads are spliced, the output of the multi-head attention mechanism in the Transformer structure is represented as Z, b is an offset vector, and then the full link feedforward network FFN is represented as:
FFN(Z)=max(0,ZW1+b1)W2+b2
w1, b1 represent weights and biases of the multi-headed attention mechanism to the summation normalization layer, respectively, and W2, b2 represent weights and biases of the summation normalization layer to the fully connected feedforward neural network, respectively. The BERT model obtains corresponding parameters by training on large-scale label-free data, and infers and outputs word vector representation of an input sequence, namely T1,T2,Tn。
Further, the step S2 is to obtain and label the corpus data of the named entity recognition model, and construct a data set, which specifically includes:
s2-1, performing conventional data processing on the original corpus identified by the named entity, wherein the conventional data processing comprises correcting wrongly-written characters and scaling the characters;
s2-2, determining the entity type to be identified according to the actual application requirement;
s2-3, in order to cope with the situation that the entity lengths are different and the entity boundaries are difficult to distinguish, adopting a BIO three-stage entity labeling method: b represents the first character of the entity, I represents the character following the first character in the entity, and O represents a non-entity.
Further, the step S3 is to construct a BERT-BiGRU-IDCNN-Attention-CRF neural network model, which specifically includes the steps of:
s3-1, inputting the training set of the named entity recognition obtained by preprocessing in the step S2 into the BERT pre-training model trained in the step S1, and outputting word embedding vectors by the BERT model;
s3-2, inputting the word vector output by the BERT pre-training language model into a biGRU (bidirectional gated recurrent unit) neural network model;
s3-3, inputting the word embedding vector output by the BERT pre-training language model into an IDCNN (iterative scaled convolution neural network) iterative expansion convolution neural network model;
s3-4, merging the feature vectors output by the BiGRU and IDCNN neural networks;
s3-5, inputting the combined feature vector obtained in the step S3-4 into an attention mechanism module, performing weight distribution on the extracted features, strengthening the features which play a key role in named entity identification, and weakening irrelevant features;
s3-6, inputting the weight-distributed feature vectors obtained in the step S3-5 into a CRF layer, extracting the dependency features among the labels, and calculating a loss function;
and S3-7, updating parameters of the whole named entity recognition model by using a loss function of a CRF layer by adopting an SGD random gradient descent method, wherein the parameters comprise model parameters including a BiGRU neural network model, an IDCNN neural network model, an Attention layer and the CRF layer, the parameters of the BERT model are kept unchanged, and when the loss value generated by the model meets the set requirement or reaches the set maximum iteration number, the training of the model is terminated.
Further, the step S3-2 inputs the word vector output by the BERT pre-training language model to the BiGRU neural network model; the GRU is a special recurrent neural network, and the state of its neurons is calculated as follows:
zt=σ(Wi*[ht-1,xt])
rt=σ(Wr*[ht-1,xt])
where σ is the sigmod function, xtIs an input vector at time t, htIs a hidden state, is also an output vector, contains all valid information of the previous t time, ztIs an update gate, controls the flow of information input for the next time,representing candidate hidden layers, Wi Wc WrAre all represented as weight matrices, r, of GRUstIs a reset gate, the control information is lost, and the update gate and the reset gate jointly determine the output of the hidden state.
Further, the step S3-5 is to input the merged feature vector obtained in the step S3-4 to the attention mechanism module, perform weight distribution on the extracted features, strengthen the features that play a key role in named entity recognition, and weaken irrelevant features;
defining a set of feature vectors H ═ H0,h1...h3The extra information is a part of speech matrix P ═ P0,P1...PnIs an activation function for the purpose of word formationInformation can give weight to the target vector set, and weight matrix W is used respectively1,W2Affine transformation is carried out on H and P to enable vector space dimensions of H and P to be the same, and transformation results are input into tan H (.) activation functions to obtain joint feature vectorsAnd using softmax function pairsWeighting and scoring to obtain the weight of each inputFinally, using the weight vectorAnd (3) carrying out attention weighting on the feature vector set H, and outputting the feature vector logits after weight distribution, wherein the calculation method comprises the following steps:
further, the step S3-6 inputs the weight-assigned feature vector obtained in the step S3-5 to the CRF layer, extracts the dependency features between the labels, and calculates the loss function, which specifically includes:
for an input feature vector X, the corresponding prediction sequence is Y, the probability generated by the prediction sequence Y is obtained by calculating the scoring function of Y, and finally, the prediction labeling sequence when the likelihood function of the probability generated by the prediction sequence is maximum is calculated as output, wherein the calculating method of the scoring function of the prediction sequence Y is as follows:
a denotes a transition score matrix, n denotes the number of words, yi denotes the i-th label of the predicted sequence Y, Ai,jRepresents the transition of label i to label j, P is the scoring matrix of the upper layer output, Pi,jThe score of the jth label representing the ith word, the probability that the predicted sequence y yields is:
wherein the content of the first and second substances,representing predicted sequencesThe score of the scoring function of (a) is scored,representing the actual annotation sequence, YXRepresenting all possible labeled sequences, and taking logarithms on two sides to obtain a likelihood function of a prediction sequence:
the training loss function is:
Further, the step S4 performs named entity recognition on the text to be recognized by using the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3 to obtain a recognition result, and includes the steps of:
s4-1, inputting text data needing named entity recognition into a trained BERT-BiGRU-IDCNN-Attention-CRF neural network model;
and S4-2, converting the text data into word vectors after passing through a BERT model, extracting the features of the word vectors through a BiGRU and IDCNN neural network, distributing the weights of the extracted features through an Attention layer, and finally solving the most possible labeling sequence of each sentence by adopting a Viterbi algorithm on a CRF layer, namely the result of named entity recognition.
The invention has the following advantages and beneficial effects:
the invention provides a named entity identification method of BERT-BiGRU-IDCNN-CRF based on an attention mechanism. The method can solve the problem that the word vector cannot represent the polysemy of a word, can overcome the defect that the BiGRU module ignores local features in the process of extracting the context features, and can strengthen the features playing a key role in classification and weaken irrelevant features.
1. The BERT model used by the invention can perform unsupervised training on large-scale label-free data, can perform pre-training by combining context semantic information, and learns the characteristics of word level, syntactic structure and context semantic information so as to solve the defect that static word embedding cannot represent word ambiguity.
2. An IDCNN neural network is introduced to extract local features of a text sequence so as to make up for the defect that the BiGRU neural network ignores the local features when extracting global context features.
3. On the basis of global context features and local features extracted by BiGRU and IDCNN, an attention mechanism is introduced, the features playing a key role in the named entity are strengthened by carrying out weight distribution on the features, and irrelevant features are weakened or ignored, so that the accuracy of named entity identification is further improved.
The invention is mainly characterized in that when the BiGRU neural network extracts the global context characteristics, the local characteristics extracted by the IDCNN neural network are fused to make up for the defect that the BiGRU neural network ignores the local characteristics, and meanwhile, the BiGRU neural network has only 2 gate structures, so that the BiGRU neural network has fewer training parameters and higher training speed compared with the BiLSTM neural network. On the other hand, on the basis of the fusion features extracted by the BiGRU neural network and the IDCNN neural network, an attention mechanism is introduced to carry out weight distribution on the features, the features playing a key role in classification are strengthened, and irrelevant features are weakened, so that a better recognition effect is achieved.
Drawings
FIG. 1 is a schematic flow chart of a named entity recognition method based on a BERT-BiGRU-IDCNN-Attention-CRF neural network model according to a preferred embodiment of the present invention;
FIG. 2 is a block diagram of a BERT model of an embodiment of the present invention;
FIG. 3 is a structural diagram of a BERT-BiGRU-IDCNN-Attention-CRF neural network model according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
fig. 1 is a schematic flow chart of the named entity recognition method of BERT-BiGRU-IDCNN-CRF based on the attention mechanism of the present invention. The method comprises the following steps:
s1, training a BERT model based on large-scale label-free expectation;
specifically, the structure of the BERT model is shown in fig. 2, and mainly includes a model embedding layer, a bidirectional Transformer encoding layer, and an output layer.
The embedding layer is the input of the model, which is the sum of word embedding, position embedding and type embedding, and respectively represents word information, position information and sentence pair information.
The Transformer consists of a self-attention mechanism and a feedforward neural network, the working principle of the self-attention mechanism is mainly to calculate the association degree between words in a text sequence, and adjust the weight coefficient according to the association degree, and the calculation method of the association degree is as follows:
wherein: q represents a query vector, K represents a key vector, and V represents a value vector, and a penalty factor is introduced to prevent the inner product of Q, V from being too largeWherein d iskRepresenting the input vector dimensions.
The transform coding structure uses a multi-head Attention mechanism, that is, Q, K, V is subjected to multiple different linear mappings, the obtained new Q, K, V is recalculated to obtain different attentions (Q, K, V) and the attentions are spliced, W is a weight matrix, and the specific method is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headn)W
headi=Attention(QWi Q,KWi k,VWi V)
if the output of the multi-head attention mechanism is represented as Z and b is the bias vector in the Transformer structure, the fully-concatenated feed-forward network (FFN) can be represented as:
FFN(Z)=max(0,ZW1+b1)W2+b2
the BERT model obtains corresponding parameters by training on large-scale label-free data, and infers and outputs word vector representation of an input sequence, namely T1,T2,Tn。
S2, obtaining and labeling the training corpus data of the named entity recognition model, and constructing a data set, wherein the method comprises the following steps:
s2-1, performing conventional data processing on the original corpus identified by the named entity, wherein the conventional data processing comprises correction on wrongly written characters, large-scale processing on characters and the like;
s2-2, determining the entity type to be identified according to the actual application requirement;
s2-3, in order to cope with the situation that the entity lengths are different and the entity boundaries are difficult to distinguish, adopting a BIO three-stage entity labeling method: b represents the first character of the entity, I represents the character following the first character in the entity, and O represents a non-entity.
S3, on the basis of the trained BERT model obtained in the step S1, constructing a BERT-BiGRU-IDCNN-Attention-CRF neural network model, wherein the model structure is shown in FIG. 3, and the model is trained on the data set obtained in the step S2, and the method comprises the following steps:
s3-1, inputting the training set of the named entity recognition obtained by preprocessing in the step S2 into the BERT pre-training model trained in the step S1, and outputting word embedding vectors by the BERT model;
s3-2, inputting the word vector output by the BERT pre-training language model to the BiGRU neural network model;
specifically, a GRU is a special recurrent neural network whose state of neurons is calculated as follows:
zt=σ(Wi*[ht-1,xt])
rt=σ(Wr*[ht-1,xt])
where σ is the sigmod function, xtIs an input vector at time t, htIt is a hidden state and is also an output vector, containing all valid information from the previous time t. z is a radical oftIs an update gate that controls the flow of information into the next time instant. r istIs a reset gate, the control information is lost, and the update gate and the reset gate jointly determine the output of the hidden state.
S3-3, inputting the word embedding vector output by the BERT pre-training language model to the IDCNN neural network model;
specifically, the iterative convolutional neural network (IDCNN) is composed of a plurality of layers of convolutional neural networks (DCNN) with different expansion widths, and compared with the conventional convolutional neural network, the DCNN expands the convolution kernel thereof, thereby increasing the receptive field. At a convolution kernel size of 3x3 with a dilation width of 2, but its receptive field is expanded to 7x 7. The benefit of DCNN is that the convolution output contains a larger field of view of information without changing the size of the convolution kernel.
S3-4, merging the feature vectors output by the BiGRU and IDCNN neural networks;
s3-5, inputting the combined feature vector obtained in the step S3-4 into an attention mechanism module, performing weight distribution on the extracted features, strengthening the features which play a key role in named entity identification, and weakening irrelevant features;
specifically, a feature vector set H ═ H is defined0,h1...h3The extra information is a part of speech matrix P ═ P0,P1...PnUsing a weight matrix W to weight the target vector set according to the part of speech information1,W2And performing affine transformation on the H and the P to ensure that the vector space dimensions are the same. Inputting the transformation result into the tanh (.) activation function to obtain a joint feature vectorAnd using softmax function pairsWeighting and scoring to obtain the weight of each inputFinally, using the weight vectorAnd (3) carrying out attention weighting on the feature vector set H, and outputting the feature vector logits after weight distribution, wherein the calculation method comprises the following steps:
s3-6, inputting the weight-distributed feature vectors obtained in the step S3-5 into a CRF layer, extracting the dependency features among the labels, and calculating a loss function;
specifically, for an input feature vector X, the corresponding prediction sequence is Y, the probability of generation of the prediction sequence Y is obtained by calculating the scoring function of Y, and finally, the prediction labeling sequence when the likelihood function of the probability of generation of the prediction sequence is maximum is calculated and is used as output. The calculation method of the scoring function of the predicted sequence Y is as follows:
a represents a transition score matrix, Ai,jRepresents the transition of label i to label j, P is the scoring matrix of the upper layer output, Pi,jThe score of the jth label representing the ith word, the probability that the predicted sequence y yields is:
wherein the content of the first and second substances,representing the actual annotation sequence, YXRepresenting all possible labeled sequences, and taking logarithms on two sides to obtain a likelihood function of a prediction sequence:
the training loss function is:
and S3-7, updating parameters of the whole named entity recognition model by using a loss function of a CRF layer by adopting an SGD (random gradient descent) method, wherein the parameters comprise model parameters including a BiGRU neural network model, an IDCNN neural network model, an Attention layer and the CRF layer, the parameters of the BERT model are kept unchanged, and when the loss value generated by the model meets the set requirement or reaches the set maximum iteration number, the training of the model is terminated.
S4, carrying out named entity recognition on the text to be recognized by utilizing the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3 to obtain a recognition result, and comprising the following steps:
s4-1, inputting text data needing named entity recognition into a trained BERT-BiGRU-IDCNN-Attention-CRF neural network model;
and S4-2, converting the text data into word vectors after passing through a BERT model, extracting the features of the word vectors through a BiGRU and IDCNN neural network, distributing the weights of the extracted features through an Attention layer, and finally solving the most possible labeling sequence of each sentence by adopting a Viterbi algorithm on a CRF layer, namely the result of named entity recognition.
The modules or methods illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.
Claims (8)
1. A named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism is characterized by comprising the following steps:
s1, training a BERT model based on large-scale label-free expectation; the BERT model comprises an embedding word embedding layer, a bidirectional Transformer coding layer and an output layer; embedding an embedding word into a layer, namely inputting a model, wherein a multi-head attention mechanism is used in transform coding;
s2, acquiring and labeling training corpus data of the named entity recognition model, and constructing a data set;
s3, constructing a BERT-BiGRU-IDCNN-Attention-CRF neural network model on the basis of the trained BERT model obtained in the step S1, and training the model on the data set obtained in the step S2;
s4, conducting named entity recognition on the text to be recognized by utilizing the trained neural network model based on BERT-BiGRU-IDCNN-Attention-CRF obtained in the step S3, and obtaining a recognition result.
2. The method for identifying named entities on BERT-BiGRU-IDCNN-CRF according to claim 1, wherein in step S1, the embedding layer is the sum of word embedding, position embedding and type embedding, and represents word information, position information and sentence pair information respectively;
the Transformer consists of a self-attention mechanism and a feedforward neural network, wherein the self-attention mechanism is used for calculating the association degree between words in a text sequence and adjusting the weight coefficient according to the association degree, and the association degree is calculated by the following method:
wherein: q represents a query vector, K represents a key vector, and V represents a value vector, and a penalty factor is introduced to prevent the inner product of Q, V from being too largeWherein d iskRepresenting an input vector dimension;
the transform coding structure uses a multi-head Attention mechanism, that is, Q, K, V is subjected to multiple different linear mappings, the obtained new Q, K, V is recalculated to obtain different attentions (Q, K, V) and the attentions are spliced, W is a weight matrix, and the specific method is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headn)W
headithe attention result of the attention head i is shown, Concat shows that the attention results of different attention heads are spliced, the output of the multi-head attention mechanism in the Transformer structure is shown as Z, b is an offset vector, and then the full link feedforward network FFN is shown as:
FFN(Z)=max(0,ZW1+b1)W2+b2
w1 and b1 respectively represent the weight and the bias of a multi-head attention mechanism to a summation normalization layer, W2 and b2 respectively represent the weight and the bias of the summation normalization layer to a fully-connected feedforward neural network, a BERT model obtains corresponding parameters by training on large-scale unlabeled data, and the word vector representation of an input sequence, namely T, is reasoned and output1,T2,Tn。
3. The method for recognizing the named entity of BERT-BiGRU-IDCNN-CRF based on attention mechanism as claimed in claim 1, wherein the step S2 is performed by obtaining and labeling corpus data of the named entity recognition model to construct a data set, specifically comprising:
s2-1, performing conventional data processing on the original corpus identified by the named entity, wherein the conventional data processing comprises correcting wrongly-written characters and scaling the characters;
s2-2, determining the entity type to be identified according to the actual application requirement;
s2-3, in order to cope with the situation that the entity lengths are different and the entity boundaries are difficult to distinguish, adopting a BIO three-stage entity labeling method: b represents the first character of the entity, I represents the character following the first character in the entity, and O represents a non-entity.
4. The method for identifying named entities on BERT-BiGRU-IDCNN-CRF based on Attention mechanism as claimed in claim 3, wherein the step S3 is to construct a BERT-BiGRU-IDCNN-Attention-CRF neural network model, specifically comprising the steps of:
s3-1, inputting the training set of the named entity recognition obtained by preprocessing in the step S2 into the BERT pre-training model trained in the step S1, and outputting word embedding vectors by the BERT model;
s3-2, inputting the word vector output by the BERT pre-training language model into a biGRU (bidirectional gated recurrent unit) neural network model;
s3-3, inputting the word embedding vector output by the BERT pre-training language model into an IDCNN (iterative scaled convolution neural network) iterative expansion convolution neural network model;
s3-4, merging the feature vectors output by the BiGRU and IDCNN neural networks;
s3-5, inputting the combined feature vector obtained in the step S3-4 into an attention mechanism module, performing weight distribution on the extracted features, strengthening the features which play a key role in named entity identification, and weakening irrelevant features;
s3-6, inputting the weight-distributed feature vectors obtained in the step S3-5 into a CRF layer, extracting the dependency features among the labels, and calculating a loss function;
and S3-7, updating parameters of the whole named entity recognition model by using a loss function of a CRF layer by adopting an SGD random gradient descent method, wherein the parameters comprise model parameters including a BiGRU neural network model, an IDCNN neural network model, an Attention layer and the CRF layer, the parameters of the BERT model are kept unchanged, and when the loss value generated by the model meets the set requirement or reaches the set maximum iteration number, the training of the model is terminated.
5. The method for named entity recognition of BERT-BiGRU-IDCNN-CRF based on attention mechanism as claimed in claim 4, wherein the step S3-2 inputs the word vector outputted from the BERT pre-training language model to the BiGRU neural network model; the GRU is a special recurrent neural network, and the state of its neurons is calculated as follows:
zt=σ(Wi*[ht-1,xt])
rt=σ(Wr*[ht-1,xt])
where σ is the sigmod function, xtIs an input vector at time t, htIs a hidden state, is also an output vector, contains all valid information of the previous t time, ztIs an update gate, controls the flow of information input for the next time,representing candidate hidden layers, Wi WcWrAre all represented as weight matrices, r, of GRUstIs a reset gate, control information is lost, and the update gate and the reset gate jointly decide to hideAnd (4) outputting the state.
6. The method for identifying named entities on a BERT-BiGRU-IDCNN-CRF as claimed in claim 5, wherein the step S3-5 comprises inputting the merged feature vector obtained in the step S3-4 to an attention mechanism module, performing weight distribution on the extracted features, enhancing features critical to named entity identification, and weakening irrelevant features;
defining a set of feature vectors H ═ H0,h1...h3The extra information is a part of speech matrix P ═ P0,P1...PnUsing a weight matrix W to weight the target vector set according to the part of speech information1,W2Affine transformation is carried out on H and P to enable vector space dimensions of H and P to be the same, and transformation results are input into tan H (.) activation functions to obtain joint feature vectorsAnd using softmax function pairsWeighting and scoring to obtain the weight of each inputFinally, using the weight vectorAnd (3) carrying out attention weighting on the feature vector set H, and outputting the feature vector logits after weight distribution, wherein the calculation method comprises the following steps:
7. the method for identifying named entities on BERT-BiGRU-IDCNN-CRF according to claim 6, wherein the step S3-6 inputs the weight-assigned feature vectors obtained in step S3-5 into a CRF layer, extracts the dependency features between labels, and calculates the loss function, and comprises:
for an input feature vector X, the corresponding prediction sequence is Y, the probability generated by the prediction sequence Y is obtained by calculating the scoring function of Y, and finally, the prediction labeling sequence when the likelihood function of the probability generated by the prediction sequence is maximum is calculated as output, wherein the calculating method of the scoring function of the prediction sequence Y is as follows:
a denotes a transition score matrix, n denotes the number of words, yi denotes the i-th label of the predicted sequence Y, Ai,jRepresents the transition of label i to label j, P is the scoring matrix of the upper layer output, Pi,jThe score of the jth label representing the ith word, the probability that the predicted sequence y yields is:
wherein the content of the first and second substances,representing predicted sequencesScore function ofThe score is obtained by the above-mentioned method,representing the actual annotation sequence, YXRepresenting all possible labeled sequences, and taking logarithms on two sides to obtain a likelihood function of a prediction sequence:
the training loss function is:
8. The method for named entity recognition on BERT-BiGRU-IDCNN-CRF based on Attention mechanism as claimed in claim 7, wherein the step S4 using the trained BERT-BiGRU-IDCNN-Attention-CRF based neural network model obtained in step S3 to perform named entity recognition on the text to be recognized to obtain the recognition result comprises the steps of:
s4-1, inputting text data needing named entity recognition into a trained BERT-BiGRU-IDCNN-Attention-CRF neural network model;
and S4-2, converting the text data into word vectors after passing through a BERT model, extracting the features of the word vectors through a BiGRU and IDCNN neural network, distributing the weights of the extracted features through an Attention layer, and finally solving the most possible labeling sequence of each sentence by adopting a Viterbi algorithm on a CRF layer, namely the result of named entity recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110016942.9A CN112733541A (en) | 2021-01-06 | 2021-01-06 | Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110016942.9A CN112733541A (en) | 2021-01-06 | 2021-01-06 | Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112733541A true CN112733541A (en) | 2021-04-30 |
Family
ID=75590850
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110016942.9A Pending CN112733541A (en) | 2021-01-06 | 2021-01-06 | Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112733541A (en) |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112949637A (en) * | 2021-05-14 | 2021-06-11 | 中南大学 | Bidding text entity identification method based on IDCNN and attention mechanism |
CN113139069A (en) * | 2021-05-14 | 2021-07-20 | 上海交通大学 | Knowledge graph construction-oriented Chinese text entity identification method and system for power failure |
CN113221571A (en) * | 2021-05-31 | 2021-08-06 | 重庆交通大学 | Entity relation joint extraction method based on entity correlation attention mechanism |
CN113268740A (en) * | 2021-05-27 | 2021-08-17 | 四川大学 | Input constraint completeness detection method of website system |
CN113361277A (en) * | 2021-06-16 | 2021-09-07 | 西南交通大学 | Medical named entity recognition modeling method based on attention mechanism |
CN113378574A (en) * | 2021-06-30 | 2021-09-10 | 武汉大学 | Named entity identification method based on KGANN |
CN113392649A (en) * | 2021-07-08 | 2021-09-14 | 上海浦东发展银行股份有限公司 | Identification method, device, equipment and storage medium |
CN113408291A (en) * | 2021-07-09 | 2021-09-17 | 平安国际智慧城市科技股份有限公司 | Training method, device and equipment for Chinese entity recognition model and storage medium |
CN113505613A (en) * | 2021-07-29 | 2021-10-15 | 沈阳雅译网络技术有限公司 | Model structure simplification compression method for small CPU equipment |
CN113569554A (en) * | 2021-09-24 | 2021-10-29 | 北京明略软件***有限公司 | Entity pair matching method and device in database, electronic equipment and storage medium |
CN113627157A (en) * | 2021-10-13 | 2021-11-09 | 京华信息科技股份有限公司 | Probability threshold value adjusting method and system based on multi-head attention mechanism |
CN113642862A (en) * | 2021-07-29 | 2021-11-12 | 国网江苏省电力有限公司 | Method and system for identifying named entities of power grid dispatching instructions based on BERT-MBIGRU-CRF model |
CN113673248A (en) * | 2021-08-23 | 2021-11-19 | 中国人民解放军32801部队 | Named entity identification method for testing and identifying small sample text |
CN113687242A (en) * | 2021-09-29 | 2021-11-23 | 温州大学 | Lithium ion battery SOH estimation method for optimizing and improving GRU neural network based on GA algorithm |
CN113744805A (en) * | 2021-09-30 | 2021-12-03 | 山东大学 | Method and system for predicting DNA methylation based on BERT framework |
CN113836926A (en) * | 2021-09-27 | 2021-12-24 | 北京林业大学 | Electronic medical record named entity identification method, electronic equipment and storage medium |
CN114036948A (en) * | 2021-10-26 | 2022-02-11 | 天津大学 | Named entity identification method based on uncertainty quantification |
CN114048750A (en) * | 2021-12-10 | 2022-02-15 | 广东工业大学 | Named entity identification method integrating information advanced features |
CN114547301A (en) * | 2022-02-21 | 2022-05-27 | 北京百度网讯科技有限公司 | Document processing method, document processing device, recognition model training equipment and storage medium |
CN114580412A (en) * | 2021-12-29 | 2022-06-03 | 西安工程大学 | Clothing entity identification method based on field adaptation |
CN114580422A (en) * | 2022-03-14 | 2022-06-03 | 昆明理工大学 | Named entity identification method combining two-stage classification of neighbor analysis |
CN114648029A (en) * | 2022-03-31 | 2022-06-21 | 河海大学 | Electric power field named entity identification method based on BiLSTM-CRF model |
CN115146630A (en) * | 2022-06-08 | 2022-10-04 | 平安科技(深圳)有限公司 | Word segmentation method, device, equipment and storage medium based on professional domain knowledge |
CN115186667A (en) * | 2022-07-19 | 2022-10-14 | 平安科技(深圳)有限公司 | Named entity identification method and device based on artificial intelligence |
CN115587594A (en) * | 2022-09-20 | 2023-01-10 | 广东财经大学 | Network security unstructured text data extraction model training method and system |
CN115757325A (en) * | 2023-01-06 | 2023-03-07 | 珠海金智维信息科技有限公司 | Intelligent conversion method and system for XES logs |
CN115859983A (en) * | 2022-12-14 | 2023-03-28 | 成都信息工程大学 | Fine-grained Chinese named entity recognition method |
CN115906845A (en) * | 2022-11-08 | 2023-04-04 | 重庆邮电大学 | E-commerce commodity title naming entity identification method |
CN116050418A (en) * | 2023-03-02 | 2023-05-02 | 浙江工业大学 | Named entity identification method, device and medium based on fusion of multi-layer semantic features |
CN116484848A (en) * | 2023-03-17 | 2023-07-25 | 北京深维智讯科技有限公司 | Text entity identification method based on NLP |
CN116501884A (en) * | 2023-03-31 | 2023-07-28 | 重庆大学 | Medical entity identification method based on BERT-BiLSTM-CRF |
CN116545779A (en) * | 2023-07-06 | 2023-08-04 | 鹏城实验室 | Network security named entity recognition method, device, equipment and storage medium |
CN116561588A (en) * | 2023-07-07 | 2023-08-08 | 北京国电通网络技术有限公司 | Power text recognition model construction method, power equipment maintenance method and device |
CN116611436A (en) * | 2023-04-18 | 2023-08-18 | 广州大学 | Threat information-based network security named entity identification method |
CN116682436A (en) * | 2023-07-27 | 2023-09-01 | 成都大成均图科技有限公司 | Emergency alert acceptance information identification method and device |
WO2023201791A1 (en) * | 2022-04-22 | 2023-10-26 | 深圳计算科学研究院 | Data entity recognition method and apparatus, and computer device and storage medium |
CN117236338A (en) * | 2023-08-29 | 2023-12-15 | 北京工商大学 | Named entity recognition model of dense entity text and training method thereof |
CN117933259A (en) * | 2024-03-25 | 2024-04-26 | 成都中医药大学 | Named entity recognition method based on local text information |
CN114048750B (en) * | 2021-12-10 | 2024-06-28 | 广东工业大学 | Named entity identification method integrating advanced features of information |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083831A (en) * | 2019-04-16 | 2019-08-02 | 武汉大学 | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF |
CN110516231A (en) * | 2019-07-12 | 2019-11-29 | 北京邮电大学 | Expansion convolution entity name recognition method based on attention mechanism |
CN112115238A (en) * | 2020-10-29 | 2020-12-22 | 电子科技大学 | Question-answering method and system based on BERT and knowledge base |
-
2021
- 2021-01-06 CN CN202110016942.9A patent/CN112733541A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083831A (en) * | 2019-04-16 | 2019-08-02 | 武汉大学 | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF |
CN110516231A (en) * | 2019-07-12 | 2019-11-29 | 北京邮电大学 | Expansion convolution entity name recognition method based on attention mechanism |
CN112115238A (en) * | 2020-10-29 | 2020-12-22 | 电子科技大学 | Question-answering method and system based on BERT and knowledge base |
Non-Patent Citations (3)
Title |
---|
李妮 等: "基于BERT-IDCNN-CRF的中文命名实体识别方法", 《山东大学学报(理学版)》 * |
杨文明 等: "在线医疗问答文本的命名实体识别", 《计算机***应用》 * |
王雪梅 等: "基于深度学习的中文命名实体识别研究", 《成都信息工程大学学报》 * |
Cited By (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113139069A (en) * | 2021-05-14 | 2021-07-20 | 上海交通大学 | Knowledge graph construction-oriented Chinese text entity identification method and system for power failure |
CN112949637A (en) * | 2021-05-14 | 2021-06-11 | 中南大学 | Bidding text entity identification method based on IDCNN and attention mechanism |
CN113268740A (en) * | 2021-05-27 | 2021-08-17 | 四川大学 | Input constraint completeness detection method of website system |
CN113221571A (en) * | 2021-05-31 | 2021-08-06 | 重庆交通大学 | Entity relation joint extraction method based on entity correlation attention mechanism |
CN113221571B (en) * | 2021-05-31 | 2022-07-01 | 重庆交通大学 | Entity relation joint extraction method based on entity correlation attention mechanism |
CN113361277A (en) * | 2021-06-16 | 2021-09-07 | 西南交通大学 | Medical named entity recognition modeling method based on attention mechanism |
CN113378574A (en) * | 2021-06-30 | 2021-09-10 | 武汉大学 | Named entity identification method based on KGANN |
CN113378574B (en) * | 2021-06-30 | 2023-10-24 | 武汉大学 | KGANN-based named entity identification method |
CN113392649B (en) * | 2021-07-08 | 2023-04-07 | 上海浦东发展银行股份有限公司 | Identification method, device, equipment and storage medium |
CN113392649A (en) * | 2021-07-08 | 2021-09-14 | 上海浦东发展银行股份有限公司 | Identification method, device, equipment and storage medium |
CN113408291A (en) * | 2021-07-09 | 2021-09-17 | 平安国际智慧城市科技股份有限公司 | Training method, device and equipment for Chinese entity recognition model and storage medium |
CN113408291B (en) * | 2021-07-09 | 2023-06-30 | 平安国际智慧城市科技股份有限公司 | Training method, training device, training equipment and training storage medium for Chinese entity recognition model |
CN113505613A (en) * | 2021-07-29 | 2021-10-15 | 沈阳雅译网络技术有限公司 | Model structure simplification compression method for small CPU equipment |
CN113642862A (en) * | 2021-07-29 | 2021-11-12 | 国网江苏省电力有限公司 | Method and system for identifying named entities of power grid dispatching instructions based on BERT-MBIGRU-CRF model |
CN113673248A (en) * | 2021-08-23 | 2021-11-19 | 中国人民解放军32801部队 | Named entity identification method for testing and identifying small sample text |
CN113673248B (en) * | 2021-08-23 | 2022-02-01 | 中国人民解放军32801部队 | Named entity identification method for testing and identifying small sample text |
CN113569554A (en) * | 2021-09-24 | 2021-10-29 | 北京明略软件***有限公司 | Entity pair matching method and device in database, electronic equipment and storage medium |
CN113836926A (en) * | 2021-09-27 | 2021-12-24 | 北京林业大学 | Electronic medical record named entity identification method, electronic equipment and storage medium |
CN113687242A (en) * | 2021-09-29 | 2021-11-23 | 温州大学 | Lithium ion battery SOH estimation method for optimizing and improving GRU neural network based on GA algorithm |
CN113744805A (en) * | 2021-09-30 | 2021-12-03 | 山东大学 | Method and system for predicting DNA methylation based on BERT framework |
CN113627157A (en) * | 2021-10-13 | 2021-11-09 | 京华信息科技股份有限公司 | Probability threshold value adjusting method and system based on multi-head attention mechanism |
CN114036948A (en) * | 2021-10-26 | 2022-02-11 | 天津大学 | Named entity identification method based on uncertainty quantification |
CN114036948B (en) * | 2021-10-26 | 2024-05-31 | 天津大学 | Named entity identification method based on uncertainty quantification |
CN114048750A (en) * | 2021-12-10 | 2022-02-15 | 广东工业大学 | Named entity identification method integrating information advanced features |
CN114048750B (en) * | 2021-12-10 | 2024-06-28 | 广东工业大学 | Named entity identification method integrating advanced features of information |
CN114580412B (en) * | 2021-12-29 | 2024-06-04 | 西安工程大学 | Clothing entity identification method based on field adaptation |
CN114580412A (en) * | 2021-12-29 | 2022-06-03 | 西安工程大学 | Clothing entity identification method based on field adaptation |
CN114547301A (en) * | 2022-02-21 | 2022-05-27 | 北京百度网讯科技有限公司 | Document processing method, document processing device, recognition model training equipment and storage medium |
CN114580422B (en) * | 2022-03-14 | 2022-12-13 | 昆明理工大学 | Named entity identification method combining two-stage classification of neighbor analysis |
CN114580422A (en) * | 2022-03-14 | 2022-06-03 | 昆明理工大学 | Named entity identification method combining two-stage classification of neighbor analysis |
CN114648029A (en) * | 2022-03-31 | 2022-06-21 | 河海大学 | Electric power field named entity identification method based on BiLSTM-CRF model |
WO2023201791A1 (en) * | 2022-04-22 | 2023-10-26 | 深圳计算科学研究院 | Data entity recognition method and apparatus, and computer device and storage medium |
CN115146630A (en) * | 2022-06-08 | 2022-10-04 | 平安科技(深圳)有限公司 | Word segmentation method, device, equipment and storage medium based on professional domain knowledge |
CN115146630B (en) * | 2022-06-08 | 2023-05-30 | 平安科技(深圳)有限公司 | Word segmentation method, device, equipment and storage medium based on professional domain knowledge |
CN115186667A (en) * | 2022-07-19 | 2022-10-14 | 平安科技(深圳)有限公司 | Named entity identification method and device based on artificial intelligence |
CN115186667B (en) * | 2022-07-19 | 2023-05-26 | 平安科技(深圳)有限公司 | Named entity identification method and device based on artificial intelligence |
CN115587594A (en) * | 2022-09-20 | 2023-01-10 | 广东财经大学 | Network security unstructured text data extraction model training method and system |
CN115906845B (en) * | 2022-11-08 | 2024-05-10 | 芽米科技(广州)有限公司 | Method for identifying title named entity of electronic commerce commodity |
CN115906845A (en) * | 2022-11-08 | 2023-04-04 | 重庆邮电大学 | E-commerce commodity title naming entity identification method |
CN115859983A (en) * | 2022-12-14 | 2023-03-28 | 成都信息工程大学 | Fine-grained Chinese named entity recognition method |
CN115859983B (en) * | 2022-12-14 | 2023-08-25 | 成都信息工程大学 | Fine-granularity Chinese named entity recognition method |
CN115757325A (en) * | 2023-01-06 | 2023-03-07 | 珠海金智维信息科技有限公司 | Intelligent conversion method and system for XES logs |
CN116050418A (en) * | 2023-03-02 | 2023-05-02 | 浙江工业大学 | Named entity identification method, device and medium based on fusion of multi-layer semantic features |
CN116050418B (en) * | 2023-03-02 | 2023-10-31 | 浙江工业大学 | Named entity identification method, device and medium based on fusion of multi-layer semantic features |
CN116484848A (en) * | 2023-03-17 | 2023-07-25 | 北京深维智讯科技有限公司 | Text entity identification method based on NLP |
CN116484848B (en) * | 2023-03-17 | 2024-03-29 | 北京深维智讯科技有限公司 | Text entity identification method based on NLP |
CN116501884A (en) * | 2023-03-31 | 2023-07-28 | 重庆大学 | Medical entity identification method based on BERT-BiLSTM-CRF |
CN116611436A (en) * | 2023-04-18 | 2023-08-18 | 广州大学 | Threat information-based network security named entity identification method |
CN116545779B (en) * | 2023-07-06 | 2023-10-03 | 鹏城实验室 | Network security named entity recognition method, device, equipment and storage medium |
CN116545779A (en) * | 2023-07-06 | 2023-08-04 | 鹏城实验室 | Network security named entity recognition method, device, equipment and storage medium |
CN116561588A (en) * | 2023-07-07 | 2023-08-08 | 北京国电通网络技术有限公司 | Power text recognition model construction method, power equipment maintenance method and device |
CN116561588B (en) * | 2023-07-07 | 2023-10-20 | 北京国电通网络技术有限公司 | Power text recognition model construction method, power equipment maintenance method and device |
CN116682436A (en) * | 2023-07-27 | 2023-09-01 | 成都大成均图科技有限公司 | Emergency alert acceptance information identification method and device |
CN117236338A (en) * | 2023-08-29 | 2023-12-15 | 北京工商大学 | Named entity recognition model of dense entity text and training method thereof |
CN117236338B (en) * | 2023-08-29 | 2024-05-28 | 北京工商大学 | Named entity recognition model of dense entity text and training method thereof |
CN117933259B (en) * | 2024-03-25 | 2024-06-14 | 成都中医药大学 | Named entity recognition method based on local text information |
CN117933259A (en) * | 2024-03-25 | 2024-04-26 | 成都中医药大学 | Named entity recognition method based on local text information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112733541A (en) | Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism | |
CN109753566B (en) | Model training method for cross-domain emotion analysis based on convolutional neural network | |
CN112989834B (en) | Named entity identification method and system based on flat grid enhanced linear converter | |
CN110609891A (en) | Visual dialog generation method based on context awareness graph neural network | |
CN110008469B (en) | Multilevel named entity recognition method | |
Xie et al. | Fully convolutional recurrent network for handwritten chinese text recognition | |
CN110263325B (en) | Chinese word segmentation system | |
CN112541356B (en) | Method and system for recognizing biomedical named entities | |
CN112115238A (en) | Question-answering method and system based on BERT and knowledge base | |
CN111767718B (en) | Chinese grammar error correction method based on weakened grammar error feature representation | |
CN114943230B (en) | Method for linking entities in Chinese specific field by fusing common sense knowledge | |
CN110555084A (en) | remote supervision relation classification method based on PCNN and multi-layer attention | |
CN113255366B (en) | Aspect-level text emotion analysis method based on heterogeneous graph neural network | |
Ren et al. | Detecting the scope of negation and speculation in biomedical texts by using recursive neural network | |
CN111476024A (en) | Text word segmentation method and device and model training method | |
CN114781375A (en) | Military equipment relation extraction method based on BERT and attention mechanism | |
CN114417872A (en) | Contract text named entity recognition method and system | |
CN112905736A (en) | Unsupervised text emotion analysis method based on quantum theory | |
Aggarwal et al. | Recurrent neural networks | |
CN114254645A (en) | Artificial intelligence auxiliary writing system | |
CN113535897A (en) | Fine-grained emotion analysis method based on syntactic relation and opinion word distribution | |
CN115600597A (en) | Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium | |
CN115331075A (en) | Countermeasures type multi-modal pre-training method for enhancing knowledge of multi-modal scene graph | |
CN115169349A (en) | Chinese electronic resume named entity recognition method based on ALBERT | |
CN114416991A (en) | Method and system for analyzing text emotion reason based on prompt |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210430 |
|
RJ01 | Rejection of invention patent application after publication |