CN109885825A

CN109885825A - Name entity recognition method, device and computer equipment based on attention mechanism

Info

Publication number: CN109885825A
Application number: CN201910012152.6A
Authority: CN
Inventors: 丁程丹; 许开河; 王少军
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-01-07
Filing date: 2019-01-07
Publication date: 2019-06-14
Also published as: WO2020143163A1

Abstract

Present applicant proposes a kind of name entity recognition method, device and computer equipments based on attention mechanism, the above-mentioned name entity recognition method based on attention mechanism includes: to segment to text to be identified, and the participle of the text to be identified is mapped as vector, obtain the term vector of the text to be identified；It assigns the term vector of the text to be identified to attention weight, and the term vector input Named Entity Extraction Model for assigning attention weight is subjected to layer-by-layer operation, obtain the name Entity recognition result of the text to be identified；Wherein, the Named Entity Extraction Model includes at least two layers of hidden layer, and when carrying out layer-by-layer operation by the Named Entity Extraction Model, the concealed nodes that upper one layer of hidden layer exports are inputted next layer of hidden layer.The application may be implemented to identify that the recognition accuracy of entity is named in raising to name entity by attention mechanism.

Description

Name entity recognition method, device and computer equipment based on attention mechanism

[technical field]

This application involves field of artificial intelligence more particularly to a kind of name Entity recognition sides based on attention mechanism Method, device and computer equipment.

[background technique]

Name Entity recognition (Named Entity Recognition；Hereinafter referred to as: NER) referring to has in identification text The entity of certain sense mainly includes name, place name, mechanism name and/or proper noun etc..Natural language processing and machine learning An important directions of artificial intelligence, language text processing in, name Entity recognition be language text processing one before Propose work, the quality of identification directly influences subsequent work, therefore it is the premise of information processing and important for naming Entity recognition Task.

It is existing that there are mainly two types of the identification methods for naming entity in the related technology, the first, the side based on regularity Formula；Second, the mode based on deep learning.However, although the first implementation realizes that simply recognition effect is not very It is good；Second of implementation, since the operational capability of the hidden layer of deep learning model is limited namely hidden layer can only be to length Concealed nodes no more than length threshold carry out operation.When the length of the concealed nodes of input hidden layer is not more than length threshold When, hidden layer can carry out operation to the Hide All node of input, this does not have shadow to the recognition result of last name entity It rings.But when the length for inputting hidden layer input concealed nodes is greater than length threshold, hidden layer have to the hiding section of discard portion Point.In this way, the concealed nodes being rejected probably include the name entity information of text, it will cause in this way and name entity known Other inaccuracy.

Therefore, how to improve to the accuracy rate for naming Entity recognition in text, become a technical problem to be solved urgently.

[summary of the invention]

The embodiment of the present application provides a kind of name entity recognition method, device and computer based on attention mechanism and sets It is standby, name entity is identified by attention mechanism with realizing, improves the recognition accuracy of name entity.

In a first aspect, the embodiment of the present application provides a kind of name entity recognition method based on attention mechanism, comprising: right Text to be identified is segmented, and the participle of the text to be identified is mapped as vector, obtains the word of the text to be identified Vector；It assigns the term vector of the text to be identified to attention weight, and the term vector for assigning attention weight is inputted into life Name entity recognition model carries out layer-by-layer operation, obtains the name Entity recognition result of the text to be identified；Wherein, the name Entity recognition model includes at least two layers of hidden layer, when carrying out layer-by-layer operation by the Named Entity Extraction Model, by upper one The concealed nodes of layer hidden layer output input next layer of hidden layer.

In one of possible implementation, the term vector by the text to be identified assign attention weight it Before, further includes: it is semantic according to the context of the text to be identified, obtain the attention power of the term vector of the text to be identified Weight.

In one of possible implementation, the term vector by the text to be identified assigns attention weight, It and will be before the layer-by-layer operation of term vector input Named Entity Extraction Model progress that attention weight be assigned, further includes: obtain instruction Practice text, and the training text is segmented；Name entity in training text after segmenting is labeled；It will The participle of the training text is mapped as vector, obtains the term vector of the training text；By the term vector of the training text It inputs Named Entity Extraction Model to be trained and carries out layer-by-layer operation, to be instructed to the name physical model to be trained Practice.

In one of possible implementation, it is real that the term vector by the training text inputs name to be trained Body identification model carries out layer-by-layer operation, to be trained to the name physical model to be trained, comprising: trained at this After journey terminates, the name Entity recognition result of the training text of the name physical model output to be trained is obtained；By institute The name entity marked in the name Entity recognition result and the training text of training text is stated to compare；It is tied according to comparison Fruit adjusts the attention weight that term vector is assigned in next training process；If the name Entity recognition result of training text with The error of the name entity marked in the training text is less than scheduled error threshold, obtains trained name Entity recognition Model.

In one of possible implementation, described pair segment after training text in name entity mark Note include: to the participle of the training text whether belong to name entity, the training text participle name belonging to it reality The type of name entity belonging to the participle of position and/or the training text in body is labeled.

Second aspect, the embodiment of the present application provide a kind of name entity recognition device based on attention mechanism, comprising: point Word module, for being segmented to text to be identified；Mapping block, the text to be identified for obtaining the word segmentation module This participle is mapped as vector, obtains the term vector of the text to be identified；Identification module, for obtaining the mapping block The text to be identified term vector assign attention weight, and will assign attention weight term vector input name entity Identification model carries out layer-by-layer operation, obtains the name Entity recognition result of the text to be identified；Wherein, the name entity is known Other model includes at least two layers of hidden layer, and when carrying out layer-by-layer operation by the Named Entity Extraction Model, upper one layer is hidden The concealed nodes of layer output input next layer of hidden layer.

In one of possible implementation, described device further include: obtain module, being used for will in the identification module It is semantic according to the context of the text to be identified before the term vector of the text to be identified assigns attention weight, it obtains The attention weight of the term vector of the text to be identified.

In one of possible implementation, described device further include: labeling module and training module；The participle mould Block is also used to assign the term vector of the text to be identified to attention weight in the identification module, and will assign attention Before the term vector input Named Entity Extraction Model of weight carries out layer-by-layer operation, training text is obtained, and to the training text This is segmented；The labeling module, for the name entity in the training text after being segmented to the word segmentation module into Rower note；The mapping block is also used to the participle of the training text being mapped as vector, obtains the word of the training text Vector；The term vector of the training module, the training text for obtaining the mapping block inputs life to be trained Name entity recognition model carries out layer-by-layer operation, to be trained to the name physical model to be trained.

The third aspect, the embodiment of the present application provide a kind of computer equipment, including memory, processor and are stored in described It is real when the processor executes the computer program on memory and the computer program that can run on the processor Now method as described above.

A kind of fourth aspect, non-transitorycomputer readable storage medium of the embodiment of the present application, is stored thereon with computer Program, the computer program realize method as described above when being executed by processor.

In above technical scheme, after being segmented to text to be identified, the participle of above-mentioned text to be identified is mapped as Vector obtains the term vector of above-mentioned text to be identified, then assigns the term vector of above-mentioned text to be identified to attention weight, and The term vector input Named Entity Extraction Model for assigning attention weight is subjected to layer-by-layer operation, obtains above-mentioned text to be identified Name Entity recognition result；Wherein, above-mentioned Named Entity Extraction Model includes at least two layers of hidden layer, passes through above-mentioned name entity When identification model carries out layer-by-layer operation, the concealed nodes that upper one layer of hidden layer exports are inputted into next layer of hidden layer, due to each hidden The concealed nodes of hiding layer input have been assigned attention weight, and each hidden layer is according to the attention weights of concealed nodes, to hidden It hides node and carries out operation, may be implemented to identify name entity by attention mechanism, the identification for improving name entity is quasi- True rate, so can to avoid the length threshold for exceeding hidden layer due to the length for hiding node layer, and caused by concealed nodes Loss.

[Detailed description of the invention]

Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for this field For those of ordinary skill, without creative efforts, it can also be obtained according to these attached drawings other attached drawings.

Fig. 1 is the flow chart of the name entity recognition method one embodiment of the application based on attention mechanism；

Fig. 2 is the flow chart of name entity recognition method another embodiment of the application based on attention mechanism；

Fig. 3 is the flow chart of name entity recognition method further embodiment of the application based on attention mechanism；

Fig. 4 is the flow chart of name entity recognition method further embodiment of the application based on attention mechanism；

Fig. 5 is the structural schematic diagram of the name entity recognition device one embodiment of the application based on attention mechanism；

Fig. 6 is the structural schematic diagram of name entity recognition device another embodiment of the application based on attention mechanism；

Fig. 7 is the structural schematic diagram of the application computer equipment one embodiment.

[specific embodiment]

In order to better understand the technical solution of the application, the embodiment of the present application is retouched in detail with reference to the accompanying drawing It states.

It will be appreciated that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.Base Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts it is all its Its embodiment, shall fall in the protection scope of this application.

The term used in the embodiment of the present application is only to be not intended to be limiting merely for for the purpose of describing particular embodiments The application.In the embodiment of the present application and the "an" of singular used in the attached claims, " described " and "the" It is also intended to including most forms, unless the context clearly indicates other meaning.

Fig. 1 is the flow chart of the name entity recognition method one embodiment of the application based on attention mechanism, such as Fig. 1 institute Show, the above-mentioned name entity recognition method based on attention mechanism may include:

Step 101, text to be identified is segmented, and the participle of text to be identified is mapped as vector, obtained wait know The term vector of other text.

Wherein, text to be identified can be in short, may include word and punctuation mark in the words.To text to be identified This carries out segmenting can be all separating each of text the words to be identified word, punctuation mark.For example, " China Women's volleyball has won group round robin first, and enters finals." to the words participle result may is that "/in/state/female/row/ Win///small/group/match/the/mono-/,/simultaneously/and/into/enter// certainly/match/./ " by the participle of text to be identified be mapped as to Amount can be each word that will be separated in text to be identified, punctuation mark and obtain by searching for participle DUAL PROBLEMS OF VECTOR MAPPING table Corresponding term vector.Here participle DUAL PROBLEMS OF VECTOR MAPPING table can be the participle DUAL PROBLEMS OF VECTOR MAPPING table for being stored in advance or loading.

Step 102, the term vector of text to be identified is assigned to attention weight, and the term vector that attention weight will be assigned It inputs Named Entity Extraction Model and carries out layer-by-layer operation, obtain the name Entity recognition result of text to be identified.Wherein, name is real Body identification model includes at least two layers of hidden layer, and when carrying out layer-by-layer operation by Named Entity Extraction Model, upper one layer is hidden The concealed nodes of layer output input next layer of hidden layer.

Further, before step 102, can also include: it is semantic according to the context of above-mentioned text to be identified, in acquisition State the attention weight of the term vector of text to be identified.

Wherein, when each term vector of text to be identified is entered Named Entity Extraction Model, each word of text to be identified to The attention weight of amount can be identical or different.Named Entity Extraction Model to the term vector of text to be identified carry out by Semantic according to the upper and lower text of text to be identified in layer calculating process, each concealed nodes of each hidden layer input can be endowed not Same or identical attention weight.The present embodiment is not construed as limiting this.

In the present embodiment, the mode that Named Entity Extraction Model carries out layer-by-layer operation to the term vector of input can be use One of following algorithm or combination: two-way shot and long term Memory Neural Networks (Bi-directional Long Short-Term Memory；Hereinafter referred to as: Bi-LSTM), condition random field (Conditional Random Fields；Hereinafter referred to as: CRF) and Convolutional neural networks (Convolutional Neural Network；Hereinafter referred to as: CNN).

It, will be upper after being segmented to text to be identified in the above-mentioned name entity recognition method based on attention mechanism The participle for stating text to be identified is mapped as vector, obtains the term vector of above-mentioned text to be identified, then by above-mentioned text to be identified The term vector term vector input Named Entity Extraction Model that assigns attention weight, and attention weight will be assigned carry out it is layer-by-layer Operation obtains the name Entity recognition result of above-mentioned text to be identified；Wherein, above-mentioned Named Entity Extraction Model includes at least two Layer hidden layer, when carrying out layer-by-layer operation by above-mentioned Named Entity Extraction Model, by the concealed nodes of upper one layer of hidden layer output Input next layer of hidden layer, since the concealed nodes of each hidden layer input have been assigned attention weight, each hidden layer according to The attention weight of concealed nodes carries out operation to concealed nodes, may be implemented to carry out name entity by attention mechanism Identification improves the recognition accuracy of name entity, and then can be to avoid the length for exceeding hidden layer due to the length for hiding node layer Spend threshold value, and caused by concealed nodes loss.

Fig. 2 is the flow chart of name entity recognition method another embodiment of the application based on attention mechanism, such as Fig. 2 It is shown, in the application embodiment illustrated in fig. 1, there is an initiation layer with Named Entity Extraction Model and hide for initial two layers below For three layers of operation layer of layer, step 102 may include:

Step 201, by the initiation layer of the term vector input Named Entity Extraction Model of text to be identified, initiation layer is through operation After export concealed nodes.

Wherein, it is layer-by-layer to carry out one vector string input Named Entity Extraction Model progress of splicing for the term vector of text to be identified Operation.Above-mentioned concealed nodes are equivalent to the feature vector for indicating text feature to be identified.Named Entity Extraction Model is hidden The length for the vector string that the concealed nodes that the vector length that layer is capable of handling can be hidden layer input are formed after mutually splicing.

Step 202, to each concealed nodes of initiation layer output, paid attention to according to semantic assign of upper and lower text of text to be identified Power weight.

In the present embodiment, the concealed nodes of each hidden layer are inputted before being entered hidden layer, it will be according to text to be identified This upper and lower literary semanteme has been assigned attention weight.The attention weight may be implemented: if inputting hiding for the hidden layer The length of node has exceeded the length threshold that the hidden layer is capable of handling, the attention that can be endowed at this time according to hiding node layer Weight, the high hiding node layer of preferential operation attention weight, the low concealed nodes of those attention weights are given up.

Specifically, attention is assigned to the concealed nodes for inputting each hidden layer according to the upper and lower text of text to be identified is semantic Weight.For example, " porcelain that Gao little Hong sees the Ming Dynasty in Palace Museum ", the participle vector input name obtained by the words The concealed nodes of the initiation layer of entity recognition model, initiation layer output can be with are as follows: h₁₁、h₂₁、h₃₁……h_n1.These initiation layers are defeated Concealed nodes out input first layer hidden layer, due to being calculated by the term vector of text to be identified, initiation layer output Concealed nodes can be with the upper and lower literary semantic feature of text to be identified.If h₁₁It is by the word of "high", " small " this two words What vector operation was got, h₂₁It is to be got by the term vector of " red " this word, although "high", " small ", " red " these three words are independent Splitting out is not name entity, but this triliteral upper and lower literary Semantic judgement " Gao little Hong " is life according to " Gao little Hong " Name entity, therefore, concealed nodes h₁₁、h₂₁Somewhat higher attention weight can be endowed.

For another example it is not name entity that " event " " palace " the two words, which individually split out,.But it is semantic according to upper and lower text It is name entity, concealed nodes h that " the Forbidden City ", which is combined,₃₁It is obtained by the term vector operation of " event ", concealed nodes h₄₁By " palace " Term vector operation obtains, therefore, concealed nodes h₃₁、h₄₁Somewhat higher attention weight can also be endowed.

Step 203, the concealed nodes by the initiation layer output for having been assigned attention weight input first layer hidden layer, the One layer of hidden layer exports concealed nodes after operation.

Step 204, it to each concealed nodes of first layer hidden layer output, is assigned according to the upper and lower text of text to be identified is semantic Give attention weight.

Although the concealed nodes of the operation of first layer hidden layer are not the term vectors of text to be identified, input first is hidden Hide the concealed nodes h of layer₁₁、h₂₁、h₃₁……h_n1It is also the feature vector of the upper and lower literary semantic information with text to be identified.Cause This, similarly, the concealed nodes for inputting each hidden layer can be according to the semantic determining each concealed nodes of upper and lower text of text to be identified Attention weight.

" porcelain that Gao little Hong sees the Ming Dynasty in Palace Museum " the words is in name Entity recognition calculating process, such as The length of the concealed nodes of fruit initiation layer output is greater than the length threshold of first layer hidden layer, then " arrives " " " with " ", " seeing " " " the related concealed nodes of these words can be endowed lower attention weight, the calculation resources of such hidden layer can be more It is more may be that some words of entity is named to carry out operation to comparing.

Step 205, the concealed nodes input second layer that will have been assigned the first layer hidden layer output of attention weight is hidden Layer is hidden, second layer hidden layer exports the recognition result of text to be identified after operation.

Above-described embodiment, only listing Named Entity Extraction Model has the case where three layers of operation layer, certainly, names entity The operation number of plies of identification model is also possible to 2 layers, 4 layers, 5 layers, 6 layers ..., and the specific number of plies can be set according to actual needs, but It is that Named Entity Extraction Model is similar to the aforementioned embodiment to the recognition methods for being named entity of text to be identified, is ok It include: to assign attention weight to each concealed nodes to be entered of each hidden layer and then attention weight will be had been assigned Concealed nodes input corresponding hidden layer and carry out operation.

Further, attention weight is assigned to the concealed nodes of hidden layer input, can be and is sentenced according to upper and lower literary semanteme Those break it is more likely that name entity, assigns higher weight to the input vector that may be name entity, that is to say, that right Naming entity to carry out can be by upper and lower text semanteme as an auxiliary judgment condition in identification process.

Fig. 3 is the flow chart of name entity recognition method further embodiment of the application based on attention mechanism, such as Fig. 3 It is shown, in the application embodiment illustrated in fig. 1, before step 102, can also include:

Step 301, training text is obtained, and training text is segmented.

Step 302, the name entity in the training text after segmenting is labeled.

Specifically, being labeled to the name entity in the training text after segmenting can be with are as follows: to above-mentioned training text Whether this participle, which belongs to, is named entity, the participle of above-mentioned training text names position in entity belonging to it and/or above-mentioned The type of name entity belonging to the participle of training text is labeled.

It in specific implementation, can be real to the name in training text by the way of BIO mark and/or IOBES mark Body is labeled.

For example, Named Entity Extraction Model is Bi-LSTM model, can be according to IOBES to training text The mode of (Inside, Other, Begin, End, Single) is labeled.If being an individual entity to a participle, Then it is labeled as (tag S- ...)；If a participle is that an entity starts, it is labeled as (tag B- ...)；If a participle It is vocabulary among an entity, then is labeled as (tag I- ...)；If a participle is the end of an entity, it is labeled as (tag E-…)；If a participle is not an entity, it is labeled as (tag O).Name (PER), place name (LOC) and mechanism For name (ORG), " Wang Ming is born in Beijing, makes profits work in County, Hebei Province, China Tangshan City now." mark result are as follows: king (B-PER), bright (E-PER), go out (O), raw (O), in (O), north (B-LOC), capital (S-LOC), (O), show (O), (O), (O), river (B-LOC), northern (I-LOC), province (E-LOC), Tang (B-LOC), mountain (I-LOC), city (E-LOC), wound (B-ORG), benefit (E-ORG), work (O), work (O).(O).

For another example Named Entity Extraction Model is Bi-LSTM+CRF model, it can be in the way of BIO to training text It is labeled, i.e., B-PER, I-PER represent the non-lead-in of name lead-in, name, and it is non-that B-LOC, I-LOC represent place name lead-in, place name Lead-in, B-ORG, I-ORG represent the non-lead-in of institution term lead-in, institution term, and O represents the word and is not belonging to name entity A part.The result of the mark of " high Xiao Ming helps China Team to win " are as follows: high (B-PER), small (I-PER), bright (I-PER), side (O), help (O), in (B-ORG), state (I-ORG), team (I-ORG), obtain (O), victory (O).

Step 303, the participle of training text is mapped as vector, obtains the term vector of training text.

Wherein, each word, the character training text separated are corresponded to by searching for participle DUAL PROBLEMS OF VECTOR MAPPING table Term vector.Here participle DUAL PROBLEMS OF VECTOR MAPPING table is the participle DUAL PROBLEMS OF VECTOR MAPPING table for being stored in advance or loading.

Step 304, the term vector of training text is inputted into Named Entity Extraction Model to be trained and carries out layer-by-layer operation, with Trained name physical model is treated to be trained.

Wherein, specifically, the specific embodiment of step 304 can be with above-mentioned Named Entity Extraction Model to be identified The identification process of text is identical, the difference is that, Named Entity Extraction Model to be trained here is not by instruction Experienced, therefore the recognition result of the name entity of the training text of name physical model output to be trained and step 302 are got the bid There may be errors between the name entity of note.

In the present embodiment, Named Entity Extraction Model to be trained, which carries out layer-by-layer operation, be can be using one of following algorithm Or combination: Bi-LSTM, CRF and CNN.It treats trained name physical model to be trained, that is, treats trained name reality The attention weight that the parameter of the layer-by-layer operation of body identification model and the concealed nodes of each hidden layer are endowed is trained.

Fig. 4 is the flow chart of name entity recognition method further embodiment of the application based on attention mechanism, such as Fig. 4 It is shown, in the application embodiment illustrated in fig. 3, after step 304, can also include:

Step 401, after this training process terminates, the training text of name physical model output to be trained is obtained Name Entity recognition result.

Step 402, the name entity marked in the name Entity recognition result of training text and training text is carried out pair Than.

Specifically, alignments can be, according to the word of the name Entity recognition result of training text and training text to Amount, the loss function of the name Entity recognition result precision of construction reflection training text.The loss function of construction can be life The difference of two squares of the term vector of name Entity recognition result and training text.

Step 403, according to comparing result, the attention weight that term vector is assigned in next training process is adjusted.

Specifically, it can use the minimum value that gradient descent algorithm solves loss function, gradient descent algorithm can use Negative gradient direction determines the parameter adjustment direction of the loss function of each iteration, therefore, available to name entity to training The attention that identification model is trained the parameter of the layer-by-layer operation of term vector of text and the concealed nodes of each hidden layer are endowed The adjustment direction of power weight.Gradually reducing for loss function means that Named Entity Extraction Model to be trained is trained text The attention weight that the parameter of the layer-by-layer operation of term vector and the concealed nodes of each hidden layer are endowed is more and more accurate.

Step 404, if the mistake of the name entity marked in the name Entity recognition result of training text and training text Difference is less than scheduled error threshold, obtains trained Named Entity Extraction Model.

Wherein, above-mentioned scheduled error threshold can in specific implementation, certainly according to system performance and/or realization demand etc. Row setting, the present embodiment are not construed as limiting the size of above-mentioned scheduled error threshold.

Fig. 5 is the structural schematic diagram of the name entity recognition device one embodiment of the application based on attention mechanism, this The name entity recognition device based on attention mechanism that embodiment provides may be implemented provided by the present application based on attention machine The name entity recognition method of system.As shown in figure 5, the above-mentioned name entity recognition device based on attention mechanism may include: Word segmentation module 51, mapping block 52 and identification module 53；

Wherein, word segmentation module 51, for being segmented to text to be identified；Wherein, text to be identified can be in short, It may include word and punctuation mark in the words.Word segmentation module 51, which carries out participle to text to be identified, can be text to be identified Each of this words word, punctuation mark are all separated.For example, " Chinese Women's Volleyball Team has won group round robin first, and into Finals is entered." result of the words participle may is that "/in/state/female/row/win///small/group/match/the/mono-/,/ And/and/into/enter// certainly/match/./"

The participle of mapping block 52, the above-mentioned text to be identified for obtaining word segmentation module 51 is mapped as vector, obtains The term vector of above-mentioned text to be identified；Specifically, the participle of text to be identified is mapped as vector by mapping block 52, can be by Each word for being separated in text to be identified, punctuation mark by searching for participle DUAL PROBLEMS OF VECTOR MAPPING table obtain corresponding word to Amount.Here participle DUAL PROBLEMS OF VECTOR MAPPING table can be the participle DUAL PROBLEMS OF VECTOR MAPPING table for being stored in advance or loading.

Identification module 53, the term vector of the above-mentioned text to be identified for obtaining mapping block 52 assign attention power Weight, and the term vector input Named Entity Extraction Model for assigning attention weight is subjected to layer-by-layer operation, it obtains above-mentioned to be identified The name Entity recognition result of text；Wherein, above-mentioned Named Entity Extraction Model includes at least two layers of hidden layer, passes through above-mentioned life When name entity recognition model carries out layer-by-layer operation, the concealed nodes that upper one layer of hidden layer exports are inputted into next layer of hidden layer.

In the present embodiment, the mode that Named Entity Extraction Model carries out layer-by-layer operation to the term vector of input can be use One of following algorithm or combination: Bi-LSTM, CRF and CNN.

In the above-mentioned name entity recognition device based on attention mechanism, word segmentation module 51 segments text to be identified Later, the participle of above-mentioned text to be identified is mapped as vector by mapping block 52, obtains the term vector of above-mentioned text to be identified, so Identification module 53 assigns the term vector of above-mentioned text to be identified to attention weight, and the term vector that will assign attention weight afterwards It inputs Named Entity Extraction Model and carries out layer-by-layer operation, obtain the name Entity recognition result of above-mentioned text to be identified；Wherein, on Stating Named Entity Extraction Model includes at least two layers of hidden layer, when carrying out layer-by-layer operation by above-mentioned Named Entity Extraction Model, The concealed nodes that upper one layer of hidden layer exports are inputted into next layer of hidden layer, since the concealed nodes of each hidden layer input are assigned Attention weight is given, each hidden layer carries out operation to concealed nodes, may be implemented to lead to according to the attention weight of concealed nodes It crosses attention mechanism to identify name entity, improves the recognition accuracy of name entity, and then can be to avoid due to hiding The length of node layer exceed hidden layer length threshold, and caused by concealed nodes loss.

Fig. 6 is the structural schematic diagram of name entity recognition device another embodiment of the application based on attention mechanism, Compared with the name entity recognition device shown in fig. 5 based on attention mechanism, the difference is that, it is shown in fig. 6 based on note The name entity recognition device of meaning power mechanism can also include: to obtain module 54；

Wherein, module 54 is obtained, for assigning the term vector of above-mentioned text to be identified to attention power in identification module 53 It is semantic according to the context of above-mentioned text to be identified before weight, obtain the attention weight of the term vector of above-mentioned text to be identified.

Specifically, when each term vector of text to be identified is entered Named Entity Extraction Model, each word of text to be identified The attention weight of vector can be identical or different.It is carried out in term vector of the Named Entity Extraction Model to text to be identified Semantic according to the upper and lower text of text to be identified in layer-by-layer calculating process, each concealed nodes of each hidden layer input can be endowed Identical or different attention weight.The present embodiment is not construed as limiting this.

Further, the above-mentioned name entity recognition device based on attention mechanism can also include: 55 He of labeling module Training module 56；

Word segmentation module 51 is also used to assign the term vector of above-mentioned text to be identified to attention weight in identification module 53, And by before the layer-by-layer operation of term vector input Named Entity Extraction Model progress for assigning attention weight, training text is obtained, And above-mentioned training text is segmented；

Labeling module 55 is labeled for the name entity in the training text after segmenting to word segmentation module 51； In the present embodiment, whether labeling module 55 belongs to the participle of name entity, training text specifically for the participle to training text Position in name entity belonging to it and/or the type of name entity belonging to the participle of training text are labeled.

In specific implementation, labeling module 55 can be by the way of BIO mark and/or IOBES mark to training text In name entity be labeled.

Mapping block 52 is also used to the participle of above-mentioned training text being mapped as vector, obtains the word of above-mentioned training text Vector；Wherein, mapping block 52 can separate training text each word, character are by searching for participle DUAL PROBLEMS OF VECTOR MAPPING Table obtains corresponding term vector.Here participle DUAL PROBLEMS OF VECTOR MAPPING table is the participle DUAL PROBLEMS OF VECTOR MAPPING table for being stored in advance or loading.

The term vector of training module 56, the above-mentioned training text for obtaining mapping block 52 inputs name to be trained Entity recognition model carries out layer-by-layer operation, to be trained to above-mentioned name physical model to be trained.

It specifically, can also be at this after training module 56 is trained above-mentioned name physical model to be trained After training process terminates, the name Entity recognition knot of the training text of above-mentioned name physical model output to be trained is obtained Fruit；The name entity marked in the name Entity recognition result of above-mentioned training text and above-mentioned training text is compared；Root According to comparing result, the attention weight that term vector is assigned in next training process is adjusted；If the name entity of training text is known The error of the name entity marked in other result and above-mentioned training text is less than scheduled error threshold, obtains trained name Entity recognition model.Wherein, above-mentioned scheduled error threshold according to system performance and/or can realize need in specific implementation Equal sets itselfs are sought, the present embodiment is not construed as limiting the size of above-mentioned scheduled error threshold.

Fig. 7 is the structural schematic diagram of the application computer equipment one embodiment, and above-mentioned computer equipment may include depositing Reservoir, processor and it is stored in the computer program that can be run on above-mentioned memory and on above-mentioned processor, above-mentioned processor When executing above-mentioned computer program, the name Entity recognition side provided by the embodiments of the present application based on attention mechanism may be implemented Method.

Wherein, above-mentioned computer equipment can be server, such as: cloud server；Alternatively, above-mentioned computer equipment can Think electronic equipment, such as: the smart machines such as smart phone, smartwatch or tablet computer, the present embodiment is to above-mentioned computer The specific form of equipment is not construed as limiting.

Fig. 7 shows the block diagram for being suitable for the exemplary computer device 12 for being used to realize the application embodiment.Fig. 7 is shown Computer equipment 12 be only an example, should not function to the embodiment of the present application and use scope bring any restrictions.

As shown in fig. 7, computer equipment 12 is showed in the form of universal computing device.The component of computer equipment 12 can be with Including but not limited to: one or more processor or processing unit 16, system storage 28 connect different system components The bus 18 of (including system storage 28 and processing unit 16).

Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (Industry Standard Architecture；Hereinafter referred to as: ISA) bus, microchannel architecture (Micro Channel Architecture；Below Referred to as: MAC) bus, enhanced isa bus, Video Electronics Standards Association (Video Electronics Standards Association；Hereinafter referred to as: VESA) local bus and peripheral component interconnection (Peripheral Component Interconnection；Hereinafter referred to as: PCI) bus.

Computer equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by The usable medium that computer equipment 12 accesses, including volatile and non-volatile media, moveable and immovable medium.

System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access Memory (Random Access Memory；Hereinafter referred to as: RAM) 30 and/or cache memory 32.Computer equipment 12 It may further include other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only conduct Citing, storage system 34 can be used for reading and writing immovable, non-volatile magnetic media, and (Fig. 7 do not show, commonly referred to as " hard disk Driver ").Although being not shown in Fig. 7, the magnetic for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided Disk drive, and to removable anonvolatile optical disk (such as: compact disc read-only memory (Compact Disc Read Only Memory；Hereinafter referred to as: CD-ROM), digital multi CD-ROM (Digital Video Disc Read Only Memory；Hereinafter referred to as: DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving Device can be connected by one or more data media interfaces with bus 18.Memory 28 may include that at least one program produces Product, the program product have one group of (for example, at least one) program module, and it is each that these program modules are configured to perform the application The function of embodiment.

Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28 In, such program module 42 includes --- but being not limited to --- operating system, one or more application program, other programs It may include the realization of network environment in module and program data, each of these examples or certain combination.Program mould Block 42 usually executes function and/or method in embodiments described herein.

Computer equipment 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 Deng) communication, can also be enabled a user to one or more equipment interact with the computer equipment 12 communicate, and/or with make The computer equipment 12 any equipment (such as network interface card, the modulatedemodulate that can be communicated with one or more of the other calculating equipment Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, computer equipment 12 may be used also To pass through network adapter 20 and one or more network (such as local area network (Local Area Network；Hereinafter referred to as: LAN), wide area network (Wide Area Network；Hereinafter referred to as: WAN) and/or public network, for example, internet) communication.Such as figure Shown in 7, network adapter 20 is communicated by bus 18 with other modules of computer equipment 12.Although should be understood that in Fig. 7 not It shows, other hardware and/or software module can be used in conjunction with computer equipment 12, including but not limited to: microcode, equipment are driven Dynamic device, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..

Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and Data processing, such as realize the name entity recognition method provided by the embodiments of the present application based on attention mechanism.

The embodiment of the present application also provides a kind of non-transitorycomputer readable storage medium, is stored thereon with computer journey The name provided by the embodiments of the present application based on attention mechanism may be implemented in sequence, above-mentioned computer program when being executed by processor Entity recognition method.

Above-mentioned non-transitorycomputer readable storage medium can appointing using one or more computer-readable media Meaning combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer can Reading storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device Or device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: Electrical connection, portable computer diskette, hard disk, random access memory (RAM), read-only storage with one or more conducting wires Device (Read Only Memory；Hereinafter referred to as: ROM), erasable programmable read only memory (Erasable Programmable Read Only Memory；Hereinafter referred to as: EPROM) or flash memory, optical fiber, portable compact disc are read-only deposits Reservoir (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer Readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded execution system, device Either device use or in connection.

Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium other than computer readable storage medium, which can send, propagate or Transmission is for by the use of instruction execution system, device or device or program in connection.

The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.

Can with one or more programming languages or combinations thereof come write for execute the application operation computer Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.? It is related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (Local Area Network；Hereinafter referred to as: LAN) or wide area network (Wide Area Network；Hereinafter referred to as: WAN) it is connected to user Computer, or, it may be connected to outer computer (such as being connected using ISP by internet).

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present application, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the application Embodiment person of ordinary skill in the field understood.

Depending on context, word as used in this " if " can be construed to " ... when " or " when ... When " or " in response to determination " or " in response to detection ".Similarly, depend on context, phrase " if it is determined that " or " if detection (condition or event of statement) " can be construed to " when determining " or " in response to determination " or " when the detection (condition of statement Or event) when " or " in response to detection (condition or event of statement) ".

It should be noted that terminal involved in the embodiment of the present application can include but is not limited to personal computer (PersonalComputer；Hereinafter referred to as: PC), personal digital assistant (PersonalDigital Assistant；Following letter Claim: PDA), radio hand-held equipment, tablet computer (Tablet Computer), mobile phone, MP3 player, MP4 player etc..

In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or group Part can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown Or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.

It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.

The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer It is each that device (can be personal computer, server or network equipment etc.) or processor (Processor) execute the application The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read- Only Memory；Hereinafter referred to as: ROM), random access memory (Random Access Memory；Hereinafter referred to as: RAM), The various media that can store program code such as magnetic or disk.

The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the application protection.

Claims

1. a kind of name entity recognition method based on attention mechanism characterized by comprising

Text to be identified is segmented, and the participle of the text to be identified is mapped as vector, obtains the text to be identified This term vector；

It assigns the term vector of the text to be identified to attention weight, and the term vector for assigning attention weight is inputted into name Entity recognition model carries out layer-by-layer operation, obtains the name Entity recognition result of the text to be identified；Wherein, the name is real Body identification model includes at least two layers of hidden layer, when carrying out layer-by-layer operation by the Named Entity Extraction Model, by upper one layer The concealed nodes of hidden layer output input next layer of hidden layer.

2. the method according to claim 1, wherein the term vector by the text to be identified assigns and paying attention to Before power weight, further includes:

It is semantic according to the context of the text to be identified, obtain the attention weight of the term vector of the text to be identified.

3. the method according to claim 1, wherein the term vector by the text to be identified assigns and paying attention to Power weight, and will be before the layer-by-layer operation of term vector input Named Entity Extraction Model progress that attention weight be assigned, further includes:

Training text is obtained, and the training text is segmented；

Name entity in training text after segmenting is labeled；

The participle of the training text is mapped as vector, obtains the term vector of the training text；

The term vector of the training text is inputted into Named Entity Extraction Model to be trained and carries out layer-by-layer operation, with to it is described to Trained name physical model is trained.

4. according to the method described in claim 3, it is characterized in that, the term vector by the training text is inputted wait train Named Entity Extraction Model carry out layer-by-layer operation and also wrapped after being trained to the name physical model to be trained It includes:

After this training process terminates, the name for obtaining the training text of the name physical model output to be trained is real Body recognition result；

The name entity marked in the name Entity recognition result of the training text and the training text is compared；

According to comparing result, the attention weight that term vector is assigned in next training process is adjusted；

If the error of the name entity marked in the name Entity recognition result of training text and the training text is less than pre- Fixed error threshold obtains trained Named Entity Extraction Model.

5. according to the method described in claim 3, it is characterized in that, described pair segment after training text in name it is real Body, which is labeled, includes:

Whether name entity is belonged to the participle of the training text, the participle of the training text is named in entity belonging to it Position and/or the training text participle belonging to name entity type be labeled.

6. a kind of name entity recognition device based on attention mechanism characterized by comprising

Word segmentation module, for being segmented to text to be identified；

The participle of mapping block, the text to be identified for obtaining the word segmentation module is mapped as vector, obtains described The term vector of text to be identified；

Identification module, the term vector of the text to be identified for obtaining the mapping block assign attention weight, and The term vector input Named Entity Extraction Model for assigning attention weight is subjected to layer-by-layer operation, obtains the text to be identified Name Entity recognition result；Wherein, the Named Entity Extraction Model includes at least two layers of hidden layer, passes through the name entity When identification model carries out layer-by-layer operation, the concealed nodes that upper one layer of hidden layer exports are inputted into next layer of hidden layer.

7. device according to claim 6, which is characterized in that further include:

Module is obtained, for before the identification module assigns the term vector of the text to be identified to attention weight, root It is semantic according to the context of the text to be identified, obtain the attention weight of the term vector of the text to be identified.

8. device according to claim 6, which is characterized in that further include: labeling module and training module；

The word segmentation module is also used to assign the term vector of the text to be identified to attention weight in the identification module, And by before the layer-by-layer operation of term vector input Named Entity Extraction Model progress for assigning attention weight, training text is obtained, And the training text is segmented；

The labeling module is labeled for the name entity in the training text after segmenting to the word segmentation module；

The mapping block is also used to the participle of the training text being mapped as vector, obtain the word of the training text to Amount；

The term vector of the training module, the training text for obtaining the mapping block inputs name to be trained Entity recognition model carries out layer-by-layer operation, to be trained to the name physical model to be trained.

9. a kind of computer equipment, which is characterized in that including memory, processor and be stored on the memory and can be in institute The computer program run on processor is stated, when the processor executes the computer program, is realized as in claim 1-5 Any method.

10. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the meter Such as method as claimed in any one of claims 1 to 5 is realized when calculation machine program is executed by processor.