CN111666427B - Entity relationship joint extraction method, device, equipment and medium - Google Patents

Entity relationship joint extraction method, device, equipment and medium Download PDF

Info

Publication number
CN111666427B
CN111666427B CN202010538132.5A CN202010538132A CN111666427B CN 111666427 B CN111666427 B CN 111666427B CN 202010538132 A CN202010538132 A CN 202010538132A CN 111666427 B CN111666427 B CN 111666427B
Authority
CN
China
Prior art keywords
entity
training
layer
relationship
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010538132.5A
Other languages
Chinese (zh)
Other versions
CN111666427A (en
Inventor
曾道建
谢依玲
赵超
田剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202010538132.5A priority Critical patent/CN111666427B/en
Publication of CN111666427A publication Critical patent/CN111666427A/en
Application granted granted Critical
Publication of CN111666427B publication Critical patent/CN111666427B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a method, a device, equipment and a medium for entity relationship joint extraction, comprising the following steps: acquiring training sample data; training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process; and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.

Description

Entity relationship joint extraction method, device, equipment and medium
Technical Field
The present invention relates to the field of natural language processing, and in particular, to a method, an apparatus, a device, and a medium for entity relationship joint extraction.
Background
The entity relation extraction is used as a key technology of information extraction, and has important theoretical significance and wide application prospect. From the theoretical value level, entity relation extraction involves the theory and method of a plurality of subjects such as machine learning, data mining, natural language processing and the like. From the application level, entity relationship extraction can be used to automatically build large-scale knowledge bases. Entity relationship extraction also provides data support for information retrieval and construction of automated question-answering systems. Entity relation extraction also has important research significance in chapter understanding, machine translation and other aspects. In relation extraction, there are many methods for extracting relation.
The current method for extracting entity relationship is mainly a serial extraction method, namely, the extraction of entity and relationship is divided into two subtasks: the entity recognition model is adopted to extract the entity, and then the classifier is adopted to obtain the relation between the entity pairs. However, since the serial method is divided into two tasks, the result of entity recognition further affects the result of relation extraction, resulting in error accumulation, and meanwhile, the subtasks are independent of each other and neglect the relevance between the two tasks. In fact, entity recognition affects relationship classification, which also affects entity recognition; if two words have some relationship, the type of the two entities may be predicted based on the type of relationship between the two words. These two tasks are interdependent. Based on the method, a joint extraction method is proposed, namely, two tasks are combined into one through a joint model, and relation extraction is regarded as a process of extracting entity relation triples from unstructured texts. The existing joint extraction method eliminates the problem that two subtasks in a serial method are mutually independent, but has the problem that the relation extraction is not accurate enough.
Disclosure of Invention
In view of this, an object of the present application is to provide a method, apparatus, device and medium for entity relationship joint extraction, which can consider the influence of other triples on the current predicted relationship in the extraction process of the entity relationship, so as to improve the accuracy of entity relationship extraction. The specific scheme is as follows:
in a first aspect, the application discloses a method for entity relationship joint extraction, including:
acquiring training sample data;
training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;
and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model.
Optionally, the entity relation extraction model further includes a BERT layer, a NER layer, and a table filling layer;
correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps:
inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences;
performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence;
splicing the word vector and the tag embedding sequence to obtain a target vector;
predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship;
inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation;
and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.
Optionally, the entity relationship joint extraction method further includes:
calculating tag sequence loss by using a tag sequence loss function; wherein the tag sequence loss function is
Figure BDA0002537790440000021
Wherein τ is a training set comprising all of the training sample data, y * Is the correct relation of manual marking of word sequence x, the word sequence is obtained by dividing sentences through the BERT layer, and p (y * I x) is y * A corresponding probability value.
Optionally, the entity relationship joint extraction method further includes:
calculating a table filling loss by using the table filling loss function; wherein the table filling loss function is
Figure BDA0002537790440000031
wherein ,LRE For table filling loss, τ is a training set comprising all of the training sample data, x is a word sequence corresponding to a sentence in training set τ,
Figure BDA0002537790440000032
for the word x i The correct relation of manual annotation in the training set; />
Figure BDA0002537790440000033
The expression x i Is the j-th related entity, +.>
Figure BDA0002537790440000034
The expression x i and />
Figure BDA0002537790440000035
The correct relation of manual marking in training set is represented as the kth relation
Figure BDA0002537790440000036
For the word x i and />
Figure BDA0002537790440000037
There is a relation between->
Figure BDA0002537790440000038
Is a probability of (2).
Optionally, the dividing the sentence by the BERT layer and mapping each divided word into a corresponding word vector includes:
dividing sentences through the BERT layer, converting each divided word into a corresponding vector, and inputting the converted vector into an encoder for encoding so as to obtain the word vector.
Optionally, the entity relationship joint extraction method further includes:
calculating training loss; the training loss includes tag sequence loss and table filling loss.
In a second aspect, the present application discloses a device for entity relationship joint extraction, including:
the data acquisition module is used for acquiring training sample data;
the model training module is used for training a pre-built entity relation extraction model by utilizing the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;
and the relation extraction module is used for outputting a corresponding entity relation extraction result by utilizing the trained model when the target text to be subjected to entity relation extraction is acquired.
Optionally, the entity relationship joint extraction device further includes a training loss calculation module, configured to calculate a training loss; the training loss includes tag sequence loss and table filling loss.
In a third aspect, the application discloses an entity relationship joint extraction device, including a processor and a memory; wherein,
the memory is used for storing a computer program;
the processor is configured to execute the computer program to implement the aforementioned entity relationship joint extraction method.
In a fourth aspect, the application discloses a computer readable storage medium for storing a computer program, where the computer program when executed by a processor implements the aforementioned entity relationship joint extraction method.
Therefore, training sample data are acquired firstly, and then a pre-built entity relation extraction model is trained by utilizing the training sample data, so that a trained model is obtained; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current predicted relationship in the training process, and finally outputting a corresponding entity relationship extraction result by utilizing the trained model when a target text to be subjected to entity relationship extraction is obtained. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flow chart of a method for entity relationship joint extraction disclosed in the present application;
FIG. 2 is a flowchart of a specific entity relationship joint extraction method disclosed in the present application;
FIG. 3 is a flowchart of a specific entity relationship joint extraction method disclosed in the present application;
FIG. 4 is a block diagram of an embodiment of a method for entity-relationship joint extraction disclosed in the present application;
FIG. 5 is a schematic structural diagram of a entity relationship joint extraction device disclosed in the present application;
FIG. 6 is a block diagram of a physical relationship joint extraction device disclosed in the present application;
fig. 7 is a block diagram of an electronic terminal disclosed in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The current method for extracting entity relationship is mainly a serial extraction method, namely, the extraction of entity and relationship is divided into two subtasks: the entity recognition model is adopted to extract the entity, and then the classifier is adopted to obtain the relation between the entity pairs. However, since the serial method is divided into two tasks, the result of entity recognition further affects the result of relation extraction, resulting in error accumulation, and meanwhile, the subtasks are independent of each other and neglect the relevance between the two tasks. In fact, entity recognition affects relationship classification, which also affects entity recognition; if two words have some relationship, the type of the two entities may be predicted based on the type of relationship between the two words. These two tasks are interdependent. Based on the method, a joint extraction method is proposed, namely, two tasks are combined into one through a joint model, and relation extraction is regarded as a process of extracting entity relation triples from unstructured texts. The existing joint extraction method eliminates the problem that two subtasks in a serial method are mutually independent, but has the problem that the relation extraction is not accurate enough. Therefore, the utility model provides a scheme for entity relationship joint extraction, which can consider the influence of other triples on the current predicted relationship in the extraction process of the entity relationship, thereby improving the accuracy of entity relationship extraction.
Referring to fig. 1, an embodiment of the present application discloses a method for entity relationship joint extraction, including:
step S11: training sample data is obtained.
Step S12: training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process.
In a specific implementation manner, the entity relationship extraction model in this embodiment further includes a BERT (i.e. Bidirectional Encoder Representations from Transformers) layer, a NER (i.e. named entity recognition, named entity recognition) layer, and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps:
step S121: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector so as to obtain the context representation of the sentences.
In a specific embodiment, the present embodiment may divide a sentence through the BERT layer, convert each divided word into a corresponding vector, and then input the converted vector to an encoder for encoding, so as to obtain the word vector. Specifically, sentences are first divided into words by Wordpiece tokenizer, the input representation of each word is composed of the sum of the tag, segment, position-embedded of each word, and the first word is given a special tag ([ CLS ] by the BERT layer]). With x= { x 1 ,x 2 ,...,x n "represents a word sequence of sentences, x n Words in a sentence are represented, where n is the length of the sentence. Each word is then mapped by BERT into a word vector: each word is converted into a vector through an embedding layer, and then the vector is input into an encoder to be encoded to obtain a continuous embedded representation z= { z of each word 1 ,z 2 ,...,z n }。
That is, the present embodiment obtains a contextual representation of sentences in the training sample data through BERT.
Step S122: and performing linear CRF (conditional random field ) calculation on the word vector through the NER layer to obtain a corresponding predicted entity tag sequence, and converting the predicted entity tag sequence into a corresponding tag embedding sequence.
That is, the present example performs linear CRF calculation through the NER layer to obtain the most likely entity tag sequence,then converted into a corresponding tag embedded sequence h= { h 1 ,h 2 ,...,h n }。
And, each word x i The score for each entity signature is calculated as:
s i =V 1 f(W 1 z i +b z )+b s
where f (·) is the activation function, W 1 、V 1 Is a conversion matrix, b z 、b s Is the bias vector and l is the number of hidden layers. If the predicted entity tag sequence is y= { y 1 ,y 2 ,…,y n Using the formula }
Figure BDA0002537790440000061
Calculating the linear CRF score;
wherein ,
Figure BDA0002537790440000062
is the word x i Is marked as y i Score, x of i For the ith word, y i Is x i The corresponding entity-marking is used to mark,
Figure BDA0002537790440000063
is the entity signature y i-1 To entity tag y i Is a transition score of (a).
Figure BDA0002537790440000064
Is a transition matrix and is +.>
Figure BDA0002537790440000065
Wherein y= { y 1 ,y 2 ,…,y n -the predicted entity tag sequence, n is the number of entity tags, p (y|x) is the probability value corresponding to the predicted entity tag sequence calculated using a softmax function, s (x, y) is the linear CRF score corresponding to the predicted entity tag sequence,
Figure BDA0002537790440000066
and (5) marking the sequence set for the entity corresponding to the word sequence x.
That is, the present application performs linear CRF calculation on all sequences in the entity tag sequence set to obtain a corresponding linear CRF score, and further obtains a corresponding probability value, so as to determine a predicted entity sequence.
Step S123: and splicing the word vector and the tag embedding sequence to obtain a target vector.
Step S124: and predicting the entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship.
In a specific implementation, the present embodiment may output the word vector z to the BERT layer k And tag embedding h for NER layer output k Obtaining a target vector g k And predicting the entity relationship of the target vector through the table filling layer. Specifically, the method is carried out by the formula f (Ug j +Wg i +b r ) Predicting arbitrary two words x i and xj The relation between them, wherein U, W is the conversion matrix, b r Is a bias vector.
Step S125: and inputting the predicted entity relationship into the self-attention layer to perform attention calculation, and obtaining a corresponding entity relationship after attention calculation.
In a specific embodiment, the output matrix of the self-attention layer is calculated as:
Figure BDA0002537790440000071
where Q, K, V is the query, key, and value representation of each input relationship vector, q=k=v, D is the dimension of Q, K, and the units in each sequence and all units in the sequence perform the intent calculation. The method comprises the steps of inputting a relation vector, initializing a weight to obtain Q, K, V, obtaining an attention score of the input vector by taking a dot product between K and Q, using softmax in all attention scores, and finally multiplying each input softmaxed attention score by a corresponding V and adding to obtain an output vector.
Step S126: and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.
In a specific embodiment, the relationship between the entity after the attention calculation and each predefined relationship vector is subjected to inner product, and the relationship between each word and the selected entity is obtained through a sigmoid multi-label classifier, and the word x i Sum word x j Having a relation r k Is defined as:
s (r) (g j ,g i ,r k )=V (k) f(Ug j +Wg i +b r ),
wherein V, U, W is a conversion matrix, b r Is the offset vector g j =[z j ;h j ]Output z is BERT j Sum word x j Tag embedding h j Is a splice of (2). In table filling, the word x is evaluated j Is the word x i Head entity and has a relation r k The probability of (2) is:
p r (x j ,r k |x i )=δ(s (r) (g j ,g i ,r k ) Delta represents the sigmoid transformation.
Step S13: and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model.
It should be noted that the problem of overlap relation extraction can be effectively solved by the table filling layer.
As can be seen, in the embodiment of the present application, training sample data is first obtained, and then a pre-built entity relationship extraction model is trained by using the training sample data, so as to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current predicted relationship in the training process, and finally outputting a corresponding entity relationship extraction result by utilizing the trained model when a target text to be subjected to entity relationship extraction is obtained. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.
Referring to fig. 2, an embodiment of the present application discloses a specific entity relationship joint extraction method, which includes:
step S21: training sample data is obtained.
Step S22: training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process.
In a specific embodiment, the entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.
Step S23: calculating training loss; the training loss includes tag sequence loss and table filling loss.
In this embodiment, the training penalty is calculated using a training penalty function, which for the joint extraction entity relationship is defined as the sum of the tag sequence penalty, i.e., NER penalty, and the penalty of the self-attention mechanism based table filling: l (L) N +L RE
In particular embodiments, the present application may calculate tag sequence loss using a tag sequence loss function; wherein the tag sequence loss function is
Figure BDA0002537790440000091
Wherein τ is a training set comprising all of the training sample data, y * Is the correct relation of manual marking of word sequence x, the word sequence is obtained by dividing sentences through the BERT layer, and p (y * I x) is y * A corresponding probability value. p (y) * The calculation method of the |x) is the same as the calculation method of the p (y|x) disclosed in the previous embodiment, and the negative log likelihood L of the correct relation marked manually is used in training N Minimizing, converting the tag into tag-embedded by looking up an embedded layer: for sequence y= { y 1 ,y 2 ,…,y n Obtaining a tag embedding sequence h= { h } 1 ,h 2 ,...,h n }。
Further, the embodiment may calculate the table filling loss by using the table filling loss function; wherein the table filling loss function is
Figure BDA0002537790440000092
wherein ,LRE For table filling loss, τ is a training set comprising all of the training sample data, x is a word sequence corresponding to a sentence in training set τ,
Figure BDA0002537790440000093
for the word x i The correct relation of manual annotation in the training set; />
Figure BDA0002537790440000094
The expression x i Is the j-th related entity, +.>
Figure BDA0002537790440000095
The expression x i and />
Figure BDA0002537790440000096
The correct relation between the training sets, manually noted, is denoted by k +.>
Figure BDA0002537790440000097
For the word x i and />
Figure BDA0002537790440000098
There is a relation between->
Figure BDA0002537790440000099
Is a probability of (2).
Step S24: and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model.
That is, the entity relationship extraction in the embodiment of the present application first uses BERT to preprocess training data, vectorizes the preprocessed data, encodes the vectorized data to capture semantic information including context information, calculates the most likely entity tag sequence of a sentence through the NER layer, then converts the most likely entity tag sequence into a tag to be embedded, fills the predicted relationship through the table, then sends all the predicted relationships in the sentence into a self-attention mechanism, comprehensively considers the influence of all other triples in the training sentence on the current predicted relationship, and finally obtains the relationship between each word and the selected entity through the sigmoid multi-tag classifier. Specifically, the entity relationship joint extraction model mainly comprises a BERT layer, a NER layer, a self-intent layer and a table filling layer, wherein the BERT layer firstly divides sentences, the input representation of each word consists of marks, segments and embedded positions of each word, and then the BERT maps each word into a word vector: each word is converted into a vector through an embedding layer, and then the vector is input into an encoder to be encoded to obtain a continuous embedded representation of each word as an output of the BERT layer. The word vector after BERT pretreatment is used as an input of the NER layer, a most likely entity marking sequence is calculated by using linear CRF, and then the most likely entity marking sequence is converted into label embedding which is used as an output of the NER layer. Splicing the output vector of the BERT module layer with the output vector of the NER layer to serve as the input of the table filling layer, and predicting the relation according to a formula of the pre-prediction relation; and taking all the predicted relation vectors as the input of the self-intent module, comprehensively considering the influence of other triples in sentences on the current relation, and further accurately predicting the relation between the current entities. The table filling layer obtains and outputs the relation between each word and the selected entity through the sigmoid multi-label classifier. For example, a training sentence is input: input: li Huayu 1980 it was developed in the Shanghai in 1980; and (3) outputting: (Li Hua, 1980, birthday), (Li Hua, shanghai, birth place).
Therefore, the embodiment of the application comprehensively considers the influence of other triples in sentences on the current prediction relation by using a self-attention mechanism, so that the relation between the current entities is better predicted. And if there is a relation between one entity and other entities, the relation extraction is implemented in a form of table filling, and the table filling can enumerate the relation between any two entities in a sentence. The extraction strategy solves the defect of the current entity relationship joint extraction, and simultaneously improves the accuracy and recall rate of the entity relationship joint extraction.
For example, referring to fig. 3, fig. 3 is a flowchart of a specific entity relationship joint extraction method according to an embodiment of the present application. For example, referring to fig. 4, fig. 4 is a block diagram illustrating an implementation of a specific entity relationship joint extraction method disclosed in the present application.
Referring to fig. 5, an embodiment of the present application discloses a body relation joint extraction device, including:
a data acquisition module 11, configured to acquire training sample data;
the model training module 12 is configured to train a pre-built entity relationship extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;
and the relation extraction module 13 is used for outputting a corresponding entity relation extraction result by using the trained model when the target text to be subjected to entity relation extraction is acquired.
Therefore, training sample data are acquired firstly, and then a pre-built entity relation extraction model is trained by utilizing the training sample data, so that a trained model is obtained; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current predicted relationship in the training process, and finally outputting a corresponding entity relationship extraction result by utilizing the trained model when a target text to be subjected to entity relationship extraction is obtained. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.
The entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer;
correspondingly, the model training module 12 is specifically configured to input the training sample data to the BERT layer, divide sentences through the BERT layer, and map each divided word into a corresponding word vector to obtain a context representation of the sentence; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.
The entity relation joint extraction device further comprises a tag sequence loss calculation module, wherein the tag sequence loss calculation module is used for calculating tag sequence loss by using a tag sequence loss function; wherein the tag sequence loss function is
Figure BDA0002537790440000111
Wherein τ is a training set comprising all of the training sample data, y * Is the correct relation of manual marking of word sequence x, the word sequence is obtained by dividing sentences through the BERT layer, and p (y * I x) is y * A corresponding probability value.
The entity relation joint extraction device further comprises a table filling loss calculation module, wherein the table filling loss calculation module is used for calculating table filling loss by using a table filling loss function; wherein the table filling loss function is
Figure BDA0002537790440000112
wherein ,LRE For table filling loss, τ is a training set comprising all of the training sample data, x is a word sequence corresponding to a sentence in training set τ,
Figure BDA0002537790440000121
for the word x i The correct relation of manual annotation in the training set; />
Figure BDA0002537790440000122
The expression x i Is the j-th related entity, +.>
Figure BDA0002537790440000123
The expression x i and />
Figure BDA0002537790440000124
The correct relation of manual marking in training set is represented as the kth relation
Figure BDA0002537790440000125
For the word x i and />
Figure BDA0002537790440000126
There is a relation between->
Figure BDA0002537790440000127
Is a probability of (2).
Further, the model training module 12 is specifically configured to divide a sentence through the BERT layer, convert each divided word into a corresponding vector, and then input the converted vector to an encoder for encoding, so as to obtain the word vector.
The entity relationship joint extraction device further comprises a training loss calculation module for calculating training loss; the training loss includes tag sequence loss and table filling loss.
Referring to fig. 6, an embodiment of the present application discloses an entity relationship joint extraction device, including a processor 21 and a memory 22; wherein the memory 22 is used for storing a computer program; the processor 21 is configured to execute the computer program to implement the entity relationship joint extraction method disclosed in the foregoing embodiment.
For the specific process of the entity relationship joint extraction method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
Referring to fig. 7, an embodiment of the present application discloses an electronic terminal 20 including a processor 21 and a memory 22 as disclosed in the foregoing embodiments. The steps that the processor 21 may specifically perform may refer to the corresponding contents disclosed in the foregoing embodiments, and will not be described herein.
Further, the electronic terminal 20 in the present embodiment may further specifically include a power supply 23, a communication interface 24, an input/output interface 25, and a communication bus 26; wherein, the power supply 23 is used for providing working voltage for each hardware device on the terminal 20; the communication interface 24 can create a data transmission channel between the terminal 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
Further, the embodiment of the application also discloses a computer readable storage medium for storing a computer program, wherein the computer program is executed by a processor to implement the entity relationship joint extraction method disclosed in the previous embodiment.
For the specific process of the entity relationship joint extraction method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed description of a method, device, apparatus and medium for entity relationship joint extraction provided in the present application applies specific examples to illustrate the principles and embodiments of the present application, where the above description of the embodiments is only for helping to understand the method and core ideas of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (9)

1. The utility model provides a method for entity relationship joint extraction, which is characterized by comprising the following steps:
acquiring training sample data;
training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;
when a target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model;
the entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.
2. The method for entity-relationship joint extraction according to claim 1, further comprising:
calculating tag sequence loss by using a tag sequence loss function; wherein the tag sequence loss function is
Figure QLYQS_1
wherein ,
Figure QLYQS_2
for a training set, said training set comprising all of said training sample data,/for>
Figure QLYQS_3
Is word sequence->
Figure QLYQS_4
The word sequence is a sequence obtained by dividing sentences through the BERT layer, and the word sequence is a sequence of the manual annotation of the sentence>
Figure QLYQS_5
Is that
Figure QLYQS_6
A corresponding probability value.
3. The method for entity-relationship joint extraction according to claim 1, further comprising:
calculating a table filling loss by using the table filling loss function; wherein the table filling loss function is
Figure QLYQS_7
wherein ,
Figure QLYQS_17
for table filling loss, ++>
Figure QLYQS_10
For a training set, said training set comprising all of said training sample data,/for>
Figure QLYQS_12
For training set->
Figure QLYQS_19
Corresponding word sequence in sentence, ++>
Figure QLYQS_25
For words->
Figure QLYQS_21
The correct relation of manual annotation in the training set; />
Figure QLYQS_23
The expression->
Figure QLYQS_15
Is>
Figure QLYQS_20
Each related entity->
Figure QLYQS_8
The expression->
Figure QLYQS_13
and />
Figure QLYQS_14
The correct relation between the training sets, which is manually marked, the first->
Figure QLYQS_18
The personal relationship is expressed as->
Figure QLYQS_22
,/>
Figure QLYQS_24
For words->
Figure QLYQS_9
and />
Figure QLYQS_16
There is a relation between->
Figure QLYQS_11
Is a probability of (2).
4. The method of claim 1, wherein the dividing sentences by the BERT layer and mapping each divided word into a corresponding word vector comprises:
dividing sentences through the BERT layer, converting each divided word into a corresponding vector, and inputting the converted vector into an encoder for encoding so as to obtain the word vector.
5. The method for entity-relationship joint extraction according to claim 1, further comprising:
calculating training loss; the training loss includes tag sequence loss and table filling loss.
6. An entity relationship joint extraction device, comprising:
the data acquisition module is used for acquiring training sample data;
the model training module is used for training a pre-built entity relation extraction model by utilizing the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;
the relation extraction module is used for outputting a corresponding entity relation extraction result by utilizing the trained model when a target text to be subjected to entity relation extraction is acquired;
the entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.
7. The entity-relationship joint extraction apparatus of claim 6, wherein,
the training loss calculation module is used for calculating training loss; the training loss includes tag sequence loss and table filling loss.
8. An entity relationship joint extraction device is characterized by comprising a processor and a memory; wherein,
the memory is used for storing a computer program;
the processor is configured to execute the computer program to implement the entity-relationship joint extraction method according to any one of claims 1 to 5.
9. A computer readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the entity-relationship joint extraction method of any one of claims 1 to 5.
CN202010538132.5A 2020-06-12 2020-06-12 Entity relationship joint extraction method, device, equipment and medium Active CN111666427B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010538132.5A CN111666427B (en) 2020-06-12 2020-06-12 Entity relationship joint extraction method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010538132.5A CN111666427B (en) 2020-06-12 2020-06-12 Entity relationship joint extraction method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111666427A CN111666427A (en) 2020-09-15
CN111666427B true CN111666427B (en) 2023-05-12

Family

ID=72387352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010538132.5A Active CN111666427B (en) 2020-06-12 2020-06-12 Entity relationship joint extraction method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111666427B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114266245A (en) * 2020-09-16 2022-04-01 北京金山数字娱乐科技有限公司 Entity linking method and device
CN112163092B (en) * 2020-10-10 2022-07-12 成都数之联科技股份有限公司 Entity and relation extraction method, system, device and medium
CN112395407B (en) * 2020-11-03 2023-09-19 杭州未名信科科技有限公司 Business entity relation extraction method, device and storage medium
CN112819622B (en) * 2021-01-26 2023-10-17 深圳价值在线信息科技股份有限公司 Information entity relationship joint extraction method and device and terminal equipment
CN112818676B (en) * 2021-02-02 2023-09-26 东北大学 Medical entity relationship joint extraction method
CN112883736A (en) * 2021-02-22 2021-06-01 零氪科技(北京)有限公司 Medical entity relationship extraction method and device
CN112989788A (en) * 2021-03-12 2021-06-18 平安科技(深圳)有限公司 Method, device, equipment and medium for extracting relation triples
CN113806493B (en) * 2021-10-09 2023-08-29 中国人民解放军国防科技大学 Entity relationship joint extraction method and device for Internet text data
CN114548325B (en) * 2022-04-26 2022-08-02 北京大学 Zero sample relation extraction method and system based on dual contrast learning
CN115169350B (en) * 2022-07-14 2024-03-12 中国电信股份有限公司 Method, device, equipment, medium and program for processing information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165385A (en) * 2018-08-29 2019-01-08 中国人民解放军国防科技大学 Multi-triple extraction method based on entity relationship joint extraction model
CN109670050A (en) * 2018-12-12 2019-04-23 科大讯飞股份有限公司 A kind of entity relationship prediction technique and device
GB201904161D0 (en) * 2019-03-26 2019-05-08 Benevolentai Tech Limited Entity type identification for named entity recognition systems
CN109902145A (en) * 2019-01-18 2019-06-18 中国科学院信息工程研究所 A kind of entity relationship joint abstracting method and system based on attention mechanism
CN110059320A (en) * 2019-04-23 2019-07-26 腾讯科技(深圳)有限公司 Entity relation extraction method, apparatus, computer equipment and storage medium
CN111160008A (en) * 2019-12-18 2020-05-15 华南理工大学 Entity relationship joint extraction method and system
CN111178074A (en) * 2019-12-12 2020-05-19 天津大学 Deep learning-based Chinese named entity recognition method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165385A (en) * 2018-08-29 2019-01-08 中国人民解放军国防科技大学 Multi-triple extraction method based on entity relationship joint extraction model
CN109670050A (en) * 2018-12-12 2019-04-23 科大讯飞股份有限公司 A kind of entity relationship prediction technique and device
CN109902145A (en) * 2019-01-18 2019-06-18 中国科学院信息工程研究所 A kind of entity relationship joint abstracting method and system based on attention mechanism
GB201904161D0 (en) * 2019-03-26 2019-05-08 Benevolentai Tech Limited Entity type identification for named entity recognition systems
CN110059320A (en) * 2019-04-23 2019-07-26 腾讯科技(深圳)有限公司 Entity relation extraction method, apparatus, computer equipment and storage medium
CN111178074A (en) * 2019-12-12 2020-05-19 天津大学 Deep learning-based Chinese named entity recognition method
CN111160008A (en) * 2019-12-18 2020-05-15 华南理工大学 Entity relationship joint extraction method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张晓海 ; 操新文 ; 张敏 ; .基于自注意力机制的军事命名实体识别.指挥控制与仿真.2019,(第06期),35-39. *
李卫疆,李 涛,漆 芳.基于多特征自注意力BLSTM的中文实体关系抽取.中文信息学报.2019,第33卷卷(第33卷期),47-56、72. *

Also Published As

Publication number Publication date
CN111666427A (en) 2020-09-15

Similar Documents

Publication Publication Date Title
CN111666427B (en) Entity relationship joint extraction method, device, equipment and medium
CN110795543B (en) Unstructured data extraction method, device and storage medium based on deep learning
CN111460807B (en) Sequence labeling method, device, computer equipment and storage medium
CN110781663B (en) Training method and device of text analysis model, text analysis method and device
CN111209401A (en) System and method for classifying and processing sentiment polarity of online public opinion text information
CN111401084B (en) Method and device for machine translation and computer readable storage medium
CN112364660B (en) Corpus text processing method, corpus text processing device, computer equipment and storage medium
CN112380863A (en) Sequence labeling method based on multi-head self-attention mechanism
CN112163429B (en) Sentence correlation obtaining method, system and medium combining cyclic network and BERT
CN110990596B (en) Multi-mode hash retrieval method and system based on self-adaptive quantization
JP2021033995A (en) Text processing apparatus, method, device, and computer-readable storage medium
CN111125380B (en) Entity linking method based on RoBERTa and heuristic algorithm
CN115438674B (en) Entity data processing method, entity linking method, entity data processing device, entity linking device and computer equipment
CN115906815B (en) Error correction method and device for modifying one or more types of error sentences
CN113657105A (en) Medical entity extraction method, device, equipment and medium based on vocabulary enhancement
CN115600597A (en) Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium
CN115658846A (en) Intelligent search method and device suitable for open-source software supply chain
CN115019142A (en) Image title generation method and system based on fusion features and electronic equipment
CN115329120A (en) Weak label Hash image retrieval framework with knowledge graph embedded attention mechanism
CN114022687B (en) Image description countermeasure generation method based on reinforcement learning
CN111507103B (en) Self-training neural network word segmentation model using partial label set
CN116882403A (en) Geographic naming entity multi-target matching method
CN116680407A (en) Knowledge graph construction method and device
CN116306653A (en) Regularized domain knowledge-aided named entity recognition method
WO2023134085A1 (en) Question answer prediction method and prediction apparatus, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant