CN111666427B

CN111666427B - Entity relationship joint extraction method, device, equipment and medium

Info

Publication number: CN111666427B
Application number: CN202010538132.5A
Authority: CN
Inventors: 曾道建; 谢依玲; 赵超; 田剑
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2023-05-12
Anticipated expiration: 2040-06-12
Also published as: CN111666427A

Abstract

The application discloses a method, a device, equipment and a medium for entity relationship joint extraction, comprising the following steps: acquiring training sample data; training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process; and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.

Description

Entity relationship joint extraction method, device, equipment and medium

Technical Field

The present invention relates to the field of natural language processing, and in particular, to a method, an apparatus, a device, and a medium for entity relationship joint extraction.

Background

The entity relation extraction is used as a key technology of information extraction, and has important theoretical significance and wide application prospect. From the theoretical value level, entity relation extraction involves the theory and method of a plurality of subjects such as machine learning, data mining, natural language processing and the like. From the application level, entity relationship extraction can be used to automatically build large-scale knowledge bases. Entity relationship extraction also provides data support for information retrieval and construction of automated question-answering systems. Entity relation extraction also has important research significance in chapter understanding, machine translation and other aspects. In relation extraction, there are many methods for extracting relation.

The current method for extracting entity relationship is mainly a serial extraction method, namely, the extraction of entity and relationship is divided into two subtasks: the entity recognition model is adopted to extract the entity, and then the classifier is adopted to obtain the relation between the entity pairs. However, since the serial method is divided into two tasks, the result of entity recognition further affects the result of relation extraction, resulting in error accumulation, and meanwhile, the subtasks are independent of each other and neglect the relevance between the two tasks. In fact, entity recognition affects relationship classification, which also affects entity recognition; if two words have some relationship, the type of the two entities may be predicted based on the type of relationship between the two words. These two tasks are interdependent. Based on the method, a joint extraction method is proposed, namely, two tasks are combined into one through a joint model, and relation extraction is regarded as a process of extracting entity relation triples from unstructured texts. The existing joint extraction method eliminates the problem that two subtasks in a serial method are mutually independent, but has the problem that the relation extraction is not accurate enough.

Disclosure of Invention

In view of this, an object of the present application is to provide a method, apparatus, device and medium for entity relationship joint extraction, which can consider the influence of other triples on the current predicted relationship in the extraction process of the entity relationship, so as to improve the accuracy of entity relationship extraction. The specific scheme is as follows:

in a first aspect, the application discloses a method for entity relationship joint extraction, including:

acquiring training sample data;

training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;

and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model.

Optionally, the entity relation extraction model further includes a BERT layer, a NER layer, and a table filling layer;

correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps:

inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences;

performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence;

splicing the word vector and the tag embedding sequence to obtain a target vector;

predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship;

inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation;

and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.

Optionally, the entity relationship joint extraction method further includes:

calculating tag sequence loss by using a tag sequence loss function; wherein the tag sequence loss function is

Wherein τ is a training set comprising all of the training sample data, y ^* Is the correct relation of manual marking of word sequence x, the word sequence is obtained by dividing sentences through the BERT layer, and p (y ^* I x) is y ^* A corresponding probability value.

Optionally, the entity relationship joint extraction method further includes:

calculating a table filling loss by using the table filling loss function; wherein the table filling loss function is

wherein ,L^RE For table filling loss, τ is a training set comprising all of the training sample data, x is a word sequence corresponding to a sentence in training set τ,

for the word x _i The correct relation of manual annotation in the training set; />

The expression x _i Is the j-th related entity, +.>

The expression x _i and />

The correct relation of manual marking in training set is represented as the kth relation

For the word x _i and />

There is a relation between->

Is a probability of (2).

Optionally, the dividing the sentence by the BERT layer and mapping each divided word into a corresponding word vector includes:

dividing sentences through the BERT layer, converting each divided word into a corresponding vector, and inputting the converted vector into an encoder for encoding so as to obtain the word vector.

Optionally, the entity relationship joint extraction method further includes:

calculating training loss; the training loss includes tag sequence loss and table filling loss.

In a second aspect, the present application discloses a device for entity relationship joint extraction, including:

the data acquisition module is used for acquiring training sample data;

the model training module is used for training a pre-built entity relation extraction model by utilizing the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;

and the relation extraction module is used for outputting a corresponding entity relation extraction result by utilizing the trained model when the target text to be subjected to entity relation extraction is acquired.

Optionally, the entity relationship joint extraction device further includes a training loss calculation module, configured to calculate a training loss; the training loss includes tag sequence loss and table filling loss.

In a third aspect, the application discloses an entity relationship joint extraction device, including a processor and a memory; wherein,

the memory is used for storing a computer program;

the processor is configured to execute the computer program to implement the aforementioned entity relationship joint extraction method.

In a fourth aspect, the application discloses a computer readable storage medium for storing a computer program, where the computer program when executed by a processor implements the aforementioned entity relationship joint extraction method.

Therefore, training sample data are acquired firstly, and then a pre-built entity relation extraction model is trained by utilizing the training sample data, so that a trained model is obtained; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current predicted relationship in the training process, and finally outputting a corresponding entity relationship extraction result by utilizing the trained model when a target text to be subjected to entity relationship extraction is obtained. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flow chart of a method for entity relationship joint extraction disclosed in the present application;

FIG. 2 is a flowchart of a specific entity relationship joint extraction method disclosed in the present application;

FIG. 3 is a flowchart of a specific entity relationship joint extraction method disclosed in the present application;

FIG. 4 is a block diagram of an embodiment of a method for entity-relationship joint extraction disclosed in the present application;

FIG. 5 is a schematic structural diagram of a entity relationship joint extraction device disclosed in the present application;

FIG. 6 is a block diagram of a physical relationship joint extraction device disclosed in the present application;

fig. 7 is a block diagram of an electronic terminal disclosed in the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The current method for extracting entity relationship is mainly a serial extraction method, namely, the extraction of entity and relationship is divided into two subtasks: the entity recognition model is adopted to extract the entity, and then the classifier is adopted to obtain the relation between the entity pairs. However, since the serial method is divided into two tasks, the result of entity recognition further affects the result of relation extraction, resulting in error accumulation, and meanwhile, the subtasks are independent of each other and neglect the relevance between the two tasks. In fact, entity recognition affects relationship classification, which also affects entity recognition; if two words have some relationship, the type of the two entities may be predicted based on the type of relationship between the two words. These two tasks are interdependent. Based on the method, a joint extraction method is proposed, namely, two tasks are combined into one through a joint model, and relation extraction is regarded as a process of extracting entity relation triples from unstructured texts. The existing joint extraction method eliminates the problem that two subtasks in a serial method are mutually independent, but has the problem that the relation extraction is not accurate enough. Therefore, the utility model provides a scheme for entity relationship joint extraction, which can consider the influence of other triples on the current predicted relationship in the extraction process of the entity relationship, thereby improving the accuracy of entity relationship extraction.

Referring to fig. 1, an embodiment of the present application discloses a method for entity relationship joint extraction, including:

step S11: training sample data is obtained.

Step S12: training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process.

In a specific implementation manner, the entity relationship extraction model in this embodiment further includes a BERT (i.e. Bidirectional Encoder Representations from Transformers) layer, a NER (i.e. named entity recognition, named entity recognition) layer, and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps:

step S121: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector so as to obtain the context representation of the sentences.

In a specific embodiment, the present embodiment may divide a sentence through the BERT layer, convert each divided word into a corresponding vector, and then input the converted vector to an encoder for encoding, so as to obtain the word vector. Specifically, sentences are first divided into words by Wordpiece tokenizer, the input representation of each word is composed of the sum of the tag, segment, position-embedded of each word, and the first word is given a special tag ([ CLS ] by the BERT layer]). With x= { x ₁ ,x ₂ ,...,x _n "represents a word sequence of sentences, x _n Words in a sentence are represented, where n is the length of the sentence. Each word is then mapped by BERT into a word vector: each word is converted into a vector through an embedding layer, and then the vector is input into an encoder to be encoded to obtain a continuous embedded representation z= { z of each word ₁ ,z ₂ ,...,z _n }。

That is, the present embodiment obtains a contextual representation of sentences in the training sample data through BERT.

Step S122: and performing linear CRF (conditional random field ) calculation on the word vector through the NER layer to obtain a corresponding predicted entity tag sequence, and converting the predicted entity tag sequence into a corresponding tag embedding sequence.

That is, the present example performs linear CRF calculation through the NER layer to obtain the most likely entity tag sequence,then converted into a corresponding tag embedded sequence h= { h ₁ ,h ₂ ,...,h _n }。

And, each word x _i The score for each entity signature is calculated as:

s _i ＝V ₁ f(W ₁ z _i +b ^z )+b ^s ，

where f (·) is the activation function, W ₁ 、V ₁ Is a conversion matrix, b ^z 、b ^s Is the bias vector and l is the number of hidden layers. If the predicted entity tag sequence is y= { y ₁ ,y ₂ ,…,y _n Using the formula }

Calculating the linear CRF score;

wherein ,

is the word x _i Is marked as y _i Score, x of _i For the ith word, y _i Is x _i The corresponding entity-marking is used to mark,

is the entity signature y _i-1 To entity tag y _i Is a transition score of (a).

Is a transition matrix and is +.>

Wherein y= { y ₁ ,y ₂ ,…,y _n -the predicted entity tag sequence, n is the number of entity tags, p (y|x) is the probability value corresponding to the predicted entity tag sequence calculated using a softmax function, s (x, y) is the linear CRF score corresponding to the predicted entity tag sequence,

and (5) marking the sequence set for the entity corresponding to the word sequence x.

That is, the present application performs linear CRF calculation on all sequences in the entity tag sequence set to obtain a corresponding linear CRF score, and further obtains a corresponding probability value, so as to determine a predicted entity sequence.

Step S123: and splicing the word vector and the tag embedding sequence to obtain a target vector.

Step S124: and predicting the entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship.

In a specific implementation, the present embodiment may output the word vector z to the BERT layer _k And tag embedding h for NER layer output _k Obtaining a target vector g _k And predicting the entity relationship of the target vector through the table filling layer. Specifically, the method is carried out by the formula f (Ug _j +Wg _i +b ^r ) Predicting arbitrary two words x _i and x_j The relation between them, wherein U, W is the conversion matrix, b ^r Is a bias vector.

Step S125: and inputting the predicted entity relationship into the self-attention layer to perform attention calculation, and obtaining a corresponding entity relationship after attention calculation.

In a specific embodiment, the output matrix of the self-attention layer is calculated as:

where Q, K, V is the query, key, and value representation of each input relationship vector, q=k=v, D is the dimension of Q, K, and the units in each sequence and all units in the sequence perform the intent calculation. The method comprises the steps of inputting a relation vector, initializing a weight to obtain Q, K, V, obtaining an attention score of the input vector by taking a dot product between K and Q, using softmax in all attention scores, and finally multiplying each input softmaxed attention score by a corresponding V and adding to obtain an output vector.

Step S126: and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.

In a specific embodiment, the relationship between the entity after the attention calculation and each predefined relationship vector is subjected to inner product, and the relationship between each word and the selected entity is obtained through a sigmoid multi-label classifier, and the word x _i Sum word x _j Having a relation r _k Is defined as:

s ^(r) (g _j ,g _i ,r _k )＝V ^(k) f(Ug _j +Wg _i +b ^r )，

wherein V, U, W is a conversion matrix, b ^r Is the offset vector g _j ＝[z _j ；h _j ]Output z is BERT _j Sum word x _j Tag embedding h _j Is a splice of (2). In table filling, the word x is evaluated _j Is the word x _i Head entity and has a relation r _k The probability of (2) is:

p _r (x _j ,r _k |x _i )＝δ(s ^(r) (g _j ,g _i ,r _k ) Delta represents the sigmoid transformation.

Step S13: and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model.

It should be noted that the problem of overlap relation extraction can be effectively solved by the table filling layer.

As can be seen, in the embodiment of the present application, training sample data is first obtained, and then a pre-built entity relationship extraction model is trained by using the training sample data, so as to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current predicted relationship in the training process, and finally outputting a corresponding entity relationship extraction result by utilizing the trained model when a target text to be subjected to entity relationship extraction is obtained. In this way, training the entity relation extraction model comprising the self-attention layer can consider the influence of other triples on the current prediction relation in the extraction process of the entity relation, thereby improving the accuracy of the entity relation extraction.

Referring to fig. 2, an embodiment of the present application discloses a specific entity relationship joint extraction method, which includes:

step S21: training sample data is obtained.

Step S22: training a pre-built entity relation extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process.

In a specific embodiment, the entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.

Step S23: calculating training loss; the training loss includes tag sequence loss and table filling loss.

In this embodiment, the training penalty is calculated using a training penalty function, which for the joint extraction entity relationship is defined as the sum of the tag sequence penalty, i.e., NER penalty, and the penalty of the self-attention mechanism based table filling: l (L) ^N +L ^RE 。

In particular embodiments, the present application may calculate tag sequence loss using a tag sequence loss function; wherein the tag sequence loss function is

Wherein τ is a training set comprising all of the training sample data, y ^* Is the correct relation of manual marking of word sequence x, the word sequence is obtained by dividing sentences through the BERT layer, and p (y ^* I x) is y ^* A corresponding probability value. p (y) ^* The calculation method of the |x) is the same as the calculation method of the p (y|x) disclosed in the previous embodiment, and the negative log likelihood L of the correct relation marked manually is used in training ^N Minimizing, converting the tag into tag-embedded by looking up an embedded layer: for sequence y= { y ₁ ,y ₂ ,…,y _n Obtaining a tag embedding sequence h= { h } ₁ ,h ₂ ,...,h _n }。

Further, the embodiment may calculate the table filling loss by using the table filling loss function; wherein the table filling loss function is

The expression x _i Is the j-th related entity, +.>

The expression x _i and />

The correct relation between the training sets, manually noted, is denoted by k +.>

For the word x _i and />

There is a relation between->

Is a probability of (2).

Step S24: and when the target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model.

That is, the entity relationship extraction in the embodiment of the present application first uses BERT to preprocess training data, vectorizes the preprocessed data, encodes the vectorized data to capture semantic information including context information, calculates the most likely entity tag sequence of a sentence through the NER layer, then converts the most likely entity tag sequence into a tag to be embedded, fills the predicted relationship through the table, then sends all the predicted relationships in the sentence into a self-attention mechanism, comprehensively considers the influence of all other triples in the training sentence on the current predicted relationship, and finally obtains the relationship between each word and the selected entity through the sigmoid multi-tag classifier. Specifically, the entity relationship joint extraction model mainly comprises a BERT layer, a NER layer, a self-intent layer and a table filling layer, wherein the BERT layer firstly divides sentences, the input representation of each word consists of marks, segments and embedded positions of each word, and then the BERT maps each word into a word vector: each word is converted into a vector through an embedding layer, and then the vector is input into an encoder to be encoded to obtain a continuous embedded representation of each word as an output of the BERT layer. The word vector after BERT pretreatment is used as an input of the NER layer, a most likely entity marking sequence is calculated by using linear CRF, and then the most likely entity marking sequence is converted into label embedding which is used as an output of the NER layer. Splicing the output vector of the BERT module layer with the output vector of the NER layer to serve as the input of the table filling layer, and predicting the relation according to a formula of the pre-prediction relation; and taking all the predicted relation vectors as the input of the self-intent module, comprehensively considering the influence of other triples in sentences on the current relation, and further accurately predicting the relation between the current entities. The table filling layer obtains and outputs the relation between each word and the selected entity through the sigmoid multi-label classifier. For example, a training sentence is input: input: li Huayu 1980 it was developed in the Shanghai in 1980; and (3) outputting: (Li Hua, 1980, birthday), (Li Hua, shanghai, birth place).

Therefore, the embodiment of the application comprehensively considers the influence of other triples in sentences on the current prediction relation by using a self-attention mechanism, so that the relation between the current entities is better predicted. And if there is a relation between one entity and other entities, the relation extraction is implemented in a form of table filling, and the table filling can enumerate the relation between any two entities in a sentence. The extraction strategy solves the defect of the current entity relationship joint extraction, and simultaneously improves the accuracy and recall rate of the entity relationship joint extraction.

For example, referring to fig. 3, fig. 3 is a flowchart of a specific entity relationship joint extraction method according to an embodiment of the present application. For example, referring to fig. 4, fig. 4 is a block diagram illustrating an implementation of a specific entity relationship joint extraction method disclosed in the present application.

Referring to fig. 5, an embodiment of the present application discloses a body relation joint extraction device, including:

a data acquisition module 11, configured to acquire training sample data;

the model training module 12 is configured to train a pre-built entity relationship extraction model by using the training sample data to obtain a trained model; wherein the entity relation extraction model comprises a self-attention layer; the self-attention layer is used for carrying out attention calculation on the influence of other triples in sentences on the current prediction relation in the training process;

and the relation extraction module 13 is used for outputting a corresponding entity relation extraction result by using the trained model when the target text to be subjected to entity relation extraction is acquired.

The entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer;

correspondingly, the model training module 12 is specifically configured to input the training sample data to the BERT layer, divide sentences through the BERT layer, and map each divided word into a corresponding word vector to obtain a context representation of the sentence; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.

The entity relation joint extraction device further comprises a tag sequence loss calculation module, wherein the tag sequence loss calculation module is used for calculating tag sequence loss by using a tag sequence loss function; wherein the tag sequence loss function is

The entity relation joint extraction device further comprises a table filling loss calculation module, wherein the table filling loss calculation module is used for calculating table filling loss by using a table filling loss function; wherein the table filling loss function is

The expression x _i Is the j-th related entity, +.>

The expression x _i and />

For the word x _i and />

There is a relation between->

Is a probability of (2).

Further, the model training module 12 is specifically configured to divide a sentence through the BERT layer, convert each divided word into a corresponding vector, and then input the converted vector to an encoder for encoding, so as to obtain the word vector.

The entity relationship joint extraction device further comprises a training loss calculation module for calculating training loss; the training loss includes tag sequence loss and table filling loss.

Referring to fig. 6, an embodiment of the present application discloses an entity relationship joint extraction device, including a processor 21 and a memory 22; wherein the memory 22 is used for storing a computer program; the processor 21 is configured to execute the computer program to implement the entity relationship joint extraction method disclosed in the foregoing embodiment.

For the specific process of the entity relationship joint extraction method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.

Referring to fig. 7, an embodiment of the present application discloses an electronic terminal 20 including a processor 21 and a memory 22 as disclosed in the foregoing embodiments. The steps that the processor 21 may specifically perform may refer to the corresponding contents disclosed in the foregoing embodiments, and will not be described herein.

Further, the electronic terminal 20 in the present embodiment may further specifically include a power supply 23, a communication interface 24, an input/output interface 25, and a communication bus 26; wherein, the power supply 23 is used for providing working voltage for each hardware device on the terminal 20; the communication interface 24 can create a data transmission channel between the terminal 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.

Further, the embodiment of the application also discloses a computer readable storage medium for storing a computer program, wherein the computer program is executed by a processor to implement the entity relationship joint extraction method disclosed in the previous embodiment.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above detailed description of a method, device, apparatus and medium for entity relationship joint extraction provided in the present application applies specific examples to illustrate the principles and embodiments of the present application, where the above description of the embodiments is only for helping to understand the method and core ideas of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. The utility model provides a method for entity relationship joint extraction, which is characterized by comprising the following steps:

acquiring training sample data;

when a target text to be subjected to entity relation extraction is obtained, outputting a corresponding entity relation extraction result by using the trained model;

the entity relation extraction model further comprises a BERT layer, a NER layer and a table filling layer; correspondingly, the training the pre-built entity relation extraction model by using the training sample data comprises the following steps: inputting the training sample data to the BERT layer, dividing sentences through the BERT layer, and mapping each divided word into a corresponding word vector to obtain the context representation of the sentences; performing linear CRF calculation on the word vector through the NER layer to obtain a corresponding predicted entity marking sequence, and converting the predicted entity marking sequence into a corresponding tag embedding sequence; splicing the word vector and the tag embedding sequence to obtain a target vector; predicting entity relationship of the target vector through the table filling layer to obtain a corresponding predicted entity relationship; inputting the predicted entity relationship into the self-attention layer for attention calculation to obtain a corresponding entity relationship after attention calculation; and performing inner product operation on the entity relationship after the attention calculation and the predefined relationship vector, and then classifying through a multi-label classifier to obtain the entity relationship corresponding to each word.

2. The method for entity-relationship joint extraction according to claim 1, further comprising:

；

wherein ,

for a training set, said training set comprising all of said training sample data,/for>

Is word sequence->

The word sequence is a sequence obtained by dividing sentences through the BERT layer, and the word sequence is a sequence of the manual annotation of the sentence>

Is that

A corresponding probability value.

3. The method for entity-relationship joint extraction according to claim 1, further comprising:

；

wherein ,

for table filling loss, ++>

For training set->

Corresponding word sequence in sentence, ++>

For words->

The correct relation of manual annotation in the training set; />

The expression->

Is>

Each related entity->

The expression->

and />

The correct relation between the training sets, which is manually marked, the first->

The personal relationship is expressed as->

，/>

For words->

and />

There is a relation between->

Is a probability of (2).

4. The method of claim 1, wherein the dividing sentences by the BERT layer and mapping each divided word into a corresponding word vector comprises:

5. The method for entity-relationship joint extraction according to claim 1, further comprising:

6. An entity relationship joint extraction device, comprising:

the data acquisition module is used for acquiring training sample data;

the relation extraction module is used for outputting a corresponding entity relation extraction result by utilizing the trained model when a target text to be subjected to entity relation extraction is acquired;

7. The entity-relationship joint extraction apparatus of claim 6, wherein,

the training loss calculation module is used for calculating training loss; the training loss includes tag sequence loss and table filling loss.

8. An entity relationship joint extraction device is characterized by comprising a processor and a memory; wherein,

the memory is used for storing a computer program;

the processor is configured to execute the computer program to implement the entity-relationship joint extraction method according to any one of claims 1 to 5.

9. A computer readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the entity-relationship joint extraction method of any one of claims 1 to 5.