CN111831783B

CN111831783B - Method for extracting chapter-level relation

Info

Publication number: CN111831783B
Application number: CN202010644404.XA
Authority: CN
Inventors: 张世琨; 叶蔚; 李博; 胡文蕙; 张君福
Original assignee: Beijing Peking University Software Engineering Co ltd
Current assignee: Beijing Peking University Software Engineering Co ltd
Priority date: 2020-07-07
Filing date: 2020-07-07
Publication date: 2023-12-08
Anticipated expiration: 2040-07-07
Also published as: CN111831783A

Abstract

The invention provides a chapter-level relation extraction method, relates to the technical field of natural language processing, and mainly solves the technical problems of calculation resource consumption and logic reasoning between a target entity and a non-target entity aiming at chapter-level documents. The invention comprises the following steps: inputting a document to be processed, wherein the document is a chapter-level document; processing the document based on bidirectional attention constraint to obtain abstract semantic representations of entities and sentences, wherein the abstract semantic representations have global information and logical reasoning information; and judging the relation type of the target entity pair in the document based on the abstract semantic representation. The developer can efficiently and accurately extract the chapter-level relation by using the method disclosed by the invention, and simultaneously solves two main problems of chapter-level relation extraction, namely the problem of calculation cost caused by traversing all entities to generate alternative samples and the problem of logic reasoning between target entities and non-target entities.

Description

Method for extracting chapter-level relation

Technical Field

The invention relates to the technical field of natural language processing, in particular to a chapter-level relation extraction method in the technical field of bidirectional attention constraint and graph convolution neural network intersection. The invention obtains abstract representation of entity and document by constraint of bidirectional attention mechanism, and uses chapter-level relation extraction technique of graph convolution neural network for logical reasoning.

Background

The relation extraction is based on given context and two target entities, and judgesTwo target entitiesWhat the relationship between them is. Relationship extraction is one of the very important techniques for constructing large-scale knowledge-graphs, and can also be used to assist some downstream tasks, such as question-answering systems, etc. The research on relation extraction is mainly divided into two directions, one direction is based on a traditional machine learning thought for constructing a large number of artificial feature engineering, and the other direction is a deep learning method for building a neural network.

In traditional machine learning solutions, some researchers construct different features for an entire sentence and for a given target entity pair. The solution to traditional machine learning is very dependent on feature engineering and expertise, so that a model needs a large amount of manpower to construct features, meanwhile, the generalization capability of the model can greatly fluctuate along with the change of a data set, and in recent years, a large number of researchers use a deep learning method to solve the problem of relation extraction. In terms of deep learning, researchers have focused their research directions on the application of Convolutional Neural Networks (CNNs), recurrent Neural Networks (RNNs), and attention mechanisms, as well as on improvements to the above models, thereby also yielding a number of relevant solutions based on deep learning approaches.

The existing task definition and method for extracting the relation are all tasks based on sentence level, namely, aiming at a sentence as a context, aiming at two target entities appearing in the current sentence, the relation of the target entities is judged. However, in a practical application scenario, the text processed in most cases is not just as simple at the sentence level, relatively speaking, a text with one chapter is more common, a large number of entities appear in each chapter, and at the same time, each target entity appears multiple times in one document, so some classical methods in relation extraction at the sentence level are not available in this case, such as entity location vectors, PCNN, and so on. Meanwhile, the length of the text at the article level is often hundreds or even thousands, which is far greater than that of the text at the sentence level, the text at the overlength causes difficult extraction of the information of the whole document, the information is depended on for a long distance, and good representation and interaction of the entity are difficult, so that the effect of extracting the model by the chapter-level relation is greatly reduced. Meanwhile, because a large number of entities exist, each combination of the entities should become a potential target entity pair, and by traversing all the entity combinations in each article, a sentence and target entity sample is constructed, and a sentence-level relation extraction model is applied, so that the calculation data consumption is greatly increased. More importantly, due to the increase of the text length, a large number of entities exist in the input document, and the relationship between two target entities is obtained through not only direct interaction but also logic reasoning among non-target entities, and can be obtained through one or more times of reasoning among the entities, so that the accurate classification result cannot be obtained for the relationship needing reasoning.

Disclosure of Invention

One of the purposes of the invention is to provide a chapter-level relation extraction method, which solves the technical problems of calculation resource consumption and logic reasoning between target entities and non-target entities aiming at chapter-level problems in the prior art. Numerous advantageous effects can be achieved in the preferred embodiments of the present invention, as described in detail below.

In order to achieve the above purpose, the present invention provides the following technical solutions:

the invention relates to a chapter-level relation extraction method, which comprises the following steps:

inputting a document to be processed, wherein the document is a chapter-level document;

processing the document based on bidirectional attention constraint to obtain abstract semantic representations of entities and sentences, wherein the abstract semantic representations have global information and logical reasoning information;

and judging the relation type of the target entity pair in the document based on the abstract semantic representation.

Further, the processing the document based on the bidirectional attention constraint to obtain an abstract semantic representation of an entity and a sentence includes:

obtaining a representation of each entity and a representation of each sentence in the document;

acquiring a global entity interaction matrix based on the representation of each entity;

calculating the global entity interaction matrix and the representation of each sentence based on a first attention mechanism to obtain the representation of each sentence based on entity attention;

calculating the representation of each sentence based on the entity attention based on the graph convolution network to obtain the representation of each sentence with logic reasoning information;

based on a second attention mechanism, calculating the global entity interaction matrix and the representation of each sentence with logical reasoning information to obtain a new global entity interaction matrix with global information and logical reasoning information, and taking the new global entity interaction matrix as the abstract semantic representation of the entity and the sentence.

Further, the obtaining a representation of each entity and a representation of each sentence in the document includes:

acquiring an abstract semantic representation of the document;

a representation of each entity and a representation of each sentence in the document is obtained based on the abstract semantic representation of the document.

Further, the obtaining a representation of each sentence in the document based on the abstract semantic representation of the document includes:

and extracting the composition of each sentence from the abstract semantic representation of the document according to the starting position and the ending position of each sentence, carrying out maximum pooling processing on the composition of each sentence, and taking a sentence vector obtained after the maximum pooling processing as the representation of each sentence.

Further, the obtaining a representation of each entity in the document based on the abstract semantic representation of the document includes:

and determining words forming the entities according to the positions of each entity, calculating an average word vector of the words, taking the average word vector as an entity vector, and taking the average vector of the entity vectors of the same entity as the representation of each entity.

Further, the obtaining the abstract semantic representation of the document includes:

performing word vector conversion on the document to obtain a word vector matrix of the document;

and carrying out bidirectional LSTM operation on the word vector matrix to obtain the abstract semantic representation of the document.

Further, the obtaining a global entity interaction matrix based on the representation of each entity includes:

information interaction is carried out on every two entities, and two entity vectors are converted into one entity interaction vector;

and forming a global entity interaction matrix by all entity interaction vectors.

Further, the matrix weights adopted by the first attention mechanism and the second attention mechanism are the same.

Further, the determining the relationship type of the target entity pair in the document based on the abstract semantic representation includes:

performing column-wise weighting operation on the new global entity interaction matrix to obtain a global entity interaction vector;

and inputting the global entity interaction vector into a preset classification function for distinguishing the relationship types of the entity pairs to obtain a classification result, and taking the classification result as the relationship type of the target entity pair in the document.

Further, the classification function employs a softmax function.

The chapter-level relation extraction method provided by the invention has at least the following beneficial technical effects:

the invention can efficiently and simultaneously acquire entity representation no matter how many different entities are in the current document by representing each entity and encoding the relative positions among the entities, and further inputs the relationship results of all potential entity pairs in the document once. In addition, the present invention employs a bi-directional attention mechanism for the problem of logical reasoning between non-target entities, i.e. first obtaining each sentence and the coded representation of each entity, then for each entity, traversing the obtaining so whether the entity has a relationship to the combination and, if so, which relationship in a given set of relationships.

The developer can efficiently and accurately extract the chapter-level relation by using the method disclosed by the invention, and simultaneously solves two main problems of chapter-level relation extraction, namely the problem of calculation cost caused by traversing all entities to generate alternative samples and the problem of logic reasoning between target entities and non-target entities.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a chapter level relationship extraction method of the present invention;

FIG. 2 is a schematic diagram of a chapter level relationship extraction method of the present invention;

FIG. 3 is a schematic flow chart of the present invention for processing the document based on bi-directional attention constraints;

fig. 4 is a schematic representation of an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, based on the examples herein, which are within the scope of the invention as defined by the claims, will be within the scope of the invention as defined by the claims.

Referring to fig. 1 and 2, the present invention is a chapter level relation extraction method, which includes the steps of:

s1: inputting a document to be processed, wherein the document is a chapter-level document;

s2: processing the document based on bidirectional attention constraint to obtain abstract semantic representations of entities and sentences, wherein the abstract semantic representations have global information and logical reasoning information;

s3: and judging the relation type of the target entity pair in the document based on the abstract semantic representation.

It should be noted that the bidirectional attention mechanism is based on the following intuitive observations: if a given sentence contains a large amount of information that determines what the target entity pair is, then the target entity pair, in turn, will find a closer relationship with the given sentence when calculating the relationship with each sentence in the document.

The invention mainly aims at the documents of the chapter level, namely the texts which are often hundreds or thousands in length and far larger than the sentence level, and a large number of target entities exist in the documents of the chapter level. According to the relation extraction method for the target entity in the chapter-level document based on the bidirectional attention constraint, the input chapter-level document is subjected to a series of transformation and information extraction, whether the target entity pair combination contains a relation or not is finally obtained, and if the relation is contained, the relation is which in a given relation set. The developer can efficiently and accurately extract the chapter-level relation, and simultaneously solve two main problems of chapter-level relation extraction, namely the calculation cost problem caused by traversing all entities to generating the alternative sample and the logic reasoning problem between the target entity and the non-target entity.

Referring to fig. 3, the processing the document based on the bidirectional attention constraint to obtain an abstract semantic representation of entities and sentences includes:

s21: obtaining a representation of each entity and a representation of each sentence in the document;

s22: acquiring a global entity interaction matrix based on the representation of each entity;

s23: calculating the global entity interaction matrix and the representation of each sentence based on a first attention mechanism to obtain the representation of each sentence based on entity attention;

s24: calculating the representation of each sentence based on the entity attention based on the graph convolution network to obtain the representation of each sentence with logic reasoning information;

s25: based on a second attention mechanism, calculating the global entity interaction matrix and the representation of each sentence with logical reasoning information to obtain a new global entity interaction matrix with global information and logical reasoning information, and taking the new global entity interaction matrix as the abstract semantic representation of the entity and the sentence.

In step S21, the obtaining a representation of each entity and a representation of each sentence in the document includes:

acquiring an abstract semantic representation of the document;

Wherein the obtaining the abstract semantic representation of the document comprises:

It should be noted that, the input document to be processed is encoded by using the pre-training word vector matrix, and is converted into word vectors, so as to obtain word vector representation of the document, and abstract semantic representation of the document is further obtained through the bidirectional LSTM.

Suppose there are k different entities in the document than there are n words, m sentences. Let the i-th word be denoted as w _i There is a pre-trained word vector matrix of size N x d, where N is the number of words in the word vector matrix and d is the dimension of each word in the word vector matrix. Converting each word into a word vector using a pre-trained word vector matrix pair, i.e., for w _i And finding out the corresponding word vectors, and splicing word vectors of all words in the document to obtain word vector representation of the whole document, wherein the size of the word vector representation is n multiplied by d.

Assuming that the number of hidden units in the bidirectional LSTM is H, after passing through the bidirectional LSTM, the abstract semantics of the document are represented as H, and the size is n×2h.

The obtaining a representation of each sentence in the document based on the abstract semantic representation of the document includes:

It should be noted that, when the starting position and the ending position of each sentence are preset in the article to be processed, each sentence in the document is easily extracted, and the vector representation of each sentence is obtained by using the maximum pooling operation.

Assuming that a certain sentence in a document starts and ends with the a-th word and the b-th word, respectively, the sentence is expressed as:

wherein,the size of (2) is 1×2h.

The obtaining a representation of each entity in the document based on the abstract semantic representation of the document includes:

It should be noted that, assuming that a given target entity appears 3 times, the entity expressions are e respectively ₁ ¹ ,e ₂ ¹ And e ₃ ¹ Then the representation of the current entity The size of (2) is 1×2h.

In step S22, the obtaining a global entity interaction matrix based on the representation of each entity includes:

It should be noted that, traversing all entity pairs in the current documentSimultaneously considering the sequence of the two entities (namely the directions of the entities), and then carrying out information interaction on the two entities, wherein the interaction method is to transform two entity vectors into an advanced entity interaction vector v through a bilinear layer _ij ：

Assuming that the two entity vectors areThe operation of the bilinear layer is v _ij ＝σ(e _i W _v e _j +b _v ) Wherein W is _v ,b _v Is a trainable parameter.

All entity interaction vectors are constructed as a global entity interaction matrix. Assuming that there are k entities in the current document, there are k (k-1) interaction vectors in the constructed global entity interaction matrix.

Step S23: calculating the global entity interaction matrix and the representation of each sentence based on a first attention mechanism to obtain the representation of each sentence based on entity attention; the method comprises the following specific steps:

the attention mechanism operation is performed on each sentence representation and the global entity interaction matrix, so that the invention obtains a certain logic reasoning capability, and the expression of each sentence based on the entity attention is obtained.

Specifically, from the foregoing, it can be known that a sentence is expressed asIf the representation of a certain entity interaction vector is v, the attention weight of the current sentence and the entity interaction vector is calculated as +.>Calculating the current sentence and each entity interaction vector to obtain k (k-1) attention weights, and then weighting all entity interaction vectors to obtain the representation of each sentence based on the attention of the entity interaction vector>I.e. < ->All sentences are combined into a feature matrix X by columns.

Step S24: calculating the representation of each sentence based on the entity attention based on the graph convolution network to obtain the representation of each sentence with logic reasoning information; the method comprises the following specific steps:

and carrying out graph convolution operation on the feature matrix X of the sentence through the following transformation: l=ρ (AXW) ₂ ) Wherein ρ is a nonlinear activation function, e.g. sigmoid function, W ₂ Is a trainable parameter. The L thus obtained is the same size as X, each column of which is a representation L of a sentence.

It should be noted that, through a graph convolution neural network layer, the sequential connection information and the logical reasoning information between each sentence are obtained. After passing this layer, each sentence representation with sequential contact information and logical inference information will be obtained.

Step S25: based on a second attention mechanism, calculating the global entity interaction matrix and the representation of each sentence with logical reasoning information to obtain a new global entity interaction matrix with global information and logical reasoning information, and taking the new global entity interaction matrix as the abstract semantic representation of the entity and the sentence. The method specifically comprises the following steps:

specifically, the step S24 can learn that a sentence is expressed asIf the representation of a certain entity interaction vector is v, the attention weight of the current entity interaction vector and sentence is calculated as +.>Calculating the current entity interaction vector and each sentence to obtain m attention weights, and then weighting all sentence vectors to obtain each entity interaction vector based on the attention of the sentenceCharacterization->I.e. < ->

It should be noted that, the matrix weights adopted by the first attention mechanism and the second attention mechanism are the same. This is because, given two target entities and contexts, the degree of association between the target entity and context should be consistent with the degree of association between the context and target entity, so that the resulting matrix for both attention mechanisms is constrained to be as close as possible. Through bi-directional attention constraints, a representation of each entity for each sentence is derived, and then a new global entity interaction matrix with global information and logical reasoning information is obtained, which is used for classification of relation extraction.

In step S3, determining a relationship type of the target entity pair in the document based on the abstract semantic representation includes:

Preferably, the classification function employs a softmax function.

For each entity interaction vector generated from step S25Determining which relationship it belongs to, using the softmax function:

taking the obtained roughThe relationship type corresponding to y with the maximum value is used as the entity interaction vectorIs a relationship of (3).

It should be noted that, assuming that there are predefined 3 relationships, after a global entity interaction vector passes through softmax, 3 probability values are obtained, and respectively corresponding to the probability values of the 3 relationships, which probability is the largest is predicted as which relationship. For example, the target entities of the document are: the method comprises the steps of carrying out classification function operation on Beijing capital airport, beijing and Beijing subway airport lines, wherein the obtained relationship is belongings, capital and roads after the target entity pair is subjected to the classification function operation, and the probability that the relationship corresponds to 0.7, 0.2 and 0.1 respectively is that the relationship of the target entity pair is belonged to.

The implementation is illustrated with respect to fig. 4 as follows:

1. firstly, acquiring representations of all entities (such as Beijing capital airport, beijing subway airport line and the like) and sentences in an input document;

2. acquiring an attention weight matrix based on given target entities and sentences, and new entity and sentence representations;

3. after the new sentence representation passes through the graph convolution neural network, the sentence representation with logic reasoning information is obtained, and then the second attention mechanism calculation is carried out with the entity representation;

4. constraint is carried out on the obtained calculation results of the two attentiveness mechanisms, and interaction information of each entity pair is calculated at the same time;

5. the interaction information for each entity pair is used to determine whether or not a given target entity pair has a relationship, and if so, what.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A chapter-level relationship extraction method, comprising:

judging the relation type of a target entity pair in the document based on the abstract semantic representation;

the processing the document based on the bidirectional attention constraint to obtain abstract semantic representation of entities and sentences comprises the following steps:

2. The extraction method according to claim 1, wherein the obtaining the representation of each entity and the representation of each sentence in the document comprises:

acquiring an abstract semantic representation of the document;

3. The extraction method according to claim 2, wherein the obtaining a representation of each sentence in the document based on the abstract semantic representation of the document comprises:

4. The extraction method according to claim 2, wherein the obtaining a representation of each entity in the document based on the abstract semantic representation of the document comprises:

5. The extraction method according to claim 2, wherein said obtaining an abstract semantic representation of said document comprises:

6. The extraction method according to claim 4, wherein the obtaining a global entity interaction matrix based on the representation of each entity comprises:

7. The extraction method according to claim 1, wherein the matrix weights used by the first and second attention mechanisms are the same.

8. The extraction method according to claim 1, wherein said determining a relationship type of a target entity pair in the document based on the abstract semantic representation comprises:

9. The extraction method according to claim 8, wherein the classification function employs a softmax function.