CN113254636A

CN113254636A - Remote supervision entity relationship classification method based on example weight dispersion

Info

Publication number: CN113254636A
Application number: CN202110456426.8A
Authority: CN
Inventors: 陈雪; 刘振贤; 骆祥峰
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2021-04-27
Filing date: 2021-04-27
Publication date: 2021-08-13

Abstract

The invention relates to a remote supervision entity relation classification method based on example weight dispersion, which is characterized in that sentence examples generated based on a remote supervision method are packaged, and a feature vector of each example is obtained through a segmented convolution network; calculating the relevance weight of the example and the packet in which the example is positioned by using an attention mechanism; calculating the average value and the standard deviation of all the example relevance weights in the package, and updating the example relevance weights according to the design threshold value of the average value and the standard deviation; combining the example feature vectors in the package into a package feature vector according to the example updated relevance weights; inputting the packet feature vectors into a classifier to obtain a classification result of the packet, comparing the classification result with the labels of the packet, and calculating a loss function; if the F1 value of the continuous three-wheel training is not lifted or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameters according to the loss function, and performing the next round of training. The method provided by the invention reduces the influence of the error labeling example on model training in remote supervision, and improves the accuracy of the remote supervision entity relationship classification model.

Description

Remote supervision entity relationship classification method based on example weight dispersion

Technical Field

The invention relates to the field of multi-instance learning and information extraction, in particular to a remote supervision entity relationship classification method based on instance weight dispersion.

Background

The entity relationship classification is one of the most important tasks of information extraction, and aims to allocate a predefined relationship type for an entity pair according to context semantics on the basis of labeling a text entity, and the method can be divided into a supervised method, a semi-supervised method, an unsupervised method and a remote supervision method.

The supervised entity relationship classification method requires a large number of accurately labeled data sets, and consumes manpower; the semi-supervised method is sensitive to given seeds and has semantic drift problem, and the accuracy rate is low; the unsupervised method utilizes corpus information clustering to define the relationship on the basis of clustering results, and the method has the problems of difficulty in describing the relationship and low recall rate of low-frequency instances.

The current mainstream method is remote supervision relation classification, a structured knowledge base is aligned with unstructured texts, and a labeling data set is automatically generated for model training, so that a large amount of labor cost is avoided. The method assumes that if two entities have some relationship in the knowledge base, then a sentence containing both entities expresses this relationship. However, the above assumptions may lead to the example of mislabeling, i.e., a sentence containing a target entity pair does not actually describe the relationship type of the entity pair in the knowledge base. Thus, remote supervised entity relationship classification can reduce the impact of mislabeled data in conjunction with multi-instance learning.

Multiple Instance Learning (MIL) proposes the concept of a package, which is defined as a collection of Multiple instances. The input of the model is not an example with a single category label, but a plurality of labeled packets, each packet is a positive example packet (positive packet) as long as at least one positive example is contained in the packet, and is negative otherwise. In the extraction of the relationship between the remote supervising entities, when a certain relationship exists between a pair of entities, at least one example in the package formed by the pair of entities can express the relationship, and because of the assumption, the example in the package which does not describe the relationship correctly can cause serious interference to the model training. Therefore, how to solve the influence of the error labeling examples in the package on the model training provides powerful support for the application of multi-example learning in other fields, and becomes a technical problem which needs to be solved urgently.

Disclosure of Invention

The invention mainly aims to overcome the defects of the prior art and provide a remote supervision entity relation classification method based on example weight dispersion, wherein a relevance weight threshold is designed by using a mean value and a standard deviation of example weights in a package, then the relevance weights of the examples in the package are updated according to the threshold, and the examples with smaller relevance weights and larger dispersion with the mean value are filtered, so that the influence of error labeling examples on a model is reduced, and powerful support is provided for the application of multi-example learning in other fields.

In order to achieve the purpose, the invention adopts the following technical scheme:

a remote supervision entity relation classification method based on example weight dispersion comprises the following steps:

step 1, packing sentence examples generated based on a remote supervision method, and obtaining a feature vector of each example through a segmented convolution network;

step 2, calculating the relevance weight of the example and the package in which the example is positioned by using an attention mechanism;

step 3, calculating the average value and standard deviation of all example correlation weights in the package, and updating the example correlation weights according to the design threshold value;

step 4, combining the example feature vectors in the package into a package feature vector according to the example updated relevance weight;

step 5, inputting the packet feature vectors into a classifier to obtain a classification result of the packet, comparing the classification result with the labels of the packet, and calculating a loss function;

step 6, if the F1 value of the continuous three-wheel training is not lifted or the training of the current wheel reaches the preset training times, finishing the training;

otherwise, updating the parameters according to the loss function, and performing the next round of training.

Preferably, in step 1, the sentence examples generated based on the remote supervision method are packaged, and the feature vector of each example is obtained through a segmented convolution network, and the specific steps are as follows:

(1-1) putting instances containing the same entity pair into the same set, forming a multi-instance package, and constructing a remote supervision data set

n_sIs the number of packets, B_L＝{S₁，S₂，...，S_mIs a multi-instance packet in the dataset, L e [1, n_s]M is the number of instances in the packet, S_iFor each example in a package, i ∈ [1, m]。

(1-2) obtaining feature vectors b of examples in a packet through a segmented convolutional network_l＝{s₁，s₂，...，s_m}，l∈[1，n_s]，s_jFor each example feature vector, j ∈ [1, m]。

Preferably, in the step 2, the relevance weight of the example and the package where the example is located is calculated by using an attention mechanism, and the specific steps are as follows:

(2-1) performing inner product on the feature vector of each example in the packet and the packet label vector, wherein the result is used as the correlation weight of the example and the packet, namely the example weight, and the specific calculation formula is as follows:

e_j＝s_jAq (1)

wherein s is_jFor the example feature vector output in step 1, a represents a weight parameter matrix, q represents a query vector, and is used for querying the feature vector corresponding to the relationship label from a;

(2-2) normalizing the example weights, wherein a specific calculation formula is as follows:

wherein k is ∈ [1, m ].

Preferably, in step 3, the average and standard deviation of all the example correlation weights in the package are calculated, and the example correlation weights are updated according to the design threshold, and the specific steps are as follows:

(3-1) outputting a threshold value for calculating the correlation weight according to the step 2, wherein the specific calculation formula is as follows:

wherein M is an example weight correlation mean, δ is a standard deviation, and M- δ is a correlation weight threshold;

(3-2) updating the example relevance weight according to a threshold, if the example relevance weight is smaller than the threshold, updating the relevance weight to be 0, otherwise, keeping the relevance weight, wherein the specific calculation formula is as follows:

(3-3) normalizing the updated weight, wherein the specific calculation formula is as follows:

preferably, in the step 4, the example feature vectors in the packet are combined into a packet feature vector according to the example relevance weight, and the specific steps are as follows:

and (3) obtaining the feature vector of the packet according to the example relevance weight output in the step (3), wherein a specific calculation formula is as follows:

wherein x is_tRepresenting a 230-dimensional package feature vector, t ∈ [1, n_s]，β_kExemplary correlation weights, s, output for step (3-3)_kAnd (5) illustrating the feature vectors in the packet output in the step (1-2).

Preferably, in the step 5, the packet feature vector is input into the classifier to obtain a classification result of the packet, and the classification result is compared with the label thereof to calculate the loss function. The classifier comprises a full connection layer and a normalization layer, and the specific steps are as follows:

(5-1) mapping the packet feature vector output in the step 4 into a 53-dimensional vector through a full connection layer of a classifier, wherein a specific calculation formula is as follows:

O_t＝Wx_t+d (8)

wherein the content of the first and second substances,

is the output vector of the full connection layer, and represents the score of each relation type, n_rNumber of relationship types, W is a trainable parameter matrix, x_tAnd d is a bias vector, wherein the packet feature vector is output in the step (4-1).

(5-2) normalizing the result of the step (5-1) and outputting probability distribution of each relation category, wherein the specific calculation formula is as follows:

wherein, B_LFor the example package described in step (1-1), r_L∈[1，n_r]Is a bag B_LThe number corresponding to the type of the relationship is assigned, theta represents a trainable parameter set of the model and comprises a segmented convolution network in the step (1-2), a weight matrix in the step (2-1), a full link layer parameter in the step (5-1), and o_cRepresents the score of the package on the c-th relationship type, c ∈ [1, n_r]。

(5-3) calculating a loss function for updating the model parameters, wherein the calculation formula is as follows:

preferably, in the step 6, it is determined whether the iterative training of the model is required to be continued, and the specific steps are as follows:

if the F1 value of the continuous three-wheel training is not lifted or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameter set theta according to the loss function, and performing the next round of training, wherein the parameter updating calculation formula is as follows:

wherein epsilon is the learning rate,

is the gradient of the loss function.

Compared with the prior art, the invention has the following obvious and prominent substantive characteristics and remarkable technical progress:

1. according to the method, the example weight threshold is designed through the mean value and the standard deviation of the example correlation weights in the package, the example weights are updated based on the threshold, and then the examples with smaller weights and larger dispersion degree with the mean value are filtered, so that the influence of wrong labeling data in multi-example learning on model training is reduced;

2. the method improves the calculation method of the example relevance weight in the package based on the attention mechanism, and improves the accuracy of the remote supervision entity relationship classification model.

Drawings

FIG. 1 is a flow of remote supervised entity relationship classification based on example weight dispersion.

FIG. 2 is a comparison of the PR curves of the experimental results of the method of the present invention with other methods.

FIG. 3 is an example weight calculation process based on mean and standard deviation.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects to be solved by the embodiments of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and the embodiments. It should be emphasized that the specific embodiments described herein are merely illustrative of the invention and are not limiting.

The preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings:

the first embodiment is as follows:

in this embodiment, referring to fig. 1, a method for remote supervised entity relationship classification based on example weight dispersion includes the following steps:

In the remote supervision entity relationship classification method based on the example weight dispersion, the method designs a relevance weight threshold by using a mean value and a standard deviation of example weights in a package, then updates the relevance weights of the examples in the package according to the threshold, and filters the examples with smaller relevance weights and larger mean value dispersion, so as to reduce the influence of a wrong labeling example on a model.

Example two:

in the foregoing embodiment, a remote supervising entity relationship classification method based on example weight dispersion includes the following steps:

step 1, packing examples in a training set generated based on a remote supervision method, and obtaining a feature vector of each example through a segmented convolution network. The specific process is as follows:

And 2, calculating the relevance weight of the example and the package in which the example is positioned by using the attention mechanism. The specific process is as follows:

e_j＝s_jAq (1)

wherein s is_jAnd (3) for the example feature vector output in the step (1-2), A represents a weight parameter matrix, and q represents a query vector, and the query vector is used for querying the feature vector corresponding to the relationship label from A.

wherein k is ∈ [1, m ].

Step 3, calculating the average value and standard deviation of all example relevance weights in the package, and updating the example relevance weights according to the design threshold value, wherein the process is as follows:

(3-1) calculating a threshold value of the correlation weight according to the output of the step (2-2), wherein a specific calculation formula is as follows:

where M is an example weighted correlation mean, δ is the standard deviation, and M- δ is the correlation weight threshold.

step 4, combining the example feature vectors in the packet into a packet feature vector according to the example relevance weight, wherein the process is as follows:

(4-1) obtaining a feature vector of the packet according to the example relevance weight output in the step (3-3), wherein a specific calculation formula is as follows:

wherein x is_tRepresenting a 230-dimensional package feature vector, t ∈ [1, n_s]，β_kExemplary correlation weights, s, output for step (3-3)_kFor the output of step (1-2)The feature vectors are exemplified in the packet(s) of (c).

And 5, inputting the packet feature vectors into a classifier to obtain a classification result of the packet, comparing the classification result with the labels of the packet, and calculating a loss function. The classifier comprises a full connection layer and a normalization layer, and the process is as follows:

(5-1) adjusting the packet feature vector output in the step (4-1) through a full connection layer, wherein a specific calculation formula is as follows:

O_t＝Wx_t+d (8)

wherein the content of the first and second substances,

step 6, judging whether the iterative training model needs to be continued or not, wherein the process is as follows:

(6-1) if the F1 value of the continuous three-wheel training is not increased or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameter set theta according to the loss function, and performing the next round of training, wherein the parameter updating calculation formula is as follows:

wherein epsilon is the learning rate,

is the gradient of the loss function.

In the embodiment, an example weight threshold is designed through the mean value and the standard deviation of example relevance weights in a package, the example weights are updated based on the threshold, and then examples with smaller weights and larger dispersion with the mean value are filtered, so that the influence of error labeling data in multi-example learning on model training is reduced.

Example three:

in order to verify the effectiveness of the method, experiments are developed by taking the news corpus in New York as data, and the method is further explained by combining the attached drawings.

In this embodiment, a remote supervision entity Relationship Classification method (DSRC-SWD) based on example Weight Dispersion firstly packages examples in a training set generated based on a remote supervision method, and obtains a feature vector of each example in the package through a segmented convolution network (PCNN); then, calculating the relevance weight of the example and the packet in which the example is positioned by using an attention mechanism; calculating the average value and the standard deviation of all the example correlation weights in the package, designing an example correlation weight threshold according to the average value and the standard deviation, if the example correlation weight is smaller than the threshold, updating the correlation weight to be 0, otherwise, keeping the correlation weight and normalizing the correlation weight; the example feature vectors in the packet are then combined into a packet feature vector according to the example relevance weights. And finally, inputting the packet feature vector into a classifier to output a classification result of the packet, comparing the classification result with the label of the packet, and calculating a loss function. If the F1 value of the continuous three-wheel training is not lifted or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameters according to the loss function, and performing the next round of training.

Referring to the remote supervision relationship extraction flow chart of fig. 1, in the embodiment, an example weight threshold is designed through a mean value and a standard deviation, and an example weight is updated, so that the influence of a mislabeling example on a model in multi-example learning is reduced.

Step 1: packing sentence examples (examples for short) in a training set generated based on a remote supervision method, and obtaining a feature vector of each example through a segmented convolution network, wherein the specific process comprises the following steps:

(1-1) packing examples in a training set generated based on a remote supervision method. For example, if there is a sentence of BillGatessisthfufacement Microsoft, entity BillGates and Microsoft appear in a triplet (Microsoft, found, BillGates) of the atlas, then the relationship label of the sentence is (business, person, company). Putting the examples containing the entity pair into the same set, forming a multi-example package, and constructing a remote supervision data set

Step 2: the method comprises the following steps of calculating the relevance weight of an example and a package in which the example is located by using an attention mechanism, wherein the specific process comprises the following steps:

e_j＝s_jAq (1)

wherein s is_jFor the example feature vector output in the step (1-2), A represents a weight parameter matrix, q represents a query vector, and the query vector is used for querying the feature vector corresponding to the relationship label from A;

wherein k is ∈ [1, m ].

3. The mean and standard deviation of all the example correlation weights within the package are calculated and the example correlation weights are updated according to their design thresholds, as shown in fig. 3, which proceeds as follows:

4. combining the example feature vectors in the packet into a packet feature vector according to the example relevance weights, which is performed as follows:

5. And inputting the packet feature vector into a classifier to obtain a classification result of the packet, comparing the classification result with the label of the packet, and calculating a loss function. The classifier is divided into a full connection layer and a normalization layer, and the process is as follows:

O_t＝Wx_t+d (8)

wherein the content of the first and second substances,

is the output vector of the full connection layer, and represents the score of each relation type, n_r53 is the number of relationship types, W is the trainable parameter matrix, x_tAnd d is a bias vector, wherein the packet feature vector is output in the step (4-1).

wherein n is_sAnd theta represents a trainable parameter set of the model and comprises the segmented convolution network in the step (1-2), the weight matrix in the step (2-1) and the full connection layer parameters in the step (5-1).

6. Judging whether the iterative training model needs to be continued or not, wherein the process is as follows:

wherein, epsilon is 0.5 as the learning rate,

is the gradient of the loss function.

Compared with other methods, the method of the embodiment designs the example weight threshold value through the mean value and the standard deviation of the example weights in the package, further updates the example weight threshold value in the package, and filters the examples with smaller weights and larger dispersion with the mean value, thereby reducing the influence of wrong labeling data in multi-example learning.

Description of experimental tests and results:

the data set used in this example is the data set of "New York Times" (http:// t. cn/RPsjAY), and is divided into a training set, a verification set and a test set. Wherein the training set includes 466876 sentence examples, the verification set includes 55167 sentence examples, and the test set includes 172448 sentence examples. The experimental indexes adopt a Precision-Recall ratio Curve (PR Curve) and P @ N, wherein the PR Curve represents the Precision ratio of the model under different Recall ratios, the Curve is closer to the upper right corner of a coordinate system to indicate that the comprehensive performance of the model is better, and the P @ N represents the accuracy of the first N pieces of test data.

TABLE 1 results of P @ N

P@N(％)	100	200	300	Average
					ONE	64.3	62.6	58.2	61.7
AVG	67.8	64.4	60.4	64.2
					ATT	71.1	67.6	64.5	67.7
DSRC-SWD	72.3	69.7	66.1	69.3

FIG. 2 is a comparison of PR plots of the present invention process with other processes, and Table 1 is a comparison of the P @ N index of the present invention process with other processes. Each method employs a segmented convolutional network for feature extraction, with the difference being an exemplary weight calculation method. Wherein ONE indicates that the example with the largest relevance weight retains its weight, and the other examples reset 0; AVG indicates that all example correlation weights are the same; ATT denotes the calculation of the relevance weights according to the attention mechanism; DSRC-SWD represents an exemplary weight calculation method of the present invention. An experimental PR curve shows that the weight calculation method of the invention obtains the highest accuracy rate when the recall rate is lower; the difference in accuracy between the process of the invention and the ATT process is small when the recall rate is between 0.05 and 0.17, but still higher than other processes; after the recall rate is 0.17, the accuracy of the method is greatly improved compared with that of all other methods. The experimental P @ N result shows that the accuracy of the method is highest under P @100, P @200 and P @300, the average accuracy reaches 69.3%, and the method is respectively improved by 7.6%, 5.1% and 1.6% compared with other methods.

The ONE method only selects the example with the highest score among the packets, other examples, although relatively low in score, may contain the relational semantic information of the packets, and this method loses a large amount of valid information. The AVG method makes the weights of the used examples the same, fails to distinguish the wrong labeled examples from the correct examples, and has great negative influence on the model. Although the ATT method gives different weights to the examples through the attention mechanism, the wrongly labeled examples still participate in the training of the model, so that the extraction effect of the model relation is limited. On the basis of an attention mechanism, the standard deviation of example weight distribution is calculated, an example weight threshold value is further obtained, examples with low scores and large dispersion degree with the mean value are filtered, and only examples with high scores are made to participate in model training. Therefore, the experimental conclusion is that under the condition that the example feature extraction methods are the same, the method provided by the invention achieves a better effect in relation classification of the multi-example learning remote supervision entity.

In summary, the above embodiment of the remote supervised entity relationship classification method based on example weight dispersion packs the sentence examples (examples for short) generated based on the remote supervised method, and obtains the feature vector of each example through the segmented convolution network. The relevance weights of the examples and their packages are calculated using an attention mechanism. The mean and standard deviation of all example relevance weights within a package are calculated and the example relevance weights are updated according to their design thresholds. The example feature vectors in the package are combined into a package feature vector according to the example updated relevance weights. And inputting the packet feature vector into a classifier to obtain a classification result of the packet, comparing the classification result with the label of the packet, and calculating a loss function. If the F1 value of the continuous three-wheel training is not lifted or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameters according to the loss function, and performing the next round of training. The method improves the calculation method of the relevance weight of the example in the package based on the attention mechanism, designs the weight threshold by using the average value and the standard deviation of the relevance weight, and filters the examples with smaller relevance weight and larger dispersion degree with the average value according to the threshold, thereby reducing the influence of the error labeling example in remote supervision on model training and improving the accuracy of the entity relationship classification model in remote supervision.

The foregoing is a more detailed description of the invention in connection with specific/preferred embodiments and is not intended to limit the practice of the invention to those descriptions. It will be apparent to those skilled in the art that various substitutions and modifications can be made to the described embodiments without departing from the spirit of the invention, and these substitutions and modifications should be considered to fall within the scope of the invention.

Claims

1. An example weight calculation method for multi-example learning is characterized by comprising the following steps:

step 1, packing sentence examples generated based on a remote supervision method, and obtaining a feature vector of each example through a segmented convolution network.

And 2, calculating the relevance weight of the example and the package in which the example is positioned by using the attention mechanism.

And 3, calculating the average value and the standard deviation of all the example correlation weights in the package, and updating the example correlation weights according to the design threshold value.

And 4, combining the example feature vectors in the package into a package feature vector according to the example updated relevance weight.

And 5, inputting the packet feature vectors into a classifier to obtain a classification result of the packet, comparing the classification result with the labels of the packet, and calculating a loss function.

Step 6, if the F1 value of the continuous three-wheel training is not lifted or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameters according to the loss function, and performing the next round of training.

2. The method of claim 1, wherein: the step 1 is to pack sentence examples generated based on a remote supervision method, and obtain a feature vector of each example through a segmented convolution network, and the process is as follows:

(2-1) putting instances containing the same entity pair into the same set, forming a multi-instance package, and constructing a remote supervision data set

(2-2) general formulaObtaining feature vectors b of examples in packets by a segmented convolutional network_l＝{s₁，s₂，...，s_m}，l∈[1，n_s]，s_jFor each example feature vector, j ∈ [1, m]。

3. The method of claim 1, wherein: in the step 2, the relevance weight of the example and the packet where the example is located is calculated by using an attention mechanism, and the process is as follows:

(3-1) performing inner product on the feature vector of each example in the packet and the packet label vector, wherein the result is used as the correlation weight of the example and the packet, namely the example weight, and the specific calculation formula is as follows:

e_j＝s_jAq (1)

wherein s is_jAnd (3) for the example feature vector output in the step (2-2), A represents a weight parameter matrix, and q represents a query vector, and the query vector is used for querying the feature vector corresponding to the relationship label from A.

(3-2) normalizing the example weights, wherein a specific calculation formula is as follows:

wherein k is ∈ [1, m ].

4. The method of claim 1, wherein: in step 3, the average and standard deviation of all example relevance weights in the package are calculated, and the example relevance weights are updated according to the design threshold, and the process is as follows:

(4-1) calculating a threshold value of the correlation weight according to the output of the step (3-2), wherein a specific calculation formula is as follows:

(4-2) updating the example relevance weight according to the threshold, if the example relevance weight is smaller than the threshold, updating the relevance weight to be 0, otherwise, keeping the relevance weight, and the specific calculation formula is as follows:

(4-3) normalizing the updated weight, wherein the specific calculation formula is as follows:

5. the method of claim 1, wherein: the step 4 combines the example feature vectors in the packet into a packet feature vector according to the example relevance weights, and the process is as follows:

(5-1) obtaining a feature vector of the packet according to the example relevance weight output in the step (4-3), wherein a specific calculation formula is as follows:

wherein x is_tRepresenting a 230-dimensional package feature vector, t ∈ [1, n_s]，β_kExemplary correlation weights, s, output for step (4-3)_kAn example feature vector output for step (2-2).

6. The method of claim 1, wherein: and 5, inputting the packet feature vectors into a classifier to obtain a classification result of the packet, comparing the classification result with the label of the packet, and calculating a loss function. The classifier comprises a full connection layer and a normalization layer, and the process is as follows:

(6-1) adjusting the packet feature vector output in the step (5-1) through the full connection layer, wherein a specific calculation formula is as follows:

O_t＝Wx_t+d (8)

wherein the content of the first and second substances,

is the output vector of the full connection layer, and represents the score of each relation type, n_rNumber of relationship types, W is a trainable parameter matrix, x_tAnd d is an offset, wherein d is the packet feature vector output in the step (5-1).

(6-2) normalizing the result of the step (6-1) and outputting probability distribution of each relation category, wherein the specific calculation formula is as follows:

wherein, B_LFor the example package described in step (2-1), r_L∈[1，n_r]Is a bag B_LThe number corresponding to the type of the relationship is assigned, theta represents a trainable parameter set of the model, and comprises a segmented convolution network in the step (2-2), a weight parameter matrix in the step (3-1), a full link layer parameter in the step (6-1), and o_cRepresents the score of the package on the c-th relationship type, c ∈ [1, n_r]。

(6-3) calculating a loss function for updating the model parameters, wherein the calculation formula is as follows:

7. a method according to claim 1, characterized in that: step 6 is to judge whether the iterative training model needs to be continued, and the process is as follows:

(7-1) if the F1 value of the continuous three-wheel training is not increased or the training of the current wheel reaches the preset training times, finishing the training; otherwise, updating the parameter set theta according to the loss function, and performing the next round of training, wherein the parameter updating calculation formula is as follows:

wherein epsilon is the learning rate,

is the gradient of the loss function.