CN115048924B

CN115048924B - Negative sentence identification method based on negative prefix and suffix information

Info

Publication number: CN115048924B
Application number: CN202210976289.5A
Authority: CN
Inventors: 李寿山; 李雅梦; 周国栋
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2022-08-15
Filing date: 2022-08-15
Publication date: 2022-12-23
Anticipated expiration: 2042-08-15
Also published as: CN115048924A

Abstract

The invention discloses a negative sentence recognition method based on negative prefix and suffix information, which firstly utilizes a word training set to train an auxiliary task model for acquiring the information of words with negative prefix and suffix; then, a sentence training set is utilized to train a main task model for identifying a negative sentence, in the training process of the main task model, a trained auxiliary task model is utilized to obtain a first hidden layer feature representation of a word with a negative suffix in the sentence, the first hidden layer feature representation is inserted into a second hidden layer feature representation of the sentence to update the hidden layer feature representation of the whole sentence, and the training of the main task model is carried out; and finally, recognizing the target sentence by using the trained main task model and the trained auxiliary task model. The method and the device model the recognition of the negative words into a matching model, and can greatly improve the recognition accuracy of the negative sentences by recognizing the words with the negative suffixes in the sentences and updating the hidden layer characteristic representation of the sentences.

Description

Negative sentence recognition method based on negative prefix and suffix information

Technical Field

The invention relates to the technical field of natural language processing, in particular to a negative sentence identification method.

Background

The negative sentence recognition task aims to automatically recognize whether an input sentence is a negative sentence with negative clue words, and is a basic task in negative sentence understanding. Negation is a common phenomenon in natural language description, and is also a core part of natural language description. Many natural language processing tasks require an understanding of negation to better understand semantic information of text, such as: such as emotion analysis, question answering, knowledge graph completion, natural language reasoning, etc. In these tasks, the potential of the model in learning and characterizing high-level morphological/syntactic knowledge about a given markup usage in a sentence can be explored through the detection of negative clue words and negative scopes.

With the development of deep learning, the negative sentence recognition method turns from sequence annotation to the use of a deep neural network model and numerous derivative models, researchers carry out negative sentence recognition task research through complex models and a large amount of annotation data, and the development of the natural language processing field is promoted due to the appearance of a pre-training model. The most widespread approach at present is to use a pre-trained model, modeling the problem as one that inputs a piece of text and outputs a label.

The prior art mainly comprises the following steps: (1) A professional labels a large number of texts with different polarity labels, each text is used as a sample, and a plurality of labeled corpora with labeled samples are obtained; (2) The method comprises the following steps of enabling a model to obtain classification capability through training a labeled corpus, wherein one mode is a sequence labeling mode, performing negative sentence identification through identifying and matching negative clue words to obtain a classification model, and the other mode is training based on a deep learning network (generally a recurrent neural network, a pre-training language model and the like) to obtain a classification model; (3) And testing the text of a certain unknown label by using a classification model to obtain the polarity label of the text segment. During the test, each time the classification model is entered, a single text is entered.

And the second step is based on a sequence labeling mode, and the network structure for identifying the negative sentences by identifying and matching the negative clue words comprises an Encoder (Encoder) layer and an FC full-connection layer. The Encoder layer is responsible for extracting the characteristics of the text, and common Encoder layers comprise LSTM, BERT and the like. The FC full connectivity layer is responsible for mapping text features to label categories of text. Inputting a section of text, and coding the text to obtain a coding vector to obtain the characteristics of the text; and then mapping the text features to each word in the text through a full connection layer, then identifying negative clue words in the text segment, and finally realizing the classification of the text. Or the recognition of negative clue words is carried out directly by constructing a word list or by using CRF combined with feature engineering, and finally the classification of texts is realized.

The deep learning network structure of the second step comprises an Encoder (Encoder) layer and an FC full connection layer. The Encoder layer is responsible for extracting the characteristics of the text, and common Encoder layers comprise LSTM, BERT and the like. The FC full connectivity layer is responsible for mapping text features to label categories of text. Inputting a section of text, and coding the text to obtain the characteristics of the text; and finally, directly realizing text classification through a full connection layer.

Much research is currently being focused on improving the characterization capabilities of the models or exploring the knowledge gained by language modeling. Based on deep learning of the pre-trained model, what the model may learn is only statistical information in the data, and the negative meaning is not really understood. Through experimental results, we found that the BERT model often makes mistakes in understanding the negative suffix (e.g., 'in-', 'im-', 'less' etc.) in English samples in the task of negative sentence recognition. For example, "On my marking that I was consistent in the halo of the doing the same thing that you are expressing the input," BERT misclassifies it as a non-negative sentence because an error is recognized for the word "input" that contains a negative suffix.

In summary, since some negative information is difficult to identify, it is difficult to make a correct judgment through a neural network (LSTM, pre-trained language model, etc.) and feature engineering, etc. in the prior art, and the identification rate of classification is not high enough.

Disclosure of Invention

The invention aims to solve the technical problem of providing a negative sentence identification method based on negative prefix and suffix information, which has high identification accuracy.

In order to solve the above problem, the present invention provides a negative sentence identification method based on negative prefix and suffix information, including the steps of:

s1, training an auxiliary task model by using a word training set, wherein the word training set consists of well-labeled words with negative prefixes and suffixes, and the auxiliary task model comprises a first sequence encoder, a first linear layer and a first softmax activation layer; the method comprises the following steps:

s11, after splicing the embedded vectors of the words with the negative suffixes in the word training set and the embedded vectors of the words without the negative suffixes, inputting the spliced embedded vectors into the first sequence encoder, wherein the first sequence encoder outputs a hidden layer feature representation indicating whether the two words are matched or not, and the hidden layer feature representation is recorded as a first hidden layer feature representation;

s12, mapping the hidden layer feature representation whether the two words are matched into a first label set by the first linear layer, and normalizing through a first softmax activation layer to obtain a prediction label of the word with a negative suffix; the first label set comprises two prediction labels of a non-negative word and a negative word;

s2, training a main task model by utilizing a sentence training set, wherein the sentence training set is formed by labeled sentences containing words with negative suffixes and suffixes, and the main task model comprises a second sequence encoder, a second linear layer and a second softmax activation layer; the method comprises the following steps:

s21, splitting sentences in the sentence training set into word sequences, and inputting the word sequences into the second sequence encoder, wherein the second sequence encoder outputs hidden layer feature representation of the sentences, and the hidden layer feature representation is recorded as second hidden layer feature representation;

s22, inputting words with negative suffixes in sentences into the trained auxiliary task model to obtain a first hidden layer feature representation and a prediction label of the words; inputting a second hidden layer feature representation into the second linear layer if the predicted tag is a non-negative word; if the predicted tag is a negative word, inserting the first hidden layer feature representation into a second hidden layer feature representation of the sentence where the predicted tag is located to update the second hidden layer feature representation, and inputting the updated second hidden layer feature representation into the second linear layer;

s23, the second linear layer maps the input second hidden layer representation into a second label set, and normalization is carried out through a second softmax activation layer to obtain a predicted label of a sentence with a word with a negative suffix; the second label set comprises two prediction labels of a non-negative sentence and a negative sentence;

and S3, identifying the target sentence by using the trained main task model and the trained auxiliary task model, and predicting whether the target sentence is a negative sentence or a non-negative sentence.

As a further improvement of the present invention, in step S11, after the embedded vectors of the words with negative suffixes in the word training set and the embedded vectors of the words after the negative suffixes are removed are spliced, the following expression is given:

where w represents a word with a negative suffix,

meaning that w removes words following a negative suffix,

an embedded vector representation representing the word w,

represent

Is represented by the embedded vector of (a).

As a further improvement of the present invention, in step S11, the first sequence encoder outputs a hidden layer feature representation of whether two words match, as follows:

wherein,

for the hidden layer feature representation of whether two words match, encoder1 represents the Encoder function of the first sequence coder.

As a further improvement of the present invention, in step S12, the predictive labels of the words with negative suffixes are obtained as follows:

wherein softmax1 denotes the softmax function of the first softmax activation layer,W ₁ representing the weight matrix to be learned in the first linear layer, y _Aux Representing the prediction probability of the auxiliary task model,

representing the predicted outcome of the auxiliary task model,

mapping into a first set of labels non-negatives, negatives.

As a further improvement of the present invention, in step S21, the sentences in the sentence training set are split into word sequences as follows:

wherein n is the number of words in the sentence; i =1,2,. Ang, n;w _i corresponding to the ith word in the sentence;x _Main a word sequence split for a sentence.

As a further refinement of the present invention, in step S21, the second sequence encoder outputs a hidden layer feature representation of a sentence as follows:

wherein,

for the hidden layer feature representation of the ith word in the sentence,

encoder2 represents the Encoder function of the second sequence coder for hidden layer feature representation of sentences.

As a further improvement of the present invention, in step S22, if the predicted tag is a negative word, inserting the first hidden layer feature representation into a second hidden layer feature representation of the sentence where the predicted tag is located to update the second hidden layer feature representation, including: if the predicted label is a negative word, inserting the first hidden layer feature representation of the predicted label into a position corresponding to the word in the word sequence, and then obtaining an updated second hidden layer feature representation, wherein the updated second hidden layer feature representation comprises the following steps:

wherein,

to predict a first hidden-layer feature representation of a word tagged as a negative word,

is the updated second hidden layer feature representation.

As a further improvement of the present invention, in step S23, the predicted tags of the input sentence are obtained as follows:

wherein softmax2 denotes the softmax function of the second softmax activation layer,W ₂ representing the weight matrix to be learned in the second linear layer, y _Main Representing the prediction probability of the main task model,

representing the predicted outcome of the main task model,

mapping into a second set of tags { non-negatives, negatives }.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any one of the above methods when executing the program.

The invention also provides a computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the steps of any of the methods described above.

The invention has the beneficial effects that:

the invention discloses a negative sentence recognition method based on negative prefix and suffix information, which comprises the steps of firstly training an auxiliary task model by using a word training set and obtaining information of words with negative prefixes and suffixes, then training a main task model by using a sentence training set and recognizing negative sentences. The method and the device model the recognition of the negative words into a matching model, and can greatly improve the recognition accuracy of the negative sentences by recognizing the words with the negative suffixes in the sentences and updating the hidden layer characteristic representation of the sentences.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.

Drawings

Fig. 1 is a schematic diagram of a negative sentence recognition method based on negative prefix and suffix information in a preferred embodiment of the present invention.

Detailed Description

The present invention is further described below in conjunction with the drawings and the embodiments so that those skilled in the art can better understand the present invention and can carry out the present invention, but the embodiments are not to be construed as limiting the present invention.

Example one

As shown in fig. 1, the present embodiment discloses a negative sentence identification method based on negative prefix and suffix information, which includes the following steps:

optionally, after the embedded vectors of the words with the negative suffix in the word training set and the embedded vectors of the words after the negative suffix is removed from the embedded vectors are spliced, the following expression is given:

(1)

where w represents a word with a negative suffix,

meaning that w removes words following a negative suffix,

an embedded vector representation representing the word w,

to represent

Is used to represent the embedded vector. For example, wordsw= impressability and word after removal of negative suffix

=“possibly”。

Optionally, the first sequence encoder outputs a hidden-layer feature representation of whether two words match, as follows:

(2)

wherein,

The first sequence encoder is a sequence learning encoder of a pre-training language model, and the pre-training language model can be BERT, XLNET and variants thereof.

alternatively, the predicted labels for words with a negative suffix are as follows:

(3)

representing the predicted outcome of the auxiliary task model,

mapping into a first set of labels non-negated, negated words.

And training the auxiliary task model by using all words with negative suffixes in the word training set to obtain the trained auxiliary task model.

s21, splitting sentences in the sentence training set into word sequences, inputting the word sequences into the second sequence encoder, and outputting hidden layer feature representation of the sentences by the second sequence encoder to be marked as second hidden layer feature representation;

optionally, in step S21, the sentences in the sentence training set are split into word sequences as follows:

(4)

wherein n is the number of words in the sentence; i =1,2,. N;w _i corresponding to the ith word in the sentence;x _Main a word sequence split for a sentence.

In step S21, the second sequence encoder outputs a hidden layer feature representation of a sentence as follows:

(5)

wherein,

for the hidden-layer feature representation of the ith word in the sentence,

S22, inputting the words with the negative suffixes in the sentences into the trained auxiliary task model to obtain a first hidden layer feature representation and a prediction label of the words; inputting a second hidden layer feature representation into the second linear layer if the predicted tag is a non-negative word; if the predicted tag is a negative word, inserting the first hidden layer feature representation of the predicted tag into a second hidden layer feature representation of the sentence where the predicted tag is located to update the second hidden layer feature representation, and inputting the updated second hidden layer feature representation into the second linear layer;

suppose w _i For a word with a negative suffix, in step S22, if the predicted tag is a negative word, inserting the first hidden layer feature representation into the second hidden layer feature representation of the sentence to update the second hidden layer feature representation, including: if the predicted label is a negative word, inserting the first hidden layer feature representation of the predicted label into the position corresponding to the word in the word sequence to obtain an updated second hidden layer feature representation, wherein the updated second hidden layer feature representation comprises the following steps:

(6)

wherein,

for words w tagged with a predictive negative word _i Is represented by a first hidden layer of features of (1),

is the updated second hidden layer feature representation.

alternatively, the predicted tags for a sentence comprising a word with a negative suffix are derived as follows:

(7)

represents the result of the prediction of the master model,

mapping into a second set of tags { non-negatives, negatives }.

And training the main task model by utilizing all sentences which comprise words with negative suffixes and suffixes in the sentence training set to obtain the trained main task model.

The invention discloses a negative sentence recognition method based on negative prefix and suffix information, which comprises the steps of firstly training an auxiliary task model by utilizing a word training set and obtaining information of words with negative prefixes and suffixes, then training a main task model by utilizing a sentence training set and recognizing the negative sentences, in the training process of the main task model, obtaining a first hidden layer feature representation of the words with the negative prefixes and suffixes in the sentences by utilizing the trained auxiliary task model, inserting the first hidden layer feature representation into a second hidden layer feature representation of the sentences to train the main task model, and recognizing target sentences by utilizing the trained main task model and the trained auxiliary task model. The method and the device model the recognition of the negative words into a matching model, and can greatly improve the recognition accuracy of the negative sentences by recognizing the words with the negative suffixes in the sentences and updating the hidden layer characteristic representation of the sentences.

As shown in table 1, the training result of the auxiliary task model in one embodiment is shown. Wherein Macro-F ₁ Representing F in all categories ₁ Average of score values, 1-F ₁ F representing a positive sample ₁ Value, 0-F ₁ F representing negative examples ₁ The value, accuracy, indicates the accuracy of the classification. Experimental results show that the performance of prefix and prefix identification is higher before and after negation based on BERT-base, macro-F ₁ The value and the accuracy rate both reach 0.903. Class 0 (not)Prefix and suffix) and class 1 (prefix and suffix) also exceeded 0.90. These better recognition capabilities provide a good basis for helping the master task model.

TABLE 1

Table 2 shows experimental results of negative sentence recognition tasks based on different recognition methods. The identification of negative clue words in the sentence is firstly carried out from the top to the bottom, then the identification of a negative sentence is carried out, and a negative word list construction method, a sequence labeling-based 2 CRF + characteristic engineering method and a BilSTM method are sequentially carried out from the top to the bottom. The characteristic engineering 1 method is that negative clue words are divided into 4 types, a CRF sequence labeling model is trained by taking the vocabulary characteristics as the main part, and the negative clue words are labeled; and the feature engineering 2 is to automatically construct a non-negative clue word list and a high-frequency negative clue word list from the training corpus after eliminating the condition of negating prefix and suffix, and train a CRF sequence labeling model by combining the negative prefix and suffix characteristics in the negative clue words with the vocabulary characteristics. The three columns from top to bottom and the method of the invention directly identify the negative sentence. The reported result is the average result under 5 different random seeds, negative sentence recognition is directly carried out on four columns from top to bottom, models used from top to bottom are BiLSTM, SVM and BERT-base in sequence, and the model used by the method is BERT-base.

TABLE 2

From the results of table 2, it can be seen that: (1) The method provided by the invention is used for F of the class 1 sample ₁ The value is improved most obviously, and is improved by 1.8% -14.5% compared with several reference methods. The result verifies that the method can better improve the performance of the type 1 (negative sentence). (2) The method provided by the invention has relatively small improvement on the value of the class 0 sample, and is improved by 0.6% -8.5% compared with several reference methods. (3) Book (notebook)The method provided by the invention can effectively improve the overall result F ₁ Value, comparison with several reference methods, the method of the invention Macro-F ₁ Respectively increased by 1.2% -14.5%. Meanwhile, as the characteristic engineering of the method CRF + characteristic engineering 2 has the special treatment for negating suffixes, the best result except the method of the invention is obtained, and the importance of the negating suffixes in the recognition of negating sentences is further verified. The result fully verifies that the negative sentence identification method based on the negative prefix and suffix information can effectively improve the performance of negative sentence identification.

For the auxiliary task model, the present embodiment selects 6 prefixes that are common in english and may have negative meaning: "un-", "im-", "in-", "il-", "ir-" and "dis-" and 2 common suffixes "-less" and "-free" that may contain a negative meaning. The words in the experiment are collected from the ninth-edition oxford english-chinese dictionary and 160 ten thousand english tweets collected by Go et al. Specifically, the method extracts a total of 2717 words containing negative suffixes from the dictionary; a total of 6671 words containing negative suffixes were extracted from the english tweet corpus. For each word containing a negative suffix, its label has two possibilities, namely "negation word" and "non-negation word". A "negative word" means that the word has a negative meaning due to the prefix/suffix of the word, and a "non-negative word" means that the prefix/suffix of the word does not affect the positive/negative meaning of the word itself. The method randomly selects 3000 words from the obtained words to allow two annotators to carry out manual annotation, and for uncertain words, please use the third annotator to carry out annotation. The Kappa value of the consistency test was 0.87. The labeling data statistics are shown in table 3.

TABLE 3

To better validate the auxiliary task model, the present invention ensures that the data set used in the auxiliary task model does not include words with negative suffixes in the main task data set. Finally, 2000 samples with balanced positive and negative are selected from the labeled corpus to perform an auxiliary task experiment. The distribution of the labeled samples is shown in table 3. The 2000 data were randomly partitioned into training, validation and test sets according to the scale of 7.

For the main task model, the inventive method used 2012 × SEM shared task data for the experiments. 2012 Data in the SEM shared task data set is in the format of a CoNLL, wherein each word data mainly has the following composition structure: current word, root word, part of speech tag POS, grammar tree, negative information and the like. The negative information includes whether the current word is a negative clue word and whether it is in a negative range. 5519 sentences are extracted from 2012 SEM shared data sets, and are classified according to negative information marked by the sentences, wherein the training sets comprise 3643 negative sentences, wherein the negative sentences are 848 sentences, and the non-negative sentences are 2795 sentences; 787 verification sets are provided, wherein the negative sentences are 144, and the non-negative sentences are 643; 1089 test sets, 235 negative sentences and 854 non-negative sentences. In the experiment, the method of the invention keeps 2012 the original data set division mode of the SEM sharing task.

Example two

The present embodiment discloses an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the negative sentence identification method based on negative prefix and suffix information in the first embodiment when executing the program.

EXAMPLE III

The present embodiment discloses a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the negative sentence identification method based on negative prefix and affix information in the first embodiment.

The above embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitutions or changes made by the person skilled in the art on the basis of the present invention are all within the protection scope of the present invention. The protection scope of the invention is subject to the claims.

Claims

1. The negative sentence identification method based on the negative prefix and suffix information is characterized by comprising the following steps of:

s1, training an auxiliary task model by utilizing a word training set, wherein the word training set is formed by marked words with negative suffixes, and the auxiliary task model comprises a first sequence encoder, a first linear layer and a first softmax activation layer; the method comprises the following steps:

s12, mapping the hidden layer feature representation whether the two words are matched into a first label set by the first linear layer, and normalizing through a first softmax activation layer to obtain a prediction label of the word with a negative prefix-suffix; the first label set comprises two prediction labels of a non-negative word and a negative word;

s2, training a main task model by utilizing a sentence training set, wherein the sentence training set is formed by labeled sentences containing words with negative suffixes, and the main task model comprises a second sequence encoder, a second linear layer and a second softmax activation layer; the method comprises the following steps:

2. The negative sentence recognition method based on the negative prefix and suffix information according to claim 1, wherein in step S11, the concatenation of the embedded vectors of the words with the negative prefix and suffix and the embedded vectors of the words after the negative prefix and suffix are removed in the word training set is represented as:

where w represents a word with a negative suffix,

indicating that w is removed from the word after the negative suffix, em (w) indicates an embedded vector representation of word w,

to represent

Is used to represent the embedded vector.

3. The negative sentence recognition method based on negative prefix and affix information as claimed in claim 2, wherein in step S11, the first sequence encoder outputs a hidden-layer feature representation of whether two words match as follows:

x′ _Aux ＝Encoder1(x _Aux )

wherein, x' _Aux Encode 1 represents the Encode function of the first sequence Encoder for a hidden layer feature representation of whether two words match.

4. The negative sentence recognition method based on the negative prefix information as claimed in claim 3, wherein in step S12, the prediction labels of the words with the negative prefix are obtained as follows:

wherein softmax1 denotes the softmax function of the first softmax activation layer, W ₁ Representing the weight matrix to be learned in the first linear layer, y _Aux Representing the prediction probability of the auxiliary task model,

representing the predicted outcome of the auxiliary task model,

mapping into a first set of labels non-negatives, negatives.

5. The method according to claim 1, wherein in step S21, the sentences in the sentence training set are divided into word sequences as follows:

x _Main ＝[w ₁ ，w ₂ ，...，w _i ，...，w _n ]

wherein n is the number of words in the sentence; i =1,2,. N; w is a _i Corresponding to the ith word in the sentence; x is the number of _Main A word sequence split for a sentence.

6. The negative sentence recognition method based on negative prefix and affix information of claim 5 wherein, in step S21, the second sequence encoder outputs a hidden layer feature representation of a sentence as follows:

[x ₁ ′，x ₂ ′，...，x _i ′，...，x _n ′]＝Encoder2(x _Main )

wherein x is _i ' is a hidden-layer feature representation of the ith word in the sentence, [ x [ ] ₁ ′，x ₂ ′，...，x _i ′，...，x _n ′]The Encoder2 represents an Encoder function of the second sequence Encoder for hidden layer feature representation of the sentence.

7. The method for identifying a negative sentence according to claim 6, wherein in step S22, if the predicted tag is a negative word, the step of inserting the first hidden-layer feature representation into the second hidden-layer feature representation of the sentence to update the second hidden-layer feature representation comprises: if the predicted label is a negative word, inserting the first hidden layer feature representation of the predicted label into a position corresponding to the word in the word sequence, and then obtaining an updated second hidden layer feature representation, wherein the updated second hidden layer feature representation comprises the following steps:

wherein,

first hidden layer feature representation for words with predicted tag as negative word, X' _Main Is the updated second hidden layer feature representation.

8. The negative sentence recognition method based on the negative prefix and suffix information according to claim 7, wherein in step S23, the predicted labels of the sentences including the words with the negative prefix and suffix are obtained as follows:

wherein softmax2 denotes the softmax function of the second softmax activation layer, W ₂ Representing the weight matrix to be learned in the second linear layer, y _Main Representing the prediction probability of the main task model,

representing the predicted outcome of the main task model,

mapping into a second set of tags { non-negatives, negatives }.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1-8 are implemented when the program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.