CN111339260A - BERT and QA thought-based fine-grained emotion analysis method - Google Patents

BERT and QA thought-based fine-grained emotion analysis method Download PDF

Info

Publication number
CN111339260A
CN111339260A CN202010136542.7A CN202010136542A CN111339260A CN 111339260 A CN111339260 A CN 111339260A CN 202010136542 A CN202010136542 A CN 202010136542A CN 111339260 A CN111339260 A CN 111339260A
Authority
CN
China
Prior art keywords
sentence
auxiliary
sequence
bert
emotion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010136542.7A
Other languages
Chinese (zh)
Inventor
谭祥
车海莺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202010136542.7A priority Critical patent/CN111339260A/en
Publication of CN111339260A publication Critical patent/CN111339260A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of text emotion analysis in natural language processing, in particular to a fine-grained emotion analysis method based on BERT and QA ideas, which changes an original single-sentence task into a sentence-pair task by adding auxiliary sentences in a data set and can utilize the characteristic that a BERT model has better effect in the sentence-pair task; meanwhile, a downstream task of the BERT model is changed into a question-answering task by adding an auxiliary sentence, namely, emotion polarities corresponding to the extracted aspect are found from the auxiliary sentence, and the original two-step subtasks can be completed by one step by using the method provided by the technology; according to the invention, by utilizing the BERT model, the downstream task is changed into the QA task by constructing the auxiliary sentence, and the BERT model is finely adjusted, so that the effects on two subtasks of aspect extraction task and emotion classification are improved, a plurality of aspect words can be extracted, the model efficiency is improved, and the related redundant workload is reduced.

Description

BERT and QA thought-based fine-grained emotion analysis method
Technical Field
The invention relates to the field of text emotion analysis in natural language processing, in particular to a fine-grained emotion analysis method based on BERT and QA ideas.
Background
At present, text comments (including social comments, news comments, commodity evaluations and the like) on the internet have academic value and commercial value, and the comments or the comments can be subjected to sentiment analysis to identify the sentiment attitude of a corresponding user to a certain event or commodity and the like, for example, a social platform and a news portal can utilize information after sentiment analysis to carry out targeted marketing and pushing on the user; the E-commerce platform can analyze attributes of various aspects of the commodity by utilizing an emotion analysis technology to obtain the evaluation of the consumers, so that the browsing time of other consumers is saved, and the evaluation of the commodity can be subjected to refined label display instead of simple good evaluation and poor evaluation.
Currently, emotion analysis can be divided into two types, namely coarse-grained type and fine-grained type according to the granularity of task attention objects, wherein the coarse-grained task attention objects are at a document level and a sentence level, and the fine-grained task attention objects are at an Aspect (Aspect) level, wherein the Aspect can be a word or a plurality of words. Because the coarse-grained emotion analysis is to give corresponding emotion polarities to a document or a sentence, such a task has a great defect, because one sentence may contain emotion polarities different from multiple attributes, the coarse-grained emotion analysis can neither find out which attributes or aspects of commodities or news are expressed by a user, nor know which emotion is expressed to a specific attribute, and is difficult to comprehensively cover the emotion expression of the user, so the fine-grained emotion analysis has a great research value.
The fine-grained emotion analysis task has two subtasks, namely aspect extraction and aspect level emotion classification. In the prior art, the aspect extraction and emotion classification tasks are mainly and respectively performed, and improvement is performed on the two subtasks. The existing technology of the aspect extraction task mainly comprises unsupervised learning, semi-supervised learning and supervised learning, wherein deep learning related models in the supervised learning, such as LSTM, CRF, BERT and the like, have excellent effects; the prior art of the emotion classification task mainly comprises an emotion dictionary-based method, a machine learning-based method and a deep learning-based method. The deep learning method has better performance in aspect extraction and emotion classification tasks compared with other methods.
However, the main problem of the current fine-grained emotion analysis task is that the emotion analysis task is performed on text data, and some deep learning models such as CNN, RNN, LSTM and the like need to be pre-trained on a large number of text data sets, which takes a long time; secondly, the current fine-grained emotion analysis tasks are mainly step-by-step, after the sub-tasks are extracted on the aspect in a unified manner, the emotion polarity of the corresponding aspect is judged for each sentence again, and therefore efficiency is low.
Disclosure of Invention
The invention aims to provide a method for analyzing fine-grained emotion based on BERT and QA ideas, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: a fine-grained emotion analysis method based on BERT and QA ideas comprises the following steps:
the method comprises the following steps: selecting a SemEval2014 year data set as a corpus, and adding auxiliary sentences to all text data;
step two: segmenting the processed data text, marking a sentence X by using an 30552 word vocabulary, connecting a [ CLS ] mark at the beginning of the sentence, adding an [ SEP ] mark between an auxiliary sentence and an original sentence, and generating an input sequence X, wherein the method specifically comprises the following steps: [ CLS ] original sentence sequence [ SEP ] auxiliary sentence sequence [ SEP ];
step three: vectorizing the input sequence X, representing each word in the input text sequence by using a pre-trained word feature vector to obtain a word vector h of the input text0Expressed as:
ho∈R(n+2)×h
where h is the size of the hidden layer;
step four: the word vector h obtained in the third step0Training as input to the Transformer block of the L layerWord vector h of the syntactical sentenceiExpressed as:
Figure BDA0002397521280000021
step five: constructing a heuristic aspect extractor capable of extracting a plurality of aspects, which specifically comprises the following sub-steps:
step 5.1: word vector h based on fused sentence semanticsiIn the aspect of training data, a BERT model is trained by using a gradient descent method, and parameters are updated;
defining the probability p of the start and end positions of the corresponding aspect of each sentencesAnd peNamely:
gs=wshL,ps=softmax(gs)
ge=wehL,pe=softmax(ge)
wherein, ws∈Rh,we∈RhIs a weight vector to be trained, and softmax is a normalization function;
step 5.2: pre-marking the boundary of the target entity in the training data set to obtain a marking list T and obtain a starting vector ys∈R(n+2)And an end vector ye∈R(n+2)Wherein each element
Figure BDA0002397521280000031
Indicating whether the ith token is the start of a target aspect,
Figure BDA0002397521280000032
indicating whether the ith token is the end of an aspect;
minimizing the sum of the negative logarithms of the probabilities of the real start position and the end position, and training a model by gradient descent:
Figure BDA0002397521280000033
step 5.3: redundancy invalidation occurs when multiple parties are extracted, for example:
original sentence: i like the food but the service was so awful!
Food, service in real aspect
And (3) prediction aspect: food, food but the service, service was so awful, service …
A heuristic multi-boundary algorithm is provided, and pseudo codes are as follows:
Figure BDA0002397521280000034
Figure BDA0002397521280000041
wherein, gsScore, g, representing the starting positioneScore representing the end position, γ is a hyperparameter, is a set minimum score threshold, and K is the maximum number in terms of a single sentence:
the algorithm mainly comprises the following steps:
a, initializing three sets R, U and O (line 1);
b, g obtained according to the trained weight vectors,geIn the method, the first M position index sets S and E with high scores are selected, wherein S isiSubscript, e, indicating the ith starting position in the set SjThe subscript indicating the jth end position in set E (line 2);
c at the end position not less than the start position and
Figure BDA0002397521280000042
add the values of(s) over γ (lines 3-8), the candidate boundary(s)i,ej) Add to set R (lines 7-8) and apply heuristic regularization
Figure BDA0002397521280000043
Add to set U (line 6, line 8);
d, removing redundant boundaries in the set R by using a non-maximum suppression algorithm, namely when the set R is not an empty set and the set R is a setWhen the size of O is less than the value K (lines 9-14), the maximum value U in the set U at that time is removed from the set RiCorresponding boundary rlAnd combining the boundary rlAdd to set O (lines 10-11); when R is in the set RlOverlapping rkThen, i.e., checking whether there is overlap using a fine-grained F1 numerical measure, the corresponding bounds and values are removed from the set R and the set U (lines 12-14), i.e., redundant aspects are removed from the candidate aspects;
e, obtaining a boundary set O (line 15) of the starting position and the ending position corresponding to the multiple aspects, namely extracting the aspect words and ending;
step six: and splicing the extracted feature word vectors of the aspect words and the feature word vectors of the auxiliary sentences by utilizing self-attention operation to obtain auxiliary feature word vectors with semantic fusion, and recording the auxiliary feature word vectors as h'.
Step seven: predicting the emotion polarity of the corresponding aspect of the auxiliary feature word vector through a feedforward neural network, and the specific process is as follows:
step 7.1: the emotion polarity value is obtained by using the Tanh activation function between two linear transformations:
Figure BDA0002397521280000054
wherein, WpAnd Wh'Two parameter matrixes to be trained are provided;
the emotion polarity probability is normalized by a softmax function to a polarity value,
Figure BDA0002397521280000051
step 7.2: gradient training parameter matrix W using negative log probability that minimizes true emotion polarity prediction probabilityp,Wh'And a fine-tuning BERT model, expressed as:
Figure BDA0002397521280000052
wherein, ypFor true emotion polesThe linear one-hot represents a vector;
step 7.3: using formulas
Figure BDA0002397521280000053
After the BERT model is trained in the gradient descent, the emotion polarity probability of each auxiliary feature word vector h' is solved, wherein the emotion word corresponding to the value with the maximum probability is the emotion polarity of the corresponding aspect word.
Preferably, the data set selected in the first step includes three data sets, namely LAPTOP, REST and TWITTER.
Preferably, the content of the auxiliary sentence in the first step is: "positive or negative or neutral".
Preferably, the heuristic regularization number in the algorithm process of the step 5.3
Figure BDA0002397521280000061
Representing the sum of the start position and end position values minus the facet length.
Preferably, the input sequence is a sequence formed by splicing linguistic data and auxiliary sentences through predefined symbols [ CLS ] and [ SEP ] after the linguistic data and the auxiliary sentences are segmented; the spliced sequence is a [ CLS ] original sentence sequence [ SEP ] auxiliary sentence sequence [ SEP ] ", [ CLS ] is a semantic symbol of an input text sequence, and [ SEP ] is a segmentation symbol of a problem sequence and a text segment sequence.
Compared with the prior art, the invention has the beneficial effects that: mainly comprises the following aspects:
when the BERT model used by the invention is proposed by Google, the BERT model is pre-trained on a large number of text data sets, and compared with models such as CNN, RNN and LSTM, the method can reduce the steps of pre-training and the complex workload;
in a fine-grained emotion analysis task, compared with a step-by-step approach, a trained model is used for extracting a target entity, and then another model is used for judging the corresponding emotion polarity, so that the model training efficiency can be improved;
the performance of the BERT model on the task is more excellent than that of a single sentence task, and the characteristic of the BERT model can be utilized by adding an auxiliary sentence mode;
compared with a sequence method, experiments show that the method for marking the boundary has better performance, and a heuristic extractor is embedded by utilizing the method for marking the boundary, so that multiple target aspects can be output at one time, namely, one sentence contains multiple aspects.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: a fine-grained emotion analysis method based on BERT and QA ideas comprises the following steps:
the method comprises the following steps: selecting a SemEval2014 year data set as a corpus, and adding auxiliary sentences to all text data;
step two: segmenting the processed data text, marking a sentence X by using an 30552 word vocabulary, connecting a [ CLS ] mark at the beginning of the sentence, adding an [ SEP ] mark between an auxiliary sentence and an original sentence, and generating an input sequence X, wherein the method specifically comprises the following steps: [ CLS ] original sentence sequence [ SEP ] auxiliary sentence sequence [ SEP ];
step three: vectorizing the input sequence X, representing each word in the input text sequence by using a pre-trained word feature vector to obtain a word vector h of the input text0Expressed as:
ho∈R(n+2)×h
where h is the size of the hidden layer;
step four: the word vector h obtained in the third step0As the input of the Transformer block of the L layer, training to obtain a word vector h fusing sentence semanticsiExpressed as:
Figure BDA0002397521280000071
step five: constructing a heuristic aspect extractor capable of extracting a plurality of aspects, which specifically comprises the following sub-steps:
step 5.1: word vector h based on fused sentence semanticsiIn the aspect of training data, a BERT model is trained by using a gradient descent method, and parameters are updated;
defining the probability p of the start and end positions of the corresponding aspect of each sentencesAnd peNamely:
gs=wshL,ps=softmax(gs)
ge=wehL,pe=softmax(ge)
wherein, ws∈Rh,we∈RhIs a weight vector to be trained, and softmax is a normalization function;
step 5.2: pre-marking the boundary of the target entity in the training data set to obtain a marking list T and obtain a starting vector ys∈R(n+2)And an end vector ye∈R(n+2)Wherein each element
Figure BDA0002397521280000081
Indicating whether the ith token is the start of a target aspect,
Figure BDA0002397521280000082
indicating whether the ith token is the end of an aspect;
minimizing the sum of the negative logarithms of the probabilities of the real start position and the end position, and training a model by gradient descent:
Figure BDA0002397521280000083
step 5.3: redundancy invalidation occurs when multiple parties are extracted, for example:
original sentence: i like the food but the service was so awful!
Food, service in real aspect
And (3) prediction aspect: food, food but the service, service was so awful, service …
A heuristic multi-boundary algorithm is provided, and pseudo codes are as follows:
Figure BDA0002397521280000084
Figure BDA0002397521280000091
wherein, gsScore, g, representing the starting positioneScore representing the end position, γ is a hyperparameter, is a set minimum score threshold, and K is the maximum number in terms of a single sentence:
the algorithm mainly comprises the following steps:
a, initializing three sets R, U and O (line 1);
b, g obtained according to the trained weight vectors,geIn the method, the first M position index sets S and E with high scores are selected, wherein S isiSubscript, e, indicating the ith starting position in the set SjThe subscript indicating the jth end position in set E (line 2);
c at the end position not less than the start position and
Figure BDA0002397521280000092
add the values of(s) over γ (lines 3-8), the candidate boundary(s)i,ej) Add to set R (lines 7-8) and apply heuristic regularization
Figure BDA0002397521280000093
Add to set U (line 6, line 8);
d, removing redundant boundaries in the set R by using a non-maximum suppression algorithm, namely removing the maximum value U in the set U at the moment from the set R when the set R is not an empty set and the size of the set O is smaller than the value K (lines 9-14)iCorresponding boundary rlAnd combining the boundary rlAdd to set O (lines 10-11); when R is in the set RlOverlapping rkThen, i.e., checking whether there is overlap using a fine-grained F1 numerical measure, the corresponding bounds and values are removed from the set R and the set U (lines 12-14), i.e., redundant aspects are removed from the candidate aspects;
e, obtaining a boundary set O (line 15) of the starting position and the ending position corresponding to the multiple aspects, namely extracting the aspect words and ending;
step six: and splicing the extracted feature word vectors of the aspect words and the feature word vectors of the auxiliary sentences by utilizing self-attention operation to obtain auxiliary feature word vectors with semantic fusion, and recording the auxiliary feature word vectors as h'.
Step seven: predicting the emotion polarity of the corresponding aspect of the auxiliary feature word vector through a feedforward neural network, and the specific process is as follows:
step 7.1: the emotion polarity value is obtained by using the Tanh activation function between two linear transformations:
Figure BDA0002397521280000101
wherein, WpAnd Wh'Two parameter matrixes to be trained are provided;
the emotion polarity probability is normalized by a softmax function to a polarity value,
Figure BDA0002397521280000102
step 7.2: gradient training parameter matrix W using negative log probability that minimizes true emotion polarity prediction probabilityp,Wh'And a fine-tuning BERT model, expressed as:
Figure BDA0002397521280000103
wherein, ypOne-hot representing vector for true emotion polarity;
step 7.3: using formulas
Figure BDA0002397521280000104
After the BERT model is trained in the gradient descent, the emotion polarity probability of each auxiliary feature word vector h' is solved, wherein the emotion word corresponding to the value with the maximum probability is the emotion polarity of the corresponding aspect word.
Further, the data set selected in the first step includes three data sets, namely, LAPTOP, REST, and TWITTER.
Further, the content of the auxiliary sentence in the first step is as follows: "positive or negative or neutral".
Further, the heuristic regularization number in the algorithm process of the step 5.3
Figure BDA0002397521280000111
Representing the sum of the start position and end position values minus the facet length.
Furthermore, the input sequence refers to a sequence formed by splicing linguistic data and auxiliary sentences through predefined symbols [ CLS ] and [ SEP ] after the linguistic data and the auxiliary sentences are segmented; the spliced sequence is a [ CLS ] original sentence sequence [ SEP ] auxiliary sentence sequence [ SEP ] ", [ CLS ] is a semantic symbol of an input text sequence, and [ SEP ] is a segmentation symbol of a problem sequence and a text segment sequence.
The model experiment of the invention is to respectively compare and extract the experimental effects of two tasks of entity classification and emotion classification by using the same training set and test set under the same condition.
Wherein the F1 value is used as an index for evaluating and extracting entity tasks, and Accuracy (Accuracy rate) is used as an index for evaluating emotion analysis tasks.
Figure BDA0002397521280000112
Figure BDA0002397521280000113
Wherein TP represents that the model prediction is true and the data true value is true, namely, the model prediction is correctly accepted;
FP indicates that the model prediction is true but the data true value is false, i.e. false acceptance;
TN indicates that the model prediction is false but the data true value is false, i.e. correct rejection;
FN indicates that the model prediction is false but the data true value is true, i.e., false rejection.
In terms of the task of extracting entities, compared to the BERT model + CRF model, DE-CNN (the model which currently extracts entities and represents the best excellence), the F1 values are as follows:
Figure BDA0002397521280000114
Figure BDA0002397521280000121
on the LAPTOP dataset and TWITTER dataset, the model proposed by the present invention works best among the three models.
In terms of emotion classification task, Accuracy values are as follows compared to MGAN, TNet (the model that currently emotion classification represents the best show):
LAPtop REST TWITTER
MGAN 75.39 - -
Tnet 76.54 - -
model of the invention 83.37 88.43 76.27
The model proposed by the invention works best among the three models on the LAPTOP dataset, the REST dataset and the TWITTER dataset.
According to the invention, by utilizing the BERT model, the downstream task is changed into the QA task by constructing the auxiliary sentence, and the BERT model is finely adjusted, so that the effects on two subtasks of aspect extraction task and emotion classification are improved, a plurality of aspect words can be extracted, the model efficiency is improved, and the related redundant workload is reduced.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (5)

1. A fine-grained emotion analysis method based on BERT and QA ideas is characterized by comprising the following steps: the method comprises the following steps:
the method comprises the following steps: selecting a SemEval2014 year data set as a corpus, and adding auxiliary sentences to all text data;
step two: segmenting the processed data text, marking a sentence X by using an 30552 word vocabulary, connecting a [ CLS ] mark at the beginning of the sentence, adding an [ SEP ] mark between an auxiliary sentence and an original sentence, and generating an input sequence X, wherein the method specifically comprises the following steps: [ CLS ] original sentence sequence [ SEP ] auxiliary sentence sequence [ SEP ];
step three: vectorizing the input sequence X, representing each word in the input text sequence by using a pre-trained word feature vector to obtain a word vector h of the input text0Expressed as:
ho∈R(n+2)×h
where h is the size of the hidden layer;
step four: the word vector h obtained in the third step0As the input of the Transformer block of the L layer, training to obtain a word vector h fusing sentence semanticsiExpressed as:
Figure FDA0002397521270000011
step five: constructing a heuristic aspect extractor capable of extracting a plurality of aspects, which specifically comprises the following sub-steps:
step 5.1: word vector h based on fused sentence semanticsiIn the aspect of training data, a BERT model is trained by using a gradient descent method, and parameters are updated;
defining the probability p of the start and end positions of the corresponding aspect of each sentencesAnd peNamely:
gs=wshL,ps=softmax(gs)
ge=wehL,pe=softmax(ge)
wherein, ws∈Rh,we∈RhIs a weight vector to be trained, and softmax is a normalization function;
step 5.2: pre-marking the boundary of the target entity in the training data set to obtain a marking list T and obtain a starting vector ys∈R(n+2)And an end vector ye∈R(n+2)Each of whichElement(s)
Figure FDA0002397521270000021
Indicating whether the ith token is the start of a target aspect,
Figure FDA0002397521270000022
indicating whether the ith token is the end of an aspect;
minimizing the sum of the negative logarithms of the probabilities of the real start position and the end position, and training a model by gradient descent:
Figure FDA0002397521270000023
step 5.3: redundancy invalidation occurs when multiple parties are extracted, for example:
original sentence: i like the food but the service was so awful!
Food, service in real aspect
And (3) prediction aspect: food, food but the service, service was so awful, service …
A heuristic multi-boundary algorithm is provided, and pseudo codes are as follows:
Figure FDA0002397521270000024
Figure FDA0002397521270000031
wherein, gsScore, g, representing the starting positioneScore representing the end position, γ is a hyperparameter, is a set minimum score threshold, and K is the maximum number in terms of a single sentence:
the algorithm mainly comprises the following steps:
a, initializing three sets R, U and O (line 1);
b, g obtained according to the trained weight vectors,geIn the method, the first M position index sets S and E with high scores are selected, wherein S isiPresentation setSubscript of ith start position in S, ejThe subscript indicating the jth end position in set E (line 2);
c at the end position not less than the start position and
Figure FDA0002397521270000032
add the values of(s) over γ (lines 3-8), the candidate boundary(s)i,ej) Add to set R (lines 7-8) and apply heuristic regularization
Figure FDA0002397521270000033
Add to set U (line 6, line 8);
d, removing redundant boundaries in the set R by using a non-maximum suppression algorithm, namely removing the maximum value U in the set U at the moment from the set R when the set R is not an empty set and the size of the set O is smaller than the value K (lines 9-14)iCorresponding boundary rlAnd combining the boundary rlAdd to set O (lines 10-11); when R is in the set RlOverlapping rkThen, i.e., checking whether there is overlap using a fine-grained F1 numerical measure, the corresponding bounds and values are removed from the set R and the set U (lines 12-14), i.e., redundant aspects are removed from the candidate aspects;
e, obtaining a boundary set O (line 15) of the starting position and the ending position corresponding to the multiple aspects, namely extracting the aspect words and ending;
step six: and splicing the extracted feature word vectors of the aspect words and the feature word vectors of the auxiliary sentences by utilizing self-attention operation to obtain auxiliary feature word vectors with semantic fusion, and recording the auxiliary feature word vectors as h'.
Step seven: predicting the emotion polarity of the corresponding aspect of the auxiliary feature word vector through a feedforward neural network, and the specific process is as follows:
step 7.1: the emotion polarity value is obtained by using the Tanh activation function between two linear transformations:
Figure FDA0002397521270000041
wherein, WpAnd Wh'Two parameter matrixes to be trained are provided;
the emotion polarity probability is normalized by a softmax function to a polarity value,
Figure FDA0002397521270000042
step 7.2: gradient training parameter matrix W using negative log probability that minimizes true emotion polarity prediction probabilityp,Wh'And a fine-tuning BERT model, expressed as:
Figure FDA0002397521270000043
wherein, ypOne-hot representing vector for true emotion polarity;
step 7.3: using formulas
Figure FDA0002397521270000044
After the BERT model is trained in the gradient descent, the emotion polarity probability of each auxiliary feature word vector h' is solved, wherein the emotion word corresponding to the value with the maximum probability is the emotion polarity of the corresponding aspect word.
2. The fine-grained emotion analysis method based on BERT and QA ideas of claim 1, wherein: the data set selected in the first step comprises three data sets of LAPTOP, REST and TWITTER.
3. The fine-grained emotion analysis method based on BERT and QA ideas of claim 1, wherein: the content of the auxiliary sentence in the first step is as follows: "positive or negative or neutral".
4. The fine-grained emotion analysis method based on BERT and QA ideas of claim 1, wherein: said step 5.3 algorithmic ProcessNormalized number
Figure FDA0002397521270000045
Representing the sum of the start position and end position values minus the facet length.
5. The fine-grained emotion analysis method based on BERT and QA ideas of claim 1, wherein: the input sequence is a sequence formed by splicing linguistic data and auxiliary sentences through predefined symbols [ CLS ] and [ SEP ] after the linguistic data and the auxiliary sentences are segmented; the spliced sequence is a [ CLS ] original sentence sequence [ SEP ] auxiliary sentence sequence [ SEP ] ", [ CLS ] is a semantic symbol of an input text sequence, and [ SEP ] is a segmentation symbol of a problem sequence and a text segment sequence.
CN202010136542.7A 2020-03-02 2020-03-02 BERT and QA thought-based fine-grained emotion analysis method Withdrawn CN111339260A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010136542.7A CN111339260A (en) 2020-03-02 2020-03-02 BERT and QA thought-based fine-grained emotion analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010136542.7A CN111339260A (en) 2020-03-02 2020-03-02 BERT and QA thought-based fine-grained emotion analysis method

Publications (1)

Publication Number Publication Date
CN111339260A true CN111339260A (en) 2020-06-26

Family

ID=71184644

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010136542.7A Withdrawn CN111339260A (en) 2020-03-02 2020-03-02 BERT and QA thought-based fine-grained emotion analysis method

Country Status (1)

Country Link
CN (1) CN111339260A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966827A (en) * 2020-07-24 2020-11-20 大连理工大学 Conversation emotion analysis method based on heterogeneous bipartite graph
CN112860889A (en) * 2021-01-29 2021-05-28 太原理工大学 BERT-based multi-label classification method
CN113204616A (en) * 2021-04-30 2021-08-03 北京百度网讯科技有限公司 Method and device for training text extraction model and extracting text
CN113377910A (en) * 2021-06-09 2021-09-10 平安科技(深圳)有限公司 Emotion evaluation method and device, electronic equipment and storage medium
CN113901171A (en) * 2021-09-06 2022-01-07 特赞(上海)信息科技有限公司 Semantic emotion analysis method and device
CN114332544A (en) * 2022-03-14 2022-04-12 之江实验室 Image block scoring-based fine-grained image classification method and device
CN114896365A (en) * 2022-04-27 2022-08-12 马上消费金融股份有限公司 Model training method, emotional tendency prediction method and device
CN114896987A (en) * 2022-06-24 2022-08-12 浙江君同智能科技有限责任公司 Fine-grained emotion analysis method and device based on semi-supervised pre-training model

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966827A (en) * 2020-07-24 2020-11-20 大连理工大学 Conversation emotion analysis method based on heterogeneous bipartite graph
CN111966827B (en) * 2020-07-24 2024-06-11 大连理工大学 Dialogue emotion analysis method based on heterogeneous bipartite graph
CN112860889A (en) * 2021-01-29 2021-05-28 太原理工大学 BERT-based multi-label classification method
CN113204616A (en) * 2021-04-30 2021-08-03 北京百度网讯科技有限公司 Method and device for training text extraction model and extracting text
CN113204616B (en) * 2021-04-30 2023-11-24 北京百度网讯科技有限公司 Training of text extraction model and text extraction method and device
CN113377910A (en) * 2021-06-09 2021-09-10 平安科技(深圳)有限公司 Emotion evaluation method and device, electronic equipment and storage medium
CN113901171A (en) * 2021-09-06 2022-01-07 特赞(上海)信息科技有限公司 Semantic emotion analysis method and device
CN114332544A (en) * 2022-03-14 2022-04-12 之江实验室 Image block scoring-based fine-grained image classification method and device
CN114332544B (en) * 2022-03-14 2022-06-07 之江实验室 Image block scoring-based fine-grained image classification method and device
CN114896365A (en) * 2022-04-27 2022-08-12 马上消费金融股份有限公司 Model training method, emotional tendency prediction method and device
CN114896987A (en) * 2022-06-24 2022-08-12 浙江君同智能科技有限责任公司 Fine-grained emotion analysis method and device based on semi-supervised pre-training model

Similar Documents

Publication Publication Date Title
CN110245229B (en) Deep learning theme emotion classification method based on data enhancement
US11403680B2 (en) Method, apparatus for evaluating review, device and storage medium
CN111339260A (en) BERT and QA thought-based fine-grained emotion analysis method
CN108446271B (en) Text emotion analysis method of convolutional neural network based on Chinese character component characteristics
CN113392209A (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN111753082A (en) Text classification method and device based on comment data, equipment and medium
CN112287100A (en) Text recognition method, spelling error correction method and voice recognition method
CN116561592B (en) Training method of text emotion recognition model, text emotion recognition method and device
CN115759119B (en) Financial text emotion analysis method, system, medium and equipment
CN111738807B (en) Method, computing device, and computer storage medium for recommending target objects
CN111709225B (en) Event causal relationship discriminating method, device and computer readable storage medium
CN112926308A (en) Method, apparatus, device, storage medium and program product for matching text
CN112528658A (en) Hierarchical classification method and device, electronic equipment and storage medium
CN115062718A (en) Language model training method and device, electronic equipment and storage medium
CN110826315B (en) Method for identifying timeliness of short text by using neural network system
El-Alfy et al. Empirical study on imbalanced learning of Arabic sentiment polarity with neural word embedding
CN113761910A (en) Comment text fine-grained emotion analysis method integrating emotional characteristics
CN113486143A (en) User portrait generation method based on multi-level text representation and model fusion
Susmitha et al. Sentimental Analysis on Twitter Data using Supervised Algorithms
CN110888983A (en) Positive and negative emotion analysis method, terminal device and storage medium
CN115906824A (en) Text fine-grained emotion analysis method, system, medium and computing equipment
CN107729509B (en) Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN113051396B (en) Classification recognition method and device for documents and electronic equipment
CN115659990A (en) Tobacco emotion analysis method, device and medium
CN114676699A (en) Entity emotion analysis method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200626