CN114168732A - Text emotion analysis method and device, computing device and readable medium - Google Patents

Text emotion analysis method and device, computing device and readable medium Download PDF

Info

Publication number
CN114168732A
CN114168732A CN202111436442.7A CN202111436442A CN114168732A CN 114168732 A CN114168732 A CN 114168732A CN 202111436442 A CN202111436442 A CN 202111436442A CN 114168732 A CN114168732 A CN 114168732A
Authority
CN
China
Prior art keywords
emotion
expression
word
text
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111436442.7A
Other languages
Chinese (zh)
Inventor
赵汉光
陈伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN202111436442.7A priority Critical patent/CN114168732A/en
Publication of CN114168732A publication Critical patent/CN114168732A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text emotion analysis method and device, computing equipment and a readable medium. The method comprises the following steps: acquiring word segmentation expression of a text to be analyzed; acquiring negative words and emotional word expressions of the text to be analyzed; and inputting the word segmentation expression and the negation and emotion word expression of the text to be analyzed into a trained emotion analysis model, and acquiring the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model. By adopting the technical scheme, the tendency emotion classification of the text to be analyzed can be predicted by adopting the trained emotion analysis model based on the word segmentation expression and the negative word and emotion word expression of the text to be analyzed. Compared with the prior art, the negative words and the emotion words in the text to be analyzed are referred, so that the accuracy of emotion classification can be effectively improved.

Description

Text emotion analysis method and device, computing device and readable medium
The application is a divisional application of patent applications with application date of 2019, month 5 and 28, application number of 201910451510.3, entitled "emotion analysis method of text and device thereof, computing device and readable medium".
Technical Field
The invention relates to the technical field of computer application, in particular to a text emotion analysis method and device, computing equipment and a readable medium.
Background
In the field of natural language processing, emotional analysis of text can aid in the understanding of text. Therefore, emotion analysis of text is particularly important in natural language processing.
The existing emotion analysis scheme of the text mainly adopts an attention mechanism to continue emotion analysis. Specifically, a large amount of text corpora are collected as training data. The emotion analysis model is then trained using the training data based on the attention mechanism. And analyzing the emotion of the text based on the trained emotion analysis model.
However, the existing emotion analysis model only pays attention to emotion words in the text by using an attention mechanism, and when negative words are simultaneously included in the text, the emotion opposite to the predicted emotion may be predicted, so that the analysis accuracy of the existing emotion analysis scheme is poor.
Disclosure of Invention
The invention provides a text emotion analysis method and device, computing equipment and a readable medium, which are used for improving the accuracy of emotion analysis.
The invention provides a method for analyzing emotion of a text, wherein the method comprises the following steps:
acquiring word segmentation expression of a text to be analyzed;
acquiring negative words and emotional word expressions of the text to be analyzed;
and inputting the word segmentation expression and the negation and emotion word expression of the text to be analyzed into a trained emotion analysis model, and acquiring the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model.
The invention also provides a method for training the emotion analysis model, wherein the method comprises the following steps:
acquiring a training text set;
extracting a training sample set based on a training text set, wherein each training sample in the training sample set comprises word segmentation expression, negative word and emotion word expression and known emotion classification;
and training an emotion analysis model based on the training sample set.
The invention also provides a device for analyzing the emotion of the text, wherein the device comprises:
the word segmentation information acquisition module is used for acquiring word segmentation expression of the text to be analyzed;
the negative word and emotion word information acquisition module is used for acquiring negative words and emotion word expressions of the text to be analyzed;
and the prediction module is used for inputting the word segmentation expression and the negation and emotion word expression of the text to be analyzed into the trained emotion analysis model, and acquiring the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model.
The invention also provides a device for training the emotion analysis model, wherein the device comprises:
the acquisition module is used for acquiring a training text set;
the extraction module is used for extracting a training sample set based on a training text set, wherein each training sample in the training sample set comprises word segmentation expression, negative word and emotion word expression and known emotion classification;
and the training module is used for training an emotion analysis model based on the training sample set.
The present invention also provides a computing device comprising:
a processor; and
a memory having executable code stored thereon which, when executed by the processor, causes the processor to perform a method as described in any one of the above.
The invention also provides a non-transitory machine-readable storage medium having stored thereon executable code which, when executed by a processor of an electronic device, causes the processor to perform a method as any one of the above.
By adopting the technical scheme, the emotion analysis method and the emotion analysis device for the text, the computing equipment and the readable medium can predict the tendency emotion classification of the text to be analyzed by adopting the trained emotion analysis model based on the word segmentation expression and the negative word and emotion word expression of the text to be analyzed. Compared with the prior art, the negative words and the emotion words in the text to be analyzed are referred, so that the accuracy of emotion classification can be effectively improved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in greater detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.
FIG. 1 is a flowchart of a first embodiment of a method for emotion analysis of a text.
FIG. 2 is a structural diagram of an emotion analysis model provided by the present invention.
FIG. 3 is a flowchart of a second embodiment of a method for emotion analysis of a text according to the present invention.
FIG. 4 is a block diagram of another emotion analysis model provided by the present invention.
FIG. 5 is a flowchart of a third embodiment of a text emotion analysis method according to the present invention.
FIG. 6 is a flowchart of a first embodiment of a method for training an emotion analysis model according to the present invention.
FIG. 7 is a flowchart of a second embodiment of the emotion analysis model training method of the present invention.
Fig. 8 is a block diagram of an emotion analyzing apparatus for text according to an embodiment of the present invention.
FIG. 9 is a block diagram of an embodiment of an emotion analysis model training apparatus according to the present invention.
FIG. 10 shows a schematic structural diagram of a computing device that can be used to implement the above method according to an embodiment of the invention.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 is a flowchart of a first embodiment of a method for emotion analysis of a text. As shown in fig. 1, the emotion analysis method of this embodiment may specifically include the following steps:
100. acquiring word segmentation expression of a text to be analyzed;
the main execution body of the text emotion analysis method in this embodiment is a text emotion analysis device, and the text emotion analysis device may specifically be an independent electronic entity, or may also be an integrated application, and may be operated on a computer device when used.
For example, the step 100 of obtaining the word segmentation expression of the text to be analyzed may specifically include: performing word segmentation on a text to be analyzed; and mapping each participle in the text to be analyzed according to a preset dictionary base and a mapping dictionary corresponding to the dictionary base to obtain the participle expression of the text to be analyzed.
The dictionary base of the present embodiment may be a base collected in advance and including many segmented words. The mapping dictionary of the dictionary base can record the mapping relation between each participle in the dictionary base and the mapping identification thereof. E.g., a → a' mapping. For example, one of the simplest mapping dictionaries may include a one-to-one mapping between each participle in the dictionary repository to a number. Different participles cannot correspond to the same mapping identity. For example, the first participle in the dictionary base is mapped to 1, the second participle is mapped to 2, and so on, the nth participle can be mapped to n. Specifically, in the mapping dictionary, a numeral may be used as a subscript of the word segmentation to indicate such a mapping relationship. In actual application, letters or a combination of numbers and letters can be used as the mapping identifiers, so that the principle and the same principle can be realized.
In this embodiment, after the text to be analyzed is segmented, each segmented word is mapped to a corresponding mapping identifier according to the sequence in the text to be analyzed through a mapping dictionary, so as to obtain a segmented word expression of the text to be analyzed. Each participle in the participle expression is represented by a corresponding mapping identifier, and the participle expression generated by the embodiment can be a one-dimensional vector. This allows the text to be analyzed to be converted into word-segmented expressions that can be processed.
101. Acquiring negative words and emotional word expressions of a text to be analyzed;
the expression of the negative words and the emotional words in the embodiment is used for representing the negative words and the emotional words in the text to be analyzed.
For example, the step 101 of acquiring the negative word and the emotional expression of the text to be analyzed may specifically include the following steps:
(a1) performing word segmentation on a text to be analyzed;
(b1) acquiring negative words and emotional words from all the participles of the text to be analyzed according to a preset negative word lexicon and a preset emotional word lexicon;
for example, in this embodiment, a word bank including a plurality of negative words may be collected in advance, and then each participle in the text to be analyzed is compared with each word in the word bank of the negative words, so as to obtain all the negative words in the text to be analyzed. Similarly, a word bank comprising a plurality of emotional words may be collected in advance, and each participle in the text to be analyzed may be compared with each word in the word bank of emotional words, so as to obtain all emotional words in the text to be analyzed.
(c1) And respectively carrying out feature mapping on the negative words and the emotion words in the text to be analyzed according to a preset feature mapping strategy of the negative words and a preset feature mapping strategy of the emotion words to obtain negation and emotion expression of the text to be analyzed.
In this embodiment, the preset feature mapping policy for the negative word and the preset feature mapping policy for the emotion word may be set according to actual requirements. For example, a feature that does not belong to either a negative word or an emotion word may be mapped to 0, and a negative word may be mapped to 1. For the case of emotion polarity classification (such as binary classification), positive emotion words appearing in the emotion word dictionary are mapped to 2, and negative emotion words are mapped to 3, so that negations are obtained, and words equivalent to only 4 words in emotion expression are embedded. In the same way, the negation and emotion expressions can also be identified by adopting one-dimensional vectors, and specifically, each participle in the text to be analyzed is mapped into one-dimensional vector in sequence according to the mapping rule of the negation word and the emotion word, and the one-dimensional vector is used as the negation and emotion expression of the text to be analyzed.
In addition, for the multi-emotion classification problem, a plurality of groups of two-class classifiers can be trained respectively to predict whether corresponding emotions exist, and when corresponding emotion words exist in an emotion word dictionary, the corresponding emotion words are mapped into 2, which is equivalent to word embedding of only 3 words.
102. And inputting the word segmentation expression, negation and emotion word expression of the text to be analyzed into the trained emotion analysis model, and acquiring the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model.
In the embodiment, the emotion analysis model is trained in advance, when the emotion analysis model is used, the word segmentation expression, negation and emotion word expression of the text to be analyzed are directly input into the emotion analysis model, and the emotion analysis model can predict and output tendency emotion classification of the text to be analyzed. Specifically, the tendency emotion classification can be a1 × n vector, where n is the number of emotion words in the emotion word library. Each position in the vector corresponds to one emotion classification, and the numerical value of each position in the vector is the probability that the text to be analyzed belongs to the emotion classification corresponding to the position. In practical application, the emotion classification corresponding to the probability greater than the preset probability threshold can be further taken as the final emotion classification of the text to be analyzed. For example, the preset probability threshold may be set according to actual requirements, and may be, for example, 0.5, 0.6, or other values greater than 0.5 and less than 1, which is not limited herein. Or in practical application, a preset probability threshold value can be configured in advance, and the emotion analysis model directly outputs the probability greater than the preset probability threshold value and the corresponding emotion classification based on the preset probability threshold value.
For example, the trained emotion analysis model of the present embodiment may include: a Recurrent Neural Networks (RNN) based participle processing layer for processing participle expression; a negative word emotion word processing layer based on a Convolutional Neural Network (CNN) for processing negative words and emotion word expressions; and a splicing treatment layer.
For example, fig. 2 is a structural diagram of an emotion analysis model provided by the present invention, and as shown in fig. 2, in the emotion analysis model of this embodiment, a word segmentation processing layer may sequentially include: a word embedding layer, an RNN layer and a first attention mechanism layer; the negative word emotion word processing layer can sequentially comprise: the emotion embedding layer, the CNN layer and the second attention mechanism layer; the splicing treatment layer may sequentially include: a splicing layer, a full connection layer and a normalization layer;
at this time, correspondingly, step 102 inputs the word segmentation expression and negation of the text to be analyzed and the emotion word expression into the trained emotion analysis model, and obtains the emotion tendency classification of the text to be analyzed predicted and output by the emotion analysis model, which may specifically include:
(a2) inputting the word segmentation expression of the text to be analyzed into a word embedding layer to obtain the embedding expression of the word segmentation;
in this embodiment, the calculation method for the embedded layer to obtain the embedded expression of the participle includes, but is not limited to, Continuous Bag of Words (CBoW), Skip-Gram (Skip-Gram), Global Vectors (GloVe), fastText (an open source library), Bidirectional Language models (Bidirectional Language models), elmo (embedding from Language models), GPT (generic Pre-Training), bert (Bidirectional Encoder descriptions from transformations), and the like.
(b2) The RNN layer extracts the characteristic expression of the participles, which contains context information, based on the embedded expression of the participles;
the RNN layer in this embodiment may be one layer, two layers, or multiple layers, and the number of layers is specifically set according to requirements.
(c2) The first attention mechanism layer gives different weights to each participle based on the feature expression of the participle obtained by the RNN layer, and text feature expression is obtained by weighted summation;
(d2) inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
(e2) extracting position relation characteristic expression of the negative words and the emotional words by the CNN layer based on the embedded expression of the negative words and the emotional words;
(f2) the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the CNN layer, and the negative and emotional feature expressions obtained by weighted summation are obtained;
(g2) splicing the text feature expression and negation with the emotional feature expression by the splicing layer to obtain spliced global feature expression;
(h2) the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
(i2) and mapping the transformed feature expression to the final classification by the normalization layer, and outputting a final classification result, wherein the final classification result comprises tendency emotion classification and corresponding probability.
For example, after the final classification process, a1 × n one-dimensional vector can be obtained, where n is the number of total emotion classifications. If two classes, the n is 2, and if multi-polar classes, the n can be all the number of poles. And the value of each position in the one-dimensional vector is the probability of the emotion classification corresponding to the position, wherein the emotion corresponding to the position with the highest probability is classified as the tendency emotion classification of the text to be analyzed. The tendency emotion classification with the maximum probability of the output of the final classification result and the corresponding probability can be controlled.
By adopting the technical scheme, the emotion analysis method of the embodiment can predict the tendency emotion classification of the text to be analyzed by adopting the trained emotion analysis model based on the word segmentation expression and the negative word and emotion word expression of the text to be analyzed. Compared with the prior art, the negative words and the emotion words in the text to be analyzed are referred, so that the accuracy of emotion classification can be effectively improved.
FIG. 3 is a flowchart of a second embodiment of a method for emotion analysis of a text according to the present invention. As shown in fig. 3, the emotion analysis method of this embodiment may specifically include the following steps:
200. acquiring word segmentation expression and position expression of the word segmentation of a text to be analyzed;
different from the step 200 of the embodiment shown in fig. 1, in this embodiment, it is further required to obtain the position expression of the participle of the text to be analyzed, and specifically, the position expression of the participle may be mapped according to the position information of each participle in the text to be analyzed. Similarly, the position expression of the present embodiment may also be in the form of a vector. The position expression of the present embodiment is used to represent position information of each segmented word.
201. Acquiring negative words and emotional word expressions of a text to be analyzed and position expressions of the negative words and the emotional words;
different from step 201 in the embodiment shown in fig. 1, in this embodiment, it is further required to obtain position expressions of the negative words and the emotion words of the text to be analyzed, and specifically, the position expressions of the negative words and the emotion words may be mapped according to position information of each negative word and each emotion word in the text to be analyzed.
202. Inputting the word segmentation expression, the position expression of the word segmentation, the expression of the negative words and the emotional words and the position expression of the negative words and the emotional words of the text to be analyzed into an emotion analysis model, and outputting the predicted tendency emotion classification of the text to be analyzed by the emotion analysis model.
Different from the step 202 in the embodiment shown in fig. 1, in this embodiment, the position expression of the input participle and the position expression of the fixed word and the emotional word are added, and the other implementation principles are the same and are not described herein again.
Similar to the embodiment shown in fig. 1, the trained emotion analysis model of this embodiment may also include: an RNN-based participle processing layer for processing participle expressions and position expressions of participles; a CNN-based negative word and emotional word processing layer for processing the negative word and emotional word expression and the negative and emotional word position expression; and a splicing treatment layer.
For example, fig. 4 is a structural diagram of another emotion analysis model provided by the present invention, and as shown in fig. 4, in the emotion analysis model of this embodiment, unlike the embodiment shown in fig. 1, the word segmentation processing layer of this embodiment sequentially includes: the system comprises a word embedding layer, a first position embedding layer, an RNN layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the second position embedding layer, the CNN layer and the second attention mechanism layer. The participle processing layer and the negative word emotion word processing layer are respectively provided with a first position embedding layer and a second position embedding layer compared with the embodiment shown in the figure 2. The splicing treatment layer is the same as the embodiment shown in fig. 2, and sequentially comprises: a splicing layer, a full connection layer and a normalization layer;
at this time, correspondingly, step 202 inputs the word segmentation expression, the position expression of the word segmentation, the negation and emotion word expression, and the negation and emotion word position expression of the text to be analyzed into the emotion analysis model, and obtains the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model, which may specifically include the following steps:
(a3) inputting the word segmentation expression of the text to be analyzed into a word embedding layer of the emotion analysis model to obtain the word segmentation embedding expression;
(b3) inputting the embedded expression and the position expression of the participles output by the word embedding layer into a first position embedding layer, so that the position embedded expression of each participle is added by the first position embedding layer on the basis of the embedded expression of the participles;
the first position embedding layer adds position information on the basis of the word embedding in the step (a3), for example, the subscript of the first word of the text to be analyzed is 0, the subscript of the second word is 1, and the positions are sequentially increased from back to back; the embedding layer maps the subscripts into trainable vectors with the same length as the word embedding length, the two embedding corresponding positions are added in sequence, and position information is added on the basis of the word embedding.
(c3) Extracting the feature expression of the participle, which contains the context information, by the RNN layer based on the embedded expression of the participle and the position embedded expression of the participle;
in this embodiment, there may be one or more bidirectional RNN layers. In particular, the RNN layer is able to extract features of the corresponding location containing context information.
(d3) The first attention mechanism layer gives different weights to each participle based on the feature expression of the participle obtained by the RNN layer, and text feature expression is obtained by weighted summation;
for example, the final text feature is obtained by the first attention mechanism layer by assigning different weighted sums to each position by using the following formula:
ei=exp(Wxi+b)
Figure BDA0003381685630000071
Figure BDA0003381685630000072
where x is the hidden state feature of the last layer of bi-directional RNN, vector xiIs the characteristic corresponding to the position i, the matrix W and the scalar b are trainable parameters, and the input characteristic is linearly transformed together and transformed into the positive scalar e through the exponential functioniThe importance of position i is represented. Scalar aiIs the result after normalizing the significance level and the final output vector j is the weighted sum of the vector x over a.
(e3) Inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
(f3) inputting the embedded expression of the negative words and the emotional words and the position expression of the negative words and the emotional words output by the emotion embedding layer into a second position embedding layer, so that the position embedded expression of the negative words and the emotional words is increased on the basis of the embedded expression of the negative words and the emotional words by the second position embedding layer;
(g3) extracting position relation characteristic expression of the negative words and the emotional words by the CNN layer based on the embedded expression of the negative words and the emotional words and the position embedded expression of the negative words and the emotional words;
the CNN layer of the present embodiment may also include one layer, two layers, or multiple layers.
(h3) The second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the CNN layer, and the negative words and the emotional word feature expression obtained by weighted summation are expressed;
and combining the negative words and the emotional words after one or more CNN layers of processing, and obtaining final negative word and emotional word characteristic expressions after a second attention mechanism which is the same as the text characteristic extraction.
(i3) Splicing the text feature expression and the negative word with the emotion word feature expression by the splicing layer to obtain spliced global feature expression;
(j3) the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
(k3) and mapping the transformed feature expression to the final classification by the normalization layer, and outputting a final classification result, wherein the final classification result comprises tendency emotion classification and corresponding probability.
And the splicing layer splices the text characteristic expression and the negative word and the emotional word characteristic expression, and if the final text characteristic expression is a vector with the length of a and the final negative word and the emotional word characteristic expression are vectors with the length of b, the splicing layer splices the text characteristic expression and the negative word and the emotional word characteristic expression to obtain a vector with the length of a + b. The full connected layer activated by tanh performs one transformation to enhance the fitting ability of the feature, and the vector length is a + b. And the full-connection layer normalized and activated by softmax maps the result to two types, namely, the result is changed into a vector with the length of 2. At the moment, the emotion analysis model is classified into two categories, the final output can also be a 2-dimensional vector, the value of each position is the probability of emotion classification of the corresponding position, and the emotion classification with high probability can be set as tendency emotion classification of the text to be analyzed.
Compared with the embodiments shown in fig. 1 and 2, the emotion analysis method of the present embodiment adds the position expression of the segmentation word and the position expression of the negation word and the emotion word, and can further improve the accuracy of the classification of the predicted tendency emotion.
By adopting the technical scheme, the emotion analysis method of the embodiment can predict the tendency emotion classification of the text to be analyzed by adopting the trained emotion analysis model based on the word segmentation expression, the position expression of the segmentation, the expression of the negative word and the emotion word and the position expression of the negative word and the emotion word of the text to be analyzed. Compared with the prior art, the negative words and the emotion words in the text to be analyzed are referred, so that the accuracy of emotion classification can be effectively improved.
FIG. 5 is a flowchart of a third embodiment of a text emotion analysis method according to the present invention. As shown in fig. 5, the emotion analysis method of this embodiment may specifically include the following steps:
300. acquiring feature expression of the participles obtained by the first attention mechanism layer based on the RNN layer, giving different weights to each participle, and outputting the normalized weights of the participles after normalizing the weights of the participles at all positions;
301. acquiring a target participle with the maximum normalization weight from a plurality of participles of a text to be analyzed according to the normalization weight of each participle;
302. judging whether the emotion word bank corresponding to the tendency emotion classification comprises target participles or not; if not, go to step 303; otherwise, if yes, ending; i.e. inclusion, indicates that the target analysis has been the word in the emotional word bank, and does not need any treatment.
303. Marking the target participle as a suspected emotion word; step 304 is executed;
304, judging whether the normalized weight of the target participle is greater than a preset weight threshold value or not, and whether the total times of marking the target participle as a suspected emotion word is greater than a preset time threshold value or not; if yes, go to step 305; otherwise, the target word segmentation is not processed for the moment, and the process is finished;
that is to say, when the normalized weight of the target participle is less than or equal to the preset weight threshold and the total frequency of the target participle marked as the suspected emotion word is less than or equal to the preset frequency threshold, the target participle cannot be listed as an emotion word and cannot be merged into the emotion word bank corresponding to the tendency emotion classification.
305. And merging the target participles into an emotion word library corresponding to the tendency emotion classification.
The true bookIn an embodiment, in the first attention mechanism layer, a weight a of each participle on the effect of the result may be obtainediAnd new emotional words can be obtained by counting the segmentation words with larger weight of different categories of emotions, and then the new emotional words can be added into the corresponding emotional word dictionary. For example, if a participle has the greatest weight in a sentence in an emotion category
Figure BDA0003381685630000091
To prevent the situation of more even attention, the aiExceeds a preset weight threshold value aiAnd the segmentation is considered to represent the corresponding emotion when the total occurrence frequency is greater than a preset frequency threshold value gamma, and the segmentation is added into a corresponding emotion word dictionary for expansion. When the polarity classification of a certain social application is processed, positive emotion words such as 'praise', 'zhao', 'hao', and 'teardrop' can be obtained, and negative emotion words such as 'peppery chicken', 'linger', 'lump together', 'yaho', and the like.
The embodiment can be implemented on the basis of the embodiment shown in fig. 1 or fig. 3, and by adopting the scheme, the extension of the emotion word lexicon is realized, so that the problem that the emotion word lexicon cannot be updated online in time in the prior art is solved, the lexicon of emotion words can be effectively enriched, and the tendency emotion classification of the text can be predicted more accurately.
FIG. 6 is a flowchart of a first embodiment of a method for training an emotion analysis model according to the present invention. As shown in fig. 6, the method for training an emotion analysis model in this embodiment may specifically include the following steps:
400. acquiring a training text set;
the main execution subject of the emotion analysis model training method of the present embodiment is an emotion analysis model training device. The emotion analysis model training device can be a separate entity or can also be a software integrated application.
Specifically, the training text set of the present embodiment may be a set collected from a network and including several pieces of text data.
401. Extracting a training sample set based on a training text set, wherein each training sample in the training sample set comprises word segmentation expression, negative word and emotion word expression and known emotion classification;
the known emotion classification in this embodiment means that the probability corresponding to a certain known emotion classification is 1, and the probabilities of other emotion classifications are all 0.
For example, extracting a training sample set based on a training text set may specifically include:
(a4) acquiring word segmentation expression of each training text in a training text set;
for example, obtaining the word segmentation expression of each training text in the training text set may specifically include: performing word segmentation on each training text; and mapping each participle in each training text according to a preset dictionary base and a mapping dictionary corresponding to the dictionary base to obtain the corresponding participle expression of the training text.
(b4) Acquiring negative words and emotional word expressions of each training text in the training text set;
(c4) a known emotion classification for each training text in the set of training texts is obtained.
In practical applications, each training text may correspond to only one known emotion classification, or may correspond to a plurality of known emotion classifications. For each known sentiment classification, the probability of the corresponding known sentiment classification may be labeled 1.
For example, the step (b4) of obtaining the negative word and the emotional word expression of each training text in the training text set may specifically include the following steps:
(a5) performing word segmentation on each training text;
(b5) acquiring negative words and emotional words from all participles of each training text according to a preset negative word lexicon and a preset emotional word lexicon corresponding to known emotion classification;
(c5) and respectively carrying out feature mapping on the negative words and the emotion words in each training text according to a preset feature mapping strategy of the negative words and a preset feature mapping strategy of the emotion words to obtain the negation and emotion expressions of the corresponding training texts.
Specifically, the specific implementation process of steps (a4) and (b4) may refer to the manner of obtaining the word segmentation expression of the text to be analyzed and obtaining the negative word and the emotion word expression of the text to be analyzed in the embodiment shown in fig. 1, and details are not repeated here.
402. And training the emotion analysis model based on the training sample set.
For example, the step 402 may train the emotion analysis model based on the training sample set, and specifically include the following two implementation manners:
the first implementation manner, which does not refer to the position expression, may specifically include the following steps:
(a5) inputting the word segmentation expression, the negative words and the emotional word expression of each training sample into an emotional analysis model, obtaining the prediction of the emotional analysis model and outputting the tendency emotional classification of the corresponding training sample;
(b5) calculating a loss function of the emotion analysis model according to the tendency emotion classification output by the emotion analysis model and the known emotion classification labels of the corresponding training samples, and adjusting parameters of the emotion analysis model according to the calculation result of the loss function.
For example, the emotion analysis model of the present embodiment may include: an RNN-based participle processing layer for processing participle expressions; a CNN-based negative word emotion word processing layer for processing negative words and emotion word expressions; and a splicing treatment layer.
Wherein the word segmentation processing layer sequentially comprises: a word embedding layer, an RNN layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the CNN layer and the second attention mechanism layer; the splicing treatment layer sequentially comprises: a splicing layer, a full connection layer and a normalization layer. Correspondingly, the step (a5) of inputting the word segmentation expression and negation and emotion word expression of each training sample into the emotion analysis model, obtaining emotion analysis model prediction, and outputting a tendency emotion classification of the corresponding training sample may specifically include the following steps:
(a6) when training is performed on each training sample, inputting the word segmentation expression of the training sample into the word embedding layer to obtain the word segmentation embedding expression;
(b6) extracting feature expression containing context information of the participle based on the embedded expression of the participle by the RNN layer;
(c6) the first attention mechanism layer gives different weights to each participle based on the feature expression of the participle obtained by the RNN layer, and text feature expression is obtained by weighted summation;
(d6) inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
(e6) extracting position relation characteristic expression of the negative words and the emotional words by the CNN layer based on the embedded expression of the negative words and the emotional words;
(f6) the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the CNN layer, and the negative and emotional feature expressions obtained by weighted summation are obtained;
(g6) splicing the text feature expression and negation with the emotional feature expression by the splicing layer to obtain spliced global feature expression;
(h6) the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
(i6) and mapping the transformed feature expression to the final classification by a normalization layer, and outputting a final classification result, wherein the final classification result comprises the probability of tendency emotion classification.
Specifically, reference may also be made to the descriptions of (a2) - (i2) in the embodiment shown in fig. 1, which are not repeated herein.
In a second implementation manner of step 402, referring to the position expression, first, in step 401, extracting a training sample set based on a training text set, which may further include: acquiring the position expression of the participle of each training text in the training text set; and acquiring the position expression of the negative words and the emotional words of each training text in the training text set. For example, the position expression of the word segmentation of each training text in the training text set is obtained, and the position expression of the word segmentation of the corresponding training text can be mapped according to the position information of each word segmentation in each training text. For example, obtaining the position expressions of the negation words and the emotion words of each training text in the training text set may include mapping the position expressions of the negation words and the emotion words of the corresponding training text according to the position information of each negation word and emotion word in each training text.
Similarly, the emotion analysis model trained at this time may also include: an RNN-based participle processing layer for processing participle expressions and position expressions of participles; a CNN-based negative word and emotional word processing layer for processing the negative word and emotional word expression and the negative and emotional word position expression; and a splicing treatment layer.
Wherein the word segmentation processing layer can sequentially comprise: the system comprises a word embedding layer, a first position embedding layer, an RNN layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the second position embedding layer, the CNN layer and the second attention mechanism layer; splicing the treatment layers to include in order: a splicing layer, a full connection layer and a normalization layer.
Compared with the first implementation mode, the word segmentation processing layer and the negative word emotion word processing layer are additionally provided with a first position embedding layer and a second position embedding layer. The splicing process layer is the same as in the first implementation described above.
However, in a second implementation manner, the step 402 of training the emotion analysis model based on the training sample set may specifically include: and inputting the word segmentation expression, the position expression of the word segmentation, the expression of the negative words and the emotional words and the position expression of the negative words and the emotional words of each training sample into an emotion analysis model, acquiring emotion analysis model prediction and outputting the tendency emotion classification of the corresponding training sample.
For example, the method includes inputting the word segmentation expression, the position expression of the word segmentation, the expression of the negative word and the emotion word, and the position expression of the negative word and the emotion word of each training sample into an emotion analysis model, obtaining emotion analysis model prediction, and outputting a tendency emotion classification of a corresponding training sample, and specifically includes the following steps:
(a7) when training is carried out on each training sample, the word segmentation expression of the training sample is input into a word embedding layer of the emotion analysis model, and the word segmentation embedding expression is obtained;
(b7) inputting the embedded expression of the participles output by the word embedding layer and the position expression of the participles into a first position embedding layer, so that the position embedding expression of each participle is added by the first position embedding layer on the basis of the embedded expression of the participles;
(c7) extracting the feature expression of the participles, which contains context information, by the RNN layer based on the embedded expression of the participles and the position embedded expression of each participle;
(d7) the first attention mechanism layer gives different weights to each participle based on the feature expression of the participle obtained by the RNN layer, and text feature expression is obtained by weighted summation;
(e7) inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
(f7) inputting the embedded expression of the negative words and the emotional words and the position expression of the negative words and the emotional words output by the emotion embedding layer into a second position embedding layer, so that the position embedded expression of the negative expressions and the emotional words is increased on the basis of the embedded expression of the negative words and the emotional words by the second position embedding layer;
(g7) extracting position relation characteristic expression of the negative words and the emotional words by the CNN layer based on the embedded expression of the negative words and the emotional words and the position embedded expression of the negative words and the emotional words;
(h7) the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the CNN layer, and the negative word and emotional word feature expression obtained by weighted summation is obtained;
(i7) splicing the text feature expression and the negative word with the emotion word feature expression by the splicing layer to obtain spliced global feature expression;
(j7) the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
(k7) and mapping the transformed feature expression to the final classification by a normalization layer, and outputting a final classification result, wherein the final classification result comprises the probability of tendency emotion classification.
Specifically, reference may also be made to the descriptions of (a3) - (k3) in the embodiment shown in fig. 2, which are not repeated herein.
By adopting the technical scheme, the emotion analysis model can be trained based on the word segmentation expression and the expression of the negative words and the emotion words of the text to be analyzed, so that the accuracy of the classification of the predicted tendency emotion can be improved by the trained emotion analysis model.
FIG. 7 is a flowchart of a second embodiment of the emotion analysis model training method of the present invention. As shown in fig. 7, the method for training an emotion analysis model in this embodiment may specifically include the following steps:
500. acquiring sentences carrying texts and expressions;
501. obtaining emotion classification corresponding to the expression;
502. predicting the emotion classification corresponding to the text by adopting an emotion analysis model;
503. judging whether the emotion classification corresponding to the expression is consistent with the emotion classification corresponding to the text; if yes, go to step 504; otherwise, if not, go to step 505;
504. adding sentences carrying texts and expressions to a training text set as training texts; step 506 is executed;
505. outputting the emotion classification corresponding to the expression and the emotion classification corresponding to the text for a worker to manually label the emotion classification of the sentence carrying the text and the expression by referring to the emotion classification corresponding to the expression and the emotion classification corresponding to the text; and (6) ending.
Further, the method can also comprise the step of adding manually marked texts and expression sentences to a training text set as training texts.
For example, in this embodiment, the step 500 is executed before the training text set is obtained, and is used to expand the training text set and enrich the corpus in the training text set.
506. And configuring loss function weight for the increased training text, so that when the emotion analysis model is trained by adopting the increased training text, the corresponding loss function is adjusted by adopting the loss function weight, and parameter adjustment is carried out on the basis of the adjusted loss function.
The added training text may be configured with a loss function weight, which may be any number between 0-1, relative to the training text already contained in the set of training texts. If the added training text is considered to be important as other training texts, the weight can be set to 1; otherwise, if it is considered that the added training text is not important as other training texts, a weight greater than 0 and smaller than 1 may be set, and after the loss function is calculated in the manner of the above embodiment, the loss function is multiplied by the weight, and then parameter adjustment is performed based on the loss function multiplied by the weight, so as to reduce the degree of influence of the training text on model training.
Specifically, in order to solve the problem of a small corpus of training words, training data may be increased by acquiring information including expressions. For example, crawling the text in the social application may include expressions, which themselves are relatively clear emotional expressions. For the emotion polarity classification problem, the expressions can be classified into three types, namely positive direction, negative direction and no obvious emotion; for the multi-emotion classification problem, different expressions can also be mapped to different emotions. After the expressions are removed, a pure text corpus with emotion category information is obtained, and the pure text corpus can be added into a training corpus to enhance the effect of the model. However, sometimes, the microblog emotion is completely expressed by the expression, or the microblog emotion is in the opposite words, for the situation, the texts can be predicted by using a trained emotion analysis model, if the prediction result is not biased to the emotion corresponding to the expression, the texts are removed, and if the expression is positive, the model prediction has 51% probability of being the negative text. Meanwhile, for the newly added texts, the corresponding loss function weight is reduced according to the prediction of the existing emotion analysis model:
Figure BDA0003381685630000141
the probability of the j-th class predicted by the sensory analysis model is obtained by summing and then obtaining the original cross entropy loss function. p is a radical ofiThe probability that a newly added text is predicted to belong to the corresponding emotion classification by the emotion analysis model, namely the more certain the text belongs to a certain class, the more the weight of the loss function is (between 0.5 and 1.0), and for the text determined by the original class, the probability of the corresponding class is pi=1。
In this embodiment, data with different prediction categories and expression categories may also be selected, and sorted from small to large according to the probability of being classified into corresponding categories, that is, samples with larger differences are arranged in front, and a small number of samples in front are labeled by manual review to obtain the correct categories. Adding the corresponding text into the training text set, and setting the probability of the corresponding text in the corresponding category as pi=1。
By adopting the technical scheme, the emotion analysis model training method can enrich the corpus in the training text set so as to overcome the problem of less training corpus in the prior art. By adopting the abundant training text set in the embodiment, the prediction accuracy of the trained emotion analysis model can be further improved.
Fig. 8 is a block diagram of an emotion analyzing apparatus for text according to an embodiment of the present invention. As shown in fig. 8, the emotion analyzing apparatus for text of the present embodiment includes:
the word segmentation information acquisition module 10 is used for acquiring word segmentation expressions of a text to be analyzed;
the negative word and emotion word information acquisition module 11 is used for acquiring negative words and emotion word expressions of the text to be analyzed;
the prediction module 12 is configured to input the segmented word expression of the text to be analyzed acquired by the segmented word information acquisition module 10 and the negation and emotion word expression acquired by the negation word and emotion word information acquisition module 11 into the trained emotion analysis model, and acquire a tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model.
Further optionally, the word segmentation information obtaining module 10 is configured to:
performing word segmentation on a text to be analyzed;
and mapping each participle in the text to be analyzed according to a preset dictionary base and a mapping dictionary corresponding to the dictionary base to obtain the participle expression of the text to be analyzed.
Further optionally, the negative word and emotion word information obtaining module 11 is configured to:
performing word segmentation on a text to be analyzed;
acquiring negative words and emotional words from all the participles of the text to be analyzed according to a preset negative word lexicon and a preset emotional word lexicon;
and respectively carrying out feature mapping on the negative words and the emotion words in the text to be analyzed according to a preset feature mapping strategy of the negative words and a preset feature mapping strategy of the emotion words to obtain negation and emotion expression of the text to be analyzed.
Further optionally, in the apparatus for emotion analysis of text in this embodiment, the trained emotion analysis model includes:
a recurrent neural network-based participle processing layer for processing participle expressions;
a negative word emotion word processing layer based on a convolution neural network and used for processing the negative words and emotion word expressions; and
and (6) splicing the treatment layers.
Further optionally, in the apparatus for emotion analyzing of text in this embodiment, the word segmentation processing layer sequentially includes: the system comprises a word embedding layer, a recurrent neural network layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the convolutional neural network layer and the second attention mechanism layer; the splicing treatment layer sequentially comprises: a splicing layer, a full connection layer and a normalization layer;
the prediction module 12 is configured to:
inputting the word segmentation expression of the text to be analyzed into a word embedding layer to obtain the embedding expression of the word segmentation;
extracting feature expression containing context information of the participle by a recurrent neural network layer based on the embedded expression of the participle;
the first attention mechanism layer gives different weights to each participle based on the characteristic expression of the participle obtained by the recurrent neural network layer, and the text characteristic expression is obtained by weighted summation;
inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
extracting position relation characteristic expression of the negative words and the emotional words by the convolutional neural network layer based on the embedded expression of the negative words and the emotional words;
the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the convolutional neural network layer, and the negative and emotional feature expressions obtained by weighted summation are obtained;
splicing the text feature expression and negation with the emotional feature expression by the splicing layer to obtain spliced global feature expression;
the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
and mapping the transformed feature expression to the final classification by the normalization layer, and outputting a final classification result, wherein the final classification result comprises tendency emotion classification and corresponding probability.
Further optionally, in the emotion analyzing apparatus for a text in this embodiment, the word segmentation information obtaining module 10 is further configured to obtain a position expression of a word segmentation of the text to be analyzed;
the negative word and emotion word information acquisition module 11 is further configured to acquire position expressions of negative words and emotion words of the text to be analyzed;
the prediction module 12 is further configured to input the segmented expression, the position expression of the segmented word, the expression of the negative word and the emotion word, and the position expression of the negative word and the emotion word of the text to be analyzed, which are acquired by the segmented word information acquisition module 10, and the position expression of the negative word and the emotion word, which are acquired by the negative word and emotion word information acquisition module 11, into the emotion analysis model, and output the predicted tendency emotion classification of the text to be analyzed by the emotion analysis model.
Further optionally, in the emotion analyzing apparatus for a text in this embodiment, the word segmentation information obtaining module 10 is configured to map a position expression of a word segmentation according to position information of each word segmentation in the text to be analyzed;
further optionally, in the apparatus for analyzing text emotion in this embodiment, the negation word and emotion word information obtaining module 11 is configured to map the position expression of negation and emotion according to the position information of each negation word and emotion word in the text to be analyzed.
Further optionally, in the apparatus for emotion analysis of text in this embodiment, the trained emotion analysis model includes:
a segmentation processing layer based on a recurrent neural network for processing the segmentation expressions and the position expressions of the segmentation;
a negative word emotion word processing layer based on a convolution neural network and used for processing the negative word and emotion word expression and the position expression of the negative and emotion words; and
and (6) splicing the treatment layers.
Further optionally, in the apparatus for emotion analyzing of text in this embodiment, the word segmentation processing layer sequentially includes: the system comprises a word embedding layer, a first position embedding layer, a recurrent neural network layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the second position embedding layer, the convolutional neural network layer and the second attention mechanism layer; the splicing treatment layer sequentially comprises: a splicing layer, a full connection layer and a normalization layer;
the training module 12 is configured to:
inputting the word segmentation expression of the text to be analyzed into a word embedding layer of the emotion analysis model to obtain the word segmentation embedding expression;
inputting the embedded expression of the participles output by the word embedding layer and the position expression of the participles into a first position embedding layer, so that the position embedding expression of each participle is added by the first position embedding layer on the basis of the embedded expression of the participles;
extracting feature expression containing context information of the participle by a recurrent neural network layer based on the embedded expression and the position embedded expression of the participle;
the first attention mechanism layer gives different weights to each participle based on the characteristic expression of the participle obtained by the recurrent neural network layer, and the text characteristic expression is obtained by weighted summation;
inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
inputting the embedded expression of the negative words and the emotional words and the position expression of the negative words and the emotional words output by the emotion embedding layer into a second position embedding layer, so that the position embedded expression of the negative words and the emotional words is increased on the basis of the embedded expression of the negative words and the emotional words by the second position embedding layer;
extracting position relation characteristic expression of the negative words and the emotional words by the convolutional neural network layer based on the embedded expression of the negative words and the emotional words and the position embedded expression of the negative words and the emotional words;
the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the convolutional neural network layer, and the negative words and the emotional word feature expression obtained by weighted summation are expressed;
splicing the text feature expression and the negative word with the emotion word feature expression by the splicing layer to obtain spliced global feature expression;
the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
and mapping the transformed feature expression to the final classification by the normalization layer, and outputting a final classification result, wherein the final classification result comprises tendency emotion classification and corresponding probability.
Further optionally, as shown in fig. 8, the apparatus for emotion analyzing of text in this embodiment further includes:
the weight obtaining module 13 obtains feature expression of the participles obtained by the first attention mechanism layer based on the recurrent neural network layer, gives different weights to each participle, and outputs the normalized weight of each participle after normalizing the weight of the participle at each position;
the target participle obtaining module 14 is configured to obtain a target participle with a maximum normalization weight from the multiple participles of the text to be analyzed according to the normalization weight of each participle processed by the weight obtaining module 13;
the judging module 15 is configured to judge whether the target participle acquired by the target participle acquiring module 14 is included in the emotion word library corresponding to the tendency emotion classification; if not, the target participle is marked as a suspected emotion word;
the judging module 15 is further configured to judge whether the normalized weight of the target segmented word acquired by the target segmented word acquiring module 14 is greater than a preset weight threshold, and whether the total number of times that the target segmented word is marked as a suspected emotion word is greater than a preset number threshold;
the merging module 16 is configured to merge the target participle into the emotion word bank corresponding to the tendency emotion classification based on the determination of the determining module 15.
In this way, the negative word and emotional word information obtaining module 11 may be configured to obtain the negative words and the emotional words from all the segmented words of the text to be analyzed according to the preset negative word lexicon and the updated emotional word lexicon of the merging module 16.
The emotion analysis device for text in this embodiment implements the implementation principle and the implementation effect of emotion analysis for text by using the above modules, which are the same as those in the related method embodiments described above, and further reference may be made to the description of the related method embodiments, which is not repeated herein.
FIG. 9 is a block diagram of an embodiment of an emotion analysis model training apparatus according to the present invention. As shown in fig. 9, the emotion analysis model training apparatus according to the present embodiment includes:
the obtaining module 20 is configured to obtain a training text set;
the extracting module 21 is configured to extract a training sample set based on the training text set acquired by the acquiring module 20, where each training sample in the training sample set includes a word segmentation expression, a negative word and emotion word expression, and a known emotion classification;
the training module 22 is configured to train the emotion analysis model based on the training sample set processed by the extraction module 21.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the extracting module 21 is configured to:
further optionally, in the training device for obtaining and training an emotion analysis model of this embodiment, the extraction module 21 is configured to:
word segmentation expression of each training text in the text set;
acquiring negative words and emotional word expressions of each training text in the training text set;
a known emotion classification for each training text in the set of training texts is obtained.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the extracting module 21 is configured to:
performing word segmentation on each training text;
and mapping each participle in each training text according to a preset dictionary base and a mapping dictionary corresponding to the dictionary base to obtain the corresponding participle expression of the training text.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the extraction module 21,
performing word segmentation on each training text;
acquiring negative words and emotional words from all participles of each training text according to a preset negative word lexicon and a preset emotional word lexicon corresponding to known emotion classification;
and respectively carrying out feature mapping on the negative words and the emotion words in each training text according to a preset feature mapping strategy of the negative words and a preset feature mapping strategy of the emotion words to obtain the negation and emotion expressions of the corresponding training texts.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the training module 22 is configured to:
inputting the word segmentation expression, the negative words and the emotional word expression of each training sample into an emotional analysis model, obtaining the prediction of the emotional analysis model and outputting the tendency emotional classification of the corresponding training sample;
calculating a loss function of the emotion analysis model according to the tendency emotion classification output by the emotion analysis model and the known emotion classification labels of the corresponding training samples, and adjusting parameters of the emotion analysis model according to the calculation result of the loss function.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the emotion analysis model includes:
a recurrent neural network-based participle processing layer for processing participle expressions;
a negative word emotion word processing layer based on a convolution neural network and used for processing the negative words and emotion word expressions; and
and (6) splicing the treatment layers.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the word segmentation processing layer sequentially includes: the system comprises a word embedding layer, a recurrent neural network layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the convolutional neural network layer and the second attention mechanism layer; the splicing treatment layer sequentially comprises: a splicing layer, a full connection layer and a normalization layer; a training module to:
when training is performed on each training sample, inputting the word segmentation expression of the training sample into the word embedding layer to obtain the word segmentation embedding expression;
extracting feature expression containing context information of the participle by a recurrent neural network layer based on the embedded expression of the participle;
the first attention mechanism layer gives different weights to each participle based on the characteristic expression of the participle obtained by the recurrent neural network layer, and the text characteristic expression is obtained by weighted summation;
inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
extracting position relation characteristic expression of the negative words and the emotional words by the convolutional neural network layer based on the embedded expression of the negative words and the emotional words;
the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the convolutional neural network layer, and the negative and emotional feature expressions obtained by weighted summation are obtained;
splicing the text feature expression and negation with the emotional feature expression by the splicing layer to obtain spliced global feature expression;
the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
and mapping the transformed feature expression to the final classification by the normalization layer, and outputting a final classification result, wherein the final classification result comprises tendency emotion classification and corresponding probability.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the extraction module 21 is further configured to:
acquiring the position expression of the participle of each training text in the training text set;
and acquiring the position expression of the negative words and the emotional words of each training text in the training text set.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the extraction module 21 is further configured to:
mapping the position expression of the participles of the corresponding training text according to the position information of the participles in each training text;
acquiring the position expression of the negative words and the emotional words of each training text in the training text set, wherein the position expression comprises the following steps:
and mapping the position expressions of negation and emotion of the corresponding training text according to the position information of each negation word and emotion word in each training text.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the trained emotion analysis model includes:
a segmentation processing layer based on a recurrent neural network for processing the segmentation expressions and the position expressions of the segmentation;
a negative word emotion word processing layer based on a convolution neural network and used for processing the negative word and emotion word expression and the position expression of the negative and emotion words; and
and (6) splicing the treatment layers.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the training module 22 is configured to:
and inputting the word segmentation expression, the position expression of the word segmentation, the expression of the negative words and the emotional words and the position expression of the negative words and the emotional words of each training sample into an emotion analysis model, acquiring emotion analysis model prediction and outputting the tendency emotion classification of the corresponding training sample.
Further optionally, in the training apparatus for emotion analysis model of this embodiment, the word segmentation processing layer sequentially includes: the system comprises a word embedding layer, a first position embedding layer, a recurrent neural network layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the second position embedding layer, the convolutional neural network layer and the second attention mechanism layer; the splicing treatment layer sequentially comprises: a splicing layer, a full connection layer and a normalization layer;
a training module to:
when training is carried out on each training sample, the word segmentation expression of the training sample is input into a word embedding layer of the emotion analysis model, and the word segmentation embedding expression is obtained;
inputting the embedded expression of the participles output by the word embedding layer and the position expression of the participles into a first position embedding layer, so that the position embedding expression of each participle is added by the first position embedding layer on the basis of the embedded expression of the participles;
extracting feature expressions containing context information of the participles by a recurrent neural network layer based on the embedded expressions of the participles and the position embedded expressions of the participles;
the first attention mechanism layer gives different weights to each participle based on the characteristic expression of the participle obtained by the recurrent neural network layer, and the text characteristic expression is obtained by weighted summation;
inputting the expression of the negative words and the emotion words into an emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
inputting the embedded expression of the negative words and the emotional words and the position expression of the negative words and the emotional words output by the emotion embedding layer into a second position embedding layer, so that the position embedded expression of the negative expressions and the emotional words is increased on the basis of the embedded expression of the negative words and the emotional words by the second position embedding layer;
extracting position relation characteristic expression of the negative words and the emotional words by the convolutional neural network layer based on the embedded expression of the negative words and the emotional words and the position embedded expression of the negative words and the emotional words;
the second attention mechanism layer gives different weights to each negative word or emotional word based on the position relation feature expression obtained by the convolutional neural network layer, and the negative word and emotional word feature expression obtained by weighted summation is obtained;
splicing the text feature expression and the negative word with the emotion word feature expression by the splicing layer to obtain spliced global feature expression;
the full connection layer carries out the fitting capability processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
and mapping the transformed feature expression to the final classification by the normalization layer, and outputting a final classification result, wherein the final classification result comprises tendency emotion classification and corresponding probability.
Further optionally, as shown in fig. 9, the training apparatus for emotion analysis models in this embodiment further includes a prediction module 23, a determination module 24, and an addition module 25;
the obtaining module 20 is further configured to obtain a sentence carrying a text and an expression;
the obtaining module 20 is further configured to obtain emotion classifications corresponding to the expressions;
the prediction module 23 is configured to predict an emotion classification corresponding to the text by using an emotion analysis model;
the judging module 24 is configured to judge whether the emotion classification corresponding to the expression obtained by the obtaining module 20 is consistent with the emotion classification corresponding to the text obtained by the predicting module 23;
the adding module 25 is configured to, based on the judgment of the judging module 24, take a sentence carrying a text and an expression as a training text and add the training text to a training text set if the sentence is consistent with the expression.
Further optionally, as shown in fig. 9, the apparatus for training an emotion analysis model according to this embodiment further includes:
the output module 26 is configured to, based on the judgment of the judgment module 24, output the emotion classification corresponding to the expression and the emotion classification corresponding to the text if the emotion classification corresponding to the expression is inconsistent with the emotion classification corresponding to the text, so that a worker manually marks the emotion classification of the sentence carrying the text and the expression with reference to the emotion classification corresponding to the expression and the emotion classification corresponding to the text.
Further optionally, as shown in fig. 9, the apparatus for training an emotion analysis model according to this embodiment further includes:
the configuration module 27 is configured to configure a loss function weight for the training text added by the adding module 25, so that when the emotion analysis model is trained by using the added training text, the loss function weight is used to adjust a corresponding loss function, and parameter adjustment is performed based on the adjusted loss function.
The implementation principle and the implementation effect of the emotion analysis model training implemented by the above modules in the apparatus for training an emotion analysis model of this embodiment are the same as those in the related method embodiments, and reference may be made to the description of the related method embodiments for details, which are not repeated herein.
FIG. 10 shows a schematic structural diagram of a computing device that can be used to implement the above method according to an embodiment of the invention. The computing device of the embodiment can be used for not only realizing the emotion analysis method of the text, but also realizing the training method of the emotion analysis model.
Referring to fig. 10, the computing device 1000 includes a memory 1010 and a processor 1020.
The processor 1020 may be a multi-core processor or may include multiple processors. In some embodiments, processor 1020 may include a general-purpose host processor and one or more special purpose coprocessors such as a Graphics Processor (GPU), Digital Signal Processor (DSP), or the like. In some embodiments, processor 1020 may be implemented using custom circuits, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA).
The memory 1010 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions that are needed by the processor 1020 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 1010 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, among others. In some embodiments, memory 1010 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disc, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.
The memory 1010 has executable code stored thereon, which when processed by the processor 1020, causes the processor 1020 to perform the emotion analysis methods for text or the training methods for emotion analysis models described above.
The emotion analysis method of text or the training method of emotion analysis model according to the present invention has been described in detail above with reference to the drawings.
Furthermore, the method according to the invention may also be implemented as a computer program or computer program product comprising computer program code instructions for carrying out the above-mentioned steps defined in the above-mentioned method of the invention.
Alternatively, the invention may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of the above-described method according to the invention.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A method for emotion analysis of a text, wherein the method comprises:
acquiring word segmentation expression of a text to be analyzed;
acquiring negative words and emotional word expressions of the text to be analyzed;
and inputting the word segmentation expression and the negation and emotion word expression of the text to be analyzed into a trained emotion analysis model, and acquiring the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model.
2. The method of claim 1, wherein obtaining a tokenization expression for a text to be analyzed comprises:
performing word segmentation on the text to be analyzed;
and mapping each participle in the text to be analyzed according to a preset dictionary base and a mapping dictionary corresponding to the dictionary base to obtain the participle expression of the text to be analyzed.
3. The method of claim 1, wherein obtaining negative words and emotional expressions of the text to be analyzed comprises:
performing word segmentation on the text to be analyzed;
acquiring negative words and emotional words from all the participles of the text to be analyzed according to a preset negative word lexicon and a preset emotional word lexicon;
respectively performing feature mapping on the negative words and the emotion words in the text to be analyzed according to a preset feature mapping strategy of the negative words and a preset feature mapping strategy of the emotion words to obtain negation and emotion expression of the text to be analyzed.
4. The method of claim 1, wherein the trained sentiment analysis model comprises:
a recurrent neural network-based participle processing layer for processing the participle expression;
a negative word emotion word processing layer based on a convolution neural network and used for processing the negative words and emotion word expressions; and
and (6) splicing the treatment layers.
5. The method of claim 4, wherein,
the word segmentation processing layer sequentially comprises: the system comprises a word embedding layer, a recurrent neural network layer and a first attention mechanism layer; the negative word emotion word processing layer sequentially comprises: the emotion embedding layer, the convolutional neural network layer and the second attention mechanism layer; the splicing treatment layer sequentially comprises: a splicing layer, a full connection layer and a normalization layer;
inputting the word segmentation expression and the negation and emotion word expression of the text to be analyzed into a trained emotion analysis model, and acquiring emotion tendency classification of the text to be analyzed predicted and output by the emotion analysis model, wherein the emotion tendency classification comprises the following steps:
inputting the word segmentation expression of the text to be analyzed into the word embedding layer to obtain the word segmentation embedding expression;
extracting, by the recurrent neural network layer, feature expressions containing context information of the participles based on the embedded expressions of the participles;
the first attention mechanism layer gives different weights to each participle based on the feature expression of the participle obtained by the recurrent neural network layer, and text feature expression is obtained through weighted summation;
inputting the expression of the negative words and the emotion words into the emotion embedding layer to obtain the embedded expression of the negative words and the emotion words;
extracting position relation characteristic expression of the negative words and the emotional words by the convolutional neural network layer based on the embedded expression of the negative words and the emotional words;
giving different weights to each negative word or emotional word by the second attention mechanism layer based on the position relation feature expression obtained by the convolutional neural network layer, and weighting and summing the negative and emotional feature expressions obtained;
splicing the text feature expression and the negation and emotion feature expression by the splicing layer to obtain a spliced global feature expression;
the full connection layer carries out the fitting capacity processing of the change enhancement feature on the global splicing feature expression through mapping to obtain the transformed feature expression;
and mapping the transformed feature expression to an ultimate classification by the normalization layer, and outputting a result of the ultimate classification, wherein the result of the ultimate classification comprises the tendency emotion classification and the corresponding probability.
6. A method for training an emotion analysis model, wherein the method comprises the following steps:
acquiring a training text set;
extracting a training sample set based on a training text set, wherein each training sample in the training sample set comprises word segmentation expression, negative word and emotion word expression and known emotion classification;
and training an emotion analysis model based on the training sample set.
7. An emotion analysis apparatus for a text, wherein the apparatus comprises:
the word segmentation information acquisition module is used for acquiring word segmentation expression of the text to be analyzed;
the negative word and emotion word information acquisition module is used for acquiring negative words and emotion word expressions of the text to be analyzed;
and the prediction module is used for inputting the word segmentation expression and the negation and emotion word expression of the text to be analyzed into the trained emotion analysis model, and acquiring the tendency emotion classification of the text to be analyzed predicted and output by the emotion analysis model.
8. An apparatus for training an emotion analysis model, wherein the apparatus comprises:
the acquisition module is used for acquiring a training text set;
the extraction module is used for extracting a training sample set based on a training text set, wherein each training sample in the training sample set comprises word segmentation expression, negative word and emotion word expression and known emotion classification;
and the training module is used for training an emotion analysis model based on the training sample set.
9. A computing device, comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor causes the processor to perform the method of any one of claims 1-5 or the method of claim 6.
10. A non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method of any one of claims 1 to 5, or to perform the method of claim 6.
CN202111436442.7A 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium Pending CN114168732A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111436442.7A CN114168732A (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910451510.3A CN110232123B (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium
CN202111436442.7A CN114168732A (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201910451510.3A Division CN110232123B (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium

Publications (1)

Publication Number Publication Date
CN114168732A true CN114168732A (en) 2022-03-11

Family

ID=67858625

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910451510.3A Active CN110232123B (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium
CN202111436442.7A Pending CN114168732A (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201910451510.3A Active CN110232123B (en) 2019-05-28 2019-05-28 Text emotion analysis method and device, computing device and readable medium

Country Status (1)

Country Link
CN (2) CN110232123B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929516A (en) * 2019-11-22 2020-03-27 新华网股份有限公司 Text emotion analysis method and device, electronic equipment and readable storage medium
CN111078879A (en) * 2019-12-09 2020-04-28 北京邮电大学 Method and device for detecting text sensitive information of satellite internet based on deep learning
CN111191438B (en) * 2019-12-30 2023-03-21 北京百分点科技集团股份有限公司 Emotion analysis method and device and electronic equipment
CN111444709B (en) * 2020-03-09 2022-08-12 腾讯科技(深圳)有限公司 Text classification method, device, storage medium and equipment
CN112115331B (en) * 2020-09-21 2021-05-04 朱彤 Capital market public opinion monitoring method based on distributed web crawler and NLP
CN113609390A (en) * 2021-08-06 2021-11-05 北京金堤征信服务有限公司 Information analysis method and device, electronic equipment and computer readable storage medium
CN114579740B (en) * 2022-01-20 2023-12-05 马上消费金融股份有限公司 Text classification method, device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11354565B2 (en) * 2017-03-15 2022-06-07 Salesforce.Com, Inc. Probability-based guider
CN108984523A (en) * 2018-06-29 2018-12-11 重庆邮电大学 A kind of comment on commodity sentiment analysis method based on deep learning model
CN109299268A (en) * 2018-10-24 2019-02-01 河南理工大学 A kind of text emotion analysis method based on dual channel model

Also Published As

Publication number Publication date
CN110232123B (en) 2021-12-03
CN110232123A (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN110232123B (en) Text emotion analysis method and device, computing device and readable medium
CN111061843B (en) Knowledge-graph-guided false news detection method
CN106650813B (en) A kind of image understanding method based on depth residual error network and LSTM
CN112966074B (en) Emotion analysis method and device, electronic equipment and storage medium
CN110110323B (en) Text emotion classification method and device and computer readable storage medium
CN110619044B (en) Emotion analysis method, system, storage medium and equipment
CN111046670B (en) Entity and relationship combined extraction method based on drug case legal documents
CN112883714B (en) ABSC task syntactic constraint method based on dependency graph convolution and transfer learning
CN112232058A (en) False news identification method and system based on deep learning three-layer semantic extraction framework
CN109977199A (en) A kind of reading understanding method based on attention pond mechanism
CN112861522B (en) Aspect-level emotion analysis method, system and model based on dual-attention mechanism
CN113627151B (en) Cross-modal data matching method, device, equipment and medium
US11373043B2 (en) Technique for generating and utilizing virtual fingerprint representing text data
WO2023108985A1 (en) Method for recognizing proportion of green asset and related product
CN116150367A (en) Emotion analysis method and system based on aspects
Islam et al. A simple and mighty arrowhead detection technique of Bangla sign language characters with CNN
Yeasmin et al. Image classification for identifying social gathering types
CN113779227A (en) Case fact extraction method, system, device and medium
CN112597299A (en) Text entity classification method and device, terminal equipment and storage medium
CN117197569A (en) Image auditing method, image auditing model training method, device and equipment
US20230130662A1 (en) Method and apparatus for analyzing multimodal data
CN115080748B (en) Weak supervision text classification method and device based on learning with noise label
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
CN115374943A (en) Data cognition calculation method and system based on domain confrontation migration network
CN115129863A (en) Intention recognition method, device, equipment, storage medium and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination