CN110879938A - Text emotion classification method, device, equipment and storage medium - Google Patents

Text emotion classification method, device, equipment and storage medium Download PDF

Info

Publication number
CN110879938A
CN110879938A CN201911110950.9A CN201911110950A CN110879938A CN 110879938 A CN110879938 A CN 110879938A CN 201911110950 A CN201911110950 A CN 201911110950A CN 110879938 A CN110879938 A CN 110879938A
Authority
CN
China
Prior art keywords
feature
feature representation
context
vector
text data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911110950.9A
Other languages
Chinese (zh)
Inventor
张少华
孟琳琳
周雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN201911110950.9A priority Critical patent/CN110879938A/en
Publication of CN110879938A publication Critical patent/CN110879938A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text emotion classification method, a text emotion classification device, text emotion classification equipment and a storage medium. The method comprises the following steps: acquiring a word vector in text data to be processed, and extracting a feature vector corresponding to the word vector; extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model; and on the basis of the extracted context feature representation, an Attention mechanism is utilized, then a top-k-max Pooling processing mode is introduced to fully extract the text feature representation, and the extracted features are sent to a classifier to obtain higher accuracy. The method of the embodiment of the invention improves the classification accuracy of the text emotion classification, and has a good classification effect.

Description

Text emotion classification method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a text emotion classification method, device, equipment and storage medium.
Background
With the development of the internet and the increase of internet users, a great amount of text information is generated on the internet by network users, such as comments on a certain commodity, a movie, a shop and the like, and how to extract useful information from the text is beneficial to merchants, consumers and the like. Therefore, the text emotion tendency analysis becomes more important, and the text emotion tendency analysis (i.e. text emotion classification) is a branch of the Natural Language Processing (NLP) field, and the traditional text emotion classification mainly includes: the two methods do not consider the context information of words or the word order problem of texts and need a large amount of manpower to extract text features, and may not extract important features in texts at a deeper level.
In recent years, with the development of deep learning technology, Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) models have been proposed, where the CNN model mainly uses a Convolutional layer and a downsampling layer to perform feature extraction, and the RNN model makes the state of a current node (or the first nodes) affect the state of a next node and uses the state of the last node as a feature, but the above models have a poor text emotion classification effect.
Disclosure of Invention
The invention provides a text emotion classification method, a text emotion classification device, text emotion classification equipment and a storage medium, which are used for improving the text emotion classification effect.
In a first aspect, the present invention provides a text emotion classification method, including:
acquiring a word vector in text data to be processed, and extracting a feature vector corresponding to the word vector by using convolution operation;
extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
determining semantic codes corresponding to the context feature representations according to the extracted context feature representations;
performing maximum pooling on semantic codes corresponding to the context feature representations, and splicing the semantic codes subjected to maximum pooling to obtain spliced feature representations;
and classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
In a second aspect, the present invention provides a text emotion classification apparatus, including:
the extraction module is used for acquiring word vectors in the text data to be processed and extracting the feature vectors corresponding to the word vectors;
the extraction module is also used for extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
the determining module is used for determining semantic codes corresponding to the context feature representations according to the extracted context feature representations;
the processing module is used for performing maximum pooling processing on the semantic codes corresponding to the context feature representation and splicing the semantic codes subjected to maximum pooling processing to obtain spliced feature representation;
and classifying the spliced feature representations to acquire emotion categories corresponding to the text data.
In a third aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the method described in any one of the first aspect.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of the first aspects via execution of the executable instructions.
The text emotion classification method, the text emotion classification device, the text emotion classification equipment and the storage medium, provided by the embodiment of the invention, are used for acquiring word vectors in text data to be processed and extracting feature vectors corresponding to the word vectors by using convolution operation; extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model; acquiring the importance of different features by using an Attention mechanism according to the extracted context feature representation, and then sending the context feature representation into a top-k-maxporoling pooling layer to extract the most important first k features, so as to determine semantic codes corresponding to the context feature representation; and classifying the semantic codes corresponding to the context feature representation to obtain the emotion categories corresponding to the text data, wherein the Bi-LSTM model can fully obtain the context features of words in the text data, and can distinguish important features and filter non-important features through semantic coding, so that the important features have higher weight, the accuracy of text emotion classification is improved, and the classification effect is better.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flowchart illustrating a text emotion classification method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a text emotion classification effect method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the principle of pooling according to one embodiment of the method of the present invention;
FIG. 4 is a schematic diagram of Bi-LSTM model principle of an embodiment of the method provided by the present invention;
FIG. 5 is a schematic illustration of an attention mechanism according to an embodiment of the method of the present invention;
FIG. 6 is a schematic structural diagram of an embodiment of a text emotion classification apparatus provided in the present invention;
fig. 7 is a schematic structural diagram of an embodiment of an electronic device provided in the present invention.
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terms "comprising" and "having," and any variations thereof, in the description and claims of this invention and the drawings described herein are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Firstly, the application scene related to the invention is introduced:
the text emotion classification method provided by the embodiment of the invention is applied to a scene for carrying out emotion classification on text data so as to improve classification accuracy.
The emotion root is an emotion for judging whether a text data expresses a positive emotion or a negative emotion, for example, for comments on the network, such as purchase evaluation, movie evaluation, microblog comments and the like.
The method provided by the invention can be realized by the electronic equipment such as a processor executing corresponding software codes, and can also be realized by the electronic equipment performing data interaction with a server while executing the corresponding software codes, for example, the server performs part of operations to control the electronic equipment to execute the method.
The following embodiments are all described with electronic devices as the executing bodies.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
FIG. 1 is a flowchart illustrating a text emotion classification method according to an embodiment of the present invention. As shown in fig. 1, the method provided by this embodiment includes:
step 101, obtaining word vectors in text data to be processed, and extracting feature vectors corresponding to the word vectors.
Specifically, the word vector may be trained before obtaining the word vector, for example, the word vector may be trained using 30G news corpus of dog search, the corpus may be segmented by using Jieba under python, and then the word vector may be trained, and the word vector may be trained using cbow model under word2vec, and the parameters may be set as: the context window length is set to 5, the learning rate alpha is used with a default of 0.025, the lowest frequency min-count is used with a default of 5, i.e., if a word occurs in a document less than 5 times, it is discarded and the word vector dimension is set to 100 dimensions.
And segmenting the text data to be processed to obtain a plurality of words, and converting the words in the text data into word vectors according to the trained word vector model. The text data includes, for example, a plurality of sentences, each sentence corresponding to a plurality of word vectors. For example by the word embedding layer shown in fig. 2.
Further, as shown in fig. 2, after text data is converted into corresponding word vectors, preliminary feature vectors are extracted through one layer of one-dimensional convolution layer.
And 102, extracting context feature representation for the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model.
Specifically, in the method of the embodiment of the present invention, the feature vector obtained by the convolution operation, which is the calculation result of the convolution layer, is sent to the Bi-LSTM model (as shown in fig. 2), and the Bi-LSTM model can fully extract the text features.
The two-way long-and-short-term memory network has great advantages in processing sequence data (and text data with sequence), so the feature vector Bi-LSTM model is used after the first layer of convolution operation in the embodiment of the invention. Compared with a traditional Recurrent Neural Network (RNN) model, the long and short term memory Network (LSTM) has no problems of gradient extinction and gradient explosion, and has good effect in natural language processing. In order to enable the LSTM to fuse the vocabulary information of the current time and all the context information thereof, a Bi-LSTM model capable of reading text in two directions is used in the embodiment of the present invention.
And extracting context feature representation from the feature vector after the convolution operation through a Bi-LSTM model.
And 103, determining semantic codes corresponding to the context feature representations according to the extracted context feature representations.
Specifically, as shown in fig. 2, the semantic code corresponding to the context feature representation is determined by an Attention mechanism, that is, the influence of the important feature can be highlighted by calculating the Attention distribution probability in the text emotion classification, that is, different keywords in the sentence have different influence on the classification result.
For example, it is impossible to remember all descriptions when looking at a review about a certain product, and only some keywords such as "good", "not good", "like" and "like" can be remembered, and these words are important for the expression of the emotional tendency of the text, so that different features in the text data have different effects on the classification result.
The semantic code is calculated according to the output result (namely, the context feature representation) and the probability weight of the Bi-LSTM layer.
104, performing maximum pooling on semantic codes corresponding to the context feature representations, and splicing the multiple semantic codes subjected to maximum pooling to obtain spliced feature representations;
in order to further reduce the data latitude, the maximum pooling operation can be performed after the semantic code is generated: and k-max-firing, selecting the first k maximum values in the generated semantic coding result by using a fixed sliding window, extracting the most important first k characteristics, and filtering out the non-important characteristics to reduce the data latitude so as to improve the convergence speed and the prediction precision of the model.
And 105, classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
Specifically, the text data is classified according to the semantic code and the result of the maximum pre-K pooling process, and the emotion type corresponding to the text data is obtained. The classification process may be performed by a preset classification function. Wherein different classifiers can be utilized for the classification process.
For example, for a comment on a certain product, if two categories are performed, the comment can be classified into a category of good comment or bad comment. Good comments may include words such as "good", and bad comments may include statements such as bad use experience and no purchase.
The method of the embodiment comprises the steps of obtaining word vectors in text data to be processed, and extracting feature vectors corresponding to the word vectors; extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model; determining semantic codes corresponding to the context feature representations according to the extracted context feature representations; and classifying the semantic codes corresponding to the context feature representation to obtain the emotion categories corresponding to the text data, wherein the Bi-LSTM model can fully obtain the context features of words in the text data, and can distinguish important features and filter non-important features through semantic coding, so that the important features have higher weight, the accuracy of text emotion classification is improved, and the classification effect is better.
On the basis of the foregoing embodiment, optionally, step 104 may be specifically implemented by:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
Specifically, as shown in fig. 2, after semantic coding, k-max pooling is performed, and as shown in fig. 3, a top-k calculation formula is:
top-k=maxk{c1,c2,c3,…cp}
k in the above formula represents the first k values to be taken as maximum, c1,c2,c3,…cpRespectively, the semantic code values, and p the size of the sliding window.
Figure BDA0002272689600000061
The numbers represent the concatenation of the vectors. The step size of the sliding window may be k-1.
That is, the first k maximum semantic code values are found at p semantic code values at a time, then the sliding window moves to the right by k-1 semantic code values, and the next group of p semantic code values is found.
And finally, classifying the spliced feature representation by using a preset classification function to obtain the emotion type corresponding to the text data.
On the basis of the foregoing embodiment, optionally, the extracting of the feature vector corresponding to the word vector in step 101 may specifically be implemented in the following manner:
inputting the word vector into the convolutional layer to obtain a characteristic matrix as follows:
Figure BDA0002272689600000071
each row in the F matrix represents a feature vector generated after convolution of a word vector in the text data by convolution windows with different sizes;
wherein c in each rowij=ReLU(sjf+θ),Wherein ReLU is activation function, f is equal to Rk×DRepresents the convolution operation of a filter with convolution layer length k (i.e. convolution window size k) on a D-dimensional word vector, theta represents the offset, sjA word vector matrix s composed of k successive words starting from the jth word in the text dataj=[wj,wj+1,…,wj+k-1]Wherein w isj∈RDRepresenting a word vector of a jth word in the text data, wherein the dimension is D dimension, and the value range of i is 1-m; m is the number of the types of the convolution windows, the value range of j is 1 to n, and n is the number of words after the words are segmented in the text data.
For example, m is 3, the size of the convolution window k of type 1 is 2, the size of the convolution window k of type 2 is 3, and the size of the convolution window k of type 3 is 4.
Further, as shown in fig. 4, step 102 may be specifically implemented by:
determining the above feature representation corresponding to the feature vector according to the following formula (2);
determining the following feature representation corresponding to the feature vector according to the following formula (3);
obtaining the feature representation of the context corresponding to the feature vector by using the following formula (4) according to the feature representation and the following feature representation;
Figure BDA0002272689600000072
Figure BDA0002272689600000073
Figure BDA0002272689600000074
wherein h istA hidden state h corresponding to the t-th word in the text datat=ot⊙tanh(ct),ot=δ(Wo·X+bo),ct=ftt-1+it⊙tanh(Wc·X+bc),
Figure BDA0002272689600000081
ft=δ(Wf·X+bf),,it=δ(Wi·X+bi);
Wherein, Wf、Wi、Wo、WcWeight matrix for LSTM, bf、bi、bo、bcOffset of LSTM, w1tColumn vectors of the t-th column of the F matrix, delta (·) is an activation function, ⊙ is a dot product operation of the matrix, and n is the number of word vectors.
In particular, δ (·) may be an activation function sigmoid.
The Bi-LSTM layer can fuse the current vocabulary information and all the context information thereof together to obtain the characteristic representation of the context.
Further, as shown in fig. 5, step 103 may be specifically implemented as follows:
determining semantic codes corresponding to the feature representation of the context by using the following formula (5) according to the feature representation of the context extracted by the Bi-LSTM model and the probability weight;
Figure BDA0002272689600000082
wherein outtIs a characteristic representation of the context, altRepresenting the degree of importance of the tth feature representation, i.e. the probability weight of the tth feature representation, altCalculated from the following equation (6):
Figure BDA0002272689600000083
wherein r ist=vTtanh(WAoutt+b),WAIs a parameter matrix, b is a bias term, vTIs the transpose of the random initial matrix v.
The value range of l is 1 to n, and n is the number of word vectors.
The semantic coding corresponding to the characteristic representation of the context is calculated according to the output result of the Bi-LSTM layer and the probability weight, and the influence of important characteristics is highlighted by utilizing an Attention mechanism.
The method of the embodiment of the invention takes the Bi-LSTM model as a single layer to be fused between the convolutional layer and the pooling layer, firstly utilizes the convolutional layer to carry out the preliminary feature extraction of the text, to fully obtain the contextual characteristics of words in the text data the feature vectors are fed into the Bi-LSTM model, in order to distinguish important features and filter non-important features, an attention mechanism and Top-k maximum pooling are introduced to an output result of the Bi-LSTM model, the important features have higher weights by the attention mechanism, the step length of a sliding window of the Top-k maximum pooling can be k-1, and therefore N-Gram operation in natural language processing is simulated.
To sum up, the method of the embodiment of the invention comprises the steps of after convolution calculation of feature vectors of a convolutional layer, putting the calculation results into a feature matrix F one by one according to the calculation sequence, directly introducing Bi-LSTM after the convolutional layer in order to prevent the sequence of the original sentence from being disturbed by a pooling layer, introducing the calculation result of the Bi-LSTM into an Attention mechanism to enable important features to have higher weight, introducing top-k maximum pooling processing in order to reduce feature dimensionality and improve classification accuracy, and finally sending extracted feature representations into a strong classifier for classification in order to improve the accuracy of text emotion classification, thereby obtaining better classification effect.
Based on the method of the embodiment, the CBLTK model of the embodiment of the present invention is established in a mode of merging the CNN and Bi-LSTM models, features of the text are extracted by using the feature that deep learning can extract deeper features, in order to improve the accuracy of final classification, several commonly used strong classifiers are compared below, and the extracted features are sent to the strong classifiers for classification (support vector machine, SVM for short)/random forest (RF for short), etc.):
the text data used below is, for example, broad shadow evaluation data, which has been marked with the number of stars of the user at the time of review (five stars represent well and one star represents poorly), from which five stars and one star data are extracted for the study of text emotion classification (i.e., the text data is subjected to two classifications), thirty thousand text data for each category of positive and negative types is used as training data, twenty thousand text data for each category of positive and negative types is used as test data, and on statistical average each review includes 45 words, the model of the embodiment of the present invention is specified as 45 words using the sentence length of the text data, is directly truncated if there are sentences exceeding 45, if the sentence length is less than 45, null is used for filling, the word segmentation tool uses the jieba word segmentation of Python and Tensorflow1.6 to construct the CBLTK model of the embodiment of the invention. The word vector training corpus may be a dog news corpus 30G.
Based on the sequential characteristic among words of text data, the embodiment of the invention only uses one layer of convolution kernel at the first layer of the CBLTK model, and uses a max-posing maximum pooling layer and a classification layer, such as softmax classification, after using an attention mechanism, and dropout can be set to be 50% in the model training process and is regularized by using L2. The model of the embodiment of the invention can set minimatch to be 100, uses three types of convolution windows with different sizes and 150 convolution kernels in each type, and selects the best convolution window of the three types.
TABLE 1 selection of convolution window size
Figure BDA0002272689600000091
Figure BDA0002272689600000101
From the results in table 1, we select convolution windows with convolution windows of lengths 2, 3 and 4 respectively to perform convolution operation, and can select 150 hidden layer neurons for the number of the hidden layer neurons of the LSTM layer in the Bi-LSTM layer. For the k max pooling layer, the most suitable value for k values from 1 to 6 was found, and k can be selected to be 3 from the results given in the following table.
TABLE 2 selection of k values
Top-k value Rate of accuracy
1 71.2%
2 74.5%
3 77.6%
4 75.3%
5 76.8%
6 75.6%
The CBLTK model of the embodiment of the invention mainly takes the final classification accuracy as an evaluation index, several groups of comparison tests are respectively classified by using softmax and strong classifiers, and the classification result of the traditional text emotion classification method is added as a reference. The specific results are as follows:
TABLE 3 use of softmax classifier
Model (model) Rate of accuracy
CNN 80.3%
LSTM 81.1%
CNN+LSTM 82.8%
LSTM+CNN 83.3%
CBLTK 85.2%
TABLE 4 use of Strong classifiers (svm)
Figure BDA0002272689600000102
Figure BDA0002272689600000111
It can be seen from tables 3 and 4 that the method for classifying text emotion by using a conventional text emotion classification method (for example, term frequency-inverse text frequency index (Tf-idf) algorithm) is high in accuracy without using a deep learning method because text features in a shallow layer can be extracted, and the effect of using first CNN and then LSTM is good without using first LSTM and then CNN. It can be seen from table 2 that although the CBLTM model proposed in the embodiment of the present invention uses SVM, the effect enhancement is not large, so that the following model combines four commonly used strong classifiers: SVM, RF, naive Bayes and GBDT to find out a combination mode with better classification effect.
TABLE 5 combination of different classifiers
Combination mode Rate of accuracy
CBLTK+SVM 86.1%
CBLTK+RF 87.6%
CBLTK+GBDT 89.3%
CBLTK + naive Bayes 85.3%
Table 5 the current results are related to other factors besides the currently selected data, such as the amount of data, whether the features extracted by the current model are applicable to the current classifier, etc., and it can be seen from the results in table 5 that the use of the strong classifier is not necessarily all effective, for example, the effect of the honokibayes combination is not as good as expected, which may be related to the characteristics of the currently used data and the honokibayes: the premise of naive bayes is that each feature is assumed to be independent, and words and phrases in text classification have strong correlation, and it can be seen from the result that the reason why GBDT is superior to random forest (hereinafter referred to as RF) is probably due to the fact that RF uses bagging in ensemble learning, that is: the GBDT belongs to boosting idea, which is to sample according to error rate, that is, a weak classifier gives a relatively low weight to a weak classification error during training, so the training process of the GBDT is similar to the deep learning model integrated into the Attention mechanism, and the weight value is used to highlight important features.
Fig. 6 is a structural diagram of an embodiment of a text emotion classification device provided in the present invention, and as shown in fig. 6, the text emotion classification device of the present embodiment includes:
the extraction module 601 is configured to obtain a word vector in text data to be processed, and extract a feature vector corresponding to the word vector;
the extraction module 601 is further configured to extract context feature representation for the feature vector by using a Bi-directional long-and-short time memory network Bi-LSTM model;
a determining module 602, configured to determine, according to the extracted context feature representation, a semantic code corresponding to the context feature representation;
a processing module 603, configured to perform maximal pooling on the semantic codes corresponding to the context feature representations, and splice multiple semantic codes after the maximal pooling to obtain a spliced feature representation;
and classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
In a possible implementation manner, the extracting module 601 is specifically configured to:
inputting the word vector into the convolutional layer to obtain a characteristic matrix as follows:
Figure BDA0002272689600000121
each row in the F matrix represents a feature vector generated after convolution of a word vector in the text data by convolution windows with different sizes;
wherein c in each rowij=ReLU(sjf + θ), wherein ReLU is the activation function, f ∈ Rk×DRepresents the convolution operation of a filter with convolution layer length k on a D-dimensional word vector, theta represents the offset, and sjK successive word groups representing the beginning of the ith word in the text dataResultant word vector matrix sj=[wj,wj+1,…,wj+k-1]Wherein w isj∈RDRepresenting a word vector of a jth word in the text data, wherein the dimension is D dimension, and the value range of i is 1-m; m is the number of the types of the convolution windows, the value range of j is 1 to n, and n is the number of words after the words are segmented in the text data.
In a possible implementation manner, the extracting module 601 is specifically configured to:
determining the above feature representation corresponding to the feature vector according to the following formula (2);
determining the following feature representation corresponding to the feature vector according to the following formula (3);
obtaining the feature representation of the context corresponding to the feature vector by using the following formula (4) according to the feature representation and the following feature representation;
Figure BDA0002272689600000131
Figure BDA0002272689600000132
Figure BDA0002272689600000133
wherein h istA hidden state h corresponding to the t-th word in the text datat=ot⊙tanh(ct),ot=δ(Wo·X+bo),ct=ft⊙ct-1+it⊙tanh(Wc·X+bc),
Figure BDA0002272689600000134
ft=δ(Wf·X+bf),,it=δ(Wi·X+bi);
Wherein, Wf、Wi、Wo、WcWeight matrix for LSTM, bf、bi、bo、bcOffset of LSTM, w1tColumn vectors of the t-th column of the F matrix, delta (·) is an activation function, ⊙ is a dot product operation of the matrix, n is the number of word vectors, and the value range of t is 1 to n.
In a possible implementation manner, the determining module 602 is specifically configured to:
determining semantic codes corresponding to the feature representation of the context by using the following formula (5) according to the feature representation of the context extracted by the Bi-LSTM model and the probability weight;
Figure BDA0002272689600000135
wherein outtIs a characteristic representation of the context, altRepresenting the degree of importance of the tth feature representation, i.e. the probability weight of the tth feature representation, altCalculated from the following equation (6):
Figure BDA0002272689600000136
wherein r ist=vTtanh(WAoutt+b),WAIs a parameter matrix, b is a bias term, vTIs a transpose of the random chamber matrix v.
In a possible implementation manner, the processing module 603 is specifically configured to:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
In a possible implementation manner, the processing module 603 is specifically configured to:
and classifying the spliced feature representation by using a preset classification function to obtain the emotion classification corresponding to the text data.
The apparatus of this embodiment may be configured to implement the technical solutions of the above method embodiments, and the implementation principles and technical effects are similar, which are not described herein again.
Fig. 7 is a structural diagram of an embodiment of an electronic device provided in the present invention, and as shown in fig. 7, the electronic device includes:
a processor 701, and a memory 702 for storing executable instructions for the processor 701.
Optionally, the method may further include: a communication interface 703 for enabling communication with other devices.
The above components may communicate over one or more buses.
The processor 501 is configured to execute the corresponding method in the foregoing method embodiment by executing the executable instruction, and the specific implementation process of the method may refer to the foregoing method embodiment, which is not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method in the foregoing method embodiment is implemented.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A text emotion classification method is characterized by comprising the following steps:
acquiring a word vector in text data to be processed, and extracting a feature vector corresponding to the word vector by using convolution operation;
extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
determining semantic codes corresponding to the context feature representations according to the extracted context feature representations;
performing maximum pooling on semantic codes corresponding to the context feature representations, and splicing the semantic codes subjected to maximum pooling to obtain spliced feature representations;
and classifying the spliced feature representations to acquire emotion categories corresponding to the text data.
2. The method according to claim 1, wherein the extracting the feature vector corresponding to the word vector comprises:
inputting the word vector into the convolutional layer to obtain a characteristic matrix as follows:
Figure FDA0002272689590000011
each row in the F matrix represents a feature vector generated after convolution of a word vector in the text data by convolution windows with different sizes;
wherein c in each rowij=ReLU(sjf + θ), wherein ReLU is the activation function, f ∈ Rk×DRepresents the convolution operation of a filter with convolution layer length k on a D-dimensional word vector, theta represents the offset, and sjA word vector matrix s composed of k successive words starting from the ith word in the text dataj=[wj,wj+1,…,wj+k-1]Wherein w isj∈RDA word vector representing a jth word in the text data, the dimension being a D dimension, the i value rangeIs 1 to m; m is the number of the types of the convolution windows, the value range of j is 1 to n, and n is the number of words after the words are segmented in the text data.
3. The method of claim 2, wherein the extracting context feature representation of the feature vector by using a Bi-directional long-and-short memory network Bi-LSTM model comprises:
determining the above feature representation corresponding to the feature vector according to the following formula (2);
determining the following feature representation corresponding to the feature vector according to the following formula (3);
obtaining the feature representation of the context corresponding to the feature vector by using the following formula (4) according to the feature representation and the following feature representation;
Figure FDA0002272689590000021
Figure FDA0002272689590000022
Figure FDA0002272689590000023
wherein h istA hidden state h corresponding to the t-th word in the text datat=ot⊙tanh(ct),ot=δ(Wo·X+bo),ct=ft⊙ct-1+it⊙tanh(Wc·X+bc),
Figure FDA0002272689590000024
ft=δ(Wf·X+bf),,it=δ(Wi·X+bi);
Wherein, Wf、Wi、Wo、WcWeight matrix for LSTM, bf、bi、bo、bcOffset of LSTM, w1tColumn vectors of the t-th column of the F matrix, delta (·) is an activation function, ⊙ is a dot product operation of the matrix, n is the number of word vectors, and the value range of t is 1 to n.
4. The method according to claim 3, wherein the determining semantic coding corresponding to the context feature representation according to the extracted context feature representation comprises:
determining semantic codes corresponding to the feature representation of the context by using the following formula (5) according to the feature representation and the probability weight of the context extracted by the Bi-LSTM model;
Figure FDA0002272689590000025
wherein outtIs a characteristic representation of the context, altRepresenting the degree of importance of the tth feature representation, i.e. the probability weight of the tth feature representation, altCalculated from the following equation (6):
Figure FDA0002272689590000026
wherein r ist=vTtanh(WAoutt+b),WAIs a parameter matrix, b is a bias term, vTIs the transpose of the random initial matrix v.
5. The method according to claim 1, wherein performing maximal pooling on semantic codes corresponding to the context feature representations and splicing the multiple semantic codes after the maximal pooling to obtain a spliced feature representation comprises:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
6. The method according to any one of claims 1 to 5, wherein the classifying the spliced feature representation comprises:
and classifying the spliced feature representations by using different preset classification models to obtain emotion categories corresponding to the text data.
7. A text emotion classification device, comprising:
the extraction module is used for acquiring word vectors in the text data to be processed and extracting the feature vectors corresponding to the word vectors;
the extraction module is also used for extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
the determining module is used for determining semantic codes corresponding to the context feature representations according to the extracted context feature representations by utilizing an Attention mechanism top-k-maxporoling mechanism;
the processing module is used for performing maximum pooling processing on the semantic codes corresponding to the context feature representation and splicing the semantic codes subjected to maximum pooling processing to obtain spliced feature representation;
and classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
8. The apparatus of claim 7, wherein the processing module is specifically configured to:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1-6.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-6 via execution of the executable instructions.
CN201911110950.9A 2019-11-14 2019-11-14 Text emotion classification method, device, equipment and storage medium Pending CN110879938A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911110950.9A CN110879938A (en) 2019-11-14 2019-11-14 Text emotion classification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911110950.9A CN110879938A (en) 2019-11-14 2019-11-14 Text emotion classification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110879938A true CN110879938A (en) 2020-03-13

Family

ID=69730444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911110950.9A Pending CN110879938A (en) 2019-11-14 2019-11-14 Text emotion classification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110879938A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111538835A (en) * 2020-03-30 2020-08-14 东南大学 Social media emotion classification method and device based on knowledge graph
CN111737467A (en) * 2020-06-22 2020-10-02 华南师范大学 Object-level emotion classification method based on segmented convolutional neural network
CN111930938A (en) * 2020-07-06 2020-11-13 武汉卓尔数字传媒科技有限公司 Text classification method and device, electronic equipment and storage medium
CN112069307A (en) * 2020-08-25 2020-12-11 中国人民大学 Legal law citation information extraction system
CN113361252A (en) * 2021-05-27 2021-09-07 山东师范大学 Text depression tendency detection system based on multi-modal features and emotion dictionary
CN113393276A (en) * 2021-06-25 2021-09-14 食亨(上海)科技服务有限公司 Comment data classification method and device and computer readable medium
CN113469365A (en) * 2021-06-30 2021-10-01 上海寒武纪信息科技有限公司 Inference and compilation method based on neural network model and related products thereof
CN114168730A (en) * 2021-11-26 2022-03-11 一拓通信集团股份有限公司 Consumption tendency analysis method based on BilSTM and SVM
CN114298019A (en) * 2021-12-29 2022-04-08 中国建设银行股份有限公司 Emotion recognition method, emotion recognition apparatus, emotion recognition device, storage medium, and program product

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
WO2019079922A1 (en) * 2017-10-23 2019-05-02 腾讯科技(深圳)有限公司 Session information processing method and device, and storage medium
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 The sentiment analysis method of two-way LSTM model based on attention enhancing
WO2019085328A1 (en) * 2017-11-02 2019-05-09 平安科技(深圳)有限公司 Enterprise relationship extraction method and device, and storage medium
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
US20190188260A1 (en) * 2017-12-14 2019-06-20 Qualtrics, Llc Capturing rich response relationships with small-data neural networks
CN110162636A (en) * 2019-05-30 2019-08-23 中森云链(成都)科技有限责任公司 Text mood reason recognition methods based on D-LSTM

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
WO2019079922A1 (en) * 2017-10-23 2019-05-02 腾讯科技(深圳)有限公司 Session information processing method and device, and storage medium
WO2019085328A1 (en) * 2017-11-02 2019-05-09 平安科技(深圳)有限公司 Enterprise relationship extraction method and device, and storage medium
US20190188260A1 (en) * 2017-12-14 2019-06-20 Qualtrics, Llc Capturing rich response relationships with small-data neural networks
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 The sentiment analysis method of two-way LSTM model based on attention enhancing
CN110162636A (en) * 2019-05-30 2019-08-23 中森云链(成都)科技有限责任公司 Text mood reason recognition methods based on D-LSTM

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
成璐: "\"基于注意力机制的双向LSTM模型在中文商品评论感觉分类中的研究\"" *
白静;李霏;姬东鸿;: "基于注意力的BiLSTM-CNN中文微博立场检测模型" *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111538835A (en) * 2020-03-30 2020-08-14 东南大学 Social media emotion classification method and device based on knowledge graph
CN111538835B (en) * 2020-03-30 2023-05-23 东南大学 Social media emotion classification method and device based on knowledge graph
CN111737467B (en) * 2020-06-22 2023-05-23 华南师范大学 Object-level emotion classification method based on segmented convolutional neural network
CN111737467A (en) * 2020-06-22 2020-10-02 华南师范大学 Object-level emotion classification method based on segmented convolutional neural network
CN111930938A (en) * 2020-07-06 2020-11-13 武汉卓尔数字传媒科技有限公司 Text classification method and device, electronic equipment and storage medium
CN112069307A (en) * 2020-08-25 2020-12-11 中国人民大学 Legal law citation information extraction system
CN113361252A (en) * 2021-05-27 2021-09-07 山东师范大学 Text depression tendency detection system based on multi-modal features and emotion dictionary
CN113393276A (en) * 2021-06-25 2021-09-14 食亨(上海)科技服务有限公司 Comment data classification method and device and computer readable medium
CN113393276B (en) * 2021-06-25 2023-06-16 食亨(上海)科技服务有限公司 Comment data classification method, comment data classification device and computer-readable medium
CN113469365A (en) * 2021-06-30 2021-10-01 上海寒武纪信息科技有限公司 Inference and compilation method based on neural network model and related products thereof
CN113469365B (en) * 2021-06-30 2024-03-19 上海寒武纪信息科技有限公司 Reasoning and compiling method based on neural network model and related products thereof
CN114168730A (en) * 2021-11-26 2022-03-11 一拓通信集团股份有限公司 Consumption tendency analysis method based on BilSTM and SVM
CN114298019A (en) * 2021-12-29 2022-04-08 中国建设银行股份有限公司 Emotion recognition method, emotion recognition apparatus, emotion recognition device, storage medium, and program product

Similar Documents

Publication Publication Date Title
CN110879938A (en) Text emotion classification method, device, equipment and storage medium
US11216620B1 (en) Methods and apparatuses for training service model and determining text classification category
CN107943784B (en) Relationship extraction method based on generation of countermeasure network
CN110110062B (en) Machine intelligent question and answer method and device and electronic equipment
CN116194912A (en) Method and system for aspect-level emotion classification using graph diffusion transducers
CN109766557B (en) Emotion analysis method and device, storage medium and terminal equipment
CN110619044B (en) Emotion analysis method, system, storage medium and equipment
CN113011186B (en) Named entity recognition method, named entity recognition device, named entity recognition equipment and computer readable storage medium
CN112711948A (en) Named entity recognition method and device for Chinese sentences
CN113326374B (en) Short text emotion classification method and system based on feature enhancement
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN113220886A (en) Text classification method, text classification model training method and related equipment
CN110377733B (en) Text-based emotion recognition method, terminal equipment and medium
CN112100377B (en) Text classification method, apparatus, computer device and storage medium
CN111078833A (en) Text classification method based on neural network
CN110968725B (en) Image content description information generation method, electronic device and storage medium
CN109614611B (en) Emotion analysis method for fusion generation of non-antagonistic network and convolutional neural network
CN112667782A (en) Text classification method, device, equipment and storage medium
CN114860930A (en) Text classification method and device and storage medium
CN111739520B (en) Speech recognition model training method, speech recognition method and device
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN113158667B (en) Event detection method based on entity relationship level attention mechanism
CN117094383A (en) Joint training method, system, equipment and storage medium for language model
CN107729509B (en) Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN110851600A (en) Text data processing method and device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200313

WD01 Invention patent application deemed withdrawn after publication