CN110929034A - Commodity comment fine-grained emotion classification method based on improved LSTM - Google Patents
Commodity comment fine-grained emotion classification method based on improved LSTM Download PDFInfo
- Publication number
- CN110929034A CN110929034A CN201911173494.2A CN201911173494A CN110929034A CN 110929034 A CN110929034 A CN 110929034A CN 201911173494 A CN201911173494 A CN 201911173494A CN 110929034 A CN110929034 A CN 110929034A
- Authority
- CN
- China
- Prior art keywords
- word
- commodity
- words
- emotion
- comment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 93
- 238000000034 method Methods 0.000 title claims abstract description 33
- 239000013598 vector Substances 0.000 claims abstract description 82
- 230000002996 emotional effect Effects 0.000 claims abstract description 29
- 238000013145 classification model Methods 0.000 claims abstract description 9
- 230000011218 segmentation Effects 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 25
- 238000012549 training Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 11
- 238000012360 testing method Methods 0.000 claims description 8
- 238000003062 neural network model Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 230000002457 bidirectional effect Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 238000001914 filtration Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 claims description 2
- 238000003058 natural language processing Methods 0.000 abstract description 5
- 238000013135 deep learning Methods 0.000 abstract description 2
- 238000007781 pre-processing Methods 0.000 abstract 1
- 230000004913 activation Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000002372 labelling Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention belongs to the field of natural language processing, and provides a commodity comment fine-grained emotion classification method based on improved LSTM, which comprises the following steps: compiling a crawler script, capturing commodity comment data of an electronic commerce website, and performing data preprocessing on the data; segmenting the cleaned data by using a crust segmentation tool; using word2vec of the generic natural language processing package to train word vectors, and obtaining word vectors corresponding to comment data; the existing emotion word bank is used as a seed word bank, and the emotion word bank is expanded according to the similarity of word vectors; extracting subject words and emotional words from the comments; and constructing an emotion classification model, importing a word vector sequence corresponding to the commodity comment subject word and the emotion word into the model, and carrying out emotion classification on the commodity comment. The invention provides a commodity comment fine-grained sentiment classification method based on improved LSTM, which fully excavates sentiment tendentiousness in commodity comments by using deep learning knowledge, thereby improving the sentiment classification accuracy of the commodity comments.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a commodity comment fine-grained emotion classification method based on improved LSTM.
Background
In recent years, with the rapid development of the internet, a large number of online users are gathered on network shopping platforms such as various social media, forums, the kyoto, the Taobao and the like. According to the 44 th statistical report of the development conditions of the Chinese Internet, as shown in 2019, 6 months, the scale of online shopping users in China reaches 6.39 hundred million, the online shopping users increase 2871 ten thousand in comparison with 2018, the online shopping users account for 74.8% of the whole netizens, and online shopping and Internet payment become applications with higher use ratio of netizens. Compared with the subjective description of merchants, when a certain commodity is purchased on the internet, people prefer to know the detailed condition of the commodity through objective information of the comments of buyers, and sellers of the e-commerce platform can also know the opinions of the people on the certain or certain commodity through the comments, so that the problems of the commodity are judged, and a reasonable selling strategy is made. In the face of such huge comment text information, it is a very time-consuming and labor-consuming matter to manually acquire emotional tendencies of comments, and therefore, it is a very important task to automatically mine and analyze emotional tendencies of comment texts by using an artificial intelligence technology and a related technology in the field of natural language processing.
The emotion analysis is a process of analyzing, processing, inducing and reasoning subjective text with emotion colors, and the emotion classification divides the text into two or more types which are either positive or negative according to the meaning and emotion information expressed by the text, and divides the text into emotion tendentiousness and viewpoint attitude. The traditional emotion classification mainly comprises emotion classification methods based on emotion dictionaries and machine learning. The emotion dictionary-based emotion classification method performs semantic analysis by using an emotion dictionary such as HowNet, and judges the positive and negative tendency of the text according to the final score. If the score is positive, the text represents positive emotion, and if the score is negative, the text represents negative emotion. The disadvantage of emotion classification through an emotion dictionary is that the emotion dictionary is excessively depended on, the difference between different fields is large, a mature Chinese emotion dictionary is limited, and the use range is limited, so that the transportability is poor. The emotion classification through machine learning mainly comprises a naive Bayes classification algorithm, a maximum entropy algorithm, a support vector machine and the like, but the methods need to contain a large number of labeled data sets, select positive features from positive comment data and select negative features from negative comment data.
Disclosure of Invention
In order to solve the defects that the existing mature emotion dictionary is short, the transportability of an emotion classification model is poor, and a large number of manual labeling data sets are needed, the invention provides a commodity comment fine-grained emotion classification method based on an improved LSTM (long-short term memory network), and the deep emotion information of a text can be extracted by combining a deep neural network model, so that the emotion classification precision is improved.
The technical scheme adopted by the invention for solving the technical problems is as follows: text processing technology in the natural language processing field is introduced into the emotion classification model, and the emotion classification accuracy is improved by combining deep learning technology. The Word vectors are trained by using the Word2Vec algorithm, the commodity comment texts are expressed by the Word vectors, the concept space of the texts is converted into a computable space, and the similarity is obtained by calculating the Euclidean distance between the two Word vectors. And finally, obtaining the emotional tendency of the commodity comment by inputting the word vectors corresponding to the subject words and the emotional words into an emotion classifier for training.
A commodity comment fine-grained emotion classification method based on improved LSTM comprises the following steps:
step 1: grabbing commodity comment data from an E-commerce website, wherein the commodity comment data comprise a commodity ID, a commodity category, a commodity name, commodity comment content and comment time, marking part of the commodity comment data into a positive category and a negative category, and dividing the marked data into a training set and a test set;
step 2: data cleaning is carried out on the commodity comment data, some punctuations which are useless for emotion classification are deleted, and the commodity comment is segmented;
and step 3: converting each word segmentation in the step 2 into a word vector, and constructing a word vector matrix corresponding to each word;
and 4, step 4: converting the emotional words and words in the subject word seed word bank into word vectors, wherein the vector matrix corresponding to each word is used as the vector matrix of the seed words, the vector matrix of the seed words and the word vector matrix obtained in the step 3 are subjected to similarity calculation, wherein the seed words are the subject words, the words of which the similarity calculation value is greater than the threshold value are used as the expansion of the subject word bank, and the seed words are the emotional words of which the similarity calculation value is greater than the threshold value are used as the expansion of the emotional words;
and 5: extracting subject words and emotion words from the commodity comment data, mapping the subject words and the emotion words into word vectors, and splicing the vectors between the subject words and the emotion words to obtain word vector splicing results as input of an emotion classifier;
step 6: the emotion classifier comprises a bidirectional long-time memory network and a softmax function, the word vector splicing result in the step 5 is used as the input of the emotion classifier, and the flow of a state matrix at different moments in the model training process is controlled through an input gate, an output gate and a forgetting gate through a two-layer LSTM neural network model; the network of the neural network model updates node information through a memory unit so as to learn the remote dependence characteristic in the text sequence, the weights of the subject words and the emotion words are respectively adjusted through an attention layer, the weight corresponding to the output matrix of the neural network unit is calculated, and the weighted sum of the output matrix and the weight of the attention layer is obtained and is a feature vector of commodity comments, so that a more accurate emotion classification result is obtained; finally, outputting the emotion categories of the commodity comments through a softmax function.
Further, in the step 1, data acquisition is carried out on the commodity comments by compiling Python crawler codes of the E-commerce website, manual labeling is carried out on the captured partial data, and each sentence of commodity comment is labeled as positive or negative; and finally, dividing the marked data into a training set and a test set.
Further, in the step 2, data cleaning is performed on the collected commodity comment data, punctuation marks which are useless for sentiment classification in the comment are removed, and a word segmentation tool is used for segmenting the commodity comment data.
Further, in the step 3, each Word is mapped into a Word vector by using Word2Vec as a result of segmenting the commodity comment, and the captured commodity comment data is trained, so that a feature vector containing emotion information and semantic information is obtained.
Further, in the step 4, each subject Word or each emotional Word is mapped to a Word vector by using Word2Vec in the subject Word seed lexicon and the emotional Word seed lexicon, similarity calculation is performed on the seed words and the Word vectors of the commodity reviews obtained in the step 3, and the seed words and the Word vectors with high similarity are respectively used as expansion lexicons of the subject Word lexicon and the emotional Word bank according to calculation results between the seed words and the Word vectors.
Further, in the step 5, effective components in the sentiment classification of the commodity comment are subject words and sentiment words, all words of the user comment are subjected to word vector conversion, vector similarity between words in the comment and a subject word bank is calculated, subject words in the comment are filtered, vector similarity between words in the comment and a sentiment word bank is calculated, and sentiment words in the comment are filtered; the method for filtering out the subject words and the emotional words is the same as the method for calculating the similarity of the extended subject word library and the emotional word library in the step 4; and splicing the filtered subject term and the emotion word vector in order to input the emotion classification information contained in the comment into the emotion classification model.
Further, in the step 6, the subject word vector and the emotion word vector of the result in the step 5 are spliced to be used as the input of the emotion classifier; inputting commodity comment data into an emotion classifier, and classifying the commodity comment data by using an emotion classification model; inputting a text vector into an emotion model, firstly, carrying out calculation of a comment corresponding matrix through two layers of long-time memory networks, wherein nodes of hidden layers in the two layers of networks are mutually connected, and the two layers of networks are connected with the same output layer; in order to highlight the role of the subject words and the emotion words in the comment sentences, an attention mechanism is introduced into the matrix of the output layer, and the matrix of the output layer is subjected to weighted summation, so that the final accuracy of emotion classification is improved; and finally, inputting the array matrix into a softmax function to obtain a softmax value, and determining the emotional tendency of the comment.
Has the advantages that:
the method has the advantages that the model can be used in various fields and situations, other data in the field can be classified with emotion only by labeling a small amount of corpora, and high classification accuracy can be achieved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of the emotion classifier of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
As shown in fig. 1 and fig. 2, the method for classifying the fine-grained sentiment of the commodity comment based on the improved LSTM according to the present invention obtains the sentiment category of the commodity comment through a sentiment classifier, and mainly includes the following steps:
the method comprises the following steps: the method comprises the steps of writing Python crawler codes to capture commodity comment data on an E-commerce website, such as a Kyoto mall, wherein the commodity comment data comprise a commodity ID, a commodity type, a commodity name, commodity comment contents, comment time and the like, and marking the commodity comment data manually to respectively represent positive emotional tendency and negative emotional tendency.
After the commodity comment data are labeled, dividing the data set into a test set and a training set, wherein the training set is used for model training of the labeled data, the test set is used for testing the trained model, and the division ratio of the test set to the training set is 2: 8.
Step two: for the captured commodity comment data, punctuation marks which make the commodity comment data useless are cleaned through data, such as: deleting commas, periods, ellipses and the like, and segmenting the commodity comments; the Chinese character recognition method includes the steps that a word segmentation tool is used for segmenting words of a commodity comment data corpus, stop words generally refer to words which do not affect the meaning and emotional tendency of whole sentence comment, stop words are filtered for commodity comments by using a stop word list, and therefore model training efficiency is improved.
Step three: calculating the similarity between words, for example, the similarity can be calculated by calling a genesis library of Python, then performing Word vector training by using a Word2Vec method, and performing Word vector conversion on a result obtained after the commodity comment data is participled, wherein each Word corresponds to one Word vector. The model adopted by the Word2Vec method comprises two different modes, namely a bag-of-words model (CBOW) and a Skip-Word model (Skip-Gram), and the Word vector can be obtained through efficient training on large data volume.
Step four: and converting words in the subject Word seed lexicon and the emotional Word seed lexicon into corresponding Word vectors by a Word2Vec method, calculating the similarity between the Word vectors and the Word vectors converted from the commodity comment data in the commodity comment step III, and respectively using the Word vectors and the Word vectors with high similarity as the expansion lexicons of the subject Word lexicon and the emotional Word lexicon according to the calculation result between the Word vectors and the Word vectors.
Step five: and step three, converting all the commodity comment data into corresponding word vector data, extracting word vector information corresponding to the subject words and the emotion words from the word vector data, and splicing word vectors to be used as input of the emotion classifier.
Step six: and inputting the subject word and the emotion word vector into an emotion classifier for model training, and inputting the word vector into a bidirectional long-time memory network in the emotion classifier model. The memory units in the LSTM are respectively a forgetting gate ftAnd input gate itAnd an output gate ot. These gates together determine the current memory cell ctAnd a current hidden state htThe conversion of (1).
Given an input sequence v ═ (v)1,v2,···,vL) LSTM computes the hidden vector sequence H ═ H1,h2,...,hL]And the output matrix sequence X ═ X1,x2,···,xL]。
The forgetting gate is used for controlling whether information is forgotten or not, and whether a cell state of a previous layer is forgotten or not is determined in the LSTM according to a certain probability. The representation of the forgetting gate is as follows:
ft=σ(Wf·[ht-1,xt]+bf)
wherein f istA forgetting gate representing the t-th time point, wherein sigma is the output value of the activation function with the value of 0,1],WfWeight variable for forgetting gate, ht-1Hidden layer data for t-1 time points, xtInput variable representing the t-th point in time, bfThe deviation of the door is forgotten.
The input gate is responsible for processing the input of the current sequence position and selectively storing new data information into the cell state, and the representation of the input gate is as follows:
it=σ(Wi·[ht-1,xt]+bi)
wherein itRepresents the input gate at the t-th time point, and sigma is the magnitude of the output value of the activation function at 0,1],WiAs weight variable of input gate, ht-1Hidden layer data for t-1 time points, xtInput variable representing the t-th point in time, biIs the bias of the input gate.
Ct=tanh(Wc·[ht-1,xt]+bc)
Wherein C istRepresenting candidate vectors, tanh being the magnitude of the activation function output value [ -1,1 [ ]],WcAs a weight variable of the candidate vector, ht-1Hidden layer data for t-1 time points, xtInput variable representing the t-th point in time, bcIs the bias of the input gate.
The output gate functions to pass the state of the cell back as an output through the processing of the intermediate layer information. The output gates are represented as follows:
ot=σ(Wo·[ht-1,xt]+bo)
wherein o istOutput gate representing the t-th time point, σ being the magnitude of the output value of the activation functionIn [0,1 ]],WoAs weight variable of the output gate, ht-1Hidden layer data for t-1 time points, xtInput variable representing the t-th point in time, boIs the bias of the output gate.
Two types of memory cells, including long and short memory, are represented as follows:
ht=0t⊙tanh(Ct)
wherein, CtRepresents the update status of the t-th time point, htThe hidden layer data for t time points.
For a sentence s ═ w1,w2,···,wLL represents the maximum number of words in a sentence. LSTMlAnd LSTMrRepresenting left and right hand LSTM elements, respectively. Cl(vi) The context vector, Cr (v), representing the output of the left LSTM celli) Context vector representing the output of the right LSTM cell, Cl(vi) And Cr (v)i) The combination of (C) results in the value of the final state matrix C.
And calculating the weight of the emotion classification model on each network node in the model training process through the attention layer, and then performing weighted summation on the vector of the output layer and the weight of the attention layer to obtain a feature vector corresponding to the final comment.
And finally, obtaining the emotional tendency classification of the commodity comments by the characteristic vector through a softmax function.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but various changes may be apparent to those skilled in the art, and it is intended that all inventive concepts utilizing the inventive concepts set forth herein be protected without departing from the spirit and scope of the present invention as defined and limited by the appended claims.
Claims (7)
1. A commodity comment fine-grained emotion classification method based on improved LSTM is characterized by comprising the following steps:
step 1: grabbing commodity comment data from an E-commerce website, wherein the commodity comment data comprise a commodity ID, a commodity category, a commodity name, commodity comment content and comment time, marking part of the commodity comment data into a positive category and a negative category, and dividing the marked data into a training set and a test set;
step 2: data cleaning is carried out on the commodity comment data, some punctuations which are useless for emotion classification are deleted, and the commodity comment is segmented;
and step 3: converting each word segmentation in the step 2 into a word vector, and constructing a word vector matrix corresponding to each word;
and 4, step 4: converting the emotional words and words in the subject word seed word bank into word vectors, wherein the vector matrix corresponding to each word is used as the vector matrix of the seed words, the vector matrix of the seed words and the word vector matrix obtained in the step 3 are subjected to similarity calculation, wherein the seed words are the subject words, the words of which the similarity calculation value is greater than the threshold value are used as the expansion of the subject word bank, and the seed words are the emotional words of which the similarity calculation value is greater than the threshold value are used as the expansion of the emotional words;
and 5: extracting subject words and emotion words from the commodity comment data, mapping the subject words and the emotion words into word vectors, and splicing the vectors between the subject words and the emotion words to obtain word vector splicing results as input of an emotion classifier;
step 6: the emotion classifier comprises a bidirectional long-time memory network and a softmax function, the word vector splicing result in the step 5 is used as the input of the emotion classifier, and the flow of a state matrix at different moments in the model training process is controlled through an input gate, an output gate and a forgetting gate through a two-layer LSTM neural network model; the network of the neural network model updates node information through a memory unit so as to learn the remote dependence characteristic in the text sequence, the weights of the subject words and the emotion words are respectively adjusted through an attention layer, the weight corresponding to the output matrix of the neural network unit is calculated, and the weighted sum of the output matrix and the weight of the attention layer is obtained and is a feature vector of commodity comments, so that a more accurate emotion classification result is obtained; finally, outputting the emotion categories of the commodity comments through a softmax function.
2. The method for classifying the fine-grained emotion of the commodity comments based on the improved LSTM as claimed in claim 1, wherein:
in the step 1, data acquisition is carried out on the commodity comments by compiling Python crawler codes of the E-commerce website, manual marking is carried out on the captured partial data, and each sentence of commodity comment is marked as positive or negative; and finally, dividing the marked data into a training set and a test set.
3. The method for classifying the fine-grained emotion of the commodity comments based on the improved LSTM as claimed in claim 1, wherein:
in the step 2, data cleaning is carried out on the collected commodity comment data, punctuation marks which are useless for sentiment classification in the comment are removed, and a word segmentation tool is used for carrying out word segmentation on the commodity comment data.
4. The method for classifying the fine-grained emotion of the commodity comments based on the improved LSTM as claimed in claim 1, wherein:
in the step 3, each Word is mapped into a Word vector by using Word2Vec as a result of segmenting the commodity comment, and the captured commodity comment data is trained, so that a feature vector containing emotion information and semantic information is obtained.
5. The method for classifying the fine-grained emotion of the commodity comments based on the improved LSTM as claimed in claim 1, wherein:
in the step 4, each subject Word or each emotional Word is mapped into a Word vector by using Word2Vec for the subject Word seed lexicon and the emotional Word seed lexicon, similarity calculation is carried out on the seed words and the Word vectors of the commodity comments obtained in the step 3, and according to the calculation result between the seed words and the Word vectors, the high similarity is respectively used as the expansion lexicons of the subject Word lexicon and the emotional lexicon.
6. The method for classifying the fine-grained emotion of the commodity comments based on the improved LSTM as claimed in claim 1, wherein:
in the step 5, effective components in the sentiment classification of the commodity comment are subject words and sentiment words, all words of the user comment are subjected to word vector conversion, the vector similarity between the words in the comment and the subject word bank is calculated, the subject words in the comment are filtered, the vector similarity between the words in the comment and the sentiment word bank is calculated, and the sentiment words in the comment are filtered; the method for filtering out the subject words and the emotional words is the same as the method for calculating the similarity of the extended subject word library and the emotional word library in the step 4; and splicing the filtered subject term and the emotion word vector in order to input the emotion classification information contained in the comment into the emotion classification model.
7. The method for classifying the fine-grained emotion of the commodity comments based on the improved LSTM as claimed in claim 1, wherein:
in the step 6, the subject word vectors and the emotion word vectors obtained in the step 5 are spliced to be used as the input of an emotion classifier; inputting commodity comment data into an emotion classifier, and classifying the commodity comment data by using an emotion classification model; inputting text vectors into an emotion classification model, firstly, carrying out calculation of comment corresponding matrixes through two layers of long-time and short-time memory networks, wherein nodes of hidden layers in the two layers of networks are mutually connected, and the two layers of networks are connected with the same output layer; in order to highlight the role of the subject words and the emotion words in the comment sentences, an attention mechanism is introduced into the matrix of the output layer, and the matrix of the output layer is subjected to weighted summation, so that the final accuracy of emotion classification is improved; and finally, inputting the array matrix into a softmax function to obtain a softmax value, and determining the emotional tendency of the comment.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911173494.2A CN110929034A (en) | 2019-11-26 | 2019-11-26 | Commodity comment fine-grained emotion classification method based on improved LSTM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911173494.2A CN110929034A (en) | 2019-11-26 | 2019-11-26 | Commodity comment fine-grained emotion classification method based on improved LSTM |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110929034A true CN110929034A (en) | 2020-03-27 |
Family
ID=69851952
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911173494.2A Pending CN110929034A (en) | 2019-11-26 | 2019-11-26 | Commodity comment fine-grained emotion classification method based on improved LSTM |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110929034A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111666410A (en) * | 2020-05-29 | 2020-09-15 | 中国人民解放军军事科学院国防科技创新研究院 | Emotion classification method and system for commodity user comment text |
CN111695017A (en) * | 2020-06-15 | 2020-09-22 | 山东浪潮云服务信息科技有限公司 | Method and system for analyzing emotional tendency of user based on product comment |
CN111738006A (en) * | 2020-06-22 | 2020-10-02 | 苏州大学 | Commodity comment named entity recognition-based problem generation method |
CN111881298A (en) * | 2020-08-04 | 2020-11-03 | 上海交通大学 | Semi-structured text processing and analyzing method |
CN111881671A (en) * | 2020-09-27 | 2020-11-03 | 华南师范大学 | Attribute word extraction method |
CN112597302A (en) * | 2020-12-18 | 2021-04-02 | 东北林业大学 | False comment detection method based on multi-dimensional comment representation |
CN112818682A (en) * | 2021-01-22 | 2021-05-18 | 深圳大学 | E-commerce data analysis method, equipment, device and computer-readable storage medium |
CN112836052A (en) * | 2021-02-19 | 2021-05-25 | 中国第一汽车股份有限公司 | Automobile comment text viewpoint mining method, equipment and storage medium |
CN113159831A (en) * | 2021-03-24 | 2021-07-23 | 湖南大学 | Comment text sentiment analysis method based on improved capsule network |
CN113761911A (en) * | 2021-03-17 | 2021-12-07 | 中科天玑数据科技股份有限公司 | Domain text labeling method based on weak supervision |
CN114579833A (en) * | 2022-03-03 | 2022-06-03 | 重庆邮电大学 | Microblog public opinion visual analysis method based on topic mining and emotion analysis |
CN117852507A (en) * | 2024-03-07 | 2024-04-09 | 南京信息工程大学 | Restaurant return guest prediction model, method, system and equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170169008A1 (en) * | 2015-12-15 | 2017-06-15 | Le Holdings (Beijing) Co., Ltd. | Method and electronic device for sentiment classification |
CN107544957A (en) * | 2017-07-05 | 2018-01-05 | 华北电力大学 | A kind of Sentiment orientation analysis method of business product target word |
CN109710761A (en) * | 2018-12-21 | 2019-05-03 | 中国标准化研究院 | The sentiment analysis method of two-way LSTM model based on attention enhancing |
CN110472042A (en) * | 2019-07-02 | 2019-11-19 | 桂林电子科技大学 | A kind of fine granularity sensibility classification method |
-
2019
- 2019-11-26 CN CN201911173494.2A patent/CN110929034A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170169008A1 (en) * | 2015-12-15 | 2017-06-15 | Le Holdings (Beijing) Co., Ltd. | Method and electronic device for sentiment classification |
CN107544957A (en) * | 2017-07-05 | 2018-01-05 | 华北电力大学 | A kind of Sentiment orientation analysis method of business product target word |
CN109710761A (en) * | 2018-12-21 | 2019-05-03 | 中国标准化研究院 | The sentiment analysis method of two-way LSTM model based on attention enhancing |
CN110472042A (en) * | 2019-07-02 | 2019-11-19 | 桂林电子科技大学 | A kind of fine granularity sensibility classification method |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111666410B (en) * | 2020-05-29 | 2022-01-28 | 中国人民解放军军事科学院国防科技创新研究院 | Emotion classification method and system for commodity user comment text |
CN111666410A (en) * | 2020-05-29 | 2020-09-15 | 中国人民解放军军事科学院国防科技创新研究院 | Emotion classification method and system for commodity user comment text |
CN111695017A (en) * | 2020-06-15 | 2020-09-22 | 山东浪潮云服务信息科技有限公司 | Method and system for analyzing emotional tendency of user based on product comment |
CN111738006A (en) * | 2020-06-22 | 2020-10-02 | 苏州大学 | Commodity comment named entity recognition-based problem generation method |
CN111881298A (en) * | 2020-08-04 | 2020-11-03 | 上海交通大学 | Semi-structured text processing and analyzing method |
CN111881671A (en) * | 2020-09-27 | 2020-11-03 | 华南师范大学 | Attribute word extraction method |
CN111881671B (en) * | 2020-09-27 | 2020-12-29 | 华南师范大学 | Attribute word extraction method |
CN112597302B (en) * | 2020-12-18 | 2022-04-29 | 东北林业大学 | False comment detection method based on multi-dimensional comment representation |
CN112597302A (en) * | 2020-12-18 | 2021-04-02 | 东北林业大学 | False comment detection method based on multi-dimensional comment representation |
CN112818682A (en) * | 2021-01-22 | 2021-05-18 | 深圳大学 | E-commerce data analysis method, equipment, device and computer-readable storage medium |
CN112836052A (en) * | 2021-02-19 | 2021-05-25 | 中国第一汽车股份有限公司 | Automobile comment text viewpoint mining method, equipment and storage medium |
CN113761911A (en) * | 2021-03-17 | 2021-12-07 | 中科天玑数据科技股份有限公司 | Domain text labeling method based on weak supervision |
CN113159831A (en) * | 2021-03-24 | 2021-07-23 | 湖南大学 | Comment text sentiment analysis method based on improved capsule network |
CN114579833A (en) * | 2022-03-03 | 2022-06-03 | 重庆邮电大学 | Microblog public opinion visual analysis method based on topic mining and emotion analysis |
CN117852507A (en) * | 2024-03-07 | 2024-04-09 | 南京信息工程大学 | Restaurant return guest prediction model, method, system and equipment |
CN117852507B (en) * | 2024-03-07 | 2024-05-17 | 南京信息工程大学 | Restaurant return guest prediction model, method, system and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110929034A (en) | Commodity comment fine-grained emotion classification method based on improved LSTM | |
CN110245229B (en) | Deep learning theme emotion classification method based on data enhancement | |
CN111222332B (en) | Commodity recommendation method combining attention network and user emotion | |
CN110472042B (en) | Fine-grained emotion classification method | |
CN108446271B (en) | Text emotion analysis method of convolutional neural network based on Chinese character component characteristics | |
Nurrohmat et al. | Sentiment analysis of novel review using long short-term memory method | |
CN111209401A (en) | System and method for classifying and processing sentiment polarity of online public opinion text information | |
CN110427623A (en) | Semi-structured document Knowledge Extraction Method, device, electronic equipment and storage medium | |
CN109726745B (en) | Target-based emotion classification method integrating description knowledge | |
CN110096575B (en) | Psychological portrait method facing microblog user | |
CN107315738A (en) | A kind of innovation degree appraisal procedure of text message | |
CN112487189B (en) | Implicit discourse text relation classification method for graph-volume network enhancement | |
CN111259140A (en) | False comment detection method based on LSTM multi-entity feature fusion | |
CN112256866A (en) | Text fine-grained emotion analysis method based on deep learning | |
CN109766553A (en) | A kind of Chinese word cutting method of the capsule model combined based on more regularizations | |
CN111368082A (en) | Emotion analysis method for domain adaptive word embedding based on hierarchical network | |
Rauf et al. | Using BERT for checking the polarity of movie reviews | |
CN114036298B (en) | Node classification method based on graph convolution neural network and word vector | |
CN114942974A (en) | E-commerce platform commodity user evaluation emotional tendency classification method | |
Chamekh et al. | Sentiment analysis based on deep learning in e-commerce | |
CN114443846A (en) | Classification method and device based on multi-level text abnormal composition and electronic equipment | |
Sinapoy et al. | Comparison of lstm and indobert method in identifying hoax on twitter | |
Imron et al. | Aspect Based Sentiment Analysis Marketplace Product Reviews Using BERT, LSTM, and CNN | |
CN115422362B (en) | Text matching method based on artificial intelligence | |
CN115906824A (en) | Text fine-grained emotion analysis method, system, medium and computing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200327 |