CN112163091B - CNN-based aspect level cross-domain emotion analysis method - Google Patents

CNN-based aspect level cross-domain emotion analysis method Download PDF

Info

Publication number
CN112163091B
CN112163091B CN202011026500.4A CN202011026500A CN112163091B CN 112163091 B CN112163091 B CN 112163091B CN 202011026500 A CN202011026500 A CN 202011026500A CN 112163091 B CN112163091 B CN 112163091B
Authority
CN
China
Prior art keywords
convolution
formula
word
sentence
emotion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011026500.4A
Other languages
Chinese (zh)
Other versions
CN112163091A (en
Inventor
孟佳娜
于玉海
吴诗涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Minzu University
Original Assignee
Dalian Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Minzu University filed Critical Dalian Minzu University
Priority to CN202011026500.4A priority Critical patent/CN112163091B/en
Publication of CN112163091A publication Critical patent/CN112163091A/en
Application granted granted Critical
Publication of CN112163091B publication Critical patent/CN112163091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The CNN-based aspect-level cross-domain emotion analysis method belongs to the field of text emotion analysis and aims to solve the problem of obtaining good emotion analysis classification results, and the CNN-based aspect-level cross-domain emotion analysis method comprises the following steps of S1, constructing an aspect-level emotion analysis model, S2, carrying out aspect-level emotion analysis on data in the target domain by fusing context features and sentence features, constructing an aspect-level emotion classification model based on a convolutional neural network, and transferring the model trained in the source domain to the target domain.

Description

CNN-based aspect level cross-domain emotion analysis method
Technical Field
The invention belongs to the field of text emotion analysis, and relates to a CNN-based aspect level cross-domain emotion analysis method.
Background
Emotion analysis has wide application value, is a challenging task in the field of natural language processing, and is one of the most active research directions. According to existing studies, emotion analysis can be divided into three classes: document level, sentence level, and aspect level. Both document-level and sentence-level emotion analysis are coarse-grained emotion analysis, while aspect-level emotion analysis is a fine-grained emotion analysis that can provide more detailed emotion analysis results than general emotion analysis. For the problem of aspect-level emotion analysis, a plurality of advanced deep learning methods exist at present, but a common deep learning model is generally highly dependent on a large amount of marked data for training, and manual marking of the data needs to take a lot of time and money to complete.
Early emotion analysis at aspect level mainly relies on feature engineering to characterize sentences, and in the aspect-level emotion analysis task, a deep learning model has better effect in recent years. Long-term memory networks (LSTM) have good capabilities for representing sequence information, and Tang et al use two LSTM's to co-model a target word with its context, integrating the interrelated information of the target word and context. The Tai et al propose a tree LSTM structure, which combines the grammar characteristics of dependency relationship, phrase constitution and the like, so that the semantic expression is more accurate. The attention mechanism can effectively improve the emotion classification effect. Ma et al propose an LSTM structure of hierarchical attention mechanism that introduces common sense knowledge of emotion related concepts into the end-to-end training of deep neural networks. Ma et al propose an interactive attention network that interactively detects important words of the target and important words in the context. The memory network model has long-term, large-scale and easy-to-read-write memories. Tang et al construct a memory network from contextual information and capture information important for emotional tendency in different ways through a attentive mechanism. The RAM model proposed by Peng et al can capture long distance emotional features and combine the multi-attentive results non-linearly with RNN to extract more complex features. CNN models are better at extracting features from n-grams, the TNet model of Li et al proposes a feature transformation component to introduce entity information into the semantic representation of words, and a "context preserving" mechanism to combine features with context information with transformed features. Wei et al combine CNN with gating mechanisms to allow the model to selectively output emotion features according to different aspects given.
The core idea of the migration learning method is to find the similarity between the Source Domain (Source Domain) and the Target Domain (Target Domain), migrate the model or the marker data used by the Source Domain to the Target Domain from the similarity angle, and finally perform new training according to the existing similarity. Because of the large differences in features between different domains, many cross-domain approaches start from a feature perspective. Blitzer et al propose a structure correspondence learning method that attempts to find a set of pivot (pivot) features with the same features or behavior in the source domain and the target domain for alignment. Pan et al propose techniques for spectral feature alignment to align domain-specific words from different domains into a unified cluster. Based on deep neural networks, many approaches to solving cross-domain have also been extended. Glooot et al used a stacked noise reduction auto encoder to reconstruct the characteristics of the source and target fields. Chen et al propose using mSDA (marginalized SDA) algorithm, which retains the strong learning ability of the model without using optimization algorithm. Yosinki et al found through experiments that the first few layers of the deep network were more suitable to be fixed for accomplishing the migration learning task, and proposed that fine tuning could well overcome inter-domain data variability. Long et al propose a deep adaptive network DAN model, which uses a deep network as a carrier to perform adaptation migration.
At present, the transfer learning has achieved great success in various fields, such as text mining, voice recognition, computer vision, spam filtering, WIFI positioning, emotion classification tasks and the like, and has a wide application prospect. Aspect-level emotion analysis can provide finer granularity information than general emotion analysis, and has greater research value and commercial value. Training an excellent aspect-level emotion analysis model requires a large amount of labeling data, and when training data is insufficient, distribution is different or data types are unbalanced, the effect of the model is greatly reduced. Therefore, constructing a model and method that is common to cross-domain emotion analysis techniques is a problem that is worthy of research in the future.
Disclosure of Invention
In order to solve the problem of obtaining good emotion analysis and classification results, the invention provides the following technical scheme: an aspect level cross-domain emotion analysis method based on CNN comprises
S1, constructing an aspect-level emotion analysis model,
s2, aspect level cross-domain emotion analysis.
The step of S1 is as follows:
the input of the aspect-level emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises l words, each word is converted into a word vector in d dimension, the sentence X is expressed as a matrix in d X l dimension, and d X k (k)<L) convolution kernel W of dimension c Unidirectional translation scan is performed on the context matrix, k represents the words contained in each scan of the convolution kernelCan obtain a convolution result c per scan i As shown in formula (2-1):
c i =f(X i:i+k-1 *W c +b c ) (2-1)
wherein b c Is offset, f is an activation function, which represents a convolution operation, so that after scanning the sentence, a vector c is obtained, as shown in equation (2-2):
c=[c 1 ,c 2 ,...,c lk ] (2-2)
where lk represents the length of vector c. Setting n in experiments k A convolution kernel with k, when all sentences are scanned, an n can be obtained k * The matrix of lk dimension is processed by maximum pooling, i.e. the maximum value of each row is taken, at this time, the sentence can be processed by n k Vector of dimensions;
since the aspect word T may be composed of one or more words, a small CNN is added, the aspect word T is converted into a word embedding matrix as shown in formula (2-3), and features of the aspect word are extracted through a convolution and pooling operation as shown in formula (2-4):
T=[T i ,T i+1 ,...,T i+k ] (2-3)
v i =f relu (T i:i+k *W v +b v ) (2-4)
wherein W is v Is a convolution kernel of dimension d x k, b v Is a paranoid;
setting two groups of convolution kernels with the same size, scanning sentences simultaneously, inputting the results into two gate units respectively, and coding the information of aspects and emotions respectively to obtain two vectors s i 、a i At calculation s i When the method is used, tan h is used as an activation function, as shown in a formula (2-5);
s i =f tanh (X i:i+k *W s +b s ) (2-5)
wherein W is s Is a convolution kernel of dimension d x k, b s Is a paranoid;
in calculation a i When adding a party to an inputFace word embedding vector v a ,v a From v i Obtaining the maximum pool by adopting relu as an activation function as shown in the formula (2-6), a i Regarded as aspect features
a i =f relu (X i:i+k *W a +V a v a +b a ) (2-6)
After training, the model gives a higher weight a to the emotion words with more intimate aspect words through the relu function i Otherwise, if the two are far apart, the weight may be small or 0, and finally s i 、a i The two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o i As shown in the formula (2-7):
o i =s i *a i (2-7)
o is set to i Inputting the vector into a pooling layer, carrying out maximum pooling treatment, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the vector according to the probability;
the step of S2 is as follows:
firstly, training a neural network model by using label data in a source field, converting each word in a sentence X into a word with d dimension, embedding the word, fixing the maximum length of the sentence as L, supplementing a part which is less than a specific value with 0, cutting off a part which is more than the specific value, and totally adding L words in the sentence, wherein the sentence X is expressed as a matrix with d X L dimension, as shown in the formula (2-8):
Xs∈R d*l (2-8)
the aspect words are expressed as a matrix in d x l dimensions, as shown in formulas (2-9):
Ts∈R d*l (2-9)
respectively inputting sentences and aspect words into a convolution layer, extracting features in the sentences by using the convolution layer, and setting the size of a convolution kernel W as d x k dimensions, k<L, respectively carrying out unidirectional translation scanning on the convolution kernel on a sentence matrix and an aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanning i And v i As shown in the formula (2-10) (2-11):
c i =f(X i:i+k-1 *W c +b c ) (2-10)
v i =f(T i:i+k *W v +b v ) (2-11)
wherein b c ,b v Is the offset, f is the convolution kernel activation function, representing the convolution operation;
second step, v i V obtained after maximum pooling operation a And c i Sending the information to a gating unit, matching and fusing the information of the opposite side and the emotion information to obtain a group of emotion vectors O s As shown in the formula (2-12):
O s =[o 1 ,o 2 ,...,o lk ] (2-12)
thirdly, aiming at the overfitting phenomenon occurring during model training, dropout is used for improving the structural performance of the neural network, the maximum pooling operation is selected, and the maximum value in the characteristic values is taken out as a main characteristic, as shown in the formula (2-13):
max(O s )=(maxo 1 ,maxo 2 ,...,maxo lk ) (2-13)
fourth, inputting the extracted features into a full connection layer, wherein the full connection layer obtains the probability of each class by using a softmax classifier, and judges the class to which the full connection layer belongs according to the probability, and the formula is shown in the formula (2-14) (2-15):
fifthly, after a classification result of the source field is obtained, fine tuning is carried out on the model by using a small part of marked target field data, a feature map is obtained by applying a forward propagation algorithm on a convolution layer by using the weight of a convolution kernel trained by the source field, fine tuning is carried out on the weight in a full-connection layer by using a random gradient descent method, and then emotion classification is carried out on the target field, so that a final classification result is obtained, wherein the final classification result is shown in a formula (2-16) (2-17):
the beneficial effects are that: according to the method, the context characteristics and sentence characteristics are fused, the aspect-level emotion classification model based on the convolutional neural network is constructed, the model trained in the source field is transferred to the target field, and aspect-level emotion analysis is performed on data in the target field.
Drawings
FIG. 1 aspect is a model of emotion analysis.
FIG. 2 is a diagram of a model framework.
FIG. 3 shows the results of the Chinese corpus accuracy test.
And 4, the experimental result of the F1 value of the Chinese corpus is shown in the figure 4.
Fig. 5 results of english corpus accuracy experiments.
Fig. 6, english corpus F1 value experimental results.
Detailed Description
1 summary of the invention
In recent years, the aspect-level emotion analysis attracts more and more scholars' attention, but the aspect-level cross-domain emotion analysis has the problem that no data is marked, and good classification results are difficult to obtain. And fusing the context features and sentence features, constructing an aspect-level emotion classification model based on a convolutional neural network, transferring the model trained in the source field to the target field, and performing aspect-level emotion analysis on the data in the target field. Chinese and English corpus suitable for aspect level cross-domain emotion analysis are manually marked, and experimental results on the corpus show that under the cross-domain environment, the optimal F1 value of a Chinese data set reaches 92.19%, the optimal F1 value of the English data set reaches 81.57%, and the aspect level cross-domain emotion analysis method based on CNN can effectively improve the emotion classification accuracy of the target domain. In order to reduce the dependence of the model on a large amount of annotation data, the invention performs cross-domain research on aspect-level emotion analysis, and the main contributions of the invention are as follows:
(1) Chinese and English aspect level cross-domain emotion analysis corpus is marked. At present, cross-domain researches on aspect-level emotion analysis are less, and the existing disclosed aspect-level emotion analysis data set cannot meet the needs of the experiment, so that two sentence-level emotion transfer learning corpus are selected, different aspects in sentences are extracted, corresponding emotion labels are marked on the different aspects in combination with semantic information, and the corpus suitable for cross-domain aspect-level emotion analysis tasks is manually marked.
(2) A cross-domain model based on aspect-level emotion analysis is presented. The CNN-based aspect-level emotion analysis method is explored, a migration learning model is established on the basis, and classification performances of the model in different fields are verified through experiments, so that the method provided by the invention has good generalization capability.
Description of the method 2
2.1 CNN-based aspect-level emotion analysis
Convolutional Neural Networks (CNNs) have made tremendous progress in the field of Natural Language Processing (NLP). CNN is mainly composed of an input layer, a convolution layer, a pooling layer and a full connection layer. When sentences containing multiple emotion aspects are processed, a simple CNN can not distinguish which entity the emotion words in the current scanning area describe, and a gating activation unit is added on the basis of the CNN. The model structure is shown in fig. 1.
The specific design steps are as follows:
the input of the model is divided into two parts, namely an aspect word and a context, and the corresponding convolution process also comprises the two parts.
The context X contains l words, each word is converted into a word vector in d dimensions, and the sentence X can be expressed as a matrix in d X l dimensions.
Using d x k (k<L) convolution kernel W of dimension c A one-way translation scan is performed on the context matrix, k representing the number of words contained by each scan of the convolution kernel. Each scan can obtain a convolution result c i As shown in formula (2-1).
c i =f(X i:i+k-1 *W c +b c ) (2-1)
Wherein b c Is the bias, f is the activation function, which represents the convolution operation, so that after scanning the sentence, a vector c is obtained, as shown in equation (2-2).
c=[c 1 ,c 2 ,…,c lk ] (2-2)
Where lk represents the length of vector c. Setting n in experiments k A convolution kernel with k, when all sentences are scanned, an n can be obtained k * The matrix of lk dimension is processed by maximum pooling, i.e. the maximum value of each row is taken, at this time, the sentence can be processed by n k Vectors of dimensions.
Since the aspect word T may be composed of one or more words, a small CNN is added in the experiment to convert T into a word embedding matrix as shown in formula (2-3), and the feature of the aspect word is extracted through the convolution and pooling operation as shown in formula (2-4).
T=[T i ,T i+1 ,...,T i+k ] (2-3)
v i =f relu (T i:i+k *W v +b v ) (2-4)
Wherein W is v Is a convolution kernel of dimension d x k, b v Is a paranoid.
The experiment sets two groups of convolution kernels with the same size to scan sentences simultaneously, and inputs the results into two gate units respectively to encode the information of aspects and emotions respectively to obtain two vectors s i 、a i
In calculating s i When tan h is used as the activationA function represented by the formula (2-5);
s i =f tanh (X i:i+k *W s +b s ) (2-5)
wherein W is s Is a convolution kernel of dimension d x k, b s Is a paranoid.
In calculation a i When the embedded vector v of the aspect word is added to the input a ,v a From v i Maximum pooling is performed and relu is used as the activation function, as shown in formulas (2-6), thus a i May be regarded as an aspect.
a i =f relu (X i:i+k *W a +V a v a +b a ) (2-6)
After training, the model gives a higher weight a to the emotion words with more intimate aspect words through the relu function i Conversely, if the two are far apart, the weight may be small or 0. Finally, s is i 、a i The two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o i As shown in the formula (2-7).
o i =s i *a i (2-7)
O is set to i Inputting the vector into a pooling layer, carrying out maximum pooling treatment, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the vector according to the probability.
2.2 Cross-Domain emotion analysis
The transfer learning is a branch of machine learning, and the transfer learning does not require training data to have the same feature space or the same edge probability distribution, so that the assumption required by the machine learning is relaxed. We pre-train the network model on a larger tagged dataset and then use the network model as an initialization model to continue processing tasks in other areas. In the model, after the aspect information and the context information are extracted to the features through convolution, the features are sent to a gating activation unit to be selected, the emotion features with low similarity to the aspect features are blocked at the gate, otherwise, the scale of the emotion features is correspondingly enlarged, the features are fused at the gating unit, and finally the emotion tendencies are predicted through a full-connection layer.
The specific steps are as follows:
firstly, training a neural network model by using label data in a source field, embedding each word in a sentence X into a word in d dimension, fixing the maximum length of the sentence as L (the part which is less than a specific value is complemented by 0 and the part which is more than the specific value is truncated), and the total L words in the sentence, wherein the sentence X can be expressed as a matrix in d X L dimension, as shown in the formula (2-8).
Xs∈R d*l (2-8)
Similarly, the term is also denoted as d x l dimensional matrix, as shown in formulas (2-9).
Ts∈R d*l (2-9)
The sentence and the aspect word are respectively input into the convolution layer, and the feature in the sentence is extracted by the convolution layer. The size of the convolution kernel W is set to d×k (k<L) dimension, and respectively carrying out unidirectional translation scanning on the convolution kernel on the sentence matrix and the aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel. After scanning, a convolution result c is obtained i And v i As shown in the formula (2-10) (2-11).
c i =f(X i:i+k-1 *W c +b c ) (2-10)
v i =f(T i:i+k *W v +b v ) (2-11)
Wherein b c ,b v Is the offset, f is the convolution kernel activation function, which represents the convolution operation.
Second step, v i V obtained after maximum pooling operation a And c i Sending the information to a gate control unit, matching and fusing the information of the direction and the emotion information, and finally obtaining a group of emotion vectors O according to the specific method as described in the section 2.1 of the invention s As shown in the formula (2-12).
O s =[o 1 ,o 2 ,...,o lk ] (2-12)
Thirdly, aiming at the over-fitting phenomenon possibly occurring during model training, dropout is used for improving the structural performance of the neural network. The invention selects the maximum pooling operation and takes the maximum value of the characteristic values as the main characteristic, as shown in the formula (2-13).
max(O s )=(maxo 1 ,maxo 2 ,...,maxo lk ) (2-13)
Fourth, the extracted features are input to a full-connection layer, the full-connection layer obtains the probability of each class through a softmax classifier, and the class to which the full-connection layer belongs is judged according to the probability. The formula is shown as formula (2-14) (2-15).
Fifthly, after the classification result of the source domain is obtained, fine tuning the model by using a small part of marked target domain data. And the weight of a convolution kernel trained by a source domain is used in a convolution layer, a forward propagation algorithm is applied to obtain a feature map, the weight in a full-connection layer is finely adjusted by a random gradient descent method, and then the target domain is subjected to emotion classification, so that a final classification result is obtained, and the final classification result is shown in a formula (2-16) (2-17).
3 experimental results and analysis
3.1 corpus labeling
Because the existing emotion analysis corpus cannot completely meet the requirement of the research, the experiment selects the conventional corpus of Chinese and English transfer learning for manual labeling, and creates a data set suitable for cross-domain aspect emotion analysis tasks. The method comprises the steps of analyzing aspect information and emotion information in a sentence-level emotion analysis public data set, extracting aspect words and marking emotion expressed by the aspect in a sentence.
3.1.1 Chinese corpus labeling
The Chinese corpus selects a Chinese comment text data set which is arranged by a scholars such as Tan Songbo and the like and is respectively a text with 2000 positive emotions and a text with 2000 negative emotions in each field, wherein the text data set is from a Beijing east computer product comment, a book comment and a travel network hotel comment, and the total number of the text data sets is 12000. An example of selecting a portion of the comment statement is shown in table 3.1.
Table 3.1 Tan Songbo corpus example
Analyzing the corpus can see that one or more aspects are involved in each comment sentence, and emotion tendencies corresponding to different aspects are not necessarily the same. The emotion aiming at different aspects in each comment sentence is marked. For example, the service for the sentence "hotel is too rotten. The geographic location is very good. "can be labeled as two different aspects emotion data, corresponding emotion tendencies are negative for the" service "aspect; for the "geographic location" aspect, the corresponding emotional tendency is positive. Examples of the labeled partial sentences are shown in table 3.2.
TABLE 3.2 Chinese corpus post-labeling data examples
Each comment sentence after manual labeling is divided into three parts: sentence, aspect, and emotional tendency. And (3) copying the sentence for how many times according to how many aspects are in the original comment sentence, and marking the aspects and the corresponding emotion tendencies of each sentence respectively. The extracted partial tagged facet is shown in table 3.3. The marked data were collated for a total of 19500 bars, as shown in Table 3.4. TABLE 3.3 partial aspect words extracted after labeling Chinese corpus
TABLE 3.4 statistics after labeling of Chinese corpus
3.1.2 English corpus labeling
English corpus uses the presently disclosed Amazon book corpus, which is divided into four major classes, book, DVD, electronic and Kitchen. The data in the four different fields comprise 1000 pieces of positive comments and 1000 pieces of negative comments respectively, 8000 pieces of data are taken, and part of comment sentences are selected and shown in table 3.5.
Table 3.5 amazon corpus example
Similarly, each comment sentence after manual labeling is divided into three parts: sentence, aspect, and emotional tendency. The extracted partial aspect words after labeling are shown in table 3.6.
The extracted partial terms after labeling are shown in table 3.7, and the total number of data after final labeling is 9090, and the data after finishing is shown in table 3.8.
TABLE 3.6 English corpus post-labeling data example
TABLE 3.7 partial aspect words extracted after labeling English corpus
TABLE 3.8 statistics after labeling English corpus
3.2 Experimental parameter setting
In the experiment, word vectors are built by taking words as basic units, word segmentation processing is carried out on texts by using a jieba tool, corresponding Word2vec Word vectors are built, specific super-parameter settings of a convolutional neural network are shown in a table 3.9, and a given super-parameter m is the marked data quantity of a target field for fine adjustment of a model.
Table 3.9 experimental parameter settings
3.3 experimental results and analysis
The experiment uses the accuracy (Acc) and the F1 value as evaluation indexes.
The calculation formula of the accuracy (Acc) is shown as (3-1):
wherein the method comprises the steps ofPredictive labels, y, representing data samples i Representing the actual tag of the data sample, N represents the size of the test set.
The calculation of the F1 value balances the recall rate and the accuracy rate index, and the calculation mode is shown as 3.10:
TABLE 3.10 confusion matrix
Precision, also called Precision or Precision, characterizes the proportion of true positive classes among all the results predicted to be positive classes, as shown in equation (3-2).
Recall (Rcca 11), also called recall, characterizes the proportion of true positive classes found by the classifier, as shown in equation (3-3).
The calculation of the F1 value comprehensively considers the accuracy and recall of the classification model and can be regarded as a weighted average of the two indexes. The F1 value is as high as the precision and recall and is between 0 and 1, with larger values indicating better model performance. The calculation formula is shown as formula (3-4).
In order to show the influence of the target field sample on the model migration effect, extracting part of data of the target field with labels to perform model training, and directly migrating the model trained by the source field to the target field when m=0 in the experiment on the Chinese data set; m=0.05 represents that 5% of the total number of target fields are randomly extracted for model retraining to adjust network parameters, m= 0.1,0.2,0.5, and the like, and the accuracy and the F1 value are selected as test indexes by using a 10-time cross validation method.
3.2.1 Chinese corpus experimental results and analysis
The accuracy test result of the Chinese corpus is shown in figure 3. Wherein C represents the data set of the Computer domain; b represents a data set in the Book field; h represents the data set of the Hotel domain. In fig. 3, c→b indicates that the source domain is Computer, the target domain is Book, and so on.
As can be seen from fig. 3, when the convolutional neural network model with the gating unit is used for migration, the migration effect from the book data set to the computer data set is the best, and the accuracy can reach 93.4%. With the increase of training data in the target area, the accuracy is improved for most data sets, and the maximum increase is typically 0 to 0.05 span when increasing the target area samples.
As shown in FIG. 4, the migration effect from the book data set to the computer data set is best, and the F1 value can reach 92.19%. With the increase of training data in the target area, the F1 value increases for most data sets. The performance of the model is expected to increase with increasing target area data set, but fig. 4 shows that the magnitude of the model performance increase is greatest when the target area data is increased from 0 to 0.05, and the performance of the model slightly floats with increasing target area data, and the model reaches the best performance in the case of the maximum target area data. Therefore, in the experiment, a few proportion of target field data is added to finely adjust the model, the experimental result can be obviously improved, and the time consumption and the cost of manual marking are greatly reduced.
3.2.2 English corpus experimental results and analysis
The accuracy test result of the English corpus is shown in FIG. 5, wherein B represents a data set in the Book field; d represents a data set of the DVD domain; e represents a dataset in the Electronics domain; k represents the data set of the Kitchen domain. In fig. 5 b→d indicates that the source field is Book, the target field is DVD, and so on.
From fig. 5, it can be seen that the accuracy of most of the data set experiments increases with the increase of training data in the target field, and the best experimental result is that when the Book data set is the source field and the Electronics data set is the target field, the accuracy reaches 82.45%.
As shown in fig. 6, the experimental result of the F1 value of the english corpus is shown in fig. 6, it can be seen that the experimental F1 value increases with the increase of training data in the target field, and the experimental result is best that when the Book data set is the source field and the Electronics data set is the target field, the migration effect of the model is best, and the F1 value reaches 81.57%.
In general, the accuracy and F1 value of the experiment are improved along with the increase of the data in the target field, the performance of the model in the experiment slightly floats, but the performance of the model is optimal under the condition that the data in the target field is maximum.
The invention marks aspect-level emotion transfer learning corpus, provides an experimental data set meeting the requirements for the invention, and also provides corpus support for the related research in the future. Aiming at cross-domain aspect emotion analysis, the invention explores a CNN-based aspect emotion analysis model, applies the idea of transfer learning, transfers a model trained in the source domain to the target domain, solves the problem that the target domain is difficult to obtain a good classification result due to less labeling data, and experiments prove that the model has good classification performance on a data set provided by the invention. In future work, more migration approaches can be used to refine the model and further verify the generalization performance of the model over more cross-domain large-scale datasets.
While the invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (1)

1. An aspect-level cross-domain emotion analysis method based on CNN is characterized by comprising the following steps of:
s1, constructing an aspect-level emotion analysis model
S2, aspect level cross-domain emotion analysis
The step of S1 is as follows:
the input of the aspect emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises L words, each word is converted into a word vector in d dimension, the sentence X is expressed as a matrix in d X L dimension, and a convolution kernel W in d X k (k < L) dimension is used c Performing unidirectional translation scanning on the context matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and each scanning can obtain a convolution result c i As shown in formula (2-1):
c i =f(X i:i+k-1 *W c +b c ) (2-1)
wherein b c Is offset, f is an activation function, which represents a convolution operation, so that after scanning the sentence, a vector c is obtained, as shown in equation (2-2):
c=[c 1 ,c 2 ,...,c lk ] (2-2)
where lk denotes the length of vector c, n is set in the experiment k A convolution kernel with k, when all sentences are scanned, an n can be obtained k * The matrix of lk dimension is processed by maximum pooling, i.e. the maximum value of each row is taken, at this time, the sentence can be processed by n k Vector of dimensions;
since the aspect word T may be composed of one or more words, a small CNN is added, the aspect word T is converted into a word embedding matrix as shown in formula (2-3), and features of the aspect word are extracted through a convolution and pooling operation as shown in formula (2-4):
T=[T i ,T i+1 ,...,T i+k ] (2-3)
v i =f relu (T i:i+k *W v +b v ) (2-4)
wherein W is v Is a convolution kernel of dimension d x k, b v Is a paranoid;
setting two groups of convolution kernels with the same size, scanning sentences simultaneously, inputting the results into two gate units respectively, and coding the information of aspects and emotions respectively to obtain two vectors s i 、a i At calculation s i When the method is used, tan h is used as an activation function, as shown in a formula (2-5);
s i =f tanh (X i:i+k *W s +b s ) (2-5)
wherein W is s Is a convolution kernel of dimension d x k, b s Is a paranoid;
in calculation a i When the embedding vector v of the aspect word is added in the input a ,v a From v i Obtaining the maximum pool by adopting relu as an activation function as shown in the formula (2-6), a i Regarded as aspect features
a i =f relu (X i:i+k *W a +V a v a +b a ) (2-6)
After training, the model gives a higher weight a to the emotion words with more intimate aspect words through the relu function i Otherwise, if the two are far apart, the weight may be small or 0, and finally s i 、a i The two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o i As shown in the formula (2-7):
o i =s i *a i (2-7)
o is set to i Inputting the vector into a pooling layer, carrying out maximum pooling treatment, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the vector according to the probability;
the step of S2 is as follows:
firstly, training a neural network model by using label data in a source field, converting each word in a sentence X into a word with d dimension, embedding the word, fixing the maximum length of the sentence as L, supplementing a part which is less than a specific value with 0, cutting off a part which is more than the specific value, and totally adding L words in the sentence, wherein the sentence X is expressed as a matrix with d X L dimension, as shown in the formula (2-8):
Xs∈R d*l (2-8)
the aspect words are expressed as a matrix in d x l dimensions, as shown in formulas (2-9):
Ts∈R d*l (2-9)
respectively inputting sentences and aspect words into a convolution layer, extracting features in the sentences by using the convolution layer, setting the size of a convolution kernel W as d x k dimensions, wherein k is smaller than L, respectively carrying out unidirectional translation scanning on a sentence matrix and an aspect word matrix by using the convolution kernel, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanning i And v i As shown in the formula (2-10) (2-11):
c i =f(X i:i+k-1 *W c +b c ) (2-10)
v i =f(T i:i+k *W v +b v ) (2-11)
wherein b c ,b v Is the offset, f is the convolution kernel activation function, representing the convolution operation;
second step, v i V obtained after maximum pooling operation s And c i Sending the information to a gating unit, matching and fusing the information of the opposite side and the emotion information to obtain a group of emotion vectors O s As shown in the formula (2-12):
O s =[o 1 ,o 2 ,...,o lk ] (2-12)
thirdly, aiming at the overfitting phenomenon occurring during model training, dropout is used for improving the structural performance of the neural network, the maximum pooling operation is selected, and the maximum value in the characteristic values is taken out as a main characteristic, as shown in the formula (2-13):
max(O s )=(max o 1 ,max o 2 ,...,max o lk ) (2-13)
fourth, inputting the extracted features into a full connection layer, wherein the full connection layer obtains the probability of each class by using a softmax classifier, and judges the class to which the full connection layer belongs according to the probability, and the formula is shown in the formula (2-14) (2-15):
fifthly, after a classification result of the source field is obtained, fine tuning is carried out on the model by using a small part of marked target field data, a feature map is obtained by applying a forward propagation algorithm on a convolution layer by using the weight of a convolution kernel trained by the source field, fine tuning is carried out on the weight in a full-connection layer by using a random gradient descent method, and then emotion classification is carried out on the target field, so that a final classification result is obtained, wherein the final classification result is shown in a formula (2-16) (2-17):
CN202011026500.4A 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method Active CN112163091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011026500.4A CN112163091B (en) 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011026500.4A CN112163091B (en) 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method

Publications (2)

Publication Number Publication Date
CN112163091A CN112163091A (en) 2021-01-01
CN112163091B true CN112163091B (en) 2023-08-22

Family

ID=73864233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011026500.4A Active CN112163091B (en) 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method

Country Status (1)

Country Link
CN (1) CN112163091B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204645B (en) * 2021-04-01 2023-05-16 武汉大学 Knowledge-guided aspect-level emotion analysis model training method
CN113128229B (en) * 2021-04-14 2023-07-18 河海大学 Chinese entity relation joint extraction method
CN113468292B (en) * 2021-06-29 2024-06-25 ***股份有限公司 Aspect-level emotion analysis method, device and computer-readable storage medium
CN113627550A (en) * 2021-08-17 2021-11-09 北京计算机技术及应用研究所 Image-text emotion analysis method based on multi-mode fusion
CN114757183B (en) * 2022-04-11 2024-05-10 北京理工大学 Cross-domain emotion classification method based on comparison alignment network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614875A (en) * 2018-04-26 2018-10-02 北京邮电大学 Chinese emotion tendency sorting technique based on global average pond convolutional neural networks
CN108763326A (en) * 2018-05-04 2018-11-06 南京邮电大学 A kind of sentiment analysis model building method of the diversified convolutional neural networks of feature based
CN109753566A (en) * 2019-01-09 2019-05-14 大连民族大学 The model training method of cross-cutting sentiment analysis based on convolutional neural networks
KR20190136337A (en) * 2018-05-30 2019-12-10 가천대학교 산학협력단 Social Media Contents Based Emotion Analysis Method, System and Computer-readable Medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614875A (en) * 2018-04-26 2018-10-02 北京邮电大学 Chinese emotion tendency sorting technique based on global average pond convolutional neural networks
CN108763326A (en) * 2018-05-04 2018-11-06 南京邮电大学 A kind of sentiment analysis model building method of the diversified convolutional neural networks of feature based
KR20190136337A (en) * 2018-05-30 2019-12-10 가천대학교 산학협력단 Social Media Contents Based Emotion Analysis Method, System and Computer-readable Medium
CN109753566A (en) * 2019-01-09 2019-05-14 大连民族大学 The model training method of cross-cutting sentiment analysis based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于集成深度迁移学习的多源跨领域情感分类;赵传君;王素格;李德玉;;山西大学学报(自然科学版)(04);全文 *

Also Published As

Publication number Publication date
CN112163091A (en) 2021-01-01

Similar Documents

Publication Publication Date Title
CN112163091B (en) CNN-based aspect level cross-domain emotion analysis method
CN111160037B (en) Fine-grained emotion analysis method supporting cross-language migration
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
CN110472003B (en) Social network text emotion fine-grained classification method based on graph convolution network
CN108038492A (en) A kind of perceptual term vector and sensibility classification method based on deep learning
CN110502753A (en) A kind of deep learning sentiment analysis model and its analysis method based on semantically enhancement
CN111368086A (en) CNN-BilSTM + attribute model-based sentiment classification method for case-involved news viewpoint sentences
CN111858935A (en) Fine-grained emotion classification system for flight comment
Al Wazrah et al. Sentiment analysis using stacked gated recurrent unit for arabic tweets
CN115392259B (en) Microblog text sentiment analysis method and system based on confrontation training fusion BERT
CN111581364B (en) Chinese intelligent question-answer short text similarity calculation method oriented to medical field
CN110851601A (en) Cross-domain emotion classification system and method based on layered attention mechanism
CN111339772B (en) Russian text emotion analysis method, electronic device and storage medium
Al Omari et al. Hybrid CNNs-LSTM deep analyzer for arabic opinion mining
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN115906816A (en) Text emotion analysis method of two-channel Attention model based on Bert
CN115129807A (en) Fine-grained classification method and system for social media topic comments based on self-attention
CN113920379A (en) Zero sample image classification method based on knowledge assistance
CN114238636A (en) Translation matching-based cross-language attribute level emotion classification method
CN113934835A (en) Retrieval type reply dialogue method and system combining keywords and semantic understanding representation
Dang et al. Sentiment analysis for vietnamese–based hybrid deep learning models
He et al. Hierarchical attention and knowledge matching networks with information enhancement for end-to-end task-oriented dialog systems
CN113065350A (en) Biomedical text word sense disambiguation method based on attention neural network
Merayo et al. Social Network Sentiment Analysis Using Hybrid Deep Learning Models
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant