CN112163091B

CN112163091B - CNN-based aspect level cross-domain emotion analysis method

Info

Publication number: CN112163091B
Application number: CN202011026500.4A
Authority: CN
Inventors: 孟佳娜; 于玉海; 吴诗涵
Original assignee: Dalian Minzu University
Current assignee: Dalian Minzu University
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2023-08-22
Anticipated expiration: 2040-09-25
Also published as: CN112163091A

Abstract

The CNN-based aspect-level cross-domain emotion analysis method belongs to the field of text emotion analysis and aims to solve the problem of obtaining good emotion analysis classification results, and the CNN-based aspect-level cross-domain emotion analysis method comprises the following steps of S1, constructing an aspect-level emotion analysis model, S2, carrying out aspect-level emotion analysis on data in the target domain by fusing context features and sentence features, constructing an aspect-level emotion classification model based on a convolutional neural network, and transferring the model trained in the source domain to the target domain.

Description

CNN-based aspect level cross-domain emotion analysis method

Technical Field

The invention belongs to the field of text emotion analysis, and relates to a CNN-based aspect level cross-domain emotion analysis method.

Background

Emotion analysis has wide application value, is a challenging task in the field of natural language processing, and is one of the most active research directions. According to existing studies, emotion analysis can be divided into three classes: document level, sentence level, and aspect level. Both document-level and sentence-level emotion analysis are coarse-grained emotion analysis, while aspect-level emotion analysis is a fine-grained emotion analysis that can provide more detailed emotion analysis results than general emotion analysis. For the problem of aspect-level emotion analysis, a plurality of advanced deep learning methods exist at present, but a common deep learning model is generally highly dependent on a large amount of marked data for training, and manual marking of the data needs to take a lot of time and money to complete.

Early emotion analysis at aspect level mainly relies on feature engineering to characterize sentences, and in the aspect-level emotion analysis task, a deep learning model has better effect in recent years. Long-term memory networks (LSTM) have good capabilities for representing sequence information, and Tang et al use two LSTM's to co-model a target word with its context, integrating the interrelated information of the target word and context. The Tai et al propose a tree LSTM structure, which combines the grammar characteristics of dependency relationship, phrase constitution and the like, so that the semantic expression is more accurate. The attention mechanism can effectively improve the emotion classification effect. Ma et al propose an LSTM structure of hierarchical attention mechanism that introduces common sense knowledge of emotion related concepts into the end-to-end training of deep neural networks. Ma et al propose an interactive attention network that interactively detects important words of the target and important words in the context. The memory network model has long-term, large-scale and easy-to-read-write memories. Tang et al construct a memory network from contextual information and capture information important for emotional tendency in different ways through a attentive mechanism. The RAM model proposed by Peng et al can capture long distance emotional features and combine the multi-attentive results non-linearly with RNN to extract more complex features. CNN models are better at extracting features from n-grams, the TNet model of Li et al proposes a feature transformation component to introduce entity information into the semantic representation of words, and a "context preserving" mechanism to combine features with context information with transformed features. Wei et al combine CNN with gating mechanisms to allow the model to selectively output emotion features according to different aspects given.

The core idea of the migration learning method is to find the similarity between the Source Domain (Source Domain) and the Target Domain (Target Domain), migrate the model or the marker data used by the Source Domain to the Target Domain from the similarity angle, and finally perform new training according to the existing similarity. Because of the large differences in features between different domains, many cross-domain approaches start from a feature perspective. Blitzer et al propose a structure correspondence learning method that attempts to find a set of pivot (pivot) features with the same features or behavior in the source domain and the target domain for alignment. Pan et al propose techniques for spectral feature alignment to align domain-specific words from different domains into a unified cluster. Based on deep neural networks, many approaches to solving cross-domain have also been extended. Glooot et al used a stacked noise reduction auto encoder to reconstruct the characteristics of the source and target fields. Chen et al propose using mSDA (marginalized SDA) algorithm, which retains the strong learning ability of the model without using optimization algorithm. Yosinki et al found through experiments that the first few layers of the deep network were more suitable to be fixed for accomplishing the migration learning task, and proposed that fine tuning could well overcome inter-domain data variability. Long et al propose a deep adaptive network DAN model, which uses a deep network as a carrier to perform adaptation migration.

At present, the transfer learning has achieved great success in various fields, such as text mining, voice recognition, computer vision, spam filtering, WIFI positioning, emotion classification tasks and the like, and has a wide application prospect. Aspect-level emotion analysis can provide finer granularity information than general emotion analysis, and has greater research value and commercial value. Training an excellent aspect-level emotion analysis model requires a large amount of labeling data, and when training data is insufficient, distribution is different or data types are unbalanced, the effect of the model is greatly reduced. Therefore, constructing a model and method that is common to cross-domain emotion analysis techniques is a problem that is worthy of research in the future.

Disclosure of Invention

In order to solve the problem of obtaining good emotion analysis and classification results, the invention provides the following technical scheme: an aspect level cross-domain emotion analysis method based on CNN comprises

S1, constructing an aspect-level emotion analysis model,

s2, aspect level cross-domain emotion analysis.

The step of S1 is as follows:

the input of the aspect-level emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises l words, each word is converted into a word vector in d dimension, the sentence X is expressed as a matrix in d X l dimension, and d X k (k)<L) convolution kernel W of dimension _c Unidirectional translation scan is performed on the context matrix, k represents the words contained in each scan of the convolution kernelCan obtain a convolution result c per scan _i As shown in formula (2-1):

c _i ＝f(X _i：i+k-1 *W _c +b _c ) (2-1)

wherein b _c Is offset, f is an activation function, which represents a convolution operation, so that after scanning the sentence, a vector c is obtained, as shown in equation (2-2):

c＝[c ₁ ，c ₂ ，...，c _lk ] (2-2)

where lk represents the length of vector c. Setting n in experiments _k A convolution kernel with k, when all sentences are scanned, an n can be obtained _k * The matrix of lk dimension is processed by maximum pooling, i.e. the maximum value of each row is taken, at this time, the sentence can be processed by n _k Vector of dimensions;

since the aspect word T may be composed of one or more words, a small CNN is added, the aspect word T is converted into a word embedding matrix as shown in formula (2-3), and features of the aspect word are extracted through a convolution and pooling operation as shown in formula (2-4):

T＝[T _i ，T _i+1 ，...，T _i+k ] (2-3)

v _i ＝f _relu (T _i：i+k *W _v +b _v ) (2-4)

wherein W is _v Is a convolution kernel of dimension d x k, b _v Is a paranoid;

setting two groups of convolution kernels with the same size, scanning sentences simultaneously, inputting the results into two gate units respectively, and coding the information of aspects and emotions respectively to obtain two vectors s _i 、a _i At calculation s _i When the method is used, tan h is used as an activation function, as shown in a formula (2-5);

s _i ＝f _tanh (X _i：i+k *W _s +b _s ) (2-5)

wherein W is _s Is a convolution kernel of dimension d x k, b _s Is a paranoid;

in calculation a _i When adding a party to an inputFace word embedding vector v _a ，v _a From v _i Obtaining the maximum pool by adopting relu as an activation function as shown in the formula (2-6), a _i Regarded as aspect features

a _i ＝f _relu (X _i：i+k *W _a +V _a v _a +b _a ) (2-6)

After training, the model gives a higher weight a to the emotion words with more intimate aspect words through the relu function _i Otherwise, if the two are far apart, the weight may be small or 0, and finally s _i 、a _i The two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o _i As shown in the formula (2-7):

o _i ＝s _i *a _i (2-7)

o is set to _i Inputting the vector into a pooling layer, carrying out maximum pooling treatment, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the vector according to the probability;

the step of S2 is as follows:

firstly, training a neural network model by using label data in a source field, converting each word in a sentence X into a word with d dimension, embedding the word, fixing the maximum length of the sentence as L, supplementing a part which is less than a specific value with 0, cutting off a part which is more than the specific value, and totally adding L words in the sentence, wherein the sentence X is expressed as a matrix with d X L dimension, as shown in the formula (2-8):

Xs∈R ^d*l (2-8)

the aspect words are expressed as a matrix in d x l dimensions, as shown in formulas (2-9):

Ts∈R ^d*l (2-9)

respectively inputting sentences and aspect words into a convolution layer, extracting features in the sentences by using the convolution layer, and setting the size of a convolution kernel W as d x k dimensions, k<L, respectively carrying out unidirectional translation scanning on the convolution kernel on a sentence matrix and an aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanning _i And v _i As shown in the formula (2-10) (2-11):

c _i ＝f(X _i：i+k-1 *W _c +b _c ) (2-10)

v _i ＝f(T _i：i+k *W _v +b _v ) (2-11)

wherein b _c ，b _v Is the offset, f is the convolution kernel activation function, representing the convolution operation;

second step, v _i V obtained after maximum pooling operation _a And c _i Sending the information to a gating unit, matching and fusing the information of the opposite side and the emotion information to obtain a group of emotion vectors O _s As shown in the formula (2-12):

O _s ＝[o ₁ ，o ₂ ，...，o _lk ] (2-12)

thirdly, aiming at the overfitting phenomenon occurring during model training, dropout is used for improving the structural performance of the neural network, the maximum pooling operation is selected, and the maximum value in the characteristic values is taken out as a main characteristic, as shown in the formula (2-13):

max(O _s )＝(maxo ₁ ，maxo ₂ ，...，maxo _lk ) (2-13)

fourth, inputting the extracted features into a full connection layer, wherein the full connection layer obtains the probability of each class by using a softmax classifier, and judges the class to which the full connection layer belongs according to the probability, and the formula is shown in the formula (2-14) (2-15):

fifthly, after a classification result of the source field is obtained, fine tuning is carried out on the model by using a small part of marked target field data, a feature map is obtained by applying a forward propagation algorithm on a convolution layer by using the weight of a convolution kernel trained by the source field, fine tuning is carried out on the weight in a full-connection layer by using a random gradient descent method, and then emotion classification is carried out on the target field, so that a final classification result is obtained, wherein the final classification result is shown in a formula (2-16) (2-17):

the beneficial effects are that: according to the method, the context characteristics and sentence characteristics are fused, the aspect-level emotion classification model based on the convolutional neural network is constructed, the model trained in the source field is transferred to the target field, and aspect-level emotion analysis is performed on data in the target field.

Drawings

FIG. 1 aspect is a model of emotion analysis.

FIG. 2 is a diagram of a model framework.

FIG. 3 shows the results of the Chinese corpus accuracy test.

And 4, the experimental result of the F1 value of the Chinese corpus is shown in the figure 4.

Fig. 5 results of english corpus accuracy experiments.

Fig. 6, english corpus F1 value experimental results.

Detailed Description

1 summary of the invention

In recent years, the aspect-level emotion analysis attracts more and more scholars' attention, but the aspect-level cross-domain emotion analysis has the problem that no data is marked, and good classification results are difficult to obtain. And fusing the context features and sentence features, constructing an aspect-level emotion classification model based on a convolutional neural network, transferring the model trained in the source field to the target field, and performing aspect-level emotion analysis on the data in the target field. Chinese and English corpus suitable for aspect level cross-domain emotion analysis are manually marked, and experimental results on the corpus show that under the cross-domain environment, the optimal F1 value of a Chinese data set reaches 92.19%, the optimal F1 value of the English data set reaches 81.57%, and the aspect level cross-domain emotion analysis method based on CNN can effectively improve the emotion classification accuracy of the target domain. In order to reduce the dependence of the model on a large amount of annotation data, the invention performs cross-domain research on aspect-level emotion analysis, and the main contributions of the invention are as follows:

(1) Chinese and English aspect level cross-domain emotion analysis corpus is marked. At present, cross-domain researches on aspect-level emotion analysis are less, and the existing disclosed aspect-level emotion analysis data set cannot meet the needs of the experiment, so that two sentence-level emotion transfer learning corpus are selected, different aspects in sentences are extracted, corresponding emotion labels are marked on the different aspects in combination with semantic information, and the corpus suitable for cross-domain aspect-level emotion analysis tasks is manually marked.

(2) A cross-domain model based on aspect-level emotion analysis is presented. The CNN-based aspect-level emotion analysis method is explored, a migration learning model is established on the basis, and classification performances of the model in different fields are verified through experiments, so that the method provided by the invention has good generalization capability.

Description of the method 2

2.1 CNN-based aspect-level emotion analysis

Convolutional Neural Networks (CNNs) have made tremendous progress in the field of Natural Language Processing (NLP). CNN is mainly composed of an input layer, a convolution layer, a pooling layer and a full connection layer. When sentences containing multiple emotion aspects are processed, a simple CNN can not distinguish which entity the emotion words in the current scanning area describe, and a gating activation unit is added on the basis of the CNN. The model structure is shown in fig. 1.

The specific design steps are as follows:

the input of the model is divided into two parts, namely an aspect word and a context, and the corresponding convolution process also comprises the two parts.

The context X contains l words, each word is converted into a word vector in d dimensions, and the sentence X can be expressed as a matrix in d X l dimensions.

Using d x k (k<L) convolution kernel W of dimension _c A one-way translation scan is performed on the context matrix, k representing the number of words contained by each scan of the convolution kernel. Each scan can obtain a convolution result c _i As shown in formula (2-1).

c _i ＝f(X _i：i+k-1 *W _c +b _c ) (2-1)

Wherein b _c Is the bias, f is the activation function, which represents the convolution operation, so that after scanning the sentence, a vector c is obtained, as shown in equation (2-2).

c＝[c ₁ ，c ₂ ，…，c _lk ] (2-2)

Where lk represents the length of vector c. Setting n in experiments _k A convolution kernel with k, when all sentences are scanned, an n can be obtained _k * The matrix of lk dimension is processed by maximum pooling, i.e. the maximum value of each row is taken, at this time, the sentence can be processed by n _k Vectors of dimensions.

Since the aspect word T may be composed of one or more words, a small CNN is added in the experiment to convert T into a word embedding matrix as shown in formula (2-3), and the feature of the aspect word is extracted through the convolution and pooling operation as shown in formula (2-4).

T＝[T _i ，T _i+1 ，...，T _i+k ] (2-3)

v _i ＝f _relu (T _i：i+k *W _v +b _v ) (2-4)

Wherein W is _v Is a convolution kernel of dimension d x k, b _v Is a paranoid.

The experiment sets two groups of convolution kernels with the same size to scan sentences simultaneously, and inputs the results into two gate units respectively to encode the information of aspects and emotions respectively to obtain two vectors s _i 、a _i 。

In calculating s _i When tan h is used as the activationA function represented by the formula (2-5);

s _i ＝f _tanh (X _i：i+k *W _s +b _s ) (2-5)

wherein W is _s Is a convolution kernel of dimension d x k, b _s Is a paranoid.

In calculation a _i When the embedded vector v of the aspect word is added to the input _a ，v _a From v _i Maximum pooling is performed and relu is used as the activation function, as shown in formulas (2-6), thus a _i May be regarded as an aspect.

a _i ＝f _relu (X _i：i+k *W _a +V _a v _a +b _a ) (2-6)

After training, the model gives a higher weight a to the emotion words with more intimate aspect words through the relu function _i Conversely, if the two are far apart, the weight may be small or 0. Finally, s is _i 、a _i The two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o _i As shown in the formula (2-7).

o _i ＝s _i *a _i (2-7)

O is set to _i Inputting the vector into a pooling layer, carrying out maximum pooling treatment, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the vector according to the probability.

2.2 Cross-Domain emotion analysis

The transfer learning is a branch of machine learning, and the transfer learning does not require training data to have the same feature space or the same edge probability distribution, so that the assumption required by the machine learning is relaxed. We pre-train the network model on a larger tagged dataset and then use the network model as an initialization model to continue processing tasks in other areas. In the model, after the aspect information and the context information are extracted to the features through convolution, the features are sent to a gating activation unit to be selected, the emotion features with low similarity to the aspect features are blocked at the gate, otherwise, the scale of the emotion features is correspondingly enlarged, the features are fused at the gating unit, and finally the emotion tendencies are predicted through a full-connection layer.

The specific steps are as follows:

firstly, training a neural network model by using label data in a source field, embedding each word in a sentence X into a word in d dimension, fixing the maximum length of the sentence as L (the part which is less than a specific value is complemented by 0 and the part which is more than the specific value is truncated), and the total L words in the sentence, wherein the sentence X can be expressed as a matrix in d X L dimension, as shown in the formula (2-8).

Xs∈R ^d*l (2-8)

Similarly, the term is also denoted as d x l dimensional matrix, as shown in formulas (2-9).

Ts∈R ^d*l (2-9)

The sentence and the aspect word are respectively input into the convolution layer, and the feature in the sentence is extracted by the convolution layer. The size of the convolution kernel W is set to d×k (k<L) dimension, and respectively carrying out unidirectional translation scanning on the convolution kernel on the sentence matrix and the aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel. After scanning, a convolution result c is obtained _i And v _i As shown in the formula (2-10) (2-11).

c _i ＝f(X _i：i+k-1 *W _c +b _c ) (2-10)

v _i ＝f(T _i：i+k *W _v +b _v ) (2-11)

Wherein b _c ，b _v Is the offset, f is the convolution kernel activation function, which represents the convolution operation.

Second step, v _i V obtained after maximum pooling operation _a And c _i Sending the information to a gate control unit, matching and fusing the information of the direction and the emotion information, and finally obtaining a group of emotion vectors O according to the specific method as described in the section 2.1 of the invention _s As shown in the formula (2-12).

O _s ＝[o ₁ ，o ₂ ，...，o _lk ] (2-12)

Thirdly, aiming at the over-fitting phenomenon possibly occurring during model training, dropout is used for improving the structural performance of the neural network. The invention selects the maximum pooling operation and takes the maximum value of the characteristic values as the main characteristic, as shown in the formula (2-13).

max(O _s )＝(maxo ₁ ，maxo ₂ ，...，maxo _lk ) (2-13)

Fourth, the extracted features are input to a full-connection layer, the full-connection layer obtains the probability of each class through a softmax classifier, and the class to which the full-connection layer belongs is judged according to the probability. The formula is shown as formula (2-14) (2-15).

Fifthly, after the classification result of the source domain is obtained, fine tuning the model by using a small part of marked target domain data. And the weight of a convolution kernel trained by a source domain is used in a convolution layer, a forward propagation algorithm is applied to obtain a feature map, the weight in a full-connection layer is finely adjusted by a random gradient descent method, and then the target domain is subjected to emotion classification, so that a final classification result is obtained, and the final classification result is shown in a formula (2-16) (2-17).

3 experimental results and analysis

3.1 corpus labeling

Because the existing emotion analysis corpus cannot completely meet the requirement of the research, the experiment selects the conventional corpus of Chinese and English transfer learning for manual labeling, and creates a data set suitable for cross-domain aspect emotion analysis tasks. The method comprises the steps of analyzing aspect information and emotion information in a sentence-level emotion analysis public data set, extracting aspect words and marking emotion expressed by the aspect in a sentence.

3.1.1 Chinese corpus labeling

The Chinese corpus selects a Chinese comment text data set which is arranged by a scholars such as Tan Songbo and the like and is respectively a text with 2000 positive emotions and a text with 2000 negative emotions in each field, wherein the text data set is from a Beijing east computer product comment, a book comment and a travel network hotel comment, and the total number of the text data sets is 12000. An example of selecting a portion of the comment statement is shown in table 3.1.

Table 3.1 Tan Songbo corpus example

Analyzing the corpus can see that one or more aspects are involved in each comment sentence, and emotion tendencies corresponding to different aspects are not necessarily the same. The emotion aiming at different aspects in each comment sentence is marked. For example, the service for the sentence "hotel is too rotten. The geographic location is very good. "can be labeled as two different aspects emotion data, corresponding emotion tendencies are negative for the" service "aspect; for the "geographic location" aspect, the corresponding emotional tendency is positive. Examples of the labeled partial sentences are shown in table 3.2.

TABLE 3.2 Chinese corpus post-labeling data examples

Each comment sentence after manual labeling is divided into three parts: sentence, aspect, and emotional tendency. And (3) copying the sentence for how many times according to how many aspects are in the original comment sentence, and marking the aspects and the corresponding emotion tendencies of each sentence respectively. The extracted partial tagged facet is shown in table 3.3. The marked data were collated for a total of 19500 bars, as shown in Table 3.4. TABLE 3.3 partial aspect words extracted after labeling Chinese corpus

TABLE 3.4 statistics after labeling of Chinese corpus

3.1.2 English corpus labeling

English corpus uses the presently disclosed Amazon book corpus, which is divided into four major classes, book, DVD, electronic and Kitchen. The data in the four different fields comprise 1000 pieces of positive comments and 1000 pieces of negative comments respectively, 8000 pieces of data are taken, and part of comment sentences are selected and shown in table 3.5.

Table 3.5 amazon corpus example

Similarly, each comment sentence after manual labeling is divided into three parts: sentence, aspect, and emotional tendency. The extracted partial aspect words after labeling are shown in table 3.6.

The extracted partial terms after labeling are shown in table 3.7, and the total number of data after final labeling is 9090, and the data after finishing is shown in table 3.8.

TABLE 3.6 English corpus post-labeling data example

TABLE 3.7 partial aspect words extracted after labeling English corpus

TABLE 3.8 statistics after labeling English corpus

3.2 Experimental parameter setting

In the experiment, word vectors are built by taking words as basic units, word segmentation processing is carried out on texts by using a jieba tool, corresponding Word2vec Word vectors are built, specific super-parameter settings of a convolutional neural network are shown in a table 3.9, and a given super-parameter m is the marked data quantity of a target field for fine adjustment of a model.

Table 3.9 experimental parameter settings

3.3 experimental results and analysis

The experiment uses the accuracy (Acc) and the F1 value as evaluation indexes.

The calculation formula of the accuracy (Acc) is shown as (3-1):

wherein the method comprises the steps ofPredictive labels, y, representing data samples _i Representing the actual tag of the data sample, N represents the size of the test set.

The calculation of the F1 value balances the recall rate and the accuracy rate index, and the calculation mode is shown as 3.10:

TABLE 3.10 confusion matrix

Precision, also called Precision or Precision, characterizes the proportion of true positive classes among all the results predicted to be positive classes, as shown in equation (3-2).

Recall (Rcca 11), also called recall, characterizes the proportion of true positive classes found by the classifier, as shown in equation (3-3).

The calculation of the F1 value comprehensively considers the accuracy and recall of the classification model and can be regarded as a weighted average of the two indexes. The F1 value is as high as the precision and recall and is between 0 and 1, with larger values indicating better model performance. The calculation formula is shown as formula (3-4).

In order to show the influence of the target field sample on the model migration effect, extracting part of data of the target field with labels to perform model training, and directly migrating the model trained by the source field to the target field when m=0 in the experiment on the Chinese data set; m=0.05 represents that 5% of the total number of target fields are randomly extracted for model retraining to adjust network parameters, m= 0.1,0.2,0.5, and the like, and the accuracy and the F1 value are selected as test indexes by using a 10-time cross validation method.

3.2.1 Chinese corpus experimental results and analysis

The accuracy test result of the Chinese corpus is shown in figure 3. Wherein C represents the data set of the Computer domain; b represents a data set in the Book field; h represents the data set of the Hotel domain. In fig. 3, c→b indicates that the source domain is Computer, the target domain is Book, and so on.

As can be seen from fig. 3, when the convolutional neural network model with the gating unit is used for migration, the migration effect from the book data set to the computer data set is the best, and the accuracy can reach 93.4%. With the increase of training data in the target area, the accuracy is improved for most data sets, and the maximum increase is typically 0 to 0.05 span when increasing the target area samples.

As shown in FIG. 4, the migration effect from the book data set to the computer data set is best, and the F1 value can reach 92.19%. With the increase of training data in the target area, the F1 value increases for most data sets. The performance of the model is expected to increase with increasing target area data set, but fig. 4 shows that the magnitude of the model performance increase is greatest when the target area data is increased from 0 to 0.05, and the performance of the model slightly floats with increasing target area data, and the model reaches the best performance in the case of the maximum target area data. Therefore, in the experiment, a few proportion of target field data is added to finely adjust the model, the experimental result can be obviously improved, and the time consumption and the cost of manual marking are greatly reduced.

3.2.2 English corpus experimental results and analysis

The accuracy test result of the English corpus is shown in FIG. 5, wherein B represents a data set in the Book field; d represents a data set of the DVD domain; e represents a dataset in the Electronics domain; k represents the data set of the Kitchen domain. In fig. 5 b→d indicates that the source field is Book, the target field is DVD, and so on.

From fig. 5, it can be seen that the accuracy of most of the data set experiments increases with the increase of training data in the target field, and the best experimental result is that when the Book data set is the source field and the Electronics data set is the target field, the accuracy reaches 82.45%.

As shown in fig. 6, the experimental result of the F1 value of the english corpus is shown in fig. 6, it can be seen that the experimental F1 value increases with the increase of training data in the target field, and the experimental result is best that when the Book data set is the source field and the Electronics data set is the target field, the migration effect of the model is best, and the F1 value reaches 81.57%.

In general, the accuracy and F1 value of the experiment are improved along with the increase of the data in the target field, the performance of the model in the experiment slightly floats, but the performance of the model is optimal under the condition that the data in the target field is maximum.

The invention marks aspect-level emotion transfer learning corpus, provides an experimental data set meeting the requirements for the invention, and also provides corpus support for the related research in the future. Aiming at cross-domain aspect emotion analysis, the invention explores a CNN-based aspect emotion analysis model, applies the idea of transfer learning, transfers a model trained in the source domain to the target domain, solves the problem that the target domain is difficult to obtain a good classification result due to less labeling data, and experiments prove that the model has good classification performance on a data set provided by the invention. In future work, more migration approaches can be used to refine the model and further verify the generalization performance of the model over more cross-domain large-scale datasets.

While the invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An aspect-level cross-domain emotion analysis method based on CNN is characterized by comprising the following steps of:

s1, constructing an aspect-level emotion analysis model

S2, aspect level cross-domain emotion analysis

The step of S1 is as follows:

the input of the aspect emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises L words, each word is converted into a word vector in d dimension, the sentence X is expressed as a matrix in d X L dimension, and a convolution kernel W in d X k (k < L) dimension is used _c Performing unidirectional translation scanning on the context matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and each scanning can obtain a convolution result c _i As shown in formula (2-1):

c _i ＝f(X _i：i+k-1 *W _c +b _c ) (2-1)

c＝[c ₁ ，c ₂ ，...，c _lk ] (2-2)

where lk denotes the length of vector c, n is set in the experiment _k A convolution kernel with k, when all sentences are scanned, an n can be obtained _k * The matrix of lk dimension is processed by maximum pooling, i.e. the maximum value of each row is taken, at this time, the sentence can be processed by n _k Vector of dimensions;

T＝[T _i ，T _i+1 ，...，T _i+k ] (2-3)

v _i ＝f _relu (T _i：i+k *W _v +b _v ) (2-4)

wherein W is _v Is a convolution kernel of dimension d x k, b _v Is a paranoid;

s _i ＝f _tanh (X _i：i+k *W _s +b _s ) (2-5)

wherein W is _s Is a convolution kernel of dimension d x k, b _s Is a paranoid;

in calculation a _i When the embedding vector v of the aspect word is added in the input _a ，v _a From v _i Obtaining the maximum pool by adopting relu as an activation function as shown in the formula (2-6), a _i Regarded as aspect features

a _i ＝f _relu (X _i：i+k *W _a +V _a v _a +b _a ) (2-6)

o _i ＝s _i *a _i (2-7)

the step of S2 is as follows:

Xs∈R ^d*l (2-8)

Ts∈R ^d*l (2-9)

respectively inputting sentences and aspect words into a convolution layer, extracting features in the sentences by using the convolution layer, setting the size of a convolution kernel W as d x k dimensions, wherein k is smaller than L, respectively carrying out unidirectional translation scanning on a sentence matrix and an aspect word matrix by using the convolution kernel, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanning _i And v _i As shown in the formula (2-10) (2-11):

c _i ＝f(X _i：i+k-1 *W _c +b _c ) (2-10)

v _i ＝f(T _i：i+k *W _v +b _v ) (2-11)

second step, v _i V obtained after maximum pooling operation _s And c _i Sending the information to a gating unit, matching and fusing the information of the opposite side and the emotion information to obtain a group of emotion vectors O _s As shown in the formula (2-12):

O _s ＝[o ₁ ，o ₂ ，...，o _lk ] (2-12)

max(O _s )＝(max o ₁ ，max o ₂ ，...，max o _lk ) (2-13)