CN110110323A - A kind of text sentiment classification method and device, computer readable storage medium - Google Patents
A kind of text sentiment classification method and device, computer readable storage medium Download PDFInfo
- Publication number
- CN110110323A CN110110323A CN201910285262.XA CN201910285262A CN110110323A CN 110110323 A CN110110323 A CN 110110323A CN 201910285262 A CN201910285262 A CN 201910285262A CN 110110323 A CN110110323 A CN 110110323A
- Authority
- CN
- China
- Prior art keywords
- text
- feature extraction
- angle
- word
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000003860 storage Methods 0.000 title claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 78
- 239000011159 matrix material Substances 0.000 claims abstract description 57
- 238000012549 training Methods 0.000 claims abstract description 54
- 230000006870 function Effects 0.000 claims abstract description 31
- 230000008451 emotion Effects 0.000 claims abstract description 21
- 239000013598 vector Substances 0.000 claims description 93
- 230000015654 memory Effects 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 6
- 230000008901 benefit Effects 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 claims 2
- 238000003780 insertion Methods 0.000 claims 1
- 230000037431 insertion Effects 0.000 claims 1
- 238000004458 analytical method Methods 0.000 abstract description 6
- 238000013528 artificial neural network Methods 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 230000000712 assembly Effects 0.000 description 3
- 238000000429 assembly Methods 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 238000009412 basement excavation Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000916 dilatatory effect Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 235000013372 meat Nutrition 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- SBNFWQZLDJGRLK-UHFFFAOYSA-N phenothrin Chemical compound CC1(C)C(C=C(C)C)C1C(=O)OCC1=CC=CC(OC=2C=CC=CC=2)=C1 SBNFWQZLDJGRLK-UHFFFAOYSA-N 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
This application discloses a kind of text sentiment classification methods and device, computer readable storage medium, the method includes obtaining the sentence context of text by pre-training language model, the pre-training language model is used for the one or more words for predicting to cover at random in the text;The angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, and the angle embeded matrix of initialization and the sentence context are carried out to notice that force function calculates, obtain modified angle embeded matrix;Discriminant classification is carried out according to modified angle embeded matrix, obtains the emotion of text all angles.The application obtains the sentence context of text by pre-training language model and carries out noticing that force function is calculated by angle embeded matrix and sentence context, is capable of handling the sentiment analysis task of multi-angle multipolarity, and does not need plenty of time extraction feature.
Description
Technical field
This application involves but be not limited to natural language processing (Natural Language Processing, NLP) technology neck
Domain more particularly to a kind of text sentiment classification method and device, computer readable storage medium.
Background technique
Sentiment analysis is vital task in NLP, also sometimes referred to as " opinion mining ", and the emotion excavation based on angle is more
Fine-grained sentiment analysis, it can provide deeper opinion tendency.
Most of currently a popular emotional semantic classification is the feeling polarities for determining whole sentence or article, can not be accomplished for given
One Duan Wenben therefrom determines to more fine granularity the feeling polarities of each angle.And the emotional semantic classification for being based partially on angle is to pass through sentence
Method analysis, linguistic feature are extracted or manual definition part rule, this kind of mode need the plenty of time to extract feature, it is desirable that are opened
Hair personnel have good linguistic base.
Summary of the invention
This application provides a kind of text sentiment classification methods and device, computer readable storage medium, are capable of handling more
The sentiment analysis task of angle multipolarity, and do not need plenty of time extraction feature.
This application provides a kind of text sentiment classification methods, comprising:
The sentence context of text is obtained by pre-training language model, the pre-training language model is described for predicting
The one or more words covered at random in text;
The angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, by the angle of initialization
Degree embeded matrix and the sentence context carry out noticing that force function calculates, and obtain modified angle embeded matrix;
Discriminant classification is carried out according to modified angle embeded matrix, obtains the emotion of text all angles.
In a kind of exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval,
Wherein:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector,
It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence
Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In a kind of exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, and the feature is taken out
Taking layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving
This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception
This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output
Context is overlapped, and is predicted according to stack result the word for covering position.
It is described to infuse the angle embeded matrix of initialization and the sentence context in a kind of exemplary embodiment
Force function of anticipating calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q
For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has one
Or multiple programs, one or more of programs can be executed by one or more processor, to realize such as any of the above
The step of text sentiment classification method described in item.
Present invention also provides a kind of text emotion sorters, including processor and memory, in which: the processor
For executing the program stored in memory, the step of to realize text sentiment classification method as described in any of the above item.
Present invention also provides a kind of text emotion sorters, including context to obtain module, attention computing module
With discriminant classification module, in which:
The context obtains module and is used for, and the sentence context of text is obtained by pre-training language model, described pre-
Train language model is used for the one or more words for predicting to cover at random in the text;
The attention computing module is used for, the angle embeded matrix of random initializtion a*n, and a is angle number, and n is word
It is embedded in number of dimensions, the angle embeded matrix of initialization and the sentence context are carried out to notice that force function calculates, corrected
Angle embeded matrix;
The discriminant classification module is used for, and carries out discriminant classification according to modified angle embeded matrix, it is each to obtain text
The emotion of angle.
In a kind of exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval,
Wherein:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector,
It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence
Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In a kind of exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, and the feature is taken out
Taking layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving
This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception
This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output
Context is overlapped, and is predicted according to stack result the word for covering position.
In a kind of exemplary embodiment, the attention computing module by the angle embeded matrix of initialization and described
Sentence context carries out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q
For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Compared with the relevant technologies, the text sentiment classification method and device of the application, computer readable storage medium pass through
Pre-training language model obtains the sentence context of text and carries out attention letter by angle embeded matrix and sentence context
Number calculates, and is capable of handling the sentiment analysis task of multi-angle multipolarity, and does not need plenty of time extraction feature.
Other features and advantage will illustrate in the following description, also, partly become from specification
It obtains it is clear that being understood and implementing the application.Other advantages of the application can be by specification, claims
And scheme described in attached drawing is achieved and obtained.
Detailed description of the invention
Attached drawing is used to provide the understanding to technical scheme, and constitutes part of specification, with the application's
Embodiment is used to explain the technical solution of the application together, does not constitute the limitation to technical scheme.
Fig. 1 is a kind of flow diagram of text sentiment classification method of the embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of text emotion sorter of the embodiment of the present invention.
Specific embodiment
This application describes multiple embodiments, but the description is exemplary, rather than restrictive, and for this
It is readily apparent that can have more in the range of embodiments described herein includes for the those of ordinary skill in field
More embodiments and implementation.Although many possible feature combinations are shown in the attached drawings, and in a specific embodiment
It is discussed, but many other combinations of disclosed feature are also possible.Unless the feelings specially limited
Other than condition, any feature or element of any embodiment can be with any other features or element knot in any other embodiment
It closes and uses, or any other feature or the element in any other embodiment can be substituted.
The application includes and contemplates the combination with feature known to persons of ordinary skill in the art and element.The application is
It can also combine with any general characteristics or element through disclosed embodiment, feature and element, be defined by the claims with being formed
Unique scheme of the invention.Any feature or element of any embodiment can also be with features or member from other scheme of the invention
Part combination, to form the unique scheme of the invention that another is defined by the claims.It will thus be appreciated that showing in this application
Out and/or any feature of discussion can be realized individually or in any suitable combination.Therefore, in addition to according to appended right
It is required that and its other than the limitation done of equivalent replacement, embodiment is not limited.Furthermore, it is possible in the guarantor of appended claims
It carry out various modifications and changes in shield range.
In addition, method and/or process may be rendered as spy by specification when describing representative embodiment
Fixed step sequence.However, in the degree of this method or process independent of the particular order of step described herein, this method
Or process should not necessarily be limited by the step of particular order.As one of ordinary skill in the art will appreciate, other steps is suitable
Sequence is also possible.Therefore, the particular order of step described in specification is not necessarily to be construed as limitations on claims.This
Outside, the claim for this method and/or process should not necessarily be limited by the step of executing them in the order written, art technology
Personnel are it can be readily appreciated that these can sequentially change, and still remain in the spirit and scope of the embodiment of the present application.
One text sentiment classification method of embodiment
As shown in Figure 1, a kind of text sentiment classification method according to an embodiment of the present invention, includes the following steps:
Step 101: the sentence context of text is obtained by pre-training language model, the pre-training language model is used for
Predict the one or more words covered at random in the text;
It should be noted that deep learning has very strong ability to express, but its born disadvantage is exactly to need largely
Ground marks training sample, and marking a large amount of high quality training samples is to take time very much and cost.If training one from the beginning
A neural network, mark task are undoubtedly most important.But there is no a large amount of mark training samples in practice, also without when
Between go to mark training sample manually in large quantities with energy, this just greatly limits the learning ability of model.The application borrows migration
Study thoughts are indicated using the high dimensional feature that pre-training language model obtains text, later on the basis that this high dimensional feature indicates
Upper modeling, and network is integrally finely adjusted.
In one exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval,
In:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector,
It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence
Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In one exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, the feature extraction
Layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving
This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception
This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output
Context is overlapped, and is predicted according to stack result the word for covering position.
In one exemplary embodiment, the prediction interval includes the first full articulamentum and the first Softmax layers, in which:
First full articulamentum is used for, and receives the sentence of the rear forward direction feature extraction layer and the backward feature extraction layer output
Sub- context, and result is exported to the first Softmax layers;
First Softmax layers be used for, receive the first full articulamentum output as a result, to input sentence in cover position word
It is predicted.
Step 102: the angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, will be first
The angle embeded matrix of beginningization and the sentence context carry out noticing that force function calculates, and obtain modified angle embeded matrix;
In one exemplary embodiment, described to pay attention to the angle embeded matrix of initialization and the sentence context
Force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q
For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Step 103: discriminant classification being carried out according to modified angle embeded matrix, obtains the emotion of text all angles.
In one exemplary embodiment, the step 103 specifically includes:
Modified angle embeded matrix is inputted into the second full articulamentum L (n, t), t indicates every kind of polar number, obtains a*
T matrix of consequence;Then Softmax layers of a*t matrix of consequence input the 2nd is obtained into final prediction result again.
Two computer readable storage medium of embodiment
The embodiment of the invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage
Have one or more program, one or more of programs can be executed by one or more processor, with realize such as with
The step of upper described in any item text sentiment classification methods.
Three text emotion sorter of embodiment
The embodiment of the invention also provides a kind of text emotion sorters, including processor and memory, in which: described
Processor is for executing the program stored in memory, to realize the step of the text sentiment classification method as described in any of the above item
Suddenly.
Example IV text emotion sorter
As shown in Fig. 2, the embodiment of the invention also provides a kind of text emotion sorter, including context obtains module
201, attention computing module 202 and discriminant classification module 203, in which:
The context obtains module 201 and is used for, and the sentence context of text is obtained by pre-training language model, described
Pre-training language model is used for the one or more words for predicting to cover at random in the text;
The attention computing module 202 is used for, the angle embeded matrix of random initializtion a*n, and a is angle number, and n is
Word is embedded in number of dimensions, and the angle embeded matrix of initialization and the sentence context are carried out to notice that force function calculates, repaired
Positive angle embeded matrix;
The discriminant classification module 203 is used for, and carries out discriminant classification according to modified angle embeded matrix, it is each to obtain text
The emotion of a angle.
In one exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval,
In:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector,
It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence
Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In one exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, the feature extraction
Layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving
This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each
Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception
This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output
Context is overlapped, and is predicted according to stack result the word for covering position.
In one exemplary embodiment, the attention computing module 202 by the angle embeded matrix of initialization and institute
Sentence context is stated to carry out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q
For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Five text sentiment classification method of embodiment
After taking training corpus, it is necessary first to be labeled to training corpus, specific notation methods are as follows:
(1) angular type and number are determined.For example, taking car data to illustrate, our predetermined angular numbers are 5, angle
Degree type is respectively as follows: interior trim, power, five major class of cost performance, oil consumption and comfort.
(2) every training sample is labeled according to each angle, i.e. every training sample need it is angled to institute all into
Rower note, if not referring to that it is common for defaulting the angle polarity for a certain angle in training sample.Such as:
A) training sample 1: " starting meat, 2 gears are a little dilatory to 3 gears, and backstage space can manage it satisfied, can sit three
People is not crowded, and appearance and interior trim are most satisfied, and handsome, oil consumption is general ".
B) training sample 2: " face value full marks, handfeel of keys is super good, and car has abnormal sound, and double clutch startings are slow, starts start and stop
Function can not be closed permanently, be turned off manually every time, and not well, oil consumption is slightly higher, but cost performance or relatively high, after all this
A price should not excessive demand it is too many ".
For two training samples as above, annotation results are as shown in table 1:
Training sample | Interior trim | Power | Cost performance | Oil consumption | Comfort |
Training sample 1 | It is positive | Negative sense | Generally | Negative sense | Generally |
Training sample 2 | It is positive | Negative sense | It is positive | Negative sense | Generally |
Table 1
(3) training sample screens, and it is all general training sample that it is angled, which to remove institute, accelerates model training speed.
Specific pre-training method is as follows:
I) unsupervised pre-training language sample:
In order to make language model that there is better feature extraction ability, we by random one covered in training sample or
Multiple words, train language model go to be predicted.Such as:
Original training sample: car has abnormal sound, and ceiling also has abnormal sound, and double clutch startings are slow.
Pre-training sample: car has abnormal sound, and ceiling also has abnormal sound, and it is slow that bis- [MASK] close step.
Language model output: from
It should be noted that each word can may word for word be predicted by mask, language model.Language model can according to from
Left-to-right (Left-to-Right) and from right to left (Right-to-Left), which learn context, to be indicated to predict by the word of mask.
II) language model defines:
In order to predict the correctly word by mask, language model needs to learn from the left and right sides context, and two parts are tied
Structure is identical but parameter is different, later by the addition of vectors of two part study.Corresponding title left part is divided into forward table and shows network,
Right half be known as after to network is indicated, two network structures are identical.It should be noted that all words are to preceding to expression net on the right of mask
Network is invisible, and all words in the mask left side are to rear invisible to network.
Forward direction indicates that network consists of two parts, and first part is preceding to embeding layer, this layer will cover the text on the left of position
Each word is mapped as term vector (Token Embedding) in this, and is each term vector embedded location vector (Position
Embedding), using Token Embedding and Position Embedding accumulated result as the input of second part;
Second part is characterized abstraction, layer, and the input of this layer is the output of first part as a result, this layer can be the volume of Transformer
Code device (Encoder), convolutional neural networks (Convolutional Neural Networks, CNN) or Recognition with Recurrent Neural Network
(Recurrent Neural Network, RNN) (shot and long term memory artificial neural network (Long-Short Term Memory,
LSTM), GRU (Gated Recurrent Unit)) etc., this feature abstraction, layer can be superimposed multilayer to improve feature extraction ability
(as long as input and output dimension is identical to be superimposed, and the n dimension output of first layer can be tieed up as the n of the second layer to be inputted).One
In exemplary embodiment, the application second part uses the part Encoder of Transformer, and the number of plies uses 8 layers.
It should be noted that the Transformer model of Google is for machine translation task, Transformer earliest
The slow disadvantage of training that RNN is most denounced by people is improved, is realized quickly simultaneously using from attention (Self-Attention) mechanism
Row.And Transformer can increase to very deep depth, sufficiently excavation deep neural network (Deep Neural
Networks, DNN) model characteristic, lift scheme accuracy rate.
It is backward to indicate that network consists of two parts, first part be after to embeding layer, this layer will cover the text on the right side of position
Each word is mapped as term vector in this, and is each term vector embedded location vector, by Token Embedding and Position
Input of the Embedding accumulated result as second part;Second part is characterized abstraction, layer, and the input of this layer is first
As a result, this layer can be Encoder, CNN or RNN (LSTM, GRU) of Transformer etc., this feature is extracted for the output divided
Layer can be superimposed multilayer to improve feature extraction ability.In one exemplary embodiment, the application second part uses
The part Encoder of Transformer, the number of plies use 8 layers.
Part III inputs two sentence context additions after the right and left submodel exports sentence context again
To full articulamentum, Softmax prediction is carried out for the word of position [MASK] in input sentence.
III) model training:
Pre-training language model is obtained according to the training of normal neuronal network model.
The application constructs angle embeded matrix (Aspect Embedding) on the basis of pre-training language model, and
All outputs of language model are modeled in conjunction with attention mechanism (Attention Mechanism).Detailed process is as follows:
Angle embeded matrix is constructed according to the number of angle, for example above angle number is 5, so building 5*n ties up square
Battle array V, wherein n is that word is embedded in number of dimensions, later by all outputs of the angle embeded matrix and pre-training model second part
(i.e. sentence context) carries out noticing that force function calculates:
Wherein, Q is angle embeded matrix, and K is sentence context, and n is that word is embedded in number of dimensions.
The attention calculation that the application uses can be scaling dot product (Scaled Dot-Product) mode.Angle
The every a line of embeded matrix all goes out modified angle embeded matrix according to above-mentioned equation calculation, later by modified angle embeded matrix
Multi-tag classifier is inputted, which is a full articulamentum, and output dimension is angle polarity number.Such as: each angle
Three kinds of polarity are exported, so the matrix that 5 angle output is 5*3, every a line indicate three polarity of an angle, pass through later
Softmax selects optimal polarity.
The each angle of pre-training model goes to train using cross entropy as loss.
Specifically it is expressed as follows:
Assuming that the output of pre-training model is M, M is p*n matrix, and p indicates that sentence word number (in short has several words.Than
Such as, in sentence, " I likes eating apple." in, p=7), n indicates that word is embedded in number of dimensions.Assuming that angle embeded matrix is A, A a*n
Matrix, a indicate angle number.F indicate Attention function (can use Scaled Dot-Product mode), so M with
Output is a*n matrix after A is input in F, passes through fully-connected network L (n, t) later, and t indicates every kind of polar number, obtains a*
The matrix of consequence of t.T=3 in the application finally passes through Softmax layers, obtains final prediction result.
It will appreciated by the skilled person that whole or certain steps, system, dress in method disclosed hereinabove
Functional module/unit in setting may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment,
Division between the functional module/unit referred in the above description not necessarily corresponds to the division of physical assemblies;For example, one
Physical assemblies can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain groups
Part or all components may be implemented as by processor, such as the software that digital signal processor or microprocessor execute, or by
It is embodied as hardware, or is implemented as integrated circuit, such as specific integrated circuit.Such software can be distributed in computer-readable
On medium, computer-readable medium may include computer storage medium (or non-transitory medium) and communication media (or temporarily
Property medium).As known to a person of ordinary skill in the art, term computer storage medium is included in for storing information (such as
Computer readable instructions, data structure, program module or other data) any method or technique in the volatibility implemented and non-
Volatibility, removable and nonremovable medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or its
His memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other
Magnetic memory apparatus or any other medium that can be used for storing desired information and can be accessed by a computer.This
Outside, known to a person of ordinary skill in the art to be, communication media generally comprises computer readable instructions, data structure, program mould
Other data in the modulated data signal of block or such as carrier wave or other transmission mechanisms etc, and may include any information
Delivery media.
Claims (10)
1. a kind of text sentiment classification method characterized by comprising
The sentence context of text is obtained by pre-training language model, the pre-training language model is for predicting the text
In one or more words for covering at random;
The angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, and the angle of initialization is embedding
Enter matrix and the sentence context carries out noticing that force function calculates, obtains modified angle embeded matrix;
Discriminant classification is carried out according to modified angle embeded matrix, obtains the emotion of text all angles.
2. text sentiment classification method according to claim 1, which is characterized in that the pre-training language model includes embedding
Enter layer, feature extraction layer and prediction interval, in which:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, by word
It exports after vector sum position vector is cumulative to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence or more
Text is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
3. text sentiment classification method according to claim 2, which is characterized in that the embeding layer includes preceding to embeding layer
With backward embeding layer, the feature extraction layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and will be covered each word in text on the left of position and is mapped as term vector, and for each word to
Embedded location vector is measured, is exported after term vector and position vector are added up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, and the output before receiving to embeding layer covers the text on the left of position as a result, extracting
High dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and will be covered each word in text on the right side of position and is mapped as term vector, and for each word to
Embedded location vector is measured, is exported after term vector and position vector are added up to the backward feature extraction layer;
The backward feature extraction layer is used for, and the output after reception to embeding layer covers the text on the right side of position as a result, extracting
High dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, above and below the forward direction feature extraction layer and the sentence of the backward feature extraction layer output
Text is overlapped, and is predicted according to stack result the word for covering position.
4. text sentiment classification method according to claim 1, which is characterized in that the angle by initialization is embedded in square
Battle array and the sentence context carry out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, and Q is institute
Angle embeded matrix is stated, K is the sentence context, and n is embedded in number of dimensions for institute's predicate.
5. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or
Multiple programs, one or more of programs can be executed by one or more processor, to realize such as claim 1 to power
Benefit require any one of 4 described in text sentiment classification method the step of.
6. a kind of text emotion sorter, which is characterized in that including processor and memory, in which: the processor is used for
The program stored in memory is executed, to realize the text emotion classification as described in any one of claim 1 to claim 4
The step of method.
7. a kind of text emotion sorter, which is characterized in that obtain module, attention computing module and classification including context
Discrimination module, in which:
The context obtains module and is used for, and the sentence context of text, the pre-training are obtained by pre-training language model
Language model is used for the one or more words for predicting to cover at random in the text;
The attention computing module is used for, the angle embeded matrix of random initializtion a*n, and a is angle number, and n is word insertion
The angle embeded matrix of initialization and the sentence context are carried out noticing that force function calculates, obtain modified angle by number of dimensions
Spend embeded matrix;
The discriminant classification module is used for, and is carried out discriminant classification according to modified angle embeded matrix, is obtained text all angles
Emotion.
8. text emotion sorter according to claim 7, which is characterized in that the pre-training language model includes embedding
Enter layer, feature extraction layer and prediction interval, in which:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, by word
It exports after vector sum position vector is cumulative to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence or more
Text is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
9. text emotion sorter according to claim 8, which is characterized in that the embeding layer includes preceding to embeding layer
With backward embeding layer, the feature extraction layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and will be covered each word in text on the left of position and is mapped as term vector, and for each word to
Embedded location vector is measured, is exported after term vector and position vector are added up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, and the output before receiving to embeding layer covers the text on the left of position as a result, extracting
High dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and will be covered each word in text on the right side of position and is mapped as term vector, and for each word to
Embedded location vector is measured, is exported after term vector and position vector are added up to the backward feature extraction layer;
The backward feature extraction layer is used for, and the output after reception to embeding layer covers the text on the right side of position as a result, extracting
High dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, above and below the forward direction feature extraction layer and the sentence of the backward feature extraction layer output
Text is overlapped, and is predicted according to stack result the word for covering position.
10. text emotion sorter according to claim 7, which is characterized in that the general of the attention computing module
The angle embeded matrix of initialization and the sentence context carry out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, and Q is institute
Angle embeded matrix is stated, K is the sentence context, and n is embedded in number of dimensions for institute's predicate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910285262.XA CN110110323B (en) | 2019-04-10 | 2019-04-10 | Text emotion classification method and device and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910285262.XA CN110110323B (en) | 2019-04-10 | 2019-04-10 | Text emotion classification method and device and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110110323A true CN110110323A (en) | 2019-08-09 |
CN110110323B CN110110323B (en) | 2022-11-11 |
Family
ID=67483800
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910285262.XA Expired - Fee Related CN110110323B (en) | 2019-04-10 | 2019-04-10 | Text emotion classification method and device and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110110323B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110704622A (en) * | 2019-09-27 | 2020-01-17 | 北京明略软件***有限公司 | Text emotion classification method and device and electronic equipment |
CN110795537A (en) * | 2019-10-30 | 2020-02-14 | 秒针信息技术有限公司 | Method, device, equipment and medium for determining improvement strategy of target commodity |
CN110837733A (en) * | 2019-10-31 | 2020-02-25 | 创新工场(广州)人工智能研究有限公司 | Language model training method and system in self-reconstruction mode and computer readable medium |
CN111241304A (en) * | 2020-01-16 | 2020-06-05 | 平安科技(深圳)有限公司 | Answer generation method based on deep learning, electronic device and readable storage medium |
CN111274807A (en) * | 2020-02-03 | 2020-06-12 | 华为技术有限公司 | Text information processing method and device, computer equipment and readable storage medium |
CN111506702A (en) * | 2020-03-25 | 2020-08-07 | 北京万里红科技股份有限公司 | Knowledge distillation-based language model training method, text classification method and device |
CN111737994A (en) * | 2020-05-29 | 2020-10-02 | 北京百度网讯科技有限公司 | Method, device and equipment for obtaining word vector based on language model and storage medium |
CN112214576A (en) * | 2020-09-10 | 2021-01-12 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, device, terminal equipment and computer readable storage medium |
CN112214601A (en) * | 2020-10-21 | 2021-01-12 | 厦门市美亚柏科信息股份有限公司 | Social short text sentiment classification method and device and storage medium |
CN113792143A (en) * | 2021-09-13 | 2021-12-14 | 中国科学院新疆理化技术研究所 | Capsule network-based multi-language emotion classification method, device, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2973138A1 (en) * | 2014-01-10 | 2015-07-16 | Cluep Inc. | Systems, devices, and methods for automatic detection of feelings in text |
DK201670552A1 (en) * | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
CN108170681A (en) * | 2018-01-15 | 2018-06-15 | 中南大学 | Text emotion analysis method, system and computer readable storage medium |
CN109543039A (en) * | 2018-11-23 | 2019-03-29 | 中山大学 | A kind of natural language sentiment analysis method based on depth network |
-
2019
- 2019-04-10 CN CN201910285262.XA patent/CN110110323B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2973138A1 (en) * | 2014-01-10 | 2015-07-16 | Cluep Inc. | Systems, devices, and methods for automatic detection of feelings in text |
DK201670552A1 (en) * | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
CN108170681A (en) * | 2018-01-15 | 2018-06-15 | 中南大学 | Text emotion analysis method, system and computer readable storage medium |
CN109543039A (en) * | 2018-11-23 | 2019-03-29 | 中山大学 | A kind of natural language sentiment analysis method based on depth network |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110704622A (en) * | 2019-09-27 | 2020-01-17 | 北京明略软件***有限公司 | Text emotion classification method and device and electronic equipment |
CN110795537A (en) * | 2019-10-30 | 2020-02-14 | 秒针信息技术有限公司 | Method, device, equipment and medium for determining improvement strategy of target commodity |
CN110795537B (en) * | 2019-10-30 | 2022-10-25 | 秒针信息技术有限公司 | Method, device, equipment and medium for determining improvement strategy of target commodity |
CN110837733A (en) * | 2019-10-31 | 2020-02-25 | 创新工场(广州)人工智能研究有限公司 | Language model training method and system in self-reconstruction mode and computer readable medium |
CN110837733B (en) * | 2019-10-31 | 2023-12-29 | 创新工场(广州)人工智能研究有限公司 | Language model training method and system of self-reconstruction mode and electronic equipment |
CN111241304B (en) * | 2020-01-16 | 2024-02-06 | 平安科技(深圳)有限公司 | Answer generation method based on deep learning, electronic device and readable storage medium |
CN111241304A (en) * | 2020-01-16 | 2020-06-05 | 平安科技(深圳)有限公司 | Answer generation method based on deep learning, electronic device and readable storage medium |
CN111274807A (en) * | 2020-02-03 | 2020-06-12 | 华为技术有限公司 | Text information processing method and device, computer equipment and readable storage medium |
CN111506702A (en) * | 2020-03-25 | 2020-08-07 | 北京万里红科技股份有限公司 | Knowledge distillation-based language model training method, text classification method and device |
CN111737994A (en) * | 2020-05-29 | 2020-10-02 | 北京百度网讯科技有限公司 | Method, device and equipment for obtaining word vector based on language model and storage medium |
CN111737994B (en) * | 2020-05-29 | 2024-01-26 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for obtaining word vector based on language model |
CN112214576A (en) * | 2020-09-10 | 2021-01-12 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, device, terminal equipment and computer readable storage medium |
CN112214576B (en) * | 2020-09-10 | 2024-02-06 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, public opinion analysis device, terminal equipment and computer readable storage medium |
CN112214601A (en) * | 2020-10-21 | 2021-01-12 | 厦门市美亚柏科信息股份有限公司 | Social short text sentiment classification method and device and storage medium |
CN112214601B (en) * | 2020-10-21 | 2022-06-10 | 厦门市美亚柏科信息股份有限公司 | Social short text sentiment classification method and device and storage medium |
CN113792143B (en) * | 2021-09-13 | 2023-12-12 | 中国科学院新疆理化技术研究所 | Multi-language emotion classification method, device, equipment and storage medium based on capsule network |
CN113792143A (en) * | 2021-09-13 | 2021-12-14 | 中国科学院新疆理化技术研究所 | Capsule network-based multi-language emotion classification method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110110323B (en) | 2022-11-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110110323A (en) | A kind of text sentiment classification method and device, computer readable storage medium | |
CN108763326B (en) | Emotion analysis model construction method of convolutional neural network based on feature diversification | |
Alzantot et al. | Generating natural language adversarial examples | |
CN108595632B (en) | Hybrid neural network text classification method fusing abstract and main body characteristics | |
CN110969020B (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
CN108717439A (en) | A kind of Chinese Text Categorization merged based on attention mechanism and characteristic strengthening | |
CN107608956A (en) | A kind of reader's mood forecast of distribution algorithm based on CNN GRNN | |
CN111177374A (en) | Active learning-based question and answer corpus emotion classification method and system | |
US20210375280A1 (en) | Systems and methods for response selection in multi-party conversations with dynamic topic tracking | |
CN110826338B (en) | Fine-grained semantic similarity recognition method for single-selection gate and inter-class measurement | |
CN106776713A (en) | It is a kind of based on this clustering method of the Massive short documents of term vector semantic analysis | |
CN105740236A (en) | Writing feature and sequence feature combined Chinese sentiment new word recognition method and system | |
CN110427461A (en) | Intelligent answer information processing method, electronic equipment and computer readable storage medium | |
Singh et al. | AlexNet architecture based convolutional neural network for toxic comments classification | |
Hu et al. | Multimodal DBN for predicting high-quality answers in cQA portals | |
CN109271537A (en) | A kind of text based on distillation study is to image generating method and system | |
CN110415071A (en) | A kind of competing product control methods of automobile based on opining mining analysis | |
CN109783794A (en) | File classification method and device | |
CN106503616A (en) | A kind of Mental imagery Method of EEG signals classification of the learning machine that transfinited based on layering | |
Mohamad Nezami et al. | Towards generating stylized image captions via adversarial training | |
Singh | Fake News Detection: a comparison between available Deep Learning techniques in vector space | |
CN109740151A (en) | Public security notes name entity recognition method based on iteration expansion convolutional neural networks | |
Sun et al. | Multi-channel CNN based inner-attention for compound sentence relation classification | |
KR102469679B1 (en) | Method and apparatus for recommending customised food based on artificial intelligence | |
CN111914553A (en) | Financial information negative subject judgment method based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20221111 |