CN110110323A - A kind of text sentiment classification method and device, computer readable storage medium - Google Patents

A kind of text sentiment classification method and device, computer readable storage medium Download PDF

Info

Publication number
CN110110323A
CN110110323A CN201910285262.XA CN201910285262A CN110110323A CN 110110323 A CN110110323 A CN 110110323A CN 201910285262 A CN201910285262 A CN 201910285262A CN 110110323 A CN110110323 A CN 110110323A
Authority
CN
China
Prior art keywords
text
feature extraction
angle
word
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910285262.XA
Other languages
Chinese (zh)
Other versions
CN110110323B (en
Inventor
齐云飞
陈栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201910285262.XA priority Critical patent/CN110110323B/en
Publication of CN110110323A publication Critical patent/CN110110323A/en
Application granted granted Critical
Publication of CN110110323B publication Critical patent/CN110110323B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of text sentiment classification methods and device, computer readable storage medium, the method includes obtaining the sentence context of text by pre-training language model, the pre-training language model is used for the one or more words for predicting to cover at random in the text;The angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, and the angle embeded matrix of initialization and the sentence context are carried out to notice that force function calculates, obtain modified angle embeded matrix;Discriminant classification is carried out according to modified angle embeded matrix, obtains the emotion of text all angles.The application obtains the sentence context of text by pre-training language model and carries out noticing that force function is calculated by angle embeded matrix and sentence context, is capable of handling the sentiment analysis task of multi-angle multipolarity, and does not need plenty of time extraction feature.

Description

A kind of text sentiment classification method and device, computer readable storage medium
Technical field
This application involves but be not limited to natural language processing (Natural Language Processing, NLP) technology neck Domain more particularly to a kind of text sentiment classification method and device, computer readable storage medium.
Background technique
Sentiment analysis is vital task in NLP, also sometimes referred to as " opinion mining ", and the emotion excavation based on angle is more Fine-grained sentiment analysis, it can provide deeper opinion tendency.
Most of currently a popular emotional semantic classification is the feeling polarities for determining whole sentence or article, can not be accomplished for given One Duan Wenben therefrom determines to more fine granularity the feeling polarities of each angle.And the emotional semantic classification for being based partially on angle is to pass through sentence Method analysis, linguistic feature are extracted or manual definition part rule, this kind of mode need the plenty of time to extract feature, it is desirable that are opened Hair personnel have good linguistic base.
Summary of the invention
This application provides a kind of text sentiment classification methods and device, computer readable storage medium, are capable of handling more The sentiment analysis task of angle multipolarity, and do not need plenty of time extraction feature.
This application provides a kind of text sentiment classification methods, comprising:
The sentence context of text is obtained by pre-training language model, the pre-training language model is described for predicting The one or more words covered at random in text;
The angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, by the angle of initialization Degree embeded matrix and the sentence context carry out noticing that force function calculates, and obtain modified angle embeded matrix;
Discriminant classification is carried out according to modified angle embeded matrix, obtains the emotion of text all angles.
In a kind of exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval, Wherein:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In a kind of exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, and the feature is taken out Taking layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output Context is overlapped, and is predicted according to stack result the word for covering position.
It is described to infuse the angle embeded matrix of initialization and the sentence context in a kind of exemplary embodiment Force function of anticipating calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has one Or multiple programs, one or more of programs can be executed by one or more processor, to realize such as any of the above The step of text sentiment classification method described in item.
Present invention also provides a kind of text emotion sorters, including processor and memory, in which: the processor For executing the program stored in memory, the step of to realize text sentiment classification method as described in any of the above item.
Present invention also provides a kind of text emotion sorters, including context to obtain module, attention computing module With discriminant classification module, in which:
The context obtains module and is used for, and the sentence context of text is obtained by pre-training language model, described pre- Train language model is used for the one or more words for predicting to cover at random in the text;
The attention computing module is used for, the angle embeded matrix of random initializtion a*n, and a is angle number, and n is word It is embedded in number of dimensions, the angle embeded matrix of initialization and the sentence context are carried out to notice that force function calculates, corrected Angle embeded matrix;
The discriminant classification module is used for, and carries out discriminant classification according to modified angle embeded matrix, it is each to obtain text The emotion of angle.
In a kind of exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval, Wherein:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In a kind of exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, and the feature is taken out Taking layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output Context is overlapped, and is predicted according to stack result the word for covering position.
In a kind of exemplary embodiment, the attention computing module by the angle embeded matrix of initialization and described Sentence context carries out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Compared with the relevant technologies, the text sentiment classification method and device of the application, computer readable storage medium pass through Pre-training language model obtains the sentence context of text and carries out attention letter by angle embeded matrix and sentence context Number calculates, and is capable of handling the sentiment analysis task of multi-angle multipolarity, and does not need plenty of time extraction feature.
Other features and advantage will illustrate in the following description, also, partly become from specification It obtains it is clear that being understood and implementing the application.Other advantages of the application can be by specification, claims And scheme described in attached drawing is achieved and obtained.
Detailed description of the invention
Attached drawing is used to provide the understanding to technical scheme, and constitutes part of specification, with the application's Embodiment is used to explain the technical solution of the application together, does not constitute the limitation to technical scheme.
Fig. 1 is a kind of flow diagram of text sentiment classification method of the embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of text emotion sorter of the embodiment of the present invention.
Specific embodiment
This application describes multiple embodiments, but the description is exemplary, rather than restrictive, and for this It is readily apparent that can have more in the range of embodiments described herein includes for the those of ordinary skill in field More embodiments and implementation.Although many possible feature combinations are shown in the attached drawings, and in a specific embodiment It is discussed, but many other combinations of disclosed feature are also possible.Unless the feelings specially limited Other than condition, any feature or element of any embodiment can be with any other features or element knot in any other embodiment It closes and uses, or any other feature or the element in any other embodiment can be substituted.
The application includes and contemplates the combination with feature known to persons of ordinary skill in the art and element.The application is It can also combine with any general characteristics or element through disclosed embodiment, feature and element, be defined by the claims with being formed Unique scheme of the invention.Any feature or element of any embodiment can also be with features or member from other scheme of the invention Part combination, to form the unique scheme of the invention that another is defined by the claims.It will thus be appreciated that showing in this application Out and/or any feature of discussion can be realized individually or in any suitable combination.Therefore, in addition to according to appended right It is required that and its other than the limitation done of equivalent replacement, embodiment is not limited.Furthermore, it is possible in the guarantor of appended claims It carry out various modifications and changes in shield range.
In addition, method and/or process may be rendered as spy by specification when describing representative embodiment Fixed step sequence.However, in the degree of this method or process independent of the particular order of step described herein, this method Or process should not necessarily be limited by the step of particular order.As one of ordinary skill in the art will appreciate, other steps is suitable Sequence is also possible.Therefore, the particular order of step described in specification is not necessarily to be construed as limitations on claims.This Outside, the claim for this method and/or process should not necessarily be limited by the step of executing them in the order written, art technology Personnel are it can be readily appreciated that these can sequentially change, and still remain in the spirit and scope of the embodiment of the present application.
One text sentiment classification method of embodiment
As shown in Figure 1, a kind of text sentiment classification method according to an embodiment of the present invention, includes the following steps:
Step 101: the sentence context of text is obtained by pre-training language model, the pre-training language model is used for Predict the one or more words covered at random in the text;
It should be noted that deep learning has very strong ability to express, but its born disadvantage is exactly to need largely Ground marks training sample, and marking a large amount of high quality training samples is to take time very much and cost.If training one from the beginning A neural network, mark task are undoubtedly most important.But there is no a large amount of mark training samples in practice, also without when Between go to mark training sample manually in large quantities with energy, this just greatly limits the learning ability of model.The application borrows migration Study thoughts are indicated using the high dimensional feature that pre-training language model obtains text, later on the basis that this high dimensional feature indicates Upper modeling, and network is integrally finely adjusted.
In one exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval, In:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In one exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, the feature extraction Layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output Context is overlapped, and is predicted according to stack result the word for covering position.
In one exemplary embodiment, the prediction interval includes the first full articulamentum and the first Softmax layers, in which:
First full articulamentum is used for, and receives the sentence of the rear forward direction feature extraction layer and the backward feature extraction layer output Sub- context, and result is exported to the first Softmax layers;
First Softmax layers be used for, receive the first full articulamentum output as a result, to input sentence in cover position word It is predicted.
Step 102: the angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, will be first The angle embeded matrix of beginningization and the sentence context carry out noticing that force function calculates, and obtain modified angle embeded matrix;
In one exemplary embodiment, described to pay attention to the angle embeded matrix of initialization and the sentence context Force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Step 103: discriminant classification being carried out according to modified angle embeded matrix, obtains the emotion of text all angles.
In one exemplary embodiment, the step 103 specifically includes:
Modified angle embeded matrix is inputted into the second full articulamentum L (n, t), t indicates every kind of polar number, obtains a* T matrix of consequence;Then Softmax layers of a*t matrix of consequence input the 2nd is obtained into final prediction result again.
Two computer readable storage medium of embodiment
The embodiment of the invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage Have one or more program, one or more of programs can be executed by one or more processor, with realize such as with The step of upper described in any item text sentiment classification methods.
Three text emotion sorter of embodiment
The embodiment of the invention also provides a kind of text emotion sorters, including processor and memory, in which: described Processor is for executing the program stored in memory, to realize the step of the text sentiment classification method as described in any of the above item Suddenly.
Example IV text emotion sorter
As shown in Fig. 2, the embodiment of the invention also provides a kind of text emotion sorter, including context obtains module 201, attention computing module 202 and discriminant classification module 203, in which:
The context obtains module 201 and is used for, and the sentence context of text is obtained by pre-training language model, described Pre-training language model is used for the one or more words for predicting to cover at random in the text;
The attention computing module 202 is used for, the angle embeded matrix of random initializtion a*n, and a is angle number, and n is Word is embedded in number of dimensions, and the angle embeded matrix of initialization and the sentence context are carried out to notice that force function calculates, repaired Positive angle embeded matrix;
The discriminant classification module 203 is used for, and carries out discriminant classification according to modified angle embeded matrix, it is each to obtain text The emotion of a angle.
In one exemplary embodiment, the pre-training language model includes embeding layer, feature extraction layer and prediction interval, In:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, It exports after term vector and position vector are added up to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence Context is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
In one exemplary embodiment, the embeding layer includes preceding to embeding layer and backward embeding layer, the feature extraction Layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and each word in the text on the left of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the left of position before receiving This high dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and each word in the text on the right side of cover position is mapped as term vector, and be each Term vector embedded location vector exports after term vector and position vector add up to the backward feature extraction layer;
The backward feature extraction layer is used for, to the output of embeding layer as a result, extracting the text covered on the right side of position after reception This high dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, by the sentence of the forward direction feature extraction layer and the backward feature extraction layer output Context is overlapped, and is predicted according to stack result the word for covering position.
In one exemplary embodiment, the attention computing module 202 by the angle embeded matrix of initialization and institute Sentence context is stated to carry out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, Q For the angle embeded matrix, K is sentence context, and n is embedded in number of dimensions for institute's predicate.
Five text sentiment classification method of embodiment
After taking training corpus, it is necessary first to be labeled to training corpus, specific notation methods are as follows:
(1) angular type and number are determined.For example, taking car data to illustrate, our predetermined angular numbers are 5, angle Degree type is respectively as follows: interior trim, power, five major class of cost performance, oil consumption and comfort.
(2) every training sample is labeled according to each angle, i.e. every training sample need it is angled to institute all into Rower note, if not referring to that it is common for defaulting the angle polarity for a certain angle in training sample.Such as:
A) training sample 1: " starting meat, 2 gears are a little dilatory to 3 gears, and backstage space can manage it satisfied, can sit three People is not crowded, and appearance and interior trim are most satisfied, and handsome, oil consumption is general ".
B) training sample 2: " face value full marks, handfeel of keys is super good, and car has abnormal sound, and double clutch startings are slow, starts start and stop Function can not be closed permanently, be turned off manually every time, and not well, oil consumption is slightly higher, but cost performance or relatively high, after all this A price should not excessive demand it is too many ".
For two training samples as above, annotation results are as shown in table 1:
Training sample Interior trim Power Cost performance Oil consumption Comfort
Training sample 1 It is positive Negative sense Generally Negative sense Generally
Training sample 2 It is positive Negative sense It is positive Negative sense Generally
Table 1
(3) training sample screens, and it is all general training sample that it is angled, which to remove institute, accelerates model training speed.
Specific pre-training method is as follows:
I) unsupervised pre-training language sample:
In order to make language model that there is better feature extraction ability, we by random one covered in training sample or Multiple words, train language model go to be predicted.Such as:
Original training sample: car has abnormal sound, and ceiling also has abnormal sound, and double clutch startings are slow.
Pre-training sample: car has abnormal sound, and ceiling also has abnormal sound, and it is slow that bis- [MASK] close step.
Language model output: from
It should be noted that each word can may word for word be predicted by mask, language model.Language model can according to from Left-to-right (Left-to-Right) and from right to left (Right-to-Left), which learn context, to be indicated to predict by the word of mask.
II) language model defines:
In order to predict the correctly word by mask, language model needs to learn from the left and right sides context, and two parts are tied Structure is identical but parameter is different, later by the addition of vectors of two part study.Corresponding title left part is divided into forward table and shows network, Right half be known as after to network is indicated, two network structures are identical.It should be noted that all words are to preceding to expression net on the right of mask Network is invisible, and all words in the mask left side are to rear invisible to network.
Forward direction indicates that network consists of two parts, and first part is preceding to embeding layer, this layer will cover the text on the left of position Each word is mapped as term vector (Token Embedding) in this, and is each term vector embedded location vector (Position Embedding), using Token Embedding and Position Embedding accumulated result as the input of second part; Second part is characterized abstraction, layer, and the input of this layer is the output of first part as a result, this layer can be the volume of Transformer Code device (Encoder), convolutional neural networks (Convolutional Neural Networks, CNN) or Recognition with Recurrent Neural Network (Recurrent Neural Network, RNN) (shot and long term memory artificial neural network (Long-Short Term Memory, LSTM), GRU (Gated Recurrent Unit)) etc., this feature abstraction, layer can be superimposed multilayer to improve feature extraction ability (as long as input and output dimension is identical to be superimposed, and the n dimension output of first layer can be tieed up as the n of the second layer to be inputted).One In exemplary embodiment, the application second part uses the part Encoder of Transformer, and the number of plies uses 8 layers.
It should be noted that the Transformer model of Google is for machine translation task, Transformer earliest The slow disadvantage of training that RNN is most denounced by people is improved, is realized quickly simultaneously using from attention (Self-Attention) mechanism Row.And Transformer can increase to very deep depth, sufficiently excavation deep neural network (Deep Neural Networks, DNN) model characteristic, lift scheme accuracy rate.
It is backward to indicate that network consists of two parts, first part be after to embeding layer, this layer will cover the text on the right side of position Each word is mapped as term vector in this, and is each term vector embedded location vector, by Token Embedding and Position Input of the Embedding accumulated result as second part;Second part is characterized abstraction, layer, and the input of this layer is first As a result, this layer can be Encoder, CNN or RNN (LSTM, GRU) of Transformer etc., this feature is extracted for the output divided Layer can be superimposed multilayer to improve feature extraction ability.In one exemplary embodiment, the application second part uses The part Encoder of Transformer, the number of plies use 8 layers.
Part III inputs two sentence context additions after the right and left submodel exports sentence context again To full articulamentum, Softmax prediction is carried out for the word of position [MASK] in input sentence.
III) model training:
Pre-training language model is obtained according to the training of normal neuronal network model.
The application constructs angle embeded matrix (Aspect Embedding) on the basis of pre-training language model, and All outputs of language model are modeled in conjunction with attention mechanism (Attention Mechanism).Detailed process is as follows:
Angle embeded matrix is constructed according to the number of angle, for example above angle number is 5, so building 5*n ties up square Battle array V, wherein n is that word is embedded in number of dimensions, later by all outputs of the angle embeded matrix and pre-training model second part (i.e. sentence context) carries out noticing that force function calculates:
Wherein, Q is angle embeded matrix, and K is sentence context, and n is that word is embedded in number of dimensions.
The attention calculation that the application uses can be scaling dot product (Scaled Dot-Product) mode.Angle The every a line of embeded matrix all goes out modified angle embeded matrix according to above-mentioned equation calculation, later by modified angle embeded matrix Multi-tag classifier is inputted, which is a full articulamentum, and output dimension is angle polarity number.Such as: each angle Three kinds of polarity are exported, so the matrix that 5 angle output is 5*3, every a line indicate three polarity of an angle, pass through later Softmax selects optimal polarity.
The each angle of pre-training model goes to train using cross entropy as loss.
Specifically it is expressed as follows:
Assuming that the output of pre-training model is M, M is p*n matrix, and p indicates that sentence word number (in short has several words.Than Such as, in sentence, " I likes eating apple." in, p=7), n indicates that word is embedded in number of dimensions.Assuming that angle embeded matrix is A, A a*n Matrix, a indicate angle number.F indicate Attention function (can use Scaled Dot-Product mode), so M with Output is a*n matrix after A is input in F, passes through fully-connected network L (n, t) later, and t indicates every kind of polar number, obtains a* The matrix of consequence of t.T=3 in the application finally passes through Softmax layers, obtains final prediction result.
It will appreciated by the skilled person that whole or certain steps, system, dress in method disclosed hereinabove Functional module/unit in setting may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment, Division between the functional module/unit referred in the above description not necessarily corresponds to the division of physical assemblies;For example, one Physical assemblies can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain groups Part or all components may be implemented as by processor, such as the software that digital signal processor or microprocessor execute, or by It is embodied as hardware, or is implemented as integrated circuit, such as specific integrated circuit.Such software can be distributed in computer-readable On medium, computer-readable medium may include computer storage medium (or non-transitory medium) and communication media (or temporarily Property medium).As known to a person of ordinary skill in the art, term computer storage medium is included in for storing information (such as Computer readable instructions, data structure, program module or other data) any method or technique in the volatibility implemented and non- Volatibility, removable and nonremovable medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or its His memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other Magnetic memory apparatus or any other medium that can be used for storing desired information and can be accessed by a computer.This Outside, known to a person of ordinary skill in the art to be, communication media generally comprises computer readable instructions, data structure, program mould Other data in the modulated data signal of block or such as carrier wave or other transmission mechanisms etc, and may include any information Delivery media.

Claims (10)

1. a kind of text sentiment classification method characterized by comprising
The sentence context of text is obtained by pre-training language model, the pre-training language model is for predicting the text In one or more words for covering at random;
The angle embeded matrix of random initializtion a*n, a are angle number, and n is that word is embedded in number of dimensions, and the angle of initialization is embedding Enter matrix and the sentence context carries out noticing that force function calculates, obtains modified angle embeded matrix;
Discriminant classification is carried out according to modified angle embeded matrix, obtains the emotion of text all angles.
2. text sentiment classification method according to claim 1, which is characterized in that the pre-training language model includes embedding Enter layer, feature extraction layer and prediction interval, in which:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, by word It exports after vector sum position vector is cumulative to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence or more Text is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
3. text sentiment classification method according to claim 2, which is characterized in that the embeding layer includes preceding to embeding layer With backward embeding layer, the feature extraction layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and will be covered each word in text on the left of position and is mapped as term vector, and for each word to Embedded location vector is measured, is exported after term vector and position vector are added up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, and the output before receiving to embeding layer covers the text on the left of position as a result, extracting High dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and will be covered each word in text on the right side of position and is mapped as term vector, and for each word to Embedded location vector is measured, is exported after term vector and position vector are added up to the backward feature extraction layer;
The backward feature extraction layer is used for, and the output after reception to embeding layer covers the text on the right side of position as a result, extracting High dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, above and below the forward direction feature extraction layer and the sentence of the backward feature extraction layer output Text is overlapped, and is predicted according to stack result the word for covering position.
4. text sentiment classification method according to claim 1, which is characterized in that the angle by initialization is embedded in square Battle array and the sentence context carry out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, and Q is institute Angle embeded matrix is stated, K is the sentence context, and n is embedded in number of dimensions for institute's predicate.
5. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or Multiple programs, one or more of programs can be executed by one or more processor, to realize such as claim 1 to power Benefit require any one of 4 described in text sentiment classification method the step of.
6. a kind of text emotion sorter, which is characterized in that including processor and memory, in which: the processor is used for The program stored in memory is executed, to realize the text emotion classification as described in any one of claim 1 to claim 4 The step of method.
7. a kind of text emotion sorter, which is characterized in that obtain module, attention computing module and classification including context Discrimination module, in which:
The context obtains module and is used for, and the sentence context of text, the pre-training are obtained by pre-training language model Language model is used for the one or more words for predicting to cover at random in the text;
The attention computing module is used for, the angle embeded matrix of random initializtion a*n, and a is angle number, and n is word insertion The angle embeded matrix of initialization and the sentence context are carried out noticing that force function calculates, obtain modified angle by number of dimensions Spend embeded matrix;
The discriminant classification module is used for, and is carried out discriminant classification according to modified angle embeded matrix, is obtained text all angles Emotion.
8. text emotion sorter according to claim 7, which is characterized in that the pre-training language model includes embedding Enter layer, feature extraction layer and prediction interval, in which:
The embeding layer is used for, and word each in text is mapped as term vector, and be each term vector embedded location vector, by word It exports after vector sum position vector is cumulative to the feature extraction layer;
The feature extraction layer is used for, and receives the output of embeding layer as a result, extracting the high dimensional feature of text, and export sentence or more Text is to the prediction interval;
The prediction interval is used for, based on the received sentence context, is predicted the word for covering position.
9. text emotion sorter according to claim 8, which is characterized in that the embeding layer includes preceding to embeding layer With backward embeding layer, the feature extraction layer includes preceding to feature extraction layer and backward feature extraction layer, in which:
The forward direction embeding layer is used for, and will be covered each word in text on the left of position and is mapped as term vector, and for each word to Embedded location vector is measured, is exported after term vector and position vector are added up to the forward direction feature extraction layer;
The forward direction feature extraction layer is used for, and the output before receiving to embeding layer covers the text on the left of position as a result, extracting High dimensional feature, and the sentence context on the left of cover position is exported to the prediction interval;
The backward embeding layer is used for, and will be covered each word in text on the right side of position and is mapped as term vector, and for each word to Embedded location vector is measured, is exported after term vector and position vector are added up to the backward feature extraction layer;
The backward feature extraction layer is used for, and the output after reception to embeding layer covers the text on the right side of position as a result, extracting High dimensional feature, and the sentence context on the right side of cover position is exported to the prediction interval;
The prediction interval is specifically used for, above and below the forward direction feature extraction layer and the sentence of the backward feature extraction layer output Text is overlapped, and is predicted according to stack result the word for covering position.
10. text emotion sorter according to claim 7, which is characterized in that the general of the attention computing module The angle embeded matrix of initialization and the sentence context carry out noticing that force function calculates, comprising:
Wherein, Attention () is to pay attention to force function, and s is a sentence, and softmax () is normalization exponential function, and Q is institute Angle embeded matrix is stated, K is the sentence context, and n is embedded in number of dimensions for institute's predicate.
CN201910285262.XA 2019-04-10 2019-04-10 Text emotion classification method and device and computer readable storage medium Expired - Fee Related CN110110323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910285262.XA CN110110323B (en) 2019-04-10 2019-04-10 Text emotion classification method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910285262.XA CN110110323B (en) 2019-04-10 2019-04-10 Text emotion classification method and device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110110323A true CN110110323A (en) 2019-08-09
CN110110323B CN110110323B (en) 2022-11-11

Family

ID=67483800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910285262.XA Expired - Fee Related CN110110323B (en) 2019-04-10 2019-04-10 Text emotion classification method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110110323B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704622A (en) * 2019-09-27 2020-01-17 北京明略软件***有限公司 Text emotion classification method and device and electronic equipment
CN110795537A (en) * 2019-10-30 2020-02-14 秒针信息技术有限公司 Method, device, equipment and medium for determining improvement strategy of target commodity
CN110837733A (en) * 2019-10-31 2020-02-25 创新工场(广州)人工智能研究有限公司 Language model training method and system in self-reconstruction mode and computer readable medium
CN111241304A (en) * 2020-01-16 2020-06-05 平安科技(深圳)有限公司 Answer generation method based on deep learning, electronic device and readable storage medium
CN111274807A (en) * 2020-02-03 2020-06-12 华为技术有限公司 Text information processing method and device, computer equipment and readable storage medium
CN111506702A (en) * 2020-03-25 2020-08-07 北京万里红科技股份有限公司 Knowledge distillation-based language model training method, text classification method and device
CN111737994A (en) * 2020-05-29 2020-10-02 北京百度网讯科技有限公司 Method, device and equipment for obtaining word vector based on language model and storage medium
CN112214576A (en) * 2020-09-10 2021-01-12 深圳价值在线信息科技股份有限公司 Public opinion analysis method, device, terminal equipment and computer readable storage medium
CN112214601A (en) * 2020-10-21 2021-01-12 厦门市美亚柏科信息股份有限公司 Social short text sentiment classification method and device and storage medium
CN113792143A (en) * 2021-09-13 2021-12-14 中国科学院新疆理化技术研究所 Capsule network-based multi-language emotion classification method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2973138A1 (en) * 2014-01-10 2015-07-16 Cluep Inc. Systems, devices, and methods for automatic detection of feelings in text
DK201670552A1 (en) * 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
CN108170681A (en) * 2018-01-15 2018-06-15 中南大学 Text emotion analysis method, system and computer readable storage medium
CN109543039A (en) * 2018-11-23 2019-03-29 中山大学 A kind of natural language sentiment analysis method based on depth network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2973138A1 (en) * 2014-01-10 2015-07-16 Cluep Inc. Systems, devices, and methods for automatic detection of feelings in text
DK201670552A1 (en) * 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
CN108170681A (en) * 2018-01-15 2018-06-15 中南大学 Text emotion analysis method, system and computer readable storage medium
CN109543039A (en) * 2018-11-23 2019-03-29 中山大学 A kind of natural language sentiment analysis method based on depth network

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704622A (en) * 2019-09-27 2020-01-17 北京明略软件***有限公司 Text emotion classification method and device and electronic equipment
CN110795537A (en) * 2019-10-30 2020-02-14 秒针信息技术有限公司 Method, device, equipment and medium for determining improvement strategy of target commodity
CN110795537B (en) * 2019-10-30 2022-10-25 秒针信息技术有限公司 Method, device, equipment and medium for determining improvement strategy of target commodity
CN110837733A (en) * 2019-10-31 2020-02-25 创新工场(广州)人工智能研究有限公司 Language model training method and system in self-reconstruction mode and computer readable medium
CN110837733B (en) * 2019-10-31 2023-12-29 创新工场(广州)人工智能研究有限公司 Language model training method and system of self-reconstruction mode and electronic equipment
CN111241304B (en) * 2020-01-16 2024-02-06 平安科技(深圳)有限公司 Answer generation method based on deep learning, electronic device and readable storage medium
CN111241304A (en) * 2020-01-16 2020-06-05 平安科技(深圳)有限公司 Answer generation method based on deep learning, electronic device and readable storage medium
CN111274807A (en) * 2020-02-03 2020-06-12 华为技术有限公司 Text information processing method and device, computer equipment and readable storage medium
CN111506702A (en) * 2020-03-25 2020-08-07 北京万里红科技股份有限公司 Knowledge distillation-based language model training method, text classification method and device
CN111737994A (en) * 2020-05-29 2020-10-02 北京百度网讯科技有限公司 Method, device and equipment for obtaining word vector based on language model and storage medium
CN111737994B (en) * 2020-05-29 2024-01-26 北京百度网讯科技有限公司 Method, device, equipment and storage medium for obtaining word vector based on language model
CN112214576A (en) * 2020-09-10 2021-01-12 深圳价值在线信息科技股份有限公司 Public opinion analysis method, device, terminal equipment and computer readable storage medium
CN112214576B (en) * 2020-09-10 2024-02-06 深圳价值在线信息科技股份有限公司 Public opinion analysis method, public opinion analysis device, terminal equipment and computer readable storage medium
CN112214601A (en) * 2020-10-21 2021-01-12 厦门市美亚柏科信息股份有限公司 Social short text sentiment classification method and device and storage medium
CN112214601B (en) * 2020-10-21 2022-06-10 厦门市美亚柏科信息股份有限公司 Social short text sentiment classification method and device and storage medium
CN113792143B (en) * 2021-09-13 2023-12-12 中国科学院新疆理化技术研究所 Multi-language emotion classification method, device, equipment and storage medium based on capsule network
CN113792143A (en) * 2021-09-13 2021-12-14 中国科学院新疆理化技术研究所 Capsule network-based multi-language emotion classification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110110323B (en) 2022-11-11

Similar Documents

Publication Publication Date Title
CN110110323A (en) A kind of text sentiment classification method and device, computer readable storage medium
CN108763326B (en) Emotion analysis model construction method of convolutional neural network based on feature diversification
Alzantot et al. Generating natural language adversarial examples
CN108595632B (en) Hybrid neural network text classification method fusing abstract and main body characteristics
CN110969020B (en) CNN and attention mechanism-based Chinese named entity identification method, system and medium
CN108717439A (en) A kind of Chinese Text Categorization merged based on attention mechanism and characteristic strengthening
CN107608956A (en) A kind of reader's mood forecast of distribution algorithm based on CNN GRNN
CN111177374A (en) Active learning-based question and answer corpus emotion classification method and system
US20210375280A1 (en) Systems and methods for response selection in multi-party conversations with dynamic topic tracking
CN110826338B (en) Fine-grained semantic similarity recognition method for single-selection gate and inter-class measurement
CN106776713A (en) It is a kind of based on this clustering method of the Massive short documents of term vector semantic analysis
CN105740236A (en) Writing feature and sequence feature combined Chinese sentiment new word recognition method and system
CN110427461A (en) Intelligent answer information processing method, electronic equipment and computer readable storage medium
Singh et al. AlexNet architecture based convolutional neural network for toxic comments classification
Hu et al. Multimodal DBN for predicting high-quality answers in cQA portals
CN109271537A (en) A kind of text based on distillation study is to image generating method and system
CN110415071A (en) A kind of competing product control methods of automobile based on opining mining analysis
CN109783794A (en) File classification method and device
CN106503616A (en) A kind of Mental imagery Method of EEG signals classification of the learning machine that transfinited based on layering
Mohamad Nezami et al. Towards generating stylized image captions via adversarial training
Singh Fake News Detection: a comparison between available Deep Learning techniques in vector space
CN109740151A (en) Public security notes name entity recognition method based on iteration expansion convolutional neural networks
Sun et al. Multi-channel CNN based inner-attention for compound sentence relation classification
KR102469679B1 (en) Method and apparatus for recommending customised food based on artificial intelligence
CN111914553A (en) Financial information negative subject judgment method based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20221111