CN109325112B - A kind of across language sentiment analysis method and apparatus based on emoji - Google Patents
A kind of across language sentiment analysis method and apparatus based on emoji Download PDFInfo
- Publication number
- CN109325112B CN109325112B CN201810678889.7A CN201810678889A CN109325112B CN 109325112 B CN109325112 B CN 109325112B CN 201810678889 A CN201810678889 A CN 201810678889A CN 109325112 B CN109325112 B CN 109325112B
- Authority
- CN
- China
- Prior art keywords
- text
- language
- emoji
- vector
- characterization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
Across the language sentiment analysis method and apparatus based on emoji that the present invention relates to a kind of.This method comprises: 1) the unmarked text creation term vector of a large amount of source language and the target language based on collection;2) text in unmarked text comprising emoji is selected based on term vector, emoji prediction task is established by the inclusion of the text of emoji, to obtain a characterization model;3) the original language corpus of labeled feeling polarities is translated into object language, is characterized using the document that sentence characterization model obtains original text and translates obtained text, then characterizes training sentiment classification model using document;4) sentiment classification model obtained using training is carried out emotional semantic classification to the new text of object language, obtains its feeling polarities.The present invention realizes across language sentiment analysis using the emoji text easily climbed in social platform, can alleviate that markup resources are rare, the unbalanced problem of markup resources in different language.
Description
Technical field
The present invention is a kind of across language sentiment analysis method and apparatus based on emoji, belongs to software technology field.
Background technique
In recent years, with the development of internet, a large amount of user has been emerged on network and has generated text, such as blog, micro-
Rich, forum's discussion, comment etc..It is emerging that a large amount of user's generation text has caused the research that researcher carries out automatic sentiment analysis to it
Interest.Since at the beginning of 2000, sentiment analysis has become most popular one of the research topic of natural language processing field, and wide
It is general to be applied to the research fields such as Web excavation, data mining, information retrieval, general fit calculation and human-computer interaction.Researcher is for emotion
The enthusiasm of analysis work is largely attributed to the fact that its higher practical application value.Sentiment analysis technology has been applied to client
Feedback and tracking, sales forecast, product ranking, Stock Market Forecasting, opinion integration, election prediction etc. many real scenes, and generate compared with
Big actual benefit.
But the research of majority sentiment analysis is all carried out on English text at present.This present Research is largely
Because the work of early stage sentiment analysis is mainly carried out by the researcher for the country that English is mother tongue.These researchs, which provide, some has mark
The corpus and benchmark dataset of note carry out later period research for researcher and provide convenience.Further, researchers open
Beginning focuses in English text research, the stagnation to work so as to cause sentiment analysis on other language.However, according to system
Meter, only 25.3% Internet user use English (https: //www.internetworldstats.com/
stats7.html).This shows that other language also possess huge user group, carries out sentiment analysis work on other language
It is same most important.Such present Research promotes a collection of researcher to start to carry out across language sentiment analysis research.The research purport
A kind of universal model is being trained using the labeled data in resourceful language (i.e. original language is often referred to English), it should be across
Language sentiment analysis model equally can to labeled data resource not plentiful language (i.e. object language, such as Japanese) text carry out
Emotional semantic classification.
Key across language sentiment analysis is to search out the vocabulary that can be connected between an original language and object language letter
The medium of ditch.The parallel text of most mainstream work selection source language and the target language is as this medium.Parallel text
This is i.e. for same semanteme, macaronic different text expression.The generation of parallel language is highly dependent on machine translation skill
Art.But current translation technology often loses the emotion information in prototype statement in translation process, gives across language sentiment analysis
Cause difficulty.For example, " blacksheep " in English is often used for referring to " blacksheep ", but is translating into Japanese shown in Fig. 1
Afterwards, the semantic information (sheep of black) for only remaining original English is lost the emotion meaning of satire.In addition, though original language
(English) has a data volume relative abundance of the resource of mark compared with other language, but in fact, these data of today deep
Still limit to very much in face of degree learning algorithm, can not often learn the vector characterization of words and phrases out well.Therefore, it is badly in need of finding one
The new mode of learning of the missing of the problem of appearance emotion in translation process is lost and flag data can be alleviated.
One kind possible solution to be remote supervisory study.Remote supervisory learning art needs researcher Manual definition regular next life
At weak label data, the knot close to the data training using authentic signature is reached by the study to a large amount of weak label datas
Fruit.
Summary of the invention
Across language sentiment analysis technical field at present there are aiming at the problem that, the purpose of the present invention is based on the wide of emoji
It is general to solve the method and apparatus across language sentiment analysis using the semi-supervised representative learning frame of one kind is provided.
For across language emotional semantic classification problem, the weak labeling requirement of selection meets two characteristics.On the one hand, the labeling requirement
It is all widely used in each language.On the other hand, which can implicitly reveal out emotion information.Such selection criteria
Under, the present invention uses emoji (emoticon) as weak label.Emoji is because it does not have aphasis and can be used to express difference
The speciality of emotion is widely used by the user of different genders and country, can be as this Chinese real feelings of each language
Weak label.Therefore, across the language sentiment analysis representative learning method based on emoji that the invention proposes a kind of, it is intended to utilize
The resource of original language (English) trains the model of the text emotion of energy class object language.
The technical solution adopted by the invention is as follows:
A kind of across language sentiment analysis method based on emoji, key step are as follows:
1. the unsupervised learning stage: the unmarked text creation word of a large amount of source language and the target language based on collection to
Amount;
2. remote supervisory learns the stage: the term vector based on creation, the text in unmarked text comprising emoji is selected,
Emoji is established by these texts and predicts task, to obtain a characterization model;
3. the supervised learning stage: the original language corpus of labeled feeling polarities being translated into object language, using original
Sentence characterization model obtains original text and translates the document characterization of obtained text, then characterizes one feelings of training using these documents
Feel disaggregated model;
4. the emotional semantic classification stage: the sentiment classification model obtained using training carries out emotion to the new text of object language
Classification, obtains its feeling polarities.
Fig. 2 is the flow chart of the above method.The specific technical solution of above-mentioned steps is as follows:
1. the unsupervised learning stage
At this stage, it has used and has pushed away the method for Te Wenben and Word2Vec on a large scale to train to obtain term vector.
These texts can be acquired by Twitter API (https: //developer.twitter.com/).Traditional
Although One-hot representation method can distinguish each word, that is still discretely indicated, can not be established between word and word
Semantic relation, to increase the difficulty of later period text-processing task.In order to solve this problem, present invention uses
Each word is encoded in continuous vector space by the mechanism of Word2Vec term vector by model training.The process is only
It is the semantic information for capturing word using unlabelled corpus and being characterized, therefore is unsupervised.In the process of specific implementation
In, the term vector model parameter that pre-training obtains is first passed through to initialize the characterization of the term vector in general frame part, and is being connect
The remote supervisory study stage got off carries out the adjusting and optimizing of parameter, in fixed relevant parameter of last supervised learning stage.
2. remote supervisory learns the stage
The characterization (i.e. term vector) of word level based on the creation of unsupervised learning stage, the present invention devise one and are based on
The prediction task of emoji come include simultaneously text semantic and emotion information characterization mechanism.In prediction emoji task, use
The sentence of identical emoji can similarly be characterized in vector space.Emoji prediction model is further illustrated in Fig. 3
Frame structure.The text code of sentence surface is wherein carried out using two two-way LSTM layers and one Attention layers.This
In, by using the mechanism of Skip-connection (great-jump-forward transmitting), so that Attention layers of input is term vector layer
In addition two LSTM layers of output, to realize information without hindrance transmitting in entire model.Finally, Attention layers of output
For Softmax layers of classification.
Next, LSTM layers two-way, Attention layers and Softmax layers will be introduced respectively.
Two-way shot and long term memory network (Bi-LSTM) layer: each training sample can be expressed as (x, e), wherein x=[d1,
d2,…,dL], it indicates to remove the corresponding term vector sequence of text after emoji, and e is corresponded in text and is included originally
emoji.In step t, LSTM carries out the calculating of nodes state according to following formula:
i(t)=δ (Uix(t)+Wih(t-1)+bi),
f(t)=δ (Ufx(t)+Wfh(t-1)+bf),
o(t)=δ (Uox(t)+Woh(t-1)+bo),
c(t)=ft⊙c(t-1)+i(t)⊙tanh(Ucx(t)+Wch(t-1)+bc),
h(t)=o(t)⊙tanh(c(t)),
Wherein x(t),i(t),f(t),o(t),c(t)And h(t)Respectively indicate input vector of the LSTM at step t, input gate-shaped
State forgets door state, output door state, internal storage location state and hiding layer state.W, U, b respectively represent recirculating network structure
Parameter, the parameter of input structure and bias term parameter.Symbol ⊙ indicates element product.Corresponding step can be obtained according to the output of model
The word sequence of each sentence characterizes vector under rapid t.
Further, in order to obtain the contextual information in past relevant with each word and future usage, use is two-way
LSTM to encode word sequence.The characterization vector of i-th of element of the obtained word sequence of forward and backward LSTM is directly connected to
Obtain final characterization hi.Specific formula for calculation is as follows:
The characterization vector h obtained in this wayiThe forward and backward context letter of corresponding i-th of word has been captured simultaneously
Breath.
Attention layers: due to what is be previously mentioned, term vector layer, forward direction being connected to by Skip-connection
LSTM layers, it is backward LSTM layers as Attention layers of input vector, input i-th of word in sentence thus can be as follows
It is characterized as ui:
ui=[di,hi1,hi2],
D in above formulai, hi1And hi2Respectively indicate i-th of word term vector layer, LSTM layers of forward direction, after in LSTM layers
Characterization.The task and emotional semantic classification of prediction emoji are served the same role due to not being each word, present invention introduces
Attention mechanism determines each word in the importance of current generation representative learning process.I-th word
Attention layers of score can be calculated according to following formula:
Wherein WaFor Attention layers of parameter matrix, and each sentence may be expressed as one group of word sequence, further
It is characterized as the weighted average of each vocabulary sign after connection in word sequence, and weight therein is that above formula is calculated
Attention value.Specifically, the characterization of each sentence is following form, and wherein L is word included in sentence
Number:
Softmax layers: being then transferred in Softmax layers from Attention layers of obtained sentence characterization, by Softmax
Corresponding probability vector Y will be returned after layer.The corresponding sentence of each element representation of probability vector Y includes that some is specific
The probability of emoji.Specifically, i-th of element of probability vector can be calculated as follows:
Wherein, T representing matrix transposition, wiIndicate i-th of weight parameter, biIndicate i-th of bias term parameter, K indicates probability
The dimension of vector.After obtaining the corresponding probability vector of each sentence, uses cross entropy as loss function, declined using gradient
Mode parameter is updated, minimize the prediction error of model.In above-mentioned remote supervisory study and unsupervised learning pair
After the adjustment of parameter, the vector characterization of each sentence can be extracted from Attention layers of output.
And since the data volume in supervised learning stage later is limited, in order to avoid the excessively huge bring mistake of model parameter
Fitting problems, during the vector table of the final document level of training sign will the sub- characterization model of fixed sentence, corresponding parameter will
It will not be adjusted again.
3. the supervised learning stage
After the remote supervisory study stage, there is identical semantic and emotion information sentence will be in table in each language
It is mapped in the vector space of sign close.And finally wish to solve the problems, such as be document across language sentiment analysis, therefore there is still a need for
Method that is a kind of compact and catching effective information characterizes document.In each document, different sentences is to entire document
Emotional expression has different degrees of importance.Therefore, equally each to polymerize using the Attention mechanism of document level
Different sentences in document.Here each document characterization is denoted as r, the sentence characterization in document is denoted as v, passes through following formula
R is calculated:
Wherein WbFor Attention layers of weight matrix, and βiFor the Attention value of i-th of sentence in document.It uses
Google translates each original language sample x ∈ LSTranslate into object language, and text after being translated using identical method
Vector characterization.There is mark English text x for eachsAnd its corresponding cypher text xt, it is assumed that it passes through above-mentioned Attention
The vector obtained after layer is characterized as rsAnd rt, it is directly connected to obtain r in the stage of supervised learningc=[rs,rt], and will obtain
RcAs last Softmax layers of input, and minimizes the intersection entropy loss between neural network forecast result and true tag and come more
New corresponding network parameter.
Accordingly with above method, the present invention also provides a kind of across the language sentiment analysis device based on emoji, packet
It includes:
Unsupervised learning module, be responsible for a large amount of source language and the target language based on collection unmarked text creation word to
Amount;
Remote supervisory study module is responsible for being selected the text in unmarked text comprising emoji based on the term vector, be led to
It crosses the text comprising emoji and establishes emoji prediction task, to obtain a characterization model;
Supervised learning module is responsible for the original language corpus of labeled feeling polarities translating into object language, using described
Sentence characterization model obtains original text and translates the document characterization of obtained text, then characterizes training emotion using the document and divides
Class model;
Emotional semantic classification module, is responsible for using the obtained sentiment classification model of training, to the new text of object language into
Row emotional semantic classification obtains its feeling polarities.
Compared with prior art, the positive effect of the present invention are as follows:
The present invention alleviates that markup resources are rare, in different language using the emoji text easily climbed in social platform
The unbalanced problem of markup resources.Specifically, because being remote supervisory study, use emoji as the weak label of emotion, because
The demand of this corpus for manually marking is less.In addition, because emoji be widely used in each language, with emoji come
The weak label in across language sentiment analysis is made, it is pervasive to each language.
Detailed description of the invention
Fig. 1 is Google's translation sample schematic diagram.
Fig. 2 is the flow chart of the method for the present invention.
Fig. 3 is remote supervisory learning framework figure.
Fig. 4 is the exemplary diagram for extracting sample in text from pushing away comprising multiple emoji.
Specific embodiment
Below with across the language analysis task (https: //www.uni-weimar.de/en/ of classical Amazon comment
Media/chairs/computer-science-department/webis/data/corp us-webis-cls-10/) come into
One step illustrates and verifies method of the invention.The task is right using Japanese, French, German as object language using English as original language
In each language, comprising data, DVD, three fields of music sentiment analysis task.Because of its representativeness, task conduct always
Benchmark dataset of the sphere of learning in across language sentiment analysis field.In order to verify method of the invention on the data set, press
The training of following steps implementation model.
Firstly, having crawled English, Japanese, French, German on spy from pushing away and pushing away text, and as follows pre-processed:
1) what removal forwarded pushes away text, to guarantee that every words appear in its original context;
2) removal pushes away text comprising URL, to guarantee that the emotion of every words only depends on the semanteme of itself, independent of outside
Resource;
3) word cutting has been carried out for all texts that pushes away, and has switched to lowercase.Since Japanese is not by space-separated
Word, the present embodiment have selected this tokenizer of MeCab (http://taku910.github.io/mecab) to come to Japanese
Individually processing;
4) it for pushing away@and number in text, is substituted with unified spcial character;
5) there is the word of redundancy letter to be restored to their original appearance those, for example, by " cooool " and
" cooooooool " all switchs to " cool ".
Word2Vec is used for these pretreated texts and obtains the characterization of each word in source language and the target language.
The text comprising emoji is extracted in text next, pushing away from these.For each language, pushes away in text and be extracted from it
64 kinds of most emoji, then filter out the sentence without these emoji.In addition, may include more in some sentences
A emoji.Text is pushed away for each, all creates a sample for every kind of emoji wherein included.For example, (a) figure institute in Fig. 4
The sentence shown can derive two samples shown in (b) figure, (c) figure.The emoji sample obtained using these establishes emoji
Prediction task is respectively trained to obtain a characterization model for source language and the target language.
Finally, the English corpus with affective tag is translated as object language, parallel text is constituted.By parallel text point
The sentence characterization that every words in text are obtained in the sentence characterization model of corresponding language obtained in the previous step is not lost, for training supervision
Learning model.
Obtained supervised learning model can be used for the emotion of the text of class object language.Following table 1 illustrates this hair
Classification accuracy of the bright method in 9 tasks of Amazon benchmark dataset.
Classification accuracy (%) of 1. the method for the present invention of table in 9 tasks of Amazon benchmark dataset
In the present invention, the unsupervised learning stage obtains during term vector may be used also other than using Word2Vec algorithm
To use other classic algorithms, such as GloVe algorithm.The remote supervisory study stage encodes text in addition to using two-way LSTM layers
Outside, CNN (convolutional neural networks) model also can be used.In addition, the number of plies of two-way LSTM can also adjust.
Another embodiment of the present invention provides a kind of across the language sentiment analysis device based on Emoji comprising:
Unsupervised learning module, be responsible for a large amount of source language and the target language based on collection unmarked text creation word to
Amount;
Remote supervisory study module is responsible for being selected the text in unmarked text comprising emoji based on the term vector, be led to
It crosses the text comprising emoji and establishes emoji prediction task, to obtain a characterization model;
Supervised learning module is responsible for the original language corpus of labeled feeling polarities translating into object language, using described
Sentence characterization model obtains original text and translates the document characterization of obtained text, then characterizes training emotion using the document and divides
Class model;
Emotional semantic classification module, is responsible for using the obtained sentiment classification model of training, to the new text of object language into
Row emotional semantic classification obtains its feeling polarities.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this field
Personnel can be with modification or equivalent replacement of the technical solution of the present invention are made, without departing from the spirit and scope of the present invention, this
The protection scope of invention should be subject to described in claims.
Claims (10)
1. a kind of across language sentiment analysis method based on emoji, which comprises the following steps:
1) the unmarked text creation term vector of a large amount of source language and the target language based on collection;
2) text in unmarked text comprising emoji is selected based on the term vector, is built by the text comprising emoji
Vertical emoji predicts task, to obtain a characterization model;
3) the original language corpus of labeled feeling polarities is translated into object language, obtains original text using the sentence characterization model
The document characterization of the text obtained with translation, then characterizes training sentiment classification model using the document;
4) sentiment classification model obtained using training is carried out emotional semantic classification to the new text of object language, obtains its feelings
Feel polarity.
2. making in this stage the method according to claim 1, wherein step 1) is the unsupervised learning stage
It trains to obtain term vector with Te Wenben and Word2Vec method is pushed away on a large scale.
3. being predicted the method according to claim 1, wherein step 2) is that remote supervisory learns the stage in emoji
It is similarly characterized using the sentence of identical emoji in vector space in task;The emoji prediction task is double using two
The text code of sentence surface is carried out to LSTM layers and one Attention layers, and by using Skip-connection's
Mechanism, so that Attention layers of input is the output that term vector layer adds two LSTM layers, to realize information in entire model
In without hindrance transmitting, last Attention layers of output is used for Softmax layers of classification.
4. according to the method described in claim 3, it is characterized in that, described two-way LSTM layers carry out in network according to following formula
The calculating of node state:
i(t)=δ (Uix(t)+Wih(t-1)+bi),
f(t)=δ (Ufx(t)+Wfh(t-1)+bf),
o(t)=δ (Uox(t)+Woh(t-1)+bo),
c(t)=ft⊙c(t-1)+i(t)⊙tanh(Ucx(t)+Wch(t-1)+bc),
h(t)=o(t)⊙tanh(c(t)),
Wherein, x(t),i(t),f(t),o(t),c(t)And h(t)Respectively indicate input vector of the LSTM at step t, input door state,
Forget door state, output door state, internal storage location state and hiding layer state;W, U, b respectively represent the ginseng of recirculating network structure
Number, the parameter of input structure and bias term parameter;Symbol ⊙ indicates element product.
5. according to the method described in claim 4, it is characterized in that, described two-way LSTM layers obtain forward and backward LSTM
The characterization vector of i-th of element of word sequence is directly connected to obtain final characterization vector hi, make to characterize vector hiIt captures simultaneously
The forward and backward contextual information of corresponding i-th of word.
6. according to the method described in claim 5, it is characterized in that, described Attention layers is determined using Attention mechanism
Each word is in the importance of current generation representative learning process, and the Attention layer score of i-th of word is according to following formula
It is calculated:
Wherein, WaFor Attention layers of parameter matrix;uiFor the table of i-th of word in Attention layers of input sentence
Levy vector, ui=[di,hi1,hi2], di, hi1And hi2Respectively indicate i-th of word term vector layer, LSTM layers of forward direction, after to
Characterization in LSTM layers;L is word number included in sentence.
7. according to the method described in claim 6, it is characterized in that, what the Softmax layers of basis was obtained from Attention layers
Sentence characterization, obtains corresponding probability vector Y;The corresponding sentence of each element representation of the probability vector Y includes some
The probability of specific emoji;After obtaining the corresponding probability vector of each sentence, uses cross entropy as loss function, use gradient
The mode of decline is updated parameter, minimizes the prediction error of model.
8. the method according to the description of claim 7 is characterized in that i-th of element of the probability vector Y is according to following formula
It is calculated:
Wherein, T representing matrix transposition, wiIndicate i-th of weight parameter, biIndicate i-th of bias term parameter, K indicates probability vector
Dimension;
Wherein,It is the characterization of each sentence.
9. the method according to claim 1, wherein step 3) is the supervised learning stage, using document level
Attention layers sub come the different sentences polymerizeing in each document;Each document characterization is denoted as r, the sentence characterization in document
It is denoted as v, r is calculated by following formula:
Wherein WbFor Attention layers of weight matrix, βiFor the Attention value of i-th of sentence in document;By each source language
Say sample x ∈ LSTranslate into object language, and after being translated text vector characterization;There is mark English text x for eachs
And its corresponding cypher text xt, it is assumed that it is characterized as r by the vector obtained after above-mentioned Attention layersAnd rt, supervising
The stage of study is directly connected to obtain rc=[rs,rt], the r that will be obtainedcAs last Softmax layers of input, and it is minimum
Change the intersection entropy loss between neural network forecast result and true tag to update corresponding network parameter.
10. a kind of across language sentiment analysis device based on emoji characterized by comprising
Unsupervised learning module is responsible for the unmarked text creation term vector of a large amount of source language and the target language based on collection;
Remote supervisory study module is responsible for being selected the text in unmarked text comprising emoji based on the term vector, passes through institute
It states the text comprising emoji and establishes emoji prediction task, to obtain a characterization model;
Supervised learning module is responsible for the original language corpus of labeled feeling polarities translating into object language, utilizes the sentence table
Sign model obtains original text and translates the document characterization of obtained text, then characterizes training emotional semantic classification mould using the document
Type;
Emotional semantic classification module is responsible for the sentiment classification model obtained using training, carries out feelings to the new text of object language
Sense classification, obtains its feeling polarities.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810678889.7A CN109325112B (en) | 2018-06-27 | 2018-06-27 | A kind of across language sentiment analysis method and apparatus based on emoji |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810678889.7A CN109325112B (en) | 2018-06-27 | 2018-06-27 | A kind of across language sentiment analysis method and apparatus based on emoji |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109325112A CN109325112A (en) | 2019-02-12 |
CN109325112B true CN109325112B (en) | 2019-08-20 |
Family
ID=65263553
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810678889.7A Active CN109325112B (en) | 2018-06-27 | 2018-06-27 | A kind of across language sentiment analysis method and apparatus based on emoji |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109325112B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134962A (en) * | 2019-05-17 | 2019-08-16 | 中山大学 | A kind of across language plain text irony recognition methods based on inward attention power |
CN112084295A (en) * | 2019-05-27 | 2020-12-15 | 微软技术许可有限责任公司 | Cross-language task training |
CN110309268B (en) * | 2019-07-12 | 2021-06-29 | 中电科大数据研究院有限公司 | Cross-language information retrieval method based on concept graph |
US11694042B2 (en) * | 2020-06-16 | 2023-07-04 | Baidu Usa Llc | Cross-lingual unsupervised classification with multi-view transfer learning |
CN112348257A (en) * | 2020-11-09 | 2021-02-09 | 中国石油大学(华东) | Election prediction method driven by multi-source data fusion and time sequence analysis |
CN113032559B (en) * | 2021-03-15 | 2023-04-28 | 新疆大学 | Language model fine tuning method for low-resource adhesive language text classification |
CN112860901A (en) * | 2021-03-31 | 2021-05-28 | 中国工商银行股份有限公司 | Emotion analysis method and device integrating emotion dictionaries |
CN113919340A (en) * | 2021-08-27 | 2022-01-11 | 北京邮电大学 | Self-media language emotion analysis method based on unsupervised unknown word recognition |
CN113761204B (en) * | 2021-09-06 | 2023-07-28 | 南京大学 | Emoji text emotion analysis method and system based on deep learning |
CN113792143B (en) * | 2021-09-13 | 2023-12-12 | 中国科学院新疆理化技术研究所 | Multi-language emotion classification method, device, equipment and storage medium based on capsule network |
CN114429143A (en) * | 2022-01-14 | 2022-05-03 | 东南大学 | Cross-language attribute level emotion classification method based on enhanced distillation |
CN116108859A (en) * | 2023-03-17 | 2023-05-12 | 美云智数科技有限公司 | Emotional tendency determination, sample construction and model training methods, devices and equipment |
CN116561325B (en) * | 2023-07-07 | 2023-10-13 | 中国传媒大学 | Multi-language fused media text emotion analysis method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488623A (en) * | 2013-09-04 | 2014-01-01 | 中国科学院计算技术研究所 | Multilingual text data sorting treatment method |
CN105068988A (en) * | 2015-07-21 | 2015-11-18 | 中国科学院自动化研究所 | Multi-dimension multi-granularity emotion analysis method |
CN106326214A (en) * | 2016-08-29 | 2017-01-11 | 中译语通科技(北京)有限公司 | Method and device for cross-language emotion analysis based on transfer learning |
CN107305539A (en) * | 2016-04-18 | 2017-10-31 | 南京理工大学 | A kind of text tendency analysis method based on Word2Vec network sentiment new word discoveries |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20170030570A (en) * | 2014-07-07 | 2017-03-17 | 머신 존, 인크. | System and method for identifying and suggesting emoticons |
US20160132607A1 (en) * | 2014-08-04 | 2016-05-12 | Media Group Of America Holdings, Llc | Sorting information by relevance to individuals with passive data collection and real-time injection |
CN107729320B (en) * | 2017-10-19 | 2021-04-13 | 西北大学 | Emoticon recommendation method based on time sequence analysis of user session emotion trend |
-
2018
- 2018-06-27 CN CN201810678889.7A patent/CN109325112B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488623A (en) * | 2013-09-04 | 2014-01-01 | 中国科学院计算技术研究所 | Multilingual text data sorting treatment method |
CN105068988A (en) * | 2015-07-21 | 2015-11-18 | 中国科学院自动化研究所 | Multi-dimension multi-granularity emotion analysis method |
CN107305539A (en) * | 2016-04-18 | 2017-10-31 | 南京理工大学 | A kind of text tendency analysis method based on Word2Vec network sentiment new word discoveries |
CN106326214A (en) * | 2016-08-29 | 2017-01-11 | 中译语通科技(北京)有限公司 | Method and device for cross-language emotion analysis based on transfer learning |
Also Published As
Publication number | Publication date |
---|---|
CN109325112A (en) | 2019-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109325112B (en) | A kind of across language sentiment analysis method and apparatus based on emoji | |
CN106980683B (en) | Blog text abstract generating method based on deep learning | |
Dashtipour et al. | Exploiting deep learning for Persian sentiment analysis | |
CN109670039B (en) | Semi-supervised e-commerce comment emotion analysis method based on three-part graph and cluster analysis | |
CN108875051A (en) | Knowledge mapping method for auto constructing and system towards magnanimity non-structured text | |
CN110929030A (en) | Text abstract and emotion classification combined training method | |
CN107247702A (en) | A kind of text emotion analysis and processing method and system | |
Lin et al. | Automatic translation of spoken English based on improved machine learning algorithm | |
CN109472026A (en) | Accurate emotion information extracting methods a kind of while for multiple name entities | |
CN110390018A (en) | A kind of social networks comment generation method based on LSTM | |
Wu et al. | Sentiment classification using attention mechanism and bidirectional long short-term memory network | |
CN111639176B (en) | Real-time event summarization method based on consistency monitoring | |
Zhang et al. | A BERT fine-tuning model for targeted sentiment analysis of Chinese online course reviews | |
Jian et al. | [Retracted] LSTM‐Based Attentional Embedding for English Machine Translation | |
Guo et al. | Who is answering whom? Finding “Reply-To” relations in group chats with deep bidirectional LSTM networks | |
CN113934835B (en) | Retrieval type reply dialogue method and system combining keywords and semantic understanding representation | |
CN115129807A (en) | Fine-grained classification method and system for social media topic comments based on self-attention | |
Lei et al. | An input information enhanced model for relation extraction | |
CN105095302B (en) | Public praise-oriented analysis and inspection system, device and method | |
CN116662924A (en) | Aspect-level multi-mode emotion analysis method based on dual-channel and attention mechanism | |
Meng et al. | Regional bullying text recognition based on two-branch parallel neural networks | |
Pradhan et al. | A multichannel embedding and arithmetic optimized stacked Bi-GRU model with semantic attention to detect emotion over text data | |
He et al. | Distant supervised relation extraction via long short term memory networks with sentence embedding | |
Zhang et al. | Conditional pre‐trained attention based Chinese question generation | |
Fu et al. | A study on recursive neural network based sentiment classification of Sina Weibo |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |