CN107967258B - Method and system for emotion analysis of text information - Google Patents

Method and system for emotion analysis of text information Download PDF

Info

Publication number
CN107967258B
CN107967258B CN201711183201.XA CN201711183201A CN107967258B CN 107967258 B CN107967258 B CN 107967258B CN 201711183201 A CN201711183201 A CN 201711183201A CN 107967258 B CN107967258 B CN 107967258B
Authority
CN
China
Prior art keywords
word
vector
emotion
words
word vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711183201.XA
Other languages
Chinese (zh)
Other versions
CN107967258A (en
Inventor
张毅
黄宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ai Media Consulting (Guangzhou) Co.,Ltd.
Original Assignee
Guangzhou Iimedia Information Consulting Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Iimedia Information Consulting Co ltd filed Critical Guangzhou Iimedia Information Consulting Co ltd
Priority to CN201711183201.XA priority Critical patent/CN107967258B/en
Publication of CN107967258A publication Critical patent/CN107967258A/en
Application granted granted Critical
Publication of CN107967258B publication Critical patent/CN107967258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method and a system for analyzing emotion of text information, wherein keywords and context associated words of the keywords are extracted from acquired text information, the keywords and the context associated words are analyzed through a preset word vector analysis model to obtain first word vectors of the keywords, then second word vectors of the emotion words are obtained, and emotion values of the text information are obtained according to the first word vectors of the keywords and the second word vectors of the emotion words. In the scheme, the word vector analysis model analyzes the keywords and the context related words, the obtained first word vector of the keywords not only expresses the characteristics of the keywords, but also considers the characteristics of the context related words related to the keywords, accurately reflects the emotional characteristics of the keywords in the text information, and can obtain the emotional value as the emotional tendency of the text information by combining the second word vector of the emotional words, so that an accurate basis is provided for the further processing of the text information.

Description

Method and system for emotion analysis of text information
Technical Field
The invention relates to the technical field of data analysis, in particular to a method and a system for emotion analysis of text information.
Background
With the rapid development of the internet, networks have become the main means for people to obtain information. Various kinds of information are filled in the network, and the information is very necessary to be combed in the face of various information. For example, the comment information of the public on the network on social events, hot characters and E-commerce products is combed, and the comment information is of five-flower eight, wherein the attitudes of the public on comment objects are expressed, and the attitudes can be expressed by specific emotions.
Currently, emotion analysis of information generally analyzes a certain specific vocabulary in text information, so as to judge emotion of the whole text information, and because emotion expressed by the same vocabulary in different text contexts is different, emotion accuracy of information analysis by the certain specific vocabulary is low.
Disclosure of Invention
Therefore, it is necessary to provide a method and a system for emotion analysis of text information to solve the conventional problem that emotion accuracy of information analyzed through a specific vocabulary is low.
A method for emotion analysis of text information comprises the following steps:
extracting keywords and context associated words of the keywords from the text information;
analyzing the keywords and the context associated words according to a preset word vector analysis model to obtain a first word vector of the keywords;
and acquiring the emotion value of the text information according to the first word vector and the second word vector, wherein the second word vector is a pre-stored word vector of the emotion words.
An emotion analysis system for text information, comprising:
the word acquisition unit is used for extracting keywords and context associated words of the keywords from the text information;
the word vector analysis unit is used for analyzing the keywords and the context associated words according to a preset word vector analysis model to obtain first word vectors of the keywords;
and the emotion value acquisition unit is used for acquiring the emotion value of the text information according to the first word vector and the second word vector, wherein the second word vector is a pre-stored word vector of the emotion words.
According to the method and the system for analyzing the emotion of the text information, the keywords and the context associated words of the keywords are extracted from the acquired text information, the keywords and the context associated words are analyzed through a preset word vector analysis model, a first word vector of the keywords is acquired, a second word vector of the emotion words is acquired, and the emotion value of the text information is obtained according to the first word vector of the keywords and the second word vector of the emotion words. In the scheme, the word vector analysis model analyzes the keywords and the context related words, the obtained first word vector of the keywords not only expresses the characteristics of the keywords, but also considers the characteristics of the context related words related to the keywords, accurately reflects the emotional characteristics of the keywords in the text information, and can obtain the emotional value as the emotional tendency of the text information by combining the second word vector of the emotional words, so that an accurate basis is provided for the further processing of the text information.
A readable storage medium, on which an executable program is stored, which when executed by a processor implements the steps of the method for emotion analysis of text information as described above.
An analysis device comprises a memory, a processor and an executable program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the emotion analysis method of the text information.
According to the emotion analysis method of the text information, the invention also provides a readable storage medium and analysis equipment, the keywords and the context relevant words can be analyzed through the word vector analysis model, the obtained first word vector of the keywords not only expresses the characteristics of the keywords, but also considers the characteristics of the context relevant words related to the keywords, the emotion characteristics of the keywords in the text information are accurately reflected, and in combination with the second word vector of the emotion words, the emotion value can be obtained to serve as the emotion tendency of the text information, so that accurate basis is provided for further processing of the text information.
Drawings
FIG. 1 is a flowchart illustrating a method for emotion analysis of text information according to an embodiment;
FIG. 2 is a schematic structural diagram of a system for emotion analysis of text information according to an embodiment;
FIG. 3 is a schematic structural diagram of a system for emotion analysis of text information according to an embodiment;
FIG. 4 is a simplified diagram of a model training process according to one embodiment;
fig. 5 is a diagram illustrating a process of modifying an intermediate node vector by a Huffman tree according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic flow chart of a method for emotion analysis of text information according to an embodiment of the present invention. The emotion analysis method for the text information in the embodiment comprises the following steps:
step S110: extracting keywords and context associated words of the keywords from the text information;
in this step, the keyword may be a word that can directly express an emotion, or a word that appears in a text with a high frequency, and the context-related word is a word related to the keyword in a text paragraph, may reflect a language environment in which the keyword is located, and is located above or below the keyword in the text;
step S120: analyzing the keywords and the context associated words according to a preset word vector analysis model to obtain a first word vector of the keywords;
in this step, the first word vector is a vector value that can be identified and calculated corresponding to the keyword;
step S130: and acquiring the emotion value of the text information according to the first word vector and the second word vector, wherein the second word vector is a pre-stored word vector of the emotion words.
In this step, the emotion value of the text information can be obtained through the relationship between the first word vector of the keyword and the second word vector of the emotion word.
In this embodiment, the keywords and the context associated words of the keywords are extracted from the obtained text information, the keywords and the context associated words are analyzed through a preset word vector analysis model, a first word vector of the keywords is obtained, then a second word vector of the emotion words is obtained, and the emotion value of the text information is obtained according to the first word vector of the keywords and the second word vector of the emotion words. In the scheme, the word vector analysis model analyzes the keywords and the context related words, the obtained first word vector of the keywords not only expresses the characteristics of the keywords, but also considers the characteristics of the context related words related to the keywords, accurately reflects the emotional characteristics of the keywords in the text information, and can obtain the emotional value as the emotional tendency of the text information by combining the second word vector of the emotional words, so that an accurate basis is provided for the further processing of the text information.
It should be noted that there may be a plurality of keywords, and when there are a plurality of keywords, an emotion value may be obtained for each keyword, and then emotion values corresponding to all keywords are synthesized, so as to accurately obtain an emotion value of text information.
Further, the process of obtaining the emotion value of the text information according to the first word vector and the second word vector may be to perform vector distance calculation on the first word vector and the second word vector to obtain the emotion value of the text information.
In one embodiment, the step of analyzing the keywords and the context-related words according to a preset word vector analysis model further comprises the following steps:
establishing a binary neural network model, obtaining an information corpus to be trained, training the binary neural network model by taking the information corpus as a training sample, and obtaining a preset word vector analysis model.
In this embodiment, the information corpus is a set including a plurality of words, and the obtained information corpus can be used as a training sample to train the established binary neural network model, so that the binary neural network model continuously learns itself and is converted into a word vector analysis model capable of analyzing words and obtaining word vectors.
Alternatively, the binary neural network Model may be a CBOW Model (Continuous Bag of Words Model) and a Skip-gram Model (Continuous Skip-gram Model).
In one embodiment, the step of training the binary neural network model by using the information corpus as a training sample comprises the following steps:
selecting a target word and a related word from the information corpus, initializing an original word vector of the target word and the related word, analyzing the original word vector of the related word through a binary neural network model to obtain an error vector of the original word vector of the target word, and correcting the original word vector of the target word according to the error vector of the original word vector of the target word.
In this embodiment, when training is performed by using the information corpus as a training sample, a target word and a related word may be selected, where the related word is a word related to the target word in a certain language environment, and a relationship between the target word and the related word is similar to a relationship between a keyword and a context associated word in text information; the method comprises the steps of analyzing an initial word vector of an initialized related word in a training process to obtain an error vector of the initial word vector of a target word, correcting the initial word vector of the target word by using the error vector, and performing a correction process on the initial word vector of the target word by training an enhanced binary neural network model for multiple times to enable a final word vector analysis model to analyze an input keyword and a context related word and accurately obtain a first word vector of the keyword.
In one embodiment, the related words are multiple, and the step of analyzing the original word vectors of the related words through the binary neural network model comprises the following steps:
adding the original word vectors of all related words to obtain a sum vector;
constructing a Huffman tree of a binary neural network model by taking the target word and each related word as leaf nodes, acquiring a path from a root node of the Huffman tree to the leaf node corresponding to the target word, and classifying corresponding intermediate nodes according to the sum vector and the vector of the intermediate nodes in the path;
if the classification result of the current intermediate node is different from the trend of the path, correcting the vector of the current intermediate node according to the trend of the path, and acquiring an error vector of the current intermediate node;
and adding the error vectors of all the intermediate nodes to be used as the error quantity of the original word vector of the target word.
In this embodiment, there may be a plurality of related words, a huffman tree of a binary neural network model is constructed by using a target word and each related word as a leaf node, a path from the root node to the leaf node corresponding to the target word may be obtained therefrom, original word vectors are added, intermediate nodes are classified according to a sum vector of each related word and a vector of the intermediate node in the path, vectors of the intermediate nodes are corrected according to a classification result, a sum of corrected error vectors of each intermediate node is an error vector of an original word vector of the target word, and the original word vector of the target word is corrected by the error vector of the original word vector of the target word obtained in the above manner, so that the word vector of the target word reflects information of the related word, and the word vector of the target word is more accurate.
It should be noted that, when constructing the huffman tree of the binary neural network model, the vectors of the non-leaf nodes of the huffman tree may be initialized, and optionally, the initialized value of the vectors of the non-leaf nodes may be a zero vector.
Optionally, when the intermediate nodes are classified according to the sum vector of each related word and the vector of the intermediate node in the path, a logistic regression classification method or other types of regression classification methods may be used.
In one embodiment, the step of obtaining the information corpus to be trained includes the following steps:
and acquiring a network data text, filtering noise information of the network data text, and cutting words to generate an information corpus to be trained.
In the embodiment, words in the network data text can be obtained as the information corpus to be trained, the relevance of the expected word and the keywords in the text information is high, the training accuracy of the word vector analysis model can be improved, information irrelevant to the words used by model training can be filtered out by filtering noise information of the network data text, word segmentation is facilitated, and effective information corpus is obtained.
In one embodiment, the emotion analyzing method for text information further comprises the following steps:
and analyzing the emotional words according to the word vector analysis model, acquiring a second word vector of the emotional words and storing the second word vector.
In this embodiment, the second word vector of the emotion word may also be obtained through a word vector analysis model, when the information corpus to be trained is rich enough, the information corpus may also include the emotion word, and the emotion word is used as a target word, and the emotion word may be analyzed to obtain the second word vector of the emotion word. The second word vector of the emotional word can be stored in advance before the keyword and the context associated word are analyzed according to the word vector analysis model.
In one embodiment, the step of obtaining the emotion value of the text message according to the first word vector and the second word vector comprises the following steps:
and respectively acquiring relative values of different emotion words corresponding to the text information according to the first word vector and the second word vectors of different emotion words, and taking the maximum relative value as the emotion value of the text information.
In this embodiment, there may be a plurality of emotion words, a plurality of relative values corresponding to different emotion words may be obtained according to the first word vector of the keyword and the second word vector of different emotion words, and the largest relative value may be selected as the emotion value of the text information, so that the emotion value of the text information matches the characteristics of the text information itself.
Furthermore, if a plurality of keywords exist, statistical analysis can be performed on the relative values of all the keywords, and a plurality of relative values are selected as the emotion values of the text information according to a preset proportion.
Alternatively, the category of emotional words may be happy, angry, sadness, happy, sad, terrorist, hated, surprised, calm, disappointed, excited, etc. The emotional words of each category may also have different forms of expression.
The present invention also provides a text information emotion analysis system according to the text information emotion analysis method, and an embodiment of the text information emotion analysis system of the present invention will be described in detail below.
Fig. 2 is a schematic structural diagram of a system for emotion analysis of text information according to an embodiment of the present invention. The emotion analysis system for text information in this embodiment includes:
a word obtaining unit 210 configured to extract a keyword and a context related word of the keyword from text information;
the word vector analysis unit 220 is configured to analyze the keyword and the context associated word according to a preset word vector analysis model to obtain a first word vector of the keyword;
the emotion value obtaining unit 230 is configured to obtain an emotion value of the text information according to the first word vector and a second word vector, where the second word vector is a word vector of a pre-stored emotion word.
In this embodiment, as shown in fig. 3, the emotion analysis system for text information further includes a model establishing unit 240, configured to establish a binary neural network model, obtain an information corpus to be trained, train the binary neural network model by using the information corpus as a training sample, and obtain a preset word vector analysis model.
In one embodiment, the model building unit 240 selects a target word and a related word from the information corpus, initializes an original word vector of the target word and the related word, analyzes the original word vector of the related word through a binary neural network model, obtains an error vector of the original word vector of the target word, and corrects the original word vector of the target word according to the error vector of the original word vector of the target word.
In one embodiment, the related words are multiple, and the model building unit 240 adds the original word vectors of the related words to obtain a sum vector; constructing a Huffman tree of a binary neural network model by taking the target word and each related word as leaf nodes, acquiring a path from a root node of the Huffman tree to the leaf node corresponding to the target word, and performing logistic classification on corresponding intermediate nodes according to the sum vector and the vector of the intermediate nodes in the path; if the classification result of the current intermediate node is different from the trend of the path, correcting the vector of the current intermediate node according to the trend of the path, and acquiring an error vector of the current intermediate node; and adding the error vectors of all the intermediate nodes to be used as the error vector of the original word vector of the target word.
In one embodiment, the model building unit 240 obtains the web data text, performs noise information filtering on the web data text, and cuts words to generate the information corpus to be trained.
In one embodiment, the emotion value obtaining unit 230 analyzes the emotion words according to the word vector analysis model, obtains a second word vector, and stores the second word vector.
In one embodiment, the emotion value acquisition unit 230 acquires relative values of different emotion words corresponding to the text information according to the first word vector and the second word vectors of the different emotion words, and takes the maximum relative value as the emotion value of the text information.
The emotion analysis system for text information and the emotion analysis method for text information correspond to each other one by one, and technical features and beneficial effects thereof described in the embodiment of the emotion analysis method for text information are all applicable to the embodiment of the emotion analysis system for text information.
The terms "first," "second," and the like are used merely to distinguish one element from another, and do not limit the other elements.
According to the emotion analysis method of the text information, the embodiment of the invention also provides a readable storage medium and analysis equipment.
The readable storage medium stores an executable program, and the program realizes the steps of the emotion analysis method of the text information when being executed by a processor; the analysis device comprises a memory, a processor and an executable program which is stored on the memory and can run on the processor, and the processor realizes the steps of the emotion analysis method of the text information when executing the program.
In a specific embodiment, the scheme of the embodiment of the invention can be applied to scenes such as sentiment analysis of network comment information.
In order to reflect the attitude of the netizen to the hot event, the comments made by the netizen under the news report of the hot event can be selected as a corpus. And the comment information of the event can be collected and stored in a database by a special crawler module, and then the comment information is transmitted into a text processing module to filter noise information and cut words to generate a corpus to be trained.
According to the embodiment of the scheme, a Word2vec mode is adopted to process linguistic data, a model of a binary neural network is used for training, training of all words depends on words with similar contexts, context information is well considered, all words are trained into Word vectors in the same space, and a value obtained by vector distance calculation is used for representing the emotion value of a target Word by using a unique emotion Word bank (a Word bank formed by coarsening words representing emotions), so that emotion analysis and calculation considering the context information are realized.
Word vectors, as the name implies, use vectors to express words, and machines cannot understand the meaning they express words as humans, so they can convert words into computationally useful word vectors that machines can recognize. While the Word2vec scheme is a scheme for converting text into reasonable Word vectors, the training models used therein may be CBOW (Continuous Bag-of-Words Model) and Skip-gram (Continuous Skip-gram Model). Taking CBOW as an example, the model is based on a Huffman tree (Huffman tree), where the initialization value of the intermediate vector stored by the non-leaf node in the Huffman tree may be a zero vector, and the initialization of the word vector of the word corresponding to the leaf node is related to the position and the occurrence frequency of the word in the text message, and the training process is as shown in fig. 4:
there are three main stages, input layer (input), mapping layer (project) and output layer (output). The input layer is a word vector of n-1 words around a certain word a. If n takes 5, the words of the first two and the last two of the word A (which can be denoted as w (t)) are w (t-2), w (t-1), w (t +1), and w (t + 2). Correspondingly, the word vectors for those 4 words are denoted as v (w (t-2)), v (w (t-1)), v (w (t +1)), and v (w (t + 2)). It is relatively simple to add those n-1 word vectors from the input layer to the mapping layer. And from the mapping layer to the output layer, a Huffman tree is constructed. Starting from the root node, the values of the mapping layer need to be continuously classified logically along the Huffman tree, and each intermediate vector and word vector are continuously modified.
Taking fig. 5 as an example, in the Huffman tree, the middle word is w (t), and the mapping layer input is pro (t) ═ v (w (t-2)) + v (w (t-1)) + v (w (t +1)) + v (w (t +2))
If the word at this time is "football", that is, w (t) ═ football ", the Huffman code is known as d (t) ═ 1001", and then the path from the root node to the leaf node is known as "right and left", that is, from the root node, the leaf node first turns left, then turns right 2 times, and finally turns left.
And correcting the intermediate vector of each node on the path from top to bottom in sequence according to the path. At the first node, Logistic classification is performed according to intermediate vectors θ (t,1) and pro (t) of the nodes. If the classification result shows 0, it indicates that the classification is erroneous (should turn left, i.e., classify to 1), θ (t,1) is corrected, and the amount of error is recorded.
Next, after the first node is processed, the second node is processed, similarly, θ (t,2) is corrected, and the error amount is accumulated. The subsequent nodes are analogized in the same way.
After all nodes have been processed and the leaf nodes have been reached, the word vector v (w (t)) is corrected according to the previously accumulated error.
Thus, the processing flow of a word w (t) is ended. If there are N words in a text, the above process needs to be repeated N times from w (0) to w (N-1). After training, a vector of each word is obtained, and a model capable of performing word vector analysis on the input words can be obtained through the training mode and is applied to analysis of emotion values of text information.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
Those skilled in the art will appreciate that all or part of the steps in the method for implementing the above embodiments may be implemented by a program instructing the relevant hardware. The program may be stored in a readable storage medium. Which when executed comprises the steps of the method described above. The storage medium includes: ROM/RAM, magnetic disk, optical disk, etc.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for emotion analysis of text information is characterized by comprising the following steps:
extracting keywords and context associated words of the keywords from text information;
analyzing the keywords and the context associated words according to a preset word vector analysis model to obtain first word vectors of the keywords;
acquiring an emotion value of the text information according to the first word vector and a second word vector, wherein the second word vector is a word vector of a prestored emotion word;
the step of analyzing the keywords and the context associated words according to a preset word vector analysis model further comprises the following steps of:
establishing a binary neural network model, acquiring an information corpus to be trained, and training the binary neural network model by taking the information corpus as a training sample to obtain the preset word vector analysis model;
the step of training the binary neural network model by using the information corpus as a training sample comprises the following steps:
selecting a target word and a related word from the information corpus, initializing an original word vector of the target word and the related word, analyzing the original word vector of the related word through the binary neural network model to obtain an error vector of the original word vector of the target word, and correcting the original word vector of the target word according to the error vector of the original word vector of the target word;
the related words are multiple, and the step of analyzing the original word vectors of the related words through the binary neural network model comprises the following steps:
adding the original word vectors of the related words to obtain a sum vector;
constructing a Huffman tree of the binary neural network model by taking the target word and each related word as leaf nodes, acquiring a path from a root node of the Huffman tree to the leaf node corresponding to the target word, and performing Logistic classification on corresponding intermediate nodes according to the sum vector and vectors of the intermediate nodes in the path;
if the classification result of the current intermediate node is different from the trend of the path, correcting the vector of the current intermediate node according to the trend of the path, and acquiring an error vector of the current intermediate node;
and adding the error vectors of all the intermediate nodes to be used as the error vector of the original word vector of the target word.
2. The emotion analysis method for text information according to claim 1, wherein the step of obtaining the corpus of information to be trained includes the steps of:
and acquiring a network data text, filtering noise information of the network data text, and cutting words to generate the information corpus to be trained.
3. The emotion analysis method for text information according to claim 1, further comprising the steps of:
and analyzing the emotional words according to the preset word vector analysis model to obtain and store the second word vector.
4. The method for emotion analysis of text information according to any one of claims 1 to 3, wherein said step of obtaining the emotion value of the text information based on the first word vector and the second word vector comprises the steps of:
and respectively acquiring relative values of different emotion words corresponding to the text information according to the first word vector and the second word vectors of different emotion words, and taking the maximum relative value as the emotion value of the text information.
5. An emotion analysis system for text information, comprising:
the word acquisition unit is used for extracting keywords and context associated words of the keywords from text information;
the word vector analysis unit is used for analyzing the keywords and the context associated words according to a preset word vector analysis model to obtain first word vectors of the keywords;
the emotion value acquisition unit is used for acquiring the emotion value of the text information according to the first word vector and a second word vector, wherein the second word vector is a pre-stored word vector of emotion words;
the model establishing unit is used for establishing a binary neural network model, acquiring information corpora to be trained, and training the binary neural network model by using the information corpora as training samples to obtain the preset word vector analysis model;
the model establishing unit is further configured to select a target word and a related word from the information corpus, initialize original word vectors of the target word and the related word, analyze the original word vectors of the related word through the binary neural network model, obtain an error vector of the original word vector of the target word, and correct the original word vector of the target word according to the error vector of the original word vector of the target word;
the model building unit is further used for adding the original word vectors of the related words to obtain a sum vector; constructing a Huffman tree of the binary neural network model by taking the target word and each related word as leaf nodes, acquiring a path from a root node of the Huffman tree to the leaf node corresponding to the target word, and performing Logistic classification on corresponding intermediate nodes according to the sum vector and vectors of the intermediate nodes in the path; if the classification result of the current intermediate node is different from the trend of the path, correcting the vector of the current intermediate node according to the trend of the path, and acquiring an error vector of the current intermediate node; and adding the error vectors of all the intermediate nodes to be used as the error vector of the original word vector of the target word.
6. The emotion analysis system of text information according to claim 5, wherein the model building unit is further configured to obtain a web data text, filter noise information of the web data text, and cut words to generate the information corpus to be trained.
7. The system for emotion analysis of text information according to claim 5, wherein said emotion value acquisition unit is further configured to analyze the emotion words according to the preset word vector analysis model, and acquire and store the second word vector.
8. The system according to any one of claims 5 to 7, wherein the emotion value obtaining unit is further configured to obtain, according to the first word vector and the second word vector of different emotion words, relative values of different emotion words corresponding to the text information, respectively, and use a maximum relative value as the emotion value of the text information.
9. A readable storage medium on which an executable program is stored, characterized in that the program, when being executed by a processor, carries out the steps of the method for emotion analysis of a text message as claimed in any one of claims 1 to 4.
10. An analysis device comprising a memory, a processor and an executable program stored on the memory and operable on the processor, the processor implementing the steps of the method for emotion analysis of textual information according to any of claims 1 to 4 when executing the program.
CN201711183201.XA 2017-11-23 2017-11-23 Method and system for emotion analysis of text information Active CN107967258B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711183201.XA CN107967258B (en) 2017-11-23 2017-11-23 Method and system for emotion analysis of text information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711183201.XA CN107967258B (en) 2017-11-23 2017-11-23 Method and system for emotion analysis of text information

Publications (2)

Publication Number Publication Date
CN107967258A CN107967258A (en) 2018-04-27
CN107967258B true CN107967258B (en) 2021-09-17

Family

ID=62001599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711183201.XA Active CN107967258B (en) 2017-11-23 2017-11-23 Method and system for emotion analysis of text information

Country Status (1)

Country Link
CN (1) CN107967258B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446405B (en) * 2018-09-12 2021-04-30 中国科学院自动化研究所 Big data-based tourism industry promotion method and system
CN109783800B (en) * 2018-12-13 2024-04-12 北京百度网讯科技有限公司 Emotion keyword acquisition method, device, equipment and storage medium
CN109766557B (en) * 2019-01-18 2023-07-18 河北工业大学 Emotion analysis method and device, storage medium and terminal equipment
CN110110137A (en) * 2019-03-19 2019-08-09 咪咕音乐有限公司 A kind of method, apparatus, electronic equipment and the storage medium of determining musical features
CN110427454B (en) * 2019-06-21 2024-03-15 平安科技(深圳)有限公司 Text emotion analysis method and device, electronic equipment and non-transitory storage medium
CN110399617A (en) * 2019-08-30 2019-11-01 广西电网有限责任公司南宁供电局 Audit data processing method, system and readable storage medium storing program for executing
CN111274807B (en) * 2020-02-03 2022-05-10 华为技术有限公司 Text information processing method and device, computer equipment and readable storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012134180A2 (en) * 2011-03-28 2012-10-04 가톨릭대학교 산학협력단 Emotion classification method for analyzing inherent emotions in a sentence, and emotion classification method for multiple sentences using context information
CN105893444A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Sentiment classification method and apparatus
CN106372058A (en) * 2016-08-29 2017-02-01 中译语通科技(北京)有限公司 Short text emotion factor extraction method and device based on deep learning
CN106502989A (en) * 2016-10-31 2017-03-15 东软集团股份有限公司 Sentiment analysis method and device
CN106547924A (en) * 2016-12-09 2017-03-29 东软集团股份有限公司 The sentiment analysis method and device of text message
CN106547740A (en) * 2016-11-24 2017-03-29 四川无声信息技术有限公司 Text message processing method and device
CN106815192A (en) * 2015-11-27 2017-06-09 北京国双科技有限公司 Model training method and device and sentence emotion identification method and device
CN107066497A (en) * 2016-12-29 2017-08-18 努比亚技术有限公司 A kind of searching method and device
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
CN107229610A (en) * 2017-03-17 2017-10-03 咪咕数字传媒有限公司 The analysis method and device of a kind of affection data
CN107247702A (en) * 2017-05-05 2017-10-13 桂林电子科技大学 A kind of text emotion analysis and processing method and system
CN107291693A (en) * 2017-06-15 2017-10-24 广州赫炎大数据科技有限公司 A kind of semantic computation method for improving term vector model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060074830A1 (en) * 2004-09-17 2006-04-06 International Business Machines Corporation System, method for deploying computing infrastructure, and method for constructing linearized classifiers with partially observable hidden states
CN105740349B (en) * 2016-01-25 2019-03-08 重庆邮电大学 A kind of sensibility classification method of combination Doc2vec and convolutional neural networks
CN106326212B (en) * 2016-08-26 2019-04-16 北京理工大学 A kind of implicit chapter relationship analysis method based on level deep semantic
CN106557463A (en) * 2016-10-31 2017-04-05 东软集团股份有限公司 Sentiment analysis method and device
CN107273348B (en) * 2017-05-02 2020-12-18 深圳大学 Topic and emotion combined detection method and device for text
CN107220180B (en) * 2017-06-08 2020-08-04 电子科技大学 Code classification method based on neural network language model
CN107357837B (en) * 2017-06-22 2019-10-08 华南师范大学 The electric business excavated based on order-preserving submatrix and Frequent episodes comments on sensibility classification method

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012134180A2 (en) * 2011-03-28 2012-10-04 가톨릭대학교 산학협력단 Emotion classification method for analyzing inherent emotions in a sentence, and emotion classification method for multiple sentences using context information
CN106815192A (en) * 2015-11-27 2017-06-09 北京国双科技有限公司 Model training method and device and sentence emotion identification method and device
CN105893444A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Sentiment classification method and apparatus
CN106372058A (en) * 2016-08-29 2017-02-01 中译语通科技(北京)有限公司 Short text emotion factor extraction method and device based on deep learning
CN106502989A (en) * 2016-10-31 2017-03-15 东软集团股份有限公司 Sentiment analysis method and device
CN106547740A (en) * 2016-11-24 2017-03-29 四川无声信息技术有限公司 Text message processing method and device
CN106547924A (en) * 2016-12-09 2017-03-29 东软集团股份有限公司 The sentiment analysis method and device of text message
CN107066497A (en) * 2016-12-29 2017-08-18 努比亚技术有限公司 A kind of searching method and device
CN107229610A (en) * 2017-03-17 2017-10-03 咪咕数字传媒有限公司 The analysis method and device of a kind of affection data
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
CN107247702A (en) * 2017-05-05 2017-10-13 桂林电子科技大学 A kind of text emotion analysis and processing method and system
CN107291693A (en) * 2017-06-15 2017-10-24 广州赫炎大数据科技有限公司 A kind of semantic computation method for improving term vector model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于电商网站商品评论数据的用户情感分析;崔志刚;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150215(第02期);I138-638 *

Also Published As

Publication number Publication date
CN107967258A (en) 2018-04-27

Similar Documents

Publication Publication Date Title
CN107967258B (en) Method and system for emotion analysis of text information
CN108304526B (en) Data processing method and device and server
CN110121706B (en) Providing responses in a conversation
US10417329B2 (en) Dialogue act estimation with learning model
CN110222178B (en) Text emotion classification method and device, electronic equipment and readable storage medium
CN104462363B (en) Comment point shows method and apparatus
CN107967261A (en) Interactive question semanteme understanding method in intelligent customer service
CN111026842A (en) Natural language processing method, natural language processing device and intelligent question-answering system
CN108804526B (en) Interest determination system, interest determination method, and storage medium
CN109543176B (en) Method and device for enriching short text semantics based on graph vector representation
KR102042168B1 (en) Methods and apparatuses for generating text to video based on time series adversarial neural network
US20170034111A1 (en) Method and Apparatus for Determining Key Social Information
CN108108354A (en) A kind of microblog users gender prediction's method based on deep learning
CN110765235A (en) Training data generation method and device, terminal and readable medium
CN109614611B (en) Emotion analysis method for fusion generation of non-antagonistic network and convolutional neural network
CN108733652B (en) Test method for film evaluation emotion tendency analysis based on machine learning
CN111813923A (en) Text summarization method, electronic device and storage medium
CN111563161B (en) Statement identification method, statement identification device and intelligent equipment
CN114492423A (en) False comment detection method, system and medium based on feature fusion and screening
CN113934834A (en) Question matching method, device, equipment and storage medium
CN109117471B (en) Word relevancy calculation method and terminal
CN109933787B (en) Text key information extraction method, device and medium
CN111914566A (en) Automatic comment generation method
CN108921213B (en) Entity classification model training method and device
Wakchaure et al. A scheme of answer selection in community question answering using machine learning techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230807

Address after: Room 1102, No. 15 Zhigang Street, Xinzao Town, Panyu District, Guangzhou City, Guangdong Province, 510000

Patentee after: Ai Media Consulting (Guangzhou) Co.,Ltd.

Address before: 510006 room 701, 26 Qinglan street, Xiaoguwei street, Panyu District, Guangzhou City, Guangdong Province

Patentee before: GUANGZHOU IIMEDIA INFORMATION CONSULTING Co.,Ltd.

TR01 Transfer of patent right