CN110175325B

CN110175325B - Comment analysis method based on word vector and syntactic characteristics and visual interaction interface

Info

Publication number: CN110175325B
Application number: CN201910343337.5A
Authority: CN
Inventors: 吕奇; 沈楠楠; 胡新春; 陈可佳
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2019-04-26
Filing date: 2019-04-26
Publication date: 2023-07-11
Anticipated expiration: 2039-04-26
Also published as: CN110175325A

Abstract

The invention provides a comment analysis method based on word vectors and syntactic characteristics in the field of data analysis, which comprises the following steps: acquiring comment data of commodity pages of an e-commerce website; preprocessing the acquired target data set; extracting recognition and detraction sets provided by Hownet and NTU to form a basic emotion dictionary; carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool; establishing a probability transition matrix by using the semantic similarity matrix; processing the obtained commodity comment text based on the core sentence rule; preprocessing the obtained text with redundancy removed; extracting < commodity attribute, negation word, degree word and emotion word > from the obtained dependency relationship pair through part of speech to evaluate and match the pair; and carrying out recognition and devaluation calculation and good and bad sequencing on the evaluation objects by combining the obtained evaluation matching pair with an emotion dictionary, and finally realizing accurate, real-time, automatic and convenient processing and analysis on commodity comment data through a visual interactive interface, thereby being applicable to an electronic commerce platform.

Description

Comment analysis method based on word vector and syntactic characteristics and visual interaction interface

Technical Field

The invention belongs to the technical field of data analysis, and particularly relates to an emotion dictionary and attribute recognition algorithm which are constructed by using word vectors trained by a neural network model and are suitable for commodity comments and a comment analysis system based on the word vectors and syntactic features.

Background

With the popularization of the Internet and the development of electronic commerce, internet electronic commerce websites such as Beijing dong and Taobao rapidly develop, and more consumers begin to select online shopping; the e-commerce websites have massive commodities and also have a large user group, so that huge comment data are generated. The comments given by consumers often carry the subjective feelings of the user about the consumption, including preference for purchasing goods, satisfaction for merchant services, etc. For consumers, these comment texts can help them to more objectively learn about the information about the relevant goods or services, thus giving a more suitable choice; the merchant can be helped to further improve the service or commodity quality in a targeted manner through experience information about commodities or services fed back by the user, so that more clients and profits are obtained. However, with the explosive growth of data volume, the cost required by the user to acquire useful information from massive comment data is also increased, so how to process and analyze the comment text of the user rapidly and effectively, extract valuable information from the comment text, and have important application value and research significance.

Currently, a large amount of comment data cannot be fully utilized, and consumers are difficult to acquire valuable information from a huge amount of comment data. Therefore, a comment analysis system based on word vectors and syntactic features is researched, satisfaction of users on all properties of commodities is obtained according to analysis results, advantages and disadvantages of the commodities are summarized, and then data visualization is conducted on the analysis results.

Disclosure of Invention

The technical problem to be solved by the invention is how to realize accurate, real-time, automatic and convenient processing and analysis of commodity comment data, and overcomes the defects of the prior art to provide a comment analysis method based on word vectors and syntactic characteristics.

The invention provides a comment analysis method based on word vectors and syntactic features, which comprises the following steps:

1) Acquiring comment data of commodity pages of an e-commerce website;

2) Preprocessing the obtained target data set, and constructing a candidate emotion word set;

3) Extracting recognition and detraction sets provided by Hownet and NTU to form a basic emotion dictionary;

4) Carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool to obtain Word vectors and generate a semantic similarity matrix;

5) Establishing a probability transfer matrix by using a semantic similarity matrix, combining a seed word set, passing through an LPA tag propagation algorithm, and generating a final emotion dictionary after basic emotion dictionary test;

6) Processing the obtained commodity comment text based on the core sentence rule to obtain a comment text with redundancy removed;

7) Preprocessing the obtained text with redundancy removed, forming a dependency tree for the obtained word segmentation data set based on the dependency and the syntactic characteristics, and generating SBV, VOB, ATT, CMP, COO dependency pairs;

8) Extracting < commodity attribute, negation word, degree word and emotion word > from the obtained dependency relationship pair through part of speech to evaluate and match the pair;

9) And combining the obtained evaluation matching pair with an emotion dictionary, performing recognition and devaluation calculation and good and bad sequencing on the evaluation object, and finally realizing the method through a visual interaction interface.

As a further definition of the invention, step 2) specifically comprises:

2-1) removing the illegal character using a character matching algorithm;

2-2) word segmentation and part-of-speech tagging are carried out on the original data set by using LTP;

2-3) extracting words conforming to part of speech, and forming a candidate emotion word set 1 through duplication elimination;

2-4) word segmentation and part-of-speech tagging are carried out on the original data set by using NLPIR;

2-5) extracting words conforming to part of speech, and forming a candidate emotion word set 2 through duplication elimination;

2-6) combining the candidate emotion word set 1 and the candidate emotion word set 2, and obtaining the candidate emotion word set through duplication removal.

As a further definition of the invention, step 3) specifically comprises: and respectively extracting the recognition and detraction words in the word dictionary by using the hownet emotion dictionary and the ntu evaluation word dictionary, and combining and then removing duplication to form a basic emotion dictionary.

As a further definition of the invention, step 4) specifically comprises:

4-1) utilizing a Word2Vec training data set to obtain Word vectors of words;

4-2) combining the candidate emotion word sets, and calculating semantic similarity between words by adopting the following formula:

4-3) for example two n-dimensional word vectors a (x ₁₁ , x ₁₂ , … , x _1n ) And b (x) ₂₁ , x ₂₂ , … , x _2n ) The semantic similarity calculation formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,

representing a semantic similarity value;

representing a kth dimension value of the word vector a;

representing a k-th dimension value of the word vector b;

4-4) constructing a semantic similarity matrix according to the calculated semantic similarity.

As a further definition of the invention, step 5) specifically comprises:

5-1) regarding each word as a node of the graph, wherein the weight of the edge between two nodes is represented by the semantic similarity between the words represented by the weight;

5-2) establishing a probability transition matrix P according to the following formula:

wherein, P [ i ]][j]Representing the probability of similarity transition between words i through j, SIM (w _i ,w _j ) Representing the similarity of words i and j, and m represents the number of words with the highest semantic similarity with the word i;

5-3) counting word frequencies of all emotion words in the candidate emotion word set in the original comment data, screening N words with highest word frequencies, and forming a seed word set 1; screening words with emotion vocabulary ontology intensity > m in the candidate emotion word sets by using the emotion vocabulary ontology library to form a seed word set 2; combining the seed word set 1 and the seed word set 2, then removing duplication to form a seed word set, and carrying out artificial emotion marking;

5-4) establishing a label matrix Y of LxC by using a small number of manually-labeled seed words _L Wherein: l represents the number of seed words; c represents the number of classes, which are classified into 3 classes, namely, the identification, the disambiguation and the neutrality respectively;

5-5) simultaneously building a label matrix Y of UxC using unlabeled sample words _U Wherein: u represents the number of unlabeled sample words; c represents the number of classes, which are classified into 3 classes, namely, the identification, the disambiguation and the neutrality respectively;

5-6) finally, performing part-of-speech tagging on the sample words by adopting an LPA tag propagation algorithm, and forming a final emotion dictionary after passing through a basic emotion dictionary test.

As a further definition of the invention, step 6) specifically comprises:

the core sentence mainly refers to deleting redundancy, and retaining a trunk component related to evaluation collocation; if the original sentence does not accord with any rule, the original sentence is kept unchanged, the method uses the core sentence to aim at improving the accuracy of the analysis of the syntactic dependency of the evaluation text, and the rule comprises the following steps:

rule 1: deleting sentence initial components in sentences, such as the "… advantage", "… disadvantage", "… deficiency", "… advantage", "… benefit" sequence;

rule 2: deleting sentences with hypothetical tendencies, such as "if …", "hope …", "if …", "wish …", "suggestion …";

rule 3: deleting a sequence whose period is "exactly," "naturally," "particularly," "still further," "particularly";

rule 4: deleting "feel", "consider" claim words;

rule 5: and deleting continuous punctuation marks except the first punctuation mark, such as abnormal characters of expression, pigment and brackets.

As a further definition of the invention, step 7) specifically comprises:

five axioms of dependency syntax:

(1) One sentence has only one and only one independent component;

(2) Any component in a sentence must depend on a certain component at the same time;

(3) Any component in a sentence cannot depend on two or more components at the same time;

(4) If component a depends directly on component b and component c is located between components a and b in the sentence, then component c depends on a or b or other components between a and b;

(5) The components on the left and right sides of the central component have no dependency relationship with each other;

the dependency tree is characterized by:

(1) Nodes in the tree are served by the individual components in the sentence;

(2) The root node of the tree is the center component of the whole sentence;

(3) Edges formed among nodes in the tree have directionality, reflecting asymmetric dependency relationships among components;

(4) Five axioms of the dependency syntax are satisfied;

most sentence dependency relations in comments are five categories of main-predicate relation (SBV), moving-guest relation (VOB/FOB), centering relation (ATT), moving-complement relation (CMP) and parallel relation (COO), dependency syntax analysis can be carried out through an LTP dependency syntax analyzer, and dependency relation pairs are extracted by combining COO algorithm for identifying parallel evaluation objects and parallel evaluation words; the COO algorithm for identifying the parallel evaluation objects and the parallel evaluation words specifically comprises the following steps:

traversing all words between two nodes in a SBV, VOB, ATT, CMP dependency pair obtained based on the dependency relationship and the syntactic characteristic and related left and right in the dependency syntactic tree;

judging whether COO relations exist in all the traversed words or not;

and expanding the parallel evaluation objects and evaluation words of COO relations.

As a further definition of the invention, step 8) specifically comprises:

8-1) according to the characteristics of Chinese language, most evaluation objects are nouns or verbs, and most evaluation words are adjectives or verbs;

8-2) extracting an evaluation object and an evaluation word, namely commodity attributes and emotion words according to the part of speech;

8-3) traversing whether negative words exist between the obtained evaluation object and the evaluation word according to the dependency syntax tree, if so, carrying out +1 number of the negative words, and if so, carrying out parity judgment on the number of the negative words until the traversal is finished. If the number is odd, the corresponding negative word private is assigned as-1, and if the number is even, the corresponding negative word private is assigned as +1;

8-4) traversing whether the obtained evaluation object and the evaluation word have the degree word according to the dependency syntax tree, and if so, accumulating the number to obtain the number of the degree words of the collocation pair;

8-5) finally forming the evaluation match pair of the commodity attribute, the negation word, the degree word and the emotion word.

As a further definition of the invention, step 9) specifically comprises:

according to the commodity attribute a appearing n times, the identification value calculation formula is as follows:

where a. Score is the affective value of commodity attribute a,

for the ith time of the commodity attribute occurrence, private is the obtained value (-1 or +1) of the negative word corresponding to the ith commodity attribute, and degree is the number of the degree adverbs corresponding to the ith commodity attribute; calculating commodity attribute emotion values, and accumulating and calculating the same evaluation objects;

and (5) sorting the extracted all evaluation objects into two categories, namely, recognition and derogation, and arranging the final results by using bubbling sequencing.

A visual interaction interface can execute all the steps of the claims, can well display emotion values in a bar chart form, and is added with a plurality of friendly interaction functions, comprising: loading, logging in, logging out, modifying passwords, user logging in use status, etc.

Compared with the prior art, the technical scheme provided by the invention has the following technical effects:

the invention constructs a basic emotion dictionary by acquiring and preprocessing commodity page comment data of an e-commerce website; carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool, generating a semantic similarity matrix, further establishing a probability transfer matrix, and generating a final emotion dictionary through an LPA label propagation algorithm by combining a seed Word set; processing the obtained commodity comment text based on the core sentence rule to obtain a comment text with redundancy removed; preprocessing the obtained text with redundancy removed, forming a dependency relation tree on the basis of the dependency relation and the syntactic characteristic of the obtained word segmentation data set, generating SBV, VOB, ATT, CMP, COO dependency relation pairs, extracting < commodity attributes, negatives, degree words and emotion word > evaluation matching pairs, carrying out positive and negative value calculation on commodity attributes by combining with an emotion dictionary, and finally realizing through a visual interactive interface; the comment data can be analyzed accurately, in real time, automatically and conveniently.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The technical scheme of the invention is further described in detail below with reference to the accompanying drawings:

according to the technical scheme, a word vector trained by a neural network model is used, and an emotion dictionary suitable for commodity comments is constructed by combining an LTP label propagation algorithm; designing a commodity attribute identification extraction algorithm based on the core sentence rule, the dependency relationship and the syntactic characteristics; and a comment analysis system based on word vectors and syntactic characteristics is constructed by combining the technical scheme, the satisfaction degree of the user on each attribute of the commodity is obtained according to the analysis result, the advantages and disadvantages of the commodity are summarized, and then the analysis result is subjected to data visualization.

Referring to fig. 1, the invention implements a comment analysis method based on word vectors and syntactic features, and the implementation steps are as follows:

step S101: and acquiring comment data of the commodity page of the E-commerce website.

In specific implementation, a comment data crawling algorithm is designed to acquire comment data of various commodities in an e-commerce website and generate an original comment data set.

Step S102: and preprocessing the obtained target data set, and constructing a basic emotion dictionary.

In a specific implementation, the original dataset is used to remove the illegal characters using a character matching algorithm; firstly, performing word segmentation and part-of-speech tagging by using LTP, extracting words with part-of-speech marks of "a" (adj), and performing de-duplication to form a candidate emotion word set 1; then, using NLPIR to perform word segmentation and part-of-speech tagging, extracting words with part-of-speech identification of "a" (adj), and performing de-duplication to form a candidate emotion word set 2; and merging the candidate emotion word set 1 and the candidate emotion word set 2, and performing de-duplication to form a final candidate emotion word set.

Step S103: and extracting the recognition and detraction sets provided by Hownet and NTU to form a basic emotion dictionary.

In specific implementation, a hotnet emotion dictionary and an NTU evaluation word dictionary are utilized to respectively extract recognition and detraction words in the hotnet emotion dictionary, and the recognition and detraction words are combined to form a basic emotion dictionary.

Step S104: and training Word vectors of the obtained preprocessed data set through a Word2Vec tool to obtain Word vectors and generate a semantic similarity matrix.

In a specific implementation, a Word2Vec training data set is used, training parameters size=100, window=5, sg=0, min_count=0 are respectively set, and Word vectors of words are obtained through training.

And combining the candidate emotion word sets, and calculating the semantic similarity between words by adopting the following formula.

For example two n-dimensional word vectors a (x ₁₁ , x ₁₂ , … , x _1n ) And b (x) ₂₁ , x ₂₂ , … , x _2n ) The semantic similarity calculation formula is as follows:

representing a semantic similarity value;

representing a kth dimension value of the word vector a;

representing a k-th dimension value of the word vector b;

traversing all emotion words in the candidate emotion word set in sequence, fixing one emotion word, and calculating the similarity of the emotion words with all other emotion words; assuming m candidate emotion words, obtaining a m-m semantic similarity matrix through m-m times of calculation.

In order to facilitate the following operation, it is prescribed that the similarity between identical emotion words is 0.

And constructing a semantic similarity matrix according to the calculated semantic similarity.

Step S105: and establishing a probability transition matrix by using a semantic similarity matrix, combining a seed word set, passing through an LPA tag propagation algorithm, and generating a final emotion dictionary after basic emotion dictionary test.

In particular implementations, each word is considered as a node of the graph, and the weights of edges between two nodes are represented by semantic similarity between the words they represent.

The probability transition matrix P is established according to the following formula:

wherein, P [ i ]][j]Representing the probability of similarity transition between words i through j, SIM (w _i ,w _j ) Representing the similarity of words i and j, and m represents the number of words (manually set) with the highest semantic similarity with the word i; and establishing a probability transition matrix P according to the formula.

Counting word frequencies of all emotion words in the candidate emotion word set in the original comment data, screening out 100 words with highest word frequencies, and forming a seed word set 1; screening words with emotion vocabulary ontology intensity of more than 7 in the candidate emotion word set by using an emotion vocabulary ontology library of university of great company, and forming a seed word set 2; and merging the seed word set 1 and the seed word set 2, then removing duplication to form a seed word set, and carrying out artificial emotion marking.

Then, a label matrix Y of LxC is established by using a small amount of manually marked seed words _L Wherein: l represents the number of seed words; c represents the number of classes, typically 3 classes (recognition, detraction, neutral); simultaneously, a label matrix Y of UxC is established by utilizing unlabeled sample words _U Wherein: u represents the number of unlabeled sample words; c represents the number of classes, typically 3 classes (recognition, detraction, neutral); combining the two label matrixes to obtain a soft label matrix F= [ Y ] of NxC _L ;Y _U ]。

Executing a tag propagation algorithm, wherein the specific operation is as follows: 1) Performing propagation: f=pf; 2) Resetting the tag of the labeled sample in F: f (F) _L =Y _L The method comprises the steps of carrying out a first treatment on the surface of the 3) Repeating steps 1) and 2) until F converges.

The purpose of step 1 is to transmit the label (emotion attribute) of each node (emotion word) to other nodes with probability determined by a probability transition matrix, if the similarity of two nodes is larger, the transmission probability is larger; the step 2 aims to reset the label marked with the seed word to a marked value, so that the change caused by the operation process of the step 1 is avoided; the method for determining F convergence in step 3 is to calculate the latest F and F after the last operation ₀ Is considered to have converged until the similarity is no longer changing.

And finally, three numerical values of a single row in the matrix F represent attribute propagation values of the emotion words corresponding to the three numerical values, the largest numerical value is selected, the corresponding attribute is judged, and the emotion word attribute is determined.

Leading out emotion words with confirmed attributes to form an emotion dictionary 1, traversing all emotion words in the emotion dictionary 1, and changing the attributes of the emotion words if the basic emotion dictionary contains the words and contradicts the attributes in the basic emotion dictionary in step S103, wherein the basic emotion dictionary is based on the attributes in the basic emotion dictionary; otherwise, the attribute is unchanged.

After the above steps are finished, the modified emotion dictionary 1 is the final emotion dictionary.

Step S106: and processing the obtained commodity comment text based on the core sentence rule to obtain a comment text with redundancy removed.

In the implementation, a commodity website is input on an interactive interface of a webpage of the system, comment data of the commodity input on an electronic commerce platform is crawled through a web crawler mechanism designed in the background, and the system sets up the first 1000 pieces of high-quality comment data of the commodity to be crawled.

The obtained commodity comment data is subjected to redundancy removal processing based on core sentence rules, and trunk components related to evaluation collocation are reserved; for example: the mobile phone receives good quality, good pixels and sound quality, particularly good express delivery force (the next day), and the only disadvantage is that the package is not good, so that a store can hope to improve. . . The treatment is as follows:

(1) The matching rule 1, namely the example sentence is matched with the defect of …, the mobile phone is changed into a mobile phone after the processing, the mobile phone receives the defect, the pixels and the tone quality are good, and especially the express delivery is very powerful (the next day is reached), namely the package is not very good, and the store can hope to improve. . . ";

(2) The matching rule 2, the example sentence is matched to the hope, the processing is changed into the mobile phone which receives the request, the picture and the tone quality are good, especially the express delivery is very powerful (the next day is reached), the package is not very good, and the store can improve. . . ";

(3) The matching rule 3, the example sentence is matched to be ' in particular ', the mobile phone is changed into ' after processing, the mobile phone is received well, the pixels and the tone quality are good, the express delivery is very powerful (the next day is reached), the package is not very good, and the store can improve. . . ";

(4) The matching rule 5, the example sentence deletes the continuous punctuation mark, the core sentence obtained by the final processing is' the mobile phone receives, the picture and the tone quality are good, the express delivery is very powerful and the next day arrives, the package is not very good, and the store can improve. This example is denoted as example sentence sendees.

Step S107: preprocessing the obtained text with redundancy removed, forming a dependency tree by the obtained word segmentation data set based on the dependency and the syntactic characteristics, and generating SBV, VOB, ATT, CMP, COO dependency pairs.

In a specific implementation, the text with redundancy removed obtained in the step S106 is preprocessed, and 6 clauses are obtained by punctuating clauses. And segmenting each sentence by using an LTP tool, marking parts of speech, and forming a dependency tree based on the dependency and the syntactic characteristics. The dependency relationship is obtained for SBV < mobile phone, received >, SBV < pixel, good >, COO < tone quality, pixel >, SBV < express delivery, force giving >, SBV < package, good >, SBV < store, improvement >.

For example, if the phrase "the pixels and the sound quality are both good" is processed by the above steps, and the dependency pair is extracted again by combining the COO algorithm for identifying the parallel evaluation object and the parallel evaluation word, the obtained dependency pair is < the pixels, good >, < the sound quality, good >.

Step S108: and extracting < commodity attribute, negation word, degree word and emotion word > from the obtained dependency relationship pair through part of speech to evaluate and match the pair.

In the specific implementation, traversing whether negative words exist between the evaluation object and the evaluation word for each extracted relation pair, calculating the number, judging parity of the negative words between the evaluation object and the evaluation word to obtain positive and negative values of the negative words, namely judging the negative words to be an odd number, and assigning-1 corresponding to the private; the negation word is determined to be an even number, and the corresponding private is assigned a value of +1. Then traversing whether the degree words exist between the evaluation object and the evaluation words, and calculating the number of the degree words. Finally, the < commodity attribute, private, emotion > evaluation match pair is formed. In the embodiment sentence sendees in step S106, a negative word "no" is identified between the relation pair < package good >, and then the corresponding private value is-1; traversing the adverbs of degree between "package" and "good", identifying "very", and the corresponding degree value is 1. The evaluation match pair for this phrase extraction is < package, -1, good >.

Step S109: and combining the obtained evaluation matching pair with an emotion dictionary, performing recognition and devaluation calculation and good and bad sequencing on the evaluation object, and finally realizing the method through a visual interaction interface.

In a specific implementation, the extracted evaluation matching pair is used for obtaining the recognition attribute of the emotion words through the emotion dictionary. Then carrying out the identification and devaluation calculation of commodity attributes according to the following formula:

for the evaluation collocation pair obtained in step S107<Packaging, -1, preferably>The commodity attribute of the commodity is 'packaged' and is subjected to identification and detraction value calculation to obtain that the emotion value is

。

Traversing all comment data of the obtained commodity, carrying out the processing of the steps, accumulating the same evaluation objects, finally extracting to obtain all commodity attributes of the commodity, then classifying the commodity attributes into two classes, and finally obtaining a final result by utilizing bubbling sequencing arrangement. And finally, through the front end and the rear end, the visual interactive interface is used for realizing the method on the webpage.

The foregoing is merely illustrative of the embodiments of the present invention, and the scope of the present invention is not limited thereto, and any person skilled in the art will appreciate that modifications and substitutions are within the scope of the present invention, and the scope of the present invention is defined by the appended claims.

Claims

1. A comment analysis method based on word vectors and syntactic features is characterized by comprising the following steps:

1) Acquiring comment data of commodity pages of an e-commerce website;

4) Carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool to obtain Word vectors and generate a semantic similarity matrix, wherein the step 4) specifically comprises the following steps:

4-1) utilizing a Word2Vec training data set to obtain Word vectors of words;

4-3) for example two n-dimensional word vectors a (x ₁₁ ,x ₁₂ ,…,x _1n ) And b (x) ₂₁ ,x ₂₂ ,…,x _2n ) The semantic similarity calculation formula is as follows:

wherein cos θ represents the semantic similarity value; x is x _1k Representing a kth dimension value of the word vector a; x is x _2k Representing a k-th dimension value of the word vector b;

4-4) constructing a semantic similarity matrix according to the calculated semantic similarity;

5) Establishing a probability transition matrix by using a semantic similarity matrix, combining a seed word set, generating a final emotion dictionary by an LPA tag propagation algorithm and a basic emotion dictionary test, wherein the step 5) specifically comprises the following steps:

5-6) finally, marking the parts of speech of the sample words by adopting an LPA tag propagation algorithm, and forming a final emotion dictionary after passing through a basic emotion dictionary test;

2. The comment analysis method based on word vectors and syntactic features according to claim 1, characterized in that step 2) specifically includes:

2-1) removing the illegal character using a character matching algorithm;

3. The comment analysis method based on word vectors and syntactic features according to claim 1, characterized in that step 3) specifically includes: and respectively extracting the recognition and detraction words in the word dictionary by using the hownet emotion dictionary and the ntu evaluation word dictionary, and combining and then removing duplication to form a basic emotion dictionary.

4. The comment analysis method based on word vectors and syntactic features according to claim 1, characterized in that step 6) specifically includes:

rule 1: deleting the initial sentence component of the sentence, such as "… advantage", "… disadvantage", etc,

A "… deficiency", "… advantage", "… benefit" sequence;

rule 4: deleting "feel", "consider" claim words;

5. The comment analysis method based on word vectors and syntactic features according to claim 1, characterized in that step 7) specifically includes:

five axioms of dependency syntax:

(1) One sentence has only one and only one independent component;

the dependency tree is characterized by:

(1) Nodes in the tree are served by the individual components in the sentence;

(2) The root node of the tree is the center component of the whole sentence;

(4) Five axioms of the dependency syntax are satisfied;

most sentence dependency relations in comments are five types of main-predicate relation, moving-guest relation, centering relation, moving-complement relation and parallel relation, dependency syntax analysis can be carried out through an LTP dependency syntax analyzer, and dependency relation pairs are extracted by combining a parallel relation algorithm for identifying parallel evaluation objects and parallel evaluation words; the parallel relation algorithm for identifying the parallel evaluation objects and the parallel evaluation words specifically comprises the following steps:

traversing all words between two nodes in a dependency relationship pair and related to the two nodes left and right in a dependency syntax tree, wherein the main-predicate relationship, the dynamic guest relationship, the centering relationship and the dynamic complement relationship are obtained based on the dependency relationship and the syntax characteristics;

judging whether all the traversed words have parallel relations or not;

and expanding the parallel evaluation objects and evaluation words of the parallel relationship.

6. The comment analysis method based on word vectors and syntactic features according to claim 1, characterized in that step 8) specifically includes:

8-3) traversing whether negative words exist between the obtained evaluation object and the evaluation word according to the dependency syntax tree, if so, carrying out +1 number of the negative words, and if so, carrying out parity judgment on the number of the negative words until the traversal is finished; if the number is odd, the corresponding negative word private is assigned as-1, and if the number is even, the corresponding negative word private is assigned as +1;

7. The comment analysis method based on word vectors and syntactic features according to claim 1, characterized in that step 9) specifically includes:

a.score＝x _i .privative*(x _i .score+x _i .degree*x _i .score*0.5)(0＜i＜＝n)

where a.score is the emotion value of commodity attribute a, x _i For the ith time of the commodity attribute occurrence, private is the obtained value-1 or +1 of the negative word corresponding to the ith commodity attribute, and degree is the number of the degree adverbs corresponding to the ith commodity attribute; calculating commodity attribute emotion values, and accumulating and calculating the same evaluation objects;

8. A visual interactive interface, characterized in that the comment analysis method of claims 1 to 7 can be performed.