CN115271816A - Bulk commodity price prediction method and device based on emotion index - Google Patents

Bulk commodity price prediction method and device based on emotion index Download PDF

Info

Publication number
CN115271816A
CN115271816A CN202210922285.9A CN202210922285A CN115271816A CN 115271816 A CN115271816 A CN 115271816A CN 202210922285 A CN202210922285 A CN 202210922285A CN 115271816 A CN115271816 A CN 115271816A
Authority
CN
China
Prior art keywords
word
emotion
commodity
candidate
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210922285.9A
Other languages
Chinese (zh)
Other versions
CN115271816B (en
Inventor
任俊玲
许英姿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN202210922285.9A priority Critical patent/CN115271816B/en
Publication of CN115271816A publication Critical patent/CN115271816A/en
Application granted granted Critical
Publication of CN115271816B publication Critical patent/CN115271816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0278Product appraisal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0611Request for offers or quotes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a bulk commodity price prediction method and device based on emotion indexes. The method comprises the following steps: constructing a bulk commodity emotion dictionary based on bulk commodity news and a universal emotion dictionary; obtaining news of a large number of commodities, and calculating an emotion index of each commodity in each time period based on a large number of commodity emotion dictionary; and predicting the commodity price based on the emotion indexes of each commodity in each time period. The invention expands the existing general emotion dictionary into a bulk commodity emotion dictionary, and uses the bulk commodity emotion dictionary for calculating the emotion index of the commodity, thereby improving the accuracy of predicting the commodity price based on the emotion index.

Description

Bulk commodity price prediction method and device based on emotion index
Technical Field
The invention belongs to the technical field of price prediction, and particularly relates to a bulk commodity price prediction method and device based on an emotion index.
Background
The emotion of investors directly influences the market price, and the action mechanism of the method is as follows: irrational factors such as subjective understanding influence the behavior of investors, so that the supply demand of commodities changes, the supply demand influences the price, and the emotion of the investors finally influences the market price. Since most of the data reflecting the emotion of the investor is unstructured texts which cannot be directly processed by a computer, such as financial news, investor comments and the like, the texts need to be converted into numerical data which can be processed by the computer. And text can be quantized into numerical data using an emotion dictionary. The polarity and the score of most of the meaningful words are recorded in the emotion dictionary, the scores of all words in a section of speech can be obtained by matching the text with the words in the emotion dictionary, and the scores of all words are collected, namely the scores of the section of speech.
Based on news texts or investor comments of stock markets, using a general emotion dictionary or based on the general emotion dictionary, using an algorithm to expand and construct an emotion dictionary in the field, wherein the expansion method comprises the following steps: and taking the words of the general emotion dictionary as reference words, preprocessing the text and taking the word segmentation result as candidate words, comparing whether the current candidate words exist in the reference word set one by one, if so, continuously comparing the next candidate words, if not, judging the similarity of the reference words and the candidate words by using an algorithm, filtering out the words similar to the reference words, and adding the words into the general emotion dictionary. And when calculating the emotion index, matching the word segmentation result of the text with words in the emotion dictionary by using the emotion dictionary, calculating the emotion value of the text according to the scores of the positive words, the negative words and the degree words, and summarizing to obtain the emotion index. And (4) inputting the emotion index as one of the prediction characteristics into a prediction model to realize the prediction of the stock price. However, the existing emotion dictionary is not adapted to the field of bulk commodities, so that the emotion indexes of bulk commodity varieties are unreasonably calculated, and text emotions cannot be correctly reflected, thereby influencing the application of subsequent emotion indexes in price prediction. For example: the term "diving" generally refers to sports, belongs to neutral words, but in the field of bulk goods, after being matched with "price", refers to the phenomenon that the price drops greatly, and belongs to negative emotional words. Therefore, the general emotion dictionary has low adaptability to the bulk commodity field, and cannot meet the calculation precision of the emotion index in the field.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a bulk commodity price prediction method and device based on an emotion index.
In order to achieve the above object, the present invention adopts the following technical solutions.
In a first aspect, the invention provides a bulk commodity price prediction method based on an emotion index, which comprises the following steps:
constructing a bulk commodity emotion dictionary based on bulk commodity news and a universal emotion dictionary;
obtaining news of a large number of commodities, and calculating an emotion index of each commodity in each time period based on a large number of commodity emotion dictionary;
and predicting the commodity price based on the emotion indexes of each commodity in each time period.
Further, the method for constructing the sentiment dictionary of the bulk commodity comprises the following steps:
acquiring a universal emotion dictionary comprising a positive reference word set and a negative reference word set;
obtaining news from a massive commodity news corpus, and preprocessing the news including word segmentation to obtain candidate words for constructing a massive commodity emotion dictionary;
combining each candidate word with each candidate word in a specified range around the position of the sentence to obtain a candidate combined word;
combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word;
judging whether each candidate combination word exists in the positive combination word or the negative combination word, if so, respectively calculating the emotional tendency coefficients K of two candidate words in each candidate combination word by using an emotional tendency point mutual information algorithm, and if K is greater than 0, the candidate words are the positive emotional words; when K =0, the candidate word is a neutral emotion word; when K is less than 0, the candidate word is a negative sentiment word;
respectively merging the candidate words meeting K > a first threshold >0 and the candidate words meeting K < a second threshold <0 into a positive benchmark word set and a negative benchmark word set of the general emotion dictionary;
and screening the expanded general emotion dictionary including duplication elimination to obtain a bulk commodity emotion dictionary.
Further, the emotional tendency coefficient K of the candidate word c is calculated by the following formula:
Figure BDA0003778279890000031
Figure BDA0003778279890000032
Figure BDA0003778279890000033
in the formula (I), the compound is shown in the specification,
Figure BDA0003778279890000034
the number n is the number of the ith front reference word in the front reference word set, i =1,2, \ 8230;
Figure BDA0003778279890000035
j =1,2, \8230forthe jth negative reference word in the negative reference word set, and m, m is the number of the negative reference words; a count (c),
Figure BDA0003778279890000036
And
Figure BDA0003778279890000037
are respectively candidate words c,
Figure BDA0003778279890000038
And
Figure BDA0003778279890000039
the number of occurrences in the corpus is,
Figure BDA00037782798900000310
as candidate word c and
Figure BDA00037782798900000311
the number of simultaneous occurrences in the corpus,
Figure BDA00037782798900000312
is a candidate word c and
Figure BDA00037782798900000313
the number of simultaneous occurrences in the corpus, N is the total word frequency.
Further, the method for calculating the emotion index of the commodity in a period of time comprises the following steps:
acquiring all news of the commodity in the time period from a large commodity news corpus;
dividing each news into sentences;
segmenting each sentence, and calculating the emotion index of each sentence by calculating the emotion index of each word and considering the influence of negative words and degree words;
summing the emotion indexes of each sentence forming each news to obtain the emotion index of each news; and averaging the emotion indexes of the news to obtain the emotion index of the commodity in the time period.
Further, the method of calculating the emotion index of a sentence includes:
s1, setting an emotion index variable word _ polar of a word, and taking values as 1, -1 and 0; setting a negation word influence variable dense _ sign to take the value as 1 or-1; setting a degree word influence variable degree _ sign, wherein the value range is [1, C ]; initializing density _ sign =1, default \/sign =1, i =1;
s2, acquiring the ith word w in the sentence i If w is i If the word is a positive word in the large commodity emotion dictionary, then word _ polar i =1 dense sign, convert S4; if w i If the word is a negative word in the bulk commodity emotion dictionary, then word _ polar i = (-1) × dense _ sign, go 4; if w i Is not in the bulk goodsIn the emotion dictionary, word _ polar i =0;
S3, if w i If the word is negative, then dense _ sign = -1; if w i To be a degree word, then obtain w i Degree value of (d) degree _ sign i ,degree_sign=degree_sign i
S4, if i is smaller than M, updating i to i +1 and then turning to S2, otherwise turning to S5, wherein M is the number of words in the sentence;
s5, calculating the emotion index of the sentence according to the following formula:
Figure BDA0003778279890000041
wherein Q is the emotion index of the sentence.
In a second aspect, the present invention provides a bulk commodity price prediction device based on an emotional index, including:
the dictionary construction module is used for constructing a bulk commodity emotion dictionary based on bulk commodity news and a general emotion dictionary;
the emotion index calculation module is used for acquiring news of a large commodity and calculating an emotion index of each commodity in each time period based on a large commodity emotion dictionary;
and the price prediction module is used for predicting the commodity price based on the emotion index of each commodity in each time period.
Further, the dictionary construction module is specifically configured to:
acquiring a universal emotion dictionary comprising a positive reference word set and a negative reference word set;
obtaining news from a large commodity news corpus, and preprocessing the news including word segmentation to obtain candidate words for constructing a large commodity emotion dictionary;
combining each candidate word with each candidate word in a specified range around the position of the sentence to obtain a candidate combined word;
combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word;
judging whether each candidate combination word exists in the positive combination word or the negative combination word, if so, respectively calculating the emotional tendency coefficients K of two candidate words in each candidate combination word by using an emotional tendency point mutual information algorithm, and if K is greater than 0, the candidate words are the positive emotional words; when K =0, the candidate word is a neutral emotion word; when K is less than 0, the candidate word is a negative sentiment word;
respectively merging the candidate words meeting K > a first threshold >0 and the candidate words meeting K < a second threshold <0 into a positive reference word set and a negative reference word set of the general emotion dictionary;
and screening the expanded general emotion dictionary by duplication elimination to obtain a bulk commodity emotion dictionary.
Further, the emotional tendency coefficient K of the candidate word c is calculated as:
Figure BDA0003778279890000051
Figure BDA0003778279890000052
Figure BDA0003778279890000053
in the formula (I), the compound is shown in the specification,
Figure BDA0003778279890000054
the number n is the number of the front reference words;
Figure BDA0003778279890000055
j =1,2, \ 8230for the jth negative reference word in the negative reference word set, and m, m is the number of negative reference words; a count (c),
Figure BDA0003778279890000056
And
Figure BDA0003778279890000057
are respectively candidate words c,
Figure BDA0003778279890000058
And
Figure BDA0003778279890000059
the number of occurrences in the corpus is,
Figure BDA00037782798900000510
is a candidate word c and
Figure BDA00037782798900000511
the number of simultaneous occurrences in the corpus,
Figure BDA00037782798900000512
is a candidate word c and
Figure BDA00037782798900000513
the number of simultaneous occurrences in the corpus, N is the total word frequency.
Further, the method for calculating the emotion index of the commodity in a period of time comprises the following steps:
acquiring all news of the commodity in the time period from a large commodity news corpus;
dividing each news into sentences;
segmenting each sentence, and calculating the emotion index of each sentence by calculating the emotion index of each word and considering the influence of negative words and degree words;
summing the emotion indexes of each sentence forming each news to obtain the emotion index of each news; and averaging the emotion indexes of the news to obtain the emotion index of the commodity in the time period.
Further, the method of calculating the emotion index of a sentence includes:
s1, setting an emotion index variable word _ polar of a word, and taking values as 1, -1 and 0; setting a negative word influence variable dense _ sign, wherein the value is 1 or-1; setting a degree word influence variable degree _ sign, wherein the value range is [1, C ]; initializing density _ sign =1, default \/sign =1, i =1;
s2, obtaining the ith word w in the sentence i If w is i If the word is a positive word in the large commodity emotion dictionary, then word _ polar i =1 dense sign, convert S4; if w i If the word is a negative word in the bulk commodity emotion dictionary, then word _ polar i (= (-1) × dense _ sign) × density _ sign, go S4; if w i If the product is not in the large commodity emotion dictionary, then word _ polar i =0;
S3, if w i If the word is negative, then dense _ sign = -1; if w i To be a degree word, then obtain w i Degree value of (d) degree _ sign i ,degree_sign=degree_sign i
S4, if i is smaller than M, updating i to i +1 and then turning to S2, otherwise turning to S5, wherein M is the number of words in the sentence;
s5, calculating the emotion index of the sentence according to the following formula:
Figure BDA0003778279890000061
wherein Q is the sentiment index of the sentence.
Compared with the prior art, the invention has the following beneficial effects.
According to the method, the bulk commodity sentiment dictionary is constructed based on the bulk commodity news and the general sentiment dictionary, the bulk commodity news is obtained, the sentiment index of each commodity in each time period is calculated based on the bulk commodity sentiment dictionary, the commodity price is predicted based on the sentiment index of each commodity in each time period, and the commodity price prediction based on the sentiment index is realized. The invention expands the existing general emotion dictionary into a bulk commodity emotion dictionary, and uses the bulk commodity emotion dictionary for calculating the emotion index of the commodity, thereby improving the accuracy of predicting the commodity price based on the emotion index.
Drawings
Fig. 1 is a flowchart of a bulk commodity price prediction method based on an emotional index according to an embodiment of the present invention.
FIG. 2 is a flow chart of another embodiment of the present invention.
FIG. 3 is a flow chart of the construction of a large commodity emotion dictionary.
Fig. 4 is a flowchart of single news emotion value calculation.
Fig. 5 is a block diagram of a device for predicting prices of bulk goods based on emotional index according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described below with reference to the accompanying drawings and the detailed description. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a bulk commodity price prediction method based on an emotion index according to an embodiment of the present invention, where the method includes the following steps:
step 101, constructing a bulk commodity emotion dictionary based on bulk commodity news and a general emotion dictionary;
102, acquiring news of a bulk commodity, and calculating an emotion index of each commodity in each time period based on a bulk commodity emotion dictionary;
and 103, predicting the commodity price based on the emotion index of each commodity in each time period.
In this embodiment, step 101 is mainly used to construct a bulk commodity emotion dictionary. Practice shows that market emotion can influence factors such as bulk commodity price. The emotion in the text is an important embodiment of market emotion, the emotion of the bulk commodity text is quantized into an emotion index which can be processed by a computer, an emotion dictionary is needed, the emotion of the bulk commodity text is difficult to accurately quantize by the conventional general dictionary, and the emotion calculation is prone to deviation. For example: the term "diving" generally refers to sports, belongs to neutral words, but in the field of bulk goods, after being matched with "price", refers to the phenomenon that the price drops greatly, and belongs to negative emotional words. Therefore, the general emotion dictionary has low adaptability to the field of bulk commodities and is not enough to support the calculation of the emotion indexes in the field. Therefore, the embodiment constructs a bulk commodity emotion dictionary for calculating the emotion indexes in the bulk commodity field on the basis of bulk commodity news and a general emotion dictionary. The construction method comprises the steps of extracting candidate words from news of the bulk commodity, screening and screening the candidate words, adding the candidate words into a general emotion dictionary, and expanding the general emotion dictionary into a bulk commodity emotion dictionary. The following embodiment will provide a specific construction method of the bulk commodity emotion dictionary.
In this embodiment, step 102 is mainly used to calculate the emotion index of the commodity. The emotion indexes of the commodities are obtained by summarizing the emotion indexes of sentences and vocabularies in news texts of the commodities. The emotional words generally include positive emotional words, negative emotional words, and neutral emotional words. The emotion index of a word mainly depends on its emotional attribute (which emotional word belongs to), for example, the emotion index of a positive emotional word can be set to a positive number, the emotion index of a negative emotional word can be set to a negative number, and the emotion index of a neutral emotional word can be set to 0. Of course, in order to calculate the emotion index of a word more accurately, the influence of the negation word and the degree word preceding the word is also considered. The emotion attribute of a word can be obtained by querying an emotion dictionary. After the large commodity emotion dictionary is built, the commodity emotion index can be calculated more effectively by replacing the general emotion dictionary. Certainly, in order to effectively predict the price of the commodity, the emotion index of each commodity in different time periods needs to be obtained, the time period is determined according to industry experience, and generally 1 day can be taken, namely the emotion index of each commodity every day is calculated. The following embodiment will provide a technical solution for calculating the emotional index of the commodity within a period of time.
In this embodiment, step 103 is mainly used to predict the commodity price based on the emotion index. The method for predicting the commodity price based on the emotion index is various, for example, an artificial neural network prediction model can be constructed, the output of the model is the commodity price, the emotion index of the commodity is input, and other quantities which have obvious influence on the commodity price, such as the indexes of position holding quantity, volume of bargain, amount of bargain and the like of the commodity are also included. And constructing a training set and a testing set by carrying out data preprocessing steps such as missing value processing, standardization and the like, training a model by using the training set, and testing and optimizing the model by using the testing set. The emotion index of the commodity and the like are input into the trained model, so that the price of the commodity can be predicted. The concrete prediction method can refer to the 'coal price prediction research based on BP neural network' which is a paper published in 2021 of Yinjian Heng in 'science and technology and Innovation' stage 02.
According to the commodity price forecasting method and device, the existing general emotion dictionary is expanded into the bulk commodity emotion dictionary, and the bulk commodity emotion dictionary is used for calculating the commodity emotion index, so that the commodity price forecasting precision based on the emotion index is improved.
Another embodiment is a flow chart shown in fig. 2, and fig. 2 shows a flow chart of price prediction of two commodities, namely a and B.
As an optional embodiment, the method for constructing the bulk commodity emotion dictionary based on the universal emotion dictionary comprises the following steps:
acquiring a general emotion dictionary comprising a positive reference word set and a negative reference word set;
obtaining news from a massive commodity news corpus, and preprocessing the news including word segmentation to obtain candidate words for constructing a massive commodity emotion dictionary;
combining each candidate word with each candidate word in a specified range around the position of the sentence to obtain a candidate combined word;
combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word;
judging whether each candidate combination word exists in the positive combination word or the negative combination word, if so, respectively calculating the emotional tendency coefficients K of two candidate words in each candidate combination word by using an emotional tendency point mutual information algorithm, and if K is greater than 0, the candidate words are the positive emotional words; when K =0, the candidate word is a neutral emotion word; when K is less than 0, the candidate word is a negative emotion word;
respectively merging the candidate words meeting K > a first threshold >0 and the candidate words meeting K < a second threshold <0 into a positive reference word set and a negative reference word set of the general emotion dictionary;
and screening the expanded general emotion dictionary including duplication elimination to obtain a bulk commodity emotion dictionary.
The embodiment provides a technical scheme for constructing a large commodity emotion dictionary, which is shown in fig. 3. In the embodiment, a bulk commodity emotion dictionary is constructed based on the general emotion dictionary, so that it is necessary to introduce the existing financial field general emotion dictionary first. The large commodity trading market is mainly divided into a spot market, an electronic trading market and a futures market. Futures, including commodity futures and financial futures, are equivalent to stocks in the financial field. Through observation of bulk commodity news text and financial domain news text, the following results are found: except for special major commodity vocabularies (such as 'warehouse-through', 'arbitrage' and 'hedging'), most of the major commodity field vocabularies are mentioned in the financial field, so the embodiment expands the existing financial field emotion dictionary to obtain the major commodity emotion dictionary. The Yao weighting and the like adopt a dictionary reorganization method and a long-short term memory model, and a Chinese emotion dictionary suitable for annual newspapers (formal texts) and social media (informal texts) in the financial field is constructed by combining 4 universal emotion dictionaries. The emotion dictionary divides Chinese words into four categories of annual newspaper positive words, annual newspaper negative words, social media positive words and social media negative words according to use scenes and emotional tendency, and the specific division and the number of each category are shown in table 1.
TABLE 1 financial domain emotional dictionary Categories Table
Figure BDA0003778279890000101
Since the content term of the bulk commodity news text related in this embodiment is between the official term and the informal term, this embodiment combines the positive and negative words of the two contexts of annual newspaper and social media based on the research result of yao, and gets the positive and negative reference word sets by de-duplication, wherein the positive reference word set includes 3856 positive reference emotion words, the negative reference word set includes 2076 negative reference emotion words, and 5932 reference emotion words in total. Through statistics, 3000 reference emotional words in the financial field Chinese emotional dictionary appear in the word segmentation result of the bulk commodity corpus, account for about 50.57% of the total reference emotional words, the fact that the dictionary has a certain degree of engagement with the bulk commodity field news text is shown, and the bulk commodity emotional dictionary can be constructed based on the dictionary.
After the general emotion dictionary is determined, news is acquired from a large commodity news corpus, and candidate words for constructing the large commodity emotion dictionary are obtained by preprocessing news texts. The preprocessing of the embodiment mainly comprises the steps of performing word segmentation on the text by using a jieba library of python language, and removing stop words and punctuation marks without obvious meanings such as 'some', 'soon' and the like. The embodiment adds the general emotion dictionary on the basis of the jieba initial dictionary, so that the words in the dictionary can be prevented from being cut by mistake. For example, in the financial domain emotion dictionary, the negative word "debt ascends", two words "debt" and "ascent" exist in the jieba initial dictionary, but there is no "debt ascent" phrase, and if the financial domain emotion dictionary is not added, the phrase is mistakenly segmented into the negative term "debt" and the positive verb "ascent", and the emotion values of the two are added and zero, and finally, the phrase is determined as a neutral phrase that does not contribute to the emotion value calculation. However, the word is a negative word whose original meaning is increased in liability, and the above determination cannot accurately represent the word original meaning. Because the phrase 'burden rising' exists in the financial field emotion dictionary, after the phrase 'burden rising' is added into the financial field emotion dictionary, the phrase 'burden rising' cannot be mistakenly cut, and the emotion value can be correctly calculated.
After candidate words are extracted from the news text, each candidate word and each candidate word in a designated range around the position of the sentence are combined respectively to obtain candidate combined words. For example, if there are B and C candidate words in the same sentence as candidate word a, candidate combined words AB and AC can be obtained. After extracting candidate words from the news text, combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word.
The following processing is performed for each candidate compound word: judging whether the candidate combined word exists in the positive combined word or the negative combined word, if so, respectively calculating the emotional tendency coefficients K of two candidate words in the candidate combined word; if not, the candidate compound word is discarded. And judging whether the candidate word is a positive word, a neutral emotional word or a negative emotional word according to the value K of the emotional tendency coefficient. When K is greater than 0, the candidate word is a positive emotion word, and the larger the K value is, the stronger the positive emotion is; when K =0, the candidate word is a neutral sentiment word; when K <0, the word is negative emotion word, and the larger the absolute value of K is, the stronger the positive emotion is. The embodiment calculates the emotional tendency coefficient of the candidate word by using a emotional tendency point Mutual Information algorithm (SO-PMI). The SO-PMI algorithm measures the mutual information value between a certain candidate word and the positive and negative reference words, the size of the mutual information value represents the correlation of the two words, then the difference value of the word and the mutual information of the positive and negative reference words is calculated, and the emotional tendency of the candidate word is judged according to the positive and negative of the difference value. One specific embodiment for calculating the emotional tendency coefficients using the SO-PMI algorithm will be given later.
After obtaining the emotional tendency coefficient of each candidate word, screening the candidate words based on the emotional tendency coefficient: and combining the positive emotion word candidate words with K being greater than the first threshold value into a positive reference word set of the general emotion dictionary, and combining the negative emotion word candidate words with K being less than the second threshold value into a negative reference word set of the general emotion dictionary. It is clear that the first threshold is >0 and the second threshold <0.
And finally, processing the expanded general emotion dictionary such as duplication elimination and manual screening to obtain a bulk commodity emotion dictionary.
In practical application, the number of the positive emotion words and the number of the negative emotion words expanded on the basis of the general emotion dictionary are likely to be large, for example, the expansion of the positive emotion words is less, the reason may be that the positive emotion word set of the original financial field emotion dictionary is complete, most of the positive emotion words merged into the positive emotion word set are overlapped with the words of the original dictionary, and the positive emotion words are screened out after duplication removal.
As an alternative embodiment, the emotional tendency coefficient K of the candidate word c is calculated by the following formula:
Figure BDA0003778279890000111
Figure BDA0003778279890000121
Figure BDA0003778279890000122
in the formula (I), the compound is shown in the specification,
Figure BDA0003778279890000123
the number n is the number of the front reference words;
Figure BDA0003778279890000124
j =1,2, \ 8230for the jth negative reference word in the negative reference word set, and m, m is the number of negative reference words; a count (c),
Figure BDA0003778279890000125
And
Figure BDA0003778279890000126
are respectively candidate words c,
Figure BDA0003778279890000127
And
Figure BDA0003778279890000128
the number of occurrences in the corpus is,
Figure BDA0003778279890000129
is a candidate word c and
Figure BDA00037782798900001210
the number of simultaneous occurrences in the corpus,
Figure BDA00037782798900001211
as candidate word c and
Figure BDA00037782798900001212
the number of simultaneous occurrences in the corpus, N is the total word frequency.
The embodiment provides a method for calculating the emotion tendency coefficient of the candidate word. The probability of common occurrence of words (referred to as co-occurrence rate) can be used to indicate the correlation between words, and the higher the co-occurrence rate, the higher the correlation between words. In the first formula above
Figure BDA00037782798900001213
Is the ith positive reference word
Figure BDA00037782798900001214
The co-occurrence rate with the candidate word c,
Figure BDA00037782798900001215
is the jth negative benchmark word
Figure BDA00037782798900001216
Co-occurrence with candidate word c.
Figure BDA00037782798900001217
The calculation formula of (2) is shown in the second and third formulas, the positive value of the calculation formula represents that the candidate word is related to the reference word, the negative value represents that the candidate word and the reference word are mutually exclusive (the probability of simultaneous occurrence is very small), and the 0 value represents that the candidate word and the reference word are not related or mutually exclusive. The emotional tendency coefficient K of the candidate word c is equal to the mean value of the co-occurrence rate of c and the positive reference wordThe mean of the co-occurrence of c with the negative benchmark word is subtracted.
As an alternative embodiment, a method of calculating an emotional index of a commodity over a period of time includes:
acquiring all news of the commodity in the time period from a large commodity news corpus;
dividing each news into sentences;
segmenting each sentence, and calculating the emotion index of each sentence by calculating the emotion index of each word and considering the influence of negative words and degree words;
summing the emotion indexes of each sentence forming each news to obtain the emotion index of each news; and averaging the emotion indexes of the news to obtain the emotion index of the commodity in the time period.
The embodiment provides a method for calculating the emotion index of a commodity in a time period. In this embodiment, all news of a commodity in the time period are acquired from a large commodity news corpus, each news is divided into sentences, the sentences are segmented, the emotion index of each sentence is calculated by calculating the emotion index of each word in the sentence, the emotion indexes of all sentences of each news are summed to obtain the emotion index of each news, and finally the emotion indexes of each news are averaged to obtain the emotion index of the commodity in the time period. It should be noted that, in the embodiment, when the emotion index of each sentence is calculated, the emotion words are taken as the main body, and the influence of the negative words and the degree words in the non-emotion words is also considered, so that the calculation accuracy of the emotion index is improved.
As an alternative embodiment, the method of calculating the sentiment index of a sentence comprises:
s1, setting an emotion index variable word _ polar of a word, and taking values as 1, -1 and 0; setting a negation word influence variable dense _ sign to take the value as 1 or-1; setting a degree word influence variable degree _ sign, wherein the value range is [1, C ]; initializing density _ sign =1, default \/sign =1, i =1;
s2, obtaining the ith word w in the sentence i If w is i If the word is a positive word in the large commodity emotion dictionary, then word _ polar i =1 dense sign, convert S4; if w i If the word is a negative word in the bulk commodity emotion dictionary, then word _ polar i = (-1) × dense _ sign, go 4; if w i If the product is not in the large commodity emotion dictionary, then word _ polar i =0;
S3, if w i If the word is negative, then dense _ sign = -1; if w i To be a degree word, then obtain w i Degree value of (d) degree value degree _ sign i ,degree_sign=degree_sign i
S4, if i is smaller than M, updating i to i +1 and then turning to S2, otherwise turning to S5, wherein M is the number of words in the sentence;
s5, calculating the emotion index of the sentence according to the following formula:
Figure BDA0003778279890000131
wherein Q is the sentiment index of the sentence.
The embodiment provides a technical scheme for calculating the emotion index of a single sentence. As shown in fig. 4. In the embodiment, the emotion indexes of the single words are set to be 1, -1 and 0, the emotion indexes of the positive emotion words and the negative emotion words are respectively 1-1, and when the words are not the positive emotion words or the negative emotion words, the emotion indexes are 0. A negation word influence variable dense _ sign and a degree word influence variable dense _ sign are also set. The value of the dense _ sign is 1 or-1, and if negative words such as 'not', 'no' and the like appear in the sentence, the dense _ sign = -1. The value range [1, C ] of the degree _ sign can be determined empirically, and generally takes an integer value greater than 1. An acquisition method of the default _ sign is given below. The method for calculating the emotion index of the sentence is given above in a very detailed manner, and a description thereof is not given here again, wherein the degree value ranges from (0, 3) to the next decimal place, for example, the value of "mild" word degree =0.3, and the value of "hundred percent" word degree =3.degree \usign is set as +1, which is the degree value found from the network.
Fig. 5 is a schematic diagram illustrating a price prediction apparatus for a bulk commodity based on an emotion index, the apparatus including:
the dictionary construction module 11 is used for constructing a bulk commodity emotion dictionary based on bulk commodity news and a general emotion dictionary;
the emotion index calculation module 12 is used for acquiring news of a large number of commodities and calculating an emotion index of each commodity in each time period based on an emotion dictionary of the large number of commodities;
and the price prediction module 13 is used for predicting the price of the commodity based on the emotion index of each commodity in each time period.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again. The same applies to the following embodiments, which are not further described.
As an optional embodiment, the dictionary building module 11 is specifically configured to:
acquiring a universal emotion dictionary comprising a positive reference word set and a negative reference word set;
obtaining news from a massive commodity news corpus, and preprocessing the news including word segmentation to obtain candidate words for constructing a massive commodity emotion dictionary;
combining each candidate word with each candidate word in a specified range around the position of the sentence to obtain a candidate combined word;
combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word;
judging whether each candidate combination word exists in the positive combination word or the negative combination word, if so, respectively calculating the emotional tendency coefficients K of two candidate words in each candidate combination word by using an emotional tendency point mutual information algorithm, and if K is greater than 0, the candidate words are the positive emotional words; when K =0, the candidate word is a neutral sentiment word; when K is less than 0, the candidate word is a negative sentiment word;
respectively merging the candidate words meeting K > a first threshold >0 and the candidate words meeting K < a second threshold <0 into a positive reference word set and a negative reference word set of the general emotion dictionary;
and screening the expanded general emotion dictionary by duplication elimination to obtain a bulk commodity emotion dictionary.
As an alternative embodiment, the emotional tendency coefficient K of the candidate word c is calculated by the following formula:
Figure BDA0003778279890000151
Figure BDA0003778279890000152
Figure BDA0003778279890000153
in the formula (I), the compound is shown in the specification,
Figure BDA0003778279890000154
the number n is the number of the front reference words;
Figure BDA0003778279890000155
j =1,2, \8230forthe jth negative reference word in the negative reference word set, and m, m is the number of the negative reference words; a count (c),
Figure BDA0003778279890000156
And
Figure BDA0003778279890000157
are respectively candidate words c,
Figure BDA0003778279890000158
And
Figure BDA0003778279890000159
the number of occurrences in the corpus is,
Figure BDA00037782798900001510
is a candidate word c and
Figure BDA00037782798900001511
the number of simultaneous occurrences in the corpus,
Figure BDA00037782798900001512
as candidate word c and
Figure BDA00037782798900001513
the number of simultaneous occurrences in the corpus, N is the total word frequency.
As an alternative embodiment, a method of calculating an emotional index of a commodity over a period of time includes:
acquiring all news of the commodity in the time period from a large commodity news corpus;
dividing each news into sentences;
segmenting each sentence, and calculating the emotion index of each sentence by calculating the emotion index of each word and considering the influence of negative words and degree words;
summing the emotion indexes of each sentence forming each news to obtain the emotion index of each news; and averaging the emotion indexes of the news to obtain the emotion index of the commodity in the time period.
As an alternative embodiment, the method of calculating the emotion index of a sentence includes:
s1, setting an emotion index variable word _ polar of a word, and taking values as 1, -1 and 0; setting a negative word influence variable dense _ sign, wherein the value is 1 or-1; setting a degree word influence variable degree _ sign, wherein the value range is [1, C ]; initializing dense _ sign =1, degree \usign =1, i =1;
s2, obtaining the ith word w in the sentence i If w is i If the word is a positive word in the large commodity emotion dictionary, then word _ polar i =1*deny_sign*degree_sign, turning to S4; if w i If the word is a negative word in the large commodity emotion dictionary, then word _ polar i (= (-1) × dense _ sign) × density _ sign, go S4; if w i If the word _ polar is not in the large commodity emotion dictionary, then word _ polar i =0;
S3, if w i If the word is negative, then dense _ sign = -1; if w i To be a degree word, then obtain w i Degree value of (d) degree value degree _ sign i ,degree_sign=degree_sign i
S4, if i is smaller than M, updating i to i +1 and then turning to S2, otherwise turning to S5, wherein M is the number of words in the sentence;
s5, calculating the emotion index of the sentence according to the following formula:
Figure BDA0003778279890000161
wherein Q is the sentiment index of the sentence.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are also within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A bulk commodity price prediction method based on emotion indexes is characterized by comprising the following steps:
constructing a bulk commodity emotion dictionary based on bulk commodity news and a universal emotion dictionary;
acquiring news of a large quantity of commodities, and calculating an emotion index of each commodity in each time period based on a large quantity of commodity emotion dictionary;
and predicting the commodity price based on the emotion index of each commodity in each time period.
2. The method for predicting price of bulk commodity based on emotion index as recited in claim 1, wherein the method for constructing emotion dictionary of bulk commodity comprises:
acquiring a universal emotion dictionary comprising a positive reference word set and a negative reference word set;
obtaining news from a massive commodity news corpus, and preprocessing the news including word segmentation to obtain candidate words for constructing a massive commodity emotion dictionary;
combining each candidate word with each candidate word in a specified range around the position of the sentence to obtain a candidate combined word;
combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word;
judging whether each candidate combination word exists in the positive combination word or the negative combination word, if so, respectively calculating the emotional tendency coefficients K of two candidate words in each candidate combination word by using an emotional tendency point mutual information algorithm, and if K is greater than 0, the candidate words are the positive emotional words; when K =0, the candidate word is a neutral emotion word; when K is less than 0, the candidate word is a negative emotion word;
respectively merging the candidate words meeting K > a first threshold >0 and the candidate words meeting K < a second threshold <0 into a positive reference word set and a negative reference word set of the general emotion dictionary;
and screening the expanded general emotion dictionary including duplication elimination to obtain a bulk commodity emotion dictionary.
3. The method for predicting price of bulk goods based on emotion index according to claim 2, wherein a formula for calculating emotion tendency coefficient K of candidate word c is:
Figure FDA0003778279880000021
Figure FDA0003778279880000022
Figure FDA0003778279880000023
in the formula (I), the compound is shown in the specification,
Figure FDA0003778279880000024
the number n is the number of the ith front reference word in the front reference word set, i =1,2, \ 8230;
Figure FDA0003778279880000025
j =1,2, \ 8230for the jth negative reference word in the negative reference word set, and m, m is the number of negative reference words; a count (c),
Figure FDA0003778279880000026
And
Figure FDA0003778279880000027
are respectively candidate words c,
Figure FDA0003778279880000028
And
Figure FDA0003778279880000029
the number of occurrences in the corpus is,
Figure FDA00037782798800000210
is a candidate word c and
Figure FDA00037782798800000211
the number of simultaneous occurrences in the corpus,
Figure FDA00037782798800000212
is a candidate word c and
Figure FDA00037782798800000213
the number of simultaneous occurrences in the corpus, N is the total word frequency.
4. The method of predicting prices of bulk commodities based on emotion indexes as recited in claim 1, wherein the method of calculating emotion indexes of a commodity over a period of time comprises:
acquiring all news of the commodity in the time period from a large commodity news corpus;
dividing each news into sentences;
segmenting each sentence, and calculating the emotion index of each sentence by calculating the emotion index of each word and considering the influence of negative words and degree words;
summing the emotion indexes of each sentence forming each news to obtain the emotion index of each news; and averaging the emotion indexes of the news to obtain the emotion indexes of the commodity in the time period.
5. The method for predicting prices of commodities in bulk according to claim 4, wherein the method for calculating the emotion index of a sentence comprises:
s1, setting an emotion index variable word _ polar of a word, and taking values as 1, -1 and 0; setting a negative word influence variable dense _ sign, wherein the value is 1 or-1; setting a degree word influence variable degree _ sign, wherein the value range is [1, C ]; initializing density _ sign =1, default \/sign =1, i =1;
s2, acquiring the ith word w in the sentence i If w is i If the word is a positive word in the large commodity emotion dictionary, then word _ polar i =1 dense sign, convert S4; if w i If the word is a negative word in the bulk commodity emotion dictionary, then word _ polar i = (-1) × dense _ sign, go 4; if w i If the product is not in the large commodity emotion dictionary, then word _ polar i =0;
S3, if w i If the word is negative, then dense _ sign = -1; if w i To be a degree word, then obtain w i Degree value of (d)gn i ,degree_sign=degree_sign i
S4, if i is smaller than M, updating i to i +1 and then turning to S2, otherwise turning to S5, wherein M is the number of words in the sentence;
s5, calculating the emotion index of the sentence according to the following formula:
Figure FDA0003778279880000031
wherein Q is the sentiment index of the sentence.
6. A bulk goods price prediction device based on an emotion index, comprising:
the dictionary building module is used for building a bulk commodity emotion dictionary based on bulk commodity news and a general emotion dictionary;
the emotion index calculation module is used for acquiring news of a large commodity and calculating an emotion index of each commodity in each time period based on a large commodity emotion dictionary;
and the price prediction module is used for predicting the commodity price based on the emotion index of each commodity in each time period.
7. The sentiment-index-based bulk commodity price prediction device according to claim 6, wherein the dictionary building module is specifically configured to:
acquiring a universal emotion dictionary comprising a positive reference word set and a negative reference word set;
obtaining news from a massive commodity news corpus, and preprocessing the news including word segmentation to obtain candidate words for constructing a massive commodity emotion dictionary;
combining each candidate word with each candidate word in a specified range around the position of the sentence to obtain a candidate combined word;
combining each candidate word with each positive reference word and each negative reference word in the general emotion dictionary to obtain a positive combination word and a negative combination word;
judging whether each candidate combined word exists in the positive combined words or the negative combined words, if so, respectively calculating the emotional tendency coefficients K of two candidate words in each candidate combined word by using an emotional tendency point mutual information algorithm, and if K is greater than 0, the candidate words are the positive emotional words; when K =0, the candidate word is a neutral emotion word; when K is less than 0, the candidate word is a negative emotion word;
respectively merging the candidate words meeting K > a first threshold >0 and the candidate words meeting K < a second threshold <0 into a positive reference word set and a negative reference word set of the general emotion dictionary;
and screening the expanded general emotion dictionary including duplication elimination to obtain a bulk commodity emotion dictionary.
8. The apparatus for predicting price of commodity according to claim 7, wherein said emotion tendency coefficient K of candidate word c is calculated by the following formula:
Figure FDA0003778279880000041
Figure FDA0003778279880000042
Figure FDA0003778279880000043
in the formula (I), the compound is shown in the specification,
Figure FDA0003778279880000044
the number n is the number of the ith front reference word in the front reference word set, i =1,2, \ 8230;
Figure FDA0003778279880000045
as the first in the negative reference word setj negative reference words, j =1,2, \8230, m, m is the number of the negative reference words; a count (c),
Figure FDA0003778279880000046
And
Figure FDA0003778279880000047
are respectively candidate words c,
Figure FDA0003778279880000048
And
Figure FDA0003778279880000049
the number of occurrences in the corpus is,
Figure FDA00037782798800000410
is a candidate word c and
Figure FDA00037782798800000411
the number of simultaneous occurrences in the corpus,
Figure FDA00037782798800000412
is a candidate word c and
Figure FDA00037782798800000413
the number of simultaneous occurrences in the corpus, N is the total word frequency.
9. The apparatus for predicting price of commodity according to claim 6, wherein the method for calculating the emotional index of a commodity during a period of time comprises:
acquiring all news of the commodity in the time period from a large commodity news corpus;
dividing each news into sentences;
segmenting each sentence, and calculating the emotion index of each sentence by calculating the emotion index of each word and considering the influence of negative words and degree words;
summing the emotion indexes of each sentence forming each news to obtain the emotion index of each news; and averaging the emotion indexes of the news to obtain the emotion indexes of the commodity in the time period.
10. The sentiment index-based commodity price prediction device of claim 9, wherein the method of calculating the sentiment index of a sentence comprises:
s1, setting an emotion index variable word _ polar of a word, and taking values as 1, -1 and 0; setting a negative word influence variable dense _ sign, wherein the value is 1 or-1; setting a degree word influence variable degree _ sign, wherein the value range is [1, C ]; initializing density _ sign =1, default \/sign =1, i =1;
s2, acquiring the ith word w in the sentence i If w is i If the word is a positive word in the large commodity emotion dictionary, then word _ polar i =1 dense sign, convert S4; if w i If the word is a negative word in the bulk commodity emotion dictionary, then word _ polar i (= (-1) × dense _ sign) × density _ sign, go S4; if w i If the product is not in the large commodity emotion dictionary, then word _ polar i =0;
S3, if w i If the word is negative, then dense _ sign = -1; if w i To be a degree word, then obtain w i Degree value of (d) degree _ sign i ,degree_sign=degree_sign i
S4, if i is smaller than M, updating i to i +1 and then turning to S2, otherwise turning to S5, wherein M is the number of words in the sentence;
s5, calculating the emotion index of the sentence according to the following formula:
Figure FDA0003778279880000051
wherein Q is the sentiment index of the sentence.
CN202210922285.9A 2022-08-02 2022-08-02 Method and device for predicting commodity price based on emotion index Active CN115271816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210922285.9A CN115271816B (en) 2022-08-02 2022-08-02 Method and device for predicting commodity price based on emotion index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210922285.9A CN115271816B (en) 2022-08-02 2022-08-02 Method and device for predicting commodity price based on emotion index

Publications (2)

Publication Number Publication Date
CN115271816A true CN115271816A (en) 2022-11-01
CN115271816B CN115271816B (en) 2023-12-22

Family

ID=83747827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210922285.9A Active CN115271816B (en) 2022-08-02 2022-08-02 Method and device for predicting commodity price based on emotion index

Country Status (1)

Country Link
CN (1) CN115271816B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005553A (en) * 2015-06-19 2015-10-28 四川大学 Emotional thesaurus based short text emotional tendency analysis method
CN106776566A (en) * 2016-12-22 2017-05-31 东软集团股份有限公司 The recognition methods of emotion vocabulary and device
CN110309508A (en) * 2019-06-20 2019-10-08 苏州点对点信息科技有限公司 A kind of VWAP quantization transaction system and method based on investor sentiment
CN112861541A (en) * 2020-12-15 2021-05-28 哈尔滨工程大学 Commodity comment sentiment analysis method based on multi-feature fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005553A (en) * 2015-06-19 2015-10-28 四川大学 Emotional thesaurus based short text emotional tendency analysis method
CN106776566A (en) * 2016-12-22 2017-05-31 东软集团股份有限公司 The recognition methods of emotion vocabulary and device
CN110309508A (en) * 2019-06-20 2019-10-08 苏州点对点信息科技有限公司 A kind of VWAP quantization transaction system and method based on investor sentiment
CN112861541A (en) * 2020-12-15 2021-05-28 哈尔滨工程大学 Commodity comment sentiment analysis method based on multi-feature fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
VALENTINA FRANZONI: ""Web-Based Similarity for Emotion Recognition in Web Objects"", 《IEEE》, pages 327 - 332 *
袁媛: ""基于文本挖掘的投资者情绪对股票市场的影响研究"", no. 7, pages 1 - 26 *
陈鑫: ""基于词语相关度的微博新情感词自动识别"", 《计算机应用》, vol. 36, no. 2, pages 424 - 427 *

Also Published As

Publication number Publication date
CN115271816B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN105824922B (en) A kind of sensibility classification method merging further feature and shallow-layer feature
Bharathi et al. Sentiment analysis for effective stock market prediction
CN111125349A (en) Graph model text abstract generation method based on word frequency and semantics
CN107862087B (en) Emotion analysis method and device based on big data and deep learning and storage medium
CN109947951B (en) Automatically-updated emotion dictionary construction method for financial text analysis
CN110210028A (en) For domain feature words extracting method, device, equipment and the medium of speech translation text
CN108388660A (en) A kind of improved electric business product pain spot analysis method
CN108073571B (en) Multi-language text quality evaluation method and system and intelligent text processing system
Sergio et al. Stacked DeBERT: All attention in incomplete data for text classification
CN110674296B (en) Information abstract extraction method and system based on key words
CN111538828A (en) Text emotion analysis method and device, computer device and readable storage medium
CN111091000A (en) Processing system and method for extracting user fine-grained typical opinion data
CN107818173B (en) Vector space model-based Chinese false comment filtering method
WO2021074798A1 (en) Automatic summarization of transcripts
CN110287493B (en) Risk phrase identification method and device, electronic equipment and storage medium
CN111815426B (en) Data processing method and terminal related to financial investment and research
CN111639189B (en) Text graph construction method based on text content features
CN112200674A (en) Stock market emotion index intelligent calculation information system
CN107729509B (en) Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN115271816B (en) Method and device for predicting commodity price based on emotion index
CN115129815A (en) Text similarity calculation method fusing improved YAKE and neural network
CN115659961A (en) Method, apparatus and computer storage medium for extracting text viewpoints
CN115577109A (en) Text classification method and device, electronic equipment and storage medium
CN115269795A (en) Segmentation method of electronic medical record
CN114676699A (en) Entity emotion analysis method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant