CN104239383A - MicroBlog emotion visualization method - Google Patents

MicroBlog emotion visualization method Download PDF

Info

Publication number
CN104239383A
CN104239383A CN201410254028.8A CN201410254028A CN104239383A CN 104239383 A CN104239383 A CN 104239383A CN 201410254028 A CN201410254028 A CN 201410254028A CN 104239383 A CN104239383 A CN 104239383A
Authority
CN
China
Prior art keywords
topic
seed
time
keyword
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410254028.8A
Other languages
Chinese (zh)
Inventor
任福继
刘宁
康鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201410254028.8A priority Critical patent/CN104239383A/en
Publication of CN104239383A publication Critical patent/CN104239383A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a microBlog emotion visualization method. By the microBlog emotion visualization method, a microBlog hot issue national attention tendency chart, a microBlog hot issue emotion national distribution diagram and a microBlog hot issue area distribution dagram are made according to relative strategies based on keyword frequency data acquired statistically and eight-dimensional emotion results obtained by emotion computing.

Description

A kind of microblog emotional method for visualizing
Technical field
The present invention relates to microblog emotional analytical approach field, specifically a kind of microblog emotional method for visualizing.
Background technology
Affection computation becomes one of current popular research field, and text emotion calculates particularly burning hot.Along with the rise of this short-text message pattern of microblogging, a large amount of texts being rich in affective characteristics can obtain easily, for text emotion research is provided convenience.Due to the difficult point in text emotion tolerance, such that text emotion is visual faces many difficult problems, microblog emotional is visual so same.
Summary of the invention
The object of this invention is to provide a kind of microblog emotional method for visualizing, to realize the displaying microblog text affective of visual pattern.
In order to achieve the above object, the technical solution adopted in the present invention is:
A kind of microblog emotional method for visualizing, is characterized in that: comprise the following steps:
(1) set of appointment topic keyword, is expanded:
Due to the colloquial style in content of microblog, in the appointment topic microblog data got, specify the original keyword seed of topic to be not the statement that standardizes, now need keyword seed spoken language words, the slang of will topic be specified original, expansion step is as follows:
(1.1), topic microblogging text participle will be specified, statistics word frequency, and determine to specify the original keyword seed of topic;
(1.2), by word frequency sequence, get front 20 words and alternatively specify topic keyword seed;
(1.3), calculate 20 candidates according to formula (1) and specify topic keyword seed and the similarity of specifying the original keyword seed of topic:
d = Σ j = 1 n log p ( word _ seed j , word i ) p ( word _ seed j ) p ( word i ) - - - ( 1 )
Wherein, word_seed jrepresent and specify the original keyword seed of topic, word irepresent that candidate specifies topic keyword seed, p (word_seed j, word i) represent the probability of specifying the original keyword seed of topic and candidate to specify topic keyword seed simultaneously to occur in microblogging text, p (word_seed j) represent the probability of specifying the original keyword seed of topic to occur in microblogging text, p (word i) represent that the probability that candidate specifies topic keyword seed to occur in microblogging text, d represent that candidate specifies topic keyword seed and the similarity of specifying the original keyword seed of topic;
(1.4), according to the result of calculation of step (1.3), the candidate getting first 10 of similarity rank specifies topic keyword seed as the keyword seed expanded, the keyword seed expanded as topic keyword set, is designated as K together with the original keyword seed of appointment topic;
(2) appointment topic microblog data, is separated: appointment topic microblog data is split as regional microblog data according to city belonging to microblogging, is designated as D city; According to microblogging issuing time, in units of sky, appointment topic microblog data is split as time microblog data, is designated as D time;
(3) the regional microblog data, by step (2) obtained is split as area time-division microblog data according to the time in units of sky, is designated as D city time;
(4) the time microblog data D after, statistics specifies topic microblog data to be separated timein the frequency of keyword seed, the frequency and being daily calculating all keyword seed to specify in topic microblog data the attention rate on this topic same day, according to statistics, adopt broken line graph, different topic selects different colors to distinguish, with keyword frequency for the longitudinal axis, take time as transverse axis, national attention rate trend map in the appointment topic fixed time section in units of sky can be obtained; Statistics area time-division microblog data D city timein the frequency of keyword seed, according to the method described above, with keyword frequency for the longitudinal axis, with time and city for transverse axis, can obtain specifying topic area attention rate trend comparison diagram, in the attention rate trend comparison diagram of actualite area, adopt tufted histogram graph representation comparative information;
(5), do to specify topic whole nation emotion distribution plan and Area distribution figure, process is as follows:
(5.1) the time microblog data D specifying topic microblog data, is calculated timeand area time-division microblog data D city time; Obtain the 8 dimension microblog emotional results of specifying topic every day, as shown in formula (2):
E=(e hate,e anger,e sorrow,e anxiety,e surprise,e love,e joy,e expect) (2)
Wherein, the vector element in formula (2) represent successively specify topic microblogging hatred, anger, sadness, anxiety, surprised, like, emotion intensity level under happiness, expectation 8 kinds of emotions;
(5.2), the three-dimensional emotion intensity level piling up histogram graph representation appointment topic microblogging every day is adopted, use respectively RGB look #EE9572, #9AC0CD, #CD8162, #5CACEE, #5D478B, #6E8B3D, #8B2500, #3A5FCD represent hatred, anger, sadness, anxiety, surprised, like, happiness, expectation 8 kinds of emotions, with emotion intensity for transverse axis, with timeline and area for the longitudinal axis, make and specify topic microblogging area emotion distribution plan, and with emotion intensity for transverse axis, be the longitudinal axis with timeline, make and specify topic microblogging whole nation emotion distribution plan.
The present invention is based on the keyword word frequency data of statistics acquisition and 8 dimension emotion results of affection computation acquisition, make the microblog hot event whole nation according to corresponding strategies and pay close attention to trend map, microblog hot event emotion whole nation distribution plan and microblog hot event Area distribution figure, can the displaying microblog text affective of visual pattern.
Accompanying drawing explanation
Fig. 1 specifies national attention rate trend map in topic fixed time section in the present invention.
Fig. 2 specifies topic area attention rate trend comparison diagram in the present invention.
Fig. 3 specifies topic microblogging area emotion distribution plan in the present invention.
Fig. 4 specifies topic microblogging whole nation emotion distribution plan in the present invention.
Embodiment
A kind of microblog emotional method for visualizing, comprises the following steps:
(1) set of appointment topic keyword, is expanded:
Due to the colloquial style in content of microblog, in the appointment topic microblog data got, specify the original keyword seed of topic to be not the statement that standardizes, now need keyword seed spoken language words, the slang of will topic be specified original, expansion step is as follows:
(1.1), topic microblogging text participle will be specified, statistics word frequency, and determine to specify the original keyword seed of topic;
(1.2), by word frequency sequence, get front 20 words and alternatively specify topic keyword seed;
(1.3), calculate 20 candidates according to formula (1) and specify topic keyword seed and the similarity of specifying the original keyword seed of topic:
d = Σ j = 1 n log p ( word _ seed j , word i ) p ( word _ seed j ) p ( word i ) - - - ( 1 )
Wherein, word_seed jrepresent and specify the original keyword seed of topic, word irepresent that candidate specifies topic keyword seed, p (word_seed j, word i) represent the probability of specifying the original keyword seed of topic and candidate to specify topic keyword seed simultaneously to occur in microblogging text, p (word_seed j) represent the probability of specifying the original keyword seed of topic to occur in microblogging text, p (word i) represent that the probability that candidate specifies topic keyword seed to occur in microblogging text, d represent that candidate specifies topic keyword seed and the similarity of specifying the original keyword seed of topic;
(1.4), according to the result of calculation of step (1.3), the candidate getting first 10 of similarity rank specifies topic keyword seed as the keyword seed expanded, the keyword seed expanded as topic keyword set, is designated as K together with the original keyword seed of appointment topic;
(2) appointment topic microblog data, is separated: appointment topic microblog data is split as regional microblog data according to city belonging to microblogging, is designated as D city; According to microblogging issuing time, in units of sky, appointment topic microblog data is split as time microblog data, is designated as D time;
(3) the regional microblog data, by step (2) obtained is split as area time-division microblog data according to the time in units of sky, is designated as D city time;
(4) the time microblog data D after, statistics specifies topic microblog data to be separated timein the frequency of keyword seed, the frequency and being daily calculating all keyword seed to specify in topic microblog data the attention rate on this topic same day, according to statistics, adopt broken line graph, different topic selects different colors to distinguish, and with keyword frequency for the longitudinal axis, take time as transverse axis, national attention rate trend map in the appointment topic fixed time section in units of sky can be obtained, as shown in Figure 1; Statistics area time-division microblog data D city timein the frequency of keyword seed, according to the method described above, with keyword frequency for the longitudinal axis, with time and city for transverse axis, can obtain specifying topic area attention rate trend comparison diagram, as shown in Figure 2, in the attention rate trend comparison diagram of actualite area, adopt tufted histogram graph representation comparative information;
(5), do to specify topic whole nation emotion distribution plan and Area distribution figure, process is as follows:
(5.1) the time microblog data D specifying topic microblog data, is calculated timeand area time-division microblog data D city time; Obtain the 8 dimension microblog emotional results of specifying topic every day, as shown in formula (2):
E=(e hate,e anger,e sorrow,e anxiety,e surprise,e love,e joy,e expect) (2)
Wherein, the vector element in formula (2) represent successively specify topic microblogging hatred, anger, sadness, anxiety, surprised, like, emotion intensity level under happiness, expectation 8 kinds of emotions;
(5.2), adopt the three-dimensional emotion intensity level piling up histogram graph representation appointment topic microblogging every day, use RGB look #EE9572 respectively, #9AC0CD, #CD8162, #5CACEE, #5D478B, #6E8B3D, #8B2500, #3A5FCD represents hatred, angry, sad, anxiety, surprised, like, glad, expect 8 kinds of emotions, with emotion intensity for transverse axis, with timeline and area for the longitudinal axis, make and specify topic microblogging area emotion distribution plan, as shown in Figure 3, and with emotion intensity for transverse axis, take timeline as the longitudinal axis, make and specify topic microblogging whole nation emotion distribution plan, as shown in Figure 4.

Claims (1)

1. a microblog emotional method for visualizing, is characterized in that: comprise the following steps:
(1) set of appointment topic keyword, is expanded:
Due to the colloquial style in content of microblog, in the appointment topic microblog data got, specify the original keyword seed of topic to be not the statement that standardizes, now need keyword seed spoken language words, the slang of will topic be specified original, expansion step is as follows:
(1.1), topic microblogging text participle will be specified, statistics word frequency, and determine to specify the original keyword seed of topic;
(1.2), by word frequency sequence, get front 20 words and alternatively specify topic keyword seed;
(1.3), calculate 20 candidates according to formula (1) and specify topic keyword seed and the similarity of specifying the original keyword seed of topic:
d = Σ j = 1 n log p ( word _ seed j , word i ) p ( word _ seed j ) p ( word i ) - - - ( 1 )
Wherein, word_seed jrepresent and specify the original keyword seed of topic, word irepresent that candidate specifies topic keyword seed, p (word_seed j, word i) represent the probability of specifying the original keyword seed of topic and candidate to specify topic keyword seed simultaneously to occur in microblogging text, p (word_seed j) represent the probability of specifying the original keyword seed of topic to occur in microblogging text, p (word i) represent that the probability that candidate specifies topic keyword seed to occur in microblogging text, d represent that candidate specifies topic keyword seed and the similarity of specifying the original keyword seed of topic;
(1.4), according to the result of calculation of step (1.3), the candidate getting first 10 of similarity rank specifies topic keyword seed as the keyword seed expanded, the keyword seed expanded as topic keyword set, is designated as K together with the original keyword seed of appointment topic;
(2) appointment topic microblog data, is separated: appointment topic microblog data is split as regional microblog data according to city belonging to microblogging, is designated as D city; According to microblogging issuing time, in units of sky, appointment topic microblog data is split as time microblog data, is designated as D time;
(3) the regional microblog data, by step (2) obtained is split as area time-division microblog data according to the time in units of sky, is designated as D city time;
(4) the time microblog data D after, statistics specifies topic microblog data to be separated timein the frequency of keyword seed, the frequency and being daily calculating all keyword seed to specify in topic microblog data the attention rate on this topic same day, according to statistics, adopt broken line graph, different topic selects different colors to distinguish, with keyword frequency for the longitudinal axis, take time as transverse axis, national attention rate trend map in the appointment topic fixed time section in units of sky can be obtained; Statistics area time-division microblog data D city timein the frequency of keyword seed, according to the method described above, with keyword frequency for the longitudinal axis, with time and city for transverse axis, can obtain specifying topic area attention rate trend comparison diagram, in the attention rate trend comparison diagram of actualite area, adopt tufted histogram graph representation comparative information;
(5), do to specify topic whole nation emotion distribution plan and Area distribution figure, process is as follows:
(5.1) the time microblog data D specifying topic microblog data, is calculated timeand area time-division microblog data D city time; Obtain the 8 dimension microblog emotional results of specifying topic every day, as shown in formula (2):
E=(e hate,e anger,e sorrow,e anxiety,e surprise,e love,e joy,e expect) (2)
Wherein, the vector element in formula (2) represent successively specify topic microblogging hatred, anger, sadness, anxiety, surprised, like, emotion intensity level under happiness, expectation 8 kinds of emotions;
(5.2), the three-dimensional emotion intensity level piling up histogram graph representation appointment topic microblogging every day is adopted, use respectively RGB look #EE9572, #9AC0CD, #CD8162, #5CACEE, #5D478B, #6E8B3D, #8B2500, #3A5FCD represent hatred, anger, sadness, anxiety, surprised, like, happiness, expectation 8 kinds of emotions, with emotion intensity for transverse axis, with timeline and area for the longitudinal axis, make and specify topic microblogging area emotion distribution plan, and with emotion intensity for transverse axis, be the longitudinal axis with timeline, make and specify topic microblogging whole nation emotion distribution plan.
CN201410254028.8A 2014-06-09 2014-06-09 MicroBlog emotion visualization method Pending CN104239383A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410254028.8A CN104239383A (en) 2014-06-09 2014-06-09 MicroBlog emotion visualization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410254028.8A CN104239383A (en) 2014-06-09 2014-06-09 MicroBlog emotion visualization method

Publications (1)

Publication Number Publication Date
CN104239383A true CN104239383A (en) 2014-12-24

Family

ID=52227457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410254028.8A Pending CN104239383A (en) 2014-06-09 2014-06-09 MicroBlog emotion visualization method

Country Status (1)

Country Link
CN (1) CN104239383A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677920A (en) * 2016-03-04 2016-06-15 百度在线网络技术(北京)有限公司 Feedback method and device for self-media quality index based on artificial intelligence
CN105989176A (en) * 2015-03-05 2016-10-05 北大方正集团有限公司 Data processing method and device
CN107704621A (en) * 2017-10-27 2018-02-16 西南财经大学 A kind of internet public feelings map visualization methods of exhibiting
CN107797983A (en) * 2017-04-07 2018-03-13 平安科技(深圳)有限公司 Microblog data processing method, device, computer equipment and storage medium
CN109783815A (en) * 2018-12-28 2019-05-21 华南理工大学 A kind of various dimensions network public-opinion big data comparative analysis method
CN109783800A (en) * 2018-12-13 2019-05-21 北京百度网讯科技有限公司 Acquisition methods, device, equipment and the storage medium of emotion keyword
CN111832573A (en) * 2020-06-12 2020-10-27 桂林电子科技大学 Image emotion classification method based on class activation mapping and visual saliency

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408883A (en) * 2008-11-24 2009-04-15 电子科技大学 Method for collecting network public feelings viewpoint
CN103500175A (en) * 2013-08-13 2014-01-08 中国人民解放军国防科学技术大学 Method for microblog hot event online detection based on emotion analysis
CN103559233A (en) * 2012-10-29 2014-02-05 中国人民解放军国防科学技术大学 Extraction method for network new words in microblogs and microblog emotion analysis method and system
US20140095148A1 (en) * 2012-10-03 2014-04-03 Kanjoya, Inc. Emotion identification system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408883A (en) * 2008-11-24 2009-04-15 电子科技大学 Method for collecting network public feelings viewpoint
US20140095148A1 (en) * 2012-10-03 2014-04-03 Kanjoya, Inc. Emotion identification system and method
CN103559233A (en) * 2012-10-29 2014-02-05 中国人民解放军国防科学技术大学 Extraction method for network new words in microblogs and microblog emotion analysis method and system
CN103500175A (en) * 2013-08-13 2014-01-08 中国人民解放军国防科学技术大学 Method for microblog hot event online detection based on emotion analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NING LIU 等: ""Microblogging Hot Events Emotion Analysis Based on Ren-CECps"", 《PROCEEDINGS OF THE 2013 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989176A (en) * 2015-03-05 2016-10-05 北大方正集团有限公司 Data processing method and device
CN105677920A (en) * 2016-03-04 2016-06-15 百度在线网络技术(北京)有限公司 Feedback method and device for self-media quality index based on artificial intelligence
CN107797983A (en) * 2017-04-07 2018-03-13 平安科技(深圳)有限公司 Microblog data processing method, device, computer equipment and storage medium
CN107704621A (en) * 2017-10-27 2018-02-16 西南财经大学 A kind of internet public feelings map visualization methods of exhibiting
CN109783800A (en) * 2018-12-13 2019-05-21 北京百度网讯科技有限公司 Acquisition methods, device, equipment and the storage medium of emotion keyword
CN109783800B (en) * 2018-12-13 2024-04-12 北京百度网讯科技有限公司 Emotion keyword acquisition method, device, equipment and storage medium
CN109783815A (en) * 2018-12-28 2019-05-21 华南理工大学 A kind of various dimensions network public-opinion big data comparative analysis method
CN111832573A (en) * 2020-06-12 2020-10-27 桂林电子科技大学 Image emotion classification method based on class activation mapping and visual saliency

Similar Documents

Publication Publication Date Title
CN104239383A (en) MicroBlog emotion visualization method
Brambilla et al. Illustrative flow visualization: State of the art, trends and challenges
CN107291696A (en) A kind of comment word sentiment analysis method and system based on deep learning
CN103605658B (en) A kind of search engine system analyzed based on text emotion
US20200265193A1 (en) Content editing using AI-based content modeling
CN106095749A (en) A kind of text key word extracting method based on degree of depth study
CN105844424A (en) Product quality problem discovery and risk assessment method based on network comments
CN105426381A (en) Music recommendation method based on emotional context of microblog
CN103678304A (en) Method and device for pushing specific content for predetermined webpage
CN104573070B (en) A kind of Text Clustering Method for mixing length text set
CN112100353B (en) Man-machine conversation method and system, computer equipment and medium
Orbay et al. Deciphering the influence of product shape on consumer judgments through geometric abstraction
CN112948575B (en) Text data processing method, apparatus and computer readable storage medium
CN105631932A (en) Three-dimensional model re-construction method with contour line guidance
CN104077705A (en) Scenic-spot e-commerce pushing method and system based on SVM
CN112529615A (en) Method, device, equipment and computer readable storage medium for automatically generating advertisement
CN104462408A (en) Topic modeling based multi-granularity sentiment analysis method
CN114827752A (en) Video generation method, video generation system, electronic device, and storage medium
Wu et al. Human–machine hybrid intelligence for the generation of car frontal forms
Li et al. Compositional Zero-Shot Artistic Font Synthesis.
CN104572915A (en) User event relevance calculation method based on content environment enhancement
CN114398909A (en) Question generation method, device, equipment and storage medium for dialogue training
CN103150329A (en) Word alignment method and device of bitext
CN105320715A (en) Body based semantic query method
Southavilay et al. Analysis of collaborative writing processes using hidden Markov models and semantic heuristics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20141224

RJ01 Rejection of invention patent application after publication