CN104636425B - A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing - Google Patents

A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing Download PDF

Info

Publication number
CN104636425B
CN104636425B CN201410795679.8A CN201410795679A CN104636425B CN 104636425 B CN104636425 B CN 104636425B CN 201410795679 A CN201410795679 A CN 201410795679A CN 104636425 B CN104636425 B CN 104636425B
Authority
CN
China
Prior art keywords
emotion
network
emotional
words
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410795679.8A
Other languages
Chinese (zh)
Other versions
CN104636425A (en
Inventor
周建栋
赵燕平
张华平
李想
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Chemical Technology
Beijing Institute of Technology BIT
Original Assignee
Beijing University of Chemical Technology
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Chemical Technology, Beijing Institute of Technology BIT filed Critical Beijing University of Chemical Technology
Priority to CN201410795679.8A priority Critical patent/CN104636425B/en
Publication of CN104636425A publication Critical patent/CN104636425A/en
Application granted granted Critical
Publication of CN104636425B publication Critical patent/CN104636425B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Computing Systems (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of network individual or colony's Emotion recognition ability prediction and method for visualizing, belong to internet public feelings information excavation and analysis field.The present invention integrates to the conventional emotion word included in existing sentiment dictionary, considering simultaneously has the network sentiment neologisms and expression character of Sentiment orientation in network environment, the Emotion element on social media platform is contained to greatest extent, and constructs emotion word ontology library on this basis;Network individual mood bifurcation position is determined, the Emotion recognition ability level of network individual is described with Emotion recognition Capability index, and mood bifurcation difference between multiple network individuals is shown in a manner of visual.Network individual or the Evolution of colony's Emotion recognition ability level can be disclosed by the present invention, critical point especially to representative network individual or the mood dynamic changing process and its paralepsy of colony is predicted, help related management person to carry out correct guidance to network public opinion, build harmonious network environment.

Description

Network individual or group emotion cognition ability prediction and visualization method
Technical Field
The invention relates to a method for predicting and visualizing emotion cognitive ability of network individuals or groups, belonging to the field of Internet public opinion information mining and analysis.
Background
With the rapid development of social networking technologies and applications, people are more and more accustomed to sharing their emotions, attitudes, opinions and viewpoints via the internet, wherein the emotions are the determining power of the leading three, because the emotions are the intrinsic psychological reactions and feelings of people, such as happiness, anger, sadness, music and the like, which significantly affect the decision-making behavior of people. Meanwhile, the network individuals with certain opinion guide capability and topic responsiveness influence the view attitude of the followers through the network platform. For example, in a movie marketing campaign, known artists often have a great influence on the emotional views and even the expression of opinions of their fans through a network platform (including microblogs, blogs, etc.). Similarly, microblogs such as 'big V' blogs and celebrity blogs, which are typical network individuals, often have tens of millions of fan followers, have higher speaking right in network events, influence the directions of public emotions and decision behaviors to a higher degree, and play the role of 'opinion leaders'. Therefore, the method has very important significance for developing a tool for predicting and visualizing the emotion dynamic change process and the emotion bifurcation of the network individual or group. From the commercial perspective, an effective marketing promotion scheme can be formulated by monitoring the law of the network individual emotion dynamic change process, and products can be completed in time and public praise maintenance can be carried out according to the emotion dynamic change condition of network groups after the products are sold. From the perspective of social governance, the method can help managers to effectively manage network users and reasonably guide network public opinions to create a harmonious network environment by analyzing the emotion dynamic change process and emotion mutation branch point prediction of network individuals or groups, particularly typical network individuals or groups.
Li Y et al propose a complex theory and a method of modeling the dynamic change of emotional structures in tissue environments (Li Y, ashkanasy N M, ahlstrom d.complex geometry and infection structure: A dynamic adaptation to modification changes in organization [ J ]. Research on adaptation in organization, 2010,6: 459-467.), as shown in FIG. 2, describing the bifurcation mutational characteristics of general natural system states, i.e., the phenomenon of jump of complex system states under realistic environmental conditions Weiss H M et al, propose the theory of emotional events (Weiss H M, cropmano R. Affective events the. A. The emotional discipline of the structure, consumers and sequences of emotional expeditions at work [ J. Research in organic observers: an annual services of ecological and cognitive reviews,1996, 18) Emotional response and attitude behavior. The analysis of Li Y and Weiss H M on the dynamic change model of the emotional structure is only qualitative, does not have enough experiment and investigation data as a basis, does not analyze the emotional change process and the emotional cognitive ability of people through natural language from the perspective of network text, and only analyzes the dynamic change mechanism of the human microscopic emotional state and the causal relationship between the external emotional event and the emotional state by combining the chaos theory of the diversity point of the theory of the emotional cognition and the psychology qualitatively from the perspective of the emotional cognition. Neither the above nor later related studies suggest how to perform sentiment analysis and predictive visualization models for networked individuals or groups of internet social media, such as the methods proposed herein for computing emotional structure bifurcation points and emotional cognitive ability indices from network text. The invention is different in that an emotional event theory and an emotional structure bifurcation model are introduced into the fields of public sentiment analysis and network data mining, and a text sentiment analysis technology is applied, so that a method for predicting and visualizing the sentiment cognitive ability of network individuals or groups and the bifurcation before the sentiment is disordered in the social network media environment is provided.
Disclosure of Invention
The invention aims to provide an effective and visual method for predicting and visualizing emotion bifurcation points and emotion cognition ability index levels of network individuals, which helps users to know and monitor mutation bifurcation points of emotion dynamic changes of the network individuals or groups so as to predict emotion evolution states (evanescent state, equilibrium state, approximately equilibrium state and disordered state) and development trends of the network individuals or groups, and can be used for dynamic analysis and early warning of network group public events and a plurality of related fields related to network emotion evolution states.
The invention provides a network individual emotion bifurcation point calculation method by collecting and analyzing network texts issued by network individuals or organizations in a social network environment, establishes a network emotion structure bifurcation point model, and describes, predicts and visualizes emotion mutation bifurcation point positions and emotion cognition ability levels of the network emotion structure bifurcation points.
The purpose of the invention is realized by the following technical scheme:
a method for predicting and visualizing emotion cognitive ability of network individuals or groups comprises the following steps:
step 1) constructing an emotional word ontology library
In order to calculate the position of a network individual emotion bifurcation point, a relatively comprehensive emotion word ontology library needs to be constructed, and the specific steps comprise: 1-1) integrating the existing Chinese emotion dictionary to more comprehensively contain the commonly used emotion words. 1-2) training new network words frequently used by netizens on the basis of large-scale corpus, and removing the words without obvious emotional colors. 1-3) training out expression characters which are frequently used by netizens on the basis of large-scale corpus. 1-4) the common emotion words, the network emotion new words and the emoticon words jointly form an emotion element set of the network text emotion analysis.
Constructing an emotional word ontology library E based on the emotional element set, wherein the emotional word ontology library E comprises emotional words, polarity tendency and emotional intensity value, and E can be expressed as:
E=<(W 1 ,P 1 ,I 1 ),(W 2 ,P 2 ,I 2 ),...,,(W i ,P i ,I i ),...,(W n ,P n ,I n )>
wherein, W i Representing affective words, P i Represents W i Polarity (P) of i > 0 indicates that it is a positive affective word; p i < 0 indicates that it is a negative affective word), I i Represents W i The greater the absolute value of the emotional intensity value of (2), the higher the emotional intensity value of (2)I is more than or equal to 1 and less than or equal to n, and n is the number of the emotional words in E.
a. A polarity integration method. Polarity P of commonly used emotional words i If the same emotion word is inconsistent in labels in different emotion dictionaries, a multi-user voting mode is used for correction; because the quantity of the network emotion new words and the emoticons is limited, the polarities of the network emotion new words and the emoticons are determined in a multi-user voting mode.
b. And (4) an emotion intensity determination method. Firstly, a large-scale social network text set U is obtained, the distribution of each character in the emotion words in the set is calculated, then the emotion weight of candidate emotion words is calculated according to the distribution of the emotion words, the emotion words exceeding a threshold value are the emotion words, and finally the emotion intensity value of the common emotion words is calculated, as described in the following. Because the quantity of the network emotion new words and the emoticons is limited, the emotion intensity of the network emotion new words and the emoticons is determined by adopting a multi-user voting mode on the basis of referring to the intensity of the common emotion words.
A social network text set, S, is denoted below by U Is just And S Negative pole Respectively representing positive emotion words and negative emotion word sets in U, and uniformly using S * Representing a set of emotion words. Suppose S * One emotional word w in the set can be represented as a character string C 1 C 2 …C i …C k In which C is i Representing one character in the emotional words; characters in the emotional words are divided into positive emotional characters and negative emotional characters, and the positive polarity and the negative polarity of each emotional character are consistent with the polarity of the corresponding emotional word.
Calculating the distribution of 'words' in the text set by P (C) i |S * ) Representation from an emotional word set S * Chinese character C i The probability in the web text set U is calculated as follows:
wherein, P (S) * ,C i ) Representation of belonging to an emotional set of words S * The component character C in the word i Probability of occurrence in U; p (S) * ) Representing a set of emotional words S * The probability of all the constituent characters of the Chinese word appearing in the text set U; freq (S) * ,C i ) Representation of belonging to an emotional set of words S * The composition character C of the word i The frequency of occurrence in U, freq (S) * ) Representation of belonging to an emotional set of words S * The sum of the frequencies of occurrence of all component words of (b) in U; in addition, the value of delta is taken as a small numerical value, and an emotion word set S is taken here * The inverse of the total word count.
Calculating candidate emotion words (candidate words for short) w in text set U * Distribution of (2)
I.e. a candidate word w * A probability of appearing in U, where | U | represents the number of words in the text corpus; w is a i Representing any word in the text set U; freq (w) * ) Represents a candidate word w * The frequency of occurrence in U is such that,denotes all of w i The sum of the frequencies occurring in U; δ has the same meaning as above.
Calculate the candidate word w * Each candidate word is from U, whether the candidate word is an emotional word is unknown, and the emotional weight of the candidate word is calculated so as to judge whether the candidate word is an emotional word, and the polarity intensity and the optimal attribution emotional word set of the candidate word are obtained. Candidate word w * The emotion weight calculation formula of (1) is as follows:
wherein, alpha, beta E [0,1]Is a combination of adjustment parameters, C i Is w * The ith word of (a), w * Has k words in total, P (C) i |S * ) And P (w) * ) It can be calculated by equations (1) and (2) which also represent the emotional tendency of the word.
Calculating the emotional membership sum of the candidate wordsEmotional intensity, as can be seen from the above expression, each candidate word has a positive emotional weight r (w) * |S Is just for ) And negative emotional weight r (w) * |S Negative pole ) Thus, the emotional membership degree I can be expressed as the combination of two emotional tendencies, which is specifically expressed as follows:
I(w * )=r(w * |S negative pole )-r(w * |S Is just )
Wherein, I represents the emotion membership degree of the candidate word, and the magnitude of the numerical value expresses the emotion membership degree, namely the emotion intensity value of the candidate word. Depending on the sign of I, P may be marked as +1 or-1, the magnitude of its absolute value determining w * Whether it is an emotional word or not and its sign can be determined to belong to S Is just Or S Negative pole
Through the calculation, the emotion membership degrees of all candidate words can be obtained, then the candidate words are ranked according to the values of the emotion membership degrees, and the candidate words with the larger emotion membership degrees have higher emotion tendency degrees and higher emotion intensity. Therefore, candidate words with emotion membership in a certain range are selected as new emotion words (threshold epsilon) to be determined, the absolute value of the corresponding emotion membership is used as an emotion intensity value, and the polarity sign is used as a positive or negative emotion mark and is filled in the ontology library.
c. The emotional intensity of the emoticon is calculated, and the META value of the emoticon can be taken out to carry out similar determination.
The construction of the emotion ontology library can be completed by the series of steps. The ontology base is constructed by learning according to a large-scale network text set, so that the method has the rationality of big data statistics.
Step 2) determining the position of the emotion bifurcation point of network individuals, and calculating the emotion cognitive ability index value of the network individuals
The method comprises the following steps of collecting a text information set published by a network individual according to a time sequence, and calculating the emotion bifurcation point position of the network individual and the time sequence change process of the emotion bifurcation point position, wherein the process comprises the following steps:
step 2-1) similarly setting the collected text information set published by the network individuals as U according to the time sequence, wherein the U can be expressed as:
wherein T is a time sequence, S is a microblog information vector set corresponding to T, and if the microblog information published at the moment T is S t
And 2-2) performing text preprocessing work on the text information set. Performing Chinese word segmentation, part-of-speech tagging and the like on the information set U by using an ICTS word segmentation tool according to a time sequence T to obtain microblog information S published at the time T t Word set W in (1) t
W t =<w 1 ,w 2 ,...w j ,...w J >
Wherein w i Is W t Whether one word in the Chinese character is an emotional word or not needs to be further judged, and J is the number of the words.
And 2-3) extracting the emotional words according to the emotional word ontology library E. The specific process is as follows: microblog information S at t moment t Word set W in (1) t Matching with the emotion word ontology library E constructed in the step 1), and extracting W t And matching the emotion intensity values and polarities of all the emotion elements in the emotion body library.
Step 2-4) constructing a network individual emotion bifurcation point position calculation model and calculating a network emotion cognition ability index value of a network individual changing according to time sequence, wherein the method comprises the following steps:
and 2-4-1) calculating the proportion of emotional elements in four emotional states of a disordered state, an equilibrium state, an approximately equilibrium state and a vanishing state.
By Num t Denotes from W t Number of extracted emotion elements, dead t 、low t 、med t And high t Respectively represent Num t The number of the intensity values in each emotion element is (0,a), [ a, b), [ b, c), [ c, d) respectively, wherein d = max (| I (w) |) is the maximum value of the absolute values of all emotion intensities in the emotion ontology library, and the demarcation points a, b, c, d are corresponding emotionThe value of the parameter value of the end branching point is a =0.25d; b =0.75d; c =0.8925d. The four intervals are consistent with the emotional state and the emotional structure bifurcation point model (see the attached figure 1 in the specification), and respectively correspond to an evanescent state 0,1]"," equilibrium state (1,3)]"," approximately equilibrium state (3,3.57)]"and" disordered state (3.57,4) ". And can calculate separately
The emotional elements of the disturbance state, the equilibrium state, the approximate equilibrium state and the vanishing state corresponding to the high, medium and low emotional intensity values in the microblog information St at the time t are represented by P (high), P (med), P (low) and P (dead) respectively.
Step 2-4-2) defining and calculating the emotion cognition ability index R of the network individual according to the position of the bifurcation point of the emotion structure t . The higher the emotional cognitive ability of the network individual, the more easily the emotional state is in a "disordered state"; the lower the emotional cognitive ability, the easier its emotional state is in "equilibrium"; the emotional state is easy to be in an approximate equilibrium state when the emotional cognitive ability is moderate, and the purpose of the method is to solve
max{P(high) t ,P(med) t ,P(low) t ,P(dead) t }
Microblog information S representing t moment t The most prominent emotional state, e.g. if P (high) r Maximum, it means that the network individual is in the emotional state of "disorder" at time t, and the corresponding emotional cognitive ability index is (3.57,4)]And (4) the following steps.
Defining the emotional cognitive ability index of the network individual as R t It is called emotional cognitive ability, and its calculation formula is related to the emotional intensity that it exhibits at the corresponding emotional cognitive ability level, as shown below.
If P (high) t Taken to be maximum, then R t =3.57+0.43*P(high) t
If P (med) t Taking the maximum, then R t =3+0.57*P(med) t
If P (low) t When the maximum value is taken, then R t =1+2*P(low) t
If P (dead) t When the maximum value is taken, then R t =P(dead) t
In this way, the network entity is in time series T =<1,2,3,…,t,…,T&gt, the sequence of change of the emotional cognitive performance index (defined by emotional bifurcation point value) is R =<R 1 ,R 2 ,...,R t ,...,R T &gt, each microblog S t It can calculate its corresponding sequence element value R t The sequence represents the emotional cognitive ability level of the individual network to different emotional events.
And 3) constructing a visual layout of the positions of the emotion bifurcation points of the network individuals.
The time is used as a horizontal axis, the emotion bifurcation point position is used as a vertical axis, the network individual emotion bifurcation point position and the emotion cognition ability index sequence are visually displayed, and the method further comprises the following steps:
step 3-1) of obtaining the network individuals in the step 2-4) in a time sequence T =<1,2,3,...,t,...,T&gt, sequence of changes in emotional cognitive performance index R =<R 1 ,R 2 ,...,R t ,...,R T &And the time sequence T is used as a horizontal axis and the R is used as a vertical axis in a two-dimensional rectangular coordinate system to construct a geometric figure. The range of the emotion cognition ability index is divided into (0,1) in the emotion structure bifurcation dynamic model],(1,3],(3,3.57](3.57,4) which correspond to "an evanescent state", "an equilibrium state", "an approximately equilibrium state", and "a disturbed state" of the mood of the individual, respectively. Divided into four regions of the longitudinal axis.
And 3-2) visually displaying the emotion cognition ability index of the network individual. Labeling points in a coordinate system when drawing a geometric layout, such that the attributes of the points can be expressed as<t,R t ,F&gt, where T denotes the T-th instant of the time series T, R t An index of emotional cognitive ability at time t, and FThe label symbol selected when the dot is drawn.
And 4) carrying out comparative analysis on a plurality of network individuals. Comparing the position change of the emotional cognitive ability indexes of a plurality of individuals to see the difference of the emotional cognitive ability levels, and visually displaying the difference, further comprising the following steps:
and 4-1) calculating the emotional cognition ability index level of the network individual according to the emotional cognition ability index sequence of the network individual, and further determining the emotional cognition ability index level sequence of the network group. Respectively determining time series emotion cognition ability index sequences of network individuals (H network individuals are assumed) according to the step 2)<T λ ,R λ &And lambda is the number of the network individuals. For network individual lambda in time period T λ Middle emotion cognitive ability index R λ Respectively at (0,1), (1,3],(3,3.57]The number C in the interval (3.57,4) 1 、C 2 、C 3 、C 4 Make statistics on the frequency C k (k =1,2,3,4) the mean or median of all R values in the R interval corresponding to the maximum value in (k = 5363) is defined as the interval center value
And defining the central value as the emotional cognitive ability index level of the network individual lambda to express the rule that the network individual cognizes the emotional event in a long term, namely the obvious emotional cognitive ability. For a group consisting of a plurality of network individuals, obtaining the emotional cognitive ability index level sequence of the network group:
step 4-2) representing the number lambda of the network individual on the horizontal axis and the emotional-cognitive ability index level of the individual on the vertical axisAnd constructing a multi-network individual emotion cognition ability index level comparison visualization layout. Other geometric figuresThe construction of the shape is the same as that in the step 3-1), and the method comprises the steps of drawing the horizontal position points of the emotion cognition ability indexes corresponding to the number lambda of the network individuals by taking specially-set symbols or head portrait icons of the network individuals as label symbols so as to more intuitively express the difference in the emotion cognition ability of the network individuals. In addition, the division method of the vertical axis is the same as that in step 3-2), and the vertical axis is divided into four regions according to the emotional bifurcation point. Therefore, the visual layout of the level difference of the emotional cognitive ability indexes of the plurality of network individuals is completed.
Advantageous effects
The method integrates the commonly used emotional words recorded in the existing emotional dictionary, considers the network emotional new words and the expression characters with emotional tendency in the network environment, furthest contains all the emotional elements on the social media platform, and constructs an emotional word ontology base on the basis; determining the position of a network individual emotion bifurcation, and describing and predicting the position and the change of the network individual emotion bifurcation in a visual geometric layout mode; and describing the emotional cognitive ability level of the network individuals by using the emotional cognitive ability index, and displaying the emotional bifurcation point difference among a plurality of network individuals in a visual mode.
According to the invention, the position and the change process of the emotional bifurcation point of the network individual or group can be described in a visual way, and the evolution rule of the emotional cognitive ability level of the network individual or group is revealed through the emotion fluctuation of the network individual or group, so that the relevant users can be helped to more comprehensively and intuitively know the cognitive attitude and the essence of the emotional state of the network individual or group on the sensitive event, and the future cognitive attitude and the possibly generated emotional state can be predicted and early warned. The method can be applied to various application fields such as network public opinion monitoring, microblog emotion analysis, customer evaluation, company and product reputation, stock market and financial crisis outbreak, risk analysis and the like.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a bifurcation model of an emotional structure;
FIG. 2 is a schematic diagram of a bifurcation point theory;
FIG. 3 is a schematic flow chart of the method of the present invention;
FIG. 4 is a schematic diagram showing the change of the accuracy of the emotional word ontology library construction algorithm under the condition of increasing the number of test words;
FIG. 5 is a schematic diagram of the location change of the emotional bifurcation point of the network individual "Cui Yongyuan";
FIG. 6 is a schematic diagram of the location change of the emotional bifurcation point of the individual "Liu Yifei" in the network;
FIG. 7 is a comparison graph of emotional cognitive ability index level visualization of multiple network individuals.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail by embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the embodiment of the invention, a method for predicting and visualizing the emotion bifurcation point position of a network individual is provided, which can visually describe the emotion bifurcation point position change process of the network individual and predict the emotion cognition index change trend and the like. The network individuals refer to network users who continuously express emotional views on various topic events on the Internet social platform, such as microblogs with large V and the like; the emotional bifurcation point can dynamically describe the emotional evolution state when the emotional state corresponding to a certain event is in an evanescent state, a balanced state, an approximately balanced state or a disordered state.
Fig. 3 is a schematic flow chart of the method of the present invention, which mainly comprises the following steps:
step 1) constructing an ontology library capable of integrating multi-source emotional words;
step 2) determining the position of an emotion bifurcation point aiming at the collected information set of the network individuals;
step 3) establishing a dynamically changed visual graph, coloring and labeling the graph;
step 4) respectively calculating the positions of emotion bifurcation points of a plurality of network individuals and the variation difference of the emotion bifurcation points to perform visual display;
and 5) comparing the emotion bifurcation position change process and the emotion cognition ability index of a plurality of network individuals, and predicting the cognition process and the emotion state of future emotional events.
More specifically, step 1) constructing an emotional word ontology library. In the embodiment, a Chinese emotion vocabulary ontology library manually labeled by an information retrieval research laboratory of university of continental engineering, an emotion dictionary of Taiwan university and an expression vocabulary character set of Qinghua university are adopted, 2 ten thousand common emotion vocabulary entries and basic expression characters are collected, an emotion word ontology library is constructed according to the method provided by the invention, potential emotion words and new emotion characters are automatically mined from a corpus set, uniform polarity and emotion intensity values are labeled, and the ontology library can be dynamically updated.
In the step, in order to verify the effectiveness of the algorithm, judgment experiments are respectively carried out on corpus data of three groups of different internet fields, and the accuracy of the algorithm is verified. The three groups of experimental corpus data do not have any work of artificial emotion marking, and the basic information is as follows:
a) The review data of jinkpad in kyoto mart is 16M in size, and contains 4000 pieces of positive review information and 4000 pieces of negative review information. And storing the text form.
b) Comment data of the dramas of the bean network 700 are 65M in size, and each of the dramas contains certain comment information and is stored in a text form.
c) The catering industry comment data of the public comment network is 407M in size, and the content comprises information such as a user ID, a shop ID, comment content and time.
In the experimental process, only comment contents are analyzed, and potential unknown emotional words are mined from comment data, so that the positive and negative information of the corpus has no influence on the experiment, and the rest parts are not processed.
In addition, the commonly used emotion dictionary selected in the step comprises 25651 emotion words, all the emotion words are labeled with emotion polarities, wherein the number of positive emotion words is 12745, and the number of negative emotion words is 12907. In order to check the validity of the algorithm in this step, 450 positive and negative emotion words are extracted from the dictionary as a test word set, and emotion intensity values are assigned to the extracted words. Meanwhile, considering that the emotional words can be adjectives, nouns or verbs, 300 adjectives, nouns and verbs are ensured when the 900 emotional words are selected.
For convenience of expression, the corpus in all the following steps is represented by U, the emotion dictionary by D, and the emotion word ontology library by E.
The following will describe in detail the construction process of the emotional word ontology library E and the related experimental results, which may include the following steps:
step 1-1) integrating the existing Chinese emotion dictionary, and comprehensively containing commonly used emotion words;
the integration process comprises the integration of emotion words and the unification of polarities, and the situation that the same emotion word has inconsistent polarities in different Chinese emotion dictionaries is corrected in a multi-user voting mode.
Step 1-2) training out network new words frequently used by netizens on the basis of the existing large-scale social network corpus, and screening out the network emotion new words with emotion tendencies, wherein the basic research work comprises text preprocessing, word segmentation, part of speech tagging, word frequency statistics, stop word removal and the like. Here, a new generation ICTCLAS participle tool is used for the correlation work. The Information and Communications Technology (ICTS) CLAS is a Chinese Lexical comprehensive Analysis System, and has the main functions of Chinese word segmentation, part-of-speech tagging, new word recognition, word frequency statistics and the like, wherein the word segmentation precision of the System reaches 98.45%. The invention completes the relevant work of text preprocessing by calling the java version interface of the ICTCCLAS. The method comprises the following specific steps:
step 1-2-1) Chinese word segmentation
For the corpus U 1 (Jingdong city of commerce), U 2 (Bean cotyledon net), U 3 (popular comment network), by calling the word segmentation interface of ICTCCLAS, respectively completing the centering of each corpus setThe word segmentation work is carried out. For the sake of simplicity, U = values are assigned in common<w 1 ,w 2 ,...,w M &gt, where M represents the total number of words in the word segmentation result, w k Represents the kth word (k =1,2,.., M) in the word segmentation result.
Step 1-2-2) word frequency statistics
Counting each word w in word segmentation result i Frequency of occurrence f i =m i (ii) M wherein M i The expression w i The number of occurrences. At this time, the whole corpus is represented by a bag-of-words model, which is represented as (words, word frequency), then
U=<(w 1 ,f 1 ),(w 2 ,f 2 ),...,(w M ,f M )>
Step 1-2-3) removal of stop words
Stop words are words that occur frequently but are of little practical significance in the context. Stop words only play a certain semantic role when placed in a sentence. Such as the common ones, "being," "and," "at," and so forth. The stop words have no effective effect on the construction of the emotional word ontology library, and the removal of the stop words does not influence the experimental result. The corpus U after the stop word is removed can be expressed as:
U=<(w 1 ,f 1 ),(w 2 ,f 2 ),...,(w N ,f N )>
wherein N represents the total number of words in the corpus after the stop words are removed, and the number of the removed stop words is M-N.
On the basis of the social network corpus, different emoticon word sets are collected and subjected to polarity labeling. In this way, the common emotion words, the new network emotion words and the emoticon words jointly form an emotion element set for performing emotion analysis on the network text.
And 1-3) constructing an emotional word ontology library according to the construction method of the emotional word ontology. The emotional word ontology library E can be expressed as:
E=<(W 1 ,P 1 ,I 1 ),(W 2 ,P 2 ,I 2 ),...,(W n ,P n ,I n )>
wherein, W i Representing emotional words, network emotional new words or emoticons, P i Represents W i Polarity (P) of i > 0 indicates that it is a positive affective word; p i < 0 indicates that it is a negative affective word), I i Represents W i The emotional intensity value of (2).
Polarity P of commonly used emotional words i The network emotion new words and the emotion intensity values I of the expression characters are consistent with the polarities in the emotion dictionary i And polarity P i The calculation of (2) is as follows, and the calculated polarity is corrected in a manner of manual labeling in the case where the calculated polarity does not coincide with the dictionary.
The construction of the emotional word ontology library adopts the formulas (1), (2) and (3) in the invention content to screen the emotional words. The emotional word ontology library E is obtained by learning according to a large-scale network text set, so that the method has the rationality of big data statistics.
Step 1-3) experimental analysis of the emotional word ontology library construction algorithm, which can comprise the following steps:
step 1-3-1) Experimental results
In the experiment, the emotion word ontology library is constructed aiming at the raw corpus, namely the corpus which is not subjected to any artificial emotion marking. In order to verify the accuracy of the algorithm, the following measures are taken: selecting 900 emotion words marked with emotion intensity values in an emotion dictionary D as a test word set Testword =<Word i ,IA i &Word in i Representing emotional words, IA i Representing the corresponding manually marked emotion word strength value, i belongs to {1,2,. Multidot., 900}; calculating emotional strength values according to the emotional word membership degree (namely emotional strength) calculation method described in the step 1,2), and using Intensity i Represents; and calculating the accuracy. Manually labeling the strength values of 900 selected emotion words in the emotion dictionary D by adopting a positive and negative 7-point scale scoring method:
in addition, the emotion membership calculated by the algorithm is divided into positive and negative, the positive and negative represent the polarity of the corresponding emotion words, and the magnitude represents the strength. In order to check the accuracy and effectiveness of the algorithm, the polarity of the emotion words in the test word set TestWord is compared with the polarity of the emotion words,
selecting adjectives, nouns and verbs in the emotion word test word set as test emotion words, and using the three groups of corpus sets<U 1 、U 2 、U 3 &The experimental results are shown in table 1 for the test corpus:
TABLE 1 Emotion ontology library construction algorithm experimental results
Sorting the calculated emotional strength values of the emotional words, wherein the calculated emotional strength of the positive emotional words is a positive value and is sorted from large to small, the calculated emotional strength of the negative emotional words is a negative value and is sorted from large to small according to the absolute value, the accuracy of the first 10, 50, 150, 250, 350 and 450 is calculated, and the calculation result is shown in table 2:
TABLE 2 Emotion word ontology base construction algorithm accuracy statistics under test word quantity increasing condition
A corresponding line graph is made according to statistical data in the table and is shown in FIG. 4, no matter positive emotional words or negative emotional words can be seen from the graph, the accuracy rate calculated by the algorithm approaches to about 91% along with the increase of the number of the emotional words, which shows that the accuracy rate of the emotion word ontology base construction algorithm is about 91%, and the method also shows that the method has high effectiveness.
So far, a construction result of a partial emotional word ontology library is given, and is shown in table 3:
TABLE 3 partial emotional word ontology library
Step 1-4) supplement of emotional word ontology library
In addition, the invention considers that the emotion symbols and some new network words with emotion characteristics in the text information published by the network individuals reflect the emotional states of the network individuals to the current events to a higher degree. Therefore, in order to complete the construction of the emotion word ontology library as much as possible and accurately calculate the emotion perception capability of the network individual, the expression character emotion ontology library and the network emotion new word ontology library are constructed by adopting the following method:
the method takes 1700 pieces of microblog data as a corpus set, trains 2182 network new words frequently used by netizens by using new word discovery and word frequency statistical function of ICTCCLAS 2014 software, eliminates the words without obvious emotional colors, extracts the first 569 words, and finally selects 458 words as the network emotional new words. Because the emotional colors of most of the network emotional new words are artificially given by netizens in the using process, the emotional colors of some words are different, and some words originally have no emotional colors like rare new words, the network emotional new word ontology library is constructed by adopting a manual labeling processing mode on the basis of the construction method of the emotional word ontology library, and the format of the network emotional new word ontology library is (W) i ,P i ,I i ) The method is consistent with the constructed emotional word ontology library.
Similarly, 268 emoticons frequently used by netizens are trained from the microblog corpus in a centralized manner, and an emoticon emotion body library is constructed in a manual labeling processing manner.
Therefore, the construction, experimental analysis and supplement of the emotional word ontology library are completed.
In the invention, the extracted emotional words and the emotional intensity values thereof are different according to different corpus sources. Therefore, the construction algorithm of the emotion word ontology base is irrelevant to the application field, the emotion dictionary can be effectively expanded, the fine use of the emotion dictionary is enriched, and the method can be used for mining and analyzing the data value of multi-field texts in a big data era.
Step 2) calculating the position of the emotion bifurcation point of the network individual
In this step, a Sina microblog user "Cui Yongyuan" is taken as a network individual research object, and a total of 20 microblogs published from 5 months in 2013 to 6 months in 2014 are collected as an embodiment, and this step is discussed in detail.
And aiming at the collected information set of the network individuals, determining the positions of the emotion bifurcation points and establishing the geometric layout of the dynamically changed visual graph. As described above, according to the invention, the xinlang microblog user "Cui Yongyuan" is taken as the network individual according to the embodiment of the invention. Noting that the collected text information set is U, U can be expressed as:
wherein T =20 is a time sequence, and S is microblog information released at T time, that is, microblog information released at T time is S t
The process comprises the following steps:
step 2-1) determining multi-element emotion classification of emotion words contained in each piece of text information and emotion intensity value on classification dimension
Performing word segmentation and part-of-speech tagging work on the information set U according to the time sequence T by using an ICTSCLAS 2014 word segmentation tool to obtain microblog information S published at the time T t Word set W in (1) t
W t =<W 1 ,W 2 ,...,W J >
Wherein, W j Is W t One word in the list, whether the word is an emotional word or not needs further judgment, and J is the number of the words
Microblog information S at t moment t Word set W in (1) t Matching with the emotion word ontology base constructed in the step 1), and extracting W t All the emotional words in (1)And its emotional intensity value.
Num t Represents W t The number of emotion words contained in the extracted emotion words, high t 、med t 、low t And dead t Respectively represent Num t The magnitude of the intensity value in each emotional word is (0,a), [ a, b), [ b, c), [ c, d]Wherein d = max (| I (w) |) is the maximum value of the absolute values of all emotion intensities in the emotion ontology library, and the demarcation points a, b, c and d are the parameter values of the corresponding emotion bifurcation points, and the numerical value of the demarcation points is a =0.25d; b =0.75d; c =0.8925d. The four intervals are consistent with the emotional state and an emotional structure bifurcation point model (shown in figure 1 of the specification), and respectively correspond to an evanescent state 0,1]"," equilibrium state (1,3)]"," approximately equilibrium state (3,3.57)]"and" disorder (3.57,4) ". And can calculate respectively
Wherein, P (high) t, P (med) t, P (low) t and P (dead) t respectively represent microblog information S at t moment t The proportion of the emotional words of 'equilibrium state', 'approximate equilibrium state', 'disorder state' and 'vanishing state' corresponding to the high, medium and low emotional intensity values contained in the Chinese character is as follows.
Defining and calculating a network emotion cognition ability index R according to the position of a bifurcation point of an emotion structure t The higher the emotional cognitive ability of the network individual, the more easily the emotional state is in a "disorderly state"; the lower the emotional cognitive ability, the easier its emotional state is in "equilibrium"; moderate emotional cognitive ability tends to keep its emotional state in "near equilibrium". In the example described:
max{P(high) t ,P(med) t ,P(low) t ,P(dead) t }=max{0,12/19,6/19,1/19}
microblog information S representing t moment t The most significant emotional state, e.g. if P (med) t Maximum, then the network individual is at tThe emotional state of the emotion is in an approximately balanced state, and the corresponding emotional cognitive ability is (3,3.57)]And (4) inside.
Defining the magnitude of emotional cognitive ability as R t It is called an emotional cognitive ability index, and its calculation formula is related to its emotional expression intensity in the corresponding emotional state, as shown below.
If P (high) t Taken to be maximum, then R t =3.57+0.43*P(high) t
If P (med) t Taken to be maximum, then R t =3+0.57*P(med) t
If P (low) t Taken to be maximum, then R t =1+2*P(low) t
If P (dead) t Taken to be maximum, then R t =P(dead) t
In this example, because P (med) t Get the maximum, then
In this way, the network entities in the time series T =<1,2,3,…,t,…,20&gt, the sequence of change of the emotional cognitive performance index (defined by emotional bifurcation point value) is R =<R 1 ,R 2 ,...,R t ,...,R 20 &gt, each microblog S t It can calculate its corresponding sequence element value R t The sequence represents the emotional cognitive ability value of the network individual.
Thus, the network individual "Cui Yongyuan" is in time series T =<1,2,3,...,t,...,20&gt, the variation sequence of the emotional bifurcation value is R =<R 1 ,R 2 ,...,R t ,...,R 20 >
Step 3) coloring the established geometric layout and labeling labels
Due to the emotional cognitive ability of an individual, namely the position of the emotional bifurcation point, the individual has relative stability if not experiencing significant emotional events within a certain period of time. That is, the emotional state tendencies presented in the web texts published by the web individuals are relatively constant in a long time period. Therefore, the invention establishes a geometric layout for the emotion cognition ability time variation sequence < T, R > of the network individual 'Cui Yongyuan' obtained in the step 2), so that the emotion bifurcation of the network individual can be visualized more intuitively, and the specific steps can comprise:
and 3-1) constructing a network individual emotion cognitive ability index position visualization layout, wherein time is taken as a horizontal axis, and emotion bifurcation point values are taken as a vertical axis. The time series T = of the network individuals obtained in the step 2-4)<1,2,3,…,t,…,T&gt, sequence of variation of index value of emotional cognitive ability R =<R 1 ,R 2 ,...,R t ,...,R T &And taking the time sequence T as a horizontal axis and the emotion cognition ability value change sequence R as a vertical axis in a two-dimensional rectangular coordinate system to construct a geometric figure. The range of emotional cognitive ability is divided into (0,1) in the dynamic model of the bifurcation point of the emotional structure],(1,3],(3,3.57](3.57,4) which correspond to "an evanescent state", "an equilibrium state", "an approximately equilibrium state", and "a disturbed state" of the mood of the individual, respectively.
Thus, R e { (0,1 ], (1,3 ], (3,3.57 ], (3.57,4) } in the first quadrant of the coordinate system is divided into four regions.
And 3-2) visually displaying the emotion cognition ability index positions of the network individuals, wherein in the step, in order to enable the change process of the emotion bifurcation positions of the individuals in the geometric layout to have a comparison effect, labeling points in a coordinate system when drawing a geometric layout. Such a point's attributes can be expressed as<t,R t ,F&gt, where T represents a time series, R t The number of dots is represented by a time series, and F represents a label symbol selected when the dots are drawn.
Thus, the emotional bifurcation point position change geometry of the network individual "Cui Yongyuan" is shown in fig. 5:
as described above, the cognitive results of the network individuals on emotional events are presented in the form of web texts.
In fig. 5, the change process of the emotional bifurcation point position of 19 times of cognition of the network individual 'Cui Yongyuan' on the emotional event is reflected in the centralized way, and the corresponding emotional cognitive ability is shown in (0,1), (1,3)],(3,3.57]The numbers in (3.57,4) are 0, 12,6 and 1, respectively. The emotion cognition ability R of the network individual is close to 3 and is in a critical position of an equilibrium state and an approximately equilibrium state, which shows that the network individual has stronger cognition ability to emotional events and is easy to generate rich and complex emotion states with higher intensity. In addition, fig. 5 also shows that the highest emotional cognitive ability occurs, R =3.88, and the microblog information S corresponding to the time point is obtained by viewing the collected text information set 15 The words with higher emotional intensity values, such as 'bad mark spot', 'cheat', 'fake', 'rogue' and the like appear in the words published when the words are fiercely warped with the Canoe microblog.
After the emotional event occurs, when the network individual learns the emotional event with higher emotional cognitive ability, the result of the emotional state of 'approximately balanced state', even 'disordered state' can be generated, most emotional words in the network text information published by most network individuals are not very strong, therefore, the normal state of emotion is in a "balanced state", namely the emotional cognitive ability of the network is between 1,3, on the other hand, the emotional cognitive abilities of different network individuals are different, in order to better reflect the difference of the emotional cognitive abilities of the network individuals, the microblog information set of the network individual 'Liu Yifei' is taken as another embodiment, and the change graph of the emotional bifurcation point is drawn according to the steps as shown in fig. 6, the number of times its magnitude of emotional cognitive ability appears in (0,1), (1,3 ], (3,3.57 ], (3.57,4) is 0, 13,3 and 0, respectively, intuitively, it can be seen that the average emotional awareness capability is close to 2.5, and is in the "equilibrium" interval, fig. 6 is related to the fact that some emotional words with not very high strength values are frequently used in published web texts, and some emotional words with high strength values are rarely used, by referring to the collected microblog text information sets, consistent with the description of fig. 6, it should be noted that understanding the change in emotional awareness of the individuals of the network described in fig. 6, i.e., the process of changing the location of their emotional bifurcation, requires a combination of emotional event theory and emotional structure bifurcation models to be considered, in view of this, according to the network individual emotion bifurcation point visualization graph disclosed by the embodiment of the invention, a user can be helped to better understand the reason that the network individual learns different emotional events at different times and shows emotional states.
Step 4) analyzing a plurality of network individuals, comparing the position change of the emotion cognition ability index with the emotion cognition ability index difference, acquiring the emotion cognition intensity index and carrying out visual display, and further comprising the following steps:
and 4-1) respectively determining the positions of the emotion cognition ability indexes of the network individuals relative to the emotion bifurcation, and then determining the emotion cognition ability index sequence of the network group.
There are differences in individual cognitive abilities with respect to mood. In order to compare the emotional cognitive abilities of different network individuals and show the emotional bifurcation point position difference of the different network individuals, a text information set of 4 network individuals including Cui Yongyuan, fang Zhouzi, sima Na and Liu Yifei is selected as an analysis corpus, and the emotional cognitive ability time change sequences are obtained through the calculation steps<T λ ,R λ &And lambda is the number of the network individuals. Through the step 3), the emotional bifurcation point position change geometric layout of the network individuals can be respectively established, however, the emotional cognitive abilities of the network individuals can not be compared with each other only according to the geometric layout.
Therefore, the concept of emotional cognitive strength is introduced, so that emotional cognitive abilities among different network individuals are compared. For network individual lambda in time period T λ Middle emotional cognitive performance R λ At (0,1), (1,3],(3,3.57]The number C in the interval (3.57,4) 1 、C 2 、C 3 、C 4 Respectively counting, and performing mean value clustering (k =1,2,3,4) on all R values in the interval corresponding to the maximum R value to obtain a central valueThe index level of the emotional cognition ability of the network individual i represents the obvious cognition ability of the network individual i to emotional events. In this way, the emotion cognition ability index level sequence of the multiple network individuals is obtained by calculating the emotion cognition ability index levels of different network individuals and numbering the network individualsWherein λ =<1,2,...,H&The number sequence of the network individuals is represented;the emotional cognitive ability index level of the network individual lambda is represented. The emotional-cognitive-ability index level has the same meaning as the average emotional cognitive ability described above and is used for describing the significant emotional cognitive ability level of the network individual in a long term.
Step 4-2) establishing a comparison visualization layout of emotion cognitive ability levels of the multiple network individuals;
in this step, the horizontal axis represents the number λ of the network individual, and the vertical axis represents the emotional cognitive ability index level of the network individual λThe construction of the other geometric panels is the same as in step 3-1) above. In the process of drawing the geometric layout, the head portrait icon of the network individual is taken as a label symbol for drawing the emotion cognition ability index point corresponding to the number lambda of the network individual so as to more intuitively express the difference in the emotion cognition ability of the network individual. In addition, the vertical axis is divided into four regions according to the emotional bifurcation point, as in the division method of the vertical axis in step 3-2). A graph for comparing and visualizing the emotional cognitive ability index levels of the network individuals is drawn and shown in fig. 7.
Fig. 7 shows comparison of emotion cognition index levels of network individuals "Cui Yongyuan", "Liu Yifei", "Sima Na" and "fang boat", wherein the values are 3.12, 2.55, 2.89 and 3.59 respectively, and the emotion cognition index level at a higher level is "fang boat", which indicates that there are more emotion words with high emotion intensity values in the published text information, and further indicates that the network individuals are sensitive to emotion events, which is consistent with the emotion structure bifurcation model. Therefore, the network individual emotion cognition ability index comparison visual graph provided by the embodiment of the invention can help a user to intuitively understand the difference of emotion cognition abilities of different network individuals. In addition, it should be understood that the step 4-2) is optional, and labels are marked on the created emotion visualization graph so as to enable the image to further display more comparative information to help the user to more vividly analyze the difference of emotional cognitive abilities among different network individuals.
And 5) comparing the emotion bifurcation position change process and the emotion cognition index level of a plurality of network individuals, and reasonably predicting the cognition process and result of the network individuals on future emotional events.
The document shows that because the individual has a certain rule on the emotional cognitive ability level of the emotional event, the invention can predict the emotional event encountered in the future based on a plurality of emotional bifurcation value sequences of the network individual and the emotional cognitive ability index level determined by the emotional bifurcation value sequences, and predict the emotional state possibly shown by emotional cognition of the network individual. The higher the level of emotional cognitive abilities of the network individuals, the more sensitive they are to emotional events, and thus the more likely they are to put their emotional state in a "disorganized" state; similarly, the lower the level of emotional cognitive ability, the easier it is to keep its emotional state in a "balanced state"; moderate levels of emotional cognitive ability place their emotional state in a "near equilibrium state". This enables a reasonable prediction according to the invention.
In conclusion, the invention can enable the user to determine the emotion bifurcation point position of a specific network individual, and can better classify and compare and analyze the individuals in the network group organization.
In order to illustrate the contents and the implementation of the present invention, two specific examples are given in the present specification. The details introduced in the examples are not intended to limit the scope of the claims but to aid in the understanding of the process described herein. Those skilled in the art will understand that: various modifications, changes or substitutions to the preferred embodiment steps are possible without departing from the spirit and scope of the invention and its appended claims. Therefore, the present invention should not be limited to the disclosure of the preferred embodiments and the accompanying drawings.

Claims (7)

1. A method for predicting and visualizing emotion cognitive ability of network individuals or groups is characterized by comprising the following steps: the method comprises the following steps:
step 1) constructing an ontology library capable of integrating multi-source emotional words;
step 2) determining the position of a network individual emotion bifurcation, and calculating an emotion cognitive ability index sequence according to a text information set published by the network individual and collected according to a time sequence;
step 3) visualizing the emotion cognition ability index sequence obtained in the step 2);
step 4) comparing and analyzing the emotion cognition ability index levels of a plurality of network individuals;
the step 1) of constructing an emotional word ontology library further comprises the following steps:
step 1-1) merging common emotion words in the existing Chinese emotion dictionary and network emotion new words and emoticons screened from a corpus set to obtain an emotion element set;
step 1-2) for each word W in the emotion element set i Determining emotional intensity I i And labeling the emotion polarity P i
Step 1-3) screening emotional intensity I i Word W above threshold i Will word W i With its emotional polarity P i And emotional intensity I i Adding the emotion word ontology library E as a triple, and obtaining E as follows:
E=<(W 1 ,P 1 ,I 1 ),(W 2 ,P 2 ,I 2 ),...,(W i ,P i ,I i ),...,(W n ,P n ,I n )>;
the step 1-2) further comprises the following steps:
step 1-2-1) emotion polarity labeling: polarity P of commonly used emotional words i If the same emotion word is inconsistent with the label in different emotion dictionaries, correcting the emotion word by using a multi-user voting mode; because the quantity of the network emotion new words and the emoticons is limited, the polarities of the network emotion new words and the emoticons are determined in a multi-user voting mode;
step 1-2-2) determining the emotional intensity:
(1) Determining the emotional intensity of the common emotional words: firstly, a large-scale social network text set U is obtained, and then common emotion words w are calculated according to the following formula * The emotional intensity of (2):
I(w * )=r(w * |S negative pole )-r(w * |S Is just )
Wherein S Is just And S Negative pole Respectively representing positive emotion words and negative emotion word sets in a social network text set U, r (w) * |S Is just ) Denotes w * R (w) of * |S Negative pole ) Denotes w * The negative emotion weight of (1), the emotion weight being calculated by the following formula;
wherein S * Denotes S Is just Or S Negative pole ,α、β∈[0,1]Is a combination of adjustment parameters, C i Is w * The ith word of (a), w * Has k words in total, P (C) i |S * ) And P (w) * ) Then it can be calculated by:
wherein Freq (S) * ,C i ) Denotes belonging to S * The composition character C of the word i The frequency of occurrence in U, freq (S) * ) Denotes belonging to S * Of all component words in USumming; δ is a smaller number;
wherein Freq (w) * ) Denotes w * The frequency of occurrence in U, | U | represents the number of words in U,denotes all words w in U i The sum of the frequencies occurring in U;
(2) Correcting the emotion polarity of the common emotion words:
when the emotion intensity I is greater than 0, positive emotion is represented, and the emotion polarity P = +1;
when the emotion intensity I is less than 0, negative emotion is represented, and the emotion polarity P = -1;
(3) Because the quantity of the network emotion new words and the emoticons is limited, the emotion intensity of the network emotion new words and the emoticons is determined by adopting a multi-user voting mode on the basis of referring to the intensity of the common emotion words;
delta is S * The inverse of the total word count.
2. The method for predicting and visualizing the emotional cognitive abilities of individuals or groups on a network according to claim 1, wherein: the step 2) further comprises the following steps:
step 2-1) collecting a text information set U published by the network individuals according to a time sequence:
wherein T is a time sequence, S is a text information vector set corresponding to T, and the microblog information published at the time T is S t
Step 2-2) carrying out word segmentation and part-of-speech tagging pretreatment on the text information set U to obtain all microblog information S published at 1-T moment 1 ~S T Word set W of 1 ~W T Therein is disclosedMicroblog information S published at middle t moment t Is noted as W t
Step 2-3) through a vocabulary W of each piece of microblog information t Matching the words in (T is more than or equal to 1 and less than or equal to T) with the emotional word ontology library E one by one, extracting the emotional words in the words, the emotional polarities and the emotional intensity values of the words, and at the momentW ti Denotes S t The ith emotion word, num, contained in t Represents W t The number of the emotion words contained in the Chinese character;
step 2-4) constructing a network individual emotion bifurcation point position calculation model and calculating a network emotion cognition ability index value of the network individual according to time sequence change, wherein the network emotion bifurcation point position calculation model specifically comprises the following steps:
step 2-4-1) separately calculating S by the following formula t The ratios P (high) t, P (med) t, P (low) t and P (dead) t of the emotional words corresponding to the four emotional states of "equilibrium state", "approximately equilibrium state", "disorder state" and "evanescent state" contained in the table are:
wherein dead t 、low t 、med t And right denotes Num t The number of the emotion intensity values in each emotion word is (0,a), [ a, b), [ b, c), [ c, d), wherein d = max (| I (w) |) is the maximum value of the absolute values of all emotion intensity values in the emotion word ontology library E, wherein the demarcation points a, b, c, d are the parameter values of the corresponding emotion bifurcation points, and the values are a =0.25d, b =0.75d, c = 0.5d;
step 2-4-2) defining and calculating the emotion cognition ability index R of the network individual according to the position of the bifurcation point of the emotion structure t The calculation process is as follows:
the following formula is calculated:
max{P(high) t ,P(med) t ,P(low) t ,P(dead) t }
if P (high) t Taken to be maximum, then R t =3.57+0.43*P(high) t
If P (med) t When the maximum value is taken, then R t =3+0.57*P(med) r
If P (low) t Taking the maximum, then R t =1+2*P(low) r
If P (dead) t When the maximum value is taken, then R t =P(dead) t
Step 2-4-3) all microblog information S published in a time sequence T = < 1,2,3, …, T, … and T ≧ 1 -S T Calculating the emotional cognitive ability index sequence R = < R) through the step 2-4-1) and the step 2-4-2) 1 ,R 2 ,...,R t ,...,R T >。
3. The method for predicting and visualizing the emotional cognitive abilities of individuals or groups on a network according to claim 1, wherein: step 3) the visualization is accomplished by:
step 3-1) constructing a geometric figure by taking a time sequence T as a horizontal axis and R as a vertical axis in a two-dimensional rectangular coordinate system, and dividing the coordinate system into four regions according to R epsilon { (0,1 ], (1,3 ], (3,3.57 ], (3.57,4 ] };
step 3-2) labeling the points in the coordinate system according to the index sequence of the emotion cognitive ability of the network individual, wherein each point is formed by three attributes < t, R t F >, where T denotes the T-th time of the time series T, R t An emotional cognitive performance index at time t is shown, and F is a label symbol selected when a point is drawn.
4. The method for predicting and visualizing the emotional cognitive abilities of individuals or groups on a network according to claim 1, wherein: the step 4) of comparing and analyzing the emotion cognition ability index levels of a plurality of network individuals is completed by the following steps:
step 4-1) calculating the emotion cognition ability index level of the network individuals according to the emotion cognition ability index sequence of the network individuals, and further determining the emotion cognition ability index level sequence of the network group:
where H represents the number of network individuals in the network population,(1 ≦ λ ≦ H) representing the emotional-cognitive ability index level of the λ -th network individual;
step 4-2) representing the number lambda of the network individual on the horizontal axis and the emotional-cognitive ability index level of the individual on the vertical axisAnd constructing a multi-network individual emotion cognition ability index level comparison visualization layout.
5. The method for predicting and visualizing emotional cognitive abilities of individuals or groups according to claim 4, wherein: the above-mentionedObtained by the following process: for network individual lambda in time period T λ Middle emotion cognitive ability index R λ Respectively at (0,1), (1,3],(3,3.57]The number C in the interval (3.57,4) 1 、C 2 、C 3 、C 4 Make statistics on the frequency C k (k =1,2,3,4) the mean or median of all R values in the R interval corresponding to the maximum value.
6. The method for predicting and visualizing emotional cognitive abilities of individuals or groups on a network according to claim 4 or 5, wherein: the visual layout is completed by the following steps:
step 3-1) taking network individual lambda as a horizontal axis and R as a horizontal axis in a two-dimensional rectangular coordinate system center Constructing a geometric figure for the longitudinal axis and according to R center ∈{(0,1],(1,3],(3,3.57],(3.57,4]Divide the coordinate system into four regions;
step 3-2) labeling the points in the coordinate system according to the index horizontal sequence of the emotion cognitive ability of the network individuals, wherein each point consists of three attributesF denotes a label symbol selected when drawing a dot.
7. The method for predicting and visualizing emotional cognitive abilities of individuals or groups according to claim 6, wherein: and F is a head icon or a schematic cartoon icon or symbol of the network individual.
CN201410795679.8A 2014-12-18 2014-12-18 A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing Expired - Fee Related CN104636425B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410795679.8A CN104636425B (en) 2014-12-18 2014-12-18 A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410795679.8A CN104636425B (en) 2014-12-18 2014-12-18 A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing

Publications (2)

Publication Number Publication Date
CN104636425A CN104636425A (en) 2015-05-20
CN104636425B true CN104636425B (en) 2018-02-13

Family

ID=53215171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410795679.8A Expired - Fee Related CN104636425B (en) 2014-12-18 2014-12-18 A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing

Country Status (1)

Country Link
CN (1) CN104636425B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105022725B (en) * 2015-07-10 2018-04-20 河海大学 A kind of text emotion trend analysis method applied to finance Web fields
CN105138570B (en) * 2015-07-26 2019-02-05 吉林大学 The doubtful crime degree calculation method of network speech data
CN105159879A (en) * 2015-08-26 2015-12-16 北京理工大学 Automatic determination method for network individual or group values
CN105843792B (en) * 2015-10-26 2018-12-21 北京宏博知微科技有限公司 A kind of synthesis emotion measure of network event
CN105740565A (en) * 2016-02-16 2016-07-06 合肥学院 Automobile model derivation method based on natural language processing
CN105786991B (en) * 2016-02-18 2019-03-15 中国科学院自动化研究所 In conjunction with the Chinese emotion new word identification method and system of user feeling expression way
CN109977101B (en) * 2016-05-24 2022-01-25 甘肃百合物联科技信息有限公司 Method and system for enhancing memory
CN106095777A (en) * 2016-05-26 2016-11-09 优品财富管理有限公司 The many empty sentiment indicator methods of prediction securities markets based on big data
CN106202047A (en) * 2016-07-15 2016-12-07 国家计算机网络与信息安全管理中心 A kind of character personality depicting method based on microblogging text
CN106775665B (en) * 2016-11-29 2021-02-19 竹间智能科技(上海)有限公司 Emotional state change information obtaining method and device based on emotional indexes
CN109218512A (en) * 2017-07-06 2019-01-15 新华网股份有限公司 Mobile terminal user emotion detection method and system and mobile terminal
CN107862087B (en) * 2017-12-01 2022-02-18 深圳爱数云科技有限公司 Emotion analysis method and device based on big data and deep learning and storage medium
CN108400810B (en) * 2018-01-31 2020-09-25 中国人民解放军陆军工程大学 Communication satellite frequency resource visualization management method based on time frequency
CN108388608B (en) * 2018-02-06 2020-08-04 金蝶软件(中国)有限公司 Emotion feedback method and device based on text perception, computer equipment and storage medium
CN108549630B (en) * 2018-03-29 2021-07-30 西安影视数据评估中心有限公司 Method for identifying turning points of film and television script stories
CN109857852B (en) * 2019-01-24 2021-02-23 安徽商贸职业技术学院 Method and system for screening and judging characteristics of E-commerce online comment training set
CN110083726B (en) * 2019-03-11 2021-10-22 北京比速信息科技有限公司 Destination image perception method based on UGC picture data
CN112232197A (en) * 2020-10-15 2021-01-15 武汉微派网络科技有限公司 Juvenile identification method, device and equipment based on user behavior characteristics
CN113761146A (en) * 2021-01-05 2021-12-07 北京沃东天骏信息技术有限公司 Method and device for recognizing emotional fluctuation of customer

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324662A (en) * 2013-04-18 2013-09-25 中国科学院计算技术研究所 Visual method and equipment for dynamic view evolution of social media event
CN103559176A (en) * 2012-10-29 2014-02-05 中国人民解放军国防科学技术大学 Microblog emotional evolution analysis method and system
CN104216873A (en) * 2014-08-27 2014-12-17 华中师范大学 Method for analyzing network left word emotion fluctuation characteristics of emotional handicap sufferer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559176A (en) * 2012-10-29 2014-02-05 中国人民解放军国防科学技术大学 Microblog emotional evolution analysis method and system
CN103324662A (en) * 2013-04-18 2013-09-25 中国科学院计算技术研究所 Visual method and equipment for dynamic view evolution of social media event
CN104216873A (en) * 2014-08-27 2014-12-17 华中师范大学 Method for analyzing network left word emotion fluctuation characteristics of emotional handicap sufferer

Also Published As

Publication number Publication date
CN104636425A (en) 2015-05-20

Similar Documents

Publication Publication Date Title
CN104636425B (en) A kind of network individual or colony&#39;s Emotion recognition ability prediction and method for visualizing
CN111914096B (en) Public opinion knowledge graph-based public transportation passenger satisfaction evaluation method and system
Weichselbraun et al. Enriching semantic knowledge bases for opinion mining in big data applications
Kechaou et al. Improving e-learning with sentiment analysis of users' opinions
Sharma et al. Comparative Analysis of Online Fashion Retailers Using Customer Sentiment Analysis on Twitter
CN108038725A (en) A kind of electric business Customer Satisfaction for Product analysis method based on machine learning
Chafale et al. Review on developing corpora for sentiment analysis using plutchik’s wheel of emotions with fuzzy logic
Mestry et al. Automation in social networking comments with the help of robust fasttext and cnn
Servi et al. A mathematical approach to gauging influence by identifying shifts in the emotions of social media users
Martin et al. Are influential writers more objective? An analysis of emotionality in review comments
Aydın et al. Turkish sentiment analysis for open and distance education systems
Kucher et al. Visual Analysis of Sentiment and Stance in Social Media Texts.
Gottipati et al. Analyzing tweets on new norm: work from home during COVID-19 outbreak
Mello et al. Towards automatic content analysis of rhetorical structure in brazilian college entrance essays
Singh et al. An efficient method for aspect based sentiment analysis using spacy and vader
Sai Ensemble machine learning models in predicting personality traits and insights using Myers-Briggs dataset
Kasthuri et al. An opinion mining and sentiment analysis techniques: A survey
Rahul et al. Social media sentiment analysis for Malayalam
Gurin Methods for Automatic Sentiment Detection
Fei et al. The study of learners’ emotional analysis based on MOOC
Lubis et al. Improving course review helpfulness Prediction through sentiment analysis
CN114443846A (en) Classification method and device based on multi-level text abnormal composition and electronic equipment
Yang et al. Text Mining and Multi-Attribute Decision-Making-Based Course Improvement in Massive Open Online Courses
Pugsee et al. A classification model for Thai statement sentiments by deep learning techniques
Porwal et al. Scientific impact analysis: Unraveling the link between linguistic properties and citations

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180213

Termination date: 20181218

CF01 Termination of patent right due to non-payment of annual fee