CN106484846A - A kind of monitoring method of network public-opinion big data - Google Patents

A kind of monitoring method of network public-opinion big data Download PDF

Info

Publication number
CN106484846A
CN106484846A CN201610877340.1A CN201610877340A CN106484846A CN 106484846 A CN106484846 A CN 106484846A CN 201610877340 A CN201610877340 A CN 201610877340A CN 106484846 A CN106484846 A CN 106484846A
Authority
CN
China
Prior art keywords
public sentiment
public
data
network
big data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610877340.1A
Other languages
Chinese (zh)
Inventor
晋彤
李永康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Special Road Mdt Infotech Ltd
Original Assignee
Guangzhou Special Road Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Special Road Mdt Infotech Ltd filed Critical Guangzhou Special Road Mdt Infotech Ltd
Priority to CN201610877340.1A priority Critical patent/CN106484846A/en
Publication of CN106484846A publication Critical patent/CN106484846A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of monitoring method of network public-opinion big data, comprise the following steps:S1, data acquisition;S2, data preprocessing module;S3, public sentiment classifying module;S4, public sentiment sensitivity computing module;S5, public sentiment deduce module;Overcome at present in public sentiment monitoring application, there is the limitation of Data Source;Simultaneously from substantial amounts of public sentiment data, excavate public sentiment start develop when clues and traces, by reasoning and the probabilistic model of science, will be toward which direction evolution and key factor from computer forecast event.

Description

A kind of monitoring method of network public-opinion big data
Technical field
The present invention relates to network public-opinion monitoring field, more particularly, to a kind of monitoring method of network public-opinion big data.
Background technology
Popularize energetically with network, people are increasingly accustomed to expressing the viewpoint of oneself in network, and the Pang due to network Big property and invisible, the expression leading to viewpoint is more true, courageously, and network public-opinion gradually causes the extensive concern of people.Network Public sentiment has certain region characteristic, and the much-talked-about topic of network is also the much-talked-about topic in society, finds network public-opinion and social carriage The contact of feelings, public sentiment is connected in the propagation on network and its propagation on geographical position, is of network public-opinion Research tendency.
But at present in public sentiment monitoring application, there is the limitation of Data Source;Current public sentiment monitoring system is most It is confined to certain or the specific network morphology of certain class, lead to public sentiment monitoring not comprehensive;And prior art only rests on The web2.0 epoch it is impossible to obtain information source from a large amount of social tool it is impossible to where obtain the node that public sentiment event starts most, Which place propagation is turning point, path of propagation etc..
Simultaneously domestic at present main public sentiment monitoring means, with real-time monitoring, afterwards dispose based on, be still not attempt to public sentiment Tendency is predicted analyzing.
Content of the invention
The present invention is directed in prior art, at present in public sentiment monitoring application, there is the limitation of Data Source;When Front public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, leads to public sentiment monitoring not comprehensive;And Prior art only rests on the web2.0 epoch it is impossible to obtain information source from a large amount of social tool it is impossible to obtain public sentiment event The node starting where, and which place propagation is turning point, the defect such as path of propagation, there is provided a kind of network public-opinion is big The monitoring method of data.
The technical scheme that the present invention provides with regard to above-mentioned technical problem is as follows:
The invention provides a kind of monitoring method of network public-opinion big data, the monitoring method of described network public-opinion big data Comprise the following steps:
S1, data acquisition, in public sentiment that interconnection cyber journalism, forum information and user are issued in the Internet Appearance is acquired;
S2, data preprocessing module, the network public-opinion text for the Internet to collection carries out pretreatment, including basis User gradation carries out noise filtering, text participle, vector representation and feature extraction;;
S3, public sentiment classifying module, are entered based on the similarity between public sentiment topic in public sentiment data after the pre-treatment Row is sorted out;
S4, public sentiment sensitivity computing module, for the public sentiment topic after sorting out, in conjunction with network attribute information and user etc. Level, calculates public sentiment sensitivity value;
S5, public sentiment deduce module, from substantial amounts of public sentiment big data, excavate public sentiment start develop when clues and traces, By reasoning and the probabilistic model of science, will be toward which direction evolution and key factor from computer forecast event.
Preferably, data acquisition described in described step S1 is by a different server, on every server It is separately operable multiple text collection processes differing, to be acquired to public sentiment big data.
Preferably, described in step S2, noise filtering is carried out according to user gradation, further include:Obtain network semantic Data and user-association data, delete garbage.
Preferably, in step s 2, data preprocessing module also includes the data variation value of public sentiment big data is carried out Statistics.
Wherein, the public sentiment in step S5 deduces module, starts to develop by from substantial amounts of public sentiment data, excavating public sentiment When clues and traces, by reasoning and the probabilistic model of science, from computer forecast event will toward which direction evolution and Key factor.As public sentiment storm that one section of period focal point of netizen may cause etc..
In the public sentiment trend graph having occurred and that, we can click on any one important time point and be deduced, and establishment pushes away When drilling, system can automatically analyze out all factors that in this time point, impact public sentiment event occurs, including:
Whether event report is occurred on forum
Network is folded arms, whether right-safeguarding lawyer, media people intervene
Netizen's attention rate how
Enthusiastically participate in situation from media
Media spread condition (diffusance)
By Top Site papers published
Traditional station media exposure degree
In addition to factor of influence, we also carry out weight statistics to the measure implemented, such as, if we deliver news Wire copy or take official's clarification means, then the impact of public sentiment or the growth alleviating negative public sentiment may be reduced. By acting on of both " factor of influence+Disposal Measures ", we just can extrapolate, and it is to rise also that the tendency of public sentiment next step is produced It is to reduce or constant, thus judging the effectiveness that we are taken measures.
Deduce the same with weather forecasting, the public sentiment tendency in following 3 days can only be done with a simulation and differentiate, following can not Or know diversity factor, equally can affect the true tendency of public sentiment.Such as weather forecasting, need by continuous accumulation with to going through The accumulation of history case, just can be accurate with convergence.
The monitoring method of the network public-opinion that the present invention provides, overcomes at present in public sentiment monitoring application, there is number Limitation according to source;Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, leads to public sentiment Monitoring is not comprehensive;And prior art only rests on the web2.0 epoch it is impossible to obtain information source from a large amount of social tool, no The node that method acquisition public sentiment event starts most where, and which place propagation is turning point, the defect in the path of propagation, permissible Know node that public sentiment event starts most where, and which place propagation is turning point, path of propagation etc., is formed a set of complete Whole public sentiment monitoring and traceability system, specific government department can purify the Internet letter by the monitoring method of present networks public sentiment Breath, builds a good online environment of health green;In addition can find in time to specify network hotspot, therefrom excavate potential business Industry be worth, be easy to commercial exploitation, simultaneously from substantial amounts of public sentiment data, excavate public sentiment start develop when clues and traces, lead to Cross reasoning and the probabilistic model of science, will be toward which direction evolution and key factor from computer forecast event.
Brief description
Fig. 1 is the monitoring flow chart of the present invention.
Specific embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, below in conjunction with the accompanying drawings with specific embodiment pair The present invention is described in more detail.
A kind of monitoring method of network public-opinion big data, the monitoring method of described network public-opinion big data is as shown in figure 1, wrap Include following steps:
S1, data acquisition, in public sentiment that interconnection cyber journalism, forum information and user are issued in the Internet Appearance is acquired;
S2, data preprocessing module, the network public-opinion text for the Internet to collection carries out pretreatment, including basis User gradation carries out noise filtering, text participle, vector representation and feature extraction;;
S3, public sentiment classifying module, are entered based on the similarity between public sentiment topic in public sentiment data after the pre-treatment Row is sorted out;
S4, public sentiment sensitivity computing module, for the public sentiment topic after sorting out, in conjunction with network attribute information and user etc. Level, calculates public sentiment sensitivity value;
S5, public sentiment deduce module, from substantial amounts of public sentiment big data, excavate public sentiment start develop when clues and traces, By reasoning and the probabilistic model of science, will be toward which direction evolution and key factor from computer forecast event.
Preferably, data acquisition described in described step S1 is by a different server, on every server It is separately operable multiple text collection processes differing, to be acquired to public sentiment big data.
Preferably, described in step S2, noise filtering is carried out according to user gradation, further include:Obtain network semantic Data and user-association data, delete garbage.
Preferably, in step s 2, data preprocessing module also includes the data variation value of public sentiment big data is carried out Statistics.
Wherein, the public sentiment in step S5 deduces module, starts to develop by from substantial amounts of public sentiment data, excavating public sentiment When clues and traces, by reasoning and the probabilistic model of science, from computer forecast event will toward which direction evolution and Key factor.As public sentiment storm that one section of period focal point of netizen may cause etc..
In the public sentiment trend graph having occurred and that, we can click on any one important time point and be deduced, and establishment pushes away When drilling, system can automatically analyze out all factors that in this time point, impact public sentiment event occurs, including:
Whether event report is occurred on forum
Network is folded arms, whether right-safeguarding lawyer, media people intervene
Netizen's attention rate how
Enthusiastically participate in situation from media
Media spread condition (diffusance)
By Top Site papers published
Traditional station media exposure degree
In addition to factor of influence, we also carry out weight statistics to the measure implemented, such as, if we deliver news Wire copy or take official's clarification means, then the impact of public sentiment or the growth alleviating negative public sentiment may be reduced. By acting on of both " factor of influence+Disposal Measures ", we just can extrapolate, and it is to rise also that the tendency of public sentiment next step is produced It is to reduce or constant, thus judging the effectiveness that we are taken measures.
Deduce the same with weather forecasting, the public sentiment tendency in following 3 days can only be done with a simulation and differentiate, following can not Or know diversity factor, equally can affect the true tendency of public sentiment.Such as weather forecasting, need by continuous accumulation with to going through The accumulation of history case, just can be accurate with convergence.
The monitoring method of the network public-opinion that the present invention provides, overcomes at present in public sentiment monitoring application, there is number Limitation according to source;Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, leads to public sentiment Monitoring is not comprehensive;And prior art only rests on the web2.0 epoch it is impossible to obtain information source from a large amount of social tool, no The node that method acquisition public sentiment event starts most where, and which place propagation is turning point, the defect in the path of propagation, permissible Know node that public sentiment event starts most where, and which place propagation is turning point, path of propagation etc., is formed a set of complete Whole public sentiment monitoring and traceability system, specific government department can purify the Internet letter by the monitoring method of present networks public sentiment Breath, builds a good online environment of health green;In addition can find in time to specify network hotspot, therefrom excavate potential business Industry be worth, be easy to commercial exploitation, simultaneously from substantial amounts of public sentiment data, excavate public sentiment start develop when clues and traces, lead to Cross reasoning and the probabilistic model of science, will be toward which direction evolution and key factor from computer forecast event.
Above in conjunction with accompanying drawing, embodiments of the invention are described, but the invention is not limited in above-mentioned concrete Embodiment, above-mentioned specific embodiment is only schematically, rather than restricted, those of ordinary skill in the art Under the enlightenment of the present invention, in the case of without departing from present inventive concept and scope of the claimed protection, also can make a lot Form, these belong within the protection of the present invention.

Claims (4)

1. a kind of monitoring method of network public-opinion big data is it is characterised in that the monitoring method bag of described network public-opinion big data Include following steps:
S1, data acquisition, the public sentiment content for issuing in the Internet to interconnection cyber journalism, forum information and user is entered Row collection;
S2, data preprocessing module, the network public-opinion text for the Internet to collection carries out pretreatment, including according to user Grade carries out noise filtering, text participle, vector representation and feature extraction;
S3, public sentiment classifying module, are returned based on the similarity between public sentiment topic in public sentiment data after the pre-treatment Class;
S4, public sentiment sensitivity computing module, for the public sentiment topic after sorting out, in conjunction with network attribute information and user gradation, Calculate public sentiment sensitivity value;
S5, public sentiment deduce module, from substantial amounts of public sentiment big data, excavate public sentiment start develop when clues and traces, pass through The reasoning of science and probabilistic model, will be toward which direction evolution and key factor from computer forecast event.
2. a kind of monitoring method of network public-opinion big data according to claim 1 is it is characterised in that in described step S1 Described data acquisition is to be entered by a different server, every server being separately operable multiple text collection differing Journey, to be acquired to public sentiment big data.
3. a kind of monitoring method of network public-opinion big data according to claim 1 is it is characterised in that described in step S2 Noise filtering is carried out according to user gradation, further includes:Obtain network semantic data and user-association data, delete useless letter Breath.
4. a kind of monitoring method of network public-opinion big data according to claim 1 is it is characterised in that in step s 2, Data preprocessing module also includes the data variation value of public sentiment big data is counted.
CN201610877340.1A 2016-09-30 2016-09-30 A kind of monitoring method of network public-opinion big data Pending CN106484846A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610877340.1A CN106484846A (en) 2016-09-30 2016-09-30 A kind of monitoring method of network public-opinion big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610877340.1A CN106484846A (en) 2016-09-30 2016-09-30 A kind of monitoring method of network public-opinion big data

Publications (1)

Publication Number Publication Date
CN106484846A true CN106484846A (en) 2017-03-08

Family

ID=58268548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610877340.1A Pending CN106484846A (en) 2016-09-30 2016-09-30 A kind of monitoring method of network public-opinion big data

Country Status (1)

Country Link
CN (1) CN106484846A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918633A (en) * 2017-03-23 2018-04-17 广州思涵信息科技有限公司 Sensitive public sentiment content identification method and early warning system based on semantic analysis technology
CN109376195A (en) * 2018-11-14 2019-02-22 重庆理工大学 For online social network data mining model numerical value mechanism validation verification method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103885993A (en) * 2012-12-24 2014-06-25 北大方正集团有限公司 Public opinion monitoring method and device for microblog
CN104408157A (en) * 2014-12-05 2015-03-11 四川诚品电子商务有限公司 Funnel type data gathering, analyzing and pushing system and method for online public opinion
CN104809252A (en) * 2015-05-20 2015-07-29 成都布林特信息技术有限公司 Internet data extraction system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103885993A (en) * 2012-12-24 2014-06-25 北大方正集团有限公司 Public opinion monitoring method and device for microblog
CN104408157A (en) * 2014-12-05 2015-03-11 四川诚品电子商务有限公司 Funnel type data gathering, analyzing and pushing system and method for online public opinion
CN104809252A (en) * 2015-05-20 2015-07-29 成都布林特信息技术有限公司 Internet data extraction system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918633A (en) * 2017-03-23 2018-04-17 广州思涵信息科技有限公司 Sensitive public sentiment content identification method and early warning system based on semantic analysis technology
CN107918633B (en) * 2017-03-23 2021-07-02 广州思涵信息科技有限公司 Sensitive public opinion content identification method and early warning system based on semantic analysis technology
CN109376195A (en) * 2018-11-14 2019-02-22 重庆理工大学 For online social network data mining model numerical value mechanism validation verification method
CN109376195B (en) * 2018-11-14 2019-11-05 重庆理工大学 For online social network data mining model numerical value mechanism validation verification method

Similar Documents

Publication Publication Date Title
CN102214241B (en) Method for detecting burst topic in user generation text stream based on graph clustering
CN103617169B (en) A kind of hot microblog topic extracting method based on Hadoop
CN104239539B (en) A kind of micro-blog information filter method merged based on much information
CN103745000B (en) Hot topic detection method of Chinese micro-blogs
Alsaedi et al. Arabic event detection in social media
CN102790762A (en) Phishing website detection method based on uniform resource locator (URL) classification
CN103345524B (en) Method and system for detecting microblog hot topics
CN110134849A (en) A kind of network public-opinion monitoring method and system
CN101674264B (en) Spam detection device and method based on user relationship mining and credit evaluation
CN108399241B (en) Emerging hot topic detection system based on multi-class feature fusion
CN101408883A (en) Method for collecting network public feelings viewpoint
CN103294818A (en) Multi-information fusion microblog hot topic detection method
CN107609103A (en) It is a kind of based on push away spy event detecting method
CN104933622A (en) Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme
CN111538888A (en) Network public opinion intensity evolution analysis system based on active monitoring engine and big data
CN103793503A (en) Opinion mining and classification method based on web texts
CN103268350A (en) Internet public opinion information monitoring system and monitoring method
CN102739679A (en) URL(Uniform Resource Locator) classification-based phishing website detection method
CN101661513A (en) Detection method of network focus and public sentiment
CN103037339A (en) Short message filtering method based on user creditworthiness and short message spam degree
CN103336766A (en) Short text garbage identification and modeling method and device
CN106055604A (en) Short text topic model mining method based on word network to extend characteristics
CN103914491A (en) Data excavating method and system for high quality user generation content (UGC)
CN101393555A (en) Rubbish blog detecting method
CN103324745A (en) Text garbage identifying method and system based on Bayesian model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170308

RJ01 Rejection of invention patent application after publication