CN104850549A - Method for monitoring public opinions on Internet - Google Patents

Method for monitoring public opinions on Internet Download PDF

Info

Publication number
CN104850549A
CN104850549A CN201410050402.2A CN201410050402A CN104850549A CN 104850549 A CN104850549 A CN 104850549A CN 201410050402 A CN201410050402 A CN 201410050402A CN 104850549 A CN104850549 A CN 104850549A
Authority
CN
China
Prior art keywords
opinion
link
supervising
network public
public
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410050402.2A
Other languages
Chinese (zh)
Inventor
屠巍瀚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XIYI DIGITAL TECHNOLOGY (SHANGHAI) Co Ltd
Original Assignee
XIYI DIGITAL TECHNOLOGY (SHANGHAI) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XIYI DIGITAL TECHNOLOGY (SHANGHAI) Co Ltd filed Critical XIYI DIGITAL TECHNOLOGY (SHANGHAI) Co Ltd
Priority to CN201410050402.2A priority Critical patent/CN104850549A/en
Publication of CN104850549A publication Critical patent/CN104850549A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for monitoring public opinions on Internet. The method for monitoring the public opinions on Internet comprises steps of S1, generating an acquisition script, analyzing source codes of a webpage according to the acquisition script, and grabbing an link; S2, storing the acquired link in a link pool after grabbing the link, and processing the link pool in an output mode; S3, regularly acquiring data in the link pool through an acquisition cluster, and storing the acquired data in a page snapshot in a database; and S4 concurrently regularly searching stored page snapshots using a search server according to user needs to obtain a search result, and monitoring the public opinions on the Internet according to the search result. By adopting the method for monitoring the public opinions on the Internet, the beginning node of a public opinion event, a turning point of propagation, a propagation path and the like can be determined, a set of whole system for monitoring and tracing back the public opinions can be achieved.

Description

A kind of method for supervising of network public-opinion
Technical field
The present invention relates to network public-opinion monitoring field, particularly relate to a kind of method for supervising of network public-opinion.
Background technology
Along with network is popularized energetically, people are more and more accustomed to expressing oneself viewpoint at network, and bulkyness and invisible due to network, cause the expression of viewpoint more true, bold, network public-opinion causes the extensive concern of people gradually.Network public-opinion has certain region characteristic, the much-talked-about topic of network is also the much-talked-about topic in society, find the contact of network public-opinion and Social Public Feelings, the propagation of public sentiment on network and its propagation on geographic position being connected, is a research tendency of network public-opinion.
But at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, where cannot obtain node that public sentiment event starts most, which place propagation is turning point, the path etc. of propagation.
Summary of the invention
The present invention is directed in prior art, at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, where cannot obtain node that public sentiment event starts most, which place propagation is turning point, the defects such as the path of propagating, provide a kind of method for supervising of network public-opinion.
The technical scheme that the present invention provides with regard to above-mentioned technical matters is as follows:
The invention provides a kind of method for supervising of network public-opinion, the method for supervising of described network public-opinion comprises the following steps:
S1, generate and gather script, and according to described collection script Webpage document source analyzed and carry out link crawl;
S2, after having captured link obtain link stored in link pond, to the process of the described link pond formula of going out;
S3, regularly by gathering cluster and carry out data acquisition to the data linked in pond and by the data that collect stored in the page snapshot in database;
S4, search server carry out concurrent type frog periodic search according to the keyword of user's request to the page snapshot stored and obtain Search Results; The monitoring of network public-opinion is completed according to Search Results.
In the method for supervising of network public-opinion of the present invention, gather described in described step S1 script comprise for each large information website, microblogging, forum php gather script, or the adaptation all kinds page of overall importance php gather script.
In the method for supervising of network public-opinion of the present invention, aggregated pattern is gathered on different linux servers described in described step S3, every platform linux server runs respectively multiple not identical php and gather process, to carry out data acquisition to the data in link pond.
In the method for supervising of network public-opinion of the present invention, described step S3 comprises:
Gather conversion that cluster carries out picture and chained address to page source code so and propose key word, and by described key word stored in database, and the data of the page are upgraded in the time of specifying.
The method for supervising of network public-opinion according to claim 4, is characterized in that, described search server is Sphinx search server.
In the method for supervising of network public-opinion of the present invention, the monitoring completing network public-opinion according to Search Results in described step S4 comprises, the content retrieving the keyword containing user's request is filed according to pre-defined rule, or IMU is crossed the mode such as note, mail and is sent to client.
In the method for supervising of network public-opinion of the present invention, the described content to retrieving the keyword containing user's request is filed according to the travel path of time order and function order or content according to the pre-defined rule content comprised retrieving the keyword containing user's request of filing.
The method for supervising of network public-opinion provided by the invention, overcomes at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, node that public sentiment event starts most cannot be obtained where, which place propagation is turning point, the defect in the path of propagating, can know node that public sentiment event starts most where, which place propagation is turning point, the path etc. of propagating, form public sentiment monitoring and the traceability system of complete set, a healthy green good online environment by the method for supervising purification internet information of present networks public sentiment, can be built by specific government department; In addition can Timeliness coverage specified network focus, therefrom excavate potential commercial value, be convenient to commercial exploitation.
Accompanying drawing explanation
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the process flow diagram of the method for supervising of the network public-opinion of the embodiment of the present invention.
Embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, below in conjunction with the drawings and specific embodiments, the present invention being described in more detail.
The present invention is directed at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, where cannot obtain node that public sentiment event starts most, which place propagation is turning point, the defect in the path of propagating, discloses a kind of method for supervising of network public-opinion.
As shown in Figure 1, the process flow diagram of the method for supervising of the network public-opinion of the embodiment of the present invention.At the method for supervising of a kind of network public-opinion that the embodiment of the present invention provides, public sentiment refers to that the cloth information on various information promulgating platform on the internet being led to php script collects and carry out instant information analysis stored in mysql database, then carries out the instant notice of public sentiment by key search engine Sphinx.The method for supervising of described network public-opinion comprises the following steps:
S1, generate and gather script, and according to described collection script Webpage document source analyzed and carry out link crawl;
S2, after having captured link obtain link stored in link pond, to the process of the described link pond formula of going out;
S3, regularly by gathering cluster and carry out data acquisition to the data linked in pond and by the data that collect stored in the page snapshot in database;
S4, search server carry out concurrent type frog periodic search according to the keyword of user's request to the page snapshot stored and obtain Search Results; The monitoring of network public-opinion is completed according to Search Results.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, gather described in described step S1 script comprise for each large information website, microblogging, forum php gather script, or the adaptation all kinds page of overall importance php gather script.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, gather aggregated pattern described in described step S3 on different linux servers, every platform linux server runs respectively multiple not identical php and gather process, to carry out data acquisition to the data in link pond.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, described step S3 comprises:
Gather conversion that cluster carries out picture and chained address to page source code so and propose key word, and by described key word stored in database, and the data of the page are upgraded in the time of specifying.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, described search server is Sphinx search server.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, the monitoring completing network public-opinion according to Search Results in described step S4 comprises, the content retrieving the keyword containing user's request is filed according to pre-defined rule, or IMU is crossed the mode such as note, mail and is sent to client.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, the described content to retrieving the keyword containing user's request is filed according to the travel path of time order and function order or content according to the pre-defined rule content comprised retrieving the keyword containing user's request of filing.
Below by way of one more specifically embodiment explain principle of the present invention:
First for each large information website, microblogging, forum writes specified ph p and gathers script, or the php compiling the more popular page of adaptation of overall importance gathers script, this script carries out link by this analysis of page source Valsartan and captures, linking stored in link pond after having captured link, this link pond has been that the mode that redis queue is carried out with row consumes.Then by collection cluster, this aggregated pattern is at different linux machines, every platform linux machine runs respectively not identical php and gather process, each data obtained in link pond of carrying out carry out data acquisition and stored in page snapshot, the page is that then the conversion by page source code being carried out picture and chained address exists this locality the page, and extracts key word stored in mysql storehouse.Gather group to be just responsible for gathering single-page, deposit snapshot and propose key word, and putting forward the data of renewal single-page of fixing the date, acquisition link script needs one to gather and upgrades the latest data when station, need ceaselessly to adopt, a renewal in 1 hour can be carried out in the website not high to updating survey, and the website that updating survey is high needs to carry out 1 minute more once.Be more than by data acquisition in system, system needs to monitor, and need to build a Sphinx search server, inquire about data, search service needs more powerful can supporting large data and support a certain amount of high concurrent.We provide customization supervisory system, this system is same linux, nginx, php, mysql is according to business development system, there is provided key word then to carry out instant search by system by client, this was searched and can carry out second and inquiry according to customer demand and divide kind of a level inquiry, and this is set by management tool by client, then completed by system, can file according to rule when there being the content searching out related keyword, or IMU cross note, the modes such as mail send straight client.Same when client needs a little thing after, client can see node that this event starts most where, and which place propagation is turning point, the path etc. of propagation, and the public sentiment forming complete set is monitored and traceability system.Solve the problem collecting comprehensive information source from huge internet and the problem of how to review public sentiment source from analyze.
By reference to the accompanying drawings embodiments of the invention are described above; but the present invention is not limited to above-mentioned embodiment; above-mentioned embodiment is only schematic; instead of it is restrictive; those of ordinary skill in the art is under enlightenment of the present invention; do not departing under the ambit that present inventive concept and claim protect, also can make a lot of form, these all belong within protection of the present invention.

Claims (7)

1. a method for supervising for network public-opinion, is characterized in that, the method for supervising of described network public-opinion comprises the following steps:
S1, generate and gather script, and according to described collection script Webpage document source analyzed and carry out link crawl;
S2, after having captured link obtain link stored in link pond, to the process of the described link pond formula of going out;
S3, regularly by gathering cluster and carry out data acquisition to the data linked in pond and by the data that collect stored in the page snapshot in database;
S4, search server carry out concurrent type frog periodic search according to the keyword of user's request to the page snapshot stored and obtain Search Results; The monitoring of network public-opinion is completed according to Search Results.
2. the method for supervising of network public-opinion according to claim 1, it is characterized in that, gather described in described step S1 script comprise for each large information website, microblogging, forum php gather script, or the adaptation all kinds page of overall importance php gather script.
3. the method for supervising of network public-opinion according to claim 1, it is characterized in that, aggregated pattern is gathered on different linux servers described in described step S3, every platform linux server runs respectively multiple not identical php and gather process, to carry out data acquisition to the data in link pond.
4. the method for supervising of network public-opinion according to claim 3, is characterized in that, described step S3 comprises:
Gather cluster carry out the conversion of picture and chained address to page source code and propose key word, and by described key word stored in database, and the data of the page are upgraded in the time of specifying.
5. the method for supervising of network public-opinion according to claim 4, is characterized in that, described search server is Sphinx search server.
6. the method for supervising of network public-opinion according to claim 1, it is characterized in that, the monitoring completing network public-opinion according to Search Results in described step S4 comprises, the content retrieving the keyword containing user's request is filed according to pre-defined rule, or IMU is crossed the mode such as note, mail and is sent to client.
7. the method for supervising of network public-opinion according to claim 6, it is characterized in that, the described content to retrieving the keyword containing user's request is filed according to the travel path of time order and function order or content according to the pre-defined rule content comprised retrieving the keyword containing user's request of filing.
CN201410050402.2A 2014-02-13 2014-02-13 Method for monitoring public opinions on Internet Pending CN104850549A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410050402.2A CN104850549A (en) 2014-02-13 2014-02-13 Method for monitoring public opinions on Internet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410050402.2A CN104850549A (en) 2014-02-13 2014-02-13 Method for monitoring public opinions on Internet

Publications (1)

Publication Number Publication Date
CN104850549A true CN104850549A (en) 2015-08-19

Family

ID=53850197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410050402.2A Pending CN104850549A (en) 2014-02-13 2014-02-13 Method for monitoring public opinions on Internet

Country Status (1)

Country Link
CN (1) CN104850549A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574191A (en) * 2015-12-26 2016-05-11 中国人民解放军信息工程大学 Online social network multisource point information tracing system and method thereof
CN106302407A (en) * 2016-08-02 2017-01-04 四川秘无痕信息安全技术有限责任公司 A kind of method monitoring wechat circle of friends transmission data
CN107944019A (en) * 2017-12-11 2018-04-20 中广在线(北京)文化传媒有限公司 A kind of monitoring device of public sentiment overseas based on crawler technology, system and method
CN108268662A (en) * 2018-02-09 2018-07-10 平安科技(深圳)有限公司 Social graph generation method, electronic device and storage medium based on the H5 pages
CN109902454A (en) * 2019-03-15 2019-06-18 北京邮电大学 Using sensitive information extracting method, device, equipment and readable storage medium storing program for executing
CN110162673A (en) * 2019-05-27 2019-08-23 上海吉江数据技术有限公司 Information changing monitoring system, method and device
CN110413681A (en) * 2019-08-01 2019-11-05 上海胜泰信息技术有限公司 A Web end group is in the visualized data processing method of big data technology
CN112395539A (en) * 2020-11-26 2021-02-23 格美安(北京)信息技术有限公司 Public opinion risk monitoring method and system based on natural language processing
CN113434751A (en) * 2021-07-14 2021-09-24 国际关系学院 Network hotspot artificial intelligence early warning system and method

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574191A (en) * 2015-12-26 2016-05-11 中国人民解放军信息工程大学 Online social network multisource point information tracing system and method thereof
CN105574191B (en) * 2015-12-26 2018-10-23 中国人民解放军信息工程大学 Online community network multi-source point information source tracing system and method
CN106302407B (en) * 2016-08-02 2019-05-17 四川秘无痕信息安全技术有限责任公司 A method of monitoring wechat circle of friends sends data
CN106302407A (en) * 2016-08-02 2017-01-04 四川秘无痕信息安全技术有限责任公司 A kind of method monitoring wechat circle of friends transmission data
CN107944019A (en) * 2017-12-11 2018-04-20 中广在线(北京)文化传媒有限公司 A kind of monitoring device of public sentiment overseas based on crawler technology, system and method
CN108268662B (en) * 2018-02-09 2020-11-10 平安科技(深圳)有限公司 Social graph generation method based on H5 page, electronic device and storage medium
CN108268662A (en) * 2018-02-09 2018-07-10 平安科技(深圳)有限公司 Social graph generation method, electronic device and storage medium based on the H5 pages
CN109902454A (en) * 2019-03-15 2019-06-18 北京邮电大学 Using sensitive information extracting method, device, equipment and readable storage medium storing program for executing
CN110162673A (en) * 2019-05-27 2019-08-23 上海吉江数据技术有限公司 Information changing monitoring system, method and device
CN110413681A (en) * 2019-08-01 2019-11-05 上海胜泰信息技术有限公司 A Web end group is in the visualized data processing method of big data technology
CN112395539A (en) * 2020-11-26 2021-02-23 格美安(北京)信息技术有限公司 Public opinion risk monitoring method and system based on natural language processing
CN112395539B (en) * 2020-11-26 2021-12-17 格美安(北京)信息技术有限公司 Public opinion risk monitoring method and system based on natural language processing
CN113434751A (en) * 2021-07-14 2021-09-24 国际关系学院 Network hotspot artificial intelligence early warning system and method
CN113434751B (en) * 2021-07-14 2023-06-02 国际关系学院 Network hotspot artificial intelligent early warning system and method

Similar Documents

Publication Publication Date Title
CN104850549A (en) Method for monitoring public opinions on Internet
US8626835B1 (en) Social identity clustering
Bordin et al. Dspbench: A suite of benchmark applications for distributed data stream processing systems
CN104182506A (en) Log management method
US20140143655A1 (en) Method for adjusting content of a webpage in real time based on users online behavior and profile
CN110502509B (en) Traffic big data cleaning method based on Hadoop and Spark framework and related device
CN101583964A (en) Large-scale aggregating and reporting of ad data
WO2015020922A1 (en) Dynamic collection analysis and reporting of telemetry data
CN103942210A (en) Processing method, device and system of mass log information
CN106951557B (en) Log association method and device and computer system applying log association method and device
Nithya et al. Novel pre-processing technique for web log mining by removing global noise and web robots
CN104133878A (en) User label generation method and device
CN104572976B (en) Website data update method and system
CN103023714A (en) Activeness and cluster structure analyzing system and method based on network topics
CN103810283A (en) Microblog data acquisition method based on user correlation
CN104252532A (en) Website information statistic method and device
CN106407429A (en) File tracking method, device and system
CN105518644A (en) Method for processing and displaying real-time social data on map
CN103902667A (en) Simple network information collector achieving method based on meta-search
CN103745383A (en) Method and system of realizing redirection service based on operator data
CN107704620A (en) A kind of method, apparatus of file administration, equipment and storage medium
US20160188676A1 (en) Collaboration system for network management
Huang et al. A process mining based service composition approach for mobile information systems
CN104166659A (en) Method and system for map data duplication judgment
CN110019152A (en) A kind of big data cleaning method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150819