CN104850549A - Method for monitoring public opinions on Internet - Google Patents
Method for monitoring public opinions on Internet Download PDFInfo
- Publication number
- CN104850549A CN104850549A CN201410050402.2A CN201410050402A CN104850549A CN 104850549 A CN104850549 A CN 104850549A CN 201410050402 A CN201410050402 A CN 201410050402A CN 104850549 A CN104850549 A CN 104850549A
- Authority
- CN
- China
- Prior art keywords
- opinion
- link
- supervising
- network public
- public
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method for monitoring public opinions on Internet. The method for monitoring the public opinions on Internet comprises steps of S1, generating an acquisition script, analyzing source codes of a webpage according to the acquisition script, and grabbing an link; S2, storing the acquired link in a link pool after grabbing the link, and processing the link pool in an output mode; S3, regularly acquiring data in the link pool through an acquisition cluster, and storing the acquired data in a page snapshot in a database; and S4 concurrently regularly searching stored page snapshots using a search server according to user needs to obtain a search result, and monitoring the public opinions on the Internet according to the search result. By adopting the method for monitoring the public opinions on the Internet, the beginning node of a public opinion event, a turning point of propagation, a propagation path and the like can be determined, a set of whole system for monitoring and tracing back the public opinions can be achieved.
Description
Technical field
The present invention relates to network public-opinion monitoring field, particularly relate to a kind of method for supervising of network public-opinion.
Background technology
Along with network is popularized energetically, people are more and more accustomed to expressing oneself viewpoint at network, and bulkyness and invisible due to network, cause the expression of viewpoint more true, bold, network public-opinion causes the extensive concern of people gradually.Network public-opinion has certain region characteristic, the much-talked-about topic of network is also the much-talked-about topic in society, find the contact of network public-opinion and Social Public Feelings, the propagation of public sentiment on network and its propagation on geographic position being connected, is a research tendency of network public-opinion.
But at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, where cannot obtain node that public sentiment event starts most, which place propagation is turning point, the path etc. of propagation.
Summary of the invention
The present invention is directed in prior art, at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, where cannot obtain node that public sentiment event starts most, which place propagation is turning point, the defects such as the path of propagating, provide a kind of method for supervising of network public-opinion.
The technical scheme that the present invention provides with regard to above-mentioned technical matters is as follows:
The invention provides a kind of method for supervising of network public-opinion, the method for supervising of described network public-opinion comprises the following steps:
S1, generate and gather script, and according to described collection script Webpage document source analyzed and carry out link crawl;
S2, after having captured link obtain link stored in link pond, to the process of the described link pond formula of going out;
S3, regularly by gathering cluster and carry out data acquisition to the data linked in pond and by the data that collect stored in the page snapshot in database;
S4, search server carry out concurrent type frog periodic search according to the keyword of user's request to the page snapshot stored and obtain Search Results; The monitoring of network public-opinion is completed according to Search Results.
In the method for supervising of network public-opinion of the present invention, gather described in described step S1 script comprise for each large information website, microblogging, forum php gather script, or the adaptation all kinds page of overall importance php gather script.
In the method for supervising of network public-opinion of the present invention, aggregated pattern is gathered on different linux servers described in described step S3, every platform linux server runs respectively multiple not identical php and gather process, to carry out data acquisition to the data in link pond.
In the method for supervising of network public-opinion of the present invention, described step S3 comprises:
Gather conversion that cluster carries out picture and chained address to page source code so and propose key word, and by described key word stored in database, and the data of the page are upgraded in the time of specifying.
The method for supervising of network public-opinion according to claim 4, is characterized in that, described search server is Sphinx search server.
In the method for supervising of network public-opinion of the present invention, the monitoring completing network public-opinion according to Search Results in described step S4 comprises, the content retrieving the keyword containing user's request is filed according to pre-defined rule, or IMU is crossed the mode such as note, mail and is sent to client.
In the method for supervising of network public-opinion of the present invention, the described content to retrieving the keyword containing user's request is filed according to the travel path of time order and function order or content according to the pre-defined rule content comprised retrieving the keyword containing user's request of filing.
The method for supervising of network public-opinion provided by the invention, overcomes at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, node that public sentiment event starts most cannot be obtained where, which place propagation is turning point, the defect in the path of propagating, can know node that public sentiment event starts most where, which place propagation is turning point, the path etc. of propagating, form public sentiment monitoring and the traceability system of complete set, a healthy green good online environment by the method for supervising purification internet information of present networks public sentiment, can be built by specific government department; In addition can Timeliness coverage specified network focus, therefrom excavate potential commercial value, be convenient to commercial exploitation.
Accompanying drawing explanation
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the process flow diagram of the method for supervising of the network public-opinion of the embodiment of the present invention.
Embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, below in conjunction with the drawings and specific embodiments, the present invention being described in more detail.
The present invention is directed at present in public sentiment monitoring application, there is the limitation of Data Source; Current public sentiment monitoring system is confined to certain or the specific network morphology of certain class mostly, causes public sentiment to be monitored comprehensive not; And prior art only rests on the web2.0 epoch, cannot from a large amount of social tool obtaining information source, where cannot obtain node that public sentiment event starts most, which place propagation is turning point, the defect in the path of propagating, discloses a kind of method for supervising of network public-opinion.
As shown in Figure 1, the process flow diagram of the method for supervising of the network public-opinion of the embodiment of the present invention.At the method for supervising of a kind of network public-opinion that the embodiment of the present invention provides, public sentiment refers to that the cloth information on various information promulgating platform on the internet being led to php script collects and carry out instant information analysis stored in mysql database, then carries out the instant notice of public sentiment by key search engine Sphinx.The method for supervising of described network public-opinion comprises the following steps:
S1, generate and gather script, and according to described collection script Webpage document source analyzed and carry out link crawl;
S2, after having captured link obtain link stored in link pond, to the process of the described link pond formula of going out;
S3, regularly by gathering cluster and carry out data acquisition to the data linked in pond and by the data that collect stored in the page snapshot in database;
S4, search server carry out concurrent type frog periodic search according to the keyword of user's request to the page snapshot stored and obtain Search Results; The monitoring of network public-opinion is completed according to Search Results.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, gather described in described step S1 script comprise for each large information website, microblogging, forum php gather script, or the adaptation all kinds page of overall importance php gather script.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, gather aggregated pattern described in described step S3 on different linux servers, every platform linux server runs respectively multiple not identical php and gather process, to carry out data acquisition to the data in link pond.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, described step S3 comprises:
Gather conversion that cluster carries out picture and chained address to page source code so and propose key word, and by described key word stored in database, and the data of the page are upgraded in the time of specifying.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, described search server is Sphinx search server.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, the monitoring completing network public-opinion according to Search Results in described step S4 comprises, the content retrieving the keyword containing user's request is filed according to pre-defined rule, or IMU is crossed the mode such as note, mail and is sent to client.
Preferably, in the method for supervising of the network public-opinion that the embodiment of the present invention provides, the described content to retrieving the keyword containing user's request is filed according to the travel path of time order and function order or content according to the pre-defined rule content comprised retrieving the keyword containing user's request of filing.
Below by way of one more specifically embodiment explain principle of the present invention:
First for each large information website, microblogging, forum writes specified ph p and gathers script, or the php compiling the more popular page of adaptation of overall importance gathers script, this script carries out link by this analysis of page source Valsartan and captures, linking stored in link pond after having captured link, this link pond has been that the mode that redis queue is carried out with row consumes.Then by collection cluster, this aggregated pattern is at different linux machines, every platform linux machine runs respectively not identical php and gather process, each data obtained in link pond of carrying out carry out data acquisition and stored in page snapshot, the page is that then the conversion by page source code being carried out picture and chained address exists this locality the page, and extracts key word stored in mysql storehouse.Gather group to be just responsible for gathering single-page, deposit snapshot and propose key word, and putting forward the data of renewal single-page of fixing the date, acquisition link script needs one to gather and upgrades the latest data when station, need ceaselessly to adopt, a renewal in 1 hour can be carried out in the website not high to updating survey, and the website that updating survey is high needs to carry out 1 minute more once.Be more than by data acquisition in system, system needs to monitor, and need to build a Sphinx search server, inquire about data, search service needs more powerful can supporting large data and support a certain amount of high concurrent.We provide customization supervisory system, this system is same linux, nginx, php, mysql is according to business development system, there is provided key word then to carry out instant search by system by client, this was searched and can carry out second and inquiry according to customer demand and divide kind of a level inquiry, and this is set by management tool by client, then completed by system, can file according to rule when there being the content searching out related keyword, or IMU cross note, the modes such as mail send straight client.Same when client needs a little thing after, client can see node that this event starts most where, and which place propagation is turning point, the path etc. of propagation, and the public sentiment forming complete set is monitored and traceability system.Solve the problem collecting comprehensive information source from huge internet and the problem of how to review public sentiment source from analyze.
By reference to the accompanying drawings embodiments of the invention are described above; but the present invention is not limited to above-mentioned embodiment; above-mentioned embodiment is only schematic; instead of it is restrictive; those of ordinary skill in the art is under enlightenment of the present invention; do not departing under the ambit that present inventive concept and claim protect, also can make a lot of form, these all belong within protection of the present invention.
Claims (7)
1. a method for supervising for network public-opinion, is characterized in that, the method for supervising of described network public-opinion comprises the following steps:
S1, generate and gather script, and according to described collection script Webpage document source analyzed and carry out link crawl;
S2, after having captured link obtain link stored in link pond, to the process of the described link pond formula of going out;
S3, regularly by gathering cluster and carry out data acquisition to the data linked in pond and by the data that collect stored in the page snapshot in database;
S4, search server carry out concurrent type frog periodic search according to the keyword of user's request to the page snapshot stored and obtain Search Results; The monitoring of network public-opinion is completed according to Search Results.
2. the method for supervising of network public-opinion according to claim 1, it is characterized in that, gather described in described step S1 script comprise for each large information website, microblogging, forum php gather script, or the adaptation all kinds page of overall importance php gather script.
3. the method for supervising of network public-opinion according to claim 1, it is characterized in that, aggregated pattern is gathered on different linux servers described in described step S3, every platform linux server runs respectively multiple not identical php and gather process, to carry out data acquisition to the data in link pond.
4. the method for supervising of network public-opinion according to claim 3, is characterized in that, described step S3 comprises:
Gather cluster carry out the conversion of picture and chained address to page source code and propose key word, and by described key word stored in database, and the data of the page are upgraded in the time of specifying.
5. the method for supervising of network public-opinion according to claim 4, is characterized in that, described search server is Sphinx search server.
6. the method for supervising of network public-opinion according to claim 1, it is characterized in that, the monitoring completing network public-opinion according to Search Results in described step S4 comprises, the content retrieving the keyword containing user's request is filed according to pre-defined rule, or IMU is crossed the mode such as note, mail and is sent to client.
7. the method for supervising of network public-opinion according to claim 6, it is characterized in that, the described content to retrieving the keyword containing user's request is filed according to the travel path of time order and function order or content according to the pre-defined rule content comprised retrieving the keyword containing user's request of filing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410050402.2A CN104850549A (en) | 2014-02-13 | 2014-02-13 | Method for monitoring public opinions on Internet |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410050402.2A CN104850549A (en) | 2014-02-13 | 2014-02-13 | Method for monitoring public opinions on Internet |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104850549A true CN104850549A (en) | 2015-08-19 |
Family
ID=53850197
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410050402.2A Pending CN104850549A (en) | 2014-02-13 | 2014-02-13 | Method for monitoring public opinions on Internet |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104850549A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105574191A (en) * | 2015-12-26 | 2016-05-11 | 中国人民解放军信息工程大学 | Online social network multisource point information tracing system and method thereof |
CN106302407A (en) * | 2016-08-02 | 2017-01-04 | 四川秘无痕信息安全技术有限责任公司 | A kind of method monitoring wechat circle of friends transmission data |
CN107944019A (en) * | 2017-12-11 | 2018-04-20 | 中广在线(北京)文化传媒有限公司 | A kind of monitoring device of public sentiment overseas based on crawler technology, system and method |
CN108268662A (en) * | 2018-02-09 | 2018-07-10 | 平安科技(深圳)有限公司 | Social graph generation method, electronic device and storage medium based on the H5 pages |
CN109902454A (en) * | 2019-03-15 | 2019-06-18 | 北京邮电大学 | Using sensitive information extracting method, device, equipment and readable storage medium storing program for executing |
CN110162673A (en) * | 2019-05-27 | 2019-08-23 | 上海吉江数据技术有限公司 | Information changing monitoring system, method and device |
CN110413681A (en) * | 2019-08-01 | 2019-11-05 | 上海胜泰信息技术有限公司 | A Web end group is in the visualized data processing method of big data technology |
CN112395539A (en) * | 2020-11-26 | 2021-02-23 | 格美安(北京)信息技术有限公司 | Public opinion risk monitoring method and system based on natural language processing |
CN113434751A (en) * | 2021-07-14 | 2021-09-24 | 国际关系学院 | Network hotspot artificial intelligence early warning system and method |
-
2014
- 2014-02-13 CN CN201410050402.2A patent/CN104850549A/en active Pending
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105574191A (en) * | 2015-12-26 | 2016-05-11 | 中国人民解放军信息工程大学 | Online social network multisource point information tracing system and method thereof |
CN105574191B (en) * | 2015-12-26 | 2018-10-23 | 中国人民解放军信息工程大学 | Online community network multi-source point information source tracing system and method |
CN106302407B (en) * | 2016-08-02 | 2019-05-17 | 四川秘无痕信息安全技术有限责任公司 | A method of monitoring wechat circle of friends sends data |
CN106302407A (en) * | 2016-08-02 | 2017-01-04 | 四川秘无痕信息安全技术有限责任公司 | A kind of method monitoring wechat circle of friends transmission data |
CN107944019A (en) * | 2017-12-11 | 2018-04-20 | 中广在线(北京)文化传媒有限公司 | A kind of monitoring device of public sentiment overseas based on crawler technology, system and method |
CN108268662B (en) * | 2018-02-09 | 2020-11-10 | 平安科技(深圳)有限公司 | Social graph generation method based on H5 page, electronic device and storage medium |
CN108268662A (en) * | 2018-02-09 | 2018-07-10 | 平安科技(深圳)有限公司 | Social graph generation method, electronic device and storage medium based on the H5 pages |
CN109902454A (en) * | 2019-03-15 | 2019-06-18 | 北京邮电大学 | Using sensitive information extracting method, device, equipment and readable storage medium storing program for executing |
CN110162673A (en) * | 2019-05-27 | 2019-08-23 | 上海吉江数据技术有限公司 | Information changing monitoring system, method and device |
CN110413681A (en) * | 2019-08-01 | 2019-11-05 | 上海胜泰信息技术有限公司 | A Web end group is in the visualized data processing method of big data technology |
CN112395539A (en) * | 2020-11-26 | 2021-02-23 | 格美安(北京)信息技术有限公司 | Public opinion risk monitoring method and system based on natural language processing |
CN112395539B (en) * | 2020-11-26 | 2021-12-17 | 格美安(北京)信息技术有限公司 | Public opinion risk monitoring method and system based on natural language processing |
CN113434751A (en) * | 2021-07-14 | 2021-09-24 | 国际关系学院 | Network hotspot artificial intelligence early warning system and method |
CN113434751B (en) * | 2021-07-14 | 2023-06-02 | 国际关系学院 | Network hotspot artificial intelligent early warning system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104850549A (en) | Method for monitoring public opinions on Internet | |
US8626835B1 (en) | Social identity clustering | |
Bordin et al. | Dspbench: A suite of benchmark applications for distributed data stream processing systems | |
CN104182506A (en) | Log management method | |
US20140143655A1 (en) | Method for adjusting content of a webpage in real time based on users online behavior and profile | |
CN110502509B (en) | Traffic big data cleaning method based on Hadoop and Spark framework and related device | |
CN101583964A (en) | Large-scale aggregating and reporting of ad data | |
WO2015020922A1 (en) | Dynamic collection analysis and reporting of telemetry data | |
CN103942210A (en) | Processing method, device and system of mass log information | |
CN106951557B (en) | Log association method and device and computer system applying log association method and device | |
Nithya et al. | Novel pre-processing technique for web log mining by removing global noise and web robots | |
CN104133878A (en) | User label generation method and device | |
CN104572976B (en) | Website data update method and system | |
CN103023714A (en) | Activeness and cluster structure analyzing system and method based on network topics | |
CN103810283A (en) | Microblog data acquisition method based on user correlation | |
CN104252532A (en) | Website information statistic method and device | |
CN106407429A (en) | File tracking method, device and system | |
CN105518644A (en) | Method for processing and displaying real-time social data on map | |
CN103902667A (en) | Simple network information collector achieving method based on meta-search | |
CN103745383A (en) | Method and system of realizing redirection service based on operator data | |
CN107704620A (en) | A kind of method, apparatus of file administration, equipment and storage medium | |
US20160188676A1 (en) | Collaboration system for network management | |
Huang et al. | A process mining based service composition approach for mobile information systems | |
CN104166659A (en) | Method and system for map data duplication judgment | |
CN110019152A (en) | A kind of big data cleaning method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150819 |