CN108228431A - A kind of method and system of configurationization reptile quality-monitoring - Google Patents
A kind of method and system of configurationization reptile quality-monitoring Download PDFInfo
- Publication number
- CN108228431A CN108228431A CN201810007604.7A CN201810007604A CN108228431A CN 108228431 A CN108228431 A CN 108228431A CN 201810007604 A CN201810007604 A CN 201810007604A CN 108228431 A CN108228431 A CN 108228431A
- Authority
- CN
- China
- Prior art keywords
- threshold value
- monitoring
- website
- alarm threshold
- reptile
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/302—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3089—Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
- G06F11/3093—Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0604—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
- H04L41/0622—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time based on time
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/2866—Architectures; Arrangements
- H04L67/30—Profiles
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
This application discloses a kind of method of configurationization reptile quality-monitoring, including:It obtains crawlers and crawls the authority record parameter of each website, and authority record parameter is saved in database;Configuration file is read, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor;The authority record parameter that website ID corresponds to the monitoring time section of website is read from database;Judge whether authority record parameter is more than alarm threshold value;If so, send out alarm signal.This method crawls the authority record parameter of each website by obtaining crawlers, can realize the multi-faceted monitoring licensing process of crawlers and reptile quality;By reading configuration file, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor, achieve the effect that do personalized monitoring according to user demand.The application additionally provides a kind of system, server and the computer readable storage medium of configurationization reptile quality-monitoring simultaneously, has above-mentioned advantageous effect.
Description
Technical field
This application involves web crawlers field, more particularly to a kind of method of configurationization reptile quality-monitoring, system, service
Device and computer readable storage medium.
Background technology
With the rapid development of Internet technology, the big data epoch have arrived, and data acquisition becomes vital ring
Section.The important source that crawlers are acquired as data, plays irreplaceable role.
In the prior art, the quality for the data that crawlers crawl generally is referred to reptile quality, it is main according to certain
The quantity that data are crawled in time and the correctness for crawling data judge the height of reptile quality.Usually, when being climbed
After maintenance or correcting, reptile quality will appear a degree of decline for the targeted website of worm routine access.
Current existing reptile quality-monitoring scheme is to do number statistics to the Authorization result of website, and broadly to reptile
The output effect of program is monitored, and is periodically generated report.As it can be seen that the monitoring pair of existing configurationization reptile quality-monitoring scheme
As single, the report content monitored is general, it is impossible to do personalized monitoring according to user demand.
Therefore, how to realize that it is the current need of those skilled in the art to do personalized monitoring to reptile quality according to user demand
Technical problems to be solved.
Invention content
The purpose of the application is to provide a kind of method, system, server and the computer of configurationization reptile quality-monitoring can
Storage medium is read, this method can be realized does personalized monitoring according to user demand to reptile quality.
In order to solve the above technical problems, the application provides a kind of method of configurationization reptile quality-monitoring, this method includes:
It obtains crawlers and crawls the authority record parameter of each website, and the authority record parameter is saved in database
In;
Configuration file is read, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor;
The authority record parameter that the website ID corresponds to the monitoring time section of website is read from the database;
Judge whether the authority record parameter is more than the alarm threshold value;
If so, send out alarm signal.
Optionally, it before the acquisition crawlers crawl the authority record parameter of each website, further includes:
The field name and verification mode for needing to verify are read from the configuration file;
When the crawlers crawl data, field name described in the data is verified using the verification mode
Corresponding data field;
The data field for verifying failure is labeled as abnormal data.
Optionally, the authority record parameter include serial number, reptile type, http url, conditional code, authorize take,
At least one of in abnormal data quantity.
Optionally, judge whether the authority record parameter is more than the alarm threshold value, including:
Change rate of the state value for the conditional code proportion of " successfully completing " is calculated, and judges whether the change rate surpasses
Cross change rate alarm threshold value;
If not exceeded, then calculating the average value for authorizing and taking, and judge whether the average value is more than to authorize to take
Alarm threshold value;
If the average value is less than described authorize and takes alarm threshold value, the average response time of http url is counted,
And judge whether the average response time is more than response time alarm threshold value;
If the average response time is less than the response time alarm threshold value, judge that the abnormal data quantity is
No is more than abnormal data quantity alarm threshold value;
If the abnormal data quantity is more than the abnormal data quantity alarm threshold value, alarm command is sent.
Optionally, the reading configuration file, including:
Judge whether to receive configuration file input by user;
If so, read the configuration file input by user;
If it is not, then read default configuration file.
Optionally, it further includes:
The database periodically deletes the authority record parameter.
Optionally, the database includes mysql databases, hbase databases, mongodb databases, redis data
At least one of in library.
The application also provides a kind of system of configurationization reptile quality-monitoring, which includes:
It obtains and preserving module, for obtaining the authority record parameter that crawlers crawl each website, and by the mandate
Recording parameters are saved in database;
First read module, for reading configuration file, to obtain website ID, monitoring time section and the alarm for needing to monitor
Threshold value;
Second read module, for reading the monitoring time section that the website ID corresponds to website from the database
Authority record parameter;
Judgment module, for judging whether the authority record parameter is more than the alarm threshold value;
Alarm module, for when the authority record parameter is more than the alarm threshold value, sending out alarm signal.
The application also provides a kind of configurationization reptile quality monitoring server, which includes:
Memory, for storing computer program;
Processor realizes the configurationization reptile quality-monitoring as described in any of the above-described during for performing the computer program
Method the step of.
The application also provides a kind of computer readable storage medium, and calculating is stored on the computer readable storage medium
Machine program realizes the side of the configurationization reptile quality-monitoring as described in any of the above-described when the computer program is executed by processor
The step of method.
A kind of method of configuration reptile quality-monitoring provided herein crawls each website including obtaining crawlers
Authority record parameter, and authority record parameter is saved in database;Configuration file is read, to obtain the net for needing to monitor
It stands ID, monitoring time section and alarm threshold value;The authority record that website ID corresponds to the monitoring time section of website is read from database
Parameter;Judge whether authority record parameter is more than alarm threshold value;If so, send out alarm signal.
Technical solution provided herein crawls the authority record parameter of each website by obtaining crawlers, and will
Authority record parameter is saved in database, can realize the multi-faceted monitoring licensing process of crawlers and reptile quality, just
The reason of reptile quality declines is found in user;By reading configuration file, with obtain need monitor website ID, monitoring when
Between section and alarm threshold value so that user can by configuration file setting selection want monitoring website, monitoring time and
Monitoring standard achievees the effect that do personalized monitoring according to user demand.The application additionally provides a kind of configurationization reptile simultaneously
System, server and the computer readable storage medium of quality-monitoring have above-mentioned advantageous effect, and details are not described herein.
Description of the drawings
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or it will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of application, for those of ordinary skill in the art, without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
The flow chart of a kind of method of configuration reptile quality-monitoring that Fig. 1 is provided by the embodiment of the present application;
A kind of practical manifestation mode of S104 in a kind of method of configuration reptile quality-monitoring that Fig. 2 is provided by Fig. 1
Flow chart;
The structure chart of the system of a kind of configuration reptile quality-monitoring that Fig. 3 is provided by the embodiment of the present application;
The structure chart of the system of another configurationization reptile quality-monitoring that Fig. 4 is provided by the embodiment of the present application;
The structure chart of a kind of configuration reptile quality monitoring server that Fig. 5 is provided by the embodiment of the present application.
Specific embodiment
The core of the application is to provide a kind of method, system, server and the computer of configurationization reptile quality-monitoring can
Storage medium is read, this method can be realized does personalized monitoring according to user demand to reptile quality.
Purpose, technical scheme and advantage to make the embodiment of the present application are clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical solution in the embodiment of the present application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
All other embodiments obtained without making creative work shall fall in the protection scope of this application.
It please refers to Fig.1, the flow of a kind of method of configuration reptile quality-monitoring that Fig. 1 is provided by the embodiment of the present application
Figure.
It specifically comprises the following steps:
S101:It obtains crawlers and crawls the authority record parameter of each website, and authority record parameter is saved in data
In library;
It is directed to that existing reptile quality-monitoring scheme monitoring object is single, and report content is general, it is impossible to according to user's need
It asks and does personalized monitoring, this application provides a kind of methods of configurationization reptile quality-monitoring, can realize according to user demand
Personalized monitoring is done to reptile quality;
Optionally, before obtaining crawlers and crawling the authority record parameter of each website, can also include:
The field name and verification mode for needing to verify are read from configuration file;
When crawlers crawl data, the corresponding data field of field name in verification mode verification data is utilized;
The data field for verifying failure is labeled as abnormal data;
Checking routine can first read the field name and verification mode for needing to verify from configuration system, then to reptile
Each object of data is obtained object type and all field names, field value, is then utilized verification mode using the mode of reflection
The corresponding data field of field name in verification data;Checking routine supports a series of rule in itself, if data format is closes
Another certain non-empty etc. during the one of non-empty of the identification card number of method, legal cell-phone number, non-empty, two relevant fields;If
It was found that check results are failure, the data field for verifying failure is labeled as abnormal data;
Optionally, authority record parameter mentioned herein can include serial number, reptile type, http url, conditional code,
It authorizes at least one of time-consuming, in abnormal data quantity;
The authority record parameter of each website is crawled by obtaining crawlers, and authority record parameter is saved in database
In, it being capable of the multi-faceted original for monitoring the licensing process of crawlers and reptile quality, the decline of reptile quality being found convenient for user
Cause.
S102:Configuration file is read, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor;
Configuration file mentioned herein can be configuration file input by user, or pre-set default configuration
File;
Based on this, configuration file is read, can be included:
Judge whether to receive configuration file input by user;
If so, read configuration file input by user;
If it is not, then read default configuration file.
Alarm threshold value correspondence mentioned herein includes change rate alarm threshold value, authorizes time-consuming alarm threshold value, response time report
At least one of in alert threshold value, abnormal data quantity alarm threshold value;
Optionally, which periodically deletes the authority record parameter of deposit, and the parameter deletion period can also be matched by reading
Put file acquisition;
Optionally, database can include mysql databases, hbase databases, mongodb databases, redis data
At least one of in library;
Website ID, monitoring time section and the alarm threshold value for needing to monitor are obtained by reading configuration file so that Yong Huneng
Enough website, monitoring time and monitoring standards by wanting monitoring to the setting selection of configuration file reach according to user demand
Do the effect of personalized monitoring.
S103:The authority record parameter that website ID corresponds to the monitoring time section of website is read from database;
S104:Judge whether authority record parameter is more than alarm threshold value;
If so, enter step S105.
S105:Send out alarm signal.
When authority record parameter is more than alarm threshold value, then the reptile quality for proving to monitor does not meet user's requirement, this
When send out alarm signal, allow the user to find in time and crawlers made with corresponding adjustment;
Optionally, the mode for sending out alarm signal can be to read the obtained type of alarm of configuration file, including such as short message,
The mode of mail or a combination of both, alert receipt person may be to read the mail address of alert receipt person that configuration file obtains
Or note number.
Based on above-mentioned technical proposal, a kind of method of configuration reptile quality-monitoring provided herein passes through acquisition
Crawlers crawl the authority record parameter of each website, and authority record parameter is saved in database, can realize multi-party
The position monitoring licensing process of crawlers and reptile quality find the reason of reptile quality declines convenient for user;Pass through reading
Configuration file, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor so that user can be by configuration text
Website, monitoring time and the monitoring standard of monitoring are wanted in the setting selection of part, reach and do personalized monitoring according to user demand
Effect.
It based on above-described embodiment, please refers to Fig.2, a kind of method of configuration reptile quality-monitoring that Fig. 2 is provided by Fig. 1
A kind of flow chart of practical manifestation mode of middle S104.
The present embodiment is the S104 for a upper embodiment, is to be made that specific implementation to the S104 contents described
Description, here is flow chart shown in Fig. 2, specifically includes following steps:
S201:Change rate of the state value for the conditional code proportion of " successfully completing " is calculated, and judges that the change rate is
No is more than change rate alarm threshold value;
If so, enter step S205;If it is not, then enter step S202.
S202:It calculates and authorizes time-consuming average value, and judge whether average value is more than to authorize to take alarm threshold value;
If so, enter step S205;If it is not, then enter step S203.
S203:The average response time of http url is counted, and judges whether average response time is more than to report the response time
Alert threshold value;
If so, enter step S205;If it is not, then enter step S204.
S204:Judge whether abnormal data quantity is more than constant data bulk alarm threshold value;
If so, enter step S205.
S205:Send alarm command.
When abnormal data quantity is more than abnormal data quantity alarm threshold value, alarm command is sent, so that configurationization reptile
The program of quality-monitoring sends out alarm signal;
It should be noted that the application is not specifically limited step S201 to the sequence between S204, user can basis
Self-demand makees corresponding setting to step S201 to the sequence between S204.
Based on above-mentioned technical proposal, the embodiment of the present application by judge authority record parameter value whether more than alarm threshold value come
Whether judgement reptile quality meets user's requirement, realizes the purpose of various dimensions monitoring reptile quality, finds and climb convenient for user
The reason of worm quality declines.
It please refers to Fig.3, the structure of the system of a kind of configuration reptile quality-monitoring that Fig. 3 is provided by the embodiment of the present application
Figure.
The system can include:
Acquisition and preserving module 100 for obtaining the authority record parameter that crawlers crawl each website, and are remembered authorizing
Record parameter is saved in database;
First read module 200, for reading configuration file, with obtain need monitor website ID, monitoring time section and
Alarm threshold value;
Second read module 300, for reading the mandate note that website ID corresponds to the monitoring time section of website from database
Record parameter;
Judgment module 400, for judging whether authority record parameter is more than alarm threshold value;
Alarm module 500, for when authority record parameter is more than alarm threshold value, sending out alarm signal.
It please refers to Fig.4, the knot of the system of another configurationization reptile quality-monitoring that Fig. 4 is provided by the embodiment of the present application
Composition.
The system can also include:
Third read module, for reading the field name and verification mode that need to verify from configuration file;
Correction verification module, for when crawlers crawl data, utilizing field name pair in verification mode verification data
The data field answered;
Mark module is labeled as abnormal data for that will verify the data field of failure.
The judgment module 400 can include:
First judging submodule, for calculating the change rate for the conditional code proportion that state value is " successfully completing ", and
Judge whether change rate is more than change rate alarm threshold value;
Second judgment submodule, for when whether change rate is more than non-change rate alarm threshold value, calculating and authorizing what is taken
Average value, and judge whether average value is more than to authorize to take alarm threshold value;
Third judging submodule, for when average value is less than mandate and takes alarm threshold value, statistics http url's to be flat
The equal response time, and judge whether average response time is more than response time alarm threshold value;
4th judging submodule, for when average response time is less than response time alarm threshold value, judging abnormal number
Whether data bulk is more than abnormal data quantity alarm threshold value;
Sending submodule, for when abnormal data quantity is more than abnormal data quantity alarm threshold value, being sent out to alarm module
Send alarm command.
First read module 200 can include:
5th judging submodule, for judging whether to receive configuration file input by user;
Reading submodule, for when receiving configuration file input by user, reading configuration file input by user;When
When not receiving configuration file input by user, default configuration file is read.
Each component part of system above can be applied in a following practical flow:
Third read module reads the field name and verification mode for needing to verify from configuration file;Work as crawlers
When crawling data, correction verification module utilizes the corresponding data field of field name in verification mode verification data;Mark module will
The data field of verification failure is labeled as abnormal data;
It obtains and preserving module obtains crawlers and crawls the authority record parameter of each website, and authority record parameter is protected
It is stored in database;5th judging submodule judges whether to receive configuration file input by user;It is inputted when receiving user
Configuration file when, reading submodule reads configuration file input by user;When not receiving configuration file input by user,
Reading submodule reads default configuration file, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor;Second reads
Modulus block reads the authority record parameter that website ID corresponds to the monitoring time section of website from database;
First judging submodule calculates change rate of the state value for the conditional code proportion of " successfully completing ", and judges to become
Whether rate is more than change rate alarm threshold value;When whether change rate is more than non-change rate alarm threshold value, second judgment submodule
It calculates and authorizes time-consuming average value, and judge whether average value is more than to authorize to take alarm threshold value;When average value is less than mandate
When taking alarm threshold value, the average response time of third judging submodule statistics http url, and judge that average response time is
No is more than response time alarm threshold value;When average response time is less than response time alarm threshold value, the 4th judging submodule
Judge whether abnormal data quantity is more than abnormal data quantity alarm threshold value;When abnormal data quantity is more than abnormal data quantity report
During alert threshold value, sending submodule sends alarm command to alarm module;When receiving alarm command, alarm module sends out alarm
Signal.
Please refer to Fig. 5, the structure of a kind of configuration reptile quality monitoring server that Fig. 5 is provided by the embodiment of the present application
Figure.
The server can generate bigger difference due to configuration or different performance, can include at one or more
Device (central processing units, CPU) 622 (for example, one or more processors) and memory 632 is managed,
The storage medium 630 of one or more storage application programs 642 or data 644 (such as one or more magnanimity are deposited
Store up equipment).Wherein, memory 632 and storage medium 630 can be of short duration storage or persistent storage.It is stored in storage medium 630
Program can include one or more modules (diagram does not mark), each module can include to a series of in device
Instruction operation.Further, central processing unit 622 could be provided as communicating with storage medium 630, in configurationization reptile quality
The series of instructions operation in storage medium 630 is performed in monitoring server 600.
Configurationization reptile quality monitoring server 600 can also include one or more power supplys 626, one or one
More than wired or wireless network interface 650, one or more input/output interfaces 658 and/or, one or more
Operating system 641, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
Step in the method for the described configurationization reptile quality-monitorings of above-mentioned Fig. 1 to Fig. 2 is by configurationization reptile quality
Monitoring server is based on the structure shown in fig. 5 and realizes.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of system and module can refer to the corresponding process in preceding method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, server and method, it can
To realize by another way.For example, system embodiment described above is only schematical, for example, module is drawn
Point, only a kind of division of logic function can have other dividing mode, such as multiple module or components can in actual implementation
To combine or be desirably integrated into another system or some features can be ignored or does not perform.Another point, it is shown or beg for
The mutual coupling, direct-coupling or communication connection of opinion can be the INDIRECT COUPLING by some interfaces, system or module
Or communication connection, can be electrical, machinery or other forms.
The module illustrated as separating component may or may not be physically separate, be shown as module
Component may or may not be physical module, you can be located at a place or can also be distributed to multiple networks
In module.Some or all of module therein can be selected according to the actual needs to realize the purpose of this embodiment scheme.
In addition, each function module in each embodiment of the application can be integrated in a processing module, it can also
That modules are individually physically present, can also two or more modules be integrated in a module.Above-mentioned integrated mould
The form that hardware had both may be used in block is realized, can also be realized in the form of software function module.
If integrated module realized in the form of software function module and be independent product sale or in use, can
To be stored in a computer read/write memory medium.Based on such understanding, the technical solution of the application substantially or
Saying all or part of the part contribute to the prior art or the technical solution can be embodied in the form of software product
Out, which is stored in a storage medium, is used including some instructions so that a Computer Service
Device (can be personal computer, funcall system or network server etc.) performs each embodiment method of the application
All or part of step.And aforementioned storage medium includes:USB flash disk, mobile hard disk, read-only memory (Read-Only Memory,
ROM), random access memory (Random Access Memory, RAM), magnetic disc or CD etc. are various can store program
The medium of code.
Above to method, system, server and the computer of a kind of configuration reptile quality-monitoring provided herein
Readable storage medium storing program for executing is described in detail.Specific case used herein carries out the principle and embodiment of the application
It illustrates, the explanation of above example is only intended to help to understand the present processes and its core concept.It should be pointed out that for this
For the those of ordinary skill of technical field, under the premise of the application principle is not departed from, the application can also be carried out several
Improvement and modification, these improvement and modification are also fallen into the application scope of the claims.
It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, term " comprising ", "comprising" or its any other variant meaning
Covering non-exclusive inclusion, so that process, method, article or server including a series of elements not only include
Those elements, but also including other elements that are not explicitly listed or further include for this process, method, article or
The intrinsic element of person's server.In the absence of more restrictions, the element limited by sentence "including a ...", and
It is not precluded in the process including element, method, article or server that also there are other identical elements.
Claims (10)
- A kind of 1. method of configurationization reptile quality-monitoring, which is characterized in that including:It obtains crawlers and crawls the authority record parameter of each website, and the authority record parameter is saved in database;Configuration file is read, to obtain website ID, monitoring time section and the alarm threshold value for needing to monitor;The authority record parameter that the website ID corresponds to the monitoring time section of website is read from the database;Judge whether the authority record parameter is more than the alarm threshold value;If so, send out alarm signal.
- 2. according to the method described in claim 1, it is characterized in that, crawl the mandate note of each website in the acquisition crawlers Before recording parameter, further include:The field name and verification mode for needing to verify are read from the configuration file;When the crawlers crawl data, verify field name described in the data using the verification mode and correspond to Data field;The data field for verifying failure is labeled as abnormal data.
- 3. according to the method described in claim 2, it is characterized in that, the authority record parameter include serial number, reptile type, Http url, conditional code, authorize take, in abnormal data quantity at least one of.
- 4. according to claim 1-3 any one of them methods, which is characterized in that judge the authority record parameter whether be more than The alarm threshold value, including:Change rate of the state value for the conditional code proportion of " successfully completing " is calculated, and judges whether the change rate is more than to become Rate alarm threshold value;If not exceeded, then calculating the average value for authorizing and taking, and judge whether the average value is more than to authorize to take alarm Threshold value;If the average value is less than described authorize and takes alarm threshold value, the average response time of http url is counted, and sentence Whether the average response time that breaks is more than response time alarm threshold value;If the average response time is less than the response time alarm threshold value, judge whether the abnormal data quantity surpasses Cross abnormal data quantity alarm threshold value;If the abnormal data quantity is more than the abnormal data quantity alarm threshold value, alarm command is sent.
- 5. according to the method described in claim 1, it is characterized in that, the reading configuration file, including:Judge whether to receive configuration file input by user;If so, read the configuration file input by user;If it is not, then read default configuration file.
- 6. it according to the method described in claim 5, it is characterized in that, further includes:The database periodically deletes the authority record parameter.
- 7. according to the method described in claim 6, it is characterized in that, the database includes mysql databases, hbase data At least one of in library, mongodb databases, redis databases.
- 8. a kind of system of configurationization reptile quality-monitoring, which is characterized in that including:It obtains and preserving module, for obtaining the authority record parameter that crawlers crawl each website, and by the authority record Parameter is saved in database;First read module, for reading configuration file, to obtain website ID, monitoring time section and the warning level for needing to monitor Value;Second read module, the monitoring time section that website is corresponded to for reading the website ID from the database are awarded Weigh recording parameters;Judgment module, for judging whether the authority record parameter is more than the alarm threshold value;Alarm module, for when the authority record parameter is more than the alarm threshold value, sending out alarm signal.
- 9. a kind of configurationization reptile quality monitoring server, which is characterized in that including:Memory, for storing computer program;Processor realizes the configurationization reptile quality as described in any one of claim 1 to 7 during for performing the computer program The step of method of monitoring.
- 10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes that configurationization reptile quality is supervised as described in any one of claim 1 to 7 when the computer program is executed by processor The step of method of survey.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810007604.7A CN108228431A (en) | 2018-01-04 | 2018-01-04 | A kind of method and system of configurationization reptile quality-monitoring |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810007604.7A CN108228431A (en) | 2018-01-04 | 2018-01-04 | A kind of method and system of configurationization reptile quality-monitoring |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108228431A true CN108228431A (en) | 2018-06-29 |
Family
ID=62642931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810007604.7A Pending CN108228431A (en) | 2018-01-04 | 2018-01-04 | A kind of method and system of configurationization reptile quality-monitoring |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108228431A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532152A (en) * | 2019-08-05 | 2019-12-03 | 北明云智(武汉)网软有限公司 | A kind of monitoring alarm processing method and system based on Kapacitor computing engines |
CN112866007A (en) * | 2020-12-31 | 2021-05-28 | 神思旭辉医疗信息技术有限责任公司 | Equipment cloud management and control system |
CN115013297A (en) * | 2022-05-30 | 2022-09-06 | 广西信发铝电有限公司 | Power plant circulating pump abnormity monitoring method and device, electronic equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103034725A (en) * | 2012-12-19 | 2013-04-10 | 中国科学院深圳先进技术研究院 | Data acquisition, analysis and pre-warning system and method thereof |
CN103248625A (en) * | 2013-04-27 | 2013-08-14 | 北京京东尚科信息技术有限公司 | Monitoring method and system for abnormal operation of web crawler |
US8868541B2 (en) * | 2011-01-21 | 2014-10-21 | Google Inc. | Scheduling resource crawls |
CN106100902A (en) * | 2016-08-04 | 2016-11-09 | 腾讯科技(深圳)有限公司 | High in the clouds index monitoring method and apparatus |
CN106202467A (en) * | 2016-07-18 | 2016-12-07 | 浪潮集团有限公司 | Peer-to-peer network-oriented web crawler method capable of defining search key points |
CN106326447A (en) * | 2016-08-26 | 2017-01-11 | 北京量科邦信息技术有限公司 | Detection method and system of data captured by crowd sourcing network crawlers |
CN106874487A (en) * | 2017-02-21 | 2017-06-20 | 国信优易数据有限公司 | A kind of distributed reptile management system and its method |
CN107092544A (en) * | 2016-05-24 | 2017-08-25 | 口碑控股有限公司 | monitoring method and device |
CN107329969A (en) * | 2017-05-23 | 2017-11-07 | 合肥智权信息科技有限公司 | It is a kind of that system and method are updated based on the data message repeatedly verified |
-
2018
- 2018-01-04 CN CN201810007604.7A patent/CN108228431A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8868541B2 (en) * | 2011-01-21 | 2014-10-21 | Google Inc. | Scheduling resource crawls |
CN103034725A (en) * | 2012-12-19 | 2013-04-10 | 中国科学院深圳先进技术研究院 | Data acquisition, analysis and pre-warning system and method thereof |
CN103248625A (en) * | 2013-04-27 | 2013-08-14 | 北京京东尚科信息技术有限公司 | Monitoring method and system for abnormal operation of web crawler |
CN107092544A (en) * | 2016-05-24 | 2017-08-25 | 口碑控股有限公司 | monitoring method and device |
CN106202467A (en) * | 2016-07-18 | 2016-12-07 | 浪潮集团有限公司 | Peer-to-peer network-oriented web crawler method capable of defining search key points |
CN106100902A (en) * | 2016-08-04 | 2016-11-09 | 腾讯科技(深圳)有限公司 | High in the clouds index monitoring method and apparatus |
CN106326447A (en) * | 2016-08-26 | 2017-01-11 | 北京量科邦信息技术有限公司 | Detection method and system of data captured by crowd sourcing network crawlers |
CN106874487A (en) * | 2017-02-21 | 2017-06-20 | 国信优易数据有限公司 | A kind of distributed reptile management system and its method |
CN107329969A (en) * | 2017-05-23 | 2017-11-07 | 合肥智权信息科技有限公司 | It is a kind of that system and method are updated based on the data message repeatedly verified |
Non-Patent Citations (1)
Title |
---|
张军强: "《面向多爬虫的监控***的设计与实现》", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532152A (en) * | 2019-08-05 | 2019-12-03 | 北明云智(武汉)网软有限公司 | A kind of monitoring alarm processing method and system based on Kapacitor computing engines |
CN112866007A (en) * | 2020-12-31 | 2021-05-28 | 神思旭辉医疗信息技术有限责任公司 | Equipment cloud management and control system |
CN112866007B (en) * | 2020-12-31 | 2022-11-04 | 神思旭辉医疗信息技术有限责任公司 | Equipment cloud management and control system |
CN115013297A (en) * | 2022-05-30 | 2022-09-06 | 广西信发铝电有限公司 | Power plant circulating pump abnormity monitoring method and device, electronic equipment and storage medium |
CN115013297B (en) * | 2022-05-30 | 2024-03-29 | 广西信发铝电有限公司 | Power plant circulating pump abnormality monitoring method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104852886B (en) | The guard method of user account number and device | |
US10776498B2 (en) | End-to-end change tracking for triggering website security review | |
CN110177108B (en) | Abnormal behavior detection method, device and verification system | |
US9601000B1 (en) | Data-driven alert prioritization | |
CN110471821B (en) | Abnormality change detection method, server, and computer-readable storage medium | |
CN103593609B (en) | Trustworthy behavior recognition method and device | |
KR20180118597A (en) | Method and apparatus for identifying network access behavior, servers and storage media | |
CN108228431A (en) | A kind of method and system of configurationization reptile quality-monitoring | |
CN104348809A (en) | Network security monitoring method and system | |
CN110602135B (en) | Network attack processing method and device and electronic equipment | |
CN108366012B (en) | Social relationship establishing method and device and electronic equipment | |
CN105516133A (en) | User identity verification method, server and client | |
CN107992738A (en) | A kind of account logs in method for detecting abnormality, device and electronic equipment | |
CN112751711A (en) | Alarm information processing method and device, storage medium and electronic equipment | |
CN107579968A (en) | Video flowing address detection method, device and server | |
CN107248995A (en) | Account verification method and device | |
US9225608B1 (en) | Evaluating configuration changes based on aggregate activity level | |
CN113591068A (en) | Online login equipment management method and device and electronic equipment | |
CN111740865A (en) | Flow fluctuation trend prediction method and device and electronic equipment | |
KR102213460B1 (en) | System and method for generating software whistlist using machine run | |
CN108073703A (en) | A kind of comment information acquisition methods, device, equipment and storage medium | |
CN105471938A (en) | Server load management method and server load management device | |
CN116739618B (en) | Variable code tracing system based on block chain and data processing method | |
CN114285844A (en) | Method and device for intelligently fusing server interface, electronic equipment and storage medium | |
CN109308573A (en) | A kind of business risk control method, device and electronic equipment based on risk point |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180629 |