CN107766208B - Method, system and device for monitoring business system - Google Patents

Method, system and device for monitoring business system Download PDF

Info

Publication number
CN107766208B
CN107766208B CN201711024783.7A CN201711024783A CN107766208B CN 107766208 B CN107766208 B CN 107766208B CN 201711024783 A CN201711024783 A CN 201711024783A CN 107766208 B CN107766208 B CN 107766208B
Authority
CN
China
Prior art keywords
log
log file
row number
abnormal
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711024783.7A
Other languages
Chinese (zh)
Other versions
CN107766208A (en
Inventor
胡文彬
刘祥涛
赵彦晖
孙淏添
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhongrun Sifang Information Technology Co ltd
Original Assignee
Shenzhen Zhongrun Sifang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhongrun Sifang Information Technology Co ltd filed Critical Shenzhen Zhongrun Sifang Information Technology Co ltd
Priority to CN201711024783.7A priority Critical patent/CN107766208B/en
Publication of CN107766208A publication Critical patent/CN107766208A/en
Application granted granted Critical
Publication of CN107766208B publication Critical patent/CN107766208B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method for monitoring service systems, which comprises the steps of collecting log files of each service system; analyzing each log file to obtain log information of each corresponding log file; judging whether all log information is normal or not; if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal. Therefore, the method collects the log files of all the service systems, analyzes the log information of all the service systems, and can accurately position the abnormal service system if the service system in the monitoring range is abnormal and the service system corresponding to the abnormal log information is abnormal. The invention also discloses a system and a device for monitoring the service system, and the effect is as above.

Description

Method, system and device for monitoring business system
Technical Field
The present invention relates to the field of computers, and in particular, to a method, a system, and an apparatus for monitoring a service system.
Background
When a business system processes business data, a process is often "dead" due to sudden situations such as program bug (fault) or operating system environment problems, and the process "dead" situation shows that the process of the business system exists, but the business system does not process the data any more. This "deadlock" situation has a severe impact on the progress of the business. Therefore, the operation condition of the business system needs to be monitored.
In the prior art, the working condition of a business system is judged by checking the processing condition of an interface program. For example, if the number of the detected pending requests is 100 and is far beyond the normal value, it is determined that the request has backlog, and it is determined that the service system has an abnormality. However, the interface program is processed by a plurality of service systems in an interactive manner, so that the method monitors the working conditions of the plurality of service systems, and when an abnormality occurs, the abnormal service system cannot be accurately located, that is, it cannot be determined which service system is abnormal, and a maintenance worker needs to check one by one, which is very low in efficiency.
Therefore, how to monitor the service system and accurately locate the service system with the abnormality when the abnormality exists is a problem to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a method, a system and a device for monitoring a service system, which are used for monitoring the service system and accurately positioning the abnormal service system when the abnormal service system exists.
In order to solve the above technical problem, the present invention provides a method for monitoring a service system, including:
collecting log files of all service systems;
analyzing each log file to obtain corresponding log information of each log file;
judging whether all the log information is normal or not;
if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal.
Preferably, the log information is specifically a line number of the content of the log file and an update time of the log file.
Preferably, the determining whether all the log information is normal specifically includes:
calculating each updating time interval of the updating time of each log file in the current monitoring period and the updating time of each log file in the corresponding previous monitoring period;
calculating the increment of each row number between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period;
if all the updating time intervals are larger than the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal;
and if the updating time interval of the log file is less than or equal to the updating time interval threshold or the line number increment is less than or equal to the line number increment threshold, determining that the log information of the log file is abnormal.
Preferably, the determining whether all the log information is normal specifically includes:
calculating each updating time interval of each updating time and the corresponding current system time;
calculating the increment of each row number between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period;
if all the updating time intervals are smaller than or equal to the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal;
and if the updating time interval of the log file is larger than the updating time interval threshold or the row number increment is smaller than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
Preferably, if the service system has an exception, the method further includes:
sending abnormal early warning information;
the abnormal early warning information comprises information of the abnormal business system.
Preferably, after the sending the abnormal early warning information, the method further includes:
and saving the sent abnormal early warning information.
The invention also provides a system for monitoring the service system, which comprises:
the log acquisition module is used for log files of all the service systems;
the log analysis module is used for analyzing each log file to obtain corresponding log information of each log file; and the log information is used for judging whether all the log information is normal or not, if so, determining that all the service systems are normal, otherwise, determining that the service systems corresponding to the abnormal log information are abnormal.
Preferably, further comprising:
the early warning sending module is used for sending abnormal early warning information;
the abnormal early warning information comprises information of the abnormal business system.
Preferably, further comprising:
and the database module is used for storing the sent abnormity early warning information.
The invention also provides a device for monitoring the service system, which comprises a processor, wherein the processor is used for realizing the steps of any method for monitoring the service system when executing the program stored in the memory.
The method for monitoring the service system provided by the invention collects the log files of each service system; analyzing each log file to obtain log information of each corresponding log file; judging whether all log information is normal or not; if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal. Therefore, the method collects the log files of all the service systems, analyzes the log information of all the service systems, and can accurately position the abnormal service system when the service system is abnormal compared with the prior art if the service system in the monitoring range is abnormal and the service system corresponding to the abnormal log information is abnormal. The system and the device for monitoring the business system have the advantages.
Drawings
In order to illustrate the embodiments of the present invention more clearly, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a method for monitoring a service system according to an embodiment of the present invention;
fig. 2 is a flowchart of another method for monitoring a service system according to an embodiment of the present invention;
fig. 3 is a structural diagram of a system for monitoring a service system according to an embodiment of the present invention;
fig. 4 is a structural diagram of another system for monitoring a service system according to an embodiment of the present invention;
fig. 5 is a structural diagram of an apparatus for monitoring a service system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without any inventive step, are within the scope of the present invention.
The invention aims to provide a method, a system and a device for monitoring a service system, which are used for monitoring the service system and accurately positioning the abnormal service system when the abnormal service system exists.
In order to make the technical solutions of the present invention better understood, the present invention will be further described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flowchart of a method for monitoring a service system according to an embodiment of the present invention, and as shown in fig. 1, the method for monitoring a service system includes the following steps:
s10: and collecting log files of all the service systems.
The log files of the business systems in the monitoring range are collected, and specifically, the log files to be analyzed can be collected from the log directories of the business systems according to the configuration information of the business systems. The configuration information may include a name, a log directory, and a log file name of the corresponding service system. According to the configuration information of each service system, the log files of each service system in the monitoring range can be collected, and the monitoring of a plurality of service systems is realized.
It should be noted that, during collection, the file attributes of the log files need to be kept from being modified, and each collected log file can be placed in each analysis processing directory. Specifically, the log files of the business systems can be collected to the corresponding analysis processing directories through the command program. The command program may be, for example:
cp-p/syspro/log/file.log/logmon/data/syspro/cur_log/file.log
the syspro is a directory established according to the name of the corresponding service system, and log files of different service systems correspond to different directories. cur _ log is a subdirectory for storing the log file of the current monitoring period in the syspro directory, and correspondingly, last _ log is a subdirectory for storing the log file of the last monitoring period in the syspro directory.
Of course, other command programs may be adopted to collect the log files of each service system to the corresponding analysis processing directories, which is not described herein again.
S11: and analyzing each log file to obtain the log information of each corresponding log file.
The system is mainly responsible for analyzing all collected log files of all service systems and acquiring log information of all the log files.
S12: and judging whether all log information is normal or not.
If so, the process proceeds to step S13, and if not, the process proceeds to step S14.
S13: all the service systems are normal.
And if all log information is normal, all corresponding service systems are normal.
S14: and the business system corresponding to the abnormal log information is abnormal.
And if the log information has abnormity, the abnormal log information is shown to be abnormal in the corresponding service system.
Collecting log files of all service systems; analyzing each log file to obtain log information of each corresponding log file; judging whether all log information is normal or not; if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal. Therefore, the method collects the log files of all the service systems, analyzes the log information of all the service systems, and can accurately position the service system with the abnormality if the service system in the monitoring range has the abnormality and the service system corresponding to the abnormal log information is abnormal.
On the basis of the above embodiment, in order to more accurately judge the operating condition of each business system, as a preferred implementation, the log information is specifically the line number of the content of the log file and the update time of the log file.
And analyzing each log file, and acquiring the updating time of each current log file and the file content line number of each current log file. And judging whether the updating time of all the current log files and the file content line number of all the current log files are normal or not, if so, determining that all the service systems are normal, otherwise, determining that the service systems corresponding to the log files with abnormal updating time or abnormal line number are abnormal.
On the basis of the foregoing embodiment, in order to determine the operating status of each business system more accurately, as a preferred implementation, step S12 specifically includes:
and calculating each updating time interval of the updating time of each log file in the current monitoring period and the updating time of each log file in the corresponding last monitoring period.
It should be noted that, each currently acquired log file is each log file of the current monitoring period, and each corresponding log file of the previous monitoring period may be found from the analysis processing directory, and the update time of each log file of the previous monitoring period is obtained through analysis.
Taking the directory syspro as an example, the command program for acquiring the update time of the corresponding log file in the current monitoring period may be, for example:
start/logmon/data/syspro/cur_log/file.log
taking the directory syspro as an example, the command program for obtaining the update time of the log file in the previous monitoring period may be, for example:
start/logmon/data/syspro/last_log/file.log
and subtracting to calculate the update time interval.
Of course, other command programs may be used to obtain the update time of the log file, which is not described herein again.
And calculating the increment of each line number between the line number of each log file in the current monitoring period and the line number of each log file in the corresponding previous monitoring period.
Specifically, the log files of the previous monitoring period can be found from the log collection directory, and the line number of the content of each log file of the previous monitoring period is obtained through analysis.
Taking the directory syspro as an example, the command procedure for acquiring the line number of the content of the log file in the current monitoring period may be, for example:
wc–l/logmon/data/syspro/cur_log/file.log
taking the directory syspro as an example, the command procedure for obtaining the line number of the content of the log file in the previous monitoring period may be, for example:
wc–l/logmon/data/syspro/last_log/file.log
and subtracting to calculate the row number increment.
Of course, other command programs may also be selected to obtain the number of lines of the content of the log file, which is not described herein again.
And if all the updating time intervals are larger than the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal.
The updating time interval is larger than a preset updating time interval threshold value, which indicates that the updating time interval is normal. And the row number increment is larger than a preset row number increment threshold value, which indicates that the row number increment is normal. And if the update time interval and the row number increment of one log file are normal, the log information of the log file is normal. And if the updating time interval and the row number increment of all the log files are normal, determining that all the log information is normal.
And if the update time interval of the log file is less than or equal to the update time interval threshold or the row number increment is less than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
If the update time interval of a certain log file is smaller than or equal to the update time interval threshold, it indicates that the update time interval is too small, that is, the update time of the log file in the current monitoring period is too close to the update time of the log file in the last monitoring period and too far away from the current system time, and the log file is most likely not updated for a long time. For example, the preset update time interval threshold is 1 minute, and the calculated update time interval of the log file is 0, which indicates that the log file is not updated.
If the row number increment for a log file is less than or equal to the row number increment threshold, for example, the preset row number increment threshold is 10 rows, and the calculated row number increment for the log file is 5 rows. The log file may not be processing the traffic data normally because it is not possible to generate only such a small amount of logs within one monitoring period if the traffic data is processed normally.
Therefore, whether the log information is normal can be judged according to the updating time interval of the log files in the current monitoring period and the previous monitoring period and the line number increment of the log files in the current monitoring period and the previous monitoring period, and whether the log information is normal can be judged more accurately.
On the basis of the foregoing embodiment, in order to determine the operating status of each service system more accurately and conveniently, as a preferred implementation, step S12 specifically includes:
each update time interval of each update time and the corresponding current system time is calculated.
And calculating the updating time interval of each log file of the current monitoring period and each updating time interval of the corresponding current system time. It should be noted that the update time interval is different from what is indicated by the aforementioned update time interval.
And calculating the increment of each line number between the line number of each log file in the current monitoring period and the line number of each log file in the corresponding previous monitoring period.
It should be noted that, each currently acquired log file is each log file of the current monitoring period, and each corresponding log file of the previous monitoring period can be found from each analysis processing directory, and the line number of the content of each log file of the previous monitoring period is obtained.
And if all the updating time intervals are smaller than or equal to the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal.
The updating time interval is smaller than or equal to a preset updating time interval threshold value, which indicates that the updating time interval is normal. And the row number increment is larger than a preset row number increment threshold value, which indicates that the row number increment is normal. And if the update time interval and the row number increment of one log file are normal, the log information of the log file is normal. And if the updating time interval and the line number increment of all the log files are normal, determining that all the log information is normal.
It should be noted that, because the update time interval in this embodiment is different from the content of the update time interval in the foregoing, the update time interval threshold in this embodiment is not connected to the update time interval threshold in the foregoing, and may be equal to or unequal to the update time interval threshold, which is not limited in this invention.
And if the update time interval of the log file is larger than the update time interval threshold or the row number increment is smaller than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
If the update time interval of a certain log file is greater than the update time interval threshold, it indicates that the update time interval is too large, that is, the update time of the log file in the current monitoring period is too far from the current system time, and the log file is most likely not updated for a long time.
If the row number increment is less than or equal to the row number increment threshold, e.g., the preset row number increment threshold is 10 rows, and the calculated row number increment for the log file is 5 rows, it indicates that the log file may not be processing the traffic data normally, because if the traffic data is processed normally, it is not possible to generate only such a small amount of logs during a monitoring period.
On the basis of the above embodiment, in order to notify the operation and maintenance staff to perform maintenance in time after the abnormality occurs in the service system, as a preferred implementation manner, if the service system has an abnormality, the method further includes sending abnormality warning information, where the abnormality warning information includes information of the service system having the abnormality.
It should be noted that the information of the abnormal service system may be the name of the abnormal service system, the number of the abnormal service system, or other information, as long as the service system can be located, and the specific type of the information of the service system is not limited in the present invention.
In addition, the abnormality early warning information may be sent to the operation and maintenance personnel by a mail or a short message through the stored related information, and of course, the abnormality early warning information may also be sent by other methods, which is not described herein again.
On the basis of the foregoing embodiment, in order to facilitate subsequent checking or confirming of the working condition of the business system, as a preferred implementation, after the sending of the abnormality warning information, the method further includes saving the sent abnormality warning information.
Specifically, the abnormal early warning information may be stored in the database module in the form of an early warning information table, where the early warning information table may include information such as an abnormal business system name, early warning detailed information, early warning sending time, an early warning sending form, and an early warning receiver.
In the following, taking monitoring the working condition of a service system as an example, the method for monitoring a service system provided by the present invention is described in detail, and with reference to fig. 2, fig. 2 is a flowchart of another method for monitoring a service system provided by the present invention, which includes:
s20: and collecting a log file to be analyzed from a log directory of the service system.
S21: and obtaining the update time of the log file, comparing the update time with the update time of the log file in the last monitoring period, and obtaining the file update time interval.
The obtained log file is the log file of the current monitoring period.
S22: and acquiring the line number of the log file, comparing the line number with the line number of the log file content in the previous monitoring period, and acquiring the line number increment of the file content.
S23: if the file updating time interval does not exceed the updating time interval threshold or the line number increment of the file content does not exceed the line number increment threshold of the file content, the business system is possibly abnormal, and abnormal early warning information is sent.
S24: and saving the abnormal early warning information for subsequent checking and confirmation.
The method for monitoring a service system provided in this embodiment monitors the working condition of a service system, and determines whether the service system is normal by comparing the update time and the number of lines of the log file in the previous monitoring period. For a plurality of service systems, the working condition of each service system is monitored according to the method, so that the working conditions of the plurality of service systems can be monitored.
The foregoing describes in detail an embodiment of a method for monitoring a service system, and based on the method for monitoring a service system described in the foregoing embodiment, an embodiment of the present invention provides a system for monitoring a service system corresponding to the method. Since the embodiment of the system part corresponds to the embodiment of the method part, the embodiment of the system part is described with reference to the embodiment of the method part, and is not described in detail here.
Fig. 3 is a structural diagram of a system for monitoring a service system according to an embodiment of the present invention, as shown in fig. 3, including:
and the log collection module 30 is used for collecting log files of all the service systems.
A log analysis module 31, configured to analyze each log file to obtain log information of each corresponding log file; and the log information is used for judging whether all log information is normal or not, if so, all service systems are determined to be normal, and otherwise, the service system corresponding to the abnormal log information is abnormal.
In the system for monitoring a service system provided by this embodiment, the log collection module collects log files of each service system, and the log analysis module analyzes each log file to obtain log information of each corresponding log file; judging whether all log information is normal or not; if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal. It can be seen that, the log collection module collects log files of each service system, the log analysis module analyzes log information of each service system, and if a service system in the monitoring range is abnormal, the service system corresponding to the abnormal log information is abnormal.
On the basis of the foregoing embodiment, in order to more accurately determine the operating status of each service system, as a preferred embodiment, the log information acquired by the log analysis module 31 is specifically the line number of the content of the log file and the update time of the log file.
On the basis of the foregoing embodiment, in order to more accurately determine the working condition of each service system, as a preferred implementation, the log analysis module 31 specifically includes:
and the obtaining submodule is used for analyzing the line number and the updating time of the content of each log file obtained by each log file.
And the calculating submodule is used for calculating each updating time interval of the updating time of each log file in the current monitoring period and the updating time of each log file in the corresponding previous monitoring period, and calculating each row number increment between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period.
And the analysis submodule is used for determining that all log information is normal and all service systems are normal if all the updating time intervals are larger than the updating time interval threshold and all the row number increments are larger than the row number increment threshold, and determining that the log information of the log file is abnormal if the updating time interval of the log file is smaller than or equal to the updating time interval threshold or the row number increments are smaller than or equal to the row number increment threshold, so that the service system corresponding to the log information is abnormal.
On the basis of the foregoing embodiment, as a preferred implementation manner, in order to more accurately and conveniently judge the working condition of each service system, the log analysis module 31 specifically includes:
and the obtaining submodule is used for analyzing the line number and the updating time of the content of each log file obtained by each log file.
And the calculating submodule is used for calculating each updating time interval of each updating time and the corresponding current system time and calculating each row number increment between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period.
And the analysis submodule is used for determining that all log information is normal if all the updating time intervals are smaller than or equal to the updating time interval threshold and all the row number increments are larger than the row number increment threshold, so that all the service systems are determined to be normal, and determining that the log information of the log file is abnormal if the updating time intervals of the log file are larger than the updating time interval threshold or the row number increments are smaller than or equal to the row number increment threshold, so that the service system corresponding to the log information is abnormal.
Referring to fig. 4, fig. 4 is a block diagram of another system for monitoring a service system according to an embodiment of the present invention.
On the basis of the foregoing embodiment, in order to notify the operation and maintenance staff to perform maintenance in time after the abnormality occurs in the service system, as a preferred implementation manner, the system further includes an early warning sending module 40, configured to send abnormality early warning information, where the abnormality early warning information includes information of the service system in which the abnormality occurs.
On the basis of the above embodiment, in order to facilitate subsequent checking or confirming the working condition of the business system, as shown in fig. 4, as a preferred embodiment, the system further includes a database module 41 for storing the sent abnormal early warning information.
Preferably, the database module 41 may store not only the early warning information table, but also a service configuration table, an early warning threshold value table, an early warning sending table, and the like. The service configuration table may store configuration information of the service system, and is used to specify a log monitoring object, which may include a service system name, a log directory, a log file name, and the like. The early warning threshold table can store early warning threshold information, including a file update time interval threshold, a file line number increment threshold, and the like. The early warning sending table can store relevant information of early warning sending, and is used for appointing an early warning receiver, wherein the relevant information comprises information such as an early warning sending type, an early warning receiver, a receiver mailbox, a receiver mobile phone number and the like.
The foregoing describes in detail an embodiment of a method for monitoring a service system, and based on the method for monitoring a service system described in the foregoing embodiment, an embodiment of the present invention further provides a device for monitoring a service system corresponding to the method. Since the embodiment of the apparatus portion and the embodiment of the method portion correspond to each other, the embodiment of the apparatus portion is described with reference to the embodiment of the method portion, and is not described in detail here.
Fig. 5 is a structural diagram of an apparatus for monitoring a service system according to an embodiment of the present invention, as shown in fig. 5, including:
a memory 50 and a processor 51.
A memory 50 for storing a computer program.
The processor 51, when executing the computer program stored in the memory 50, may implement the following steps:
and collecting log files of all the service systems.
And analyzing each log file to obtain the log information of each corresponding log file.
And judging whether all log information is normal or not.
If so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal.
In some embodiments of the present invention, the processor 51 may be further configured to execute the computer program in the memory 50 to implement the following steps:
and analyzing the log files to obtain the corresponding line number of the content of each log file and the update time of the log files.
And judging whether the line number of the contents of all the log files and the updating time of the log files are normal or not.
If so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the content row of the abnormal log file or the update time of the log file is abnormal.
In some embodiments of the present invention, the processor 51 may be further configured to execute the computer program in the memory 50 to implement the following steps:
and calculating each updating time interval of the updating time of each log file in the current monitoring period and the updating time of each log file in the corresponding last monitoring period.
And calculating the increment of each line number between the line number of each log file in the current monitoring period and the line number of each log file in the corresponding previous monitoring period.
And if all the updating time intervals are larger than the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal.
And if the update time interval of the log file is less than or equal to the update time interval threshold or the row number increment is less than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
In some embodiments of the present invention, the processor 51 may be further configured to execute the computer program in the memory 50 to implement the following steps:
each update time interval of each update time and the corresponding current system time is calculated.
And calculating the increment of each line number between the line number of each log file in the current monitoring period and the line number of each log file in the corresponding previous monitoring period.
And if all the updating time intervals are smaller than or equal to the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal.
And if the update time interval of the log file is larger than the update time interval threshold or the row number increment is smaller than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
In some embodiments of the present invention, the processor 51 may be further configured to execute the computer program in the memory 50 to implement the following steps:
and sending abnormal early warning information.
The abnormal early warning information comprises information of an abnormal service system.
In some embodiments of the present invention, the processor 51 may be further configured to execute the computer program in the memory 50 to implement the following steps:
and saving the sent abnormal early warning information.
In the apparatus for monitoring service systems provided in this embodiment, when the processor executes the computer program in the memory, the processor collects log files of each service system; analyzing each log file to obtain log information of each corresponding log file; judging whether all log information is normal or not; if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal. The device for monitoring the service system provided by the embodiment can accurately position the abnormal service system when the abnormal service system exists.
The method, system and apparatus for monitoring a service system provided by the present invention are described in detail above. The embodiments are described in a progressive mode in the specification, the emphasis of each embodiment is different from that of other embodiments, and the same and similar parts among the embodiments are referred to each other.
It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (7)

1. A method of monitoring a business system, comprising:
collecting log files of all service systems;
analyzing each log file to obtain corresponding log information of each log file;
judging whether all the log information is normal or not;
if so, determining that all the service systems are normal, otherwise, determining that the service system corresponding to the abnormal log information is abnormal;
the log information is specifically the line number of the content of the log file and the update time of the log file;
the specific steps of judging whether all the log information is normal are as follows:
calculating each updating time interval of the updating time of each log file in the current monitoring period and the updating time of each log file in the corresponding previous monitoring period;
calculating the increment of each row number between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period;
if all the updating time intervals are larger than the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal;
if the update time interval of the log file is less than or equal to the update time interval threshold or the row number increment is less than or equal to the row number increment threshold, determining that the log information of the log file is abnormal;
or the judging whether all the log information is normal specifically includes:
calculating each updating time interval of each updating time and the corresponding current system time;
calculating the increment of each row number between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period;
if all the updating time intervals are smaller than or equal to the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal;
and if the updating time interval of the log file is larger than the updating time interval threshold or the row number increment is smaller than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
2. The method of claim 1, wherein if there is an exception in the service system, further comprising:
sending abnormal early warning information;
the abnormal early warning information comprises information of the abnormal business system.
3. The method of claim 2, further comprising, after the sending the abnormal pre-warning information:
and saving the sent abnormal early warning information.
4. A system for monitoring a business system, comprising:
the log acquisition module is used for log files of all the service systems;
the log analysis module is used for analyzing each log file to obtain corresponding log information of each log file; the log information is used for judging whether all the log information is normal or not, if so, all the service systems are determined to be normal, otherwise, the service systems corresponding to the abnormal log information are abnormal;
the log information is specifically the line number of the content of the log file and the update time of the log file;
the specific steps of judging whether all the log information is normal are as follows:
calculating each updating time interval of the updating time of each log file in the current monitoring period and the updating time of each log file in the corresponding previous monitoring period;
calculating the increment of each row number between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period;
if all the updating time intervals are larger than the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal;
if the update time interval of the log file is less than or equal to the update time interval threshold or the row number increment is less than or equal to the row number increment threshold, determining that the log information of the log file is abnormal;
or the judging whether all the log information is normal specifically includes:
calculating each updating time interval of each updating time and the corresponding current system time;
calculating the increment of each row number between the row number of each log file in the current monitoring period and the row number of each log file in the corresponding previous monitoring period;
if all the updating time intervals are smaller than or equal to the updating time interval threshold value and all the row number increments are larger than the row number increment threshold value, determining that all the log information is normal;
and if the updating time interval of the log file is larger than the updating time interval threshold or the row number increment is smaller than or equal to the row number increment threshold, determining that the log information of the log file is abnormal.
5. The system of claim 4, further comprising:
the early warning sending module is used for sending abnormal early warning information;
the abnormal early warning information comprises information of the abnormal business system.
6. The system of claim 5, further comprising:
and the database module is used for storing the sent abnormity early warning information.
7. An apparatus for monitoring a business system, comprising a processor for implementing the steps of the method for monitoring a business system according to any one of claims 1 to 3 when executing a program stored in a memory.
CN201711024783.7A 2017-10-27 2017-10-27 Method, system and device for monitoring business system Active CN107766208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711024783.7A CN107766208B (en) 2017-10-27 2017-10-27 Method, system and device for monitoring business system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711024783.7A CN107766208B (en) 2017-10-27 2017-10-27 Method, system and device for monitoring business system

Publications (2)

Publication Number Publication Date
CN107766208A CN107766208A (en) 2018-03-06
CN107766208B true CN107766208B (en) 2021-01-05

Family

ID=61271889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711024783.7A Active CN107766208B (en) 2017-10-27 2017-10-27 Method, system and device for monitoring business system

Country Status (1)

Country Link
CN (1) CN107766208B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110363640B (en) * 2019-07-08 2023-07-25 中国平安人寿保险股份有限公司 Monitoring system service operation method and device, storage medium and server
CN110704225A (en) * 2019-09-18 2020-01-17 平安科技(深圳)有限公司 Monitoring method, monitoring device, electronic equipment and computer readable storage medium
CN110908885B (en) * 2019-11-21 2022-08-05 苏州浪潮智能科技有限公司 Log collection method and device and related components
CN113138891A (en) * 2020-01-19 2021-07-20 上海臻客信息技术服务有限公司 Service monitoring system based on log
CN111404735A (en) * 2020-03-09 2020-07-10 北京思特奇信息技术股份有限公司 Distributed application monitoring method and monitoring system
CN111736579B (en) * 2020-08-26 2020-12-08 北京安帝科技有限公司 Industrial control equipment safety detection method based on log inquiry and retention
CN114238018B (en) * 2021-12-17 2023-03-24 天翼爱音乐文化科技有限公司 Method, system and device for detecting integrity of log collection file and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082704A (en) * 2009-11-30 2011-06-01 ***通信集团河北有限公司 Safety monitoring method and system
CN103761165A (en) * 2014-01-15 2014-04-30 北京奇虎科技有限公司 Log backup method and log backup device
CN105721187A (en) * 2014-12-03 2016-06-29 ***通信集团江苏有限公司 Service fault diagnosis method and apparatus
CN106339303A (en) * 2016-08-23 2017-01-18 浪潮电子信息产业股份有限公司 Running log abnormity analysis method
CN106598800A (en) * 2015-10-14 2017-04-26 中兴通讯股份有限公司 Hardware fault analysis system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120259675A1 (en) * 2011-04-08 2012-10-11 Roehrs Louis F System and Method for a Retail Collaboration Network Platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082704A (en) * 2009-11-30 2011-06-01 ***通信集团河北有限公司 Safety monitoring method and system
CN103761165A (en) * 2014-01-15 2014-04-30 北京奇虎科技有限公司 Log backup method and log backup device
CN105721187A (en) * 2014-12-03 2016-06-29 ***通信集团江苏有限公司 Service fault diagnosis method and apparatus
CN106598800A (en) * 2015-10-14 2017-04-26 中兴通讯股份有限公司 Hardware fault analysis system and method
CN106339303A (en) * 2016-08-23 2017-01-18 浪潮电子信息产业股份有限公司 Running log abnormity analysis method

Also Published As

Publication number Publication date
CN107766208A (en) 2018-03-06

Similar Documents

Publication Publication Date Title
CN107766208B (en) Method, system and device for monitoring business system
CN109726072B (en) WebLogic server monitoring and alarming method, device and system and computer storage medium
CN109412870B (en) Alarm monitoring method and platform, server and storage medium
CN101470426B (en) Fault detection method and system
WO2016188100A1 (en) Information system fault scenario information collection method and system
CN106789306B (en) Method and system for detecting, collecting and recovering software fault of communication equipment
CN116880412B (en) Visual production management platform based on cloud
US11770199B2 (en) Traffic data self-recovery processing method, readable storage medium, server and apparatus
CN106202535B (en) Method and system for detecting RRD database
JP2012198796A (en) Log collection system, device, method and program
CN112416705A (en) Abnormal information processing method and device
CN108965049B (en) Method, device, system and storage medium for providing cluster exception solution
CN112612680A (en) Message warning method, system, computer equipment and storage medium
CN109344046B (en) Data processing method, device, medium and electronic equipment
CN111625428A (en) Method, system, device and storage medium for monitoring running state of Java application program
JP5503177B2 (en) Fault information collection device
CN110069382B (en) Software monitoring method, server, terminal device, computer device and medium
KR102048294B1 (en) Relay device and program
CN111124805A (en) Data acquisition method, device, equipment and storage medium
CN105550094B (en) A kind of high-availability system state automatic monitoring method
JP2007025820A (en) Risk diagnostic program for software
CN112883253A (en) Data processing method, device, equipment and readable storage medium
CN107861842B (en) Metadata damage detection method, system, equipment and storage medium
CN106878101B (en) Method and device for processing alarm information in monitoring system
JP2020080111A (en) Job management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant