CN112612671A - System monitoring method, device, equipment and storage medium - Google Patents

System monitoring method, device, equipment and storage medium Download PDF

Info

Publication number
CN112612671A
CN112612671A CN202011493657.8A CN202011493657A CN112612671A CN 112612671 A CN112612671 A CN 112612671A CN 202011493657 A CN202011493657 A CN 202011493657A CN 112612671 A CN112612671 A CN 112612671A
Authority
CN
China
Prior art keywords
index data
maintenance index
category
maintenance
component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011493657.8A
Other languages
Chinese (zh)
Inventor
梁永富
熊刚
江旻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202011493657.8A priority Critical patent/CN112612671A/en
Publication of CN112612671A publication Critical patent/CN112612671A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Tourism & Hospitality (AREA)
  • Operations Research (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The method determines a target alarm rule by acquiring real-time operation and maintenance index data of a system to be monitored and according to index types of historical operation and maintenance index data of the system to be monitored, wherein a technician does not need to set an alarm value according to historical experience, but determines the alarm rule according to the index types of the historical operation and maintenance index data of the system, so that the target alarm rule is consistent with the operation and maintenance index data of the system, further, whether the real-time operation and maintenance index data is abnormal or not is judged according to the target alarm rule, and if the real-time operation and maintenance index data is abnormal, abnormal alarm is performed, and the system monitoring accuracy is improved. Moreover, when the operation and maintenance index data of the system to be monitored is large in quantity, the alarm rule can be quickly determined according to the index type of the historical operation and maintenance index data of the system, the system monitoring period is shortened, the system monitoring efficiency is improved, and the method and the device are suitable for application.

Description

System monitoring method, device, equipment and storage medium
Technical Field
The present application relates to a system monitoring technology of financial technology (Fintech), and in particular, to a system monitoring method, apparatus, device, and storage medium.
Background
With the development of computer technology, more and more technologies are applied in the financial field, the traditional financial industry is gradually changing to financial technology, and the system monitoring technology is no exception, but higher requirements are also put forward on the system monitoring technology due to the requirements of security and real-time performance of the financial industry.
Distributed systems are currently used more and more widely in the financial industry. For example, with the explosive growth of data in the financial industry, the conventional storage system cannot meet the current data storage requirements due to insufficient disk space, limited processing capacity and the like, and the application of the distributed storage system solves the storage bottleneck of the conventional storage system to a certain extent. The existing system monitoring mode is mainly that a technician sets an alarm value according to historical experience, then compares operation and maintenance index data of a system monitored in real time with the set alarm value, and judges that abnormality occurs when certain index data does not accord with the set alarm value.
However, the alarm value is set based on the historical experience of the technician, and the setting of the alarm value is easily deviated due to some subjective factors of the technician, so that the monitoring accuracy of the system based on the alarm value is low. Moreover, when the number of operation and maintenance index data of the system to be monitored is large, the manual setting of the alarm value takes a lot of time for technicians, so that the monitoring period of the system is prolonged, and the monitoring efficiency of the system is reduced.
Disclosure of Invention
In order to solve the problems in the prior art, the present application provides a system monitoring method, apparatus, device and storage medium.
In a first aspect, an embodiment of the present application provides a system monitoring method, where the method includes:
acquiring real-time operation and maintenance index data of a system to be monitored;
determining a target alarm rule according to the index category of historical operation and maintenance index data of the system to be monitored, wherein the historical operation and maintenance index data is the operation and maintenance index data of the system to be monitored in a first preset time period before the real-time operation and maintenance index data is obtained;
judging whether the real-time operation and maintenance index data is abnormal or not according to the target alarm rule;
and if the real-time operation and maintenance index data is abnormal, performing abnormal alarm.
In a possible implementation manner, before determining a target alarm rule according to an index category of historical operation and maintenance index data of the system to be monitored, the method further includes:
performing time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a period component and a stable component of the historical operation and maintenance index data;
calculating a first similarity of the historical operation and maintenance index data and the trend component, a second similarity of the historical operation and maintenance index data and the periodic component, and a third similarity of the historical operation and maintenance index data and the stable component;
and determining the index category of the historical operation and maintenance index data according to the first similarity, the second similarity and the third similarity.
In a possible implementation manner, the performing time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a periodic component, and a stable component of the historical operation and maintenance index data includes:
performing detrending processing on the historical operation and maintenance index data to obtain a detrending sequence, and performing local weighted regression processing on each subsequence in the detrending sequence to obtain a temporary periodic sequence;
low-pass filtering the temporary periodic sequence to obtain a low-frequency time sequence, and performing de-trending processing on the temporary periodic sequence according to the low-frequency time sequence to obtain the periodic component;
according to the periodic component, performing de-cyclization processing on the historical operation and maintenance index data, and performing local weighted regression processing on the de-cyclization processed historical operation and maintenance index data to obtain a trend component;
and obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
In a possible implementation manner, before the obtaining the stable component according to the historical operation and maintenance index data, the periodic component, and the trend component, the method further includes:
judging whether the periodic component and the trend component converge;
the obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component includes:
and if the periodic component and the trend component are converged, executing the step of obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
In a possible implementation manner, after obtaining the stable component according to the historical operation and maintenance index data, the periodic component, and the trend component, the method further includes:
determining a target robust weight according to the stable component;
the performing local weighted regression processing on each subsequence in the detrended sequence comprises:
according to the target steady weight, performing local weighted regression processing on each sub-sequence in the de-trending sequence;
the local weighted regression processing of the history operation and maintenance index data after the periodization removal processing comprises the following steps:
and performing local weighted regression processing on the historical operation and maintenance index data subjected to the periodization removal processing according to the target steady weight.
In a possible implementation manner, the calculating a first similarity between the historical operation and maintenance index data and the trend component includes:
constructing a first matrix to be processed according to the historical operation and maintenance index data and the trend component;
searching all regular paths from a first matrix point to a last matrix point in the first matrix to be processed;
obtaining a path with the minimum regular cost in all the regular paths;
and determining the first similarity between the historical operation and maintenance index data and the trend component according to the path with the minimum regular cost.
In a possible implementation manner, the determining an index category of the historical operation and maintenance index data according to the first similarity, the second similarity, and the third similarity includes:
comparing the first similarity, the second similarity, and the third similarity;
if the first similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a trend type;
if the second similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a period type;
and if the third similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a stable type.
In a possible implementation manner, the determining a target alarm rule according to an index category of historical operation and maintenance index data of the system to be monitored includes:
acquiring the corresponding relation between the index category of prestored operation and maintenance index data and an alarm rule;
and determining the target alarm rule corresponding to the index category of the historical operation and maintenance index data according to the corresponding relation.
In a possible implementation manner, the obtaining a corresponding relationship between an index category of the pre-stored operation and maintenance index data and an alarm rule includes:
for the operation and maintenance index data of a stable category, obtaining a stable component and a standard deviation of the operation and maintenance index data of a historical stable category, wherein the operation and maintenance index data of the historical stable category is obtained before the operation and maintenance index data of the stable category is obtained;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the stable category according to the stable component and the standard deviation.
In a possible implementation manner, the obtaining a corresponding relationship between an index category of the pre-stored operation and maintenance index data and an alarm rule includes:
for the operation and maintenance index data of the cycle category, taking a second preset time period as a cycle, comparing the operation and maintenance index data of the cycle category with the operation and maintenance index data of the historical cycle category to determine a data growth rate, wherein the operation and maintenance index data of the historical cycle category is acquired in the cycle before the operation and maintenance index data of the cycle category is acquired;
determining a predicted value of the operation and maintenance index data of the period type according to a preset Long Short-Term Memory network (LSTM) and a mean square error of the operation and maintenance index data of the historical period type;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the period category according to the data growth rate and the predicted value.
In a possible implementation manner, the obtaining a corresponding relationship between an index category of the pre-stored operation and maintenance index data and an alarm rule includes:
for the operation and maintenance index data of the trend category, comparing the operation and maintenance index data of the trend category with the operation and maintenance index data of the historical trend category to determine a ring ratio increase rate, wherein the operation and maintenance index data of the historical trend category is acquired before the operation and maintenance index data of the trend category is acquired;
determining a binary search tree according to the operation and maintenance index data of the historical trend category, and determining the length of each path of the operation and maintenance index data of the trend category according to the binary search tree;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the trend category according to the ring ratio growth rate and the path lengths.
In a possible implementation manner, before the performing the exception warning, the method further includes:
sending abnormal data in the real-time operation and maintenance index data to a preset terminal, wherein the abnormal data is used for indicating the preset terminal to check the abnormal data;
the abnormal alarm includes:
and if the audit passing information sent by the preset terminal is received, performing abnormal alarm.
In a second aspect, an embodiment of the present application provides a system monitoring apparatus, where the apparatus includes:
the acquisition module is used for acquiring real-time operation and maintenance index data of the system to be monitored;
the determining module is used for determining a target alarm rule according to the index category of the historical operation and maintenance index data of the system to be monitored, wherein the historical operation and maintenance index data is the operation and maintenance index data of the system to be monitored in a first preset time period before the real-time operation and maintenance index data is acquired;
the judging module is used for judging whether the real-time operation and maintenance index data is abnormal or not according to the target alarm rule;
and the alarm module is used for carrying out abnormal alarm if the real-time operation and maintenance index data is abnormal.
In a possible implementation manner, the determining module is further configured to:
performing time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a period component and a stable component of the historical operation and maintenance index data;
calculating a first similarity of the historical operation and maintenance index data and the trend component, a second similarity of the historical operation and maintenance index data and the periodic component, and a third similarity of the historical operation and maintenance index data and the stable component;
and determining the index category of the historical operation and maintenance index data according to the first similarity, the second similarity and the third similarity.
In a possible implementation manner, the determining module is specifically configured to:
performing detrending processing on the historical operation and maintenance index data to obtain a detrending sequence, and performing local weighted regression processing on each subsequence in the detrending sequence to obtain a temporary periodic sequence;
low-pass filtering the temporary periodic sequence to obtain a low-frequency time sequence, and performing de-trending processing on the temporary periodic sequence according to the low-frequency time sequence to obtain the periodic component;
according to the periodic component, performing de-cyclization processing on the historical operation and maintenance index data, and performing local weighted regression processing on the de-cyclization processed historical operation and maintenance index data to obtain a trend component;
and obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
In a possible implementation manner, the determining module is specifically configured to:
judging whether the periodic component and the trend component converge;
and if the periodic component and the trend component are converged, executing the step of obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
In a possible implementation manner, the determining module is further configured to:
determining a target robust weight according to the stable component;
and according to the target robust weight, performing local weighted regression processing on each sub-sequence in the de-trending sequence, and according to the target robust weight, performing local weighted regression processing on the historical operation and maintenance index data after the de-periodization processing.
In a possible implementation manner, the determining module is specifically configured to:
constructing a first matrix to be processed according to the historical operation and maintenance index data and the trend component;
searching all regular paths from a first matrix point to a last matrix point in the first matrix to be processed;
obtaining a path with the minimum regular cost in all the regular paths;
and determining the first similarity between the historical operation and maintenance index data and the trend component according to the path with the minimum regular cost.
In a possible implementation manner, the determining module is specifically configured to:
comparing the first similarity, the second similarity, and the third similarity;
if the first similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a trend type;
if the second similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a period type;
and if the third similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a stable type.
In a possible implementation manner, the determining module is specifically configured to:
acquiring the corresponding relation between the index category of prestored operation and maintenance index data and an alarm rule;
and determining a target alarm rule corresponding to the index category of the historical operation and maintenance index data according to the corresponding relation.
In a possible implementation manner, the determining module is specifically configured to:
for the operation and maintenance index data of the stable category, obtaining a stable component and a standard deviation of the operation and maintenance index data of the stable category;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the stable category according to the stable component and the standard deviation.
In a possible implementation manner, the determining module is specifically configured to:
for the operation and maintenance index data of the cycle category, taking a second preset time period as a cycle, comparing the operation and maintenance index data of the cycle category with the operation and maintenance index data of the historical cycle category to determine a data growth rate, wherein the operation and maintenance index data of the historical cycle category is acquired in the cycle before the operation and maintenance index data of the cycle category is acquired;
determining a predicted value of the operation and maintenance index data of the period category according to a preset LSTM and the mean square error of the operation and maintenance index data of the historical period category;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the period category according to the data growth rate and the predicted value.
In a possible implementation manner, the determining module is specifically configured to:
for the operation and maintenance index data of the trend category, comparing the operation and maintenance index data of the trend category with the operation and maintenance index data of the historical trend category to determine a ring ratio increase rate, wherein the operation and maintenance index data of the historical trend category is acquired before the operation and maintenance index data of the trend category is acquired;
determining a binary search tree according to the operation and maintenance index data of the historical trend category, and determining the length of each path of the operation and maintenance index data of the trend category according to the binary search tree;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the trend category according to the ring ratio growth rate and the path lengths.
In a possible implementation manner, the alarm module is specifically configured to:
sending abnormal data in the real-time operation and maintenance index data to a preset terminal, wherein the abnormal data is used for indicating the preset terminal to check the abnormal data;
and if the audit passing information sent by the preset terminal is received, performing abnormal alarm.
In a third aspect, an embodiment of the present application provides a system monitoring device, including:
a processor;
a memory; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of the first aspect.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program causes a server to execute the method according to the first aspect.
In a fifth aspect, the present application provides a computer program product, which includes computer instructions for executing the method of the first aspect by a processor.
The system monitoring method, the device, the equipment and the storage medium provided by the embodiment of the application have the advantages that by acquiring the real-time operation and maintenance index data of the system to be monitored, and determines a target alarm rule according to the index category of the historical operation and maintenance index data of the system to be monitored, here, the technician is not required to set the alarm value according to the historical experience, but determines the alarm rule according to the index category of the historical operation and maintenance index data of the system, wherein, the historical operation and maintenance index data is the operation and maintenance index data of the system in a first preset time period before the real-time operation and maintenance index data is acquired, therefore, the target alarm rule is consistent with the operation and maintenance index data of the system, and further, and judging whether the real-time operation and maintenance index data is abnormal or not according to the target alarm rule, and if so, performing abnormal alarm, so that the monitoring accuracy of the system is improved. Moreover, when the operation and maintenance index data of the system to be monitored is large in quantity, the alarm rule can be quickly determined according to the index type of the historical operation and maintenance index data of the system, the system monitoring period is shortened, the system monitoring efficiency is improved, and the method and the device are suitable for application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic diagram of a system monitoring system according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a system monitoring method according to an embodiment of the present disclosure;
fig. 3 is a schematic flow chart illustrating a time sequence decomposition of historical operation and maintenance index data according to an embodiment of the present disclosure;
fig. 4A is a schematic diagram of operation and maintenance index data and a periodic component, a trend component, and a stable component thereof according to an embodiment of the present disclosure;
fig. 4B is a schematic diagram of operation and maintenance index data and a periodic component, a trend component, and a stable component thereof according to an embodiment of the present disclosure;
fig. 5 is a schematic flow chart of another system monitoring method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a system monitoring apparatus according to an embodiment of the present disclosure;
fig. 7 shows a schematic diagram of a possible structure of a monitoring device of the system of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," and "fourth," if any, in the description and claims of this application and the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Distributed systems are currently used more and more widely in the financial industry. For example, with the explosive growth of data in the financial industry, the conventional storage system cannot meet the current data storage requirements due to insufficient disk space, limited processing capacity and the like, and the application of the distributed storage system solves the storage bottleneck of the conventional storage system to a certain extent. The existing system monitoring mode is mainly that a technician sets an alarm value according to historical experience, then compares operation and maintenance index data of a system monitored in real time with the set alarm value, and judges that abnormality occurs when certain index data does not accord with the set alarm value.
However, the alarm value is set based on the historical experience of the technician, and the setting of the alarm value is easily deviated due to some subjective factors of the technician. For example, for operation and maintenance index data of the same system, a technician a sets an alarm value to be an alarm value 1 according to historical experience, a technician B sets an alarm value to be an alarm value 2 according to historical experience, and the alarm value 1 is different from the alarm value 2. Therefore, different technicians set different alarm values, when the operation and maintenance index data of the system monitored in real time is compared with the set alarm value, it may be sometimes determined that certain index data does not conform to the set alarm value, and sometimes determined that the index data conforms to the set alarm value, so that the system monitoring accuracy based on the alarm value is low.
Moreover, when the number of operation and maintenance index data of the system to be monitored is large, the manual setting of the alarm value takes a lot of time for technicians, so that the monitoring period of the system is prolonged, and the monitoring efficiency of the system is reduced. In addition, because the alarm values to be set are more, the manual setting mode may miss some alarm values, so that monitoring omission occurs, a black swan phenomenon that causes downtime frequently occurs, and it is also difficult to determine which operation and maintenance index data of the system are abnormal when the system fails, and further, the failure cause cannot be determined.
Therefore, the embodiment of the present application provides a system monitoring method, which determines a target alarm rule according to an index type of historical operation and maintenance index data of a system to be monitored, wherein a technician does not need to set an alarm value according to historical experience, but determines the alarm rule according to the index type of the historical operation and maintenance index data of the system, so that the target alarm rule is consistent with the operation and maintenance index data of the system, and further, according to the target alarm rule, whether the real-time operation and maintenance index data of the system to be monitored is abnormal or not is judged, and if the real-time operation and maintenance index data of the system to be monitored is abnormal, abnormal alarm is performed, so that the monitoring accuracy of the system is improved. Moreover, when the operation and maintenance index data of the system to be monitored is large in quantity, the alarm rule can be quickly determined according to the index type of the historical operation and maintenance index data of the system, the system monitoring period is shortened, the system monitoring efficiency is improved, and the method and the device are suitable for application.
The system monitoring method provided by the embodiment of the application can be applied to system monitoring of various scenarios, and for example, the system monitoring of a distributed storage system is taken as an example. Here, the distributed storage system is generally configured by a Load Balance (LB), a database (Data Base, DB), a cluster management node (Configuration server), and a plurality of Application nodes (APIs). The database is used for storing the service data, and among a plurality of application nodes in the distributed storage system, there are undifferentiated peer-to-peer nodes, and the application nodes are not used for storing the service data, so that the distributed storage system is very easy to expand to carry more service traffic. And, the load balancer sets up in the front end entry number of a plurality of application nodes, is used for distributing the visit request evenly, thus, make every application node can be without the processing request of difference. Wherein the access request comprises: at least one of a query request for a database and an update request for data.
Optionally, fig. 1 is a schematic diagram of a system monitoring system architecture provided in an embodiment of the present application. In fig. 1, taking system monitoring of the distributed storage system as an example, the architecture includes a system monitoring device.
Here, the distributed storage system described above includes: the system comprises a load balancer, a database, a cluster management node, a first application node and a second application node. The ellipses in fig. 1 indicate that the distributed storage system may also include one or more other application nodes. As shown in fig. 1, the load balancer can communicate and interact data with external clients through a network. For example, the load balancer may receive a data query request sent by a client through a network, and send the data query request to the first application node for processing. The database acts as a persistent database in the distributed storage system for storing all data. Therefore, all data updating requests and data loading requests are finally completed in the database.
The cluster management node serves as a configuration center in the distributed storage system, receives a configuration issuing request from the application node, and records the state of the database, so that when the application node detects that the state recorded by the cluster management node is an updated state, the application node requests the database to load the updated data to the built-in storage space of the application node. The state of the database that the cluster management node can characterize may include, but is not limited to: an updated state that indicates that the data in the database is updated.
The first application node and the second application node are stateless nodes and are indifferent peers in the distributed storage system, wherein the indifferent peers refer to that a load balancer sends one or more access requests for a database from the outside of the distributed system to any application node for processing, the load balancer sends the access requests to application nodes with low loads according to the load condition of each application node, and any application node can access the database and send an access result to a client. The application node in the distributed storage system is responsible for receiving an update request sent by a client, sending data carried by the update request to a database to modify the database, and synchronizing the time to the cluster management node if the data is successfully sent.
It is to be understood that the illustrated structure of the embodiments of the present application does not form a specific limitation to the system monitoring architecture. In other possible embodiments of the present application, the foregoing architecture may include more or less components than those shown in the drawings, or combine some components, or split some components, or arrange different components, which may be determined according to practical application scenarios, and is not limited herein. The components shown in fig. 1 may be implemented in hardware, software, or a combination of software and hardware.
In a specific implementation process, the system monitoring device obtains real-time operation and maintenance index data of the distributed storage system, where the operation and maintenance index data may include load (such as Central Processing Unit (CPU) utilization rate and memory utilization rate) of the distributed storage system, transaction success rate, traffic volume, and the like. The system monitoring equipment determines a target alarm rule according to the index type of the historical operation and maintenance index data of the distributed storage system, wherein a technician does not need to set an alarm value according to historical experience, but determines the alarm rule according to the index type of the historical operation and maintenance index data of the system, so that the target alarm rule is in accordance with the operation and maintenance index data of the system, further, whether the real-time operation and maintenance index data of the distributed storage system is abnormal or not is judged according to the target alarm rule, if the real-time operation and maintenance index data of the distributed storage system is abnormal, abnormal alarm is carried out, and the system monitoring accuracy is improved. Moreover, when the operation and maintenance index data of the system to be monitored is large in quantity, the alarm rule can be quickly determined according to the index type of the historical operation and maintenance index data of the system, the system monitoring period is shortened, the system monitoring efficiency is improved, and the method and the device are suitable for application.
In addition, the system architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not constitute a limitation to the technical solution provided in the embodiment of the present application, and it can be known by a person skilled in the art that along with the evolution of the system architecture and the appearance of a new service scenario, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.
The technical solutions of the present application are described below with several embodiments as examples, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 2 is a schematic flowchart of a system monitoring method according to an embodiment of the present disclosure, which may be applied to system monitoring processing and may be executed by any device that executes the system monitoring method, where the device may be implemented by software and/or hardware. As shown in fig. 2, based on the system architecture shown in fig. 1, the system monitoring method provided in the embodiment of the present application includes the following steps:
s201: and acquiring real-time operation and maintenance index data of the system to be monitored.
Here, the system to be monitored may be determined according to actual conditions, such as the distributed storage system in fig. 1.
In this embodiment, taking the execution subject as the system monitoring device in fig. 1 as an example, the system monitoring device obtains real-time operation and maintenance index data of the system to be monitored. The operation and maintenance index data may include the load (for example, CPU usage rate and memory usage rate) of the system to be monitored, transaction success rate, traffic volume, and the like.
S202: and determining a target alarm rule according to the index type of the historical operation and maintenance index data of the system to be monitored.
The historical operation and maintenance index data is the operation and maintenance index data of the system to be monitored in a first preset time period before the real-time operation and maintenance index data is obtained. The historical operation and maintenance index data is time sequence data.
Here, the first preset time period may be determined according to actual conditions, for example, the historical operation and maintenance index data is the operation and maintenance index data of the system to be monitored within one week before the real-time operation and maintenance index data is acquired, which is not limited in this embodiment of the application.
For example, the system monitoring device may obtain a correspondence between index types of pre-stored operation and maintenance index data and alarm rules, and further determine the target alarm rule corresponding to the index type of the historical operation and maintenance index data according to the correspondence.
Here, the correspondence relationship may be determined according to actual conditions, for example, for the operation and maintenance index data of the stable category, the system monitoring device may obtain a stable component and a standard deviation of the operation and maintenance index data of the historical stable category, which is obtained before obtaining the operation and maintenance index data of the stable category. And then, according to the stable component and the standard deviation, obtaining an alarm rule corresponding to the operation and maintenance index data of the stable category.
The stable type of operation and maintenance index data may be understood as a type of operation and maintenance index data that is most similar to the stable component of the operation and maintenance index data.
Here, the system monitoring device may perform time-series decomposition on each operation and maintenance index data to obtain a trend component, a periodic component, and a stable component of the operation and maintenance index data, further calculate a similarity between the operation and maintenance index data and each component, that is, the trend component, the periodic component, or the stable component, and determine a category of the operation and maintenance index data according to the calculated similarity. And if the operation and maintenance index data is most similar to the stable component of the operation and maintenance index data, namely the similarity value of the operation and maintenance index data and the stable component of the operation and maintenance index data is the maximum, the category of the operation and maintenance index data is a stable category. Similarly, if the operation and maintenance index data is most similar to the periodic component thereof, that is, the similarity value between the operation and maintenance index data and the periodic component thereof is the maximum, the category of the operation and maintenance index data is the periodic category. And if the operation and maintenance index data is most similar to the trend component of the operation and maintenance index data, namely the similarity value of the operation and maintenance index data and the trend component of the operation and maintenance index data is the maximum, the category of the operation and maintenance index data is the trend category.
The stable component has the characteristic of small fluctuation amplitude, the periodic component has the characteristic of obvious fluctuation period, and the trend component has the characteristic of relatively gentle change trend.
For the operation and maintenance index data of the stable category, the fluctuation range of the index data is not large, and the system monitoring device may obtain the alarm rule corresponding to the operation and maintenance index data of the stable category by using a constant threshold setting method, for example, a standard statistical method such as a 3-sigma strategy. For example, the system monitoring device may obtain a stable component and a standard deviation of the operation and maintenance index data of the historical stable category, determine a data range (μ 3 σ, μ +3 σ) by using the stable component as an index data mean value μ and the standard deviation as σ, identify that the data within the 3-sigma range (μ -3 σ, μ +3 σ) is normal data, and determine that the rest is abnormal values, thereby obtaining an alarm rule corresponding to the operation and maintenance index data of the historical stable category.
In addition, for the operation and maintenance index data of the cycle type, the system monitoring device may set a second preset time period as a cycle, and further compare the operation and maintenance index data of the cycle type with the operation and maintenance index data of the historical cycle type to determine a data growth rate, where the operation and maintenance index data of the historical cycle type is obtained in a cycle before the operation and maintenance index data of the cycle type is obtained. And then, determining a predicted value of the operation and maintenance index data of the period type according to the mean square error of the preset LSTM and the operation and maintenance index data of the historical period type, and thus obtaining an alarm rule corresponding to the operation and maintenance index data of the period type according to the data growth rate and the predicted value.
Here, for the operation and maintenance index data of the period category, the index data has a significant fluctuation period. The system monitoring equipment can adopt a dynamic threshold setting method based on time sequence prediction, for example, a homonymy algorithm is combined with an LSTM to obtain an alarm rule corresponding to the operation and maintenance index data of the cycle type. For example, with a second preset time period as a cycle, for example, with a day as a cycle unit, the operation and maintenance index data of the cycle category is compared with the operation and maintenance index data of the historical cycle category, and the data increase rate Y% is calculated. Then, by combining the LSTM, an LSTM model can be constructed by using a common neural network Python library Keras, the operation and maintenance index data of the historical period type are normalized, the mean square error of the operation and maintenance index data is used as a loss function, the iteration times are set to be 100, model training is carried out, finally, the predicted value R of the operation and maintenance index data of the period type is obtained, normal data in a data range (R-R Y%, R + R Y%) are determined, and the rest are judged to be abnormal values.
For the operation and maintenance index data of the trend category, the system monitoring device may compare the operation and maintenance index data of the trend category with the operation and maintenance index data of the historical trend category, and determine a ring ratio increase rate, where the operation and maintenance index data of the historical trend category is obtained before the operation and maintenance index data of the trend category is obtained. And then, according to the operation and maintenance index data of the historical trend category, determining a binary search tree, and according to the binary search tree, determining each path length of the operation and maintenance index data of the trend category, and further, according to the ring ratio increase rate and each path length, obtaining an alarm rule corresponding to the operation and maintenance index data of the trend category.
Here, as for the index data of the trend category, the index data has a characteristic that the change trend is gentle. The system monitoring equipment can adopt a ring ratio algorithm and an isolated forest anomaly detection algorithm to obtain alarm rules corresponding to the operation and maintenance index data of the trend category. Illustratively, the system monitoring device calculates a loop ratio increase rate of the operation and maintenance index data of the trend category and the operation and maintenance index data of the historical trend category by using a loop ratio algorithm, and sets a loop ratio increase rate abnormal threshold. Combining an isolated forest anomaly detection algorithm, firstly, taking operation and maintenance index data of historical trend categories as training samples, randomly selecting samples from the training samples to create a binary search tree (iTree), and iteratively and repeatedly constructing a binary tree forest (iForest); then, completing the prediction of the abnormity of the operation and maintenance index data of the trend category, performing middle-order traversal on the binary search tree, and recording the path length of the operation and maintenance index data of the trend category in each iTree, namely the path length h (x) from the root node to the leaf node; finally, calculating the expected value E (h (x)) and the variance S (h (x)) of each path length of the operation and maintenance index data of the trend type by using a statistical method, and judging that E (h (x)) → 0 and S (h (x)) → 1 are abnormal values.
In the embodiment of the application, the system monitoring device determines the alarm rule according to the index type of the historical operation and maintenance index data of the system to be monitored. The historical operation and maintenance index data is the operation and maintenance index data of the system in a first preset time period before the real-time operation and maintenance index data is obtained, namely the historical operation and maintenance index data can embody the characteristics of the operation and maintenance index data of the system. Therefore, the system monitoring equipment determines an alarm rule according to the index type of the historical operation and maintenance index data, so that the target alarm rule is in accordance with the operation and maintenance index data of the system, further, whether the real-time operation and maintenance index data is abnormal or not is judged according to the target alarm rule, and if the real-time operation and maintenance index data is abnormal, abnormal alarm is performed, and the system monitoring accuracy is improved.
S203: and judging whether the real-time operation and maintenance index data is abnormal or not according to the target alarm rule.
For example, if the real-time operation and maintenance index data is a stable type of operation and maintenance index data, the system monitoring device may determine that the real-time operation and maintenance index data in a 3-sigma range (μ -3 σ, μ +3 σ) is normal data, and determine that the real-time operation and maintenance index data is an abnormal value.
Similarly, if the real-time operation and maintenance index data is the operation and maintenance index data of the periodic type, the system monitoring device may determine that the real-time operation and maintenance index data in (R-R x Y%, R + R x Y%) is normal data, and determine the rest as abnormal values.
If the real-time operation and maintenance index data is trend-type operation and maintenance index data, the system monitoring device may determine that the expected value E (h (x)) → 0 of the real-time operation and maintenance index data and the variance S (h (x)) → 1 are abnormal values.
S204: and if the real-time operation and maintenance index data is abnormal, performing abnormal alarm.
Here, in order to reduce the false alarm rate of the abnormal alarm, the embodiment of the present application sets an auditing link before the abnormal alarm is performed.
For example, before the abnormality alarm is performed, the system monitoring device may send data, in which an abnormality exists, in the real-time operation and maintenance index data to a preset terminal. And the preset terminal checks the abnormal data, and sends a check passing message to the system monitoring equipment after the abnormal data is checked to be abnormal. After receiving the audit passing information, the system monitoring equipment carries out abnormal alarm, thereby reducing the false alarm rate of the abnormal alarm.
In the embodiment of the application, the system monitoring device determines the target alarm rule by acquiring the real-time operation and maintenance index data of the system to be monitored and according to the index type of the historical operation and maintenance index data of the system to be monitored, wherein a technician does not need to set an alarm value according to historical experience, but determines the alarm rule according to the index type of the historical operation and maintenance index data of the system, wherein the historical operation and maintenance index data is the operation and maintenance index data of the system in a first preset time period before the real-time operation and maintenance index data is acquired, so that the target alarm rule is consistent with the operation and maintenance index data of the system, and further, whether the real-time operation and maintenance index data is abnormal or not is judged according to the target alarm rule, and if the real-time operation and maintenance index data is abnormal, abnormal alarm is performed, and the system monitoring accuracy is improved. Moreover, when the operation and maintenance index data of the system to be monitored is large in quantity, the system monitoring equipment can quickly determine the alarm rule according to the index type of the historical operation and maintenance index data of the system, shortens the system monitoring period, improves the system monitoring efficiency, and is suitable for application.
In the embodiment shown in fig. 2, before determining the target alarm rule according to the index type of the historical operation and maintenance index data of the system to be monitored, the system monitoring device may perform time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a periodic component, and a stable component of the historical operation and maintenance index data, further calculate similarities between the historical operation and maintenance index data and the trend component, the periodic component, and the stable component, and determine the index type of the historical operation and maintenance index data according to the calculated similarities.
Here, when the system monitoring device performs time-series decomposition on the historical operation and maintenance index data, the system monitoring device may perform Detrending (Detrending) processing on the historical operation and maintenance index data to obtain a detrended sequence, and perform local weighted regression processing on each subsequence in the detrended sequence to obtain a temporary periodic sequence. Further, the temporary period sequence is low-pass filtered to obtain a low-frequency time sequence, and the temporary period sequence is detrended according to the low-frequency time sequence to obtain the period component. And then, according to the periodic component, performing de-cyclization processing on the historical operation and maintenance index data, and performing local weighted regression processing on the de-cyclization processed historical operation and maintenance index data to obtain the trend component. And finally, obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
For example, as shown in fig. 3, fig. 3 is a schematic flow chart illustrating a time sequence decomposition of historical operation and maintenance index data according to an embodiment of the present application. Here, the system monitoring device specifically analyzes the historical operation and maintenance index data, and thus it can be known that the historical operation and maintenance index data is time series data and can be approximated to three categories, namely, a periodic component with an obvious fluctuation period, a trend component with a relatively gentle change trend, and a stable component with a relatively small data fluctuation amplitude. Therefore, the system monitoring equipment considers the time sequence decomposition of the historical operation and maintenance index data and decomposes the historical operation and maintenance index data into a trend component, a periodic component and a stable component, namely YV=TV+SV+RV(V=1,...,N)。
As shown in fig. 3, the method includes:
s301: for the above-mentioned historical operation and maintenanceIndex data YvPerforming de-trending treatment to obtain de-trending sequence
Figure BDA0002841419050000171
Here, n is initialized(p)The value was 60. Wherein n is(p)For the operation and maintenance index data with the collection granularity of minutes, the original sequence length is 1440, and the period unit of the period term observation is hour.
S302: performing local weighted regression processing, namely periodic subsequence smoothing on each subsequence in the de-trended subsequence, and applying adjacent weight q-n to the de-trended subsequence(s)To obtain a temporary periodic sequence
Figure BDA0002841419050000172
Wherein n is(s)The speed of change of data constituting the period item is determined for the smoothing parameter of the period item, and is generally set to a default value of 7.
S303: low-pass filtering the temporary period sequence, namely performing low-pass filtering on the smooth period subsequence
Figure BDA0002841419050000173
Applying low-pass filtering to perform the filtering operation with length n(p)、n(p)And 3, and applying q ═ n(l)Local weighted regression smoothing to obtain low frequency time series
Figure BDA0002841419050000174
Wherein n is(l)A smoothing parameter for low-pass filtering set to n or more(p)So as to prevent competition between the trend component and the seasonal component.
S304: according to the low-frequency time sequence, performing detrending treatment on the temporary period sequence, namely performing detrending on a smooth period subsequence, and calculating
Figure BDA0002841419050000181
Obtaining periodic components
Figure BDA0002841419050000182
The low frequency time series are subtracted to prevent the low frequency components from affecting the incoming periodic components.
S305: according to the period component, the historical operation and maintenance index data are subjected to de-cyclization processing to obtain the de-cyclization processed historical operation and maintenance index data
Figure BDA0002841419050000183
Missing values of the original sequence are ignored here.
S306: performing local weighted regression processing on the history operation and maintenance index data subjected to the de-cyclization processing, namely performing trend smoothing, wherein the history operation and maintenance index data subjected to the de-cyclization processing is applied with q ═ n(t)Obtaining a stable long-term trend component by local weighted regression smoothing
Figure BDA0002841419050000184
Wherein n is(t)Smoothing parameter being a trend term, n(t)The larger the value, the trend component
Figure BDA0002841419050000185
The smoother the trend, generally n(t)Is set to 1.5n(p)-2n(p)Thereby ensuring the robustness of iteration and reducing the influence of abnormal values.
S307: and obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component. Here, the above
Figure BDA0002841419050000186
Thus RV=YV-TV-SV
Here, the system monitoring device further determines whether the period component and the trend component converge, and if the period component and the trend component converge, obtains the stable component according to the historical operation and maintenance index data, the period component and the trend component, otherwise, re-executes the step S301.
Wherein, the above-mentioned periodic component and trend component converge may be understood as that the above-mentioned periodic component and trend component are the same or substantially the same as the periodic component and trend component obtained last time. That is, the periodic component and the trend component obtained by the system monitoring device during the execution of the steps S301 to S306 are the same as or substantially the same as the periodic component and the trend component obtained by the last execution of the steps S301 to S306. Therefore, the system monitoring equipment can obtain the more accurate periodic component and trend component, and obtain the stable component based on the converged periodic component and trend component, so that the obtained stable component is more accurate, and the accuracy of the subsequent processing result based on the periodic component, the trend component and the stable component is improved.
Here, the present application obtains a trend component, a period component and a stable component of the historical operation and maintenance index data through steps S301 to S307, further calculates the similarity between the historical operation and maintenance index data and the trend component, the period component and the stable component thereof, determines an index type of the historical operation and maintenance index data based on the calculated similarity, determines a target alarm rule according to the index type, and determines whether the real-time operation and maintenance index data is abnormal according to the target alarm rule, and if so, performs an abnormal alarm. The alarm value is not required to be set by a technician according to historical experience, but the alarm rule is determined according to the index type of the historical operation and maintenance index data of the system, so that the target alarm rule is consistent with the operation and maintenance index data of the system, the monitoring accuracy of the subsequent system is improved, the alarm rule can be determined quickly, the monitoring period of the system is shortened, the monitoring efficiency of the system is improved, and the system is suitable for application.
In addition, in order to avoid a large stable component Rv caused by an abnormal value in the operation and maintenance index data, after the stable component is obtained according to the historical operation and maintenance index data, the periodic component and the trend component, the system monitoring device may further determine a target robust weight according to the stable component, so that the target robust weight is considered when the local weighted regression processing is performed on each subsequence in the de-trending sequence and the local weighted regression processing is performed on the historical operation and maintenance index data after the de-periodization processing.
Illustratively, after obtaining the stable component, the system monitoring device further considers adjusting the proximity weight q to make h equal to 6 mean (| R)v| to obtain the target robust weight ρ at the v momentvComprises the following steps:
Figure BDA0002841419050000191
the neighboring weight q and the robust weight ρ are obtained in the local weighted regression of the above steps S302 and S306vMultiplication to avoid the influence of outliers on regression. Wherein, the function B is a Bisquarre function, and the mathematical expression is as follows:
Figure BDA0002841419050000192
here, in order to better understand the relationship between the operation and maintenance index data and the periodic component, the trend component and the stable component thereof, fig. 4A shows a schematic diagram of one operation and maintenance index data and the periodic component, the trend component and the stable component thereof, and fig. 4B shows a schematic diagram of another operation and maintenance index data and the periodic component, the trend component and the stable component thereof. In the two figures, the abscissa represents time and the ordinate represents size.
As can be seen from fig. 4A and 4B, in order to further determine the similarity between the operation and maintenance index data and its periodic component, trend component, and stable component, and thus determine the index category of the operation and maintenance index data based on the determined similarity, the system monitoring device may calculate a first similarity between the operation and maintenance index data and its trend component, a second similarity between the operation and maintenance index data and its periodic component, and a third similarity between the operation and maintenance index data and its stable component.
Here, for example, when the system monitoring device calculates the first similarity between the operation and maintenance index data and the trend component thereof, the system monitoring device may construct a first matrix to be processed according to the operation and maintenance index data and the trend component thereof, and then search all regular paths from a first matrix point to a last matrix point in the first matrix to be processed, so as to obtain a path with a minimum regular cost among all regular paths, and determine the first similarity between the operation and maintenance index data and the trend component thereof according to the path with the minimum regular cost.
For example, if the operation and maintenance index data and the trend component thereof are represented as time series Q ═ { Q ═ Q1,q2,...qn},C={c1,c2,...cmThe length of the system monitoring device is n and m, and the process of calculating the first similarity between the operation and maintenance index data and the trend component of the operation and maintenance index data by the system monitoring device comprises the following steps:
(1) constructing a first matrix D to be processed with the size of nxm and matrix elements Dij=dist(qi,cj) Where dist is a distance computation function, a Euclidean distance may be used here.
(2) The search from the first matrix point D in matrix D may be performed using a dynamic programming search method1To the last matrix point dnmAll the regular paths (forwarding path). Each regular path is denoted by W, and all matrix point sequences through which the path passes are numbered as { (1,1), (1,2), (i, j), ·. The k-th element of W is defined as Wk=(i,j)kWhich reflects the sequence Q and C mapping, resulting in W ═ W1,w2,....,wk(max(m,n)≤K<m+n+1)。
In the path searching process, the following constraint conditions need to be satisfied:
boundary conditions: w is a1=(1,1),wkAnd (n, m), for the time sequence, the sequence of each part element is ensured not to change, and the searched path needs to start from the lower left corner of the matrix and end at the upper right corner.
Continuity: if wk-1=(a1,b1) For the next matrix point w of the search pathk=(a2,b2) Need to satisfy (a)2-a1) 1 or less and (b)2-b1) And (4) less than or equal to 1, and each point in the path W can only be matched with the adjacent point.
Monotonicity: if wk-1=(a1,b1)For the next matrix point w of the search pathk=(a2,b2) Need to satisfy (a)2-a1) Not less than 0 and (b)2-b1) And the value is more than or equal to 0, and each point in the path W is ensured to move monotonously along with the time.
For the matrix points (i, j) where the path already exists, it can be known from the monotonicity and continuity that the next matrix point passing through the search path can only be one of the following three cases: (i +1, j), (i +1, j +1), and (i, j + 1).
(3) Among all the above regular paths, the path with the minimum regular cost is obtained. For example by means of a formula
Figure BDA0002841419050000211
And solving the path with the minimum warping cost, wherein K in the denominator is mainly used for compensating warping paths with different lengths.
(4) And determining the first similarity of the operation and maintenance index data and the trend component of the operation and maintenance index data according to the path with the minimum regular cost. Illustratively, for D in matrix D11To dnmAnd calculating the cumulative distance gamma by the path with the minimum regular cost to obtain the optimal solution of the DTW (Q, C), which can be used as the similarity of the sequences Q and C.
Wherein the cumulative distance γ (i, j) is a point qiAnd cjEuclidean distance (q) ofi,cj) The sum of the minimum cumulative distance that the point can be reached. From the above matrix points (i, j) already existing for the path, the next matrix point passing through in the search path can only be one of the following three cases: the specific calculation formula of the (i +1, j), (i +1, j +1), and (i, j +1) obtained cumulative distance γ (i, j) is:
γ(i,j)=dist(qi,cj)+min{γ(i-1,j),γ(i-1,j-1),γ(i,j-1)}
similarly, the calculating, by the system monitoring device, a second similarity between the operation and maintenance index data and the periodic component thereof, which is similar to the calculating of the first similarity between the operation and maintenance index data and the trend component thereof, may include: and constructing a second matrix to be processed according to the operation and maintenance index data and the periodic component thereof, and then searching all regular paths from the first matrix point to the last matrix point in the second matrix to be processed, thereby obtaining the path with the minimum regular cost in all the regular paths, and determining a second similarity between the operation and maintenance index data and the periodic component thereof according to the path with the minimum regular cost.
The calculating, by the system monitoring device, a third similarity between the operation and maintenance index data and the stable component thereof may include: and constructing a third matrix to be processed according to the operation and maintenance index data and the periodic component thereof, and then searching all regular paths from the first matrix point to the last matrix point in the third matrix to be processed, thereby obtaining the path with the minimum regular cost in all the regular paths, and determining the third similarity between the operation and maintenance index data and the stable component thereof according to the path with the minimum regular cost.
After the first similarity between the operation and maintenance index data and the trend component thereof, the second similarity between the operation and maintenance index data and the periodic component thereof, and the third similarity between the operation and maintenance index data and the stable component thereof are calculated, the system monitoring device may determine the index type of the operation and maintenance index data.
For example, the system monitoring device may compare the first similarity, the second similarity, and the third similarity. And if the first similarity is the maximum value, determining that the index category of the operation and maintenance index data is a trend category. And if the second similarity is the maximum value, determining that the index class of the operation and maintenance index data is the cycle class. And if the third similarity is the maximum value, determining that the index category of the operation and maintenance index data is a stable category. For example, as shown in fig. 4A, the similarity between the operation and maintenance index data and the trend component thereof is the maximum value, and the index category thereof is the trend category. As shown in fig. 4B, the similarity between the operation and maintenance index data and the period component thereof is the maximum value, and the index category thereof is the period index category.
Optionally, on the basis of the system monitoring device shown in fig. 1, the system monitoring device may include an offline module and a real-time module. For example, as shown in fig. 5, the real-time module may be configured to obtain real-time operation and maintenance index data of the system to be monitored. The offline module may be configured to determine a target alarm rule according to an index type of the historical operation and maintenance index data of the system to be monitored. The off-line module may perform time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a periodic component, and a stable component of the historical operation and maintenance index data, and further calculate a first similarity between the historical operation and maintenance index data and the trend component, a second similarity between the historical operation and maintenance index data and the periodic component, and a third similarity between the historical operation and maintenance index data and the stable component, so as to determine an index category of the historical operation and maintenance index data according to the first similarity, the second similarity, and the third similarity.
Here, the offline module performs time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a periodic component and a stable component of the historical operation and maintenance index data, calculates the similarity between the historical operation and maintenance index data and the trend component, the periodic component and the stable component, and determines the specific process of the index category of the historical operation and maintenance index data based on the calculated similarity, which is referred to the above description and is not repeated herein.
In addition, the offline module may obtain a corresponding relationship between the index type of the pre-stored operation and maintenance index data and the alarm rule, so as to determine the target alarm rule corresponding to the index type of the historical operation and maintenance index data according to the corresponding relationship. For the alarm rules corresponding to the operation and maintenance index data of the trend category, the alarm rules corresponding to the operation and maintenance index data of the period category, and the specific description of the alarm rules corresponding to the operation and maintenance index data of the stable category, reference is made to the above description, which is not repeated herein.
The real-time module can also send abnormal data in the real-time operation and maintenance index data to a preset terminal, the preset terminal checks the abnormal data, and after the abnormal data is checked to be abnormal, audit passing information is sent to the real-time module. And after receiving the audit passing information, the real-time module carries out abnormal alarm.
In the embodiment of the application, the real-time module obtains real-time operation and maintenance index data of a system to be monitored, the offline module determines a target alarm rule according to an index type of historical operation and maintenance index data of the system to be monitored, here, a technician does not need to set an alarm value according to historical experience, but determines an alarm rule according to the index type of the historical operation and maintenance index data of the system, wherein the historical operation and maintenance index data is the operation and maintenance index data of the system in a first preset time period before the real-time operation and maintenance index data is obtained, so that the target alarm rule is consistent with the operation and maintenance index data of the system, further, the real-time module judges whether the real-time operation and maintenance index data is abnormal according to the target alarm rule, and if the real-time operation and maintenance index data is abnormal, abnormal alarm is performed, and the system monitoring accuracy is improved. Moreover, when the operation and maintenance index data of the system to be monitored is large in quantity, the alarm rule can be quickly determined according to the index category of the historical operation and maintenance index data of the system, the system monitoring period is shortened, the system monitoring efficiency is improved, and the method is suitable for application.
Fig. 6 is a schematic structural diagram of a system monitoring apparatus according to an embodiment of the present application, corresponding to the system monitoring method according to the foregoing embodiment. For convenience of explanation, only portions related to the embodiments of the present application are shown. Fig. 6 is a schematic structural diagram of a system monitoring apparatus according to an embodiment of the present application, where the system monitoring apparatus 60 includes: an acquisition module 601, a determination module 602, a judgment module 603 and an alarm module 604. The system monitoring device may be the system monitoring device itself, or a chip or an integrated circuit that implements the functions of the system monitoring device. It should be noted here that the division of the obtaining module, the determining module, the judging module, and the alarming module is only a division of a logic function, and the obtaining module, the determining module, the judging module, and the alarming module may be integrated or independent physically.
The obtaining module 601 is configured to obtain real-time operation and maintenance index data of a system to be monitored.
A determining module 602, configured to determine a target alarm rule according to an index category of historical operation and maintenance index data of the system to be monitored, where the historical operation and maintenance index data is operation and maintenance index data of the system to be monitored within a first preset time period before the real-time operation and maintenance index data is acquired.
The determining module 603 is configured to determine whether the real-time operation and maintenance index data is abnormal according to the target alarm rule.
The alarm module 604 is configured to perform an abnormal alarm if the real-time operation and maintenance index data is abnormal.
In a possible implementation manner, the determining module 602 is further configured to:
performing time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a period component and a stable component of the historical operation and maintenance index data;
calculating a first similarity of the historical operation and maintenance index data and the trend component, a second similarity of the historical operation and maintenance index data and the periodic component, and a third similarity of the historical operation and maintenance index data and the stable component;
and determining the index category of the historical operation and maintenance index data according to the first similarity, the second similarity and the third similarity.
In a possible implementation manner, the determining module 602 is specifically configured to:
performing detrending processing on the historical operation and maintenance index data to obtain a detrending sequence, and performing local weighted regression processing on each subsequence in the detrending sequence to obtain a temporary periodic sequence;
low-pass filtering the temporary periodic sequence to obtain a low-frequency time sequence, and performing de-trending processing on the temporary periodic sequence according to the low-frequency time sequence to obtain the periodic component;
according to the periodic component, performing de-cyclization processing on the historical operation and maintenance index data, and performing local weighted regression processing on the de-cyclization processed historical operation and maintenance index data to obtain a trend component;
and obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
In a possible implementation manner, the determining module 602 is specifically configured to:
judging whether the periodic component and the trend component converge;
and if the periodic component and the trend component are converged, executing the step of obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
In a possible implementation manner, the determining module 602 is further configured to:
determining a target robust weight according to the stable component;
and according to the target robust weight, performing local weighted regression processing on each sub-sequence in the de-trending sequence, and according to the target robust weight, performing local weighted regression processing on the historical operation and maintenance index data after the de-periodization processing.
In a possible implementation manner, the determining module 602 is specifically configured to:
constructing a first matrix to be processed according to the historical operation and maintenance index data and the trend component;
searching all regular paths from a first matrix point to a last matrix point in the first matrix to be processed;
obtaining a path with the minimum regular cost in all the regular paths;
and determining the first similarity between the historical operation and maintenance index data and the trend component according to the path with the minimum regular cost.
In a possible implementation manner, the determining module 602 is specifically configured to:
comparing the first similarity, the second similarity, and the third similarity;
if the first similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a trend type;
if the second similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a period type;
and if the third similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a stable type.
In a possible implementation manner, the determining module 602 is specifically configured to:
acquiring the corresponding relation between the index category of prestored operation and maintenance index data and an alarm rule;
and determining a target alarm rule corresponding to the index category of the historical operation and maintenance index data according to the corresponding relation.
In a possible implementation manner, the determining module 602 is specifically configured to:
for the operation and maintenance index data of the stable category, obtaining a stable component and a standard deviation of the operation and maintenance index data of the stable category;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the stable category according to the stable component and the standard deviation.
In a possible implementation manner, the determining module 602 is specifically configured to:
for the operation and maintenance index data of the cycle category, taking a second preset time period as a cycle, comparing the operation and maintenance index data of the cycle category with the operation and maintenance index data of the historical cycle category to determine a data growth rate, wherein the operation and maintenance index data of the historical cycle category is acquired in the cycle before the operation and maintenance index data of the cycle category is acquired;
determining a predicted value of the operation and maintenance index data of the period category according to a preset LSTM and the mean square error of the operation and maintenance index data of the historical period category;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the period category according to the data growth rate and the predicted value.
In a possible implementation manner, the determining module 602 is specifically configured to:
for the operation and maintenance index data of the trend category, comparing the operation and maintenance index data of the trend category with the operation and maintenance index data of the historical trend category to determine a ring ratio increase rate, wherein the operation and maintenance index data of the historical trend category is acquired before the operation and maintenance index data of the trend category is acquired;
determining a binary search tree according to the operation and maintenance index data of the historical trend category, and determining the length of each path of the operation and maintenance index data of the trend category according to the binary search tree;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the trend category according to the ring ratio growth rate and the path lengths.
In a possible implementation manner, the alarm module 604 is specifically configured to:
sending abnormal data in the real-time operation and maintenance index data to a preset terminal, wherein the abnormal data is used for indicating the preset terminal to check the abnormal data;
and if the audit passing information sent by the preset terminal is received, performing abnormal alarm.
The apparatus provided in the embodiment of the present application may be configured to implement the technical solution of the method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again in the embodiment of the present application.
Alternatively, fig. 7 schematically provides one possible basic hardware architecture of the monitoring device of the system described herein.
Referring to fig. 7, the system monitoring device 700 includes at least one processor 701 and a communication interface 703. Further optionally, a memory 702 and a bus 704 may also be included.
The system monitoring device 700 may be the processing device, and the present application is not limited thereto. In the system monitoring apparatus 700, the number of the processors 701 may be one or more, and fig. 7 illustrates only one of the processors 701. Alternatively, the processor 701 may be a CPU, a Graphics Processing Unit (GPU), or a Digital Signal Processing (DSP). If the system monitoring apparatus 700 has a plurality of processors 701, the types of the plurality of processors 701 may be different, or may be the same. Alternatively, the plurality of processors 701 of the system monitoring apparatus 700 may also be integrated into a multi-core processor.
Memory 702 stores computer instructions and data; the memory 702 may store computer instructions and data required to implement the above-described system monitoring methods provided herein, e.g., the memory 702 stores instructions for implementing the steps of the above-described system monitoring methods. Memory 702 can be any one or any combination of the following storage media: nonvolatile memory (e.g., Read Only Memory (ROM), Solid State Disk (SSD), hard disk (HDD), optical disk), volatile memory.
The communication interface 703 may provide information input/output for the at least one processor. Any one or any combination of the following devices may also be included: a network interface (e.g., an ethernet interface), a wireless network card, etc. having a network access function.
Optionally, the communication interface 703 may also be used for the system monitoring apparatus 700 to perform data communication with other computing apparatuses or terminals.
Further alternatively, fig. 7 shows the bus 704 by a thick line. The bus 704 may connect the processor 701 with the memory 702 and the communication interface 703. Thus, via bus 704, processor 701 may access memory 702 and may also interact with other computing devices or terminals using communication interface 703.
In the present application, the system monitoring apparatus 700 executes computer instructions in the memory 702, so that the system monitoring apparatus 700 implements the system monitoring method provided in the present application, or the system monitoring apparatus 700 deploys the system monitoring device.
In view of logical functional division, as shown in fig. 7, the memory 702 may include an obtaining module 601, a determining module 602, a determining module 603, and an alarming module 604. The instructions stored in the memory may be executed to implement the functions of the acquiring module, the determining module, the judging module and the alarming module, respectively, and are not limited to physical structures.
In addition, the system monitoring device may be implemented by software as shown in fig. 7, or may be implemented by hardware as a hardware module or a circuit unit.
The present application provides a computer-readable storage medium, the computer program product comprising computer instructions that instruct a computing device to perform the above-mentioned system monitoring method provided herein.
The present application provides a computer program product comprising computer instructions for executing the above system monitoring method provided by the present application by a processor.
The present application provides a chip comprising at least one processor and a communication interface providing information input and/or output for the at least one processor. Further, the chip may also include at least one memory for storing computer instructions. The at least one processor is used for calling and running the computer instructions to execute the system monitoring method provided by the application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Claims (16)

1. A method for system monitoring, comprising:
acquiring real-time operation and maintenance index data of a system to be monitored;
determining a target alarm rule according to the index category of historical operation and maintenance index data of the system to be monitored, wherein the historical operation and maintenance index data is the operation and maintenance index data of the system to be monitored in a first preset time period before the real-time operation and maintenance index data is obtained;
judging whether the real-time operation and maintenance index data is abnormal or not according to the target alarm rule;
and if the real-time operation and maintenance index data is abnormal, performing abnormal alarm.
2. The method according to claim 1, before determining a target alarm rule according to the index category of the historical operation and maintenance index data of the system to be monitored, further comprising:
performing time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a period component and a stable component of the historical operation and maintenance index data;
calculating a first similarity of the historical operation and maintenance index data and the trend component, a second similarity of the historical operation and maintenance index data and the periodic component, and a third similarity of the historical operation and maintenance index data and the stable component;
and determining the index category of the historical operation and maintenance index data according to the first similarity, the second similarity and the third similarity.
3. The method of claim 2, wherein the performing a time sequence decomposition on the historical operation and maintenance index data to obtain a trend component, a periodic component, and a stable component of the historical operation and maintenance index data comprises:
performing detrending processing on the historical operation and maintenance index data to obtain a detrending sequence, and performing local weighted regression processing on each subsequence in the detrending sequence to obtain a temporary periodic sequence;
low-pass filtering the temporary periodic sequence to obtain a low-frequency time sequence, and performing de-trending processing on the temporary periodic sequence according to the low-frequency time sequence to obtain the periodic component;
according to the periodic component, performing de-cyclization processing on the historical operation and maintenance index data, and performing local weighted regression processing on the de-cyclization processed historical operation and maintenance index data to obtain a trend component;
and obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
4. The method of claim 3, further comprising, before the obtaining the stable component from the historical operation and maintenance index data, the periodic component, and the trend component:
judging whether the periodic component and the trend component converge;
the obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component includes:
and if the periodic component and the trend component are converged, executing the step of obtaining the stable component according to the historical operation and maintenance index data, the periodic component and the trend component.
5. The method of claim 3, wherein after obtaining the stable component from the historical operation and maintenance index data, the periodic component, and the trend component, further comprising:
determining a target robust weight according to the stable component;
the performing local weighted regression processing on each subsequence in the detrended sequence comprises:
according to the target steady weight, performing local weighted regression processing on each sub-sequence in the de-trending sequence;
the local weighted regression processing of the history operation and maintenance index data after the periodization removal processing comprises the following steps:
and performing local weighted regression processing on the historical operation and maintenance index data subjected to the periodization removal processing according to the target steady weight.
6. The method of claim 2, wherein calculating the first similarity of the historical operation and maintenance index data to the trend component comprises:
constructing a first matrix to be processed according to the historical operation and maintenance index data and the trend component;
searching all regular paths from a first matrix point to a last matrix point in the first matrix to be processed;
obtaining a path with the minimum regular cost in all the regular paths;
and determining the first similarity between the historical operation and maintenance index data and the trend component according to the path with the minimum regular cost.
7. The method of claim 2, wherein determining the index category of the historical operation and maintenance index data according to the first similarity, the second similarity and the third similarity comprises:
comparing the first similarity, the second similarity, and the third similarity;
if the first similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a trend type;
if the second similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a period type;
and if the third similarity is compared to be the maximum value, determining that the index type of the historical operation and maintenance index data is a stable type.
8. The method according to any one of claims 1 to 7, wherein the determining a target alarm rule according to an index category of historical operation and maintenance index data of the system to be monitored comprises:
acquiring the corresponding relation between the index category of prestored operation and maintenance index data and an alarm rule;
and determining the target alarm rule corresponding to the index category of the historical operation and maintenance index data according to the corresponding relation.
9. The method according to claim 8, wherein the obtaining of the correspondence between the index category of the pre-stored operation and maintenance index data and the alarm rule comprises:
for the operation and maintenance index data of a stable category, obtaining a stable component and a standard deviation of the operation and maintenance index data of a historical stable category, wherein the operation and maintenance index data of the historical stable category is obtained before the operation and maintenance index data of the stable category is obtained;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the stable category according to the stable component and the standard deviation.
10. The method according to claim 8, wherein the obtaining of the correspondence between the index category of the pre-stored operation and maintenance index data and the alarm rule comprises:
for the operation and maintenance index data of the cycle category, taking a second preset time period as a cycle, comparing the operation and maintenance index data of the cycle category with the operation and maintenance index data of the historical cycle category to determine a data growth rate, wherein the operation and maintenance index data of the historical cycle category is acquired in the cycle before the operation and maintenance index data of the cycle category is acquired;
determining a predicted value of the operation and maintenance index data of the period category according to a preset long-short term memory network (LSTM) and a mean square error of the operation and maintenance index data of the historical period category;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the period category according to the data growth rate and the predicted value.
11. The method according to claim 8, wherein the obtaining of the correspondence between the index category of the pre-stored operation and maintenance index data and the alarm rule comprises:
for the operation and maintenance index data of the trend category, comparing the operation and maintenance index data of the trend category with the operation and maintenance index data of the historical trend category to determine a ring ratio increase rate, wherein the operation and maintenance index data of the historical trend category is acquired before the operation and maintenance index data of the trend category is acquired;
determining a binary search tree according to the operation and maintenance index data of the historical trend category, and determining the length of each path of the operation and maintenance index data of the trend category according to the binary search tree;
and acquiring an alarm rule corresponding to the operation and maintenance index data of the trend category according to the ring ratio growth rate and the path lengths.
12. The method according to any one of claims 1 to 7, further comprising, prior to said performing an anomaly alert:
sending abnormal data in the real-time operation and maintenance index data to a preset terminal, wherein the abnormal data is used for indicating the preset terminal to check the abnormal data;
the abnormal alarm includes:
and if the audit passing information sent by the preset terminal is received, performing abnormal alarm.
13. A system monitoring device, comprising:
the acquisition module is used for acquiring real-time operation and maintenance index data of the system to be monitored;
the determining module is used for determining a target alarm rule according to the index category of the historical operation and maintenance index data of the system to be monitored, wherein the historical operation and maintenance index data is the operation and maintenance index data of the system to be monitored in a first preset time period before the real-time operation and maintenance index data is acquired;
the judging module is used for judging whether the real-time operation and maintenance index data is abnormal or not according to the target alarm rule;
and the alarm module is used for carrying out abnormal alarm if the real-time operation and maintenance index data is abnormal.
14. A system monitoring device, comprising:
a processor;
a memory; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1-12.
15. A computer-readable storage medium, characterized in that it stores a computer program that causes a server to execute the method of any of claims 1-12.
16. A computer program product comprising computer instructions for executing the method of any one of claims 1-12 by a processor.
CN202011493657.8A 2020-12-16 2020-12-16 System monitoring method, device, equipment and storage medium Pending CN112612671A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011493657.8A CN112612671A (en) 2020-12-16 2020-12-16 System monitoring method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011493657.8A CN112612671A (en) 2020-12-16 2020-12-16 System monitoring method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112612671A true CN112612671A (en) 2021-04-06

Family

ID=75239989

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011493657.8A Pending CN112612671A (en) 2020-12-16 2020-12-16 System monitoring method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112612671A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114037100A (en) * 2021-11-15 2022-02-11 国网山东省电力公司信息通信公司 AI technology-based power equipment operation and maintenance method and system
WO2022252573A1 (en) * 2021-05-31 2022-12-08 深圳前海微众银行股份有限公司 Method and apparatus for monitoring service data
CN115701890A (en) * 2022-12-14 2023-02-14 深圳富联富桂精密工业有限公司 Method for adjusting alarm rule and related equipment
CN118171223A (en) * 2024-05-15 2024-06-11 国家气象信息中心(中国气象局气象数据中心) Meteorological health index anomaly monitoring method, device, equipment and storage medium

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140336984A1 (en) * 2013-05-13 2014-11-13 Abb Technology Ag. Conditional monitoring of industrial systems
US20150032708A1 (en) * 2013-07-25 2015-01-29 Hitachi, Ltd. Database analysis apparatus and method
CN105515820A (en) * 2015-09-25 2016-04-20 上海北塔软件股份有限公司 Health analysis method for operation and maintenance management
US20170371757A1 (en) * 2016-06-28 2017-12-28 Beijing Baidu Netcom Science And Technology, Ltd. System monitoring method and apparatus
CN108197011A (en) * 2018-01-29 2018-06-22 上海洞识信息科技有限公司 A kind of single index prediction and method for early warning based on artificial intelligence big data platform
US20180324199A1 (en) * 2017-05-05 2018-11-08 Servicenow, Inc. Systems and methods for anomaly detection
CN108880841A (en) * 2017-05-11 2018-11-23 上海宏时数据***有限公司 A kind of threshold values setting, abnormality detection system and the method for service monitoring system
CN109977098A (en) * 2019-03-08 2019-07-05 北京工商大学 Non-stationary time-series data predication method, system, storage medium and computer equipment
US20190370610A1 (en) * 2018-05-29 2019-12-05 Microsoft Technology Licensing, Llc Data anomaly detection
CN110750429A (en) * 2019-09-06 2020-02-04 平安科技(深圳)有限公司 Abnormity detection method, device, equipment and storage medium of operation and maintenance management system
US20200183946A1 (en) * 2018-12-11 2020-06-11 EXFO Solutions SAS Anomaly Detection in Big Data Time Series Analysis
CN111324639A (en) * 2020-02-11 2020-06-23 京东数字科技控股有限公司 Data monitoring method and device and computer readable storage medium
CN111338878A (en) * 2020-02-21 2020-06-26 平安科技(深圳)有限公司 Anomaly detection method and device, terminal device and storage medium
CN111459778A (en) * 2020-03-12 2020-07-28 平安科技(深圳)有限公司 Operation and maintenance system abnormal index detection model optimization method and device and storage medium
CN111537836A (en) * 2020-05-15 2020-08-14 国网山东省电力公司济宁供电公司 Automatic power distribution network fault diagnosis method and system based on wave recording data
CN111625413A (en) * 2020-04-23 2020-09-04 平安科技(深圳)有限公司 Index abnormality analysis method, index abnormality analysis device and storage medium
CN111639814A (en) * 2020-06-02 2020-09-08 贝壳技术有限公司 Method, apparatus, medium, and electronic device for predicting occurrence probability of fluctuating behavior
CN111767202A (en) * 2020-07-08 2020-10-13 中国工商银行股份有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and medium
CN111984503A (en) * 2020-08-17 2020-11-24 网宿科技股份有限公司 Method and device for identifying abnormal data of monitoring index data

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140336984A1 (en) * 2013-05-13 2014-11-13 Abb Technology Ag. Conditional monitoring of industrial systems
US20150032708A1 (en) * 2013-07-25 2015-01-29 Hitachi, Ltd. Database analysis apparatus and method
CN105515820A (en) * 2015-09-25 2016-04-20 上海北塔软件股份有限公司 Health analysis method for operation and maintenance management
US20170371757A1 (en) * 2016-06-28 2017-12-28 Beijing Baidu Netcom Science And Technology, Ltd. System monitoring method and apparatus
US20180324199A1 (en) * 2017-05-05 2018-11-08 Servicenow, Inc. Systems and methods for anomaly detection
CN108880841A (en) * 2017-05-11 2018-11-23 上海宏时数据***有限公司 A kind of threshold values setting, abnormality detection system and the method for service monitoring system
CN108197011A (en) * 2018-01-29 2018-06-22 上海洞识信息科技有限公司 A kind of single index prediction and method for early warning based on artificial intelligence big data platform
US20190370610A1 (en) * 2018-05-29 2019-12-05 Microsoft Technology Licensing, Llc Data anomaly detection
US20200183946A1 (en) * 2018-12-11 2020-06-11 EXFO Solutions SAS Anomaly Detection in Big Data Time Series Analysis
CN109977098A (en) * 2019-03-08 2019-07-05 北京工商大学 Non-stationary time-series data predication method, system, storage medium and computer equipment
CN110750429A (en) * 2019-09-06 2020-02-04 平安科技(深圳)有限公司 Abnormity detection method, device, equipment and storage medium of operation and maintenance management system
CN111324639A (en) * 2020-02-11 2020-06-23 京东数字科技控股有限公司 Data monitoring method and device and computer readable storage medium
CN111338878A (en) * 2020-02-21 2020-06-26 平安科技(深圳)有限公司 Anomaly detection method and device, terminal device and storage medium
CN111459778A (en) * 2020-03-12 2020-07-28 平安科技(深圳)有限公司 Operation and maintenance system abnormal index detection model optimization method and device and storage medium
CN111625413A (en) * 2020-04-23 2020-09-04 平安科技(深圳)有限公司 Index abnormality analysis method, index abnormality analysis device and storage medium
CN111537836A (en) * 2020-05-15 2020-08-14 国网山东省电力公司济宁供电公司 Automatic power distribution network fault diagnosis method and system based on wave recording data
CN111639814A (en) * 2020-06-02 2020-09-08 贝壳技术有限公司 Method, apparatus, medium, and electronic device for predicting occurrence probability of fluctuating behavior
CN111767202A (en) * 2020-07-08 2020-10-13 中国工商银行股份有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and medium
CN111984503A (en) * 2020-08-17 2020-11-24 网宿科技股份有限公司 Method and device for identifying abnormal data of monitoring index data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘峰麟;殷铭;袁平;: "基于DBSCAN的时序数据异常检测阈值选择算法研究", 现代计算机, no. 04, 5 February 2020 (2020-02-05) *
张小翠;: "监控阈值模型及报警事件关联规则研究", 中国金融电脑, no. 05, 7 May 2016 (2016-05-07) *
温粉莲;: "一种混合模型的时序数据异常检测方法", 数字通信世界, no. 01, 1 January 2020 (2020-01-01) *
苏怀智;王锋;刘红萍;: "基于POT模型建立大坝服役性态预警指标", 水利学报, no. 08, 15 August 2012 (2012-08-15) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022252573A1 (en) * 2021-05-31 2022-12-08 深圳前海微众银行股份有限公司 Method and apparatus for monitoring service data
CN114037100A (en) * 2021-11-15 2022-02-11 国网山东省电力公司信息通信公司 AI technology-based power equipment operation and maintenance method and system
CN114037100B (en) * 2021-11-15 2024-01-16 国网山东省电力公司信息通信公司 AI technology-based power equipment operation and maintenance method and system
CN115701890A (en) * 2022-12-14 2023-02-14 深圳富联富桂精密工业有限公司 Method for adjusting alarm rule and related equipment
CN118171223A (en) * 2024-05-15 2024-06-11 国家气象信息中心(中国气象局气象数据中心) Meteorological health index anomaly monitoring method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112612671A (en) System monitoring method, device, equipment and storage medium
CN112800116B (en) Method and device for detecting abnormity of service data
US8185781B2 (en) Invariants-based learning method and system for failure diagnosis in large scale computing systems
US8516499B2 (en) Assistance in performing action responsive to detected event
CN109818961B (en) Network intrusion detection method, device and equipment
US11429863B2 (en) Computer-readable recording medium having stored therein learning program, learning method, and learning apparatus
US20210097431A1 (en) Debugging and profiling of machine learning model training
CN111045894A (en) Database anomaly detection method and device, computer equipment and storage medium
US9860109B2 (en) Automatic alert generation
CN115174231A (en) AI-Knowledge-Base-based network fraud analysis method and server
JP6368798B2 (en) Monitoring device, monitoring system, and monitoring method
US11468365B2 (en) GPU code injection to summarize machine learning training data
CN114546765A (en) Cluster monitoring method, system, device and medium
Bi et al. Large-scale network traffic prediction with LSTM and temporal convolutional networks
CN113312239B (en) Data detection method, device, electronic equipment and medium
US20090177953A1 (en) Method and system for updating topology changes of a computer network
CN113657536A (en) Object classification method and device based on artificial intelligence
CN113886237A (en) Analysis report generation method and device, electronic equipment and storage medium
CN114997879B (en) Payment routing method, device, equipment and storage medium
CN110781410A (en) Community detection method and device
CN115174129A (en) Abnormal node detection method and device, computer equipment and storage medium
CN113170018A (en) Sleep prediction method, device, storage medium and electronic equipment
CN112800089B (en) Intermediate data storage level adjusting method, storage medium and computer equipment
CN111382874B (en) Method and device for realizing update iteration of online machine learning model
CN114225421A (en) Game transaction data anomaly detection method and device, terminal and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination