CN113127311A - Anomaly detection method and device - Google Patents

Anomaly detection method and device Download PDF

Info

Publication number
CN113127311A
CN113127311A CN202110523606.3A CN202110523606A CN113127311A CN 113127311 A CN113127311 A CN 113127311A CN 202110523606 A CN202110523606 A CN 202110523606A CN 113127311 A CN113127311 A CN 113127311A
Authority
CN
China
Prior art keywords
target
application program
alarm information
physical device
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110523606.3A
Other languages
Chinese (zh)
Inventor
孙晓梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110523606.3A priority Critical patent/CN113127311A/en
Publication of CN113127311A publication Critical patent/CN113127311A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses an abnormality detection method and device, which can obtain target alarm information of at least one application program within a preset time length, count the target alarm information of each application program to obtain a statistical result of each target alarm information, determine whether the statistical result meets a preset linkage detection trigger condition, if so, determine at least one application program corresponding to target physical equipment as the target application program according to a topological corresponding relation between the physical equipment and the application programs in a physical equipment set, determine whether the target physical equipment is abnormal based on the target alarm information of each target application program, and effectively realize the detection of an actual fault object.

Description

Anomaly detection method and device
Technical Field
The invention relates to the technical field of computers, in particular to an anomaly detection method and device.
Background
With the development of scientific technology, IT information monitoring technology is continuously improved.
In an IT information monitoring system model, a service layer, an application layer and an infrastructure layer can be divided in sequence from an upper layer to a lower layer, and the prior art can respectively monitor and alarm the working process of each layer. In the prior art, the monitoring indexes and the monitoring thresholds configured for each layer may be different because the attention points and the tolerances of the monitoring personnel of each layer are different.
Wherein, the abnormal alarm occurred at the application layer may be caused by the fault of the infrastructure layer.
However, in the actual monitoring process, the monitoring threshold of the application layer is often triggered before the monitoring threshold of the infrastructure layer. In the prior art, the actual fault object is difficult to determine by means of abnormal alarm information of an application layer.
Disclosure of Invention
In view of the above problems, the present invention provides an abnormality detection method and apparatus that overcomes or at least partially solves the above problems, and the technical solution is as follows:
an anomaly detection method comprising:
obtaining target alarm information of at least one application program within a preset time length;
counting the target alarm information of each application program to obtain a counting result of each target alarm information;
determining whether the statistical result meets a preset linkage detection triggering condition, if so, determining at least one application program corresponding to a target physical device as a target application program according to a topological corresponding relation between the physical devices in a physical device set and the application programs, wherein the physical device set is composed of at least one physical device;
and determining whether the target physical equipment is abnormal or not based on the target alarm information of each target application program.
Optionally, the target warning information includes a designated type of warning information; the obtaining of the target warning information of at least one application program within the preset time duration includes:
and acquiring the alarm information of the specified type from the application monitoring data of at least one application program within a preset time length.
Optionally, the linkage detection triggering condition is: the total number of the target alarm information is not less than a first preset threshold value and the total number of the alarm application programs is not less than a second preset threshold value;
the counting of the target alarm information of each application program includes:
and counting the total number of the target alarm information and the total number of the alarm application programs.
Optionally, determining whether the target physical device is abnormal based on the target alarm information of each target application program includes:
determining the equipment abnormality rate of the target physical equipment based on the target alarm information of each target application program;
and determining whether the equipment abnormality rate of the target physical equipment is greater than a third preset threshold, and if so, determining that the target physical equipment is abnormal physical equipment.
Optionally, the determining the device anomaly rate of the target physical device based on the target alarm information of each target application program includes:
determining a first program quantity, wherein the first program quantity is the program quantity of the target application program with the target alarm information in the preset time length;
determining a ratio of the first program quantity to a second program quantity as a device abnormality rate of the target physical device, where the second program quantity is a total program quantity of the target application program corresponding to the target physical device.
Optionally, the determining whether the target physical device is abnormal based on the target alarm information of each target application includes:
and if the total number of the information of the target alarm information appearing in the preset time length of each target application program is greater than a fourth preset threshold value, determining that the target physical equipment is abnormal physical equipment.
Optionally, the method further includes:
obtaining monitoring data for the abnormal physical device on an infrastructure layer;
and respectively outputting alarm information to the monitoring equipment of the abnormal physical equipment and the monitoring equipment of an abnormal application program, wherein the abnormal application program is the application program corresponding to the abnormal physical equipment, and the alarm information carries the monitoring data.
An abnormality detection device comprising: a first obtaining unit, a statistical unit, a first determining unit, a second determining unit and a third determining unit, wherein:
the first obtaining unit is configured to perform: obtaining target alarm information of at least one application program within a preset time length;
the statistical unit is configured to perform: counting the target alarm information of each application program to obtain a counting result of each target alarm information;
the first determination unit is configured to perform: determining whether the statistical result meets a preset linkage detection triggering condition, and if so, triggering the second determining unit;
the second determination unit configured to perform: determining at least one application program corresponding to a target physical device as a target application program according to a topological corresponding relation between the physical devices in a physical device set and the application program, wherein the physical device set is composed of at least one physical device;
the third determination unit is configured to perform: and determining whether the target physical equipment is abnormal or not based on the target alarm information of each target application program.
Optionally, the target warning information includes a designated type of warning information;
the first obtaining unit is configured to perform: and acquiring the alarm information of the specified type from the application monitoring data of at least one application program within a preset time length.
Optionally, the linkage detection triggering condition is: the total number of the target alarm information is not less than a first preset threshold value and the total number of the alarm application programs is not less than a second preset threshold value;
the statistical unit is configured to perform: and counting the total number of the target alarm information and the total number of the alarm application programs.
Optionally, the third determining unit includes: a fourth determination unit, a fifth determination unit, and a sixth determination unit, wherein:
the fourth determination unit configured to perform: determining the equipment abnormality rate of the target physical equipment based on the target alarm information of each target application program;
the fifth determination unit configured to perform: determining whether the equipment abnormality rate of the target physical equipment is greater than a third preset threshold, and if so, triggering the sixth determining unit;
the sixth determining unit configured to perform: and determining that the target physical device is an abnormal physical device.
Optionally, the fourth determining unit includes: a seventh determining unit and an eighth determining unit, wherein:
the seventh determining unit configured to perform: determining a first program quantity, wherein the first program quantity is the program quantity of the target application program with the target alarm information in the preset time length;
the eighth determining unit configured to perform: determining a ratio of the first program quantity to a second program quantity as a device abnormality rate of the target physical device, where the second program quantity is a total program quantity of the target application program corresponding to the target physical device.
Optionally, the third determining unit is configured to perform: and if the total number of the information of the target alarm information appearing in the preset time length of each target application program is greater than a fourth preset threshold value, determining that the target physical equipment is abnormal physical equipment.
Optionally, the apparatus further comprises: a second obtaining unit and an output unit, wherein:
the second obtaining unit is configured to perform: obtaining monitoring data for the abnormal physical device on an infrastructure layer;
the output unit configured to perform: and respectively outputting alarm information to the monitoring equipment of the abnormal physical equipment and the monitoring equipment of an abnormal application program, wherein the abnormal application program is the application program corresponding to the abnormal physical equipment, and the alarm information carries the monitoring data.
The method and the device for detecting the abnormality can obtain target alarm information of at least one application program within a preset time length, count the target alarm information of each application program to obtain a statistical result of each target alarm information, determine whether the statistical result meets a preset linkage detection trigger condition, determine at least one application program corresponding to target physical equipment as the target application program according to a topological corresponding relation between the physical equipment and the application programs in a physical equipment set if the statistical result meets the preset linkage detection trigger condition, determine whether the target physical equipment is abnormal or not based on the target alarm information of each target application program, and effectively realize detection of an actual fault object.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart illustrating a first method for anomaly detection provided by an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating a topological correspondence between an application layer and an infrastructure layer according to an embodiment of the present invention;
FIG. 3 is a flow chart of a second anomaly detection method provided by an embodiment of the present invention;
FIG. 4 is a flow chart illustrating a third method for anomaly detection provided by embodiments of the present invention;
fig. 5 is a schematic structural diagram of a first abnormality detection apparatus provided in an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a second abnormality detection apparatus according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
As shown in fig. 1, the present embodiment proposes a first abnormality detection method, which may include the steps of:
s101, obtaining target alarm information of at least one application program within a preset time length;
it should be noted that one or more application programs may be included in the application layer. Specifically, the invention can obtain the target alarm information of each application program in the application layer appearing in the preset time length.
The target alarm information may be any type of alarm information, or may be a specific type of alarm information.
It should be noted that, if target alarm information of a plurality of applications appears in a short time collectively at the application layer, the present invention may determine that the alarm appearing at some applications may be caused by a fault at the infrastructure layer. Therefore, the invention can periodically collect the target alarm information of the application layer within a certain short time, and then can respectively utilize the target alarm information collected within the short time each time to judge whether the alarm of some application program is probably caused by the fault of the infrastructure layer.
Specifically, the preset time period may be the short time period. The specific duration of the preset duration can be set by a technician according to the actual working condition, which is not limited by the invention.
Specifically, the present invention may configure a corresponding application monitoring program for each application program, and monitor the application program using the application monitoring program to obtain application monitoring data generated by the application monitoring program in the process of monitoring the application program.
Then, the invention can respectively obtain the target alarm information of each application program appearing in the preset time length from the application monitoring data of each application program. For example, the present invention may obtain the target warning information of the first application program occurring within the preset time period from the application monitoring data of the first application program, and may obtain the target warning information of the second application program occurring within the preset time period from the application monitoring data of the second application program.
Specifically, the present invention may periodically and respectively obtain the target alarm information of each application program appearing in the preset time duration from the application monitoring data of each application program, and then determine whether the alarm appearing in some application programs is caused by the infrastructure layer by using the target alarm information of each application program appearing in the preset time duration obtained each time. For example, when the preset time duration is 0.5 second, the present invention may obtain the target alarm information of each application program appearing within the first 0.5 second and the second 0.5 second, respectively, determine whether the alarm appearing in some application programs is caused by the infrastructure layer by using the target alarm information of each application program appearing within the first 0.5 second, and determine whether the alarm appearing in some application programs is caused by the infrastructure layer by using the target alarm information of each application program appearing within the second 0.5 second.
Alternatively, the target alert information may include a specified type of alert information. At this time, step S101 may include:
and obtaining the alarm information of the designated type from the application monitoring data of at least one application program within the preset time length.
The alarm information of the specified type may include alarm information of time-consuming, waiting time and no response.
Specifically, the invention can screen out the alarm information of the designated type from the application monitoring data.
S102, counting the target alarm information of each application program to obtain a counting result of each target alarm information;
specifically, the method and the device can count the obtained target alarm information of each application program to obtain the statistical result of the target alarm information of each application program, and then judge whether the alarm of some application programs is caused by the fault of the infrastructure layer according to the statistical result.
The target statistical index items required to be obtained for the statistics are not limited in the invention. For example, the target statistical indicator item may include the total information of the target alarm information, may include the total program number of the application program in which the alarm information occurs, and may also include the number of different types of target alarm information.
It can be understood that the invention can determine the statistical mode of the target alarm information of each application program according to the target statistical index item to be obtained.
S103, determining whether the statistical result meets a preset linkage detection triggering condition, and if so, executing a step S104;
it should be noted that, when the statistical result meets the linkage detection triggering condition, the present invention can determine that the alarm occurring in some application program may be caused by the fault of the infrastructure layer; when the statistical result does not meet the linkage detection triggering condition, the invention can determine that the alarm occurring in the application layer is probably caused by the self fault of the application layer but not caused by the fault of the infrastructure layer.
The linkage detection triggering condition may be formulated by a technician according to an actual working condition, working experience and the like, and the invention is not limited thereto.
Optionally, the linkage detection triggering condition may be: the total number of the target alarm information is not less than a first preset threshold value and the total number of the alarm application programs is not less than a second preset threshold value. At this time, the step S102 may include:
and counting the total number of the target alarm information and the total number of the alarm application programs.
The total number of the target alarm information may be the total number of the target alarm information appearing in the preset time duration of each application program.
Wherein, the total number of the warning applications may be the total number of the warning applications (i.e. the applications in which the target warning information occurs).
The first preset threshold and the second preset threshold may be set by a technician according to an actual working condition, which is not limited in the present invention.
Optionally, the linkage detection triggering condition may also include the number of information appearing in the unit time length of the total number of the target alarm information (that is, the total number of the target alarm information is divided by the value of the preset time length).
Optionally, the linkage detection triggering condition may also include the number of programs that appear in the unit time length of the total number of alert applications (i.e., the value obtained by dividing the total number of alert applications by the preset time length).
S104, determining at least one application program corresponding to the target physical device as a target application program according to the topological corresponding relation between the physical devices and the application programs in the physical device set, wherein the physical device set is composed of at least one physical device;
each physical device in the physical device set can be an electronic device, and an application program can be installed on the physical device and supports the running of the application program, such as a server, a mobile phone, a desktop computer, a tablet computer and the like.
The target physical device may be a physical device in the physical device set.
It should be noted that, as the application engineering project becomes larger, the development of componentization and modularization has become a current trend, and the use of the infrastructure, such as network and storage, related to the application program is also becoming more complicated. In order to macroscopically master the overall deployment condition of the application program, managers mostly adopt a registration mode to uniformly manage the application program. Thus, there may be a mesh-like topological correspondence between applications in the application layer and the infrastructure in the infrastructure layer.
The infrastructure layer may include a logical deployment unit and each physical device in the physical device set.
The logic deployment unit is a unit which can logically support an application program to work and is deployed on a physical device. For example, the logic deployment unit may be a database of application programs, an application server, a web server, and the like.
It should be noted that, in order to improve the design efficiency of the application program and the interaction efficiency between the components inside the application program, the application program may be logically divided according to the dimensions of the layers, the subsystems, the modules, and the like, and a corresponding logic deployment architecture may be obtained after the division, where the components in the logic deployment architecture may be logic deployment units.
Specifically, in the application layer and the infrastructure layer, there is a topological correspondence between the application program, the logic deployment unit, and the physical device.
In order to better illustrate the topological correspondence between the application program, the logic deployment unit and the physical device, the present invention provides a schematic structural diagram shown in fig. 2, which includes an application layer and an infrastructure layer, for illustration.
In fig. 2, the application layer includes a first application program and a second application program, the logical deployment unit in the infrastructure layer includes a first application server, a second application server, a first web server, a second web server, and a first database, and the physical devices in the infrastructure layer include a first physical device, a second physical device, a third physical device, and a fourth physical device.
The first application server, the first web server and the first database may be a logical deployment unit of a first application program, the second application server, the second web server and the first database may be a logical deployment unit of a second application program, the first application server and the second application server may be deployed on a first physical device, the first web server may be deployed on a second physical device, the first database may be deployed on a third physical device, and the second web server may be deployed on a fourth physical device.
It is understood that there is a topological correspondence between the application, the logical deployment unit, and the physical device in fig. 2. Specifically, the first application program may correspond to the first application server, the first web server, and the first database, respectively, and the second application program may correspond to the second application server, the second web server, and the first database, respectively; the first application server and the second application server may correspond to a first physical device, the first web server may correspond to a second physical device, the first database may correspond to a third physical device, and the second web server may correspond to a fourth physical device.
The logic deployment units of the first application program are respectively deployed on the first physical device, the second physical device and the third physical device, and the logic deployment units of the second application program are respectively deployed on the first physical device, the third physical device and the fourth physical device. Therefore, the present invention may consider that the first application program corresponds to the first physical device, the second physical device, and the third physical device, respectively, and the second application program corresponds to the first physical device, the third physical device, and the fourth physical device, respectively.
Specifically, the present invention may determine each application program corresponding to the target physical device as the target application program by using the topological correspondence between the application program and the physical device. Then, the present invention may determine whether the target physical device is abnormal according to the target alarm information that occurs in the preset time duration for each target application program, that is, determine whether the target physical device is an abnormal physical device, thereby determining whether the alarm that occurs in each target application program is caused by a fault of the target physical device.
And S105, determining whether the target physical equipment is abnormal or not based on the target alarm information of each target application program.
The method and the device can determine whether the target physical equipment is abnormal physical equipment or not according to the target alarm information of each target application program in the preset time length.
Optionally, the present invention may determine whether the target physical device is an abnormal physical device according to the total number of information of the target alarm information appearing in the preset time period in each target application program.
Optionally, if the total number of pieces of target warning information occurring in the preset time duration by each target application program is greater than a fourth preset threshold, it is determined that the target physical device is an abnormal physical device.
If the total number of the information of the target alarm information appearing in the preset time length of each target application program is greater than a fourth preset threshold value, the target physical equipment can be determined to be abnormal physical equipment; if the total number of the target alarm information appearing in the preset time length of each target application program is not more than the fourth preset threshold value, the target physical device can be determined to be normal physical device.
The fourth preset threshold may be set by a technician according to an actual working condition, which is not limited in the present invention.
Optionally, the present invention may determine whether the target physical device is an abnormal physical device according to the number of the alert application programs in each target application program.
Optionally, if the proportion of the alarm application program in each target application program is larger within the preset time length, the target physical device may be determined as an abnormal physical device by the present invention; if the alarm application program occupies a smaller area in each target application program within the preset time length, the target physical device can be determined as the normal physical device by the method and the device.
Optionally, within the preset time length, if the number of the warning application programs in the target application program is greater than a certain value, the target physical device may be determined as an abnormal physical device; within the preset time length, if the number of the alarm application programs in the target application program is not more than a certain value, the target physical equipment can be determined as normal physical equipment by the method and the device.
Specifically, when the target physical device is determined to be an abnormal physical device, the present invention may consider that the alarm occurred in each target application program is caused by the abnormality of the target physical device, and the actual fault object at this time may be the target physical device, thereby effectively completing the detection of the actual fault object.
Specifically, when the target physical device is determined to be a normal physical device, the present invention may consider that the alarm generated by each target application program is caused by the problem of each target application program, and the actual fault object at this time may be the target application program, thereby effectively completing the detection of the actual fault object.
According to the invention, after the actual fault object is determined, a technician is reminded to check the fault reason of the actual fault object, so that the fault processing efficiency is effectively improved.
It should be further noted that, after the statistical result satisfies the linkage detection triggering condition, the present invention may sequentially determine each physical device in the physical device set as the target physical device, so as to respectively determine whether each physical device in the physical device set is abnormal, thereby implementing the abnormal detection of each physical device in the physical device set, and implementing the detection of the actual fault object causing the alarm of each application program.
If all the physical devices in the physical device set are normal physical devices, the method and the device can determine that the alarm occurring in the application layer is caused by the problem of the application program.
The anomaly detection method provided by this embodiment may obtain target alarm information of at least one application program within a preset duration, perform statistics on the target alarm information of each application program to obtain a statistical result of each target alarm information, determine whether the statistical result meets a preset linkage detection trigger condition, if yes, determine at least one application program corresponding to a target physical device as the target application program according to a topological correspondence between physical devices and application programs in a physical device set, determine whether the target physical device is anomalous based on the target alarm information of each target application program, and effectively implement detection on an actual fault object.
Based on the steps shown in fig. 1, the present embodiment proposes a second abnormality detection method, as shown in fig. 3. In this method, the step S105 may specifically include steps S201, S202, and S203, where:
s201, determining the equipment abnormal rate of target physical equipment based on the target alarm information of each target application program;
it should be noted that, in the process of determining whether the target physical device is abnormal according to the target alarm information of the target application program, if the proportion of the alarm application program in each target application program is large within the preset time period, the target physical device may be determined as an abnormal physical device by the present invention.
The device exception rate may be a proportion of the alert application in the target application.
Optionally, step S201 may specifically include:
determining the number of first programs, wherein the number of the first programs is the program number of target application programs with target alarm information in a preset time length;
and determining the ratio of the first program quantity to the second program quantity as the device abnormality rate of the target physical device, wherein the second program quantity is the total program quantity of the target application programs corresponding to the target physical device.
The first program number may be the program number of the warning application program in each target application program within the preset time length.
The second program number may be the total program number of the target application program.
Specifically, the present invention may determine the device abnormality rate of the target physical device as a value obtained by dividing the first program number by the second program number. For example, if the number of target applications corresponding to the target physical device is 5, and the number of alert applications in each target application is 3 within the preset time period, the device abnormality rate of the target physical device may be 3/5, that is, 0.6.
S202, determining whether the equipment abnormal rate of the target physical equipment is greater than a third preset threshold, and if so, executing a step S203;
the third preset threshold may be set by a technician according to an actual working condition, which is not limited in the present invention.
S203, determining the target physical device as an abnormal physical device.
Specifically, in the physical device set, the physical device whose device abnormality rate is not less than the third preset threshold may be determined as an abnormal physical device, and the physical device whose device abnormality rate is less than the third preset threshold may be determined as a normal physical device.
In the anomaly detection method provided in this embodiment, in the process of determining whether the target physical device is anomalous according to the target alert information of the target application program, if the proportion of the alert application program in each target application program is large within the preset time period, the target physical device may be determined as an anomalous physical device, and detection of an actual fault object that causes an alert to occur to the target application program is implemented.
Based on the steps shown in fig. 1, the present embodiment proposes a third abnormality detection method, as shown in fig. 4. The method may further comprise the steps of:
s301, acquiring monitoring data of abnormal physical equipment on an infrastructure layer;
it should be noted that, similar to the application monitoring program set on the application program, the corresponding monitoring program also exists in the physical device. When determining the abnormal physical equipment in the physical equipment set, the invention can obtain the monitoring data of the abnormal physical equipment on the infrastructure level from the monitoring program of the abnormal physical equipment.
S302, alarm information is output to the monitoring device of the abnormal physical device and the monitoring device of the abnormal application program respectively, the abnormal application program is the application program corresponding to the abnormal physical device, and the alarm information carries monitoring data.
The monitoring device may be an electronic device responsible for storing and processing the monitoring data.
Specifically, the monitoring device of the abnormal physical device may be an electronic device for storing and processing monitoring data of the abnormal physical device.
It is understood that the present invention may determine all applications corresponding to the abnormal physical devices as abnormal applications.
Optionally, the present invention may also determine only the alarm application corresponding to the abnormal physical device as the abnormal application.
The monitoring device of the abnormal application program may be an electronic device for saving and processing monitoring data of the abnormal application program.
It is understood that the monitoring device of the abnormal physical device may be the abnormal physical device itself, or may be other electronic devices. The monitoring device of the abnormal application program can also be the corresponding abnormal physical device, and can also be other electronic devices.
Specifically, the method and the device can inform technicians in the application layer and the infrastructure layer by outputting alarm information to the abnormal physical device and the monitoring device of the abnormal application program, so that the technicians can master an actual fault object causing the alarm of the application program as soon as possible and process the fault in time, and the fault processing efficiency and the service operation efficiency are improved.
It is understood that steps S301 and S302 may also be applied in the method shown in fig. 3.
According to the anomaly detection method provided by the embodiment, technicians in an application layer and an infrastructure layer can be informed by outputting alarm information to the abnormal physical device and the monitoring device of the abnormal application program, so that the technicians can master an actual fault object causing the alarm of the application program as soon as possible, and can timely process the fault, thereby improving the fault processing efficiency and the service operation efficiency.
Corresponding to the steps shown in fig. 1, the present embodiment proposes a first abnormality detection device, as shown in fig. 5. The apparatus may include: a first obtaining unit 101, a statistical unit 102, a first determining unit 103, a second determining unit 104, and a third determining unit 105, wherein:
a first obtaining unit 101 configured to perform: obtaining target alarm information of at least one application program within a preset time length;
it should be noted that one or more application programs may be included in the application layer. Specifically, the invention can obtain the target alarm information of each application program in the application layer appearing in the preset time length.
The target alarm information may be any type of alarm information, or may be a specific type of alarm information.
It should be noted that, if target alarm information of a plurality of applications appears in a short time collectively at the application layer, the present invention may determine that the alarm appearing at some applications may be caused by a fault at the infrastructure layer. Therefore, the invention can periodically collect the target alarm information of the application layer within a certain short time, and then can respectively utilize the target alarm information collected within the short time each time to judge whether the alarm of some application program is probably caused by the fault of the infrastructure layer.
Specifically, the preset time period may be the short time period. The specific duration of the preset duration can be set by a technician according to the actual working condition, which is not limited by the invention.
Specifically, the present invention may configure a corresponding application monitoring program for each application program, and monitor the application program using the application monitoring program to obtain application monitoring data generated by the application monitoring program in the process of monitoring the application program.
Then, the invention can respectively obtain the target alarm information of each application program appearing in the preset time length from the application monitoring data of each application program.
Specifically, the present invention may periodically and respectively obtain the target alarm information of each application program appearing in the preset time duration from the application monitoring data of each application program, and then determine whether the alarm appearing in some application programs is caused by the infrastructure layer by using the target alarm information of each application program appearing in the preset time duration obtained each time.
Alternatively, the target alert information may include a specified type of alert information. At this time, the first obtaining unit 101 is configured to perform: and obtaining the alarm information of the designated type from the application monitoring data of at least one application program within the preset time length.
The alarm information of the specified type may include alarm information of time-consuming, waiting time and no response.
Specifically, the invention can screen out the alarm information of the designated type from the application monitoring data.
A statistics unit 102 configured to perform: counting the target alarm information of each application program to obtain a counting result of each target alarm information;
specifically, the method and the device can count the obtained target alarm information of each application program to obtain the statistical result of the target alarm information of each application program, and then judge whether the alarm of some application programs is caused by the fault of the infrastructure layer according to the statistical result.
The target statistical index items required to be obtained for the statistics are not limited in the invention. For example, the target statistical indicator item may include the total information of the target alarm information, may include the total program number of the application program in which the alarm information occurs, and may also include the number of different types of target alarm information.
It can be understood that the invention can determine the statistical mode of the target alarm information of each application program according to the target statistical index item to be obtained.
A first determination unit 103 configured to perform: determining whether the statistical result meets a preset linkage detection triggering condition, and if so, triggering a second determining unit 104;
it should be noted that, when the statistical result meets the linkage detection triggering condition, the present invention can determine that the alarm occurring in some application program may be caused by the fault of the infrastructure layer; when the statistical result does not meet the linkage detection triggering condition, the invention can determine that the alarm occurring in the application layer is probably caused by the self fault of the application layer but not caused by the fault of the infrastructure layer.
The linkage detection triggering condition may be formulated by a technician according to an actual working condition, working experience and the like, and the invention is not limited thereto.
Optionally, the linkage detection triggering condition may be: the total number of the target alarm information is not less than a first preset threshold value and the total number of the alarm application programs is not less than a second preset threshold value. At this time, the statistical unit 102 is configured to perform: and counting the total number of the target alarm information and the total number of the alarm application programs.
The total number of the target alarm information may be the total number of the target alarm information appearing in the preset time duration of each application program.
Wherein, the total number of the warning applications may be the total number of the warning applications (i.e. the applications in which the target warning information occurs).
The first preset threshold and the second preset threshold may be set by a technician according to an actual working condition, which is not limited in the present invention.
Optionally, the linkage detection triggering condition may also include the number of information appearing in the unit time length of the total number of the target alarm information (that is, the total number of the target alarm information is divided by the value of the preset time length).
Optionally, the linkage detection triggering condition may also include the number of programs that appear in the unit time length of the total number of alert applications (i.e., the value obtained by dividing the total number of alert applications by the preset time length).
A second determining unit 104 configured to perform: determining at least one application program corresponding to a target physical device as a target application program according to a topological corresponding relation between the physical devices and the application programs in a physical device set, wherein the physical device set is composed of at least one physical device;
each physical device in the physical device set can be an electronic device, and an application program can be installed on the physical device and supports the running of the application program, such as a server, a mobile phone, a desktop computer, a tablet computer and the like.
The target physical device may be a physical device in the physical device set.
It should be noted that, as the application engineering project becomes larger, the development of componentization and modularization has become a current trend, and the use of the infrastructure, such as network and storage, related to the application program is also becoming more complicated. In order to macroscopically master the overall deployment condition of the application program, managers mostly adopt a registration mode to uniformly manage the application program. Thus, there may be a mesh-like topological correspondence between applications in the application layer and the infrastructure in the infrastructure layer.
The infrastructure layer may include a logical deployment unit and each physical device in the physical device set.
The logic deployment unit is a unit which can logically support an application program to work and is deployed on a physical device. For example, the logic deployment unit may be a database of application programs, an application server, a web server, and the like.
It should be noted that, in order to improve the design efficiency of the application program and the interaction efficiency between the components inside the application program, the application program may be logically divided according to the dimensions of the layers, the subsystems, the modules, and the like, and a corresponding logic deployment architecture may be obtained after the division, where the components in the logic deployment architecture may be logic deployment units.
Specifically, in the application layer and the infrastructure layer, there is a topological correspondence between the application program, the logic deployment unit, and the physical device.
Specifically, the present invention may determine each application program corresponding to the target physical device as the target application program by using the topological correspondence between the application program and the physical device. Then, the present invention may determine whether the target physical device is abnormal according to the target alarm information that occurs in the preset time duration for each target application program, that is, determine whether the target physical device is an abnormal physical device, thereby determining whether the alarm that occurs in each target application program is caused by a fault of the target physical device.
A third determining unit 105 configured to perform: and determining whether the target physical equipment is abnormal or not based on the target alarm information of each target application program.
The method and the device can determine whether the target physical equipment is abnormal physical equipment or not according to the target alarm information of each target application program in the preset time length.
Optionally, the present invention may determine whether the target physical device is an abnormal physical device according to the total number of information of the target alarm information appearing in the preset time period in each target application program.
Optionally, if the total number of pieces of target warning information occurring in the preset time duration by each target application program is greater than a fourth preset threshold, it is determined that the target physical device is an abnormal physical device.
If the total number of the information of the target alarm information appearing in the preset time length of each target application program is greater than a fourth preset threshold value, the target physical equipment can be determined to be abnormal physical equipment; if the total number of the target alarm information appearing in the preset time length of each target application program is not more than the fourth preset threshold value, the target physical device can be determined to be normal physical device.
The fourth preset threshold may be set by a technician according to an actual working condition, which is not limited in the present invention.
Optionally, the present invention may determine whether the target physical device is an abnormal physical device according to the number of the alert application programs in each target application program.
Optionally, if the proportion of the alarm application program in each target application program is larger within the preset time length, the target physical device may be determined as an abnormal physical device by the present invention; if the alarm application program occupies a smaller area in each target application program within the preset time length, the target physical device can be determined as the normal physical device by the method and the device.
Optionally, within the preset time length, if the number of the warning application programs in the target application program is greater than a certain value, the target physical device may be determined as an abnormal physical device; within the preset time length, if the number of the alarm application programs in the target application program is not more than a certain value, the target physical equipment can be determined as normal physical equipment by the method and the device.
Specifically, when the target physical device is determined to be an abnormal physical device, the present invention may consider that the alarm occurred in each target application program is caused by the abnormality of the target physical device, and the actual fault object at this time may be the target physical device, thereby effectively completing the detection of the actual fault object.
Specifically, when the target physical device is determined to be a normal physical device, the present invention may consider that the alarm generated by each target application program is caused by the problem of each target application program, and the actual fault object at this time may be the target application program, thereby effectively completing the detection of the actual fault object.
According to the invention, after the actual fault object is determined, a technician is reminded to check the fault reason of the actual fault object, so that the fault processing efficiency is effectively improved.
It should be further noted that, after the statistical result satisfies the linkage detection triggering condition, the present invention may sequentially determine each physical device in the physical device set as the target physical device, so as to respectively determine whether each physical device in the physical device set is abnormal, thereby implementing the abnormal detection of each physical device in the physical device set, and implementing the detection of the actual fault object causing the alarm of each application program.
If all the physical devices in the physical device set are normal physical devices, the method and the device can determine that the alarm occurring in the application layer is caused by the problem of the application program.
The anomaly detection device provided by this embodiment may obtain target alarm information of at least one application program within a preset duration, perform statistics on the target alarm information of each application program to obtain a statistical result of each target alarm information, determine whether the statistical result meets a preset linkage detection trigger condition, if yes, determine at least one application program corresponding to a target physical device as the target application program according to a topological correspondence between physical devices and application programs in a physical device set, determine whether the target physical device is anomalous based on the target alarm information of each target application program, and effectively implement detection on an actual fault object.
Based on fig. 5, the present embodiment proposes a second abnormality detection device as shown in fig. 6. In the apparatus, the third determining unit 105 may include: a fourth determining unit 201, a fifth determining unit 202, and a sixth determining unit 203, wherein:
a fourth determination unit 201 configured to perform: determining the equipment abnormal rate of the target physical equipment based on the target alarm information of each target application program;
it should be noted that, in the process of determining whether the target physical device is abnormal according to the target alarm information of the target application program, if the proportion of the alarm application program in each target application program is large within the preset time period, the target physical device may be determined as an abnormal physical device by the present invention.
The device exception rate may be a proportion of the alert application in the target application.
Optionally, the fourth determining unit 201 may include: a seventh determining unit and an eighth determining unit, wherein:
a seventh determining unit configured to perform: determining the number of first programs, wherein the number of the first programs is the program number of target application programs with target alarm information in a preset time length;
an eighth determination unit configured to perform: and determining the ratio of the first program quantity to the second program quantity as the device abnormality rate of the target physical device, wherein the second program quantity is the total program quantity of the target application programs corresponding to the target physical device.
The first program number may be the program number of the warning application program in each target application program within the preset time length.
The second program number may be the total program number of the target application program.
Specifically, the present invention may determine the device abnormality rate of the target physical device as a value obtained by dividing the first program number by the second program number.
A fifth determining unit 202 configured to perform: determining whether the device abnormality rate of the target physical device is greater than a third preset threshold, and if so, triggering a sixth determining unit 203;
the third preset threshold may be set by a technician according to an actual working condition, which is not limited in the present invention.
A sixth determining unit 203 configured to perform: and determining that the target physical device is an abnormal physical device.
Specifically, in the physical device set, the physical device whose device abnormality rate is not less than the third preset threshold may be determined as an abnormal physical device, and the physical device whose device abnormality rate is less than the third preset threshold may be determined as a normal physical device.
In the process of determining whether the target physical device is abnormal according to the target alarm information of the target application program, if the proportion of the alarm application program in each target application program is large within the preset time length, the abnormality detection apparatus provided in this embodiment may determine the target physical device as an abnormal physical device, and implement detection of an actual fault object causing an alarm of the target application program.
Based on fig. 5, the present embodiment proposes a third abnormality detection device, which may further include: a second obtaining unit and an output unit, wherein:
a second obtaining unit configured to perform: obtaining monitoring data for the abnormal physical device on an infrastructure layer;
an output unit configured to perform: alarm information is respectively output to the monitoring equipment of the abnormal physical equipment and the monitoring equipment of the abnormal application program, the abnormal application program is the application program corresponding to the abnormal physical equipment, and the alarm information carries monitoring data.
It should be noted that, similar to the application monitoring program set on the application program, the corresponding monitoring program also exists in the physical device. When determining the abnormal physical equipment in the physical equipment set, the invention can obtain the monitoring data of the abnormal physical equipment on the infrastructure level from the monitoring program of the abnormal physical equipment.
The monitoring device may be an electronic device responsible for storing and processing the monitoring data.
Specifically, the monitoring device of the abnormal physical device may be an electronic device for storing and processing monitoring data of the abnormal physical device.
It is understood that the present invention may determine all applications corresponding to the abnormal physical devices as abnormal applications.
Optionally, the present invention may also determine only the alarm application corresponding to the abnormal physical device as the abnormal application.
The monitoring device of the abnormal application program may be an electronic device for saving and processing monitoring data of the abnormal application program.
It is understood that the monitoring device of the abnormal physical device may be the abnormal physical device itself, or may be other electronic devices. The monitoring device of the abnormal application program can also be the corresponding abnormal physical device, and can also be other electronic devices.
Specifically, the method and the device can inform technicians in the application layer and the infrastructure layer by outputting alarm information to the abnormal physical device and the monitoring device of the abnormal application program, so that the technicians can master an actual fault object causing the alarm of the application program as soon as possible and process the fault in time, and the fault processing efficiency and the service operation efficiency are improved.
It is to be understood that the second obtaining unit and the output unit may also be applied to the second abnormality detecting device described above.
The anomaly detection device provided by the embodiment can notify technicians in an application layer and an infrastructure layer by outputting alarm information to the abnormal physical device and the monitoring device of the abnormal application program, so that the technicians can master an actual fault object causing the alarm of the application program as soon as possible, and timely process faults, thereby improving the fault processing efficiency and the service operation efficiency.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. An abnormality detection method characterized by comprising:
obtaining target alarm information of at least one application program within a preset time length;
counting the target alarm information of each application program to obtain a counting result of each target alarm information;
determining whether the statistical result meets a preset linkage detection triggering condition, if so, determining at least one application program corresponding to a target physical device as a target application program according to a topological corresponding relation between the physical devices in a physical device set and the application programs, wherein the physical device set is composed of at least one physical device;
and determining whether the target physical equipment is abnormal or not based on the target alarm information of each target application program.
2. The method of claim 1, wherein the target alert information comprises a specified type of alert information; the obtaining of the target warning information of at least one application program within the preset time duration includes:
and acquiring the alarm information of the specified type from the application monitoring data of at least one application program within a preset time length.
3. The method of claim 1, wherein the linkage detection trigger condition is: the total number of the target alarm information is not less than a first preset threshold value and the total number of the alarm application programs is not less than a second preset threshold value;
the counting of the target alarm information of each application program includes:
and counting the total number of the target alarm information and the total number of the alarm application programs.
4. The method of claim 1, wherein determining whether the target physical device is abnormal based on target alert information for each of the target applications comprises:
determining the equipment abnormality rate of the target physical equipment based on the target alarm information of each target application program;
and determining whether the equipment abnormality rate of the target physical equipment is greater than a third preset threshold, and if so, determining that the target physical equipment is abnormal physical equipment.
5. The method of claim 4, wherein determining the device anomaly rate for the target physical device based on the target alarm information for each of the target applications comprises:
determining a first program quantity, wherein the first program quantity is the program quantity of the target application program with the target alarm information in the preset time length;
determining a ratio of the first program quantity to a second program quantity as a device abnormality rate of the target physical device, where the second program quantity is a total program quantity of the target application program corresponding to the target physical device.
6. The method of claim 1, wherein determining whether the target physical device is abnormal based on the target alert information for each of the target applications comprises:
and if the total number of the information of the target alarm information appearing in the preset time length of each target application program is greater than a fourth preset threshold value, determining that the target physical equipment is abnormal physical equipment.
7. The method of any of claims 1 to 6, further comprising:
obtaining monitoring data for the abnormal physical device on an infrastructure layer;
and respectively outputting alarm information to the monitoring equipment of the abnormal physical equipment and the monitoring equipment of an abnormal application program, wherein the abnormal application program is the application program corresponding to the abnormal physical equipment, and the alarm information carries the monitoring data.
8. An abnormality detection device characterized by comprising: a first obtaining unit, a statistical unit, a first determining unit, a second determining unit and a third determining unit, wherein:
the first obtaining unit is configured to perform: obtaining target alarm information of at least one application program within a preset time length;
the statistical unit is configured to perform: counting the target alarm information of each application program to obtain a counting result of each target alarm information;
the first determination unit is configured to perform: determining whether the statistical result meets a preset linkage detection triggering condition, and if so, triggering the second determining unit;
the second determination unit configured to perform: determining at least one application program corresponding to a target physical device as a target application program according to a topological corresponding relation between the physical devices in a physical device set and the application program, wherein the physical device set is composed of at least one physical device;
the third determination unit is configured to perform: and determining whether the target physical equipment is abnormal or not based on the target alarm information of each target application program.
9. The apparatus of claim 8, wherein the target alert information comprises a specified type of alert information;
the first obtaining unit is configured to perform: and acquiring the alarm information of the specified type from the application monitoring data of at least one application program within a preset time length.
10. The device of claim 8, wherein the linkage detection trigger condition is: the total number of the target alarm information is not less than a first preset threshold value and the total number of the alarm application programs is not less than a second preset threshold value;
the statistical unit is configured to perform: and counting the total number of the target alarm information and the total number of the alarm application programs.
CN202110523606.3A 2021-05-13 2021-05-13 Anomaly detection method and device Pending CN113127311A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110523606.3A CN113127311A (en) 2021-05-13 2021-05-13 Anomaly detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110523606.3A CN113127311A (en) 2021-05-13 2021-05-13 Anomaly detection method and device

Publications (1)

Publication Number Publication Date
CN113127311A true CN113127311A (en) 2021-07-16

Family

ID=76781760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110523606.3A Pending CN113127311A (en) 2021-05-13 2021-05-13 Anomaly detection method and device

Country Status (1)

Country Link
CN (1) CN113127311A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102783087A (en) * 2012-05-23 2012-11-14 华为技术有限公司 Associative alarm method and device based on management layers
CN102937930A (en) * 2012-09-29 2013-02-20 重庆新媒农信科技有限公司 Application program monitoring system and method
CN103559124A (en) * 2013-10-24 2014-02-05 华为技术有限公司 Fast fault detection method and device
KR101580772B1 (en) * 2014-06-24 2015-12-28 주식회사 케이티 Method for monitoring application apparatus therefor
CN106407077A (en) * 2016-09-21 2017-02-15 广州华多网络科技有限公司 A real-time alarm method and system
CN108900353A (en) * 2018-07-18 2018-11-27 平安科技(深圳)有限公司 Fault alarming method and terminal device
CN109783322A (en) * 2018-11-22 2019-05-21 远光软件股份有限公司 A kind of monitoring analysis system and its method of enterprise information system operating status
CN112015618A (en) * 2020-08-17 2020-12-01 杭州指令集智能科技有限公司 Abnormity warning method and device
CN112084055A (en) * 2020-08-19 2020-12-15 广州小鹏汽车科技有限公司 Fault positioning method and device of application system, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102783087A (en) * 2012-05-23 2012-11-14 华为技术有限公司 Associative alarm method and device based on management layers
CN102937930A (en) * 2012-09-29 2013-02-20 重庆新媒农信科技有限公司 Application program monitoring system and method
CN103559124A (en) * 2013-10-24 2014-02-05 华为技术有限公司 Fast fault detection method and device
KR101580772B1 (en) * 2014-06-24 2015-12-28 주식회사 케이티 Method for monitoring application apparatus therefor
CN106407077A (en) * 2016-09-21 2017-02-15 广州华多网络科技有限公司 A real-time alarm method and system
CN108900353A (en) * 2018-07-18 2018-11-27 平安科技(深圳)有限公司 Fault alarming method and terminal device
CN109783322A (en) * 2018-11-22 2019-05-21 远光软件股份有限公司 A kind of monitoring analysis system and its method of enterprise information system operating status
CN112015618A (en) * 2020-08-17 2020-12-01 杭州指令集智能科技有限公司 Abnormity warning method and device
CN112084055A (en) * 2020-08-19 2020-12-15 广州小鹏汽车科技有限公司 Fault positioning method and device of application system, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郝鹏海;徐成龙;刘一田;: "基于Kafka和Kubernetes的云平台监控告警***", 计算机***应用, no. 08, pages 121 - 126 *

Similar Documents

Publication Publication Date Title
CN109726072B (en) WebLogic server monitoring and alarming method, device and system and computer storage medium
JP5267736B2 (en) Fault detection apparatus, fault detection method, and program recording medium
CN108919935A (en) Monitoring method, device and equipment for power supply on server mainboard
CN103392176B (en) For predicting the apparatus and method that network event spreads unchecked
EP1291772A2 (en) Failure prediction apparatus and method
CN101222361A (en) Alarm frequency monitor and alarm processing method
EP2085850B1 (en) Alarm management apparatus
CN102937930A (en) Application program monitoring system and method
CN103746831A (en) Alarm analysis method, device and system
EP2360590A2 (en) Apparatus and method for analysing a computer infrastructure
CN112702184A (en) Fault early warning method and device and computer-readable storage medium
CN112380089A (en) Data center monitoring and early warning method and system
CN101989931A (en) Operation alarm processing method and device
CN105549508A (en) Alarm method based on information combination and apparatus thereof
CN115794588A (en) Memory fault prediction method, device and system and monitoring server
US20190362262A1 (en) Information processing device, non-transitory storage medium and information processing method
CN111339466A (en) Interface management method and device, electronic equipment and readable storage medium
CN113127311A (en) Anomaly detection method and device
CN115102838B (en) Emergency processing method and device for server downtime risk and electronic equipment
CN114915541B (en) System fault elimination method and device, electronic equipment and storage medium
CN115118614A (en) Operation abnormality detection method, operation abnormality detection device, electronic device, and storage medium
CN113300918A (en) Fault detection method of intelligent lamp pole, terminal device and storage medium
CN114091702A (en) Event monitoring method and device, electronic equipment and storage medium
JP5586322B2 (en) Plant monitoring system and plant monitoring method
CN114422332B (en) Network slice control method, device, processing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination