CN115361266A - Alarm root cause positioning method, device, equipment and storage medium - Google Patents

Alarm root cause positioning method, device, equipment and storage medium Download PDF

Info

Publication number
CN115361266A
CN115361266A CN202110478093.9A CN202110478093A CN115361266A CN 115361266 A CN115361266 A CN 115361266A CN 202110478093 A CN202110478093 A CN 202110478093A CN 115361266 A CN115361266 A CN 115361266A
Authority
CN
China
Prior art keywords
alarm
time
network element
frequency
association
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110478093.9A
Other languages
Chinese (zh)
Other versions
CN115361266B (en
Inventor
郭斌洁
严昱超
谢丹
付家乐
李海传
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110478093.9A priority Critical patent/CN115361266B/en
Publication of CN115361266A publication Critical patent/CN115361266A/en
Application granted granted Critical
Publication of CN115361266B publication Critical patent/CN115361266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/064Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for positioning an alarm root cause, which relate to the technical field of equipment alarm, and the method comprises the following steps: acquiring real-time alarm data; constructing a real-time topological relation according to a preset equipment topological relation; classifying real-time alarm data by using an event classification model obtained based on high-frequency time and/or associated time training to obtain a plurality of alarm events; aiming at each alarm event, acquiring an alarm propagation graph according to an alarm association rule; and according to the alarm propagation graph, obtaining a root node to locate an alarm root cause. The invention solves the problem of difficult alarm root cause positioning in the prior art, realizes alarm root cause positioning from two dimensions of time and space, and improves the effect of fault positioning accuracy.

Description

Alarm root cause positioning method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of equipment alarm, in particular to an alarm root cause positioning method, device, equipment and storage medium.
Background
With the rapid development of NFV (Network Functions Virtualization) technology and landing application, traditional physical communication devices gradually implement cloud. In a network cloud system utilizing the NFV technology, when a network element equipment fault in the system alarms, a traditional operation and maintenance method based on experience rules is difficult to locate an alarm root cause in an alarm storm, so that the technical problem that the alarm root cause is difficult to locate exists in the prior art.
Disclosure of Invention
The main purposes of the invention are as follows: the method, the device, the equipment and the storage medium for positioning the alarm root cause are provided, and aim to solve the technical problem that the alarm root cause is difficult to position in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for positioning an alarm root cause, including the following steps:
acquiring real-time alarm data of a network system comprising network element equipment, wherein the real-time alarm data comprises alarm equipment information and alarm occurrence time;
constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information;
classifying the real-time alarm data by utilizing an event division model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event division model is obtained by training based on high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in network element equipment related to topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
aiming at each alarm event, obtaining an alarm propagation graph according to an alarm association rule, wherein the alarm association rule is obtained based on the association relation and the confidence degree among all alarms;
and acquiring a root node according to the alarm propagation graph so as to position an alarm root cause.
Optionally, in the above alarm root cause positioning method, before the step of classifying the real-time alarm data by using an event classification model obtained by training according to the real-time topological relation to obtain a plurality of alarm events, the method further includes:
acquiring historical alarm data of the network system and a historical topological relation of corresponding network element equipment;
performing time sequence mining on the historical alarm data based on the historical topological relation to obtain time aggregation characteristics, wherein the time aggregation characteristics comprise high-frequency time and/or associated time;
and training a Bayesian analysis model according to the time aggregation characteristics to obtain an event division model.
Optionally, in the above alarm root cause positioning method, the step of performing time sequence mining on the historical alarm data based on the historical topological relation to obtain a time aggregation feature specifically includes:
obtaining a plurality of network element devices related to the topology based on the historical topological relation;
based on the network element equipment related to the topology, acquiring the total alarm duration corresponding to each network element equipment from the historical alarm data;
and obtaining high-frequency time and/or associated time according to the total alarm duration corresponding to each network element device.
Optionally, in the above method for positioning an alarm root cause, the step of obtaining the high-frequency time according to the total alarm duration corresponding to each network element device specifically includes:
acquiring high-frequency alarms and the total times thereof according to the total alarm duration corresponding to each network element device, wherein the high-frequency alarms comprise alarms with overlapped alarm durations and consistent alarm titles;
aiming at each network element device, obtaining a high-frequency time period and high-frequency times of the high-frequency alarm according to the alarm duration of each high-frequency alarm, wherein the high-frequency time period comprises a time period with overlapped alarm duration;
acquiring the interval duration of the high-frequency time period according to the starting time of each high-frequency alarm in the high-frequency time period;
performing weighted average processing according to the interval durations of all the high-frequency time periods to obtain the average interval duration of all the high-frequency time periods in the network element equipment;
obtaining high-frequency time according to the high-frequency times and the average interval duration of each network element device, wherein the adopted calculation formula is as follows:
Figure BDA0003046979750000031
wherein T represents high-frequency time, N represents the number of the network element equipment, and N n Indicating the total number of high frequency alarms, i n Number of high frequencies, i, representing a high frequency time period n ≤N n ,Δt n Representing the average inter time difference of all high frequency time segments.
Optionally, in the above method for positioning an alarm root cause, the step of obtaining the associated time according to the total alarm duration corresponding to each network element device specifically includes:
obtaining associated alarms according to the total alarm duration corresponding to each network element device, wherein the associated alarms comprise alarm pairs with overlapped alarm durations and inconsistent alarm titles, and the alarm pairs comprise primary alarms and secondary alarms;
for each network element device, obtaining the association time period and the association times of the association alarm according to the alarm duration of each association alarm;
acquiring the initial time difference of the associated time period according to the starting time of the primary alarm and the starting time of the secondary alarm in the associated time period;
performing weighted average processing according to the starting time differences of all the associated time periods to obtain average interval time differences of all the associated time periods in the network element equipment;
obtaining the association time of the association alarm according to the association times of each network element device and the average interval time difference, wherein the calculation formula is as follows:
Figure BDA0003046979750000032
wherein S represents the association time, n represents the number of the network element devices, Δ S n Representing the average inter-time difference of all associated time segments,
Figure BDA0003046979750000033
k n number of associations, t, representing period of association time v Representing the start time difference of the associated time period.
Optionally, in the above method for positioning an alarm root cause, the step of obtaining an alarm propagation graph according to an alarm association rule for each alarm event specifically includes:
constructing an alarm undirected graph aiming at each alarm event;
setting a weight value for an edge in the alarm undirected graph according to the alarm association rule to obtain an alarm link topological graph;
and reconstructing the alarm link topological graph by using a maximum spanning tree algorithm to obtain an alarm propagation graph.
In a second aspect, the present invention provides an event partitioning model training method, including the following steps:
acquiring historical alarm data of a network system comprising network element equipment and a historical topological relation of the corresponding network element equipment;
based on the historical topological relation, performing time sequence mining on the historical alarm data to obtain time aggregation characteristics, wherein the time aggregation characteristics comprise high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and consistent in alarm titles in the network element equipment related to the topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
and training the Bayesian analysis model according to the time aggregation characteristics to obtain an event division model so as to merge the alarms meeting the high-frequency time and/or the alarms meeting the associated time.
In a third aspect, the present invention provides an alarm root cause positioning device, including:
the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring real-time alarm data of a network system comprising network element equipment, and the real-time alarm data comprises alarm equipment information and alarm occurrence time;
the topological relation acquisition module is used for constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information;
the event partitioning module is used for classifying the real-time alarm data by utilizing an event partitioning model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event partitioning model is obtained on the basis of high-frequency time and/or associated time training, and the high-frequency time is the average interval duration between alarms with overlapped alarm durations and consistent alarm titles in the topology-related network element equipment; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
the event analysis module is used for acquiring an alarm propagation graph according to an alarm association rule aiming at each alarm event, wherein the alarm association rule is acquired based on the association relationship and the confidence degree among the alarms;
and the root cause positioning module is used for obtaining a root node according to the alarm propagation diagram so as to position the alarm root cause.
In a fourth aspect, the present invention provides an alarm root cause positioning device, which includes a processor and a memory, where the memory stores a computer program, and the computer program, when executed by the processor, implements the alarm root cause positioning method as described above.
In a fifth aspect, the present invention provides a storage medium having stored thereon a computer program executable by one or more processors to implement the alarm root cause localization method as described above.
One or more technical solutions provided by the present invention may have the following advantages or at least achieve the following technical effects:
according to the alarm root cause positioning method, device, equipment and storage medium provided by the invention, real-time alarm data are obtained, a real-time topological relation is obtained, the real-time alarm data are classified by using an event division model obtained based on high-frequency time and/or associated time training, a plurality of alarm events are obtained, the alarm events can be effectively aggregated, and alarm compression is realized in an alarm storm; and the alarm propagation graph is obtained according to the alarm association rule aiming at each alarm event, so that the root node is obtained to position the alarm root cause, the alarm root cause positioning from two dimensions of time and space is realized, and the effect of rapidness and accuracy of fault positioning is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flowchart illustrating a first embodiment of a method for alarm root cause location according to the present invention;
FIG. 2 is a schematic diagram of a hardware structure of an alarm root cause location device according to the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of a method for alarm root cause location according to the present invention;
FIG. 4 is a detailed flowchart of step S40 in FIG. 3;
fig. 5 is a schematic diagram of distribution of total alarm duration times of the network element devices a and b in the embodiment of step S42.3 of the second embodiment of the alarm root cause positioning method of the present invention;
fig. 6 is a schematic diagram of distribution of duration of full alarms of the first network element device in another embodiment of step S42.3 of the second embodiment of the alarm root cause positioning method of the present invention;
FIG. 7 is a schematic diagram illustrating an acquisition process of an alarm propagation map in step S70 according to a second embodiment of the alarm root cause location method of the present invention;
FIG. 8 is a functional block diagram of an alarm root cause positioning device according to a first embodiment of the present invention;
FIG. 9 is a diagram illustrating the connection of functional modules of the alarm root cause positioning device according to the first embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive efforts based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
It should be noted that, in the present invention, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of another like element in a process, method, article, or system that comprises the element. In addition, the meaning of "and/or" appearing throughout includes three juxtapositions, exemplified by "A and/or B" including either A or B or both A and B. In addition, suffixes such as "module", "sub-module", or "unit" used to represent elements are used only for facilitating the description of the present invention, and have no specific meaning in themselves.
The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations. In addition, the technical solutions of the respective embodiments may be combined with each other, but must be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should be considered to be absent and not be within the protection scope of the present invention.
The analysis of the prior art shows that with the rapid development and landing application of the NFV technology, the traditional physical communication equipment gradually realizes clouding, so that software and hardware decoupling is realized, and dependence on past special hardware is eliminated. The network cloud system utilizing the NFV technology can realize the characteristics of resource sharing, automatic deployment, elastic expansion and the like, but also brings the operation and maintenance problems of multiple fault points, alarm storms, difficult source tracing of fault roots and limited manual expert experience to the network cloud operation and maintenance. Because the network cloud is a new technology, no mature operation and maintenance technical scheme exists at present, and each professional operation and maintenance personnel still needs to perform alarm convergence analysis and fault tracing detection, so that the problems of large input of manpower and large maintenance pressure exist. And some rule combing based on fault trees or expert experiences also need to be established on the basis of sufficient experience accumulation for the network cloud technology, and a great deal of expert experience is consumed.
Especially with the development of 5G network clouding, when network element equipment failure in a network cloud system utilizing NFV technology alarms, the traditional operation and maintenance method based on experience rules cannot perform rapid alarm compression, find a core alarm root cause, and lock a faulty network element; moreover, the traditional operation and maintenance method based on the empirical rule is difficult to excavate a novel fault propagation path, quickly feed back or provide an architecture management optimization scheme; and because the traditional operation and maintenance method based on the empirical rule is constructed based on the system expert experience, the rule is difficult to update and customize and optimize by multiplexing the operation data of the wire network. Therefore, the traditional operation and maintenance method based on the empirical rule is difficult to locate the alarm root cause in the alarm storm.
In view of the technical problem that the alarm root cause is difficult to locate in the prior art, the invention provides an alarm root cause locating method, which has the following general idea:
acquiring real-time alarm data of a network system comprising network element equipment, wherein the real-time alarm data comprises alarm equipment information and alarm occurrence time; constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information; classifying the real-time alarm data by utilizing an event division model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event division model is obtained by training based on high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in network element equipment related to topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology; aiming at each alarm event, obtaining an alarm propagation graph according to an alarm association rule, wherein the alarm association rule is obtained based on the association relation and the confidence degree among all alarms; and acquiring a root node according to the alarm propagation graph so as to position an alarm root cause.
By the technical scheme, the real-time alarm data are acquired, the real-time topological relation is acquired, the event division model obtained based on high-frequency time and/or associated time training is utilized to classify the real-time alarm data, a plurality of alarm events are obtained, the alarm events can be effectively aggregated, and alarm compression is realized in an alarm storm; and the alarm propagation graph is obtained according to the alarm association rule aiming at each alarm event, so that the root node is obtained to position the alarm root cause, the alarm root cause positioning from two dimensions of time and space is realized, and the effect of rapidness and accuracy of fault positioning is improved.
Example one
Referring to the flowchart of fig. 1, a first embodiment of the method for locating an alarm root cause according to the present invention is provided.
The alarm root cause positioning device is terminal equipment or network connection equipment capable of realizing network connection, and can be terminal equipment such as a mobile phone, a computer, a tablet computer and an embedded industrial personal computer, and also can be network equipment such as a server.
Fig. 2 is a schematic diagram of a hardware structure of an alarm root cause location device. The apparatus may include: a processor 1001, such as a CPU (Central Processing Unit), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
Those skilled in the art will appreciate that the hardware configuration shown in FIG. 2 does not constitute a limitation of the alarm root cause location apparatus of the present invention and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
Specifically, the communication bus 1002 is used for realizing connection communication among these components;
the user interface 1003 is used for connecting a client and performing data communication with the client, the user interface 1003 may include an output unit, such as a display screen, an input unit, such as a keyboard, and optionally, the user interface 1003 may further include other input/output interfaces, such as a standard wired interface and a wireless interface;
the network interface 1004 is used for connecting to the backend server and performing data communication with the backend server, and the network interface 1004 may include an input/output interface, such as a standard wired interface, a wireless interface, such as a Wi-Fi interface;
the memory 1005 is used for storing various types of data, which may include, for example, instructions of any application program or method in the device and application program-related data, and the memory 1005 may be a high-speed RAM memory, or a stable memory such as a disk memory, and optionally, the memory 1005 may be a storage device independent of the processor 1001;
specifically, with continued reference to fig. 2, the memory 1005 may include an operating system, a network communication module, a user interface module, and a computer program, wherein the network communication module is mainly used for connecting to a server and performing data communication with the server;
the processor 1001 is configured to call the computer program stored in the memory 1005 and perform the following operations:
acquiring real-time alarm data of a network system comprising network element equipment, wherein the real-time alarm data comprises alarm equipment information and alarm occurrence time;
constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information;
classifying the real-time alarm data by utilizing an event division model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event division model is obtained by training based on high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in network element equipment related to topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
aiming at each alarm event, acquiring an alarm propagation diagram according to an alarm association rule, wherein the alarm association rule is acquired based on the association relation and the confidence coefficient between alarms;
and acquiring a root node according to the alarm propagation graph so as to position an alarm root factor.
Based on the above alarm root cause positioning device, the following describes the alarm root cause positioning method of this embodiment in detail with reference to the flowchart shown in fig. 1. The method may comprise the steps of:
step S10: the method comprises the steps of obtaining real-time alarm data of a network system comprising network element equipment, wherein the real-time alarm data comprise alarm equipment information and alarm occurrence time.
Specifically, the network element device may be an entity device, or may be a virtual node of a virtualized device, and correspondingly, the network system may be a network system formed by the entity network element device, or may be a network system formed by the virtual network element device. When the network system is in operation and has a fault, the actual faulty network element equipment generates an alarm, and correspondingly, other equipment associated with the faulty network element equipment also generates an alarm, so that an alarm storm is generated. However, among the numerous alarms, it is difficult for the operation and maintenance personnel to clearly know the alarm root cause and further cannot correspondingly find out the faulty equipment for maintenance, so that when the alarm of the network system occurs, the alarm data is analyzed to locate the alarm root cause. For example, the alarm root cause positioning device may be sent to an alarm root cause positioning device that can operate independently, and the alarm root cause positioning device may exist independently from the network system, or may be included in the network system, and may be specifically set according to an actual situation. The network system sends the alarm data to the alarm root cause positioning device in real time, and the alarm root cause positioning device can perform root cause positioning analysis after acquiring the real-time alarm data. The alarm data may include attribute information such as an alarm signal, device information of a device in which an alarm is placed, that is, alarm device information, an alarm header, alarm occurrence time, and alarm end time.
Step S30: and constructing the real-time topological relation of the network element equipment according to the preset equipment topological relation and the alarm equipment information.
Specifically, the alarm device information in the real-time alarm data is analyzed, and a real-time topological relation, that is, a real-time topological graph between network element devices generating the real-time alarm data, is obtained with reference to preset different kinds of device topological relations, that is, a pre-constructed topological graph.
Step S50: classifying the real-time alarm data by utilizing an event division model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event division model is obtained by training based on high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in network element equipment related to topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology.
Specifically, in the real-time topological graph, based on the alarm occurrence time in the real-time alarm data corresponding to the network element device associated with the actual topology, the real-time alarm data is analyzed by using the event partitioning model obtained by training, and for any two adjacent alarms with the same alarm title and overlapped alarm duration time, whether the interval of the alarm occurrence time of the two alarms meets the high-frequency time or not is judged, and the alarms meeting the high-frequency time are combined into one alarm event; or aiming at any two adjacent alarms of different alarm titles with overlapped alarm duration, judging whether the interval of the alarm occurrence time of the two alarms meets the association time or not, and combining the alarms meeting the association time into an alarm event; the alarms simultaneously meeting the high-frequency time and the associated time can be combined into one alarm event, so that the real-time alarm data corresponding to the network element equipment with actual topological association is classified into a plurality of alarm events.
Step S70: and aiming at each alarm event, obtaining an alarm propagation graph according to an alarm association rule, wherein the alarm association rule is obtained based on the association relation and the confidence coefficient between the alarms.
Specifically, in each alarm event, all alarms in the alarm event are converted into an alarm undirected graph, the alarm undirected graph comprises a plurality of vertexes and edges connecting the vertexes, wherein each vertex is equivalent to each alarm signal, each edge is equivalent to topological association of two connected alarm signals, at this time, the edges have no direction limitation, and the alarm undirected graph may form a ring. And then, a direction and a weight value are given to each edge by utilizing a pre-established alarm association rule, the direction of the edge connecting any two vertexes can be known according to the primary and secondary relation between two alarms associated in the alarm association rule, and the weight value of the edge connecting any two vertexes can be known according to the confidence coefficient between the two alarms associated in the alarm association rule, so that a directional alarm link topological graph is obtained, and the alarm link topological graph can also form a ring. Then, the alarm link topological graph is reconstructed into an alarm propagation graph by using a maximum spanning tree algorithm, that is, the alarm propagation graph is arranged according to the weight values of the edges from large to small, for example, the weight value of the alarm A to the alarm B is maximum, the weight value of the alarm B to the alarm C is the second, and so on, the link relation of the alarm A to the alarm B to the alarm C is formed, that is, the final alarm propagation graph is obtained, and at this time, the alarm propagation graph does not form a ring.
Step S90: and acquiring a root node according to the alarm propagation graph so as to position an alarm root cause.
Specifically, after the alarm propagation graph is obtained, the alarm of the root node is obtained, the alarm is the root cause alarm, the alarm data of the alarm is extracted from the real-time alarm data, so that the attribute information of the alarm is obtained, the root cause alarm positioning is realized, and subsequent operation and maintenance personnel can know the root cause of a certain alarm event in the current alarm storm according to the information, timely process the fault and prevent other larger fault problems from being caused.
According to the alarm root cause positioning method provided by the embodiment, real-time alarm data are obtained, a real-time topological relation is obtained, the event division model obtained based on high-frequency time and/or associated time training is used for classifying the real-time alarm data, a plurality of alarm events are obtained, the alarm events can be effectively aggregated, and alarm compression is realized in an alarm storm; and the alarm propagation graph is obtained according to the alarm association rule aiming at each alarm event, so that the root node is obtained to position the alarm root cause, the alarm root cause positioning from two dimensions of time and space is realized, and the effect of rapidness and accuracy of fault positioning is improved.
Example two
Based on the same inventive concept, referring to fig. 3 to 4, a second embodiment of the alarm root cause positioning method of the present invention is proposed, and the alarm root cause positioning method is applied to alarm root cause positioning equipment.
The method for positioning an alarm root cause according to this embodiment is described in detail below with reference to the flowchart shown in fig. 3. The method may comprise the steps of:
step S10: the method comprises the steps of obtaining real-time alarm data of a network system comprising network element equipment, wherein the real-time alarm data comprise alarm equipment information and alarm occurrence time.
Specifically, the subsequent steps may be directly performed on the acquired real-time alarm data, or the subsequent steps may be performed after the real-time alarm data is preprocessed. For example, data cleaning and quality inspection are performed on the acquired real-time alarm data to remove wrong values, check the accuracy of data quality, and merge repeated data. Therefore, the authenticity and the accuracy of the subsequent processing of the real-time alarm data can be ensured.
In a specific embodiment, the network element device may be an entity device, or may also be a virtual node of a virtualized device, and correspondingly, the network system may be a network system formed by the entity network element device, or may also be a network system formed by the virtual network element device. When the network system is in operation and a fault occurs, the actual faulty network element equipment generates an alarm, and correspondingly, other equipment associated with the faulty network element equipment also generates an alarm, thereby generating an alarm storm. However, among the numerous alarms, it is difficult for the operation and maintenance personnel to clearly know the alarm root cause and further cannot correspondingly find out the faulty equipment for maintenance, so that when the alarm of the network system occurs, the alarm data is analyzed to locate the alarm root cause. For example, the alarm root cause positioning device may be sent to an alarm root cause positioning device that can operate independently, and the alarm root cause positioning device may exist independently from the network system, or may be included in the network system, and may be specifically set according to an actual situation. The network system sends the alarm data to the alarm root cause positioning device in real time, and the alarm root cause positioning device can perform root cause positioning analysis after acquiring the real-time alarm data.
In this embodiment, a network system using a 5G technology and a virtualization technology is taken as an example for explanation, where the system includes multiple network element devices, and when an alarm occurs on one unknown network element device, other network element devices associated with the unknown network element device also give an alarm because the other network element devices cannot receive corresponding signals or instructions, and the number of alarms generated by each network element device is large, and an operation and maintenance worker cannot clearly know which network element device a specific fault occurs or cannot locate a specific alarm root cause, and therefore cannot take measures in time. Therefore, the acquired real-time alarm data is input into the alarm root cause positioning device of this embodiment, and after the device acquires the real-time alarm data of the network system, root cause alarm positioning is performed according to the method of this embodiment, so as to assist operation and maintenance personnel in determining the fault root cause. The real-time alarm data may include attribute information such as an alarm signal, alarm device information, an alarm header, alarm occurrence time, and alarm end time, which are generated in real time.
After the above steps, the following step S20 may be sequentially executed to construct a real-time topological relationship of the network element device according to a preset device topological relationship and the alarm device information; step S20 may also be a step branch executed independently, so as to implement the establishment of the device topology relationship.
Step S20: and constructing the topological relation of the equipment.
Specifically, the step S20 may include:
step S21: acquiring historical alarm data of the network system;
step S22: acquiring the connection relation of network element equipment of different types or different architecture layers in the network system according to the historical alarm data;
step S23: and constructing a device topological relation according to the connection relation so as to obtain a preset device topological relation.
In a specific embodiment, the number of each network element device or the number of network element devices in each layer of device architecture is not constant, but the type of the corresponding network element device or the device architecture layer is not changed at all, and for a new network system, a basic topology diagram needs to be determined first, that is, a basic topology diagram is constructed according to the difference of the type of the network element device or the architecture layer in the network system, that is, a preset device topology relationship is established. Therefore, in practical application, the topological relation of the equipment does not need to be established for many times when the network element equipment of the same type or the same architecture layer is added, the operation space of the equipment can be saved, and more space is provided for processing the alarm root cause positioning analysis of historical alarm data. Specifically, the automatic acquisition of the network element equipment can be realized based on a technical stack such as a big data Kafka system (distributed publish-subscribe message system) or a Hadoop architecture (distributed system infrastructure) to construct an equipment topological relation, so that the analysis of the real-time topological relation and the synchronization of the alarm network element equipment can be performed subsequently.
Step S30: and constructing the real-time topological relation of the network element equipment according to the preset equipment topological relation and the alarm equipment information.
In a specific embodiment, according to the device topology relationship preset in step S20, that is, the pre-constructed topology map, the alarm device information in the real-time alarm data is analyzed to obtain the real-time topology relationship, that is, the real-time topology map between the actual network element devices that generate the real-time alarm data. The real-time topological Graph updating and collecting can be specifically realized based on a Graph db library (Graph Database).
After the above steps, the following step S40 may be sequentially performed to realize that the event classification model obtained by training is used to classify the real-time alarm data to obtain a plurality of alarm events; step S40 may also be a step branch that is executed independently, thereby implementing the training of the event partitioning model.
Step S40: training and obtaining an event division model based on high-frequency time and/or associated time, wherein the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and consistent in alarm titles in network element equipment related to topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology.
Specifically, the event partition model may be obtained by direct training based on the high-frequency time, or may be obtained by direct training based on the associated time, or may be obtained by training based on the high-frequency time and the associated time. The event partitioning model is obtained with the best effect based on the high-frequency time and the associated time training, and the alarm event partitioning of real-time alarm data can be more accurately carried out when the model is applied.
In the embodiment, historical alarm data is used as a data set, and the Bayesian analysis model is trained based on the high-frequency time and the association time to obtain an event partitioning model.
Further, the step S40 may include:
step S41: and acquiring historical alarm data of the network system and a historical topological relation of the corresponding network element equipment.
In a specific embodiment, the historical alarm data may be the alarm data of the network element device in the device topology relationship in step S20, or may be the alarm data after combining the real-time alarm data obtained in the previous time period after the alarm root cause positioning analysis in the previous time period is finished, so that the event partitioning model is updated by using a deep learning technique, and the event partitioning accuracy is improved. The acquired historical alarm data can be subjected to data cleaning or merging processing, and the accuracy of an event division model obtained according to historical alarm data training is guaranteed. And meanwhile, acquiring a historical topological relation, namely a topological graph, of the network element equipment corresponding to the historical alarm data.
Step S42: and carrying out time sequence mining on the historical alarm data based on the historical topological relation to obtain time aggregation characteristics, wherein the time aggregation characteristics comprise high-frequency time and/or associated time.
In a specific embodiment, the high-frequency time is an average interval duration between alarms with overlapping alarm durations and consistent alarm titles in the network element equipment related to the topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology. The time aggregation feature of the present embodiment includes a high frequency time and an associated time.
Further, the step S42 may include:
step S42.1: and obtaining a plurality of network element devices related to the topology based on the historical topological relation.
Step S42.2: based on the topology-related network element equipment, acquiring the total alarm duration corresponding to each network element equipment from the historical alarm data;
step S42.3: and obtaining high-frequency time and/or associated time according to the total alarm duration corresponding to each network element device.
In a specific implementation manner, in a topological graph corresponding to historical topological relations, that is, historical alarm data, for network element devices with topological associations, alarm data corresponding to the network element devices are extracted from the historical alarm data, and the alarm data are subjected to time sequence mining to obtain the total alarm duration time corresponding to each network element device. And analyzing and calculating based on the total alarm duration of each network element device to obtain high-frequency time and/or associated time. In this embodiment, the high frequency time is obtained first and then the correlation time is obtained.
In an embodiment, in the step S42.3, the step of obtaining the high-frequency time according to the total alarm duration corresponding to each network element device may include:
step a1: obtaining high-frequency alarms and the total times thereof according to the total alarm duration corresponding to each network element device, wherein the high-frequency alarms comprise alarms with overlapped alarm durations and consistent alarm titles;
step a2: aiming at each network element device, obtaining a high-frequency time period and high-frequency times of the high-frequency alarm according to the alarm duration of each high-frequency alarm, wherein the high-frequency time period comprises a time period with overlapped alarm duration;
step a3: acquiring the interval duration of the high-frequency time period according to the starting time of each high-frequency alarm in the high-frequency time period;
step a4: performing weighted average processing according to the interval durations of all the high-frequency time periods to obtain the average interval duration of all the high-frequency time periods in the network element equipment;
step a5: obtaining high-frequency time according to the high-frequency times and the average interval duration of each network element device, wherein the adopted calculation formula is as follows:
Figure BDA0003046979750000151
wherein T represents high-frequency time, N represents the number of the network element equipment, and N n Representing the total number of high frequency alarms, i n Number of high frequencies, i, representing a high frequency time period n ≤N n ,Δt n Representing the average inter time difference of all high frequency time segments.
In this embodiment, taking the total alarm duration of the network element device a and the network element device b as an example, as shown in fig. 5, a schematic diagram of the distribution of the total alarm duration of the network element devices a and b is shown. Assume that network element devices a and b both have alarm a with the same title. For the device A, the starting time st and the ending time ed of each time of the alarm A are known, and multiple alarms of the alarm A of the device A are shown in the figure. Based on this example, it can be known that there is a situation that the alarm duration of the alarm A overlaps, and the alarm A is determined as a high-frequency alarm, and the total number of times N of the alarm A is obtained n =5; it can also be known from the figure that the high-frequency time periods of the alarm A are st1 to ed3 and st1 to ed5 respectively, and each alarm is in the high-frequency time period, the high-frequency times i corresponding to the high-frequency time period n =5, for the first high-frequency period, the interval duration is obtained as st3-st1, and the interval duration of the second high-frequency period is obtained as st5-st4; after the average treatment is carried out, the average interval duration of all high-frequency time periods of the equipment A is obtained
Figure BDA0003046979750000152
Thereby obtaining the high-frequency time of the device A
Figure BDA0003046979750000153
Calculating the formula to obtain the high-frequency time of the high-frequency alarm on the first network element equipment. However, for a plurality of devices with topological relation, the high-frequency time of one device is not the final result, so that the high-frequency time of the device B is obtained in the same way
Figure BDA0003046979750000154
Correspondingly, for the example, the number of the network element devices is 2, that is, n =2, the final high-frequency time T may be obtained according to the above calculation formula:
Figure BDA0003046979750000155
in the actual training process, specific historical alarm data are substituted into specific time points of alarm duration time of the alarm A on the equipment A and the equipment B, and final high-frequency time can be obtained.
In another embodiment, in step S42.3, the step of obtaining the associated time according to the total alarm duration corresponding to each network element device may include:
step b1: obtaining associated alarms according to the total alarm duration corresponding to each network element device, wherein the associated alarms comprise alarm pairs with overlapped alarm durations and inconsistent alarm titles, and the alarm pairs comprise primary alarms and secondary alarms;
step b2: for each network element device, obtaining the association time period and the association times of the association alarm according to the alarm duration of each association alarm;
step b3: acquiring the initial time difference of the associated time period according to the starting time of the primary alarm and the starting time of the secondary alarm in the associated time period;
and b4: performing weighted average processing according to the starting time differences of all the associated time periods to obtain average interval time differences of all the associated time periods in the network element equipment;
and b5: obtaining the association time of the association alarm according to the association times of each network element device and the average interval time difference, wherein the calculation formula is as follows:
Figure BDA0003046979750000161
wherein S represents the association time, n represents the number of the network element devices, Δ S n Representing the average inter-time difference over all associated time periods,
Figure BDA0003046979750000162
k n number of associations, t, representing period of association time v Denotes a start time difference of the associated time period, v ≦ k n
In this embodiment, taking the total alarm duration of the network element device a as an example, as shown in fig. 6, a schematic diagram of the distribution of the total alarm duration of the network element device a is shown. Aiming at network element equipment A, supposing that an alarm A and an alarm B with two different alarm titles exist in the network element equipment A, and knowing that the alarm A and the alarm B have the condition that the alarm duration time is overlapped, the alarm pair of the alarm A and the alarm B is determined to be a related alarm, the alarm A is a main alarm, and the alarm B is a secondary alarm; it can also be known from the figure that the associated time period of alarm A and alarm B is
Figure BDA0003046979750000163
Number of associations k of association period n =2, for the first associated time period, obtain a time interval difference of
Figure BDA0003046979750000164
The second associated time period has a time interval difference of
Figure BDA0003046979750000165
After the average treatment is carried out, the average interval time difference of all the associated time periods in the equipment A is obtained
Figure BDA0003046979750000171
The association time of the association alarm on the network element equipment A can be obtained by calculating the formula, but for a plurality of pieces of equipment with topology association, the association time of one piece of equipment is not the final result, and the final association time can be obtained by calculating the association time of other network element equipment by the same method.
Correspondingly, for the example where the number of the network element devices is 1, that is, n =1, the final association time S may be obtained according to the above calculation formula:
Figure BDA0003046979750000172
in the actual training process, specific historical alarm data are substituted into specific time points of alarm duration time of the alarm A and the alarm B, and final associated time can be obtained.
Step S43: and training the Bayesian analysis model according to the time aggregation characteristics to obtain an event division model.
In a specific embodiment, a bayesian analysis model can be established on a machine learning excavation platform based on a bayesian analysis method. And training the Bayesian analysis model according to the time aggregation characteristics to obtain an event division model, namely merging the alarms meeting the high-frequency time and/or the alarms meeting the associated time. The event partitioning model of this embodiment merges alarms satisfying both high-frequency time and associated time in real-time alarm data to obtain a plurality of alarm events, so as to implement classification of the real-time alarm data.
With the continuous updating of the historical alarm data, after the alarm event classification is carried out by utilizing the event classification model each time, the event classification model is updated based on the classification result, so that the accuracy of the model can be effectively improved.
Step S50: and classifying the real-time alarm data by using the event division model according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events.
In a specific implementation mode, an event classification model trained according to historical alarm data is used for classifying real-time alarm data, alarms meeting high-frequency time and/or association time in equipment with topology association are combined into one alarm event, and the real-time alarm data is correspondingly divided into a plurality of alarm events. In this embodiment, in the devices having topology association, for example, the devices a and B are associated, the devices C, d and e are associated, and the alarms simultaneously satisfying the high frequency time and the associated time are combined into one alarm event, that is, the alarm a, the alarm B and the alarm C in the devices a and B simultaneously satisfying the high frequency time and the associated time are combined into one alarm event, the alarms D, E and F in the devices a and B simultaneously satisfying the high frequency time and the associated time are combined into one alarm event, the alarms G and H in the devices C, d and e simultaneously satisfying the high frequency time and the associated time are combined into one alarm event, and the alarms I and J in the devices C, d and e simultaneously satisfying the high frequency time and the associated time are combined into one alarm event.
The existing alarm root cause positioning method adopting a clustering algorithm is easy to compress different alarm roots which occur simultaneously into the same alarm event, so that the root cause positioning is inaccurate or the positioning root cause is less than the actual root cause; the method of the invention can effectively integrate effective alarm events by calculating the association time between different alarms and the high-frequency time between the same alarms and dividing the alarm events based on the high-frequency time and the association time, so that the root cause positioning is more accurate and comprehensive, the influence of noise points on event aggregation is reduced, and the robustness is increased. And the two dimensions of the space, namely the equipment with topology association, and the time, namely the alarm occurrence time of each alarm are analyzed to position the alarm root cause of one alarm time, so that the fault positioning is carried out, the alarm storm data is subjected to alarm compression, the alarm root cause is quickly positioned, the positioning accuracy is improved, the alarm event is more accurately obtained, and the fault positioning is accurately carried out.
After the above steps, the following step S60 may be sequentially executed to obtain an alarm propagation graph according to an alarm association rule for each alarm event; or steps S20, S40 and S60 may be steps executed independently, so as to implement association statistics of primary and secondary alarms by using high-frequency time and association time, and then establish an alarm association rule.
Step S60: and establishing an alarm association rule.
Further, the step S60 may include:
step S61: acquiring an association relation among alarms in the association alarm and a confidence coefficient of the association alarm, wherein the confidence coefficient comprises the probability of secondary alarm when the primary alarm occurs;
step S62: and establishing alarm association rules according to the association relationship among the alarms and the confidence coefficient.
In a specific embodiment, only the confidence level of the associated alarm may be obtained, only the support degree of the associated alarm may be obtained, and the confidence level and the support degree of the associated alarm may also be obtained at the same time, so as to establish an alarm association rule according to the confidence level and the support degree. The confidence degree comprises the probability of the occurrence of the secondary alarm when the primary alarm occurs, and the support degree comprises the probability of the simultaneous occurrence of the primary alarm and the secondary alarm. Obtaining the association relationship value of the associated alarm according to the confidence and/or the support degree, obtaining the association relationship value of one alarm causing the other alarm for each pair of associated alarms, combining the association relationship of the two alarms to obtain the direction and the weight of the two alarms, performing the above steps on the two alarms of any network element equipment meeting the topological association in the equipment topological relationship of the step S20, and finally establishing an alarm association rule according to all the obtained information.
Step S70: and aiming at each alarm event, obtaining an alarm propagation graph according to an alarm association rule, wherein the alarm association rule is obtained based on the association relation and the confidence degree between the alarms.
Further, the step S70 may include:
step S71: constructing an alarm undirected graph aiming at each alarm event;
step S72: setting a weight value for an edge in the alarm undirected graph according to the alarm association rule to obtain an alarm link topological graph;
step S73: and reconstructing the alarm link topological graph by using a maximum spanning tree algorithm to obtain an alarm propagation graph.
Graph (Graph) is an abstract data structure for representing the association between objects, and is described using vertices (Vertex) representing objects and edges (Edge) representing the relationship between objects.
In the specific implementation process, aiming at each alarm event, an alarm undirected graph is constructed according to all alarms in the alarm event, the alarm undirected graph comprises a plurality of vertexes, namely alarms, and edges connecting the vertexes, namely alarm association relations, at the moment, the edges have no direction limitation, and the alarm undirected graph can form a ring. And then, by utilizing a pre-established alarm association rule, endowing each edge with a direction and a weight to obtain a directional alarm link topological graph, wherein the alarm link topological graph can also form a ring. Then, the alarm link topological graph is reconstructed into an alarm propagation graph by using a maximum spanning tree algorithm, namely the alarm propagation graph is arranged according to the weight of the edge from large to small, and the alarm propagation graph does not form a ring at the moment.
In this embodiment, as shown in fig. 7, an acquisition process demonstration diagram of an alarm propagation diagram is shown for an alarm event in which an alarm a, an alarm B, and an alarm C that simultaneously satisfy high-frequency time and associated time in the network element devices a and B are merged. In the alarm event, the constructed alarm undirected graph is shown in fig. 7 (a), three vertexes are alarm a, alarm B and alarm C, and the corresponding three edges are an AB edge, a BC edge and a CA edge, at this time, the three edges have no direction limitation, and the alarm undirected graph constitutes a ring.
When step S60 of this embodiment only obtains the confidence level of the associated alarm, for example, the confidence level of the established alarm association rule a → B is 1,B → C is 3,C → a is 2, and certainly, the confidence level of B → a may be different and is determined according to the actual situation. For this example, according to the alarm association rule, the direction and weight are given to the edge in the alarm undirected graph, and for the primary and secondary relationship between two alarms associated in the alarm association rule, the direction of the edge connecting any two vertices can be known, that is, the directions of the edges a to B, B to C, and C to a in the alarm undirected graph are known, and the weights of the corresponding edges AB, BC, and CA are 1, 3, and 2, respectively, so as to obtain a directional alarm link topology graph as shown in fig. 7 (B), and at this time, the alarm link topology graph also forms a ring.
Then, the alarm link topology map is reconstructed into an alarm propagation map by using a maximum spanning tree Algorithm, specifically, a Kruskal 'sAlgorithm (Kruskal's Algorithm) or a Prim's Algorithm (Prim's Algorithm) which applies a Greedy Algorithm (Greedy) idea may be adopted, and the method is specifically selected according to an actual situation. In the above alarm link topology, since the weight of the BC edge is the largest, the CA edge is the next and the AB edge is the smallest, the link relationship with the alarm B as the source is formed, but since the alarm propagation graph does not form a ring, the last AB edge is pruned, and as shown in fig. 7 (C), the alarm propagation graph B → C → a is finally obtained.
The above embodiment is only the simplest example, and in order to facilitate understanding of the specific steps of the present invention, only one alarm link relationship is obtained by using the simplest example. In practical application, the number of the network element devices, the topological relation among the network element devices, the distribution of the alarm undirected graph, and the like are far more complex than the example, so that the application of the maximum spanning tree algorithm needs to be specifically applied according to practical situations.
Step S90: and acquiring a root node according to the alarm propagation graph so as to position an alarm root cause.
In a specific embodiment, arranging each alarm link relation in the alarm propagation graph in an inverted order of weight values to obtain a plurality of alarm link relations; and screening the alarm link relations according to the preset output number to obtain an alarm link result so as to position an alarm root cause.
In this embodiment, if it is set that only the most likely two results are output, B → C, C → a are displayed to the operation and maintenance staff in reverse order according to the weight, the operation and maintenance staff can know that the alarm B is the root node, thereby positioning the alarm root as the alarm B, obtaining the attribute information of the alarm B, and displaying the attribute information to the operation and maintenance staff, implementing real-time display of the fault propagation influence range, and the operation and maintenance staff can take fault protection measures in time for the network element device where the alarm B occurs, thereby preventing other larger faults from occurring.
It should be noted that, in this embodiment, the device topology relationship in step 20, the event partitioning model in step 40, and the alarm association rule in step 60 may all be updated in real time, and after the accumulation of historical data, the alarm association rule may be continuously updated first, new association alarms among the alarms are mined, and invalid association alarms are eliminated, and the event partitioning model may also be updated by learning through the historical alarm data in the later period, and continuously iterated, so that the accuracy of alarm event partitioning is higher.
The alarm root cause positioning method provided by the embodiment performs alarm root cause positioning in alarm time according to different topological relations, has the advantages of intelligence, customization and high accuracy, can repeatedly utilize alarm data, performs self-learning iteration, effectively performs alarm event aggregation, and continuously improves accuracy along with increase of operation data. When the traditional algorithm carries out root cause positioning, the relevance calculation is mostly carried out so as to carry out the root cause positioning, and the situation that a certain alarm happens in different devices in error positioning at the same time is easy to occur; the method introduces the connection topological diagram of the equipment, thereby utilizing the diagram structure to carry out event aggregation and root cause positioning, utilizing the diagram structure to carry out alarm event classification during event aggregation, utilizing the diagram structure to establish an alarm undirected graph during root cause positioning, and then utilizing the graph algorithm to carry out root cause excavation, introducing the physical connection state of the equipment, and avoiding the error positioning of alarm simultaneously occurring on different equipment.
EXAMPLE III
Based on the same inventive concept, referring to fig. 4, a first embodiment of the event partitioning model training method of the present invention is provided, and the training method can be applied to alarm root cause positioning devices.
The event partition model training method of the present embodiment is described in detail below with reference to the flowchart shown in fig. 4. The method specifically comprises the following steps:
step S41: acquiring historical alarm data of a network system comprising network element equipment and a historical topological relation of the corresponding network element equipment;
step S42: performing time sequence mining on the historical alarm data based on the historical topological relation to obtain a time aggregation characteristic, wherein the time aggregation characteristic comprises high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms with overlapped alarm durations and consistent alarm titles in the network element equipment related to the topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
step S43: and training the Bayesian analysis model according to the time aggregation characteristics to obtain an event division model so as to merge the alarms meeting the high-frequency time and/or the alarms meeting the associated time.
For more implementation details in the specific implementation of the method steps, reference may be made to the description of the specific implementation of the steps in the second embodiment, and for simplicity of the description, repeated descriptions are not repeated here.
In the event partitioning model training method provided in this embodiment, a bayesian analysis model is established based on a bayesian analysis method, and the bayesian analysis model is trained according to the time aggregation characteristics to obtain an event partitioning model, that is, alarms meeting high-frequency time and/or alarms meeting associated time are combined. The alarm event is divided based on the high-frequency time and the associated time, so that effective alarm events can be effectively integrated, the root cause positioning is more accurate and comprehensive, the influence of noise points on event aggregation is reduced, and the robustness is increased. Model training is carried out from two dimensions of space and time, and the accuracy of alarm event classification is further improved. Meanwhile, with the continuous updating of historical alarm data, the iterative training is continuously carried out on the event division model, and the accuracy of the model can be effectively improved.
Example four
Based on the same inventive concept, referring to fig. 8 and 9, a first embodiment of the alarm root cause positioning device of the present invention is provided, and the alarm root cause positioning device may be a virtual device and applied to an alarm root cause positioning device.
The alarm root cause positioning apparatus provided in this embodiment is described in detail below with reference to the functional module diagram shown in fig. 8. The apparatus may include:
the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring real-time alarm data of a network system comprising network element equipment, and the real-time alarm data comprises alarm equipment information and alarm occurrence time;
the topological relation acquisition module is used for constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information;
the event partitioning module is used for classifying the real-time alarm data by utilizing an event partitioning model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event partitioning model is obtained based on high-frequency time and/or associated time training, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in the network element equipment related to the topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
the event analysis module is used for acquiring an alarm propagation graph according to an alarm association rule aiming at each alarm event, wherein the alarm association rule is acquired based on the association relationship and the confidence degree among the alarms;
and the root cause positioning module is used for obtaining a root node according to the alarm propagation graph so as to position the alarm root cause.
Next, with reference to the schematic connection diagram of the functional module shown in fig. 9, the alarm root cause positioning apparatus provided in this embodiment is described in detail.
Specifically, the data acquisition module, the topological relation acquisition module, the event division module, the event analysis module and the root cause positioning module are connected in sequence.
Further, the apparatus may further include:
the equipment topological relation construction module is connected with the topological relation acquisition module;
and the equipment topological relation construction module is used for acquiring historical alarm data of the network system, acquiring the connection relation of network element equipment of different types or different architecture layers in the network system according to the historical alarm data, and constructing an equipment topological relation according to the connection relation so as to acquire a preset equipment topological relation.
Further, the apparatus may further include:
the event partitioning model training module is connected with the event partitioning module;
and the event partitioning model training module is used for training and obtaining an event partitioning model based on the high-frequency time and/or the associated time.
Specifically, the event partitioning model training module may include:
a historical topological relation obtaining submodule, configured to obtain historical alarm data of the network system and a historical topological relation of a corresponding network element device;
the time sequence mining submodule is used for carrying out time sequence mining on the historical alarm data based on the historical topological relation to obtain time aggregation characteristics, wherein the time aggregation characteristics comprise high-frequency time and/or associated time;
and the model training submodule trains the Bayesian analysis model according to the time aggregation characteristics to obtain an event division model.
Further, in the event partitioning model training module, the timing mining submodule may include:
a topological diagram obtaining unit, configured to obtain, based on the historical topological relation, a plurality of network element devices related to the topology;
a duration obtaining unit, configured to obtain, based on the topology-related network element devices, a total alarm duration corresponding to each network element device from the historical alarm data;
and the time aggregation characteristic acquisition unit is used for acquiring the high-frequency time and/or the associated time according to the total alarm duration corresponding to each network element device.
In one embodiment, the time aggregation feature obtaining unit may include:
a high-frequency time obtaining subunit, configured to obtain high-frequency alarms and total times thereof according to the total alarm duration corresponding to each network element device, where the high-frequency alarms include alarms with overlapping alarm durations and consistent alarm titles; then, aiming at each network element device, obtaining a high-frequency time period and high-frequency times of the high-frequency alarm according to the alarm duration of each high-frequency alarm, wherein the high-frequency time period comprises a time period with overlapped alarm duration; obtaining the interval duration of the high-frequency time period according to the starting time of each high-frequency alarm in the high-frequency time period; performing weighted average processing according to the interval durations of all the high-frequency time periods to obtain the average interval duration of all the high-frequency time periods in the network element equipment; and finally, obtaining high-frequency time according to the high-frequency times and the average interval duration of each network element device, wherein the adopted calculation formula is as follows:
Figure BDA0003046979750000241
wherein T represents high-frequency time, N represents the number of the network element equipment, and N n Indicating the total number of high frequency alarms, i n Number of high frequencies, i, representing a high frequency time period n ≤N n ,Δt n Representing the average inter time difference of all high frequency time segments.
In another embodiment, the time aggregation characteristic obtaining unit may include:
the association time obtaining subunit is configured to obtain association alarms according to the total alarm duration corresponding to each network element device, where the association alarms include an alarm pair in which alarm durations are overlapped and alarm titles are inconsistent, and the alarm pair includes a primary alarm and a secondary alarm; then aiming at each network element device, obtaining the association time period and the association times of the association alarm according to the alarm duration of each association alarm; acquiring the initial time difference of the associated time period according to the starting time of the primary alarm and the starting time of the secondary alarm in the associated time period; performing weighted average processing according to the starting time differences of all the associated time periods to obtain average interval time differences of all the associated time periods in the network element equipment; and finally, obtaining the association time of the association alarm according to the association times of each network element device and the average interval time difference, wherein the adopted calculation formula is as follows:
Figure BDA0003046979750000242
wherein S represents the association time, n represents the number of the network element devices, Δ S n Representing the average inter-time difference over all associated time periods,
Figure BDA0003046979750000243
k n indicating the number of associations, t, of the associated time period v Representing the start time difference of the associated time period.
Further, the apparatus may further include:
an alarm association rule establishing module connected with the event analysis module;
and the alarm association rule establishing module is used for establishing an alarm association rule.
Specifically, the alarm association rule establishing module may include:
a confidence coefficient obtaining submodule, configured to obtain an association relationship between alarms in the association alarms and a confidence coefficient of the association alarms, where the confidence coefficient includes a probability of occurrence of a secondary alarm when the primary alarm occurs;
and the alarm association rule establishing submodule is used for establishing an alarm association rule according to the association relation between the alarms and the confidence coefficient.
Further, the event analysis module may include:
the undirected graph construction submodule is used for constructing an alarm undirected graph aiming at each alarm event;
the alarm link topological graph obtaining submodule is used for setting a weight value for an edge in the alarm undirected graph according to the alarm association rule to obtain an alarm link topological graph;
and the alarm propagation map acquisition submodule is used for reconstructing the alarm link topological map by utilizing a maximum spanning tree algorithm to acquire the alarm propagation map.
It should be noted that, the functions and the corresponding achievable technical effects of the alarm root cause positioning apparatus provided in this embodiment may refer to the description of the specific implementation manners in each embodiment of the alarm root cause positioning method of the present invention, and for the sake of brevity of the description, no further description is given here.
EXAMPLE five
Based on the same inventive concept, fig. 2 is a schematic diagram of a hardware structure of an alarm root cause positioning device according to embodiments of the present invention. The present embodiment provides an alarm root cause positioning apparatus, which may include a processor and a memory, where the memory stores a computer program, and when the computer program is executed by the processor, the computer program implements all or part of the steps of the method embodiments of the present invention.
Specifically, the alarm root cause positioning device refers to a terminal device or a network connection device capable of realizing network connection, and may be a terminal device such as a mobile phone, a computer, a tablet computer, a portable computer, or a network device such as a server or a cloud platform.
It will be appreciated that the device may also include a communications bus, a user interface and a network interface.
Wherein the communication bus is used for realizing connection communication among the components.
The user interface is used for connecting the client and performing data communication with the client, and may include an output unit such as a display screen and an input unit such as a keyboard, and optionally may also include other input/output interfaces such as a standard wired interface and a wireless interface.
The network interface is used for connecting the background server and performing data communication with the background server, and the network interface may include an input/output interface, such as a standard wired interface, a wireless interface, such as a Wi-Fi interface.
The memory is used to store various types of data, which may include, for example, instructions for any application or method in the device, as well as application-related data. The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk, or alternatively, the Memory may be a storage device independent of the processor.
The Processor may be an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and is configured to call a computer program stored in the memory and perform all or part of the steps of the above-described embodiments of the database backup method.
EXAMPLE six
Based on the same inventive concept, the present embodiments provide a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., having stored thereon a computer program that is executable by one or more processors and which, when executed by the processors, performs all or part of the steps of the various method embodiments of the present invention.
It should be noted that the above-mentioned serial numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
The above description is only an alternative embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications, equivalents and flow changes made by the present invention as described in the specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for positioning an alarm root cause is characterized by comprising the following steps:
acquiring real-time alarm data of a network system comprising network element equipment, wherein the real-time alarm data comprises alarm equipment information and alarm occurrence time;
constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information;
classifying the real-time alarm data by utilizing an event division model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event division model is obtained by training based on high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in network element equipment related to topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
aiming at each alarm event, obtaining an alarm propagation graph according to an alarm association rule, wherein the alarm association rule is obtained based on the association relation and the confidence degree among all alarms;
and acquiring a root node according to the alarm propagation graph so as to position an alarm root cause.
2. The alarm root cause positioning method according to claim 1, wherein before the step of classifying the real-time alarm data by using the event classification model obtained by training according to the real-time topological relation to obtain a plurality of alarm events, the method further comprises:
acquiring historical alarm data of the network system and a historical topological relation of corresponding network element equipment;
performing time sequence mining on the historical alarm data based on the historical topological relation to obtain time aggregation characteristics, wherein the time aggregation characteristics comprise high-frequency time and/or associated time;
and training a Bayesian analysis model according to the time aggregation characteristics to obtain an event division model.
3. The alarm root cause positioning method according to claim 2, wherein the step of performing time sequence mining on the historical alarm data based on the historical topological relation to obtain a time aggregation feature specifically comprises:
obtaining a plurality of network element devices related to the topology based on the historical topological relation;
based on the network element equipment related to the topology, acquiring the total alarm duration corresponding to each network element equipment from the historical alarm data;
and obtaining high-frequency time and/or associated time according to the total alarm duration corresponding to each network element device.
4. The model training method of claim 3, wherein the step of obtaining the high-frequency time according to the total alarm duration corresponding to each network element device specifically comprises:
acquiring high-frequency alarms and the total times thereof according to the total alarm duration corresponding to each network element device, wherein the high-frequency alarms comprise alarms with overlapped alarm durations and consistent alarm titles;
aiming at each network element device, obtaining a high-frequency time period and high-frequency times of the high-frequency alarm according to the alarm duration of each high-frequency alarm, wherein the high-frequency time period comprises a time period with overlapped alarm duration;
obtaining the interval duration of the high-frequency time period according to the starting time of each high-frequency alarm in the high-frequency time period;
performing weighted average processing according to the interval duration of all the high-frequency time periods to obtain the average interval duration of all the high-frequency time periods in the network element equipment;
obtaining high-frequency time according to the high-frequency times and the average interval duration of each network element device, wherein the adopted calculation formula is as follows:
Figure FDA0003046979740000021
wherein T represents high-frequency time, N represents the number of the network element equipment, and N n Representing the total number of high frequency alarms, i n Number of high frequencies, i, representing a high frequency time period n ≤N n ,Δt n Representing the average inter time difference of all high frequency time segments.
5. The model training method of claim 3, wherein the step of obtaining the associated time according to the total alarm duration corresponding to each network element device specifically comprises:
obtaining associated alarms according to the total alarm duration corresponding to each network element device, wherein the associated alarms comprise alarm pairs with overlapped alarm durations and inconsistent alarm titles, and the alarm pairs comprise a main alarm and a secondary alarm;
for each network element device, obtaining the association time period and the association times of the association alarm according to the alarm duration of each association alarm;
acquiring the initial time difference of the associated time period according to the starting time of the primary alarm and the starting time of the secondary alarm in the associated time period;
performing weighted average processing according to the starting time differences of all the associated time periods to obtain average interval time differences of all the associated time periods in the network element equipment;
obtaining the association time of the association alarm according to the association times of each network element device and the average interval time difference, wherein the calculation formula is as follows:
Figure FDA0003046979740000031
wherein S represents the association time, n represents the number of the network element devices, Δ S n Representing the average inter-time difference over all associated time periods,
Figure FDA0003046979740000032
k n indicating the number of associations, t, of the associated time period v Representing a start time difference of the associated time period.
6. The method for alarm root cause positioning according to claim 1, wherein the step of obtaining an alarm propagation graph according to an alarm association rule for each alarm event specifically comprises:
constructing an alarm undirected graph aiming at each alarm event;
setting a weight value for an edge in the alarm undirected graph according to the alarm association rule to obtain an alarm link topological graph;
and reconstructing the alarm link topological graph by using a maximum spanning tree algorithm to obtain an alarm propagation graph.
7. An event partitioning model training method, characterized by comprising the steps of:
acquiring historical alarm data of a network system comprising network element equipment and a historical topological relation of corresponding network element equipment;
based on the historical topological relation, performing time sequence mining on the historical alarm data to obtain time aggregation characteristics, wherein the time aggregation characteristics comprise high-frequency time and/or associated time, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and consistent in alarm titles in the network element equipment related to the topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
and training the Bayesian analysis model according to the time aggregation characteristics to obtain an event division model so as to combine alarms meeting high-frequency time and/or alarms meeting associated time.
8. An alarm root cause positioning apparatus, the apparatus comprising:
the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring real-time alarm data of a network system comprising network element equipment, and the real-time alarm data comprises alarm equipment information and alarm occurrence time;
the topological relation acquisition module is used for constructing a real-time topological relation of the network element equipment according to a preset equipment topological relation and the alarm equipment information;
the event partitioning module is used for classifying the real-time alarm data by utilizing an event partitioning model obtained by training according to the real-time topological relation and the alarm occurrence time to obtain a plurality of alarm events, wherein the event partitioning model is obtained based on high-frequency time and/or associated time training, and the high-frequency time is the average interval duration between alarms which are overlapped in alarm duration time and have consistent alarm titles in the network element equipment related to the topology; the association time is the average interval duration between alarm pairs with overlapped alarm durations but inconsistent alarm titles in the network element equipment related to the topology;
the event analysis module is used for obtaining an alarm propagation diagram according to an alarm association rule aiming at each alarm event, wherein the alarm association rule is obtained based on the association relation and the confidence coefficient between alarms;
and the root cause positioning module is used for obtaining a root node according to the alarm propagation graph so as to position the alarm root cause.
9. An alarm root cause positioning device, characterized in that the device comprises a memory and a processor, the memory having stored thereon a computer program which, when executed by the processor, implements an alarm root cause positioning method according to any one of claims 1 to 6.
10. A storage medium having stored thereon a computer program executable by one or more processors to implement the alarm root cause positioning method of any one of claims 1 to 6.
CN202110478093.9A 2021-04-29 2021-04-29 Alarm root cause positioning method, device, equipment and storage medium Active CN115361266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110478093.9A CN115361266B (en) 2021-04-29 2021-04-29 Alarm root cause positioning method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110478093.9A CN115361266B (en) 2021-04-29 2021-04-29 Alarm root cause positioning method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115361266A true CN115361266A (en) 2022-11-18
CN115361266B CN115361266B (en) 2023-08-15

Family

ID=84030739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110478093.9A Active CN115361266B (en) 2021-04-29 2021-04-29 Alarm root cause positioning method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115361266B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116155692A (en) * 2023-02-24 2023-05-23 北京优特捷信息技术有限公司 Alarm solution recommending method and device, electronic equipment and storage medium
CN116582410A (en) * 2023-05-24 2023-08-11 青岛海信信息科技股份有限公司 Intelligent operation and maintenance service method and device based on ITSM system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014071084A2 (en) * 2012-10-31 2014-05-08 O'malley, Matt System and method for dynamically monitoring, analyzing, managing, and alerting packet data traffic and applications
CN106603317A (en) * 2017-02-20 2017-04-26 山东浪潮商用***有限公司 Alarm monitoring strategy analysis method based on data mining technology
CN108847994A (en) * 2018-07-25 2018-11-20 山东中创软件商用中间件股份有限公司 Alarm localization method, device, equipment and storage medium based on data analysis
CN112039695A (en) * 2020-08-19 2020-12-04 朔黄铁路发展有限责任公司肃宁分公司 Transmission network fault positioning method and device based on Bayesian inference
CN112118141A (en) * 2020-09-21 2020-12-22 中山大学 Communication network-oriented alarm event correlation compression method and device
CN112152852A (en) * 2020-09-23 2020-12-29 创新奇智(北京)科技有限公司 Root cause analysis method, device, equipment and computer storage medium
CN112564949A (en) * 2020-11-27 2021-03-26 中盈优创资讯科技有限公司 Analysis method and device based on cross-professional alarm association rule

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014071084A2 (en) * 2012-10-31 2014-05-08 O'malley, Matt System and method for dynamically monitoring, analyzing, managing, and alerting packet data traffic and applications
CN106603317A (en) * 2017-02-20 2017-04-26 山东浪潮商用***有限公司 Alarm monitoring strategy analysis method based on data mining technology
CN108847994A (en) * 2018-07-25 2018-11-20 山东中创软件商用中间件股份有限公司 Alarm localization method, device, equipment and storage medium based on data analysis
CN112039695A (en) * 2020-08-19 2020-12-04 朔黄铁路发展有限责任公司肃宁分公司 Transmission network fault positioning method and device based on Bayesian inference
CN112118141A (en) * 2020-09-21 2020-12-22 中山大学 Communication network-oriented alarm event correlation compression method and device
CN112152852A (en) * 2020-09-23 2020-12-29 创新奇智(北京)科技有限公司 Root cause analysis method, device, equipment and computer storage medium
CN112564949A (en) * 2020-11-27 2021-03-26 中盈优创资讯科技有限公司 Analysis method and device based on cross-professional alarm association rule

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵纪刚;张超;丁建立;王静;: "民航旅客服务信息***告警关联规则挖掘", 计算机应用与软件 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116155692A (en) * 2023-02-24 2023-05-23 北京优特捷信息技术有限公司 Alarm solution recommending method and device, electronic equipment and storage medium
CN116155692B (en) * 2023-02-24 2023-11-24 北京优特捷信息技术有限公司 Alarm solution recommending method and device, electronic equipment and storage medium
CN116582410A (en) * 2023-05-24 2023-08-11 青岛海信信息科技股份有限公司 Intelligent operation and maintenance service method and device based on ITSM system
CN116582410B (en) * 2023-05-24 2023-10-27 青岛海信信息科技股份有限公司 Intelligent operation and maintenance service method and device based on ITSM system

Also Published As

Publication number Publication date
CN115361266B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
US11246045B2 (en) Systems and methods for communications node upgrade and selection
US10205635B2 (en) System and method for diagnosing database network integrity using application business groups and application epicenters
US20200259715A1 (en) Topology-Aware Continuous Evaluation of Microservice-based Applications
CN104424235B (en) The method and apparatus for realizing user profile cluster
CN114567538B (en) Alarm information processing method and device
CN115361266A (en) Alarm root cause positioning method, device, equipment and storage medium
CN110515986B (en) Processing method and device of social network diagram and storage medium
CN109886699A (en) Activity recognition method and device, electronic equipment, storage medium
CN109218080A (en) A kind of method, monitoring system and the terminal device of automatic drafting network topology architecture
CN113204451B (en) Pressure test method, system, storage medium and terminal for Redis cluster
CN114091610A (en) Intelligent decision method and device
CN111859187A (en) POI query method, device, equipment and medium based on distributed graph database
US20160125005A1 (en) Apparatus and Method for Profiling Activities and Transitions
CN109376287A (en) House property map construction method, device, computer equipment and storage medium
CN112784025A (en) Method and device for determining target event
CN117042026A (en) Business visualization model construction method, device, equipment, medium and program product
CN115630073B (en) Electric power Internet of things data processing method and platform based on edge calculation
CN114756301B (en) Log processing method, device and system
CN112579402A (en) Method and device for positioning faults of application system
CN110909191A (en) Graph data processing method and device, storage medium and electronic equipment
CN114500227B (en) Alarm analysis method, device, equipment and computer storage medium
US20230028044A1 (en) Environment change management and risk analysis
Leite et al. Use of distribution network topological fractality and sunburst charts in the online risk assessment
CN117251438A (en) WebGIS-based multi-source data management method and cloud sharing system
CN116992972A (en) Machine learning model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant