CN118133952A - Event influence determining method, device, equipment and storage medium of batch system - Google Patents

Event influence determining method, device, equipment and storage medium of batch system Download PDF

Info

Publication number
CN118133952A
CN118133952A CN202410331136.4A CN202410331136A CN118133952A CN 118133952 A CN118133952 A CN 118133952A CN 202410331136 A CN202410331136 A CN 202410331136A CN 118133952 A CN118133952 A CN 118133952A
Authority
CN
China
Prior art keywords
batch
time
batch system
relationship
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410331136.4A
Other languages
Chinese (zh)
Inventor
张兰霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Publication of CN118133952A publication Critical patent/CN118133952A/en
Pending legal-status Critical Current

Links

Abstract

The application provides an event influence determining method, device, equipment and storage medium of a batch system, wherein the method comprises the following steps: determining whether an association relationship exists between each batch of systems or not and determining the existence time of the association relationship; taking the batch systems as entities in the time sequence knowledge graph, and constructing the relationship among the entities in the time sequence knowledge graph based on whether the association relationship exists among the batch systems and the existence time of the association relationship; acquiring an abnormal batch system with abnormal events at the current moment; and determining a batch system which is related to the abnormal batch system at the current moment and/or after the current moment from all batch systems according to the time sequence knowledge graph, and taking the batch system as an abnormal influence system. According to the technical scheme, the association relation between the batch systems is represented through the knowledge graph, the time dimension is introduced, the relation change between the batch systems is better represented, and the most accurate prediction can be made on the influence of the event.

Description

Event influence determining method, device, equipment and storage medium of batch system
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, a device, and a storage medium for determining an event impact of a batch system.
Background
With development of cloud services, the development of micro-service architecture increases the number of systems and the association relationship between the systems is more and more complex. Because the batch system does not have the high timeliness of the online system, after some events occur, the batch system needs to wait until the end of the day to determine whether each system operates normally according to the correctness of the settlement file.
In the related art, the configuration management database (Configuration Management database, CMDB) contains information about the full life cycle of configuration items and the physical relationships between the configuration items to monitor what systems will be in existence after an event occurs.
However, in the existing method, accurate analysis of the influence of the event cannot be achieved.
Disclosure of Invention
The application provides a method, a device, equipment and a storage medium for determining event influence of a batch system, which are used for solving the problem that the conventional mode cannot accurately analyze the event influence in the batch system.
In a first aspect, an embodiment of the present application provides a method for determining an event impact of a batch system, where the method includes:
Determining whether an association relationship exists between each batch of systems or not and determining the existence time of the association relationship;
taking the batch systems as entities in a time sequence knowledge graph, and constructing the relationship among the entities in the time sequence knowledge graph based on whether the relationship exists among the batch systems and the existence time of the relationship;
Acquiring an abnormal batch system with abnormal events at the current moment;
and determining a batch system which is associated with the abnormal batch system at the current moment and/or after the current moment from all batch systems according to the time sequence knowledge graph, and taking the batch system as an abnormal influence system.
In one possible design of the first aspect, the determining whether there is an association between the batch systems includes:
acquiring a configuration management database, a historical event, a batch system time sequence, system parameters and a calling relation of the batch system;
and determining whether an association relationship exists among all batch systems or not based on at least one of the configuration management database, the historical event, the batch system time sequence, the system parameters and the calling relationship.
In another possible design of the first aspect, the determining whether there is an association relationship between the batch systems based on at least one of the configuration management database, the historical event, the batch system timing, the system parameter, and the call relationship includes:
Determining whether a first batch system is attributed to a second batch system and/or whether a call exists between the first batch system and the second batch system based on at least one of the configuration management database, the historical event, the batch system timing, the system parameters, and the call relationship;
If the first batch system belongs to the second batch system, determining that an association relationship exists between the first batch system and the second batch system, wherein the association relationship is a composition relationship;
If the first batch system is not attributed to the second batch system and a calling relationship exists between the first batch system and the second batch system, determining that an association relationship exists between the first batch system and the second batch system and the association relationship is an influence relationship.
In yet another possible design of the first aspect, the determining the existence time of the association relationship between the batch systems includes:
If the batch systems are in a composition relationship, determining the existence time of the association relationship between the batch systems as each time point;
and if the batch systems are in an influence relationship, acquiring the calling time between the first batch system and the second batch system, and taking the calling time as the existing time of the association relationship.
In yet another possible design of the first aspect, the determining the existence time of the association relationship between the batch systems includes:
and acquiring a time point and/or a time period of an association relation between the first batch system and the second batch system in the working period based on the working period of the operation automation system formed by all batch systems, and taking the time point and/or the time period as the existence time of the association relation.
In yet another possible design of the first aspect, the method further comprises:
Each entity in the time-series knowledge-graph is described with a quadruple, which is used to characterize the relationship between the entity and other entities at time t.
In yet another possible design of the first aspect, the determining, among all batch systems, a batch system associated with the abnormal batch system at and/or after the current time, as an abnormal influencing system includes:
Acquiring a history four-tuple describing an abnormal entity, wherein the abnormal entity is used for representing the abnormal batch system, and the history four-tuple is used for representing the relationship between the abnormal entity and other entities before the current moment;
Predicting a current quaternion of the abnormal entity based on the historical quaternion, wherein the current quaternion is used for representing the relationship between the abnormal entity and other entities at the current moment;
Based on the current quadruple, determining other entities with relations between the current moment and/or the moment and the abnormal entity as abnormal influence entities;
and acquiring a batch system represented by the abnormal influence entity in the time sequence knowledge graph as the abnormal influence system.
In yet another possible design of the first aspect, the predicting the current quadruple of the anomalous entity based on the historical quadruple includes:
Based on the history quadruple, constructing a time sequence diagram network model, wherein the time sequence diagram network model is used for acquiring the time change condition of the relation among all entities in the time sequence knowledge graph;
And predicting the current quadruple of the abnormal entity based on the time sequence diagram network model.
In yet another possible design of the first aspect, the timing diagram network model includes at least a memory module, an update function module, an aggregation module, a memory update module, and an embedding module;
The memory module is used for storing state information of each entity, wherein the state information comprises the relation between the entity and other entities at each historical time point;
the updating function module is used for determining updating information of the first entity and updating information of the second entity when the first entity and the second entity have a relation at the moment t, and the updating information is used for updating the state information;
The aggregation module is used for aggregating the update information generated by each entity before the current moment;
The memory updating module is used for updating the state information of the memory module according to the newly generated updating information;
the embedding module is used for generating state information of the entity at the current moment
In yet another possible design of the first aspect, the method further comprises:
And dynamically updating the time sequence knowledge graph and the time sequence graph network model when the batch system and the abnormal event change with time.
In a second aspect, an embodiment of the present application provides an event impact determining apparatus for a batch system, including:
The relationship determining module is used for determining whether the association relationship exists between each batch of systems or not and determining the existence time of the association relationship;
The relationship construction module is used for taking the batch systems as entities in the time sequence knowledge graph and constructing the relationship among the entities in the time sequence knowledge graph based on whether the association relationship exists among the batch systems and the existence time of the association relationship;
The abnormal acquisition module is used for acquiring an abnormal batch system with abnormal events at the current moment;
And the influence determining module is used for determining a batch system which is related to the abnormal batch system at the current moment and/or after the current moment in all batch systems according to the time sequence knowledge graph, and taking the batch system as the abnormal influence system.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to implement the method as described above.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored therein computer-executable instructions for performing a method as described above when executed by a processor.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the method described above.
According to the method, the device, the equipment and the storage medium for determining the event influence of the batch system, provided by the embodiment of the application, the association relation between the batch system and the batch system is represented through the knowledge graph, and the time dimension is further introduced, so that the relation change between the batch systems can be better represented, complex event scenes can be better analyzed and processed, and the most accurate prediction on the event influence is made.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application;
FIG. 1 is a schematic flow chart of a method for determining influence of batch system events according to an embodiment of the present application;
fig. 2 is a schematic diagram of a knowledge graph provided in an embodiment of the present application;
Fig. 3 is a schematic diagram of a timing knowledge graph at time T1 according to an embodiment of the present application;
fig. 4 is a schematic diagram of a timing knowledge graph at time T2 according to an embodiment of the present application;
FIG. 5 is a flowchart of a method for determining event influence based on a time sequence knowledge graph according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an event impact determination apparatus of a batch system according to an embodiment of the present application;
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
With development of cloud services, the development of micro-service architecture increases the number of systems and the association relationship between the systems is more and more complex. The processing of the service often needs to circulate among a plurality of systems, and when one system is abnormal, the normal processing of the service can be influenced, so that other related systems are abnormal. For this reason, when an event occurs, it is necessary for the operation and maintenance personnel to rapidly process and comprehensively evaluate the influence of the event. Under the condition that current systems and events are growing, the influence caused by manual analysis of the events is limited by the expertise of operation and maintenance personnel, and is not necessarily the trend of emergency treatment of the future events. For the online system, after the event occurs, the real-time online transaction success rate is a standard for reflecting whether the system is normal or not, and if the success rate is normal, the success rate can indirectly indicate that all the systems on the transaction link are not abnormal, and the operation condition of the online system can be easily predicted by monitoring the transaction curve through a proper algorithm. However, for a batch system, the timeliness is not as high as that of an online system, after an event occurs, the clearing result file is generated only until the day is finished, whether the system operates normally or not can be reflected according to the correctness of the clearing result file, and the final clearing result file is usually formed by summarizing the clearing results of a plurality of batch systems, and the clearing result abnormality of one batch system may cause the abnormality of one or even a plurality of other batch systems. For this reason, how to quickly evaluate the impact of events on the clearing results of individual batch systems is a major issue in batch system emergency.
In the related art, the following three ways are provided to influence the monitoring event: mode (1) a centralized monitoring event impact determination method based on a configuration management database (Configuration Management database, CMDB), ① builds a hierarchical model; ② Constructing a layer comparison array; ③ Determining weight vectors and performing consistency check; ④ And determining a combination weight vector and performing combination consistency check. In the mode (1), the CMDB contains information of the full life cycle of the configuration items and physical relations among the configuration items, but the calling relation among the systems and the transaction flow direction of the service are not well reflected, and accurate analysis of event influence cannot be achieved only through the CMDB. And (2) dividing the historical data into data fragments according to a specified rule fragmentation mode, acquiring an optimal threshold boundary on each data fragment through ensemble learning, and generating monitoring rule parameters in batches. And the historical data of the monitored object is input, and the combination rule information such as the monitoring threshold value, the backtracking duration and the like of each index of the object in each time zone is output. In the mode (2), the anomaly sensing method based on the ensemble learning is performed by taking an online system as an example, and the method cannot be directly applied to a batch system due to the difference between the online system and the batch system, when the anomaly of the system is cleared, the influence on another system is not real-time, when the failure of the system A occurs, the influence on the system B is not generated, but when a certain time point is reached, the failure is not solved yet, and the influence on the system B is generated. The system fault early warning method based on the system call chain in the mode (3) is mainly characterized in that the implementation core of the fault early warning mode is provided with two points: ① Acquiring a calling relation between systems; ② And early warning of system faults is carried out through the relation among the systems. In the method (3), the call relations between the systems are collected, and the correlation degree between the systems is evaluated by using Dijkstra distances, so that the system state of a period of time in the future is predicted by using the whole of all the systems, but in the case of batch systems, there is a case where there is no call relation between two systems, but in the same transaction flow direction, when one system is abnormal, the other system is actually affected.
In view of the above problems, the present invention provides a method, apparatus, device and storage medium for determining the influence of an event in a batch system, which can infer the influence of each batch system associated with a future time from the current scenario when the event occurs. Compared with a CMDB, the knowledge graph can better show complex relations between systems, a time dimension is introduced on the basis, a time sequence knowledge graph is formed, dynamic relations among batch systems can be better represented, hidden information in massive events can be found, complex event scenes can be better analyzed and processed, and the most accurate prediction on event influence is made.
The technical scheme of the application is described in detail through specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 1 is a flow chart of a method for determining influence of events in a batch system according to an embodiment of the present application, where the method may apply Yu Yunwei platforms to allow operation and maintenance personnel to quickly evaluate influence caused by the events. Taking the operation and maintenance platform as an execution main body, as shown in fig. 1, the method specifically includes the following steps: step S101, determining whether an association relationship exists between each batch system and the existence time of the association relationship.
In this embodiment, from the system dimension, each cluster, physical machine, virtual machine, database, and system can be regarded as a batch system, and the association between clusters and physical machines, between physical machines and virtual machines, between virtual machines and systems, and between systems and systems represents the association between batch systems.
When determining whether the association relation exists among all batch systems, taking the CMDB as an example, the CMDB contains the information of the full life cycle of the configuration items and the physical relation among the configuration items. The CMDB may be used to learn the relationship between the cluster and the physical machine, between the physical machine and the virtual machine, between the virtual machine and the system, and between the system and the system, for example, the physical machine W1 belongs to a component of the cluster J1, that is, the association relationship between the physical machine W1 and the cluster J1 is: composition relation.
For example, refer to table 1 below:
Cluster 1 Composition of the composition Physical machine 1
Cluster 2 Composition of the composition Physical machine 2
Physical machine 1 Composition of the composition Virtual machine 1
Physical machine 2 Composition of the composition Database 1
Virtual machine 1 Composition of the composition System 1
Database 1 Composition of the composition System 1
TABLE 1
As shown in table 1 above, by the physical relationship between the configuration items included in the CMDB, it can be known that the association relationship between the cluster 1 and the physical machine 1 is: composition; the association relation between the cluster 2 and the physical machine 2 is composed; the association between the database 1 and the system 1 is a composition.
The above-mentioned composition relationship can be regarded as a long-term fixed association relationship. In addition, the association between batch systems may also change over time. For example, in the execution process of a certain service, the system 1 may be required to call the function of the system 2, at this time, an association relationship exists between the system 1 and the system 2, and when the service is executed, the system 1 no longer needs to call the function of the system 2, so that the association relationship between the system 1 and the system 2 disappears.
By way of example, by introducing a comb of the occurrence times of the historical events, it is possible to know at which time there is an association between batch systems.
For example, refer to table 2 below:
System 1 Influence of System 2
System 2 Influence of System 3
TABLE 2
As shown in the above table 2, by introducing the historical event, if the occurrence time of the historical event is the time period T1-T2, the system 1 is found to affect the system 2, and the system 2 is found to affect the system 3, so that the relationship between the system 1 and the systems 2 and the system 3 in the time period T1-T2 can be known.
For example, refer to table 3 below:
Virtual machine 1 Composition of the composition 00:00-24:00 System 1
System 2 Influence of 02:00 System 1
System 3 Influence of 04:00 System 2
TABLE 3 Table 3
As shown in table 3 above, by using the history event, the traffic flow situation between each batch system can be grasped, and thus, it is determined when each batch system will generate the association relationship.
Step S102, taking the batch systems as entities in the time sequence knowledge graph, and constructing the relationship among the entities in the time sequence knowledge graph based on whether the association relationship exists among the batch systems and the existence time of the association relationship.
In this embodiment, the knowledge graph is a technology for modeling association relationships between everything in the world by using a graph model, and fig. 2 is a schematic diagram of the knowledge graph provided in the embodiment of the present application, where, as shown in fig. 2, the knowledge graph is composed of points and edges, the points represent entities, and the edges represent relationships between the entities. Each entity in fig. 2 needs to be defined.
In this embodiment, each batch system may be regarded as one entity, and then the association relationship between batch systems may be regarded as a connection line between the respective entities. For example, the batch system P1 and the batch system P2 have an association relationship, and then an entity representing the batch system P1 and an entity representing the batch system P2 are connected by edges in the knowledge graph.
In addition, each side in the knowledge graph needs to be time stamped based on the time when the association relationship exists among the batch systems, so that a time sequence knowledge graph is formed. The time when the association relationship exists among the batch systems can be specifically referred to the table 3.
Illustratively, in other embodiments, relationships between entities may also be characterized by the form of Table 3 above. Specifically, taking the virtual machine 1 as one entity in the knowledge graph, and taking the system 1 as the other entity in the knowledge graph, the association relationship between the virtual machine 1 and the system 1 is as follows: composition relationship, and the existence time of the composition relationship is 00:00-24:00, all time points of each day.
Step S103, an abnormal batch system generating an abnormal event at the current moment is obtained.
Step S104, determining a batch system which is associated with the abnormal batch system at the current moment and/or after the current moment from all batch systems according to the time sequence knowledge graph as an abnormal influence system.
In this embodiment, the time-series knowledge graph may be used as training data to train to obtain a neural network model. When an abnormal event occurs, the influence of the future time on other batch systems is deduced by using a trained neural network model.
In this embodiment, a batch system in which the current time is associated with the abnormal batch system may be determined from all batch systems based on the time-series knowledge graph of the current time.
Specifically, referring to fig. 2, it may be recorded in the timing indication map whether there is an association relationship between the system F and the system D at the current time, and if the system F at the current time is an abnormal batch system, the corresponding system D is an abnormal influencing system.
In other embodiments, batch systems associated with an abnormal batch system presence at and/or after the current time may also be determined based on table 3 above. Specifically, as shown in table 3 above, if the current time is 02:00, when the abnormal time occurs in the virtual machine 1, it can be determined that the system 1 having a composition relationship with the virtual machine 1 is always affected at a subsequent time based on the time information recorded in table 3, and the system 1 is regarded as an abnormal affecting system. The system 2 and the system 3 at the current moment have no association relationship with the virtual machine 1, and are not affected, namely, do not belong to an abnormal influence system.
According to the embodiment of the application, the association relation between the batch systems is represented by the knowledge graph, and the time dimension is further introduced, so that the relation change between the batch systems can be better represented, complex event scenes can be better analyzed and processed, and the most accurate prediction on the event influence can be made.
In some embodiments, the step S101 may be specifically implemented by the following steps: acquiring a configuration management database, a historical event, a batch system time sequence, system parameters and a calling relationship of a batch system; and determining whether an association exists between the batch systems based on at least one of the configuration management database, the historical event, the batch system timing, the system parameters, and the call relationship.
In this embodiment, the configuration management database contains all relevant information of components of the information system used by IT services of one organization and the relationships between these components. Wherein, organization may refer to a collective building the entire micro-service architecture, and the operation and maintenance personnel also belong to a part of the collective. Each entity, i.e. which batch systems can be considered as the same entity and which batch systems as different entities, can be defined based on the configuration management database.
In addition, based on the configuration management database, it may also be determined whether a composition relationship exists between the entities, for example, taking the cluster 1 as one entity and the physical machine 1 as another entity, where the physical machine 1 belongs to the cluster 1, and the association relationship between the physical machine 1 and the cluster 1 is: composition relation.
In this embodiment, the history event may refer to an event that occurs before the current time, and whether there is an association relationship between the batch systems may also be determined by the batch systems affected by the event after the event occurs.
In this embodiment, through the batch system timing, it may be obtained whether there is an association relationship between each batch system at a certain time point or time period.
In this embodiment, the system parameter may be used to describe the attribute of each batch system, and based on the attribute of each system, it may be further determined whether there is an association relationship between each batch system.
In this embodiment, the call relationship may describe whether there is a call between the batch systems when an event is executed, for example, call a function of a batch system, call data of a batch system, and when there is a call relationship between batch systems, there is a corresponding association relationship.
According to the embodiment of the application, whether the association relationship exists among all batch systems is determined, so that the complex relationship among the systems can be shown, when an abnormal event occurs to a certain batch system, the batch system with the association relationship can be quickly found, whether the associated batch systems are affected or not is further analyzed, and the most accurate prediction on the event influence is made.
Further, in other embodiments, the association relationship may be classified according to actual requirements, which may be specifically classified as: composition relationships and influence relationships. For example, if a lot of systems need to invoke each other, then the relationship between them is described by the invocation relationship.
When classifying the association relationship into the composition relationship and the influence relationship, whether the association relationship exists between the respective batch systems can be determined by:
step A: determining whether the first batch system is attributed to the second batch system and/or whether there is a call between the first batch system and the second batch system based on at least one of the configuration management database, the historical event, the batch system timing, the system parameters, and the call relationship;
and (B) step (B): if the first batch system belongs to the second batch system, determining that the first batch system and the second batch system have an association relationship, wherein the association relationship is a composition relationship;
Step C: if the first batch system is not attributed to the second batch system and a calling relationship exists between the first batch system and the second batch system, determining that an association relationship exists between the first batch system and the second batch system, wherein the association relationship is an influence relationship.
In this embodiment, the entity may be defined by configuring the management database. Each cluster, physical machine, virtual machine, database, system is an entity in the system dimension, and each entity is regarded as a batch system. The configuration management database may obtain whether a composition relationship exists between each entity, for example, a composition relationship exists between the physical machine 1 and the cluster 1, and no composition relationship exists between the physical machine 1 and the cluster 2.
In this embodiment, whether there is a call between batch systems may be determined directly by call relation lookup. For example, if there is a call between system 1 and system 2, then system 1 and system 2 are a pair of batch systems that interact with each other. The association relationship between the system 1 and the system 2 can be regarded as: affecting the relationship.
According to the embodiment of the application, the association relation between the batch systems is classified to distinguish whether the batch systems are the influence relation or the composition relation, so that when an event occurs in an abnormal batch system, the influence on the associated batch system can be more intuitively and rapidly determined based on the subdivided association relation between the event and each batch system.
Further, based on the above embodiments, in some embodiments, the time when the association relationship exists between the batch systems may be determined by: if the batch systems are in a composition relationship, determining the existence time of the association relationship between the batch systems as each time point; and if the batch systems are in an influence relationship, acquiring the calling time between the first batch system and the second batch system, and taking the calling time as the existing time of the association relationship.
In the present embodiment, for a batch system of composition relationships, the relationship between the virtual machine 1 and the system 1 as described above is a composition relationship, which is a fixed relationship for a long period of time without any time. For the virtual machine 1 and the system 1, their composition relationship exists at each point in time, so the existence time of the association relationship is each point in time.
In this embodiment, there is no composition relationship between some batch systems, and only when executing a service, a certain batch system needs to be called, so that there is an association relationship between the batch systems, and this association relationship has been mentioned as an influence relationship. The time of existence of the influence relation is from the start time of the call to the end time of the call.
According to the embodiment of the application, the time dimension is introduced into the association relation among the batch systems, so that the dynamic relation among the batch systems can be better represented, the batch systems with the association relation in some time periods or time points can be found out, and hidden information in massive events can be found out, so that complex event scenes can be better analyzed and processed, and the judgment accuracy of the influence on the events can be improved.
In other embodiments, the existing clearing job automation system may be utilized to mine the existence time of the association relationship between the batch system and the batch system, and specifically includes the following steps: and acquiring a time point and/or a time period of the association relation between the first batch system and the second batch system in the working period based on the working period of the operation automation system formed by all batch systems, and taking the time point and/or the time period as the existence time of the association relation.
In this embodiment, all batch systems under the micro-service architecture have a common duty cycle, e.g., a single day. After the first job cycle is finished, each batch system generates a clearing result file of the first job cycle, then waits for the arrival of the first job cycle, and continues to generate a clearing result file of the second job cycle after the second job cycle is finished.
Assuming that N working cycles (N is a positive integer greater than 1) have been elapsed before the current time, if there is an association between the first batch system and the second batch system at the time point T0 or the time period T1-T2 in each working cycle, it may be determined that the time of existence of the association between the first batch system and the second batch system is the time point T0 or the time period T1-T2.
Taking the time T1 as an example, fig. 3 is a schematic diagram of a timing knowledge graph at the time T1 provided by the embodiment of the present application, as shown in fig. 3, at the time T1, an influence relationship exists between a system F and a system D, an influence relationship exists between a system D and a system C, an influence relationship exists between a system C and a system B, an influence relationship exists between a system B and a database a, an influence relationship exists between a database B and a system B, an influence relationship exists between a system a and a system C, and an influence relationship exists between a system E and a database B.
In which, the edges in the time sequence knowledge graph can be dynamically changed, and as shown in fig. 4, fig. 4 is a schematic diagram of the time sequence knowledge graph at the time T2 in the embodiment of the present application, as shown in fig. 4, at the time T2, there is no influence relationship between the database b and the system a, and the edges connecting the database b and the system a disappear.
Further, in other embodiments, the existence time of the association relationship between the first batch system and the second batch system in each of the N operation cycles that have been performed before the current time may be used as training data to train the neural network model, and then the existence time of the association relationship between the first batch system and the second batch system in the next operation cycle after the current time may be predicted based on the trained neural network model.
According to the embodiment of the application, the operation automation system is utilized to mine the existence time of the association relation between the batch systems, so that the existence time of the association relation of the batch systems can be more accurately determined, the accuracy of the time sequence knowledge graph is improved, the follow-up time sequence knowledge graph is ensured, and the batch systems influenced by the event can be accurately found.
In some embodiments, after the definition of the entities and edges in the time sequence knowledge graph is completed and the time dimension is introduced, an appropriate representation method is selected to convert the entities, relationships and time into a form which can be understood and processed by a computer. These representations are intended to capture semantic relationships between entities so that computer systems can better understand and infer information about the relationships between the entities. For example, each entity in the timing knowledge graph may be described using a quad.
Wherein the quadruple is used to characterize the relationship between the entity and the other entity at time t.
In this embodiment, the time-series knowledge graph is denoted as s, which is a directed graph describing the relationship between entities. Where the relationship between entities is a time stamped edge, s i (t) may be formalized as a four-tuple (e i,r,ej, t) that describes the relationship r between the batch system e i and the batch system e j at time t.
Illustratively, referring to table 3 above, table 3 above may be described as:
( "virtual machine 1", "composition", "system 1", "00:00-24:00" )
( "System 2", "influence", "System 1", "02:00" )
( "System 3", "influence", "System 2", "04:00" )
Further, after converting the entity, the relation and the time into the quadruple, the abnormal influence system can be searched by the following steps:
step (1) acquires a history quadruple describing an abnormal entity.
The abnormal entity is used for representing an abnormal batch system, and the history four-element is used for representing the relationship between the abnormal entity and other entities before the current moment.
And (2) predicting the current quaternion of the abnormal entity based on the historical quaternion. The current quadruple is used for representing the relationship between the abnormal entity and other entities at the current moment and/or after the current moment.
And (3) determining other entities with a relation between the current moment and the abnormal entity based on the current quadruple as abnormal influence entities.
And (4) acquiring a batch system characterized by the abnormal influence entity in the time sequence knowledge graph as an abnormal influence system.
In this embodiment, the abnormal influence system having an influence at the current time and/or after the current time may be obtained through knowledge reasoning, that is, influence analysis of the event, and when a new event is generated, influence in a future time period is predicted through knowledge reasoning.
Where the history quadruple is characterized by s i (t), the current time being denoted by t s, s i (t) aims to predict (e i,r,e?,ts) the missing entity e ? from a set of histories before a given time t s.
The historical quadruple can be used as training data to train the neural network model continuously. And then predicting the current moment and/or the current quadruple after the current moment by using the trained neural network model.
For example, in some embodiments, an initial timing diagram network model may be constructed, the initial timing diagram network model may be trained based on the historical quadruples, a trained timing diagram network model may be obtained, and a current quadruple of the abnormal entity may be predicted based on the trained timing diagram network model. The time sequence diagram network model is used for acquiring the time change condition of the relation among all the entities in the time sequence knowledge graph.
In this embodiment, the timing diagram network (Temporal Graph Networks, TGN) is a graph neural network model that is dedicated to processing timing diagram data, including time-specific reasoning mechanisms. TGNs are able to capture the evolution of nodes and edges in time in a timing graph, inferred through timing attention mechanisms and timing embedding. The time sequence diagram is a time sequence knowledge graph in the application, the nodes in the time sequence diagram are entities in the time sequence knowledge graph, and the edges of the nodes are edges of the entities.
Further, the timing diagram network model at least comprises a memory module, an updating function module, an aggregation module, a memory updating module and an embedding module. The memory module is used for storing state information of each entity, wherein the state information comprises the relation between the entity and other entities at each historical time point.
In this embodiment, the memory module is configured to store state information of all nodes, as compressed representations of past time interconnections of the nodes, each node having a separate state quantity s i t), and to double the state of the corresponding zero vector when a new node appears.
In this embodiment, the update function module is configured to determine update information of the first entity and update information of the second entity when the first entity and the second entity have a relationship at time t, where the update information is used for updating the state information.
The update function module is an important mechanism for updating the memory. Assuming that node i and node j have an association at time t, the update function computes 2 messages (one m i (t) for i and one m j (t) for j) for updating the memory.
mi(t)=msg(si(t-),sj(t-),t,eij(t))
mj(t)=msg(sj(t-),si(t-),t,eij(t))
In this embodiment, the aggregation module is configured to aggregate update information generated by each entity before the current time.
In order to improve the efficiency, a batch processing manner is generally used to process the data, which causes that a plurality of events include the same node i in the same batch of data, and since each event generates a message, the message with the aggregation time t 1,t2,...,tb less than or equal to t is needed, namely: m i(t1),mi(t2),...,mi (b)
Wherein agg is an aggregation function.
In this embodiment, the memory update module is configured to update the state information of the memory module according to the newly generated update information.
Wherein,
In this embodiment, the embedding module is configured to generate state information of the entity at the current time and/or after the current time, that is, generate a state prediction z i (t) of the node i at the time t.
Wherein after z i (t) is calculated, it is concluded that when the lot i fails, it will be affected by several lots during future time periods.
According to the embodiment of the application, the time sequence chart network model for knowledge reasoning is constructed, when an event is generated, the influence of the time in the future time period is predicted through the knowledge reasoning, and how many batch systems are influenced in the future time period is determined, so that the time for manual analysis and judgment can be saved, the event processing efficiency is improved, and the stable operation of the production environment is ensured.
In other embodiments, the batch system and events are changed with time, so that the time sequence knowledge graph is also dynamically changed, so that when the batch system and abnormal events are changed with time, the time sequence knowledge graph and the time sequence graph network model need to be updated periodically to ensure the accuracy and practicability of the model
Fig. 5 is a flowchart of a method for determining event influence based on a time-series knowledge graph according to an embodiment of the present application, as shown in fig. 5, which specifically may include the following steps: ① Entities and relationships are defined. ② A time dimension is introduced. ③ Knowledge representation. ④ Knowledge reasoning TGN. ⑤ Time-dependent analysis. ⑥ Unknown entities/events. ⑦ Events that have occurred. ⑧ And (5) concluding. ⑨ And updating and maintaining regularly.
The time sequence knowledge graph can be constructed by five steps, namely ①,②,③,④,⑨.
For ①, the knowledge graph is a technical method for modeling the association relationship between everything in the world by using a graph model, and consists of points and edges, wherein the points represent entities, and the edges represent the relationship between the entities, so that the first step of building the knowledge graph is to define the entities and the relationship.
For ②, a time dimension is introduced between each entity and the relationship, i.e., determining their relationship at a point in time or within a period of time.
For ③, after the definition of the entity and the relationship is completed and the time dimension is introduced, an appropriate representation method is selected to convert the entity, the relationship and the time into a form which can be understood and processed by a computer. These representations are intended to capture semantic relationships between entities so that computer systems can better understand and infer information about the relationships between the entities.
For ④, knowledge reasoning, i.e., impact analysis of events, i.e., when a new event is generated, the impact over a future time period is predicted by knowledge reasoning.
For ⑨, since the batch system and events change over time, the time series knowledge graph is dynamic and needs to be updated periodically to ensure accuracy and practicality of the model.
According to the embodiment of the application, the influence of the batch system events is inferred and analyzed by building the time sequence knowledge graph. And constructing an omnibearing batch system time sequence knowledge graph through the elements such as the CMDB, the call relation among batch systems, the transaction flow direction and the like. And meanwhile, by combining the characteristics of the batch system, a time sequence knowledge graph is formed by introducing the time sequence relation of the batch system and the time sequence relation of the historical event occurrence time, the model is trained, and when an abnormal event occurs, the influence of the future time on other systems is deduced by using the model, so that the accurate analysis of the influence of the batch system event is realized.
The following are examples of the apparatus of the present application that may be used to perform the method embodiments of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the method of the present application.
Fig. 6 is a schematic structural diagram of an event impact determining apparatus of a batch system according to an embodiment of the present application, where the apparatus may be applied to electronic devices such as a computer. As shown in fig. 6, the event impact determination apparatus 600 of the batch system includes a relationship determination module 610, a relationship construction module 620, an anomaly acquisition module 630, and an impact determination module 640.
The relationship determining module 610 is configured to determine whether there is an association relationship between each batch system and a time for which the association relationship exists.
The relationship construction module 620 is configured to take the batch systems as entities in the time sequence knowledge graph, and construct relationships between the entities in the time sequence knowledge graph based on whether there is an association relationship between the batch systems and the existence time of the association relationship.
The exception acquisition module 630 is configured to acquire an exception batch system in which an exception event occurs at a current time;
the influence determining module 640 is configured to determine, from all batch systems, batch systems associated with an abnormal batch system at and/or after the current time, as abnormal influence systems, according to the time sequence knowledge graph.
Optionally, the relationship determination module may specifically be configured to: acquiring a configuration management database, a historical event, a batch system time sequence, system parameters and a calling relationship of a batch system; and determining whether an association exists between the batch systems based on at least one of the configuration management database, the historical event, the batch system timing, the system parameters, and the call relationship.
Optionally, the relationship determination module may specifically be configured to: determining whether the first batch system is attributed to the second batch system and/or whether there is a call between the first batch system and the second batch system based on at least one of the configuration management database, the historical event, the batch system timing, the system parameters, and the call relationship; if the first batch system belongs to the second batch system, determining that the first batch system and the second batch system have an association relationship, wherein the association relationship is a composition relationship; if the first batch system is not attributed to the second batch system and a calling relationship exists between the first batch system and the second batch system, determining that an association relationship exists between the first batch system and the second batch system, wherein the association relationship is an influence relationship.
Optionally, the relationship determination module may specifically be configured to: if the batch systems are in a composition relationship, determining the existence time of the association relationship between the batch systems as each time point; and if the batch systems are in an influence relationship, acquiring the calling time between the first batch system and the second batch system, and taking the calling time as the existing time of the association relationship.
Optionally, the relationship determination module may specifically be configured to: and acquiring a time point and/or a time period of the association relation between the first batch system and the second batch system in the working period based on the working period of the operation automation system formed by all batch systems, and taking the time point and/or the time period as the existence time of the association relation.
Optionally, the system further comprises an entity description module for describing each entity in the time sequence knowledge graph by using a quaternion, wherein the quaternion is used for representing the relationship between the entity and other entities at the time t.
Optionally, the impact determination module may specifically be configured to: acquiring a historical quadruple describing an abnormal entity, wherein the abnormal entity is used for representing an abnormal batch system; predicting a current quaternion of the abnormal entity based on the historical quaternion; based on the current quadruple, determining other entities with relations between the current moment and/or the moment and the abnormal entity as abnormal influence entities; and acquiring a batch system represented by the abnormal influence entity in the time sequence knowledge graph as an abnormal influence system.
The historical four-tuple is used for representing the relationship between the abnormal entity and other entities before the current moment, and the current four-tuple is used for representing the relationship between the abnormal entity and other entities at the current moment.
Optionally, the impact determination module may specifically be configured to: constructing a time sequence chart network model based on the history quadruple; and predicting the current four-tuple of the abnormal entity based on the time sequence diagram network model. The time sequence diagram network model is used for acquiring the time change condition of the relation among all the entities in the time sequence knowledge graph.
Optionally, the time sequence diagram network model at least comprises a memory module, an updating function module, an aggregation module, a memory updating module and an embedding module. The memory module is used for storing state information of each entity, wherein the state information comprises the relation between the entity and other entities at each historical time point. The update function module is used for determining update information of the first entity and update information of the second entity when the first entity and the second entity have a relation at the time t, wherein the update information is used for updating the state information. The aggregation module is used for aggregating the update information generated by each entity before the current moment. The memory updating module is used for updating the state information of the memory module according to the newly generated updating information. The embedding module is used for generating state information of the entity at the current moment.
Optionally, the system further comprises an updating module for dynamically updating the time sequence knowledge graph and the time sequence graph network model when the batch system and the abnormal event change with time.
The device provided by the embodiment of the application can be used for executing the method in the embodiment shown above, and the implementation principle and technical effects are similar, and are not repeated here.
It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the relationship determining module may be a processing element that is set up separately, may be implemented in a chip of the above-described apparatus, or may be stored in a memory of the above-described apparatus in the form of program codes, and the functions of the relationship determining module may be called and executed by a processing element of the above-described apparatus. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device 700 includes: at least one processor 701, memory 702, bus 703, and communication interface 704. The processor, the communication interface and the memory are in communication with each other through the bus. The communication interface is used for communicating with other devices. The communication interface comprises a communication interface for data transmission, a display interface or an operation interface for human-computer interaction, and the like. A processor for executing computer-executable instructions, and in particular for performing the relevant steps of the methods described in the above embodiments.
The processor may be a central processing unit (cpu), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the electronic device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs. The memory is used for storing computer execution instructions. The memory may comprise high speed RAM memory or may also comprise non-volatile memory, such as at least one disk memory.
The present embodiment also provides a computer-readable storage medium having stored therein computer instructions which, when executed by at least one processor of an electronic device, perform the event impact determination method of a batch system provided by the above-described various embodiments.
The present embodiment also provides a computer program product comprising computer instructions stored on a readable storage electronic device that can be read from a readable storage medium by at least one processor, the at least one processor executing the computer instructions causing the electronic device to implement the method of determining the event impact of a batch system provided by the various embodiments described above.
In the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the front and rear associated objects are an "or" relationship; in the formula, the character "/" indicates that the front and rear associated objects are a "division" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It will be appreciated that the various numerical numbers referred to in the embodiments of the present application are merely for ease of description and are not intended to limit the scope of the embodiments of the present application. In the embodiment of the present application, the sequence number of each process does not mean the sequence of the execution sequence, and the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application in any way.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (14)

1. A method for determining the impact of an event in a batch system, the method comprising:
Determining whether an association relationship exists between each batch of systems or not and determining the existence time of the association relationship;
taking the batch systems as entities in a time sequence knowledge graph, and constructing the relationship among the entities in the time sequence knowledge graph based on whether the relationship exists among the batch systems and the existence time of the relationship;
Acquiring an abnormal batch system with abnormal events at the current moment;
and determining a batch system which is associated with the abnormal batch system at the current moment and/or after the current moment from all batch systems according to the time sequence knowledge graph, and taking the batch system as an abnormal influence system.
2. The method of claim 1, wherein determining whether an association exists between the batch systems comprises:
acquiring a configuration management database, a historical event, a batch system time sequence, system parameters and a calling relation of the batch system;
and determining whether an association relationship exists among all batch systems or not based on at least one of the configuration management database, the historical event, the batch system time sequence, the system parameters and the calling relationship.
3. The method of claim 2, wherein determining whether an association exists between the respective batch systems based on at least one of the configuration management database, historical events, batch system timing, system parameters, and call relationships comprises:
Determining whether a first batch system is attributed to a second batch system and/or whether a call exists between the first batch system and the second batch system based on at least one of the configuration management database, the historical event, the batch system timing, the system parameters, and the call relationship;
If the first batch system belongs to the second batch system, determining that an association relationship exists between the first batch system and the second batch system, wherein the association relationship is a composition relationship;
If the first batch system is not attributed to the second batch system and a calling relationship exists between the first batch system and the second batch system, determining that an association relationship exists between the first batch system and the second batch system and the association relationship is an influence relationship.
4. A method according to claim 3, wherein determining the time of existence of the association between batch systems comprises:
If the batch systems are in a composition relationship, determining the existence time of the association relationship between the batch systems as each time point;
and if the batch systems are in an influence relationship, acquiring the calling time between the first batch system and the second batch system, and taking the calling time as the existing time of the association relationship.
5. The method of claim 1, wherein determining the time of existence of the association between batch systems comprises:
and acquiring a time point and/or a time period of an association relation between the first batch system and the second batch system in the working period based on the working period of the operation automation system formed by all batch systems, and taking the time point and/or the time period as the existence time of the association relation.
6. The method according to claim 1, wherein the method further comprises:
Each entity in the time-series knowledge-graph is described with a quadruple, which is used to characterize the relationship between the entity and other entities at time t.
7. The method according to claim 6, wherein the determining, among all batch systems, a batch system associated with the abnormal batch system at and/or after a current time as an abnormal influencing system includes:
Acquiring a history four-tuple describing an abnormal entity, wherein the abnormal entity is used for representing the abnormal batch system, and the history four-tuple is used for representing the relationship between the abnormal entity and other entities before the current moment;
Predicting a current quaternion of the abnormal entity based on the historical quaternion, wherein the current quaternion is used for representing the relationship between the abnormal entity and other entities at the current moment;
Based on the current quadruple, determining other entities with relations between the current moment and/or the moment and the abnormal entity as abnormal influence entities;
and acquiring a batch system represented by the abnormal influence entity in the time sequence knowledge graph as the abnormal influence system.
8. The method of claim 7, wherein predicting the current quadruple of the anomalous entity based on the historical quadruple comprises:
Based on the history quadruple, constructing a time sequence diagram network model, wherein the time sequence diagram network model is used for acquiring the time change condition of the relation among all entities in the time sequence knowledge graph;
And predicting the current quadruple of the abnormal entity based on the time sequence diagram network model.
9. The method of claim 8, wherein the timing diagram network model comprises at least a memory module, an update function module, an aggregation module, a memory update module, and an embedding module;
The memory module is used for storing state information of each entity, wherein the state information comprises the relation between the entity and other entities at each historical time point;
the updating function module is used for determining updating information of the first entity and updating information of the second entity when the first entity and the second entity have a relation at the moment t, and the updating information is used for updating the state information;
The aggregation module is used for aggregating the update information generated by each entity before the current moment;
The memory updating module is used for updating the state information of the memory module according to the newly generated updating information;
The embedding module is used for generating state information of the entity at the current moment.
10. The method according to claim 8 or 9, characterized in that the method further comprises:
And dynamically updating the time sequence knowledge graph and the time sequence graph network model when the batch system and the abnormal event change with time.
11. An event impact determination apparatus for a batch system, comprising:
The relationship determining module is used for determining whether the association relationship exists between each batch of systems or not and determining the existence time of the association relationship;
The relationship construction module is used for taking the batch systems as entities in the time sequence knowledge graph and constructing the relationship among the entities in the time sequence knowledge graph based on whether the association relationship exists among the batch systems and the existence time of the association relationship;
The abnormal acquisition module is used for acquiring an abnormal batch system with abnormal events at the current moment;
And the influence determining module is used for determining a batch system which is related to the abnormal batch system at the current moment and/or after the current moment in all batch systems according to the time sequence knowledge graph, and taking the batch system as the abnormal influence system.
12. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-10.
13. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1 to 10.
14. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-10.
CN202410331136.4A 2024-03-21 Event influence determining method, device, equipment and storage medium of batch system Pending CN118133952A (en)

Publications (1)

Publication Number Publication Date
CN118133952A true CN118133952A (en) 2024-06-04

Family

ID=

Similar Documents

Publication Publication Date Title
CN112162878B (en) Database fault discovery method and device, electronic equipment and storage medium
CN111553590A (en) Radar embedded health management system
CN111459700A (en) Method and apparatus for diagnosing device failure, diagnostic device, and storage medium
CN106161138A (en) A kind of intelligence automatic gauge method and device
CN111539493B (en) Alarm prediction method and device, electronic equipment and storage medium
CN104796273A (en) Method and device for diagnosing root of network faults
CN114267178B (en) Intelligent operation maintenance method and device for station
Liu et al. CSSAP: Software aging prediction for cloud services based on ARIMA-LSTM hybrid model
CN113361139A (en) Production line simulation rolling optimization system and method based on digital twin
CN112559237B (en) Operation and maintenance system troubleshooting method and device, server and storage medium
JP7442001B1 (en) Comprehensive failure diagnosis method for hydroelectric power generation units
CN112379325A (en) Fault diagnosis method and system for intelligent electric meter
CN115453356A (en) Power equipment running state monitoring and analyzing method, system, terminal and medium
CN115373888A (en) Fault positioning method and device, electronic equipment and storage medium
JP3054039B2 (en) Plant maintenance support equipment
CN113835947A (en) Method and system for determining abnormality reason based on abnormality identification result
CN118133952A (en) Event influence determining method, device, equipment and storage medium of batch system
EP4033421B1 (en) Method and system for predicting a failure of a monitored entity
CN115965354A (en) Operation and maintenance system and method for fuel cell energy station
JP7466479B2 (en) Business improvement support device, program, and storage medium storing the program
CN114385403A (en) Distributed cooperative fault diagnosis method based on double-layer knowledge graph framework
Sun et al. Fault Root Rank Algorithm Based on Random Walk Mechanism in Fault Knowledge Graph
CN112162528A (en) Fault diagnosis method, device, equipment and storage medium of numerical control machine tool
Jha et al. Holistic measurement-driven system assessment
CN113887751A (en) Mechanical fault predictive maintenance method and system based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication