CN113656287B - Method and device for predicting software instance faults, electronic equipment and storage medium - Google Patents

Method and device for predicting software instance faults, electronic equipment and storage medium Download PDF

Info

Publication number
CN113656287B
CN113656287B CN202110860029.7A CN202110860029A CN113656287B CN 113656287 B CN113656287 B CN 113656287B CN 202110860029 A CN202110860029 A CN 202110860029A CN 113656287 B CN113656287 B CN 113656287B
Authority
CN
China
Prior art keywords
index
fault
alarm
software instance
alarm index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110860029.7A
Other languages
Chinese (zh)
Other versions
CN113656287A (en
Inventor
易存道
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baolande Software Co ltd
Original Assignee
Beijing Baolande Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baolande Software Co ltd filed Critical Beijing Baolande Software Co ltd
Priority to CN202110860029.7A priority Critical patent/CN113656287B/en
Publication of CN113656287A publication Critical patent/CN113656287A/en
Application granted granted Critical
Publication of CN113656287B publication Critical patent/CN113656287B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)

Abstract

According to the method, the device, the electronic equipment and the storage medium for predicting the faults of the software instance, the real-time data of the alarm index of the software instance are obtained; predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes; the method and the device realize that whether the software instance can generate faults or not is predicted in advance through the fault prediction model, and meanwhile, possible fault points of the faults are predicted, so that effective basis can be provided for repairing the faults, time for manually checking fault causes is saved, and fault repairing efficiency is improved.

Description

Method and device for predicting software instance faults, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer information technologies, and in particular, to a method and apparatus for predicting a software instance failure, an electronic device, and a storage medium.
Background
Along with the change of the environment of the computer software industry and the increasing complexity of the calling deployment relationship of each service system, the calling relationship among each component is also increasing, and the frequency of faults and anomalies of each service system is correspondingly increased, so that the timely repair and prediction of the associated instance faults in the service system become particularly important.
At present, the existing software instance fault repairing method mainly comprises the steps of checking machine equipment after a software instance breaks down, or checking the machine equipment manually and regularly; after the system is down, the possible reasons are predicted through manual experience, and then the predicted fault reasons are checked and repaired.
Therefore, the existing software instance fault solving method can only repair the fault after the fault occurs, and the position and the time point of the fault cannot be predicted before the fault occurs; when the fault reasons are checked, the fault reasons can be judged only through manual experience, so that the problems of long time and low efficiency of software instance faults are solved.
Disclosure of Invention
The invention provides a method, a device, electronic equipment and a storage medium for predicting a software instance fault, which are used for solving the problem that the existing method for solving the software instance fault can only repair the fault after the fault occurs and can not predict the position and the time point of the fault before the fault occurs; when the fault reasons are checked, the fault reasons can be judged only through manual experience, so that the problems of long time and low efficiency of software instance faults are solved; through the fault prediction model, whether the software instance can generate faults or not is predicted in advance, and meanwhile possible fault points of the faults are predicted, so that effective basis can be provided for repairing the faults, time for manually checking fault causes is saved, and fault repairing efficiency is improved.
The invention provides a method for predicting faults of a software instance, which comprises the following steps:
Acquiring real-time data of an alarm index of a software instance;
predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes;
the fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes.
According to the method for predicting the faults of the software instance provided by the invention, the faults of the software instance are predicted through a fault prediction model according to the real-time data of the alarm index, and the method comprises the following steps:
Comparing the real-time data with a keyword set to determine whether the alarm index is an abnormal alarm index; the keyword set comprises keywords extracted from abnormal historical alarm indexes of the software instance;
And if the alarm index is an abnormal alarm index, inputting real-time data of the alarm index into the fault prediction model to predict the fault of the software instance.
According to the method for predicting the failure of the software instance provided by the invention, the real-time data of the alarm index is input into the failure prediction model, and before the failure of the software instance is predicted, the method comprises the following steps:
determining the occurrence time of any fault index in the historical data of the software instance;
Acquiring a preamble alarm index in a first preset time period before the occurrence time, and acquiring a history alarm index in a second preset time period before the occurrence time; wherein the second preset time period is greater than the first preset time period;
Generating an association rule of the fault index and any preamble alarm index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period;
And establishing a fault prediction model according to the association rule.
According to the method for predicting the software instance fault provided by the invention, the generating of the association rule between the fault index and any preamble alarm index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period comprises the following steps:
acquiring a target alarm index associated with the fault index through a preamble alarm index in the first preset time period;
Through time slicing operation, the historical alarm indexes in the second preset time period are divided into barrels, and a historical alarm index barrel is generated;
Calculating the association degree of the fault index and any target alarm index based on an Apriori algorithm according to a historical alarm index bucket; the association degree comprises a support degree, a confidence degree and a lifting degree;
And generating an association rule of the fault index and any target alarm index according to the association degree.
According to the method for predicting the software instance fault provided by the invention, the obtaining of the target alarm index with correlation with the fault index comprises the following steps:
Performing de-duplication on the preamble alarm index to generate a preamble alarm index set;
and taking any preamble alarm index in the preamble alarm index set as a target alarm index associated with the fault index.
According to the software instance fault prediction method provided by the invention, the generation of the association rule between the fault index and any preamble alarm index further comprises the following steps:
acquiring the time interval between the occurrence time of any target alarm index and the occurrence time of the fault index; wherein the time interval comprises: maximum time interval, minimum time interval, median time interval;
and adding the time interval into the association rule to predict the occurrence time of the fault of the software instance.
According to the method for predicting the faults of the software instance, which is provided by the invention, after the faults of the software instance are predicted, the method comprises the following steps:
if the software instance is predicted to fail, determining the failure type of the software instance failure;
marking real-time data of an alarm index of the software instance according to the fault type;
and according to the mark, storing the real-time data of the alarm index and the prediction result into a relational database to be used as a data source for displaying the prediction result of the software instance fault.
The invention also provides a device for predicting the faults of the software instance, which comprises:
The acquisition unit is used for acquiring real-time data of the alarm index of the software instance;
the prediction unit is used for predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes;
the fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the software instance fault prediction method as described in any of the above when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a method of predicting failure of a software instance as described in any one of the above.
According to the method, the device, the electronic equipment and the storage medium for predicting the faults of the software instance, the real-time data of the alarm index of the software instance are obtained; predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes; the method and the device realize that whether the software instance can generate faults or not is predicted in advance through the fault prediction model, and meanwhile, possible fault points of the faults are predicted, so that effective basis can be provided for repairing the faults, time for manually checking fault causes is saved, and fault repairing efficiency is improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for predicting failure of a software instance according to an embodiment of the present invention;
FIG. 2 is a flowchart of a fault prediction method for implementing a software instance index based on an Apriori algorithm according to another embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a software instance failure prediction apparatus according to another embodiment of the present invention;
Fig. 4 is a schematic diagram of the physical structure of the electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
First, a conventional method for determining a software instance index fault will be described.
With the business change of the software industry and the increasing complexity of the calling deployment relationship of each business system, the calling relationship among each component is complex. Failure prediction of the associated instance of the business system becomes particularly important. The introduction of artificial intelligence predicts in advance what faults the software instance may have in the future. Such as jvm memory overflow, instance downtime, etc. The artificial intelligence is promoted to the operation and maintenance field, so that sufficient time is reserved for operation and maintenance personnel to solve in advance, the working efficiency is improved, the occurrence probability of production environment faults is reduced, and the loss is reduced.
The existing software instance fault prediction methods are mainly divided into two types. Firstly, after abnormality occurs in a software instance, checking a machine, for example, if the utilization rate of a disk is too high, causing downtime of a service system, wherein the problem can be solved only by locating a specific reason after the downtime of the system; the reasons for this are predicted by human experience. Low efficiency, long solving time and loss to production environment. Secondly, respectively subdividing fault events based on artificial operation and maintenance experience by collecting a large amount of historical alarm data and fault event data, and then training a classification model by using a machine learning algorithm, so that when a new system alarm occurs, the trained classification model is called to predict a new alarm sample, and future fault categories are predicted, wherein no faults occur in the categories as a single category; this solution is a great improvement over the first one, but in which the labeling of the historical sample data requires a great deal of manual effort, and the quality of the manually labeled sample data can also have a great impact on the accuracy of the model.
The prior art scheme does not have a complete fault prediction scheme, and can only quickly locate the cause of the problem according to error information after the fault occurs through human experience, thereby solving the problem. There are several disadvantages to this approach:
The traditional mode generally comprises the steps that operation and maintenance personnel regularly check the state of a server, and index states of all application instances, when the states possibly exceed a threshold value, the problem of fault solution is solved, the processing time is slow, the positioning is inaccurate, and the influence of artificial factors is too large; when a fault occurs but the fault level is not high, the problem cannot be timely processed and the fault can be solved;
The fault prediction model based on the machine learning supervised classification algorithm has the defects that a great amount of labor is required to be invested in labeling of historical sample data, and the quality of the sample data labeled manually can have a great influence on the accuracy of the model.
Aiming at the defects, the prediction method for the software instance faults, provided by the embodiment of the invention, can predict faults of the indexes based on the static threshold value and the dynamic threshold value in the software service, marks the data of the alarm indexes and is beneficial to classifying the alarm data in the subsequent operation process.
The method for predicting the failure of the software instance provided by the invention is described below with reference to fig. 1-2.
Fig. 1 is a flow chart of a software instance fault prediction method according to an embodiment of the present invention. Referring to fig. 1, the method for predicting the failure of the software instance includes:
step 101: and acquiring real-time data of the alarm index of the software instance.
Along with the change of the environment of the computer software industry and the increasing complexity of the calling deployment relationship of each service system, the calling relationship among each component is also increasing complexity, and the timely repair and prediction of the associated instance faults in the service system become particularly important; the software instances are software programs or processes in the business system, and each software instance contains one or more indexes.
Specifically, the software instance may generate a large amount of data during the operation process, where the data may be real-time operation data generated by the software instance, or may be real-time data of an alarm indicator in the software instance. Specifically, the real-time data of the alarm index of the software instance includes alarm data generated when the software instance operates normally and alarm data generated when the software instance actually generates abnormality.
Step 102: predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes;
the fault prediction model is trained based on fault indexes of the software instance and historical alarm indexes associated with the fault indexes.
During the running process of the software instance, the generated data is stored and used as historical data; wherein the stored content includes the content of the history data and whether the history data is abnormal data. In this embodiment, by collecting historical data in a certain period of time, establishing a fault prediction model through an Apriori algorithm, and training the fault prediction model according to the association relationship between the fault index and the alarm index, so that after real-time data of the alarm index of a software instance is input into the fault prediction model, the fault prediction model can determine whether the software instance has a fault after a certain period of time according to experience of the historical data.
According to the software instance fault prediction method provided by the invention, real-time data of the alarm index of the software instance is obtained; predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes; the method and the device realize that whether the software instance can generate faults or not is predicted in advance through the fault prediction model, and meanwhile, possible fault points of the faults are predicted, so that effective basis can be provided for repairing the faults, time for manually checking fault causes is saved, and fault repairing efficiency is improved.
Further, on the basis of the foregoing embodiment, according to the method for predicting a software instance failure provided by the present invention, the predicting, according to the real-time data of the alarm indicator, the failure of the software instance by using a failure prediction model includes:
Comparing the real-time data with a keyword set to determine whether the alarm index is an abnormal alarm index; the keyword set comprises keywords extracted from abnormal historical alarm indexes of the software instance;
And if the alarm index is an abnormal alarm index, inputting real-time data of the alarm index into the fault prediction model to predict the fault of the software instance.
Specifically, in this embodiment, whether a fault occurs is predicted according to real-time data generated by an alarm indicator of a software instance, that is, a cause of the fault of the software instance is predicted according to the data of the alarm indicator; and acquiring the data of the alarm index of the software instance, wherein the data of the alarm index comprises alarm data generated during normal operation and alarm data generated when the software instance is abnormal in practice.
The keyword set may be generated prior to prediction from real-time data of alert indicators of the software instance. Acquiring data of abnormal historical alarm indexes in the historical data, determining abnormal historical alarm index data generated when an abnormality actually occurs in the data of the historical alarm indexes, extracting keywords capable of representing that the alarm indexes of the software instance actually occur from the abnormal historical alarm index data, and generating a keyword set.
When the real-time data of the newly generated alarm index is matched with the keywords in the keyword set, determining that the real-time data of the newly generated alarm index is abnormal, namely, the alarm index is an abnormal alarm index, and inputting the real-time data of the abnormal alarm index into a fault prediction model, so that the possible faults of the software instance can be predicted, and the possible positions and the possible occurrence probability of the faults are predicted.
In this embodiment, by acquiring real-time data of the alarm index in the software instance, when the service of the software instance fails, the possible failure point of the failure can be predicted based on the keyword set, so that an effective basis can be provided for repairing the failure, time for manually troubleshooting the cause of the failure is saved, and failure repairing efficiency is improved.
Further, on the basis of the foregoing embodiment, according to the method for predicting a failure of a software instance provided by the present invention, before inputting the real-time data of the alarm indicator into the failure prediction model to predict the failure of the software instance, the method includes:
determining the occurrence time of any fault index in the historical data of the software instance;
Acquiring a preamble alarm index in a first preset time period before the occurrence time, and acquiring a history alarm index in a second preset time period before the occurrence time; wherein the second preset time period is greater than the first preset time period;
Generating an association rule of the fault index and any preamble alarm index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period;
And establishing a fault prediction model according to the association rule.
In the running process of the software instance, different alarm indexes are generated firstly before the occurrence of the fault indexes, namely, the occurrence of the different alarm indexes possibly indicates the occurrence of the different fault indexes, so that the occurrence of the fault indexes can be predicted by determining the association relation between the fault indexes and the alarm indexes, and a fault prediction model is established through the association relation between the fault indexes and the alarm indexes.
And acquiring the historical data of the software instance, selecting any fault index from the historical data, and determining the occurrence time of the fault index. The method comprises the steps of obtaining alarm indexes in a first preset time period before the occurrence time, wherein occurrence of the alarm indexes possibly causes occurrence of the fault indexes, taking the alarm indexes in the first preset time period as preamble alarm indexes of the fault indexes, namely determining at least one alarm index which has correlation with the fault indexes. And acquiring the historical alarm index in a second preset time period before the occurrence time, wherein the second preset time period is larger than the first preset time period, and the historical alarm index in the second preset time period contains more information of the preamble alarm index. The second preset time period and the first preset time period can be set by personnel.
The correlation strength between the preamble alarm index and the fault index can be determined through the preamble alarm index and the history alarm index, so that the correlation rule between the fault index and any preamble alarm index is generated, and the establishment of a fault prediction model is realized.
For example, setting the first preset time period to be one hour, setting the second preset time period to be one month, based on a historical alarm data set and a time slicing unit, counting a unique fault alarm set, intercepting a historical alarm data table A of one month before the occurrence time of each fault G and a unique alarm index set B of the previous 1 hour, and obtaining a preamble alarm index which leads to the occurrence of the fault G and a result rule set C of data such as time interval, occurrence probability and the like of a result fault index which leads to the occurrence of the result fault index by analyzing the association relation between the set B and the fault alarm G. And traversing all fault indexes to obtain a result rule set C of all fault indexes.
Specifically, firstly, character string data transmitted based on a restful API is received, a character string json is deserialized to obtain a data dictionary part, then a history alarm list in the part is converted into a dataframe format of Python, each item in the history alarm list is a part dictionary of Python, and each part comprises fields of alarm time eventtime, alarm index name item, alarm instance id, alarm type key words, whether faults exist or not and the like.
The historical alarms of the last N years (for example, 1 year) of the system are collected, the historical alarm data table dataframe is divided into barrels according to the time sequence according to a time slice unit maxInternalMs in the subject, all fault indexes in the whole historical alarm data table dataframe are counted, and a unique fault index set GuZhangList is obtained after duplication removal.
The alarm index M is obtained within one hour before any fault index G in the fault index set GuZhangList occurs, and the alarm index M and the fault index G are determined to have an association relationship. In the data of the historical alarm indexes in the month before the occurrence of the fault index G, the association degree of the alarm index M and the fault index G is determined according to the information such as the occurrence time, the occurrence frequency and the like of the alarm index M. Therefore, the association relation between all fault indexes and alarm indexes of the software instance is established, and a fault prediction model is established, so that the prediction of faults is realized.
In this embodiment, an association rule between the fault index and any preamble alarm index is generated through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period, and a fault prediction model is established; and the association rule of all fault indexes in the software instance is generated, so that the prediction precision and efficiency of the fault indexes are improved.
Further, on the basis of the foregoing embodiment, according to the method for predicting a software instance fault provided by the present invention, the generating, by using the preamble alarm indicator in the first preset time period and the history alarm indicator in the second preset time period, an association rule between the fault indicator and any preamble alarm indicator includes:
acquiring a target alarm index associated with the fault index through a preamble alarm index in the first preset time period;
Through time slicing operation, the historical alarm indexes in the second preset time period are divided into barrels, and a historical alarm index barrel is generated;
Calculating the association degree of the fault index and any target alarm index based on an Apriori algorithm according to a historical alarm index bucket; the association degree comprises a support degree, a confidence degree and a lifting degree;
And generating an association rule of the fault index and any target alarm index according to the association degree.
The preamble alarm index in the first preset time period may exist for a plurality of times, the preamble alarm index needs to be de-duplicated to generate a target alarm index, and the correlation between the target alarm index and the fault index is determined.
And (3) carrying out barrel separation on the historical alarm indexes through time slicing operation, wherein the specific time slicing unit can be the same as the time slicing unit of the data dictionary. Traversing the obtained historical alarm index barrels, obtaining the total number of all barrels, and calculating the support degree, the confidence degree and the lifting degree of the fault index and the target alarm index according to the occurrence times of the target alarm index in the historical alarm index barrels and the occurrence times of the fault index.
Specifically, under the condition of traversing each fault index A of the fault index set GuZhangList, obtaining the time Atime of the first occurrence of the fault A in the whole alarm data table based on time, intercepting all alarm indexes in the history alarm data table dataframe for 1 hour based on the Atime, obtaining a unique alarm index set useFaultList after de-duplication, and obtaining a unique alarm index set useFaultList after de-duplication, wherein the basic principle of the operation is that the time range of the occurrence of the fault A at the current time is considered to be one hour; based on Atime, intercepting the alarm data df-30 in the history alarm data table dataframe for 30 days, sorting the history alarm df-30 into buckets buketList according to the time slice unit maxInternalMs in the original subject in time sequence, traversing each sub-bucket buket, counting the total number D of the buckets buketList, the number cntM of the buckets with unique alarm indexes M appearing in the bucket buketList, the number cntMA of the occurrence buckets with unique alarm indexes M and fault alarms A appearing in the same bucket at the same time, and the number cntA of the buckets with A appearing in the bucket buketList.
The core idea of the Apriori algorithm is:
support degree: the Support degree of association rule a→b, support=p (AB), refers to the probability that event a and event B occur simultaneously, i.e., support (a→b) =p (AB);
Confidence level: confidence=p (b|a) =p (AB)/P (a), referring to the probability of occurrence of event B on the basis of occurrence of event a;
Degree of lifting: lift=confidence (A→B)/P (B) =P (AB)/P (A) P (B), and refers to whether the probability of event A causing event B to occur actually plays a role in promoting event B to occur, and as long as lift is greater than 1, the rule A→B can be considered as a strong valid rule.
According to the Apriori algorithm, the alarm index M-fault index A is marked as a rule M-A, and a formula is sleeved to obtain the alarm index M-fault index A:
P(M)=cntM/D,P(A)=cntA/D,P(MA)=cntMA/D;
the data such as the support degree, the confidence coefficient, the lifting degree and the like of the alarm index M relative to the fault index A can be calculated, and the data is used as the association rule of the fault index A and the alarm index M.
In the embodiment, the association rule of the fault index and the alarm index is obtained through calculation by the Apriori algorithm, so that the prediction of the fault alarm by the real-time data of the alarm index is realized, an effective basis is provided for repairing the fault, the time for manually checking the fault cause is saved, and the fault repairing efficiency is improved.
Further, on the basis of the foregoing embodiment, according to the method for predicting a software instance fault provided by the present invention, the obtaining a target alarm indicator related to the fault indicator includes:
Performing de-duplication on the preamble alarm index to generate a preamble alarm index set;
and taking any preamble alarm index in the preamble alarm index set as a target alarm index associated with the fault index.
Specifically, after a first preset time period before the occurrence time of the fault index is obtained, a preamble alarm index in the first preset time period is obtained, wherein the same alarm index possibly exists for a plurality of times, and the preamble alarm index needs to be subjected to duplication removal processing. And the preamble alarm index obtained after the duplication removal is the target alarm index associated with the fault index.
In this embodiment, the target alarm index is obtained by de-duplicating the preamble alarm index, the alarm index associated with the fault index is accurately determined, and then the association relationship between the target alarm index and the fault index is determined, so as to complete the establishment of the fault prediction model and realize the prediction of the software instance fault.
Further, on the basis of the foregoing embodiment, according to the method for predicting a software instance fault provided by the present invention, the generating the association rule between the fault indicator and any preamble alarm indicator further includes:
acquiring the time interval between the occurrence time of any target alarm index and the occurrence time of the fault index; wherein the time interval comprises: maximum time interval, minimum time interval, median time interval;
and adding the time interval into the association rule to predict the occurrence time of the fault of the software instance.
After the target alarm indexes with relevance to the fault indexes are determined, determining the time interval between any target alarm index and the fault index in a second preset time period. When the same fault index occurs for a plurality of times in the historical data, determining the time interval between the occurrence time of the target alarm index and the occurrence time of the fault index, calculating the maximum value, the minimum value and the median value of the time interval, and adding the calculated maximum time interval, minimum time interval and median time interval into the association rule of the fault index and the target alarm index. When predicting a failure of a software indicator, the predicted time at which the failure indicator will occur may also be displayed in the prediction result.
Specifically, the average time interval MEANELAPS of the rule m→a is calculated, the implementation thinking is to count the time interval between the occurrence time of the alarm index and the occurrence time of the fault index a in each barrel, and a time interval list is formed after all the sub-barrels are traversed, so that the average time interval MEANELAPS of the rule m→a can be obtained, and the following targets can be similarly obtained:
calculating a rule M-A total time interval toalElaps;
Calculating a rule M-A maximum time interval maxElaps;
calculating a rule M-A minimum time interval minElaps;
Calculating a rule M-A median time interval MEDIANELAPS;
And assembling the support degree, the confidence degree and the lifting degree in the association relation between the alarm index M and the fault index A, and the total time interval, the maximum time interval, the minimum time interval and the median time interval to form a rule list, so that a background can be applied to system fault prediction according to the rule list provided by the fault prediction algorithm service, and the possible faults, the corresponding time interval and the corresponding reliability, the effectiveness and the like are predicted.
In this embodiment, by calculating the interval between the occurrence time of the target alarm indicator and the occurrence time of the fault indicator and adding the interval to the association rule, the prediction of the occurrence time of the fault indicator can be implemented, so that a technician can more clearly know the occurrence condition of the fault indicator, repair the fault in time, and improve the fault repair efficiency.
Further, on the basis of the foregoing embodiment, according to the method for predicting a failure of a software instance provided by the present invention, after predicting the failure of the software instance, the method includes:
if the software instance is predicted to fail, determining the failure type of the software instance failure;
marking real-time data of an alarm index of the software instance according to the fault type;
and according to the mark, storing the real-time data of the alarm index and the prediction result into a relational database to be used as a data source for displaying the prediction result of the software instance fault.
Obtaining a prediction result after predicting the faults of the software instance through a fault prediction model, wherein the prediction result comprises the types of the faults; different faults are of different types, real-time data corresponding to the different types of faults are marked, and the real-time data and a prediction result obtained by predicting the real-time data are stored in a relational database according to the marks, so that an association relation between the real-time data and the prediction result is established. When the prediction results are displayed to the technicians through the display interface, the real-time data and the prediction results thereof are obtained through the database, so that the sources of the prediction results can be completely displayed to the technicians.
In this embodiment, by marking the real-time data and storing the real-time data and the prediction result in the relational database, a data source can be provided for displaying the prediction result, and the prediction result is completely displayed to a technician, so that the technician obtains more sufficient prediction content and processes the fault more quickly.
Further, when generating the fault prediction model through the Apriori algorithm, the method further comprises:
Transmitting the history data to an Apriori algorithm in the form of character strings based on restapi services; the format of the history data is converted so that the format of the history data is suitable for the Apriori algorithm.
After a large amount of history data is acquired, the history data needs to be input into the Apriori algorithm background to process the history data.
In the embodiment of the invention, restapi service based on sanic framework is developed, history index data input and result output processes can be standardized, history sample data of an instance index is transmitted to an algorithm service background in a character string mode, so that transmission cost is reduced, performance is improved, the background performs basic operations such as analysis, pretreatment and conversion on the history data, and then the history data is sent to a standard algorithm method for model training to obtain a model of a corresponding algorithm. After the operations such as analysis, preprocessing and conversion are performed on the historical data, the historical data transmitted to the algorithm background in the form of character strings can be converted into a data format which can be identified by the Apriori algorithm, so that the processing of the historical data is completed.
Through restapi services, the historical data are transmitted to the Apriori algorithm in a character string mode, and the format of the historical data is converted, so that the data transmission cost can be reduced, the performance of the algorithm is improved, and the generation of a fault prediction model is accelerated.
Further, after predicting the failure of the software instance, the prediction result of the failure of the software instance needs to be displayed; wherein the prediction result comprises a fault type and occurrence probability (i.e. confidence) of the fault.
After predicting that the software instance will fail through the failure prediction model, the prediction result needs to be displayed to technicians. Specifically, in the prediction result of the software instance, the type of the fault to be occurred, the predicted time of occurrence of the fault, the possible fault point when the fault occurs, and the like may be included. By displaying the prediction data, a technician can be reminded to deal with the faults of the software instance in advance, so that the normal operation of the software instance is ensured.
Fig. 2 is a flowchart of a fault prediction method for implementing a software instance index based on an Apriori algorithm according to another embodiment of the present invention. Referring to fig. 2, specifically, the fault prediction method for implementing the software instance index based on the Apriori algorithm includes:
step 201: monitoring data of the index for one month by monitoring equipment, namely acquiring the monitoring data of the index for one month by the existing monitoring equipment of a user or other monitoring products;
step 202: inquiring a software instance call chain relation from a monitoring database in real time through a timing task according to the range to be checked by a user, inquiring instance index data between the instance and a calling instance thereof, and calling the instance index data by an algorithm;
Step 203: developing restapi service based on sanic framework, standardizing historical index data input and result output flows, transmitting historical alarm sample data of example indexes to an algorithm service background in a character string mode so as to reduce transmission cost and improve performance, analyzing, preprocessing, converting and other basic operations on the historical data by the background, and transmitting the historical data to a fault prediction method based on Apriori algorithm thought;
step 204: the basic implementation thought of the scheme is based on a historical alarm data set and a time slicing unit, a unique fault alarm set is counted, a historical alarm data table A of a month before the occurrence time of each fault G and a unique alarm index set B of the previous 1 hour are intercepted, a preamble alarm index which causes the occurrence of the fault G and a result rule set C of data such as a time interval and occurrence probability of the result fault index caused by the corresponding preamble alarm index are obtained by analyzing the association relation between the set B and the fault alarm G, and the result rule set C of all fault indexes is obtained by traversing all fault indexes;
step 205: acquiring corresponding historical service alarm data according to a service system under a prediction range;
Step 206: according to the alarm data in step 205, matching the alarm data which is abnormal in history according to the key word of the detection range, and if the detection result is abnormal, sending the data to an AI (i.e. a fault prediction model) for detection;
Step 207: and marking the current data according to the failure result predicted by the AI, and storing the calculation index and the calculation result into a relational database to be used as a data source of the display interface.
Specifically, step 204 includes:
Step 2041: receiving character string data transmitted based on a restful API, deserializing a character string json to obtain a data dictionary, converting a history alarm list in the dictionary into a dataframe format of Python, wherein each item in the history alarm list is a dictionary of Python, and each item comprises fields of alarm time eventtime, alarm index name item, alarm instance id, alarm type keywords, fault or the like.
Step 2042: the historical alarms of the system in the last N years are collected, the historical alarms dataframe are classified according to time sequence according to a time slice unit maxInternalMs in the subject, all fault indexes in the whole historical alarm data table dataframe are counted, and a unique fault index set GuZhangList is obtained after duplicate removal.
Step 2043: under the condition of traversing each fault index A of the fault index set GuZhangList, obtaining the time Atime of the first occurrence of the fault A in the whole alarm data table based on time, intercepting all alarm indexes in the history alarm data table dataframe for 1 hour based on the Atime, obtaining a unique alarm index set useFaultList after duplicate removal, and obtaining a unique alarm index set useFaultList after duplicate removal, wherein the basic principle of the operation is that the time range of the occurrence of the fault A at the current time is considered to be one hour; based on Atime, intercepting the alarm data df-30 in the history alarm data table dataframe for 30 days, sorting the history alarm df-30 into buckets buketList according to the time slice unit maxInternalMs in the original subject in time sequence, traversing each sub-bucket buket, counting the total number D of the buckets buketList, the number cntM of the buckets with unique alarm indexes M appearing in the bucket buketList, the number cntMA of the occurrence buckets with unique alarm indexes M and fault alarms A appearing in the same bucket at the same time, and the number cntA of the buckets with A appearing in the bucket buketList.
Step 2044: according to the core idea of the Apriori algorithm:
support degree: the Support degree of association rule a→b, support=p (AB), refers to the probability that event a and event B occur simultaneously, i.e., support (a→b) =p (AB);
Confidence level: confidence=p (b|a) =p (AB)/P (a), referring to the probability of occurrence of event B on the basis of occurrence of event a;
Degree of lifting: lift=confidence (A→B)/P (B) =P (AB)/P (A) P (B), and refers to whether the probability of event A causing event B to occur actually plays a role in promoting event B to occur, and as long as lift is greater than 1, the rule A→B can be considered as a strong valid rule.
An alarm index m→a fault index a, here denoted as rule m→a,
The formulation of the nested formula yields P (M) = cntM/D, P (a) =cnta/D, P (MA) = cntMA/D;
Calculating a rule M-A average time interval MEANELAPS, wherein the implementation thinking is to count the time interval between the occurrence time of the alarm index and the occurrence time of the fault index A in each barrel, traversing all the sub-barrels to form a time interval list, thus obtaining the average time interval MEANELAPS of the M-A, and similarly obtaining the following targets:
calculating a rule M-A total time interval toalElaps;
Calculating a rule M-A maximum time interval maxElaps;
calculating a rule M-A minimum time interval minElaps;
Calculating a rule M-A median time interval MEDIANELAPS;
assembling each attribute value of the measurement rule M-A in the steps, and finally returning the data result to the background as follows: {
Rule list composed of }.
Step 2045: the background is applied to system fault prediction according to a rule list provided by a fault prediction algorithm service, and predicts possible faults, corresponding time intervals, reliability, effectiveness and the like.
The method comprises the steps of carrying out fault prediction by adopting an idea of an Apriori algorithm of a statistical probability model, firstly collecting historical alarm index data, adding a fault identification keyword to locate fault alarms, and then calculating time association degree between alarm indexes and the fault alarms based on the idea of the Apriori algorithm, namely, based on time-segment association analysis, so as to obtain association relation indexes between the alarm indexes and the fault alarms, such as confidence level, lifting degree and supporting degree of occurrence of faults caused by the alarm indexes and influence time range; the alarm indicator causes an average minimum time interval, an average maximum time interval, an average time interval, etc. for the occurrence of the fault. The scheme not only can save a large amount of manual marking cost, but also can accurately predict the type of faults possibly happening in the future, and give out a specific time range of 'future', namely an average minimum time interval, an average maximum time interval, an average time interval and the like of faults caused by the above-mentioned alarm indexes.
Therefore, the embodiment of the invention is used for solving the following problems:
Based on the Apriori algorithm idea, the association relation between the historical alarms and the faults is analyzed, a fault prediction model service is constructed, various indexes of association degree between the alarm indexes and the fault alarms are counted, reliable index causal rules are filtered according to various filtering thresholds, and therefore fault prediction is achieved.
And supporting cumulative index fault prediction such as faults of cumulative indexes of cpu utilization rate, jvm memory overflow, disk utilization rate and the like, and predicting possible fault types in a certain time period in the future through fault prediction model service based on Apriori algorithm ideas. Giving early warning information.
The user may create a detection scope in different dimensions for different service views. By collecting service alarm data of all examples under the detection range. And predicting other possible faults through service alarms sent out by the service system.
The fault prediction model is updated through regular training and iteration without manual intervention in the full-automatic treatment, namely, the cause and effect rule of the index and the corresponding association relation index (such as confidence degree, lifting degree and supporting degree) and the index related to the influence duration (such as average minimum time interval, average maximum time interval and average time interval of faults caused by the alarm index) are updated, and the faults in the detection range are predicted in real time.
The software instance fault prediction device provided by the invention is described below, and the software instance fault prediction device described below and the software instance fault prediction method described above can be referred to correspondingly.
Fig. 3 is a schematic structural diagram of a software instance fault prediction device provided by the present invention, referring to fig. 3, the software instance fault prediction device includes:
an acquiring unit 301, configured to acquire real-time data of an alarm indicator of a software instance;
the prediction unit 302 is configured to predict, according to the real-time data of the alarm indicator, a fault of the software instance through a fault prediction model;
the fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes.
The software instance fault prediction device provided in this embodiment is applicable to the software instance fault prediction method provided in each embodiment, and will not be described herein.
Specifically, according to the device for predicting a software instance fault provided by the present invention, the predicting, according to the real-time data of the alarm indicator, the fault of the software instance by using a fault prediction model includes:
Comparing the real-time data with a keyword set to determine whether the alarm index is an abnormal alarm index; the keyword set comprises keywords extracted from abnormal historical alarm indexes of the software instance;
And if the alarm index is an abnormal alarm index, inputting real-time data of the alarm index into the fault prediction model to predict the fault of the software instance.
According to the software instance fault prediction device provided by the invention, the real-time data of the alarm index is input into the fault prediction model, and before the fault of the software instance is predicted, the device comprises:
determining the occurrence time of any fault index in the historical data of the software instance;
Acquiring a preamble alarm index in a first preset time period before the occurrence time, and acquiring a history alarm index in a second preset time period before the occurrence time; wherein the second preset time period is greater than the first preset time period;
Generating an association rule of the fault index and any preamble alarm index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period;
And establishing a fault prediction model according to the association rule.
According to the software instance fault prediction device provided by the invention, the generating of the association rule between the fault index and any preamble alarm index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period includes:
acquiring a target alarm index associated with the fault index through a preamble alarm index in the first preset time period;
Through time slicing operation, the historical alarm indexes in the second preset time period are divided into barrels, and a historical alarm index barrel is generated;
Calculating the association degree of the fault index and any target alarm index based on an Apriori algorithm according to a historical alarm index bucket; the association degree comprises a support degree, a confidence degree and a lifting degree;
And generating an association rule of the fault index and any target alarm index according to the association degree.
According to the software instance fault prediction device provided by the invention, the acquisition of the target alarm index with correlation with the fault index comprises the following steps:
Performing de-duplication on the preamble alarm index to generate a preamble alarm index set;
and taking any preamble alarm index in the preamble alarm index set as a target alarm index associated with the fault index.
According to the software instance fault prediction device provided by the invention, the generation of the association rule between the fault index and any preamble alarm index further comprises:
acquiring the time interval between the occurrence time of any target alarm index and the occurrence time of the fault index; wherein the time interval comprises: maximum time interval, minimum time interval, median time interval;
and adding the time interval into the association rule to predict the occurrence time of the fault of the software instance.
According to the invention, the predicting device for the faults of the software instance comprises:
if the software instance is predicted to fail, determining the failure type of the software instance failure;
marking real-time data of an alarm index of the software instance according to the fault type;
and according to the mark, storing the real-time data of the alarm index and the prediction result into a relational database to be used as a data source for displaying the prediction result of the software instance fault.
Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430, and communication bus 440, wherein processor 410, communication interface 420, and memory 430 communicate with each other via communication bus 440. Processor 410 may call logic instructions in memory 430 to perform a method of predicting a failure of a software instance, the method comprising: acquiring real-time data of an alarm index of a software instance; predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes; the fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a method of predicting failure of a software instance provided by the methods described above, the method comprising: acquiring real-time data of an alarm index of a software instance; predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes; the fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the above-provided software instance fault prediction method, the method comprising: acquiring real-time data of an alarm index of a software instance; predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes; the fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. A method for predicting failure of a software instance, comprising:
Acquiring real-time data of an alarm index of a software instance;
predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes;
The fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes;
inputting the real-time data of the alarm index into the fault prediction model, wherein before predicting the fault of the software instance, the method comprises the following steps:
determining the occurrence time of any fault index in the historical data of the software instance;
Acquiring a preamble alarm index in a first preset time period before the occurrence time, and acquiring a history alarm index in a second preset time period before the occurrence time; wherein the second preset time period is greater than the first preset time period;
Determining the association degree of the preamble alarm index and the fault index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period, and generating an association rule of the fault index and any preamble alarm index according to the association degree;
establishing a fault prediction model according to the association rule;
The determining the association degree of the preamble alarm index and the fault index according to the preamble alarm index in the first preset time period and the history alarm index in the second preset time period, and generating the association rule of the fault index and any preamble alarm index according to the association degree, including:
acquiring a target alarm index associated with the fault index through a preamble alarm index in the first preset time period;
Through time slicing operation, the historical alarm indexes in the second preset time period are divided into barrels, and a historical alarm index barrel is generated;
Calculating the association degree of the fault index and any target alarm index based on an Apriori algorithm according to a historical alarm index bucket; the association degree comprises a support degree, a confidence degree and a lifting degree;
generating an association rule of the fault index and any target alarm index according to the association degree;
The obtaining the target alarm index associated with the fault index comprises the following steps:
Performing de-duplication on the preamble alarm index to generate a preamble alarm index set;
and taking any preamble alarm index in the preamble alarm index set as a target alarm index associated with the fault index.
2. The method for predicting a failure of a software instance according to claim 1, wherein predicting the failure of the software instance according to the real-time data of the alarm indicator through a failure prediction model comprises:
Comparing the real-time data with a keyword set to determine whether the alarm index is an abnormal alarm index; the keyword set comprises keywords extracted from abnormal historical alarm indexes of the software instance;
And if the alarm index is an abnormal alarm index, inputting real-time data of the alarm index into the fault prediction model to predict the fault of the software instance.
3. The method for predicting a software instance failure according to claim 1, wherein generating the association rule between the failure indicator and any preamble alert indicator further comprises:
Acquiring a time interval between the occurrence time of any target alarm index and the occurrence time of the fault index; wherein the time interval comprises: maximum time interval, minimum time interval, median time interval;
and adding the time interval into the association rule to predict the occurrence time of the fault of the software instance.
4. The method for predicting failure of a software instance according to claim 1, wherein after predicting failure of the software instance, comprising:
if the software instance is predicted to fail, determining the failure type of the software instance failure;
marking real-time data of an alarm index of the software instance according to the fault type;
and according to the mark, storing the real-time data and the prediction result of the alarm index into a relational database to be used as a data source for displaying the prediction result of the software instance fault.
5. A software instance fault prediction apparatus, comprising:
The acquisition unit is used for acquiring real-time data of the alarm index of the software instance;
the prediction unit is used for predicting the faults of the software instance through a fault prediction model according to the real-time data of the alarm indexes;
The fault prediction model is obtained by training based on fault indexes of a software instance and alarm indexes associated with the fault indexes;
inputting the real-time data of the alarm index into the fault prediction model, wherein before predicting the fault of the software instance, the method comprises the following steps:
determining the occurrence time of any fault index in the historical data of the software instance;
Acquiring a preamble alarm index in a first preset time period before the occurrence time, and acquiring a history alarm index in a second preset time period before the occurrence time; wherein the second preset time period is greater than the first preset time period;
Determining the association degree of the preamble alarm index and the fault index through the preamble alarm index in the first preset time period and the history alarm index in the second preset time period, and generating an association rule of the fault index and any preamble alarm index according to the association degree;
establishing a fault prediction model according to the association rule;
The determining the association degree of the preamble alarm index and the fault index according to the preamble alarm index in the first preset time period and the history alarm index in the second preset time period, and generating the association rule of the fault index and any preamble alarm index according to the association degree, including:
acquiring a target alarm index associated with the fault index through a preamble alarm index in the first preset time period;
Through time slicing operation, the historical alarm indexes in the second preset time period are divided into barrels, and a historical alarm index barrel is generated;
Calculating the association degree of the fault index and any target alarm index based on an Apriori algorithm according to a historical alarm index bucket; the association degree comprises a support degree, a confidence degree and a lifting degree;
generating an association rule of the fault index and any target alarm index according to the association degree;
The obtaining the target alarm index associated with the fault index comprises the following steps:
Performing de-duplication on the preamble alarm index to generate a preamble alarm index set;
and taking any preamble alarm index in the preamble alarm index set as a target alarm index associated with the fault index.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the software instance fault prediction method according to any one of claims 1 to 4 when the program is executed.
7. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the software instance fault prediction method of any of claims 1 to 4.
CN202110860029.7A 2021-07-28 2021-07-28 Method and device for predicting software instance faults, electronic equipment and storage medium Active CN113656287B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110860029.7A CN113656287B (en) 2021-07-28 2021-07-28 Method and device for predicting software instance faults, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110860029.7A CN113656287B (en) 2021-07-28 2021-07-28 Method and device for predicting software instance faults, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113656287A CN113656287A (en) 2021-11-16
CN113656287B true CN113656287B (en) 2024-06-04

Family

ID=78490830

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110860029.7A Active CN113656287B (en) 2021-07-28 2021-07-28 Method and device for predicting software instance faults, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113656287B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114637656B (en) * 2022-05-13 2022-09-20 飞狐信息技术(天津)有限公司 Redis-based monitoring method and device, storage medium and equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918629A (en) * 2016-10-11 2018-04-17 北京神州泰岳软件股份有限公司 The correlating method and device of a kind of alarm failure
CN108304941A (en) * 2017-12-18 2018-07-20 中国软件与技术服务股份有限公司 A kind of failure prediction method based on machine learning
CN109358602A (en) * 2018-10-23 2019-02-19 山东中创软件商用中间件股份有限公司 A kind of failure analysis methods, device and relevant device
CN110166297A (en) * 2019-05-22 2019-08-23 平安信托有限责任公司 O&M method, system, equipment and computer readable storage medium
CN110300011A (en) * 2018-03-23 2019-10-01 ***通信集团有限公司 A kind of alarm root is because of localization method, device and computer readable storage medium
CN110474799A (en) * 2019-07-31 2019-11-19 中国联合网络通信集团有限公司 Fault Locating Method and device
CN110503247A (en) * 2019-08-01 2019-11-26 中国科学院深圳先进技术研究院 Alarm of telecommunication network prediction technique and system
CN110851342A (en) * 2019-11-08 2020-02-28 中国工商银行股份有限公司 Fault prediction method, device, computing equipment and computer readable storage medium
CN112446511A (en) * 2020-11-20 2021-03-05 中国建设银行股份有限公司 Fault handling method, device, medium and equipment
CN112637132A (en) * 2020-12-01 2021-04-09 北京邮电大学 Network anomaly detection method and device, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918629A (en) * 2016-10-11 2018-04-17 北京神州泰岳软件股份有限公司 The correlating method and device of a kind of alarm failure
CN108304941A (en) * 2017-12-18 2018-07-20 中国软件与技术服务股份有限公司 A kind of failure prediction method based on machine learning
CN110300011A (en) * 2018-03-23 2019-10-01 ***通信集团有限公司 A kind of alarm root is because of localization method, device and computer readable storage medium
CN109358602A (en) * 2018-10-23 2019-02-19 山东中创软件商用中间件股份有限公司 A kind of failure analysis methods, device and relevant device
CN110166297A (en) * 2019-05-22 2019-08-23 平安信托有限责任公司 O&M method, system, equipment and computer readable storage medium
CN110474799A (en) * 2019-07-31 2019-11-19 中国联合网络通信集团有限公司 Fault Locating Method and device
CN110503247A (en) * 2019-08-01 2019-11-26 中国科学院深圳先进技术研究院 Alarm of telecommunication network prediction technique and system
CN110851342A (en) * 2019-11-08 2020-02-28 中国工商银行股份有限公司 Fault prediction method, device, computing equipment and computer readable storage medium
CN112446511A (en) * 2020-11-20 2021-03-05 中国建设银行股份有限公司 Fault handling method, device, medium and equipment
CN112637132A (en) * 2020-12-01 2021-04-09 北京邮电大学 Network anomaly detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113656287A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
CN113282461B (en) Alarm identification method and device for transmission network
EP3105644B1 (en) Method of identifying anomalies
CN116450399B (en) Fault diagnosis and root cause positioning method for micro service system
CN114465874B (en) Fault prediction method, device, electronic equipment and storage medium
CN109992484B (en) Network alarm correlation analysis method, device and medium
CN112491611A (en) Fault location system, method, apparatus, electronic device and computer readable medium
CN113037575B (en) Network element abnormal root cause positioning method and device, electronic equipment and storage medium
CN116680113B (en) Equipment detection implementation control system
CN115858794B (en) Abnormal log data identification method for network operation safety monitoring
CN115392812B (en) Abnormal root cause positioning method, device, equipment and medium
CN113656287B (en) Method and device for predicting software instance faults, electronic equipment and storage medium
Fullen et al. Semi-supervised case-based reasoning approach to alarm flood analysis
CN111767193A (en) Server data anomaly detection method and device, storage medium and equipment
CN113535458B (en) Abnormal false alarm processing method and device, storage medium and terminal
CN115640543A (en) Fault occurrence time accurate positioning method based on machine learning algorithm
CN115600695A (en) Fault diagnosis method of metering equipment
CN113825162B (en) Method and device for positioning fault reasons of telecommunication network
CN114416417A (en) System abnormity monitoring method, device, equipment and storage medium
CN114881112A (en) System anomaly detection method, device, equipment and medium
US11243937B2 (en) Log analysis apparatus, log analysis method, and log analysis program
CN114564391A (en) Method and device for determining test case, storage medium and electronic equipment
CN113778875A (en) System test defect classification method, device, equipment and storage medium
CN117708720B (en) Equipment fault diagnosis system based on knowledge graph
RU2777950C1 (en) Detection of emergency situations for predictive maintenance and determination of end results and technological processes based on the data quality
Hu et al. Research on application of equipment fault diagnosis technology based on FTA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant