CN113420422B - Alarm log proportion prediction method, system, device and medium - Google Patents

Alarm log proportion prediction method, system, device and medium Download PDF

Info

Publication number
CN113420422B
CN113420422B CN202110599920.XA CN202110599920A CN113420422B CN 113420422 B CN113420422 B CN 113420422B CN 202110599920 A CN202110599920 A CN 202110599920A CN 113420422 B CN113420422 B CN 113420422B
Authority
CN
China
Prior art keywords
module
sequence
value
predicted
proportion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110599920.XA
Other languages
Chinese (zh)
Other versions
CN113420422A (en
Inventor
王崇娇
杨虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Jinan data Technology Co ltd
Original Assignee
Inspur Jinan data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Jinan data Technology Co ltd filed Critical Inspur Jinan data Technology Co ltd
Priority to CN202110599920.XA priority Critical patent/CN113420422B/en
Publication of CN113420422A publication Critical patent/CN113420422A/en
Application granted granted Critical
Publication of CN113420422B publication Critical patent/CN113420422B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Economics (AREA)
  • Computational Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for predicting the proportion of an alarm log, which comprises the following steps: acquiring alarm logs respectively generated by the module to be predicted in a plurality of unit times and acquiring logs respectively generated by all the modules in a plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time; forming a proportion sequence by the multiple proportions and establishing an ARIMA model by using the proportion sequence; acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences; and predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model. The invention also discloses a system, a computer device and a readable storage medium. The scheme provided by the invention not only considers the influence of the self rule of the module to be predicted, but also considers the mutual influence of all modules of the server, and more accurately predicts the alarm log occupation ratio in the T +1 time period.

Description

Alarm log proportion prediction method, system, device and medium
Technical Field
The invention relates to the field of prediction, in particular to a method, a system, equipment and a storage medium for predicting the proportion of an alarm log.
Background
With the development and innovation of computer network technology, people's life and work are increasingly unable to leave the support of computer technology, which also highlights the important role of a stable and safe data center. The business communication and stability maintenance among the whole data center servers are really important, and the premise of ensuring the stability of the data center is to ensure the safety and stability of each server as far as possible, so the alarm state of the server module is particularly important. The existing alarm state analysis is mainly based on the current alarm log number, the method can feed back the state of the server module more accurately, but the timeliness is not strong, and generally only a single server module can be analyzed, and the relation among all the modules is ignored.
Disclosure of Invention
In view of this, in order to overcome at least one aspect of the above problem, an embodiment of the present invention provides an alarm log proportion prediction method, including the following steps:
acquiring alarm logs respectively generated by a module to be predicted in a plurality of unit times and acquiring logs respectively generated by all modules in the plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
forming a proportion sequence by the plurality of proportions and establishing an ARIMA model by utilizing the proportion sequence;
acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences;
and predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
In some embodiments, constructing a plurality of the fractions into a fraction sequence and using the fraction sequence to build an ARIMA model further comprises:
carrying out multi-order differential processing on the ratio sequence until a stable differential ratio sequence is obtained and recording the current order d;
calculating the differential occupancy sequence using an ACF function to determine parameters p and autocorrelation coefficients of the ARIMA model and calculating the differential occupancy sequence using a PACF function to determine parameters q and partial autocorrelation coefficients of the ARIMA model;
and constructing the ARIMA model by utilizing the order d, the parameter p, the parameter q, the autocorrelation coefficient, the partial autocorrelation coefficient and the proportion sequence.
In some embodiments, constructing the ARIMA model using the order d, the parameter p, the parameter q, the autocorrelation coefficients, the partial autocorrelation coefficients, and the dominating sequence further comprises constructing the ARIMA model according to the following equation:
Figure BDA0003092383870000021
θ(B)=1-θ 1 B-θ 2 B 2 -…-θ q B q
Figure BDA0003092383870000022
Figure BDA0003092383870000023
wherein, W t Is a ratio sequence; b is a delay operator; theta 1 ,θ 2 ,θ 3 ...θ q Is a function of the partial auto-correlation coefficient,
Figure BDA0003092383870000024
is an autocorrelation coefficient; f. of t Is an error.
In some embodiments, obtaining a plurality of influence factor sequences corresponding to each of the other modules further includes:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module 1 ,AN 2 ,…,AN n }, module historical life value sequence { MT 1 ,MT 2 ,…,MT n And interval duration sequence of adjacent abnormal conditions (IT) 1 ,IT 2 ,…,IT n };
Wherein n is the total number of other modules.
In some embodiments, determining the state prediction model for each of the other modules from the plurality of sequences of impact factors further comprises:
according to Y = m (X) i ) + ε calculating the state value corresponding to each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, X i For each element in each sequence of influencing factors; according to
Figure DEST_PATH_IMAGE002
Calculating the state predicted value when each influence factor takes the value of x, and calculating the state predicted value when each influence factor takes the value of x
Figure BDA0003092383870000032
As a corresponding state prediction model for each of the other modules; wherein, X i And X j Respectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, Y i Is X i Corresponding state value, K is a second order Gaussian kernel fit kernel function, R>
Figure BDA0003092383870000033
The abnormal times in unit time is taken as x an The corresponding status prediction value in time, ->
Figure BDA0003092383870000034
Valuing a module historical life value as x mt Based on the corresponding status prediction value>
Figure BDA0003092383870000035
The interval duration for adjacent abnormal conditions is taken to be x it And (4) corresponding state prediction values.
In some embodiments, predicting a proportion of an alarm log generated by the module to be predicted in a next unit time according to the state prediction model and the ARIMA model, further comprises:
calculating the correlation between the state value corresponding to each other module predicted by the state prediction model and the difference ratio sequence;
taking the state value of which the correlation is greater than the threshold as a linear parameter, and taking the state value of which the correlation is not greater than the threshold as a nonlinear parameter;
according to U T Calculating a correction value of beta + m (T) + epsilon, wherein U = (U) 1 ,…,U a ) T ,U 1 ,…,U a Represents a linear parameters, T = (T) 1 ,…,T b ) T ,T 1 ,…,T b Represents b nonlinear parameters, beta is a coefficient, and epsilon is an error disturbance term.
In some embodiments, further comprising predicting the occupancy using:
Figure BDA0003092383870000036
wherein,
Figure BDA0003092383870000041
for differential ratio sequences, alpha t ,α t-1 …α t-p Is->
Figure BDA0003092383870000042
β t-1 ,...,β t-q Is theta 2 B 2 ,...,θ q B q
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides an alarm log proportion prediction system, including:
the acquisition module is configured to acquire alarm logs respectively generated by the module to be predicted in a plurality of unit times and acquire logs respectively generated by all the modules in the plurality of unit times so as to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
the first model building module is configured to form a proportion sequence by a plurality of proportion and build an ARIMA model by utilizing the proportion sequence;
the second module establishing module is configured to obtain a plurality of influence factor sequences corresponding to each other module and determine a state prediction model of each other module according to the plurality of influence factor sequences;
and the prediction module is configured to predict the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer apparatus, including:
at least one processor; and
a memory storing a computer program operable on the processor, wherein the processor executes the program to perform any of the steps of the alarm log proportion prediction method described above.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of any of the alarm log proportion prediction methods described above.
The invention has one of the following beneficial technical effects: the scheme provided by the invention is based on three influence factor data sets such as system logs, historical service lives of modules, abnormal times appearing in unit time, interval duration of adjacent abnormal conditions and the like, not only the influence of the self rule of the module to be predicted is considered, but also the mutual influence of all modules of a server is considered, and an NPM factor system is added through an ARIMA main system to establish an SPM correction-combined prediction system and predict the alarm log proportion of the module to be predicted. Therefore, an ARIMA model and an SPM model can be established through system logs collected by ISREST based on the self rule of the module to be predicted and the influence of other factor modules on the module, the alarm log proportion of the T +1 time period of the module to be predicted can be accurately predicted, the alarm grade is correspondingly obtained and provided for operation and maintenance personnel of a server, timely and accurate prejudgment is provided for the operation and maintenance personnel, and the loss of a larger degree is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic flow chart of an alarm log proportion prediction method according to an embodiment of the present invention;
FIG. 2 is an architecture diagram of an embodiment of a method for predicting alarm log fraction according to the present invention;
FIG. 3 is a block diagram of a data processing system according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an alarm log proportion prediction system according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a computer device provided in an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
According to an aspect of the present invention, an embodiment of the present invention provides an alarm log proportion prediction method, as shown in fig. 1, which may include the steps of:
s1, acquiring alarm logs respectively generated by a module to be predicted in a plurality of unit times and acquiring logs respectively generated by all modules in a plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
s2, forming a proportion sequence by the plurality of proportions and establishing an ARIMA model by utilizing the proportion sequence;
s3, acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences;
and S4, predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
According to the scheme provided by the invention, the alarm state of the module to be predicted is predicted according to the system alarm log proportion of the module to be predicted and the alarm state influence factor data set of each other factor module of the server, and the log data can be more objectively utilized by taking the relative variable of the proportion number of the system alarm logs of the module to be predicted in all logs as a main analysis factor. Meanwhile, the relation between the alarm log percentage and the historical value of the alarm log is considered, an Autoregressive Integrated Moving Average model (ARIMA) is established by analyzing the independence and the stationarity of data, future values are predicted according to the historical values of the ARIMA, all modules of the server are mutually influenced and inseparable, the module to be predicted is influenced not only by the historical values of the module to be predicted but also by other factor modules of the server, and the residual error sequence of the ARIMA model predicted value is corrected through three influence factors such as the historical service life of other factor modules, the abnormal times occurring in unit time, the interval duration of adjacent abnormal conditions and the like, so that the alarm state of the module to be predicted can be predicted more accurately. It should be noted that, the user can replace the module to be predicted and other factor modules according to the needs of the user.
In the embodiment of the invention, based on the server system alarm log percentage of the time series type, the influence factors of the historical service life of the module, the abnormal times in unit time and the interval duration of adjacent abnormal conditions are added, and an ARIMA model and an SPM model are established to predict the alarm state of the module to be predicted. Firstly, collecting SYSTEM logs in band through a server REST tool (ISREST) to obtain the number of logs and the total number of logs at the warming level of a plurality of modules such as a CPU, a DISK, a DIRVER, a GPU, an HBA, a MEMORY, an NIC, a RAID, a SYSTEM and an FAN, calculating to obtain the ratio of the number of logs at the warming level of the modules to be predicted of the server in unit time, sequentially arranging according to a time axis to establish an ARIMA model as a main SYSTEM, and obtaining a primary predicted value of a time period of a dependent variable T + 1. Because strong association relation may exist among all modules of the server, the modules may also be independent and have small influence on other modules, the alarm state of the module to be predicted in the state existing among all the modules of the server is not only related to self historical data, but also influenced by other modules, and because the linear relation between the modules and the module to be predicted is uncertain, an SPM model is established on the basis of a main model to correct errors and improve the accuracy of prediction. And providing the prediction result for server operation and maintenance personnel, and checking the module to be predicted in advance to avoid more serious loss.
In some embodiments, as shown in FIG. 2, the modular prediction approach of the ARIMA model and the SPM model may be implemented using a data collection system, a data processing system, an ARIMA host system, an NPM factor system, and an SPM correction-combination prediction system.
In some embodiments, the data collection SYSTEM may be an in-band log collection function of the application server management software ISREST, and may further obtain logs of a plurality of server modules such as a CPU, a DISK, a DIRVER, a GPU, an HBA, a MEMORY, a NIC, a RAID, a SYSTEM, a FAN and the like, generally, it is considered that the more the number of logs of a module warming level is, the more serious the alarm state of the module is, the more the module is affected by the non-uniform log collection time and the unstable number of various logs, in the present invention, the ratio of the number of logs of the module warming level to be predicted is taken as a main research object, and one of the modules is taken as a module M to be predicted W The other server modules are used as factor modules, and n factor modules are arranged 1 ,M 2 ,…,M n }. Taking out the number N of the logs of the warming level of the module to be predicted from a large number of logs W And total log number N of all modules sum Alternative module historical lifetime { MT 1 ,MT 2 ,…,MT n ,MT n+1 }, number of abnormal times occurring per unit time { AN 1 ,AN 2 ,…,AN n ,AN n+1 And interval duration of adjacent abnormal conditions { IT 1 ,IT 2 ,…,IT n ,IT n+1 The variable is used as an influence factor of the factor module state.
In some embodiments, as shown in fig. 3, the data processing system may calculate, according to the time axis, a ratio of the number of the warming level logs of the module to be predicted as an input parameter of the ARIMA main system according to the data set of the data collection system; the historical service life of the factor module, the abnormal times in unit time and the interval duration variable of adjacent abnormal conditions are used as input parameters of the NPM factor system; the output parameters of the ARIMA main system and the SNM factor system are used as the input parameters of the SPM correction-combination prediction system.
In some embodiments, compared with the log collected out-of-band, the log collected in-band can collect the required log more directly and more comprehensively, the quantity proportion of the alarm logs has a strong correlation with a time axis, that is, the sequence has a certain rule, future data is influenced by historical data and error disturbance items, and based on the characteristic, an ARIMA model can be established through a historical data set to predict the system alarm log proportion at the moment.
The basis of ARIMA model modeling is that the time sequence must be a stable sequence, namely the change of the characteristics of the decision sequence along with time is fixed and unchanged, and a non-stable sequence can be changed into the stable sequence through differential operation.
In some embodiments, in step S2, forming a plurality of the fractions into a fraction sequence and using the fraction sequence to establish an ARIMA model, further comprises:
carrying out multi-order differential processing on the ratio sequence until a stable differential ratio sequence is obtained and recording a current order d;
calculating the differential occupancy sequence using an ACF function to determine parameters p and autocorrelation coefficients of the ARIMA model and calculating the differential occupancy sequence using a PACF function to determine parameters q and partial autocorrelation coefficients of the ARIMA model;
and constructing the ARIMA model by utilizing the order d, the parameter p, the parameter q, the autocorrelation coefficient, the partial autocorrelation coefficient and the proportion sequence.
Specifically, after each order difference is processed, the stationarity of the data can be subjectively judged through a data line graph, if the whole data has no ascending or descending trend and no local data set obviously influenced by time is observed in different regions, the data after the current order difference is processed is considered to be stable,noting the plateau sequence (i.e., the difference ratio sequence) as
Figure BDA0003092383870000081
The difference processing procedure may be:
Figure BDA0003092383870000082
then, parameters p, q, autocorrelation coefficients and partial autocorrelation coefficients of the ARIMA model can be calculated by applying the ACF function and the PACF function to the differential ratio sequence.
Alarm log proportion sequence { W) of prediction module warming level is treated based on processing 1 ,W 2 ,…,W t Establishing an ARIMA (p, d, q) model:
Figure BDA0003092383870000091
θ(B)=1-θ 1 B-θ 2 B 2 -…-θ q B q
Figure BDA0003092383870000092
Figure BDA0003092383870000093
wherein, W t Is a ratio sequence; b is a delay operator; theta 1 ,θ 2 ,θ 3 ...θ q Is a function of the partial auto-correlation coefficient,
Figure BDA0003092383870000094
is an autocorrelation coefficient; f. of t Is an error.
Therefore, the sequence of the module to be predicted of the server is analyzed by the ARIMA main system to obtain the predicted value of the module influenced by the self rule.
In some embodiments, the NPM factor system may be utilized to obtain the status value for each of the other factor modules.
Wherein, obtaining a plurality of influence factor sequences corresponding to each other module further comprises:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module 1 ,AN 2 ,…,AN n }, module historical life value sequence { MT 1 ,MT 2 ,…,MT n And interval duration sequence of adjacent abnormal conditions (IT) 1 ,IT 2 ,…,IT n };
Wherein n is the total number of other modules.
Then using the Non-Parametric model, Y = m (X) i ) + ε calculating the state value corresponding to each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, X i For each element in each sequence of influencing factors. Because the predicted values must be separated from the true values, the equations are written rigorously, marking the separation with epsilon, but bringing in the data does not require considering epsilon.
Thus, each element X in each sequence of influencing factors for each other factor module is obtained i Corresponding state value Y i
Then, according to
Figure BDA0003092383870000101
Calculating each influence factor value as x (x can be AN) n+1 、MT n+1 、IT n+1 ) A status of time to predict and will >>
Figure BDA0003092383870000102
As a corresponding state prediction model for each of the other modules; wherein, X i And X j Respectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, Y i Is X i Corresponding state value, K is a second order Gaussian kernel fit kernel function, R>
Figure BDA0003092383870000103
The abnormal times appearing in unit time is taken as x an The corresponding status prediction value in time, ->
Figure BDA0003092383870000104
Valuing a module historical life value as x mt The corresponding status prediction value in time, ->
Figure BDA0003092383870000105
Taking the interval duration of adjacent abnormal conditions to be x it And (4) corresponding state prediction values.
For example, taking the first other factor block as AN example, the x values are AN n+1 、MT n+1 、IT n+1 To obtain the corresponding state value
Figure BDA0003092383870000106
The error disturbance terms are respectively marked as epsilon 11 、ε 12 And ε 13 Similarly, because the predicted value must be different from the true value, the formula is written strictly, and the formula is expressed by epsilon 11 Mark the gap, but bring in data without considering epsilon 11 、ε 12 And ε 13
And averaging the three state values to obtain the state prediction value of the first other factor module.
In some embodiments, the ARIMA main system mainly analyzes the influence of the self rule of the module to be predicted of the server on future values, and predicts the proportion of the quantity of the warming level logs of the module to be predicted in the T +1 time period according to historical data; the SNP factor system mainly considers the influence of other factor modules of the server on the module to be predicted, selects three influence factors of the historical service life of the module, the abnormal times appearing in unit time and the interval duration of adjacent abnormal conditions, and establishes a state value of the factor module estimated by a Non-Parametric model and an averaging mode.
And constructing an SPM correction-combined prediction system based on the two model systems, further analyzing the residual error of the SPM correction-combined prediction system on the basis of the model output data of the ARIMA system, and correcting the prediction of the ratio of the ARIMA model to the prediction module warming log. Because the linear correlation relation of the influence of each factor module of the server on the to-be-predicted module is uncertain, firstly, the correlation coefficient of each factor module and the output residual error of the ARIMA system is calculated, and the correlation coefficient of the state value of each factor module and the output residual error sequence (difference ratio sequence) of the ARIMA system is determined by a correlation calculation method. Taking the state value with the correlation larger than the threshold as a linear parameter, taking the state value with the correlation not larger than the threshold as a nonlinear parameter, further obtaining a linear parameters and b nonlinear parameters, and then establishing a Semi-parameter model of the factor module and the residual sequence, wherein the Semi-parameter model meets the following requirements:
Y=U T β+m(T)+ε
wherein, U = (U) 1 ,…,U a ) T ,U 1 ,…,U a Represents a linear parameters, T = (T) 1 ,…,T b ) T , T 1 ,…,T b Represents b nonlinear parameters, beta is a coefficient, and epsilon is an error disturbance term.
In some embodiments, a Local-polymodal-Regression method can be applied to obtain an estimation result of the Non-Parametric model beta, and a difference module warping log number ratio sequence to be predicted is used in combination with prediction of the number ratio of the module to be predicted to the warping log by the ARIMA system
Figure BDA0003092383870000111
And obtaining a combined predicted value after error correction as follows, and corresponding to the module alarm level.
Figure BDA0003092383870000112
Wherein,
Figure BDA0003092383870000113
for differential ratio sequences, alpha t ,α t-1 …α t-p Is->
Figure BDA0003092383870000114
β t-1 ,...,β t-q Is theta 2 B 2 ,...,θ q B q
The scheme provided by the invention is based on three influence factor data sets such as system logs, historical service lives of modules, abnormal times occurring in unit time, interval duration of adjacent abnormal conditions and the like, not only the influence of the self rule of the module to be predicted is considered, but also the mutual influence of all the modules of a server is considered, an NPM factor system is added through an ARIMA main system, an SPM correction-combination prediction system is established, and the alarm log proportion of the module to be predicted is predicted. Therefore, through the system logs collected by ISREST, based on the influence of the self rules of the module to be predicted and other factor modules on the module, an ARIMA model and an SPM model are established, the alarm log proportion of the T +1 time period of the module to be predicted is accurately predicted, the alarm grade is correspondingly obtained and provided for operation and maintenance personnel of a server, timely and accurate prejudgment is provided for the operation and maintenance personnel, and the loss of a greater degree is avoided.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides an alarm log proportion prediction system 400, as shown in fig. 4, including:
the obtaining module 401 is configured to obtain alarm logs respectively generated by a module to be predicted in a plurality of unit times and obtain logs respectively generated by all modules in a plurality of unit times to obtain a ratio of the alarm logs generated by the module to be predicted in each unit time;
a first model building module 402 configured to form a plurality of the occupation ratios into occupation ratio sequences and build an ARIMA model using the occupation ratio sequences;
a second module establishing module 403, configured to obtain a plurality of influence factor sequences corresponding to each other module and determine a state prediction model of each other module according to the plurality of influence factor sequences;
and the prediction module 404 is configured to predict the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
The scheme provided by the invention is based on three influence factor data sets such as system logs, historical service lives of modules, abnormal times occurring in unit time, interval duration of adjacent abnormal conditions and the like, not only the influence of the self rule of the module to be predicted is considered, but also the mutual influence of all the modules of a server is considered, an NPM factor system is added through an ARIMA main system, an SPM correction-combination prediction system is established, and the alarm log proportion of the module to be predicted is predicted. Therefore, an ARIMA model and an SPM model can be established through system logs collected by ISREST based on the self rule of the module to be predicted and the influence of other factor modules on the module, the alarm log proportion of the T +1 time period of the module to be predicted can be accurately predicted, the alarm grade is correspondingly obtained and provided for operation and maintenance personnel of a server, timely and accurate prejudgment is provided for the operation and maintenance personnel, and the loss of a larger degree is avoided.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 5, an embodiment of the present invention further provides a computer apparatus 501, comprising:
at least one processor 520; and
a memory 510, the memory 510 storing a computer program 511 operable on a processor, the processor 520 when executing the program performing the steps of any of the alarm log proportion prediction methods described above.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 6, an embodiment of the present invention further provides a computer-readable storage medium 601, where the computer-readable storage medium 601 stores computer program instructions 610, and the computer program instructions 610, when executed by a processor, perform the steps of any one of the above alarm log proportion prediction methods.
Finally, it should be noted that, as will be understood by those skilled in the art, all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above.
Further, it should be appreciated that the computer-readable storage media (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the above embodiments of the present invention are merely for description, and do not represent the advantages or disadvantages of the embodiments.
It will be understood by those skilled in the art that all or part of the steps of implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (8)

1. A method for predicting the proportion of alarm logs is characterized by comprising the following steps:
acquiring alarm logs respectively generated by a module to be predicted in a plurality of unit times and acquiring logs respectively generated by all modules in a plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
forming a proportion sequence by the plurality of proportions and establishing an ARIMA model by utilizing the proportion sequence;
acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences;
predicting the proportion of an alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model;
obtaining a plurality of influence factor sequences corresponding to each of the other modules, further comprising:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module 1 ,AN 2 ,…,AN n } historical life of moduleValue sequence { MT 1 ,MT 2 ,…,MT n And interval duration sequence of adjacent abnormal conditions (IT) 1 ,IT 2 ,…,IT n };
Wherein n is the total number of other modules;
determining a state prediction model for each of the other modules based on the plurality of sequences of impact factors, further comprising:
according to Y = m (X) i ) + ε calculating the corresponding state value of each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, X i For each element in each sequence of influencing factors;
according to
Figure FDA0003943709590000015
Calculating a state prediction value when each influence factor takes the value of x, and judging whether or not the value of x is greater than or equal to>
Figure FDA0003943709590000012
As a corresponding state prediction model for each of the other modules; wherein, X i And X j Respectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, Y i Is X i Corresponding state value, K is a second order Gaussian kernel fit kernel function, R>
Figure FDA0003943709590000013
The abnormal times appearing in unit time is taken as x an Based on the corresponding status prediction value>
Figure FDA0003943709590000014
Valuing a module historical life value as x mt The corresponding status prediction value in time, ->
Figure FDA0003943709590000021
Taking the interval duration of adjacent abnormal conditions to be x it Time correspondingAnd (6) predicting the state.
2. The method of claim 1, wherein forming a plurality of said fractions into a fraction sequence and using said fraction sequence to build an ARIMA model, further comprises:
carrying out multi-order differential processing on the ratio sequence until a stable differential ratio sequence is obtained and recording the current order d;
calculating the differential occupancy sequence using an ACF function to determine parameters p and autocorrelation coefficients of the ARIMA model and calculating the differential occupancy sequence using a PACF function to determine parameters q and partial autocorrelation coefficients of the ARIMA model;
and constructing the ARIMA model by utilizing the order d, the parameter p, the parameter q, the autocorrelation coefficient, the partial autocorrelation coefficient and the occupation sequence.
3. The method of claim 2, wherein the ARIMA model is constructed using the order d, the parameter p, the parameter q, the autocorrelation coefficients, the partial autocorrelation coefficients, and the dominating sequence, further comprising constructing the ARIMA model according to:
Figure FDA0003943709590000022
θ(B)=1-θ 1 B-θ 2 B 2 -…-θ q B q
Figure FDA0003943709590000023
/>
Figure FDA0003943709590000024
wherein, W t Is a ratio sequence; b is a delay operator; theta 1 ,θ 2 ,θ 3 ...θ q Is a function of the partial auto-correlation coefficient,
Figure FDA0003943709590000025
is an autocorrelation coefficient; f. of t Is an error.
4. The method of claim 1, wherein predicting a proportion of alarm logs generated by the module to be predicted in a next unit of time based on the state prediction model and the ARIMA model, further comprises:
calculating the correlation between the state value corresponding to each other module predicted by the state prediction model and the difference ratio sequence;
taking the state value of which the correlation is greater than a threshold value as a linear parameter, and taking the state value of which the correlation is not greater than the threshold value as a nonlinear parameter;
according to U T Calculating a correction value of beta + m (T) + epsilon, wherein U = (U) 1 ,…,U a ) T ,U 1 ,…,U a Represents a linear parameters, T = (T) 1 ,…,T b ) T ,T 1 ,…,T b Represents b nonlinear parameters, beta is a coefficient, and epsilon is an error disturbance term.
5. The method of claim 4, further comprising predicting the occupancy using:
Figure FDA0003943709590000031
wherein,
Figure FDA0003943709590000032
for differential ratio sequences, alpha t ,α t-1 …α t-p Is->
Figure FDA0003943709590000033
β t-1 ,...,β t-q Is theta 2 B 2 ,...,θ q B q
6. An alarm log proportion prediction system, comprising:
the acquisition module is configured to acquire alarm logs respectively generated by the module to be predicted in a plurality of unit times and acquire logs respectively generated by all the modules in the plurality of unit times so as to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
the first model building module is configured to form a proportion sequence by a plurality of proportions and build an ARIMA model by utilizing the proportion sequence;
the second module establishing module is configured to obtain a plurality of influence factor sequences corresponding to each other module and determine a state prediction model of each other module according to the plurality of influence factor sequences;
the prediction module is configured to predict the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model;
the second module establishing module is further configured to:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module 1 ,AN 2 ,…,AN n }, module historical life value sequence { MT 1 ,MT 2 ,…,MT n And interval duration sequence of adjacent abnormal conditions (IT) 1 ,IT 2 ,…,IT n };
Wherein, the value of n is the total number of other modules;
according to Y = m (X) i ) + ε calculating the state value corresponding to each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, X i For each element in each sequence of influencing factors;
according to
Figure FDA0003943709590000041
Calculating the state predicted value of each influencing factor with the value of x, and combining->
Figure FDA0003943709590000042
As a corresponding state prediction model for each of the other modules; wherein, X i And X j Respectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, Y i Is X i Corresponding state value, K is a second order Gaussian kernel fit kernel function,. Sub.>
Figure FDA0003943709590000043
The abnormal times appearing in unit time is taken as x an The corresponding status prediction value in time, ->
Figure FDA0003943709590000044
Valuing a module historical life value as x mt The corresponding status prediction value in time, ->
Figure FDA0003943709590000045
Taking the interval duration of adjacent abnormal conditions to be x it And (4) corresponding state prediction values.
7. A computer device, comprising:
at least one processor; and
memory storing a computer program operable on the processor, characterized in that the processor executes the program to perform the steps of the method according to any of claims 1-5.
8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1-5.
CN202110599920.XA 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium Active CN113420422B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110599920.XA CN113420422B (en) 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110599920.XA CN113420422B (en) 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium

Publications (2)

Publication Number Publication Date
CN113420422A CN113420422A (en) 2021-09-21
CN113420422B true CN113420422B (en) 2023-04-07

Family

ID=77713291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110599920.XA Active CN113420422B (en) 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium

Country Status (1)

Country Link
CN (1) CN113420422B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888788A (en) * 2019-10-16 2020-03-17 平安科技(深圳)有限公司 Anomaly detection method and device, computer equipment and storage medium
CN111314115A (en) * 2020-01-19 2020-06-19 苏州浪潮智能科技有限公司 Alarm method, device and equipment based on IDL log and readable medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116531A (en) * 2013-01-25 2013-05-22 浪潮(北京)电子信息产业有限公司 Storage system failure predicting method and storage system failure predicting device
CN108256898B (en) * 2017-12-26 2021-05-11 深圳索信达数据技术有限公司 Product sales prediction method, system and storage medium
US20190228353A1 (en) * 2018-01-19 2019-07-25 EMC IP Holding Company LLC Competition-based tool for anomaly detection of business process time series in it environments
CN110224865A (en) * 2019-05-30 2019-09-10 宝付网络科技(上海)有限公司 A kind of log warning system based on Stream Processing
FR3098967B1 (en) * 2019-07-15 2022-07-01 Bull Sas Method and device for determining an estimated time before a technical incident in an IT infrastructure based on performance indicator values
CN110458374A (en) * 2019-08-23 2019-11-15 山东浪潮通软信息科技有限公司 A kind of business electrical maximum demand prediction technique based on ARIMA and SVM
CN110688069A (en) * 2019-09-20 2020-01-14 苏州浪潮智能科技有限公司 Service life prediction method, device and equipment of solid state disk and readable storage medium
CN110907984A (en) * 2019-11-21 2020-03-24 中国地震局地震预测研究所 Method for detecting earthquake front infrared long-wave radiation abnormal information based on autoregressive moving average model
CN111008114A (en) * 2019-11-30 2020-04-14 北京浪潮数据技术有限公司 Disk partition monitoring method, device, equipment and readable storage medium
CN111314173B (en) * 2020-01-20 2022-04-08 腾讯科技(深圳)有限公司 Monitoring information abnormity positioning method and device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888788A (en) * 2019-10-16 2020-03-17 平安科技(深圳)有限公司 Anomaly detection method and device, computer equipment and storage medium
CN111314115A (en) * 2020-01-19 2020-06-19 苏州浪潮智能科技有限公司 Alarm method, device and equipment based on IDL log and readable medium

Also Published As

Publication number Publication date
CN113420422A (en) 2021-09-21

Similar Documents

Publication Publication Date Title
EP2490126B1 (en) System operation management device, system operation management method, and program storage medium
Vilalta et al. Predictive algorithms in the management of computer systems
US9323641B2 (en) System and method for predicting and avoiding network downtime
US8880946B2 (en) Fault detection apparatus, a fault detection method and a program recording medium
JP2010526352A (en) Performance fault management system and method using statistical analysis
CN110880984A (en) Model-based flow anomaly monitoring method, device, equipment and storage medium
US20050216793A1 (en) Method and apparatus for detecting abnormal behavior of enterprise software applications
EP2759938A1 (en) Operations management device, operations management method, and program
US8868993B1 (en) Data replacement policy
US20110320228A1 (en) Automated Generation of Markov Chains for Use in Information Technology
US9600391B2 (en) Operations management apparatus, operations management method and program
EP2963553A1 (en) System analysis device and system analysis method
US20160048805A1 (en) Method of collaborative software development
WO2020220437A1 (en) Method for virtual machine software aging prediction based on adaboost-elman
US20150378806A1 (en) System analysis device and system analysis method
CN106713267A (en) Network security assessment method and system
CN113420422B (en) Alarm log proportion prediction method, system, device and medium
CN108696397B (en) Power grid information security assessment method and device based on AHP and big data
JP2013150083A (en) Network abnormality detection device and network abnormality detection method
US20160350692A1 (en) Measuring Change in Software Developer Behavior Under Pressure
CN113822517A (en) Case division method and device based on capability matching
JP2009193238A (en) System load monitoring method
Carmeli et al. State-dependent estimation of delay distributions in fork-join networks
CN117560423B (en) Cloud storage node-based intelligent lock cloud storage resource scheduling system
Shao et al. A Markov chain approach to study flow disruptions on surgery in emergency care

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant