CN108038049A - Real-time logs control system and control method, cloud computing system and server - Google Patents

Real-time logs control system and control method, cloud computing system and server Download PDF

Info

Publication number
CN108038049A
CN108038049A CN201711333074.7A CN201711333074A CN108038049A CN 108038049 A CN108038049 A CN 108038049A CN 201711333074 A CN201711333074 A CN 201711333074A CN 108038049 A CN108038049 A CN 108038049A
Authority
CN
China
Prior art keywords
sequence
mrow
failure
msub
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711333074.7A
Other languages
Chinese (zh)
Other versions
CN108038049B (en
Inventor
裴庆祺
赵伟伟
王磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201711333074.7A priority Critical patent/CN108038049B/en
Publication of CN108038049A publication Critical patent/CN108038049A/en
Application granted granted Critical
Publication of CN108038049B publication Critical patent/CN108038049B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention belongs to field of cloud computer technology, disclose a kind of real-time logs control system and control method, cloud computing system and server, pass through the analysis for log events, error message is classified, is filtered, converging operation, extraction becomes sequence, simultaneously the sequence of calculation belongs to the probability of failure sequence and the probability of non-faulting sequence to training fault model, is obtained a result, made prediction using Bayes classification theory.The present invention passes through the analysis for log events, the operation such as all error messages are classified, are filtered, are polymerize, extraction becomes sequence, train fault model and calculate the sequence and belong to the probability of failure sequence and the probability of non-faulting sequence, obtained a result using Bayes classification theory, make prediction, compared with improving judgement speed for substantial amounts of rule match;Failure predication research loses caused by reducing network failure and is of great significance for the burden for mitigating network management and safeguarding.

Description

Real-time logs control system and control method, cloud computing system and server
Technical field
The invention belongs to field of cloud computer technology, more particularly to a kind of real-time logs control system and control method, cloud meter Calculation system and server.
Background technology
With the rapid development of computer technology, cloud computing becomes one of most important computer realm, cloud computing service It is deep among everyone live and work.Can be by the calculating to real time data, based on machine learning algorithm for cloud The failure that may occur in computing system carries out look-ahead, reserves failure response time, while also supports elastic Horizon to expand The disposal ability of cluster is opened up, to adapt to ever-increasing data volume and user demand.Massive logs data are calculated in real time Processing, mining analysis has good developing direction and application prospect in terms of going out the state of system, failure predication from data.
In conclusion problem existing in the prior art is:In original failure predication model, on the one hand, when state continues Between be distributed and be defaulted as exponential type distribution mostly, and the state probability of failure changes and is unsatisfactory for exponential type in practice;On the other hand, Detected value probability of nonserviceabling has done sliding-model control, this carries out experimental analysis to big data environment and has unexpected shadow Ring, therefore this content adoption status continuous time and its distribution and state observation value probability distribution an ancient unit of weight carry out serialization distribution and assumes prestige cloth You are distributed, and the probable value of diagnosis and prediction can be improved using improved prediction model.
The content of the invention
In view of the problems of the existing technology, the present invention provides a kind of real-time logs control system and control method, cloud Computing system and server.
The present invention is achieved in that a kind of real-time logs control method, the real-time logs control method by for The analysis of log events, error message is classified, is filtered, converging operation, and extraction becomes sequence, training fault model And the sequence of calculation belongs to the probability of failure sequence and the probability of non-faulting sequence, is obtained a result, done using Bayes classification theory Go out prediction.
Further, the real-time logs control method specifically includes:
Step 1, collects the log file data on each node in distributed system, will newly be produced by increment inspection Daily record data is sent to collecting terminal in real time;
Step 2, deletes the same type event of same position report in a certain period of time, deletes redundancy event, pass through Time threshold is setRepresent to be used for the time window for performing event filtering;By removing in certain time period by multiple The similar case of diverse location report, deletes the redundancy event in daily record, data flow is saved in time series database;Use phase Like property Sim (D1, D2) judge:
Wherein D1, D2Represent two sequences, W1K, W2KRepresent the vector entries of D1, D2 sequence, similarity i.e. two vector angle Cosine value represent, Sim (D1, D2) bigger, represent that both similarities are higher;
Step 3, when every data is stored to tables of data, using SQL statement according to timestamp, process number, record level Not, scheduler module, separator, record information cutting recording;
Processed standard format data are carried out persistent storage by step 4 using SQL statement;
Step 5, extracts daily record failure sequence;
Step 6, clusters likelihood value of the standard according to sequenceCalculate as metric, calculated using hierarchical clustering Method realizes that failure dependent event is grouped, wherein:
S=[si] represent an a length of L status switch,For in state si(k) initial state probability vector π= [πi] under observation probability matrix;
Step 7, is combined using improved HSMM and Bayesian network BayesNet, and real-time logs data are made with event Barrier prediction;
Standard HSMM can be by transition probability matrix G (t) between state=[gij(t)], state si(k) in initial state probabilities Vectorial π=[πi] under observation probability matrix B=bi(k), it is defined asBy state duration probability It is distributed serialization;Handled using the distribution of state duration as continuously distributed, and assume that it is obeyed Weibull distribution and comes State duration probability distribution, the state duration probability distribution f of state are describedi(l) it is:
fi(l)=α β (α l)β-1e-(αl)β
In formula:α, β are respectively the scale parameter and form parameter of Weibull distribution;
By status monitoring value probability distribution serialization;Equally set it and obey Weibull distribution, state-detection value probability point Cloth function ξi(θ) is:
Wherein αi、βiFor the parameter of the Weibull distribution of each state phase;Improved HSMM models can be described as
Step 8, failure and non-faulting model are trained, parameterWithTarget is assessment, gives an observation sequence Arrange O=[o1, o2..., ol] whether it is failure correlated series;The sequence likelihood value of disaggregated model is calculated, is then classified as nothing Failure or failure Bayesian decision theory;
Step 9, fail result anticipation:
One sequence mark is become into failure dependent event sequence, system sends failure predication;WhereinRepresent mistake Failure correlated series is judged into the cost as failure unrelated sequences, P (F) represents the probability of failure,Expression pair Sequence likelihood value is taken the logarithm.
Further, the extraction daily record failure sequence specifically includes:
The first step, extracts error event sequence:Using SQL statement, the record of ERROR ranks is crossed according to logging level and is carried Take out, retention time stamp and text message information;
Second step, merges similar error event:Levenshtein editing distance algorithms are utilized to sequence of events, will be similar Larger error event is spent to merge;Smallest edit distance includes sub- smallest edit distance;
Wherein d[i-1, j]+ 1, which represents target journaling, is inserted into a letter, d[i, j-1]- 1, which represents matching daily record, deletes a letter; Then xi=yjWhen, it is not necessary to change, so with previous step d[i-1, j-1]+ 1 cost is identical, otherwise+1, d[i,j]Represent above three Middle minimum one;
3rd step, error event classification:After previous step merges error event, according to the text message of error event In keyword similar error event is sorted out, and assignment ID, preserves in the database;
4th step, abstraction sequence:Sequentially in time, failure is extracted in occur for the previous periodInterior event, setting For failure dependent event sequence,For the failure lead time, current failure event is dependent failure event;Non-faulting correlation thing Part sequence is then the sequence of events in the time interval that system does not break down.
Another object of the present invention is to provide a kind of real-time logs control system of the real-time logs control method, institute Stating real-time logs control system includes:Log information processing module, daily record failure analysis module.
Further, the daily record failure analysis module includes:
Collector journal information unit, for collecting the log file data in distributed system on each node, daily record is received Collection function should allow the self-defined journal file to be monitored, by the method for increment inspection, will newly produce daily record data reality When be sent to collecting terminal;
Log information filter element, for carrying out de-redundancy and the filtering of data;
Log information standard format unit, data standard formatting is carried out for processed log information;
Log storage unit, for processed standard format data to be carried out persistent storage.
Further, the daily record failure analysis module includes:
Extract log event sequence units;
Failure dependent event cluster cell, for training a small hidden semi-Markov model in advance using event, Seek sequence likelihood value;
Failure predication unit, it is theoretical using hidden semi-Markov model and Bayes's decibel, judge whether sequence is failure Correlated series;
Fail result judges output unit:When being determined as failure correlated series, system sends failure warning stream, exports shape State fault pre-alarming.
The extraction log event sequence units further comprise:
Error event recording unit is extracted, the record of ERROR ranks is crossed according to logging level and is extracted, retention time Stamp, scheduler module and text message information;
Merge similar error event elements, error event sequence is utilized into Levenshtein editing distance algorithms, will be similar Larger error event is spent to merge;
Error event taxon, uses Levenshtein editing distance algorithms, by similar wrong thing to sequence of events Part is sorted out, and assignment ID;
Failure correlated series unit is extracted, according to time order and function order, extraction failure interior event for the previous period, setting For the preposition event of failure.
Another object of the present invention is to provide a kind of cloud computing system using the real-time logs control method.
Failure predication research work now mainly has three classes method, including the Fault Model based on daily record frequency, base In message frequency Fault Model and based on state transfer Fault Model.
The real-time collecting log information of the invention in system operation time simultaneously carries out clustering processing, by analyzing event log Using the algorithm and model of machine learning, the prediction for the failure that system future may occur is realized, in system operation The system failure is investigated and positioned in advance, for improving system O&M efficiency and prevention emergency event.The present invention is logical The analysis for log events is crossed, the operation such as all error messages are classified, are filtered, are polymerize, extraction becomes sequence Row, training fault model simultaneously calculate the sequence and belong to the probability of failure sequence and the probability of non-faulting sequence, use Bayes point Class theory is obtained a result, and is made prediction.
Effective criterion of this method mainly determines by three parameters, i.e. accuracy rate, recall rate and F-measure Parameter, accuracy rate reaction is correct ratio in all predictions, recall rate reaction be institute it is faulty in be predicted correctly out The ratio come, F.measure are a comprehensive metrics with reference to accuracy rate and recall rate;
Prediction case such as table 1 below:
Prediction result actual result The system failure System is normal
The system failure TruePositive(TP) FalsePositive(FP)
System is normal FalseNegative(FN) TrueNegative(TN)
1 prediction case of table
Predictive validity parameter such as table 2:
2 validity parameter expression formula of table
Following data conclusion is drawn by system experimentation, it can be seen that this subsystem is in accuracy rate better than before not improving
Brief description of the drawings
Fig. 1 is real-time logs control system architecture schematic diagram provided in an embodiment of the present invention;
In figure:1st, log information processing module;1-1, collector journal information unit;1-2, log information filter element;1- 3rd, log information standard format unit;1-4, log storage unit;2nd, daily record failure analysis module;2-1, extraction log event Sequence units;2-1-1, extraction error event recording unit;2-1-2, merge similar error event elements;2-1-3, error event Taxon;2-1-4, extraction failure correlated series unit;2-2, failure dependent event cluster cell;2-3, failure predication list Member;2-4, fail result judge output unit.
Fig. 2 is real-time logs control method flow chart provided in an embodiment of the present invention.
Fig. 3 is that real-time logs control method provided in an embodiment of the present invention realizes flow chart.
Fig. 4 is failure sequence extraction schematic diagram provided in an embodiment of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to Limit the present invention.
The application principle of the present invention is explained in detail below in conjunction with the accompanying drawings.
As shown in Figure 1, real-time logs control system provided in an embodiment of the present invention includes:Log information processing module 1, day Will failure analysis module 2.
Daily record failure analysis module 1 includes:
Collector journal information unit 1-1:For collecting the log file data in distributed system on each node, daily record Collecting function should allow the self-defined journal file to be monitored, by the method for increment inspection, will newly produce daily record data Collecting terminal is sent in real time.
Log information filter element 1-2:For carrying out de-redundancy and the filtering of data.
Log information standard format unit 1-3:Data standard formatting is carried out for processed log information, such as According to:Timestamp, process number, record rank, scheduler module, separator, record information, wherein, record rank is divided into several major classes, Including:ERROR, WARING, TRACE, INFO, DUBUG, CRITICAL, AUDIT, rank is more forward, and higher grade, and higher grade The significance level for representing event is higher.
Log storage unit 1-4:For processed standard format data to be carried out persistent storage, easy to rear issue According to extraction and analysis.
Daily record failure analysis module 2 includes:
Extract log event sequence units 2-1:
Failure dependent event cluster cell 2-2, for training a small hidden semi-Markov in advance using event (HSMM) model, the observation sequence for asking sequence likelihood value i.e. given sequence to utilize training pattern to produce;
Failure predication unit 2-3:It is theoretical using hidden semi-Markov model and Bayes's decibel, judge whether sequence is event Hinder correlated series;
Fail result judges output unit 2-4:When being determined as failure correlated series, system sends failure warning stream, defeated Do well fault pre-alarming.
Extraction log event sequence units 2-1 further comprises:
Extract error event recording unit 2-1-1:The record of ERROR ranks is crossed according to logging level and is extracted, is protected Stay the information such as timestamp, scheduler module and text message;
Merge similar error event elements 2-1-2:Error event sequence is utilized into Levenshtein editing distance algorithms, The larger error event of similarity is merged;
Error event taxon 2-1-3:Levenshtein editing distance algorithms are used to sequence of events, will be similar Error event is sorted out, and assignment ID;
Extract failure correlated series unit 2-1-4:According to time order and function order, extraction failure interior thing for the previous period Part, is set as the preposition event of failure.
As shown in Fig. 2, real-time logs control method provided in an embodiment of the present invention comprises the following steps:
S201:By the analysis for log events, all error messages are classified, filtered, are polymerize Operation, extraction become sequence;
S202:Train fault model and calculate the sequence and belong to the probability of failure sequence and the probability of non-faulting sequence, make Obtained a result, made prediction with Bayes classification theory.
The application principle of the present invention is further described below in conjunction with the accompanying drawings.
Carried out compared with using failure keyword for substantial amounts of rule match, it is in the present invention, (hidden using improved HSMM Markov model) and Bayesdecisiontheory (Bayes classification theory), directly calculate a faulty sequence and belong to event Hinder the probability of sequence, raising judges speed.
As shown in figure 3, real-time logs control method provided in an embodiment of the present invention comprises the following steps that:
1st, log information processing procedure
Step 1, log information is collected
System should be able to collect the log file data on each node in distributed system, and log collection function should Allow the self-defined journal file to be monitored, by the method for increment inspection, will newly produce daily record data and send in real time To collecting terminal.
Step 2, log information filters
There are two methods:One is temporal filtering, the other is spatial filtering.When system detectio is to exception, in system Before breaking down, system can continue output warning message stream.Similarly, once system jam, is solving failure problems May repeatedly break down information repeatedly in daily record before.
Temporal filtering method is by deleting the same type event that same position is reported in a certain period of time, so as to delete Redundancy event, by setting time thresholdRepresent to be used for the time window for performing event filtering.Spatial filtering method is led to The similar case for removing and being reported in certain time period by multiple and different positions is crossed, the redundancy event in daily record is deleted, by data flow It is saved in time series database, saves space and improve efficiency.Usually using similitude Sim (D1, D2) judge:
Wherein D1, D2Represent two sequences, W1K, W2KRepresent the vector entries of D1, D2 sequence, similarity i.e. two vector angle Cosine value represent, Sim (D1, D2) bigger, represent that both similarities are higher.
Step 3, journal format standardizes.
When that will be stored per data to tables of data, using SQL statement according to timestamp, process number, record rank, process The cutting recordings such as module, separator, record information.
Step 4, daily record stores.
Processed standard format data are subjected to persistent storage using SQL statement, easy to the extraction of later data Analysis.
2. daily record accident analysis:
Failure performance system mode between establish be based on probability causal relation, by failure appearance prior probability come Hidden Semi-Markov Process and Bayesian network are trained, bug list now each germline is solved according to prior probability during diagnosis The posterior probability of system state, it is directly perceived to express the joint probability distribution of variable, while calculate the probability that each feature causes failure.
Step 1, daily record failure sequence is extracted.
The first step, extracts error event sequence:Using SQL statement, the record of ERROR ranks is crossed according to logging level and is carried Take out, the information such as retention time stamp and text message;
Second step, merges similar error event:The sequence of events of previous step is calculated using Levenshtein editing distances Method, the larger error event of similarity is merged;
The algorithm has used the algorithm policy of Dynamic Programming, which possesses optimal minor structure, and smallest edit distance includes Sub- smallest edit distance;
Wherein d[i-1, j]+ 1, which represents target journaling, is inserted into a letter, d[i, j-1]+ 1, which represents matching daily record, deletes a letter; Then xi=yjWhen, it is not necessary to change, so with previous step d[i-1, j-1]+ 1 cost is identical, otherwise+1, d[i, j]Represent above three Middle minimum one;
3rd step, error event classification:After previous step merges error event, according to the text message of error event In keyword similar error event is sorted out, and assignment ID, preserves in the database;
4th step, abstraction sequence:Sequentially in time, failure is extracted in occur for the previous periodFor event, setting For failure dependent event sequence,For the failure lead time, current failure event is dependent failure event;Non-faulting correlation thing Part sequence is then the sequence of events in the time interval that system does not break down, as shown in Figure 4:
Step 2, failure dependent event clusters.
In practice, the same system failure may be caused by having a variety of failure dependent event sequences, and this is a variety of former Barrier dependent event sequence be characterized in it is different, therefore need clustered.
Cluster standard can be according to the likelihood value of sequenceCalculate as metric, finally calculated using hierarchical clustering Method realizes that failure dependent event is grouped, wherein:
S=[si] represent an a length of L status switch, bsi(oi) it is in state si(k) initial state probability vector π= [πi] under observation probability matrix.
Step 3, prediction model is established in training.
Prediction model is the key of network failure prediction, and the feature constructed directly affects the performance of prediction model.This It is combined using hidden semi-Markov model (HSMM) and Bayesian network (Bayes Net), is made for real-time logs data Failure predication.
Standard HSMM can be by transition probability matrix G (t) between state=[gij(t)], state si(k) in initial state probabilities Vectorial π=[πi] under observation probability matrix B=bi(k), it is defined as
Have in terms of this improvement to HSMM:By state duration probability distribution serialization.By state duration Distribution is handled as continuously distributed, and assumes to describe state duration probability distribution, i.e., it obeys Weibull distribution The state duration probability distribution f of statei(l) it is:
fi(l)=α β (α l)β-1e-(αl)β
In formula:D, β is respectively the scale parameter and form parameter of Weibull distribution;
By status monitoring value probability distribution serialization.Equally set it and obey Weibull distribution, state-detection value probability point Cloth function ξi(θ) is:
Wherein αi、βiFor the parameter of the Weibull distribution of each state phase;Therefore improved HSMM models can be described as
Step 4, failure predication.
The failure and non-faulting model of hypothesis are trained, i.e. parameterWithTarget is assessment, gives an observation sequence Arrange (faulty sequence) O=[o1, o2..., ol] whether it is failure correlated series.The sequence likelihood value of disaggregated model is calculated first, Then it is classified as fault-free or failure Bayesian decision theory.
Step 5, fail result prejudges:
When formula is set up above, a sequence mark is become into failure dependent event sequence, system sends failure predication.Its InRepresent cost failure correlated series judged as failure unrelated sequences of mistake, P (F) represents the probability of failure,Sequence likelihood value is taken the logarithm in expression, can so prevent that sequence likelihood value is too small and overflow problem occurs.Pass through Such method, can judge each sequence, make failure predication.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement made within refreshing and principle etc., should all be included in the protection scope of the present invention.

Claims (9)

1. a kind of real-time logs control method, it is characterised in that the real-time logs control method passes through for log recording thing The analysis of part, error message is classified, is filtered, converging operation, and extraction becomes sequence, training fault model and the sequence of calculation Belong to the probability of failure sequence and the probability of non-faulting sequence, obtained a result, made prediction using Bayes classification theory.
2. real-time logs control method as claimed in claim 1, it is characterised in that the real-time logs control method is specifically wrapped Include:
Step 1, collects the log file data on each node in distributed system, daily record will be newly produced by increment inspection Data are sent to collecting terminal in real time;
Step 2, deletes the same type event of same position report in a certain period of time, deletes redundancy event, pass through setting Time thresholdRepresent to be used for the time window for performing event filtering;By removing in certain time period by multiple and different The similar case of position report, deletes the redundancy event in daily record, data flow is saved in time series database;Use similitude Sim(D1, D2) judge:
<mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>D</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>D</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>c</mi> <mi>o</mi> <mi>s</mi> <mi>&amp;theta;</mi> <mo>=</mo> <mfrac> <mrow> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>-</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>W</mi> <mrow> <mn>1</mn> <mi>K</mi> </mrow> </msub> <mo>&amp;times;</mo> <msub> <mi>W</mi> <mrow> <mn>2</mn> <mi>K</mi> </mrow> </msub> </mrow> <msqrt> <mrow> <msup> <mrow> <mo>(</mo> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>-</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>W</mi> <mrow> <mn>1</mn> <mi>K</mi> </mrow> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>&amp;times;</mo> <msup> <mrow> <mo>(</mo> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>-</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>W</mi> <mrow> <mn>2</mn> <mi>K</mi> </mrow> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> </mfrac> <mo>;</mo> </mrow>
Wherein D1, D2Represent two sequences, W1K, W2KRepresent D1, D2 sequence vector entries, similarity i.e. two vector angle it is remaining String value represents, Sim (D1, D2) bigger, represent that both similarities are higher;
Step 3, every data store to tables of data when, using SQL statement according to timestamp, process number, record rank, into Journey module, separator, record information cutting recording;
Processed standard format data are carried out persistent storage by step 4 using SQL statement;
Step 5, extracts daily record failure sequence;
Step 6, clusters likelihood value of the standard according to sequenceCalculated as metric, it is real using hierarchical clustering algorithm Existing failure dependent event packet, wherein:
S=[si] represent an a length of L status switch,For in state si(k) in initial state probability vector π=[πi] under Observation probability matrix;
Step 7, is combined using hidden semi-Markov model HSMM and Bayesian network Bayes Net, to real-time logs data Make failure predication;
Standard HSMM can be by transition probability matrix G (t) between state=[gij(t)], state si(k) in initial state probability vector π=[πi] under observation probability matrix B=bi(k), it is defined as λ=(π, G (t), B);By state duration probability distribution Serialization;Handled using the distribution of state duration as continuously distributed, and assume that it obeys Weibull distribution to describe State duration probability distribution, the state duration probability distribution f of statei(l) it is:
fi(l)=α β (α l)β-1e-(αl)β
In formula:α, β are respectively the scale parameter and form parameter of Weibull distribution;
By status monitoring value probability distribution serialization;Equally set it and obey Weibull distribution, state-detection value probability distribution letter Number ξi(θ) is:
<mrow> <msub> <mi>&amp;xi;</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>y</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>&amp;theta;</mi> <mo>|</mo> <msub> <mi>q</mi> <mi>t</mi> </msub> <mo>=</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>&amp;alpha;</mi> <mi>i</mi> </msub> <msub> <mi>&amp;beta;</mi> <mi>i</mi> </msub> <msup> <mrow> <mo>(</mo> <msub> <mi>&amp;alpha;</mi> <mi>i</mi> </msub> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> <mrow> <msub> <mi>&amp;beta;</mi> <mi>i</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </msup> <msup> <mi>e</mi> <mrow> <mo>(</mo> <msub> <mi>&amp;alpha;</mi> <mi>i</mi> </msub> <mi>&amp;theta;</mi> <mo>)</mo> <msub> <mi>&amp;beta;</mi> <mi>i</mi> </msub> </mrow> </msup> <mo>;</mo> </mrow>
Wherein αi、βiFor the parameter of the Weibull distribution of each state phase;Improved HSMM models can be described as
Step 8, failure and non-faulting model are trained, parameterWithTarget is assessment, gives an observation sequence O= [o1, o2..., ol] whether it is failure correlated series;Calculate disaggregated model sequence likelihood value, be then classified as fault-free or Failure Bayesian decision theory;
Step 9, fail result anticipation:
One sequence mark is become into failure dependent event sequence, system sends failure predication;WhereinRepresent mistake will therefore Hindering correlated series to judge to become the cost of failure unrelated sequences, P (F) represents the probability of failure,Represent to sequence Likelihood value is taken the logarithm.
3. real-time logs control method as claimed in claim 2, it is characterised in that the extraction daily record failure sequence specifically wraps Include:
The first step, extracts error event sequence:Using SQL statement, the record of ERROR ranks is crossed according to logging level and is extracted Come, retention time stamp and text message information;
Second step, merges similar error event:Levenshtein editing distance algorithms are utilized to sequence of events, by similarity compared with Big error event merges;Smallest edit distance includes sub- smallest edit distance;
Wherein d[i-1, j]+ 1, which represents target journaling, is inserted into a letter, d[i, j-1]+ 1, which represents matching daily record, deletes a letter;Then xi=yjWhen, it is not necessary to change, so with previous step d[i-1, j-1]+ 1 cost is identical, otherwise+1, d[i, j]Represent in above three most Small one;
3rd step, error event classification:After previous step merges error event, according in the text message of error event Keyword is sorted out similar error event, and assignment ID, is preserved in the database;
4th step, abstraction sequence:Sequentially in time, failure is extracted in occur for the previous periodInterior event, is set as event Hinder dependent event sequence,For the failure lead time, current failure event is dependent failure event;Non-faulting dependent event sequence Row are then the sequences of events in the time interval that system does not break down.
A kind of 4. real-time logs control system of real-time logs control method as claimed in claim 1, it is characterised in that the reality Shi Zhi control systems include:Log information processing module, daily record failure analysis module.
5. real-time logs control system as claimed in claim 4, it is characterised in that the daily record failure analysis module includes:
Collector journal information unit, for collecting the log file data in distributed system on each node, log collection work( It can should allow the self-defined journal file to be monitored, by the method for increment inspection, will newly produce daily record data in real time It is sent to collecting terminal;
Log information filter element, for carrying out de-redundancy and the filtering of data;
Log information standard format unit, data standard formatting is carried out for processed log information;
Log storage unit, for processed standard format data to be carried out persistent storage.
6. real-time logs control system as claimed in claim 4, it is characterised in that the daily record failure analysis module includes:
Extract log event sequence units;
Failure dependent event cluster cell, for training a small hidden semi-Markov model in advance using event, seeks sequence Row likelihood value;
Failure predication unit, it is theoretical using hidden semi-Markov model and Bayes's decibel, judge whether sequence is failure correlation Sequence;
Fail result judges output unit:When being determined as failure correlated series, system sends failure warning stream, output state event Hinder early warning.
7. real-time logs control system as claimed in claim 6, it is characterised in that it is described extraction log event sequence units into One step includes:
Error event recording unit is extracted, the record of ERROR ranks is crossed according to logging level and is extracted, retention time stamp, Scheduler module and text message information;
Merge similar error event elements, error event sequence is utilized into Levenshtein editing distance algorithms, by similarity compared with Big error event merges;
Error event taxon, Levenshtein editing distance algorithms are used to sequence of events, by similar error event into Row is sorted out, and assignment ID;
Extract failure correlated series unit, according to time order and function order, extraction failure interior event for the previous period, be set as therefore Hinder preposition event.
A kind of 8. cloud computing system using real-time logs control method described in 3 any one of claims 1 to 3.
A kind of 9. cloud computing server using real-time logs control method described in 3 any one of claims 1 to 3.
CN201711333074.7A 2017-12-13 2017-12-13 Real-time log control system and control method, cloud computing system and server Active CN108038049B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711333074.7A CN108038049B (en) 2017-12-13 2017-12-13 Real-time log control system and control method, cloud computing system and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711333074.7A CN108038049B (en) 2017-12-13 2017-12-13 Real-time log control system and control method, cloud computing system and server

Publications (2)

Publication Number Publication Date
CN108038049A true CN108038049A (en) 2018-05-15
CN108038049B CN108038049B (en) 2021-11-09

Family

ID=62102328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711333074.7A Active CN108038049B (en) 2017-12-13 2017-12-13 Real-time log control system and control method, cloud computing system and server

Country Status (1)

Country Link
CN (1) CN108038049B (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063017A (en) * 2018-07-12 2018-12-21 广州市闲愉凡生信息科技有限公司 Data persistence distribution method of cloud computing platform
CN109218407A (en) * 2018-08-14 2019-01-15 平安普惠企业管理有限公司 Code management-control method and terminal device based on log monitoring technology
CN109343990A (en) * 2018-09-25 2019-02-15 江苏润和软件股份有限公司 A kind of cloud computing system method for detecting abnormality based on deep learning
CN109460362A (en) * 2018-11-06 2019-03-12 北京京航计算通讯研究所 System interface timing knowledge analysis system based on fine granularity Feature Semantics network
CN109460478A (en) * 2018-11-06 2019-03-12 北京京航计算通讯研究所 System interface timing knowledge analysis method based on fine granularity Feature Semantics network
CN109885456A (en) * 2019-02-20 2019-06-14 武汉大学 A kind of polymorphic type event of failure prediction technique and device based on system log cluster
CN110598871A (en) * 2018-05-23 2019-12-20 ***通信集团浙江有限公司 Method and system for flexibly controlling service flow under micro-service architecture
WO2020000763A1 (en) * 2018-06-29 2020-01-02 平安科技(深圳)有限公司 Network risk monitoring method and apparatus, computer device and storage medium
WO2020001642A1 (en) * 2018-06-28 2020-01-02 中兴通讯股份有限公司 Operation and maintenance system and method
CN110647446A (en) * 2018-06-26 2020-01-03 中兴通讯股份有限公司 Log fault association and prediction method, device, equipment and storage medium
CN110704221A (en) * 2019-09-02 2020-01-17 西安交通大学 Data center fault prediction method based on data enhancement
CN111444156A (en) * 2020-04-20 2020-07-24 南阳理工学院 Fault diagnosis method based on cloud computing
CN111585799A (en) * 2020-04-29 2020-08-25 杭州迪普科技股份有限公司 Network fault prediction model establishing method and device
CN111858263A (en) * 2020-06-12 2020-10-30 苏州浪潮智能科技有限公司 Log analysis-based fault prediction method, system and device
CN111858526A (en) * 2020-06-19 2020-10-30 国网福建省电力有限公司信息通信分公司 Failure time space prediction method and system based on information system log
CN111881153A (en) * 2020-07-24 2020-11-03 北京金山云网络技术有限公司 Data processing method and device, electronic equipment and machine-readable storage medium
CN112000502A (en) * 2020-08-11 2020-11-27 杭州安恒信息技术股份有限公司 Processing method and device for mass error logs, electronic device and storage medium
CN112416732A (en) * 2021-01-20 2021-02-26 国能信控互联技术有限公司 Hidden Markov model-based data acquisition operation anomaly detection method
CN112738088A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm
CN112800666A (en) * 2021-01-18 2021-05-14 上海派拉软件股份有限公司 Log behavior analysis training method and identity security risk prediction method
CN112988440A (en) * 2021-02-23 2021-06-18 山东英信计算机技术有限公司 System fault prediction method and device, electronic equipment and storage medium
CN113806178A (en) * 2021-09-22 2021-12-17 中国建设银行股份有限公司 Cluster node fault detection method and device
CN114169651A (en) * 2022-02-14 2022-03-11 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity
CN115033889A (en) * 2022-06-22 2022-09-09 中国电信股份有限公司 Illegal copyright detection method and device, storage medium and computer equipment
CN115426276A (en) * 2022-08-22 2022-12-02 神华准格尔能源有限责任公司 Monitoring method for strip mine 5G major equipment and cloud server
CN116192612A (en) * 2023-04-23 2023-05-30 成都新西旺自动化科技有限公司 System fault monitoring and early warning system and method based on log analysis
CN116520817A (en) * 2023-07-05 2023-08-01 贵州宏信达高新科技有限责任公司 ETC system running state real-time monitoring system and method based on expressway
WO2023231192A1 (en) * 2022-05-31 2023-12-07 ***数智科技有限公司 Srv6-based intelligent network and device fault prediction method and system
CN117348586A (en) * 2023-10-11 2024-01-05 江苏云涌电子科技股份有限公司 Event sequence record SOE implementation method based on energy storage EMS system
JP7504307B1 (en) 2023-05-23 2024-06-21 三菱電機株式会社 Information processing device, analysis system, analysis method, and program
CN113806178B (en) * 2021-09-22 2024-06-28 中国建设银行股份有限公司 Cluster node fault detection method and device

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080004904A1 (en) * 2006-06-30 2008-01-03 Tran Bao Q Systems and methods for providing interoperability among healthcare devices
CN102968556A (en) * 2012-11-08 2013-03-13 重庆大学 Probability distribution-based distribution network reliability judgment method
CN103761173A (en) * 2013-12-28 2014-04-30 华中科技大学 Log based computer system fault diagnosis method and device
CN103825272A (en) * 2014-03-18 2014-05-28 国家电网公司 Reliability determination method for power distribution network with distributed wind power based on analytical method
CN104361169A (en) * 2014-11-12 2015-02-18 武汉科技大学 Method for monitoring reliability of modeling based on decomposition method
CN104537487A (en) * 2014-12-25 2015-04-22 云南电网公司电力科学研究院 Assessment method of operating dynamic risk of electric transmission and transformation equipment
CN104778370A (en) * 2015-04-20 2015-07-15 北京交通大学 Risk analyzing method based on Monte-Carlo simulation solution dynamic fault tree model
CN105095918A (en) * 2015-09-07 2015-11-25 上海交通大学 Multi-robot system fault diagnosis method
CN105653444A (en) * 2015-12-23 2016-06-08 北京大学 Internet log data-based software defect failure recognition method and system
CN105893208A (en) * 2016-03-31 2016-08-24 城云科技(杭州)有限公司 Cloud computing platform system fault prediction method based on hidden semi-Markov models
CN107423205A (en) * 2017-07-11 2017-12-01 北京明朝万达科技股份有限公司 A kind of system failure method for early warning and system for anti-data-leakage system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080004904A1 (en) * 2006-06-30 2008-01-03 Tran Bao Q Systems and methods for providing interoperability among healthcare devices
CN102968556A (en) * 2012-11-08 2013-03-13 重庆大学 Probability distribution-based distribution network reliability judgment method
CN103761173A (en) * 2013-12-28 2014-04-30 华中科技大学 Log based computer system fault diagnosis method and device
CN103825272A (en) * 2014-03-18 2014-05-28 国家电网公司 Reliability determination method for power distribution network with distributed wind power based on analytical method
CN104361169A (en) * 2014-11-12 2015-02-18 武汉科技大学 Method for monitoring reliability of modeling based on decomposition method
CN104537487A (en) * 2014-12-25 2015-04-22 云南电网公司电力科学研究院 Assessment method of operating dynamic risk of electric transmission and transformation equipment
CN104778370A (en) * 2015-04-20 2015-07-15 北京交通大学 Risk analyzing method based on Monte-Carlo simulation solution dynamic fault tree model
CN105095918A (en) * 2015-09-07 2015-11-25 上海交通大学 Multi-robot system fault diagnosis method
CN105653444A (en) * 2015-12-23 2016-06-08 北京大学 Internet log data-based software defect failure recognition method and system
CN105893208A (en) * 2016-03-31 2016-08-24 城云科技(杭州)有限公司 Cloud computing platform system fault prediction method based on hidden semi-Markov models
CN107423205A (en) * 2017-07-11 2017-12-01 北京明朝万达科技股份有限公司 A kind of system failure method for early warning and system for anti-data-leakage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FELIX SALFNER 等: "Using Hidden Semi-Markov Models for Effective Online Failure Prediction", 《26TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS》 *

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598871A (en) * 2018-05-23 2019-12-20 ***通信集团浙江有限公司 Method and system for flexibly controlling service flow under micro-service architecture
CN110647446B (en) * 2018-06-26 2023-02-21 中兴通讯股份有限公司 Log fault association and prediction method, device, equipment and storage medium
CN110647446A (en) * 2018-06-26 2020-01-03 中兴通讯股份有限公司 Log fault association and prediction method, device, equipment and storage medium
CN110659173A (en) * 2018-06-28 2020-01-07 中兴通讯股份有限公司 Operation and maintenance system and method
KR102483025B1 (en) * 2018-06-28 2022-12-29 지티이 코포레이션 Operational maintenance systems and methods
CN110659173B (en) * 2018-06-28 2023-05-26 中兴通讯股份有限公司 Operation and maintenance system and method
US11947438B2 (en) 2018-06-28 2024-04-02 Xi'an Zhongxing New Software Co., Ltd. Operation and maintenance system and method
KR20210019564A (en) * 2018-06-28 2021-02-22 지티이 코포레이션 Operation maintenance system and method
WO2020001642A1 (en) * 2018-06-28 2020-01-02 中兴通讯股份有限公司 Operation and maintenance system and method
WO2020000763A1 (en) * 2018-06-29 2020-01-02 平安科技(深圳)有限公司 Network risk monitoring method and apparatus, computer device and storage medium
CN109063017A (en) * 2018-07-12 2018-12-21 广州市闲愉凡生信息科技有限公司 Data persistence distribution method of cloud computing platform
CN109218407A (en) * 2018-08-14 2019-01-15 平安普惠企业管理有限公司 Code management-control method and terminal device based on log monitoring technology
CN109218407B (en) * 2018-08-14 2022-10-25 平安普惠企业管理有限公司 Code management and control method based on log monitoring technology and terminal equipment
CN109343990A (en) * 2018-09-25 2019-02-15 江苏润和软件股份有限公司 A kind of cloud computing system method for detecting abnormality based on deep learning
CN109460478A (en) * 2018-11-06 2019-03-12 北京京航计算通讯研究所 System interface timing knowledge analysis method based on fine granularity Feature Semantics network
CN109460362A (en) * 2018-11-06 2019-03-12 北京京航计算通讯研究所 System interface timing knowledge analysis system based on fine granularity Feature Semantics network
CN109885456A (en) * 2019-02-20 2019-06-14 武汉大学 A kind of polymorphic type event of failure prediction technique and device based on system log cluster
CN110704221A (en) * 2019-09-02 2020-01-17 西安交通大学 Data center fault prediction method based on data enhancement
CN110704221B (en) * 2019-09-02 2020-10-27 西安交通大学 Data center fault prediction method based on data enhancement
CN111444156A (en) * 2020-04-20 2020-07-24 南阳理工学院 Fault diagnosis method based on cloud computing
CN111444156B (en) * 2020-04-20 2023-01-24 南阳理工学院 Fault diagnosis method based on cloud computing
CN111585799A (en) * 2020-04-29 2020-08-25 杭州迪普科技股份有限公司 Network fault prediction model establishing method and device
CN111858263A (en) * 2020-06-12 2020-10-30 苏州浪潮智能科技有限公司 Log analysis-based fault prediction method, system and device
CN111858263B (en) * 2020-06-12 2022-08-02 苏州浪潮智能科技有限公司 Log analysis-based fault prediction method, system and device
CN111858526B (en) * 2020-06-19 2022-08-16 国网福建省电力有限公司信息通信分公司 Failure time space prediction method and system based on information system log
CN111858526A (en) * 2020-06-19 2020-10-30 国网福建省电力有限公司信息通信分公司 Failure time space prediction method and system based on information system log
CN111881153A (en) * 2020-07-24 2020-11-03 北京金山云网络技术有限公司 Data processing method and device, electronic equipment and machine-readable storage medium
CN112000502A (en) * 2020-08-11 2020-11-27 杭州安恒信息技术股份有限公司 Processing method and device for mass error logs, electronic device and storage medium
CN112738088B (en) * 2020-12-28 2023-03-21 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm
CN112738088A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm
CN112800666A (en) * 2021-01-18 2021-05-14 上海派拉软件股份有限公司 Log behavior analysis training method and identity security risk prediction method
CN112416732A (en) * 2021-01-20 2021-02-26 国能信控互联技术有限公司 Hidden Markov model-based data acquisition operation anomaly detection method
CN112416732B (en) * 2021-01-20 2021-06-01 国能信控互联技术有限公司 Hidden Markov model-based data acquisition operation anomaly detection method
CN112988440A (en) * 2021-02-23 2021-06-18 山东英信计算机技术有限公司 System fault prediction method and device, electronic equipment and storage medium
CN112988440B (en) * 2021-02-23 2023-08-01 山东英信计算机技术有限公司 System fault prediction method and device, electronic equipment and storage medium
CN113806178B (en) * 2021-09-22 2024-06-28 中国建设银行股份有限公司 Cluster node fault detection method and device
CN113806178A (en) * 2021-09-22 2021-12-17 中国建设银行股份有限公司 Cluster node fault detection method and device
CN114169651B (en) * 2022-02-14 2022-04-19 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity
CN114169651A (en) * 2022-02-14 2022-03-11 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity
WO2023231192A1 (en) * 2022-05-31 2023-12-07 ***数智科技有限公司 Srv6-based intelligent network and device fault prediction method and system
CN115033889B (en) * 2022-06-22 2023-10-31 中国电信股份有限公司 Illegal right-raising detection method and device, storage medium and computer equipment
CN115033889A (en) * 2022-06-22 2022-09-09 中国电信股份有限公司 Illegal copyright detection method and device, storage medium and computer equipment
CN115426276A (en) * 2022-08-22 2022-12-02 神华准格尔能源有限责任公司 Monitoring method for strip mine 5G major equipment and cloud server
CN115426276B (en) * 2022-08-22 2024-03-12 神华准格尔能源有限责任公司 Method for monitoring 5G major equipment of strip mine and cloud server
CN116192612A (en) * 2023-04-23 2023-05-30 成都新西旺自动化科技有限公司 System fault monitoring and early warning system and method based on log analysis
JP7504307B1 (en) 2023-05-23 2024-06-21 三菱電機株式会社 Information processing device, analysis system, analysis method, and program
CN116520817A (en) * 2023-07-05 2023-08-01 贵州宏信达高新科技有限责任公司 ETC system running state real-time monitoring system and method based on expressway
CN116520817B (en) * 2023-07-05 2023-08-29 贵州宏信达高新科技有限责任公司 ETC system running state real-time monitoring system and method based on expressway
CN117348586A (en) * 2023-10-11 2024-01-05 江苏云涌电子科技股份有限公司 Event sequence record SOE implementation method based on energy storage EMS system
CN117348586B (en) * 2023-10-11 2024-02-27 江苏云涌电子科技股份有限公司 Event sequence record SOE implementation method based on energy storage EMS system

Also Published As

Publication number Publication date
CN108038049B (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN108038049A (en) Real-time logs control system and control method, cloud computing system and server
KR101984730B1 (en) Automatic predicting system for server failure and automatic predicting method for server failure
CN110958136A (en) Deep learning-based log analysis early warning method
CN107577588A (en) A kind of massive logs data intelligence operational system
CN101516099B (en) Test method for sensor network anomaly
CN105893208A (en) Cloud computing platform system fault prediction method based on hidden semi-Markov models
CN111858526B (en) Failure time space prediction method and system based on information system log
CN106649527B (en) Advertisement click abnormity detection system and detection method based on Spark Streaming
CN104518905A (en) Fault locating method and fault locating device
CN115809183A (en) Method for discovering and disposing information-creating terminal fault based on knowledge graph
CN108549647A (en) The method without accident in mark language material active predicting movement customer service field is realized based on SinglePass algorithms
CN113378990A (en) Traffic data anomaly detection method based on deep learning
CN104777827A (en) Method for diagnosing fault of high-speed railway signal system vehicle-mounted equipment
CN111950660A (en) Alarm prediction method and device for artificial intelligence training platform
CN107229017A (en) A kind of wind generating set pitch control battery abnormal failure Forecasting Methodology
Ouyang et al. Study on the classification of data streams with concept drift
CN111726351B (en) Bagging-improved GRU parallel network flow abnormity detection method
CN112951311A (en) Hard disk fault prediction method and system based on variable weight random forest
CN104579782A (en) Hotspot security event identification method and system
CN115758908A (en) Alarm online prediction method under alarm flooding condition based on deep learning
Zhang et al. Automatic Traffic Anomaly Detection on the Road Network with Spatial‐Temporal Graph Neural Network Representation Learning
Liu et al. An efficient framework for unsupervised anomaly detection over edge-assisted internet of things
Zhang et al. Failure prediction in ibm bluegene/l event logs
CN113778792B (en) Alarm classifying method and system for IT equipment
Barros et al. Leveraging phase transition of topics for event detection in social media

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant