CN114781529A

CN114781529A - KPI (Key performance indicator) abnormity detection method, device, equipment and medium

Info

Publication number: CN114781529A
Application number: CN202210460951.1A
Authority: CN
Inventors: 苏海明
Original assignee: Zhengzhou Yunhai Information Technology Co Ltd
Current assignee: Zhengzhou Yunhai Information Technology Co Ltd
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2022-07-22
Also published as: WO2023208136A1

Abstract

The application discloses KPI abnormity detection method, device, equipment and medium, which are applied to the technical field of KPI abnormity and comprise the following steps: acquiring single-dimensional KPI time sequence data of a target interval; the length of the target interval is a first preset time length, and the end time point of the target interval is a designated time point; extracting a first data characteristic of the single-dimensional KPI time sequence data; inputting the first data characteristics into a base classifier, and outputting a preliminary abnormal detection result of the single-dimensional KPI time sequence data by using the base classifier; extracting second data characteristics of the target time point, and inputting the second data characteristics and the preliminary abnormal detection result into a label classifier to obtain a classification result; the target time point is any time point in a second preset time length after the pointing time point; and determining a final abnormity detection result of the single-dimensional KPI time sequence data based on the classification result. Therefore, accuracy of KPI abnormity detection can be improved.

Description

KPI (Key performance indicator) abnormity detection method, device, equipment and medium

Technical Field

The present application relates to the field of KPI anomaly detection technologies, and in particular, to a KPI anomaly detection method, apparatus, device, and medium.

Background

With the rapid development of the field of cloud computing, bare computer construction with physical computer performance and cloud elasticity is emerging quietly in cloud computing. In order to enable the performance of a physical machine and a cloud host in cloud computing to be optimal, analysis and monitoring data analysis has guiding significance for machine performance optimization.

Currently, the server monitoring data mainly includes performance data such as a CPU (central processing unit), a memory, a storage, a network, and the like, and the data includes sequential performance data such as a CPU usage rate, a memory usage rate, a network throughput, and the like. Most of these data are single-dimensional KPI (Key Performance Indicators) data, and in the detection of single-dimensional time series data anomaly, some challenges are often faced: lack of definable anomaly occurrence patterns; noise may be present in the data; data is often unstable and dynamically changing, and therefore, it presents a great challenge to the anomaly detection of single-dimensional time series data. How to improve the accuracy of KPI anomaly detection is a problem of continuous research in the technical field of KPI anomaly detection.

Disclosure of Invention

In view of the above, an object of the present application is to provide a KPI anomaly detection method, apparatus, device and medium, which can improve accuracy of KPI anomaly detection. The specific scheme is as follows:

in a first aspect, the present application discloses a KPI anomaly detection method, including:

acquiring single-dimensional KPI time sequence data of a target interval; the length of the target interval is a first preset time length, and the ending time point of the target interval is a designated time point;

extracting a first data characteristic of the single-dimensional KPI time sequence data;

inputting the first data characteristics into a base classifier, and outputting a preliminary abnormal detection result of the single-dimensional KPI time sequence data by using the base classifier;

extracting second data characteristics of a target time point, and inputting the second data characteristics and the preliminary abnormal detection result into a label classifier to obtain a classification result; the target time point is any time point within a second preset time length after the pointing time point;

and determining a final abnormal detection result of the single-dimensional KPI time sequence data based on the classification result.

Optionally, the extracting a first data feature of the single-dimensional KPI timing data includes:

carrying out normalization processing on the single-dimensional KPI time sequence data to obtain normalized data;

and determining at least one of the normalized data and the statistical characteristic, the prediction characteristic and the frequency domain characteristic of the normalized data as a first data characteristic of the single-dimensional KPI time sequence data.

Optionally, the method further includes:

constructing a candidate root cause set based on a plurality of single-dimensional KPI time sequence data with abnormal final abnormal detection results; wherein the set of candidate root causes includes one or more of the single-dimensional KPI timing data;

taking the candidate root cause set as a node, and constructing a multilayer root cause tree according to the data dimension of the candidate root cause set;

and carrying out layer-by-layer pruning on the multilayer root cause tree based on a preset pruning strategy, and determining an abnormal root cause set based on a ripple effect.

Optionally, the data dimensions of the nodes in the same layer in the multi-layer root tree are the same, and the data dimension number of the nodes in each layer decreases from top to bottom;

correspondingly, the pruning the multilayer root cause tree layer by layer based on the preset pruning strategy comprises the following steps: and pruning the multilayer root-cause trees layer by layer from top to bottom based on a preset pruning strategy.

Optionally, the pruning the multilayer root cause tree layer by layer based on a preset pruning strategy includes:

and aiming at any layer, determining the influence value of each node based on a preset influence value calculation rule, judging whether the influence value is smaller than a preset influence threshold value, and if so, rejecting the node and child nodes of the rejected node in each layer below the layer.

Optionally, the step-by-step pruning is performed on the multilayer root cause tree based on a preset pruning strategy, and an abnormal root cause set is determined based on a ripple effect, including:

for any layer, after the nodes with the influence values smaller than the preset influence threshold value are removed, potential values corresponding to the remaining nodes are determined based on the ripple effect; wherein the potential score characterizes a degree of influence of an element of a node on elements of children of the node;

and sequencing the potential scores of the remaining nodes of each layer, and determining an abnormal root cause set based on a sequencing result.

Optionally, the determining the influence value of each node based on the preset influence value calculation rule includes:

and calculating a predicted value of each element in each node based on a preset prediction algorithm, and determining the influence value of each node based on the predicted value of each element and the actual value of each element.

In a second aspect, the present application discloses a KPI anomaly detection apparatus, comprising:

the KPI data acquisition module is used for acquiring single-dimensional KPI time sequence data of a target interval; the length of the target interval is a first preset time length, and the end time point of the target interval is a designated time point;

the data feature extraction module is used for extracting a first data feature of the single-dimensional KPI time sequence data;

the detection result output module is used for inputting the first data characteristics into a base classifier and outputting a preliminary abnormal detection result of the single-dimensional KPI time sequence data by using the base classifier;

the classification result acquisition module is used for extracting second data characteristics of a target time point and inputting the second data characteristics and the preliminary abnormal detection result into a label classifier to obtain a classification result; the target time point is any time point within a second preset time length after the pointing time point;

and the detection result determining module is used for determining a final abnormal detection result of the single-dimensional KPI time sequence data based on the classification result.

In a third aspect, the present application discloses an electronic device comprising a processor and a memory; wherein,

the memory is used for storing a computer program;

the processor is configured to execute the computer program to implement the KPI anomaly detection method.

In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the KPI anomaly detection method described above.

It can be seen that this application obtains the single-dimensional KPI time sequence data of target interval earlier, the length of target interval is first default time length, the finish time point of target interval is for appointing the time point, then draws the first data characteristic of single-dimensional KPI time sequence data, and will first data characteristic input base classifier, and utilize base classifier output the preliminary abnormal detection result of single-dimensional KPI time sequence data, later draw the second data characteristic of target time point, and will second data characteristic and preliminary abnormal detection result input label classifier, obtain classification result, the target time point is arbitrary time point in the second default time length after the point time point, confirm the final abnormal detection result of single-dimensional KPI time sequence data based on classification result. That is, this application has utilized two-layer classifier, utilizes basic classifier to detect the single-dimensional KPI time sequence data of target interval earlier, obtains preliminary abnormal detection result, then utilizes label classifier to judge preliminary abnormal detection result to obtain final abnormal detection result, like this, can promote KPI abnormal detection's rate of accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only the embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a flowchart of a KPI anomaly detection method disclosed in the present application;

fig. 2 is a flowchart of a specific KPI anomaly detection method disclosed in the present application;

FIG. 3 is a schematic diagram of a specific multi-level root cause tree disclosed herein;

FIG. 4 is a schematic diagram of specific KPI anomaly detection and root cause localization disclosed in the present application;

fig. 5 is a schematic structural diagram of a KPI anomaly detection device disclosed in the present application;

fig. 6 is a block diagram of an electronic device disclosed in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.

Currently, the server monitoring data mainly includes performance data such as CPU, memory, storage, network, etc., and these data include sequence performance data such as CPU utilization, memory utilization, network throughput, etc. Most of the data are single-dimensional KPIs, and the single-dimensional time sequence data anomaly detection often faces some challenges: lack of definable anomaly occurrence patterns; noise may be present in the data; data is often unstable and dynamically changing, and therefore, it presents a great challenge to the anomaly detection of single-dimensional time series data. The abnormality detection methods for these single-dimensional KPIs include a time series-based method and a machine learning-based method, and the time series-based features mainly include a series of linear models such as an auto-regressive moving average (ARIMA) model and an exponential smoothing model. Currently, machine learning is mainly used for anomaly detection, and the method mainly includes supervised anomaly detection, semi-supervised anomaly detection and unsupervised anomaly detection. The supervised anomaly detection method is to train a binary classifier, such as SVM (support vector machines) and the like, through normal and abnormal data instance labels, but the supervised detection method has problems that the proportion of abnormal samples to normal samples in the data is seriously unbalanced, and the trained model is easy to be over-fitted, so the method is not popular as a semi-supervised or unsupervised method. In the anomaly detection method adopting semi-supervision, a small amount of label data is used for training a classification model, and non-label data is used for optimizing the structure information implied by a sample. The most commonly used is semi-supervised training on normal samples using a depth Auto-Encoder, so that the reconstruction error of the normal class of Auto-Encoder for normal data is low, and currently, more mainstream Auto-encoders such as VAE (i.e., variable Auto-Encoder), AAE (i.e., Adaptive Arithmetic code Encoder), and the like are used. While the third category of unsupervised one-field detection techniques detects outliers based only on intrinsic properties of the data instances, the underlying theory also comes from the self-encoder. Generally, such methods can be used for automatic labeling of data samples, and common unsupervised algorithms include a Restricted Boltzmann Machine (RBM), a deep belief network (DNB), and the like. How to improve the accuracy of KPI anomaly detection is a problem of continuous research in the technical field of KPI anomaly detection. Therefore, the KPI abnormity detection scheme is provided, and the accuracy of KPI abnormity detection can be improved.

Referring to fig. 1, an embodiment of the present application discloses a KPI anomaly detection method, including:

step S11: acquiring single-dimensional KPI time sequence data of a target interval; the length of the target interval is a first preset time length, and the ending time point of the target interval is a designated time point.

It should be noted that in the KPI anomaly detection scenario, the KPI anomalies are usually in the form of continuous intervals. Once an anomaly occurs, it may persist for some time, rather than a single point in time. When an KPI is abnormal at time T, the abnormality lasts to time T + T. Therefore, the anomaly detection method in the application is divided into two steps, the base classifier detects the anomaly of KPI time sequence data, and the label classifier further detects the anomaly.

For example, at time t, data X within (t-W +1, t) time is extracted_t＝(x_t-W+1，x_t-W+2，...，x_t). Wherein t is a fingerAt the timing point, W is a first preset time length. X_tThe single-dimensional KPI time sequence data can be represented by CPU data, memory data, network data and the like.

Step S12: and extracting a first data characteristic of the single-dimensional KPI time sequence data.

In a specific implementation manner, normalization processing may be performed on the single-dimensional KPI time series data to obtain normalized data; and determining the normalized data and at least one of the statistical characteristics, the prediction characteristics and the frequency domain characteristics of the normalized data as the first data characteristics of the single-dimensional KPI time sequence data.

In a specific embodiment, the method for normalizing the raw data may adopt a Minmax method, where the expression is:

wherein X_{t_nom}The normalized data is identified. Moreover, the statistical characteristics may include at least one of a mean value, a variance, an extreme value, a quantile, a difference, and the like, the prediction characteristics may be obtained by predicting the normalized data using an EWMA (exponential Weighted Moving Average) prediction algorithm, and the frequency domain characteristics may be wavelet characteristics and decomposed using a DB2 wavelet. In these features, the normalized data and the statistical features are used to represent short-term features of KPI time series data, the prediction features may represent the possibility of KPI time series data abnormality to some extent, and the wavelet decomposition features may represent the features of KPI data in the frequency domain.

Step S13: and inputting the first data characteristic into a base classifier, and outputting a preliminary abnormal detection result of the single-dimensional KPI time sequence data by using the base classifier.

In a specific implementation, the XGBoost model may be used as a base classifier for anomaly detection, and the first data feature is used as an input of the base classifier, and an output is normal or abnormal. The KPI anomaly detection of the embodiment converts data anomaly into a two-classification problem, and takes XGboost as a two-classification classifier. The XGBoost model may be expressed as:

wherein f is_k(x) The K-th weak learner is shown, and the total number of weak learners in the XGboost model is K. x is a radical of a fluorine atom_iFor the (i) th sample,

is a sample x_iThe predicted value of (2). These K weak classifiers, in order to compose a strong classifier, need to minimize the function:

where l (-) is the loss function and Ω (-) is the regularization function. y is_iIs a sample x_iThe true value of (d). T in the regularization term is the number of leaf nodes of the tree, w is the weight of the leaf nodes, and gamma and lambda are hyper-parameters in the regularization term. At each iteration, only the objective function of the t-th regression tree is optimized:

wherein,

corresponding sample x for the first t-1 trees_iOutput of (f)_t(x_i) Is the output of the current tree. Performing Taylor expansion on the target function, and reserving a first term and a second term in the formula to obtain an approximate value of a target:

wherein,

for each sample, I ∈ I, and the first derivative and the second order on the loss function_jRepresenting each sample data mapped onto the jth leaf node. n is the number of samples. To w_jThe derivative is equal to 0, find w_jThe optimal solution of (2):

will be provided with

Substituting the original objective function to obtain:

and T is the number of leaf nodes. Through the iteration, the optimal splitting variables and the splitting values of the tree can be found. Use of

And finding a tree with an optimal structure, adding the tree into the model, and finding the optimal tree structure by using a greedy algorithm.

In this way, by the above detection, it is possible to obtain whether the single-dimensional KPI time-series data is abnormal or not, and the point abnormality in the abnormality detection is addressed, but the importance of the event is greater than the importance of the point in the normal system. More in this application, event exceptions are of concern and are reflected in a continuous interval in the KPI. Therefore, the above detected results need to be filtered, that is, the preliminary abnormal detection result is determined by using a tag classifier.

Step S14: extracting second data characteristics of a target time point, and inputting the second data characteristics and the preliminary abnormal detection result into a label classifier to obtain a classification result; and the target time point is any time point in a second preset time length after the pointing time point.

The second data feature extraction method may refer to the first data feature extraction method, extract the target time point and KPI timing data within a first preset time length before the target time point, and perform feature extraction to obtain the second data feature of the target time point.

Step S15: and determining a final abnormal detection result of the single-dimensional KPI time sequence data based on the classification result.

It should be noted that, the time point when the exception occurs is defined as T, if the exception still occurs within the time (T, T + T), it is determined that the discovered exception is a true exception, otherwise it is a false exception. Therefore, classification results can be classified into four categories of true positive cases, false positive cases, true negative cases and false negative cases. The detection of the abnormality in the past is a true positive case TN, the detection of the abnormality in the past but the detection of the normality in the time T is a false positive case FN, the detection of the abnormality in the past is a normal case, the detection of the abnormality in the time T is a true negative case TP, the detection of the abnormality in the past is a normal case, and the detection of the abnormality in the time T is a false negative case FP. In the label classifier, if the abnormality detected within T time after the start of the continuous abnormal section represents that the entire section is an abnormal section, the detected TN and FP are true abnormalities, and the other cases can be ignored.

The label classifier in the application still uses the XGboost model. The input of the label classifier is any time point T in T time after the abnormal time point_iFeatures extracted in the foregoing manner

Preliminary anomaly detection node corresponding to time tFruit 1_tIs characterized by the combination of

That is, t_iE (T, T + T), and further, the mode of judging whether the abnormal interval is formed: when t is_iCharacteristic f of time of day_lWhen the label classifier is input, a result y is obtained_iIf y is_iAnd e, TN or FP, determining the continuous interval as an abnormal interval, and finally determining that the detection result is abnormal.

It can be seen that, in the embodiment of the present application, single-dimensional KPI time sequence data of a target interval is obtained first, the length of the target interval is a first preset time length, an ending time point of the target interval is a designated time point, then a first data feature of the single-dimensional KPI time sequence data is extracted, the first data feature is input to a base classifier, and a preliminary anomaly detection result of the single-dimensional KPI time sequence data is output by the base classifier, then a second data feature of the target time point is extracted, and the second data feature and the preliminary anomaly detection result are input to a tag classifier to obtain a classification result, the target time point is any time point within a second preset time length after the designated time point, and a final anomaly detection result of the single-dimensional KPI time sequence data is determined based on the classification result. That is, this application embodiment has utilized two-layer classifier, utilizes basic classifier to detect the single-dimensional KPI time sequence data of target interval earlier, obtains preliminary abnormal detection result, then utilizes label classifier to judge preliminary abnormal detection result to obtain final abnormal detection result, like this, can promote KPI abnormal detection's rate of accuracy.

Referring to fig. 2, the present application discloses a root cause positioning method, including:

step S21: constructing a candidate root cause set based on a plurality of pieces of single-dimensional KPI time sequence data with abnormal final abnormal detection results; wherein the set of candidate root causes includes one or more of the single-dimensional KPI timing data.

That is, in the same target interval, there may be an abnormality in the single-dimensional KPI timing data, and a candidate root set is constructed based on a plurality of single-dimensional KPI timing data. One single-dimensional KPI timing data is an element in the candidate root cause set. For the abnormal detection of the single-dimensional KPI time sequence data, reference may be made to the disclosure of the foregoing embodiments, which are not described herein again.

Step S22: and taking the candidate root cause set as a node, and constructing a multilayer root cause tree according to the data dimension of the candidate root cause set.

The data dimensions of the same layer of nodes in the multi-layer root cause tree are the same, and the data dimension number of each layer of nodes is decreased progressively from top to bottom. For example, referring to fig. 3, the embodiment of the present application discloses a specific multi-level root cause tree diagram. Wherein, K1, K2, K3 and K4 respectively represent single-dimensional KPI time series data of 4 kinds of abnormalities. And the root cause set corresponding to each node in the first layer only comprises single-dimensional KPI time sequence data. The root cause set corresponding to each node in the second layer includes 2-dimensional KPI time sequence data, that is, includes two types of single-dimensional KPI time sequence data, and the root cause set corresponding to each node in the third layer includes 3-dimensional KPI time sequence data. And the root cause set corresponding to each node of the third layer comprises 4-dimensional KPI time sequence data.

It should be noted that, for a set of abnormal KPI timing data, the root of the abnormal KPI timing data can be obtained according to the correlation between the abnormalities. The root cause can be searched based on fig. 3, specifically, the root cause can be searched layer by layer according to the dimension of the root cause, each node is a root cause combination, and the final root cause falls on a certain leaf node.

Step S23: and carrying out layer-by-layer pruning on the multilayer root cause tree based on a preset pruning strategy, and determining an abnormal root cause set based on a ripple effect.

According to the embodiment of the application, the multi-layer root-cause tree is pruned layer by layer from top to bottom based on a preset pruning strategy. And in the layer-by-layer pruning process, aiming at any layer, determining the influence value of each node based on a preset influence value calculation rule, judging whether the influence value is smaller than a preset influence threshold value, and if so, rejecting the node and child nodes of the rejected node in each layer below the layer. Further, for any layer, after the nodes with the influence values smaller than the preset influence threshold value are removed, potential values corresponding to the remaining nodes are determined based on the ripple effect; wherein the potential score characterizes a degree of influence of an element of a node on elements of children of the node; and sequencing the potential scores of the remaining nodes of each layer, and determining an abnormal root cause set based on a sequencing result.

The method comprises the steps of calculating a predicted value of each element in each node based on a preset prediction algorithm, and determining an influence value of each node based on the predicted value of each element and an actual value of each element.

It should be noted that, if a group of KPI data can affect KPI values of other macro-elements according to the ripple effect, the group of KPI data is a root cause set, and other KPI data are leaf nodes of the group of KPI data. Assuming that a candidate set of a root set is S, KPI values of leaf nodes of its descendant may be derived according to the ripple effect, and KPI data in the candidate set is compared with the derived KPI values. Therefore, the embodiment of the present application designs an evaluation criterion capable of expressing a comparison between a candidate set KPI and a derived KPI, and the closer these two values are, the greater the probability that S becomes a root cause set, but if there are a plurality of candidate sets, it is necessary to preferentially select a set with fewer dimensions according to the ocamer' S principle.

In an implementation manner, the abnormal root cause positioning method adopted in the embodiment of the present application may be based on a HotSpot algorithm and increase a strategy of pruning layer by layer for improving efficiency, and a specific process is as follows:

when the KPI is detected to be abnormal, the time t when the abnormality occurs and the KPI leaf element value in the window of the length W before the abnormal time, namely the actual value of the single-dimensional KPI time sequence data with the abnormal final detection result are given as v ═ v { (v)_t-W+1，v_t-W+2，...，v_t}，V_tFor the KPI actual values, V, of all leaf elements in any node at time t_t＝{v(e₁，t)，v(e₂，t)，...，v(e_nT), e represents an element, n represents the number of elements in a node, and the KPI predicted value F of any node at the time t_t＝{f(e₁，t)，f(e₂，t)，...，f(e_nT), the prediction algorithm uses the EWAM algorithm.

According to the ripple effect, if an element x is abnormal, all sub-elements will change accordingly, and the variation of the element needs to be distributed to all the subsequent elements in proportion, so as to calculate the derived values of all the elements. Assuming that the variation value of x is h (x), then h (x) ═ f (x) -v (x), the derived value of each element x' is calculated according to the formula:

wherein x' is an element different from x in the child node of the node to which the element x belongs. f (x) represents the predicted value of x, the prediction algorithm uses the EWAM algorithm, and v (x) represents the actual value of x.

Further, the influence degree of the element x on other leaf node elements can be represented by a potential score ps:

wherein,

is a variable of

Euler distance therebetween:

the same is true. Where i represents the point in time i in the single-dimensional KPI timing data, so if the leaf node element is the root cause, the closer the values of a and v are,

the closer to 0 the value of (b) and the closer to 1 the value of ps.

It will be appreciated that the ps value of a node is instead determined from the ps value of an element. The root cause set S may be determined based on the ranking of the ps values. For example, as shown in fig. 3, assuming that ps values of K1 nodes in the first layer are calculated, K1 is x, K1,2, K1, and 3 are all remaining nodes, K2 in K1,2, and K3 in K1, and 3 are all x ', ps values of K1 to K2 in K1, K1, and K3 in K382, and K1, and 3 are calculated, respectively, and then summed to obtain a potential score of the K1 node, and for the potential scores of K1,2, K1 and K2 are x, f (x) (f 1) + f (K2), v (x) (K1) + v (K2), and K1,2,3 are all remaining nodes, K1, K3, 2, and 3 are x'.

It should be noted that any set of potential scores may be determined according to the foregoing process. However, in the process of searching for abnormal root causes, namely in a set of any dimension combination, the process of searching for a candidate root cause set with the largest potential score is carried out, the search space is very huge, and in the case of one-dimensional root causes, the search space is n³But the root is not necessarily one-dimensional, the search space grows exponentially. Therefore, in order to solve the problem of search space explosion, in the application, a root set is searched by using a strategy of pruning layer by layer, and pruning is performed according to the influence of the root set, wherein in one embodiment, the influence of the root set S is defined as:

wherein h (S) is the sum of absolute values of change values of each element in the root cause set S, and e represents KPI time sequence data with abnormal final detection result. In another embodiment, the influence of the root cause set may also be calculated by the formula e(s) ═ h(s).

Note that the influence indicates the possibility that the candidate root cause set S becomes a root cause. In addition, a threshold value T needs to be determined_EWhen traversing to a certain node, when E (S)<T_EWhen it is ready to useThe combinations in the node are represented with low probability as root causes and are not used as candidate root cause sets. Through the pruning strategy and the potential score calculation process, nodes without influence are removed, a plurality of candidate root cause sets are obtained in each layer, sorting is performed according to the potential scores from large to small, and the obtained root cause set with the highest potential score is the root cause with the highest probability.

For example, referring to fig. 4, the embodiment of the present application discloses a specific KPI anomaly detection and root cause localization diagram. In the KPI abnormal detection process, the multi-dimensional KPI is divided into single-dimensional KPI to be detected respectively. The abnormal detection algorithm is designed aiming at the continuous section abnormality in the KPI, a multi-layer classifier is used for detecting, the base classifier detects abnormal points in the KPI, the label classifier detects that the abnormal points in the time section are determined as true abnormality only when the abnormal points occur, and all abnormal KPIs occurring at the same moment are detected. And forming a root cause candidate set according to the discovered abnormal KPIs, and discovering the root cause set by combining a layer-by-layer calculation mode of the root cause tree based on the ripple effect and by using the potential scores in the HotSpot to quantify the influence of the KPIs. Therefore, the KPI point abnormity in abnormity detection is converted into KPI continuous interval abnormity through the multilayer classifier, the abnormity detector focuses more on abnormal events, the ripple effect of KPI abnormity is used in the root cause analysis stage, the potential scores and the influence are quantified, the search speed is accelerated in a layer-by-layer pruning mode, and the abnormal root cause is positioned.

Referring to fig. 5, an embodiment of the present application discloses a KPI abnormality detection apparatus, including:

a KPI data acquisition module 11, configured to acquire single-dimensional KPI timing sequence data of a target interval; the length of the target interval is a first preset time length, and the ending time point of the target interval is a designated time point;

a data feature extraction module 12, configured to extract a first data feature of the single-dimensional KPI timing data;

a detection result output module 13, configured to input the first data feature into a base classifier, and output a preliminary abnormal detection result of the single-dimensional KPI timing data by using the base classifier;

a classification result obtaining module 14, configured to extract a second data feature of the target time point, and input the second data feature and the preliminary anomaly detection result into a tag classifier to obtain a classification result; the target time point is any time point within a second preset time length after the pointing time point;

and the detection result determining module 15 is configured to determine a final abnormal detection result of the single-dimensional KPI timing data based on the classification result.

The data feature extraction module 12 specifically includes:

the normalization processing submodule is used for performing normalization processing on the single-dimensional KPI time sequence data to obtain normalized data;

and the data feature extraction submodule is used for determining the normalized data and at least one of the statistical feature, the prediction feature and the frequency domain feature of the normalized data as the first data feature of the single-dimensional KPI time sequence data.

Further, the device further comprises a root cause positioning module, wherein the root cause positioning module specifically comprises:

a candidate root cause set constructing submodule, configured to construct a candidate root cause set based on the plurality of single-dimensional KPI time-series data in which the final anomaly detection result is anomalous; wherein the set of candidate root causes includes one or more of the single-dimensional KPI timing data;

the root cause tree construction submodule is used for taking the candidate root cause sets as nodes and constructing a multi-layer root cause tree according to the data dimensions of the candidate root cause sets;

and the abnormal root cause set determining submodule is used for carrying out layer-by-layer pruning on the multilayer root cause tree based on a preset pruning strategy and determining an abnormal root cause set based on a ripple effect.

The data dimensions of the same layer of nodes in the multi-layer root cause tree are the same, and the data dimension number of each layer of nodes is decreased progressively from top to bottom;

correspondingly, the abnormal root cause set determining submodule is used for carrying out layer-by-layer pruning on the multilayer root cause tree from top to bottom based on a preset pruning strategy.

In a specific embodiment, the abnormal root cause set determining submodule is specifically configured to:

aiming at any layer, determining the influence value of each node based on a preset influence value calculation rule, judging whether the influence value is smaller than a preset influence threshold value, and if so, rejecting the node and child nodes of rejected nodes in layers below the layer;

Further, the abnormal root cause set determining submodule is specifically configured to: and calculating a predicted value of each element in each node based on a preset prediction algorithm, and determining the influence value of each node based on the predicted value of each element and the actual value of each element.

Referring to fig. 6, an embodiment of the present application discloses an electronic device 20, which includes a processor 21 and a memory 22; wherein, the memory 22 is used for storing computer programs; the processor 21 is configured to execute the computer program, and the KPI abnormality detection method disclosed in the foregoing embodiment.

For the specific process of the KPI abnormality detection method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not described here again.

The memory 22 is used as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, and the storage mode may be a transient storage mode or a permanent storage mode.

In addition, the electronic device 20 further includes a power supply 23, a communication interface 24, an input-output interface 25, and a communication bus 26; the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to a specific application requirement, which is not specifically limited herein.

Further, an embodiment of the present application also discloses a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the KPI anomaly detection method disclosed in the foregoing embodiment.

For the specific process of the KPI abnormality detection method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The KPI anomaly detection method, apparatus, device and medium provided by the present application are introduced in detail above, and specific examples are applied herein to explain the principles and embodiments of the present application, and the descriptions of the above embodiments are only used to help understand the method and core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific implementation manner and the application scope may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A KPI anomaly detection method is characterized by comprising the following steps:

2. A KPI anomaly detection method according to claim 1, wherein said extracting a first data characteristic of said single-dimensional KPI timing data comprises:

and determining the normalized data and at least one of the statistical characteristics, the prediction characteristics and the frequency domain characteristics of the normalized data as the first data characteristics of the single-dimensional KPI time sequence data.

3. A KPI abnormality detection method according to claim 1, characterized by further comprising:

and performing layer-by-layer pruning on the multilayer root cause tree based on a preset pruning strategy, and determining an abnormal root cause set based on a ripple effect.

4. A KPI anomaly detection method according to claim 3, wherein the data dimensions of the same layer of nodes in the multi-layer root tree are the same, and the data dimension number of each layer of nodes decreases from top to bottom;

5. A KPI anomaly detection method according to claim 4, wherein said pruning the multi-layer root cause tree layer by layer based on a preset pruning strategy comprises:

6. A KPI anomaly detection method according to claim 5, wherein said pruning said multilayer root cause tree layer by layer based on a preset pruning strategy and determining an anomalous root cause set based on a ripple effect comprises:

7. A KPI anomaly detection method according to claim 5, wherein said determining an influence value for each node based on preset influence value calculation rules comprises:

and calculating a predicted value of each element in each node based on a preset prediction algorithm, and determining an influence value of each node based on the predicted value of each element and the actual value of each element.

8. An KPI abnormality detection apparatus, characterized by comprising:

the KPI data acquisition module is used for acquiring single-dimensional KPI time sequence data of a target interval; the length of the target interval is a first preset time length, and the ending time point of the target interval is a designated time point;

9. An electronic device comprising a processor and a memory; wherein,

the memory is used for storing a computer program;

the processor for executing the computer program to implement a KPI anomaly detection method according to any of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements a KPI anomaly detection method according to any one of claims 1 to 7.