CN108804703B - Data anomaly detection method and device - Google Patents

Data anomaly detection method and device Download PDF

Info

Publication number
CN108804703B
CN108804703B CN201810631649.1A CN201810631649A CN108804703B CN 108804703 B CN108804703 B CN 108804703B CN 201810631649 A CN201810631649 A CN 201810631649A CN 108804703 B CN108804703 B CN 108804703B
Authority
CN
China
Prior art keywords
data
current service
adjustment
service data
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810631649.1A
Other languages
Chinese (zh)
Other versions
CN108804703A (en
Inventor
郑琳琳
杨劲锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bodian Zhihe Technology Co ltd
Original Assignee
Beijing Jiaodian Xinganxian Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaodian Xinganxian Information Technology Co ltd filed Critical Beijing Jiaodian Xinganxian Information Technology Co ltd
Priority to CN201810631649.1A priority Critical patent/CN108804703B/en
Publication of CN108804703A publication Critical patent/CN108804703A/en
Application granted granted Critical
Publication of CN108804703B publication Critical patent/CN108804703B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a data anomaly detection method and device, which can perform at least one first type adjustment and at least one second type adjustment on current service data under the condition that the obtained current service data is periodic data, and perform at least one of periodic anomaly detection and inflection point anomaly detection on the adjusted current service data, so that whether the anomaly type is periodic anomaly or inflection point anomaly can be determined when the adjusted current service data is abnormal, and the periodic data is subjected to dynamic multiple adjustment through the first type adjustment and the second type adjustment, so that the accuracy of the periodic data anomaly detection is improved. In the case that the acquired current service data is not periodic data, at least one of deviation anomaly detection and inflection point anomaly detection may be directly performed on the current service data to determine whether the anomaly type is a deviation anomaly or an inflection point anomaly when there is an anomaly in the current service data.

Description

Data anomaly detection method and device
Technical Field
The present invention belongs to the field of data processing technologies, and in particular, to a method and an apparatus for detecting data anomalies.
Background
Over time, service data corresponding to any device, such as data acquired by a sensor or data generated by executing a certain service, may have a problem, and therefore, anomaly detection for the service data is an important branch in data analysis, and plays an important role in the fields of data centers and the like, such as event detection, intrusion detection, fraud detection, error detection and the like for the data centers.
The current method for detecting the abnormality of the service data is as follows: the method comprises the steps of carrying out statistical analysis on current business data (namely obtained business data) to obtain a development rule of the current business data, carrying out data prediction according to the development rule to obtain predicted business data, obtaining the deviation of the actual business data relative to the predicted business data after the actual business data are obtained, and determining whether the actual business data are abnormal or not according to the deviation of the actual business data relative to the predicted business data, wherein the current data abnormality detection only can detect whether the actual business data are abnormal or not, and cannot determine the abnormal type.
Disclosure of Invention
In view of the above, the present invention provides a data anomaly detection method and apparatus, which are used for determining an anomaly type when current service data is abnormal. The technical scheme is as follows:
the invention provides a data anomaly detection method, which comprises the following steps:
acquiring current service data;
under the condition that the current service data is determined to be periodic data, performing at least one first type adjustment and at least one second type adjustment on the current service data to obtain adjusted current service data, wherein the first type adjustment and the second type adjustment adopt different adjustment parameters, the first adjustment is to adjust the current service data, and the other adjustments except the first adjustment are to adjust the data obtained by the last adjustment;
performing at least one of cycle anomaly detection and inflection point anomaly detection on the adjusted current service data to determine whether the adjusted current service data has at least one of cycle anomaly and inflection point anomaly;
in a case where it is determined that the current traffic data is not periodic data, at least one of a deviation anomaly detection and an inflection point anomaly detection is performed on the current traffic data to determine whether the current traffic data has at least one of a deviation anomaly and an inflection point anomaly.
Preferably, the performing at least one first type adjustment and at least one second type adjustment on the current service data when it is determined that the current service data is periodic data includes:
if the ith adjustment is the first type adjustment, acquiring a normal distribution graph of data corresponding to the adjustment, and adjusting data with a value larger than a first preset threshold value in the normal distribution graph of the data corresponding to the adjustment to the median of the data corresponding to the adjustment, wherein the data corresponding to the adjustment is the current service data when i is equal to 1, and the data corresponding to the adjustment is the data obtained by the adjustment of the ith-1 time when i is a natural number larger than 1;
if the ith adjustment is the second type adjustment, acquiring a residual component of the data corresponding to the adjustment through a time series decomposition algorithm, acquiring a normal distribution diagram of the residual component of the data corresponding to the adjustment, determining a time point corresponding to each data of which the value is greater than a second preset threshold in the normal distribution diagram of the residual component of the data corresponding to the adjustment, and for any data of which the value is greater than the second preset threshold: and adjusting the data to an expected value of the data corresponding to the adjustment at a time point corresponding to the data, wherein the data corresponding to the adjustment is the current service data when i is equal to 1, the data corresponding to the adjustment is data obtained by the adjustment of the ith-1 th time when i is a natural number greater than 1, and the second preset threshold is different from the first preset threshold.
Preferably, the method further comprises: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, when the adjusted current service data is determined to have an abnormality, the abnormality of the current service data of the main service is verified through the current service data of the sub-service, and the abnormality is at least one of cycle abnormality and inflection point abnormality.
Preferably, the verifying the abnormality of the current service data of the main service through the current service data of the sub-service includes:
determining the data deviation direction of the current service data of the main service at each corresponding time point, and for the current service data of any sub-service belonging to the main service: determining a data deviation direction of the current service data of the sub-service at each time point, wherein the data deviation direction is used for indicating that the data development trend at the time point is any one of data increase and data decrease;
for any time point: if the data deviation direction of each sub-service at the time point is different from the data deviation direction of the main service at the time point, determining that the abnormality of the current service data of the main service is a pseudo abnormality, and determining the current service data of the main service as normal service data.
Preferably, the method further comprises: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, when the adjusted current service data has an abnormal period, calculating an expected value of the current service data of each sub-service at each corresponding time point, and for any time point: determining a deviation value of an actual value of current service data of each sub-service at the time point relative to an expected value at the time point;
determining a deviation value meeting a first preset condition from all the deviation values, and performing anomaly analysis on the current service data of the main service according to the deviation value meeting the first preset condition;
if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, acquiring the differential data of the current service data of each sub-service when the inflection point of the current service data of the main service is abnormal;
and determining data meeting a second preset condition from all the differential data, and performing anomaly analysis on the current service data of the main service according to the data meeting the second preset condition.
Preferably, the performing at least one of a deviation anomaly detection and an inflection point anomaly detection on the current traffic data in a case where it is determined that the current traffic data is not periodic data includes:
under the condition that the current service data is determined not to be periodic data, acquiring a normal distribution diagram of the current service data, acquiring a deviation point in the normal distribution diagram of the current service data, and determining whether the current service data has deviation abnormality according to the deviation point in the normal distribution diagram of the current service data;
and under the condition that the current service data is determined not to be periodic data, acquiring a normal distribution diagram of the differential data of the current service data, acquiring a deviation point in the normal distribution diagram of the differential data, and determining whether the inflection point abnormality exists in the current service data according to the deviation point in the normal distribution diagram of the differential data.
The present invention also provides a data anomaly detection apparatus, the apparatus comprising:
the acquisition unit is used for acquiring current service data;
an adjusting unit, configured to perform at least one first type adjustment and at least one second type adjustment on the current service data to obtain adjusted current service data when it is determined that the current service data is periodic data, where adjustment parameters adopted by the first type adjustment and the second type adjustment are different, the first adjustment is to adjust the current service data, and other adjustments except the first adjustment are to adjust data obtained by the last adjustment;
a first anomaly detection unit, configured to perform at least one of a period anomaly detection and an inflection point anomaly detection on the adjusted current service data to determine whether the adjusted current service data has at least one of a period anomaly and an inflection point anomaly;
a second anomaly detection unit, configured to, in a case where it is determined that the current traffic data is not periodic data, perform at least one of deviation anomaly detection and inflection point anomaly detection on the current traffic data to determine whether there is at least one of deviation anomaly and inflection point anomaly in the current traffic data.
Preferably, the adjusting unit is specifically configured to, if the ith adjustment is the first type adjustment, obtain a normal distribution graph of data corresponding to the adjustment, adjust data, of which a value is greater than a first preset threshold, in the normal distribution graph of the data corresponding to the adjustment to a median of the data corresponding to the adjustment, where the data corresponding to the adjustment is the current service data when i is equal to 1, and the data corresponding to the adjustment is data obtained by the adjustment for the ith-1 st time when i is a natural number greater than 1;
if the ith adjustment is the second type adjustment, acquiring a residual component of the data corresponding to the adjustment through a time series decomposition algorithm, acquiring a normal distribution diagram of the residual component of the data corresponding to the adjustment, determining a time point corresponding to each data of which the value is greater than a second preset threshold in the normal distribution diagram of the residual component of the data corresponding to the adjustment, and for any data of which the value is greater than the second preset threshold: and adjusting the data to an expected value of the data corresponding to the adjustment at a time point corresponding to the data, wherein the data corresponding to the adjustment is the current service data when i is equal to 1, the data corresponding to the adjustment is data obtained by the adjustment of the ith-1 th time when i is a natural number greater than 1, and the second preset threshold is different from the first preset threshold.
Preferably, the apparatus further comprises: and if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, verifying the abnormality of the current service data of the main service through the current service data of the sub-service when the adjusted current service data is determined to have the abnormality, wherein the abnormality is at least one of cycle abnormality and inflection point abnormality.
Preferably, the apparatus further comprises: an offset value determining unit, configured to calculate, if the current service data is current service data of a main service and the current service data of the main service corresponds to current service data of at least one sub-service, an expected value of the current service data of each sub-service at each corresponding time point when it is determined that the adjusted current service data has an abnormal period, and for any time point: determining a deviation value of an actual value of current service data of each sub-service at the time point relative to an expected value at the time point;
the first anomaly analysis unit is used for determining an offset value meeting a first preset condition from all the offset values and carrying out anomaly analysis on the current service data of the main service according to the offset value meeting the first preset condition;
a differential data obtaining unit, configured to obtain differential data of current service data of each sub-service when an inflection point of the current service data of the main service is abnormal, if the current service data is current service data of the main service and the current service data of the main service corresponds to current service data of at least one sub-service;
and the second anomaly analysis unit is used for determining data meeting a second preset condition from all the differential data and carrying out anomaly analysis on the current service data of the main service according to the data meeting the second preset condition.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
according to the technical scheme, under the condition that the obtained current service data is periodic data, at least one first type adjustment and at least one second type adjustment can be performed on the current service data, and at least one of period anomaly detection and inflection point anomaly detection is performed on the adjusted current service data, so that whether the adjusted current service data has at least one of period anomaly and inflection point anomaly is determined, whether the anomaly type is the period anomaly or the inflection point anomaly can be determined when the adjusted current service data has the anomaly, dynamic multiple adjustment on the periodic data is realized through the first type adjustment and the second type adjustment on the periodic data, and the accuracy of the periodic data anomaly detection is improved. In the case that the obtained current service data is not periodic data, at least one of deviation anomaly detection and inflection point anomaly detection may be directly performed on the current service data, so as to determine whether the current service data has at least one of deviation anomaly and inflection point anomaly, so as to determine whether the anomaly type is the deviation anomaly or the inflection point anomaly when the current service data has the anomaly.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a data anomaly detection method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a corner anomaly provided by an embodiment of the present invention;
FIG. 3 is another flow chart of a data anomaly detection method provided by an embodiment of the invention;
FIG. 4 is a flowchart illustrating a data anomaly detection method according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a data anomaly detection apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a data anomaly detection apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a data anomaly detection apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of a data anomaly detection method provided in an embodiment of the present invention is shown, where the method is used to determine an anomaly type when current business data is anomalous, and may include the following steps:
101: and acquiring current service data. It can be understood that: the current service data is data corresponding to and obtained from a certain service, such as data that has been generated when a certain service is executed, and of course, corresponding time may also be allocated to the service data, such as allocating a detection period, where the current service data is service data located in the detection period, for example, the detection period may be 8 to 24 points per day, and the current service data is service data located in 8 to 24 points per day. In an actual application environment, the service may include a main service and sub-services, and the service configuration table indicates which sub-services the main service corresponds to, so that when obtaining current service data, if it is determined that the main service corresponds to at least one sub-service through the service configuration table, the current service data of the sub-services also needs to be obtained when obtaining the current service data of the main service.
102: and under the condition that the current service data is determined to be periodic data, performing at least one first type adjustment and at least one second type adjustment on the current service data to obtain the adjusted current service data. A possible way to determine whether the current service data is periodic data may be: and calculating the period corresponding to the current service data through Fast Fourier Transform (FFT), wherein if the period corresponding to the current service data can be obtained, the current service data is the periodic data, and otherwise, the current service data is not the periodic data.
For periodic data, the periodic data is adjusted for multiple times through first-type adjustment and second-type adjustment, wherein adjustment parameters adopted by the first-type adjustment and the second-type adjustment are different, the first adjustment is to adjust the current service data, and the adjustment for other times except the first adjustment is to adjust the data obtained by the last adjustment. That is, different adjustment parameters are adopted to perform multiple adjustments, and the data corresponding to each adjustment is also different, the specific data corresponding to the first adjustment is the current service data (i.e., the data obtained in step 101), the data corresponding to other times of adjustments is the data obtained by the last adjustment, taking the jth adjustment (j is a natural number greater than 1) as an example, and the data corresponding to the jth adjustment is the data obtained by the jth-1 adjustment. Possible ways of the first type of adjustment and the second type of adjustment are explained below:
if the ith adjustment is the first type adjustment, acquiring a normal distribution graph of the data corresponding to the adjustment, and adjusting the data with the value larger than a first preset threshold value in the normal distribution graph of the data corresponding to the adjustment to the median of the data corresponding to the adjustment, wherein the data corresponding to the adjustment is the current business data when i is equal to 1, and the data corresponding to the adjustment is the data obtained by the adjustment of the ith-1 time when i is a natural number larger than 1.
For example, the first predetermined threshold may be a first predetermined multiple, such as but not limited to six times, of the standard deviation of the adjustment corresponding data, and for data whose value in the normal distribution diagram of the adjustment corresponding data is greater than six times of the standard deviation of the adjustment corresponding data, the data is adjusted to the median of the adjustment corresponding data, that is, the median of the adjustment corresponding data, such as the data length of the adjustment corresponding data is N, the median is (N +1)/2 bits when N is odd, and the median is the average of (N +1)/2 bits when N is even, and the data of N/2 bits.
If the ith adjustment is the second type adjustment, acquiring a residual component of the data corresponding to the adjustment through an STL (secure and Trend decomposition using Loess) algorithm, acquiring a normal distribution diagram of the residual component of the data corresponding to the adjustment, determining a time point corresponding to each data of which the value is greater than a second preset threshold in the normal distribution diagram of the residual component of the data corresponding to the adjustment, and for any data of which the value is greater than the second preset threshold: and adjusting the data to an expected value of the data corresponding to the adjustment at a time point corresponding to the data, wherein the data corresponding to the adjustment is current service data when i is equal to 1, the data corresponding to the adjustment is data obtained by the adjustment of the (i-1) th time when i is a natural number greater than 1, and the second preset threshold is different from the first preset threshold.
For example, the second preset threshold may be a second preset multiple, such as but not limited to five times, of the standard deviation of the adjustment corresponding data, for any data in the normal distribution diagram of the residual component of the adjustment corresponding data, which takes a value greater than five times of the standard deviation of the adjustment corresponding data: firstly, determining a time point corresponding to the data, then obtaining an expected value at the time point, and finally adjusting the data to the expected value at the time point corresponding to the data, wherein the expected value is the sum of a period value and a trend value of the data corresponding to the adjustment at the time point, the period value and the trend value can be decomposed by an STL algorithm on the data corresponding to the adjustment to obtain a period component and a trend component, the period value at the time point is determined from the period component, and the trend value at the time point is determined from the trend component.
In an actual application scenario, the randomness factor may affect the current service data, and therefore, a trend component obtained by decomposing the data corresponding to the adjustment through the STL algorithm may be used as an initial trend component, and the initial trend component is calculated through the mobile median algorithm to obtain the trend component of the data corresponding to the adjustment, so as to reduce the influence of random variation caused by the interference of the randomness factor on the current service data, where the feasible way of the mobile median algorithm is: assuming that the initial trend component is X, the size of the moving window is 7, and the data length of the initial trend component is n, the formula corresponding to the moving median algorithm is as follows:
Xt’=median(Xt-3,Xt-2,Xt-1,Xt,Xt+1,Xt+2,Xt+3)(t>=3 or t<=n-4)
obtaining a plurality of median X through the formula of the mobile median algorithmt' after, for a plurality of median Xt' fitting (e.g., linear fitting) results in a trend component for the data corresponding to the adjustment.
In this embodiment, the second preset threshold is preferably smaller than the first preset threshold, and after the first type adjustment is performed at least once through the first preset threshold, the second type adjustment is performed at least once through the second preset threshold, so that the data can be coarsely adjusted through the first type adjustment, and finely adjusted through the second type adjustment, so as to reduce the influence of abnormal data in the current service data on the periodic component and the trend component, and when the data volume of the current service data is small, the current service data is easy to fluctuate greatly, and the fluctuation involved range can be reduced through the coarse adjustment and the fine adjustment, so that the detection accuracy is improved.
The points to be explained here are: for the first preset threshold and the second preset threshold, the threshold may be a threshold that changes with a data development trend, for example, for the current service data, the data development trend of the current service data is data increase, the first preset threshold and the second preset threshold may be increased, and if the data development trend is data decrease, the first preset threshold and the second preset threshold may be decreased. In addition, according to the actual application scenario, for example, different first preset threshold and second preset threshold may be set for different time points, for example, different first preset threshold and second preset threshold may be set for time points in two time periods, namely night and day, and detailed description of this embodiment is omitted.
The number of times of the first type adjustment and the second type adjustment may be determined according to the condition of the data obtained by the adjustment, for example, after the ith first type adjustment is performed, a normal distribution diagram of the data obtained by the ith adjustment is obtained, a data amount of the data with a value greater than a first preset threshold value in the normal distribution diagram of the data obtained by the ith adjustment is determined, and if the data amount is less than the first preset data amount, the first type adjustment is stopped, and the second type adjustment is started. Similarly, for the second type adjustment, after the ith second type adjustment is performed, the normal distribution diagram of the residual component of the data obtained by the ith adjustment is obtained, the data amount of the data of which the value is greater than the second preset threshold value in the normal distribution diagram of the residual component of the data obtained by the ith adjustment is determined, and if the data amount is less than the second preset data amount, the second type adjustment is stopped, wherein the first preset data amount and the second preset data amount may be determined according to practical applications, and the value is not limited in this embodiment.
103: and performing at least one of cycle anomaly detection and inflection point anomaly detection on the adjusted current service data to determine whether at least one of cycle anomaly and inflection point anomaly exists in the adjusted current service data.
In this embodiment, the period anomaly detection may be determined by the STL algorithm, and the inflection point anomaly detection may be performed by: obtaining a normal distribution diagram of differential data (such as first-order differential data) of the adjusted current service data, obtaining a deviation point in the normal distribution diagram of the differential data, determining whether the adjusted current service data has an inflection point abnormality according to the deviation point in the normal distribution diagram of the differential data, determining whether the adjusted current service data has the inflection point abnormality by determining whether the adjusted current service data belongs to the deviation point, determining that the adjusted current service data has the inflection point abnormality if the adjusted current service data belongs to the deviation point, otherwise, automatically positioning a mutation point in the adjusted current service data under the condition that the inflection point abnormality exists in the current service data, wherein the mutation point is a point with a value variation larger than a preset value variation for the deviation point, and the value variation is a value difference between the data corresponding to the deviation point and the previous data, the deviation point is an outlier outside the standard deviation of the first preset multiple in the normal distribution graph of the differential data, that is, data outside the standard deviation greater than and less than the first preset multiple in the normal distribution graph of the differential data, and the first preset multiple may be determined according to practical applications, which is not limited in this embodiment.
As can be seen from the above description of the mutation point, a feasible way for determining whether the adjusted current service data has an inflection point anomaly may be as follows: and determining whether a point with a value variation larger than a preset value variation exists in the deviation point, if so, indicating that the inflection point abnormality exists in the adjusted current service data.
104: in a case where it is determined that the current traffic data is not periodic data, at least one of a deviation anomaly detection and an inflection point anomaly detection is performed on the current traffic data to determine whether the current traffic data has at least one of a deviation anomaly and an inflection point anomaly.
In this embodiment, a feasible manner of detecting the deviation anomaly is: and obtaining a normal distribution diagram of the current service data, obtaining a deviation point in the normal distribution diagram of the current service data, and determining whether the deviation of the current service data is abnormal or not according to the deviation point in the normal distribution diagram of the current service data. If the current service data belongs to the deviation point, determining whether the current service data has deviation abnormality, if the current service data belongs to the deviation point, determining that the current service data has deviation abnormality, otherwise, determining that the current service data does not have deviation abnormality, wherein the deviation point is an outlier out of a standard deviation of a second preset multiple in a normal distribution diagram of the current service data, that is, data out of the standard deviation which is greater than and less than the second preset multiple in the normal distribution diagram of the current service data, and the second preset multiple can be determined according to practical application, which is not limited in this embodiment.
Possible ways of inflection point anomaly detection are: obtaining a normal distribution diagram of the differential data of the current service data, obtaining a deviation point in the normal distribution diagram of the differential data, and determining whether the inflection point abnormality exists in the current service data according to the deviation point in the normal distribution diagram of the differential data, wherein the specific process refers to the description in the step 103, which is not described in this embodiment.
The points to be explained here are: when it is determined that there is inflection point abnormality in the adjusted current service data or the current service data that is not adjusted, if one inflection point causes two inflection point abnormalities and the two inflection points abnormalities are continuous but have opposite data development trends, as shown in fig. 2, the data development trends of the two inflection point abnormalities caused by one inflection point are data increase and data decrease, respectively, the latter one of the two inflection point abnormalities is considered as a pseudo-abnormality, and data corresponding to the latter one is determined as normal data.
According to the technical scheme, under the condition that the obtained current service data is periodic data, at least one first type adjustment and at least one second type adjustment can be performed on the current service data, and at least one of period anomaly detection and inflection point anomaly detection is performed on the adjusted current service data, so that whether the adjusted current service data has at least one of period anomaly and inflection point anomaly is determined, whether the anomaly type is the period anomaly or the inflection point anomaly can be determined when the adjusted current service data has the anomaly, dynamic multiple adjustment on the periodic data is realized through the first type adjustment and the second type adjustment on the periodic data, and the accuracy of the periodic data anomaly detection is improved. In the case that the obtained current service data is not periodic data, at least one of deviation anomaly detection and inflection point anomaly detection may be directly performed on the current service data, so as to determine whether the current service data has at least one of deviation anomaly and inflection point anomaly, so as to determine whether the anomaly type is the deviation anomaly or the inflection point anomaly when the current service data has the anomaly.
Referring to fig. 3, another flow chart of the data anomaly detection method according to the embodiment of the present invention is shown, and on the basis of fig. 1, the method may further include the following steps:
105: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, when the adjusted current service data is determined to have abnormality (at least one of the cycle abnormality and the inflection point abnormality), the abnormality of the current service data of the main service is verified through the current service data of the sub-service, so that whether the abnormality of the current service data of the main service is false abnormality or not is determined, and the misjudgment rate is reduced.
In this embodiment, the process of verifying the anomaly of the current service data of the main service may be: determining the data deviation direction of the current service data of the main service at each corresponding time point, and for the current service data of any sub-service belonging to the main service: and determining a data deviation direction of the current service data of the sub-service at each time point, wherein the data deviation direction is used for indicating that the data development trend at the time point is any one of data increase and data decrease.
For any time point: if the data deviation direction of each sub-service at the time point is different from the data deviation direction of the main service at the time point, determining that the current service data of the main service is a pseudo-anomaly, and determining the current service data of the main service as normal service data, thereby realizing automatic correction of the pseudo-anomaly.
Each time point corresponding to the current service data may be a generation time or a recording time of the current service data, for example, the service data may be recorded at intervals of a period of time, and the time point corresponding to the current service data may be determined according to an actual application scenario, which is not limited in this embodiment.
According to the technical scheme, for the periodic data of the main service, when the periodic data of the main service is determined to have a period abnormality or an inflection point abnormality, the period abnormality or the inflection point abnormality can be verified through the periodic data of the sub-service corresponding to the main service to determine whether the periodic data of the main service is a pseudo-abnormality, so that the misjudgment rate can be reduced, and when the periodic data of the main service is determined to be the pseudo-abnormality, the current service data of the main service can be determined to be normal service data, so that the automatic correction of the pseudo-abnormality is realized.
Referring to fig. 4, a flowchart of a data anomaly detection method according to another embodiment of the present invention is shown, which includes the following steps:
401: and acquiring current service data.
402: and under the condition that the current service data is determined to be periodic data, performing at least one first type adjustment and at least one second type adjustment on the current service data to obtain the adjusted current service data, wherein the first type adjustment and the second type adjustment adopt different adjustment parameters, the first adjustment is to adjust the current service data, and the adjustment for other times except the first adjustment is to adjust the data obtained by the last adjustment.
403: and performing at least one of cycle anomaly detection and inflection point anomaly detection on the adjusted current service data to determine whether at least one of cycle anomaly and inflection point anomaly exists in the adjusted current service data.
In this embodiment, steps 401 to 403: similar to the above steps 101 to 103, the detailed description of the steps 401 to 403 is omitted for this embodiment.
404: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, when the adjusted current service data has an abnormal period, calculating an expected value of the current service data of each sub-service at each corresponding time point, and for any time point: and determining a deviation value of the actual value of the current service data of each sub-service at the time point relative to the expected value at the time point, wherein the deviation value is used for representing the difference between the actual value and the expected value at the same time point. The expected value of the current service data of each sub-service at each corresponding time point may refer to the description of the expected value in step 102, which is not described in this embodiment.
405: determining an offset value meeting a first preset condition from all the offset values, and performing anomaly analysis on the current service data of the main service according to the offset value meeting the first preset condition, that is, selecting the offset value meeting the first preset condition from all the offset values, performing anomaly analysis on the current service data of the main service, and how to perform anomaly analysis on the current service data of the main service, which is not described in detail in this embodiment.
The first preset condition may be determined according to an actual application scenario, and is not limited to this embodiment, for example, the first preset condition may be an offset value with a maximum value and/or a minimum value among all offset values, or the first preset condition may be an offset value with a value in a certain range among all offset values.
406: in a case that it is determined that the current service data is not periodic data, at least one of a deviation anomaly detection and a knee anomaly detection is performed on the current service data to determine whether at least one of a deviation anomaly and a knee anomaly exists in the current service data.
407: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, acquiring the differential data of the current service data of each sub-service when the inflection point of the current service data of the main service is abnormal.
The points to be explained here are: step 407 and step 408 may be applied to the periodic data and the aperiodic data, and when it is determined that the inflection point abnormality exists in the periodic data of a certain main service or the aperiodic data of a certain main service, differential data, such as first-order differential data, of the current service data of each sub-service corresponding to the main service is obtained.
408: determining data meeting a second preset condition from all the differential data, and performing anomaly analysis on the current service data of the main service according to the data meeting the second preset condition, that is, selecting the differential data meeting the second preset condition from all the differential data, performing anomaly analysis on the current service data of the main service, and how to perform anomaly analysis on the current service data of the main service, which is not described in detail in this embodiment.
The second preset condition may be determined according to an actual application scenario, and is not limited to this embodiment, for example, the first preset condition may be the difference data with the largest value and/or the smallest value among all the difference data, or the first preset condition may be the difference data with the value in a certain range among all the difference data.
According to the technical scheme, for the cycle abnormality of the current service data of the main service, the cycle abnormality can be analyzed through the deviation value, and for the inflection point abnormality of the current service data of the main service, the inflection point abnormality can be analyzed through the difference data of the current service data of the sub-service, so that the abnormality reasons of the cycle abnormality and the inflection point abnormality can be determined.
In addition, in the embodiment, when it is determined that there is an abnormality (any one of the above-mentioned cycle abnormality, inflection point abnormality, and deviation abnormality) in the current service data, the abnormality is output, for example, the abnormality may be output in an alarm manner, for example, the abnormality may be output in at least one of an audio alarm manner, a text alarm manner, and a screen alarm manner, and when an abnormality cause of the abnormality is determined, the abnormality cause may be output, so as to facilitate checking of the abnormality cause.
While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present invention is not limited by the illustrated ordering of acts, as some steps may occur in other orders or concurrently with other steps in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Corresponding to the foregoing method embodiment, an embodiment of the present invention further provides a data anomaly detection apparatus, whose structure is shown in fig. 5, and may include: an acquisition unit 11, an adjustment unit 12, a first abnormality detection unit 13, and a second abnormality detection unit 14.
An obtaining unit 11, configured to obtain current service data, please refer to the method embodiment for a description of the current service data, which is not described in this embodiment.
The adjusting unit 12 is configured to perform at least one first type adjustment and at least one second type adjustment on the current service data to obtain the adjusted current service data when it is determined that the current service data is periodic data.
A possible way to determine whether the current service data is periodic data may be: and calculating the period corresponding to the current service data through FFT, wherein if the period corresponding to the current service data can be obtained, the current service data is the periodic data, and otherwise, the current service data is not the periodic data.
For periodic data, the periodic data is adjusted for multiple times through first-type adjustment and second-type adjustment, wherein adjustment parameters adopted by the first-type adjustment and the second-type adjustment are different, the first adjustment is to adjust the current service data, and the adjustment for other times except the first adjustment is to adjust the data obtained by the last adjustment. That is, different adjustment parameters are adopted to perform multiple adjustments, and the data corresponding to each adjustment is also different, the specific data corresponding to the first adjustment is the current service data, the data corresponding to other times of adjustments is the data obtained by the last adjustment, taking the jth adjustment (j is a natural number greater than 1) as an example, and the data corresponding to the jth adjustment is the data obtained by the jth-1 adjustment. Possible ways of the first type of adjustment and the second type of adjustment are explained below:
if the ith adjustment is the first type adjustment, acquiring a normal distribution graph of data corresponding to the adjustment, and adjusting data with a value larger than a first preset threshold value in the normal distribution graph of the data corresponding to the adjustment to be the median of the data corresponding to the adjustment, wherein the data corresponding to the adjustment is current business data when i is equal to 1, and the data corresponding to the adjustment is data obtained by the adjustment of the ith-1 time when i is a natural number larger than 1; and if the ith adjustment is the second type adjustment, acquiring a residual component of the data corresponding to the adjustment through an STL algorithm, acquiring a normal distribution diagram of the residual component of the data corresponding to the adjustment, determining a time point corresponding to each data of which the value is greater than a second preset threshold in the normal distribution diagram of the residual component of the data corresponding to the adjustment, and for any data of which the value is greater than the second preset threshold: the data is adjusted to an expected value of the data corresponding to the adjustment at a time point corresponding to the data, where the data corresponding to the adjustment is current service data when i is equal to 1, the data corresponding to the adjustment is data obtained by the adjustment of the (i-1) th time when i is a natural number greater than 1, and the second preset threshold is different from the first preset threshold.
In this embodiment, the second preset threshold is preferably smaller than the first preset threshold, and after the first type adjustment is performed at least once through the first preset threshold, the second type adjustment is performed at least once through the second preset threshold, so that the data can be coarsely adjusted through the first type adjustment, and finely adjusted through the second type adjustment, so as to reduce the influence of abnormal data in the current service data on the periodic component and the trend component, and when the data volume of the current service data is small, the current service data is easy to fluctuate greatly, and the fluctuation involved range can be reduced through the coarse adjustment and the fine adjustment, so that the detection accuracy is improved.
A first anomaly detection unit 13, configured to perform at least one of a period anomaly detection and an inflection point anomaly detection on the adjusted current service data to determine whether there is at least one of a period anomaly and an inflection point anomaly in the adjusted current service data.
In this embodiment, the period anomaly detection may be determined by the STL algorithm, and the inflection point anomaly detection may be performed by: obtaining a normal distribution diagram of differential data (such as first-order differential data) of the adjusted current service data, obtaining a deviation point in the normal distribution diagram of the differential data, and determining whether the adjusted current service data has an inflection point abnormality according to the deviation point in the normal distribution diagram of the differential data (refer to the method embodiment part for specific description), so as to automatically locate a mutation point in the adjusted current service data when the inflection point abnormality exists in the current service data, where the mutation point is a point having a value variation larger than a value variation preset for the deviation point, and the value variation is a value difference between data corresponding to the deviation point and previous data thereof, where the deviation point is an outlier point outside a standard deviation of a first preset multiple in the normal distribution diagram of the differential data, that is, data outside the standard deviation larger than and smaller than the first preset multiple in the normal distribution diagram of the differential data, the first preset multiple may be determined according to practical applications, and this embodiment is not limited.
As can be seen from the above description of the mutation point, a feasible way for determining whether the adjusted current service data has an inflection point anomaly may be as follows: and determining whether a point with a value variation larger than a preset value variation exists in the deviation point, if so, indicating that the inflection point abnormality exists in the adjusted current service data.
A second anomaly detection unit 14, configured to, in a case where it is determined that the current traffic data is not periodic data, perform at least one of deviation anomaly detection and inflection point anomaly detection on the current traffic data to determine whether the current traffic data has at least one of deviation anomaly and inflection point anomaly.
In this embodiment, a feasible manner of detecting the deviation anomaly is: obtaining a normal distribution diagram of the current service data, obtaining a deviation point in the normal distribution diagram of the current service data, and determining whether the deviation of the current service data is abnormal according to the deviation point in the normal distribution diagram of the current service data.
Possible ways of inflection point anomaly detection are: obtaining a normal distribution diagram of the differential data of the current service data, obtaining a deviation point in the normal distribution diagram of the differential data, and determining whether the inflection point abnormality exists in the current service data according to the deviation point in the normal distribution diagram of the differential data.
The points to be explained here are: when it is determined that there is inflection point abnormality in the adjusted current service data or the current service data that is not adjusted, if one inflection point causes two inflection point abnormalities and the two inflection points abnormalities are continuous but have opposite data development trends, as shown in fig. 2, the data development trends of the two inflection point abnormalities caused by one inflection point are data increase and data decrease, respectively, the latter one of the two inflection point abnormalities is considered as a pseudo-abnormality, and data corresponding to the latter one is determined as normal data.
According to the technical scheme, under the condition that the obtained current service data is periodic data, at least one first type adjustment and at least one second type adjustment can be performed on the current service data, and at least one of period anomaly detection and inflection point anomaly detection is performed on the adjusted current service data, so that whether the adjusted current service data has at least one of period anomaly and inflection point anomaly is determined, whether the anomaly type is the period anomaly or the inflection point anomaly can be determined when the adjusted current service data has the anomaly, dynamic multiple adjustment on the periodic data is realized through the first type adjustment and the second type adjustment on the periodic data, and the accuracy of the periodic data anomaly detection is improved. In the case that the obtained current service data is not periodic data, at least one of deviation anomaly detection and inflection point anomaly detection may be directly performed on the current service data, so as to determine whether the current service data has at least one of deviation anomaly and inflection point anomaly, so as to determine whether the anomaly type is the deviation anomaly or the inflection point anomaly when the current service data has the anomaly.
Referring to fig. 6, another structure of the data anomaly detection apparatus according to the embodiment of the present invention is shown, and on the basis of fig. 5, the apparatus may further include: and an anomaly verification unit 15, configured to verify, if the current service data is current service data of the main service and the current service data of the main service corresponds to current service data of at least one sub-service, an anomaly of the current service data of the main service, which is at least one of a period anomaly and an inflection point anomaly, through the current service data of the sub-service when it is determined that the adjusted current service data is anomalous.
In this embodiment, the process of the anomaly verification unit 15 verifying the anomaly of the current service data of the main service may be: determining the data deviation direction of the current service data of the main service at each corresponding time point, and for the current service data of any sub-service belonging to the main service: and determining a data deviation direction of the current service data of the sub-service at each time point, wherein the data deviation direction is used for indicating that the data development trend at the time point is any one of data increase and data decrease.
For any time point: if the data deviation direction of each sub-service at the time point is different from the data deviation direction of the main service at the time point, determining that the current service data of the main service is a pseudo-anomaly, and determining the current service data of the main service as normal service data, thereby realizing automatic correction of the pseudo-anomaly.
Each time point corresponding to the current service data may be a generation time or a recording time of the current service data, for example, the service data may be recorded at intervals of a period of time, and the time point corresponding to the current service data may be determined according to an actual application scenario, which is not limited in this embodiment.
According to the technical scheme, for the periodic data of the main service, when the periodic data of the main service is determined to have a period abnormality or an inflection point abnormality, the period abnormality or the inflection point abnormality can be verified through the periodic data of the sub-service corresponding to the main service to determine whether the periodic data of the main service is a pseudo-abnormality, so that the misjudgment rate can be reduced, and when the periodic data of the main service is determined to be the pseudo-abnormality, the current service data of the main service can be determined to be normal service data, so that the automatic correction of the pseudo-abnormality is realized.
Referring to fig. 7, which shows another structure of the data anomaly detection apparatus according to the embodiment of the present invention, on the basis of fig. 5, the data anomaly detection apparatus may further include: an offset value determining unit 16, a first abnormality analyzing unit 17, a differential data acquiring unit 18, and a second abnormality analyzing unit 19.
An offset value determining unit 16, configured to, if the current service data is current service data of the main service, and the current service data of the main service corresponds to current service data of at least one sub-service, calculate, when it is determined that the adjusted current service data has an abnormal period, an expected value of the current service data of each sub-service at each corresponding time point, and for any time point: and determining the deviation value of the actual value of the current service data of each sub-service at the time point relative to the expected value at the time point.
The first anomaly analysis unit 17 is configured to determine an offset value meeting a first preset condition from all the offset values, perform anomaly analysis on the current service data of the main service according to the offset value meeting the first preset condition, that is, select an offset value meeting the first preset condition from all the offset values, perform anomaly analysis on the current service data of the main service, and how to perform anomaly analysis on the current service data of the main service, which is not described in detail in this embodiment.
The first preset condition may be determined according to an actual application scenario, and is not limited to this embodiment, for example, the first preset condition may be an offset value with a maximum value and/or a minimum value among all offset values, or the first preset condition may be an offset value with a value in a certain range among all offset values.
The differential data obtaining unit 18 is configured to, if the current service data is current service data of the main service, and the current service data of the main service corresponds to current service data of at least one sub-service, obtain differential data of the current service data of each sub-service when an inflection point of the current service data of the main service is abnormal, for example, obtain first-order differential data of the current service data of each sub-service.
The second anomaly analysis unit 19 is configured to determine data meeting a second preset condition from all the differential data, and perform anomaly analysis on the current service data of the main service according to the data meeting the second preset condition, that is, select the differential data meeting the second preset condition from all the differential data, perform anomaly analysis on the current service data of the main service, and how to perform anomaly analysis on the current service data of the main service, which is not described in detail in this embodiment.
The second preset condition may be determined according to an actual application scenario, and is not limited to this embodiment, for example, the first preset condition may be the difference data with the largest value and/or the smallest value among all the difference data, or the first preset condition may be the difference data with the value in a certain range among all the difference data.
According to the technical scheme, for the cycle abnormality of the current service data of the main service, the cycle abnormality can be analyzed through the deviation value, and for the inflection point abnormality of the current service data of the main service, the inflection point abnormality can be analyzed through the difference data of the current service data of the sub-service, so that the abnormality reasons of the cycle abnormality and the inflection point abnormality can be determined.
In addition, in the present embodiment, when it is determined that there is an abnormality (any one of the above-described cycle abnormality, inflection point abnormality, and deviation abnormality) in the current traffic data, the data abnormality detecting device may further output the abnormality, for example, may output the abnormality in an alarm manner, for example, may output the abnormality in at least one of an audio alarm manner, a text alarm manner, and a screen alarm manner, and may further output the cause of the abnormality when the cause of the abnormality is determined, so as to facilitate viewing of the cause of the abnormality.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (8)

1. A method for detecting data anomalies, the method comprising:
acquiring current service data;
under the condition that the current service data is determined to be periodic data, performing at least one first type adjustment and at least one second type adjustment on the current service data to obtain adjusted current service data, wherein the first type adjustment and the second type adjustment adopt different adjustment parameters, the first adjustment is to adjust the current service data, and the other adjustments except the first adjustment are to adjust the data obtained by the last adjustment;
performing at least one of cycle anomaly detection and inflection point anomaly detection on the adjusted current service data to determine whether the adjusted current service data has at least one of cycle anomaly and inflection point anomaly;
in a case where it is determined that the current traffic data is not periodic data, performing at least one of deviation anomaly detection and inflection point anomaly detection on the current traffic data to determine whether the current traffic data has at least one of deviation anomaly and inflection point anomaly;
wherein, in the case that it is determined that the current service data is periodic data, performing at least one first type adjustment and at least one second type adjustment on the current service data includes:
if the ith adjustment is the first type adjustment, acquiring a normal distribution graph of data corresponding to the adjustment, and adjusting data with a value larger than a first preset threshold value in the normal distribution graph of the data corresponding to the adjustment to the median of the data corresponding to the adjustment, wherein the data corresponding to the adjustment is the current service data when i is equal to 1, and the data corresponding to the adjustment is the data obtained by the adjustment of the ith-1 time when i is a natural number larger than 1;
if the ith adjustment is the second type adjustment, acquiring a residual component of the data corresponding to the adjustment through a time series decomposition algorithm, acquiring a normal distribution diagram of the residual component of the data corresponding to the adjustment, determining a time point corresponding to each data of which the value is greater than a second preset threshold in the normal distribution diagram of the residual component of the data corresponding to the adjustment, and for any data of which the value is greater than the second preset threshold: and adjusting the data to an expected value of the data corresponding to the adjustment at a time point corresponding to the data, wherein the data corresponding to the adjustment is the current service data when i is equal to 1, the data corresponding to the adjustment is data obtained by the adjustment of the ith-1 th time when i is a natural number greater than 1, and the second preset threshold is different from the first preset threshold.
2. The method of claim 1, further comprising: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, when the adjusted current service data is determined to have an abnormality, the abnormality of the current service data of the main service is verified through the current service data of the sub-service, and the abnormality is at least one of cycle abnormality and inflection point abnormality.
3. The method according to claim 2, wherein the verifying the anomaly of the current service data of the main service by the current service data of the sub-service comprises:
determining the data deviation direction of the current service data of the main service at each corresponding time point, and for the current service data of any sub-service belonging to the main service: determining a data deviation direction of the current service data of the sub-service at each time point, wherein the data deviation direction is used for indicating that the data development trend at the time point is any one of data increase and data decrease;
for any time point: if the data deviation direction of each sub-service at the time point is different from the data deviation direction of the main service at the time point, determining that the abnormality of the current service data of the main service is a pseudo abnormality, and determining the current service data of the main service as normal service data.
4. The method of claim 1, further comprising: if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, when the adjusted current service data has an abnormal period, calculating an expected value of the current service data of each sub-service at each corresponding time point, and for any time point: determining a deviation value of an actual value of current service data of each sub-service at the time point relative to an expected value at the time point;
determining a deviation value meeting a first preset condition from all the deviation values, and performing anomaly analysis on the current service data of the main service according to the deviation value meeting the first preset condition;
if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, acquiring the differential data of the current service data of each sub-service when the inflection point of the current service data of the main service is abnormal;
and determining data meeting a second preset condition from all the differential data, and performing anomaly analysis on the current service data of the main service according to the data meeting the second preset condition.
5. The method of claim 1, wherein the performing at least one of bias anomaly detection and inflection point anomaly detection on the current traffic data in the case that the current traffic data is determined not to be periodic data comprises:
under the condition that the current service data is determined not to be periodic data, acquiring a normal distribution diagram of the current service data, acquiring a deviation point in the normal distribution diagram of the current service data, and determining whether the current service data has deviation abnormality according to the deviation point in the normal distribution diagram of the current service data;
and under the condition that the current service data is determined not to be periodic data, acquiring a normal distribution diagram of the differential data of the current service data, acquiring a deviation point in the normal distribution diagram of the differential data, and determining whether the inflection point abnormality exists in the current service data according to the deviation point in the normal distribution diagram of the differential data.
6. An apparatus for detecting data abnormality, the apparatus comprising:
the acquisition unit is used for acquiring current service data;
an adjusting unit, configured to perform at least one first type adjustment and at least one second type adjustment on the current service data to obtain adjusted current service data when it is determined that the current service data is periodic data, where adjustment parameters adopted by the first type adjustment and the second type adjustment are different, the first adjustment is to adjust the current service data, and other adjustments except the first adjustment are to adjust data obtained by the last adjustment;
a first anomaly detection unit, configured to perform at least one of a period anomaly detection and an inflection point anomaly detection on the adjusted current service data to determine whether the adjusted current service data has at least one of a period anomaly and an inflection point anomaly;
a second anomaly detection unit configured to, in a case where it is determined that the current traffic data is not periodic data, perform at least one of deviation anomaly detection and inflection point anomaly detection on the current traffic data to determine whether or not there is at least one of deviation anomaly and inflection point anomaly in the current traffic data;
the adjusting unit is specifically configured to, if the ith adjustment is the first type adjustment, obtain a normal distribution map of data corresponding to the adjustment, adjust data, of which a value is greater than a first preset threshold, in the normal distribution map of the data corresponding to the adjustment to a median of the data corresponding to the adjustment, where the data corresponding to the adjustment is the current service data when i is equal to 1, and the data corresponding to the adjustment is data obtained by the adjustment for the ith-1 st time when i is a natural number greater than 1;
if the ith adjustment is the second type adjustment, acquiring a residual component of the data corresponding to the adjustment through a time series decomposition algorithm, acquiring a normal distribution diagram of the residual component of the data corresponding to the adjustment, determining a time point corresponding to each data of which the value is greater than a second preset threshold in the normal distribution diagram of the residual component of the data corresponding to the adjustment, and for any data of which the value is greater than the second preset threshold: and adjusting the data to an expected value of the data corresponding to the adjustment at a time point corresponding to the data, wherein the data corresponding to the adjustment is the current service data when i is equal to 1, the data corresponding to the adjustment is data obtained by the adjustment of the ith-1 th time when i is a natural number greater than 1, and the second preset threshold is different from the first preset threshold.
7. The apparatus of claim 6, further comprising: and if the current service data is the current service data of the main service and the current service data of the main service corresponds to the current service data of at least one sub-service, verifying the abnormality of the current service data of the main service through the current service data of the sub-service when the adjusted current service data is determined to have the abnormality, wherein the abnormality is at least one of cycle abnormality and inflection point abnormality.
8. The apparatus of claim 6, further comprising: an offset value determining unit, configured to calculate, if the current service data is current service data of a main service and the current service data of the main service corresponds to current service data of at least one sub-service, an expected value of the current service data of each sub-service at each corresponding time point when it is determined that the adjusted current service data has an abnormal period, and for any time point: determining a deviation value of an actual value of current service data of each sub-service at the time point relative to an expected value at the time point;
the first anomaly analysis unit is used for determining an offset value meeting a first preset condition from all the offset values and carrying out anomaly analysis on the current service data of the main service according to the offset value meeting the first preset condition;
a differential data obtaining unit, configured to obtain differential data of current service data of each sub-service when an inflection point of the current service data of the main service is abnormal, if the current service data is current service data of the main service and the current service data of the main service corresponds to current service data of at least one sub-service;
and the second anomaly analysis unit is used for determining data meeting a second preset condition from all the differential data and carrying out anomaly analysis on the current service data of the main service according to the data meeting the second preset condition.
CN201810631649.1A 2018-06-19 2018-06-19 Data anomaly detection method and device Active CN108804703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810631649.1A CN108804703B (en) 2018-06-19 2018-06-19 Data anomaly detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810631649.1A CN108804703B (en) 2018-06-19 2018-06-19 Data anomaly detection method and device

Publications (2)

Publication Number Publication Date
CN108804703A CN108804703A (en) 2018-11-13
CN108804703B true CN108804703B (en) 2021-09-17

Family

ID=64083608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810631649.1A Active CN108804703B (en) 2018-06-19 2018-06-19 Data anomaly detection method and device

Country Status (1)

Country Link
CN (1) CN108804703B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232132B (en) * 2019-06-18 2020-11-06 北京天泽智云科技有限公司 Time series data processing method and device
CN110851338B (en) * 2019-09-23 2022-06-24 平安科技(深圳)有限公司 Abnormality detection method, electronic device, and storage medium
CN110830946B (en) * 2019-11-15 2020-11-06 江南大学 Mixed type online data anomaly detection method
CN113514713B (en) * 2020-04-10 2022-12-20 中车唐山机车车辆有限公司 Method and device for detecting performance of traction converter of motor train unit and terminal equipment
CN112090097B (en) * 2020-08-06 2021-10-19 浙江大学 Performance analysis method and application of traditional Chinese medicine concentrator
CN114936211B (en) * 2022-07-19 2022-11-01 深圳市星卡软件技术开发有限公司 Automobile diagnosis data processing method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090848A (en) * 2014-07-16 2014-10-08 云南大学 Memory management method and device for periodic large big data processing
CN104753733A (en) * 2013-12-31 2015-07-01 中兴通讯股份有限公司 Method and device for detecting abnormal network traffic data
CN104915846A (en) * 2015-06-18 2015-09-16 北京京东尚科信息技术有限公司 Electronic commerce time sequence data anomaly detection method and system
CN105243001A (en) * 2014-07-07 2016-01-13 阿里巴巴集团控股有限公司 Abnormal alarm method and apparatus for business object
CN106789837A (en) * 2015-11-20 2017-05-31 腾讯科技(深圳)有限公司 Network anomalous behaviors detection method and detection means

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9852019B2 (en) * 2013-07-01 2017-12-26 Agent Video Intelligence Ltd. System and method for abnormality detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104753733A (en) * 2013-12-31 2015-07-01 中兴通讯股份有限公司 Method and device for detecting abnormal network traffic data
CN105243001A (en) * 2014-07-07 2016-01-13 阿里巴巴集团控股有限公司 Abnormal alarm method and apparatus for business object
CN104090848A (en) * 2014-07-16 2014-10-08 云南大学 Memory management method and device for periodic large big data processing
CN104915846A (en) * 2015-06-18 2015-09-16 北京京东尚科信息技术有限公司 Electronic commerce time sequence data anomaly detection method and system
CN106789837A (en) * 2015-11-20 2017-05-31 腾讯科技(深圳)有限公司 Network anomalous behaviors detection method and detection means

Also Published As

Publication number Publication date
CN108804703A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108804703B (en) Data anomaly detection method and device
US9354968B2 (en) Systems and methods for data quality control and cleansing
CN107871190B (en) Service index monitoring method and device
CN110874674B (en) Abnormality detection method, device and equipment
CN110008080B (en) Business index anomaly detection method and device based on time sequence and electronic equipment
US20170339168A1 (en) Method, apparatus, and computer-readable medium for detecting anomalous user behavior
US11089369B2 (en) Methods and apparatus to categorize media impressions by age
US20160226901A1 (en) Anomaly Detection Using Adaptive Behavioral Profiles
Barreto‐Souza Zero‐modified geometric INAR (1) process for modelling count time series with deflation or inflation of zeros
Burke et al. Planet Detection Metrics: Per-Target Detection Contours for Data Release 25
Schoonhoven et al. A robust control chart
US20150087242A1 (en) Systems and methods for active cellular transceiver analysis for harmful passive intermodulation detection
US20130173215A1 (en) Adaptive trend-change detection and function fitting system and method
Süveges Extreme-value modelling for the significance assessment of periodogram peaks
CN113765895B (en) Method and device for auditing live broadcasting room
CN112148557A (en) Method for predicting performance index in real time, computer equipment and storage medium
CN115081969B (en) Abnormal data determination method and related device
González‐López Effect of noise on MTF calculations using different phantoms
CN110909306A (en) Service abnormity detection method and device, electronic equipment and storage equipment
CN112152833B (en) Network abnormity alarm method and device and electronic equipment
US9535917B1 (en) Detection of anomalous utility usage
CN110971435A (en) Alarm method and device
Szarka III et al. Comparison of the early aberration reporting system (EARS) W2 methods to an adaptive threshold method
CN111897851A (en) Abnormal data determination method and device, electronic equipment and readable storage medium
Castillo et al. Fault detection schemes for continuous-time stochastic dynamical systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231115

Address after: 100190 901-1, Floor 9, Building 3, No. 2 Academy South Road, Haidian District, Beijing

Patentee after: Beijing Bodian Zhihe Technology Co.,Ltd.

Address before: 100086 20 / F, block C, No.2, south academy of Sciences Road, Haidian District, Beijing

Patentee before: BEIJING JIAODIAN XINGANXIAN INFORMATION TECHNOLOGY CO.,LTD.