CN114168586A - Abnormal point detection method and device - Google Patents

Abnormal point detection method and device Download PDF

Info

Publication number
CN114168586A
CN114168586A CN202210123405.9A CN202210123405A CN114168586A CN 114168586 A CN114168586 A CN 114168586A CN 202210123405 A CN202210123405 A CN 202210123405A CN 114168586 A CN114168586 A CN 114168586A
Authority
CN
China
Prior art keywords
sequence
index data
spectrum
frequency domain
data sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210123405.9A
Other languages
Chinese (zh)
Inventor
易存道
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baolande Software Co ltd
Original Assignee
Beijing Baolande Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baolande Software Co ltd filed Critical Beijing Baolande Software Co ltd
Priority to CN202210123405.9A priority Critical patent/CN114168586A/en
Publication of CN114168586A publication Critical patent/CN114168586A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for detecting an abnormal point, wherein the method comprises the following steps: acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence; determining the position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point. The method and the device realize low false alarm rate and low false alarm rate in the aspect of abnormal point detection without depending on a large number of marked samples.

Description

Abnormal point detection method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for detecting an abnormal point.
Background
The detection of anomalies in time-series index data sequences has also been widely developed in recent years as an important issue in the field of data mining. In the anomaly detection method in the prior art, an anomaly detection model is constructed based on a supervised algorithm or a corresponding baseline is generated by calculating a static or dynamic threshold based on a same-proportion and a ring-proportion, and if the threshold proportion exceeds the baseline, the anomaly is judged to be an abnormal point. In addition, under the condition that the task of anomaly detection is to monitor various time sequence index data sequences from different service scenes, the anomaly point identification of the two methods has higher false negative rate and false positive rate.
Disclosure of Invention
The invention provides a method and a device for detecting an abnormal point, which realize the purposes of not depending on a large number of marked samples and realizing low missing report rate and low false report rate in the aspect of abnormal point detection.
In a first aspect, the present invention provides a method for detecting an abnormal point, including:
acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points;
performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result;
performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence;
determining the position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
According to the abnormal point detection method provided by the invention, before the target time sequence index data sequence is obtained, the method comprises the following steps:
the method comprises the steps of collecting operation data of each component in a computer system at regular time, storing the operation data into a data warehouse, and processing the operation data by using a processing convergence technology to generate a time sequence index data sequence;
adding a preset number of target data points at the end of each data point in the time sequence index data sequence based on an average gradient mode, and generating a target time sequence index data sequence based on the time sequence index data sequence and the target data points.
According to the anomaly point detection method provided by the invention, the frequency domain index data sequence based spectrum residual calculation is carried out, and the obtained spectrum residual calculation result comprises the following steps:
acquiring a logarithmic magnitude spectrum and a main stream signal magnitude spectrum based on the frequency domain index data sequence;
and performing spectrum residual error calculation based on the logarithmic magnitude spectrum and the main stream signal magnitude spectrum to obtain a spectrum residual error calculation result.
According to the anomaly point detection method provided by the invention, the step of acquiring the logarithmic magnitude spectrum and the mainstream signal magnitude spectrum based on the frequency domain index data sequence comprises the following steps:
determining a magnitude spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence;
carrying out logarithmic transformation processing on the amplitude spectrum to obtain a logarithmic amplitude spectrum;
and acquiring a main stream signal amplitude spectrum corresponding to the main stream signal frequency in the logarithmic amplitude spectrum by using a convolution smoothing mode.
According to the anomaly point detection method provided by the invention, the step of performing inverse fast fourier transform on the frequency domain index data sequence based on the spectrum residual error calculation result to obtain a saliency map sequence comprises the following steps:
determining a phase spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence;
and performing fast Fourier inverse transformation on the frequency domain index data sequence based on the frequency spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence.
According to the anomaly point detection method provided by the invention, the frequency domain index data sequence is subjected to inverse fast Fourier transform based on the spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence, and the method is realized by the following formula:
Figure 757623DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 724442DEST_PATH_IMAGE002
a sequence of saliency maps is represented that,
Figure 823985DEST_PATH_IMAGE003
representing the inverse of the fast fourier transform,
Figure 305782DEST_PATH_IMAGE004
representing an exponential function with a natural constant e as the base,
Figure 657129DEST_PATH_IMAGE005
the number of the units of the imaginary number is expressed,
Figure 337509DEST_PATH_IMAGE006
which represents the result of the spectral residual calculation,
Figure 748899DEST_PATH_IMAGE007
representing a phase spectrum.
According to the method for detecting the abnormal points, provided by the invention, the position of the data point to be detected in the saliency map sequence meets the preset mutation condition through the following formula:
Figure 452412DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 466505DEST_PATH_IMAGE009
a value representing a data point to be detected,
Figure 142337DEST_PATH_IMAGE010
represents the mean value of a preset number of data points before the data point to be detected on the saliency map sequence,
Figure 583682DEST_PATH_IMAGE011
indicating the coefficient of sensitivity.
In a second aspect, the present invention provides an apparatus for outlier detection, comprising:
the acquisition module is used for acquiring a target time sequence index data sequence and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points;
the calculation module is used for performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result;
the inverse transformation module is used for carrying out fast Fourier inverse transformation on the frequency domain index data sequence based on the frequency spectrum residual error calculation result to obtain a saliency map sequence;
and the determining module is used for determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
In a third aspect, the present invention provides an electronic device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the steps of the method for outlier detection according to any of the above.
In a fourth aspect, the invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of anomaly detection as described in any one of the above.
According to the method and the device for detecting the abnormal point, provided by the invention, the target time sequence index data sequence is obtained, the obtained target time sequence index data sequence does not need to be subjected to sample marking, the target time sequence index data sequence is subjected to fast Fourier transform, and the target time sequence index data sequence in a time domain is converted into a frequency domain index data sequence; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual error calculation result to obtain a saliency map sequence; and determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point. The spectrum residual error abnormal detection method based on Fourier analysis is adopted without depending on a large number of labeled samples, and low missing report rate and low false report rate can be realized in the aspect of abnormal point detection.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for detecting an abnormal point according to an embodiment of the present invention;
FIG. 2 is one of the comparative graphs of the effects provided for the present invention;
FIG. 3 is a second comparison graph of the effects provided by the present invention;
FIG. 4 is a third comparison graph of the effects provided by the present invention;
FIG. 5 is a schematic structural diagram of an apparatus for anomaly detection according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a schematic flow chart of a method for detecting an abnormal point according to an embodiment of the present invention includes:
and 110, acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence is composed of data points.
In this step, Fast Fourier Transform (FFT), which is a general name of an efficient and fast calculation method for calculating Discrete Fourier Transform (DFT) by using a computer, is abbreviated as FFT.
Fast fourier transform belongs to the classical method in the field of digital signal processing. The fast Fourier transform is used for transforming the target time sequence index data sequence from a time domain to a frequency domain to obtain a frequency domain index data sequence. The frequency domain dimension is one dimension better at capturing the signal than the time domain that is usually seen and perceived by the human eye.
And 120, performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result.
In the step, the spectrum residual calculation can be calculated based on a spectrum residual algorithm SR, correspondingly, the spectrum residual algorithm SR is an image saliency detection algorithm, the image saliency is an important visual feature in the image, and the attention degree of human eyes to each region of the image is reflected.
And 130, performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence.
In the step, the inverse fast fourier transform is to convert the frequency domain index data sequence after the spectrum residual calculation from the frequency domain back to the time domain to obtain the saliency map sequence.
And 140, determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
In this step, the saliency map sequence is composed of data points corresponding to random noise and data points corresponding to abnormal signals, and the data points corresponding to random noise are subject to normal distribution. Therefore, if the position of the data point to be detected in the saliency map sequence satisfies the preset mutation condition, the data point to be detected is determined as an outlier.
According to the method for detecting the abnormal point, provided by the invention, a target time sequence index data sequence is obtained, the obtained target time sequence index data sequence does not need to be subjected to sample marking, the target time sequence index data sequence is subjected to fast Fourier transform, and the target time sequence index data sequence in a time domain is converted into a frequency domain index data sequence; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual error calculation result to obtain a saliency map sequence; and determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point. The spectrum residual error abnormal detection method based on Fourier analysis is adopted without depending on a large number of labeled samples, and low missing report rate and low false report rate can be realized in the aspect of abnormal point detection.
According to any of the above embodiments, before step 110, the method includes the following steps 111-112:
and 111, acquiring the operation data of each component in the computer system at regular time, storing the operation data into a data warehouse, and processing the operation data by using a processing convergence technology to generate a time sequence index data sequence.
In the step, the operation data of each component in the computer system can be acquired at regular time by using an Agent technology, and particularly, the Agent technology is a computer system packaged in a certain environment, so that flexible and autonomous activities can be realized in the environment for the purpose of design.
A data warehouse, which may be abbreviated as DW or DWH, is a theme-oriented, integrated, time-varying, but relatively stable data set of information itself that is used to support administrative decision-making processes.
The data warehouse supports the use of a processing convergence technology to process the operation data to generate a time sequence index data sequence.
And 112, adding a preset number of target data points at the end of each data point in the time sequence index data sequence based on an average gradient mode, and generating a target time sequence index data sequence based on the time sequence index data sequence and the target data points.
In this step, if the data point to be detected is located at the center of the sliding window, the spectrum residual calculation effect based on fourier analysis is better. Therefore, a preset number of target data points need to be added at the end of each data point in the time-series index data sequence, so that each data point in the time-series index data sequence is located at the center of the sliding window. The preset number here belongs to an adjustable hyper-parameter, and is set according to a specific application scene.
Based on any of the above embodiments, the step 120 specifically includes the following steps 121 to 122:
and step 121, acquiring a logarithmic magnitude spectrum and a main stream signal magnitude spectrum based on the frequency domain index data sequence.
In the step, a frequency domain index data sequence is obtained through fast Fourier transform, a corresponding magnitude spectrum is obtained based on the frequency domain index data sequence, and logarithmic change is performed on the magnitude spectrum to obtain a corresponding logarithmic magnitude spectrum.
The mainstream signal amplitude spectrum refers to the amplitude spectrum corresponding to the normal signal (data point) in the frequency domain index data sequence. In particular, the signal is a structured data form.
And step 122, performing spectrum residual calculation based on the logarithmic magnitude spectrum and the main stream signal magnitude spectrum to obtain a spectrum residual calculation result.
Based on any of the above embodiments, the step 121 specifically includes the following steps 1211 to 1213:
step 1211, determining a magnitude spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence.
In this step, in fourier analysis, the variation of the amplitude of each component with frequency is referred to as the amplitude spectrum of the signal. Correspondingly, the variation of the amplitude of each data point in the frequency domain index data sequence along with the frequency is taken as an amplitude spectrum.
And 1212, performing logarithmic transformation processing on the magnitude spectrum to obtain a logarithmic magnitude spectrum.
In this step, the logarithmic conversion is performed in order to increase the discrimination between the normal signal frequency and the abnormal signal frequency.
And 1213, acquiring a main flow signal amplitude spectrum corresponding to the main flow signal frequency in the logarithmic amplitude spectrum by using a convolution smoothing mode.
In the steps 1211 to 1213, the method can be specifically realized by the following formula:
Figure 774492DEST_PATH_IMAGE012
(1)
Figure 467642DEST_PATH_IMAGE013
(2)
Figure 122614DEST_PATH_IMAGE014
(3)
Figure 875806DEST_PATH_IMAGE015
(4)
Figure 553912DEST_PATH_IMAGE016
(5)
wherein the content of the first and second substances,
Figure 912737DEST_PATH_IMAGE017
the amplitude spectrum is represented by a spectrum of amplitudes,
Figure 297582DEST_PATH_IMAGE018
the representation of a log-magnitude spectrum,
Figure 346310DEST_PATH_IMAGE019
representing the magnitude spectrum of the main stream signal,
Figure 714974DEST_PATH_IMAGE020
in order to be a custom convolution kernel function,
Figure 546664DEST_PATH_IMAGE021
the width of the hyper-parametric convolution smoothing window is shown, and the default value is 10.
Based on any of the above embodiments, the step 130 specifically includes the following steps 131 to 132:
and 131, determining a phase spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence.
In this step, the variation of the phase of each data point in the frequency domain index data series with the frequency is taken as a phase spectrum. Correspondingly, this can be achieved by the following formula:
Figure 910649DEST_PATH_IMAGE022
(6)
wherein the content of the first and second substances,
Figure 5644DEST_PATH_IMAGE007
representing a phase spectrum.
And 132, performing inverse fast fourier transform on the frequency domain index data sequence based on the spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence.
In this step, the spectral residual calculation result can be used
Figure 986238DEST_PATH_IMAGE006
Specifically, spectrum residual calculation is performed based on the logarithmic magnitude spectrum and the main stream signal magnitude spectrum to obtain a spectrum residual calculation result, and the calculation is realized by the following formula:
Figure 356040DEST_PATH_IMAGE023
(7)
correspondingly, the frequency domain index data sequence is subjected to inverse fast Fourier transform based on the spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence, and the method is realized by the following formula:
Figure 981056DEST_PATH_IMAGE024
(8)
wherein the content of the first and second substances,
Figure 106007DEST_PATH_IMAGE002
a sequence of saliency maps is represented that,
Figure 980422DEST_PATH_IMAGE003
representing the inverse of the fast fourier transform,
Figure 622756DEST_PATH_IMAGE004
representing an exponential function with a natural constant e as the base,
Figure 961333DEST_PATH_IMAGE005
the number of the units of the imaginary number is expressed,
Figure 398131DEST_PATH_IMAGE006
which represents the result of the spectral residual calculation,
Figure 87738DEST_PATH_IMAGE007
representing a phase spectrum.
Saliency map sequences
Figure 799342DEST_PATH_IMAGE006
The method is characterized by comprising data points corresponding to random noise and data points corresponding to abnormal signals, and transforming the data points from a frequency domain to a time domain through inverse Fourier transform to obtain a saliency map sequence
Figure 133372DEST_PATH_IMAGE002
By saliency map sequences
Figure 865704DEST_PATH_IMAGE002
The abnormal detection point can be visually and obviously detected.
Based on any of the above embodiments, the position of the data point to be detected in the saliency map sequence satisfies the preset mutation condition is realized by the following formula:
Figure 714712DEST_PATH_IMAGE025
(9)
wherein the content of the first and second substances,
Figure 433269DEST_PATH_IMAGE009
a value representing a data point to be detected,
Figure 3229DEST_PATH_IMAGE010
represents the mean value of a preset number of data points before the data point to be detected on the saliency map sequence,
Figure 516250DEST_PATH_IMAGE011
indicating the coefficient of sensitivity.
Specifically, the mean value of a preset number of data points before the data point to be detected on the saliency map sequence may be understood as the mean value of the target data point added by the preset number in a manner based on the average gradient.
Figure 852553DEST_PATH_IMAGE010
The acquisition process comprises the following steps:
Figure 499435DEST_PATH_IMAGE026
(10)
Figure 808057DEST_PATH_IMAGE027
(11)
Figure 616613DEST_PATH_IMAGE028
(12)
wherein the content of the first and second substances,
Figure 377895DEST_PATH_IMAGE029
indicating the first in the time series of indicator data
Figure 500572DEST_PATH_IMAGE030
The value of the last data point of the time series index data sequence,
Figure 788334DEST_PATH_IMAGE031
indicating a target timing index data sequence
Figure 643158DEST_PATH_IMAGE032
The value of the individual data point, here specifically the 1 st target data point,
Figure 281949DEST_PATH_IMAGE033
indicating a target timing index data sequence
Figure 942738DEST_PATH_IMAGE034
The value of the individual data points is,
Figure 960372DEST_PATH_IMAGE035
indicating a target timing index data sequence
Figure 376310DEST_PATH_IMAGE030
Data point and the second
Figure 112185DEST_PATH_IMAGE036
The gradient (i.e. slope) of a straight line between data points,
Figure 638981DEST_PATH_IMAGE037
indicating a target timing index data sequence
Figure 573439DEST_PATH_IMAGE030
Data point and the second
Figure 301224DEST_PATH_IMAGE036
The average gradient (i.e., the average slope) between data points.
In the embodiment provided by the invention, the preset number is
Figure 117870DEST_PATH_IMAGE038
=10, so the second one can be acquired in turn according to the average gradient-based method described above
Figure 589303DEST_PATH_IMAGE039
The value of each target data point.
Further, the embodiment of the present invention is described as follows:
(1) the method comprises the steps of collecting operation data of each component in a computer system at regular time based on an Agent technology, storing the operation data into a data warehouse, processing the operation data by using a processing convergence technology, and generating a time sequence index data sequence.
(2) And adding a preset number of target data points at the tail of each data point in the time sequence index data sequence based on an average gradient mode, and generating a target time sequence index data sequence based on the time sequence index data sequence and the target data points.
(3) And performing fast Fourier transform on the target time sequence index data sequence, and converting the target time sequence index data sequence from a time domain to a frequency domain to obtain a frequency domain index data sequence.
(4) And determining a magnitude spectrum and a phase spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence.
(5) And carrying out logarithmic transformation processing on the amplitude spectrum to obtain a logarithmic amplitude spectrum, and obtaining a main stream signal amplitude spectrum corresponding to the main stream signal frequency in the logarithmic amplitude spectrum by using a convolution smoothing mode.
(6) And calculating the obtained logarithmic magnitude spectrum and the magnitude spectrum of the main stream signal based on a spectrum residual error algorithm SR to obtain a spectrum residual error calculation result.
(7) And performing fast Fourier inverse transformation on the frequency domain index data sequence based on the frequency spectrum residual error calculation result and the phase spectrum to obtain a saliency map sequence.
(8) And determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
Specifically, the preset mutation conditions are as follows:
Figure 378267DEST_PATH_IMAGE040
wherein the content of the first and second substances,
Figure 138937DEST_PATH_IMAGE009
a value representing a data point to be detected,
Figure 849404DEST_PATH_IMAGE010
represents the mean value of a preset number of data points before the data point to be detected on the saliency map sequence,
Figure 718003DEST_PATH_IMAGE011
indicating the coefficient of sensitivity.
The anomaly point detection method provided by the invention is based on fast Fourier analysis and is analyzed by integrating two dimensions of a frequency domain and a time domain, belongs to a highly universal anomaly detection method, and is suitable for different waveform types.
The following provides a comparison of the effect of the anomaly detection method provided by the present invention and the anomaly detection algorithm in the prior art.
Fig. 2 is one of the comparison graphs of the effects provided by the present invention. The effect comparison diagram shown in fig. 2 is an effect diagram of the abnormal point detection method provided by the present invention on the upper half of a general random time sequence index data sequence, and the lower half of the diagram shows an effect diagram of an abnormal point detection algorithm (taking a quartile algorithm as an example) in the prior art. The black solid points marked in the figure represent abnormal points, and it can be seen that the abnormal points detected by the abnormal point detection method provided by the invention are less than the abnormal points detected by the quartile algorithm in the prior art, which means that the abnormal point detection method provided by the invention has lower false alarm rate and improves the detection accuracy, that is, the detection method in the prior art can mistake the data points which are not the abnormal points as the abnormal points, thereby causing false alarm.
Referring to fig. 3, a second comparison graph of the effects provided by the present invention is shown. The effect comparison diagram shown in fig. 3 is an effect diagram of the method for detecting an abnormal point provided by the present invention, which is shown in the upper half of the diagram, and the lower half of the diagram shows an effect diagram of an abnormal detection algorithm (taking a quartile algorithm as an example) in the prior art, for a time sequence index data sequence with a strong rule. The black solid points marked in the figure represent abnormal points, and it can be seen that the abnormal points are detected by using the abnormal point detection method provided by the invention, but the abnormal points are not detected by using the quartile algorithm in the prior art, which shows that the abnormal point detection method provided by the invention has higher abnormal detection precision and improves the detection performance.
Fig. 4 is a comparison graph showing the third effect of the present invention. The effect comparison diagram shown in fig. 4 is a timing index data sequence with an insignificant internal rule, and the upper half of the diagram shows an effect diagram of the method for detecting an abnormal point provided by the present invention, and the lower half of the diagram shows an effect diagram of an abnormal detection algorithm (taking a quartile algorithm as an example) in the prior art. The black solid points marked in the figure represent abnormal points, and it can be seen that the abnormal points are detected by using the abnormal point detection method provided by the invention, but the abnormal points are not detected by using the quartile algorithm in the prior art, which shows that the abnormal point detection method provided by the invention has higher abnormal detection precision and improves the detection performance.
The effect comparison graph shows that the method for detecting the abnormal point based on the spectrum residual error of the Fourier analysis improves the detection performance and the overall accuracy.
The following describes the abnormal point detection device provided by the present invention, and the abnormal point detection device described below and the abnormal point detection method described above can be referred to correspondingly.
Referring to fig. 5, a schematic structural diagram of an abnormal point detecting apparatus according to an embodiment of the present invention includes:
an obtaining module 510, configured to obtain a target time sequence index data sequence, and perform fast fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, where the target time sequence index data sequence is composed of data points;
a calculating module 520, configured to perform spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result;
an inverse transform module 530, configured to perform inverse fast fourier transform on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence;
the determining module 540 is configured to determine a position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, define the data point to be detected as an outlier.
According to the device for detecting the abnormal points, provided by the invention, the target time sequence index data sequence is obtained, the obtained target time sequence index data sequence does not need to be subjected to sample marking, the target time sequence index data sequence is subjected to fast Fourier transform, and the target time sequence index data sequence in a time domain is converted into a frequency domain index data sequence; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual error calculation result to obtain a saliency map sequence; and determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point. The spectrum residual error abnormal detection method based on Fourier analysis is adopted without depending on a large number of labeled samples, and low missing report rate and low false report rate can be realized in the aspect of abnormal point detection.
According to any of the above embodiments, before the obtaining module 510, the method includes:
the system comprises an acquisition module, a data warehouse and a time sequence index data sequence, wherein the acquisition module is used for acquiring operation data of each component in the computer system at regular time, storing the operation data into the data warehouse, and processing the operation data by using a processing convergence technology to generate the time sequence index data sequence;
and the adding module is used for adding a preset number of target data points at the tail of each data point in the time sequence index data sequence based on an average gradient mode and generating a target time sequence index data sequence based on the time sequence index data sequence and the target data points.
Based on any of the above embodiments, the calculation module 520 includes:
the acquisition unit is used for acquiring a logarithmic magnitude spectrum and a main stream signal magnitude spectrum based on the frequency domain index data sequence;
and the calculating unit is used for performing spectrum residual error calculation based on the logarithmic magnitude spectrum and the main stream signal magnitude spectrum to obtain a spectrum residual error calculation result.
Based on any of the above embodiments, the obtaining unit is specifically configured to:
determining a magnitude spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence;
carrying out logarithmic transformation processing on the amplitude spectrum to obtain a logarithmic amplitude spectrum;
and acquiring a main stream signal amplitude spectrum corresponding to the main stream signal frequency in the logarithmic amplitude spectrum by using a convolution smoothing mode.
Based on any of the above embodiments, the inverse transformation module 530 is specifically configured to:
determining a phase spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence;
and performing fast Fourier inverse transformation on the frequency domain index data sequence based on the frequency spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence.
Based on any one of the above embodiments, the frequency domain index data sequence is subjected to inverse fast fourier transform based on the spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence, and the method is implemented by the following formula:
Figure 564737DEST_PATH_IMAGE041
wherein the content of the first and second substances,
Figure 431061DEST_PATH_IMAGE002
a sequence of saliency maps is represented that,
Figure 487879DEST_PATH_IMAGE003
representing the inverse of the fast fourier transform,
Figure 35535DEST_PATH_IMAGE004
representing an exponential function with a natural constant e as the base,
Figure 126988DEST_PATH_IMAGE005
the number of the units of the imaginary number is expressed,
Figure 367476DEST_PATH_IMAGE006
which represents the result of the spectral residual calculation,
Figure 911590DEST_PATH_IMAGE007
representing a phase spectrum.
Based on any of the above embodiments, the position of the data point to be detected in the saliency map sequence satisfies a preset mutation condition by the following formula:
Figure 262937DEST_PATH_IMAGE042
wherein the content of the first and second substances,
Figure 881000DEST_PATH_IMAGE009
a value representing a data point to be detected,
Figure 151445DEST_PATH_IMAGE010
represents the mean value of a preset number of data points before the data point to be detected on the saliency map sequence,
Figure 58221DEST_PATH_IMAGE011
indicating the coefficient of sensitivity.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)610, a communication Interface (Communications Interface)620, a memory (memory)630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a method of outlier detection comprising: acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence; determining the position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, wherein when the computer program is executed by a processor, the computer is capable of executing a method for outlier detection provided by the above methods, the method comprising: acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence; determining the position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program, when executed by a processor, implementing a method for anomaly detection provided by the above methods, including: acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points; performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result; performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence; determining the position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method of anomaly detection, comprising:
acquiring a target time sequence index data sequence, and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points;
performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result;
performing fast Fourier inverse transformation on the frequency domain index data sequence based on the spectrum residual calculation result to obtain a saliency map sequence;
determining the position of a data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
2. The method of anomaly detection according to claim 1, comprising, before said obtaining a sequence of target timing index data:
the method comprises the steps of collecting operation data of each component in a computer system at regular time, storing the operation data into a data warehouse, and processing the operation data by using a processing convergence technology to generate a time sequence index data sequence;
adding a preset number of target data points at the end of each data point in the time sequence index data sequence based on an average gradient mode, and generating a target time sequence index data sequence based on the time sequence index data sequence and the target data points.
3. The method of claim 1, wherein the performing the spectrum residual calculation based on the frequency domain index data sequence to obtain the spectrum residual calculation result comprises:
acquiring a logarithmic magnitude spectrum and a main stream signal magnitude spectrum based on the frequency domain index data sequence;
and performing spectrum residual error calculation based on the logarithmic magnitude spectrum and the main stream signal magnitude spectrum to obtain a spectrum residual error calculation result.
4. The method of outlier detection as recited in claim 3, wherein said obtaining a log-magnitude spectrum and a mainstream signal magnitude spectrum based on said sequence of frequency domain indicator data comprises:
determining a magnitude spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence;
carrying out logarithmic transformation processing on the amplitude spectrum to obtain a logarithmic amplitude spectrum;
and acquiring a main stream signal amplitude spectrum corresponding to the main stream signal frequency in the logarithmic amplitude spectrum by using a convolution smoothing mode.
5. The method of outlier detection as claimed in claim 4 wherein said inverse fast Fourier transforming said frequency domain indicative data sequence based on said spectral residual calculation to obtain a saliency map sequence comprises:
determining a phase spectrum corresponding to the frequency domain index data sequence based on the frequency domain index data sequence;
and performing fast Fourier inverse transformation on the frequency domain index data sequence based on the frequency spectrum residual calculation result and the phase spectrum to obtain a saliency map sequence.
6. The method of outlier detection of claim 5 wherein said inverse fast Fourier transform of said frequency domain index data sequence based on said spectral residual calculation and said phase spectrum results in a saliency map sequence, implemented by the following formula:
Figure 969057DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 644889DEST_PATH_IMAGE002
a sequence of saliency maps is represented that,
Figure 89164DEST_PATH_IMAGE003
representing the inverse of the fast fourier transform,
Figure 483236DEST_PATH_IMAGE004
representing an exponential function with a natural constant e as the base,
Figure 301020DEST_PATH_IMAGE005
the number of the units of the imaginary number is expressed,
Figure 893675DEST_PATH_IMAGE006
which represents the result of the spectral residual calculation,
Figure 646867DEST_PATH_IMAGE007
representing a phase spectrum.
7. The method for detecting abnormal points according to claim 2, wherein the position of the data point to be detected in the saliency map sequence satisfies a preset mutation condition by the following formula:
Figure 121711DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 618551DEST_PATH_IMAGE009
a value representing a data point to be detected,
Figure 393609DEST_PATH_IMAGE010
represents the mean value of a preset number of data points before the data point to be detected on the saliency map sequence,
Figure 114441DEST_PATH_IMAGE011
indicating the coefficient of sensitivity.
8. An apparatus for anomaly detection, comprising:
the acquisition module is used for acquiring a target time sequence index data sequence and performing fast Fourier transform on the target time sequence index data sequence to obtain a frequency domain index data sequence, wherein the target time sequence index data sequence consists of data points;
the calculation module is used for performing spectrum residual calculation based on the frequency domain index data sequence to obtain a spectrum residual calculation result;
the inverse transformation module is used for carrying out fast Fourier inverse transformation on the frequency domain index data sequence based on the frequency spectrum residual error calculation result to obtain a saliency map sequence;
and the determining module is used for determining the position of the data point to be detected in the saliency map sequence, and if the position of the data point to be detected in the saliency map sequence meets a preset mutation condition, defining the data point to be detected as an abnormal point.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of outlier detection according to any of the claims 1 to 7 when executing the program.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of anomaly detection according to any one of claims 1 to 7.
CN202210123405.9A 2022-02-10 2022-02-10 Abnormal point detection method and device Pending CN114168586A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210123405.9A CN114168586A (en) 2022-02-10 2022-02-10 Abnormal point detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210123405.9A CN114168586A (en) 2022-02-10 2022-02-10 Abnormal point detection method and device

Publications (1)

Publication Number Publication Date
CN114168586A true CN114168586A (en) 2022-03-11

Family

ID=80489579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210123405.9A Pending CN114168586A (en) 2022-02-10 2022-02-10 Abnormal point detection method and device

Country Status (1)

Country Link
CN (1) CN114168586A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114799610A (en) * 2022-06-24 2022-07-29 苏芯物联技术(南京)有限公司 Welding quality real-time detection method and system based on inverse Fourier transform and self-encoder
CN114844796A (en) * 2022-04-29 2022-08-02 济南浪潮数据技术有限公司 Method, device and medium for detecting abnormity of time-series KPI

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947812A (en) * 2018-07-09 2019-06-28 平安科技(深圳)有限公司 Consecutive miss value fill method, data analysis set-up, terminal and storage medium
CN111967508A (en) * 2020-07-31 2020-11-20 复旦大学 Time series abnormal point detection method based on saliency map
WO2021017665A1 (en) * 2019-07-26 2021-02-04 Telefonaktiebolaget Lm Ericsson (Publ) Methods, devices and computer storage media for anomaly detection
CN113127716A (en) * 2021-04-29 2021-07-16 南京大学 Sentiment time sequence anomaly detection method based on saliency map

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947812A (en) * 2018-07-09 2019-06-28 平安科技(深圳)有限公司 Consecutive miss value fill method, data analysis set-up, terminal and storage medium
WO2021017665A1 (en) * 2019-07-26 2021-02-04 Telefonaktiebolaget Lm Ericsson (Publ) Methods, devices and computer storage media for anomaly detection
CN111967508A (en) * 2020-07-31 2020-11-20 复旦大学 Time series abnormal point detection method based on saliency map
CN113127716A (en) * 2021-04-29 2021-07-16 南京大学 Sentiment time sequence anomaly detection method based on saliency map

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
岳广军: "《市场调查与预测》", 31 March 2018, 哈尔滨工程大学出版社 *
李庆东 等: "《统计学概论》", 30 April 2013, 东北财经大学出版社 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114844796A (en) * 2022-04-29 2022-08-02 济南浪潮数据技术有限公司 Method, device and medium for detecting abnormity of time-series KPI
CN114799610A (en) * 2022-06-24 2022-07-29 苏芯物联技术(南京)有限公司 Welding quality real-time detection method and system based on inverse Fourier transform and self-encoder
CN114799610B (en) * 2022-06-24 2022-10-04 苏芯物联技术(南京)有限公司 Welding quality real-time detection method and system based on inverse Fourier transform and self-encoder

Similar Documents

Publication Publication Date Title
CN110839016B (en) Abnormal flow monitoring method, device, equipment and storage medium
Yan et al. Improved Hilbert–Huang transform based weak signal detection methodology and its application on incipient fault diagnosis and ECG signal analysis
CN114168586A (en) Abnormal point detection method and device
Huang et al. Iterative least‐squares‐based wave measurement using X‐band nautical radar
CN113034431B (en) Equipment use state online monitoring method, system, storage medium and terminal
CN109340586A (en) A kind of detection method and system of water supply line leakage
CN114726581B (en) Abnormality detection method and device, electronic equipment and storage medium
CN112013285A (en) Method and device for detecting pipeline leakage point, storage medium and terminal
CN111368989B (en) Training method, device and equipment for neural network model and readable storage medium
CN114755010A (en) Rotary machine vibration fault diagnosis method and system
US11640553B2 (en) Method for analyzing time-series data based on machine learning and information processing apparatus
CN113746780A (en) Abnormal host detection method, device, medium and equipment based on host image
CN115588439B (en) Fault detection method and device of voiceprint acquisition device based on deep learning
CN117290679A (en) Running state detection method and device of current transformer and electronic equipment
CN116778758A (en) Unmanned aerial vehicle remote control signal identification method, device, equipment and medium based on time-frequency diagram
CN114844796A (en) Method, device and medium for detecting abnormity of time-series KPI
CN113569695B (en) Sea surface target detection method and system based on bispectrum three characteristics
CN116776087A (en) Heart rate detection method and related equipment
CN115272229A (en) Abnormal visual image detection method and device under category imbalance condition
CN111209567B (en) Method and device for judging perceptibility of improving robustness of detection model
CN112329626A (en) Modulation and deep learning fused equipment fault diagnosis method, system and medium
CN112217749A (en) Blind signal separation method and device
JP5854551B2 (en) Real-time frequency analysis method
JP2021111034A (en) Abnormality detection program, abnormality detection method, and information processing device
CN117345680B (en) Ventilator detection method, ventilator detection device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220311