CN113127705A

CN113127705A - Heterogeneous bidirectional generation countermeasure network model and time sequence anomaly detection method

Info

Publication number: CN113127705A
Application number: CN202110360734.0A
Authority: CN
Inventors: 陈鹏; 夏云霓; 任建华; 单文煜; 王锐; 于春
Original assignee: Xihua University
Current assignee: Quzhou Haiyi Technology Co ltd
Priority date: 2021-04-02
Filing date: 2021-04-02
Publication date: 2021-07-16
Anticipated expiration: 2041-04-02
Also published as: CN113127705B

Abstract

The invention relates to a heterogeneous bidirectional generation confrontation network model and a time sequence anomaly detection method, which comprises a generator G, a discriminator D and a data processing module, wherein the generator G is used for learning the characteristics of a time sequence, generating anomalous data similar to real data from random noise and inputting the anomalous data into the discriminator D for judgment, and realizing the mapping from a potential generation space to an original data space; the encoder E is used for calculating the reconstruction error of the generator G; the discriminator D is used for identifying and distinguishing different patterns, and discriminates whether the data generated by the generator G is normal data or abnormal data. The invention fully combines the abnormal reconstruction error of the generator and the classification error of the discriminator through the improved abnormal value function, improves the performance of the abnormal value function and improves the detection precision; finally, by means of the improved bidirectional generation countermeasure network encoder-generator-discriminator framework, the reconstruction error calculation complexity and accuracy of the generator can be improved, and the anomaly detection speed is increased.

Description

Heterogeneous bidirectional generation countermeasure network model and time sequence anomaly detection method

Technical Field

The invention relates to the technical field of data processing, in particular to a heterogeneous bidirectional generation confrontation network model and a time series anomaly detection method.

Background

In the last decade, we have entered the big data era, with a dramatic increase in the data available. Of these, it is noteworthy that, with the rapid development of information technology, time series data is increasing at an unthinkable rate in various industries such as medical, business, biology, finance, internet, and the like, such as electrocardiographic medical records, stock price quotations, earthquake activity records, and the like. Analysis and application of time series data is now an area of great interest. Among the many research directions in time series data mining, time series anomaly detection research is emerging, and simply, anomalies refer to values that are different from the majority of the data set. In static data, if the data set itself is assumed to be from a certain distribution, anomalies refer to those values that deviate from this distribution; if it is assumed that the observed values in the data set originate from a model, the anomalies refer to those values that deviate from the model. There are many causes of the abnormality, such as sudden weather causes, policy changes, writing errors, and the like. The anomalies themselves are sometimes of great significance and provide a lot of useful information. Such as: credit card fraud may be manifested as the credit card being used almost simultaneously in different places, presumably credit card fraud by analyzing the usage of the credit card. Sensors are often used in life to track parameters around various environments, and sudden changes in which may be the occurrence of a surrounding event of interest. In medicine, normal MRJ or PET scans are often collected, and if an abnormal condition other than these occurs, it may be that cancer has occurred. The satellite or remote sensors are used to collect a large number of weather conditions, climate changes, etc. so that sudden weather changes can be predicted based on possible abnormal conditions. Therefore, finding anomalies in the time series is a very meaningful task.

Today in the big data era, technologies such as artificial intelligence represented by deep learning are introduced to analyze time series data, optimize models and solve practical problems. In recent years, deep learning has strong capacity in learning expression forms of complex data (high-dimensional data, time data, spatial data and graphic data), deep learning technology has no requirement for fixing a model structure, characteristics are automatically extracted from massive data through an algorithm on the basis of the data, the data are continuously used for many times to improve the performance of the deep learning, and the application practice effect shows good applicability and accuracy until the application requirement or the iteration frequency is reached. In recent years, deep learning has been successfully applied to a plurality of time series practical application fields, and scholars and experts at home and abroad have devoted themselves to research in this respect, and have achieved a lot of excellent research results in some important fields including sequence matching, pattern recognition, clustering, trend analysis, similarity detection, classification, long-term and short-term prediction, and the like. When complex heterogeneous mass data is processed, the anomaly detection method based on deep learning has better performance than the traditional anomaly detection method when various problems in practical application are solved.

Currently, the current time series anomaly detection method based on deep learning generally includes: the method is used for deep learning of feature extraction, deep feature learning facing normal data and end-to-end abnormal value direct deep learning. The deep learning for feature extraction includes methods of directly using mature pre-training deep neural networks such as AlexNet, VGG and the like to extract low-dimensional features, or explicitly training a deep feature extraction model independently; the depth feature learning facing the normal data is mainly to learn the distribution and feature representation of the normal data, while the abnormal data does not meet the distribution and feature representation, and the normal data and the abnormal data can be distinguished by calculating abnormal values, and the method comprises a self-encoder for detecting the abnormal data based on reconstruction errors, a generation countermeasure network based on generation and discrimination errors, a predictive model based on classification errors, a fusion method based on traditional abnormal measurement such as distance, One-Class, cluster and the like; the end-to-end abnormal value direct deep learning is to directly learn the abnormal value of the data instance by using the deep neural network, which is a specific method for deep learning, and mainly considers how to design an effective loss function and how to combine the deep neural network with the abnormal value measurement, wherein the method comprises a model based on ranking, a priority driving model, an end-to-end One-Class classification model and the like.

Due to the unknown, heterogeneous and scarce nature of the anomaly itself and the increasing complexity of high-dimensional data and the correlation between the dimensions of the data, the existing anomaly detection methods still have the following disadvantages: 1) the recall rate for complex data (massive, heterogeneous and high-noise) is insufficient; 2) normal data or abnormal feature learning still depends on a large amount of training data with labels, and is difficult to acquire in practice; 3) the performance of detection for complex anomalies, particularly context-related anomalies and aggregate anomalies, is inadequate. Moreover, training based on anomaly detection that generates an antagonistic network may now suffer from a number of problems, such as inability to converge and pattern collapse, the generator network may be misled and generate data instances other than normal ones, especially when the true distribution of a given data set is complex or the training data contains unexpected outliers, and again, an outlier function based on the generation of the antagonistic network is built on the generator with the goal of data synthesis rather than anomaly detection, so that the outlier function performance relying only on the generator is somewhat deficient.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a heterogeneous bidirectional generation confrontation network model and a time series abnormity detection method, and solves the defects of the existing time series abnormity detection method based on deep learning.

The purpose of the invention is realized by the following technical scheme: a heterogeneous bidirectional generation confrontation network model comprises a generator G and an encoder E, wherein the neural network is a multilayer long and short term memory network, and a discriminator D, wherein the neural network is a convolutional neural network; the generator G is used for learning the characteristics of the time sequence, generating abnormal data similar to normal data from random noise and inputting the abnormal data into the discriminator D for judgment, so that mapping from a potential generation space to an original data space is realized; the encoder E is used for calculating the reconstruction error of the generator G; the discriminator D is used for identifying and distinguishing different modes and discriminating whether the data generated by the generator G is normal data or abnormal data;

through the game between the generator G and the discriminator D, the model for finally achieving Nash equilibrium is as follows:

the computation of outliers for the model includes computing a discriminant error and a reconstruction error; constructing an outlier function

To compute the outliers of the model, wherein,

in order to determine the error, the error is determined,

is a reconstruction error; the discrimination error is directly obtained through the classification cross entropy of the discriminator.

The encoder E trains with the generator G simultaneously in the countertraining process, so that the encoder E simultaneously realizes inverse mapping from a data space to a potential generation space while the potential space is mapped to an original data space by training the generator G, and further, reconstruction errors are quickly and accurately realized

And (4) calculating.

A time series abnormity detection method based on heterogeneous bidirectional generation countermeasure network model comprises an abnormity detection step; the abnormality detecting step includes:

inputting real data containing abnormal data and normal data into the trained model;

classifying abnormal data and normal data by a discriminator D in the model, and calculating a classification cross entropy to obtain a discrimination error

Mapping data to a potential generation space through an encoder E in the model to obtain E (x), mapping the potential generation space to an original data space through a generator G to obtain G (E (x)), and further mapping the potential generation space to the original data space through the generator G

Calculating to obtain a reconstruction error

Combined discrimination error

And reconstruction error

By function of outliers

And calculating an abnormal value to realize abnormal detection.

The time series anomaly detection method also comprises the steps of model construction and model training; the model building and training steps are performed before the anomaly detection step.

The model building step comprises:

constructing a generator G and an encoder E of which the neural network is a multilayer long and short term memory network; the generator G is used for learning the characteristics of the time sequence, generating abnormal data similar to normal data from random noise and inputting the abnormal data into the discriminator D for judgment, so that mapping from a potential generation space to an original data space is realized; the encoder E is used for calculating the reconstruction error of the generator G;

and constructing a discriminator D of which the neural network is a convolutional neural network, wherein the discriminator D is used for identifying and distinguishing different modes, and discriminating whether the data generated by the generator G is normal data or abnormal data.

The model training step comprises:

acquiring training data which are normal data x, preprocessing the training data, and inputting the preprocessed training data into a model completing the model building step;

a generator G in the model generates abnormal data x 'through random noise z, an encoder E encodes training data to generate potential generated spatial data z', and a discriminator D classifies and discriminates the generated abnormal data x 'and normal data x and performs a plurality of countercheck iterations with the generator G until the discriminator D accurately discriminates the abnormal data x' and the normal data x generated by the generator D.

The invention has the following advantages: on one hand, a long-short term memory network is selected as a generator, a convolutional neural network is selected as a heterogeneous generation countermeasure network of a discriminator, so that the advantages of the two types of neural networks in sequence data and pattern recognition capability are fully exerted, high-dimensional abnormality can be detected by checking reconstruction performed in a learned low-dimensional potential space, and the convergence of training and the effectiveness of the model are ensured; on the other hand, the improved abnormal value function fully combines the reconstruction error of the generator and the classification error of the discriminator, so that the performance of the abnormal value function is improved, and the detection precision is improved; finally, by means of the improved bidirectional generation countermeasure network encoder-generator-discriminator framework, the reconstruction error calculation complexity and accuracy of the generator can be improved, and the anomaly detection speed is increased.

Drawings

FIG. 1 is a schematic diagram of a generation countermeasure network architecture;

FIG. 2 is a schematic diagram of a single-layer structure of a long term memory network;

FIG. 3 is a schematic diagram of a convolutional neural network structure;

FIG. 4 is a schematic diagram of a real data base;

FIG. 5 is a diagram illustrating a heterogeneous bi-directional generative countermeasure network architecture of the present invention;

FIG. 6 is a graph of the performance of a data set;

FIG. 7 is a diagram illustrating the convergence curve of the loss function according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the detailed description of the embodiments of the present application provided below in connection with the appended drawings is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application. The invention is further described below with reference to the accompanying drawings.

The invention realizes the unsupervised anomaly detection of the non-stationary nonlinear multi-element mass complex time sequence by constructing the heterogeneous generation countermeasure network. The method comprises the steps of generating potential abnormal data by using a generator, classifying normal data and abnormal data by using a discriminator, constructing an abnormal value function by using different reconstruction errors of the abnormal data and the normal data and combining a classification result, thereby realizing more accurate abnormal detection; the method specifically comprises the following steps:

x for a given data set₁，x₂，...，x_NTherein of

Order to

For low-dimensional representation space, the anomaly detection target based on deep learning is obtained by learningMapping data space to feature representation space

Or an outlier function

So that abnormal and normal data can be distinguished through phi or tau, wherein phi and tau are constructed based on a neural network, H hidden layers are included, and the weight matrix of the hidden layers is theta ═ M¹，M²，...，M^H}。

For the

Then there are:

s_x＝f(x，φΘ^*，ψW^*)

for the

Then there is a change in the number of,

s_x＝τ(x；Θ^*)

wherein phi maps the original data to a representation space Z, psi is a learning task of normal data on the representation space Z, a weight matrix of the neural network is W,

is a loss function of the model, f is an abnormal value calculation function, calculates an abnormal score using phi and psi, and tau is a function of directly calculating an abnormal value for end-to-end learning.

As shown in fig. 1, the neural network of the generator G is a multi-layer long-short term memory network for learning the characteristics of the time series, and the discriminator is a convolutional neural network for recognizing and distinguishing the unused patterns by using its strong pattern recognition capability, and the design of the heterogeneous generation countermeasure network can better utilize the learning capability of the long-short term memory network for the memory and time-related characteristics of the series and the pattern recognition capability of the convolutional neural network.

Unlike previous methods based on generating a countermeasure network, the present invention utilizes the idea of generating countermeasures, where the generator directly generates abnormal data from random noise, rather than normal data, and the discriminator discriminates whether the data is the generated abnormal data or the original normal data. The generator aims to generate data which is similar to normal data as much as possible and can not be identified by the discriminator; the goal of the discriminator is to discriminate as much as possible between true data and abnormal data. The two games are played to finally reach Nash equilibrium.

Wherein, C₀And C_nShowing the cost of misjudging the abnormal and normal data, and zeta (x) epsilon (0, 1) is an abnormal value function used for minimizing the objective function.

As shown in fig. 2, the generation model G based on the long-short term memory network: for the generator long-short term memory network model, for a layer of unidirectional LSTM, the update equation is:

f^t＝σ(b_f+W_f[C^t-1，h^t-1，x^t])

i^t＝σ(b_i+W_i[C^t-1，h^t-1，x^t])

o^t＝σ(b_o+W_o[C^t，h^t-1，x^t])

h^t＝o^ttanh(C^t)

wherein x is^tIs the input feature vector h at time t in the same time window^t-1，h^t，C^t-1，C^tHidden state vector and neuron state vector input or output for t time, when t is 0, h^t-1，C^t-1Is absent, h^t，C^tAre transferred between the same layers, and all h^tBut also as an output of the LSTM layer, W_f，W_i，W_C，W_oThe weight matrixes are respectively a forgetting gate, an input gate, a hidden state vector and an output gate, and are spliced by three weight matrixes for expression simplicity and correspond to b_f，b_i，b_C，b_oIs its offset vector. Sigma is a sigmoid function.

As shown in fig. 3, the convolutional neural network-based discriminant model D: in order to prevent the over-fitting of the generated antagonistic network model result, the discrimination model D uses a convolution neural network with a structure completely different from that of the generated model G, and x represents time series data, and when passing through the convolution neural network layer, the following convolution mathematical operation is performed,

c_j＝f(x*W_j+b_j)

wherein, W_jAnd b_jF is a specific activation function as a weight parameter of the convolutional layer. The features extracted by the convolution operation are further subjected to the following pooling operations:

p_j＝pooling(c_j)+b_j

in the final abnormal detection process, for a given piece of data, two parts need to be considered based on the calculation of generating an abnormal value of the countermeasure network, firstly, the judgment error of a discriminator is judged, and the discriminator of the model is essentially a two-classifier for distinguishing the generated abnormal data from the real normal data, so that the discriminator is only required to be used for judging whether the normal data or the abnormal data is equivalent to a result of two-classification; secondly, the reconstruction error of the generator is, because there is a clear difference between the normal data and the error of the abnormal data generated by the generator, and the generator generates the abnormal data by random noise in the model, the reconstruction error of the normal data is significantly larger than that of the abnormal data, so that the abnormal value function can be constructed as follows:

wherein

In order to determine the error, the error is determined,

for error reconstruction, i.e. generation of errors, but directly calculated from the structure of the generative countermeasure network

It is difficult because the generator only implements the mapping from the potential generation space to the original data space if the inverse mapping E from the original data space to the potential generation space can be found: z ═ e (x), the reconstruction error can be calculated from the following equation

To this end, the model improves the existing generative countermeasure network into a bidirectional generative countermeasure network to generate abnormal data while implementing an inverse mapping E from the data space to the potential generative space: z ═ E (x), by adding the encoder E on the basis of the existing generator G and the discriminator D, the encoder E and the generator G are trained simultaneously in the antithetical training process, so that the encoder E simultaneously realizes the inverse mapping from the data space to the potential generation space while training the generator G to realize the potential space-to-data space mapping,thus, reconstruction error can be rapidly and accurately realized

And (4) calculating. The specific implementation architecture is shown in fig. 5, so that the overall MinMax model of the model is formed by:

the improvement is that:

the invention is explained below by way of corresponding experimental data and examples;

1. data set

As shown in fig. 4, a data service log of a real data center in the whole month of 6 months in 2017 is selected as a data set, total log entries exceed 4000000, log contents include that the first to fourteen columns respectively include a timestamp, transmission time, a remote host, transmission data volume, a file name, a transmission type, a special flag bit, a transmission direction, an access mode, a user id, a service id, an authentication method, an authentication id and a completion state, total data is 4094157, the data volume reaches the level of ten million, and it is ensured that no future information or irrelevant information is contained, and major data dimensions fluctuate little by little or no time during the period; in addition to the real dataset, the present invention also employs two common open datasets optdigits and vertebral for anomaly detection.

2. Data pre-processing

t is the current time, assuming t is present₀，t₁，t₂，...，t_MFor anomaly detection, each feature is first normalized individually, and zscore normalization is used to first find the feature average for a sliding window of current length N

Difference of alignmentσ₁，σ₂，...，σ_m，

And respectively subtracting the corresponding average values from the characteristic, and dividing the average values by the standard deviation to remove dimensional differences among the characteristics.

3. Model training process

When the discrimination model D is trained, the data generated by the generation model G in the previous round and the real data are directly spliced together to be used as a new input x. x generates a score (a number between 0 and 1) by the discriminant model D, and performs inverse gradient propagation by the loss function of score and y composition. When training the generative model G, the generative model G and the discriminant model D are taken as a whole. The ensemble is still input with one score. When a group of noise data is input, a pseudo data can be generated in a generation model G, the pseudo data is scored through a discrimination model D, the generation model G aims to enable the expression D (G (z)) of the pseudo data G (z) generated by the generation model G on the discrimination model D to be consistent with the expression D (x) of real data x on the discrimination model D, the performance of the discrimination model D and the performance of the generation model G are continuously improved through the two processes of mutual confrontation and iterative optimization, and when the discrimination capability of the discrimination model D is improved to a certain degree finally and the data source cannot be correctly discriminated, the generation model G can be considered to learn the distribution of the real data. Since the parameters of the discriminant model D are not trainable. This ensures that the training of the generative model G is in accordance with the scoring criteria of the discriminative model D. It is noted that, in the present technique, the encoder E is trained while training G, and the pseudo data E (x) generated by the encoder is consistent with the random noise z, so as to implement inverse mapping from the original data to the potential generation space, and facilitate the calculation of the subsequent abnormal value.

4. Model performance index

The performance comparison of the models still employs several sorted confusion matrix-based main performance indicators: precision, recall, F1-Score and Roc curves;

the accuracy rate refers to the proportion of the actual positive samples in the samples predicted to be positive by the model to the samples predicted to be positive, and the calculation formula is

The recall ratio refers to the proportion of the samples predicted to be positive in the samples actually being positive to the samples actually being positive, and the calculation formula is as follows:

f1 score is the harmonic mean of precision and recall and is calculated as:

the abscissa of the ROC curve is a False Positive Rate (FPR); the ordinate is True Positive Rate (TPR). The FPR and TPR calculation methods are respectively as follows:

5. results of model comparisons

As can be seen from fig. 6 and 7, and tables 1 and 2 below, the experimental results of the model in the real data set and the open data set are as follows, compared with the existing model:

(1) as shown in Table 1, on the real data set, the accuracy, recall, F1-Score and ROC of the model are respectively 0.996, 0.974, 0.985 and 0.955, and the detection performance for the abnormity is higher overall and is better than that of the existing typical model.

(2) As shown in Table 2, the model also achieves better detection performance on open data sets optdigits and vertebral, the ROC value is slightly lower than that of the Isolation Forest model only in optdigs, and the ROC value is superior to that of the existing typical model in other aspects.

(3) As shown in FIG. 6, the ROC curve of the present model is significantly better than that of the existing model.

(4) As shown in fig. 7, the model has better convergence and faster convergence rate in the training process.

Table 1, and table comparing detection performance of 4 typical anomaly detection methods based on machine learning in real data set

	IForest	KNN	AutoEncoder	SOGAAL	OurMethod
						Precision	0.555	0.476	0.585	0.756	0.996
Recall	0.562	0.471	0.674	0.869	0.974
						F1	0.558	0.472	0.566	0.809	0.985
ROC	0.650	0.420	0.740	0.812	0.955

Table 2, and table for comparing detection performance of 4 typical anomaly detection methods based on machine learning on two open anomaly detection data sets

Aiming at the problem of abnormality detection of multivariate time sequence data, particularly non-stationary nonlinear multivariate mass complex time sequence, the invention designs an unsupervised abnormality detection method based on heterogeneous bidirectional generation confrontation network structure, which has the advantages that: (1) the method solves the characteristic representation and learning of normal data in a complex time sequence, realizes unsupervised anomaly detection, and has better practicability and application value; (2) the problems of poor training convergence and model collapse existing in an anomaly detection model based on a generated countermeasure network are solved; (3) the performance of the abnormal value function is improved, the calculation complexity is reduced, and the abnormal detection can be realized more accurately in a shorter time.

The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A heterogeneous bi-directionally generated countermeasure network model, characterized by: the device comprises a generator G and an encoder E, wherein the neural network is a multilayer long and short term memory network, and a discriminator D, wherein the neural network is a convolutional neural network; the generator G is used for learning the characteristics of the time sequence, generating abnormal data similar to normal data from random noise and inputting the abnormal data into the discriminator D for judgment, so that mapping from a potential generation space to an original data space is realized; the encoder E is used for calculating the reconstruction error of the generator G; the discriminator D is used for identifying and distinguishing different modes and discriminating whether the data generated by the generator G is normal data or abnormal data; through the game between the generator G and the discriminator D, the model for finally achieving Nash equilibrium is as follows:

2. a heterogeneous bi-directional generative countermeasure network model according to claim 1 wherein: computing outliers for the model includes computing discriminantsError and reconstruction error; constructing an outlier function s_x＝αl_D(X)-(1-α)l_G(X) to calculate an outlier of the model, wherein l_DTo discriminate errors, |_GIs a reconstruction error; the discrimination error is directly obtained through the classification cross entropy of the discriminator.

3. A heterogeneous bi-directional generative countermeasure network model according to claim 1 wherein: the encoder E trains with the generator G simultaneously in the countertraining process, so that the encoder E simultaneously realizes inverse mapping from a data space to a potential generation space while the potential space is mapped to an original data space by the training generator G, and further, the reconstruction error is quickly and accurately calculated.

4. A time series anomaly detection method based on a heterogeneous bidirectional generation countermeasure network model is characterized by comprising the following steps: the time series abnormality detection method includes an abnormality detection step; the abnormality detecting step includes:

classifying abnormal data and normal data by a discriminator D in the model, and calculating a classification cross entropy to obtain a discrimination error l_D；

Mapping data to a potential generation space through an encoder E in the model to obtain E (x), mapping the potential generation space to an original data space through a generator G to obtain G (E (x)), and then mapping the potential generation space to the original data space through l_G(X)＝||x-G(E(X))||₁Calculating to obtain a reconstruction error l_G；

Combined with the discrimination error l_DAnd a reconstruction error l_GBy means of an outlier function s_x＝αl_D(X)-(1-α)l_GAnd (X) calculating an abnormal value to realize abnormal detection.

5. The method for detecting the time series abnormality based on the heterogeneous bidirectional generation countermeasure network model according to claim 4, wherein: the time series anomaly detection method also comprises the steps of model construction and model training; the model building and training steps are performed before the anomaly detection step.

6. The method for detecting the time series abnormality based on the heterogeneous bidirectional generation countermeasure network model according to claim 5, wherein: the model building step comprises:

7. The method for detecting the time series abnormality based on the heterogeneous bidirectional generation countermeasure network model according to claim 5, wherein: the model training step comprises: