CN112257917A - Time series abnormal mode detection method based on entropy characteristics and neural network - Google Patents
Time series abnormal mode detection method based on entropy characteristics and neural network Download PDFInfo
- Publication number
- CN112257917A CN112257917A CN202011116876.4A CN202011116876A CN112257917A CN 112257917 A CN112257917 A CN 112257917A CN 202011116876 A CN202011116876 A CN 202011116876A CN 112257917 A CN112257917 A CN 112257917A
- Authority
- CN
- China
- Prior art keywords
- sequence
- score
- abnormal
- sample
- entropy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 77
- 238000001514 detection method Methods 0.000 title claims abstract description 17
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 51
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000004364 calculation method Methods 0.000 claims abstract description 19
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 230000005856 abnormality Effects 0.000 claims description 6
- 239000012634 fragment Substances 0.000 claims description 6
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 101150000419 GPC gene Proteins 0.000 claims description 2
- 101150026392 N gene Proteins 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 239000003245 coal Substances 0.000 description 14
- 238000011160 research Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000002485 combustion reaction Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002269 spontaneous effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000011158 quantitative evaluation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000009423 ventilation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Forestry; Mining
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Agronomy & Crop Science (AREA)
- Animal Husbandry (AREA)
- Marine Sciences & Fisheries (AREA)
- Mining & Mineral Resources (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a time series abnormal pattern detection method based on entropy characteristics and a neural network, which comprises the following steps: 1) extracting a second-order difference ratio sample entropy characteristic sequence from the time sequence in the training data set; 2) training a generated confrontation network model to obtain a generator and a corresponding discriminator; 3) calculating the abnormal score of the characteristic sequence and constructing a threshold value; 4) and carrying out abnormity judgment on the input data to be detected according to the threshold value. The method has the advantages that the time series data are subjected to feature extraction by utilizing the difference rate sample entropy, so that the abnormal mode is more obvious; a new abnormal score calculation method is established, and the accuracy and the generalization of model identification are improved, so that the method has higher practicability and application value.
Description
Technical Field
The invention relates to prediction of coal mine thermodynamic composite disasters, in particular to a time series abnormal mode detection method based on entropy characteristics and a neural network, and belongs to the field of emergency safety.
Background
Coal as a main energy source occupies an irreplaceable important position in the energy structure of China, the area left after coal mining is a goaf, ventilation in the goaf is poor, more coal is left, and combustible gas is generated by continuous oxidation, so that coal spontaneous combustion, gas explosion and other coal mine thermal power disasters are easily caused. The concentration change of the released combustible gas shows a certain rule along with the development of time, the inflection points of monitoring data in different stages are effectively detected, and when the gas concentration is greatly changed, the gas concentration can be considered to enter an abnormal mode, so that the possibility of disasters such as coal spontaneous combustion and the like is shown. The gas generation content of different coal mines is different, and if the gas content value is only used as the judgment standard of disaster occurrence, great errors can be caused when the method is applied to other coal mines, so that the detection of the abnormal mode can improve the generalization of disaster judgment, and a new thought is provided for the detection of the coal composite disaster.
With the research and the deepening of the artificial intelligence theory by people, the prediction of coal and gas by applying the time sequence prediction method becomes a new trend, people introduce the prediction into the quantitative evaluation and analysis of coal and gas disasters and integrate the theory of computer technology, support vector machines, artificial neural networks and the like for research, but the prediction methods are difficult to apply to complex data, have the problems of easy falling into local minimum values, appear an overfitting phenomenon, have low accuracy and are large in limitation.
With the improvement of information technology, the problem of abnormality detection in time series has become a research focus in recent years. Time series anomalies generally refer to a series of data that is significantly different from other data, and such anomalies do not refer to random bias, but rather to differences due to different mechanisms. The abnormal mode of the gas time sequence data is detected, and a theoretical basis can be provided for the coal mine thermal power disaster. If the time series data has the abnormal mode, the change trend of the data is greatly changed, and the abnormal mode can be used as a judgment basis for disaster occurrence.
In the conventional method (CN201910809956.9), GAN is used for anomaly detection of a time sequence, which mainly uses an optimized GAN generator and a discriminator to build an anomaly detection model, and uses a generated residual and an identification loss output by the model as a judgment basis for judging abnormal data. But most of time series do not change significantly, and the time series are directly used as input data of the GAN, so that the characteristics are not significant enough; meanwhile, a more effective judgment criterion is obtained by using the generated residual error and the identification loss output by the model, and how to improve the accuracy and universality of the abnormity judgment is yet to be researched.
Disclosure of Invention
The invention aims to realize a time series abnormal mode detection method based on entropy characteristics and a neural network. The method of the invention is divided into 4 stages: extracting a second-order difference ratio sample entropy characteristic sequence from the time sequence in the training data set; training a generated confrontation network model to obtain a generator and a corresponding discriminator; calculating the abnormal score of the characteristic sequence and constructing a threshold value; and carrying out abnormity judgment on the input data to be detected according to the threshold value. Specifically, the method of the present invention comprises the steps of:
A. the method specifically comprises the following steps of extracting a second-order difference ratio sample entropy characteristic sequence from a time sequence in a training data set:
A1. dividing a training data set into two sets which are respectively marked as a training data set 1 and a training data set 2;
all the training data set 1 is normal data, and the training data set 2 comprises normal data and abnormal data;
A2. for training data set 1 time seriesCarrying out segmentation by using the formula 1 according to the window size W and the step length d in a sliding manner to obtain a sequence segment set W with the length of L, wherein the ith time sequence segment is recorded as si;
si=[x1+(i-1)d,x2+(i-1)d,…,x1+(i-1)d+w](formula 1)
SaidTtrain1 × T representing the number of time series of training data setstrainRepresenting a training data set time series dimension;
A3. performing difference ratio operation on each sequence segment in the sequence segment set W to obtain a second-order difference ratio sequence of all the sequence segments, wherein the specific implementation is as follows:
A3.1. for sequence segment siCalculating a second order difference ratio sequence G ═ G using equation 21,g2,…,gw′Solving the standard deviation std;
saidIs the e-order difference value of the u time point,is the e-order difference value of the u-1 time point;
A3.2. dividing a second-order difference rate sequence with w 'data points by taking m time sequence data points as a subsequence, totaling w' -m +1 subsequence segments, and marking as K2i={q1,q2,…,qw′-m+1};
A4. Carrying out sample entropy feature extraction on the second-order difference rate sequences of all the sequence segments to obtain the second-order difference rate sample entropy feature sequences of all the sequence segments, and concretely realizing the following steps:
A4.1. calculating any two subsequence fragments qaAnd q isbA distance D [ q ] betweena,qb]The distance is determined by the maximum difference of the corresponding position elements in the two subsequence segments;
A4.2. calculating the subsequence fragment qaObtaining the similarity probability of the subsequences with the distance between subsequences smaller than the threshold value by formula 3, and obtaining the average similarity probability of the second-order difference rate sequence by formula 4;
r is a similarity threshold;
A4.3. according to the steps A4.1-A4.2, the average similarity probability B is recalculated by taking m +1 as the subsequence lengthm+1(r) obtaining a second-order difference ratio sample entropy feature SE by formula 5;
A5. carrying out sectional average preprocessing on the difference rate sample entropy sequence to obtain the difference rate sample entropy sequence, and concretely realizing the following steps:
A5.1. from Xt(t-1, 2.. t-w), and extracting a sequence segment S with the length wt={Xt,Xt+1,...,Xw+t-1}1×tSumming according to formula 6, and then averaging according to formula 7;
sumt=Xt+Xt+1...Xw+t-1(formula 6)
sumt=sumtW; (formula 7)
A5.2. Repeating the step A4.1, taking out t-w sequence segments in total, and adding sumtForming a new entropy sequence S of difference ratio samplest'={sum1,sum2,…,sumt-w}1×t;
B. Training a generated confrontation network model to obtain a generator and a corresponding discriminator, and the specific implementation is as follows:
B1. randomly sampled noisy data Z ═ ZiI is 1,2, …, n, where n corresponds to the number of samples. The generator model G is a plurality of LSTM memory units, the number of the memory units is set, Z is input into the generator model G, and reconstructed sample sequence data G (Z) is generated;
B2. entropy sequencing of new difference ratio samplesSt' inputting the generated reconstructed sample sequence data G (Z) into a built discriminator model D;
B3. updating the model parameters by using a random gradient descent algorithm according to the value of the loss function, updating the parameters of the discriminator, and then updating the parameters of the generator according to the noise data by using an Adam optimization algorithm;
B4. saving the model parameters, repeating the steps B1-B3 to carry out loop iteration, and finally obtaining a trained generator model G capable of generating a normal time sequence and a corresponding discriminator model D;
C. calculating the abnormal score of the characteristic sequence and constructing a threshold, wherein the method is specifically realized as follows:
C1. using time series in training data set 2Repeating the steps A2-A5, and extracting the features to obtain a new feature sequence
C2. Randomly sampling noise data ZvalInputting the data into a generator G which is completed in training, and generating a reconstructed sample G (Z)val) Calculating the abnormal score R of the input sample by using the generated errorscoreThe method is concretely realized as follows:
C2.1. for the reconstructed sample G (Z) with the length of nval) New signature sequence with training data set 2The elements in the absolute error E are sorted from small to large to obtain the sorted absolute error Ei′={e′1,e′2,…,e′nGet the absolute error E after sortingi′={e′1,e′2,…,e′nMean value M of;
C2.2. e'iComparing the extracted elements with the average value M, and taking out E'iMiddle { e'k,e′k+1,…,e′nAre data elements greater than the mean M, the number beingn-k + 1; initializing weight sequence Wi′={w′1,w′2,…,w′n}T,w′1~n-2X 'is provided'nCorresponding weight w'nIs lambda, x'n-1Corresponding weight w'n-1Is 1-lambda, the weight sequence W is updatedi' size of element in, W is represented by formula 8i' update is performed;
C2.3. using the updated weight sequence Wi'and sequenced sample E'iCalculating the generation abnormality score R of the training sample set 2 by equation 9score;
Rscore=Ei′·Wi' (formula 9)
C3. Outputting a generated sample and a new characteristic sequence by using the discriminator D trained in the step BThe similarity probability P of (2), calculating the discriminant anomaly score DscoreIs 1-P;
C4. using discriminant anomaly score DscoreAnd generating an anomaly score RscoreThe anomaly score O is calculated by equation 10, and a threshold is established according to the training data set 2, specifically implemented as follows:
O=WD×Dscore+WG×Rscore(formula 10)
W isDAnd WGGenerating weights of the abnormal scores for the discrimination abnormal scores and the samples respectively;
C4.1. will train the data setThe maximum abnormal score and the minimum abnormal score in the result are used as the maximum boundary and the minimum boundary, the maximum abnormal score and the minimum abnormal score are divided averagely, and the abnormal score of the q-th training data set 2 is calculated through an equation 11;
C4.2. the abnormal score corresponding to the maximum F1 score is used as a threshold, and the calculation mode of F1 is as shown in formula 12;
the Pre is the proportion of the positive sample predicted to be positive in all the positive samples, and the Rec is the proportion of the positive sample predicted to be positive in all the positive samples; TP is the positive sample predicted to be positive by the model; FP is a negative sample predicted to be positive by the model; FN is the positive sample predicted as negative by the model;
D. the method specifically comprises the following steps of judging the abnormity of input data to be detected according to a threshold value:
D1. inputting a time series of data sets to be detectedRepeating the steps A1-A5, and extracting the entropy characteristics of the difference rate samples to obtain a new time sequence
D2. Repeating steps C1-C4 to obtainInputting the data into a trained generation countermeasure network, and calculating the abnormal score O of the data to be detected by using the formula 10real;
D3. Abnormality score O obtained by calculationrealAnd C, comparing the data to be detected with the threshold value obtained by calculation in the step C, if the abnormal score is larger than the threshold value, judging that the data to be detected contains an abnormal mode, otherwise, judging that the data to be detected does not contain the abnormal mode.
The method has the advantages that the time series data are subjected to feature extraction by utilizing the difference rate sample entropy, so that the abnormal mode is more obvious; a new abnormal score calculation method is established, the accuracy and the generalization of time series abnormal pattern detection are improved, and the method has higher practicability and application value.
Drawings
FIG. 1: general flow chart of abnormal pattern detection
Detailed Description
The present invention will be further described below as an example by performing CO time series prediction on experimental data and performing a description of a time series abnormal pattern detection method based on a difference ratio entropy characteristic and generation of a countermeasure network according to a time series data amount, an input-output dimension, and the like with reference to the accompanying drawings.
The general flow chart of the method is shown in figure 1. The method comprises the following steps: 1) extracting a second-order difference ratio sample entropy characteristic sequence from the time sequence in the training data set; 2) training a generated confrontation network model to obtain a generator and a corresponding discriminator; 3) calculating the abnormal score of the characteristic sequence and constructing a threshold value; 4) and carrying out abnormity judgment on the input data to be detected according to the threshold value. The invention is further described below by way of example according to the following steps:
A. the method specifically comprises the following steps of extracting a second-order difference ratio sample entropy characteristic sequence from a time sequence in a training data set:
A1. selecting experimental data, wherein a research object is a CO gas concentration one-dimensional time sequence, selecting a training data set, dividing the training data set into two sets which are respectively marked as a training data set 1 and a training data set 2;
all the training data set 1 is normal data, and the training data set 2 comprises normal data and abnormal data;
A2. setting the sliding window size of the sequence segment to be 10 and the step length to be 1 for the training data set 1 which is all normal data to slide for segmentation;
A3. performing difference rate operation on each sequence segment in the sequence segment set to obtain a second-order difference rate sequence of all the sequence segments, wherein the specific implementation is as follows:
A3.1. the CO gas concentration sequence is totally 348 data, and the formula is utilizedThe second order difference ratio series was obtained to have 345 parts of data, G ═ G, as shown in table 21,g2,…,gw′And find the standard deviation std to be 0.11, and the partial data are as follows:
A3.2. dividing a second-order difference rate sequence with 345 data points by taking 6 time sequence data points as a subsegment, and counting 340 subsequences which are marked as K2i={q1,q2,…,qw′-m+1Part of the data are as follows:
A4. carrying out sample entropy feature extraction on the second-order difference rate sequences of all the sequence segments to obtain second-order difference rate sample entropy feature sequences of all the sequence segments, and concretely realizing the following steps;
A4.1. calculating the second-order difference ratio sample entropy characteristics of each sequence segment to finally obtain a complete second-order difference ratio sample entropy sequence, wherein partial data are as follows:
A5. carrying out sectional average preprocessing on the difference rate sample entropy sequence to obtain the difference rate sample entropy sequence, and concretely realizing the following steps:
A5.1. from Xt(t 1,2.. t-w), and taking out the sequence with the length wColumn segment St={Xt,Xt+1,...,Xw+t-1}1×tSumming and then averaging;
A5.2. repeating the step A4.1, taking out t-w sequence segments in total, and adding sumtConstitute a new sequence St'={sum1,sum2,…,sumt-w}1×tPart of the data are as follows:
B. training a generated confrontation network model to obtain a generator and a corresponding discriminator, and the specific implementation is as follows:
B1. randomly sampled noisy data Z ═ ZiI ═ 1,2, …, n }, where n is 330. The generator model is a plurality of LSTM memory units, the number of the memory units is set, Z is input into the built generator model, and reconstructed sample sequence data G (Z) is generated;
B2. entropy S of new difference ratio samplest' and the generated reconstructed sample sequence data G (Z) are input into a built discriminator model D, and partial parameter data are as follows:
B3. updating the model parameters by using a random gradient descent algorithm according to the value of the loss function, updating the parameters of the discriminator, and then updating the parameters of the generator according to the noise data by using an Adam optimization algorithm;
B4. saving the model parameters, returning to B2 for 1000 times of loop iteration, setting the learning rate to be 0.1, and finally obtaining a trained generator model G and a discriminant model D;
C. calculating the abnormal score of the characteristic sequence and constructing a threshold, wherein the method is specifically realized as follows:
C1. the steps A2-A5 are first repeated for a time series of training data set 2 containing normal data and abnormal dataExtracting features to obtain new feature sequencePart of the data are as follows:
C2. using discriminant anomaly score DscoreAnd sample generation anomaly score RscoreCalculating an anomaly score O;
C2.1. will train the data setThe maximum abnormal score and the minimum abnormal score in the result are used as the maximum and minimum boundaries, and the maximum abnormal score and the minimum abnormal score are averagely divided to obtain the abnormal score of the training data set 2 of the q-th section
C2.2. The maximum F1 score is 0.8916, and the corresponding abnormal score O is used as a threshold value, so that the threshold value is 0.375;
D. the method specifically comprises the following steps of judging the abnormity of input data to be detected according to a threshold value:
D1. inputting time series samples of a data set to be detectedRepeating the steps A2-A5, and extracting the entropy characteristics of the difference rate samples to obtain a new time sequencePart of the data are as follows:
D2. repeating steps C1-C4 to obtainInputting the abnormal score O into a trained generation countermeasure network, and calculating the abnormal score O of an actual data samplerealIs 0.572;
D3. abnormality score O obtained by calculationrealAnd C, comparing the abnormal score with the threshold value calculated in the step C, if the abnormal score is larger than the threshold value, judging that the sample is an abnormal sample, and actually processing the whole sample as follows:
the method realizes a time series abnormal mode detection method based on the difference rate entropy characteristics and generation of the countermeasure network, and can detect whether the sequence section contains an abnormal mode, thereby achieving the purpose of providing judgment basis for the occurrence of coal mine thermal dynamic disasters; a new abnormal score calculation method is established, so that the accuracy and the generalization of model identification are improved, and the method has higher application value.
Finally, it is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.
Claims (6)
1. A time series abnormal pattern detection method based on entropy characteristics and a neural network comprises the following steps:
A. the method specifically comprises the following steps of extracting a second-order difference ratio sample entropy characteristic sequence from a time sequence in a training data set:
A1. dividing a training data set into two sets which are respectively marked as a training data set 1 and a training data set 2;
all the training data set 1 is normal data, and the training data set 2 comprises normal data and abnormal data;
A2. for training data set 1 time seriesThe window size W and the step length d are used for sliding segmentation to obtain a sequence segment set W with the length of L, wherein the ith time sequence segment is marked as siThe calculation formula is as follows:
si=[x1+(i-1)d,x2+(i-1)d,…,x1+(i-1)d+w]
the T istrain1 × T representing the number of time series of training data setstrainRepresenting a training data set time series dimension;
A3. performing difference rate operation on each sequence segment in the sequence segment set W to obtain a second-order difference rate sequence of all the sequence segments;
A4. carrying out sample entropy feature extraction on the second-order difference rate sequences of all the sequence segments to obtain second-order difference rate sample entropy feature sequences of all the sequence segments;
A5. carrying out sectional average pretreatment on the difference rate sample entropy sequence to obtain a difference rate sample entropy sequence;
B. training a generated confrontation network model to obtain a generator and a corresponding discriminator, and the specific implementation is as follows:
B1. randomly sampled noisy data Z ═ ZiI is 1,2, …, n, where n corresponds to the number of samples. The generator model G is a plurality of LSTM memory units, the number of the memory units is set, Z is input into the generator model G, and reconstructed sample sequence data G (Z) is generated;
B2. entropy sequencing S of new difference ratio samplest' inputting the generated reconstructed sample sequence data G (Z) into a built discriminator model D;
B3. updating the model parameters by using a random gradient descent algorithm according to the value of the loss function, updating the parameters of the discriminator, and then updating the parameters of the generator according to the noise data by using an Adam optimization algorithm;
B4. saving the model parameters, repeating the steps B1-B3 to carry out loop iteration, and finally obtaining a trained generator model G capable of generating a normal time sequence and a corresponding discriminator model D;
C. calculating the abnormal score of the characteristic sequence and constructing a threshold, wherein the method is specifically realized as follows:
C1. using time series in training data set 2Repeating the steps A2-A5, and extracting the features to obtain a new feature sequence
C2. Randomly sampling noise data ZvalInputting the data into a generator G which is completed in training, and generating a reconstructed sample G (Z)val) Calculating the abnormal score R of the input sample by using the generated errorscore;
C3. Outputting a generated sample and a new characteristic sequence by using the discriminator D trained in the step BThe similarity probability P of (2), calculating the discriminant anomaly score DscoreIs 1-P;
C4. using discriminant anomaly score DscoreAnd generating an anomaly score RscoreCalculating an abnormal score O, and establishing a threshold value according to the training data set 2, wherein the calculation formula is as follows:
O=WD×Dscore+WG×Rscore
w isDAnd WGGenerating weights of the abnormal scores for the discrimination abnormal scores and the samples respectively;
D. the method specifically comprises the following steps of judging the abnormity of input data to be detected according to a threshold value:
D1. inputting a time series of data sets to be detectedRepeating the steps A1-A5, and extracting the entropy characteristics of the difference rate samples to obtain a new time sequence
D2. Repeating steps C1-C4 to obtainInputting the data into a trained generation countermeasure network, and calculating the abnormal score O of the data to be detected by using the formula 10real;
D3. Abnormality score O obtained by calculationrealAnd C, comparing the data to be detected with the threshold value obtained by calculation in the step C, if the abnormal score is larger than the threshold value, judging that the data to be detected contains an abnormal mode, otherwise, judging that the data to be detected does not contain the abnormal mode.
2. The method for detecting the abnormal pattern of the time series based on the entropy features and the neural network as claimed in claim 1, wherein the difference ratio operation is performed on each sequence segment in the sequence segment set W to obtain the second order difference ratio sequence of all the sequence segments, and the method is implemented as follows:
A3.1. for sequence segment siCalculating the second order difference rate sequence G ═ G1,g2,…,gw′And solving the standard deviation std thereof, wherein the calculation formula is as follows:
saidIs the e-order difference value of the u time point,is the e-order difference value of the u-1 time point;
A3.2. dividing m time sequence data points into w' dataThe second order difference ratio sequence of points, totaling w' -m +1 subsequences fragments, is denoted as K2i={q1,q2,…,qw′-m+1}。
3. The method for detecting the abnormal pattern of the time series based on the entropy characteristics and the neural network as claimed in claim 1, wherein the sample entropy characteristics are extracted from the second order difference ratio sequences of all the sequence segments to obtain the second order difference ratio sample entropy characteristic sequences of all the sequence segments, and the specific implementation steps are as follows:
A4.1. calculating any two subsequence fragments qaAnd q isbA distance D [ q ] betweena,qb]The distance is determined by the maximum difference of the corresponding position elements in the two subsequence segments;
A4.2. calculating the subsequence fragment qaProbability of similarity to the remainder of the subsequence fragment. Using the occupation ratio of the subsequence segments with the distance between the subsequence segments smaller than a threshold value and the average similarity probability of the second-order difference rate sequence as the second-order difference rate sample entropy, wherein the calculation formula is as follows:
r is a similarity threshold;
A4.3. according to the steps A4.1-A4.2, the average similarity probability B is recalculated by taking m +1 as the subsequence lengthm+1(r), the second order difference ratio sample entropy characteristic SE, the calculation mode is as follows:
4. the method for detecting the abnormal pattern of the time series based on the entropy characteristics and the neural network as claimed in claim 1, wherein the difference ratio sample entropy sequence is subjected to the segmented average preprocessing to obtain the difference ratio sample entropy sequence, and the method is specifically realized as follows:
A5.1. from Xt(t-1, 2.. t-w), and extracting a sequence segment S with the length wt={Xt,Xt+1,...,Xw+t-1}1×tSumming and then averaging, wherein the calculation formula is as follows:
sumt=Xt+Xt+1...Xw+t-1
sumt=sumt/w;
A5.2. repeating the step A4.1, taking out t-w sequence segments in total, and adding sumtForming a new entropy sequence S of difference ratio samplest'={sum1,sum2,…,sumt-w}1×t。
5. The entropy feature and neural network-based time series abnormal pattern detection method of claim 1, wherein the noise data Z is randomly sampledvalInputting the data into a generator G which is completed in training, and generating a reconstructed sample G (Z)val) Calculating the abnormal score R of the input sample by using the generated errorscoreThe method is concretely realized as follows:
C2.1. for the reconstructed sample G (Z) with the length of nval) New signature sequence with training data set 2The elements in the absolute error E are sorted from small to large to obtain the sorted absolute error Ei′={e′1,e′2,…,e′nGet absolute error E 'after sorting'i={e′1,e′2,…,e′nMean value M of;
C2.2. e'iComparing the extracted elements with the average value M, and taking out E'iMiddle { e'k,e′k+1,…,e′n-data elements larger than the mean M, number n-k + 1; initializing weight sequence Wi′={w′1,w′2,…,w′n}T,w′1~n-2X 'is provided'nCorresponding weight w'nIs lambda, x'n-1Corresponding weight w'n-1Is 1-lambda, the weight sequence W is updatedi' the size of the middle element, the calculation formula is:
C2.3. using the updated weight sequence Wi'and sequenced sample E'iCalculating the abnormal score R of training sample set 2scoreThe calculation formula is as follows:
Rscore=Ei′·Wi′。
6. the entropy feature and neural network-based time series abnormal pattern detection method of claim 1, wherein a discriminant abnormality score D is usedscoreAnd generating an anomaly score RscoreThe anomaly score O is calculated by equation 10, and a threshold is established according to the training data set 2, specifically implemented as follows:
C4.1. will train the data setThe maximum abnormal score and the minimum abnormal score in the result are used as the maximum boundary and the minimum boundary, the maximum abnormal score and the minimum abnormal score are averagely divided, and the abnormal score of the q-th training data set 2 is calculated, wherein the calculation formula is as follows:
C4.2. the abnormal score corresponding to the maximum F1 score is used as a threshold, and the calculation formula of F1 is as follows:
the Pre is the proportion of positive samples predicted to be positive in all the positive samples predicted to be positive; rec is the proportion of positive samples predicted to be positive among all positive samples. TP is the positive sample predicted to be positive by the model; FP is a negative sample predicted to be positive by the model; FN is the positive sample that is predicted to be negative by the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011116876.4A CN112257917B (en) | 2020-10-19 | 2020-10-19 | Time sequence abnormal mode detection method based on entropy characteristics and neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011116876.4A CN112257917B (en) | 2020-10-19 | 2020-10-19 | Time sequence abnormal mode detection method based on entropy characteristics and neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112257917A true CN112257917A (en) | 2021-01-22 |
CN112257917B CN112257917B (en) | 2023-05-12 |
Family
ID=74244702
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011116876.4A Active CN112257917B (en) | 2020-10-19 | 2020-10-19 | Time sequence abnormal mode detection method based on entropy characteristics and neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112257917B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113127705A (en) * | 2021-04-02 | 2021-07-16 | 西华大学 | Heterogeneous bidirectional generation countermeasure network model and time sequence anomaly detection method |
CN114386454A (en) * | 2021-12-09 | 2022-04-22 | 首都医科大学附属北京友谊医院 | Medical time sequence signal data processing method based on signal mixing strategy |
CN114844796A (en) * | 2022-04-29 | 2022-08-02 | 济南浪潮数据技术有限公司 | Method, device and medium for detecting abnormity of time-series KPI |
CN115600116A (en) * | 2022-12-15 | 2023-01-13 | 西南石油大学(Cn) | Dynamic detection method, system, storage medium and terminal for time series abnormity |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001092990A2 (en) * | 2000-06-01 | 2001-12-06 | Variagenics, Inc. | Structure-based methods for assessing amino acid variances |
CN103886405A (en) * | 2014-02-20 | 2014-06-25 | 东南大学 | Boiler combustion condition identification method based on information entropy characteristics and probability nerve network |
CN109035488A (en) * | 2018-08-07 | 2018-12-18 | 哈尔滨工业大学(威海) | Aero-engine time series method for detecting abnormality based on CNN feature extraction |
CN110071913A (en) * | 2019-03-26 | 2019-07-30 | 同济大学 | A kind of time series method for detecting abnormality based on unsupervised learning |
CN110211114A (en) * | 2019-06-03 | 2019-09-06 | 浙江大学 | A kind of scarce visible detection method of the vanning based on deep learning |
CN110598851A (en) * | 2019-08-29 | 2019-12-20 | 北京航空航天大学合肥创新研究院 | Time series data abnormity detection method fusing LSTM and GAN |
-
2020
- 2020-10-19 CN CN202011116876.4A patent/CN112257917B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001092990A2 (en) * | 2000-06-01 | 2001-12-06 | Variagenics, Inc. | Structure-based methods for assessing amino acid variances |
CN103886405A (en) * | 2014-02-20 | 2014-06-25 | 东南大学 | Boiler combustion condition identification method based on information entropy characteristics and probability nerve network |
CN109035488A (en) * | 2018-08-07 | 2018-12-18 | 哈尔滨工业大学(威海) | Aero-engine time series method for detecting abnormality based on CNN feature extraction |
CN110071913A (en) * | 2019-03-26 | 2019-07-30 | 同济大学 | A kind of time series method for detecting abnormality based on unsupervised learning |
CN110211114A (en) * | 2019-06-03 | 2019-09-06 | 浙江大学 | A kind of scarce visible detection method of the vanning based on deep learning |
CN110598851A (en) * | 2019-08-29 | 2019-12-20 | 北京航空航天大学合肥创新研究院 | Time series data abnormity detection method fusing LSTM and GAN |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113127705A (en) * | 2021-04-02 | 2021-07-16 | 西华大学 | Heterogeneous bidirectional generation countermeasure network model and time sequence anomaly detection method |
CN113127705B (en) * | 2021-04-02 | 2022-08-05 | 西华大学 | Heterogeneous bidirectional generation countermeasure network model and time sequence anomaly detection method |
CN114386454A (en) * | 2021-12-09 | 2022-04-22 | 首都医科大学附属北京友谊医院 | Medical time sequence signal data processing method based on signal mixing strategy |
CN114386454B (en) * | 2021-12-09 | 2023-02-03 | 首都医科大学附属北京友谊医院 | Medical time sequence signal data processing method based on signal mixing strategy |
CN114844796A (en) * | 2022-04-29 | 2022-08-02 | 济南浪潮数据技术有限公司 | Method, device and medium for detecting abnormity of time-series KPI |
CN115600116A (en) * | 2022-12-15 | 2023-01-13 | 西南石油大学(Cn) | Dynamic detection method, system, storage medium and terminal for time series abnormity |
Also Published As
Publication number | Publication date |
---|---|
CN112257917B (en) | 2023-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112257917B (en) | Time sequence abnormal mode detection method based on entropy characteristics and neural network | |
CN113434357B (en) | Log anomaly detection method and device based on sequence prediction | |
CN110213222B (en) | Network intrusion detection method based on machine learning | |
CN111914873A (en) | Two-stage cloud server unsupervised anomaly prediction method | |
CN107194524B (en) | RBF neural network-based coal and gas outburst prediction method | |
KR102361423B1 (en) | Artificial intelligence system and method for predicting maintenance demand | |
CN112199670B (en) | Log monitoring method for improving IFOREST (entry face detection sequence) to conduct abnormity detection based on deep learning | |
CN108446714B (en) | Method for predicting residual life of non-Markov degradation system under multiple working conditions | |
CN113505826B (en) | Network flow anomaly detection method based on joint feature selection | |
CN112761628A (en) | Shale gas yield determination method and device based on long-term and short-term memory neural network | |
CN112329974B (en) | LSTM-RNN-based civil aviation security event behavior subject identification and prediction method and system | |
CN111881299B (en) | Outlier event detection and identification method based on replicated neural network | |
CN114281864A (en) | Correlation analysis method for power network alarm information | |
CN113806889A (en) | Processing method, device and equipment of TBM cutter head torque real-time prediction model | |
CN115018512A (en) | Electricity stealing detection method and device based on Transformer neural network | |
Li et al. | A rockburst prediction model based on extreme learning machine with improved Harris Hawks optimization and its application | |
CN114742165A (en) | Aero-engine gas circuit performance abnormity detection system based on depth self-encoder | |
CN110991363B (en) | Method for extracting CO emission characteristics of coal mine safety monitoring system in different coal mining processes | |
Yu et al. | Anomaly detection in unstructured logs using attention-based Bi-LSTM network | |
US20230401454A1 (en) | Method using weighted aggregated ensemble model for energy demand management of buildings | |
CN115017015B (en) | Method and system for detecting abnormal behavior of program in edge computing environment | |
CN115048873B (en) | Residual service life prediction system for aircraft engine | |
CN114826718A (en) | Multi-dimensional information-based internal network anomaly detection method and system | |
CN113326371B (en) | Event extraction method integrating pre-training language model and anti-noise interference remote supervision information | |
CN111967494B (en) | Multi-source heterogeneous data analysis method for guard security of large movable public security system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |