CN115345202A

CN115345202A - Third-party load aggregation platform interactive data anomaly detection method and system

Info

Publication number: CN115345202A
Application number: CN202210987513.0A
Authority: CN
Inventors: 郭静; 郭雅娟; 黄伟; 姜海涛; 王梓莹; 顾智敏; 李岩; 赵新冬; 秦冉; 冒佳明; 娄征; 徐江涛; 毕晓甜; 孙云晓; 周超
Original assignee: State Grid Jiangsu Electric Power Co Ltd; Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Current assignee: State Grid Jiangsu Electric Power Co Ltd; Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2022-08-17
Filing date: 2022-08-17
Publication date: 2022-11-15
Anticipated expiration: 2042-08-17
Also published as: CN115345202B

Abstract

The invention discloses a third-party load aggregation platform interactive data anomaly detection method and a system in the field of data anomaly detection, wherein the method comprises the following steps: inputting the interactive data into a pre-trained data anomaly detection model, and judging whether the interactive data is anomalous data or not through the data anomaly detection model; denoising the interactive initial data through discrete binary wavelet transform to obtain training interactive data; extracting k times of training interactive data in batches to form data samples; processing the data samples by a spectral clustering algorithm to obtain a candidate training data set; comparing the Jacard similarity coefficient with a set threshold value to screen out abnormal data points and normal data points, and constructing a training data set; training the deep residual error learning network by using a training data set to obtain a data anomaly detection model; the method and the system can protect the interaction safety of the third-party service in the novel power system, and have the characteristics of strong computing capability, high efficiency and data noise immunity.

Description

Third-party load aggregation platform interactive data anomaly detection method and system

Technical Field

The invention belongs to the field of data anomaly detection, and particularly relates to a third-party load aggregation platform interactive data anomaly detection method and system.

Background

With the rapid development of national economy, the demand for industrial electricity and residential electricity is rising year by year, and the power supply of the power system is under great pressure. On one hand, the loads of a power distribution network and a power utilization network are increased, and large load pressure and potential safety hazards are brought to normal power supply and operation of a power system; on the other hand, the power demand side-third party load aggregation platform, including but not limited to load objects such as distributed photovoltaic, building loads (air conditioners, lighting, power, etc.), industrial loads (such as cement plants, equipment manufacturing plants, component manufacturing plants, etc.), and client side power distribution rooms (mainly 10kV and 35 kV), requires comprehensive and efficient planning of power supply and demand resources and efficient and reasonable allocation of power resources, and improves the power utilization efficiency and the safe operation efficiency of the power grid of the third party platform. In addition, when a power supply system or a power distribution network fails to cause power supply or power distribution voltage and frequency drop, a third-party load aggregation platform and other demand side response technologies face the power supply or power distribution network voltage regulation and frequency modulation demands, so that the data transmission and data safety stability of the power supply or power distribution network can be seriously influenced.

Disclosure of Invention

The invention aims to provide a third-party load aggregation platform interaction data anomaly detection method and system, which can protect the interaction safety of third-party services in a novel power system and have the characteristics of strong computing capability, high efficiency and data noise resistance.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

the invention provides a third-party load aggregation platform interactive data anomaly detection method, which comprises the following steps:

inputting the interactive data into a pre-trained data anomaly detection model, and judging whether the interactive data is abnormal data or not through the data anomaly detection model;

the training process of the data anomaly detection model comprises the following steps:

acquiring interaction initial data of a third-party load aggregation platform; after discrete binary wavelet transformation is carried out on the interactive initial data, searching the modulus maximum value points on all levels of scales step by step; denoising the interactive initial data according to the modulus maximum point to obtain training interactive data;

extracting k times of training interactive data in batches to form data samples; processing the data samples by a spectral clustering algorithm to obtain a candidate training data set; calculating Jacard similarity coefficients of the candidate training data sets; comparing the Jacard similarity coefficient with a set threshold value to screen out abnormal data points and normal data points, and constructing a training data set;

and training the deep residual learning network by utilizing a training data set to obtain a data anomaly detection model with the classification error within a set range.

Preferably, the interactive initial data is subjected to a discrete binary wavelet transform, denoted as (W) _Ψ f) (a, b); the formula of the table is:

the allowable conditions of (a) are:

in the formula, f (t) is expressed as interaction initial data of a third-party load aggregation platform; t is expressed as a time-space variable, and t belongs to R; a is expressed as a scale factor; b is expressed as a translation factor;

is composed of

Complex conjugation of (a); Ψ (w) is a base wavelet sequence.

Preferably, the method for searching the maximum value points of the modulus on each scale step by step and denoising the interactive initial data according to the maximum value points of the modulus to obtain the training interactive data comprises the following steps:

screening out interaction initial data at 2 according to the amplitude ¹ And (3) judging the maximum value point of the modulus on the scale under the following conditions:

log ₂ |Wf(2 ^h ,t)|≤log ₂ A+hα

wherein A and alpha are respectively a smoothing coefficient and a Lee index; h is expressed as a scale series; wf (2) ^h And t) is expressed as interaction initiation data at 2 ^h Magnitude on a scale;

is composed of 2 ¹ Position determination of the mode maximum point on the scale 2 ² The position of the mode maximum point on the scale; from dimension 2 ² At the beginning, from 2 ² Each maximum point on the scale is searched down for its corresponding maximum line using an impromptu algorithm,

will 2 ^h Removing extreme points which are not on any maximum line on the scale, wherein h is more than or equal to 2; is removed again 2 ¹ All modulo maximum points on the scale; and finally, restoring the training interactive data through wavelet inverse transformation.

Preferably, the expression formula for finally restoring the training interactive data through the wavelet inverse transformation is as follows:

in the formula, Ψ _a,b And (t) is a continuous wavelet sequence obtained by stretching and translating the base wavelet.

Preferably, the method for processing the data samples by the spectral clustering algorithm to obtain the candidate training data set includes:

calculating the similarity among the data samples in the interactive data, wherein the expression formula is as follows:

in the formula, s _ij Representing the similarity between the ith data sample and the jth data sample: e _i Denoted as the ith data sample; e _j Denoted as the jth data sample; σ is expressed as a similarity coefficient;

obtaining a similarity matrix W according to the similarity between the data samples, wherein the expression formula is as follows:

obtaining a degree matrix D according to the similarity between the data samples, wherein the expression formula is as follows:

d _i ＝s _i1 +s _i2 +…+s _in ，1≤i≤m

calculating a characteristic vector x according to the degree matrix D and the similarity matrix W, and calculating a formula:

L＝D ^-1 (D-W)

in the formula, λ is

A characteristic value of (d); x is a feature vector corresponding to the feature value lambda; d ^-1 An inverse matrix represented as a degree matrix D; l is expressed as a Laplace matrix;

and sequentially selecting k feature vectors x to form a new vector Q, and performing spectral clustering algorithm processing on the data samples through the new vector Q to obtain a candidate training data set.

Preferably, the method for calculating the Jacard similarity coefficient by using the candidate training data set comprises the following steps:

calculating the Jacard similarity coefficient of the candidate abnormal data points in the following way:

in the formula, A _i Representing the interactive candidate abnormal data points of the ith third-party load aggregation platform; a. The _j Representing as a jth third-party load aggregation platform interaction candidate abnormal data point; j (A) _i ，A _j ) Denoted as candidate outlier data points A _i And candidate outlier data points A _j The Jacard similarity factor of (D).

Preferably, the method for constructing the deep residual learning network includes:

establishing jump cascade of deep residual error network cross-layer between every two weighting layers of the deep convolutional neural network, and segmenting the deep convolutional neural network through the jump cascade to form a plurality of residual error learning structure units;

and constructing a plurality of residual error learning structural units and cascade feature layers into a deep residual error learning network.

The second aspect of the present invention provides a third party load aggregation platform interaction data anomaly detection system, including:

the identification module is used for inputting the interactive data into a pre-trained data anomaly detection model and judging whether the interactive data is anomalous data or not through the data anomaly detection model;

the acquisition module is used for acquiring the interactive initial data of the third-party load aggregation platform;

the de-noising module is used for searching the maximum value points of the model on each scale step by step after discrete binary wavelet transform is carried out on the interactive initial data; denoising the interactive initial data according to the modulus maximum point to obtain training interactive data;

the classification module is used for extracting k times of training interactive data in batches to form data samples; processing the data samples by a spectral clustering algorithm to obtain a candidate training data set; calculating Jacard similarity coefficients of the candidate training data sets; comparing the Jacard similarity coefficient with a set threshold value to screen out abnormal data points and normal data points, and constructing a training data set;

and the training module is used for training the deep residual learning network by utilizing the training data set to obtain a data anomaly detection model with the classification error within a set range.

A third aspect of the present invention provides a computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a processor, carries out the steps of the interactive data anomaly detection method.

Compared with the prior art, the invention has the beneficial effects that:

the method comprises the steps of denoising data by utilizing discrete binary wavelet transform, carrying out spectral clustering algorithm processing on data samples, and obtaining a candidate training data set; calculating Jacard similarity coefficients of the candidate training data sets; comparing the Jacard similarity coefficient with a set threshold value to screen out abnormal data points and normal data points, and constructing a training data set; and denoising the data through spectral clustering algorithm processing and wavelet transformation to ensure that the training data set has stability and reliability.

Training a deep residual learning network by using a training data set to obtain a data anomaly detection model with a classification error within a set range; the interactive data are input into a pre-trained data anomaly detection model, whether the interactive data are abnormal data or not is judged through the data anomaly detection model, and a third-party load aggregation platform interactive data anomaly data detection task is achieved, so that the interactive data anomaly detection system has the advantages of being strong in computing capacity, high in efficiency and data noise resistance.

Drawings

Fig. 1 is a flowchart of a third party load aggregation platform interaction data anomaly detection method according to an embodiment of the present invention;

FIG. 2 is a structural diagram of a residual learning structure unit provided by the present invention;

fig. 3 is a structural diagram of a deep residual learning network provided by the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

Example one

As shown in fig. 1, a first aspect of the present invention provides a method for detecting an abnormal interaction data of a third-party load aggregation platform, including:

acquiring interaction initial data of a third-party load aggregation platform; after discrete binary wavelet transform is performed on the interactive initial data, it is recorded as (W) _Ψ f) (a, b); the formula of the table is:

the allowable conditions of (a) are:

is composed of

Complex conjugation of (a); Ψ (w) is a base wavelet sequence.

The method for searching the module maximum value point on each scale step by step and denoising the interactive initial data according to the module maximum value point to obtain training interactive data comprises the following steps:

screening out interaction initial data at 2 according to amplitude ¹ And (3) judging the maximum value point of the modulus on the scale under the following conditions:

log ₂ |Wf(2 ^h ,t)|≤log ₂ A+hα

wherein A and alpha are respectively a smoothing coefficient and a Lee's index; h is expressed as a scale series; wf (2) ^h And t) as interaction initiation data at 2 ^h A magnitude on a scale;

will 2 ^h Removing extreme points which are not on any maximum line in the scale, wherein h is more than or equal to 2; is removed again 2 ¹ All module maximum points on the scale;

and finally, the expression formula for recovering the training interactive data in the wavelet inverse transformation is as follows:

Extracting k times of training interactive data in batches to form data samples; the method for processing the data samples by the spectral clustering algorithm to obtain the candidate training data set comprises the following steps:

calculating the similarity between each data sample in the interactive data, wherein the expression formula is as follows:

d _i ＝s _i1 +s _i2 +…+s _in ，1≤i≤m

L＝D ^-1 (D-W)

in the formula, λ is

A characteristic value of (d); x is a feature vector corresponding to the feature value lambda; d ^-1 Expressed as the inverse of the degree matrix D; l is expressed as a Laplace matrix;

sequentially selecting k feature vectors x to form a new vector Q, and performing spectral clustering algorithm processing on the data samples through the new vector Q to obtain a candidate training data set; the interactive data anomaly detection method has strong adaptability to data distribution and small algorithm calculation amount through a spectral clustering algorithm.

The method for calculating the Jacard similarity coefficient by using the candidate training data set comprises the following steps:

in the formula, A _i Representing the interactive candidate abnormal data points of the ith third-party load aggregation platform; a. The _j Representing as a jth third-party load aggregation platform interaction candidate abnormal data point; j (A) _i ，A _j ) Represented as candidate outlier data point A _i And candidate abnormal data points A _j The Jacobs's similarity coefficient of (D).

Comparing the Jacard similarity coefficient with a set threshold value to screen out abnormal data points and normal data points, and constructing a training data set; training the deep residual learning network by using a training data set to obtain a data anomaly detection model with a classification error within a set range; and inputting the interactive data into a pre-trained data anomaly detection model, and judging whether the interactive data is anomalous data or not through the data anomaly detection model.

As shown in fig. 2 to 3, the method for constructing the deep residual learning network includes: establishing jump cascade of deep residual error network cross-layer between every two weighting layers of the deep convolutional neural network, and segmenting the deep convolutional neural network through the jump cascade to form a plurality of residual error learning structure units; constructing a plurality of residual error learning structural units and cascade feature layers into a deep residual error learning network; the cascaded feature layer divides the training data set into abnormal data points and normal data points.

Through the attenuation reduction of the error in the back-propagation process, the deep network can be successfully trained, the error of each residual learning structure unit is minimized, and the purpose of minimizing the error of the data anomaly detection model is finally achieved, so that the gradient dispersion phenomenon is reduced, and the detection accuracy of the interaction data of the third-party load aggregation platform is higher.

Example two

As shown in fig. 1 to fig. 3, a third party load aggregation platform interactive data anomaly detection system, the system provided in this embodiment may be applied to the interactive data anomaly detection method in the first embodiment, where the interactive data anomaly detection system includes:

the de-noising module is used for searching the maximum value points of the module on each scale step by step after performing discrete binary wavelet transform on the interactive initial data; denoising the interactive initial data according to the modulus maximum point to obtain training interactive data;

the training module is used for training the deep residual learning network by utilizing a training data set to obtain a data anomaly detection model with a classification error within a set range; the deep residual error learning network is formed by sequentially connecting a plurality of residual error learning structure units and cascade feature layers; the cascading feature layer divides the training data set into abnormal data points and normal data points.

EXAMPLE III

The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the interactive data anomaly detection method of an embodiment.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various improvements and modifications without departing from the technical principle of the present invention, and those improvements and modifications should be considered as the protection scope of the present invention.

Claims

1. A third party load aggregation platform interactive data anomaly detection method is characterized by comprising the following steps:

inputting the interactive data into a pre-trained data anomaly detection model, and judging whether the interactive data is anomalous data or not through the data anomaly detection model;

2. The method for detecting the abnormal interactive data of the third-party load aggregation platform according to claim 1, wherein discrete binary wavelet transform is performed on the interactive initial data and the transformed interactive initial data is denoted as (W) _Ψ f) (a, b); the formula of the table is:

the allowable conditions of (a) are:

is composed of

Complex conjugation of (a); Ψ (w) is a base wavelet sequence.

3. The method for detecting the abnormal interaction data of the third-party load aggregation platform as claimed in claim 2, wherein the method for searching the mode maximum value point on each scale step by step and denoising the interaction initial data according to the mode maximum value point to obtain the training interaction data comprises the following steps:

log ₂ |Wf(2 ^h ,t)|≤log ₂ A+hα

will 2 ^h Dimensionally yetRemoving extreme points on any maximum line, wherein h is more than or equal to 2; is removed again 2 ¹ All modulo maximum points on the scale; and finally, restoring the training interactive data through wavelet inverse transformation.

4. The method for detecting the abnormal interaction data of the third-party load aggregation platform according to claim 3, wherein the expression formula for restoring the interaction data into the training interaction data through the inverse wavelet transform is as follows:

in the formula, Ψ _a,b And (t) is a continuous wavelet sequence obtained by performing expansion and translation on the basic wavelet.

5. The method for detecting the abnormal interaction data of the third-party load aggregation platform according to claim 1 or claim 4, wherein the method for processing the data samples by the spectral clustering algorithm to obtain the candidate training data set comprises the following steps:

in the formula, s _ij Representing the similarity between the ith and jth data samples: e _i Denoted as the ith data sample; e _j Denoted as the jth data sample; σ is expressed as a similarity coefficient;

obtaining a degree matrix D according to the similarity among the data samples, wherein the expression formula is as follows:

d _i ＝s _i1 +s _i2 +…+s _im ，1≤i≤m

L＝D ^-1 (D-W)

in the formula, λ is

6. The method of claim 5, wherein the step of calculating the Jacard similarity factor using the candidate training data set comprises:

in the formula, A _i Representing an ith third-party load aggregation platform interaction candidate abnormal data point; a. The _j Representing as a jth third-party load aggregation platform interaction candidate abnormal data point; j (A) _i ，A _j ) Represented as candidate outlier data point A _i And candidate outlier data points A _j The Jacard similarity factor of (D).

7. The method for detecting the abnormal interaction data of the third-party load aggregation platform according to claim 6, wherein the method for constructing the deep residual error learning network comprises the following steps:

8. A third party load aggregation platform interaction data anomaly detection system is characterized by comprising:

9. Computer-readable storage medium, characterized in that a computer program is stored thereon, which program, when being executed by a processor, carries out the steps of the interactive data anomaly detection method according to any one of claims 1 to 7.