CN109462521B

CN109462521B - Network flow abnormity detection method suitable for source network load interaction industrial control system

Info

Publication number: CN109462521B
Application number: CN201811415563.1A
Authority: CN
Inventors: 吴克河; 张晓良; 何辉; 张明; 朱红勤; 余刚刚; 吴屹浩; 杨东锴
Original assignee: State Grid Jiangsu Electric Power Co Ltd; North China Electric Power University; Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Current assignee: State Grid Jiangsu Electric Power Co Ltd; North China Electric Power University; Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2018-11-26
Filing date: 2018-11-26
Publication date: 2020-11-20
Anticipated expiration: 2038-11-26
Also published as: CN109462521A

Abstract

The invention discloses a network flow abnormity detection method suitable for a source network load interaction industrial control system, which adopts a two-layer classification mechanism, namely, an OCSVM model is firstly used for carrying out first classification, the classifier can detect most normal flows, abnormal flows are detected as far as possible by adjusting the model, then data (possibly including part of normal flows) which are judged to be abnormal by the OCSVM model are secondly classified by a GBDT algorithm, the second classification is used for detecting the normal flows which are falsely detected in the first classification, and the part of flows are added into a sample for retraining, so that the detection accuracy is improved. The invention has the advantages of ensuring the flow detection accuracy, having higher detection efficiency and meeting the flow detection requirement of the industrial control system with source network load interaction.

Description

Network flow abnormity detection method suitable for source network load interaction industrial control system

Technical Field

The invention belongs to the technical field of wireless communication, and particularly relates to a network flow abnormity detection method suitable for a source network load interaction industrial control system.

Background

With the construction of global energy Internet, the rapid development of extra-high voltage power grids and distributed energy, novel loads with dual characteristics of source and load, such as electric vehicles and controllable users, continuously emerge, the time-space distribution characteristic of power grid tide is gradually complex, and the importance and the urgency of interaction and cooperative control between the power grid and the users are continuously improved.

Under the background of source network load interaction, industrial control systems are widely distributed in power supply companies, power plants and transformer substations and continuously extend to a new energy power generation side and a user side, safety control has the characteristics of multiple levels, multiple types, frequent interaction of monitoring control information and the like, risks of eavesdropping, tampering, interruption and the like exist in the processes of acquisition, transmission and execution of various operation information and control instructions, and the difficulty of system safety precaution is increased due to access of a large number of new energy power generation equipment and user equipment which are distributed dispersedly. How to monitor the network flow of the source network load system in real time and discover the network abnormality in time has important significance on the stability and safety of the system.

At present, the detection method of abnormal flow mainly comprises: and training the flow data with the marks to obtain a classifier for distinguishing normal flow data and abnormal flow data, and detecting abnormal flow by using the classifier.

The method uses specific historical flow data for training, and once the historical data is out of date, huge errors can occur in the judgment of the real-time network. In practical application, the detection accuracy is low. Meanwhile, the accuracy and efficiency of detection are difficult to be considered, and the method cannot be directly used for detecting the network flow abnormity of the industrial control system with source network load interaction.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problems in the prior art, the invention provides a network flow abnormity detection method which adopts a self-learning double-layer detection model, enables the detection model to be self-updated through self-learning, adapts to the change of the environment, and improves the detection accuracy and the detection efficiency and is suitable for a source network load interaction industrial control system.

The technical scheme is as follows: in order to solve the technical problem, the invention provides a network flow abnormity detection method suitable for a source network load interaction industrial control system, which comprises the following steps:

(1) the method comprises the steps that flow data in a source network load interaction industrial control system are collected in real time through a data collection module, data characteristics in the flow data are counted, and the characteristic data of the flow are input into a data processing module to be processed;

(2) the data processing module processes off-line sample data or on-line test data and applies the processed data to the first classification module;

(3) forming new sample data by the sample data 1 after training and the flow data in the self-learning module obtained in the step (6), preprocessing the data by a data preprocessing module, and applying the data processed by the data preprocessing module to a first training module;

(4) taking the data processed in the step (2) as input, training the data through the first training module obtained in the step (3), entering the trained data into a first classification module, detecting whether the flow is normal through the first classification module, if so, outputting the flow normally, and if not, entering the step (6);

(5) the data processing module is used for processing the data of the sample data 2 after training processing, and the data processed by the data processing module is applied to a second training module;

(6) and (3) training data in the second training module, entering the trained data and the abnormal flow data obtained in the step (4) into a second classification module, detecting whether the flow is normal or not through the second classification module, adding the data into the self-learning module and entering the step (3) if the flow is normal, and outputting abnormal flow and giving an alarm if the flow is abnormal.

Further, when the data processed by the data preprocessing module in the step (3) is applied to the first training module, dimension reduction processing is required, which specifically includes the following steps:

(3.1) according to a formula for the original d-dimensional sample data set

Decentralized processing, wherein the sample data is: { (X)⁽¹⁾，y⁽¹⁾)，(X⁽²⁾，y⁽²⁾)，…，(X⁽ⁿ⁾，y⁽ⁿ⁾)}；

(3.2) constructing a covariance matrix of the sample; wherein the covariance formula is

(3.3) calculating the eigenvalue of the covariance matrix and the corresponding eigenvector; the eigenvector of the covariance matrix represents the principal component, and the importance of the eigenvector is determined according to the magnitude of the eigenvalue;

(3.4) selecting k eigenvectors corresponding to the first k eigenvalues;

(3.5) constructing a mapping matrix W through the k eigenvectors;

(3.6) dimensionality reduction of the d-dimensional data to a k-dimensional vector Z by the mapping matrix W: z⁽ⁱ⁾＝U^TX⁽ⁱ⁾。

Further, the specific steps of detecting whether the flow rate is normal through the first classification module in the step (4) are as follows:

(4.1) selecting an adjustable parameter v and a kernel function

(4.2) training by sample data: solving for

Choose to satisfy arbitrarily

Alpha of (A)^*Calculating

Wherein is satisfied with

Alpha of (A)^*Namely the support vector;

(4.3) obtaining a decision function f (x): integrating decision functions

If f (x) is greater than 0, the data is proved to be normal data, and f (x) is less than 0, the data is proved to be abnormal data, wherein N_svIs the number of support vectors.

Further, the specific step of detecting whether the flow rate is normal through the second classification module in the step (6) is as follows:

(6.1) begin generating GBDT classifier

(6.2) initializing the loss function

Loss function L (y, F) log (1+ e)^-2yF)，y∈{-1，1}

(6.3) initializing the classification model from said loss function, having

(6.4) calculating the value of the negative gradient of the loss function in the current model, and taking the value as the estimation of the residual error;

(6.5) in the residual

The CART tree is constructed, each training sample is finally divided into corresponding leaf nodes, and the predicted values of the leaf nodes are as follows:

update F_m(x)＝F_m-1(x)+γ_jmObtaining a stronger classification model;

(6.6) judging whether an ending condition is met; namely the gradient reaches the minimum value or the iteration number reaches the set value;

(6.7) generating a GBDT classification model F (x); a threshold value theta is set, and when F (x) < theta, the flow rate is a normal flow rate, and when F (x) > theta, the flow rate is an abnormal flow rate.

Further, the specific steps for constructing the CART tree in the step (6.5) are as follows:

(6.5.1) starting to generate the CART tree;

(6.5.2) selecting a feature, and dividing all samples into left and right subtrees according to the feature value;

(6.5.3) calculating the variance of the left subtree and the right subtree respectively; is provided with

Is the average of the labels of the nodes in the left sub-tree,

the average value of the node labels in the right subtree; the variance is calculated as

(6.5.4) selecting the feature point which meets the minimum sum of the variances of the left subtree and the right subtree to perform primary division;

(6.5.5) dividing the above steps down in sequence;

(6.5.6) determining whether an end condition is satisfied; and if the ending condition is met, outputting the generated CART tree and ending, and if the ending condition is not met, returning to the step (6.5.4).

Further, the ending condition in the step (6.5.6) is: the nodes are pure nodes, namely the target variable values of all records are the same; the depth of the tree reaches a pre-specified maximum value; the maximum drop-off value of the degree of clutter is less than a pre-specified value; the record quantity of the nodes is less than the pre-specified minimum node record quantity; all records in a node have the same predictor variable value.

Further, the self-learning module in the step (3) self-learns the following steps:

and (3.1) initializing a self-learning module, and setting the capacity of the sample and the condition of training triggering. The trigger condition may be set to a specific time or a specific state, such as a timed trigger;

(3.2) monitoring whether the flow is detected by mistake;

and (3.3) if the flow is detected by mistake, forming a new training sample by the new training sample and the original training sample. And if the training sample does not reach the set sample capacity size, directly adding the training sample. Otherwise, replacing the earliest training sample;

and (3.4) judging whether the trigger condition is met. If the training triggering condition is not met, returning to the step (3.2);

and (3.5) if the triggering condition is met, retraining the classification model.

Compared with the prior art, the invention has the advantages that:

the invention adopts a two-layer classification mechanism, namely, firstly, an OCSVM model is used for carrying out first classification, the classifier can detect most of normal flow, abnormal flow is detected as much as possible by adjusting the model, then, data (possibly comprising part of normal flow) which is judged to be abnormal by the OCSVM model is subjected to second classification by a GBDT algorithm, the second classification is used for detecting the normal flow which is detected by mistake in the first classification, and the part of flow is added into a sample for retraining, so that the detection accuracy is improved. The method has the advantages that under the condition of ensuring the flow detection accuracy, the detection efficiency is high, and the flow detection requirement of the industrial control system with source network load interaction is met. The method can be used for detecting the network flow abnormity of the industrial control system with source network load interaction and other industrial control systems.

Drawings

FIG. 1 is an overall flow chart of the present invention;

FIG. 2 is a flow chart of the application of the data processed by the data preprocessing module to the first training module in FIG. 1;

FIG. 3 is a flowchart illustrating the first classification module in FIG. 1 detecting whether the flow rate is normal;

FIG. 4 is a flowchart illustrating the second classification module of FIG. 1 detecting whether the flow rate is normal;

FIG. 5 is a flow diagram of constructing the CART tree of FIG. 4;

fig. 6 is a flow chart of the self-learning in fig. 1.

Detailed Description

The invention is further elucidated with reference to the drawings and the detailed description.

As shown in fig. 1 to 6, the present invention provides a method for detecting network traffic anomaly suitable for a source network load industrial control system, which includes the following steps:

The self-learning training system is composed of a data acquisition module, a data processing module, a first training module, a second training module, a first classification module, a second classification module and a self-learning module.

The data acquisition module acquires flow data in the industrial control system with source network load interaction in real time, counts data characteristics in the flow data, and inputs the characteristic data of the flow into the data processing module for processing.

When data are collected, the invention adopts a mechanism of sliding time window to carry out statistics on data characteristics. And setting the size of a time window as W, and the sliding step length as L (L < W), wherein W and L represent the number of data packets. In the collecting process, the earliest L data packets in the W are replaced by the new L data packets, statistics is carried out, and finally obtained characteristic values serve as a characteristic vector.

The data processing module processes off-line sample data or on-line test data, and the preprocessed data can be applied to the next module.

And the first training module obtains the classification model of the first classification module through the sample data 1 after training processing.

And the second training module obtains a classification model of the second classification module through the sample data 2 after training processing.

The first classification module takes the data processed by the data processing module as input and performs primary classification on the flow corresponding to the data. In the first classification, the problem of recognition efficiency is mainly solved, and the requirement on accuracy is low.

The flow judged to be abnormal by the first classification module is more accurately judged by the second classification module. The second classification module mainly solves the problem of judgment accuracy and has low requirement on detection efficiency. And if the judgment result of the second classification module is normal, the first classifier is judged to have false detection, and the partial flow data is input into the self-learning module.

And the self-learning module combines the flow data and the original sample data 1 into new sample data, and retrains the new sample data through the first training module under the preset triggering condition, so that the accuracy of the first type of module is improved. The predetermined trigger condition may be set to a fixed time or the like.

The data acquisition module acquires flow data and extracts characteristic data of the flow. The industrial control system with source network load interaction is mainly used for monitoring and controlling a source network load system, a specific acquisition and control protocol is adopted for data transmission, such as a 104 protocol, a 61850 protocol and the like, and besides conventional network attack behaviors, such as abnormal burst flow, port scanning, vulnerability or service scanning, the source network load also faces special attack threats, namely flow with attack load conforming to a source network load system protocol standard format.

Normal traffic is very different from abnormal traffic, which typically causes changes in the source IP address, destination IP address, source port, destination port, and protocol type distribution. In a source network load industrial control system, the flow with an attack load conforming to the standard format of a source network load system protocol generally causes the change of type identification, application service type and message distribution of different formats.

In order to realize the abnormal detection of the network flow of the source network load system and detect different attacks, the data acquisition module acquires the flow data and comprises the following parts: source address, destination address, source port, destination port, transport layer protocol type, application layer protocol type, TCP flag ACK, SYN fields. In order to detect a special attack threat of the source network load system, it is further required to collect feature data of an application layer of the source network load industrial control system, taking a 104 protocol as an example, including: type identification (including monitoring, control and file transmission type identification), message length (according to the length, the message can be divided into different levels), message transmission reason, application service type, frame type (including S frame, U frame and I frame)

After the data are collected, statistics is carried out through a sliding time window, and the obtained characteristic attributes comprise the following parts: the entropy value of the source IP address, the entropy value of the destination IP address, the entropy value of the source port, the entropy value of the destination port, the proportion of each transmission layer protocol, the proportion of each application layer protocol, the ratio of SYN number to ACK number, and the ratio of the packet sending number and the packet receiving number of the destination port in the window W. The characteristic attributes suitable for the source network load industrial control system further comprise: entropy of type identification, proportion of each length message, proportion of each transmission reason, entropy of application service type, and proportion of each format frame.

In the first training module training stage, the data processing module performs data standardization and data feature selection.

And (3) selecting the data characteristics, and performing dimensionality reduction on the characteristic data by adopting a Principal Component Analysis (PCA). The sample data is set as follows: { (X)⁽¹⁾，y⁽¹⁾)，(X⁽²⁾，y⁽²⁾)，…，(X⁽ⁿ⁾，y⁽ⁿ⁾)}. The method for reducing the dimension by using the principal component analysis method comprises the following steps:

step 201: according to a formula for an original d-dimensional sample data set

And (5) performing decentralized processing.

Step 202: a covariance matrix of the samples is constructed. The covariance formula is

Step 203: eigenvalues of the covariance matrix and corresponding eigenvectors are computed. The eigenvectors of the covariance matrix represent principal components, and the importance of the eigenvectors is determined according to the magnitude of the eigenvalues.

Step 204: and selecting k eigenvectors corresponding to the first k eigenvalues.

Step 205: and constructing a mapping matrix W by the k eigenvectors.

Step 206: dimensionality reduction of d-dimensional data to a k-dimensional vector Z by a mapping matrix W: z⁽ⁱ⁾＝U^TX⁽ⁱ⁾

The single-class support vector machine (OCSVM) can train an anomaly detection model by only one class of samples, can accurately detect anomalies, and has high calculation efficiency. For solving some cases where only one type of sample is available for training the classifier. The idea of the standard SVM is to construct a generalized optimal classification surface, so that two types of data points of a training data set are positioned at two sides of the classifier as much as possible, and the interval between the two types of data points is designed as much as possible. And the OCSVM assumes that the coordinate origin is an abnormal sample, and an optimal hyperplane is constructed in the feature space to realize the maximum interval between the data target and the coordinate origin. The task of OCSVM classification is to find out a function f (x), if the value of f (x) is positive, the data x is considered to be normal, and if the value of f (x) is negative, the data x is considered to be abnormal.

In the industrial control system with source network load interaction, most data are normal data, and the single-type support vector machine has higher efficiency in detection. Therefore, in the first training module, a single-class support vector machine (OCSVM) is adopted for training, and a first classification module for classification is obtained. Namely: the OCSVM module comprises an OCSVM training module and an OCSVM classification module.

The single-class support vector machine solves the following quadratic programming problem:

s.t. f(x)＝φ(x_i)ω-ρ≥-ξ_i，ξ_i≥0

wherein x is_iThe method comprises the steps of taking samples in an original space, 1 is the number of training samples, phi is the mapping from the original space to a feature space, omega and rho are a normal vector and compensation of a required hyperplane in the feature space respectively, and the method can be used for solving the problems that the prior art is complex and has high cost and low costThe tuning parameter v belongs to (0, 1) as an upper limit for controlling the proportion of error samples in the total number of samples, and the relaxation variable xi_iIs the degree to which some training samples are misclassified.

Selecting radial kernel functions

And an adjustable parameter v, solving the following optimization problem and solving

Choose to satisfy arbitrarily

Alpha of (A)^*Calculating

Wherein is satisfied with

Alpha of (A)^*I.e. the support vector.

Integrating decision functions

If f (x) > 0, return +1, meaning the data is normal data, f (x) < 0, return-1, meaning the data is abnormal data. N is a radical of_svIs the number of support vectors.

In order to ensure the detection efficiency and the accuracy of the whole method, a fault tolerance factor zeta is defined to be larger than 0 in an OCSVM classification module, when f (x) is larger than zeta, the fault tolerance factor returns +1, and otherwise, the fault tolerance factor returns-1. By properly adjusting ζ, a small amount of false detection exists in normal data (namely, normal flow is determined as abnormal flow), and abnormal flow is detected as far as possible.

When the flow data is classified for the first time through the first classification module, if the classification result is abnormal, the flow data needs to be classified more accurately through the second classifier. The first classification module judges that a small amount of false detection (normal data are judged as abnormal flow) possibly exists in abnormal flow, the second classification module has higher accuracy than the first classification module, and the second classification module is used for identifying the false detection flow and providing the partial flow for the self-learning module to learn. The first classification module filters a large amount of normal flow for the second classification module, and reduces the burden of the second classification module.

Selecting a Gradient Boosting Decision Tree (GBDT)) method at a second training module to train the classification model. The GBDT is formed by combining a series of integrated weak classification models, each weak classifier respectively gives a predicted value, and the predicted values are combined to form a final predicted value according to certain weight. Generally speaking, the goal of training is to find a model to make its predicted value f (x) of the input variable approach to its true value y, and the GBDT algorithm only trains a weak base model (weak classifier) each time, i.e. let the predicted value of each base model approach to the partial true value it needs to predict, and then combines the predicted values of these base models by weighting. GBDT is sensitive to abnormal data and has good classification effect.

For sample { (X)⁽¹⁾，y⁽¹⁾)，(X⁽²⁾，y⁽²⁾)，…，(X⁽ⁿ⁾，y⁽ⁿ⁾) The purpose of the GBDT algorithm is to find a mapping F (X) that satisfies the least of the penalty functions L (y, F (X)), i.e.

In the GBDT algorithm, gradient lifting requires a total of M iterations, each iteration produces a model, and the model generated by each iteration is required to minimize the loss function of the training set. By adopting a gradient descent method, the loss function is made to be smaller and smaller by moving to the negative gradient direction of the loss function at each iteration, so that a more and more accurate model can be obtained.

The GBDT algorithm includes the following steps:

step 401: start of Generation GBDT classifier

Step 402: a loss function is initialized.

In the present invention, the loss function is selected such that L (y, F) is log (1+ e)^-2yF)，y∈{-1，1}

Step 403: initializing a classification model from said loss function, having

Step 404: the value of the negative gradient of the loss function at the current model is calculated as an estimate of the residual error.

Step 405: in the residual error

update F_m(x)＝F_m-1(x)+γ_jmAnd obtaining a stronger classification model.

Step 406: and judging whether the ending condition is met. I.e. the gradient reaches a minimum or the number of iterations reaches a set value.

Step 407: a GBDT classification model f (x) is generated. A threshold value theta is set, and when F (x) < theta, the flow rate is a normal flow rate, and when F (x) > theta, the flow rate is an abnormal flow rate.

When GBDT generates weak classifiers, CART tree is adopted. The generation method of the CART tree comprises the following steps:

step 501: the CART tree starts to be generated.

Step 502: a feature is selected, and all samples are divided into a left subtree and a right subtree according to the feature value.

Step 503: and respectively calculating the variances of the left subtree and the right subtree. Is provided with

Is the average of the labels of the nodes in the left sub-tree,

is the mean of the node labels in the right subtree. The variance is calculated as

Step 504: and selecting the characteristic point which meets the minimum sum of the variances of the left subtree and the right subtree to perform primary division.

Step 505: according to the above method, the division is carried out downwards

Step 506: and judging whether the ending condition is met. The end conditions are as follows: the nodes are pure nodes, namely the target variable values of all records are the same; the depth of the tree reaches a pre-specified maximum value; the maximum drop-off value of the degree of clutter is less than a pre-specified value; the record quantity of the nodes is less than the pre-specified minimum node record quantity; all records in a node have the same predictor variable value.

Step 507: and if the ending condition is met, outputting the generated CART tree.

Step 508: and ending the flow.

And the data processing module selects the real-time flow data characteristics in the detection stage and performs data standardization processing.

The first classification module is obtained by training the first training module, takes the data processed by the data processing module as input, and performs primary classification on the flow corresponding to the data. The OCSVM classifier obtained by training of the first training module is adopted for carrying out flow abnormity detection.

The second classification module is obtained by training a second training module. The second classifier mainly considers the problem of detection accuracy and has higher detection precision. The second classification module performs classification by a GBDT algorithm. If the abnormal flow is detected, alarming is carried out or the detection result is submitted to other systems for security defense.

The self-learning module stores the data judged to be normal by the second classification module (the first classifier false detection), and under a preset trigger condition, the data and the original sample data 1 are recombined into a new sample to retrain the first classifier. Through continuous learning improvement of the self-learning module, the accuracy of the first classification module can be improved, the method is suitable for different network environments, and the detection efficiency of the whole model is improved.

Specifically, the method comprises the following steps: the self-learning training system comprises a data processing module, a first training module, a second training module, a data acquisition module, a first classification module, a second classification module and a self-learning module. The data processing module processes off-line sample data or on-line test data, and the preprocessed data can be applied to the next module. And the first training module obtains the classification model of the first classification module through the sample data 1 after training processing. And the second training module obtains a classification model of the second classification module through the sample data 2 after training processing. The data acquisition module acquires flow data in the source network load industrial control system in real time and inputs the characteristic data of the flow into the data processing module for processing. The first classification module takes the data processed by the data processing module as input and performs primary classification on the flow corresponding to the data. In the first classification, in order to ensure the detection efficiency and the accuracy of the whole method, a small amount of false detection of normal data is allowed (namely, normal traffic is determined as abnormal traffic), and abnormal traffic is detected as far as possible. The traffic which is judged to be abnormal by the first classification module is subjected to more accurate classification by the second classification module. And if the judgment result of the second classification module is that the flow is normal, the first classifier is subjected to false detection, and the data is input into the self-learning module. The self-learning module adds the flow data into the sample data 1, and the training is carried out again through the first training module at a fixed time, so that the accuracy of the first classification module is improved.

FIG. 2 is a schematic view of a PCA dimension reduction process according to the present invention. The sample data is set as follows: { (X)⁽¹⁾，y⁽¹⁾)， (X⁽²⁾，y⁽²⁾)，…，(X⁽ⁿ⁾，y⁽ⁿ⁾)}. The method for reducing the dimension by using the principal component analysis method comprises the following steps:

step 201: feature dimensionality reduction is initiated.

Step 202: according to a formula for an original d-dimensional sample data set

And (5) performing decentralized processing.

Step 203: a covariance matrix of the samples is constructed. The covariance formula is

Step 204: eigenvalues of the covariance matrix and corresponding eigenvectors are computed. The eigenvectors of the covariance matrix represent principal components, and the importance of the eigenvectors is determined according to the magnitude of the eigenvalues.

Step 205: and selecting k eigenvectors corresponding to the first k eigenvalues.

Step 206: and constructing a mapping matrix W by the k eigenvectors.

Step 207: dimensionality reduction of d-dimensional data to a k-dimensional vector Z by a mapping matrix W: z⁽ⁱ⁾＝U^TX⁽ⁱ⁾。

Step 208: and completing feature dimension reduction.

Fig. 3 is a schematic diagram of an OCSVM learning process according to the present invention. The process of OCSVM training is as follows:

step 301: and selecting an adjustable parameter v and a kernel function K (x, y). In this problem, radial kernel functions are chosen

Step 302: and training the OCSVM classifier through the processed sample data.

Step 303: a classifier is obtained, and the decision function of the classifier can be expressed as

Wherein

Is a support vector.

Fig. 4 is a schematic diagram of a process of generating a GBDT tree by a GBDT according to the present invention. The GBDT generation prediction model includes the following steps:

step 401: start of Generation GBDT classifier

Step 402: a loss function is initialized.

In this invention, the loss function is chosen such that L (y, F) is log (1+ e)^-2yF)，y∈{-1，1}

Step 403: initializing a classification model from said loss function, having

Step 405: in the residual error

Constructing the CART tree, and finally dividing each training sample into corresponding leaf nodes, wherein the leaf nodes are pre-arranged at the momentMeasured values are:

update F_m(x)＝F_m-1(x)+γ_jmAnd a stronger learner is obtained.

FIG. 5 is a schematic diagram of generating a CART tree according to the present invention. The generation method of the CART tree comprises the following steps:

step 501: the CART tree starts to be generated.

Is the average of the labels of the nodes in the left sub-tree,

Step 505: according to the above method, the division is carried out downwards

Step 508: and ending the flow.

FIG. 6 is a flow chart of the self-learning method of the present invention.

Step 601: and initializing a self-learning module, and setting the capacity of the sample and the condition of training triggering. The trigger condition may be set to a specific time or a specific state, such as a timed trigger.

Step 602: and monitoring whether the flow is detected by mistake.

Step 603: and if the false detection flow exists, forming a new training sample with the original training sample. And if the training sample does not reach the set sample capacity size, directly adding the training sample. Otherwise, the earliest training sample is replaced.

Step 604: and judging whether the triggering condition is met. If the training trigger condition is not met, then return to step 602.

Step 605: and if the triggering condition is met, the classification model is retrained.

In summary, the method for realizing the network traffic anomaly detection of the industrial control system with source network load interaction adopts a double-layer detection structure, gives consideration to the detection accuracy and the detection efficiency, and simultaneously adopts a self-learning method to improve the detection accuracy. The technology of the industrial control system with source-grid-load removal interaction can also be applied to a plurality of fields, and has high popularization value.

The above description is only an example of the present invention and is not intended to limit the present invention. All equivalents which come within the spirit of the invention are therefore intended to be embraced therein. Details not described herein are well within the skill of those in the art.

Claims

1. A network flow abnormity detection method suitable for a source network load interaction industrial control system is characterized by comprising the following steps:

(4) taking the data processed in the step (2) as input, training the data through the first training module obtained in the step (3), entering the trained data into a first classification module, detecting whether the flow is normal through the first classification module, if so, finishing the normal flow of output flow, and if not, entering the step (5);

2. The method for detecting the network traffic abnormality applicable to the source network load interaction industrial control system according to claim 1, wherein the self-learning module in the step (3) has the following specific steps:

(3.1) initializing a self-learning module, and setting the volume of the sample and the condition of training triggering; the trigger condition may be set to a specific time or a specific state;

(3.2) monitoring whether the flow is detected by mistake;

(3.3) if the flow is detected by mistake, forming a new training sample with the original training sample; if the training sample does not reach the set sample capacity, directly adding the training sample; otherwise, replacing the earliest training sample;

(3.4) judging whether the trigger condition is met, if the trigger condition is not met, returning to the step (3.2);