CN113268730B - Smart power grid false data injection attack detection method based on reinforcement learning - Google Patents

Smart power grid false data injection attack detection method based on reinforcement learning Download PDF

Info

Publication number
CN113268730B
CN113268730B CN202110486653.5A CN202110486653A CN113268730B CN 113268730 B CN113268730 B CN 113268730B CN 202110486653 A CN202110486653 A CN 202110486653A CN 113268730 B CN113268730 B CN 113268730B
Authority
CN
China
Prior art keywords
value
attack
detection
data injection
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110486653.5A
Other languages
Chinese (zh)
Other versions
CN113268730A (en
Inventor
吴争光
张阔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd
Original Assignee
Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd filed Critical Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd
Priority to CN202110486653.5A priority Critical patent/CN113268730B/en
Publication of CN113268730A publication Critical patent/CN113268730A/en
Application granted granted Critical
Publication of CN113268730B publication Critical patent/CN113268730B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S40/00Systems for electrical power generation, transmission, distribution or end-user application management characterised by the use of communication or information technologies, or communication or information technology specific aspects supporting them
    • Y04S40/20Information technology specific aspects, e.g. CAD, simulation, modelling, system security

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Computing Systems (AREA)
  • Virology (AREA)
  • Operations Research (AREA)
  • Emergency Alarm Devices (AREA)

Abstract

The invention discloses a smart grid false data injection attack detection method based on reinforcement learning. In view of the observation value construction, the direct false data injection attack detection combines a residual method with threshold segmentation, and the hidden false data injection attack combines the difference norm of the measured value with the threshold segmentation to respectively obtain the observation value. And respectively obtaining Q tables by using observation value training, and detecting the attack by using the Q tables. The design realizes the rapid detection of the hidden attack, increases the detection speed and the success rate, has simple implementation method and can obviously improve the detection efficiency.

Description

Smart power grid false data injection attack detection method based on reinforcement learning
Technical Field
The invention relates to the field of smart grids, in particular to a smart grid false data injection attack detection method based on reinforcement learning.
Background
The smart power grid is a novel power grid technology which separates an information transmission channel from a power transmission channel, so that the power grid has more efficient power resource allocation and stronger anti-interference capability, and along with the deep development of the information technology, the information safety of the smart power grid becomes an important concern. The false data injection attack plays an important role in intelligent power grid information attack research, and the core idea is that the state estimation of a power grid system is influenced by using a constructed attack vector through the loophole of a traditional detection method, so that the safe and stable operation of the power grid system is damaged. The traditional false data injection attack detection method is a bad data detection method. The method can only detect direct false data injection attacks, can not detect hidden false data injection attacks, and has a common detection success rate due to the adoption of a single threshold value. The current machine learning detection method can realize the detection of the hidden false data injection attack, but the detection method needs to ensure that the attack vector is far larger than noise and the attack vector is slightly larger than the process noise.
Disclosure of Invention
The invention aims to provide a method for detecting false data injection attack of a smart grid based on reinforcement learning aiming at the defects of the prior art.
The aim of the invention is realized by the following technical scheme: a false data injection attack detection method based on reinforcement learning comprises the following steps:
step one: establishing a general linear model of the power grid:
x t =Ax t-1 +v t (1)
y t =Hx t +w t (2)
wherein x is t =[x 1,t ,…,x n,t ,…,x N,t ]For the system state at time t, x n,t The phase angle on the nth node at the moment t is expressed, and N represents the total state number of the system; the measurement at time t is denoted as y t =[y 1,t ,…,y m,t ,…,y M,t ],y m,t The detection value of the mth measuring instrument at the t moment is represented, and M represents the total measuring instrument value;for state transition matrix>Jacobian matrix, determined by the topology of the network>Representing a real set; />Representing system noise at time t->Representing the variance of the process noise, the value of which is determined by the system, I N Representation ofAn N-dimensional identity matrix; />Represents the measurement noise at time t, < >>Representing the variance of the measured noise, the value of which is determined by the measuring device, I M Representing an M-dimensional identity matrix;
step two: virtual attack obtains samples: the attacked measurement can be obtained using equation (3) for direct attacks, equation (4) for hidden attacks,
in which a is t Attack vector Hc representing direct attack at time t t Since H does not change over time, c is used as an attack vector representing a hidden attack t Representing a hidden attack vector, a t And c t Known in sample training, unknown in actual detection, τ is the time the system is under attack,representing step functions, i.e. when t.gtoreq.τ
Step three: obtaining an observation value: calculating the measured value y t And its estimated valueIs used as the detection of direct false data injection attacks, the current detection value y t With the detected value y at the previous moment t-1 Is the residue of (2)The difference module length is used for detecting the hidden false data injection attack, a threshold segmentation method is used for dividing the two module values to obtain a direct false data injection attack instant observed value and a hidden false data injection attack instant observed value respectively, a sliding window method is used for updating the two instant observed values into the observed values to obtain a direct false data injection attack observed value and a hidden false data injection attack observed value in corresponding time respectively;
step four: obtaining t-moment detector action a using epsilon greedy strategy t : dividing the system into two states, s n The system is not attacked and s a The system is attacked and the detector action is also divided into two states a s Alerting the algorithm to consider the system to be attacked, a c The algorithm is shown to consider that the system is not attacked and does not give an alarm, and a direct attack detection observed value is obtained at the time tObservation value for detection of hidden attack +.>Q-table Q based on direct spurious data injection attack detection using greedy strategy n Q-table Q for attack detection with hidden false data injection s Selecting detector action, epsilon greedy strategy, i.e. the detector selects optimal action with probability 1-epsilon, randomly selects action with probability epsilon, and epsilon is updated once every d steps, and the update formula is shown as formula (5)
ε=max(ε-e -1 ,ε min ) (5)
Where e is the sample value, ε, that is the current and has been used min A minimum epsilon value set for human beings;
step five: training was performed using the Sarsa algorithm, the Q table was updated using equation (6),
the parameter expression containing the upper corner mark i is used for detecting iParameters of type attack, i=n or s, i.e. when i=n the parameters are used for detecting direct attacks, and when i=s the parameters are used for detecting hidden attacks, Q i The Q table required to detect an i-type attack,observations for detecting i-type attacks at time t, < >>Get +.>Actions that can be taken later for i-type attacks, alpha being learning efficiency, gamma i For the fit factor trained for type i attacks, +.>The state for detecting an i-type attack for time t is +.>Action as->The return of the time is shown in the formula (7),
wherein r is 0 And b is a predetermined coefficient of leading alarm return value and lagging alarm return value,the system state is detected for the type i at the time t;
step six: repeating steps one to five until reaching the stage maximum detection time T, orAnd->In which there is a s Appearance;
step seven: repeating the first to sixth steps until the total sample number E is used up to obtain a complete Q d Table and Q s A table;
step eight: in the detection, the observed values are obtained by using the steps one to threeAnd->Using (8) according to Qd table and Q respectively s Watch obtaining action>And->When both action values are a c Repeating the step until +.>And->One of which is a s Stop detecting and giving an alarm when +.>Is a as s The system is considered to be under direct spurious data injection attack when +.>Is a as s The system is considered to be under direct spurious data injection attacks.
Further, the third step is realized by the following substeps:
(3.1) setting a threshold: setting direct false data injection attack threshold according to different power grid structures Threshold value of attack against hidden false data injection>
(3.2) obtaining a detection value: obtaining the detection value y at the moment t from each detection instrument t Invoking the t-1 moment detection value y t-1
(3.3) estimating the detection value by using Kalman filtering: obtaining a t-time state estimation value by using a least squares algorithm represented by the formulas (9) and (10)Calculating t moment measurement estimated value +.>
In the middle ofFor measuringA variance matrix of the value deviations;
(3.4) calculating a deviation module value: calculating deviation module values of the measured value and the estimated value at the time t by using the steps (12) and (13), respectivelyAnd the value of the modulus which varies between time t and time t-1 +.>
(3.5) obtaining an instantaneous observation value by using a threshold segmentation method: from the following componentsAnd->The immediate observations of direct and suppressed spurious data injection attacks can be obtained according to equation (14)>And->
Because the threshold segmentation method is consistent, the upper corner mark i is still used for replacing the upper corner mark n and the upper corner mark s, i.e. i in the formula (14) can be n or s at the same time;
(3.6) observed values were obtained using a sliding window method: make t-1 moment straightAttach false data injection attack observationsIs thatHidden false data injection attack observation +.>Is->The sliding window method is used for adding the corresponding instant observed value at the t moment to the observed value at the t-1 moment, and then removing the oldest instant observed value to obtain the direct false data injection attack observed value at the t moment as +.>Hidden false data injection attack observation value is +.>
The method has the beneficial effects that the detection of the false data injection attack is realized by using the Sarsa algorithm, the detection accuracy and the detection speed of the false data injection attack are improved, the method has a good effect on the detection of the hidden false data injection attack, and the direct false data injection attack and the detection of the hidden false data injection attack are conveniently realized.
Drawings
Figure 1 is an IEEE-14 node diagram from which an H-array can be obtained,
figure 2 is a flow chart of a training Q-table,
figure 3 is a flow chart of the detection process,
figure 4 shows a lead alarm rate detection map,
figure 5 shows a hysteresis alert rate detection graph,
figure 6 shows a graph of the total alarm failure rate,
figure 7 shows a lead alarm rate detection map,
figure 8 shows a hysteresis alert rate detection graph,
figure 9 shows a graph of the total alarm failure rate,
figure 10 shows an alarm category error rate diagram,
figure 11 shows a graph of the immediate detection success rate of a hidden dummy data injection attack,
fig. 12 shows a graph of the immediate detection success rate of a direct spurious data injection attack.
Detailed Description
In order to enhance the understanding and appreciation for the invention, the invention will be described in detail below with reference to the drawings and embodiments.
Example 1: referring to fig. 1-4, a smart grid false data injection attack detection method based on reinforcement learning includes the following steps:
step one: establishing a general linear model of the power grid:
x t =Ax t-1 +v t (1)
y t =Hx t +w t (2)
wherein x is t =[x 1,t ,…,x n,t ,…,x N,t ]For the system state at time t, x n,t The phase angle on the nth node at the moment t is expressed, N represents the total state number of the system, and 14 is taken; the measurement at time t is denoted as y t =[y 1,t ,…,y m,t ,…,y M,t ],y m,t The detection value of the mth measuring instrument at the t moment is represented, M represents the value of the total measuring instrument, and 23 is taken;is a state transition matrix, which is set as a unit matrix, < >>Jacobian matrix, determined by the topology of the network>Representing a real set;representing system noise at time t->Representing the variance of the process noise, the value of which takes 10 -4 ,I N Representing an N-dimensional identity matrix; />Represents the measurement noise at time t, < >>Representing the variance of the measured noise, the value of which takes 2 x 10 -4 ,I M Representing an M-dimensional identity matrix;
step two: virtual attack obtains samples: the attacked measurement can be obtained using equation (3) for direct attacks, equation (4) for hidden attacks,
in which a is t Attack vector Hc representing direct attack at time t t Since H does not change over time, c is used as an attack vector representing a hidden attack t Representing a hidden attack vector, a t And c t Known in sample training, unknown in actual detection, τ is the attack time of the system, 10 < τ < 200,representing a step function, i.e. when t is t
Step three: obtaining an observation value: calculating the measured value y t And its estimated valueIs used as the detection of direct false data injection attacks, the current detection value y t With the detected value y at the previous moment t-1 The residual error module length of the (2) is used as the detection of the hidden false data injection attack, a threshold segmentation method is used for dividing the two module values to obtain a direct false data injection attack instant observed value and a hidden false data injection attack instant observed value respectively, a sliding window method is used for updating the two instant observed values into the observed values to obtain a direct false data injection attack observed value and a hidden false data injection attack observed value corresponding to the time respectively;
this step is the core of the present invention and is divided into the following sub-steps.
3.1 A threshold is set.
Setting direct false data injection attack threshold according to different power grid structuresThreshold value of attack against hidden false data injection>Taking out
3.2 A detection value is obtained.
Obtaining the detection value y at the moment t from each detection instrument t Invoking the t-1 moment detection value y t-1
3.3 Using kalman filter to estimate the detection value.
Obtaining a t-time state estimation value by using a least squares algorithm represented by the formulas (9) and (10)Calculating t moment measurement estimated value +.>
In the middle ofA variance matrix for the measured value bias;
3.4 Calculating a deviation module value.
Calculating deviation module values of the measured value and the estimated value at the time t by using the steps (12) and (13), respectivelyAnd the value of the modulus which varies between time t and time t-1 +.>
3.5 A threshold segmentation method is used to obtain the instantaneous observations.
From the following componentsAnd->The immediate observations of direct and suppressed spurious data injection attacks can be obtained according to equation (14)>And->
Because the threshold segmentation method is consistent, the upper corner mark i is still used for replacing the upper corner mark n and the upper corner mark s, i.e. i in the formula (14) can be n or s at the same time;
3.6 Using a sliding window method to obtain observations.
Let t-1 time directly false data inject attack observation valueIs->Hidden false data injection attack observation +.>Is->The sliding window method is used for adding the corresponding instant observed value at the t moment to the observed value at the t-1 moment, and then removing the oldest instant observed value to obtain the direct false data injection attack at the t momentThe observed value of the click was +.>The observation value of the hidden false data injection attack is
Step four: obtaining t-moment detector action a using epsilon greedy strategy t : dividing the system into two states, namely that the sn system is not attacked and s a The system is attacked and the detector action is also divided into two states a s Alerting the algorithm to consider the system to be attacked, a c The algorithm is shown to consider that the system is not attacked and does not give an alarm, and a direct attack detection observed value is obtained at the time tObservation value for detection of hidden attack +.>Q-table Q based on direct spurious data injection attack detection using greedy strategy n Q-table Q for attack detection with hidden false data injection s Selecting detector action, epsilon greedy strategy, i.e. the detector selects optimal action with probability 1-epsilon, randomly selects action with probability epsilon, and updates epsilon once every d steps, let d=40, and the update formula is shown in formula (5)
ε=max(ε-e -1 ,ε min ) (5)
Where e is the sample value, ε, that is the current and has been used min =0.01 is a minimum epsilon value set by human, and epsilon initial value is set to 0.2;
step five: training was performed using the Sarsa algorithm, the Q table was updated using equation (6),
the parameter containing the superscript i in the formula represents a parameter for detecting an i-type attack, i=n or s,i.e. when i=n, the parameter is used to detect a direct attack, when i=s, the parameter is used to detect a hidden attack, Q i The Q table required to detect an i-type attack,observations for detecting i-type attacks at time t, < >>Get +.>Actions which can be taken for i-type attack later, alpha is learning efficiency and is set to 0.1 and gamma i For the matching factor trained for i-type attack, 1 is set for both direct and hidden dummy data injection attacks, ++>The state for detecting an i-type attack for time t is +.>Action as->The return of the time is shown in the formula (7),
wherein r is 0 And b is a predetermined coefficient of the leading alarm return value and the lagging alarm return value, which are respectively set as r 0 =1、b=0.01,The system state is detected for the type i at the time t;
step six: repeating steps one to five until reaching the stage maximum detection time t=300, orAnd->In which there is a s Appearance;
step seven: repeating the first to sixth steps until the total sample number E=40000 is used up to obtain the complete Q d Table and Q s A table;
step eight: in the detection, the observed values are obtained by using the steps one to threeAnd->According to Q using formula (8) d Table and Q s Watch obtaining action>And->When both action values are a c Repeating the step until +.>And->One of which is a s Stop detecting and giving an alarm when +.>Is a as s The system is considered to be under direct spurious data injection attack when +.>Is a as s The system is considered to be under direct spurious data injection attacks.
As can be seen in conjunction with the drawings, FIG. 4 shows a leading alarm rate detection chart, in which a t And c t Respectively obey [0,0.075 ]]、[0.075,0.15]、[0.1,0.175]、[0.15,0.225]、[0.175,0.25]FAR represents the advance alarm rate, i.e. the frequency of occurrence of an alarm without an attack, calculated as the number of advance alarms divided by the total number of detections, direct attack test represents the detection result of a direct false data injection attack using the method, stea1thattack test represents the detection result of a hidden false data injection attack using the method, BDD represents the result of a conventional bad data monitoring method, BDD can only detect a direct false data injection attack (detection threshold is set to 0.006), lm is a t And c t Obeying the lower interval limit of distribution, um is a t And c t Obeying the upper interval limit of the distribution. FIG. 5 shows a hysteresis alarm rate detection chart, in which a t And c t Respectively obey [0,0.075 ]]、[0.075,0.15]、[0.1,0.175]、[0.15,0.225]、[0.175,0.25]Where DAR represents the rate of the delayed alarms, i.e., the frequency of detecting more than 10 alarms not yet performed after an attack, calculated as the number of delayed alarms divided by the total number of detections represents the rate of the delayed alarms. FIG. 6 shows a total alarm failure rate graph, in which a t And c t Respectively obey [0,0.075 ]]、[0.075,0.15]、[0.1,0.175]、[0.15,0.225]、[0.175,0.25]Where TFR represents the total alarm failure rate, i.e. the total frequency of detection failures including FAR, DAR, CER, calculated as the total number of failures divided by the total number of detections. FIG. 7 shows a lead alarm rate detection chart, in which a t And c t Respectively obey [0,0.15 ]]、[0.01,0.25]、[0.15,0.3]、[0.2,0.35]、[0.25,0.4]Is a uniform distribution of (c). FIG. 8 shows a hysteresis alarm rate detection chart, in which a t And c t Respectively obey [0,0.15 ]]、[0.01,0.25]、[0.15,0.3]、[0.2,0.35]、[0.25,0.4]Is a uniform distribution of (c). FIG. 9 shows a total alarm failure rate graph, in which a t And c t Respectively obey [0,0.15 ]]、[0.01,0.25]、[0.15,0.3]、[0.2,0.35]、[0.25,0.4]Is a uniform distribution of (c). Fig. 10 shows an alarm class error rate chart, where um-lm=0.075 is a t And c t Following the distribution of fig. 4, um-lm= 0.1.5 is a t And c t Following the distribution of fig. 7, um-lm=0.075 is a t And c t Obeying respectively [0.03,0.08 ]]、[0.05,0.1]、[0.1,0.15]、[0.15,0.2]、[0.2,0.25]The CER represents the alarm class error rate, i.e. the frequency with which the direct and suppressed spurious data injection attacks detect class errors, calculated as the number of class detection errors divided by the total detection times. Fig. 11 shows a graph of the immediate detection success rate of a hidden false data injection attack, three distributions being identical to fig. 10, SDR shows the immediate detection success rate, i.e. the frequency of the alarm immediately after the attack, calculated as the number of immediate alarms divided by the total detection number. Fig. 12 shows a graph of the immediate detection success rate of a direct dummy data injection attack, with three distributions identical to fig. 10.
It should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and equivalent changes or substitutions made on the basis of the above-mentioned technical solutions fall within the scope of the present invention as defined in the claims.

Claims (5)

1. The smart grid false data injection attack detection method based on reinforcement learning is characterized by comprising the following steps of:
step one: a general linear model of the electrical network is built,
step two: the virtual attack takes a sample of the sample,
step three: the observation value is obtained and the data of the observation value,
step four: obtaining t-moment detector action a using epsilon greedy strategy t
Step five: the training was performed using the Sarsa algorithm,
step six: repeating steps one to five until reaching the stage maximum detection time T, orAnd->In which there is a s Appearance;
step seven: repeating the first to sixth steps until the total sample number E is used up to obtain a complete Q d Table and Q s A table;
step eight: detecting, namely judging whether the system is attacked by direct false data injection;
step one: establishing a general linear model of the power grid:
x t =Ax t-1 +v t (1)
y t =Hx t +w t (2)
wherein x is t =[x 1,t ,…,x n,t ,…,x N,t ]For the system state at time t, x n,t The phase angle on the nth node at the moment t is expressed, and N represents the total state number of the system; the measurement at time t is denoted as y t =[y 1,t ,…,y m,t ,…,y M,t ],y m,t The detection value of the mth measuring instrument at the t moment is represented, and M represents the total measuring instrument value;for state transition matrix>Jacobian matrix, determined by the topology of the network>Representing a real set; />Representing system noise at time t->Representing the variance of the process noise, the value of which is determined by the system, I N Representing an N-dimensional identity matrix; />Represents the measurement noise at time t, < >>Representing the variance of the measured noise, the value of which is determined by the measuring device, I M Representing an M-dimensional identity matrix;
step two: virtual attack obtains samples: the attacked measurement can be obtained using equation (3) for direct attacks, equation (4) for hidden attacks,
in which a is t Attack vector Hc representing direct attack at time t t Since H does not change over time, c is used as an attack vector representing a hidden attack t Representing a hidden attack vector, a t And c t Known in sample training, unknown in actual detection, τ is the time the system is under attack,representing a step function, i.e. +.when t.gtoreq.tau>
Step three: obtaining an observation value: calculating the measured value y t And its estimated valueIs used as the detection of direct false data injection attacks, the current detection value y t With the detected value y at the previous moment t-1 The residual error module length of the (2) is used as the detection of the hidden false data injection attack, a threshold segmentation method is used for dividing the two module values to obtain a direct false data injection attack instant observed value and a hidden false data injection attack instant observed value respectively, a sliding window method is used for updating the two instant observed values into the observed values to obtain a direct false data injection attack observed value and a hidden false data injection attack observed value corresponding to the time respectively;
step four: obtaining t-moment detector action a using epsilon greedy strategy t : dividing the system into two states, s n The system is not attacked and s a The system is attacked and the detector action is also divided into two states a s Alerting the algorithm to consider the system to be attacked, a c The algorithm is shown to consider that the system is not attacked and does not give an alarm, and a direct attack detection observed value is obtained at the time tObservation value for detection of hidden attack +.>Q-table Q based on direct spurious data injection attack detection using greedy strategy n Q-table Q for attack detection with hidden false data injection s Selecting detector action, epsilon greedy strategy, i.e. the detector selects optimal action with probability 1-epsilon, randomly selects action with probability epsilon, and epsilon is updated once every d steps, and the update formula is shown as formula (5)
ε=max(ε-e -1 ,ε min ) (5)
Where e is the sample value, ε, that is the current and has been used min A minimum epsilon value set for human beings;
step five: training was performed using the Sarsa algorithm, updating the Q table using equation (4),
the parameter containing the superscript i in the formula represents a parameter for detecting an i-type attack, i=n or s, i.e. when i=n the parameter is used for detecting a direct attack, when i=s the parameter is used for detecting a hidden attack, Q i The Q table required to detect an i-type attack,observations for detecting i-type attacks at time t, < >>Get +.>Actions that can be taken later for i-type attacks, alpha being learning efficiency, gamma i For the fit factor trained for type i attacks, +.>The state for detecting an i-type attack for time t is +.>Action as->The return of the time is shown in the formula (7),
wherein r is 0 And b is a predetermined coefficient of leading alarm return value and lagging alarm return value,and detecting the system state for the type i at the time t.
2. The smart grid dummy data injection attack detection method based on reinforcement learning according to claim 1, wherein the step six: repeating steps one to five until reaching the stage maximum detection time T, orAnd->In which there is a s Appears.
3. The smart grid dummy data injection attack detection method based on reinforcement learning according to claim 1, wherein step seven: repeating the first to sixth steps until the total sample number E is used up to obtain a complete Q d Table and Q s And (3) a table.
4. The smart grid dummy data injection attack detection method based on reinforcement learning according to claim 1, wherein the step eight: in the detection, the observed values are obtained by using the steps one to threeAnd->According to Q using formula (8) d Table and Q s Watch obtaining action>And->When both action values are a c Repeating the step until +.>And->One of which is a s Stop detecting and giving an alarm when +.>Is a as s The system is considered to be under direct spurious data injection attack when +.>Is a as s The system is considered to be under direct spurious data injection attacks,
5. the smart grid dummy data injection attack detection method based on reinforcement learning according to claim 1, wherein the step three is implemented by the following substeps:
(3.1) setting a threshold: setting direct false data injection attack threshold according to different power grid structuresThreshold value of attack against hidden false data injection>
(3.2) obtaining a detection value: obtaining the detection value y at the moment t from each detection instrument t Invoking the t-1 moment detection value y t-1
(3.3) estimating the detection value by using Kalman filtering: obtaining a t-time state estimation value by using a least squares algorithm represented by the formulas (9) and (10)Calculating t moment measurement estimated value +.>
In the middle ofA variance matrix for the measured value bias;
(3.4) calculating a deviation module value: calculating deviation module values of the measured value and the estimated value at the time t by using the steps (12) and (13), respectivelyAnd the value of the modulus which varies between time t and time t-1 +.>
(3.5) obtaining an instantaneous observation value by using a threshold segmentation method: from the following componentsAnd->The immediate observations of direct and suppressed spurious data injection attacks can be obtained according to equation (14)>And->
Because the threshold segmentation method is consistent, the upper corner mark i is still used for replacing the upper corner mark n and the upper corner mark s, i.e. i in the formula (14) can be n or s at the same time;
(3.6) observed values were obtained using a sliding window method: let t-1 time directly false data inject attack observation valueIs thatHidden false data injection attack observation +.>Is->The sliding window method is used for adding the corresponding instant observed value at the t moment to the observed value at the t-1 moment, and then the oldest instant observed valueValue elimination can obtain the observation value of direct false data injection attack at the moment t as +.> Hidden false data injection attack observation value is +.>
CN202110486653.5A 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning Active CN113268730B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110486653.5A CN113268730B (en) 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110486653.5A CN113268730B (en) 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN113268730A CN113268730A (en) 2021-08-17
CN113268730B true CN113268730B (en) 2023-07-25

Family

ID=77229967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110486653.5A Active CN113268730B (en) 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN113268730B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629728B (en) * 2022-05-11 2022-09-09 深圳市永达电子信息股份有限公司 Network attack tracking method and device based on Kalman filtering
CN115134130B (en) * 2022-06-14 2023-04-18 浙江大学 DQN algorithm-based smart grid DoS attack detection method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930265A (en) * 2019-12-12 2020-03-27 燕山大学 Power system false data injection attack detection method based on moving distance to ground
CN111783845A (en) * 2020-06-12 2020-10-16 浙江工业大学 Hidden false data injection attack detection method based on local linear embedding and extreme learning machine

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10671060B2 (en) * 2017-08-21 2020-06-02 General Electric Company Data-driven model construction for industrial asset decision boundary classification
CN109361678B (en) * 2018-11-05 2021-10-12 浙江工业大学 False data injection attack detection method for intelligent networked automobile automatic cruise system
CN110571787B (en) * 2019-09-26 2021-01-01 国网浙江省电力有限公司嘉兴供电公司 False data injection attack design and defense method for direct-current micro-grid
CN110889111A (en) * 2019-10-23 2020-03-17 广东工业大学 Power grid virtual data injection attack detection method based on deep belief network
CN110942109A (en) * 2019-12-17 2020-03-31 浙江大学 PMU false data injection attack prevention method based on machine learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930265A (en) * 2019-12-12 2020-03-27 燕山大学 Power system false data injection attack detection method based on moving distance to ground
CN111783845A (en) * 2020-06-12 2020-10-16 浙江工业大学 Hidden false data injection attack detection method based on local linear embedding and extreme learning machine

Also Published As

Publication number Publication date
CN113268730A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
CN113268730B (en) Smart power grid false data injection attack detection method based on reinforcement learning
CN112783940B (en) Multi-source time sequence data fault diagnosis method and medium based on graph neural network
CN117238058B (en) Starter monitoring method for automobile based on data analysis
CN110087207B (en) Method for reconstructing missing data of wireless sensor network
CN104921736A (en) Continuous blood glucose monitoring device comprising parameter estimation function filtering module
CN117077044B (en) Method and device for judging faults of vacuum circuit breaker for generator
CN116010485B (en) Unsupervised anomaly detection method for dynamic period time sequence
CN117436005B (en) Abnormal data processing method in automatic ambient air monitoring process
CN113114530A (en) Network element health state detection method and equipment
US9613123B2 (en) Data stream processing
CN108960329A (en) A kind of chemical process fault detection method comprising missing data
CN115719294A (en) Indoor pedestrian flow evacuation control method and system, electronic device and medium
CN117454283A (en) State evaluation method for wind turbine generator operation detection data
CN107682354A (en) A kind of network virus detection method, apparatus and equipment
CN117092980B (en) Electrical fault detection control system based on big data
CN114564345A (en) Server abnormity detection method, device, equipment and storage medium
CN116400168A (en) Power grid fault diagnosis method and system based on depth feature clustering
CN116186581A (en) Floor identification method and system based on graph pulse neural network
Xie et al. Adaptive and online fault detection using RPCA algorithm in wireless sensor network nodes
CN111865267B (en) Temperature measurement data prediction method and device
CN114781083A (en) Engine steady-state data hierarchical analysis and steady-state data characteristic value extraction method
CN115168154A (en) Abnormal log detection method, device and equipment based on dynamic baseline
CN111625525B (en) Environment data repairing/filling method and system
Qian et al. Multi channels data fusion algorithm on quantum genetic algorithm for sealed relays
CN112651087A (en) Train motor fault detection method based on distributed estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant