CN113268730A - Smart grid false data injection attack detection method based on reinforcement learning - Google Patents

Smart grid false data injection attack detection method based on reinforcement learning Download PDF

Info

Publication number
CN113268730A
CN113268730A CN202110486653.5A CN202110486653A CN113268730A CN 113268730 A CN113268730 A CN 113268730A CN 202110486653 A CN202110486653 A CN 202110486653A CN 113268730 A CN113268730 A CN 113268730A
Authority
CN
China
Prior art keywords
attack
value
time
data injection
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110486653.5A
Other languages
Chinese (zh)
Other versions
CN113268730B (en
Inventor
吴争光
张阔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd
Original Assignee
Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd filed Critical Qunzhi Future Artificial Intelligence Technology Research Institute Wuxi Co ltd
Priority to CN202110486653.5A priority Critical patent/CN113268730B/en
Publication of CN113268730A publication Critical patent/CN113268730A/en
Application granted granted Critical
Publication of CN113268730B publication Critical patent/CN113268730B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S40/00Systems for electrical power generation, transmission, distribution or end-user application management characterised by the use of communication or information technologies, or communication or information technology specific aspects supporting them
    • Y04S40/20Information technology specific aspects, e.g. CAD, simulation, modelling, system security

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Computing Systems (AREA)
  • Virology (AREA)
  • Operations Research (AREA)
  • Emergency Alarm Devices (AREA)

Abstract

The invention discloses a smart grid false data injection attack detection method based on reinforcement learning, which is based on a detection method of false data injection attack of an Sarsa algorithm and divides the attack into direct false data injection attack and hidden false data injection attack for respective detection. On the aspect of observation value construction, a residual method is combined with threshold segmentation by direct false data injection attack detection, and the difference norm of a measured value is combined with the threshold segmentation by hidden false data injection attack to respectively obtain observation values. And (5) respectively obtaining Q tables by using the observation value training, and realizing the detection of the attack by using the Q tables. The design realizes the rapid detection of the hidden attack, increases the detection speed and the success rate, has simple realization method and can obviously improve the detection efficiency.

Description

Smart grid false data injection attack detection method based on reinforcement learning
Technical Field
The invention relates to the field of smart power grids, in particular to a smart power grid false data injection attack detection method based on reinforcement learning.
Background
The smart power grid is a novel power grid technology which separates an information transmission channel from a power transmission channel, so that the power grid has more efficient power resource allocation and stronger anti-interference capability, and the information security of the smart power grid becomes an important concern along with the deep development of the information technology. The false data injection attack plays an important role in the research of the smart grid information attack, and the core idea is that the state estimation of a power grid system is influenced by using constructed attack vectors through the loopholes of the traditional detection method, so that the safe and stable operation of the power system is damaged. The traditional false data injection attack detection method is a bad data detection method. The method can only detect direct false data injection attacks and cannot detect hidden false data injection attacks, and meanwhile, the detection success rate is general due to the fact that a single threshold is adopted. The current detection method of machine learning can realize the detection of the injection attack of the hidden false data, but the detection method that the attack vector is far larger than the noise and the attack vector is slightly larger than the process noise is lacked.
Disclosure of Invention
The invention aims to provide a smart grid false data injection attack detection method based on reinforcement learning aiming at the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: a false data injection attack detection method based on reinforcement learning comprises the following steps:
the method comprises the following steps: establishing a general linear model of the power grid:
xt=Axt-1+vt (1)
yt=Hxt+wt (2)
wherein xt=[x1,t,…,xn,t,…,xN,t]Is the system state at time t, xn,tRepresenting the phase angle of the nth node at the moment t, wherein N represents the total state number of the system; the measurement value at time t is denoted as yt=[y1,t,…,ym,t,…,yM,t],ym,tThe detection value of the mth measuring instrument at the moment t is represented, and M represents the total measuring instrument value;
Figure BDA0003050655570000011
in order to be a state transition matrix,
Figure BDA0003050655570000012
is a Jacobian matrix determined by the grid topology,
Figure BDA0003050655570000013
representing a set of real numbers;
Figure BDA0003050655570000014
which is indicative of the system noise at time t,
Figure BDA0003050655570000015
the variance representing the process noise, the value of which is determined by the system, INRepresenting an N-dimensional identity matrix;
Figure BDA0003050655570000016
representing the noise of the measurement at the time t,
Figure BDA0003050655570000017
the variance representing the measurement noise, the value of which is determined by the measuring device, IMRepresenting an M-dimensional identity matrix;
step two: virtual attack acquisition samples: the attacked measurement value can be obtained using equation (3) for a direct attack, the attacked measurement value can be obtained using equation (4) for an insidious attack,
Figure BDA0003050655570000021
Figure BDA0003050655570000022
in the formula atAttack vector, Hc, representing a direct attack at time ttThe attack vector representing a secret attack uses c because H does not change over timetRepresenting a concealed attack vector, atAnd ctIt is known in sample training, unknown in actual detection, tau is the attack time of the system,
Figure BDA0003050655570000023
representing step function, i.e. when t ≧ τ
Figure BDA0003050655570000024
Step three: obtaining an observed value: calculating the measured value ytAnd its estimated value
Figure BDA0003050655570000025
Is used as the detection of direct spurious data injection attacks, and the current detection value y is usedtAnd the detected value y of the previous timet-1The residual modular length is used for detecting the injection attack of the hidden false data, a threshold segmentation method is used for carrying out degree division on two modular values to respectively obtain a direct false data injection attack instant observed value and a hidden false data injection attack instant observed value, a sliding window method is used for updating the two instant observed values into observed values, and a direct false data injection attack observed value and a hidden false data injection attack observed value which correspond to time are respectively obtained;
step four: deriving a Detector action a at time t using an epsilon greedy strategyt: the system is divided into two states, snThe system is not attacked andathe system is attacked, and the detector action is also divided into two states asIssuing an alarm for the algorithm to assume the system is under attack, acThe expression algorithm considers that the system is not attacked and does not send an alarm, and a direct attack detection observation value is obtained at the moment t
Figure BDA0003050655570000026
And detection of observations by covert attacks
Figure BDA0003050655570000027
Q-table Q based on direct spurious data injection attack detection using greedy strategynQ-table Q with detection of concealed spurious data injection attackssSelecting detector actions, namely an epsilon greedy strategy that the detector selects the optimal action according to the probability 1-epsilon and randomly selects the action according to the probability epsilon, the epsilon is updated once every d steps, and the updating formula is shown as the formula (5)
ε=max(ε-e-1,εmin) (5)
Where e is the current and used sample value, εminIs a minimum epsilon value set by human;
step five: training is carried out by using the Sarsa algorithm, a Q table is updated by using an equation (6),
Figure BDA0003050655570000028
the parameter containing the superscript i in the formula represents a parameter for detecting an i-type attack, i ═ n or s, i ═ n indicates that the parameter is for detecting a direct attack, i ═ s indicates that the parameter is for detecting a concealed attack, Q represents that the parameter is for detecting a concealed attackiThe Q-table required to detect the type i attack,
Figure BDA0003050655570000029
for the observations at time t used to detect type i attacks,
Figure BDA00030506555700000210
for time t to obtain
Figure BDA00030506555700000211
The action that can be taken later on for the i-type attack, α is the learning efficiency, γiTo characterize the factor for training against type i attacks,
Figure BDA00030506555700000212
the state for detecting an i-type attack for time t is
Figure BDA00030506555700000213
Acting as
Figure BDA00030506555700000214
The value of the time is shown as the formula (7),
Figure BDA00030506555700000215
in the formula r0And b is a predetermined advance alarm report and retard alarm report coefficient,
Figure BDA00030506555700000216
the system state of type i detection at time t;
step six: repeating steps one to five until the maximum detection time T of the phase is reached, or
Figure BDA0003050655570000031
And
Figure BDA0003050655570000032
in which is as(ii) occurs;
step seven: repeating the first step to the sixth step until the total number E of samples is used up to obtain complete QdWatch and QsTable;
step eight: during detection, the observed value is obtained by using the steps from the first step to the third step
Figure BDA0003050655570000033
And
Figure BDA0003050655570000034
using equation (8) based on the Qd table and Q, respectivelysTable acquisition actions
Figure BDA00030506555700000330
And
Figure BDA0003050655570000036
when both action values are acThe steps are repeated until
Figure BDA0003050655570000037
And
Figure BDA0003050655570000038
has one assStopping detection and giving an alarm when
Figure BDA0003050655570000039
Is asThe system is considered to be attacked by direct spurious data injection when
Figure BDA00030506555700000310
Is asThe system is considered to be attacked by direct spurious data injection.
Figure BDA00030506555700000311
Further, the third step is realized by the following sub-steps:
(3.1) setting a threshold: respectively setting direct false data injection attack threshold values according to different power grid structures
Figure BDA00030506555700000312
Figure BDA00030506555700000313
And a suppressed spurious data injection attack threshold
Figure BDA00030506555700000314
(3.2) obtaining a detection value: obtaining a detection value y at time t from each detection instrumenttRecall the t-1 time detection value yt-1
(3.3) estimating a detection value by using Kalman filtering: obtaining a state estimation value at time t by using a least square algorithm represented by equations (9) and (10)
Figure BDA00030506555700000315
The estimated value of the measurement at the time t is calculated by using the formula (11)
Figure BDA00030506555700000316
Figure BDA00030506555700000317
Figure BDA00030506555700000318
Figure BDA00030506555700000319
In the formula
Figure BDA00030506555700000320
A variance matrix which is a deviation of the measured values;
(3.4) calculating a deviation module square value: calculating the deviation norm value of the measured value and the estimated value at the time t by using the equations (12) and (13), respectively
Figure BDA00030506555700000321
And the module square value of the change of t time and t-1 time
Figure BDA00030506555700000322
Figure BDA00030506555700000323
Figure BDA00030506555700000324
(3.5) obtaining instantaneous observations using a threshold segmentation method: by
Figure BDA00030506555700000325
And
Figure BDA00030506555700000326
the immediate observation value of the direct spurious data injection attack and the concealed spurious data injection attack can be obtained according to the formula (14)
Figure BDA00030506555700000327
And
Figure BDA00030506555700000328
Figure BDA00030506555700000329
since the threshold segmentation methods are consistent, the superscript i is still used to replace the superscript n and the superscript s, i in the formula (14) may be n or s at the same time;
(3.6) obtaining an observed value using a sliding window method: make t-1 moment directly inject the false data into attack observation value
Figure BDA0003050655570000041
Is composed of
Figure BDA0003050655570000042
Hiding false data injection attack observations
Figure BDA0003050655570000043
Is composed of
Figure BDA0003050655570000044
Adding the corresponding instant observed value at the time t to the observed value at the time t-1 by using a sliding window method, and then removing the oldest instant observed value to obtain a direct false data injection attack observed value at the time t
Figure BDA0003050655570000045
The observation of a concealed spurious data injection attack is
Figure BDA0003050655570000046
The invention has the advantages that the detection of the false data injection attack is realized by using the Sarsa algorithm, the detection accuracy and the detection speed of the false data injection attack are improved, the detection of the concealed false data injection attack also has a better effect, and the direct false data injection attack and the concealed false data injection attack are conveniently detected.
Drawings
Figure 1 is a diagram of IEEE-14 nodes from which an H-matrix can be obtained,
figure 2 is a flow chart of a training Q table,
figure 3 is a flow chart of the detection,
figure 4 shows a look-ahead rate detection diagram,
figure 5 shows a late alarm rate detection graph,
figure 6 shows a graph of the total alarm failure rate,
figure 7 shows a look-ahead rate detection map,
figure 8 shows a graph of late alarm rate detection,
figure 9 shows a graph of total alarm failure rate,
figure 10 shows an alarm category error rate map,
figure 11 shows a graph of the instantaneous detection success rate of a suppressed spurious data injection attack,
fig. 12 shows a graph of the instant detection success rate of a direct spurious data injection attack.
Detailed Description
For the purposes of promoting an understanding and appreciation of the invention, reference will now be made in detail to the present embodiments of the invention illustrated in the accompanying drawings.
Example 1: referring to fig. 1-4, a smart grid false data injection attack detection method based on reinforcement learning includes the following steps:
the method comprises the following steps: establishing a general linear model of the power grid:
xt=Axt-1+vt (1)
yt=Hxt+wt (2)
wherein xt=[x1,t,…,xn,t,…,xN,t]Is the system state at time t, xn,tDenoted as nth node at time tThe phase angle of the system is shown as N, and 14 is taken; the measurement value at time t is denoted as yt=[y1,t,…,ym,t,…,yM,t],ym,tThe detection value of the mth measuring instrument at the moment t is shown, M represents the value of the total measuring instrument, and 23 is taken;
Figure BDA0003050655570000051
the state transition matrix, set as the identity matrix,
Figure BDA0003050655570000052
is a Jacobian matrix determined by the grid topology,
Figure BDA0003050655570000053
representing a set of real numbers;
Figure BDA0003050655570000054
which is indicative of the system noise at time t,
Figure BDA0003050655570000055
the variance of process noise is 10-4,INRepresenting an N-dimensional identity matrix;
Figure BDA0003050655570000056
representing the noise of the measurement at the time t,
Figure BDA0003050655570000057
the variance of the measurement noise is expressed by 2 × 10-4,IMRepresenting an M-dimensional identity matrix;
step two: virtual attack acquisition samples: the attacked measurement value can be obtained using equation (3) for a direct attack, the attacked measurement value can be obtained using equation (4) for an insidious attack,
Figure BDA0003050655570000058
Figure BDA0003050655570000059
in the formula atAttack vector, Hc, representing a direct attack at time ttThe attack vector representing a secret attack uses c because H does not change over timetRepresenting a concealed attack vector, atAnd ctIn the sample training, it is known that it is unknown in the actual detection, tau is the attack time of the system, tau is set to be 10 < tau < 200,
Figure BDA00030506555700000510
representing step functions, i.e. when there is t of t
Figure BDA00030506555700000511
Figure BDA00030506555700000512
Step three: obtaining an observed value: calculating the measured value ytAnd its estimated value
Figure BDA00030506555700000513
Is used as the detection of direct spurious data injection attacks, and the current detection value y is usedtAnd the detected value y of the previous timet-1The residual modular length is used for detecting the injection attack of the hidden false data, a threshold segmentation method is used for carrying out degree division on two modular values to respectively obtain a direct false data injection attack instant observed value and a hidden false data injection attack instant observed value, a sliding window method is used for updating the two instant observed values into observed values, and a direct false data injection attack observed value and a hidden false data injection attack observed value which correspond to time are respectively obtained;
this step is the core of the present invention and is divided into the following substeps.
3.1) setting a threshold value.
Respectively setting direct false data injection attack threshold values according to different power grid structures
Figure BDA00030506555700000514
And a suppressed spurious data injection attack threshold
Figure BDA00030506555700000515
Get
Figure BDA00030506555700000516
Figure BDA00030506555700000517
3.2) obtaining a detection value.
Obtaining a detection value y at time t from each detection instrumenttRecall the t-1 time detection value yt-1
3.3) estimating the detection value by using Kalman filtering.
Obtaining a state estimation value at time t by using a least square algorithm represented by equations (9) and (10)
Figure BDA00030506555700000518
The estimated value of the measurement at the time t is calculated by using the formula (11)
Figure BDA00030506555700000519
Figure BDA00030506555700000520
Figure BDA00030506555700000521
Figure BDA00030506555700000522
In the formula
Figure BDA00030506555700000523
A variance matrix which is a deviation of the measured values;
3.4) calculating a deviation module square value.
Calculating the deviation norm value of the measured value and the estimated value at the time t by using the equations (12) and (13), respectively
Figure BDA0003050655570000061
And the module square value of the change of t time and t-1 time
Figure BDA0003050655570000062
Figure BDA0003050655570000063
Figure BDA0003050655570000064
3.5) obtaining the instantaneous observation value by using a threshold segmentation method.
By
Figure BDA0003050655570000065
And
Figure BDA0003050655570000066
the immediate observation value of the direct spurious data injection attack and the concealed spurious data injection attack can be obtained according to the formula (14)
Figure BDA0003050655570000067
And
Figure BDA0003050655570000068
Figure BDA0003050655570000069
since the threshold segmentation methods are consistent, the superscript i is still used to replace the superscript n and the superscript s, i in the formula (14) may be n or s at the same time;
3.6) obtaining the observed values using a sliding window method.
Make t-1 moment directly inject the false data into attack observation value
Figure BDA00030506555700000610
Is composed of
Figure BDA00030506555700000611
Hiding false data injection attack observations
Figure BDA00030506555700000612
Is composed of
Figure BDA00030506555700000613
Adding the corresponding instant observed value at the time t to the observed value at the time t-1 by using a sliding window method, and then removing the oldest instant observed value to obtain a direct false data injection attack observed value at the time t
Figure BDA00030506555700000614
The observation of a concealed spurious data injection attack is
Figure BDA00030506555700000615
Step four: deriving a Detector action a at time t using an epsilon greedy strategyt: the system is divided into two states, namely the state that the sn system is not attacked and the state that the sn system is not attackedaThe system is attacked, and the detector action is also divided into two states asIssuing an alarm for the algorithm to assume the system is under attack, acThe expression algorithm considers that the system is not attacked and does not send an alarm, and a direct attack detection observation value is obtained at the moment t
Figure BDA00030506555700000616
And detection of observations by covert attacks
Figure BDA00030506555700000617
Q-table Q based on direct spurious data injection attack detection using greedy strategynQ-table Q with detection of concealed spurious data injection attackssThe detector actions are selected, an epsilon greedy strategy, i.e. the detector selects the optimal action with probability 1-epsilon, and selects the action randomly with probability epsilonIf epsilon is updated once every d steps, let d equal to 40, the updating formula is shown in formula (5)
ε=max(ε-e-1,εmin) (5)
Where e is the current and used sample value, εminThe minimum epsilon value set by people is 0.01, and the initial value of epsilon is set to be 0.2;
step five: training is carried out by using the Sarsa algorithm, a Q table is updated by using an equation (6),
Figure BDA00030506555700000618
the parameter containing the superscript i in the formula represents a parameter for detecting an i-type attack, i ═ n or s, i ═ n indicates that the parameter is for detecting a direct attack, i ═ s indicates that the parameter is for detecting a concealed attack, Q represents that the parameter is for detecting a concealed attackiThe Q-table required to detect the type i attack,
Figure BDA00030506555700000619
for the observations at time t used to detect type i attacks,
Figure BDA00030506555700000620
for time t to obtain
Figure BDA00030506555700000621
The action that can be taken later for the i-type attack, α is the learning efficiency, and is set to 0.1, γiThe impression factor for the i-type attack training is set to 1 for both the direct spurious data injection attack and the concealed spurious data injection attack,
Figure BDA0003050655570000071
the state for detecting an i-type attack for time t is
Figure BDA0003050655570000072
Acting as
Figure BDA0003050655570000073
The time is reported by the value asIs shown in a formula (7),
Figure BDA0003050655570000074
in the formula r0And b are predetermined early and late alarm return coefficients, respectively set to r0=1、b=0.01,
Figure BDA0003050655570000075
The system state of type i detection at time t;
step six: repeating the first to fifth steps until the maximum detection time T of the phase is 300, or
Figure BDA0003050655570000076
And
Figure BDA0003050655570000077
in which is as(ii) occurs;
step seven: repeating the first step to the sixth step until the total number of samples E is used up to 40000, and obtaining the complete QdWatch and QsTable;
step eight: during detection, the observed value is obtained by using the steps from the first step to the third step
Figure BDA0003050655570000078
And
Figure BDA0003050655570000079
according to Q using formula (8)dWatch and QsTable acquisition actions
Figure BDA00030506555700000710
And
Figure BDA00030506555700000711
when both action values are acThe steps are repeated until
Figure BDA00030506555700000712
And
Figure BDA00030506555700000713
has one assStopping detection and giving an alarm when
Figure BDA00030506555700000714
Is asThe system is considered to be attacked by direct spurious data injection when
Figure BDA00030506555700000715
Is asThe system is considered to be attacked by direct spurious data injection.
Figure BDA00030506555700000716
As can be seen in conjunction with the drawing, FIG. 4 shows a look-ahead alarm rate detection map, where atAnd ctAre respectively obeyed to 0, 0.075]、[0.075,0.15]、[0.1,0.175]、[0.15,0.225]、[0.175,0.25]The FAR represents the advanced alarm rate, namely the frequency of alarm occurrence without attack, the calculation mode is the number of advanced alarms divided by the total detection number, direct attack test represents the detection result of direct false data injection attack by using the method, stea1thattack test represents the detection result of hidden false data injection attack by using the method, BDD represents the result of using the traditional bad data monitoring method, BDD can only detect direct false data injection attack (the detection threshold is set to 0.006), lm is atAnd ctThe lower limit of the distribution-obeying interval, um is atAnd ctSubject to the upper interval limit of the distribution. FIG. 5 shows a graph of late alarm rate detection, where atAnd ctAre respectively obeyed to 0, 0.075]、[0.075,0.15]、[0.1,0.175]、[0.15,0.225]、[0.175,0.25]Wherein DAR represents the rate of late warning, i.e. the frequency of more than 10 detections after an attack without warning, is calculated in such a way that the number of late warning divided by the total number of detections represents the rate of late warning. FIG. 6 is a graph of total alarm failure rate, where atAnd ctAre equally dividedCompliance with [0, 0.075 ]]、[0.075,0.15]、[0.1,0.175]、[0.15,0.225]、[0.175,0.25]Where TFR represents the total alarm failure rate, i.e. the total frequency of detection failures including FAR, DAR, CER, calculated as the total number of failures divided by the total number of detections. FIG. 7 shows a look-ahead rate detection map, where atAnd ctAre respectively obeyed to 0, 0.15]、[0.01,0.25]、[0.15,0.3]、[0.2,0.35]、[0.25,0.4]Is uniformly distributed. FIG. 8 shows a late alarm rate detection graph, in which atAnd ctAre respectively obeyed to 0, 0.15]、[0.01,0.25]、[0.15,0.3]、[0.2,0.35]、[0.25,0.4]Is uniformly distributed. FIG. 9 is a graph of total alarm failure rate, where atAnd ctAre respectively obeyed to 0, 0.15]、[0.01,0.25]、[0.15,0.3]、[0.2,0.35]、[0.25,0.4]Is uniformly distributed. FIG. 10 is a graph showing the error rate of the alarm category, where "a" denotes "0.075tAnd ctFollowing the distribution of fig. 4, um-lm-0.1.5 is atAnd ctFollowing the distribution of fig. 7, um-lm ═ 0.075 is atAnd ctRespectively obey [0.03, 0.08]、[0.05,0.1]、[0.1,0.15]、[0.15,0.2]、[0.2,0.25]The CER represents the alarm class error rate, i.e. the frequency of direct spurious data injection attacks and concealed spurious data injection attacks that detect the class is erroneous, calculated as the number of class detection errors divided by the total number of detections. Fig. 11 shows a graph of the success rate of the instantaneous detection of the concealed false data injection attack, with the three distributions being the same as in fig. 10, SDR showing the success rate of instantaneous detection, i.e. the frequency of alarming immediately after the attack, calculated as the number of alarming immediately divided by the total number of detections. Fig. 12 shows a graph of the instantaneous detection success rate of a direct spurious data injection attack, with three distributions identical to fig. 10.
It should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and all equivalent modifications and substitutions based on the above-mentioned technical solutions are within the scope of the present invention as defined in the claims.

Claims (10)

1. A smart grid false data injection attack detection method based on reinforcement learning is characterized by comprising the following steps:
the method comprises the following steps: a generally linear model of the power grid is established,
step two: the virtual attack obtains a sample of the data,
step three: the acquisition of the observed value is carried out,
step four: deriving a Detector action a at time t using an epsilon greedy strategyt
Step five: the Sarsa algorithm is used for training purposes,
step six: repeating steps one to five until the maximum detection time T of the phase is reached, or
Figure FDA0003050655560000011
And
Figure FDA0003050655560000012
in which is as(ii) occurs;
step seven: repeating the first step to the sixth step until the total number E of samples is used up to obtain complete QdWatch and QsTable;
step eight: and detecting and judging whether the system is attacked by direct false data injection.
2. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1,
the method comprises the following steps: establishing a general linear model of the power grid:
xt=Axt-1+vt (1)
yt=Hxt+wt (2)
wherein xt=[x1,t,…,xn,t,…,xN,t]Is the system state at time t, xn,tRepresenting the phase angle of the nth node at the moment t, wherein N represents the total state number of the system; the measurement value at time t is denoted as yt=[y1,t,…,ym,t,…,yM,t],ym,tExpressed as the value detected by the M-th meter at time t, M being the total meterA value;
Figure FDA0003050655560000013
in order to be a state transition matrix,
Figure FDA0003050655560000014
is a Jacobian matrix determined by the grid topology,
Figure FDA0003050655560000015
representing a set of real numbers;
Figure FDA0003050655560000016
which is indicative of the system noise at time t,
Figure FDA0003050655560000017
the variance representing the process noise, the value of which is determined by the system, INRepresenting an N-dimensional identity matrix;
Figure FDA0003050655560000018
representing the noise of the measurement at the time t,
Figure FDA0003050655560000019
the variance representing the measurement noise, the value of which is determined by the measuring device, IMRepresenting an M-dimensional identity matrix.
3. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1,
step two: virtual attack acquisition samples: the attacked measurement value can be obtained using equation (3) for a direct attack, the attacked measurement value can be obtained using equation (4) for an insidious attack,
Figure FDA00030506555600000110
Figure FDA00030506555600000111
in the formula atAttack vector, Hc, representing a direct attack at time ttThe attack vector representing a secret attack uses c because H does not change over timetRepresenting a concealed attack vector, atAnd ctIt is known in sample training, unknown in actual detection, tau is the attack time of the system,
Figure FDA0003050655560000021
representing step function, i.e. when t ≧ τ
Figure FDA0003050655560000022
4. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1,
step three: obtaining an observed value: calculating the measured value ytAnd its estimated value
Figure FDA0003050655560000023
Is used as the detection of direct spurious data injection attacks, and the current detection value y is usedtAnd the detected value y of the previous timet-1The residual modular length is used for detecting the injection attack of the hidden false data, a threshold segmentation method is used for carrying out degree division on two modular values to respectively obtain a direct false data injection attack immediate observation value and a hidden false data injection attack immediate observation value, a sliding window method is used for updating the two immediate observation values into the observation values, and the direct false data injection attack observation value and the hidden false data injection attack observation value at the corresponding time are respectively obtained.
5. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1,
step four: deriving a Detector action a at time t using an epsilon greedy strategyt: the system is divided into two states, snThe system is not attacked andathe system is attacked, and the detector action is also divided into two states asIssuing an alarm for the algorithm to assume the system is under attack, acThe expression algorithm considers that the system is not attacked and does not send an alarm, and a direct attack detection observation value is obtained at the moment t
Figure FDA0003050655560000024
And detection of observations by covert attacks
Figure FDA0003050655560000025
Q-table Q based on direct spurious data injection attack detection using greedy strategynQ-table Q with detection of concealed spurious data injection attackssSelecting detector actions, namely an epsilon greedy strategy that the detector selects the optimal action according to the probability 1-epsilon and randomly selects the action according to the probability epsilon, the epsilon is updated once every d steps, and the updating formula is shown as the formula (5)
ε=max(ε-e-1min) (5)
Where e is the current and used sample value, εminIs a minimum epsilon value set by human.
6. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1,
step five: training is carried out by using the Sarsa algorithm, a Q table is updated by using an equation (4),
Figure FDA0003050655560000026
the parameter containing the superscript i in the formula represents a parameter for detecting an i-type attack, i ═ n or s, i ═ n indicates that the parameter is for detecting a direct attack, i ═ s indicates that the parameter is for detecting a concealed attack, Q represents that the parameter is for detecting a concealed attackiRequired for detecting i-type attacksThe table of Q to be used is,
Figure FDA0003050655560000027
for the observations at time t used to detect type i attacks,
Figure FDA0003050655560000028
for time t to obtain
Figure FDA0003050655560000029
The action that can be taken later on for the i-type attack, α is the learning efficiency, γiTo characterize the factor for training against type i attacks,
Figure FDA00030506555600000210
the state for detecting an i-type attack for time t is
Figure FDA00030506555600000211
Acting as
Figure FDA00030506555600000212
The value of the time is shown as the formula (7),
Figure FDA00030506555600000213
in the formula r0And b is a predetermined advance alarm report and retard alarm report coefficient,
Figure FDA0003050655560000031
the system status of type i detection at time t.
7. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1, characterized by comprising the following steps: repeating steps one to five until the maximum detection time T of the phase is reached, or
Figure FDA0003050655560000032
And
Figure FDA0003050655560000033
in which is asAnd occurs.
8. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1, characterized by comprising the following steps: repeating the first step to the sixth step until the total number E of samples is used up to obtain complete QdWatch and QsTable (7).
9. The reinforcement learning-based smart grid false data injection attack detection method according to claim 1, characterized in that the step eight: during detection, the observed value is obtained by using the steps from the first step to the third step
Figure FDA0003050655560000034
And
Figure FDA0003050655560000035
according to Q using formula (8)dWatch and QsTable acquisition actions
Figure FDA0003050655560000036
And
Figure FDA0003050655560000037
when both action values are acThe steps are repeated until
Figure FDA0003050655560000038
And
Figure FDA0003050655560000039
has one assStopping detection and giving an alarm when
Figure FDA00030506555600000310
Is asThe system is considered to be attacked by direct spurious data injection when
Figure FDA00030506555600000311
Is asThe system is considered to be attacked by direct spurious data injection,
Figure FDA00030506555600000312
10. the reinforcement learning-based smart grid false data injection attack detection method according to claim 1, wherein the third step is realized by the following sub-steps:
(3.1) setting a threshold: respectively setting direct false data injection attack threshold values according to different power grid structures
Figure FDA00030506555600000313
Figure FDA00030506555600000314
And a suppressed spurious data injection attack threshold
Figure FDA00030506555600000315
(3.2) obtaining a detection value: obtaining a detection value y at time t from each detection instrumenttRecall the t-1 time detection value yt-1
(3.3) estimating a detection value by using Kalman filtering: obtaining a state estimation value at time t by using a least square algorithm represented by equations (9) and (10)
Figure FDA00030506555600000316
The estimated value of the measurement at the time t is calculated by using the formula (11)
Figure FDA00030506555600000317
Figure FDA00030506555600000318
Figure FDA00030506555600000319
Figure FDA00030506555600000320
In the formula
Figure FDA00030506555600000321
A variance matrix which is a deviation of the measured values;
(3.4) calculating a deviation module square value: calculating the deviation norm value of the measured value and the estimated value at the time t by using the equations (12) and (13), respectively
Figure FDA00030506555600000322
And the module square value of the change of t time and t-1 time
Figure FDA00030506555600000323
Figure FDA00030506555600000324
Figure FDA00030506555600000325
(3.5) obtaining instantaneous observations using a threshold segmentation method: by
Figure FDA00030506555600000326
And
Figure FDA00030506555600000327
the immediate observation value of the direct spurious data injection attack and the concealed spurious data injection attack can be obtained according to the formula (14)
Figure FDA00030506555600000328
And
Figure FDA00030506555600000329
Figure FDA0003050655560000041
since the threshold segmentation methods are consistent, the superscript i is still used to replace the superscript n and the superscript s, i in the formula (14) may be n or s at the same time;
(3.6) obtaining an observed value using a sliding window method: make t-1 moment directly inject the false data into attack observation value
Figure FDA0003050655560000042
Is composed of
Figure FDA0003050655560000043
Hiding false data injection attack observations
Figure FDA0003050655560000044
Is composed of
Figure FDA0003050655560000045
Adding the corresponding instant observed value at the time t to the observed value at the time t-1 by using a sliding window method, and then removing the oldest instant observed value to obtain a direct false data injection attack observed value at the time t
Figure FDA0003050655560000046
The observation of a concealed spurious data injection attack is
Figure FDA0003050655560000047
CN202110486653.5A 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning Active CN113268730B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110486653.5A CN113268730B (en) 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110486653.5A CN113268730B (en) 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN113268730A true CN113268730A (en) 2021-08-17
CN113268730B CN113268730B (en) 2023-07-25

Family

ID=77229967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110486653.5A Active CN113268730B (en) 2021-05-01 2021-05-01 Smart power grid false data injection attack detection method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN113268730B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629728A (en) * 2022-05-11 2022-06-14 深圳市永达电子信息股份有限公司 Network attack tracking method and device based on Kalman filtering
CN115134130A (en) * 2022-06-14 2022-09-30 浙江大学 DQN algorithm-based smart grid DoS attack detection method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109361678A (en) * 2018-11-05 2019-02-19 浙江工业大学 A kind of intelligent network connection automobile automatic cruising system false data detection method for injection attack
US20190056722A1 (en) * 2017-08-21 2019-02-21 General Electric Company Data-driven model construction for industrial asset decision boundary classification
CN110571787A (en) * 2019-09-26 2019-12-13 国网浙江省电力有限公司嘉兴供电公司 false data injection attack design and defense method for direct-current micro-grid
CN110889111A (en) * 2019-10-23 2020-03-17 广东工业大学 Power grid virtual data injection attack detection method based on deep belief network
CN110930265A (en) * 2019-12-12 2020-03-27 燕山大学 Power system false data injection attack detection method based on moving distance to ground
CN110942109A (en) * 2019-12-17 2020-03-31 浙江大学 PMU false data injection attack prevention method based on machine learning
CN111783845A (en) * 2020-06-12 2020-10-16 浙江工业大学 Hidden false data injection attack detection method based on local linear embedding and extreme learning machine

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190056722A1 (en) * 2017-08-21 2019-02-21 General Electric Company Data-driven model construction for industrial asset decision boundary classification
CN109361678A (en) * 2018-11-05 2019-02-19 浙江工业大学 A kind of intelligent network connection automobile automatic cruising system false data detection method for injection attack
CN110571787A (en) * 2019-09-26 2019-12-13 国网浙江省电力有限公司嘉兴供电公司 false data injection attack design and defense method for direct-current micro-grid
CN110889111A (en) * 2019-10-23 2020-03-17 广东工业大学 Power grid virtual data injection attack detection method based on deep belief network
CN110930265A (en) * 2019-12-12 2020-03-27 燕山大学 Power system false data injection attack detection method based on moving distance to ground
CN110942109A (en) * 2019-12-17 2020-03-31 浙江大学 PMU false data injection attack prevention method based on machine learning
CN111783845A (en) * 2020-06-12 2020-10-16 浙江工业大学 Hidden false data injection attack detection method based on local linear embedding and extreme learning machine

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MOSTAFA MOHAMMADPOURFARD等: "Generation of False Data Injection Attacks using Conditional Generative Adversarial Networks", pages 1 - 5, Retrieved from the Internet <URL:《网页在线公开:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9248967》> *
李小杭等: "虚假数据注人攻击下的自适应补偿控制", 《兵工学报》, vol. 41, no. 11, pages 2260 - 2265 *
陈刘东等: "面向互动需求响应的虚假数据注入攻击及其检测方法", 《电力***自动化》, vol. 45, no. 3, pages 15 - 23 *
陈婉莹;王运鹏;赵珂雨;刘晓洁;: "云环境中基于分组的安全虚拟机放置方法", 信息网络安全, no. 08, pages 61 - 67 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629728A (en) * 2022-05-11 2022-06-14 深圳市永达电子信息股份有限公司 Network attack tracking method and device based on Kalman filtering
CN115134130A (en) * 2022-06-14 2022-09-30 浙江大学 DQN algorithm-based smart grid DoS attack detection method
CN115134130B (en) * 2022-06-14 2023-04-18 浙江大学 DQN algorithm-based smart grid DoS attack detection method

Also Published As

Publication number Publication date
CN113268730B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
Gustafsson The marginalized likelihood ratio test for detecting abrupt changes
CN112783940B (en) Multi-source time sequence data fault diagnosis method and medium based on graph neural network
CN110035090A (en) A kind of smart grid false data detection method for injection attack
CN113268730A (en) Smart grid false data injection attack detection method based on reinforcement learning
CN109165504B (en) Power system false data attack identification method based on anti-generation network
CN109474472A (en) A kind of fault detection method based on the more cell space filtering of holohedral symmetry
CN103916896B (en) Anomaly detection method based on multi-dimensional Epanechnikov kernel density estimation
CN111580151B (en) SSNet model-based earthquake event time-of-arrival identification method
CN113242209B (en) Generalized accumulation and detection method for false data injection attack of smart grid
CN106127047B (en) A kind of electric system malicious data detection method based on Jensen-Shannon distance
CN109921415A (en) A kind of pernicious online defence method of Data Injection Attacks of power grid towards hybrid measurement
CN111970229B (en) CAN bus data anomaly detection method aiming at multiple attack modes
CN117439827B (en) Network flow big data analysis method
CN116682458A (en) GIS partial discharge voiceprint detection method based on improved wavelet packet of energy operator
CN116304912A (en) Sensor gas concentration detection method based on deep learning transducer neural network
CN115342814A (en) Unmanned ship positioning method based on multi-sensor data fusion
CN113269041B (en) Signal abnormality detection method applied to synchronous device
CN113094702B (en) False data injection attack detection method and device based on LSTM network
CN114048811A (en) Wireless sensor node fault diagnosis method and device based on deep learning
CN117009903A (en) Data anomaly detection method, device, equipment and storage medium
CN108834043B (en) Priori knowledge-based compressed sensing multi-target passive positioning method
CN116400168A (en) Power grid fault diagnosis method and system based on depth feature clustering
CN115134130B (en) DQN algorithm-based smart grid DoS attack detection method
Geng et al. Bayesian quickest detection with unknown post-change parameter
CN106788816A (en) A kind of channel status detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant