CN111555297B - Unified time scale voltage control method with tri-state energy unit - Google Patents

Unified time scale voltage control method with tri-state energy unit Download PDF

Info

Publication number
CN111555297B
CN111555297B CN202010434595.7A CN202010434595A CN111555297B CN 111555297 B CN111555297 B CN 111555297B CN 202010434595 A CN202010434595 A CN 202010434595A CN 111555297 B CN111555297 B CN 111555297B
Authority
CN
China
Prior art keywords
network
voltage
self
width
adaptive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010434595.7A
Other languages
Chinese (zh)
Other versions
CN111555297A (en
Inventor
殷林飞
陆悦江
苏志鹏
陈立春
高放
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University filed Critical Guangxi University
Priority to CN202010434595.7A priority Critical patent/CN111555297B/en
Publication of CN111555297A publication Critical patent/CN111555297A/en
Application granted granted Critical
Publication of CN111555297B publication Critical patent/CN111555297B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/12Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load
    • H02J3/16Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load by adjustment of reactive power
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/30Reactive power compensation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Power Engineering (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides a unified time scale voltage control method with a tri-state energy unit, which can solve the problem that the coordination control is difficult in the voltage control method of the conventional power system in a multi-time scale form of three-level voltage control, two-level voltage control and one-level voltage control. The invention provides a real-time voltage optimization and control framework for converting three-level voltage control into a unified time scale, and a system is subjected to learning optimization by using a self-adaptive deep-width deterministic strategy gradient method to obtain a voltage control strategy of a tri-state energy unit in a power system. The self-adaptive deep width deterministic strategy gradient method comprises three parts of a deep width deterministic strategy gradient, a self-adaptive mode and a deep width neural network. The invention can replace a multi-time scale voltage optimization and control method.

Description

Unified time scale voltage control method with tri-state energy unit
Technical Field
The invention belongs to the field of operation optimization and control of an electric power system, relates to a voltage optimization and control method for replacing a traditional multi-time scale mode, and is suitable for operation optimization and control of the electric power system.
Background
With the application of renewable energy and distributed power generation to smart grids, according to the comprehensive research plan of smart grids sponsored by the european union, a power system is decomposed into a plurality of cellular systems in a cellular network system of a future smart grid. In addition, many cells in a cellular network system can be considered as tri-state energy units. Each tri-state energy unit has three states: and sending out a reactive state, an absorbing reactive state and a stop state. Traditional voltage optimization and control methods are difficult to adapt to future smart grids that require real-time voltage optimization and regulation.
The traditional voltage optimization and control method is a three-level voltage optimization and control method in the form of multiple time scales. The third-level voltage control is the highest level of the three-level voltage control, takes the economic operation of the whole system as an optimization target, considers stability indexes, and finally outputs a set reference value of the voltage amplitude of the central bus for the second-level voltage control, wherein the time scale of the reference value is dozens of minutes to several hours. The secondary voltage control is a regional control which ensures that the voltage of the central bus is maintained within a preset range by changing a set reference value of the primary voltage controller through a predetermined control law, and the time scale of the secondary voltage control is tens of seconds to several minutes. The primary voltage control is a local control whose control device compensates for rapid random changes in voltage by keeping the output variable as close as possible to the set point, the time scale of which is a few seconds. In the conventional voltage optimization and control method, the voltage regulation instruction of the system needs to be transmitted stage by stage to enable the control equipment controlled by the first-stage voltage to perform corresponding voltage regulation action. Thus, conventional voltage optimization and control methods have some drawbacks. Firstly, because the step-by-step transmission of the voltage regulating instruction requires a certain time, the time scale for voltage optimization and control of the whole power system is longer; secondly, the time scales of the three levels of voltage control are different, and the voltage fluctuation is real-time, so that voltage regulating instructions made by the three levels of voltage control according to the automatically detected voltage values are possibly inconsistent, voltage of the voltage regulating equipment cannot be effectively optimized and controlled due to the incoordination of the voltage regulating instructions, and further, the situation of voltage unbalance occurs.
In summary, a voltage control method with a uniform time scale is needed for a modern power system to perform coordinated control on the whole system, and a controller with a uniform time scale directly outputs a voltage regulating instruction to each tri-state energy unit, so that real-time optimization and control on system voltage are realized.
Disclosure of Invention
The invention provides a unified time scale voltage control method with a tri-state energy unit. The unified time scale voltage control method with the tri-state energy units is different from a traditional voltage optimization and control framework, and the method carries out unified regulation and control on the tri-state energy units of each cellular system through the real-time voltage optimization and controller with unified time scale, so that the problem that the coordination is difficult due to multiple time scales in the traditional voltage optimization and control framework is solved. The real-time voltage optimization and controller designed based on the self-adaptive deep width deterministic strategy gradient method takes voltage deviation delta V and allowable deviation as input, takes a voltage regulation instruction of the tri-state energy unit as output, and does not need other optimization control instructions.
The self-adaptive deep-width deterministic strategy gradient method designed in the unified time scale voltage control method with the tri-state energy unit mainly comprises three parts, namely a deep deterministic strategy gradient, a self-adaptive mode and a deep-width neural network. In the self-adaptive deep-width deterministic strategy gradient method, a deep deterministic strategy gradient part mainly comprises an evaluation network and an actuator network, wherein the evaluation network is responsible for parameterizing an action value function, the actuator network is responsible for guiding the updating of strategy function parameters according to the value obtained by the evaluation network part, and then an evaluation network and a target network are added on the basis of the two networks of the actuator network and the evaluation network to form four network structures: an actuator valuation network, an evaluation valuation network, an actuator target network and an evaluation target network; the self-adaptive deep-width deterministic strategy gradient method also uses a deep-width neural network to continuously approach four network structures in the gradient part of the deep-width deterministic strategy in a self-adaptive mode, so that the deep-width neural network can fit the four network structures and play corresponding roles; the deep-width neural network is capable of rapidly learning multiple-input and multiple-output relationships in training data, thereby efficiently predicting outputs from input data.
In the adaptive deep-width deterministic policy gradient method, the deterministic policy is described as being in state stLower determined action value atIn a state stIn which is adopted atThe expected return is then described as a function of the action value, with the deterministic policy a ═ pi (s | θ)μ) And action value function Q (s, a | θ)Q) Using a parameter of theta respectivelyμAnd thetaQIs represented by a deep-width neural network.
For the evaluation network, pass θQThe parameterized function approximates the evaluation network and updates the parameters by minimizing the loss.
Minimize the loss of
Figure BDA0002501763180000021
In the formula, st,atAnd rtStatus, action value and reward value, respectively; y istCan be represented by'
yt=r(st,at)+γQ(st+1,π(st+1μ)|θQ) (2)
In the formula, r(s)t,at) Is in a state stLower take action value atThe value of the prize earned, γ is the discount coefficient.
The evaluation network in the evaluation network is updated with a gradient descent, which is denoted as
Figure BDA0002501763180000022
In the formula, pi (s | theta)μ) Is a deterministic policy.
For a network of actuators, a valuation network in the network of actuators updates a current policy by mapping a state, defined as a total reward for a future discount reward, to a specified action through an action function, the state reward being expressed as a total reward for the future discount reward
Figure BDA0002501763180000031
And (4) performing end-to-end optimization on the objective function by using a Sierpish strategy gradient method, and further updating the network parameters of the executor towards the direction of obtaining the maximum total reward. The gradient method of the Sierpish strategy is expressed as
Figure BDA0002501763180000032
Updating target network parameters in the actuator-evaluation dual-network structure by adopting an empirical playback method through four-tuple(s)t,at,rt,st+1) Storing in experience playback pool, assigning time interval, calling experience backAnd performing soft update on the target network parameters by using the data in the pool. The soft update process is
θQ′←τθ+(1-τ)θQ′ (6)
θμ′←τθ+(1-τ)θμ′ (7)
In the formula, τ represents an update parameter.
In order to increase some randomness and coverage of learning in the learning process, random noise N is added to the selected action, and the expression of the action finally interacting with the environment is
at=π(stμ)+N (8)
Drawings
FIG. 1 is a schematic structural diagram of an adaptive deep-width deterministic strategy gradient method of the present invention.
Figure 2 is a schematic view of the reactive power absorption curve of the tri-state energy unit of the method of the invention.
FIG. 3 is a schematic diagram of a real-time voltage optimization and control framework of the method of the present invention.
Detailed Description
The invention provides a unified time scale voltage control method with a tri-state energy unit, which is described in detail in the following steps in combination with the attached drawings:
FIG. 1 is a schematic structural diagram of an adaptive deep-width deterministic strategy gradient method of the present invention. The self-adaptive deep width deterministic strategy gradient method comprises three parts of a deep width deterministic strategy gradient, a self-adaptive mode and a deep width neural network, wherein the deep width deterministic strategy gradient part mainly comprises an evaluation network and an actuator network, an evaluation network and a target network are added on the basis of the actuator network and the evaluation network, and four network structures are formed: the method comprises the steps of fitting the actuator estimation network, the evaluation estimation network, the actuator target network and the evaluation target network by using a deep-width neural network in a self-adaptive mode. Actuator valuation network responsible policy network parametersNumber thetaμIs responsible for updating according to the current state stSelecting a Current action atFor interacting with the environment to generate the next state st+1And a prize r. The executor target network is responsible for replaying the next state s sampled in the pool according to experiencet+1Selecting the optimal next action at+1Periodically updating the network parameter θμ. Evaluating valuation network responsible value network function thetaQIs responsible for calculating the current Q value Q(s)t,atQ) And y in the target Q valuet. The evaluation target network is responsible for calculating Q(s) in the target Q valuet+1,π(st+1μ)|θQ′) Partly, and periodically updating the network parameter thetaQ. The experience replay pool is responsible for storing the current state stCurrent action atCurrent prize rtAnd the next state st+1And samples are provided to the target network. The loss function is responsible for collecting the Q values provided by the evaluation network and calculating the loss values. The gradient descent is responsible for updating the evaluation network according to the loss value. The gradient calculation is responsible for calculating gradient values according to the value provided by the evaluation valuation network and providing the gradient values to the strategy gradient. The strategy gradient updates its strategy gradient value according to the gradient value provided by the gradient calculation, and updates the actuator estimation network.
The designed adaptive deep-width deterministic strategy gradient method is used for predicting the pressure regulating instruction of the system. And the voltage deviation delta V and the allowable deviation of the current state are used as input, prediction is carried out through a deep-width neural network, and real-time voltage regulating instructions of the tri-state energy units in each cellular system are output.
Figure 2 is a schematic view of the reactive power absorption curve of the tri-state energy unit of the method of the invention. In a cellular system, each tri-state energy unit has three states, namely a reactive power absorption state, a reactive power emission state and a shutdown state, and can be controlled through a voltage regulating instruction. When the reactive value at a certain moment is a positive value, the tri-state energy unit is in a reactive power absorption state, the reactive power of the cellular system is reduced, and the voltage is reduced; when the reactive value is a negative value, the tri-state energy unit is in a reactive state, the reactive power of the cellular system is increased, and the voltage is increased; when the reactive value is constantly 0 in a certain time period, the tri-state energy unit is in a shutdown state, and the voltage value of the cellular system is not changed.
FIG. 3 is a schematic diagram of a real-time voltage optimization and control framework of the method of the present invention. The real-time voltage optimization and control device adopts a unified time scale frame and is a multi-output control device. The controller takes voltage deviation delta V and allowable deviation provided by an interconnected power grid as input, trains the deep width neural network, obtains a voltage regulating instruction of a tri-state energy unit in each cellular system according to a prediction result of a self-adaptive deep width deterministic strategy gradient method, and takes the voltage regulating instruction as output of the controller.
The adaptive deep-width deterministic strategy gradient method after off-line training is used for calculation, corresponding voltage regulating instructions can be rapidly formulated according to the voltage offset delta V and the allowable deviation of each monitoring point in the power system, the response speed of the tri-state energy unit in each cellular system is accelerated, and the power system can realize real-time reactive power balance and voltage stability under the framework of unified time scale. By the method of the invention, the voltage deviation delta V at the moment t +1i,(t+1)Voltage deviation delta V according to t moment through deep width neural networki,tAnd (4) forecasting, wherein the real-time voltage optimization and the control period of the controller can reach the second level.

Claims (1)

1. A unified time scale voltage control method with tri-state energy units is characterized in that a real-time voltage optimization and control framework with unified time scale can be used for carrying out real-time voltage optimization and control on an electric power system; the method comprises the following steps in the using process:
(1) dividing a target power system into a plurality of cellular systems, wherein each cellular system comprises a plurality of tri-state energy units;
(2) establishing a voltage optimization and control model with uniform time scale capable of replacing the traditional combined method;
the voltage optimization and control model with unified time scale processes the input provided by the interconnected power grid through the real-time voltage optimization and controller, the input is voltage offset delta V and allowable deviation, an optimal voltage regulation strategy is predicted by using a self-adaptive deep width deterministic strategy gradient method, and a voltage regulation instruction is directly output to a tri-state energy unit in each cellular system in real time, so that the problem that the voltage is difficult to coordinate and control due to multiple time scales in the traditional combined method is solved;
(3) storing values of input and output variables of a plurality of groups of traditional combined methods of three-level voltage control, two-level voltage control and one-level voltage control;
(4) extracting the data obtained in the step (3) according to the model in the step (2), and training the extracted data by adopting a self-adaptive depth-width deterministic strategy gradient method;
the self-adaptive deep-width deterministic strategy gradient method can output real-time voltage regulating instructions of a plurality of tri-state energy units; the self-adaptive deep width deterministic strategy gradient method comprises three parts of a deep width deterministic strategy gradient, a self-adaptive mode and a deep width neural network, wherein the deep width deterministic strategy gradient part consists of an evaluation network and an actuator network, and an evaluation network and a target network are added on the basis of the actuator network and the evaluation network to form four network structures: an actuator valuation network, an evaluation valuation network, an actuator target network and an evaluation target network; the self-adaptive deep width deterministic strategy gradient method uses a deep width neural network to fit an actuator estimation network, an evaluation estimation network, an actuator target network and an evaluation target network in a self-adaptive mode, so that the method is suitable for learning and prediction of a continuous action process, and combines a self-adaptive mode part, a deep width neural network part and a deep width deterministic strategy gradient part to form the self-adaptive deep width deterministic strategy gradient method;
(5) and calculating in a self-adaptive depth width certainty strategy gradient method by using real-time data to obtain real-time voltage regulating instructions of each tri-state energy unit of the power system.
CN202010434595.7A 2020-05-21 2020-05-21 Unified time scale voltage control method with tri-state energy unit Active CN111555297B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010434595.7A CN111555297B (en) 2020-05-21 2020-05-21 Unified time scale voltage control method with tri-state energy unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010434595.7A CN111555297B (en) 2020-05-21 2020-05-21 Unified time scale voltage control method with tri-state energy unit

Publications (2)

Publication Number Publication Date
CN111555297A CN111555297A (en) 2020-08-18
CN111555297B true CN111555297B (en) 2022-04-29

Family

ID=72006541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010434595.7A Active CN111555297B (en) 2020-05-21 2020-05-21 Unified time scale voltage control method with tri-state energy unit

Country Status (1)

Country Link
CN (1) CN111555297B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112564118B (en) * 2020-11-23 2022-03-18 广西大学 Distributed real-time voltage control method capable of expanding quantum deep width learning
CN112906289B (en) * 2021-01-15 2023-04-18 广西大学 Method for coordinating optimization of parameters of power system stabilizer and secondary voltage controller
CN112883547B (en) * 2021-01-15 2023-03-28 广西大学 Virtual cellular cooperative reactive auxiliary service optimization method for comprehensive energy system
CN113255893B (en) * 2021-06-01 2022-07-05 北京理工大学 Self-evolution generation method of multi-agent action strategy

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110429652A (en) * 2019-08-28 2019-11-08 广西大学 A kind of intelligent power generation control method for expanding the adaptive Dynamic Programming of deep width
CN110518580A (en) * 2019-08-15 2019-11-29 上海电力大学 A kind of active distribution network running optimizatin method for considering microgrid and actively optimizing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104993522B (en) * 2015-06-30 2018-01-19 中国电力科学研究院 A kind of active distribution network Multiple Time Scales coordination optimization dispatching method based on MPC
CN105226664B (en) * 2015-10-14 2018-12-18 中国电力科学研究院 A kind of active distribution network reactive voltage layer distributed control method for coordinating
CN109193641A (en) * 2018-10-11 2019-01-11 广西大学 A kind of tri-state energy control method based on automatic expansion deep learning
CN110535146B (en) * 2019-08-27 2022-09-23 哈尔滨工业大学 Electric power system reactive power optimization method based on depth determination strategy gradient reinforcement learning
CN110970903A (en) * 2019-12-27 2020-04-07 山东大学 Voltage coordination control optimization method and system applied to active power distribution network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110518580A (en) * 2019-08-15 2019-11-29 上海电力大学 A kind of active distribution network running optimizatin method for considering microgrid and actively optimizing
CN110429652A (en) * 2019-08-28 2019-11-08 广西大学 A kind of intelligent power generation control method for expanding the adaptive Dynamic Programming of deep width

Also Published As

Publication number Publication date
CN111555297A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
CN111555297B (en) Unified time scale voltage control method with tri-state energy unit
CN112615379B (en) Power grid multi-section power control method based on distributed multi-agent reinforcement learning
Li et al. Coordinated load frequency control of multi-area integrated energy system using multi-agent deep reinforcement learning
CN113363997B (en) Reactive voltage control method based on multi-time scale and multi-agent deep reinforcement learning
CN110365057B (en) Distributed energy participation power distribution network peak regulation scheduling optimization method based on reinforcement learning
CN111934335A (en) Cluster electric vehicle charging behavior optimization method based on deep reinforcement learning
CN114217524B (en) Power grid real-time self-adaptive decision-making method based on deep reinforcement learning
CN114139354B (en) Electric power system simulation scheduling method and system based on reinforcement learning
WO2023070293A1 (en) Long-term scheduling method for industrial byproduct gas system
CN113363998A (en) Power distribution network voltage control method based on multi-agent deep reinforcement learning
CN115313403A (en) Real-time voltage regulation and control method based on deep reinforcement learning algorithm
CN117039981A (en) Large-scale power grid optimal scheduling method, device and storage medium for new energy
CN114722693A (en) Optimization method of two-type fuzzy control parameter of water turbine regulating system
CN114566971A (en) Real-time optimal power flow calculation method based on near-end strategy optimization algorithm
CN112787331B (en) Deep reinforcement learning-based automatic power flow convergence adjusting method and system
CN117172097A (en) Power distribution network dispatching operation method based on cloud edge cooperation and multi-agent deep learning
CN115133540B (en) Model-free real-time voltage control method for power distribution network
CN114039366B (en) Power grid secondary frequency modulation control method and device based on peacock optimization algorithm
CN113725863A (en) Power grid autonomous control and decision method and system based on artificial intelligence
Ahiakwo et al. Application of Neuro-Swarm Intelligence Technique ToLoad Flow Analysis
CN113837654B (en) Multi-objective-oriented smart grid hierarchical scheduling method
Cao et al. Optimal control with deep reinforcement learning for shunt compensations to enhance voltage stability
CN112398142B (en) Power grid frequency intelligent control method based on empirical mode decomposition
Shi et al. Deep Reinforcement Learning-based Data-Driven Active Power Dispatching for Smart City Grid
Lu et al. Optimal Design of Energy Storage System Assisted AGC Frequency Regulation Based on DDPG Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant