CN117039981A - Large-scale power grid optimal scheduling method, device and storage medium for new energy - Google Patents

Large-scale power grid optimal scheduling method, device and storage medium for new energy Download PDF

Info

Publication number
CN117039981A
CN117039981A CN202310785632.2A CN202310785632A CN117039981A CN 117039981 A CN117039981 A CN 117039981A CN 202310785632 A CN202310785632 A CN 202310785632A CN 117039981 A CN117039981 A CN 117039981A
Authority
CN
China
Prior art keywords
new energy
power
power grid
unit
scheduling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310785632.2A
Other languages
Chinese (zh)
Inventor
郝毅
崇志强
徐娜
马世乾
史亚坤
穆朝絮
陈建
商敬安
李振斌
陈亮
于光耀
陈培育
张�杰
黄志刚
黄家凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Original Assignee
Tianjin University
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University, State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd, Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd filed Critical Tianjin University
Priority to CN202310785632.2A priority Critical patent/CN117039981A/en
Publication of CN117039981A publication Critical patent/CN117039981A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/381Dispersed generators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/004Generation forecast, e.g. methods or systems for forecasting future energy generation
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/04Circuit arrangements for ac mains or ac distribution networks for connecting networks of the same frequency but supplied from different sources
    • H02J3/06Controlling transfer of power between connected networks; Controlling sharing of load between connected networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/46Controlling of the sharing of output between the generators, converters, or transformers
    • H02J3/466Scheduling the operation of the generators, e.g. connecting or disconnecting generators to meet a given demand
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/20The dispersed energy generation being of renewable origin
    • H02J2300/22The renewable source being solar energy
    • H02J2300/24The renewable source being solar energy of photovoltaic origin
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/20The dispersed energy generation being of renewable origin
    • H02J2300/28The renewable source being wind energy
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/40Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation wherein a plurality of decentralised, dispersed or local energy generation technologies are operated simultaneously

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a large-scale power grid optimal scheduling method, device and storage medium for new energy, which comprise the following steps: step 1, constructing a large-scale power grid model containing new energy nodes, predicting new energy generating capacity and required load based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of time scale required by day-ahead scheduling; step 2, improving the standard node, and converting the optimal scheduling problem into a Markov decision process; step 3, constructing an intelligent body based on a neural network with parameter self-adaption, and establishing an optimized scheduling frame based on a TD3 algorithm; and 4, based on the optimized scheduling framework established in the step 3 and based on the TD3 algorithm, training and testing the intelligent body are completed, and a unit output plan aiming at meeting the new energy consumption is obtained. The invention can effectively increase the new energy generating capacity on the premise of stable operation of the power grid, reduce operation of the thermal power generating unit, improve the new energy utilization rate and save the power generation cost.

Description

Large-scale power grid optimal scheduling method, device and storage medium for new energy
Technical Field
The invention belongs to the technical field of stable operation and new energy consumption of a large-scale power grid system containing new energy, relates to a large-scale power grid optimal scheduling method, a device and a storage medium, and in particular relates to a large-scale power grid optimal scheduling method, a device and a storage medium for containing new energy.
Background
Along with the continuous increase of the scale of the power system, the dynamic characteristics of the power system are increasingly complex, and meanwhile, along with the large amount of new energy being connected into a power grid, the complexity and the control difficulty of the power grid are also gradually increased. Because of the characteristics of random volatility, uncontrollable and the like of new energy, higher requirements are put on the safe, controllable, flexible and high-efficiency capability of the power grid containing the new energy. The traditional optimal scheduling method designs an objective function and constraint conditions according to related technical requirements, and establishes an optimal scheduling model based on the objective function and the constraint conditions.
Based on the established optimal scheduling model, an optimal scheme is sought, the traditional optimal scheduling scheme needs to depend on manual experience, and generalization capability is limited when solving complex problems. The deep reinforcement learning is an advantageous and effective artificial intelligence technical method, has a brand-new angle in many fields, has a certain development in the aspects of intelligent fault diagnosis and state evaluation of power transmission and transformation equipment, and is not yet effectively applied to the optimized dispatching direction of the large power grid containing new energy.
Therefore, it is necessary to provide an optimized scheduling method based on deep reinforcement learning, so as to solve the scheduling problem of the large power grid containing new energy and the problem of new energy consumption.
No prior art publication is found, either the same or similar to the present invention, upon retrieval.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a large-scale power grid optimizing and scheduling method, device and storage medium for new energy, which can obtain a unit output plan meeting the aim of new energy consumption, effectively increase the new energy generating capacity on the premise of stable operation of a power grid, reduce the operation of a thermal power unit, improve the new energy utilization rate and save the power generation cost.
The invention solves the practical problems by adopting the following technical scheme:
a large-scale power grid optimization scheduling method for new energy comprises the following steps:
and 1, constructing a large-scale power grid model containing new energy nodes, predicting new energy generating capacity and required load based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of time scale required by day-ahead scheduling.
Step 2, based on the wind power, photovoltaic and load prediction data set constructed in the step 1, improving standard nodes, and converting an optimal scheduling problem into a Markov decision process;
Step 3, constructing an intelligent body based on a neural network with self-adaption parameters based on the Markov decision process converted and generated in the step 2 according to historical operation data characteristics of the power grid, and establishing an optimized scheduling framework based on a dual-delay depth deterministic strategy gradient TD3 algorithm;
and 4, based on the large-scale power grid model comprising the new energy nodes constructed in the step 1 and the optimized scheduling framework based on the TD3 algorithm constructed in the step 3, training and testing the intelligent agent are completed, and the unit output plan aiming at meeting the new energy consumption is obtained.
Moreover, the specific steps of the step 1 include:
(1) Analyzing IEEE1888 nodes, extracting key data capable of reflecting basic characteristics, power grid connection relation and production characteristics of power grid elements and operation management of a power system, researching regulation characteristics and response characteristics of different types of energy nodes in a grid-connected state, improving IEEE1888 standard nodes, performing tide testing on an improved IEEE1888 node model, and completing construction of a large power grid model containing new energy nodes;
(2) Based on the large power grid model with the new energy nodes constructed in the step (1), predicting the new energy generating capacity by utilizing the LSTM neural network, including preprocessing historical operation data and establishing the LSTM neural network to complete the prediction of the generating capacity and the required load of the new energy unit, and completing the construction of a wind power, photovoltaic and load prediction data set of the time scale required by day-ahead scheduling.
Furthermore, in the step 1 (2), the objective function of the LSTM neural network established by the present invention is set as follows:
and selecting wind power data of one year, wherein 80% of the wind power data are used as training data, 20% of the wind power data are used as test data, and optimizing a loss function in the training process, so that the wind power, photovoltaic and load prediction data set of a time scale required by day-ahead scheduling is constructed.
Moreover, the specific steps of the step 2 include:
(1) Analyzing a wind power generation and photovoltaic power generation network model, and constructing an objective function to meet a daily scheduling target;
total cost of system operation:
wherein:
wherein F is the total cost of operation of the power system, T is the number of scheduling cycles, F 1,t Is the operation cost of the conventional unit in the period t, F 2,t Is the active power loss of the system in the period t, F 3,t Is the amount of waste wind and waste light in the t period. n is the number of thermal power generating units, P i,h,t Active output of ith thermal power unit, a i 、b i 、c i The coefficients of the first and second constants of the cost function, respectively. u (u) i,t And the on-off state of the thermal power generating unit is indicated. P (P) k,loss,t Representing the active loss generated during the period t, and m represents the power system branch number. r is the number P of renewable energy units d,t,w The output value of the fan in the period t is represented by P d,t,w,max Indicating the upper limit of the fan output in the period t. P (P) d,t,pv Representing the output value, P, of the photovoltaic in the period t d,t,pvmax Representing the upper output limit of the photovoltaic at time t.
(2) Setting constraint conditions, including power balance constraint, tide balance constraint, unit output constraint, unit climbing constraint, output limit of wind power generation and photovoltaic power generation:
wherein P is i,t And Q i,t Representing the active and reactive power injected by node i, U i Is the voltage amplitude of node i, θ ij Is the voltage phase angle difference between node i and node j, G ij And B ij Node conductance and susceptance, respectively.
And the output constraint and climbing constraint of the unit:
wherein r is i,up The upward climbing rate of the ith thermal power unit, r i,down The downward slope climbing rate of the ith thermal power generating unit is obtained.
Fan and photovoltaic output constraints:
wherein P is d,t,wmax And P d,t,pvmax Respectively are provided withIndicating the upper output limit of the blower and the photovoltaic.
(3) Converting the optimal scheduling problem generation into a Markov decision process:
setting state observables s t The method comprises the steps of outputting a thermal power unit, controlling the active power loss of the system and dispatching the period t, wherein the new energy unit generates energy and loads are predicted according to the step 1;
S t ={t,P t WF ,P t PV ,P t H ,P t WF } (7)
action a t Is a unit decision variable, decides the unit output strategy of the next time, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to the minimum up/down time limit of the generator;
A t ={P t WF ,P t PV ,P t G } (8)
prize value R t The bonus function is determined based on the designed objective function, requiring minimum running cost and maximum new energy consumption.
R t =-[a 1 r 1,t +a 2 r 2,t +a 3 r 3,t ] (9)
Wherein R is t Representing the total reward function, r 1,t 、r 2,t 、r 3,t Respectively representing the running cost, active loss and new energy consumption of the system, wherein a represents the weight occupied by each part of rewarding value;
moreover, the specific steps of establishing the optimized scheduling framework in the step 3 include:
(1) Simulating the actual running condition of a power grid, building a power grid simulation environment, and acquiring a corresponding observation state space through a calling program interface after the simulation environment carries out tide calculation;
(2) Using the state generated in the power grid operation scene as input quantity, searching the optimal action in the action space, and obtaining the action a through an actor network t ' then adding noise to obtain action a t =a t ' +N then a t Input into the environment to obtain s t ,r t After that, an experience(s) t ,a t ,s t+1 ,r t+1 ) Then putting the experientence into an experience pool;
(3) Adding noise to the action in the strategy network evaluation part, constructing a positive distribution, sampling according to the shape of the action, multiplying the sampling by parameters to realize scaling of the noise, and finally adding the noise to the action and outputting the noise, thereby obtaining the optimal scheduling framework based on the TD3 algorithm.
A large-scale power grid optimization scheduling device for new energy comprises:
the wind power, photovoltaic and load prediction data set construction module is used for constructing a large-scale power grid model containing new energy nodes, predicting new energy generating capacity and required load based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of the time scale required by day scheduling;
the Markov decision process conversion generation module is used for improving the standard nodes based on the constructed wind power, photovoltaic and load prediction data set and generating the optimized scheduling problem into a Markov decision process;
the optimization scheduling framework construction module is used for constructing an intelligent agent based on a neural network of parameter self-adaption based on the historical operation data characteristics of the power grid based on a Markov decision process generated by conversion and an optimization scheduling framework based on the TD3 algorithm;
and the unit output plan output module is used for completing training and testing of the intelligent agent based on the constructed large-scale power grid model containing the new energy nodes and the optimized scheduling framework of the TD3 algorithm and obtaining the unit output plan aiming at meeting the new energy consumption.
Moreover, the large-scale power grid model building module of the new energy node further comprises:
the large power grid model construction module containing the new energy nodes is used for analyzing the IEEE1888 nodes, extracting key data capable of reflecting basic characteristics, power grid connection relation and production characteristics of power grid elements and operation management of a power system, researching regulation characteristics and response characteristics of different types of energy nodes in a grid-connected state, improving IEEE1888 standard nodes, carrying out tide testing on an improved IEEE1888 node model and completing large power grid model construction containing the new energy nodes;
the wind power, photovoltaic and load prediction data set construction module is used for predicting the new energy generating capacity by utilizing the LSTM neural network based on the constructed large power grid model containing the new energy nodes, and comprises the steps of preprocessing historical operation data and establishing the LSTM neural network to complete the prediction of the new energy generating capacity and the required load of the new energy unit, and completing the construction of the wind power, photovoltaic and load prediction data set of the time scale required by the day scheduling.
Moreover, the markov decision process conversion generation module further includes:
the objective function construction module is used for analyzing the wind power generation and photovoltaic power generation network models and constructing an objective function to meet a daily scheduling target;
Total cost of system operation:
wherein:
wherein F is the total cost of operation of the power system, T is the number of scheduling cycles, F 1,t Is the operation cost of the conventional unit in the period t, F 2,t Is the active power loss of the system in the period t, F 3,t The amount of waste wind and waste light in the period t; n isThe number of thermal power generating units, P i,h,t Active output of ith thermal power unit, a i 、b i 、c i The coefficients are the primary and secondary constants of the cost function respectively; u (u) i,t The method comprises the steps of representing the on-off state of a thermal power unit; p (P) k,loss,t Representing active loss generated in a t period, and m represents the branch number of the power system; r is the number P of renewable energy units d,t,w The output value of the fan in the period t is represented by P d,t,w,max The upper limit of the output of the fan in the period t is represented; p (P) d,t,pv Representing the output value, P, of the photovoltaic in the period t d,t,pvmax Representing the upper output limit of the photovoltaic in the period t;
the constraint condition setting module is used for setting power balance constraint, tide balance constraint, unit output constraint, unit climbing constraint, output limit of wind power generation and photovoltaic power generation:
wherein P is i,t And Q i,t Representing the active and reactive power injected by node i, U i Is the voltage amplitude of node i, θ ij Is the voltage phase angle difference between node i and node j, G ij And B ij Node conductance and susceptance, respectively;
and the output constraint and climbing constraint of the unit:
Wherein r is i,up The upward climbing rate of the ith thermal power unit, r i,down The downward climbing rate of the ith thermal power generating unit is set;
fan and photovoltaic output constraints:
wherein P is d,t,wmax And P d,t,pvmax Respectively representing the upper output limit of the fan and the photovoltaic;
the Markov decision process conversion generation module is used for converting the generation of the optimal scheduling problem into a Markov decision process:
setting state observables s t The method comprises the steps of outputting a thermal power unit, generating power of a new energy unit and predicting a load, wherein the active power loss of the system is scheduled for a period t;
S t ={t,P t WF ,P t PV ,P t H ,P t WF } (7)
action a t Is a unit decision variable, decides the unit output strategy of the next time, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to the minimum up/down time limit of the generator;
A t ={P t WF ,P t PV ,P t G } (8)
prize value R t The rewarding function is determined based on the designed objective function, and the minimum running cost and the maximum new energy consumption are required;
R t =-[a 1 r 1,t +a 2 r 2,t +a 3 r 3,t ] (9)
wherein R is t Representing the total reward function, r 1,t 、r 2,t 、r 3,t Respectively representing the running cost, active loss and new energy consumption of the system, wherein a represents the weight occupied by each part of rewarding value;
thereby translating the optimized scheduling problem intoMarkov decision process and complete the process for the grid state s t Action a t Rewarding value R t Transfer function is set up by the above-mentioned equipment.
Moreover, the optimal scheduling frame establishment module further includes:
The power grid actual operation condition simulation module is used for simulating the actual operation condition of the power grid, constructing a power grid simulation environment, and acquiring a corresponding observation state space through a calling program interface after the simulation environment carries out tide calculation;
the optimal action searching module is used for searching optimal actions in an action space by using states generated in a power grid operation scene as input quantity and obtaining actions a through an actor network t ' then adding noise to obtain action a t =a t ' +N then a t Input into the environment to obtain s t ,r t After that, an experience(s) t ,a t ,s t+1 ,r t+1 ) Then putting the experientence into an experience pool;
and the optimal scheduling frame building module adds noise to the action in the strategy network evaluation part, samples according to the shape of the action by constructing a normal distribution, multiplies the action by parameters to realize scaling of the noise, and finally adds the noise to the action and outputs the noise, thereby obtaining the optimal scheduling frame based on the TD3 algorithm.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the large-scale grid optimization scheduling method of any one of claims 1 to 5.
The invention has the advantages and beneficial effects that:
1. according to the invention, under the power grid environment of large-scale access of new energy, a large-scale power grid optimization scheduling method for new energy is provided, a traditional thermal power model is improved, a large power grid model containing new energy nodes is constructed, prediction of the power generation amount of a new energy unit is realized based on an LSTM network, and the optimization scheduling problem is converted into a Markov decision process. And constructing parameter self-adaptive agents according to historical operating characteristics of the power grid, training the agents based on historical power grid operating data, and acquiring a unit output plan aiming at meeting the requirement of new energy consumption.
2. The invention improves the traditional standard node model, constructs a large power grid containing new energy, predicts the power generation amount and load requirement of the new energy for 24 hours in the future by adopting an LSTM network based on historical operation data, and lays a foundation for the design of an optimal scheduling scheme of the large power grid.
3. The invention fully considers the new energy consumption problem and the robustness problem faced by the new energy grid connection, converts the optimal scheduling process into the Markov decision process, and sets the power grid state s to be observed t Action a taken by the grid scheduling process t Optimizing the prize value R generated after scheduling t And (5) a transfer function.
4. According to the invention, the optimization scheduling problem is converted into a Markov decision process, an intelligent body is constructed based on a neural network with parameter self-adaption, an optimization scheduling framework based on a TD3 algorithm is provided, the adopted algorithm effectively solves the problem of explosive action space and the problem of vulnerability of a power grid system in the process of optimizing strategy exploration, the overestimation of actions is avoided, and the update speed of the neural network parameters and the convergence speed of the algorithm are improved.
Drawings
FIG. 1 is a general scheme diagram of a new energy-containing large-scale power grid optimization scheduling method;
FIG. 2 is a network structure diagram of the LSTM of the present invention;
FIG. 3 is a diagram of the internal structure of an LSTM network of the present invention;
FIG. 4 is a schematic diagram of a loss function optimization process in the training process of the present invention;
FIG. 5 is a schematic diagram of a short-term power generation prediction of a new energy unit based on an LSTM network according to the present invention;
FIG. 6 is a diagram of an optimal scheduling framework based on the TD3 algorithm of the present invention;
FIG. 7 is a schematic diagram of a convergence process and an obtained average prize value in a TD3 algorithm-based test process according to the present invention;
FIG. 8 is a schematic view of a thermal power generating unit scheduling strategy according to the present invention;
FIG. 9 is a schematic diagram of a wind turbine scheduling strategy according to the present invention;
fig. 10 is a schematic view of a photovoltaic unit scheduling strategy according to the present invention.
Detailed Description
Embodiments of the invention are described in further detail below with reference to the attached drawing figures:
a large-scale power grid optimization scheduling method for new energy, as shown in figure 1, comprises the following steps:
and 1, constructing a large-scale power grid model containing new energy nodes, predicting the new energy generating capacity and the required load in each period of 24 hours based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of a time scale required by day-ahead scheduling.
The specific steps of the step 1 comprise:
(1) Analyzing IEEE1888 nodes, extracting key data capable of reflecting basic characteristics, power grid connection relation and production characteristics of power grid elements and operation management of a power system, researching regulation characteristics and response characteristics of different types of energy nodes in a grid-connected state, improving IEEE1888 standard nodes, performing tide testing on an improved IEEE1888 node model, and completing construction of a large power grid model containing new energy nodes;
in the embodiment, the IEEE1888 nodes comprise 55 generator nodes in total, 25 thermal power nodes are selected to improve the generator nodes, 15 nodes are connected with wind power, 10 nodes are connected with photovoltaic, and the rest 30 nodes keep thermal power generation. And selecting a predicted value and a true value of wind power and photovoltaic output of one week as a group of data for preliminary test, and carrying out tide test on the improved IEEE1888 node model to complete the construction of the large power grid model containing new energy nodes.
(2) Based on the large power grid model with the new energy nodes constructed in the step (1), predicting the new energy generating capacity by utilizing the LSTM neural network, including preprocessing historical operation data and establishing the LSTM neural network to complete the prediction of the generating capacity and the required load of the new energy unit, and completing the construction of a wind power, photovoltaic and load prediction data set of the time scale required by day-ahead scheduling.
In the step 1 and the step (2), the LSTM neural network established by the invention is as follows: the memory is stored for 12 hours, and the interval is 1 hour; the activation function adopts a Sigmoid/Tanh function, the training times are 100, and the hidden layer is provided with 5 layers; the objective function is set as:
and selecting wind power data of one year, wherein 80% of the wind power data are used as training data, 20% of the wind power data are used as test data, and optimizing a loss function in the training process, so that the wind power, photovoltaic and load prediction data set of a time scale required by day-ahead scheduling is constructed.
In the embodiment, the generating capacity of the new energy unit is predicted based on standardized historical data, and the LSTM is a network structure derived by adding a plurality of special computing nodes in the hidden layer of the cyclic neural network, so that the input information can be selectively forgotten or recorded, and the gradient elimination and gradient explosion problems can be effectively alleviated.
Preprocessing historical data, adopting a data normalization method to process different types of data, aiming at enabling an objective function to converge rapidly, and carrying out normalization processing on three different data parameters of historical power, wind speed and temperature based on Min-Max Normalization:
wherein X is the data to be processed, X mac X is the maximum value of the historical data min Is the minimum value of historical data, X * Is normalized standard data.
Based on standardized historical data, the generating capacity of the new energy unit is predicted, and a long-short-term memory network (LSTM) is a specially designed deep recurrent neural network. It is trained by time back propagation, overcoming vanishing gradients and long-term dependence problems. The LSTM can be used for selectively forgetting or recording the input information by adding a network structure derived from a plurality of special computing nodes in a hidden layer of a cyclic neural network (RNN), so that the problems of gradient elimination and gradient explosion are effectively alleviated, and the network structure of the LSTM is shown in a figure 2.
Input Gate: an information input layer, the switch of which determines whether or not information is input to a Memory Cell at this time.
Output Gate: the information output layer, whether there is information output from the storage unit at a certain time depends on the output gate. The present invention outputs a loss function.
Forget Gate: at each instant, the value in the memory cell undergoes a process of whether it is forgotten.
Inputting information of the LSTM neural network, firstly entering an input gate, judging whether information is input or not, then judging whether the forgetting gate selects information in the forgetting Memory Cell or not, and finally judging whether the information at the moment is output or not through an output gate. The activation functions used in LSTM are two, the tanh function and the sigmoid function, and the internal structure is shown in FIG. 3.
Memory Cell is used as a Memory storage place, similar to the state s of a neural network t ,h t Is an abbreviation of hidden layer, a is the output of this instant, Z i ,Z o ,Z f The four values complement each other as inputs to the LSTM network.
Z in i ,Z o ,Z f The device is a gate control device, a numerical value between 0 and 1 is obtained through a sigmoid activation function through the input at the moment and the hidden state at the last moment and the dot product of the weight coefficient, and is used as an input signal, an output signal and a forgetting signal, wherein 0 represents complete closing, and 1 represents complete opening.
The method comprises the steps of predicting a plurality of influence factors of the power generation amount of a new energy unit, and predicting the wind power by adopting different time periods based on the obtained standard data. As with a standard recurrent neural network, the forward propagation process of the input sequence x of length T is calculated starting from t=1, with the update equation being applied recursively while incrementing. And calculating a back propagation process by using a gradient descent method from t=T, and completing the reverse parameter tuning of the network. The LSTM memory designed by the invention is 12 hours, and the interval is 1 hour. The activation function adopts a Sigmoid/Tanh function, the training times are 100, and the hidden layer is provided with 5 layers. The objective function is set as:
Selecting wind power data of one year, 80% training, and 20% testing, wherein the loss function optimization process in the training process is shown in fig. 4, and the prediction result based on LSTM training is shown in fig. 5. The construction of the wind power, photovoltaic and load prediction data set of the time scale required by the day-ahead scheduling is completed.
Step 2, based on the wind power, photovoltaic and load prediction data set constructed in the step 1, standard nodes are improved, the optimization scheduling problem is converted into a Markov decision process, and the power grid state s is completed t Action a t Prize value R t And (3) establishing a transfer function.
In this embodiment, in step 2, the standard nodes are modified based on the wind power, photovoltaic and load prediction data set constructed in step 1, the optimization scheduling problem is converted into a Markov decision process, and four-dimensional tuples are used for describing (S, A, P, R), S t Representing a set of grid states, a t Representing the set of actions taken by the grid, the prize value Rt and the design of the transfer function, the observed quantity s of the state t The method comprises the steps of obtaining the generating capacity and the load demand predicted value of the new energy unit in the step 1, obtaining the output value of each unit at the current moment and the like. Action a t Is a unit output strategy for determining the next time, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to a generator minimum up/down time limit. Prize value R t The reward function is determined based on the objective function of step 1, requiring minimum running cost and maximum new energy consumption. Strong strengthThe task of learning is to determine a strategy that maximizes the sum of rewards in the MDP.
The specific steps of the step 2 include:
(1) And analyzing and including wind power generation and photovoltaic power generation network models, constructing an objective function to meet a daily scheduling target, minimizing system operation cost, minimizing active network loss, minimizing wind and light abandoning, and improving system operation economy and new energy consumption capability.
Total cost of system operation:
wherein:
wherein F is the total cost of operation of the power system, T is the number of scheduling cycles, F 1,t Is the operation cost of the conventional unit in the period t, F 2,t Is the active power loss of the system in the period t, F 3,t Is the amount of waste wind and waste light in the t period. n is the number of thermal power generating units, P i,h,t Active output of ith thermal power unit, a i 、b i 、c i The coefficients of the first and second constants of the cost function, respectively. u (u) i,t And the on-off state of the thermal power generating unit is indicated. P (P) k,loss,t Representing the active loss generated during the period t, and m represents the power system branch number. r is the number P of renewable energy units d,t,w The output value of the fan in the period t is represented by P d,t,w,max Indicating the upper limit of the fan output in the period t. P (P) d,t,pv Representing the output value, P, of the photovoltaic in the period t d,t,pvmax Representing the upper output limit of the photovoltaic at time t.
(2) In order to ensure safe and stable operation of the power system, constraint conditions are set, and the constraint conditions mainly comprise power balance constraint, tide balance constraint, unit output constraint, unit climbing constraint, output limit of wind power generation and photovoltaic power generation and the like.
P in the formula i,t And Q i,t Representing the active and reactive power injected by node i, U i Is the voltage amplitude of node i, θ ij Is the voltage phase angle difference between node i and node j, G ij And B ij Node conductance and susceptance, respectively.
And the output constraint and climbing constraint of the unit:
wherein r is i,up The upward climbing rate of the ith thermal power unit, r i,down The downward slope climbing rate of the ith thermal power generating unit is obtained.
Fan and photovoltaic output constraints:
wherein P is d,t,wmax And P d,t,pvmax The upper output limits of the blower and the photovoltaic are respectively indicated.
(3) Converting the optimal scheduling problem generation into a Markov decision process:
setting state observables s t The method comprises the steps of outputting a thermal power unit, controlling the active power loss of the system and the scheduling period t, wherein the new energy unit generates energy, loads predictive value and the like obtained in the step 1;
S t ={t,P t WF ,P t PV ,P t H ,P t WF } (7)
action a t Is a unit decision variable for deciding the next Time unit output strategy, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to a generator minimum up/down time limit.
A t ={P t WF ,P t PV ,P t G } (8)
Prize value R t The bonus function is determined based on the designed objective function, requiring minimum running cost and maximum new energy consumption.
R t =-[a 1 r 1,t +a 2 r 2,t +a 3 r 3,t ] (9)
Wherein R is t Representing the total reward function, r 1,t 、r 2,t 、r 3,t The system running cost, the active loss and the new energy consumption are respectively represented, and a represents the weight occupied by each part of rewarding value. Thereby, the optimization scheduling problem is converted into a Markov decision process, and the setting of the power grid state st, the action at, the reward value Rt and the transfer function is completed.
Step 3, constructing an intelligent body based on a neural network with self-adaption parameters based on the Markov decision process converted and generated in the step 2 according to historical operation data characteristics of the power grid, and establishing an optimized scheduling framework based on a dual-delay depth deterministic strategy gradient TD3 algorithm;
in this embodiment, based on the markov decision process constructed in the step 2, a dual-delay depth deterministic strategy gradient algorithm (TD 3) is adopted, and TD3 is a deterministic strategy reinforcement learning algorithm, which is suitable for a high-dimensional continuous motion space. The optimization objective is to find the optimal value of the action-value function Q (s, a). I.e. the power system is in different states s t When the unit output value is found, the corresponding action, namely the unit output value, can be found. To avoid overestimation of the Q value, the TD3 algorithm adopts two sets of networks to estimate the Q value, comprising 6 network structures, which are relatively higherSmall as the update target.
Considering the problems of Q value overestimation and hyper-parameter sensitivity faced by the DDPG algorithm, the TD3 algorithm is improved in three aspects. 1. The TD3 algorithm learns the two Q functions during operation and uses the smaller of the two Q values to form the target in the bellman error loss function. 2. The frequency of updating the target network by the TD3 algorithm is lower. The delayed policy update is proposed by updating the participant network less frequently than the evaluation network, allowing for reduced estimation errors prior to policy update while updating the target network with soft updates. θ '≡τθ+ (1- τ) θ'. 3. The TD3 algorithm adds an extra noise when making action decisions making it more difficult to implement strategies that utilize Q function errors by smoothing the Q changes with action.
The implementation of TD3 includes 6 networks: 2Q networks, 2 target Q networks, 1 policy network, 1 target policy network. The optimal strategy is obtained by solving a Bellman equation with recursive attributes, and the formula of the optimal strategy of the TD3 algorithm is as follows:
Wherein r(s) t ,a t ) Representing a state s t The time-display takes action a t The value of the prize to be obtained,representing the current state action pair(s) t ,a t ) Probability distribution ρ π Expected prize value at that time.
The TD3 algorithm runs the process agent to maximize the benefit function by finding the optimal strategy,representation of parameters->Gradient is calculated:
the two evaluation networks calculate two Q values: q (Q) 1 (A') and Q 2 (A') updating the target network with a smaller value, the evaluation network optimizing the parameters by minimizing the loss function, where y t Is the update target:
the update targets of the two evaluation networks in the TD3 algorithm are the same, but the estimated values are different in the running process due to different initial network parameter value settings, so that the possibility of selecting smaller Q value estimation is provided, and the estimation of the Q value which is too high is avoided.
Step 3 relates to the design of a large-scale power grid optimization scheduling scheme based on a TD3 algorithm, and the overall framework diagram is shown in fig. 6.
The specific steps of establishing the optimized dispatching framework in the step 3 include:
firstly, simulating the actual running condition of a power grid in MATLAB, constructing a power grid simulation environment, wherein the total scheduling period is 24h, and taking 1h as a unit. And after the simulation environment carries out tide calculation, acquiring a corresponding observation state space through calling a program interface.
Then training the intelligent agent, using the state generated in the power grid operation scene as input quantity, searching the optimal action in the action space, and obtaining the action a through the actor network t ' then adding noise to obtain action a t =a t ' +N then a t Input into the environment to obtain s t ,r t Thus, an experience(s) t ,a t ,s t+1 ,r t+1 ) The experientce is then placed in an experience pool. The Actor network directly determines the quality of the adopted strategy. Training a good Actor network requires an accurate Critic network to evaluate it, and the remaining 5 networks of TD3 are designed to create a Critic network as accurate as possible. Finally, noise is added to the action in the strategic network assessment segment. The noise is scaled by constructing a normal too much distribution, sampling according to the shape of the action, multiplying the sampled value by a parameter, and finally adding the noise to the action and outputting the noise. Thereby obtaining an optimized scheduling framework based on the TD3 algorithm.
And 4, based on the large-scale power grid model comprising the new energy nodes constructed in the step 1 and the optimized scheduling framework based on the TD3 algorithm constructed in the step 3, training and testing the intelligent agent are completed, and the unit output plan aiming at meeting the new energy consumption is obtained.
And 4, training and testing of the intelligent agent are completed based on the optimized scheduling frame based on the TD3 algorithm and constructed in the step 1 and the large-scale power grid model containing the new energy nodes, and a unit output plan aiming at meeting the new energy consumption is obtained. The specific contents include:
the invention adopts a large-scale power grid model comprising new energy nodes, the new energy power generation capacity of each node in the power grid environment has different duty ratio, and the actions which can be operated and provided in the large-scale power grid are all active power output values of units, which are 25 groups of adjustable new energy units, 29 groups of adjustable thermal power units and 1 group of adjustable balance units. Each line in the system has own power transmission capacity, and when the transmission power quantity overflows, the transmission power quantity can be automatically disconnected. In the simulated grid operation training process, the following two conditions occur to immediately terminate the grid operation:
1) The unbalanced power of the power grid causes that the power of the balancing machine is out of limit and the power flow calculation cannot be converged;
2) This may occur when there are more open lines, as the actions performed by the agent cause the load, generator or substation to form isolated nodes.
The training process of the algorithm relates to the selection of software environment observance and power grid agent scheduling actions, the setting of super parameters and training processes and the training process of a specific algorithm. The algorithm interacts with the environment in real time to generate empirical data, and then updates the action network and the evaluation network based on the empirical data. Firstly, an action network provides observed quantity o according to a power grid software environment t Action a of output grid scheduling t Receiving a feedback rewarding value r provided by a software environment t And obtains the observation value o at the next moment (t+1) . The empirical data of the grid agent interaction with the software environment is then used (o t ,a t ,r t ,o (t+1) ) And storing the data into an experience pool, and starting training by the power grid intelligent agent when the stored data quantity reaches WARMUP_STEPS.
After the power grid intelligent agent starts training, firstly updating the action network, and updating the weight parameters of the action network by a gradient descent method. After the action network is updated for T times, updating the evaluation network, wherein the weight parameters of the evaluation network are updated by minimizing the Belman error of the intelligent agent. The power grid intelligent agent comprises a plurality of learning periods during training, wherein each learning period randomly selects one section data as an initial observed quantity, and outputs a power grid dispatching strategy according to the observed quantity. Due to the randomness of the observed quantity at the initial time and the various uncertainties faced in the decision process, the algorithm needs to train a plurality of learning periods before the bonus value curve can be completely converged.
And testing the trained intelligent agent, wherein in order to test the generalization performance of the algorithm module, the algorithm uses section data different from those used in training in the test. The algorithm adopts a virtual test environment, wherein the virtual environment comprises historical section data different from that of training, 20 sequences are randomly selected by an intelligent power grid body for testing, section data are extracted every 5 minutes, any section data are used as observables, a tide non-convergence or balance machine out-of-limit time sequence is ended, a reward value is obtained, an average reward value is obtained by the intelligent body trained based on a TD3 algorithm in the test process, a thermal power unit scheduling strategy is shown in FIG. 8, a wind power unit scheduling strategy is shown in FIG. 9, and a photovoltaic unit scheduling strategy is shown in FIG. 10. The broken lines in the graph 10 of fig. 9 are predicted values of the power generation amount of the wind turbine unit and the photovoltaic unit within one hour respectively, the column line part is the power generation amount of each unit within one hour, and obviously, the scheduling strategy generated based on the TD3 algorithm can ensure the consumption of new energy and save the power generation cost on the premise of stable operation of the power grid. Therefore, training and testing of the intelligent agent are completed, and the unit output plan aiming at meeting the requirement of new energy consumption is obtained.
A large-scale power grid optimization scheduling device for new energy comprises:
The wind power, photovoltaic and load prediction data set construction module is used for constructing a large-scale power grid model containing new energy nodes, predicting new energy generating capacity and required load based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of the time scale required by day scheduling;
the Markov decision process conversion generation module is used for improving the standard nodes based on the constructed wind power, photovoltaic and load prediction data set and generating the optimized scheduling problem into a Markov decision process;
the optimization scheduling framework construction module is used for constructing an intelligent agent based on a neural network of parameter self-adaption based on the historical operation data characteristics of the power grid based on a Markov decision process generated by conversion and an optimization scheduling framework based on the TD3 algorithm;
and the unit output plan output module is used for completing training and testing of the intelligent agent based on the constructed large-scale power grid model containing the new energy nodes and the optimized scheduling framework of the TD3 algorithm and obtaining the unit output plan aiming at meeting the new energy consumption.
The large-scale power grid model building module of the new energy node further comprises:
The large power grid model construction module containing the new energy nodes is used for analyzing the IEEE1888 nodes, extracting key data capable of reflecting basic characteristics, power grid connection relation and production characteristics of power grid elements and operation management of a power system, researching regulation characteristics and response characteristics of different types of energy nodes in a grid-connected state, improving IEEE1888 standard nodes, carrying out tide testing on an improved IEEE1888 node model and completing large power grid model construction containing the new energy nodes;
the wind power, photovoltaic and load prediction data set construction module is used for predicting the new energy generating capacity by utilizing the LSTM neural network based on the constructed large power grid model containing the new energy nodes, and comprises the steps of preprocessing historical operation data and establishing the LSTM neural network to complete the prediction of the new energy generating capacity and the required load of the new energy unit, and completing the construction of the wind power, photovoltaic and load prediction data set of the time scale required by the day scheduling.
The markov decision process conversion generation module further includes:
the objective function construction module is used for analyzing the wind power generation and photovoltaic power generation network models and constructing an objective function to meet a daily scheduling target;
Total cost of system operation:
wherein:
wherein F is the total cost of operation of the power system, T is the number of scheduling cycles, F 1,t Is the operation cost of the conventional unit in the period t, F 2,t Is the active power loss of the system in the period t, F 3,t The amount of waste wind and waste light in the period t; n is the number of thermal power generating units, P i,h,t Active output of ith thermal power unit, a i 、b i 、c i The coefficients are the primary and secondary constants of the cost function respectively; u (u) i,t The method comprises the steps of representing the on-off state of a thermal power unit; p (P) k,loss,t Representing active loss generated in a t period, and m represents the branch number of the power system; r is the number P of renewable energy units d,t,w The output value of the fan in the period t is represented by P d,t,w,max Indicating that the fan is at tAn upper output limit of the segment; p (P) d,t,pv Representing the output value, P, of the photovoltaic in the period t d,t,pvmax Representing the upper output limit of the photovoltaic in the period t;
the constraint condition setting module is used for setting power balance constraint, tide balance constraint, unit output constraint, unit climbing constraint, output limit of wind power generation and photovoltaic power generation:
wherein P is i,t And Q i,t Representing the active and reactive power injected by node i, U i Is the voltage amplitude of node i, θ ij Is the voltage phase angle difference between node i and node j, G ij And B ij Node conductance and susceptance, respectively;
and the output constraint and climbing constraint of the unit:
Wherein r is i,up The upward climbing rate of the ith thermal power unit, r i,down The downward climbing rate of the ith thermal power generating unit is set;
fan and photovoltaic output constraints:
wherein P is d,t,wmax And P d,t,pvmax Respectively representing the upper output limit of the fan and the photovoltaic;
the Markov decision process conversion generation module is used for converting the generation of the optimal scheduling problem into a Markov decision process:
setting state observables s t The method comprises the steps of outputting a thermal power unit, generating power of a new energy unit and predicting a load, wherein the active power loss of the system is scheduled for a period t;
S t ={t,P t WF ,P t PV ,P t H ,P t WF } (7)
action a t Is a unit decision variable, decides the unit output strategy of the next time, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to the minimum up/down time limit of the generator;
A t ={P t WF ,P t PV ,P t G } (8)
prize value R t The rewarding function is determined based on the designed objective function, and the minimum running cost and the maximum new energy consumption are required;
R t =-[a 1 r 1,t +a 2 r 2,t +a 3 r 3,t ] (9)
wherein R is t Representing the total reward function, r 1,t 、r 2,t 、r 3,t The system running cost, the active loss and the new energy consumption are respectively represented, and a represents the weight occupied by each part of rewarding value.
The optimal scheduling frame building module further comprises:
the power grid actual operation condition simulation module is used for simulating the actual operation condition of the power grid, constructing a power grid simulation environment, and acquiring a corresponding observation state space through a calling program interface after the simulation environment carries out tide calculation;
The optimal action searching module is used for searching the optimal action in the action space by using the state generated in the power grid operation scene as input quantity and passing through the actor networkAction a of obtaining collaterals t ' then adding noise to obtain action a t =a t ' +N then a t Input into the environment to obtain s t ,r t After that, an experience(s) t ,a t ,s t+1 ,r t+1 ) Then putting the experientence into an experience pool;
and the optimal scheduling frame building module adds noise to the action in the strategy network evaluation part, samples according to the shape of the action by constructing a normal distribution, multiplies the action by parameters to realize scaling of the noise, and finally adds the noise to the action and outputs the noise, thereby obtaining the optimal scheduling frame based on the TD3 algorithm.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the large-scale grid optimization scheduling method.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (10)

1. A large-scale power grid optimization scheduling method for new energy is characterized in that: the method comprises the following steps:
step 1, constructing a large-scale power grid model containing new energy nodes, predicting new energy generating capacity and required load based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of time scale required by day-ahead scheduling;
Step 2, based on the wind power, photovoltaic and load prediction data set constructed in the step 1, improving standard nodes, and converting an optimal scheduling problem into a Markov decision process;
step 3, constructing an intelligent body based on a neural network with self-adaption parameters based on the Markov decision process converted and generated in the step 2 according to historical operation data characteristics of the power grid, and establishing an optimized scheduling framework based on a dual-delay depth deterministic strategy gradient TD3 algorithm;
and 4, based on the large-scale power grid model comprising the new energy nodes constructed in the step 1 and the optimized scheduling framework based on the TD3 algorithm constructed in the step 3, training and testing the intelligent agent are completed, and the unit output plan aiming at meeting the new energy consumption is obtained.
2. The new energy-containing large-scale power grid optimization scheduling method according to claim 1, wherein the method comprises the following steps: the specific steps of the step 1 comprise:
(1) Analyzing IEEE1888 nodes, extracting key data capable of reflecting basic characteristics, power grid connection relation and production characteristics of power grid elements and operation management of a power system, researching regulation characteristics and response characteristics of different types of energy nodes in a grid-connected state, improving IEEE1888 standard nodes, performing tide testing on an improved IEEE1888 node model, and completing construction of a large power grid model containing new energy nodes;
(2) Based on the large power grid model with the new energy nodes constructed in the step (1), predicting the new energy generating capacity by utilizing the LSTM neural network, including preprocessing historical operation data and establishing the LSTM neural network to complete the prediction of the generating capacity and the required load of the new energy unit, and completing the construction of a wind power, photovoltaic and load prediction data set of the time scale required by day-ahead scheduling.
3. The new energy-containing large-scale power grid optimization scheduling method according to claim 2, wherein the method comprises the following steps: in the step 1 and the step (2), the objective function of the LSTM neural network established by the invention is set as follows:
and selecting wind power data of one year, wherein 80% of the wind power data are used as training data, 20% of the wind power data are used as test data, and optimizing a loss function in the training process, so that the wind power, photovoltaic and load prediction data set of a time scale required by day-ahead scheduling is constructed.
4. The new energy-containing large-scale power grid optimization scheduling method according to claim 1, wherein the method comprises the following steps: the specific steps of the step 2 include:
(1) Analyzing a wind power generation and photovoltaic power generation network model, and constructing an objective function to meet a daily scheduling target;
total cost of system operation:
Wherein:
wherein F is the total cost of operation of the power system, T is the number of scheduling cycles, F 1,t Is the operation cost of the conventional unit in the period t, F 2,t Is the active power loss of the system in the period t, F 3,t The amount of waste wind and waste light in the period t; n is the number of thermal power generating units, P i,h,t Active output of ith thermal power unit, a i 、b i 、c i The coefficients are the primary and secondary constants of the cost function respectively; u (u) i,t The method comprises the steps of representing the on-off state of a thermal power unit; p (P) k,loss,t Representing active loss generated in a t period, and m represents the branch number of the power system; r is the number P of renewable energy units d,t,w The output value of the fan in the period t is represented by P d,t,w,max The upper limit of the output of the fan in the period t is represented; p (P) d,t,pv Representing the output value, P, of the photovoltaic in the period t d,t,pvmax Representing the upper output limit of the photovoltaic in the period t;
(2) Setting constraint conditions, including power balance constraint, tide balance constraint, unit output constraint, unit climbing constraint, output limit of wind power generation and photovoltaic power generation:
wherein P is i,t And Q i,t Representing the active and reactive power injected by node i, U i Is the voltage amplitude of node i, θ ij Is the voltage phase angle difference between node i and node j, G ij And B ij Node conductance and susceptance, respectively;
and the output constraint and climbing constraint of the unit:
wherein r is i,up The upward climbing rate of the ith thermal power unit, r i,down The downward climbing rate of the ith thermal power generating unit is set;
fan and photovoltaic output constraints:
wherein P is d,t,wmax And P d,t,pvmax Respectively representing the upper output limit of the fan and the photovoltaic;
(3) Converting the optimal scheduling problem generation into a Markov decision process:
setting state observables s t The method comprises the steps of outputting a thermal power unit, controlling the active power loss of the system and dispatching the period t, wherein the new energy unit generates energy and loads are predicted according to the step 1;
action a t Is a unit decision variable, decides the unit output strategy of the next time, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to the minimum up/down time limit of the generator;
prize value R t The rewarding function is determined based on the designed objective function, and the minimum running cost and the maximum new energy consumption are required;
R t =-[a 1 r 1,t +a 2 r 2,t +a 3 r 3,t ] (9)
wherein R is t Representing the total reward function, r 1,t 、r 2,t 、r 3,t The system running cost, the active loss and the new energy consumption are respectively represented, and a represents the weight occupied by each part of rewarding value.
5. The new energy-containing large-scale power grid optimization scheduling method according to claim 1, wherein the method comprises the following steps: the specific steps of establishing the optimized dispatching framework in the step 3 include:
(1) Simulating the actual running condition of a power grid, building a power grid simulation environment, and acquiring a corresponding observation state space through a calling program interface after the simulation environment carries out tide calculation;
(2) Using states generated in the operation scene of the power grid as input quantity, searching the optimal action in the action space, and obtaining the action a through an actor network t ' then adding noise to obtain action a t =a t ' +N then a t Input into the environment to obtain s t ,r t After that, an experience(s) t ,a t ,s t+1 ,r t+1 ) Then putting the experientence into an experience pool;
(3) Adding noise to the action in the strategy network evaluation part, constructing a positive distribution, sampling according to the shape of the action, multiplying the sampling by parameters to realize scaling of the noise, and finally adding the noise to the action and outputting the noise, thereby obtaining the optimal scheduling framework based on the TD3 algorithm.
6. A large-scale power grid optimization scheduling device for new energy is characterized in that: comprising the following steps:
the wind power, photovoltaic and load prediction data set construction module is used for constructing a large-scale power grid model containing new energy nodes, predicting new energy generating capacity and required load based on historical operation data, and constructing a wind power, photovoltaic and load prediction data set of the time scale required by day scheduling;
the Markov decision process conversion generation module is used for improving the standard nodes based on the constructed wind power, photovoltaic and load prediction data set and generating the optimized scheduling problem into a Markov decision process;
The optimization scheduling framework construction module is used for constructing an intelligent agent based on a neural network of parameter self-adaption based on the historical operation data characteristics of the power grid based on a Markov decision process generated by conversion and an optimization scheduling framework based on the TD3 algorithm;
and the unit output plan output module is used for completing training and testing of the intelligent agent based on the constructed large-scale power grid model containing the new energy nodes and the optimized scheduling framework of the TD3 algorithm and obtaining the unit output plan aiming at meeting the new energy consumption.
7. The new energy-containing large-scale power grid optimization scheduling device according to claim 6, wherein the device comprises: the large-scale power grid model building module of the new energy node further comprises:
the large power grid model construction module containing the new energy nodes is used for analyzing the IEEE1888 nodes, extracting key data capable of reflecting basic characteristics, power grid connection relation and production characteristics of power grid elements and operation management of a power system, researching regulation characteristics and response characteristics of different types of energy nodes in a grid-connected state, improving IEEE1888 standard nodes, carrying out tide testing on an improved IEEE1888 node model and completing large power grid model construction containing the new energy nodes;
The wind power, photovoltaic and load prediction data set construction module is used for predicting the new energy generating capacity by utilizing the LSTM neural network based on the constructed large power grid model containing the new energy nodes, and comprises the steps of preprocessing historical operation data and establishing the LSTM neural network to complete the prediction of the new energy generating capacity and the required load of the new energy unit, and completing the construction of the wind power, photovoltaic and load prediction data set of the time scale required by the day scheduling.
8. The new energy-containing large-scale power grid optimization scheduling device according to claim 6, wherein the device comprises: the markov decision process conversion generation module further includes:
the objective function construction module is used for analyzing the wind power generation and photovoltaic power generation network models and constructing an objective function to meet a daily scheduling target;
total cost of system operation:
wherein:
wherein F is the total cost of operation of the power system, T is the number of scheduling cycles, F 1,t Is the operation cost of the conventional unit in the period t, F 2,t Is the active power loss of the system in the period t, F 3,t The amount of waste wind and waste light in the period t; n is the number of thermal power generating units, P i,h,t Active output of ith thermal power unit, a i 、b i 、c i The coefficients are the primary and secondary constants of the cost function respectively; u (u) i,t The method comprises the steps of representing the on-off state of a thermal power unit; p (P) k,loss,t Representing active loss generated in a t period, and m represents the branch number of the power system; r is the number P of renewable energy units d,t,w The output value of the fan in the period t is represented by P d,t,w,max The upper limit of the output of the fan in the period t is represented; p (P) d,t,pv Representing the output value, P, of the photovoltaic in the period t d,t,pvmax Representing the upper output limit of the photovoltaic in the period t;
the constraint condition setting module is used for setting power balance constraint, tide balance constraint, unit output constraint, unit climbing constraint, output limit of wind power generation and photovoltaic power generation:
wherein P is i,t And Q i,t Representing the active and reactive power injected by node i, U i Is the voltage amplitude of node i, θ ij Is the voltage phase angle difference between node i and node j, G ij And B ij Node conductance and susceptance, respectively;
and the output constraint and climbing constraint of the unit:
wherein r is i,up The upward climbing rate of the ith thermal power unit, r i,down The downward climbing rate of the ith thermal power generating unit is set;
fan and photovoltaic output constraints:
wherein P is d,t,wmax And P d,t,pvmax Respectively representing the upper output limit of the fan and the photovoltaic;
the Markov decision process conversion generation module is used for converting the generation of the optimal scheduling problem into a Markov decision process:
setting state observables s t The method comprises the steps of outputting a thermal power unit, generating power of a new energy unit and predicting a load, wherein the active power loss of the system is scheduled for a period t;
S t ={t,P t WF ,P t PV ,P t H ,P t WF } (7)
Action a t Is a unit blockStrategy variable, determining unit output strategy of next time, a t =[a 1,t ,a 2,t …a N,t ]The selected action is subject to the minimum up/down time limit of the generator;
prize value R t The rewarding function is determined based on the designed objective function, and the minimum running cost and the maximum new energy consumption are required;
R t =-[a 1 r 1,t +a 2 r 2,t +a 3 r 3,t ] (9)
wherein R is t Representing the total reward function, r 1,t 、r 2,t 、r 3,t The system running cost, the active loss and the new energy consumption are respectively represented, and a represents the weight occupied by each part of rewarding value.
9. The new energy-containing large-scale power grid optimization scheduling device according to claim 6, wherein the device comprises: the optimal scheduling frame building module further comprises:
the power grid actual operation condition simulation module is used for simulating the actual operation condition of the power grid, constructing a power grid simulation environment, and acquiring a corresponding observation state space through a calling program interface after the simulation environment carries out tide calculation;
the optimal action searching module is used for searching optimal actions in an action space by using states generated in a power grid operation scene as input quantity and obtaining actions a through an actor network t ' then adding noise to obtain action a t =a t ' +N then a t Input into the environment to obtain s t ,r t After that, getOne experience(s) t ,a t ,s t+1 ,r t+1 ) Then putting the experientence into an experience pool;
and the optimal scheduling frame building module adds noise to the action in the strategy network evaluation part, samples according to the shape of the action by constructing a normal distribution, multiplies the action by parameters to realize scaling of the noise, and finally adds the noise to the action and outputs the noise, thereby obtaining the optimal scheduling frame based on the TD3 algorithm.
10. A computer-readable storage medium, characterized by: the storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the large-scale grid optimization scheduling method according to any one of claims 1 to 5.
CN202310785632.2A 2023-06-29 2023-06-29 Large-scale power grid optimal scheduling method, device and storage medium for new energy Pending CN117039981A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310785632.2A CN117039981A (en) 2023-06-29 2023-06-29 Large-scale power grid optimal scheduling method, device and storage medium for new energy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310785632.2A CN117039981A (en) 2023-06-29 2023-06-29 Large-scale power grid optimal scheduling method, device and storage medium for new energy

Publications (1)

Publication Number Publication Date
CN117039981A true CN117039981A (en) 2023-11-10

Family

ID=88634352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310785632.2A Pending CN117039981A (en) 2023-06-29 2023-06-29 Large-scale power grid optimal scheduling method, device and storage medium for new energy

Country Status (1)

Country Link
CN (1) CN117039981A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117335414A (en) * 2023-11-24 2024-01-02 杭州鸿晟电力设计咨询有限公司 Method, device, equipment and medium for deciding alternating current optimal power flow of power system
CN117522082A (en) * 2024-01-04 2024-02-06 国网山西省电力公司经济技术研究院 Power system operation cost calculation method and system based on standby cost calculation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117335414A (en) * 2023-11-24 2024-01-02 杭州鸿晟电力设计咨询有限公司 Method, device, equipment and medium for deciding alternating current optimal power flow of power system
CN117335414B (en) * 2023-11-24 2024-02-27 杭州鸿晟电力设计咨询有限公司 Method, device, equipment and medium for deciding alternating current optimal power flow of power system
CN117522082A (en) * 2024-01-04 2024-02-06 国网山西省电力公司经济技术研究院 Power system operation cost calculation method and system based on standby cost calculation
CN117522082B (en) * 2024-01-04 2024-03-22 国网山西省电力公司经济技术研究院 Power system operation cost calculation method and system based on standby cost calculation

Similar Documents

Publication Publication Date Title
Li et al. Coordinated load frequency control of multi-area integrated energy system using multi-agent deep reinforcement learning
Xi et al. A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of integrated energy systems
CN111884213B (en) Power distribution network voltage adjusting method based on deep reinforcement learning algorithm
CN112615379B (en) Power grid multi-section power control method based on distributed multi-agent reinforcement learning
CN112186743B (en) Dynamic power system economic dispatching method based on deep reinforcement learning
CN114725936B (en) Power distribution network optimization method based on multi-agent deep reinforcement learning
CN117039981A (en) Large-scale power grid optimal scheduling method, device and storage medium for new energy
CN113935463A (en) Microgrid controller based on artificial intelligence control method
CN112491094B (en) Hybrid-driven micro-grid energy management method, system and device
Xi et al. A virtual generation ecosystem control strategy for automatic generation control of interconnected microgrids
CN114784823A (en) Micro-grid frequency control method and system based on depth certainty strategy gradient
CN116629461B (en) Distributed optimization method, system, equipment and storage medium for active power distribution network
CN116454926A (en) Multi-type resource cooperative regulation and control method for three-phase unbalanced management of distribution network
CN116468159A (en) Reactive power optimization method based on dual-delay depth deterministic strategy gradient
CN115133540B (en) Model-free real-time voltage control method for power distribution network
CN114048576B (en) Intelligent control method for energy storage system for stabilizing power transmission section tide of power grid
CN114204546B (en) Unit combination optimization method considering new energy consumption
CN114400675B (en) Active power distribution network voltage control method based on weight mean value deep double-Q network
CN115333111A (en) Multi-region power grid collaborative optimization method, system, equipment and readable storage medium
CN114298429A (en) Power distribution network scheme aided decision-making method, system, device and storage medium
Yang et al. Offshore Wind Power Prediction Based on Variational Mode Decomposition and Long Short Term Memory Networks
CN111859780A (en) Micro-grid operation optimization method and system
CN113837654B (en) Multi-objective-oriented smart grid hierarchical scheduling method
CN112134304B (en) Micro-grid full-automatic navigation method, system and device based on deep learning
Lu et al. Optimal Design of Energy Storage System Assisted AGC Frequency Regulation Based on DDPG Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination