CN112330021A

CN112330021A - Network coordination control method of distributed optical storage system

Info

Publication number: CN112330021A
Application number: CN202011222997.7A
Authority: CN
Inventors: 吕冬翔; 胡秉晨; 王焘; 朱立宏; 孙子路; 李钊; 贾子熙; 钟豪; 仇海波; 赵彬涛
Original assignee: CETC 18 Research Institute
Current assignee: Cetc Energy Co ltd
Priority date: 2020-11-05
Filing date: 2020-11-05
Publication date: 2021-02-05

Abstract

A network coordination control method of a distributed optical storage system, the method comprising the steps of: acquiring and processing real-time photovoltaic data, real-time load data, historical photovoltaic data and historical load data of the distributed light storage system; inputting the processed real-time photovoltaic data, the processed real-time load data, the processed historical photovoltaic data and the processed historical load data into a long-time and short-time neural network model; the long-time neural network model outputs a photovoltaic power generation capacity predicted value and a load demand predicted value at a future moment; acquiring current charge data of each subsystem in the distributed optical storage system; inputting the current charge data, the photovoltaic power generation capacity predicted value and the load demand predicted value of each subsystem into a deep reinforcement learning model; and the deep reinforcement learning model performs optimized scheduling control on the distributed light storage system. The energy utilization efficiency of the system can be effectively improved, and the energy of each power supply module is balanced.

Description

Network coordination control method of distributed optical storage system

Technical Field

The invention belongs to the technical field of distributed energy scheduling control, and particularly relates to a network coordination control method of a distributed optical storage system.

Background

The distributed energy network is formed by connecting a large number of power generation, energy storage and power supply devices. Because there is the unmatched problem of power generation link photovoltaic and load power value, need store unnecessary electric quantity when photovoltaic cell power supply is more sufficient usually, let energy storage battery carry out energy supply when photovoltaic cell power supply is not enough, wherein the energy storage link plays the effect of peak clipping and valley filling. However, due to the random characteristics of the photovoltaic cell, the energy storage cell and the power load, the energy imbalance phenomenon easily occurs at each part of the energy network, so that the utilization rate of the system energy and the energy storage capacity is reduced, and the quality of electric energy is reduced.

At present, an artificial intelligence algorithm is one of the popular solutions to the above problems, and a value-based reinforcement learning algorithm such as Q-learning is widely used in the energy field due to its simple and practical characteristics. And the Q-learning enables the intelligent agent to interact with the environment according to the set reward rule and collects the interactive reward to update the Q-table until the optimal decision Q-table is found, and the training process is completed. However, since the basis of the selected action of the Q-learning algorithm in the training process is a discrete Q function, for a continuous problem, the method is difficult to handle a complex problem of continuous or high latitude, and a single non-prediction algorithm is difficult to make a long-term scheduling plan.

Disclosure of Invention

In order to solve the above problem, the present invention provides a network coordination control method for a distributed optical storage system, where the method includes:

acquiring and processing real-time photovoltaic data, real-time load data, historical photovoltaic data and historical load data of the distributed light storage system;

inputting the processed real-time photovoltaic data, the processed real-time load data, the processed historical photovoltaic data and the processed historical load data into a long-time and short-time neural network model;

the long-time neural network model outputs a photovoltaic power generation capacity predicted value and a load demand predicted value at a future moment;

acquiring current charge data of each subsystem in the distributed optical storage system;

inputting the current charge data, the photovoltaic power generation capacity predicted value and the load demand predicted value of each subsystem into a deep reinforcement learning model;

and the deep reinforcement learning model performs optimized scheduling control on the distributed light storage system.

Preferably, the acquiring and processing real-time photovoltaic data, real-time load data, historical photovoltaic data and historical load data of the distributed light storage system includes the steps of:

collecting real-time photovoltaic power generation data and a real-time load power sequence of an i node at the current moment and historical photovoltaic power generation data and a historical load power sequence of previous k moments in the distributed optical storage system;

regularization processing is respectively carried out on the real-time photovoltaic power generation data and the historical photovoltaic power generation data;

and respectively carrying out thermal coding processing on the real-time load power sequence and the historical load power sequence.

Preferably, the expression of the real-time photovoltaic power generation data of the i node at the current moment is as follows:

G_ik(T)＝{p_ig(T)，p_ig(T-1)，...，p_ig(T-k)}；

the expression of the historical load power sequence at the first k moments is as follows:

L_ik(T)＝{p_io(T)，p_io(T-1)，...，p_io(T-k)}；

wherein, G is_ik(T) represents real-time photovoltaic power generation data of the i-node at the present time, L_ik(T) represents the historical load power sequence at the k previous times, i is 1, 2, 3, k is 10, T represents the current time, p represents the power at time T_ig0Representing the power from the moment of the present start to the kth time node, p_io0Indicating the power of the node from time t to the kth time in the historical data.

Preferably, the expression after the real-time photovoltaic power generation data regularization is as follows:

the expression after the historical photovoltaic power generation data is regularized is as follows:

wherein the content of the first and second substances,

representing the real-time photovoltaic power generation data after regularization,

representing the normalized historical photovoltaic power generation data, p_ig0Representing the power from the moment of the present start to the kth time node, p_ig ^*Representing the set maximum value, G, of the real-time photovoltaic power generation data_ik(T) represents the inodeI is 1, 2, 3, k is 10, T represents the current time, and p represents the power at time T.

Preferably, the expression of the real-time load power sequence is:

T_k＝{T，T-1，T-2，...，T-k}；

the expression of the historical load power sequence is as follows:

M_k＝{M₀，M₁，M₂，...，M_k}；

wherein T represents a time node, Tk represents a real-time load power sequence of a kth time node, k represents the number of the time nodes, and Mk represents a historical load power sequence of the kth time node.

Preferably, the step of inputting the processed real-time photovoltaic data, the real-time load data, the historical photovoltaic data and the historical load data into the long-term and short-term neural network model includes the steps of:

designing the long-time neural network model;

forming an input matrix by the real-time photovoltaic power generation data and the historical photovoltaic power generation data which are subjected to regularization processing and the real-time load power sequence and the historical load power sequence which are subjected to thermal coding processing;

and inputting the input matrix into the long-time and short-time neural network model.

Preferably, the expression of the photovoltaic power generation amount predicted value is:

the expression of the load demand predicted value is as follows:

wherein the content of the first and second substances,

represents the predicted value of the photovoltaic power generation amount,

represents the predicted load demand value, j is 10, P represents the power at time T, and P represents the predicted load demand value_ig0Representing the short-time power, p, from the moment of the present start to the kth time node_io0The short-time power from the time T to the kth time node in the historical data is shown, and T represents the time node.

Preferably, the optimal scheduling control of the distributed light storage system by the deep reinforcement learning model comprises the following steps:

combining the current state of charge of each power module of the distributed light storage system, the photovoltaic power generation capacity predicted value and the load demand predicted value into state variables;

defining action variables and reward functions;

performing deep learning training by adopting a strategy-based near-end optimization strategy algorithm;

the circulation interacts with the environment, and collects corresponding state, action and advantage function, so as to find out and obtain the optimum action strategy pi by adopting a strategy gradient method_θ。

Preferably, the expression of the action variable is:

Action(T)＝{a₁，a₂，a₃}|a_l|∈[p_tmin，p_tmax]；

wherein P represents the power at time T, a1, a2, a3 represent the operating variables at a certain time, and P_tminRepresenting the minimum value of power at time t, p_tmaxRepresenting the maximum power at time t.

Preferably, the expression of the reward function is:

where P denotes the power at time T, T denotes the current time, C is the compensation factor of the reward function, w1 denotes the coefficients of the reward function, and m denotes the process constants of the different scenarios in the reward function.

The network coordination control method of the distributed optical storage system is correspondingly applied to energy management of the distributed electric energy network in a mode of combining the neural network with deep learning, the energy utilization efficiency of the system can be effectively improved, and the energy of each power module is balanced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic diagram of a network coordination control method of a distributed optical storage system provided in the present invention;

fig. 2 is a schematic diagram of an electric energy network in a network coordination control method of a distributed optical storage system according to the present invention;

fig. 3 is a schematic diagram of a long-term and short-term neural network prediction model in a network coordination control method of a distributed optical storage system according to the present invention;

fig. 4 is a flowchart of a deep learning algorithm in the network coordination control method of the distributed optical storage system provided by the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

As shown in fig. 1 to 4, in the embodiment of the present application, the present invention provides a network coordination control method of a distributed optical storage system, where the method includes the steps of:

s1: acquiring and processing real-time photovoltaic data, real-time load data, historical photovoltaic data and historical load data of the distributed light storage system;

s2: inputting the processed real-time photovoltaic data, the processed real-time load data, the processed historical photovoltaic data and the processed historical load data into a long-time and short-time neural network model;

s3: the long-time neural network model outputs a photovoltaic power generation capacity predicted value and a load demand predicted value at a future moment;

s4: acquiring current charge data of each subsystem in the distributed optical storage system;

s5: inputting the current charge data, the photovoltaic power generation capacity predicted value and the load demand predicted value of each subsystem into a deep reinforcement learning model;

s6: and the deep reinforcement learning model performs optimized scheduling control on the distributed light storage system.

When the network coordination control is carried out on the distributed light storage system, the real-time photovoltaic data, the real-time load data, the historical photovoltaic data and the historical load data of the distributed light storage system are firstly obtained and processed, then inputting the processed real-time photovoltaic data, the real-time load data, the historical photovoltaic data and the historical load data into a long-time and short-time neural network model, then the long-time neural network model outputs a photovoltaic power generation amount predicted value and a load demand predicted value at the future moment, then obtaining the current charge data of each subsystem in the distributed light storage system, inputting the current charge data of each subsystem, the photovoltaic power generation amount predicted value and the load demand predicted value into a deep reinforcement learning model, and then the deep reinforcement learning model performs optimized scheduling control on the distributed light storage system.

In this embodiment of the present application, the acquiring and processing the real-time photovoltaic data, the real-time load data, the historical photovoltaic data, and the historical load data of the distributed light storage system in step S1 includes the steps of:

When the real-time photovoltaic data, the real-time load data, the historical photovoltaic data and the historical load data of the distributed light storage system are obtained and processed, firstly, the real-time photovoltaic power generation data and the real-time load power sequence of an i node in the distributed light storage system at the current moment and the historical photovoltaic power generation data and the historical load power sequence of the previous k moments are collected, then the real-time photovoltaic power generation data and the historical photovoltaic power generation data are respectively subjected to regularization processing, and then the real-time load power sequence and the historical load power sequence are respectively subjected to thermal coding processing.

In this embodiment of the present application, the expression of the real-time photovoltaic power generation data of the i node at the current time is as follows:

G_ik(T)＝{p_ig(T)，p_ig(T-1)，...，p_ig(T-k)}；

L_ik(T)＝{p_io(T)，p_io(T-1)，...，p_io(T-k)}；

wherein G is_ik(T) represents real-time photovoltaic power generation data of the i-node at the present time, L_ik(T) represents the historical load power sequence at the k previous times, i is 1, 2, 3, k is 10, T represents the current time, p represents the power at time T_ig0Representing the power from the moment of the present start to the kth time node, p_io0Indicating the power of the node from time t to the kth time in the historical data.

In this embodiment of the present application, the expression after the real-time photovoltaic application data regularization is:

wherein the content of the first and second substances,

representing the normalized historical photovoltaic power generation data, p_ig0Representing the power from the moment of the present start to the kth time node, p_ig ^*Representing the set maximum value, G, of the real-time photovoltaic power generation data_ik(T) represents real-time photovoltaic power generation data at the current time of the i-node, i is 1, 2, 3, k is 10, T represents the current time, and p represents power at the time of T.

In this embodiment of the present application, the expression of the real-time load power sequence is:

T_k＝{T，T-1，T-2，...，T-k}；

the expression of the historical load power sequence is as follows:

M_k＝{M₀，M₁，M₂，...，M_k}；

In an embodiment of the present application, the inputting the processed real-time photovoltaic data, the real-time load data, the historical photovoltaic data, and the historical load data into the long-term and short-term neural network model includes:

designing the long-time neural network model;

In the embodiment of the present application, the expression of the photovoltaic power generation amount predicted value is:

the expression of the load demand predicted value is as follows:

wherein the content of the first and second substances,

represents the predicted value of the photovoltaic power generation amount,

In an embodiment of the present application, the performing, by the deep reinforcement learning model, optimized scheduling control on the distributed light storage system includes:

defining action variables and reward functions;

the circulation interacts with the environment, and collects corresponding state, action and advantage function, so as to find out the optimal action by adopting a strategy gradient methodStrategy pi_θ。

In the embodiment of the present application, the expression of the action variable is:

Action(T)＝{a₁，a₂，a₃}|a_l|∈[p_tmin，p_tmax]；

In the embodiment of the present application, the expression of the reward function is:

The present application is described in detail below with specific examples.

As shown in fig. 1, the distributed optical storage system is formed by connecting 3 energy modules with power generation-energy storage-energy supply functions as nodes, the nodes are connected through power gateways, and the power gateways have a power transmission function and can continuously transmit power from one node to another node. In order to enable the system to operate efficiently and uniformly, the transmission power and direction of each power gateway need to be controlled, so that the energy of each node is complemented.

Therefore, a network coordination control method of a distributed optical storage system is provided, and the basic idea is as follows: the prediction model predicts photovoltaic power generation power values and load power values at 10 future moments on the basis of historical information, and then inputs the predicted power value sequence and the current system state information into a near-end strategy optimization (PPO) model of deep reinforcement learning to obtain control actions of 3 power gateways, wherein the actions tend to be optimal for a long term because the future predicted information is used.

The invention provides a network coordination control method of a distributed optical storage system, which specifically comprises the following steps:

(1) design of long-time and short-time neural network energy prediction model

The photovoltaic power generation power and the load power have time sequence correlation, and the photovoltaic load power value at the later moment can be predicted by the power values at the first 10 moments. And a long-term and short-term neural network for learning long-term dependence is adopted. Considering that photovoltaic power generation and load power values are continuous variables, and the dimension can be greatly changed. In order to overcome the problem that different dimensions have great influence on the prediction result, the input power value is expanded to the range of [0, 1] by adopting a regularization method.

G_ik(T)＝{p_ig(T)，p_ig(T-1)，...，p_ig(T-k)}

L_ik(T)＝{p_io(T)，p_io(T-1)，...，p_io(T-k)}

i＝1，2，3 k＝10

Wherein G is_ik(T)，L_ikAnd (T) are power generation and load power value sequences of the current time and the k previous times of the node at the T time.

Wherein the content of the first and second substances,

respectively, normalized photovoltaic power generation, load power sequence, p_ig ^*And p_io ^*The maximum values of the photovoltaic power generation and the load power are set respectively.

In addition, because the time sequence and the task sequence are discrete sequences, the values of the sequences can only be discrete integer values, and the measurement distances of all the values in the sequences should be the same, the values of the discrete features are expanded into the Euclidean space in a thermal coding mode, and a certain value of the discrete features corresponds to a certain point of the Euclidean space.

T_k＝{T，T-1，T-2，...，T-k}

M_k＝{M₀，M₁，M₂，...，M_k}

k＝10

Wherein, T_kAnd M_kRespectively time sequence and task number sequence, and respectively corresponding to hot code sequence after hot coding

And

regularizing the photovoltaic power generation power value

Load power value L_ik(T) and thermally encoded time series

Sequence of task numbers

Composing an input matrix

Inputting the long and short time neural network model to obtain the prediction of the photovoltaic power generation power value and the load power value of each node at 10 moments in the future

And

the long-short time neural network model design adopts 3 layers of long-short time neural networks, each layer of long-short time neural network comprises 20 neurons, each neuron adopts a leak-relu activation function, and the output of each layer of LSTM network is connected with batch regularization operation so as to help the convergence of the neural networks.

(2) Design of coordination control algorithm based on deep reinforcement learning

On the basis of the prediction information, the deep reinforcement learning algorithm can make an action which accords with the optimal overall value in a future period of time at the current moment. Here we use the current state of charge SoC of each power module of the system_i(i 1, 2, 3), predicted power for 10 time instants

The combination is the state variable:

the control action is completed by the power gateway, and the transmission power of the power gateway ranges from [ p ]_tminp_tmax]Within this range, the flow direction and the magnitude of the power can be continuously controlled. Defining action variables:

Action(T)＝{a₁，a₂，a₃}|a_l|∈[p_tmin，p_tmax]。

the reward function comprehensively considers the predicted residual energy and the balance in the system under the action to provide judgment basis for the action. And the energy size of the power supply module with the minimum predicted residual energy in the network is used as a reward item. And meanwhile, introducing the variance of each power supply module as a penalty term:

wherein, w₁And w₂The weight coefficients are corresponding terms respectively, the first term is the predicted energy value of the energy minimum node in the electric energy network after 10 moments, C_iThe current energy value of the i node is shown, and delta T is the time interval of adjacent moments; second term ε (SoC)_1～3(T)) is the variance of the current state of charge of each node.

Based on the design, the deep reinforcement learning model is constructed, a near-end optimization strategy algorithm (PPO) based on a strategy is adopted for training, and the training process is shown in FIG. 3. Through continuous interaction with the environment and collection of corresponding state, action and advantage functions, the optimal action strategy pi is found and obtained by adopting a strategy gradient method_θ. In order to avoid that the action difference generated by the new strategy and the old strategy is too large, which causes the non-convergence of the training process, the change degree of the new strategy and the old strategy needs to be limited, a KL penalty term is introduced, and the coefficients are adaptively adjusted. And continuously updating the strategy until the optimal strategy in the running process of the system.

Through verification, the electric energy network shown in fig. 1 is controlled by using the coordination control algorithm, the generated power and the load power of each power module in the electric energy network can be accurately predicted, and the energy utilization efficiency and the balance performance are greatly improved.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims

1. A network coordination control method of a distributed optical storage system, the method comprising the steps of:

2. The network coordination control method of the distributed light storage system according to claim 1, wherein said acquiring and processing real-time photovoltaic data, real-time load data, historical photovoltaic data and historical load data of the distributed light storage system comprises the steps of:

3. The network coordination control method of the distributed light storage system according to claim 2, wherein the expression of the real-time photovoltaic power generation data of the i-node at the current time is as follows:

G_ik(T)＝{p_ig(T)，p_ig(T-1)，...，p_ig(T-k)}；

L_ik(T)＝{p_io(T)，p_io(T-1)，...，p_io(T-k)}；

4. The network coordination control method of the distributed light storage system according to claim 2, wherein the expression after the real-time photovoltaic development data regularization is:

wherein the content of the first and second substances,

the historical photovoltaic power generation data p after the abnormal table is normalized_ig0Indicating the moment of time from now onPower to the kth time node, p_ig ^*Representing the set maximum value, G, of the real-time photovoltaic power generation data_ik(T) represents real-time photovoltaic power generation data at the current time of the i-node, i is 1, 2, 3, k is 10, T represents the current time, and p represents power at the time of T.

5. The network coordination control method of the distributed optical storage system according to claim 2, wherein the expression of the real-time load power sequence is:

T_k＝{T，T-1，T-2，...，T-k}；

the expression of the historical load power sequence is as follows:

M_k＝{M₀，M₁，M₂，...，M_k}；

6. The network coordination control method of the distributed light storage system according to claim 1, wherein the step of inputting the processed real-time photovoltaic data, the real-time load data, the historical photovoltaic data and the historical load data into the long-term neural network model comprises the steps of:

designing the long-time neural network model;

7. The network coordination control method of the distributed light storage system according to claim 1, wherein the expression of the photovoltaic power generation amount predicted value is as follows:

the expression of the load demand predicted value is as follows:

wherein the content of the first and second substances,

represents the predicted value of the photovoltaic power generation amount,

8. The network coordination control method of the distributed light storage system according to claim 1, wherein the deep reinforcement learning model performs optimal scheduling control on the distributed light storage system, and comprises the following steps:

defining action variables and reward functions;

9. The network coordination control method of the distributed light storage system according to claim 8, wherein the expression of the action variable is:

Action(T)＝{a₁，a₂，a₃}|a_l|∈[p_tmin，p_tmax]；

10. The network coordination control method of the distributed light storage system according to claim 8, wherein the expression of the reward function is: