CN113779871A - Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof - Google Patents

Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof Download PDF

Info

Publication number
CN113779871A
CN113779871A CN202110989053.0A CN202110989053A CN113779871A CN 113779871 A CN113779871 A CN 113779871A CN 202110989053 A CN202110989053 A CN 202110989053A CN 113779871 A CN113779871 A CN 113779871A
Authority
CN
China
Prior art keywords
coupling system
reinforcement learning
learning network
control
electric heating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110989053.0A
Other languages
Chinese (zh)
Inventor
孙宏斌
王宣元
席嫣娜
郭庆来
宁卜
张�浩
张宏宇
王彬
刘庆时
赵昊天
刘蓁
韦凌霄
潘昭光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
State Grid Jibei Electric Power Co Ltd
State Grid Beijing Electric Power Co Ltd
Original Assignee
Tsinghua University
State Grid Jibei Electric Power Co Ltd
State Grid Beijing Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, State Grid Jibei Electric Power Co Ltd, State Grid Beijing Electric Power Co Ltd filed Critical Tsinghua University
Priority to CN202110989053.0A priority Critical patent/CN113779871A/en
Publication of CN113779871A publication Critical patent/CN113779871A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/14Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/08Thermal analysis or thermal optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Strategic Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Tourism & Hospitality (AREA)
  • Medical Informatics (AREA)
  • General Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Marketing (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Feedback Control In General (AREA)

Abstract

The application belongs to the technical field of comprehensive energy system operation control, and particularly relates to an electric heating coupling system scheduling method and device, electronic equipment and a storage medium thereof. Firstly, constructing a reinforcement learning network for scheduling an electrothermal coupling system; collecting measurement data in the electric-thermal coupling system in real time, training the reinforcement learning network according to the measurement data and the reaction condition of the electric-thermal coupling system to a control signal, and updating parameters in the reinforcement learning network; and (4) outputting the action according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system. The method overcomes the defects of the traditional model-based optimization method and the traditional reinforcement learning algorithm, the reinforcement learning based on additional memory does not depend on the accurate model of the building, the problem that the learning is difficult due to the large time delay of heat transfer in the electric heating coupling system can be solved, the flexibility of the load side is exploited to the maximum extent, and the method is suitable for online application.

Description

Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof
Technical Field
The application belongs to the technical field of comprehensive energy system operation control, and particularly relates to an electric heating coupling system scheduling method and device, electronic equipment and a storage medium thereof.
Technical Field
The concepts of energy internet and comprehensive energy system are developed to improve the utilization efficiency of various energy sources, fully utilize the flexibility of various energy sources, reduce the carbon emission and improve the permeability of new energy sources. The most common electric heat coupling system is a heating system using an electric boiler and a cogeneration unit as heat sources. In such a heating system, the thermal load is mostly a building load, and has a large thermal inertia. If the thermal inertia of the building can be considered during the electric-heat combined dispatching, the total cost can be reduced, the flexibility is provided for the power system, and the electricity abandonment of new energy resources is reduced. Traditional electric heating system scheduling model-based optimization requires accurate models. However, the thermal characteristics of buildings are complex, affected by many factors, and it is difficult to obtain an accurate model thereof. Therefore, the traditional model-based optimization may cause the problems that the building temperature cannot meet the heating requirement, the building temperature is difficult to implement, the modeling cost is high, and the like.
In recent years, reinforcement learning, which is a model-free or weak-model control technique, has been widely applied to control problems in various fields including power systems. Reinforcement learning is based on the real-time interaction of the agent with the environment by observing the relationship between agent actions and system states. However, in the heating system, because there is a great time delay in the pipeline and heat transfer process, the state of the system can be affected only after the action of the intelligent agent is long, and the traditional reinforcement learning algorithm is not suitable.
Disclosure of Invention
The present application aims to solve the problems existing in the prior art, and in recent years, reinforcement learning, which is a model-free or weak-model control technique, has been widely applied to control problems in various fields including power systems, based on recognition and understanding by the present inventors of the following problems and facts, such as learning by observing the relationship between the actions of an agent and the states of a system based on the real-time interaction of a reinforcement learning network and the environment. However, in the heating system, because there is a large time delay in the pipeline and heat transfer process, the behavior of the reinforcement learning network can affect the state of the system only after a long time, and the traditional reinforcement learning algorithm is not suitable.
In view of the above, the present disclosure provides a method and an apparatus for scheduling an electrothermal coupling system, an electronic device, and a storage medium thereof, so as to solve technical problems in the related art.
According to a first aspect of the present disclosure, a method for scheduling an electrothermal coupling system is provided, including:
constructing a reinforcement learning network for scheduling the electric heating coupling system;
collecting measurement data in the electric heating coupling system in real time, training the reinforcement learning network, and updating parameters in the reinforcement learning network;
and (4) outputting the action according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system.
Optionally, the reinforcement learning network for scheduling of the electrothermal coupling system includes a generator μ and an evaluator Q, where:
the expression of the generator mu is at=μ(o tμ) Wherein, thetaμRepresenting the model parameters of a generator mu, wherein the input of the generator is measurement information o of the electrothermal coupling system after t time stepst
Figure BDA0003231665250000021
Wherein the content of the first and second substances,
Figure BDA0003231665250000022
after spatial differentiation is performed on the pipeline, a vector formed by the temperatures of the temperature measurement infinitesimal elements exists,
Figure BDA0003231665250000023
is TpipeOf a proper subset, TpipeVector, T, representing the temperature composition of each pipe infinitesimal after spatial differentiation of the pipesinVector, T, representing the indoor temperature composition of all buildingsaRepresenting the ambient temperature of the outdoor building, c representing the electricity price, h representing the output power of the heat source, t representing a discrete time variable during the control process, Π representing an additional memory parameter, (.)tRepresents the value at the time of t control;
the output of the generator is the measurement information o of the electrothermal coupling systemtThe following action vectors of the control strategy to be executed:
at=(m,Ts,am)
wherein m is a column vector formed by the mass flow of all the pipelines, TsSupply temperature to heat source, amTo determine whether to store the current observations and actions to variables in memory;
the specific structure of the generator μ is as follows:
the input layer of generator μ contains NoA neuron of which NoIs a measurement vector otDimension (d);
the hidden layer of the generator mu contains b1A hidden layer, the number b of hidden layers1The number of neurons and the activation function of each hidden layer are determined by repeated trial and error according to artificial experience or calculation precision requirements, and the activation function of the hidden layer is ReLU;
the output layer of the generator mu contains NaA neuron of which NaIs a motion vector atThe activation function of the output layer is a tanh activation function;
the expression of the evaluator Q is
Figure BDA0003231665250000024
Wherein theta isQFor the model parameters of the evaluator Q, the input of the evaluator Q is otAnd atThe output of the evaluator Q is measured at a value otExecute action a at oncetIs evaluated
Figure BDA0003231665250000025
The structure of the evaluator Q is as follows:
the input layer of the evaluator Q comprises (N)o+Na) A plurality of neurons;
the hidden layer of evaluator Q contains b2A hidden layer, the number b of hidden layers2The number of neurons and the activation function of each hidden layer are determined by repeated trial and error according to artificial experience or calculation precision requirements, and the activation function of the hidden layer is ReLU;
the output layer of the evaluator Q contains 1 neuron, and the activation function of the output layer is a linear activation function.
Optionally, the collecting measurement data in the electrothermal coupling system, training the reinforcement learning network according to the measurement data, and updating parameters in the reinforcement learning network includes:
(1) initializing reinforcement learning parameters, specifically as follows:
random initialization generator mu and evaluator Q parameter thetaμQ(ii) a Setting maximum entropy parameter alpha of reinforcement learning networkf,αfA constant set manually; initializing a discrete time variable t as 0, training period number ks0; initializing a vector data set formed by additional memory parameters pi as an empty set, and selecting the number k of data which can be stored by additional memorym(ii) a Initializing the action set a to be empty, and initializing the reinforcement learning network experience library D to be an empty set; setting the total training cycle number NmaxTotal number of control steps N in a daypt
(2) And executing the following steps at the t control moment to train the reinforcement learning network:
(2-1) collecting pipeline measurement values from the electrothermal coupling system measurement device in real time
Figure BDA0003231665250000031
Indoor temperature T of buildinginOutdoor building ambient temperature TaII is obtained from the additional memory of the reinforcement learning network, and the obtained information is composed of the price of electricity c, the output power h of the heat source and the control time tThe vector is denoted as o',
Figure BDA0003231665250000032
meaning the collected measurement vector;
(2-2) judging the action set a: if a is empty, entering step (2-3), if a is not empty, calculating an evaluation value r of a when the action is executed according to the following formula, adding an experience sample to the reinforcement learning network experience library D, updating D ← D { (o, a, r, o') }, and then entering step (2-3):
Figure BDA0003231665250000033
wherein eta is the electric conversion efficiency of the electric boiler,
Figure BDA0003231665250000034
and in,iTupper and lower limits of i indoor temperature, T, of the buildingin,iIs the indoor temperature of building i, phiLFor the set of all buildings, relu (x) is an activation function, defined as relu (x) max (0, x);
(2-3) making the measurement information o ═ o';
(2-4) generating an action a ═ m, T using the generator network μ based on the observation information os,am)=μ(o|θμ);
(2-5) pairs of amMake a judgment if amIf not, performing the step (2-6); if amNot equal to 0, further judging a vector data set formed by the additional memory parameter pi, if the additional memory parameter pi is not full, making pi { (o, a) } equal to pi { (u, a) }, and if the additional memory parameter pi is full, making pi { (o, a) } equal to pi { (pi) }1{ (o, a) }, in which { (o, a) } is larger than1For the first element in Π, u represents union operation of the sets, and \ represents difference operation of the sets;
(2-6) converting a to (m, T)s,am) M in a is sent to each pump in the electric heating coupling system to control the flow of each pipeline and heat source and enable T in asSending the heat source to a heat source in an electric heating coupling system to control the heat source to supplyThe heat temperature realizes the control of the electric heating coupling system at the t control moment;
(2-7) randomly extracting a set of experiences D from the experience base D of the reinforcement learning networkBE, D, and the number of samples in the experience group is B;
(2-8) Using DBCalculating a model parameter θ of the evaluator Q for each sample of (1)QLoss function of (2):
Figure BDA0003231665250000041
where E is the desired operator, representing the pair DBEach sample in (1) finds the mathematical expectation, yfSatisfies the following conditions:
yf=r+γ[Q(o',a'|θQ)-αflogμ(a'|o)]
wherein a 'is an action subject to probability distribution mu (· | o), log mu (a' | o) is entropy of a generator mu generation strategy, gamma is a discount factor and is a constant set manually, and the condition that 0< gamma is less than or equal to 1 is met;
(2-9) updating the parameter θQ
Figure BDA0003231665250000042
Where ρ isfIn order to reinforce the training step size of the learning network,
Figure BDA0003231665250000043
the Hamilton operator represents the gradient of the solving function;
(2-10) calculating a model parameter θ of the generator μμA loss function of (d);
Figure BDA0003231665250000044
wherein, a to mu (· | o) represent a obedience probability distribution mu (· | o);
(2-11) updating parameter θμ
Figure BDA0003231665250000045
(3) Let t be t +1, compare t with NptIf t is greater than or equal to NptIf t is equal to 0, k iss=ks+1, return to step (2), if t<NptThen, the step (4) is carried out;
(4) comparison ksAnd NmaxIs a size of (c), if ks<NmaxReturning to the step (2) and continuing the training process; if k iss≥NmaxStopping the updating process of the network parameters and obtaining the network parameters theta at the momentμAnd thetaQAs parameters of the final generator network μ and evaluator network Q.
Optionally, the controlling the electrothermal coupling system according to the action output by the reinforcement learning network includes:
(1) setting the initial control time t as 0, and setting the total control step number N in one dayptInitializing an additional memory pi data set as an empty set;
(2) real-time acquisition of pipeline measurement value from electrothermal coupling system measurement device
Figure BDA0003231665250000051
Indoor temperature T of buildinginOutdoor building ambient temperature TaAcquiring pi from the additional memory data set of the reinforcement learning network according to the electricity price c, the heat source output power h and the control time t, and recording the measurement information as
Figure BDA0003231665250000052
(3) Generating an action a ═ m, T by using the generator mu according to the measurement information os,am)=μ(o|θμ);
(4) To amMake a judgment if amAnd (5) carrying out the step (0); if amNot equal to 0, and if the additional memory pi is not full, let pi be { (o, a) }; if amNot equal to 0 and the additional memory is full, let pi ═ pi \ pi { pi1{ (o, a) } which is { (o, a) } and which is a unit of a structureMiddle II1Is the first element in Π;
(5) a is equal to (m, T)s,am) M in a is sent to each pump in the electric heating coupling system to control the flow of each pipeline and heat source and enable T in asIssuing to the heat source in the electric heating coupling system to control the heat supply temperature of the heat source, realizing the control of the electric heating coupling system at the t control moment, making t equal to t +1, and comparing t with NptIf t is greater than or equal to NptIf t is equal to 0, ending the operation; and if t ≠ 0, returning to the step (2).
According to a second aspect of the present disclosure, an electrothermal coupling system scheduling device is provided, including:
the network construction module is used for constructing a reinforcement learning network for scheduling the electric heating coupling system;
the network parameter updating module is used for acquiring measurement data in the electric heating coupling system in real time, training the reinforcement learning network according to the measurement data and the reaction condition of the electric heating coupling system to the control signal, and updating parameters in the reinforcement learning network;
and the control module is used for outputting actions according to the measurement data acquired in real time by utilizing the trained reinforcement learning network so as to control the electric heating coupling system.
According to a third aspect of the present disclosure, an electronic device is presented, comprising:
a memory for storing computer-executable instructions;
a processor configured to perform:
constructing a reinforcement learning network for scheduling the electric heating coupling system;
collecting measurement data in the electric heating coupling system in real time, training the reinforcement learning network, and updating parameters in the reinforcement learning network;
and (4) outputting the action according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system.
A fourth aspect of the present disclosure proposes a computer-readable storage medium having stored thereon a computer program for causing the computer to execute:
constructing a reinforcement learning network for scheduling the electric heating coupling system;
collecting measurement data in the electric heating coupling system in real time, training the reinforcement learning network, and updating parameters in the reinforcement learning network;
and (4) outputting the action according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system.
According to the embodiment of the disclosure, the defects of a traditional model-based optimization method and a traditional reinforcement learning algorithm are overcome, reinforcement learning based on additional memory is independent of an accurate model of a building, the problem that learning is difficult due to large time delay of heat transfer in an electric heating coupling system can be solved, the flexibility of a load side is exploited to the maximum extent, and the method is suitable for online application.
Additional aspects and advantages of the disclosure will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic flowchart of a scheduling method of an electrothermal coupling system according to an embodiment of the present disclosure.
Fig. 2 is a block diagram of a scheduling apparatus of an electrothermal coupling system according to an embodiment of the present disclosure.
Detailed description of the invention
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic flow diagram illustrating an electrothermal coupling system scheduling method according to one embodiment of the present disclosure. The scheduling method of the electrothermal coupling system in the embodiment can be applied to user equipment, such as a mobile phone, a tablet computer and the like.
As shown in fig. 1, the scheduling method of the electrothermal coupling system may include the following steps:
in the step 1, constructing a reinforcement learning network for scheduling the electric heating coupling system;
in one embodiment, the reinforcement learning network for scheduling of the electrothermal coupling system includes a generator μ (Actor network) and an evaluator Q (criticic network), where:
(a) the expression of the generator mu is at=μ(o tμ) Wherein, thetaμRepresenting the model parameters of a generator mu, wherein the input of the generator is measurement information o of the electrothermal coupling system after t time stepst
Figure BDA0003231665250000071
Wherein the content of the first and second substances,
Figure BDA0003231665250000072
after spatial differentiation is performed on the pipeline, a vector formed by the temperatures of the temperature measurement infinitesimal elements exists,
Figure BDA0003231665250000073
is TpipeOf a proper subset, TpipeVector, T, representing the temperature composition of each pipe infinitesimal after spatial differentiation of the pipesinVector, T, representing the indoor temperature composition of all buildingsaRepresenting the ambient temperature of the outdoor building, c representing the electricity price, h representing the output power of the heat source, and t representing the power during the control processDiscrete time variable, pi represents an additional memory parameter, (.)tRepresents the value at the time of t control; in the present invention, pipe gauging is not necessary, and when there is no pipe gauging, it is considered that the pipe is not measured
Figure BDA0003231665250000074
Is empty;
the output of the generator is the measurement information o of the electrothermal coupling systemtAction vector a of the control strategy to be executedt
at=(m,Ts,am)
Wherein m is a column vector formed by the mass flow of all the pipelines, TsSupply temperature to heat source, amTo determine whether to store the current measurements and actions to variables in memory;
the specific structure of the generator μ is as follows:
the input layer of generator μ contains NoA neuron of which NoIs a measurement vector otDimension (d);
the hidden layer of the generator mu contains b1A hidden layer, the number b of hidden layers1The number of neurons per hidden layer, the activation function, is determined by iterative heuristics based on manual experience or computational accuracy requirements, and in one embodiment of the invention, the number of hidden layers b12, the number of neurons of each hidden layer is 256, and the activation function of the hidden layer is ReLU;
the output layer of the generator mu contains NaA neuron of which NaIs a motion vector atThe activation function of the output layer is a tanh activation function;
(b) the expression of the evaluator Q is
Figure BDA0003231665250000075
Wherein theta isQFor the model parameters of the evaluator Q, the input of the evaluator Q is otAnd atThe output of the evaluator Q is measured at a value otExecute action a at oncetIs evaluated
Figure BDA0003231665250000076
The structure of the evaluator Q is as follows:
the input layer of the evaluator Q comprises (N)o+Na) A plurality of neurons;
the hidden layer of evaluator Q contains b2A hidden layer, the number b of hidden layers2The number of neurons in each hidden layer, the activation function, are determined by repeated heuristics based on manual experience or computational accuracy requirements, and in one embodiment of the disclosure, the number of hidden layers b22, the number of neurons of each hidden layer is 256, and the activation function of the hidden layer is ReLU;
the output layer of the evaluator Q contains 1 neuron, and the activation function of the output layer is a linear activation function.
In step 2, the measurement data in the electric-thermal coupling system is collected in real time, the reinforcement learning network is trained according to the measurement data and the reaction condition of the electric-thermal coupling system to the control signal, and the parameters in the reinforcement learning network are updated.
In one embodiment, the collecting measurement data in the electrothermal coupling system in real time, training the reinforcement learning network according to the measurement data, and updating parameters in the reinforcement learning network includes:
(1) initializing reinforcement learning parameters, specifically as follows:
random initialization generator mu and evaluator Q parameter thetaμQ(ii) a Setting maximum entropy parameter alpha of reinforcement learning networkf,αfFor manually set constants, in one embodiment of the present disclosure, take αf0.3; initializing a discrete time variable t as 0, training period number ks0; initializing a vector data set formed by additional memory parameters pi as an empty set, and selecting the number k of data which can be stored by additional memorym(ii) a Initializing the action set a to be empty, and initializing the reinforcement learning network experience library D to be an empty set; setting the total training cycle number NmaxTotal number of control steps N in a daypt
(2) And executing the following steps at the t control moment to train the reinforcement learning network:
(2-1) collecting pipeline measurement values from the electrothermal coupling system measurement device in real time
Figure BDA0003231665250000081
Indoor temperature T of buildinginOutdoor building ambient temperature TaAcquiring pi from the additional memory of the reinforcement learning network, recording a vector formed by acquired information as o',
Figure BDA0003231665250000082
meaning the collected measurement vector;
(2-2) judging the action set a: if a is empty, entering step (2-3), if a is not empty, calculating an evaluation value r of a when the action is executed according to the following formula, adding an experience sample to the reinforcement learning network experience library D, updating D ← D { (o, a, r, o') }, and then entering step (2-3):
Figure BDA0003231665250000083
wherein eta is the electric conversion efficiency of the electric boiler,
Figure BDA0003231665250000084
and in,iTupper and lower limits of i indoor temperature, T, of the buildingin,iIs the indoor temperature of building i, phiLFor the set of all buildings, relu (x) is an activation function, defined as relu (x) max (0, x);
(2-3) making the measurement information o ═ o';
(2-4) generating an action a ═ m, T using the generator network μ based on the observation information os,am)=μ(o|θμ);
(2-5) pairs of amMake a judgment if amIf not, performing the step (2-6); if amIf the added memory parameter pi is not equal to 0, further judging a vector data set formed by the added memory parameter pi, and if the added memory parameter pi is not full, setting pi asWhen the additional memory data set is full, let pi be pi \ pi { (o, a) } and if the additional memory data set is full, let pi be pi \ pi { (o, a) }1{ (o, a) }, in which { (o, a) } is larger than1For the first element in Π, u represents union operation of the sets, and \ represents difference operation of the sets;
(2-6) converting a to (m, T)s,am) The mass flow m of all pipelines in the electrothermal coupling system is transmitted to each pump in the electrothermal coupling system to control the flow of each pipeline and heat source, and T in asThe temperature control module is issued to a heat source in the electric heating coupling system to control the heat supply temperature of the heat source, so that the electric heating coupling system is controlled at the t control moment;
(2-7) randomly extracting a set of experiences D from the experience base D of the reinforcement learning networkBE, D, and the number of samples in the experience group is B;
(2-8) Using DBCalculating a model parameter θ of the evaluator Q for each sample of (1)QLoss function of (2):
Figure BDA0003231665250000091
where E is the desired operator, representing the pair DBEach sample in (1) finds the mathematical expectation, yfSatisfies the following conditions:
yf=r+γ[Q(o',a'|θQ)-αflogμ(a'|o)]
where a 'is an action subject to probability distribution μ (· | o), log μ (a' | o) is entropy of generator μ generation policy, γ is a discount factor, is an artificially set constant, and satisfies 0< γ ≦ 1, in one embodiment of the present disclosure, γ ≦ 0.99;
(2-9) updating the parameter θQ
Figure BDA0003231665250000092
Where ρ isfIn order to reinforce the training step size of the learning network,
Figure BDA0003231665250000093
the Hamilton operator represents the gradient of the solving function;
(2-10) calculating a model parameter θ of the generator μμA loss function of (d);
Figure BDA0003231665250000094
wherein, a to mu (· | o) represent a obedience probability distribution mu (· | o);
(2-11) updating parameter θμ
Figure BDA0003231665250000101
(3) Let t be t +1, compare t with NptIf t is greater than or equal to NptIf t is equal to 0, k iss=ks+1, return to step (2), if t<NptThen, the step (4) is carried out;
(4) comparison ksAnd NmaxIs a size of (c), if ks<NmaxReturning to the step (2) and continuing the training process; if k iss≥NmaxStopping the updating process of the network parameters and obtaining the network parameters theta at the momentμAnd thetaQAs parameters of the final generator network μ and evaluator network Q.
The invention establishes the control problem of the electric heating coupling system as a partially observable Markov decision process, converts the large time delay of the heat transfer in the electric heating coupling system into partial observability, and avoids the problem of asynchronous observation and action in the training process. The intelligent agent is realized by adopting a reinforcement learning algorithm. By adopting the reinforcement learning method based on additional memory provided by the embodiment of the disclosure to train, partial observability in the Markov decision process can be well dealt with, so that the control strategy is approximately converged to the optimal strategy, and further, on the premise of ensuring the indoor temperature of the heat load within a certain range, the thermal inertia of the heat load of the building is utilized to the maximum extent to reduce the energy supply cost, and the optimal control of the electric-heat coupling system is realized.
In step 3, the trained reinforcement learning network is used for outputting actions according to the measurement data acquired in real time, and the electric heating coupling system is controlled.
In one embodiment, the controlling the electric-thermal coupling system according to the action output by the reinforcement learning network includes:
(1) setting the initial control time t as 0, and setting the total control step number N in one dayptInitializing an additional memory pi data set as an empty set;
(2) real-time acquisition of pipeline measurement value from electrothermal coupling system measurement device
Figure BDA0003231665250000102
Indoor temperature T of buildinginOutdoor building ambient temperature TaAcquiring pi from the additional memory data set of the reinforcement learning network according to the electricity price c, the heat source output power h and the control time t, and recording the measurement information as
Figure BDA0003231665250000103
(3) Generating an action a ═ m, T by using the generator mu according to the measurement information os,am)=μ(o|θμ);
(4) To amMake a judgment if amAnd (5) carrying out the step (0); if amNot equal to 0, and if the additional memory pi is not full, let pi be { (o, a) }; if amNot equal to 0 and the additional memory is full, let pi ═ pi \ pi { pi1{ (o, a) }, in which { (o, a) } is larger than1Is the first element in Π;
(5) a is equal to (m, T)s,am) M in a is sent to each pump in the electric heating coupling system to control the flow of each pipeline and heat source and enable T in asIssuing to the heat source in the electric heating coupling system to control the heat supply temperature of the heat source, realizing the control of the electric heating coupling system at the t control moment, making t equal to t +1, and comparing t with NptIf t is greater than or equal to NptIf t is equal to 0, ending the operation; and if t ≠ 0, returning to the step (2).
Corresponding to the electrothermal coupling system scheduling method, the disclosure also provides an embodiment of an electrothermal coupling system scheduling device.
Fig. 2 shows an electrothermal coupling system scheduling apparatus according to an embodiment of the present disclosure, including:
the network construction module is configured to construct a reinforcement learning network for scheduling of the electrothermal coupling system;
the network training module is configured to collect measurement data in the electric-thermal coupling system in real time, train the reinforcement learning network according to the measurement data and the reaction condition of the electric-thermal coupling system to the control signal, and update parameters in the reinforcement learning network;
and the control module is configured to utilize the trained reinforcement learning network to output actions according to the measurement data acquired in real time so as to control the electric heating coupling system.
An embodiment of the present disclosure also provides an electronic device, including:
a memory for storing processor-executable instructions;
a processor configured to:
constructing a reinforcement learning network for scheduling the electric heating coupling system;
collecting measurement data in the electric heating coupling system in real time, training the reinforcement learning network, and updating parameters in the reinforcement learning network;
and (4) outputting the action according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system.
Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a computer program for causing a computer to execute:
constructing a reinforcement learning network for scheduling the electric heating coupling system;
collecting measurement data in the electric heating coupling system in real time, training the reinforcement learning network, and updating parameters in the reinforcement learning network;
and (4) outputting the action according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system.
It should be noted that, in the embodiment of the present disclosure, the Processor may be a Central Processing Unit (CPU), or may be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the memory may be used for storing the computer program and/or the module, and the processor may realize various functions of the automobile accessory picture dataset making apparatus by executing or executing the computer program and/or the module stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device. If the modules/units of the construction device of the wind power system operation stability domain are realized in the form of software functional units and sold or used as independent products, the modules/units can be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method of the embodiments described above can be realized by the present disclosure, and the method can also be realized by the relevant hardware instructed by a computer program, which can be stored in a computer readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments described above can be realized. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the device provided by the present disclosure, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
While the foregoing is directed to the preferred embodiment of the present disclosure, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (7)

1. An electrothermal coupling system scheduling method is characterized by comprising the following steps:
step 1, constructing a reinforcement learning network for scheduling an electrothermal coupling system;
step 2, collecting measurement data in the electric heating coupling system in real time, training the reinforcement learning network, and updating parameters in the reinforcement learning network;
and 3, outputting actions according to the measurement data acquired in real time by using the trained reinforcement learning network to control the electric heating coupling system.
2. The electrothermal coupling system scheduling method of claim 1, wherein the reinforcement learning network for electrothermal coupling system scheduling comprises a generator μ and an evaluator Q, wherein:
(a) the expression of the generator mu is at=μ(o tμ) Wherein, thetaμRepresenting the model parameters of a generator mu, wherein the input of the generator is measurement information o of the electrothermal coupling system after t time stepst
Figure FDA0003231665240000011
Wherein the content of the first and second substances,
Figure FDA0003231665240000012
after spatial differentiation is performed on the pipeline, a vector formed by the temperatures of the temperature measurement infinitesimal elements exists,
Figure FDA0003231665240000013
is TpipeOf a proper subset, TpipeVector, T, representing the temperature composition of each pipe infinitesimal after spatial differentiation of the pipesinVector, T, representing the indoor temperature composition of all buildingsaRepresenting the ambient temperature of the outdoor building, c representing the electricity price, h representing the output power of the heat source, t representing a discrete time variable during the control process, Π representing an additional memory parameter, (.)tRepresents the value at the time of t control;
the output of the generator is the measurement information o of the electrothermal coupling systemtThe following action vectors of the control strategy to be executed:
at=(m,Ts,am)
wherein m is a column vector formed by the mass flow of all the pipelines, TsSupply temperature to heat source, amTo determine whether to store the current observations and actions to variables in memory;
the specific structure of the generator μ is as follows:
the input layer of generator μ contains NoA neuron of which NoIs a measurement vector otDimension (d);
the hidden layer of the generator mu contains b1A hidden layer, the number b of hidden layers1The number of neurons and the activation function of each hidden layer are determined by repeated trial and error according to artificial experience or calculation precision requirements, and the activation function of the hidden layer is ReLU;
the output layer of the generator mu contains NaA neuron of which NaIs a motion vector atThe activation function of the output layer is a tanh activation function;
(b) the expression of the evaluator Q is
Figure FDA0003231665240000021
Wherein theta isQFor the model parameters of the evaluator Q, the input of the evaluator Q is otAnd atThe output of the evaluator Q is measured at a value otExecute action a at oncetIs evaluated
Figure FDA0003231665240000022
The structure of the evaluator Q is as follows:
the input layer of the evaluator Q comprises (N)o+Na) A plurality of neurons;
the hidden layer of evaluator Q contains b2A hidden layer, the number b of hidden layers2The number of neurons and the activation function of each hidden layer are determined by repeated trial and error according to artificial experience or calculation precision requirements, and the activation function of the hidden layer is ReLU;
the output layer of the evaluator Q contains 1 neuron, and the activation function of the output layer is a linear activation function.
3. The electrothermal coupling system scheduling method of claim 1, wherein the acquiring measurement data in the electrothermal coupling system, training the reinforcement learning network according to the measurement data, and updating parameters in the reinforcement learning network comprises:
(1) initializing reinforcement learning parameters, specifically as follows:
random initialization generator mu and evaluator Q parameter thetaμQ(ii) a Setting maximum entropy parameter alpha of reinforcement learning networkf,αfA constant set manually; initializing a discrete time variable t as 0, training period number ks0; initializing a vector data set formed by additional memory parameters pi as an empty set, and selecting the number k of data which can be stored by additional memorym(ii) a Initializing the action set a to be empty, and initializing the reinforcement learning network experience library D to be an empty set; setting the total training cycle number NmaxTotal number of control steps N in a daypt
(2) And executing the following steps at the t control moment to train the reinforcement learning network:
(2-1) collecting pipeline measurement values from the electrothermal coupling system measurement device in real time
Figure FDA0003231665240000023
Indoor temperature T of buildinginOutdoor building ambient temperature TaAcquiring pi from the additional memory of the reinforcement learning network, recording a vector formed by acquired information as o',
Figure FDA0003231665240000024
meaning the collected measurement vector;
(2-2) judging the action set a: if a is empty, entering step (2-3), if a is not empty, calculating an evaluation value r of a when the action is executed according to the following formula, adding an experience sample to the reinforcement learning network experience library D, updating D ← D { (o, a, r, o') }, and then entering step (2-3):
Figure FDA0003231665240000025
wherein eta is the electric conversion efficiency of the electric boiler,
Figure FDA0003231665240000026
and in,iTupper and lower limits of i indoor temperature, T, of the buildingin,iIs the indoor temperature of building i, phiLFor the set of all buildings, relu (x) is an activation function, defined as relu (x) max (0, x);
(2-3) making the measurement information o ═ o';
(2-4) generating an action a ═ m, T using the generator network μ based on the observation information os,am)=μ(o|θμ);
(2-5) pairs of amMake a judgment if amIf not, performing the step (2-6); if amNot equal to 0, further judging a vector data set formed by the additional memory parameter pi, if the additional memory parameter pi is not full, making pi { (o, a) } equal to pi { (u, a) }, and if the additional memory parameter pi is full, making pi { (o, a) } equal to pi { (pi) }1{ (o, a) }, in which { (o, a) } is larger than1For the first element in Π, u represents union operation of the sets, and \ represents difference operation of the sets;
(2-6) converting a to (m, T)s,am) M in a is sent to each pump in the electric heating coupling system to control the flow of each pipeline and heat source and enable T in asThe temperature control module is issued to a heat source in the electric heating coupling system to control the heat supply temperature of the heat source, so that the electric heating coupling system is controlled at the t control moment;
(2-7) randomly extracting a set of experiences D from the experience base D of the reinforcement learning networkBE, D, and the number of samples in the experience group is B;
(2-8) Using DBCalculating a model parameter θ of the evaluator Q for each sample of (1)QLoss function of (2):
Figure FDA0003231665240000031
where E is the desired operator, representing the pair DBEach sample in (1) finds the mathematical expectation, yfSatisfies the following conditions:
yf=r+γ[Q(o',a'|θQ)-αflogμ(a'|o)]
wherein a 'is an action subject to probability distribution mu (· | o), log mu (a' | o) is entropy of a generator mu generation strategy, gamma is a discount factor and is a constant set manually, and the condition that 0< gamma is less than or equal to 1 is met;
(2-9) updating the parameter θQ
Figure FDA0003231665240000032
Where ρ isfIn order to reinforce the training step size of the learning network,
Figure FDA0003231665240000033
the Hamilton operator represents the gradient of the solving function;
(2-10) calculating a model parameter θ of the generator μμA loss function of (d);
Figure FDA0003231665240000034
wherein, a to mu (· | o) represent a obedience probability distribution mu (· | o);
(2-11) updating parameter θμ
Figure FDA0003231665240000041
(3) Let t be t +1, compare t with NptIf t is greater than or equal to NptIf t is equal to 0, k iss=ks+1, return to step (2), if t<NptThen, the step (4) is carried out;
(4) comparison ksAnd NmaxIs a size of (c), if ks<NmaxThen returning to the step (2) to continueA training process; if k iss≥NmaxStopping the updating process of the network parameters and obtaining the network parameters theta at the momentμAnd thetaQAs parameters of the final generator network μ and evaluator network Q.
4. The electrothermal coupling system scheduling method of claim 1, wherein the controlling the electrothermal coupling system according to the action output by the reinforcement learning network comprises:
(1) setting the initial control time t as 0, and setting the total control step number N in one dayptInitializing an additional memory pi data set as an empty set;
(2) real-time acquisition of pipeline measurement value from electrothermal coupling system measurement device
Figure FDA0003231665240000042
Indoor temperature T of buildinginOutdoor building ambient temperature TaAcquiring pi from the additional memory data set of the reinforcement learning network according to the electricity price c, the heat source output power h and the control time t, and recording the measurement information as
Figure FDA0003231665240000043
(3) Generating an action a ═ m, T by using the generator mu according to the measurement information os,am)=μ(o|θμ);
(4) To amMake a judgment if amAnd (5) carrying out the step (0); if amNot equal to 0, and if the additional memory pi is not full, let pi be { (o, a) }; if amNot equal to 0 and the additional memory is full, let pi ═ pi \ pi { pi1{ (o, a) }, in which { (o, a) } is larger than1Is the first element in Π;
(5) a is equal to (m, T)s,am) M in a is sent to each pump in the electric heating coupling system to control the flow of each pipeline and heat source and enable T in asThe temperature control module is issued to a heat source in the electrothermal coupling system to control the heat supply temperature of the heat source, realize the control of the electrothermal coupling system at the t control moment and realize the control of the electrothermal coupling system at the t control momentLet t be t +1, and compare t with NptIf t is greater than or equal to NptIf t is equal to 0, ending the operation; and if t ≠ 0, returning to the step (2).
5. An electrothermal coupling system scheduling device, comprising:
the network construction module is used for constructing a reinforcement learning network for scheduling the electric heating coupling system;
the network parameter updating module is used for acquiring measurement data in the electric heating coupling system in real time, training the reinforcement learning network according to the measurement data and the reaction condition of the electric heating coupling system to the control signal, and updating parameters in the reinforcement learning network;
and the control module is used for outputting actions according to the measurement data acquired in real time by utilizing the trained reinforcement learning network so as to control the electric heating coupling system.
6. An electronic device comprising a memory for storing computer-executable instructions;
a processor configured to perform any of the electrothermal coupling system scheduling methods of claims 1-4.
7. A computer-readable storage medium, characterized in that a computer program is stored thereon for causing a computer to perform any of the electrothermal coupling system scheduling methods of claims 1-4.
CN202110989053.0A 2021-08-26 2021-08-26 Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof Pending CN113779871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110989053.0A CN113779871A (en) 2021-08-26 2021-08-26 Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110989053.0A CN113779871A (en) 2021-08-26 2021-08-26 Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof

Publications (1)

Publication Number Publication Date
CN113779871A true CN113779871A (en) 2021-12-10

Family

ID=78839574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110989053.0A Pending CN113779871A (en) 2021-08-26 2021-08-26 Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof

Country Status (1)

Country Link
CN (1) CN113779871A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649102A (en) * 2024-01-30 2024-03-05 大连理工大学 Optimal scheduling method of multi-energy flow system in steel industry based on maximum entropy reinforcement learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021007812A1 (en) * 2019-07-17 2021-01-21 深圳大学 Deep neural network hyperparameter optimization method, electronic device and storage medium
CN112290536A (en) * 2020-09-23 2021-01-29 电子科技大学 Online scheduling method of electricity-heat comprehensive energy system based on near-end strategy optimization
CN112529727A (en) * 2020-11-06 2021-03-19 台州宏远电力设计院有限公司 Micro-grid energy storage scheduling method, device and equipment based on deep reinforcement learning
CN112862281A (en) * 2021-01-26 2021-05-28 中国电力科学研究院有限公司 Method, device, medium and electronic equipment for constructing scheduling model of comprehensive energy system
CN113159341A (en) * 2021-04-23 2021-07-23 中国电力科学研究院有限公司 Power distribution network aid decision-making method and system integrating deep reinforcement learning and expert experience
CN113283156A (en) * 2021-03-29 2021-08-20 北京建筑大学 Subway station air conditioning system energy-saving control method based on deep reinforcement learning
US11205124B1 (en) * 2020-12-04 2021-12-21 East China Jiaotong University Method and system for controlling heavy-haul train based on reinforcement learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021007812A1 (en) * 2019-07-17 2021-01-21 深圳大学 Deep neural network hyperparameter optimization method, electronic device and storage medium
CN112290536A (en) * 2020-09-23 2021-01-29 电子科技大学 Online scheduling method of electricity-heat comprehensive energy system based on near-end strategy optimization
CN112529727A (en) * 2020-11-06 2021-03-19 台州宏远电力设计院有限公司 Micro-grid energy storage scheduling method, device and equipment based on deep reinforcement learning
US11205124B1 (en) * 2020-12-04 2021-12-21 East China Jiaotong University Method and system for controlling heavy-haul train based on reinforcement learning
CN112862281A (en) * 2021-01-26 2021-05-28 中国电力科学研究院有限公司 Method, device, medium and electronic equipment for constructing scheduling model of comprehensive energy system
CN113283156A (en) * 2021-03-29 2021-08-20 北京建筑大学 Subway station air conditioning system energy-saving control method based on deep reinforcement learning
CN113159341A (en) * 2021-04-23 2021-07-23 中国电力科学研究院有限公司 Power distribution network aid decision-making method and system integrating deep reinforcement learning and expert experience

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李琦;乔颖;张宇精;: "配电网持续无功优化的深度强化学习方法", 电网技术, no. 04, pages 294 - 301 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649102A (en) * 2024-01-30 2024-03-05 大连理工大学 Optimal scheduling method of multi-energy flow system in steel industry based on maximum entropy reinforcement learning
CN117649102B (en) * 2024-01-30 2024-05-17 大连理工大学 Optimal scheduling method of multi-energy flow system in steel industry based on maximum entropy reinforcement learning

Similar Documents

Publication Publication Date Title
Tian et al. Data driven parallel prediction of building energy consumption using generative adversarial nets
CN112529727A (en) Micro-grid energy storage scheduling method, device and equipment based on deep reinforcement learning
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN112686464A (en) Short-term wind power prediction method and device
CN110866592B (en) Model training method, device, energy efficiency prediction method, device and storage medium
CN112766596A (en) Building energy consumption prediction model construction method, energy consumption prediction method and device
CN116707331B (en) Inverter output voltage high-precision adjusting method and system based on model prediction
CN113657660A (en) Heat source load prediction method based on substation load and heat supply network hysteresis model
He et al. Probabilistic solar irradiance forecasting via a deep learning‐based hybrid approach
CN112100911A (en) Solar radiation prediction method based on deep BISLTM
CN111461445A (en) Short-term wind speed prediction method and device, computer equipment and storage medium
CN113722939A (en) Wind power output prediction method, device, equipment and storage medium
CN114065646B (en) Energy consumption prediction method based on hybrid optimization algorithm, cloud computing platform and system
CN116757465A (en) Line risk assessment method and device based on double training weight distribution model
CN111192158A (en) Transformer substation daily load curve similarity matching method based on deep learning
CN113779871A (en) Electric heating coupling system scheduling method and device, electronic equipment and storage medium thereof
CN114970357A (en) Energy-saving effect evaluation method, system, device and storage medium
CN115100466A (en) Non-invasive load monitoring method, device and medium
CN113552855B (en) Industrial equipment dynamic threshold setting method and device, electronic equipment and storage medium
CN112561180B (en) Short-term wind speed prediction method and device based on meta-learning, computer equipment and storage medium
Khan et al. Deep dive into hybrid networks: A comparative study and novel architecture for efficient power prediction
CN113408808A (en) Training method, data generation method, device, electronic device and storage medium
CN116885711A (en) Wind power prediction method, device, equipment and readable storage medium
CN116777646A (en) Artificial intelligence-based risk identification method, apparatus, device and storage medium
CN113449968B (en) New energy power grid frequency risk assessment method and device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination