CN113715805B - Rule fusion deep reinforcement learning energy management method based on working condition identification - Google Patents

Rule fusion deep reinforcement learning energy management method based on working condition identification Download PDF

Info

Publication number
CN113715805B
CN113715805B CN202111177978.1A CN202111177978A CN113715805B CN 113715805 B CN113715805 B CN 113715805B CN 202111177978 A CN202111177978 A CN 202111177978A CN 113715805 B CN113715805 B CN 113715805B
Authority
CN
China
Prior art keywords
vehicle
port
working condition
torque
battery
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111177978.1A
Other languages
Chinese (zh)
Other versions
CN113715805A (en
Inventor
***
昌诚程
张自宇
栾众楷
赵万忠
周冠
文凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tianhang Intelligent Equipment Research Institute Co ltd
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing Tianhang Intelligent Equipment Research Institute Co ltd
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tianhang Intelligent Equipment Research Institute Co ltd, Nanjing University of Aeronautics and Astronautics filed Critical Nanjing Tianhang Intelligent Equipment Research Institute Co ltd
Priority to CN202111177978.1A priority Critical patent/CN113715805B/en
Publication of CN113715805A publication Critical patent/CN113715805A/en
Application granted granted Critical
Publication of CN113715805B publication Critical patent/CN113715805B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W20/00Control systems specially adapted for hybrid vehicles
    • B60W20/20Control strategies involving selection of hybrid configuration, e.g. selection between series or parallel configuration
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W10/00Conjoint control of vehicle sub-units of different type or different function
    • B60W10/04Conjoint control of vehicle sub-units of different type or different function including control of propulsion units
    • B60W10/06Conjoint control of vehicle sub-units of different type or different function including control of propulsion units including control of combustion engines
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W10/00Conjoint control of vehicle sub-units of different type or different function
    • B60W10/04Conjoint control of vehicle sub-units of different type or different function including control of propulsion units
    • B60W10/08Conjoint control of vehicle sub-units of different type or different function including control of propulsion units including control of electric propulsion units, e.g. motors or generators
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W20/00Control systems specially adapted for hybrid vehicles
    • B60W20/10Controlling the power contribution of each of the prime movers to meet required power demand
    • B60W20/11Controlling the power contribution of each of the prime movers to meet required power demand using model predictive control [MPC] strategies, i.e. control methods based on models predicting performance
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W20/00Control systems specially adapted for hybrid vehicles
    • B60W20/10Controlling the power contribution of each of the prime movers to meet required power demand
    • B60W20/15Control strategies specially adapted for achieving a particular effect
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2710/00Output or target parameters relating to a particular sub-units
    • B60W2710/06Combustion engines, Gas turbines
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2710/00Output or target parameters relating to a particular sub-units
    • B60W2710/08Electric propulsion units
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/62Hybrid vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Automation & Control Theory (AREA)
  • Electric Propulsion And Braking For Vehicles (AREA)

Abstract

The invention discloses a rule fusion deep reinforcement learning energy management method based on working condition identification. The plug-in hybrid electric vehicle is established by taking a plug-in hybrid electric vehicle as an object, a parallel structure is used as a connection mode of an engine and a motor to establish a hybrid electric system model, a working condition library is established by selecting 8 standard working conditions and is subjected to kinematics segmentation, the working conditions of the vehicle are classified and identified by comparing 9 representative parameters according to segmented kinematics segments, then states, actions, agents and penalty functions in a deep Q learning algorithm are designed, and then the designed depth-enhanced learning algorithm with a rule fused is trained and distributed under three different training working conditions, so that the purposes of efficient energy distribution and utilization are achieved, fewer poor samples exist in the training process of the algorithm, the training efficiency is high, and the comprehensive performance of the hybrid electric vehicle system is high.

Description

Rule fusion deep reinforcement learning energy management method based on working condition identification
Technical Field
The invention relates to the field of energy management of hybrid power systems, in particular to a rule fusion deep reinforcement learning energy management method based on working condition identification.
Background
The hybrid power system is a relatively mature driving mode in the transition period from a fuel vehicle to a pure electric vehicle, and a plug-in hybrid power system is widely applied in recent years along with the development of battery technology as a relatively new driving mode.
The energy management strategies of present hybrid vehicles can be roughly divided into three categories: a rule-based energy management policy, an optimization-based energy management policy, and a learning-based energy management policy. The rule-based energy management strategy needs more experimental results and experience, is biased to local optimization at a component level, cannot realize overall optimization control on the plug-in hybrid power system, and the designed rule is usually only aimed at specific working conditions and has poor working condition adaptability. The energy management strategy based on optimization can only solve an optimal solution under known working conditions, and cannot be well suitable for unknown working conditions, the global optimization is easy to generate dimension disasters, the algorithm instantaneity is poor, the dependence degree of an instantaneous optimizer on a model is large, and the optimal distribution in a long time period cannot be guaranteed. The problem of working condition adaptability is not considered in the learning-based energy management strategy, an algorithm is generally trained under a standard working condition, and when the working condition characteristics change, the energy management strategy can cause the problems of unreasonable energy distribution, low running efficiency of a hybrid power system and the like. Meanwhile, the intelligent algorithm gives all action spaces to a machine for exploration during training, and does not have advantages brought by fusion of expert experience, so that the algorithm has more poor samples in the training process, the training efficiency is low, and the problems that the control effect of the trained energy management strategy is not ideal under certain conditions, the comprehensive performance of a hybrid power system is low and the like can be caused. Therefore, aiming at the problems, the invention provides a rule fusion deep reinforcement learning energy management method based on working condition identification, and the purpose of reasonably distributing the energy of the hybrid electric system is achieved.
Disclosure of Invention
The invention aims to solve the technical problem of providing a rule fusion deep reinforcement learning energy management method based on working condition identification aiming at the defects of the background technology. The method solves the problems that in the training process of an algorithm, a plurality of poor samples exist, the training efficiency is low, the control effect of the trained energy management strategy is not ideal under certain conditions, and the comprehensive performance of a hybrid power system is low.
The invention adopts the following technical scheme for solving the technical problems:
a rule fusion deep reinforcement learning energy management method based on working condition identification specifically comprises the following steps:
step 1, establishing a hybrid power system model;
step 2, classifying and identifying working conditions;
and 3, designing a rule-fused deep reinforcement learning energy management strategy.
Further, in the step 1, a plug-in hybrid electric vehicle is used as a target for establishing, and a parallel structure is used as a connection mode of an engine and a motor for establishing a hybrid power system model. The plug-in hybrid power system comprises a fuel engine, a motor, a vehicle-mounted power battery, an oil tank, a torque coupler, a clutch and a 5-gear transmission. The fuel engine is connected with the torque coupler, the motor is directly connected with one end of the torque coupler, the output end of the torque coupler is connected with the clutch, the other end of the clutch is connected with the 5-gear transmission, and then power is transmitted to the front axle to drive the vehicle to run;
the power battery adopts a Rint equivalent circuit model:
Figure GDA0003933969620000021
in the formula, I is a battery, and is positive when discharging and negative when charging; u shape ocv The open-circuit voltage of the battery can be obtained by an open-circuit voltage test; r is the internal resistance of the battery, the value of which changes along with the SOC and can be obtained by looking up a table; p is bat As the power of the battery, when the motor torque T m When the battery is in discharge state and the motor torque T is positive m When the voltage is negative, the battery is in a charging state; n is a radical of an alkyl radical m The motor rotation speed; eta bat-d The efficiency of discharge for the cell; eta bat-c Efficiency of charging the battery; eta m The efficiency of the motor under the current rotating speed and torque is obtained; SOC is the state of charge of the battery; Δ t is the sampling interval; q is the battery capacity;
the longitudinal running equation of the vehicle, regardless of the vertical motion and the operational stability of the vehicle, is as follows:
Figure GDA0003933969620000022
wherein Tcon is the torque required by the current working condition; ig is the transmission ratio of the transmission at the current gear; i0 is the transmission ratio of the main speed reducer; eta T To the overall transmission efficiency; r is the wheel radius; m is the mass of the whole vehicle; g is gravity acceleration; f is a rolling resistance coefficient; theta is the ramp angle; CD is the air resistance coefficient; a is the frontal area of the vehicle; u is the vehicle speed; δ is a rotation mass conversion coefficient.
The vehicle torque coupler adopts three-port two-degree-of-freedom mechanical configuration, a port 1 is used for unidirectional power input, a port 2 and a port 3 are used for bidirectional power input or output, the port 1 is connected with an engine crankshaft, the port 2 is connected with a motor output shaft, and the port 3 is connected with a clutch input end;
the relationship between the torque and the rotating speed of each port of the torque coupler is as follows:
Figure GDA0003933969620000031
in the formula, T e Is the engine torque; n is a radical of an alkyl radical e Is the engine speed; t is a unit of 3 Outputting torque for the coupler; n is 3 Outputting the rotating speed for the coupler; i all right angle e For the transmission ratio at the connection of port 1 to the crankshaft of the engine, i is taken here e =1;i m For the transmission ratio of the port 2 connected with the output shaft of the motor, the rotating speed of the motor is generally higher and needs to be reduced, the invention i m Taking 1.7368;
there are 3 driving modes according to the energy flow direction of the engine and the motor in the torque coupler:
(1) A combined driving mode: in the mode, the port 1 and the port 2 are power input ends, the port 3 is a power output end, the engine and the motor jointly provide power to drive the vehicle to run, and the motor torque T is at the moment m Positive, the battery is in a discharged state;
(2) Pure electric drive mode: in the mode, the port 1 has no power input, the port 2 is a power input end, and the motor drives the vehicle independently, wherein the motor torque T is m If the voltage is positive, the battery is in a discharge state, the engine is stopped, and because the port 1 is in one-way power input, the decoupling of the engine on a power system can be realized, so that the mechanical loss is reduced;
(3) A motor charging mode: in this mode, the motor of the vehicle becomes a generator, and the motor torque T m Is negative; and can be classified into charging in a driving state and charging in a non-driving state according to the vehicle running state. When the vehicle is charged in a driving state, the clutch is combined, the port 1 is a power input end, the port 2 and the port 3 are power output ends, the engine provides power to drive the vehicle to run, meanwhile, the generator is driven to rotate, and the battery is in a charging state. When charging is carried out in a parking state, the port 1 is a power input end, the port 2 is a power output end, the port 3 has no power output, the clutch is separated, mechanical loss caused by the gearbox and the front axle is reduced, and the engine only provides power for the generator to charge the battery.
Further, the kinematic segment of the vehicle condition in step 2 represents the vehicle driving state from the beginning of one idling to the beginning of the next idling, and includes an idling process and a driving process, wherein the vehicle is in a stationary state during the idling process, and the driving process includes multiple acceleration, constant speed and deceleration behaviors of the vehicle. In the invention, for comprehensively establishing a deep reinforcement learning training working condition, 8 standard working conditions are selected to establish a working condition library, the working condition library is subjected to kinematics segmentation, and then the following 9 representative parameters are selected according to the segmented kinematics segment to calculate the characteristics of the kinematics segment: average vehicle speed, average running vehicle speed, maximum vehicle speed, average acceleration, acceleration ratio, deceleration ratio, constant speed ratio, maximum acceleration and maximum deceleration;
the characteristic parameters in each kinematic segment can represent the characteristics of the kinematic segment, but each characteristic parameter is not independent and has a certain relationship with each other, so that the invention utilizes principal component analysis to reduce the dimension of the characteristic parameters of the kinematic segment and simultaneously covers all working condition characteristics as fully as possible, thereby reducing the classification difficulty and improving the reliability. The specific implementation process is as follows:
(1) Data were normalized:
Figure GDA0003933969620000041
wherein x is ij A j-th characteristic parameter representing an i-th kinematic segment;
Figure GDA0003933969620000042
is the sample mean; s j Is the standard deviation. i =1,2,3, \ 8230;, n; j =1,2,3, \ 8230;, m.
(2) Calculating the covariance matrix C of the Z matrix
Figure GDA0003933969620000043
(3) Eigenvalue decomposition of covariance matrix C
C=Q∑Q -1 (6)
Wherein Q is a matrix formed by eigenvalue vectors, sigma is a diagonal matrix, and the elements on the diagonal are eigenvalues lambda 1 、λ 2 、…、λ m
(4) Calculating the contribution ratio p of each feature vector 1 、p 2 、…p m And cumulative contribution rates.
Wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0003933969620000044
cumulative contribution rate P j Is the accumulation of the first k principal component contribution rates.
Figure GDA0003933969620000045
(5) Taking the feature vector corresponding to the principal component as a conversion matrix, and multiplying the data matrix by the conversion matrix to realize principal component mapping to obtain the corresponding kinematics segment feature parameters after dimension reduction;
then, fuzzy C-means clustering in the fuzzy clustering is used, and the clustering analysis is carried out on the kinematic segments according to the obtained principal component result, wherein the flow is as follows:
(1) Setting the number of clusters n c And a weighting index b;
(2) Initializing each cluster center m j
(3) Calculating membership functions of all samples under the current clustering center:
Figure GDA0003933969620000051
wherein mu j (x i ) Expressed as the membership function of the ith sample corresponding to the jth class.
(4) Calculating various clustering centers under the current membership function:
Figure GDA0003933969620000052
(5) And (4) repeating the steps (3) and (4) until the algorithm converges or the maximum iteration number is reached.
To determine the number of clusters n c L (n) is used herein c ) The function is used as an evaluation index, and the formula is as follows:
Figure GDA0003933969620000053
in the formula, the numerator represents the sum of the inter-class distances and the denominator represents the sum of the intra-class distances, so L (n) c ) Larger values indicate better classification.
And according to the fuzzy clustering result, combining the different types of kinematic fragments into a 3-type kinematic fragment library, then randomly extracting a certain number of kinematic fragments from the 3-type kinematic fragment library, and randomly arranging the kinematic fragments to obtain 3 working conditions for training.
And finally, training and identifying the working condition type under the training working condition of 3 by using the LVQ neural network, wherein the specific steps are as follows:
(1) And combining the working conditions 1,2 and 3 for training, calculating 9 corresponding characteristic parameters in window data by using a sliding window algorithm to serve as input of the LVQ neural network, and training by taking a vector form of the working condition category as a label.
(2) If the number of windows is too large, the window data may include more than one type of operating condition data, thereby increasing the difficulty of identification. If the number of the windows is too short, the working condition characteristic information is incomplete, so that the identification precision is reduced, and the fuel economy of the whole vehicle is reduced. Comprehensively, the method uses 35s as the window length to perform rolling extraction of the characteristic parameters of the working conditions.
(3) And training the LVQ neural network. The selected hyper-parameters are as follows: the number of nodes of the LVQ nerve competition layer is 500, the learning rate is 0.0005, the type of the learning function is learnlv1, and the iteration cycle is 50 times.
(4) And verifying the accuracy of the LVQ neural network. And carrying out sliding window operation with the length of 35s on the verification working condition, rolling the extracted characteristic parameters to be used as input of the trained LVQ neural network, and carrying out indexing operation on the output to obtain a verification working condition identification result.
Furthermore, the design in the step 3 comprises a state, an action, an agent and a penalty function, a state space is selected as a required torque Tr, a battery SOC and a current transmission ratio of the transmission, an action variable is selected as an engine output torque Te and a gear shifting action Ag, the agent design of the fusion rule takes the idea of energy distribution by using a rule algorithm into reference, the rule is fused into a machine for deep Q learning, a deep Q learning algorithm for the fusion rule is obtained, the number of effective samples in a sample pool is increased, a plug-in hybrid electric vehicle generally controls a battery SOC working interval within a certain range to ensure the cycle life of the battery and a small amount of electric energy storage for special conditions, the SOC is used as a rule control quantity, and the efficient working range of the SOC is set to be 0.2-0.8; and taking the torque of the power system as a regular control quantity;
the penalty function calculation method comprises the following steps:
Figure GDA0003933969620000061
wherein b is the fuel consumption rate, and can be obtained from a universal characteristic curve chart according to the current torque and the rotating speed of the engine; ρ is the fuel density; g is the acceleration of gravity; cf is the price per liter of fuel; ce is the price of electrical energy per kwh; lambda [ alpha ] A Is a shift action value weighting factor; lambda [ alpha ] p1 Is a penalty factor under a poor shift strategy; lambda [ alpha ] p2 Is a penalty factor for SOC exceeding the upper and lower usage limits.
Compared with the prior art, the invention adopting the technical scheme has the following beneficial effects:
1. training the designed rule-fused deep reinforcement learning algorithm under three different training working conditions to obtain three deep neural networks net1, net2 and net3 suitable for different working condition categories for energy distribution of a hybrid power system;
2. in the actual use process, a sliding window algorithm is used firstly, 9 corresponding characteristic parameters in window data are calculated and used as input of a trained LVQ neural network to obtain the current working condition type, and then a rule-fused deep reinforcement learning algorithm under the training of the corresponding working condition type is used for distributing the energy of the hybrid power system, so that the purpose of efficient energy distribution and utilization is achieved.
Drawings
FIG. 1 is a block diagram of a plug-in hybrid powertrain system;
FIG. 2 is a battery Rint equivalent circuit model;
FIG. 3 is a flow chart of an energy management policy algorithm.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
the present invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 shows a plug-in hybrid power system structure diagram, which is composed of a fuel engine, an electric motor, a vehicle-mounted power battery, a fuel tank, a torque coupler, a clutch and a 5-gear transmission. The fuel engine is connected with the torque coupler, the motor is directly connected with one end of the torque coupler, the output end of the torque coupler is connected with the clutch, the other end of the clutch is connected with the 5-gear transmission, and then power is transmitted to the front axle to drive the vehicle to run. The vehicle model comprises a five-gear transmission, and gears
The torque required by the powertrain is directly related to and thus affects the power reserve capacity of the vehicle, so the torque of the powertrain is also referred to herein as a regulated control quantity.
The plug-in hybrid electric vehicle engine is a power source for driving the vehicle to run and supplementing the electric quantity of the battery, the importance of the plug-in hybrid electric vehicle engine is higher than that of the electric motor, so that the engine torque is used as a first-stage rule control quantity, the SOC of the battery is used as a second-stage rule control quantity, and the motor torque is used as a third-stage rule control quantity because the motor torque is larger and the power reserve capacity is stronger.
As shown in fig. 2, which is an equivalent circuit model of the battery Rint, it can be obtained:
Figure GDA0003933969620000071
in the formula, I is a battery, and is positive when discharging and negative when charging; u shape ocv The open-circuit voltage of the battery can be obtained by an open-circuit voltage test; r is the internal resistance of the battery, the value of which changes along with the SOC and can be obtained by looking up a table; p bat For the power of the battery, when the motor torque T m When the time is positive, the battery is in a discharge state, and the motor torque T m When the voltage is negative, the battery is in a charging state; n is m The motor rotating speed; eta bat-d The cell discharge efficiency; eta bat-c Efficiency of charging the battery; eta m The efficiency of the motor under the current rotating speed and torque is obtained; SOC is the state of charge of the battery; Δ t is the sampling interval; q is the battery capacity.
As shown in figure 3 is a flow chart of an energy management policy algorithm,
the method comprises the steps of firstly, conducting dimensionality reduction on characteristic values of the velocity fragments in a working condition by using principal component analysis, classifying the motion fragments by using fuzzy clustering, conducting working condition recombination according to classification results to obtain low-speed, medium-speed and high-speed training working conditions, and conducting training on working condition types by using an LVQ neural network. And then, establishing a rule with the engine torque, the SOC and the motor torque as rule control variables and the driving mode as output quantities, integrating the rule into an agent of deep reinforcement learning, and training the rule-integrated deep reinforcement learning energy management under three working conditions by combining with a designed penalty function. And then in the actual use process, firstly, extracting characteristic parameters of the current operation working condition by using a sliding window algorithm, then, taking the characteristic parameters as the input of the trained LVQ neural network to obtain the current working condition category, and then, selecting a rule after corresponding working condition training according to the working condition category, and fusing a deep reinforcement learning energy management strategy to carry out energy distribution on the hybrid power system.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only illustrative of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A rule fusion deep reinforcement learning energy management method based on working condition identification is characterized by comprising the following steps:
step 1, establishing a hybrid power system model, establishing a plug-in hybrid power automobile as a target, and establishing the hybrid power system model by adopting a parallel structure as a connection mode of an engine and a motor, wherein the plug-in hybrid power system consists of a fuel engine, the motor, a vehicle-mounted power battery, an oil tank, a torque coupler, a clutch and a 5-gear gearbox, the fuel engine is connected with the torque coupler, the motor is directly connected with one end of the torque coupler, the output end of the torque coupler is connected with the clutch, the other end of the clutch is connected with the 5-gear gearbox, and then transmitting power to a front axle to drive the vehicle to run;
the power battery adopts a Rint equivalent circuit model:
Figure FDA0003933969610000011
Figure FDA0003933969610000012
Figure FDA0003933969610000013
wherein, I is the current of the battery, and is positive when discharging and negative when charging; u shape ocv The open-circuit voltage of the battery can be obtained by an open-circuit voltage test; r is the internal resistance of the battery, the value of which changes along with the SOC and can be obtained by looking up a table; p bat For the power of the battery, when the motor torque T m When the time is positive, the battery is in a discharge state, and the motor torque T m When the voltage is negative, the battery is in a charging state; n is m The motor rotating speed; eta bat-d The efficiency of discharge for the cell; eta bat-c Efficiency of charging the battery; eta m The efficiency of the motor under the current rotating speed and torque is obtained; SOC is the state of charge of the battery; Δ t is the sampling interval; q is the battery capacity;
the longitudinal running equation of the vehicle is that when the vertical motion and the operation stability of the vehicle are not considered:
Figure FDA0003933969610000014
wherein Tcon is the torque required by the current working condition; ig is the transmission ratio of the transmission at the current gear; i0 is the transmission ratio of the main speed reducer; eta T The total transmission efficiency; r is the wheel radius; m is the mass of the whole vehicle; g is the acceleration of gravity; f is a rolling resistance coefficient; theta is a road slope angle; CD is the air resistance coefficient; a is the frontal area of the vehicle; u is the vehicle speed; delta is a rotating mass conversion coefficient;
the vehicle torque coupler adopts three-port two-degree-of-freedom mechanical configuration, a port 1 is used for unidirectional power input, a port 2 and a port 3 are used for bidirectional power input or output, the port 1 is connected with an engine crankshaft, the port 2 is connected with a motor output shaft, and the port 3 is connected with a clutch input end;
the relationship between the torque and the rotating speed of each port of the torque coupler is as follows:
T 3 =i e T e +i m T m
Figure FDA0003933969610000021
in the formula, T e Is the engine torque; n is e Is the engine speed; t is a unit of 3 Outputting torque for the coupler; n is a radical of an alkyl radical 3 Outputting the rotating speed for the coupler; i.e. i e For the transmission ratio at the position where the port 1 is connected with the crankshaft of the engine, i is taken e =1;i m Gear ratio of port 2 connected to the output shaft of the motor, i m Taking the weight as 1.7368;
step 2, classifying and identifying working conditions, wherein the kinematic segment of the working condition of the vehicle represents the driving state of the vehicle in the period from one idling starting to the next idling starting, and comprises an idling process and a driving process, wherein the vehicle is in a static state in the idling process, and the driving process comprises multiple acceleration, constant speed and deceleration behaviors of the vehicle; establishing a deep reinforcement learning training working condition, selecting 8 standard working conditions, establishing a working condition library, performing kinematics segmentation on the working condition library, and then selecting the following 9 representative parameters according to the segmented kinematics segment to calculate the characteristics of the kinematics segment: average vehicle speed, average running vehicle speed, maximum vehicle speed, average acceleration, acceleration ratio, deceleration ratio, constant speed ratio, maximum acceleration and maximum deceleration;
the characteristic parameters in each kinematic segment can represent the characteristics of the kinematic segment, but each characteristic parameter is not independent and has a certain relationship with each other, so that the characteristic parameters of the kinematic segment are subjected to dimension reduction by utilizing principal component analysis, all working condition characteristics are covered as comprehensively as possible, the classification difficulty is reduced, and the reliability is improved;
step 3, designing a rule-fused deep reinforcement learning energy management strategy, wherein the design comprises a state, an action, an agent and a penalty function, a state space is selected as a required torque Tr, a battery SOC and a current transmission ratio of a transmission, an action variable is selected as an engine output torque Te and a gear shifting action Ag, the agent design of the fusion rule carries out energy based on the thought of the rule algorithm, a depth Q learning algorithm of the fusion rule is obtained, the number of effective samples in a sample pool is increased, the SOC is used as a rule control quantity, and the efficient working range of the SOC is set to be 0.2-0.8; and taking the torque of the power system as a regular control quantity;
the penalty function calculation method comprises the following steps:
Figure FDA0003933969610000022
wherein b is the fuel consumption rate, and can be obtained from the universal characteristic curve chart according to the current torque and the rotating speed of the engine; ρ is the fuel density; g is the acceleration of gravity; cf is the price per liter of fuel; ce is the price of electrical energy per kwh; lambda [ alpha ] A Is a shift action value weighting factor; lambda [ alpha ] p1 Is a penalty factor under a poor shift strategy; lambda [ alpha ] p2 Is the penalty factor of SOC exceeding the upper and lower use limits.
2. The rule fusion deep reinforcement learning energy management method based on working condition identification as claimed in claim 1, wherein the step 1 can be divided into 3 driving modes according to the energy flow direction of the engine and the motor in the torque coupler:
(1) A combined driving mode: in the mode, the port 1 and the port 2 are power input ends, the port 3 is a power output end, the engine and the motor jointly provide power to drive the vehicle to run, the motor torque Tm is positive, and the battery is in a discharging state;
(2) Pure electric drive mode: in the mode, the port 1 has no power input, the port 2 is a power input end, the motor drives the vehicle independently, the motor torque Tm is positive, the battery is in a discharge state, and the engine is stopped;
(3) And (3) a motor charging mode: in this mode, the motor of the vehicle becomes a generator, and the motor torque Tm is negative; the vehicle can be charged in a driving state and charged in a non-driving state according to the driving state of the vehicle, when the vehicle is charged in the driving state, the clutch is combined, the port 1 is a power input end, the port 2 and the port 3 are power output ends, the engine drives the generator to rotate while providing power to drive the vehicle to run, the battery is in the charging state, when the vehicle is charged in the parking state, the port 1 is the power input end, the port 2 is the power output end, the port 3 has no power output, the clutch is separated, the mechanical loss caused by the gearbox and the front axle is reduced, and the engine only provides power for the generator to charge the battery.
3. The rule fusion deep reinforcement learning energy management method based on working condition identification as claimed in claim 1, wherein the specific implementation process in step 2 is as follows:
(1) Data were normalized:
Figure FDA0003933969610000031
wherein x is ij A j-th characteristic parameter representing an i-th kinematic segment;
Figure FDA0003933969610000032
is the sample mean; s j Is the standard deviation; i =1,2,3, \8230;, n; j =1,2,3, \8230;, m;
(2) Calculating the covariance matrix C of the Z matrix
Figure FDA0003933969610000033
(3) Eigenvalue decomposition of covariance matrix C
C=Q∑Q -1
Wherein Q is a matrix formed by eigenvalue vectors, sigma is a diagonal matrix, and the elements on the diagonal are eigenvalues lambda 1 、λ 2 、…、λ m
(4) Calculating the contribution ratio p of each feature vector 1 、p 2 、…p m And cumulative contribution rate;
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003933969610000034
cumulative contribution rate P j Accumulating contribution rates of the first k principal components;
Figure FDA0003933969610000035
(5) Taking the feature vector corresponding to the principal component as a conversion matrix, and multiplying the data matrix by the conversion matrix to realize principal component mapping to obtain the corresponding kinematics segment feature parameters after dimension reduction;
then, fuzzy C-means clustering in the fuzzy clustering is used, and the clustering analysis is carried out on the kinematic segments according to the obtained principal component result, wherein the flow is as follows:
(1) Setting the number of clusters n c And a weighting index b;
(2) Initializing each cluster center m j
(3) Calculating membership functions of all samples under the current clustering center:
Figure FDA0003933969610000041
wherein mu j (x i ) Expressed as a membership function of the ith sample corresponding to the jth class;
(4) Calculating various clustering centers under the current membership function:
Figure FDA0003933969610000042
(5) Until the algorithm converges or reaches the maximum iteration times, otherwise, repeating the steps (3) and (4);
to determine the number of clusters n c Using L (n) c ) The function is used as an evaluation index, and the formula is as follows:
Figure FDA0003933969610000043
wherein the numerator represents the sum of the inter-class distances and the denominator represents the sum of the intra-class distances, so L (n) c ) The larger the value is, the better the classification effect is;
and according to the fuzzy clustering result, combining the different types of kinematic fragments into a 3-type kinematic fragment library, then randomly extracting a certain number of kinematic fragments from the 3-type kinematic fragment library, and randomly arranging the kinematic fragments to obtain 3 working conditions for training.
4. The rule fusion deep reinforcement learning energy management method based on working condition identification as claimed in claim 3, wherein the step 2 further comprises training identification of working condition classes under 3 training working conditions by using LVQ neural network, and the specific steps are as follows:
(1) Combining the training working conditions 1,2 and 3, calculating 9 corresponding characteristic parameters in window data by using a sliding window algorithm as input of an LVQ neural network, and training by using a vector form of the working condition category as a label;
(2) If the number of the windows is too large, the window data may contain more than one type of working condition data, so that the identification difficulty is increased, and if the number of the windows is too short, the working condition characteristic information is incomplete, so that the identification precision is reduced, and the fuel economy of the whole vehicle is reduced, so that the working condition characteristic parameters are extracted by rolling with 35s as the window length;
(3) Training the LVQ neural network, wherein the selected hyper-parameters are as follows: the number of nodes of the LVQ nerve competition layer is 500, the learning rate is 0.0005, the type of the learning function is Learnlv1, and the iteration cycle is 50 times;
(4) Verifying the accuracy of the LVQ neural network, performing 35 s-long sliding window operation on the verification working condition, taking the characteristic parameters extracted in a rolling mode as the input of the trained LVQ neural network, and performing indexing operation on the output to obtain a verification working condition identification result.
CN202111177978.1A 2021-10-09 2021-10-09 Rule fusion deep reinforcement learning energy management method based on working condition identification Active CN113715805B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111177978.1A CN113715805B (en) 2021-10-09 2021-10-09 Rule fusion deep reinforcement learning energy management method based on working condition identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111177978.1A CN113715805B (en) 2021-10-09 2021-10-09 Rule fusion deep reinforcement learning energy management method based on working condition identification

Publications (2)

Publication Number Publication Date
CN113715805A CN113715805A (en) 2021-11-30
CN113715805B true CN113715805B (en) 2023-01-06

Family

ID=78685752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111177978.1A Active CN113715805B (en) 2021-10-09 2021-10-09 Rule fusion deep reinforcement learning energy management method based on working condition identification

Country Status (1)

Country Link
CN (1) CN113715805B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821775A (en) * 2023-08-29 2023-09-29 陕西重型汽车有限公司 Load estimation method based on machine learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104071161A (en) * 2014-04-29 2014-10-01 福州大学 Method for distinguishing working conditions and managing and controlling energy of plug-in hybrid electric vehicle
CN110929920A (en) * 2019-11-05 2020-03-27 中车戚墅堰机车有限公司 Hybrid power train energy management method based on working condition identification
CN112035949A (en) * 2020-08-14 2020-12-04 浙大宁波理工学院 Real-time fuzzy energy management method combined with Q reinforcement learning
CN112116156A (en) * 2020-09-18 2020-12-22 中南大学 Hybrid train energy management method and system based on deep reinforcement learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7954579B2 (en) * 2008-02-04 2011-06-07 Illinois Institute Of Technology Adaptive control strategy and method for optimizing hybrid electric vehicles

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104071161A (en) * 2014-04-29 2014-10-01 福州大学 Method for distinguishing working conditions and managing and controlling energy of plug-in hybrid electric vehicle
CN110929920A (en) * 2019-11-05 2020-03-27 中车戚墅堰机车有限公司 Hybrid power train energy management method based on working condition identification
CN112035949A (en) * 2020-08-14 2020-12-04 浙大宁波理工学院 Real-time fuzzy energy management method combined with Q reinforcement learning
CN112116156A (en) * 2020-09-18 2020-12-22 中南大学 Hybrid train energy management method and system based on deep reinforcement learning

Also Published As

Publication number Publication date
CN113715805A (en) 2021-11-30

Similar Documents

Publication Publication Date Title
Tang et al. Naturalistic data-driven predictive energy management for plug-in hybrid electric vehicles
Khayyam et al. Adaptive intelligent energy management system of plug-in hybrid electric vehicle
CN112287463B (en) Fuel cell automobile energy management method based on deep reinforcement learning algorithm
Lee et al. A novel big data modeling method for improving driving range estimation of EVs
CN105868942B (en) The orderly charging schedule method of electric car
CN112327168A (en) XGboost-based electric vehicle battery consumption prediction method
CN111079230A (en) NSGA-II-based multi-objective optimization method for parameters of plug-in hybrid electric vehicle power transmission system
CN113715805B (en) Rule fusion deep reinforcement learning energy management method based on working condition identification
Lee et al. Learning to recognize driving patterns for collectively characterizing electric vehicle driving behaviors
Ghobadpour et al. An intelligent energy management strategy for an off‐road plug‐in hybrid electric tractor based on farm operation recognition
CN115759462A (en) Charging behavior prediction method and device for electric vehicle user and electronic equipment
Peng et al. Ecological Driving Framework of Hybrid Electric Vehicle Based on Heterogeneous Multi-Agent Deep Reinforcement Learning
Chang et al. A novel energy management strategy integrating deep reinforcement learning and rule based on condition identification
Balch et al. The affect of battery pack technology and size choices on hybrid electric vehicle performance and fuel economy
Wu et al. Adaptive energy management strategy for extended-range electric vehicle based on micro-trip identification
Chen et al. A novel method of developing driving cycle for electric vehicles to evaluate the private driving habits
CN117465301A (en) Fuel cell automobile real-time energy management method based on data driving
Wang et al. An enhanced hypotrochoid spiral optimization algorithm based intertwined optimal sizing and control strategy of a hybrid electric air-ground vehicle
Kang et al. A Costate Estimation for Pontryagin's Minimum Principle by Machine Learning
CN114122466B (en) Method, system, equipment and medium for optimizing management of fuel cell vehicle battery
Wimalendra et al. Determination of maximum possible fuel economy of HEV for known drive cycle: genetic algorithm based approach
CN114670803A (en) Parallel hybrid electric vehicle energy management method based on self-supervision learning
Gozukucuk et al. Design and simulation of an optimal energy management strategy for plug-in electric vehicles
CN113702842A (en) Method for estimating endurance mileage of pure electric vehicle
Bhaskar et al. Recent trends on drivetrain control strategies and battery parameters of a hybrid electric vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant