CN112766310B - Fuel-saving lane-changing decision-making method and system - Google Patents

Fuel-saving lane-changing decision-making method and system Download PDF

Info

Publication number
CN112766310B
CN112766310B CN202011613625.7A CN202011613625A CN112766310B CN 112766310 B CN112766310 B CN 112766310B CN 202011613625 A CN202011613625 A CN 202011613625A CN 112766310 B CN112766310 B CN 112766310B
Authority
CN
China
Prior art keywords
decision
data
unit
vehicle
inputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011613625.7A
Other languages
Chinese (zh)
Other versions
CN112766310A (en
Inventor
王大维
高令平
李伟
杨睿刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inceptio Star Intelligent Technology Shanghai Co Ltd
Original Assignee
Inceptio Star Intelligent Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inceptio Star Intelligent Technology Shanghai Co Ltd filed Critical Inceptio Star Intelligent Technology Shanghai Co Ltd
Priority to CN202011613625.7A priority Critical patent/CN112766310B/en
Publication of CN112766310A publication Critical patent/CN112766310A/en
Application granted granted Critical
Publication of CN112766310B publication Critical patent/CN112766310B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a fuel-saving channel-changing decision method and a fuel-saving channel-changing decision system, wherein the method comprises the following steps: acquiring traffic flow data at the current moment and target surrounding vehicle road condition data within a set time range before the current moment; inputting the traffic flow data and the target surrounding road condition data into an oil-saving lane-changing decision model, and outputting a decision result; executing a lane change decision according to the decision result; the oil-saving and lane-changing decision model is obtained by training based on historical sample data of the vehicle tracks, prediction sample data of the vehicle tracks and road condition data of the samples, so that the oil-saving and lane-changing decision model outputs a decision result based on traffic flow data and road condition data of the vehicle, and the output decision result is more accurate and reasonable.

Description

Fuel-saving lane-changing decision-making method and system
Technical Field
The invention relates to the technical field of automatic driving, in particular to an oil-saving lane change decision method and an oil-saving lane change decision system.
Background
Each module in the traditional autopilot system framework is decoupled, and they are generally classified as: perception (perception), prediction of the trajectory of the vehicle (prediction), driving decision (decision making), trajectory planning (trajectory planning), vehicle control (control). The output of each module is the input of the next module, and the data flow is transmitted backwards in a single direction.
At present, the mature and safe mode of driving decision (lane change decision) is mostly the driver sends out instruction (shift lever lane change) or the rule-based lane change decision. However, in high-level autopilots, in particular in L3 and higher autopilot systems, a stick lane change is no longer appropriate because of the intervention of the driver. The rule-based lane change strategy requires a system developer to preset a lane change decision rule, and the rule is often difficult to design completely in the face of ever-changing road conditions and traffic changes.
Disclosure of Invention
The invention provides an oil-saving channel-changing decision method and an oil-saving channel-changing decision system, which are used for solving the technical defects in the prior art.
The invention provides an oil-saving channel-changing decision method, which comprises the following steps:
acquiring traffic flow data at the current moment and target vehicle-to-vehicle road condition data within a set time range before the current moment;
inputting the traffic flow data and the target surrounding road condition data into an oil-saving lane-changing decision model, and outputting a decision result;
executing a lane change decision according to the decision result;
the oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample.
According to the oil-saving lane change decision method provided by the invention, the oil-saving lane change decision model comprises a coding unit, a vehicle track prediction unit and a decision unit which are respectively connected with the coding unit;
the training method of the oil-saving and lane-changing decision model comprises the following steps:
inputting the historical sample data and the predicted sample data of the track of the peripheral vehicle into a supervised learning network formed by the coding unit and the track prediction unit of the peripheral vehicle so as to adjust the parameters of the coding unit and the track prediction unit of the peripheral vehicle;
inputting the historical sample data of the track of the vehicle and the road condition data of the sample into the coding unit to generate a first coding vector, inputting the first coding vector into the decision unit, outputting a prediction decision value and updating the parameters of the decision unit;
and alternately training a supervised learning network consisting of the coding unit and the all-vehicle trajectory prediction unit and a reinforcement learning network consisting of the coding unit and the decision unit to obtain a trained oil-saving lane-changing decision model.
According to the fuel-saving track-changing decision-making method provided by the invention, the historical sample data of the track of the week vehicle and the prediction sample data of the track of the week vehicle are input into a supervised learning network formed by the coding unit and the track prediction unit of the week vehicle so as to adjust the parameters of the coding unit and the track prediction unit of the week vehicle, and the method comprises the following steps:
inputting the historical sample data of the track of the vehicle to the coding unit, and outputting a second coding vector;
and inputting the second encoding vector to the peripheral track prediction unit, outputting peripheral track prediction initial data, and adjusting parameters of the encoding unit and the peripheral track prediction unit based on errors of the peripheral track prediction initial data and the peripheral track prediction sample data.
According to the fuel-saving lane change decision method provided by the invention, the week vehicle track prediction unit comprises: a circulating network layer and a first full connection layer;
inputting the second encoding vector into the peripheral track prediction unit, and outputting peripheral track prediction initial data, including:
inputting the second encoding vector to a circulating network layer, and outputting a first intermediate vector;
and inputting the first intermediate vector into a first full-connection layer, and outputting the initial data of the vehicle track prediction.
According to the oil-saving channel-changing decision method provided by the invention, the first coding vector is input to the decision unit, a prediction decision value is output, and the parameter of the decision unit is updated, and the method comprises the following steps:
inputting the first coding vector into a decision unit, outputting a prediction decision value, and determining a corresponding reward value according to a reward function, wherein the reward function comprises instantaneous oil consumption, total oil consumption rate and channel change punishment;
adjusting a parameter of the decision unit based on the reward value.
According to the oil-saving channel-changing decision method provided by the invention, the oil-saving channel-changing decision model comprises a coding unit and a decision unit;
inputting the traffic flow data and the target vehicle road condition data into an oil-saving lane-changing decision model, and outputting a decision result, wherein the decision result comprises the following steps:
inputting the traffic flow data and the target surrounding road condition data into a coding unit to obtain a third coding vector;
and inputting the third coding vector into a decision unit, outputting three decision values and corresponding probability values, and taking the decision value with the maximum probability value as a decision result, wherein the decision value comprises right turn, left turn and straight line.
According to the oil-saving lane-changing decision-making method provided by the invention, the decision-making unit comprises: a second fully-connected layer and a third fully-connected layer;
inputting the third coding vector into a decision unit, and outputting three decisions and corresponding probability values, including:
inputting the third coding vector to a second full-connection layer, and outputting a second intermediate vector;
and inputting the second intermediate vector to a third full-connection layer, and outputting three decision values and corresponding probability values.
The invention also provides an oil-saving channel-changing decision-making system, which comprises:
the lane change data acquisition module is used for acquiring traffic flow data at the current moment and target vehicle-to-vehicle road condition data within a set time range before the current moment;
the decision result output module is used for inputting the traffic flow data and the target surrounding road condition data into the oil-saving lane-changing decision model and outputting a decision result;
the channel switching decision execution module is used for executing channel switching decision according to the decision result;
the oil-saving and road-changing decision model is obtained by training based on historical sample data of the tracks of the vehicles, forecast sample data of the tracks of the vehicles and road condition data of the samples.
The invention also provides electronic equipment comprising a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the program to realize the steps of the oil-saving channel-changing decision-making method.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the fuel economy lane change decision method as described in any of the above.
According to the fuel-saving lane change decision method and the fuel-saving lane change decision system, the traffic flow data and the target vehicle road condition data are input into the fuel-saving lane change decision model, the decision result is output, and then the lane change decision is executed according to the decision result, so that the fuel-saving lane change decision model outputs the decision result based on the traffic flow data and the vehicle road condition data, and the output decision result is more accurate and reasonable.
In addition, the oil-saving and road-changing decision model fuses a vehicle track prediction task and a road-changing decision task, and uses the vehicle track prediction as a subtask to supervise and update part of weights in the oil-saving and road-changing decision model, so that the difficulty of the decision model training is reduced, and the convergence effect of the decision model is better.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a fuel-saving lane-changing decision method provided by the present invention;
FIG. 2 is a schematic structural diagram of a fuel-saving lane-changing decision model provided by the invention;
FIG. 3 is a schematic flow chart of a training method of the fuel-saving lane-changing decision model provided by the invention;
FIG. 4 is a second flowchart of the fuel-saving lane-changing decision-making method according to the present invention;
FIG. 5 is a schematic structural diagram of the fuel-saving and lane-changing decision-making system provided by the invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
First, the noun terms related to the present embodiment are explained.
Traffic flow data: the motion data of the week vehicle includes, for example, the speed and position data of the week vehicle.
The vehicle road condition data: including road information of the surrounding vehicle travel environment, road traffic data, intersection traffic data, road intersection information, and the like.
Long Short-Term Memory network (LSTM): the method is a special recurrent neural network and can learn long-term dependence information.
Full connection layer: the fully-connected layer can integrate local information with class distinction in the received data.
The embodiment of the invention discloses an oil-saving lane change decision method, which is shown in figure 1 and comprises the following steps of 101-103:
step 101, obtaining traffic data at the current moment and target vehicle-to-vehicle road condition data in a set time range before the current moment.
The traffic data includes the speed and position data of the week vehicle.
In addition, the set time range may be set according to actual requirements, for example, if the set time range is 5 seconds, the target surrounding vehicle road condition data within 5 seconds before the current time is acquired in this step.
And generating a decision result in the subsequent step based on the traffic flow data and the target surrounding road condition data.
And 102, inputting the traffic flow data and the target vehicle road condition data into an oil-saving and lane-changing decision model, and outputting a decision result.
The oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample.
Specifically, a training process of the fuel-saving lane change decision model in this embodiment is explained. Referring to fig. 2, fig. 2 is a decision model for fuel saving and lane changing in this embodiment, which includes a coding unit, a vehicle trajectory prediction unit and a decision unit, where the vehicle trajectory prediction unit and the decision unit are respectively connected to the coding unit.
The coding unit and the vehicle track prediction unit form a supervised learning network, and the coding unit and the decision unit form a reinforcement learning network.
And in the process of outputting the decision result, only the reinforcement learning network is used for predicting the decision result. Specifically, step 102 includes: inputting the traffic flow data and the target surrounding road condition data into a coding unit to obtain a coding vector; and inputting the coding vector into a decision unit, outputting three decision values and corresponding probability values, and taking the decision value with the maximum probability value as a decision result, wherein the decision value comprises right turn, left turn and straight line.
In this embodiment, referring to fig. 2, the encoding unit is implemented by two LSTM network layers and one full-link layer which are connected in sequence, and outputs an encoding vector which fuses traffic information and vehicle road condition information by sequentially processing traffic data and target vehicle road condition data through three layers.
In addition, referring to fig. 2, the decision unit is implemented by sequentially connecting two fully-connected layers, and outputs three decision values and corresponding probability values by sequentially processing the encoded vector through the two fully-connected layers.
Generally, the decision values include: and the decision value with the maximum probability value is taken as the decision result. For example, if the currently output decision value is 0.31 for left turn, 0.45 for right turn, and 0.24 for straight line, the right turn is taken as the decision result.
And 103, executing a channel switching decision according to the decision result.
In this embodiment, the decision result is any one of a left transition lane, a right transition lane, and a straight line. According to the finally determined decision result, the automatically driven vehicle executes a corresponding lane change decision, such as a left lane change, wherein the corresponding lane change decision comprises: left turn steering wheel, increase brake pedal angle, decrease throttle angle, etc.
In the fuel-saving lane-changing decision method provided by the embodiment, the traffic flow data and the target vehicle road condition data are input into the fuel-saving lane-changing decision model, the decision result is output, and then the lane-changing decision is executed according to the decision result, so that the fuel-saving lane-changing decision model outputs the decision result based on the traffic flow data and the vehicle road condition data, and the output decision result is more accurate and reasonable.
In order to make the decision result of the model more reasonable and more visible, several seconds of weekly vehicle and road condition information are added into the input of the model. In order to process the data containing the time sequence information, a special network structure is needed, and a long-short term memory network (LSTM) is widely applied to the fields of track prediction, natural language processing and the like, but needs a strong supervision signal, otherwise, the training is difficult. Therefore, the network structure of multi-task training is adopted in the embodiment, the weekly vehicle trajectory prediction task is added to serve as an auxiliary task for training, the problem that the LSTM is difficult to train in reinforcement learning is effectively solved, convergence is easier, and a better effect is obtained. See the examples below.
The embodiment of the invention also discloses a training method of the oil-saving lane-changing decision model, which is shown in the figure 2 and the figure 3 and comprises the following three steps:
step 301, inputting the historical sample data and the predicted sample data of the weekly vehicle track into a supervised learning network formed by the coding unit and the weekly vehicle track prediction unit so as to adjust parameters of the coding unit and the weekly vehicle track prediction unit.
The purpose of step 301 is to enable the encoding unit to extract historical information features of the weekly vehicle track, which can accurately predict the future track of the weekly vehicle.
Specifically, step 301 includes steps S311 to S312:
s311, inputting the historical sample data of the track of the week vehicle into the encoding unit, and outputting a second encoding vector.
Specifically, the encoding unit includes two LSTM network layers and one full-connection FC layer connected in sequence, and step S311 includes: and sequentially processing the historical sample data of the track of the vehicle through two LSTM network layers and a full connection layer, and outputting a second coding vector.
The LSTM network is configured to extract timing information, and may also be implemented by other timing networks, for example, a Gate Recovery Unit (GRU) network.
And S312, inputting the second encoding vector into the peripheral track prediction unit, outputting the initial peripheral track prediction data, and adjusting parameters of the encoding unit and the peripheral track prediction unit based on errors of the initial peripheral track prediction data and the sample data of the peripheral track prediction.
Specifically, the vehicle-following trajectory prediction unit includes: a circulating network layer and a first full connection layer;
step S312 includes: inputting the second encoding vector to a circulating network layer, and outputting a first intermediate vector; and inputting the first intermediate vector into a first full-connection layer, and outputting the initial data of the vehicle track prediction.
Through steps S311 to S312, the supervised learning network composed of the coding unit and the vehicle trajectory prediction unit can be trained.
Step 302, inputting the historical sample data of the weekly vehicle track and the road condition sample data into the coding unit to generate a first coding vector, inputting the first coding vector into the decision unit, outputting a prediction decision value and updating parameters of the decision unit.
Specifically, the step 302 of inputting the first coded vector into the decision unit, outputting a prediction decision value, and updating parameters of the decision unit includes:
s321, inputting the first coding vector to a decision unit, outputting a prediction decision value, and determining a corresponding reward value according to a reward function, wherein the reward function comprises instantaneous oil consumption, total oil consumption rate and channel change punishment.
The instantaneous oil consumption is calculated through the state of the engine, and instantaneous feedback is provided for model decision; the total oil consumption rate is the total oil consumption in the whole process, and the oil consumption in the whole process is finally evaluated; the lane change punishment means that a small punishment is given to each lane change decision, so that the model does not frequently change lanes. The coefficients between the three terms of the reward function will be tested experimentally and adjusted to the optimum.
S322, adjusting the parameters of the decision unit based on the reward value.
It should be noted that, as a reinforcement learning network, unlike a supervised learning network, both learning modes learn a mapping from input to output, and what output the supervised learning network outputs is a relationship between them, which can tell the model what input corresponds to what output; the reinforcement learning shows the feedback (rewarded function) to the machine, namely the feedback of the output quality of the time to the model through the reward value.
In step 302, it is implemented to train the reinforcement learning network. Since the coding unit is trained, in the training of this stage, we will fix the weight parameter of the coding unit, and only update the parameter of the cycle trajectory prediction unit, which enables the network to converge quickly (because the network structure of the coding unit is reduced).
Step 303, alternately training a supervised learning network formed by the coding unit and the all-vehicle trajectory prediction unit and a reinforcement learning network formed by the coding unit and the decision unit to obtain a trained oil-saving lane change decision model.
In step 303, all the network weights are released for training and updating, and by alternately training the two learning networks, the coding layer can have the capability of extracting historical information of the weekly vehicle track and traffic information at the current moment, so that the accuracy of outputting decision data can be improved.
Through the steps 301-303, the training process of the oil-saving channel-changing decision model is realized, and the oil-saving channel-changing decision model capable of predicting the channel-changing decision is obtained.
The embodiment of the invention discloses an oil-saving channel-changing decision method, which comprises the following steps of:
step 401, obtaining traffic data at the current time and target vehicle road condition data in a set time range before the current time.
The traffic data includes the speed and position data of the week vehicle.
In addition, the set time range may be set according to actual requirements, for example, if the set time range is 5 seconds, the target surrounding vehicle road condition data within 5 seconds before the current time is acquired in this step.
And 402, inputting the traffic flow data and the target road condition data into a coding unit to obtain a third coding vector.
In particular, the coding unit comprises two LSTM network layers and one fully-connected layer implementation connected in sequence. Step 402 comprises: and sequentially processing the traffic data and the target road condition data of the vehicles in the week through two LSTM network layers and a full connection layer, and outputting a third coding vector fusing traffic information and road condition information of the vehicles in the week.
Step 403, inputting the third coding vector to a decision unit, outputting three decision values and corresponding probability values, and taking the decision value with the maximum probability value as a decision result, wherein the decision value includes a right turn, a left turn and a straight line.
Specifically, step 403 includes: inputting the third encoding vector into a second full-connection layer, and outputting a second intermediate vector; and inputting the second intermediate vector into a third full-connection layer, and outputting three decision values and corresponding probability values.
Generally, the decision values include: and the decision value with the maximum probability value is taken as the decision result. For example, if the currently output decision value is 0.41 for left turn, 0.26 for right turn, and 0.33 for straight line, the left turn is taken as the decision result.
And 404, executing a lane changing decision according to the decision result.
In this embodiment, the decision result is any one of a left transition lane, a right transition lane, and a straight line. And according to the finally determined decision result, the automatic driving vehicle executes a corresponding lane change decision. Lane-change decisions include control of vehicle operating components such as throttle, brake pedal, steering wheel, etc.
According to the oil-saving lane-changing decision method provided by the embodiment, the vehicle flow data and the target vehicle road condition data are input into the oil-saving lane-changing decision model, the decision result is output, and then the lane-changing decision is executed according to the decision result, so that the oil-saving lane-changing decision model outputs the decision result based on the vehicle flow data and the vehicle road condition data, and the output decision result is more accurate and reasonable.
In addition, the fuel-saving lane change decision model integrates a vehicle track prediction task and a lane change decision task, and uses the vehicle track prediction as a subtask to supervise and update part of weights in the fuel-saving lane change decision model, so that the difficulty of decision model training is reduced, and the convergence effect of the decision model is better.
The fuel-saving channel-changing decision-making system provided by the invention is described below, and the fuel-saving channel-changing decision-making system described below and the fuel-saving channel-changing decision-making method described above can be referred to correspondingly.
The embodiment of the invention discloses an oil-saving and channel-changing decision making system, which is shown in figure 5 and comprises the following components:
a lane change data acquisition module 501, configured to acquire traffic data at the current time and target surrounding road condition data within a set time range before the current time;
a decision result output module 502, configured to input the traffic flow data and the target surrounding road condition data into the fuel-saving lane-changing decision model, and output a decision result;
a lane change decision execution module 503, configured to execute a lane change decision according to the decision result;
the oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample.
Optionally, the fuel-saving lane-changing decision model comprises a coding unit, a vehicle-following trajectory prediction unit and a decision unit which are respectively connected with the coding unit;
the device further comprises:
the first training module is used for inputting historical sample data and predicted sample data of the track of the week vehicle into a supervised learning network consisting of the coding unit and the track prediction unit of the week vehicle so as to adjust parameters of the coding unit and the track prediction unit of the week vehicle;
the second training module is used for inputting the historical sample data of the vehicle track and the road condition sample data to the coding unit to generate a first coding vector, inputting the first coding vector to the decision-making unit, outputting a prediction decision value and updating parameters of the decision-making unit;
and the third training module is used for alternately training a supervised learning network formed by the coding unit and the all-vehicle trajectory prediction unit and a reinforcement learning network formed by the coding unit and the decision unit to obtain the trained oil-saving lane-changing decision model.
Optionally, the first training module is specifically configured to: inputting the historical sample data of the track of the vehicle to the coding unit, and outputting a second coding vector; and inputting the second encoding vector to the peripheral track prediction unit, outputting peripheral track prediction initial data, and adjusting parameters of the encoding unit and the peripheral track prediction unit based on errors of the peripheral track prediction initial data and the peripheral track prediction sample data.
Optionally, the vehicle track prediction unit includes: a circulating network layer and a first full connection layer; the first training module is specifically configured to: inputting the second encoding vector to a circulating network layer, and outputting a first intermediate vector; and inputting the first intermediate vector into a first full-connection layer, and outputting the initial data of the vehicle track prediction.
Optionally, the second training module is specifically configured to: inputting the first coding vector into a decision unit, outputting a prediction decision value, and determining a corresponding reward value according to a reward function, wherein the reward function comprises instantaneous oil consumption, total oil consumption rate and channel change punishment; adjusting a parameter of the decision unit based on the reward value.
Optionally, the oil-saving channel-changing decision model comprises a coding unit and a decision unit; the decision result output module 502 is specifically configured to:
inputting the traffic data and the target vehicle road condition data into a coding unit to obtain a third coding vector; and inputting the third coding vector into a decision unit, outputting three decision values and corresponding probability values, and taking the decision value with the maximum probability value as a decision result, wherein the decision value comprises right turn, left turn and straight line.
Optionally, the decision unit comprises: a second fully-connected layer and a third fully-connected layer; the decision result output module 502 is specifically configured to: inputting the third coding vector to a second full-connection layer, and outputting a second intermediate vector; and inputting the second intermediate vector to a third full-connection layer, and outputting three decision values and corresponding probability values.
The fuel-saving lane-changing decision-making system provided by the invention outputs the decision-making result by inputting the traffic flow data and the target road condition data of the vehicles to the fuel-saving lane-changing decision-making model and then executes the lane-changing decision according to the decision-making result, so that the fuel-saving lane-changing decision-making model outputs the decision-making result based on the traffic flow data and the road condition data of the vehicles, and the output decision-making result is more accurate and reasonable.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)610, a communication Interface (Communications Interface)620, a memory (memory)630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a fuel economy lane change decision method comprising:
acquiring traffic flow data at the current moment and target surrounding vehicle road condition data within a set time range before the current moment;
inputting the traffic flow data and the target surrounding road condition data into an oil-saving lane-changing decision model, and outputting a decision result;
executing a lane change decision according to the decision result;
the oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample.
In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, which when executed by a computer, enable the computer to perform the lane change saving decision method provided by the above methods, including:
acquiring traffic flow data at the current moment and target surrounding vehicle road condition data within a set time range before the current moment;
inputting the traffic flow data and the target surrounding road condition data into an oil-saving lane-changing decision model, and outputting a decision result;
executing a lane change decision according to the decision result;
the oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample.
In another aspect, the present invention further provides a non-transitory computer readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to execute the above-mentioned methods for fuel saving and lane changing decisions, and the method includes:
acquiring traffic flow data at the current moment and target surrounding vehicle road condition data within a set time range before the current moment;
inputting the traffic flow data and the target surrounding road condition data into an oil-saving lane-changing decision model, and outputting a decision result;
executing a lane change decision according to the decision result;
the oil-saving and road-changing decision model is obtained by training based on historical sample data of the tracks of the vehicles, forecast sample data of the tracks of the vehicles and road condition data of the samples.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A decision-making method for saving oil and changing channels is characterized by comprising the following steps:
acquiring traffic flow data at the current moment and target week vehicle road condition data within a set time range before the current moment, wherein the traffic flow data are motion data of a week vehicle and comprise speed and position data of the week vehicle; the target vehicle road condition data comprises road information of the driving environment of surrounding vehicles, road traffic data, intersection traffic data and road intersection information;
inputting the traffic flow data and the target surrounding road condition data into an oil-saving lane-changing decision model, and outputting a decision result;
executing a lane change decision according to the decision result;
the oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample;
the oil-saving lane-changing decision model comprises a coding unit, a vehicle track prediction unit and a decision unit which are respectively connected with the coding unit;
the training method of the oil-saving lane-changing decision model comprises the following steps:
inputting the historical sample data and the predicted sample data of the track of the peripheral vehicle into a supervised learning network formed by the coding unit and the track prediction unit of the peripheral vehicle so as to adjust the parameters of the coding unit and the track prediction unit of the peripheral vehicle;
inputting the historical sample data of the track of the vehicle and the road condition data of the sample into the coding unit to generate a first coding vector, inputting the first coding vector into the decision unit, outputting a prediction decision value and updating the parameters of the decision unit;
alternately training a supervised learning network consisting of the coding unit and the all-vehicle trajectory prediction unit and a reinforcement learning network consisting of the coding unit and the decision unit to obtain a trained oil-saving lane-changing decision model;
inputting the historical sample data and the predicted sample data of the track of the week vehicle into a supervised learning network formed by the coding unit and the track prediction unit of the week vehicle so as to adjust the parameters of the coding unit and the track prediction unit of the week vehicle, wherein the supervised learning network comprises:
inputting the historical sample data of the track of the vehicle to the coding unit, and outputting a second coding vector;
inputting the second encoding vector into the peripheral track prediction unit, outputting peripheral track prediction initial data, and adjusting parameters of the encoding unit and the peripheral track prediction unit based on errors of the peripheral track prediction initial data and the peripheral track prediction sample data.
2. The fuel-saving lane-changing decision-making method according to claim 1, wherein the vehicle-following trajectory prediction unit comprises: a circulating network layer and a first full connection layer;
inputting the second encoding vector into the peripheral track prediction unit, and outputting peripheral track prediction initial data, including:
inputting the second encoding vector to a circulating network layer, and outputting a first intermediate vector;
and inputting the first intermediate vector into a first full-connection layer, and outputting the initial data of the vehicle track prediction.
3. The fuel-saving lane-changing decision method according to claim 1, wherein inputting the first encoding vector to the decision unit, outputting a prediction decision value, and updating a parameter of the decision unit comprises:
inputting the first coding vector into a decision unit, outputting a prediction decision value, and determining a corresponding reward value according to a reward function, wherein the reward function comprises instantaneous oil consumption, total oil consumption rate and channel change punishment;
adjusting a parameter of the decision unit based on the reward value.
4. The fuel-saving channel-changing decision method according to claim 1, wherein the fuel-saving channel-changing decision model comprises a coding unit and a decision unit;
inputting the traffic flow data and the target vehicle road condition data into an oil-saving lane-changing decision model, and outputting a decision result, wherein the decision result comprises the following steps:
inputting the traffic flow data and the target surrounding road condition data into a coding unit to obtain a third coding vector;
and inputting the third coding vector into a decision unit, outputting three decision values and corresponding probability values, and taking the decision value with the maximum probability value as a decision result, wherein the decision value comprises right turn, left turn and straight line.
5. The fuel-saving lane change decision-making method according to claim 4, wherein the decision-making unit comprises: a second fully-connected layer and a third fully-connected layer;
inputting the third coding vector into a decision unit, and outputting three decisions and corresponding probability values, including:
inputting the third coding vector to a second full-connection layer, and outputting a second intermediate vector;
and inputting the second intermediate vector to a third full-connection layer, and outputting three decision values and corresponding probability values.
6. An oil-saving channel-changing decision-making system is characterized by comprising:
the lane change data acquisition module is used for acquiring traffic flow data at the current moment and target week vehicle road condition data within a set time range before the current moment, wherein the traffic flow data is motion data of a week vehicle and comprises speed and position data of the week vehicle; the target vehicle road condition data comprises road information of the driving environment of surrounding vehicles, road traffic data, intersection traffic data and road intersection information;
the decision result output module is used for inputting the traffic flow data and the target surrounding road condition data into the oil-saving lane-changing decision model and outputting a decision result;
the channel switching decision execution module is used for executing channel switching decision according to the decision result;
the oil-saving lane change decision model is obtained by training based on historical sample data of the track of the week vehicle, prediction sample data of the track of the week vehicle and road condition data of the sample;
the oil-saving lane-changing decision model comprises a coding unit, a vehicle track prediction unit and a decision unit which are respectively connected with the coding unit;
the oil-saving lane-changing decision-making system further comprises:
the first training module is used for inputting historical sample data and predicted sample data of the track of the week vehicle into a supervised learning network consisting of the coding unit and the track prediction unit of the week vehicle so as to adjust parameters of the coding unit and the track prediction unit of the week vehicle;
the second training module is used for inputting the historical sample data of the weekly vehicle track and the road condition data of the samples into the coding unit to generate a first coding vector, inputting the first coding vector into the decision unit, outputting a prediction decision value and updating the parameters of the decision unit;
the third training module is used for alternately training a supervised learning network formed by the coding unit and the all-vehicle trajectory prediction unit and a reinforcement learning network formed by the coding unit and the decision unit to obtain a trained oil-saving lane-changing decision model;
the first training module is specifically configured to: inputting the historical sample data of the track of the vehicle to the coding unit, and outputting a second coding vector; and inputting the second encoding vector to the peripheral track prediction unit, outputting peripheral track prediction initial data, and adjusting parameters of the encoding unit and the peripheral track prediction unit based on errors of the peripheral track prediction initial data and the peripheral track prediction sample data.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the fuel economy lane change decision method according to any one of claims 1 to 5 when executing the program.
8. A non-transitory computer readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the fuel economy lane change decision method according to any one of claims 1 to 5.
CN202011613625.7A 2020-12-30 2020-12-30 Fuel-saving lane-changing decision-making method and system Active CN112766310B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011613625.7A CN112766310B (en) 2020-12-30 2020-12-30 Fuel-saving lane-changing decision-making method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011613625.7A CN112766310B (en) 2020-12-30 2020-12-30 Fuel-saving lane-changing decision-making method and system

Publications (2)

Publication Number Publication Date
CN112766310A CN112766310A (en) 2021-05-07
CN112766310B true CN112766310B (en) 2022-09-23

Family

ID=75696167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011613625.7A Active CN112766310B (en) 2020-12-30 2020-12-30 Fuel-saving lane-changing decision-making method and system

Country Status (1)

Country Link
CN (1) CN112766310B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106740457A (en) * 2016-12-07 2017-05-31 镇江市高等专科学校 Vehicle lane-changing decision-making technique based on BP neural network model
CN111483468A (en) * 2020-04-24 2020-08-04 广州大学 Unmanned vehicle lane change decision-making method and system based on confrontation and imitation learning

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3460613B1 (en) * 2016-06-08 2020-10-07 Uisee Technologies (Beijing) Ltd Speed planning method and apparatus and calculating apparatus for automatic driving of vehicle
CN111046919B (en) * 2019-11-21 2023-05-12 南京航空航天大学 Surrounding dynamic vehicle track prediction system and method integrating behavior intention
CN111009153B (en) * 2019-12-04 2021-10-15 珠海深圳清华大学研究院创新中心 Training method, device and equipment of trajectory prediction model
CN111145552B (en) * 2020-01-06 2022-04-29 重庆大学 Planning method for vehicle dynamic lane changing track based on 5G network
CN111238523B (en) * 2020-04-23 2020-08-07 北京三快在线科技有限公司 Method and device for predicting motion trail
CN112071059B (en) * 2020-08-20 2021-07-16 华南理工大学 Intelligent vehicle track changing collaborative planning method based on instantaneous risk assessment
CN112085165A (en) * 2020-09-02 2020-12-15 中国第一汽车股份有限公司 Decision information generation method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106740457A (en) * 2016-12-07 2017-05-31 镇江市高等专科学校 Vehicle lane-changing decision-making technique based on BP neural network model
CN111483468A (en) * 2020-04-24 2020-08-04 广州大学 Unmanned vehicle lane change decision-making method and system based on confrontation and imitation learning

Also Published As

Publication number Publication date
CN112766310A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN112099496B (en) Automatic driving training method, device, equipment and medium
CN109991987B (en) Automatic driving decision-making method and device
Wang et al. Continuous control for automated lane change behavior based on deep deterministic policy gradient algorithm
CN110796856A (en) Vehicle lane change intention prediction method and training method of lane change intention prediction network
CN113044064B (en) Vehicle self-adaptive automatic driving decision method and system based on meta reinforcement learning
Wang et al. Autonomous ramp merge maneuver based on reinforcement learning with continuous action space
US12005922B2 (en) Toward simulation of driver behavior in driving automation
US11628865B2 (en) Method and system for behavioral cloning of autonomous driving policies for safe autonomous agents
CN116134292A (en) Tool for performance testing and/or training an autonomous vehicle planner
CN112373483B (en) Vehicle speed and steering prediction method based on forward neural network
EP3629105A1 (en) High-level decision making for safe and reasonable autonomous lane changing using reinforcement learning
EP3751465A1 (en) Methods, apparatuses and computer programs for generating a reinforcement learning-based machine-learning model and for generating a control signal for operating a vehicle
CN115123159A (en) AEB control method and system based on DDPG deep reinforcement learning
CN117406756B (en) Method, device, equipment and storage medium for determining motion trail parameters
CN112835362B (en) Automatic lane change planning method and device, electronic equipment and storage medium
Wang et al. A deep reinforcement learning-based approach for autonomous lane-changing velocity control in mixed flow of vehicle group level
CN112766310B (en) Fuel-saving lane-changing decision-making method and system
Islam et al. Enhancing Longitudinal Velocity Control With Attention Mechanism-Based Deep Deterministic Policy Gradient (DDPG) for Safety and Comfort
CN114104005B (en) Decision-making method, device and equipment of automatic driving equipment and readable storage medium
CN116306800A (en) Intelligent driving decision learning method based on reinforcement learning
Zhao et al. Imitation of real lane-change decisions using reinforcement learning
CN115719547A (en) Traffic participant trajectory prediction method and system based on multiple interactive behaviors
CN115700626A (en) Reward function for a vehicle
Deng et al. Deep Reinforcement Learning Based Decision-Making Strategy of Autonomous Vehicle in Highway Uncertain Driving Environments
Wang et al. An end-to-end deep reinforcement learning model based on proximal policy optimization algorithm for autonomous driving of off-road vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220901

Address after: Room 2528, Building 2, Lane 1800, Xinyang Road, Lingang New Area, China (Shanghai) Pilot Free Trade Zone, Pudong New Area, Shanghai, 201801

Applicant after: Inceptio Star Intelligent Technology (Shanghai) Co.,Ltd.

Address before: Room 4, room 001, building 11, Lane 1333, Jiangnan Avenue, Changxing Town, Chongming District, Shanghai 202150

Applicant before: International network technology (Shanghai) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant