WO2023004698A1 - 智能驾驶决策方法、车辆行驶控制方法、装置及车辆 - Google Patents
智能驾驶决策方法、车辆行驶控制方法、装置及车辆 Download PDFInfo
- Publication number
- WO2023004698A1 WO2023004698A1 PCT/CN2021/109331 CN2021109331W WO2023004698A1 WO 2023004698 A1 WO2023004698 A1 WO 2023004698A1 CN 2021109331 W CN2021109331 W CN 2021109331W WO 2023004698 A1 WO2023004698 A1 WO 2023004698A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vehicle
- strategy
- self
- game
- game object
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- 230000033001 locomotion Effects 0.000 title claims abstract description 61
- 238000005070 sampling Methods 0.000 claims description 269
- 230000009471 action Effects 0.000 claims description 111
- 230000006399 behavior Effects 0.000 claims description 81
- 238000012545 processing Methods 0.000 claims description 30
- 230000003993 interaction Effects 0.000 claims description 23
- 230000002829 reductive effect Effects 0.000 abstract description 3
- 230000000875 corresponding effect Effects 0.000 description 86
- 230000001133 acceleration Effects 0.000 description 81
- 230000006870 function Effects 0.000 description 35
- 230000008859 change Effects 0.000 description 17
- 230000002452 interceptive effect Effects 0.000 description 17
- 238000004891 communication Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- 238000013528 artificial neural network Methods 0.000 description 9
- 230000003542 behavioural effect Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000009795 derivation Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 6
- 230000036961 partial effect Effects 0.000 description 6
- 230000036461 convulsion Effects 0.000 description 5
- 230000001276 controlling effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000008713 feedback mechanism Effects 0.000 description 2
- 239000000446 fuel Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000004297 night vision Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- PCTMTFRHKVHKIS-BMFZQQSSSA-N (1s,3r,4e,6e,8e,10e,12e,14e,16e,18s,19r,20r,21s,25r,27r,30r,31r,33s,35r,37s,38r)-3-[(2r,3s,4s,5s,6r)-4-amino-3,5-dihydroxy-6-methyloxan-2-yl]oxy-19,25,27,30,31,33,35,37-octahydroxy-18,20,21-trimethyl-23-oxo-22,39-dioxabicyclo[33.3.1]nonatriaconta-4,6,8,10 Chemical compound C1C=C2C[C@@H](OS(O)(=O)=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2.O[C@H]1[C@@H](N)[C@H](O)[C@@H](C)O[C@H]1O[C@H]1/C=C/C=C/C=C/C=C/C=C/C=C/C=C/[C@H](C)[C@@H](O)[C@@H](C)[C@H](C)OC(=O)C[C@H](O)C[C@H](O)CC[C@@H](O)[C@H](O)C[C@H](O)C[C@](O)(C[C@H](O)[C@H]2C(O)=O)O[C@H]2C1 PCTMTFRHKVHKIS-BMFZQQSSSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W60/00—Drive control systems specially adapted for autonomous road vehicles
- B60W60/001—Planning or execution of driving tasks
- B60W60/0015—Planning or execution of driving tasks specially adapted for safety
- B60W60/0018—Planning or execution of driving tasks specially adapted for safety by employing degraded modes, e.g. reducing speed, in response to suboptimal conditions
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/08—Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
- B60W30/095—Predicting travel path or likelihood of collision
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
- B60W50/14—Means for informing the driver, warning the driver or prompting a driver intervention
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W60/00—Drive control systems specially adapted for autonomous road vehicles
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
- B60W50/14—Means for informing the driver, warning the driver or prompting a driver intervention
- B60W2050/146—Display means
Definitions
- the present application relates to intelligent driving technology, in particular to an intelligent driving decision-making method, a vehicle driving control method, a device, and a vehicle.
- L1-L5 levels among which, L1 level, assisted driving, can help the driver complete certain driving task, and can only help complete one driving operation
- L2 level partial automation, which can automatically perform acceleration, deceleration and steering operations at the same time
- L3 level conditional automation, the vehicle can realize automatic acceleration, deceleration and steering in a specific environment, no need Driver's operation
- L4 level highly automated, can realize the whole driving without a driver, but there will be restrictions, such as limiting the vehicle speed to not exceed a certain value, and the driving area is relatively fixed
- L5 level fully automated, fully adaptive Driving, adapt to any driving scene. The higher these ratings, the more powerful the autonomous driving capabilities.
- the present application provides an intelligent driving decision-making method, a vehicle driving control method, a device, a vehicle, etc., which can consume as little computing power as possible to realize the decision-making of self-driving under the premise of ensuring the decision-making accuracy.
- the first aspect of the present application provides an intelligent driving decision-making method, including: obtaining the game object of the self-vehicle; performing multiple releases of multiple strategy spaces from the multiple strategy spaces of the self-vehicle and the game object. After one release in , determine the strategy feasible region of the self-vehicle and the game object according to the strategy spaces that have been released, and determine the decision-making result of the self-vehicle driving according to the strategy feasible region.
- the feasible domain of the strategy of the self-vehicle and the non-game object includes the behavior actions that the self-vehicle can perform relative to the non-game object.
- the decision-making accuracy can be, for example, the execution probability of the result of the decision
- the strategy feasible region can be obtained when releasing as little strategy space as possible, so as to obtain from A behavior-action pair is selected in the policy feasible region as the decision result, which minimizes the release times and operations of the policy space, and reduces the requirements for hardware computing power.
- the dimensions of the multiple policy spaces include at least one of the following: a vertical sampling dimension, a horizontal sampling dimension, or a time sampling dimension.
- the plurality of strategy spaces include a longitudinal sampling strategy space spanned by the longitudinal sampling dimensions of the ego vehicle and/or the game object, a horizontal sampling strategy space spanned by the ego vehicle and/or the horizontal sampling dimensions of the game object, or a horizontal sampling strategy space spanned by the ego vehicle and/or the game object /or the time dimension strategy space formed by the game objects in the time sampling dimension, or the strategy space formed by any combination of two or three of the vertical sampling dimension, horizontal sampling dimension, or time sampling dimension.
- the time-dimension strategy space corresponds to the strategy space stretched in multiple single-frame deduction included in one-step decision-making, and in each single-frame deduction, the strategy space stretched may include vertical sampling strategy space and/or horizontal Sampling policy space.
- the corresponding policy space can be expanded in at least one sampling dimension, and the policy space can be released.
- performing multiple releases of multiple policy spaces includes performing the releases in order of the following dimensions: vertical sampling dimension, horizontal sampling dimension, and time sampling dimension.
- the strategy space released multiple times can include the following strategy spaces:
- the vertical sampling strategy space spanned by a set of values of the longitudinal sampling dimension of the self-vehicle; the longitudinal sampling strategy space spanned by another set of values of the longitudinal sampling dimension of the self-vehicle;
- the vertical sampling strategy space formed by a set of group values and a set of values of the longitudinal sampling dimension of the game object; Zhang Cheng's vertical sampling strategy space; the longitudinal sampling strategy space formed by another set of values of the longitudinal sampling dimension of the self-vehicle and another set of values of the longitudinal sampling dimension of the game object; the horizontal sampling dimension of the ego-vehicle
- the horizontal sampling strategy space spanned by a set of values of , and the vertical sampling strategy space spanned by the longitudinal sampling dimension of the ego vehicle and/or the longitudinal sampling dimension of the game object are jointly spanned into a strategy space;
- the horizontal sampling strategy space formed by another set of values of the sampling dimension is jointly formed by the longitudinal sampling strategy space formed by the longitudinal sampling dimension of the ego vehicle and/or the
- the released time dimension strategy space includes: multiple In each single-frame deduction, the created strategy space can include the aforementioned vertical sampling strategy space, horizontal sampling strategy space, and vertical sampling strategy space and horizontal sampling strategy space The strategy space jointly formed by the strategy space.
- the total cost value of the behavior-action pair in the strategically feasible domain is determined according to one or more of the following: the self-vehicle or the game
- one or more cost values can be selected according to needs to calculate the total cost value, which is used to determine the feasible region.
- each cost value has a different weight.
- the different weights can focus on driving safety, right of way, passability, comfort, risk, etc.
- the weight allocation may be allocated as follows: safety weight > road right weight > lateral offset weight > passability weight > comfort weight > risk area weight > inter-frame correlation weight.
- the decision result of driving the own vehicle is determined according to the feasible domains of each strategy of the own vehicle and each game object.
- the final strategy feasible region is determined according to the intersection of each strategy feasible region through the respectively obtained strategy feasible regions.
- the intersection here refers to behavior actions that both include the same action of the self-vehicle.
- the first aspect also includes: obtaining the non-game object of the self-vehicle; determining the feasible domain of the strategy of the self-vehicle and the non-game object; the feasible domain of the strategy of the self-vehicle and the non-game object includes the relative Behavior actions that non-game objects can perform; at least determine the decision result of the self-vehicle driving according to the feasible domain of the strategy of the self-vehicle and the non-game object.
- the strategy feasible region of the decision result of the own vehicle is determined according to the intersection of the strategy feasible regions of the own vehicle and each game object, or the strategy feasible region of the decision result of the own vehicle and each game object is determined. Domain and the intersection of each policy feasible domain of ego vehicle and each non-game object determines the strategy feasible domain of the decision result of ego vehicle driving.
- the final strategy feasible region and the decision result of ego vehicle driving can be obtained through the intersection of each strategy feasible domain of ego vehicle and multiple game objects.
- the final strategy feasible region and the decision result of ego-vehicle driving can be obtained according to the intersection of the strategy feasible regions of the self-vehicle and multiple game objects and non-game objects.
- the first aspect also includes: obtaining the non-game object of the self-vehicle; constraining the longitudinal sampling strategy space corresponding to the self-vehicle or constraining the horizontal sampling strategy space corresponding to the self-vehicle according to the motion state of the non-game object Sampling policy space.
- the longitudinal sampling strategy space corresponding to the self-vehicle is constrained, that is, the range of values in the longitudinal sampling dimension of the ego-vehicle used when constraining the longitudinal sampling strategy space;
- the horizontal sampling strategy space corresponding to the ego-vehicle is constrained, That is, the range of values in the horizontal sampling dimension of the self-vehicle used when constraining the horizontal sampling strategy space.
- the range of longitudinal acceleration or lateral offset of the self-vehicle in Zhangcheng's strategy space can be constrained by the motion state of non-game objects, such as position and velocity, which reduces the number of behaviors in the strategy space amount, which can further reduce the amount of computation.
- the first aspect also includes: acquiring the non-game objects of the game objects of the self-vehicle; constraining the longitudinal sampling strategy space corresponding to the game objects of the self-vehicle according to the motion state of the non-game objects, or constraining The horizontally sampled policy space corresponding to the game objects of the ego vehicle.
- constrain the longitudinal sampling strategy space corresponding to the game object of the self-vehicle that is, constrain the value range on the longitudinal sampling dimension corresponding to the game object of the ego-vehicle used when constraining the longitudinal sampling strategy space; the constraint and the ego-vehicle
- the horizontal sampling strategy space corresponding to the game object of that is, the range of values in the horizontal sampling dimension of the ego vehicle’s game object used when constraining the horizontal sampling strategy space.
- the range of longitudinal acceleration or lateral offset of the ego vehicle game object in Zhangcheng’s strategy space can be constrained by the motion state of non-game objects, such as position and speed, which reduces the behavior of the strategy space.
- the number of actions can further reduce the amount of computation.
- a conservative decision on the driving of the ego vehicle is executed, and the conservative decision includes an action of making the ego car stop safely, or an action of making the ego car safely decelerate.
- the game object or non-game object is determined according to the attention method.
- the game object and non-game object can be determined according to the attention allocated by each obstacle to the self-vehicle.
- the attention method can be implemented through algorithms, or it can be implemented through neural network reasoning.
- the first aspect also includes: displaying at least one of the following through the human-computer interaction interface: the decision result of the ego vehicle driving, the policy feasible region of the decision result, and the ego vehicle corresponding to the decision result of the ego vehicle driving The driving trajectory, or the driving trajectory of the game object corresponding to the decision result of ego vehicle driving.
- the human-computer interaction interface can display the decision-making results of the corresponding driving of the vehicle or the game with rich content, and the interaction with the user is more friendly. .
- the second aspect of the present application provides an intelligent driving decision-making device, including: an acquisition module, used to obtain the game object of the self-vehicle; a processing module, used to execute multiple strategies from the multiple strategy spaces of the self-vehicle and the game object For multiple releases of space, when one of the multiple releases is executed, the strategy feasible regions of the self-vehicle and the game object are determined according to the released strategy spaces, and the decision-making result of the self-vehicle is determined according to the strategy feasible regions.
- the dimensions of the multiple policy spaces include at least one of the following: a vertical sampling dimension, a horizontal sampling dimension, or a time sampling dimension.
- performing multiple releases of multiple policy spaces includes performing the releases in the order of the following dimensions: vertical sampling dimension, horizontal sampling dimension, and time sampling dimension.
- the total cost value of the action pairs in the strategically feasible domain is determined according to one or more of the following: the self-vehicle or the game object The object's safety cost, right of way cost, lateral offset cost, passability cost, comfort cost, inter-frame correlation cost, and risk area cost.
- each cost value has a different weight.
- the decision result of driving the own vehicle is determined according to the feasible domains of each strategy of the own vehicle and each game object.
- the acquisition module is also used to obtain the non-game object of the self-vehicle;
- the processing module is also used to determine the feasible domain of the strategy of the self-vehicle and the non-game object;
- the strategy of the self-vehicle and the non-game object Feasible region includes the actions that the self-vehicle can execute relative to non-game objects; at least determine the decision-making result of self-vehicle driving according to the feasible region of the strategy of the self-vehicle and non-game objects.
- the processing module is also used to determine the strategy feasible region of the decision result of the driving of the own vehicle according to the intersection of the strategy feasible regions of the own vehicle and each game object, or according to the intersection of the own vehicle and each game object.
- Each strategy feasible region of the object and the intersection of each strategy feasible region of the ego vehicle and each non-game object determine the strategy feasibility region of the decision result of the ego vehicle driving.
- the acquisition module is also used to acquire non-game objects of the self-vehicle; the processing module is also used to constrain the longitudinal sampling strategy space corresponding to the self-vehicle according to the motion state of the non-game objects, or Constrains the space of lateral sampling policies corresponding to the ego-vehicle.
- the acquisition module is also used to acquire the non-game objects of the game object of the self-vehicle; the processing module is also used to constrain the game objects corresponding to the game object of the self-vehicle
- the strategy space is sampled vertically, or the strategy space is horizontally sampled with constraints corresponding to the game objects of the ego vehicle.
- the conservative decision of the ego vehicle when the intersection is an empty set, the conservative decision of the ego vehicle is executed, and the conservative decision includes an action of making the ego vehicle stop safely, or an action of making the ego vehicle safely decelerate.
- the game object or non-game object is determined according to the attention method.
- the processing module is further configured to display at least one of the following through the human-computer interaction interface: the decision result of the self-vehicle driving, the policy feasible region of the decision result, and the corresponding The driving trajectory of the own vehicle, or the driving trajectory of the game object corresponding to the decision result of the driving of the own vehicle.
- the third aspect of the present application provides a vehicle driving control method, including: acquiring obstacles outside the vehicle; determining the decision result of vehicle driving according to any method of the first aspect for the obstacle; and controlling the driving of the vehicle according to the decision result.
- the fourth aspect of the present application provides a vehicle driving control device, including: an acquisition module, used to acquire obstacles outside the vehicle; a processing module, used to determine the decision result of vehicle driving according to any method of the first aspect for obstacles; The processing module is also used to control the driving of the vehicle according to the decision result.
- a fifth aspect of the present application provides a vehicle, including: the vehicle travel control device of the fourth aspect, and a travel system; the vehicle travel control device controls the travel system.
- the sixth aspect of the present application provides a computing device, including: a processor, and a memory, on which program instructions are stored, and when the program instructions are executed by the processor, the processor implements any intelligent driving decision-making method of the first aspect, or The program instructions, when executed by the processor, cause the processor to implement the vehicle travel control method of the third aspect.
- the seventh aspect of the present application provides a computer-readable storage medium on which program instructions are stored.
- the program instructions When executed by a processor, the program instructions enable the processor to implement any intelligent driving decision-making method in the first aspect, or when the program instructions are processed
- the processor When the processor is executed, the processor is made to implement the vehicle driving control method of the third aspect.
- FIG. 1 is a schematic diagram of a traffic scene in which a road vehicle is driven according to an embodiment of the present application
- FIG. 2 is a schematic diagram of an embodiment of the present application applied to a vehicle
- Fig. 3A-Fig. 3E are the schematic diagrams of game objects and non-game objects in different traffic scenarios provided by the embodiment of the present application;
- Fig. 4 is a flow chart of the intelligent driving decision-making method provided by the embodiment of the present application.
- Fig. 5 is the flow chart that obtains game object among Fig. 4;
- Fig. 6 is the flowchart of obtaining decision-making result in Fig. 4;
- FIG. 7 is a schematic diagram of multi-frame derivation provided in the embodiment of the present application.
- FIG. 8A-FIG. 8F are schematic diagrams of the cost function provided by the embodiment of the present application.
- Fig. 9 is a flow chart of driving control provided by another embodiment of the present application.
- FIG. 10 is a schematic diagram of a traffic scene in an embodiment of the present application.
- Fig. 11 is a flow chart of the driving control provided by the specific embodiment of the present application.
- FIG. 12 is a schematic diagram of an intelligent driving decision-making device provided in an embodiment of the present application.
- FIG. 13 is a flowchart of a vehicle driving control method provided in an embodiment of the present application.
- Fig. 14 is a schematic diagram of a vehicle driving control device provided by an embodiment of the present application.
- Figure 15 is a schematic diagram of a vehicle provided in the embodiment of the present application.
- FIG. 16 is a schematic diagram of an embodiment of a computing device of the present application.
- intelligent driving decision-making solutions include intelligent driving decision-making methods and devices, vehicle driving control methods and devices, vehicles, electronic devices, computing equipment, computer-readable storage media, and computer program products. Since the principles of these technical solutions to solve problems are the same or similar, in the introduction of the following specific embodiments, some repetitions may not be repeated, but it should be considered that these specific embodiments have been referred to each other and can be combined with each other.
- Figure 1 shows a traffic scene of vehicles driving on the road.
- the north-south road and the east-west road form an intersection A, where: the first vehicle 901 is located on the south side of the intersection A and is traveling from south to north; the second vehicle 902 is located at the north of the intersection A side, and travel from north to south; the third vehicle 903 is located on the east side of intersection A, and travels from east to south, that is, it will turn left at intersection A and merge into the north-south road; behind the first vehicle 901, there is a fourth vehicle 904, the fourth vehicle 904 is also driving from south to north; near the southeast corner of intersection A, there is a fifth vehicle 905 parked on the side of the north-south road, that is, the fifth vehicle 905 is located on the roadside in front of the first vehicle 901.
- the first vehicle 901 can detect the current traffic scene, and can make a decision on the driving strategy under the current traffic scene, and then control the driving of the vehicle according to the decision result, for example, according to the decision Preempting or yielding or avoiding strategies, controlling the vehicle to accelerate or decelerate or change lanes, etc.
- An intelligent driving decision-making scheme is based on game-based decision-making of driving strategies. For example, the first vehicle 901 decides the driving strategy of its own vehicle in a traffic scene with the second vehicle 902 driving in the opposite direction through a game. The decision-making of driving strategy based on the game method is difficult to deal with the decision-making in complex traffic scenarios. For example, for the traffic scenario shown in Fig.
- the third vehicle 903 crosses the intersection A, and the fifth vehicle 905 is parked on the roadside in front of it.
- the game objects of the first vehicle 901 are the second vehicle 902 and the third vehicle 901 at the same time.
- the vehicle 903 therefore needs to use a multi-dimensional game space for game decision-making, for example, a multi-dimensional game space formed by the respective horizontal driving dimensions and longitudinal driving dimensions of the first vehicle 901, the second vehicle 902, and the third vehicle 903.
- the use of multi-dimensional game space will lead to an explosive increase in the number of game decision-making solutions, resulting in a geometric increase in the computational burden, which poses a great challenge to the existing hardware computing power. Therefore, currently limited by hardware computing power, it is difficult to realize productization in intelligent driving scenarios by using multi-dimensional game space for game decision-making.
- the embodiment of the present application provides an improved intelligent driving decision-making scheme.
- the basic principle of the scheme includes: for the self-vehicle, identify obstacles in the current traffic scene, and the obstacles may include The game object of the self-vehicle and the non-game object of the self-vehicle.
- the strategy space formed by a single sampling dimension or multiple sampling dimensions is released multiple times, and after each release of the strategy space, search The solution of the self-vehicle and its single game object in the strategy space.
- the decision result of the self-vehicle driving is determined according to the game result, and the driving of the vehicle can be controlled according to the decision result .
- the unreleased strategy space can no longer be released from the multi-dimensional game space.
- the strategy feasible domains of the ego vehicle for each game object can be respectively determined as above, and the action of the ego vehicle is used as an index to obtain the intersection of the strategy feasible domains of the ego vehicle and each game object (intersection index Both include the same action of the own car) to obtain the driving strategy of the own car.
- the method can obtain the optimal driving decision in the multi-dimensional game space with the least number of searches, and can reduce as much as possible
- the use of policy space reduces the requirements for hardware computing power and makes it easier to productize on vehicles.
- the implementation subject of the intelligent driving decision-making scheme of the embodiment of the present application may be an intelligent body with power and can move autonomously.
- the intelligent body can make game decisions with other objects in the traffic scene through the intelligent driving decision-making scheme provided by the embodiment of the present application. Generate semantic-level decision labels and the expected driving trajectory of the agent, so that the agent can perform reasonable horizontal and vertical motion planning.
- the intelligent body can be, for example, a vehicle with an automatic driving function, a robot that can move autonomously, and the like.
- the vehicles here include general motor vehicles, such as cars, sports utility vehicles (Sport Utility Vehicle, SUV), multi-purpose vehicles (Multi-purpose Vehicle, MPV), automatic guided transport vehicles (Automated Guided Vehicle, AGV) Land transportation devices including buses, trucks and other cargo or passenger vehicles, as well as water surface transportation devices including various ships and boats, and aircraft.
- general motor vehicles such as cars, sports utility vehicles (Sport Utility Vehicle, SUV), multi-purpose vehicles (Multi-purpose Vehicle, MPV), automatic guided transport vehicles (Automated Guided Vehicle, AGV) Land transportation devices including buses, trucks and other cargo or passenger vehicles, as well as water surface transportation devices including various ships and boats, and aircraft.
- motor vehicles it also includes hybrid vehicles, electric vehicles, gasoline vehicles, plug-in hybrid vehicles, fuel cell vehicles and other alternative fuel vehicles.
- a hybrid vehicle refers to a vehicle with two or more power sources, and an electric vehicle includes a pure electric vehicle, an extended-range electric vehicle, and the like.
- the aforementioned robot that can move autonomously may also belong to one of the vehicles.
- the intelligent driving decision-making scheme provided by the embodiment of the present application is applied to a vehicle as an example.
- a vehicle in some embodiments may also include a communication device 14 , a navigation device 15 , or a display device 16 .
- the environment information acquiring device 11 can be used to acquire the external environment information of the vehicle.
- the environmental information acquisition device 11 may include a camera, a laser radar, a millimeter-wave radar, an ultrasonic radar, or a Global Navigation Satellite System (Global Navigation Satellite System, GNSS) as described later, and the number may be one or A plurality of, wherein, camera can comprise conventional RGB (Red Green Blue) tricolor camera sensor, infrared camera sensor etc.
- the acquired environment outside the vehicle includes road surface information and objects on the road surface. Objects on the road surface include surrounding vehicles and pedestrians, etc. Specifically, it may include vehicle motion state information. Motion state information may include vehicle speed, acceleration, heading angle information, trajectory information etc.
- the movement state information of surrounding vehicles may also be obtained through the communication device 14 of the vehicle 10 .
- the environment information outside the vehicle acquired by the environment information acquisition device 11 can be used to form a world model constructed by roads (corresponding to road surface information) and obstacles (corresponding to objects on the road surface).
- the environmental information acquisition device 11 can also be an electronic device that receives the external environmental information of the vehicle transmitted by the camera sensor, infrared night vision camera sensor, laser radar, millimeter wave radar, ultrasonic radar, etc., such as a data transmission chip , data transmission chips such as bus data transceiver chips, network interface chips, etc., data transmission chips can also be wireless transmission chips, such as Bluetooth (Bluetooth) chips or Wi-Fi chips.
- the environment information acquisition device 11 may also be integrated into the control device 12 to become an interface circuit or a data transmission module integrated into the processor.
- control device 12 can be used to make intelligent driving strategy decisions based on the acquired vehicle external environment information (including the constructed world model), and generate decision results.
- the decision results can include acceleration, braking, steering ( Including changing lanes or turning), and also including the expected driving trajectory of the vehicle in the short term (eg, within a few seconds).
- the control device 12 can further generate corresponding instructions to control the driving system 13 according to the decision result, so as to execute the driving control of the vehicle through the driving system 13, and control the vehicle to realize the desired driving trajectory according to the decision result.
- control device 12 can be an electronic device, for example, it can be a processor of a vehicle-mounted processing device such as a vehicle machine, a domain controller, a mobile data center (Mobile Data Center, MDC) or a vehicle-mounted computer, or it can be a central processing unit.
- vehicle-mounted processing device such as a vehicle machine, a domain controller, a mobile data center (Mobile Data Center, MDC) or a vehicle-mounted computer, or it can be a central processing unit.
- Conventional chips such as Central Processing Unit (CPU) and Microprocessor (Micro Control Unit, MCU).
- the driving system 13 may include a power system 131, a steering system 132, and a braking system 133, which will be introduced separately below:
- the power system 131 may include a driving electronic control unit (Electrical Control Unit, ECU) and a driving source.
- the driving ECU controls the driving force (such as torque) of the vehicle 10 by controlling the driving source.
- the drive source there may be an engine, a drive motor, or the like.
- the driving ECU can control the driving source according to the driver's operation of the accelerator pedal, or can control the driving source according to the command sent from the control device 12, thereby being able to control the driving force.
- the driving force of the driving source is transmitted to the wheels via a transmission or the like, thereby driving the vehicle 10 to run.
- the steering system 132 may include a steering electronic control unit (ECU) and an electric power steering system (Electric Power Steering, EPS).
- the steering ECU can control the EPS motor according to the driver's operation of the steering wheel, or can control the EPS motor according to the command sent from the control device 12, thereby controlling the direction of the wheels (specifically, the steering wheels).
- steering can also be performed by changing the torque distribution or braking force distribution to the left and right wheels.
- the braking system 133 may include a braking electronic control unit (ECU) and a braking mechanism.
- the brake mechanism makes the brake components work through the brake motor, hydraulic mechanism, etc.
- the brake ECU can control the brake mechanism according to the driver's operation of the brake pedal, or can control the brake mechanism according to the command sent from the control device 12, and can control the braking force.
- the braking system 133 may also include an energy regenerative braking mechanism.
- a communication device 14 may also be included.
- the communication device 14 can perform data interaction with external objects through wireless communication, and obtain data required by the vehicle 10 for smart driving decisions.
- the communicable external object may include a cloud server, a mobile terminal (such as a mobile phone, a laptop, a tablet, etc.), a roadside device, or a surrounding vehicle.
- the data required for decision-making includes user portraits of vehicles around the vehicle 10 (that is, other cars), which reflect the driving habits of other car drivers, and may also include the position of other cars and the movement of other cars. status information, etc.
- a navigation device 15 may also be included, and the navigation device 15 may include a Global Navigation Satellite System (Global Navigation Satellite System, GNSS) receiver and a map database.
- the navigation device 15 can determine the position of the vehicle 10 through satellite signals received by the GNSS receiver, and can generate a route to the destination according to the map information in the map database, and send information about the route (including the position of the vehicle 10) Provided to the control device 12.
- the navigation device 15 may also have an inertial measurement unit (Inertial Measurement Unit, IMU), and perform more accurate positioning of the vehicle 10 by fusing the information of the GNSS receiver and the information of the IMU.
- IMU Inertial Measurement Unit
- a display device 16 may also be included, for example, it may be a display screen installed in the central control position of the cockpit of the vehicle, or it may be a head-up display device (Head Up Display, HUD).
- the control device 12 may display the decision result on the display device 16 in the cockpit of the vehicle in a manner understandable to the user, for example, in the form of expected driving trajectory, arrows, text, and the like.
- when displaying the desired driving trajectory it may also be displayed on a display device in the vehicle cockpit in the form of a partially enlarged view in combination with the vehicle's current traffic scene (such as a graphical traffic scene).
- the control device 12 can also display information on the route to the destination provided by the navigation device 15 .
- a voice playback system may also be included, which prompts the user to make decisions about the current traffic scene by playing voices.
- the intelligent driving vehicle that is in the traffic scene and executes the intelligent driving decision-making method provided in the embodiment of the present application is called the self-vehicle.
- the self-vehicle From the perspective of the self-vehicle, other objects in the traffic scene that affect or may affect the driving of the self-vehicle are called obstacles of the self-vehicle.
- the self-vehicle has a certain behavior decision-making ability, and can generate a driving strategy to change its own motion state.
- the driving strategy includes acceleration, braking, and steering (including changing lanes or steering), and the self-vehicle also has driving behavior execution Ability, including executing the driving strategy, and driving according to the desired driving trajectory determined by the decision.
- the obstacle of the self-vehicle can also have behavioral decision-making ability to change its own motion state, for example, the obstacle can be a vehicle, a pedestrian, etc. that can move autonomously.
- the obstacle of the self-vehicle may also not have behavioral decision-making ability, or may not change its motion state.
- the obstacle may be a vehicle parked on the side of the road (the vehicle is in a non-starting state), a width-limiting pier on the road, etc.
- the obstacles of the self-vehicle can include: pedestrians, bicycles, motor vehicles (such as motorcycles, passenger cars, trucks, trucks, buses, etc.), etc., wherein the motor vehicles can include intelligent driving systems that can implement intelligent driving decision-making methods. vehicle.
- the obstacles of the own vehicle can be further divided into game objects of the own vehicle, non-game objects of the own vehicle or irrelevant obstacles of the own vehicle.
- the interaction strength between game objects, non-game objects and irrelevant obstacles and the ego vehicle gradually weakens from strong interaction to no interaction. It should be understood that in multiple traffic scenarios corresponding to different decision-making moments, game objects, non-game objects, and irrelevant obstacles may be converted to each other.
- the position or motion state of the irrelevant obstacle of the own vehicle makes it completely irrelevant to the future behavior of the own vehicle, and there will be no trajectory conflict or intention conflict between the own vehicle and the irrelevant obstacle in the future, so unless otherwise specified, the Obstacles refer to the game objects of the own vehicle and the non-game objects of the own vehicle.
- trajectory conflict or intention conflict between the non-game objects of the self-vehicle and the self-vehicle in the future.
- trajectory conflicts or intention conflicts between them, and the self-vehicle needs to unilaterally adjust the motion state of the self-vehicle to resolve the possible trajectory conflicts or intention conflicts between it and its non-game objects in the future, that is, the non-game objects of the self-vehicle
- the non-game objects of the self-vehicle There is no game interaction relationship with the self-vehicle. That is to say, the non-game object of the self-vehicle will not be affected by the driving behavior of the self-vehicle, and will maintain its predetermined motion state, and will not adjust its motion state to resolve possible trajectory conflicts or intention conflicts between the self-vehicle and the future.
- the ego vehicle 101 goes straight through the unprotected intersection.
- the oncoming vehicle 102 (in the left front of the own vehicle 101) turns left and passes through the unprotected intersection.
- there is a trajectory conflict or intention conflict between the oncoming vehicle 102 and the own vehicle 101 and the oncoming vehicle 102 is the game object of the own vehicle 101 .
- the ego vehicle 101 is going straight.
- the oncoming vehicle 102 on the left crosses the lane where the own vehicle 101 is located and passes.
- the ego vehicle 101 is going straight.
- Oncoming vehicle 102 in the same direction (right front of own vehicle 101) merges into own vehicle lane or adjacent lane of own vehicle.
- there is a trajectory conflict or intention conflict between the oncoming vehicle 102 and the own vehicle 101 and a game interaction relationship will be established with the own vehicle 101 , and the oncoming vehicle 102 is the game object of the own vehicle 101 .
- the own car 101 is going straight, and the opposite vehicle 103 is going straight on the left adjacent lane of the own car 101, and there is a stationary vehicle 102 on the right adjacent lane of the own car 101 (in the right front of the own car 101) .
- there is a trajectory conflict or intention conflict between the oncoming vehicle 103 and the own vehicle 101 and a game interaction relationship will be established with the own vehicle 101 , and the oncoming vehicle 103 is the game object of the own vehicle 101 .
- the location of the stationary vehicle 102 will conflict with the trajectory of the self-vehicle 101 in the future, but according to the obtained external environment information, it can be confirmed that in the interactive game decision-making process, the stationary vehicle 102 will not switch to the moving state or even if it switches to the moving state, but It has a relatively high right of way and will not establish a game interaction relationship with the self-vehicle 101. Therefore, the stationary vehicle 102 is a non-game object of the self-vehicle. track conflicts.
- the ego vehicle 101 changes lanes to the right from the current lane to merge into the right adjacent lane.
- the first straight vehicle 103 has a higher right of way, does not establish a game interaction relationship with the own vehicle 101 , and is a non-game object of the own vehicle 101 .
- the second straight-going vehicle 102 at the right rear of the own vehicle 101 will have a trajectory conflict with the own vehicle 101 in the future, and will establish a game interaction relationship with the own vehicle 101 , and the second straight-going vehicle 102 is the game object of the own vehicle 101 .
- this step may include the following sub-steps:
- the self-vehicle obtains the external environment information of the vehicle, and the acquired external environment information includes the movement state and relative position information of the self-vehicle and obstacles in the road scene.
- the ego vehicle acquires the external environment information of the vehicle through its environmental information acquisition device, such as camera sensor, infrared night vision camera sensor, laser radar, millimeter wave radar, ultrasonic radar, GNSS, etc. to obtain.
- the vehicle's acquisition of the vehicle's external environment information may be through its communication device communicating with a roadside device, or communicating with a cloud server to obtain its vehicle's external environment information.
- the roadside device may have a camera or a communication device, which can obtain vehicle information around it, and the cloud server can receive and store the information reported by each roadside device.
- the above two methods may also be combined to acquire the vehicle external environment information.
- S12 According to the acquired movement state of the obstacle, or the movement state of the obstacle within a period of time, or the driving track of the formed obstacle, and the relative position information with the obstacle, identify it from the obstacle The game object from the car.
- non-gaming objects of the own vehicle may also be identified from the obstacles at the same time, or non-gaming objects of the own vehicle's game objects may be identified from the obstacles.
- the aforementioned game objects or non-game objects can be identified according to preset judging rules.
- the judgment rule is for example: if the driving trajectory or driving intention of an obstacle conflicts with the driving trajectory or driving intention of the own vehicle, and the obstacle has behavioral decision-making ability and can change its own motion state, then it is the game object of the own vehicle . If the driving trajectory or driving intention of an obstacle conflicts with the driving trajectory or driving intention of the self-vehicle, but does not actively change its own motion state to actively avoid the conflict, it is a non-game object with the self-vehicle.
- the driving trajectory or driving intention of the obstacle can be judged according to the lane in which it is driving (straight-going or turning lane), whether to turn on the turn signal, the direction of the head of the vehicle, and so on.
- the crossing obstacle and the obstacle merged into the lane all belong to the large-angle intersection with the own vehicle trajectory and thus have a driving trajectory conflict, and are further classified as game vehicles.
- the oncoming vehicle 103 passing through the narrow lane in FIG. 3D and the following vehicle 104 whose own vehicle merges into the traffic flow in the side lane in FIG. 3E belong to conflicting intentions, and thus are classified as game vehicles.
- the car 103 in front of which the ego car merges into the traffic flow of the side lane although the trajectories or intentions have conflicts, because the ego car has a lower right of way than the other car, the other car does not Will take actions to resolve the conflict, and the behavior of the own car can not change the behavior of other cars, then the other car is a non-game car.
- the self-vehicle obtains obstacle information from its perceived or acquired vehicle external environment information based on some known algorithms, and identifies the game object, non-game object or game object of the self-vehicle from the obstacles non-game objects.
- the above algorithm can be, for example, a classification neural network based on deep learning. Since it is to identify the type of obstacle, it is equivalent to classification, so the neural network of the classification model can be used for reasoning and determination.
- the classification neural network can adopt Convolutional Neural Networks (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformers (Bidirectional Encoder Representations from Transformers, BERT), etc.
- CNN Convolutional Neural Networks
- RNN Recurrent Neural Network
- BERT Bidirectional Encoder Representations from Transformers
- the classification neural network when the classification neural network is trained, the neural network can be trained using sample data, the sample data can be a picture or a video clip of a vehicle driving scene marked with a classification label, and the classification label can include game objects, non-game objects, The non-game object of the game object.
- the above algorithm may also utilize attention-related algorithms, such as modeled attention models.
- the attention model is used to output the attention value assigned by each obstacle to the self-vehicle, which is related to the degree of intention conflict or trajectory conflict between the obstacle and the self-vehicle. For example, obstacles that have intention conflicts or trajectory conflicts with the ego vehicle will assign more attention to the ego vehicle; while obstacles that do not have intention conflicts or trajectory conflicts with the ego vehicle will assign less attention or zero attention to the ego vehicle. Attention; obstacles with a higher right of way than the ego vehicle can also assign little or zero attention to the ego vehicle.
- the obstacle can be identified as the game object of the self-vehicle. If the obstacle assigns enough attention to the self-vehicle (such as below a certain threshold), the obstacle can be identified as a non-game object of the self-vehicle.
- the attention model can also be implemented by a neural network, and the output of the neural network at this time is the attention value assigned to the ego vehicle corresponding to the identified obstacle.
- Step S20 For the interactive game task between the ego vehicle and the game object, execute multiple releases of the multiple strategy spaces from the multiple strategy spaces between the ego vehicle and the game object, when the multiple releases After a release is executed, the strategy feasible region of the own vehicle and the game object is determined according to the released strategy spaces, and the decision result of the driving of the own vehicle is determined according to the strategy feasible region.
- the decision result refers to an executable behavior action pair between the ego vehicle and the game object in the policy feasible domain.
- step S20 completes the single-vehicle interactive game decision-making process between the self-vehicle and any game object, and determines the feasible domain of the strategy between the self-vehicle and the game object.
- performing multiple releases of the plurality of policy spaces includes performing successive releases of each of the policy spaces, that is, only one policy space is released for each release, and successive releases are performed. After that, multiple policy spaces are released accumulatively. At this time, each policy space is spanned by at least one sampling dimension.
- the ego vehicle first accelerates and decelerates in the current lane before changing lanes. Therefore, when the above-mentioned multiple strategy spaces are released, the optional , in the process of releasing multiple strategy spaces, they can be successively released according to the following dimension order: vertical sampling dimension, horizontal sampling dimension, and time sampling dimension. Different dimensions can form different strategy spaces. For example, when the vertical sampling dimension is released, the vertical sampling strategy space can be expanded, and when the horizontal sampling dimension is released, the horizontal sampling strategy space can be expanded, or it can be expanded together with the vertical sampling dimension.
- the strategy space combined with the vertical sampling strategy space and the horizontal sampling strategy space can form multiple strategy spaces composed of multi-frame deduction when the time sampling dimension is released.
- it can also be the sequential release of the combination of various parts of different strategy spaces. For example, when releasing for the first time, the partial space of the longitudinal sampling strategy space and the partial space of the horizontal adoption strategy space are released sequentially. , and then sequentially release the remaining space of the vertical sampling strategy space and the remaining local space of the horizontal sampling strategy space.
- the cumulatively released multiple strategy spaces may include: vertical sampling strategy space, horizontal sampling strategy space, strategy space formed by combining vertical sampling strategy space and horizontal sampling strategy space, vertical sampling strategy space and horizontal sampling strategy space The strategy space formed by combining the strategy space with the time sampling dimension, and the strategy space formed by combining the vertical sampling dimension, horizontal sampling dimension, and time sampling dimension.
- the dimensions constituting the policy space may include vertical sampling dimensions, horizontal sampling dimensions, or time sampling dimensions.
- the longitudinal sampling dimension used when the longitudinal sampling strategy space is stretched includes at least one of the following: the longitudinal acceleration of the ego vehicle and the longitudinal acceleration of the game object;
- the horizontal sampling dimension used when the horizontal sampling strategy space is stretched includes at least the following One: the lateral offset of the ego vehicle and the lateral offset of the game object;
- the time sampling dimension includes multiple strategy spaces composed of continuous multi-frame deduction corresponding to continuous time points (that is, sequentially increasing the deduction depth). The combination of these three dimensions can constitute the above-mentioned multiple strategy spaces.
- the values in each horizontal or vertical sampling dimension when each strategy space is stretched correspond to the sampling action of the ego vehicle or the game object, that is, the behavioral action.
- this step S20 may include the following substeps S21-S26:
- S21 Execute the first release of the strategy space, release the strategy space of the self-vehicle and the game object, and take out multiple values in at least one sampling dimension of the self-vehicle and the value of the game object one by one according to the released strategy space Behavior-action pairs formed by multiple values on at least one sampling dimension.
- the longitudinal sampling dimension is released, including the longitudinal acceleration of the ego vehicle and the longitudinal acceleration of the game object, and the strategy space formed by the released longitudinal sampling dimension is the longitudinal sampling strategy Space, to simplify the description, hereinafter referred to as the longitudinal sampling strategy space of the first release.
- the strategy space is the behavior-action pair consisting of the longitudinal acceleration of the ego vehicle to be evaluated and the longitudinal acceleration of the game object (namely the other vehicle).
- multiple sampling values may be set in each sampling dimension.
- a sampling interval may be formed by a plurality of sampling values at uniform and continuous sampling intervals.
- a plurality of sampling values scattered on the sampling dimension are discrete sampling points.
- the longitudinal acceleration dimension of the ego vehicle uniform sampling is performed at a predetermined sampling interval, and multiple sampling values of the ego vehicle in the longitudinal acceleration dimension can be obtained, denoted as M1, that is, the M1 longitudinal acceleration of the ego vehicle Acceleration sampling action.
- M1 longitudinal acceleration of the ego vehicle Acceleration sampling action M1 longitudinal acceleration of the ego vehicle Acceleration sampling action.
- N1 longitudinal acceleration sampling actions of the game object N1 longitudinal acceleration sampling actions of the game object .
- the longitudinal sampling strategy space released for the first time includes M1*N1 behavior action pairs of the ego vehicle and the game object obtained by combining the ego vehicle’s longitudinal acceleration sampling action and the game object’s longitudinal acceleration sampling action.
- Table 1 or Table 2 A specific example of this strategy space can be found in Table 1 or Table 2 below.
- the first row and the first column of Table 1 or Table 2 are the values of the own car and the game object (that is, the other car O in the table) respectively.
- the game object is the crossing game car
- the game object is the opposing game car.
- S22 Deduce each behavior-action pair in the released strategy space into the sub-traffic scene currently constructed by the self-vehicle and the game object, and determine the cost value corresponding to each behavior-action pair.
- the self-vehicle and each game object can also construct each sub-traffic scene separately, and each sub-traffic scene is a subset of the road scene where the self-vehicle is located.
- the cost value corresponding to each behavior action pair in the strategy space is determined according to at least one of the following: the corresponding safety cost value and comfort cost value of the self-vehicle and the game object executing the behavior action pair , lateral offset cost value, passability cost value, right-of-way cost value, risk area cost value, and inter-frame correlation cost value.
- the weighted sum of the above-mentioned cost values can be used.
- the calculated weighted sum can be called the total cost value. The greater the benefit of the corresponding decision-making by the object performing the behavior action, the greater the possibility of the behavior action being a decision result.
- the above-mentioned generation values will be further described later.
- S23 Add the behavior action pair whose cost value is not greater than the cost threshold to the strategy feasible region between the self-vehicle and the game object.
- the strategy feasible region is the result of the game between the self-vehicle and the game object when the strategy space is released for the first time .
- the policy feasible domain refers to the set of executable behavior and action pairs.
- the table items whose table content is Cy or Cg in Table 1 below constitute the feasible domain of the strategy.
- the second release of the policy space is performed, that is, the next policy space among the multiple policy spaces is released.
- the first The second release is the horizontal sampling dimension
- the horizontal sampling strategy space is formed by the horizontal offset.
- the horizontal sampling strategy space released this time and the vertical sampling strategy space released for the first time are jointly used as the current strategy space.
- the strategy space currently used for the interactive game between the ego vehicle and the game object is the strategy space formed by the combination of the vertical sampling strategy space and the horizontal sampling strategy space.
- the lateral sampling strategy space is formed on the lateral offset dimension of the ego vehicle and the lateral offset dimension of the game object.
- uniform sampling is performed at a predetermined sampling interval, and multiple sampling values of the ego vehicle on the lateral offset dimension can be obtained, recorded as Q, that is, the Q of the ego vehicle A horizontal offset sampling action.
- uniform sampling is carried out at a predetermined sampling interval, and multiple sampling values of the game object on the horizontal offset dimension can be obtained, which are recorded as R, that is, the R lateral offset of the game object Shift sampling action.
- the behavior of each ego-vehicle and the game object is composed of the ego-vehicle’s lateral offset sampling action, the game object’s lateral offset sampling action, and the ego-vehicle’s longitudinal acceleration
- the sampling action and the longitudinal acceleration sampling action of the game object are jointly formed.
- the strategy space of the self-vehicle and the game object released for the second time includes M2*N2*Q*R behavior-action pairs.
- Table 3 below, wherein each table item in the upper horizontal sampling strategy space in Table 3 is associated with a table item in the lower vertical sampling strategy space in Table 3.
- the game object that is, the other car O in the table
- the opposing game car is the opposing game car.
- step S26 When the strategy feasible region (ie game result) in step S25 is not empty, select a behavior-action pair as the decision result, and end the release of the strategy space.
- the policy feasible domain When the policy feasible domain is empty, it means that the current policy space has no solution, and at this time, the third release of the policy space is performed, that is, the next policy space among the multiple policy spaces is released. In this way, other strategy spaces can be continuously released in sequence according to the above method, so as to continue to execute the determination of the game result and the decision result.
- the strategy space released multiple times above can be released first by the i-th group of values of the ego vehicle and/or game object on the longitudinal acceleration dimension and the ego vehicle and/or game object on the lateral offset dimension
- the strategy space formed by the i-th group of values on and when there is no strategy feasible region in the strategy space, release the i+1-th group of values and The strategy space formed by the i+1th set of values of the ego vehicle and/or the game object in the horizontal offset dimension.
- the strategy space released multiple times is the game space formed by all the values of the ego vehicle and/or game objects on the longitudinal acceleration dimension and all the values of the ego vehicle and/or game objects on the lateral offset dimension Move its local position inside, where i is a positive integer.
- i is a positive integer.
- the strategy spaces corresponding to different parts are released sequentially, and the optimal decision-making result can be obtained in the multi-dimensional game space with the least number of searches, and the use of strategy space can be reduced as much as possible, and the requirements for hardware computing power can be reduced.
- the policy space between the self-vehicle and the game object is still empty after the above steps are executed multiple times to release the strategy space, it means that there is still no solution.
- the above-mentioned conservative decision-making includes the behavior of making the self-vehicle brake safely, the behavior of making the self-vehicle decelerate safely, or giving a prompt or warning so that the driver takes over the control of the vehicle.
- a single-frame deduction is completed.
- the policy feasible domain may also include: according to the development of the deduction time (that is, increasing the deduction depth in sequence, for multiple consecutive moments), Perform multiple releases of the time sampling dimension and perform multi-frame deduction.
- T1 is used to indicate the initial motion state of the self-vehicle and the game object
- T2 is used to indicate the motion state of the self-vehicle and the game object after the first frame deduction, that is, the deduction result of the first frame
- Tn is used for Indicates the motion state of the self-vehicle and the game object after deduction in frame n-1.
- the derivation time is moved backward by a predetermined time interval (eg, 2 seconds or 5 seconds), that is, moved to the next moment (or referred to as a time point).
- a predetermined time interval eg, 2 seconds or 5 seconds
- the deduction result of the current frame is used as the deduction initial condition of the next frame to deduce the motion state of the ego vehicle and the game object at the next moment; thus, in this way, the time sampling dimension can continue to be continued at subsequent moments release, to continue to execute subsequent continuous multi-frame deduction, to continue to execute the determination of game results and decision results.
- the release time sampling dimension it is necessary to evaluate the decision results of the behavior decisions between the self-vehicle and the game object determined by two adjacent single-frame deduction, and determine the inter-frame correlation cost value, which will be described in detail later .
- Releasing the time sampling dimension helps to improve the behavior consistency of the vehicle. For example, when the self-vehicle and the game object’s motion state or decision result corresponding to the intention and decision of the game object are the same or similar, the intelligent driving vehicle executes The driving behavior of the intelligent driving decision-making method is more stable in the time domain, the fluctuation of the driving trajectory is smaller, and the driving comfort of the vehicle is better.
- the above release time sampling dimension that is, at multiple consecutive decision-making moments, use the released multiple strategy spaces to obtain the feasible domain of the strategy corresponding to the self-vehicle and the game object, and deduce the self-vehicle and the game object in order of time
- the motion state after executing the executable behavior and action pairs corresponding to the feasible regions of these strategies.
- the consistency of decision results in time can be achieved.
- the decision result of the first frame in the multi-frame derivation can be used as the decision result of the self-vehicle driving.
- the decision result of the first frame may be reselected.
- the decision result of the first frame corresponds to the decision result of single-frame deduction, and the decision result of the first frame is reselected, that is, another behavior-action pair is selected as the decision result from the policy feasible domain of the decision result of the first frame.
- multi-frame deduction can be performed again to judge whether it can be used as the final decision result.
- the ranking results of the corresponding cost values can be selected according to each behavior action, and the total cost value is preferably selected.
- the action behavior corresponds to the corresponding decision result.
- the above-mentioned different cost values may have different weights, and the corresponding ones may be respectively referred to as safety weight, comfort weight, lateral offset weight, passability weight, right-of-way weight, risk area weight, frame Inter-correlation weights.
- the weight distribution may be assigned as follows: safety weight>right of way weight>lateral offset weight>passability weight>comfortability weight>risk area weight>inter-frame correlation weight.
- the above cost values may be normalized respectively, and the value range is [0,1].
- the above cost values can be calculated according to different cost functions, and the corresponding ones can be respectively called safety cost function, comfort cost function, passability cost function, lateral offset cost function, and right of way cost function .
- the safety cost value can be calculated according to the safety cost function with the relative distance between the self-vehicle and other cars (that is, the game object) as the independent variable, and the safety cost value and the relative distance are negative relevant. For example, the greater the relative distance between two vehicles, the smaller the safety penalty.
- a normalized security cost function is the following piecewise function, where C dist is the security cost value, and dist is the relative distance between the self-vehicle and the game object, for example, the minimum The distance is defined as the polygon minimum distance between the self-vehicle and the game object:
- threLow is the lower threshold of the distance, which is 0.2 in FIG. 8A
- threHigh is the upper threshold of the distance, and is 1.2 in FIG. 8A
- the distance lower threshold threLow and the distance upper threshold threHigh can be dynamically adjusted according to the interaction between the self-vehicle and other vehicles, such as the relative speed, relative distance, and relative angle between the self-vehicle and other vehicles.
- the safety cost value defined by the safety cost function is positively related to relative velocity or relative angle. For example, when meeting cars in opposite directions or laterally (horizontal means that the other car crosses the own car), the greater the relative speed or relative angle of the interaction between the two cars, the greater the corresponding safety cost.
- the comfort cost value of the vehicle can be calculated according to a comfort cost function whose independent variable is the absolute value of the acceleration change (ie, jerk).
- a homogenized comfort cost function is the following piecewise function, where C comf is the comfort cost, and jerk is the acceleration change of the ego vehicle or the game object:
- threMiddle is the jerk middle point threshold, which is 2 in the example shown in FIG. 8B
- threHigh is the jerk upper limit threshold, which is 4 in FIG. 8B
- C middle is the jerk cost slope. That is, the greater the acceleration change of the vehicle, the worse the comfort and the greater the comfort cost, and the comfort cost increases faster after the acceleration change of the vehicle is greater than the intermediate point threshold.
- the acceleration variation of the vehicle may be a longitudinal acceleration variation, a lateral acceleration variation, or a weighted sum of the two.
- the comfort cost may be the comfort cost of the self-vehicle, or the comfort cost of the game object, or the weighted sum of the comfort cost of the two.
- the passability cost value can be calculated according to a passability cost function with the velocity variation of the ego vehicle or the game object as an independent variable. For example, a vehicle giving way with a large deceleration will result in a large speed loss (the difference between the current speed and the future speed, that is, acceleration) or a long waiting time, and the vehicle's passability cost will increase. For example, a vehicle rushing with a large acceleration will result in a large speed increase (the difference between the current speed and the future speed, that is, the acceleration) or a short waiting time, and the vehicle's passability cost will decrease.
- a passability cost function with the velocity variation of the ego vehicle or the game object as an independent variable. For example, a vehicle giving way with a large deceleration will result in a large speed loss (the difference between the current speed and the future speed, that is, acceleration) or a long waiting time, and the vehicle's passability cost will increase.
- a vehicle rushing with a large acceleration will result in a large speed
- the passability cost value can also be calculated according to the passability cost function with the relative speed ratio of the ego vehicle and the game object as an independent variable. For example, before the action pair is executed, the absolute value of the speed of the own vehicle accounts for a large proportion of the sum of the absolute values of the speeds of the own vehicle and the game object, and the absolute value of the speed of the game object is greater than the sum of the absolute values of the speed of the own vehicle and the game object. A small proportion of them. After executing the action pair, if the ego vehicle yields with a greater deceleration, its speed loss will increase, and the speed ratio will decrease, so the passing cost of the ego vehicle performing the action action will be greater. However, if after executing the action pair, the game object rushes forward with a relatively large acceleration, its speed increases, and the speed ratio increases, and the passability cost corresponding to the game object's execution of the action action is relatively small.
- the passability cost value is the passability cost value of the own vehicle performing the behavior action to the corresponding own vehicle passability cost value, or the game object executes the behavior action to the corresponding game object passability cost value, or the two pass through Weighted sum of sexual cost values.
- the normalized passability cost function is the following piecewise function, where C pass is the passability cost value, and speed is the absolute value of the speed of the vehicle:
- the absolute value of the vehicle's intermediate speed is speed0
- the maximum value of the vehicle's absolute speed is speed1
- C middle is the slope of the speed cost. That is, the greater the absolute value of the speed of the vehicle, the better the passability and the smaller the passability cost, and the passability cost decreases faster after the absolute value of the vehicle speed is greater than the intermediate point threshold.
- the road right information corresponding to the vehicle can be determined according to the obtained user portrait of the ego vehicle or the game object. For example, if the driving behavior of the game object belongs to an aggressive style, and it is more inclined to adopt a rush decision, it is a high road Right of way, if the driving behavior of the game object is conservative and more inclined to adopt a yielding strategy, it is a low right of way. Among them, the high right of way tends to maintain the established motion state or established driving behavior, and the low right of way tends to change the established motion state or established driving behavior.
- the user profile can be determined according to the user's gender, age, or completion of historical actions.
- the data required for determining the user profile may be acquired by the cloud server and the user profile may be determined. If the self-vehicle and/or the game object executes the behavior action to cause the vehicle with a high right of way to change its motion state, the cost of the behavior action to the corresponding right of way is greater, and the benefit is smaller.
- penalties are increased by determining a higher right-of-way cost value for behavioral decisions that result in a change in the vehicle's motion state with a high right-of-way. That is to say, through this feedback mechanism, the behavior action that makes the high-right-of-way vehicle maintain the current state of motion has a greater right-of-way benefit, that is, a smaller right-of-way cost.
- the normalized road right cost function is the following piecewise function, where C roadRight is the road right cost value, and acc is the absolute value of the acceleration of the vehicle:
- threHigh is the acceleration upper limit threshold, which is 1 in FIG. 8D . That is, the greater the acceleration of the vehicle, the greater the value of the right of way.
- the right-of-way cost function makes the behavior of the high-right-of-way vehicle maintain the current motion state have a small right-of-way cost value, thereby preventing the behavior of the high-right-of-way vehicle from changing the current motion state from becoming a decision result.
- the acceleration of the vehicle may be a longitudinal acceleration or a lateral acceleration. That is to say, in the dimension of lateral offset, a large lateral change will also make the cost of the right of way larger.
- the right-of-way cost value can be the corresponding right-of-way cost value of the self-vehicle performing the behavior action, or the corresponding right-of-way cost value of the game object performing the behavior action, or the two right-of-way cost values weighted sum of .
- the vehicle is in a risky area on the road (in this area, the vehicle has a greater driving risk and needs to leave the risky area as soon as possible)
- the area will not have a serious impact on traffic.
- the vehicles in the risk area of the road will not give way, that is, the behavior that causes the vehicle in the risk area of the road to give way will be abandoned Decision-making (the behavior decision has a large risk area cost value), and choose the decision result (with a small risk area cost value) of the vehicle in the risk area of the road to rush to leave the risk area as soon as possible, so as to avoid being in the road Vehicles in the risk area are stranded and have a serious impact on traffic.
- the cost value of the risk area can be the cost value of the corresponding risk area when the self-vehicle in the risk area of the road executes the behavior action, or the game object in the risk area of the road executes the action action to the corresponding Risk area cost value, or the weighted sum of both risk area cost values.
- the lateral offset cost can be calculated according to the lateral offset of the ego vehicle or the game object.
- the horizontal offset cost function after the homogenization process is the following piecewise function in the right half space, where C offset is the lateral offset cost value, offset is the lateral offset of the vehicle, and the unit is meter,
- the equation expression of the left half space can be obtained by inverting the equation expression of the right half space of the coordinate plane:
- threMiddle is the middle value of the lateral offset, for example, it is the soft boundary of the road; C middle is the slope of the first lateral offset cost; threHigh is the upper threshold of the lateral offset, for example, it is the hard boundary of the road. That is to say, the larger the lateral offset of the vehicle, the smaller the lateral offset benefit and the greater the lateral offset cost, and after the vehicle’s lateral offset is greater than the median value of the lateral offset, the lateral offset cost increases by Faster for increased penalties. After the lateral offset of the vehicle is greater than the upper limit threshold of the lateral offset, for example, the lateral offset cost value is fixed at 1.2 to increase the penalty.
- the lateral offset cost value can be the corresponding lateral offset cost value of the ego vehicle performing the behavior action, or the corresponding lateral offset cost value of the game object performing the behavior action, or both Weighted sum of offset cost values.
- the intention decision of the self-vehicle in the previous frame K is a rushing game object
- the corresponding frame interval Affinity cost value will be small, like 0.3, while the default value is 0.5, so it is a bonus.
- the intention decision of the current frame K+1 of the ego vehicle is the yielding game object
- the corresponding inter-frame correlation cost value will be larger, such as 0.8, and the default value is 0.5, so it is a penalty.
- choosing the strategy that makes the intention decision of the ego vehicle in the current frame as the object of the rushing game becomes the feasible solution of the current frame.
- the inter-frame correlation cost value can be calculated according to the ego vehicle's intention decision in the previous frame and the ego vehicle's intention decision in the current frame , can also be calculated according to the intention decision of the game object in the previous frame and the intention decision of the game object in the current frame, or obtained after weighting the ego vehicle and the game object.
- steps S30 and/or S40 may also be included:
- Step S30 The own vehicle generates the longitudinal/lateral control amount according to the decision result, so that the driving system of the own vehicle executes the longitudinal/lateral control amount to realize the expected driving trajectory of the own vehicle.
- control device of the own vehicle generates the longitudinal/lateral control amount according to the decision result, and sends the longitudinal/lateral control amount to the driving system 13, so that the driving system 13 can execute the driving control of the vehicle, including power control , steering control, and braking control, so that the vehicle executes the desired driving track of its own vehicle according to the decision result.
- Step S40 Display the decision result on the display device in a manner understandable to the user.
- the decision result of ego vehicle driving includes the behavior of ego car.
- the intended decision-making of the own vehicle can be predicted, such as rushing, yielding or avoiding, and the expected driving trajectory of the own vehicle can also be predicted.
- the decision result is displayed on the display device in the cockpit of the vehicle in a manner understandable to the user, such as the expected driving trajectory, arrows indicating the decision-making, and words indicating the decision-making.
- the desired driving trajectory when the desired driving trajectory is displayed, it may be combined with the vehicle's current traffic scene (such as a graphical traffic scene) and displayed on a display device in the vehicle cockpit in the form of a partially enlarged view.
- a voice playback system may also be included, which can prompt the user for the intention decision or policy label that has been decided by playing voice.
- step S10 considering the one-way interactive decision between the self-vehicle and the non-game object of the self-vehicle, or the one-way interactive decision between the game object of the self-vehicle and the non-game object of the self-vehicle game object, such as Another embodiment shown in FIG. 9, between the above step S10 and step S20, further includes the following steps:
- S15 Constrain the strategy space of the ego vehicle or the game object through the motion state of the non-game object.
- the range of values of the self-vehicle in each sampling dimension can be constrained with regard to the motion state of the non-game object of the own vehicle
- the range of values of the game objects of the own vehicle in each sampling dimension may be constrained with respect to the motion states of the non-game objects of the game objects of the own vehicle.
- the value range may be one or more sampling intervals on the sampling dimension, or may be a plurality of discrete sampling points.
- the value range after the constraint may be a partial value range.
- step S15 includes: regarding the one-way interaction process between the self-vehicle and its non-game object, determine the value range of the self-vehicle on each sampling dimension under the constraints of the motion state of the non-game object; or The one-way interaction process between the game object of the car and the non-game object of the game object of the self-vehicle determines the selection of the game object of the self-vehicle in each sampling dimension under the constraints of the motion state of the non-game object of the game object of the self-vehicle. range of values.
- step S20 is performed, which is conducive to narrowing the gap between the self-vehicle and the game objects of the self-vehicle in the process of single-vehicle interactive game decision-making.
- the game space and strategy space reduce the computing power used in the interactive game decision-making process.
- step S15 may include: first, receiving the motion state of the non-game object of the self-vehicle, and observing the feature quantity of the non-game object, such as; then, calculating the relationship between the self-vehicle and the non-game object The conflict area of the game object determines the characteristic quantity of the self-vehicle, that is, the critical action.
- the feasible interval of the game object that is, the value range of the self-vehicle after being constrained by the non-game object in each sampling dimension.
- the self-vehicle when it has a non-game object C, it can also be: first process the interactive game decision between the self-vehicle A and the game object B of the own vehicle, determine the corresponding strategy feasible region AB, and then introduce The non-game feasible region AC of the self-vehicle and the non-game object C, and then take the intersection of the policy feasible region AB and the non-game feasible region AC to obtain the final strategy feasible region ABC, based on the final strategy feasible region to determine the target The decision result of ego driving.
- the interactive game decision between the self-vehicle A and the game object B of the self-vehicle can also be processed first, and the corresponding strategy feasible region AB is determined, and then Then introduce the non-game feasible domain BD of the game object B of the self-vehicle and its non-game object D, and then take the intersection of the policy feasible domain AB and the non-game feasible domain BD to obtain the final feasible domain ABD, based on the final strategy is feasible domain to determine the decision outcome for the ego vehicle.
- the interactive game decision between the own vehicle A and the game object B of the own vehicle can also be processed first, Determine the corresponding strategy feasible region AB, and then introduce the non-game feasible region AC of the self-vehicle A and the non-game object C, and the non-game feasible region BD of the game object B of the self-vehicle and its non-game object D, and then introduce the strategy feasible region AB, the non-game feasible domain AC, and the non-game feasible domain BD are intersected to obtain the final feasible domain ABCD, and based on the final policy feasible domain, the decision result for the self-vehicle driving is determined.
- the above specifically exemplifies the steps of determining the executable behavior action pairs of the self-vehicle and the game object by successively releasing the multiple strategy spaces of the self-vehicle and a single game object.
- the intelligent driving provided by the embodiment of the present application Decision-making methods, including:
- Step 1 From the multiple strategy spaces of the self-vehicle and the first game object, execute successive releases of each of the strategy spaces, and determine the feasible domain of the self-vehicle's driving strategy for the first game object. Wherein, determining the feasible region of the self-vehicle driving strategy for the first game object is similar to the aforementioned step S20. I won't go into details here.
- Step 2 From the plurality of strategy spaces of the self-vehicle and the second game object, execute successive releases of each of the strategy spaces, and determine the feasible domain of the self-vehicle's driving strategy for the second game object.
- determining the feasible region of the self-vehicle driving strategy for the second game object is similar to the aforementioned step S20. I won't go into details here.
- Step 3 Determine the decision result of the driving of the self-vehicle according to the feasible domains of each strategy of the own vehicle and each of the game objects.
- the final policy feasible domain is obtained by intersecting each policy feasible domain, and then the decision result is determined from the policy feasible domain.
- the decision result may be the behavior-action pair with the least cost in the feasible domain of the strategy.
- the ego vehicle acquires the environment information outside the vehicle through the environment information acquisition device 11 .
- S120 The ego vehicle determines game objects and non-game objects.
- This step can be referred to the aforementioned step S12, and will not be repeated here.
- it is determined that a game object from the car is a crossing game car, and a game object is an opposing game car.
- S130 From the plurality of strategy spaces of the self-vehicle and the traversing game vehicle, perform sequential release of each of the strategy spaces, and determine the game result of the self-vehicle and the traversing game vehicle. Specifically, the following steps S131-S132 may be included:
- the longitudinal acceleration dimension of the self-vehicle and the crossing game car is released, and the first longitudinal sampling strategy space of the self-vehicle and the crossing game car is created.
- the value range of the longitudinal acceleration of the crossing game car is [-4,3], and the unit is m/s 2 , where m means meter and s means second.
- the sampling interval of the own vehicle and the crossing game vehicle is determined to be 1m/s 2 .
- Zhang Cheng’s strategy space is shown in Table 1 when displayed in a two-dimensional form.
- the first line of Table 1 lists all the values Ae of the longitudinal acceleration of the own car, and the first column lists all the values Ao1 of the longitudinal acceleration values of the crossing game car. That is to say, the longitudinal sampling strategy space of the self-vehicle and the crossing game car released this time includes 8 times 8, that is, 64 pairs of longitudinal acceleration behaviors of the self-vehicle and the crossing game car.
- S132 According to the pre-defined cost value determination methods, such as each cost function, calculate the cost value corresponding to each behavior action pair in the longitudinal sampling strategy space of the self-vehicle and the crossing game vehicle, and determine the feasible region of the strategy.
- the self-vehicle and the crossing game car execute 3 plus 13, that is, after 16 sampling actions, in the sub-traffic scene constructed by the self-vehicle and the crossing game car, the safety cost value , comfort cost, passability cost, lateral offset cost, right-of-way cost, risk area cost, and inter-frame correlation cost are greater than the preset cost threshold, which is the strategy space Feasible solution constitutes the strategy feasible region of the self-vehicle and the traversing game vehicle.
- the cost value of the horizontal offset is zero. Because it is the decision of the current frame and does not involve the decision result of the previous frame, the inter-frame correlation cost value is zero.
- the interactive game between the self-vehicle and the crossing game vehicle finds enough feasible solutions in the vertical sampling strategy space, and it is no longer necessary to continue to find solutions in the horizontal sampling dimension.
- the logarithm of the search behavior is 64, and the current round of the game consumes less computing power and less computing time.
- the behavior decision of the own car and the crossing game car is: the self car accelerates, and the crossing game car decelerates.
- the self-vehicle and the cross-travel game car perform any of the three sampling actions, the self-vehicle will pass before the cross-travel game car Therefore, it is determined that the intention decision corresponding to these behavioral action pairs is the self-vehicle rushing across the game vehicle.
- set the self-vehicle rushing decision label for these three behavior-action pairs that is, "Cg" in Table 1.
- the behavior decision of the own vehicle and the crossing game vehicle is: the crossing game vehicle accelerates, and the self vehicle decelerates.
- the motion state of the self-vehicle and the traversing game acquired at the beginning of the decision-making in the current frame deduction, it can be deduced that after the self-vehicle and the traversing game vehicle perform any of the 13 sampling actions, the traversing game vehicle will pass before the self-vehicle Therefore, it is determined that the intention decision corresponding to these behavioral action pairs is to cross the game vehicle and seize the self-vehicle.
- set the decision-making label of crossing the game car that is, "Cy" in Table 1.
- S140 From the plurality of strategy spaces of the self-vehicle and the opposing game vehicle, execute successive releases of each of the strategy spaces, and determine the game result of the self-vehicle and the opposing game vehicle. Specifically, the following steps S141-S144 may be included:
- S141 According to the principle of releasing the vertical sampling dimension first and then the horizontal sampling dimension, release the longitudinal sampling dimension of the self-vehicle and the opposing game car, and create the first longitudinal sampling strategy space for the self-vehicle and the opposing game car.
- Zhang Cheng's strategy space is displayed in two-dimensional form as shown in Table 2.
- the first row of Table 2 lists all the values of the longitudinal acceleration of the ego vehicle, Ae; the first column lists all the values of the longitudinal acceleration of the opposing game car, Ao2. That is to say, the longitudinal sampling strategy space of the ego vehicle and the opposing game vehicle released this time includes 8 times 8, a total of 64 longitudinal acceleration behavior action pairs of the ego vehicle and the opposing game vehicle.
- S142 According to the pre-defined cost value determination methods, such as each cost function, respectively calculate the cost value corresponding to each behavior action pair in the longitudinal sampling strategy space of the ego vehicle and the forward game vehicle, and determine the strategy feasible region.
- the safety cost value or the passability cost value are greater than the preset The cost threshold, there is no feasible solution in the strategy space released for the first time, and the strategy feasible regions of the self-vehicle and the opposing game vehicle are empty.
- S143 Release the lateral offset dimension of the own vehicle, and form a second strategy space of the own vehicle and the opposing game vehicle with the longitudinal acceleration dimension of the own vehicle and the opposing game vehicle.
- release part of the values of the self-vehicle on the lateral offset dimension and part of the values of the self-vehicle and the opposing game car on the longitudinal acceleration dimension and create a second strategy space for the ego-vehicle and the opposing game car.
- Figure 10 shows a schematic diagram of the lateral sampling actions determined for the lateral offset sampling for the two vehicles, respectively. That is, the multiple lateral deviation behaviors correspond to multiple mutually parallel lateral deviation trajectories that the vehicle can execute.
- the first row of the upper sub-table of Table 3 lists all values Oe of the lateral offset values of the ego vehicle, and the first column lists all values Oo2 of the lateral offset values of the opposing game vehicle. Therefore, the horizontal sampling strategy space formed by the self-vehicle and the opposing game vehicle in the horizontal offset dimension includes at most 7 times 7, that is, 49 pairs of lateral offset behaviors of the ego-vehicle and the opposing game vehicle.
- each cost value determination method such as each cost function, calculate the partial values of the own vehicle on the lateral offset dimension and the partial values of the own vehicle and the opposing game vehicle on the longitudinal acceleration dimension to form a second time The corresponding cost value of each behavior action pair in the released strategy space, and determine the feasible domain of the strategy.
- the lateral offset value of the self-vehicle is 1, in the 64 released longitudinal acceleration behavior pairs of the self-vehicle and the opposing game car, after the self-vehicle and the opposing game car perform 48 sampling actions, the In the sub-traffic scene constructed for the game car, the weighted sum of safety, comfort, passability, lateral offset cost value, road right cost value, risk area cost value, and inter-frame correlation cost value is greater than the preset cost
- the threshold is the feasible solution in the strategy space, which constitutes the strategy feasible region of the self-vehicle and the opposing game vehicle.
- these 48 action pairs are identified with the label "1".
- the inter-frame correlation cost value is zero.
- the label of the corresponding behavior-action pair in the sub-table of Table 3 is adjusted from “-1" to "0" when the longitudinal accelerations of the ego vehicle and the opposing game vehicle are respectively -1. This is because, when the ego vehicle laterally shifts to the right by 1m, the ego car and the opposing game car no longer have the risk of collision. The passability is too poor (for braking), and it is still an infeasible solution, but the label is adjusted from "-1" to "0".
- the intention decision can also be determined for the ego vehicle and the opposing game vehicle, and a policy label can be set, refer to step S132, and will not be repeated here.
- the release of the strategy space between the self-vehicle and the opposing game-vehicle mentioned above can also select multiple sampling values in the dimension of the lateral offset of the ego-vehicle, for example, the lateral offset value of the ego-vehicle is 2 or 3, and the The longitudinal acceleration sampling strategy space of the car and the opposing game car jointly expands into more strategy spaces.
- S150 Find the intersection of the strategically feasible domains of the own vehicle and the opposing game vehicle and the strategically feasible domains of the own vehicle and the crossing game vehicle, and determine the game result of the own vehicle.
- Table 4 shows the minimum cost value (that is, the best profit) found in the public feasible domains of the strategic feasible domains of the self-vehicle and the opposing game vehicle in Table 3 and the strategic feasible domains of the self-vehicle and the crossing game vehicle in Table 1 feasible solution.
- the feasible solution is the game decision-making action pair of the self-vehicle, the opposing game vehicle and the crossing game vehicle.
- a multi-action action pair formed by the combination of longitudinal acceleration.
- the ego vehicle decelerates to give way with a longitudinal acceleration of -2m /s2, and laterally shifts to the right by 1m to avoid the opposing game vehicle; in order to ensure trafficability, the crossing game vehicle accelerates through the conflict with a longitudinal acceleration of 1m/ s2 Area: The opposing game vehicle accelerates through the conflict area with a longitudinal acceleration of 1m/s 2 .
- the intention decisions are respectively: the crossing game car rushes its own car, the opposing game car snatches its own car, and the ego car laterally avoids the opponent to the right. To the game car, the car gives way to cross the game car.
- S160 From the game results of the self-vehicle, select the decision result, and select the corresponding action pair with the smallest cost value, and determine the executable action from the own vehicle, which can be used to control the self-vehicle to execute the action.
- an action pair may be selected according to the cost value as the decision result.
- continuous multi-frame deduction can be further carried out for each solution, that is, the time sampling dimension is released to select the Behavior-action pairs with good consistency in the time dimension are used as the decision-making result of ego-vehicle driving.
- the present application also provides an embodiment of a corresponding intelligent driving decision-making device.
- the intelligent driving decision-making device 100 includes:
- the obtaining module 110 is used to obtain the game object with the self-vehicle. Specifically, it is used to execute the above step S10 or steps S110-S120, or various optional embodiments corresponding to the steps.
- the processing module 120 is used to perform multiple releases of multiple strategy spaces from the multiple strategy spaces of the self-vehicle and the game object. After one release in the multiple releases is executed, determine the self-vehicle according to each strategy space that has been released. According to the strategy feasible region of the game object, the decision result of ego vehicle driving is determined according to the strategy feasible region. Specifically, it is used to execute the above steps S20-S40, or various optional embodiments corresponding to the steps.
- the dimensions of the plurality of policy spaces include at least one of the following: vertical sampling dimension, horizontal sampling dimension, or time sampling dimension.
- performing multiple releases of multiple policy spaces includes performing the releases in the order of the following dimensions: vertical sampling dimension, horizontal sampling dimension, and time sampling dimension.
- the total cost value of the behavior action pair in the feasible strategy domain is determined according to one or more of the following: the security cost value of the self-vehicle or the game object , right of way cost, lateral offset cost, passability cost, comfort cost, inter-frame correlation cost, and risk area cost.
- each cost value has a different weight.
- the decision result of driving the own vehicle is determined according to the feasible domains of each strategy of the own vehicle and each game object.
- the acquisition module 110 is also used to obtain the non-game object of the own vehicle; the processing module 120 is also used to determine the feasible domain of the strategy from the vehicle and the non-game object; the feasible domain of the strategy of the vehicle and the non-game object Including the executable behavior of the self-vehicle relative to the non-game object; at least determine the decision-making result of the self-vehicle driving according to the feasible area of the strategy of the self-vehicle and the non-game object.
- the processing module 120 is also used to determine the strategy feasible region of the decision result of the driving of the own vehicle according to the intersection of the feasible regions of each strategy of the own vehicle and each game object, or according to the intersection of each strategy feasible region of the own vehicle and each game object.
- the intersection of the feasible domain and the feasible domains of each strategy of the self-vehicle and each non-game object determines the feasible domain of the strategy of the decision result of the driving of the own vehicle.
- the obtaining module 110 is also used to obtain the non-game object of the self-vehicle; the processing module 120 is also used to constrain the longitudinal sampling strategy space corresponding to the self-vehicle according to the motion state of the non-game object, or to constrain and The lateral sampling policy space corresponding to the ego vehicle.
- the acquiring module 110 is also used to acquire the non-game objects of the game objects of the own vehicle; the processing module 120 is also used to constrain the longitudinal sampling corresponding to the game objects of the own vehicle according to the motion state of the non-game objects Policy space, or a laterally sampled policy space with constraints corresponding to the game objects of the ego vehicle.
- intersection set is an empty set
- a conservative decision for the ego vehicle to travel is performed, and the conservative decision includes an action of stopping the ego vehicle safely, or an action of making the ego vehicle decelerate safely.
- the game object or non-game object is determined according to the attention method.
- the processing module 120 is further configured to display at least one of the following through the human-computer interaction interface: the decision result of the driving of the own vehicle, the policy feasible region of the decision result, and the driving trajectory of the own vehicle corresponding to the decision result of the driving of the own vehicle , or the driving trajectory of the game object corresponding to the decision result of ego vehicle driving.
- the decision result of ego vehicle driving can be the decision result of the current single-frame deduction, or it can be the decision result corresponding to multiple single-frame deduction that has already been executed.
- the decision result can be the executable behavior of the ego vehicle, or it can be
- the behavior action that the game object can execute can also be the intention decision corresponding to the ego vehicle’s execution of the action action, such as Cg or Cy in Table 1, such as rushing, giving way or avoiding.
- the policy feasible domain of the decision result may be the policy feasible domain of the current single-frame deduction, or the policy feasible domain corresponding to multiple single-frame deduction that has been executed.
- the ego vehicle trajectory corresponding to the ego vehicle driving decision result can be the ego vehicle trajectory corresponding to the first single frame deduction in one-step decision-making, such as T1 in Figure 7, or it can be already executed in one-step decision-making Multiple single-frame derivations are sequentially connected to the ego vehicle trajectory, such as T1, T2, and Tn in Figure 7.
- the driving trajectory of the game object corresponding to the decision result of ego vehicle driving can be the driving trajectory of the game object corresponding to the first single-frame deduction in one-step decision-making, such as T1 in Figure 7, or it can be already
- the running trajectories of the game objects formed by sequential connection of multiple single-frame deduction performed are T1, T2, and Tn in FIG. 7 .
- the embodiment of the present application also provides a vehicle driving control method, including:
- S220 According to the obstacle information, determine the decision result of vehicle driving according to any one of the above intelligent driving decision-making methods
- the embodiment of the present application also provides a vehicle driving control device 200, including: an acquisition module 210, used to acquire obstacles outside the vehicle; a processing module 220, An intelligent driving decision-making method determines the decision result of vehicle driving; the processing module is also used to control the vehicle running according to the decision result.
- a vehicle driving control device 200 including: an acquisition module 210, used to acquire obstacles outside the vehicle; a processing module 220, An intelligent driving decision-making method determines the decision result of vehicle driving; the processing module is also used to control the vehicle running according to the decision result.
- the embodiment of the present application also provides a vehicle 300 , including: the vehicle travel control device 200 described above, and a travel system 250 ; the vehicle travel control device 200 controls the travel system 250 .
- the driving system 250 may include the aforementioned driving system 13 in FIG. 2 .
- FIG. 16 is a schematic structural diagram of a computing device 400 provided by an embodiment of the present application.
- the computing device 400 includes: a processor 410 , a memory 420 , and may also include a communication interface 430 .
- the communication interface 430 in the computing device 400 shown in FIG. 16 can be used to communicate with other devices.
- the processor 410 may be connected to the memory 420 .
- the memory 420 can be used to store the program codes and data. Therefore, the memory 420 may be a storage unit inside the processor 410, or an external storage unit independent of the processor 410, or may include a storage unit inside the processor 410 and an external storage unit independent of the processor 410. part.
- computing device 400 may also include a bus.
- the memory 420 and the communication interface 430 may be connected to the processor 410 through a bus.
- the bus may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like.
- PCI Peripheral Component Interconnect
- EISA Extended Industry Standard Architecture
- the bus can be divided into address bus, data bus, control bus and so on.
- the processor 410 may be a central processing unit (central processing unit, CPU).
- the processor can also be other general-purpose processors, digital signal processors (digital signal processors, DSPs), application specific integrated circuits (Application specific integrated circuits, ASICs), off-the-shelf programmable gate arrays (field programmable gate arrays, FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the processor 410 adopts one or more integrated circuits for executing related programs, so as to implement the technical solutions provided by the embodiments of the present application.
- the memory 420 may include read-only memory and random-access memory, and provides instructions and data to the processor 410 .
- a portion of processor 410 may also include non-volatile random access memory.
- processor 410 may also store device type information.
- the processor 410 executes computer-implemented instructions in the memory 420 to perform the operation steps of the above method.
- the computing device 400 may correspond to a corresponding subject in performing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the modules in the computing device 400 are for realizing the present invention For the sake of brevity, the corresponding processes of the methods in the embodiments are not repeated here.
- the disclosed systems, devices and methods may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
- the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
- the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and the program is used to execute the above method when executed by a processor, and the method includes at least one of the solutions described in the above embodiments one.
- the computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media.
- the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
- a computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
- Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
- Computer program code for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
- the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider). connect).
- LAN local area network
- WAN wide area network
- connect such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Transportation (AREA)
- Mechanical Engineering (AREA)
- Human Computer Interaction (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
Claims (31)
- 一种智能驾驶决策方法,其特征在于,包括:获取自车的博弈对象;从自车与所述博弈对象的多个策略空间中,执行所述多个策略空间的多次释放,当所述多次释放中的一次释放执行后,根据已经释放的各所述策略空间确定自车与所述博弈对象的策略可行域,根据所述策略可行域确定所述自车行驶的决策结果。
- 根据权利要求1所述的方法,其特征在于,所述多个策略空间的维度包括至少以下之一:纵向采样维度、横向采样维度、或时间采样维度。
- 根据权利要求2所述的方法,其特征在于,所述执行所述多个策略空间的多次释放包括按照以下维度的顺序执行所述释放:纵向采样维度、横向采样维度、时间采样维度。
- 根据权利要求1-3任一所述的方法,其特征在于,所述确定自车与所述博弈对象的策略可行域时,所述策略可行域中的行为动作对的总代价值,根据以下之一或多个确定:自车或博弈对象的安全性代价值、路权代价值、横向偏移代价值、通过性代价值、舒适性代价值、帧间关联性代价值、风险区域代价值。
- 根据权利要求4所述的方法,其特征在于,所述行为动作对的总代价值根据两个或两个以上的代价值进行确定时,各所述代价值具有不同的权重。
- 根据权利要求1所述的方法,其特征在于,所述博弈对象包括两个或两个以上时,所述自车行驶的决策结果根据自车与各所述博弈对象的各策略可行域确定。
- 根据权利要求1-6任一所述的方法,其特征在于,还包括:获取自车的非博弈对象;确定出自车与所述非博弈对象的策略可行域;至少根据自车与所述非博弈对象的策略可行域确定所述自车行驶的决策结果。
- 根据权利要求6或7所述的方法,其特征在于,根据自车与各所述博弈对象的各策略可行域的交集确定所述自车行驶的决策结果的策略可行域,或根据自车与各所述博弈对象的各策略可行域以及自车与各所述非博弈对象的各策略可行域的交集确定所述自车行驶的决策结果的策略可行域。
- 根据权利要求2-8任一所述的方法,其特征在于,还包括:获取自车的非博弈对象;根据所述非博弈对象的运动状态,约束与自车对应的纵向采样策略空间,或约束与自车对应的横向采样策略空间。
- 根据权利要求2-8任一所述的方法,其特征在于,还包括:获取自车的博弈对象的非博弈对象;根据所述非博弈对象的运动状态,约束与自车的博弈对象对应的纵向采样策略空间,或约束与自车的博弈对象对应的横向采样策略空间。
- 根据权利要求8所述的方法,其特征在于,所述交集为空集时,执行自车行驶的保守决策,所述保守决策包括使所述自车安全停车的动作,或,使所述自车安全 减速行驶的动作。
- 根据权利要求1所述的方法,其特征在于,所述博弈对象或非博弈对象,根据注意力方式进行确定。
- 根据权利要求1-12任一所述的方法,其特征在于,还包括:通过人机交互界面显示至少以下之一:所述自车行驶的决策结果、所述决策结果的策略可行域、所述自车行驶的决策结果对应的自车行驶轨迹、或所述自车行驶的决策结果对应的博弈对象的行驶轨迹。
- 一种智能驾驶决策装置,其特征在于,包括:获取模块,用于获取自车的博弈对象;处理模块,用于从自车与所述博弈对象的多个策略空间中,执行所述多个策略空间的多次释放,当所述多次释放中的一次释放执行后,根据已经释放的各所述策略空间确定自车与所述博弈对象的策略可行域,根据所述策略可行域确定所述自车行驶的决策结果。
- 根据权利要求14所述的装置,其特征在于,所述多个策略空间的维度包括至少以下之一:纵向采样维度、横向采样维度、或时间采样维度。
- 根据权利要求15所述的装置,其特征在于,所述执行所述多个策略空间的多次释放包括按照以下维度的顺序执行所述释放:纵向采样维度、横向采样维度、时间采样维度。
- 根据权利要求14-16任一所述的装置,其特征在于,所述确定自车与所述博弈对象的策略可行域时,所述策略可行域中的行为动作对的总代价值,根据以下之一或多个确定:自车或博弈对象的安全性代价值、路权代价值、横向偏移代价值、通过性代价值、舒适性代价值、帧间关联性代价值、风险区域代价值。
- 根据权利要求17所述的装置,其特征在于,所述行为动作对的总代价值根据两个或两个以上的代价值进行确定时,各所述代价值具有不同的权重。
- 根据权利要求14所述的装置,其特征在于,所述博弈对象包括两个或两个以上时,所述自车行驶的决策结果根据自车与各所述博弈对象的各策略可行域确定。
- 根据权利要求14-19任一所述的装置,其特征在于,所述获取模块还用于获取自车的非博弈对象;所述处理模块还用于确定出自车与所述非博弈对象的策略可行域;以及用于至少根据自车与所述非博弈对象的策略可行域确定所述自车行驶的决策结果。
- 根据权利要求19或20所述的装置,其特征在于,所述处理模块还用于:根据自车与各所述博弈对象的各策略可行域的交集确定所述自车行驶的决策结果的策略可行域,或根据自车与各所述博弈对象的各策略可行域以及自车与各所述非博弈对象的各策略可行域的交集确定所述自车行驶的决策结果的策略可行域。
- 根据权利要求15-21任一所述的装置,其特征在于,所述获取模块还用于获取自车的非博弈对象;所述处理模块还用于根据所述非博弈对象的运动状态,约束与自车对应的纵向采样策略空间,或约束与自车对应的横向采样策略空间。
- 根据权利要求15-21任一所述的装置,其特征在于,所述获取模块还用于获取自车的博弈对象的非博弈对象;所述处理模块还用于根据所述非博弈对象的运动状态,约束与自车的博弈对象对应的纵向采样策略空间,或约束与自车的博弈对象对应的横向采样策略空间。
- 根据权利要求21所述的装置,其特征在于,所述交集为空集时,执行自车行驶的保守决策,所述保守决策包括使所述自车安全停车的动作,或,使所述自车安全减速行驶的动作。
- 根据权利要求14所述的装置,其特征在于,所述博弈对象或非博弈对象,根据注意力方式进行确定。
- 根据权利要求14所述的装置,其特征在于,所述处理模块还用于通过人机交互界面显示至少以下之一:所述自车行驶的决策结果、所述决策结果的策略可行域、所述自车行驶的决策结果对应的自车行驶轨迹、或所述自车行驶的决策结果对应的博弈对象的行驶轨迹。
- 一种车辆行驶控制方法,其特征在于,包括:获取车外障碍物;针对所述障碍物,根据权利要求1-13任一所述方法确定车辆行驶的决策结果;根据所述决策结果控制车辆的行驶。
- 一种车辆行驶控制装置,其特征在于,包括:获取模块,用于获取车外障碍物;处理模块,用于针对所述障碍物,根据权利要求1-13任一所述方法确定车辆行驶的决策结果;所述处理模块还用于根据所述决策结果控制车辆的行驶。
- 一种车辆,其特征在于,包括:如权利要求28所述的车辆行驶控制装置,及行驶***;所述车辆行驶控制装置控制所述行驶***。
- 一种计算设备,其特征在于,包括:处理器,以及存储器,其上存储有程序指令,所述程序指令当被所述处理器执行时使得所述处理器实现权利要求1-13任一所述的智能驾驶决策方法,或所述程序指令当被所述处理器执行时使得所述处理器实现权利要求27所述的车辆行驶控制方法。
- 一种计算机可读存储介质,其特征在于,其上存储有程序指令,所述程序指令当被处理器执行时使得所述处理器实现权利要求1-13任一所述的智能驾驶决策方法,或所述程序指令当被所述处理器执行时使得所述处理器实现权利要求27所述的车辆行驶控制方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21951305.8A EP4360976A1 (en) | 2021-07-29 | 2021-07-29 | Method for intelligent driving decision-making, vehicle movement control method, apparatus, and vehicle |
CN202180008224.5A CN115943354A (zh) | 2021-07-29 | 2021-07-29 | 智能驾驶决策方法、车辆行驶控制方法、装置及车辆 |
PCT/CN2021/109331 WO2023004698A1 (zh) | 2021-07-29 | 2021-07-29 | 智能驾驶决策方法、车辆行驶控制方法、装置及车辆 |
US18/424,238 US20240166242A1 (en) | 2021-07-29 | 2024-01-26 | Intelligent driving decision-making method, vehicle traveling control method and apparatus, and vehicle |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/109331 WO2023004698A1 (zh) | 2021-07-29 | 2021-07-29 | 智能驾驶决策方法、车辆行驶控制方法、装置及车辆 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/424,238 Continuation US20240166242A1 (en) | 2021-07-29 | 2024-01-26 | Intelligent driving decision-making method, vehicle traveling control method and apparatus, and vehicle |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023004698A1 true WO2023004698A1 (zh) | 2023-02-02 |
Family
ID=85085998
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/109331 WO2023004698A1 (zh) | 2021-07-29 | 2021-07-29 | 智能驾驶决策方法、车辆行驶控制方法、装置及车辆 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240166242A1 (zh) |
EP (1) | EP4360976A1 (zh) |
CN (1) | CN115943354A (zh) |
WO (1) | WO2023004698A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116503025A (zh) * | 2023-06-25 | 2023-07-28 | 深圳高新区信息网有限公司 | 一种基于工作流引擎的业务工单流程处理方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160144838A1 (en) * | 2013-06-14 | 2016-05-26 | Valeo Schalter Und Sensoren Gmbh | Method and device for carrying out collision-avoiding measures |
CN108595823A (zh) * | 2018-04-20 | 2018-09-28 | 大连理工大学 | 一种联合驾驶风格和博弈理论的自主车换道策略的计算方法 |
EP3539837A1 (en) * | 2018-03-13 | 2019-09-18 | Veoneer Sweden AB | A vehicle radar system for detecting preceding vehicles |
CN110362910A (zh) * | 2019-07-05 | 2019-10-22 | 西南交通大学 | 基于博弈论的自动驾驶车辆换道冲突协调模型建立方法 |
CN111267846A (zh) * | 2020-02-11 | 2020-06-12 | 南京航空航天大学 | 一种基于博弈论的周围车辆交互行为预测方法 |
CN112907967A (zh) * | 2021-01-29 | 2021-06-04 | 吉林大学 | 一种基于不完全信息博弈的智能车换道决策方法 |
CN113160547A (zh) * | 2020-01-22 | 2021-07-23 | 华为技术有限公司 | 一种自动驾驶方法及相关设备 |
-
2021
- 2021-07-29 CN CN202180008224.5A patent/CN115943354A/zh active Pending
- 2021-07-29 EP EP21951305.8A patent/EP4360976A1/en active Pending
- 2021-07-29 WO PCT/CN2021/109331 patent/WO2023004698A1/zh active Application Filing
-
2024
- 2024-01-26 US US18/424,238 patent/US20240166242A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160144838A1 (en) * | 2013-06-14 | 2016-05-26 | Valeo Schalter Und Sensoren Gmbh | Method and device for carrying out collision-avoiding measures |
EP3539837A1 (en) * | 2018-03-13 | 2019-09-18 | Veoneer Sweden AB | A vehicle radar system for detecting preceding vehicles |
CN108595823A (zh) * | 2018-04-20 | 2018-09-28 | 大连理工大学 | 一种联合驾驶风格和博弈理论的自主车换道策略的计算方法 |
CN110362910A (zh) * | 2019-07-05 | 2019-10-22 | 西南交通大学 | 基于博弈论的自动驾驶车辆换道冲突协调模型建立方法 |
CN113160547A (zh) * | 2020-01-22 | 2021-07-23 | 华为技术有限公司 | 一种自动驾驶方法及相关设备 |
CN111267846A (zh) * | 2020-02-11 | 2020-06-12 | 南京航空航天大学 | 一种基于博弈论的周围车辆交互行为预测方法 |
CN112907967A (zh) * | 2021-01-29 | 2021-06-04 | 吉林大学 | 一种基于不完全信息博弈的智能车换道决策方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116503025A (zh) * | 2023-06-25 | 2023-07-28 | 深圳高新区信息网有限公司 | 一种基于工作流引擎的业务工单流程处理方法 |
CN116503025B (zh) * | 2023-06-25 | 2023-09-19 | 深圳高新区信息网有限公司 | 一种基于工作流引擎的业务工单流程处理方法 |
Also Published As
Publication number | Publication date |
---|---|
US20240166242A1 (en) | 2024-05-23 |
EP4360976A1 (en) | 2024-05-01 |
CN115943354A (zh) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11932284B2 (en) | Trajectory setting device and trajectory setting method | |
CN110488802B (zh) | 一种网联环境下的自动驾驶车辆动态行为决策方法 | |
CN111081065B (zh) | 路段混行条件下的智能车辆协同换道决策模型 | |
US11364902B2 (en) | Testing predictions for autonomous vehicles | |
CN108269424B (zh) | 用于车辆拥堵估计的***和方法 | |
CN110325823B (zh) | 基于规则的导航 | |
CN113071520B (zh) | 车辆行驶控制方法及装置 | |
CN110068346A (zh) | 用于自主车辆中不受保护的操纵缓解的***和方法 | |
EP4316935A1 (en) | Method and apparatus for obtaining lane change area | |
CN114074681A (zh) | 基于概率的车道变更决策和运动规划***及其方法 | |
US20230037767A1 (en) | Behavior planning for autonomous vehicles in yield scenarios | |
US20240166242A1 (en) | Intelligent driving decision-making method, vehicle traveling control method and apparatus, and vehicle | |
CN112689024B (zh) | 一种车路协同的货车队列换道方法、装置及*** | |
US20210078608A1 (en) | System and method for providing adaptive trust calibration in driving automation | |
WO2023087157A1 (zh) | 一种智能驾驶方法及应用该方法的车辆 | |
CN116142194A (zh) | 一种拟人化的换道决策方法 | |
CN115810291A (zh) | 一种关联目标识别方法、装置、路侧设备及车辆 | |
WO2023168630A1 (zh) | 一种控车方法及相关装置 | |
Qin et al. | Two-lane multipoint overtaking decision model based on vehicle network | |
WO2024140381A1 (zh) | 一种自动泊车方法、装置和智能驾驶设备 | |
US20230278562A1 (en) | Method to arbitrate multiple automatic lane change requests in proximity to route splits | |
CN113954871B (zh) | 对于自主车辆测试预测的方法、装置及介质 | |
CN115402321A (zh) | 一种变道策略确定方法、***、电子设备及存储介质 | |
CN117760453A (zh) | 车辆并道行驶的路线规划方法、装置以及计算机设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21951305 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2024504975 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021951305 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2021951305 Country of ref document: EP Effective date: 20240126 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |