WO2024109763A1 - 一种基于场景自适应识别的自动驾驶方法及*** - Google Patents

一种基于场景自适应识别的自动驾驶方法及*** Download PDF

Info

Publication number
WO2024109763A1
WO2024109763A1 PCT/CN2023/133059 CN2023133059W WO2024109763A1 WO 2024109763 A1 WO2024109763 A1 WO 2024109763A1 CN 2023133059 W CN2023133059 W CN 2023133059W WO 2024109763 A1 WO2024109763 A1 WO 2024109763A1
Authority
WO
WIPO (PCT)
Prior art keywords
scene
module
path planning
abnormality
operation information
Prior art date
Application number
PCT/CN2023/133059
Other languages
English (en)
French (fr)
Inventor
黄乐雄
王帅
韩瑞华
王洋
叶可江
须成忠
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2024109763A1 publication Critical patent/WO2024109763A1/zh

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present invention belongs to the field of information technology, and in particular relates to an automatic driving method and system based on scene adaptive recognition.
  • the commonly used imitation learning algorithm currently uses neural networks to learn the input-output pairs in the data set, and continuously optimizes the neuron parameters to approximate the characteristics and logic of the data set. Finally, the neural network can obtain a logical output based on the input.
  • imitation learning is heavily dependent on the data set, and the decision confidence for scenes that do not appear in the data set is insufficient. Therefore, imitation learning is suitable for single scenes with similar characteristics.
  • the traditional path planning method is to calculate a collision-free optimal trajectory based on the starting position, the target position, and the environmental information through mathematical reasoning, and then approximate the position at the next moment based on the trajectory and the current position, and use this as the basis for calculating the dynamic parameters.
  • path planning is time-consuming in some scenarios and may be difficult to solve, which cannot meet the real-time requirements of autonomous driving.
  • the purpose of the embodiments of this specification is to provide an autonomous driving method and system based on scene adaptive recognition.
  • the present application provides an autonomous driving method based on scene adaptive recognition, the method comprising:
  • the path planning module determines a path planning trajectory for the current driving scenario based on the operation information at the previous moment;
  • the imitation learning module uses the neural network to derive the uncertainty distribution of the decision based on the environmental information
  • the decision module determines the autonomous driving method based on the complexity and abnormality of the scene.
  • the path planning module adopts a model predictive control method, including:
  • the vehicle's motion state and trajectory within a preset period are predicted based on the dynamic model at the current moment. Under the condition of considering the constraints, the control trajectory at each specific moment is optimized to ensure the optimal solution at each specific moment.
  • optimizing the control trajectory at each specific moment is based on a cost function, an acceleration limit constraint, a speed limit constraint, an obstacle avoidance constraint, and a dynamics constraint.
  • the scene complexity is determined based on the parameter space of the path planning trajectory: the number of constraint equations in the parameter space of the path planning trajectory is proportional to the scene complexity.
  • the data set for training the neural network includes historical bird's-eye view images and corresponding driver operation information at historical moments;
  • the historical bird's-eye view is a fusion of the surrounding environment information of the corresponding historical moment of driving.
  • the bird's-eye view is a vehicle-centric view composed of RGB camera images from multiple perspectives.
  • the structure of the neural network is: a three-layer fully connected network with convolution kernels of 32, 64, and 64 respectively, and a four-layer fully connected network with the number of nodes of 1024, 512, 128, and 21 respectively.
  • the imitation learning module obtains the uncertainty distribution of the decision by the neural network according to the environmental information, including:
  • the scene abnormality is determined according to the uncertainty distribution: the uncertainty distribution is proportional to the scene abnormality.
  • determining the autonomous driving method according to the scene complexity and the scene abnormality includes:
  • the operation information determined by the imitation learning module is used to control the execution module
  • the operation information determined by the path planning module is used to control the execution module
  • the operation information determined by the path planning module is used to control the execution module, and if the first difference is greater than the second difference, the operation information determined by the imitation learning module is used to control the execution module;
  • the operation information determined by the path planning module is used to control the execution module; if the first difference is greater than the second difference, the operation information determined by the imitation learning module is used to control the execution module.
  • the present application provides an autonomous driving system based on scene adaptive recognition, the system comprising:
  • the acquisition module is used to obtain the environmental information of the driving scene and the operation information at the last moment;
  • the path planning module is used to determine a path planning trajectory for the current driving scenario based on the operation information at the previous moment;
  • a first determination module used to determine the scene complexity according to the parameter space of the path planning trajectory
  • Imitation learning module used to derive the uncertainty distribution of decisions by the neural network based on environmental information
  • a second determination module is used to determine the abnormality of the scene according to the uncertainty distribution
  • the decision module is used to determine the autonomous driving method based on the complexity and abnormality of the scene.
  • this solution combines the advantages of path planning and imitation learning methods.
  • This method can adaptively identify and analyze scene complexity and scene abnormality according to different scenes, and intelligently choose to use path planning methods to solve constraints, or use imitation learning methods to calculate by neural networks.
  • the advantages and disadvantages of these two methods in autonomous driving are comprehensively considered, and the real-time performance and accuracy performance of autonomous driving are improved.
  • FIG1 is an overall block diagram of an autonomous driving system based on scene adaptive recognition provided by the present application
  • FIG2 is a schematic diagram of a flow chart of an autonomous driving method based on scene adaptive recognition provided by the present application
  • FIG3 is a schematic diagram of a bird's-eye view provided by the present application.
  • FIG4 is a schematic diagram of the structure of the autonomous driving system based on scene adaptive recognition provided in the present application.
  • path planning methods can be roughly divided into four categories: graph search, sampling, interpolation, and optimization.
  • Graph search searches for the best path by constructing an environment map.
  • Sampling represents the environment map by sampling.
  • Interpolation generates trajectories based on existing reference points.
  • Optimization constructs the planning problem as Solving optimization problems.
  • the optimization-based path planning method can solve the control parameters and trajectory of vehicle navigation while taking into account the constraints of obstacle avoidance.
  • the more commonly used imitation learning method in autonomous driving is to collect environmental information (such as cameras and lidar) and driver behavior actions (such as throttle, steering, braking) to form a data set, and then hand over a large amount of data sets to neural network training.
  • the neural network will update the neuron parameters through gradient descent and back propagation, allowing the network to continuously fit the input-output pairs, and finally give output that conforms to the driver's logic based on the input.
  • the main disadvantage of the current imitation learning-based methods is that imitation learning is highly restrictive with respect to scenarios.
  • the model can only respond well to scenarios that have appeared in the data set, and the model loses judgment on scenarios outside the data set. This also determines that imitation learning is heavily dependent on the diversity of the data set and is difficult to generalize to driving behaviors in all scenarios.
  • the present application provides an automatic driving method based on scene adaptive pattern recognition.
  • the method navigates an optimal collision-free path from the starting point to the target location based on the path planning method, and then uses the imitation learning method to calculate the decision under the current scene.
  • the scene analysis comprehensively considers the output of path planning and the output of imitation learning to select the appropriate solution.
  • This decision-making solution fully considers the advantages of each of the two methods.
  • the system mainly includes: a perception module, a path planning module, an imitation learning module, and a decision module.
  • the role of the perception module is to collect sensor data and extract the information needed by subsequent modules.
  • Environmental information Commonly used sensors include cameras (i.e., the camera in Figure 1), radars, and UWB positioning systems (i.e., positioning in Figure 1). Cameras can capture visual information such as the color, brightness, and objects of the environment; the rays emitted by the radar will return after hitting an object, and the distance of the object from the radar can be calculated by the return time, thereby obtaining depth and distance information; the positioning system can obtain the coordinate information of its own position. By combining these sensors and corresponding algorithms, a map of the surrounding environment and the coordinates of obstacles can be constructed, which can be handed over to subsequent modules for processing.
  • the function of the path planning module is to determine the optimal trajectory for the current driving scenario based on the driver's operation information at the previous moment (including throttle, brake and steering, etc.) and then use the optimized path planning method.
  • the imitation learning module uses a neural network to calculate the uncertainty distribution of the decision based on the environmental information obtained from the perception module.
  • the decision module intelligently evaluates the scene abnormality caused by the uncertainty distribution and the scene complexity derived from the parameter space of the path planning trajectory to decide which decision plan to choose.
  • FIG. 2 which shows a flow chart of an autonomous driving method based on scene adaptive recognition applicable to an embodiment of the present application.
  • the autonomous driving method based on scene adaptive recognition may include:
  • environmental information includes visual information captured by the camera, depth and distance information obtained by radar, and the positioning system's own position coordinate information.
  • the operation information at the last moment includes the linear speed, heading angle, vehicle steering angle and other information involved in the accelerator, brake, steering wheel, etc.
  • the path planning module determines a path planning trajectory for the current driving scenario based on the operation information at the previous moment.
  • obstacles and vehicles are modeled as convex sets. Constructing them as convex sets can take into account the robot and environment models in the optimization equation and speed up the solution.
  • the subsequent path planning parts are all based on convex set planning.
  • matrices A and B are determined by the shape and size of the obstacle or robot, and the points that satisfy the inequality constitute the convex set representing the obstacle or vehicle.
  • vehicles are constantly moving, so the convex set of the vehicle at each moment must be calculated based on the current position.
  • the commonly used method is to first calculate the initial convex set of the vehicle (using the constructed generalized linear inequality as the initial convex set of the vehicle), Then, according to the current position of the vehicle, the convex set is transformed by the rotation matrix and the translation matrix. These two matrices are determined by the current orientation and position of the vehicle. For example, assuming the translation matrix is:
  • the path planning module adopts a model predictive control method, including:
  • the vehicle's motion state and trajectory within a preset period are predicted based on the dynamic model at the current moment. Under the condition of considering the constraints, the control trajectory at each specific moment is optimized to ensure the optimal solution at each specific moment.
  • model predictive control method is commonly used in the control algorithm of autonomous driving. Its advantage is that it can make the control program meet certain constraints, such as taking various dynamics and kinematics into consideration as constraints.
  • the main idea of model predictive control is to predict the motion state and trajectory of the vehicle in a future time window based on the current dynamic model. Then, under the condition of considering the constraints, optimize the control trajectory at each specific moment to ensure the optimal solution at this moment. Among them, the optimization problem is divided into five parts: cost function, acceleration limit constraint, speed limit constraint, obstacle avoidance constraint, and dynamic constraint.
  • the final result of the solution is the optimal control command and the predicted optimal trajectory during this period of time. And the predicted control command is passed to the vehicle for execution.
  • this application adopts a hot start method to save computing time, that is, the solution to each problem will be used as the initial value of the problem at the next moment, because the movement of the vehicle will not be too large in a very small time unit, so the difference between each solution is not large.
  • the cost function is the core part of the optimization problem.
  • the setting of the cost function determines the solution direction of the optimization equation.
  • the ultimate goal of optimization is to solve the value that minimizes the cost function.
  • the cost function is:
  • s is the state variable of the vehicle, including coordinates and direction;
  • u is the control variable of the vehicle (also called operation information), including linear speed and steering angle, etc.,
  • s, and v are the reference trajectory and reference speed,
  • P and Q are weight matrices, which can adjust the weight of the vehicle running along the reference trajectory s, and the reference speed v,. The larger these two values are, the more the vehicle will run according to the reference values. It can be understood that different navigation tasks correspond to different P and Q.
  • N is the prediction time, and the subscript t refers to the tth moment. This cost function represents the state of the robot.
  • the optimization direction is to minimize the difference between the state and the control variable and the reference value.
  • the obstacle avoidance constraint is the core constraint of the planning optimization problem. This constraint ensures that the vehicle's trajectory will not encounter obstacles and collide.
  • the constraint is established based on the convex set given by the environmental information of the perception module.
  • the basis for determining whether there is a collision is the minimum distance between the vehicle and the surrounding obstacles, which is denoted as In order to ensure obstacle avoidance, the minimum distance must be controlled within a safe range.
  • the mathematical form is:
  • the Ackerman model is commonly used for vehicles. Its characteristics are that it cannot move laterally, and its trajectory is a combination of straight lines and arcs.
  • the radius of the arc depends on the minimum turning radius rmin of the vehicle, and the turning radius is determined by the center distance between the front and rear wheels of the vehicle and the maximum steering angle.
  • the control commands of the Ackerman vehicle mainly include linear speed and steering angle, and its dynamic model is:
  • e and are the linear velocity and the heading angle
  • is the steering angle of the vehicle
  • L is the front and rear wheelbase of the vehicle
  • St and St +1 are the states of the vehicle at different times.
  • the constraints of the dynamic model can ensure the smoothness and feasibility of the desired trajectory.
  • the vehicle's speed u and acceleration a are subject to maximum and minimum constraints, that is, they are restricted to optimization within a certain range. This also reduces the definition domain and facilitates solving the optimization equation.
  • the planning problem of the path planning module is to require the vehicle to be as close to the ideal trajectory as possible while avoiding collisions.
  • the problem can be abstracted as follows:
  • the following solver is designed for iterative optimization. Each iteration is divided into four steps, and the optimization is performed cyclically to obtain the optimal linear speed, heading angle, and vehicle steering angle:
  • Step 1 Use the control command output by the solver at the previous moment and the state of the vehicle after executing the action as the initial point; this facilitates the rapid solution of subsequent steps.
  • Step 2 Use L1 paradigm sparsity to dynamically adjust the safe distance d safe ; for sparse environments, d safe will tend to be larger, and vice versa.
  • Step 3 Use the penalty function to convert the constraint group into a summation form of constraints, eliminate non-convex constraints and make all constraints of the problem linear; linear constraints are easier to solve and facilitate subsequent calculations.
  • Step 4 For non-convex cost functions, use the inequality method to calculate the upper bound of the cost function, and input the upper bound as a proxy function into the interior point method solver. In this way, the original non-convex problem is converted into a convex problem, which is easier to solve.
  • the role of the proxy function is to replace the original function with another function form to facilitate the solution.
  • the upper bound of a can be obtained as c, and the original inequality can be converted to c ⁇ b. This is the proxy function of the original inequality.
  • this application uses the method of studying the parameter space of path planning to determine the scene complexity.
  • the solution of path planning depends on the dependency relationship of the established constraint equations, so the more constraint equations (parameters) there are, the higher the scene complexity, that is, the higher the difficulty of calculation and solution.
  • the imitation learning module obtains the uncertainty distribution of the decision by the neural network according to the environmental information.
  • the imitation learning module uses a pre-trained neural network for control.
  • the data set for training the neural network includes historical bird's-eye view images and corresponding driver operation information at historical moments;
  • the historical bird's-eye view is a fusion of the surrounding environment information of the corresponding historical moment of driving.
  • the bird's-eye view is a vehicle-centric view composed of RGB camera images from multiple perspectives.
  • the data collection module mainly uses various sensors in the perception module to obtain environmental information to produce data sets.
  • sensor data include RGB cameras, depth cameras, radars, lidars, etc.
  • RGB cameras can be used to obtain visual information of objects around the vehicle body, from which semantic information, object interaction information, etc. can be extracted.
  • Depth cameras can obtain a matrix composed of depth data of all points in the field of view, thereby constructing a depth map, and querying the distance information of other objects.
  • Lidar measures the distance between the sensor and the object. The propagation distance between the sensor transmitter and the target object is analyzed, and the reflected energy size, amplitude, frequency and phase of the reflected spectrum on the surface of the target object are analyzed, thereby presenting accurate three-dimensional structural information of the target object.
  • More sensors can bring richer information to the system, making the system's judgment of the surrounding environment more accurate, but the data processing and fusion between multiple sensors will become relatively more complicated, making it difficult to train the intelligent model.
  • RGB camera images from multiple perspectives to form an input of a bird's-eye view with the vehicle as the center point, as shown in Figure 3.
  • an experienced human driver to perform road driving operations, select different traffic environments on different streets for driving, avoid collisions and comply with traffic regulations during the period.
  • the intermediate process maintains a frequency of 15Hz, that is, 15 data are recorded per second, and each data includes a bird's-eye view and the corresponding driver's operation (throttle, brake, steering value).
  • a complete driving collection is considered as a set of data.
  • the collected bird's-eye view images are cropped and scaled so that the size of each image is 160*80, which is convenient for network calculation.
  • For the throttle, brake, and steering values we modify the values that exceed the physical range limit to the maximum/minimum value of the physical range limit, and then normalize all data values to make their values between [-1,1]. Then process the action value (i.e., operation value), with each 0.1 value as an interval, and divide [-1,1] into 21 categories from 0 to 20.
  • This application uses a convolution plus fully connected structure to construct a classification network.
  • the task of this network is to obtain the classification values corresponding to each action through the input bird's-eye view and network calculation, and then convert the classification values into control values for application.
  • the specific network structure is a three-layer fully connected network with convolution kernels of 32, 64, and 64, respectively, and a four-layer fully connected network with node numbers of 1024, 512, 128, and 21, respectively.
  • the training goal of imitation learning is to make the strategy closest to the driver's driving strategy and make the output under different inputs closest to the corresponding values in the data set.
  • the optimization equation is:
  • ⁇ * is the driving strategy parameter that is closest to the driver
  • s,a ⁇ D are the states sampled from the data set D.
  • state and action (input and label) loss is the error function
  • ⁇ ) is the output of strategy ⁇ under parameter ⁇ when given input s
  • the purpose of parameter optimization is to minimize the difference between the strategy of the current parameters and the strategy in the data set.
  • the bird's-eye view that is fused with the current driving environment information is also cropped and scaled to the input size required by the network, and then input into the trained neural network to calculate the corresponding control result.
  • the imitation learning module obtains the uncertainty distribution of the decision by the neural network according to the environmental information, including:
  • s,w) represents the probability that the perception model w of the neural network produces the result y after observing the scene s;
  • s,w); the probability corresponding to this output indicates whether the perception model w is familiar with the observed scene. Therefore, the scene uncertainty can be calculated using the following formula: U(s) 1-P(y *
  • a larger uncertainty distribution U represents a more complex scenario.
  • scene complexity can also be determined by the scene understanding deviation, specifically:
  • scene understanding deviation including classification error rate, detection error rate, and tracking loss probability
  • A represents the understanding deviation of the smart car for a specific scene
  • (b,c) represents the parameters to be fitted
  • T represents the temperature constant of the Gibbs distribution
  • (U,V) represents the second and first order generalization error of the scene understanding model.
  • the derivative matrix W represents the number of parameters of the scene understanding model. The larger the A, the more complex the scene.
  • the scene abnormality can also be determined by comprehensively considering the uncertainty distribution U and the scene understanding deviation A.
  • the two can be averaged, or the weighted average can be used to obtain the value to determine the scene abnormality.
  • the two can be averaged, or the weighted average can be used to obtain the value to determine the scene abnormality. There is no limitation here.
  • the decision module determines the autonomous driving method according to the scene complexity and scene abnormality.
  • the operation information determined by the imitation learning module is used to control the execution module;
  • the operation information determined by the path planning module is used to control the execution module
  • the operation information determined by the path planning module is used to control the execution module, and if the first difference is greater than the second difference, the operation information determined by the imitation learning module is used to control the execution module;
  • the operation information determined by the path planning module is used to control the execution module; if the first difference is greater than the second difference, the operation information determined by the imitation learning module is used to control the execution module.
  • the first threshold and the second threshold can be set according to actual needs.
  • the model calculation results of imitation learning can be used for faster and more effective control, which meets the real-time requirements of autonomous driving.
  • the scene abnormality is high (i.e., the scene abnormality is greater than the second threshold) and the scene complexity is low (i.e., the scene complexity is less than or equal to the first threshold)
  • the current scene is a scene that rarely appears in the imitation learning data set, and the model has a low confidence in judging this scene.
  • the path planning method can improve the accuracy of decision-making and better avoid possible abnormal situations such as collisions.
  • the automatic driving method based on scene adaptive recognition combines the advantages of path planning and imitation learning methods.
  • the method can adaptively identify and analyze scene complexity and scene abnormality according to different scenes, and intelligently choose to use path planning methods to solve constraints, or use imitation learning methods to calculate by neural networks.
  • the advantages and disadvantages of these two methods in automatic driving are comprehensively considered, and the real-time performance and accuracy performance of automatic driving are improved.
  • the embodiment of the present application combines the advantages of both methods and can stably, reliably and collision-free complete autonomous driving operations on traffic roads. Compared with a simple path planning method, the present application takes less time to calculate; compared with a simple imitation learning method, the present application has better coping performance for training scenarios with lower frequency of occurrence.
  • FIG. 4 shows a schematic diagram of the structure of an autonomous driving system based on scene adaptive recognition according to an embodiment of the present application.
  • the scene adaptive recognition based autonomous driving system 400 may include:
  • the acquisition module 410 is used to acquire the environment information and the operation information at the last moment in the driving scene;
  • a path planning module 420 is used to determine a path planning trajectory in a current driving scenario based on the operation information at the previous moment;
  • a first determination module 430 configured to determine scene complexity according to a parameter space of a path planning trajectory
  • An imitation learning module 440 for deriving a decision uncertainty distribution by a neural network based on environmental information
  • a second determination module 450 is used to determine the abnormality of the scene according to the uncertainty distribution
  • the decision module 460 is used to determine the automatic driving method according to the scene complexity and scene abnormality.
  • the path planning module adopts a model predictive control method, including:
  • the vehicle's motion state and trajectory within a preset period are predicted based on the dynamic model at the current moment. Under the condition of considering the constraints, the control trajectory at each specific moment is optimized to ensure the optimal solution at each specific moment.
  • control trajectory at each specific moment is optimized based on a cost function, an acceleration limit constraint, a speed limit constraint, an obstacle avoidance constraint, and a dynamics constraint.
  • the number of constraint equations in the parameter space of the path planning trajectory is proportional to the scene complexity.
  • the data set for training the neural network includes historical bird's-eye view images and corresponding operation information of drivers at historical moments;
  • the historical bird's-eye view is a fusion of the surrounding environment information of the corresponding historical moment of driving.
  • the bird's-eye view is a vehicle-centric view composed of RGB camera images from multiple perspectives.
  • the structure of the neural network is: a three-layer fully connected network with convolution kernels of 32, 64, and 64 respectively, and a four-layer fully connected network with the number of nodes of 1024, 512, 128, and 21 respectively.
  • the imitation learning module 440 is further used to:
  • s,w) represents the probability that the neural network perception model w will produce the result y after observing the scene s. Rate;
  • the uncertainty distribution is proportional to the scene abnormality.
  • the decision module 460 is further configured to:
  • the operation information determined by the imitation learning module is used to control the execution module
  • the operation information determined by the path planning module is used to control the execution module
  • the operation information determined by the path planning module is used to control the execution module, and if the first difference is greater than the second difference, the operation information determined by the imitation learning module is used to control the execution module;
  • the operation information determined by the path planning module is used to control the execution module; if the first difference is greater than the second difference, the operation information determined by the imitation learning module is used to control the execution module.
  • This embodiment provides an autonomous driving system based on scene adaptive recognition, which can execute the embodiments of the above method. Its implementation principles and technical effects are similar and will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Traffic Control Systems (AREA)

Abstract

本申请提供一种基于场景自适应识别的自动驾驶方法及***,该方法包括:获取驾驶场景下的环境信息及上一时刻操作信息;路径规划模块基于上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹;根据路径规划轨迹的参数空间确定场景复杂度;模仿学习模块根据环境信息由神经网络得出决策的不确定性分布;根据不确定性分布确定场景异常度;决策模块根据场景复杂度及场景异常度,确定自动驾驶方法。该方案提高了自动驾驶对实时性能与准确性能。

Description

一种基于场景自适应识别的自动驾驶方法及*** 技术领域
本发明属于信息技术领域,特别涉及一种基于场景自适应识别的自动驾驶方法及***。
背景技术
随着汽车智能化与电动化的不断升级,自动驾驶汽车成为汽车产业变革的一大趋势。自动驾驶汽车发展潜力巨大,自动驾驶技术已成为战略性新兴产业的重要组成部分,其快速发展将深刻影响人、资源和产品的流动方式,颠覆性地改变人类的生活方式。
目前常用的模仿学习算法会通过神经网络去学习数据集中的输入-输出对,不断优化神经元参数以逼近数据集的特征与逻辑,最终神经网络可以根据输入得到符合逻辑的输出。但模仿学习严重依赖数据集,对于数据集中未出现的场景决策置信度不够,因此模仿学习适合用于拥有相似特征的单一场景。而传统的路径规划方法是根据起始位置与目标位置以及环境信息,通过数学推理计算得出一条无碰撞的最优轨迹,再根据轨迹与当前位置近似得出下一个时刻的位置,以此为根据计算动力学参数。但路径规划在一些场景下计算耗时,可能难以求解,无法满足自动驾驶实时性的要求。
发明内容
本说明书实施例的目的是提供一种基于场景自适应识别的自动驾驶方法及***。
为解决上述技术问题,本申请实施例通过以下方式实现的:
第一方面,本申请提供一种基于场景自适应识别的自动驾驶方法,该方法包括:
获取驾驶场景下的环境信息及上一时刻操作信息;
路径规划模块基于上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹;
根据路径规划轨迹的参数空间确定场景复杂度;
模仿学习模块根据环境信息由神经网络得出决策的不确定性分布;
根据不确定性分布确定场景异常度;
决策模块根据场景复杂度及场景异常度,确定自动驾驶方法。
在其中一个实施例中,路径规划模块采用模型预测控制方法,包括:
根据当前时刻的动力学模型预测预设时段内车辆的运动状态及轨迹,在考虑约束的条件下,优化每一具体时刻的控制轨迹以保证每一具体时刻的最优解。
在其中一个实施例中,优化每一具体时刻的控制轨迹基于代价函数、加速度限制约束、速度限制约束、避障约束、动力学约束。
在其中一个实施例中,根据路径规划轨迹的参数空间确定场景复杂度为:路径规划轨迹的参数空间中约束方程的数量与场景复杂度成正比。
在其中一个实施例中,训练神经网络的数据集包括历史俯瞰图及对应的历史时刻驾驶员的操作信息;
历史俯瞰图由对应的历史时刻驾驶的周围环境信息融合而成,其中俯瞰图为多个视角的RGB摄像头图片构成的以车辆为中心的视图。
在其中一个实施例中,神经网络的结构为:三层全连接网络,卷积核分别为32、64、64,四层全连接网络,结点数依次为1024、512、128、21。
在其中一个实施例中,模仿学习模块根据环境信息由神经网络得出决策的不确定性分布,包括:
神经网络输出的感知结果为:
y*=argmaxyP(y|s,w)
其中,P(y|s,w)表示神经网络的感知模型w在观测到场景s后产生结果y的概率;
场景的不确定性分布U(s):
U(s)=1-P(y*|s,w)。
在其中一个实施例中,根据不确定性分布确定场景异常度为:不确定性分布与场景异常度成正比。
在其中一个实施例中,根据场景复杂度及场景异常度,确定自动驾驶方法,包括:
若场景复杂度大于第一阈值且场景异常度小于或等于第二阈值,则采取模仿学习模块确定的操作信息控制执行模块;
若场景复杂度小于或等于第一阈值且场景异常度大于第二阈值,则采取路径规划模块确定的操作信息控制执行模块;
若场景复杂度大于第一阈值且场景异常度大于第二阈值,则若场景复杂度与第一阈值的第一差值小于场景异常度与第二阈值的第二差值,则采取路径规划模块确定的操作信息控制执行模块,若第一差值大于第二差值,则采取模仿学习模块确定的操作信息控制执行模块;
若场景复杂度小于第一阈值且场景异常度小于第二阈值,则若第一差值小于第二差值,则采取路径规划模块确定的操作信息控制执行模块,若第一差值大于第二差值,则采取模仿学习模块确定的操作信息控制执行模块。
第二方面,本申请提供一种基于场景自适应识别的自动驾驶***,该***包括:
获取模块,用于获取驾驶场景下的环境信息及上一时刻操作信息;
路径规划模块,用于基于上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹;
第一确定模块,用于根据路径规划轨迹的参数空间确定场景复杂度;
模仿学习模块,用于根据环境信息由神经网络得出决策的不确定性分布;
第二确定模块,用于根据不确定性分布确定场景异常度;
决策模块,用于根据场景复杂度及场景异常度,确定自动驾驶方法。
由以上本说明书实施例提供的技术方案可见,该方案:结合了路径规划与模仿学习方法各自的优点,该方法能够根据不同的场景自适应地识别分析场景复杂度与场景异常度,智能地选择采用路径规划方法对约束求解,或是采用模仿学习方法由神经网络进行计算。综合考虑了这两种方法在自动驾驶的优劣,提高了自动驾驶对实时性能与准确性能。
附图说明
为了更清楚地说明本说明书实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中 的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请提供的基于场景自适应识别的自动驾驶的***总体框图;
图2为本申请提供的基于场景自适应识别的自动驾驶方法的流程示意图;
图3为本申请提供的俯瞰图的示意图;
图4为本申请提供的基于场景自适应识别的自动驾驶***的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书保护的范围。
以下描述中,为了说明而不是为了限定,提出了诸如特定***结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的***、***、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
在不背离本申请的范围或精神的情况下,可对本申请说明书的具体实施方式做多种改进和变化,这对本领域技术人员而言是显而易见的。由本申请的说明书得到的其他实施方式对技术人员而言是显而易见得的。本申请说明书和实施例仅是示例性的。
关于本文中所使用的“包含”、“包括”、“具有”、“含有”等等,均为开放性的用语,即意指包含但不限于。
本申请中的“份”如无特别说明,均按质量份计。
目前在自动驾驶领域已商用的驾驶方法,多数是采用路径规划的方法,这些路径规划法大致可分为四类,分别为基于图搜索、采样、插值和优化的方法。图搜索是通过构造环境地图来搜索最佳路径,采样法是通过采样的方式来代表环境地图,插值法是根据已有的参考点来插值生成轨迹,优化法是将规划问题构造为 优化问题求解。基于优化的路径规划法可以求解出车辆导航的控制参数与轨迹,同时考虑到避障的约束。
目前自动驾驶中比较常用的模仿学习方法则是通过收集环境信息(比如摄像头与激光雷达)与驾驶员行为动作(比如油门、转向、刹车)组成数据集,将大量数据集交给神经网络训练,神经网络会通过梯度下降及反向传播等去更新神经元参数,让网络不断去拟合输入输出对,最终能够根据输入给出符合驾驶员逻辑的输出。
目前现有的基于优化的方法主要有两个缺点,第一个是最终求解问题大多是非凸的,非凸问题一方面导致求解困难,遇到一些复杂的场景很难求出最优解,同时计算耗时也会更高,导致实时性无法满足应用需求。有些方法通过线性化来解决非凸的约束,然而这种转换无法保证收敛。第二个是目前多数方法将汽车或者障碍物当作质点模型或者圆形来处理,并没有考虑到多维度的形状,例如比较常见的是将车辆建模为椭圆形和将障碍物建模为多边形。这样限制了该方法在一些特殊场景中的应用。例如车辆在两辆车中间进行倒车时,将轿车当作椭圆形处理是不合理的。
目前现有的基于模仿学习方法的主要缺点是模仿学习对于场景的限制性高,模型只能对数据集中出现过的场景进行良好应对,数据集以外的场景模型便会失去判断,这也决定了模仿学习严重依赖于数据集的多样性,难以推广到全场景下的驾驶行为。
基于上述缺陷,本申请提供一种基于场景自适应模式识别的自动驾驶方法。该方法在导航时基于路径规划方法导航一条从起始点到目标位置的最优无碰撞路径,再使用模仿学习方法计算当前场景下的决策,之后场景分析综合考虑路径规划的输出与模仿学习的输出,选择合适的方案。这种决策方案充分考虑了两种方法各自的优点。
下面结合附图和实施例对本发明进一步详细说明。
参照图1,其示出了适用于本申请实施例提供的基于场景自适应识别的自动驾驶方法的***总体框图。如图1所示,该***主要包括:感知模块、路径规划模块、模仿学习模块、决策模块。
其中,感知模块的作用是搜集传感器的数据,并从中提取出后续模块需要的 环境信息。常用的传感器包括相机(即图1中摄像头)、雷达、uwb定位***(即图1中定位)等。相机可以捕捉到环境的色彩、亮度、物体等视觉信息;雷达发射的射线打在物体上会返回,通过返回的时间可以计算出物体离雷达的距离,以此获得深度距离信息;定位***可以获得自身位置的坐标信息。通过这些传感器及相应算法结合,可以构造出周围环境的地图以及障碍物的坐标等,交给后续模块处理。
路径规划模块的作用是根据上一时刻驾驶员的操作信息(包括油门、刹车及转向等),然后采用优化路径规划方法确定一条当前行驶场景的最优轨迹。
模仿学习模块根据从感知模块获取的环境信息,有神经网络计算得出决策的不确定性分布。
决策模块智能地评估不确定性分布带来的场景异常度以及路径规划轨迹的参数空间得出的场景复杂度,决定选择哪一种决策方案。
参照图2,其示出了适用于本申请实施例提供的基于场景自适应识别的自动驾驶方法的流程示意图。
如图2所示,基于场景自适应识别的自动驾驶方法,可以包括:
S210、获取驾驶场景下的环境信息及上一时刻操作信息。
具体的,环境信息包括相机捕捉的视觉信息,雷达测算得到的深度距离信息,定位***获得的自身位置坐标信息等信息。
上一时刻操作信息包括油门、刹车、方向盘等涉及到的线速度、朝向角、车辆转向角等信息。
S220、路径规划模块基于上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹。
在路径规划模块中,将障碍物和车辆都建模为凸集合,构造为凸集合能够在优化方程中考虑机器人与环境模型并加快求解速度,后续路径规划部分都是基于凸集规划。凸集可以将障碍物和车辆的形状和位置构造为一个广义线性不等式:O={x|Ax≤kB},其中矩阵A和B由障碍物或者机器人的形状尺寸决定,满足该不等式的点即构成代表障碍物或车辆的凸集。与障碍物不同的是,车辆是时刻在移动的,所以每一时刻的车辆凸集都要根据当前位置去进行计算。常用的方法是先计算出车辆初始的凸集(将构造的广义线性不等式作为车辆的初始凸集), 然后根据车辆当前位置对凸集通过旋转矩阵和平移矩阵进行转换,这两个矩阵则是由当前车辆的朝向和位置决定。例如,假设平移矩阵为:
平移转换为:[x,y,1]new=[x,y,1]old*Translationx,y
同理,旋转转换为:[x,y,1]new=[x,y,1]old*Rotationx,y,具体的Rotationx,y根据按照x轴转换或是按照y轴转换有不同的形式。
可选的,路径规划模块采用模型预测控制方法,包括:
根据当前时刻的动力学模型预测预设时段内车辆的运动状态及轨迹,在考虑约束的条件下,优化每一具体时刻的控制轨迹以保证每一具体时刻的最优解。
具体的,模型预测控制(model predictive control)方法较常用于自动驾驶的控制算法中,它的优点是能够使得控制程序满足一定的约束条件,例如将各种动力学和运动学作为约束考虑进去。模型预测控制的主要思路在于根据当前的动力学模型预测未来一段时间窗口的车辆的运动状态和轨迹。然后在考虑约束的条件下,优化每一具体时刻的控制轨迹以保证这一时刻的最优解。其中,优化问题分为了五部分:代价函数、加速度限制约束、速度限制约束、避障约束、动力学约束。最终希望求解的结果是这段时间内的最优控制命令以及预测的最优轨迹。并将预测的控制命令传递给车辆执行。并且本申请采用了热启动的方式以节约计算时间,即每次求解问题的解都会作为下一时刻的问题的初始值,因为在很小的时间单位内车辆的移动不会太大,所以每次的求解相差不大。
其中,代价函数是优化问题的核心部分,代价函数的设置决定了优化方程的求解方向,优化的最终目的是求解出使代价函数最小的值。代价函数为:
其中,s是车辆的状态变量,包括坐标与方向;u是车辆的控制变量(也称为操作信息),包括线速度与转向角等,s、和v、是参考轨迹以及参考速度,P和Q是权重矩阵,可以调整车辆沿着参考轨迹s、和参考速度v、运行的权重,这两个值越大,车辆就会越依据参考值运行,可以理解的,不同的导航任务对应的P和Q是不同的,N是预测时长,下标t是指第t时刻。这个代价函数代表了机器人的状 态和控制变量与参考值之间的差,优化的方向就是尽量使差异变小。
其中,避障约束是规划优化问题的核心约束,这个约束保证了车辆的轨迹不会遇到障碍物发生碰撞,这里约束的建立基于感知模块的环境信息给定的凸集。判断是否碰撞的依据是车辆和周围障碍物间的最小距离,记为为了保证避障,最小距离需控制在安全范围内,数学形式为:
这个就是优化方程的避障约束,保证避障约束即可避免碰撞。
不同的机器人有不同的动力学模型。车辆比较常用的是阿克曼模型,它的特点是不能横向移动,轨迹只有直线和弧线的组合,弧线的半径取决于车辆的最小转弯半径rmin,而转弯半径是由车辆的前后轮中心距离以及最大转向角决定的。阿克曼车辆的控制命令主要有线性速度和转向角,它的动力学模型为:
其中,e和分别为线速度和朝向角,α为车辆转向角,L为车辆的前后轮距,而St和St+1是车辆不同时刻的状态。动力学模型的约束可以保证所求轨迹的平滑与可行性。
受限于实际的物理模型,车辆的速度u与加速度a均有最大值和最小值约束,即它们会受限在一定的范围里优化,同时这也缩小了定义域方便优化方程求解。
综上,路径规划模块的规划问题是要求车辆能够在避免碰撞的同时,尽可能地靠近理想轨迹。问题可抽象成:




针对这个问题,设计以下求解器进行迭代优化,每次迭代分为四步骤,循环执行优化,得到最优线速度和朝向角、车辆转向角:
步骤一:利用上一时刻求解器输出的控制命令以及车辆执行动作后的状态作为初始点;便于后续步骤的快速求解。
步骤二:利用L1范式稀疏性动态调整安全距离dsafe;对于稀疏的环境dsafe会倾向于更大值,反之亦然。
步骤三:用惩罚函数将约束群转为求和形式的约束条件,消除非凸约束条件使得问题所有约束均为线性;线性约束更容易求解,同时便于后续计算。
步骤四:对于非凸的代价函数,用不等式法计算代价函数上界,并将上界作为代理函数输入到内点法求解器中。这样原始的非凸问题就转换为了凸的问题,更易于求解。代理函数的作用即是用另一种函数形式代替原函数的作用方便求解。在本步骤中,对于不易求解的a≤b问题,可以先求得a的上界为c,原不等式就可转化为c≤b。这就是原不等式的代理函数。
S230、根据路径规划轨迹的参数空间确定场景复杂度,具体为路径规划轨迹的参数空间中约束方程的数量与场景复杂度成正比。
具体的,本申请采用研究路径规划的参数空间的方法判断场景复杂度。路径规划的求解取决于所建立的约束方程的依赖关系,所以约束方程(参数量)越多代表场景复杂度越高,也就是计算求解难度越高。
S240、模仿学习模块根据环境信息由神经网络得出决策的不确定性分布。
具体的,模仿学习模块采用预训练好的神经网络进行控制。
一个实施例中,训练神经网络的数据集包括历史俯瞰图及对应的历史时刻驾驶员的操作信息;
历史俯瞰图由对应的历史时刻驾驶的周围环境信息融合而成,其中俯瞰图为多个视角的RGB摄像头图片构成的以车辆为中心的视图。
具体的,采集数据模块主要是应用感知模块中各类传感器获取到环境信息以制作数据集。常用的传感器数据有RGB摄像头、深度摄像头、雷达、激光雷达等。使用RGB摄像头可以获取到车身周围的物体的视觉信息,从中可以提取出语义信息、物体交互信息等等。深度摄像头可以获取到视野内所有点的深度数据构成的矩阵,以此建构深度图,可以查询到其他物体的距离信息。激光雷达通过测定传 感器发射器与目标物体之间的传播距离,分析目标物体表面的反射能量大小、反射波谱的幅度、频率和相位等信息,从而呈现出目标物精确的三维结构信息。更多的传感器能够带给***更丰富的信息,使***对周围环境的判断更为准确,但是多传感器间的数据处理与融合相对地也会变得更为复杂,让智能模型难以训练。我们考虑采用多个视角的RGB摄像头图片,以构成一张以车辆为中心点的俯瞰(bird-view)视图的输入,如图3所示。之后安排一位富有经验的人类驾驶员进行道路驾驶操作,选取不同街道的不同交通环境进行驾驶,期间要避免碰撞以及符合交通规则。中间过程保持15Hz的频率,即每秒记录下15次数据,每一次数据包括俯瞰图以及对应的驾驶员的操作(油门、刹车、转向值)。完整的一次驾驶收集到的视为一组数据。数据量越大,数据集包含的多样性就越充足,训练出的模型效果就更好。采集完成之后,需要对数据进行一遍预处理。收集到的俯瞰图,将其裁剪和缩放,使得每张图片的尺寸都在160*80,便于网络计算。对于油门、刹车、转向值,超出物理范围限制的值我们将其修改为物理范围限制的最大值/最小值,之后对所有数据值进行归一化(normalization),让它们的值在[-1,1]之间。然后对动作值(即操作值)进行处理,以每0.1的值为一个间隔,将[-1,1]分为0~20共21类,具体计算公式为:
pred=value*10.0+10
然后应用收集好的数据集,对神经网络进行训练。本申请采用卷积加全连接的结构,构造一个分类网络,该网络的任务是,通过输入的俯瞰图,经过网络计算,得出各个动作对应的分类值,再将分类值转换为控制值应用。具体的网络结构为,三层全连接网络,卷积核分别是32,64,64,四层全连接网络,结点数依次为1024,512,128,21。最后一层全连接网络共有二十一个结点,依次输出每个动作对应的概率值,从这些结点中取概率最大的作为选定分类,再通过下列公式将分类值转为控制值,控制值交给车辆的制动***去执行。
value=(pred-10)/10.0
可以理解的,模仿学习的训练目标是,使得策略最逼近驾驶员的驾驶策略,让不同输入下的输出能够与数据集中的对应数值最为接近。优化方程为:
此处的θ*为最逼近驾驶员的驾驶策略参数,s,a~D为从数据集D中采样的状 态与动作(输入与标签),loss为误差函数,π(s|θ)为在参数θ下的策略π在给定输入s时得到的输出,参数优化的目的是最小化当前参数的策略与数据集中的策略的差异。
使用训练好的神经网络时,将当前驾驶的周围环境信息融合成的俯瞰图,同样裁剪缩放至网络需要的输入尺寸,然后输入至训练好的神经网络计算就能得出对应的控制结果。
一个实施例中,模仿学习模块根据环境信息由神经网络得出决策的不确定性分布,包括:
神经网络输出的感知结果为:
y*=argmaxyP(y|s,w)
其中,P(y|s,w)表示神经网络的感知模型w在观测到场景s后产生结果y的概率;
场景的不确定性分布U(s):
U(s)=1-P(y*|s,w)。
具体的,用P(y|s,w)来表示感知模型w在观测到场景s后产生结果y的概率。由于感知模型会从所有结果中选取可能性最大的结果,因此模型最终输出的感知结果为y*=argmaxyP(y|s,w);该输出所对应的概率表示感知模型w对所观测场景是否熟悉。因此场景不确定性可以用如下公式计算:
U(s)=1-P(y*|s,w)。
S250、根据不确定性分布确定场景异常度为:不确定性分布与场景异常度成正比。
具体的,不确定性分布U越大代表场景越复杂。
另外,还可以通过场景理解偏差确定场景复杂度,具体为:
将一般性的感知复杂度分析应用于智能驾驶场景理解,初步得到场景理解偏差(包括了分类错误率、检测错误率、与追踪丢失概率)与场景理解模型的参数量呈如下函数关系式:
其中,A表示智能汽车对某特定场景的理解偏差,(b,c)表示需要拟合的参数,T表示Gibbs分布的温度常数,(U,V)表示泛化误差关于场景理解模型的二阶与一 阶导数矩阵,W表示场景理解模型的参数量。A越大代表场景越复杂。
可以理解的,还可以综合考虑不确定性分布U和场景理解偏差A确定场景异常度,例如可以将二者求平均,也可以进行加权平均等得到的值确定场景异常度,在此不做限制。
S260、决策模块根据场景复杂度及场景异常度,确定自动驾驶方法。
具体的,若场景复杂度大于第一阈值且场景异常度小于或等于第二阈值,则采取模仿学习模块确定的操作信息控制执行模块;
若场景复杂度小于或等于第一阈值且场景异常度大于第二阈值,则采取路径规划模块确定的操作信息控制执行模块;
若场景复杂度大于第一阈值且场景异常度大于第二阈值,则若场景复杂度与第一阈值的第一差值小于场景异常度与第二阈值的第二差值,则采取路径规划模块确定的操作信息控制执行模块,若第一差值大于第二差值,则采取模仿学习模块确定的操作信息控制执行模块;
若场景复杂度小于第一阈值且场景异常度小于第二阈值,则若第一差值小于第二差值,则采取路径规划模块确定的操作信息控制执行模块,若第一差值大于第二差值,则采取模仿学习模块确定的操作信息控制执行模块。
具体的,其中,第一阈值和第二阈值均可以根据实际需求进行设定。
综合考虑场景复杂度和场景异常度两个指标,当场景复杂度较高(即场景复杂度大于第一阈值)、场景异常度较低(即场景异常度小于或等于第二阈值)时,采取模仿学习的模型计算结果可以更快速有效地进行控制,符合自动驾驶实时性的需求。当场景异常度较高(即场景异常度大于第二阈值)、场景复杂度较低(即场景复杂度小于或等于第一阈值)时,当前场景是模仿学习的数据集中较少出现过的场景,模型对于这种场景的判断置信度较低,采取路径规划的方法可以提高决策的准确性,更好地避免可能发生的碰撞等异常情况。
本申请实施例提供的基于场景自适应识别的自动驾驶方法,结合了路径规划与模仿学习方法各自的优点,该方法能够根据不同的场景自适应地识别分析场景复杂度与场景异常度,智能地选择采用路径规划方法对约束求解,或是采用模仿学习方法由神经网络进行计算。综合考虑了这两种方法在自动驾驶的优劣,提高了自动驾驶对实时性能与准确性能。
本申请实施例结合了两种方法的优势,能够稳定可靠无碰撞地完成交通道路的自动驾驶操作,相比单纯的路径规划方法,本申请在计算上用时更短;相比单纯的模仿学习方法,本申请对于出现频率较低的训练场景有更好的应对性能。
参照图4,其示出了根据本申请一个实施例描述的基于场景自适应识别的自动驾驶***的结构示意图。
如图4所示,基于场景自适应识别的自动驾驶***400,可以包括:
获取模块410,用于获取驾驶场景下的环境信息及上一时刻操作信息;
路径规划模块420,用于基于上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹;
第一确定模块430,用于根据路径规划轨迹的参数空间确定场景复杂度;
模仿学习模块440,用于根据环境信息由神经网络得出决策的不确定性分布;
第二确定模块450,用于根据不确定性分布确定场景异常度;
决策模块460,用于根据场景复杂度及场景异常度,确定自动驾驶方法。
可选的,路径规划模块采用模型预测控制方法,包括:
根据当前时刻的动力学模型预测预设时段内车辆的运动状态及轨迹,在考虑约束的条件下,优化每一具体时刻的控制轨迹以保证每一具体时刻的最优解。
可选的,优化每一具体时刻的控制轨迹基于代价函数、加速度限制约束、速度限制约束、避障约束、动力学约束。
可选的,路径规划轨迹的参数空间中约束方程的数量场景复杂度成正比。
可选的,训练神经网络的数据集包括历史俯瞰图及对应的历史时刻驾驶员的操作信息;
历史俯瞰图由对应的历史时刻驾驶的周围环境信息融合而成,其中俯瞰图为多个视角的RGB摄像头图片构成的以车辆为中心的视图。
可选的,神经网络的结构为:三层全连接网络,卷积核分别为32、64、64,四层全连接网络,结点数依次为1024、512、128、21。
可选的,模仿学习模块440还用于:
神经网络输出的感知结果为:
y*=argmaxyP(y|s,w)
其中,P(y|s,w)表示神经网络的感知模型w在观测到场景s后产生结果y的概 率;
场景的不确定性分布U(s):
U(s)=1-P(y*|s,w)。
可选的,不确定性分布与场景异常度成正比。
可选的,决策模块460还用于:
若场景复杂度大于第一阈值且场景异常度小于或等于第二阈值,则采取模仿学习模块确定的操作信息控制执行模块;
若场景复杂度小于或等于第一阈值且场景异常度大于第二阈值,则采取路径规划模块确定的操作信息控制执行模块;
若场景复杂度大于第一阈值且场景异常度大于第二阈值,则若场景复杂度与第一阈值的第一差值小于场景异常度与第二阈值的第二差值,则采取路径规划模块确定的操作信息控制执行模块,若第一差值大于第二差值,则采取模仿学习模块确定的操作信息控制执行模块;
若场景复杂度小于第一阈值且场景异常度小于第二阈值,则若第一差值小于第二差值,则采取路径规划模块确定的操作信息控制执行模块,若第一差值大于第二差值,则采取模仿学习模块确定的操作信息控制执行模块。
本实施例提供的一种基于场景自适应识别的自动驾驶***,可以执行上述方法的实施例,其实现原理和技术效果类似,在此不再赘述。
需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于***实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。

Claims (10)

  1. 一种基于场景自适应识别的自动驾驶方法,其特征在于,所述方法包括:
    获取驾驶场景下的环境信息及上一时刻操作信息;
    路径规划模块基于所述上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹;
    根据所述路径规划轨迹的参数空间确定场景复杂度;
    模仿学习模块根据所述环境信息由神经网络得出决策的不确定性分布;
    根据所述不确定性分布确定场景异常度;
    决策模块根据所述场景复杂度及所述场景异常度,确定自动驾驶方法。
  2. 根据权利要求1所述的方法,其特征在于,所述路径规划模块采用模型预测控制方法,包括:
    根据当前时刻的动力学模型预测预设时段内车辆的运动状态及轨迹,在考虑约束的条件下,优化每一具体时刻的控制轨迹以保证每一具体时刻的最优解。
  3. 根据权利要求2所述的方法,其特征在于,所述优化每一具体时刻的控制轨迹基于代价函数、加速度限制约束、速度限制约束、避障约束、动力学约束。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述路径规划轨迹的参数空间确定场景复杂度为:所述路径规划轨迹的参数空间中约束方程的数量与所述场景复杂度成正比。
  5. 根据权利要求1所述的方法,其特征在于,训练所述神经网络的数据集包括历史俯瞰图及对应的历史时刻驾驶员的操作信息;
    所述历史俯瞰图由对应的历史时刻驾驶的周围环境信息融合而成,其中俯瞰图为多个视角的RGB摄像头图片构成的以车辆为中心的视图。
  6. 根据权利要求1所述的方法,其特征在于,所述神经网络的结构为:三层全连接网络,卷积核分别为32、64、64,四层全连接网络,结点数依次为1024、512、128、21。
  7. 根据权利要求1所述的方法,其特征在于,所述模仿学习模块根据所述环境信息由神经网络得出决策的不确定性分布,包括:
    所述神经网络输出的感知结果为:
    y*=argmaxyP(y|s,w)
    其中,P(y|s,w)表示神经网络的感知模型w在观测到场景s后产生结果y的概率;
    场景的不确定性分布U(s):
    U(s)=1-P(y*|s,w)。
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述不确定性分布确定场景异常度为:所述不确定性分布与所述场景异常度成正比。
  9. 根据权利要求1所述的方法,其特征在于,所述根据所述场景复杂度及所述场景异常度,确定自动驾驶方法,包括:
    若所述场景复杂度大于第一阈值且所述场景异常度小于或等于第二阈值,则采取所述模仿学习模块确定的操作信息控制执行模块;
    若所述场景复杂度小于或等于所述第一阈值且所述场景异常度大于所述第二阈值,则采取所述路径规划模块确定的操作信息控制所述执行模块;
    若所述场景复杂度大于所述第一阈值且所述场景异常度大于所述第二阈值,则若所述场景复杂度与所述第一阈值的第一差值小于所述场景异常度与所述第二阈值的第二差值,则采取所述路径规划模块确定的操作信息控制所述执行模块,若所述第一差值大于所述第二差值,则采取所述模仿学习模块确定的操作信息控制所述执行模块;
    若所述场景复杂度小于所述第一阈值且所述场景异常度小于所述第二阈值,则若所述第一差值小于所述第二差值,则采取所述路径规划模块确定的操作信息控制所述执行模块,若所述第一差值大于所述第二差值,则采取所述模仿学习模块确定的操作信息控制所述执行模块。
  10. 一种基于场景自适应识别的自动驾驶***,其特征在于,所述***包括:
    获取模块,用于获取驾驶场景下的环境信息及上一时刻操作信息;
    路径规划模块,用于基于所述上一时刻操作信息,确定一条当前驾驶场景下的路径规划轨迹;
    第一确定模块,用于根据所述路径规划轨迹的参数空间确定场景复杂度;
    模仿学习模块,用于根据所述环境信息由神经网络得出决策的不确定性分布;
    第二确定模块,用于根据所述不确定性分布确定场景异常度;
    决策模块,用于根据所述场景复杂度及所述场景异常度,确定自动驾驶方法。
PCT/CN2023/133059 2022-11-25 2023-11-21 一种基于场景自适应识别的自动驾驶方法及*** WO2024109763A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211497070.3 2022-11-25
CN202211497070.3A CN115743178A (zh) 2022-11-25 2022-11-25 一种基于场景自适应识别的自动驾驶方法及***

Publications (1)

Publication Number Publication Date
WO2024109763A1 true WO2024109763A1 (zh) 2024-05-30

Family

ID=85338860

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/133059 WO2024109763A1 (zh) 2022-11-25 2023-11-21 一种基于场景自适应识别的自动驾驶方法及***

Country Status (2)

Country Link
CN (1) CN115743178A (zh)
WO (1) WO2024109763A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115743178A (zh) * 2022-11-25 2023-03-07 中国科学院深圳先进技术研究院 一种基于场景自适应识别的自动驾驶方法及***

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669461A (zh) * 2019-01-08 2019-04-23 南京航空航天大学 一种复杂工况下自动驾驶车辆决策***及其轨迹规划方法
US20190332110A1 (en) * 2018-04-27 2019-10-31 Honda Motor Co., Ltd. Reinforcement learning on autonomous vehicles
CN112418237A (zh) * 2020-12-07 2021-02-26 苏州挚途科技有限公司 车辆驾驶决策方法、装置及电子设备
CN113635909A (zh) * 2021-08-19 2021-11-12 崔建勋 一种基于对抗生成模仿学习的自动驾驶控制方法
US20220204020A1 (en) * 2020-12-31 2022-06-30 Honda Motor Co., Ltd. Toward simulation of driver behavior in driving automation
CN115743178A (zh) * 2022-11-25 2023-03-07 中国科学院深圳先进技术研究院 一种基于场景自适应识别的自动驾驶方法及***

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190332110A1 (en) * 2018-04-27 2019-10-31 Honda Motor Co., Ltd. Reinforcement learning on autonomous vehicles
CN109669461A (zh) * 2019-01-08 2019-04-23 南京航空航天大学 一种复杂工况下自动驾驶车辆决策***及其轨迹规划方法
CN112418237A (zh) * 2020-12-07 2021-02-26 苏州挚途科技有限公司 车辆驾驶决策方法、装置及电子设备
US20220204020A1 (en) * 2020-12-31 2022-06-30 Honda Motor Co., Ltd. Toward simulation of driver behavior in driving automation
CN113635909A (zh) * 2021-08-19 2021-11-12 崔建勋 一种基于对抗生成模仿学习的自动驾驶控制方法
CN115743178A (zh) * 2022-11-25 2023-03-07 中国科学院深圳先进技术研究院 一种基于场景自适应识别的自动驾驶方法及***

Also Published As

Publication number Publication date
CN115743178A (zh) 2023-03-07

Similar Documents

Publication Publication Date Title
CN110834644B (zh) 一种车辆控制方法、装置、待控制车辆及存储介质
JP7200371B2 (ja) 車両速度を決定する方法及び装置
Botteghi et al. On reward shaping for mobile robot navigation: A reinforcement learning and SLAM based approach
WO2024109763A1 (zh) 一种基于场景自适应识别的自动驾驶方法及***
Li et al. Importance weighted Gaussian process regression for transferable driver behaviour learning in the lane change scenario
Aradi et al. Policy gradient based reinforcement learning approach for autonomous highway driving
Lu et al. Hierarchical reinforcement learning for autonomous decision making and motion planning of intelligent vehicles
Nallaperuma et al. Intelligent detection of driver behavior changes for effective coordination between autonomous and human driven vehicles
Xiao et al. Vehicle trajectory prediction based on motion model and maneuver model fusion with interactive multiple models
Babu et al. Model predictive control for autonomous driving considering actuator dynamics
Chen et al. ES-DQN: A learning method for vehicle intelligent speed control strategy under uncertain cut-in scenario
Jiang et al. Event-triggered shared lateral control for safe-maneuver of intelligent vehicles
Reda et al. Path planning algorithms in the autonomous driving system: A comprehensive review
Chen et al. Automatic overtaking on two-way roads with vehicle interactions based on proximal policy optimization
Wei et al. End-to-end vision-based adaptive cruise control (ACC) using deep reinforcement learning
CN113311828A (zh) 一种无人车局部路径规划方法、装置、设备及存储介质
Sharma et al. Kernelized convolutional transformer network based driver behavior estimation for conflict resolution at unsignalized roundabout
Wang et al. End-to-end driving simulation via angle branched network
Zhang et al. A convolutional neural network method for self-driving cars
CN116909287A (zh) 一种综合DMPs和APF的车辆换道避障轨迹规划方法
CN111160089A (zh) 一种基于不同车辆类型的轨迹预测***及方法
CN114537435B (zh) 一种自动驾驶中的实时整车轨迹规划方法
Qiu et al. Learning a steering decision policy for end-to-end control of autonomous vehicle
US11794780B2 (en) Reward function for vehicles
Wang et al. LSTM-based prediction method of surrounding vehicle trajectory

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23893853

Country of ref document: EP

Kind code of ref document: A1