CN115107767A

CN115107767A - Artificial intelligence-based automatic driving brake and anti-collision control method

Info

Publication number: CN115107767A
Application number: CN202210859898.2A
Authority: CN
Inventors: 犹杰
Original assignee: Mingshang Technology Co ltd
Current assignee: Mingshang Technology Co ltd
Priority date: 2022-07-21
Filing date: 2022-07-21
Publication date: 2022-09-27

Abstract

The invention discloses an automatic driving brake and anti-collision control method based on artificial intelligence, which comprises the following steps: s1, establishing an environment model based on the lane change of the front vehicle; s2, establishing an environment model based on the front vehicle brake; s3, establishing an environment model based on the pedestrians or other obstacles in front; s4, passenger comfort degree research based on lane changing of the front vehicle, braking of the front vehicle and emergency braking of pedestrians or other obstacles in the front under the emergency condition; s5, establishing an AI algorithm model; s6, two deep reinforcement learning models are established, the method has better environmental adaptability and growth, algorithm examples aiming at different application scenes can be obtained by learning a large amount of sampling data, the emergency situation is more comprehensive, the control action comprises braking and decelerating or turning and changing lanes, and the optimal control strategy can be achieved by model training and learning for a pure automatic driving environment, an environment with automatic driving and auxiliary driving and a mixed environment with human driving.

Description

Artificial intelligence-based automatic driving brake and anti-collision control method

Technical Field

The invention belongs to the field of safety assistance and emergency automatic braking of automatic driving, and particularly relates to an automatic driving brake and anti-collision control method based on artificial intelligence.

Background

In the field of vehicle driving, according to the investigation and analysis results of traffic accidents, road traffic accidents are often caused by inattention, misjudgment or misoperation of a driver in the driving process, and rear-end accidents are often caused by the fact that a driver of a rear vehicle does not realize complete parking or effective parking. On the other hand, the sudden lane change of the front vehicle makes the driver of the rear vehicle have insufficient time to take proper measures, which is also a main factor causing a collision accident.

The ADAS mainly identifies and detects the distance of pedestrians, other vehicles, lane lines, road barriers/signs and the like in the driving environment of the vehicle in real time through long-distance radars, medium-short distance radars, laser radars, ultrasonic waves, cameras and other sensing equipment arranged around the vehicle sound, so that the warning of a driver through a buzzing/indicating lamp under an emergency condition or the automatic parking control of the vehicle under an extreme condition can be directly implemented.

The traditional safe driving auxiliary system (ADAS) and the emergency brake system (AEB) cannot have good adaptability to the environment, are unreliable and unstable in application, cannot make accurate and effective control decisions based on a complex changing traffic environment, and cannot realize effective control decisions especially on a mixed driving road where an automatic driving vehicle and a human driving vehicle coexist.

Disclosure of Invention

The invention aims to solve the defects in the prior art, has better environmental adaptability and growth, and can obtain algorithm examples for different application scenarios by learning a large amount of sampling data, and the emergency situations are more comprehensive, including the lane change of the front vehicle, the emergency brake of the front vehicle, the obstacles such as pedestrians appearing in the front, the control action includes the deceleration of the brake or the lane change of the turning, the optimal control strategy can be achieved through model training and learning in pure automatic driving environment, automatic driving and auxiliary driving environment and human driving environment, can be used as a core system of an automatic driving technology, and by deploying training in a wide range of road driving environments, the method can help a road traffic system to avoid traffic accidents and realize safer road traffic environment, and the method for controlling the automatic driving brake and the anti-collision based on the artificial intelligence is provided.

In order to achieve the purpose, the invention provides the following technical scheme: the control method of automatic driving brake and anti-collision based on artificial intelligence comprises the following steps:

s1, for the front vehicle suddenly changing the lane from the lane different from the lane of the rear vehicle to the driving lane of the rear vehicle, establishing an environment model based on the lane changing of the front vehicle;

s2, for two vehicles running on the same lane, the front vehicle suddenly brakes and decelerates, and an environment model based on the front vehicle brake is established;

s3, establishing an environment model based on the pedestrians or other obstacles in front of the lane in which the vehicle runs, wherein the obstacles in front are the pedestrians walking, the people riding bicycles and other objects which can cause life safety or vehicle damage due to rear vehicle collision;

s4, passenger comfort degree research based on lane changing of the front vehicle, braking of the front vehicle and emergency braking of pedestrians or other obstacles in the front under the emergency condition;

s5, establishing an AI algorithm model; when the rear vehicle detects that the front vehicle suddenly changes lane to the front vehicle brake on the same lane or the same lane and a pedestrian barrier exists in front of the rear vehicle, the rear vehicle needs to make brake control according to the current relative distance, relative speed and relative acceleration between the rear vehicle and the front vehicle or the front barrier so as to decelerate or change lane to an adjacent lane to avoid collision;

and S6, establishing two deep reinforcement learning models, wherein the deep reinforcement learning models perform self-adaptive learning on the real-time change of the driving state and interact with the environment in real time to determine an anti-collision control strategy.

Preferably, in S1, the x-axis represents the direction in which the front and rear vehicles travel, the y-axis represents the direction perpendicular thereto, and the V-axis represents the speed at which the rear vehicle HC travels _xHC (t) and the speed in the y-axis direction thereof is 0 (V) _yHC (t) ═ 0); the QC of the front vehicle drives at a speed V _xQC (t) and V _yHC (t) when the front vehicle is changing lanes, the speed V of the front vehicle in the y-axis direction _yQC (t) ≠ 0, where t is the point in time when the leading vehicle starts to lane change, then the leading vehicle lane change results in an offset angle θ (t) on the y-axis:

setting the moving distance of the rear vehicle from the beginning of the braking process to the end of the braking process as X when an emergency is found _HC (t), the distance of the front vehicle moving in the same direction as the rear vehicle is X _QC (t) the length of the front vehicle body is L _QC Then, in order to avoid collision between the rear vehicle and the front vehicle, the following conditions need to be satisfied:

X _HC (t)＜X _QC (t)-L _QC cos[θ(t)] (2)

when Cos [ θ (t) ] takes the maximum value, the minimum safe distance between the front vehicle and the rear vehicle is:

in the formula (3), a _xHC (τ) and a _xQC (τ) is the moving acceleration in the direction of travel (x-axis), v, of the rear and front vehicles, respectively _xHC (0) And v _xQC (0) The initial speeds of the rear car and the front car are respectively.

Preferably, in S2, if a preceding vehicle that is originally traveling in the same lane as a following vehicle suddenly brakes, the following vehicle also needs to recognize and make an appropriate judgment to brake in order to avoid a collision, and the braking process of the vehicle driven by the driver is divided into three stages: the driver reacts time t1, the braking force is enhanced t2 and the brake is continuously applied t3, and the braking distance is as follows due to the automatic braking, i.e. t1 is 0:

for the lane change or the brake of the front vehicle, the dynamic distance between the front vehicle and the rear vehicle is as follows:

x _Δ (t)＝x _Δ (0)+x _QC (t)-x _HC (t)

＝x _Δ (0)+x _QC (t)-x _bH C(t) (5)

in the formula (5), x _Δ (0) Setting the average length of the vehicle body as L for the initial distance of the front and the rear vehicles in the common driving direction _p According to the formula (5), to prevent collision, the distance between the front and rear wheels after braking must satisfy:

x _Δ (t)＞L _p (6)。

preferably, in S3, a pedestrian or other stationary obstacle suddenly appears right ahead of the travel of the vehicle HC, and since the pedestrian or obstacle is almost stationary in the travel direction with respect to the vehicle, x can be simplified based on the formula (5) _QC (t) 0, and obtaining the distance between the rear vehicle and the pedestrian or other obstacles XR in the braking process as follows:

x _Δ (t)＝x _Δ (0)-x _bHC (t) (7)

then, to prevent the collision of the vehicle against the pedestrian or other obstacle, it is necessary to satisfy:

x _Δ (t)＞0 (8)。

preferably, in S4, sudden braking in an emergency situation may cause discomfort to passengers, physical injury to passengers due to violent inertia, and passenger discomfort related to not only vehicle acceleration or deceleration but also jerk due to braking discontinuity, which can be quantitatively calculated as a change in acceleration:

in the formula (9), a is acceleration, v is speed, x is distance, and t is time, so as to avoid the incontrollable movement of the passenger body or the generation of uncomfortable feeling, the maximum acceleration or deceleration (a) in the control process can be controlled _max ) And maximum pause (j) _max ) Limiting is performed such that:

wherein the maximum value of the acceleration is 4m/s ² 。

Preferably, in S5, for each calculation period T, due to noise in the inter-vehicle communication or in-vehicle sensor signal acquisition, the rear vehicle HC may not accurately determine the distance to the front vehicle QC or the front pedestrian obstacle XR, resulting in too early or too late braking, or even no braking or lane change; corresponding braking actions may also be implemented, but cause discomfort to passengers and even cause accidents;

in the anti-collision policy algorithm, at any time t, the state of the environment is represented as:

S(t)＝{x _HC (t)，v _HC (t)，a _HC (t)，x _QC (t)，v _QC (t)，a _QC (t)} (11)

is the current position (x) of the rear vehicle HC and the front vehicle QC _HC (t)，x _QC (t)), velocity (v) _HC (t)，v _QC (t)) and acceleration (a) _HC (t)，a _QC (t)) a tensor of composition;

the control strategy includes braking or steering to an adjacent lane to avoid collision, and the implementation action in the corresponding reinforcement learning model is represented as:

A(t)＝{a _d (t)，θ(t)} (12)

wherein, a _d (t) is the deceleration of the rear vehicle, theta (t) is the steering angle of the rear vehicle, and the value of the deceleration is quantized and is in the range of [0, 10 ]]0 for no braking and 10 for maximum braking deceleration-4 m/s ² The steering angle theta (t) has a value range of [ -1, 1 [ ]]Negative numbers represent left turns, positive numbers represent right turns, 0 represents no turns;

the reward function R (t) in the algorithm is an incentive for guiding the control strategy to implement the optimal action A (t) under the environmental state S (t), and for early braking, the reward function is set as follows:

R ₁ (t)＝-δ[(x _Δ (t)-L _p -X) ² +α] (13)

wherein x is _Δ (t) is the distance difference between the rear vehicle and the front vehicle (or pedestrian obstacle) moving in the forward direction of the rear vehicle, L _p Is the average vehicle length, X is a safe distance threshold, δ and α are weighting factors, when X Δ (t) > (L) _p + X), δ > 0, otherwise δ ═ 0;

for too late braking or no braking, the reward function is set to:

R ₂ (t)＝-ρ[(x _Δ (t)-L _p ) ² +(v _HC (t)-v _Qc (t)) ² +β] (14)

wherein x is _Δ (t) and L _p Respectively the distance between two cars and the average car length, v _HC (t) and v _QC (t) the advancing speeds of the rear vehicle and the front vehicle are respectively, rho and beta are weight factors, and according to the formula (14), if the relative speed of the two vehicles is higher and the brake is later, the punishment reflected by the reward function is higher;

for passenger comfort goals, the reward function is set to:

R ₃ (t)＝-σ[(a _HC (t)-a _limit ) ² +(j _HC (t)-j _limit ) ² +γ] (15)

wherein, σ and γ are weighting factors, when a _HC (t)＞a _limit Or j _HC (t)＞j _limit When sigma is greater than 0, otherwise, sigma is 0, and the weighting factors in the formulas (13) to (15) are adjustable factors;

the three reward functions are combined to obtain a comprehensive reward function:

R(t)＝ω ₁ R ₁ (t)+ω ₂ R ₂ (t)+ω ₃ R ₃ (t) (16)

wherein, ω is ₁ ，ω ₂ ，ω ₃ Respectively, weights optimized for the early braking, late braking, and passenger comfort goals.

Preferably, in S6, the two deep reinforcement Learning models include an Actor-Critic model and a deep q-Learning model;

wherein, Actor of Actor-Critic model is a deep neural network with input of environment state S (t) and output of action tensor, Critic is a deep neural network with input of { S (t), a (t) }, and output of cost function Q (S, a), wherein the cost function is used for calculating loss function of neural network training:

wherein N is the number of samples, and gamma belongs to (0, 1)]To reward the stride factor, the Actor network outputs the specific actions in state s (t): speed reduction (a) _d (t)) or steering (θ (t)), the Actor network maps the current state to the optimal action, setting the implicit function to π: [ S (t) → A >]Then the Actor updates itself with the gradient:

wherein,

and

respectively updating the parameters of the Actor and Critic networks in the main network to the corresponding Actor and Critic in the target network respectively in an iterative manner for each calculation period T:

wherein μ represents the update rate, and the value of μ is small to enhance the robustness of model learning in order to avoid model overfitting or falling into local optimum.

Preferably, the Q network in the main network and the Q network in the target network in the deep Q-Learning model are both a deep neural network with an input of the environmental state S (t) and an output of the environmental state S (t) being a cost function Q (S, a), wherein the cost function is used for calculating a loss function of the neural network training:

wherein N is the number of samples, and gamma is belonged to (0, 1)]For rewarding the stride factor, at each calculation iterationAnd if the Random is less than epsilon, one action A is randomly selected _ε Else by Q (S) of the Q network _t ,A _t ) The value argmax is calculated to obtain action A which maximizes the Q value _Q 。

The invention has the technical effects and advantages that: compared with the prior art, the control method for automatically driving the brake and preventing collision based on artificial intelligence has better environmental adaptability and growth, and can obtain algorithm examples for different application scenarios by learning a large amount of sampling data, and the emergency situations are more comprehensive, including the lane change of the front vehicle, the emergency brake of the front vehicle, the obstacles such as pedestrians appearing in the front, the control action includes the deceleration of the brake or the lane change of the turning, the optimal control strategy can be achieved through model training and learning in pure automatic driving environment, automatic driving and auxiliary driving environment and human driving environment, can be used as a core system of an automatic driving technology, and by deploying and training in a wide road driving environment, the road traffic system can be helped to avoid traffic accidents, and a safer road traffic environment is realized.

Drawings

Fig. 1 is a schematic view of an emergency scenario faced by an autonomous driving collision avoidance control system of the present invention;

FIG. 2 is a communication method for data sharing between vehicles according to the present invention;

FIG. 3 is a schematic view of a preceding lane change emergency situation in accordance with the present invention;

FIG. 4 is a schematic diagram of a reinforcement learning model structure employed by the control strategy algorithm of the present invention;

FIG. 5 is a diagram of an Actor-Critic model structure employed by the control strategy algorithm of the present invention;

FIG. 6 is a schematic diagram of a DeepQ-Learning model structure used in the control strategy algorithm of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-6, the present invention provides an automatic driving braking and anti-collision control method based on artificial intelligence, comprising the following steps:

s6, establishing two deep reinforcement learning models, wherein the deep reinforcement learning models perform self-adaptive learning on real-time changes of driving states and interact with the environment in real time to determine an anti-collision control strategy;

the automatic control strategy of the vehicle under the emergency condition of comprehensively considering the lane change of the front vehicle, the sudden braking of the front vehicle and the sudden occurrence of obstacles such as pedestrians and the like right ahead of the running vehicle is a set of artificial intelligent software for making an optimized control strategy according to the current driving environment of the vehicle, and the software is matched with a vehicle-mounted communication system and a sensor system to form an integral scheme of automatic driving emergency braking and anti-collision. The current driving environment state is obtained in real time through a vehicle-mounted sensor and a communication system, an emergency is detected according to the current state, a decision is made, and the vehicle is controlled to brake, decelerate or turn and change lanes to avoid collision accidents;

the decision algorithm also comprehensively considers the damage caused by early and late braking, over-fierce braking and discontinuity, and can adapt to different environments by continuously learning the environment so as to ensure the growth and adaptability of the system; and a series of configurable parameters are designed, and can be adjusted and adapted to different vehicle types, road environments, application scenes (such as specific business scenes) and the like in specific system implementation to achieve the optimization.

In S1, the direction of travel of the front and rear vehicles is defined as the x-axis, the direction perpendicular thereto is defined as the y-axis, and the speed at which the rear vehicle HC is traveling is defined as the v-axis _xHC (t) and the speed in the y-axis direction thereof is 0 (V) _yHC (t) ═ 0); the QC of the front vehicle drives at a speed V _xQC (t) and V _yQC (t) when the front vehicle is changing lanes, the speed V of the front vehicle in the y-axis direction _yQC (t) ≠ 0, where t is the point in time when the leading vehicle starts to lane change, then the leading vehicle lane change results in an offset angle θ (t) on the y-axis:

X _HC (t)＜X _QC (t)-L _QC cos[θ(t)] (2)

in the formula (3), a _xHHC (τ) and a _xQC (τ) is the moving acceleration of the rear vehicle and the front vehicle in the direction of travel (x-axis), v _xHC (0) And v _xQC (0) The initial speeds of the rear car and the front car are respectively.

In S2, if the front vehicle, which is originally driving in the same lane as the rear vehicle, suddenly brakes, the rear vehicle also needs to recognize and make an appropriate judgment and brake to avoid collision, and the braking process of the driver driving the vehicle is divided into three stages: the driver reacts time t1, the braking force is enhanced t2 and the brake is continuously applied t3, and the braking distance is as follows due to the automatic braking, i.e. t1 is 0:

x _Δ (t)＝x _Δ (0)+x _QC (t)-x _HC (t)

＝x _Δ (0)+x _QC (t)-x _bH C(t) (5)

x _A (t)＞L _p (6)

in S3, a pedestrian or other stationary obstacle suddenly appears right ahead of the travel of the vehicle HC, and since the pedestrian or obstacle is almost at a standstill in the travel direction with respect to the vehicle, x can be simplified based on the formula (5) _QC (t) 0, and obtaining the distance between the rear vehicle and the pedestrian or other obstacles XR in the braking process as follows:

x _Δ (t)＝x _Δ (0)-x _bHC (t) (7)

x _Δ (t)＞0 (8)

at S4, sudden braking in an emergency situation may cause a physical and psychological discomfort to the passenger, and the violent inertia may even cause a physical injury to the vehicle occupants, and the passenger discomfort may be related to not only the acceleration or deceleration of the vehicle, but also the jerk due to the braking discontinuity, which may be quantitatively calculated as a change in the acceleration:

wherein the maximum value of the acceleration is 4m/s ² 。

In S5, for each calculation period T, due to noise in inter-vehicle communication or in-vehicle sensor signal acquisition, the rear vehicle HC may not accurately determine the distance to the front vehicle QC or the front pedestrian barrier XR, resulting in premature or late braking, or even no braking or lane change; corresponding braking actions may also be implemented, but the passengers are led to discomfort and even accidents are caused;

A(t)＝{a _d (t)，θ(t)} (12)

wherein, a _d (t) is the deceleration of the rear vehicle, theta (t) is the steering angle of the rear vehicle, and the deceleration value is quantized and is in a value range of [0, 10 ]]0 for no braking and 10 for maximum braking deceleration-4 m/s ² The steering angle theta (t) has a value range of [ -1, 1 ]]Negative numbers represent left turns, positive numbers represent right turns, 0 represents no turns;

R ₁ (t)＝-δ[(x _Δ (t)-L _p -X) ² +α] (13)

wherein x is _Δ (t) is the distance difference between the rear vehicle and the front vehicle (or pedestrian obstacle) moving in the forward direction of the rear vehicle, L _p Is the average vehicle length, X is a safe distance threshold, delta and alpha are weighting factors, when X is _Δ (t)＞(L _p + X), δ > 0, otherwise δ ═ 0;

for too late braking or no braking, the reward function is set to:

R ₂ (t)＝-p[(x _Δ (t)-L _p ) ² +(v _HC (t)-v _QC (t)) ² +β] (14)

for passenger comfort goals, the reward function is set to:

R ₃ (t)＝-σ[(a _HC (t)-a _limit ) ² +(j _HC (t)-j _limit ) ² +γ] (15)

R(t)＝ω ₁ R ₁ (t)+ω ₂ R ₂ (t)+ω ₃ R ₃ (t) (16)

In S6, the two deep reinforcement Learning models comprise an Actor-Critic model and a deep Q-Learning model;

wherein,

and

The invention uses two indexes to control the maturity of the Actor-Critic model: average prize value

And rate of accidents

The average reward value is defined as the average value of reward values (formula 16) obtained by learning samples (in an actual application system, data acquisition is performed according to the sample acquisition at every T time) in all training rounds, and the value is close to 0 theoretically; the accident rate is the number of accidents occurring in the scene expressed by the training data in all training rounds divided by the maximum number of accidents contained in the scene; in the model training phase, all samples are divided into E training rounds according to the amount of the provided data samples, and each round comprises N iterations (a calculation period T), namely the actual occurrence time of each round is N x T.

The input of the Q network in the main network and the Q network in the target network in the deep Q-learning model is an environmental state S (t), and the output is a value function Q (S, A), wherein the value function is used for calculating a loss function of the neural network training:

wherein N is the number of samples, and gamma is belonged to (0, 1)]In each calculation iteration, namely a calculation period T, for rewarding the step-size factor, the model determines an action A under the current state S (T) based on an epsilon-Greedy strategy, the system generates a Random number Random, and if the Random is less than epsilon, one action A is randomly selected _ε Else by Q (S) of the Q network _t ，A _t ) The value argmax is calculated to obtain the action AQ that maximizes the Q value.

In conclusion, the method has better environmental adaptability and growth performance, algorithm examples aiming at different application scenes can be obtained by learning a large amount of sampling data, and aiming at emergency situations which are more comprehensive and comprise lane changing of a front vehicle, emergency braking of the front vehicle, obstacles such as pedestrians appearing in the front and the like, control actions comprise braking deceleration or turning lane changing, an optimal control strategy can be achieved through model training and learning for a pure automatic driving environment, an environment in which automatic driving and auxiliary driving are mixed and human driving, the method can be used as a core system of an automatic driving technology, and the method can help a road traffic system to avoid traffic accidents and realize a safer road traffic environment by deploying training in a wide road driving environment.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.

Claims

1. An automatic driving brake and anti-collision control method based on artificial intelligence is characterized in that: the method comprises the following steps:

s5, establishing an AI algorithm model; when the rear vehicle detects that the front vehicle suddenly changes lane to the same lane or the front vehicle brakes on the same lane and a pedestrian barrier exists in front of the rear vehicle, the rear vehicle needs to make braking control according to the current relative distance, relative speed and relative acceleration between the rear vehicle and the front vehicle or the front barrier so as to decelerate or change lane to an adjacent lane to avoid collision;

2. The artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 1, wherein: in S1, the direction of travel of the front and rear vehicles is defined as the x-axis, the direction perpendicular thereto is defined as the y-axis, and the speed at which the rear vehicle HC is traveling is defined as the v-axis _XHC (t) and the speed in the y-axis direction thereof is 0 (V) _yHC (t) ═ 0); the QC of the front vehicle drives at a speed V _XQC (t) and V _yQC (t) when the front vehicle is changing lanes, the speed v of the front vehicle in the y-axis direction _yQC (t) ≠ 0, where t is the point in time when the leading vehicle starts to lane change, then the leading vehicle lane change results in an offset angle θ (t) on the y-axis:

the moving distance of the rear vehicle from the beginning of the braking process to the end of the braking process is X from the discovery of emergency _HC (t), the distance of the front vehicle moving in the same direction as the rear vehicle is X _QC (t) the length of the front vehicle body is L _QC Then, in order to avoid collision between the rear vehicle and the front vehicle, the following conditions need to be satisfied:

X _HC (t)＜X _QC (t)-L _QC cos[θ(t)] (2)

3. The artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 1, wherein: in S2, if the front vehicle, which is originally driving in the same lane as the rear vehicle, suddenly brakes, the rear vehicle also needs to recognize and make an appropriate judgment and brake to avoid collision, and the braking process of the driver driving the vehicle is divided into three stages: the driver reacts time t1, the braking force is enhanced t2 and the brake is continuously applied t3, and the braking distance is as follows due to the automatic braking, i.e. t1 is 0:

x _Δ (t)＝x _Δ (0)+x _QC (t)-x _HC (t)

＝x _Δ (0)+x _QC (t)-x _bHC (t) (5)

x _Δ (t)＞L _p (6)。

4. the artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 1, wherein: in S3, a pedestrian or other stationary obstacle suddenly appears right ahead of the travel of the vehicle HC, and since the pedestrian or obstacle is almost at a standstill in the travel direction with respect to the vehicle, x can be simplified based on the formula (5) _QC (t) 0, and obtaining the distance between the rear vehicle and the pedestrian or other obstacles XR in the braking process as follows:

x _Δ (t)＝x _Δ (0)-x _bHC (t) (7)

then, to prevent the collision of the vehicle with the pedestrian or other obstacle, it is necessary to satisfy:

x _Δ (t)＞0 (8)。

5. the artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 1, wherein: at S4, sudden braking in an emergency situation may cause a physical and psychological discomfort to the passenger, and the violent inertia may even cause a physical injury to the vehicle occupants, and the passenger discomfort may be related to not only the acceleration or deceleration of the vehicle, but also the jerk due to the braking discontinuity, which may be quantitatively calculated as a change in the acceleration:

in the formula (9), a is acceleration, v is speed, x is distance, and t is time, so as to avoid the incontrollable movement of the passenger body or the generation of uncomfortable feeling, the maximum acceleration or deceleration (a) in the control process can be controlled _max ) And max tonFile (j) _max ) The restriction is made such that:

wherein the maximum value of the acceleration is 4m/s ² 。

6. The artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 1, wherein: in S5, for each calculation period T, due to noise in inter-vehicle communication or in-vehicle sensor signal acquisition, the rear vehicle HC may not accurately determine the distance to the front vehicle QC or the front pedestrian barrier XR, resulting in premature or late braking, or even no braking or lane change; corresponding braking actions may also be implemented, but cause discomfort to passengers and even cause accidents;

A(t)＝{a _d (t)，θ(t)} (12)

R ₁ (t)＝-δ[(x _Δ (t)-L _p -X) ² +α] (13)

for too late or no braking, the reward function is set to:

R ₂ (t)＝-ρ[(x _Δ (t)-L _p ) ² +(v _HC (t)-v _QC (t)) ² +β] (14)

for passenger comfort goals, the reward function is set to:

R ₃ (t)＝-σ[(a _HC (t)-a _limit ) ² +(j _HC (t)-j _limit ) ² +γ] (15)

R(t)＝ω ₁ R ₁ (t)+ω ₂ R ₂ (t)+ω ₃ R ₃ (t) (16)

7. The artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 1, wherein: in S6, the two deep reinforcement Learning models comprise an Actor-Critic model and a deep Q-Learning model;

wherein N is the number of samples, and gamma belongs to (0, 1)]To reward the stride factor, the Actor network outputs the specific actions in state s (t): speed reduction (a) _d (t)) or steering (θ (t)), the Actor network maps the current state to the optimal action, setting the implicit function to π: [ S (t) → A ^* ]Then the Actor updates itself with the gradient:

wherein,

and

8. The artificial intelligence based control method for automatically driving brakes and preventing collision as claimed in claim 7, wherein: the Q networks in the main network and the target network in the deep Q-Learning model are both a deep neural network with the input of the environmental state S (t) and the output of the environmental state S (t) as a value function Q (S, A), wherein the value function is used for calculating a loss function of neural network training:

wherein N is the number of samples, and gamma is belonged to (0, 1)]In each calculation iteration, namely a calculation period T, for rewarding the step-size factor, the model determines an action A under the current state S (T) based on an epsilon-Greedy strategy, the system generates a Random number Random, and if the Random is less than epsilon, one action A is randomly selected _ε Else by Q (S) of the Q network _t ，A _t ) The value argmax is calculated to obtain action A which maximizes the Q value _Q 。