CN109556609B

CN109556609B - Artificial intelligence-based collision avoidance method and device

Info

Publication number: CN109556609B
Application number: CN201811358583.XA
Authority: CN
Inventors: 徐应年; 铁井华; 邹绍云; 罗永涛
Original assignee: Wuhan Nanhua Industrial Equipment Engineering Co ltd
Current assignee: Wuhan Nanhua Industrial Equipment Engineering Co ltd
Priority date: 2018-11-15
Filing date: 2018-11-15
Publication date: 2020-11-17
Anticipated expiration: 2038-11-15
Also published as: CN109556609A

Abstract

The invention discloses a collision avoidance method and device based on artificial intelligence. Wherein, the method comprises the following steps: establishing a collision avoidance strategy model; sending the current motion data to the collision avoidance strategy model to obtain an optimal solution space; the Monte Carlo tree search algorithm is applied to the optimal solution space, the adjustment parameters are obtained through calculation, an analytic motion model does not need to be established, the defects that an accurate control mathematical model cannot be established under the conditions of large disturbance such as wind wave flow and the like, the control precision is low, the robustness is poor are overcome, and the technical effects of improving intelligence, stability and applicability are achieved.

Description

Artificial intelligence-based collision avoidance method and device

Technical Field

The invention relates to the technical field of artificial intelligence and collision avoidance, in particular to a collision avoidance method and device based on artificial intelligence.

Background

The research on the unmanned ship control technology in water surface in China starts late, but the initial concept design stage is gradually transited to the actual application stage. Due to the complexity and the unknown working environment of the unmanned ship on the water surface, the existing intelligent system structure needs to be continuously improved and perfected, the prediction capability of the system for the future is improved, the autonomous learning capability of the system is enhanced, and the intelligent system is more prospective.

The amount of information required to be processed during ship collision avoidance is large, the operation is quite complex, and the research of related collision avoidance algorithm technology needs to be deepened. Because the current collision avoidance algorithm has limitations, the ship collision avoidance mode is not intelligent, and most algorithms only aim at collision avoidance in the environment with wide sea area and few dynamic obstacles.

Therefore, no collision avoidance method which has good intelligence and strong stability and is suitable for various complex sea conditions exists.

Disclosure of Invention

The invention provides the collision avoidance method and the collision avoidance device based on the artificial intelligence, solves the technical problems that the prior art is low in intelligence and poor in stability and cannot be applied to various complex sea conditions, and achieves the technical effects of improving intelligence, stability and applicability.

The invention provides an artificial intelligence-based collision avoidance method, which comprises the following steps:

establishing a collision avoidance strategy model;

sending the current motion data to the collision avoidance strategy model to obtain an optimal solution space;

and applying a Monte Carlo tree search algorithm to the optimal solution space, and calculating to obtain an adjusting parameter.

Further, the sending the current motion data to the collision avoidance strategy model to obtain a preferred solution space includes:

sending the current motion data to the collision avoidance strategy model to obtain the optimal solution space and the estimated motion data;

further comprising:

establishing an evaluation network model related to the autonomous obstacle avoidance performance index;

sending the estimated motion data to the evaluation network model to obtain an evaluation value;

the applying the Monte Carlo tree search algorithm to the preferred solution space and calculating to obtain the adjustment parameters comprises the following steps:

applying the Monte Carlo tree search algorithm to the optimal solution space, and calculating to obtain the adjustment parameters;

and adjusting the adjustment parameters based on the evaluation values to obtain optimal adjustment parameters.

Further, the establishing of the evaluation network model related to the autonomous obstacle avoidance performance index includes:

by the formula

Establishment and path length f₁A related evaluation network model;

wherein d is_iIs the distance between the path point at the time i and the path point at the time i +1, and the expression is

In the formula, P_iPath point at time i, P_iHas the coordinates of (x)_i，y_i)；P_i+1Is the path point at time i +1, P_i+1Has the coordinates of (x)_i+1,y_i+1)。

by the formula

n>0 build and path smoothness f₂A related evaluation network model;

wherein alpha is_iPath point P at time i_iAt a corner of the formula

In the formula, P_iPath point at time i, P_iHas the coordinates of (x)_i,y_i)；P_i+1Is the path point at time i +1, P_i+1Has the coordinates of (x)_i+1，y_i+1)；

To be taken from the path point P_i-1To the path point P_iVector, | P_i-1P_iIs a vector

Length of (d);

to be taken from the path point P_iTo the path point P_i+1Vector, | P_iP_i+1Is a vector

Length of (d); k is the number of corners greater than or equal to pi/2.

by the formula

Setup and path security f₃A related evaluation network model;

wherein, λ is a weight adjustment coefficient; m is a second penalty parameter; d is the average shortest distance between all the path points and the obstacle, and the expression is

In the formula, n is the number of path points except for the starting point and the target point in the path; d_iThe shortest distance between the path point and the obstacle is expressed as

And

respectively, is a path point P_iTwo end points of the link line.

The invention also provides a collision avoidance device based on artificial intelligence, which comprises:

the collision avoidance strategy model establishing module is used for establishing a collision avoidance strategy model;

the collision avoidance strategy model operation module is used for sending the current motion data to the collision avoidance strategy model to obtain an optimal solution space;

and the calculation module is used for applying the Monte Carlo tree search algorithm to the optimal solution space and calculating to obtain the adjustment parameters.

Further, the collision avoidance strategy model operation module includes:

the first model operation unit is used for sending the current motion data to the collision avoidance strategy model to obtain the optimal solution space;

the second model operation unit is used for sending the current motion data to the collision avoidance strategy model to obtain estimated motion data;

further comprising:

the evaluation network model establishing module is used for establishing an evaluation network model related to the autonomous obstacle avoidance performance index;

the evaluation network model operation module is used for sending the estimated motion data to the evaluation network model to obtain an evaluation value;

the calculation module comprises:

the calculation unit is used for applying the Monte Carlo tree search algorithm to the optimal solution space and calculating to obtain the adjustment parameters;

and the adjusting unit is used for adjusting the adjusting parameters based on the evaluation values to obtain the optimal adjusting parameters.

Further, the evaluation network model building module includes:

a first evaluation network model establishing unit for establishing a first evaluation network model by formula

Establishment and path length f₁A related evaluation network model; wherein d is_iIs the distance between the path point at the time i and the path point at the time i +1, and the expression is

Further, the evaluation network model building module includes:

a second evaluation network model establishing unit for establishing a second evaluation network model by formula

n>0 build and path smoothness f₂A related evaluation network model; wherein alpha is_iPath point P at time i_iAt a corner of the formula

In the formula, P_iPath point at time i, P_iHas the coordinates of (x)_i,y_i)；P_i+1Way at time i +1Radial point, P_i+1Has the coordinates of (x)_i+1,y_i+1)；

Length of (d);

Length of (d); k is the number of corners greater than or equal to pi/2.

Further, the evaluation network model building module includes:

a third evaluation network model establishing unit for establishing a third evaluation network model by formula

Setup and path security f₃A related evaluation network model; wherein, λ is a weight adjustment coefficient; m is a second penalty parameter; d is the average shortest distance between all the path points and the obstacle, and the expression is

And

respectively, is a path point P_iTwo end points of the link line.

One or more technical schemes provided by the invention at least have the following technical effects or advantages:

the invention establishes a collision avoidance strategy model based on deep learning (convolutional neural network), and sends the current motion data to the collision avoidance strategy model to obtain an optimal solution space; and then the Monte Carlo tree search algorithm is applied to the optimal solution space, and the adjustment parameters are obtained through calculation, so that not only is an analytic motion model not required to be established, but also the defects of low control precision and poor robustness that an accurate control mathematical model cannot be established under the conditions of large disturbance such as storm flow and the like are overcome, and the technical effects of improving intelligence, stability and applicability are realized.

Drawings

Fig. 1 is a flowchart of an artificial intelligence-based collision avoidance method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating a collision avoidance method based on artificial intelligence according to an embodiment of the present invention;

fig. 3 is a flowchart of establishing a monte carlo tree in the collision avoidance method based on artificial intelligence according to the embodiment of the present invention;

fig. 4 is a schematic view of a corner in an artificial intelligence-based collision avoidance method according to an embodiment of the present invention;

fig. 5 is a schematic diagram illustrating a shortest distance between a path point and an obstacle in the collision avoidance method based on artificial intelligence according to the embodiment of the present invention;

fig. 6 is a block diagram of an artificial intelligence-based collision avoidance apparatus according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides an artificial intelligence-based collision avoidance method and device, solves the technical problems of low intelligence, poor stability and incapability of being suitable for various complex sea conditions in the prior art, and achieves the technical effects of improving intelligence, stability and applicability.

In order to solve the above problems, the technical solution in the embodiments of the present invention has the following general idea:

the embodiment of the invention establishes a collision avoidance strategy model based on deep learning (convolutional neural network), and sends the current motion data to the collision avoidance strategy model to obtain an optimal solution space; and then the Monte Carlo tree search algorithm is applied to the optimal solution space, and the adjustment parameters are obtained through calculation, so that not only is an analytic motion model not required to be established, but also the defects of low control precision and poor robustness that an accurate control mathematical model cannot be established under the conditions of large disturbance such as storm flow and the like are overcome, and the technical effects of improving intelligence, stability and applicability are realized.

For better understanding of the above technical solutions, the following detailed descriptions will be provided in conjunction with the drawings and the detailed description of the embodiments.

Referring to fig. 1 and fig. 2, an artificial intelligence-based collision avoidance method provided in an embodiment of the present invention includes:

step S110: establishing a collision avoidance strategy model;

this step is explained:

receiving original navigation data;

and establishing a collision avoidance strategy model based on the original navigation data.

In this embodiment, the raw navigation data is at least any one of the following:

the speed of the aircraft, the heading of the aircraft, the wind speed, the wind direction, the position of the static obstacle, the relative orientation of the static obstacle to the aircraft, the relative speed of the static obstacle to the aircraft, the position of the dynamic obstacle, the relative orientation of the dynamic obstacle to the aircraft, the relative speed of the dynamic obstacle to the aircraft, and the number of obstacles.

In order to improve the operation accuracy of the collision avoidance strategy model, the established collision avoidance strategy model is learned.

The specific learning process comprises the following steps:

and inputting the historical collision avoidance case data into a collision avoidance strategy model to determine the optimal weight and excitation function inside the model network. The collision avoidance strategy model at this time is a deep learning network.

In this embodiment, the historical collision avoidance case data is at least any one of the following:

the longitude and latitude of the obstructive object, the longitude and latitude of the collision avoidance route, the course in the collision avoidance process and the navigational speed in the collision avoidance process.

It should be noted here that data such as expert steering data, ship collision accident case data, international maritime collision avoidance rules, and the like may also be collected to construct a data sample library for supervised learning of a collision avoidance strategy model.

In order to realize the expandability of the embodiment of the invention, other rules can be accessed as samples at any time by reserving an interface in the database sample library.

In order to further improve the operation accuracy of the collision avoidance strategy model, constraint conditions are set for the well-learned collision avoidance strategy model.

In this embodiment, the constraint is a solution space for the heading and/or speed of the aircraft.

Step S120: sending the current motion data to a collision avoidance strategy model to obtain an optimal solution space;

specifically, the current motion data is sent to a collision avoidance strategy model, and autonomous learning is performed based on constraint conditions to obtain an optimal solution space.

Step S130: and applying the Monte Carlo tree search algorithm to the optimal solution space, and calculating to obtain the adjustment parameters.

This step is explained:

referring to fig. 3, the rectangles represent the root nodes, and the tree building is extended downward by the root nodes. The state of the node generally refers to the obstacle that the aircraft must encounter to avoid. The ellipses represent child nodes, which are common nodes where state transitions occur during collision avoidance. When the aircraft selects an action, a transition between nodes is generated that holds the collision avoidance strategy (i.e., solution space) taken over a period of time. The triangles represent leaf nodes and represent the search tree reaching the boundary of the aircraft motion or an uncertain environment, and the states of the nodes are divided into two cases of collision avoidance success and collision avoidance failure. The monte carlo tree search algorithm is generally divided into 4 stages, namely a selection stage, an expansion stage, a simulation stage and a backtracking update stage. The algorithm will repeatedly perform these 4 phases until a certain specific case of collision avoidance is met.

In order to optimize the finally obtained adjustment parameter, step S120 specifically includes:

sending the current motion data to a collision avoidance strategy model to obtain an optimal solution space and estimated motion data;

the embodiment of the invention also comprises the following steps:

sending the estimated motion data to an evaluation network model to obtain an evaluation value;

step S130 specifically includes:

applying a Monte Carlo tree search algorithm to the optimal solution space, and calculating to obtain an adjustment parameter;

and adjusting the adjustment parameters based on the evaluation values to obtain the optimal adjustment parameters.

It should be noted that the evaluation network model may be established simultaneously with the collision avoidance policy model, may be established before the collision avoidance policy model, and may also be established after the collision avoidance policy model, which is not specifically limited in this embodiment of the present invention.

In this embodiment, the autonomous obstacle avoidance performance index at least includes: path length, path smoothness, and path security.

Specifically, the specific process of establishing the evaluation network model related to the path length is as follows:

by the formula

Establishment and path length f₁A related evaluation network model;

In the same case, the shorter the navigation path length, the shorter the navigation time, and the higher the evaluation.

The specific process of establishing the evaluation network model related to the path smoothness is as follows:

by the formula

n>0 build and path smoothness f₂A related evaluation network model;

wherein, see FIG. 4, α_iPath point P at time i_iAt a corner of the formula

In the formula, P_iPath point at time i, P_iHas the coordinates of (x)_i,y_i)；P_i+1Is the path point at time i +1, P_i+1Has the coordinates of (x)_i+1,y_i+1)；

Length of (d);

Length of (d); the number of corners with k greater than or equal to pi/2, also called the first penalty parameter, is when a certain corner is largeWhen the sum is less than pi/2, punishment is carried out on the target value; when n is 0, the path is a connecting line from the starting point to the target point, and the path smoothness f₂The value of (d) is 0.

Path smoothness f₂Expressed as a corner average, the smaller the corner average, the smoother the turn, the smoother the path, and the higher the rating.

The specific process of establishing the evaluation network model related to the path security comprises the following steps:

by the formula

Setup and path security f₃A related evaluation network model;

referring to fig. 5, λ is a weight adjustment coefficient for solving the problem that the value is too small after the reciprocal of the average distance is calculated, and λ is randomly selected from 0 to 1; m is a second penalty parameter; in this embodiment, m is the number of waypoints having a shortest distance to the obstacle of zero. d is the average shortest distance between all the path points and the obstacle, and the expression is

Get immediately

And

the smaller of the two distances;

and

respectively, is a path point P_iOn the link lineThe two end points of (1).

In the same case, the longer the average shortest distance of all the route points from the obstacle, the safer the route, the higher the rating.

The estimated motion data may be sent to any one or any two or three of an evaluation network model related to path length, an evaluation network model related to path smoothness, and an evaluation network model related to path security. Correspondingly, the adjustment parameter is adjusted based on one evaluation value or two evaluation values or three evaluation values, and an optimal adjustment parameter is obtained. From this, it is understood that the adjustment parameter adjusted based on the three evaluation values is optimal.

Referring to fig. 6, the collision avoidance apparatus based on artificial intelligence provided in the embodiment of the present invention includes:

a collision avoidance strategy model establishing module 100 for establishing a collision avoidance strategy model;

specifically, the collision avoidance strategy model building module 100 includes:

the data receiving unit is used for receiving original navigation data;

and the establishing execution unit is used for establishing a collision avoidance strategy model based on the original navigation data.

In order to improve the operation accuracy of the collision avoidance strategy model, the method further comprises the following steps:

and the model learning module is used for learning the established collision avoidance strategy model.

In this embodiment, the model learning module is specifically configured to input the historical collision avoidance case data into the collision avoidance policy model to determine optimal weights and excitation functions inside the model network. The collision avoidance strategy model at this time is a deep learning network.

Specifically, the historical collision avoidance case data is at least any one of the following:

In order to supervise and learn the collision avoidance strategy model, the method further comprises the following steps:

the data collection module is used for collecting expert steering data, ship collision accident case data, international maritime collision avoidance rules and other data;

and the sample base construction module is used for constructing a data sample base.

In order to further improve the operation accuracy of the collision avoidance strategy model, the method further comprises the following steps:

and the constraint condition setting module is used for setting constraint conditions for the well-learned collision avoidance strategy model.

The collision avoidance strategy model operation module 200 is used for sending the current motion data to the collision avoidance strategy model to obtain an optimal solution space;

in this embodiment, the collision avoidance policy model operation module 200 is specifically configured to send the current motion data to the collision avoidance policy model, and perform autonomous learning based on the constraint condition to obtain the preferred solution space.

And a calculating module 300, configured to apply the monte carlo tree search algorithm to the preferred solution space, and calculate to obtain the adjustment parameter.

This step is explained:

In this embodiment, the collision avoidance strategy model operation module 200 includes:

the first model operation unit is used for sending the current motion data to the collision avoidance strategy model to obtain an optimal solution space;

in order to optimize the finally obtained adjustment parameters, the embodiment of the present invention further includes:

a calculation module 300 comprising:

the calculation unit is used for applying the Monte Carlo tree search algorithm to the optimal solution space and calculating to obtain an adjustment parameter;

Specifically, the evaluation network model building module includes:

In the formula, P_iPath point at time i, P_iHas the coordinates of (x)_i,y_i)；P_i+1Is the path point at time i +1, P_i+1Has the coordinates of (x)_i+1，y_i+1). In the same case, the shorter the navigation path length, the shorter the navigation time, and the higher the evaluation.

Length of (d);

Length of (d); k is the number of corners larger than or equal to pi/2, and is also called as a first penalty parameter, namely when a certain corner is larger than or equal to pi/2, a penalty is given to the target value; when n is 0, the path is a connecting line from the starting point to the target point, and the path smoothness f₂The value of (d) is 0. In the present embodiment, the path smoothness f₂Expressed as a corner average, the smaller the corner average, the smoother the turn, the smoother the path, and the higher the rating.

Setup and path security f₃A related evaluation network model; wherein, λ is a weight adjustment coefficient for solving the problem that the value is too small after the reciprocal of the average distance is solved, and λ is randomly selected from 0 to 1; m is a second penalty parameter; in this embodiment, m is the number of waypoints having a shortest distance to the obstacle of zero. d is the average shortest distance between all the path points and the obstacle, and the expression is

Get

And

the smaller of the two distances;

and

respectively, is a path point P_iTwo end points of the link line. In the same case, the longer the average shortest distance of all the route points from the obstacle, the safer the route, the higher the rating.

It should be noted that the estimated motion data may be sent to any one or any two or three of the three evaluation network models, i.e., the first evaluation network model building unit, the second evaluation network model building unit, and the third evaluation network model building unit. Correspondingly, the adjustment parameter is adjusted based on one evaluation value or two evaluation values or three evaluation values, and an optimal adjustment parameter is obtained. From this, it is understood that the adjustment parameter adjusted based on the three evaluation values is optimal.

[ technical effects ] of

1. The embodiment of the invention establishes a collision avoidance strategy model based on deep learning (convolutional neural network), and sends the current motion data to the collision avoidance strategy model to obtain an optimal solution space; and then the Monte Carlo tree search algorithm is applied to the optimal solution space, and the adjustment parameters are obtained through calculation, so that not only is an analytic motion model not required to be established, but also the defects of low control precision and poor robustness that an accurate control mathematical model cannot be established under the conditions of large disturbance such as storm flow and the like are overcome, and the technical effects of improving intelligence, stability and applicability are realized.

2. And evaluating the action selection of the aircraft by using the evaluation value given by the evaluation network model, updating the evaluation value of the whole path under the action by combining a Monte Carlo tree search algorithm until the searched course and speed values are converged, and sending the convergence value to an executing mechanism as an optimal adjustment parameter, thereby improving the collision avoidance effectiveness.

3. Establishing a training sample base based on data such as expert steering data (including actual ship sea test, historical experience and the like), accident cases, international marine collision avoidance rules and the like, and establishing a collision avoidance strategy model for basic deep reinforcement learning, so that the expert steering experience and relevant historical information of the accident cases are fully combined, and the feasibility of collision avoidance decision is improved; and the international maritime collision avoidance rule is combined, so that the practical collision avoidance condition is better met, and the applicability of collision avoidance is further improved.

4. And the interface is reserved in the data sample library, and other rules can be accessed at any time as samples, so that the expandability of the embodiment of the invention is realized, and the applicability of the embodiment of the invention is further improved.

The embodiment of the invention is not only suitable for wide sea areas, but also suitable for complex marine environments such as high sea conditions, narrow water areas and the like, and effectively ensures the navigation safety of ships.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. An artificial intelligence-based collision avoidance method is characterized by comprising the following steps:

establishing a collision avoidance strategy model;

the sending the current motion data to the collision avoidance strategy model to obtain an optimal solution space includes:

further comprising:

2. The method of claim 1, wherein establishing an evaluation network model related to an autonomous obstacle avoidance performance metric comprises:

by the formula

Establishment and path length f₁A related evaluation network model;

In the formula, P_iPath point at time i, P_iHas the coordinates of (x)_i,y_i)；P_i+1Is the path point at time i +1, P_i+1Has the coordinates of (x)_i+1,y_i+1)。

3. The method of claim 1, wherein establishing an evaluation network model related to an autonomous obstacle avoidance performance metric comprises:

by the formula

n>0 build and path smoothness f₂A related evaluation network model;

wherein alpha is_iPath point P at time i_iAt a corner of the formula

Length of (d);

Length of (d); k is the number of corners greater than or equal to pi/2.

4. The method of claim 1, wherein establishing an evaluation network model related to an autonomous obstacle avoidance performance metric comprises:

by the formula

Setup and path security f₃A related evaluation network model;

wherein, λ is a weight adjustment coefficient; m is a second penalty parameter; d is the average shortest distance between all path points and the obstacleHas the formula of

And

respectively, is a path point P_iTwo end points of the link line.

5. The utility model provides a keep away and bump device based on artificial intelligence which characterized in that includes:

the calculation module is used for applying a Monte Carlo tree search algorithm to the optimal solution space and calculating to obtain an adjustment parameter;

the collision avoidance strategy model operation module comprises:

further comprising:

the calculation module comprises:

6. The apparatus of claim 5, wherein the evaluation network model building module comprises: