CN112327923B

CN112327923B - Multi-unmanned aerial vehicle collaborative path planning method

Info

Publication number: CN112327923B
Application number: CN202011303855.3A
Authority: CN
Inventors: 周志浪; 刘小波; 郑可心; 张超超; 杨建峰
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2020-11-19
Filing date: 2020-11-19
Publication date: 2022-04-01
Anticipated expiration: 2040-11-19
Also published as: CN112327923A

Abstract

The invention provides a multi-unmanned aerial vehicle collaborative path planning method, which comprises the following steps: the unmanned aerial vehicle group obtains a starting point position, an obstacle point and existing radar coordinates in a map environment, and establishes a two-dimensional grid map of a combat environment; according to the obtained two-dimensional grid map, establishing an optimization target of the multiple unmanned aerial vehicles according to the path length of each unmanned aerial vehicle and the threat strength; initializing the population by combining random generation path points and Q learning to obtain an initial population of each unmanned aerial vehicle path; optimizing the initial population by adopting an improved particle swarm algorithm according to the optimization target; the method takes the overall benefits of the multiple unmanned aerial vehicles as an optimization target, and ensures the safety and the high efficiency of the flight path of each unmanned aerial vehicle.

Description

Multi-unmanned aerial vehicle collaborative path planning method

Technical Field

The invention relates to the technical field of unmanned aerial vehicles, in particular to a multi-unmanned aerial vehicle collaborative path planning method.

Background

The cooperation of many unmanned aerial vehicles is the main style of future unmanned aerial vehicle application, and many unmanned aerial vehicles cooperate the performance of route planning has decided many unmanned aerial vehicles synergy effect to a great extent, is the focus of unmanned aerial vehicle route planning research at present.

The cooperative path planning of the multiple unmanned aerial vehicles refers to planning paths for the multiple unmanned aerial vehicles under the condition of considering cooperative constraint, so that the multiple unmanned aerial vehicles can cooperatively complete tasks at minimum cost. Compared with a single unmanned aerial vehicle, the implementation of path planning of multiple unmanned aerial vehicles is more limited and more difficult, which is a key point and a difficult point of current path planning.

The traditional algorithms such as A, Voronoi and the like have high calculation efficiency and simple planning. MA P B et al use Voronoi diagrams and collaborative strategies to achieve collaborative track planning that satisfies time constraints. Lidewei et al, by improving the search sequence and optimizing the valuation function, changes the undirected search in the A-algorithm into a directed search, changes the global valuation into the local valuation, and improves the algorithm efficiency. However, the traditional algorithm has many limitations in solving the path planning problem, for example, the precision method is only suitable for small-scale path planning, and when the objective function and the constraint condition are complicated, the precision method is difficult to provide an effective solution. The heuristic algorithm is easy to fall into local optimization and is also not suitable for the problem of large scale.

In view of the limitation of solving the path planning of multiple unmanned aerial vehicles by the traditional algorithm, more and more learners utilize the intelligent optimization algorithm to solve the unmanned aerial vehicle cluster path planning, the intelligent optimization algorithm has strong self-organization, strong robustness and strong expandability, and can be used for solving the unmanned aerial vehicle cooperation problem under complex constraint. The ant colony algorithm, the particle swarm algorithm and the genetic algorithm are the most widely used 3 types of methods.

All quantum ants complete path search according to tabu search and quantum pheromone updating node selection probability, quantum rotation angles are updated according to comprehensive cost of optimal paths, quantum pheromone is updated by using simulated quantum rotating gates, the output optimal paths are stored in a path set, whether the number of the paths in the path set reaches the maximum number of the paths is judged, and the paths in the path set are sorted according to length for selection of the unmanned aerial vehicle. Yu et al generated a solution sequence of the unmanned aerial vehicle cluster path using the track point numbers, and the multiple sequences divided by the division points represented multiple unmanned aerial vehicle paths, and introduced differential evolution operation to maintain the diversity of particles for solving the problem of particle precocity, and solved by a hybrid particle swarm algorithm in combination with an adaptive adjustment inertial weight strategy. XIAO et al, in order to solve the problem of path planning of multiple unmanned aerial vehicles traversing multiple target points with obstacle area restriction, aiming at the lowest total flight cost, design a new crossover operator in the process of solving by using a genetic algorithm, randomly select a sub-path from a parent generation and lead the sub-path, and generate paths traversing other target points according to a certain rule, thereby improving the convergence speed while retaining the excellent chromosomes of the parent generation.

In the invention, the particle swarm algorithm based on single route point evaluation can realize learning more potential better route points in the unmanned aerial vehicle cluster, but cross route points can be generated, and in this case, the obtained routes need to be subjected to conflict processing, so that the convergence speed is slowed.

The UUV path planning method based on the particle swarm algorithm does not consider threats possibly encountered in the running process of the unmanned ship, and is a single-target path planning method.

The unmanned aerial vehicle cluster path planning based on the improved Q learning algorithm only plans a path for each unmanned aerial vehicle independently, and does not consider the overall benefit of the unmanned aerial vehicle cluster.

The path planning method based on the ant colony algorithm judges whether the cooperation can be achieved or not by smoothing the initial path of each unmanned aerial vehicle, does not consider the cooperation problem among the unmanned aerial vehicles in the algorithm process, and only plans the path for each unmanned aerial vehicle independently.

Aiming at the current existing unmanned aerial vehicle path planning method, the characteristics of multi-target and cooperative optimization in the multi-unmanned aerial vehicle path planning in the actual situation are combined, the cooperative nature of the multi-unmanned aerial vehicle path planning needs to be deeply researched, and the overall benefit of the unmanned aerial vehicle cluster can be increased under the condition that the unmanned aerial vehicle can reach the destination smoothly.

The invention provides a multi-unmanned aerial vehicle collaborative path planning method based on an improved particle swarm algorithm. In the initialization of particle swarm, the invention adopts a mode of combining random initialization and Q learning initialization, thereby ensuring the diversity of the initial population, accelerating the convergence speed of the algorithm, and improving the position and speed updating mode of the particle swarm algorithm, so that the method is suitable for the path planning problem in a coordinate form; the path lengths and threat strengths of the multiple unmanned aerial vehicles are considered in the flight process of the unmanned aerial vehicles, so that the high efficiency and safety of the multiple unmanned aerial vehicles for executing tasks are ensured; the weighted path length and the threat intensity of the multiple unmanned aerial vehicles are used as optimization targets, and the overall benefit of the unmanned aerial vehicle cluster is increased.

Disclosure of Invention

In view of the above, the present invention provides a method for planning a collaborative path of multiple drones, including the following steps:

s1, acquiring radar coordinates including a starting point position, an obstacle point and an existing radar in a map environment by the unmanned aerial vehicle cluster, and establishing a two-dimensional grid map of a battle environment;

s2, constructing an optimization target of the multiple unmanned aerial vehicles according to the two-dimensional grid map established in the S1 through the path length of each unmanned aerial vehicle and the threat strength;

s3, initializing the population by combining random generation path points and Q learning to obtain an initial population of each unmanned aerial vehicle path;

s4, optimizing the initial population obtained in the S3 by adopting an improved particle swarm optimization according to the optimization target constructed in the step S2;

and S5, obtaining the optimal particles of the population, and outputting the air routes of all unmanned aerial vehicles in the unmanned aerial vehicle cluster.

The technical scheme provided by the invention has the beneficial effects that: 1. the mixed mode of the random generation path and the Q learning generation path is adopted, so that the superiority of the initial solution of the population is ensured;

2. the variation operation is based on the distance and the included angle between the point to be selected and the end point, different selection probabilities are given to the points to be selected, and the convergence speed of the algorithm is accelerated;

3. the overall benefit of the multiple unmanned aerial vehicles is taken as an optimization target, and the safety and the high efficiency of flight paths of the unmanned aerial vehicles are guaranteed.

Drawings

Fig. 1 is a flow chart of a multi-drone collaborative path planning method of the present invention;

FIG. 2 is a grid environment diagram of a multi-UAV collaborative path planning method according to the present invention;

fig. 3 is a schematic diagram illustrating an action variation of a multi-drone collaborative path planning method according to the present invention;

fig. 4 is a cross operation mode 1 of optimal particle and individual for the multi-drone collaborative path planning method of the present invention;

fig. 5 is a cross operation mode 2 of the optimal particle and individual for the multi-unmanned aerial vehicle collaborative path planning method of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.

Referring to fig. 1, the present invention provides a method for planning a collaborative path of multiple drones, including the following steps:

the path points are represented in a form of coordinates in a two-dimensional grid graph, and the flight environment of the unmanned aerial vehicle is described as follows: { (x, y) | x_min≤x≤x_max,y_min≤y≤y_max}；x_min、x_max、y_minAnd y_maxRepresented as the boundary of the drone flight;

as shown in fig. 2, the environment is divided into 20 × 20 cells having the same area, each cell carries different parameter information of 0, -1, 1, and 2, and when the cell parameter is 0, it indicates that the area has no obstacle, when the cell parameter is 1, it indicates that the area contains an obstacle or a threat source, when the cell parameter is-1, and it indicates that the area has an end point, and by constructing a two-dimensional grid map, it is possible to obtain environment information well.

number of steps f taken by i unmanned aerial vehicles_stepConsidering the constraint of the turning angle of the unmanned aerial vehicle as step, if the unmanned aerial vehicle has oblique flight in the path, a penalty f is imposed_penaltyAnd obtaining the path length expression of the single unmanned aerial vehicle:

f_i＝f_step+f_penalty

the path comprehensive weighting length of the multiple unmanned aerial vehicles is calculated by the following formula:

wherein m is the number of unmanned aerial vehicles;

dividing a complete path from a starting point to a target point into three segments, and taking each threat point to 1/4, 2/4 and 3/4 of each segment for summation to obtain the threat intensity of the threat point j to the unmanned aerial vehicle i as represented by

Where t is the three parts of the path, i_tFor the t-th segment of the drone path,

and

for the fourth power of the distance from threat point j to the t-th paths 1/4, 2/4, and 3/4, K is the adjustment factor, and the threat strength of a single drone under the path is:

wherein N is the number of threat points, and the comprehensive threat intensity that the unmanned aerial vehicle received of definition is:

constructing an overall optimization target f of multiple unmanned aerial vehicles: f ═ w₁*f₁+w₂*f₂Wherein w is₁And w₂Respectively, the total weighted path length and the weight of the total weighted threat strength.

the invention adopts a random path point generation mode to obtain part of initial population, the random path point generation mode is that when the initial species group is generated, if a continuous path is obtained, the unmanned aerial vehicle can select a free grid from 8 adjacent grids around the unmanned aerial vehicle as a next grid, the selection method can ensure that the path is continuous, the generated path has diversity, but the generated path is longer, the path threat intensity is higher, and simultaneously, the evolution algebra is increased to influence the convergence speed.

Therefore, the invention obtains another part of the initial population by using a Q learning algorithm, the purpose of the Q learning algorithm is to obtain the maximum accumulated reward value in the interaction with the environment, and therefore, the construction of a reasonable reward function is very important. In the present invention, the reward function is designed as shown in table 1;

setting of a report-back function

Unmanned plane status	Starting point	Target point	Obstacle	Threat point	Feasible point
						Reward value
	0	20	-1	1	r

In order to accelerate the convergence rate of Q learning, a feasible point reward function based on a sparse function is constructed in Q learning:

wherein r is₀Based on the prize value, r_maxTo take into account the maximum reward value at the time of threat, r_threatAs a threat coefficient, x_UAVAnd y_UAVCoordinates for unmanned aerial vehicles, x_goalAnd y_goalCoordinate of end point, x_ithreatAnd y_ithreatAs coordinates of each threat source, k_goalAnd k_threatTo adjust the control coefficient of the reward function, r is guaranteed during the parameter setting>0；

In order to ensure good exploration, the action selection strategy needs to give consideration to exploration and application, the action selection strategy adopted by Q learning in the invention is a Boltzmann distribution method, and the expression of the action selection strategy is as follows:

wherein V_iRepresenting state s or state-action pairs (s, a)_i) Any of the value functions of p (a)_i| s) represents action a under state s_iThe selected probability, T is a control parameter, and k is the iteration number;

each unmanned aerial vehicle path obtained through the Q learning algorithm has the characteristics of shorter path length and smaller threat intensity compared with the path obtained in a random generation mode.

Selecting a path as a candidate initial solution of a population from the unmanned aerial vehicle path solutions obtained by the Q learning algorithm, prescribing the number of path solutions of each unmanned aerial vehicle in different initial directions by using priori knowledge, and combining the initial solution obtained by the Q learning and the initial solution obtained by randomly generating path points, namely obtaining the initial population of the particle swarm algorithm, wherein the form of the initial solution of the population is a coordinate.

aiming at the problem that a speed and position updating operator of a basic particle swarm algorithm is difficult to express a discrete domain such as coordinate-form-based collaborative path planning, therefore, a position and speed updating formula of the particle swarm algorithm is analyzed and an improvement strategy is proposed;

the updating formula of the position and the speed of the particle swarm algorithm is as follows:

wherein x_i,j(t)＝(x_i,j(1),x_i,j(2),Λ,x_i,j(t)) is the position of the ith particle at the t-th iteration, p_i,j,best(t)＝(p_i,j,best(1),p_i,j,best(2),Λ,p_i,j,best(3) G) is the optimum value of the particle in t iterations_i,j,best(t)＝(g_i,j,best(1),g_i,j,best(2),Λ,g_i,j,best(3) Is the global optimum of the particle in t iterations, w is the inertial weight, c₁And c₂Is a non-negative acceleration coefficient, f₁As a function of the influence of the particle on itself, f₂Is a particle x_i,j(t) learning of the optimal value of its individual history, f₃Is a particle x_i,j(t) learning of its global historical optimum;

in an evolution operation F₁Defined as a mutation operation, a particle i mutates itself with a probability of w, with a probability of 1-w remaining unchanged, expressed as:

the convergence of the algorithm is accelerated by adopting a linear time-varying inertia weight strategy:

w has a value of from w_maxLinearly decreasing to a final value w_minT is the current iteration number of the algorithm, iter_maxThe maximum number of iterations allowed for the algorithm to continue;

the mutation operation comprises the following specific steps:

s41, randomly selecting a point in the path coordinate, and removing the starting point and the end point;

s42, as shown in fig. 3, the next action of the unmanned aerial vehicle has 8 choices according to the point where the unmanned aerial vehicle is located, and if the original coordinate point in the path is 1, the coordinate point to be selected is 2 to 8, and there are 7 coordinate points to be selected. In order to reduce the search blindness, the distance D between each point i to be selected and the target point is calculated_iCalculating the probability P that the point i is selected_i：

Wherein k is_iThe adjustment coefficient is determined according to the included angle between the action point to be selected and the target point, and particularly, when the point to be selected is an obstacle point, the probability of the point to be selected is 0;

s43, randomly generating a number P of 0-1 if P₁+P₂+...P_i-1＜P≤P₁+P₂+...P_iIf yes, selecting the point i as a next action point;

s44, adding the current grid into a search tabu table to avoid repeated selection, and replacing the current grid serial number with the next grid serial number;

s45, if all the coordinate points to be selected are obstacle points, canceling the variation operation of the point, and directly jumping to S41 to reselect the variation coordinates;

s46, judging whether the new action point is continuous with the original path, if not, inserting a new non-obstacle coordinate point at the discontinuous position, and judging whether the path is continuous according to the following formula;

δ＝max{abs(x_i+1-x_i),abs(y_i+1-y_i)}

wherein (x)_i+1,y_i+1) Is the coordinate point of the original path, (x)_i,y_i) If δ is 1, the path is continuous, and the mutation operation is finished; when delta>When 1, the path is discontinuous, new coordinates are directly inserted into the two coordinate points through an average value method, and the coordinate expression of the inserted points is as follows;

and judging whether the coordinates of the insertion point have obstacles or not, if so, sequentially selecting the coordinates in the order of the upper right part, the lower left part and the lower right part of the insertion point to judge whether the coordinates are the obstacles or not until the free coordinate point is selected, judging whether the inserted point is continuous with the original path point again, and if not, repeating the step S45 until the path is continuous.

F is to be₂Defined as particles i and c₁Is interleaved with its individual optimum values by 1-c₁The probability of (c) remains unchanged, which is expressed as:

f is to be₃Defined as particles i and c₂Is interleaved with its global optimum by 1-c₂The probability of (c) remains unchanged, which is expressed as:

step F of cross operation of particles in population and globally optimal particles₂Similarly;

and obtaining the optimal particles of each sub-population through the evolution operation, judging whether the optimal particles meet the cycle ending condition, if so, storing the optimal particles of each sub-population, terminating the evolution operation, and if not, re-performing the evolution operation.

S401, judging whether the particle i and the individual optimal particle have the same coordinate point, if so, carrying out the next step, otherwise, ending the mutation operation;

s402, if there is only one identical point, exchanging coordinates of a section from the point to the end point of the individual optimal particle with the particle i, as shown in fig. 4, if there are two or more identical points, randomly selecting two points, and exchanging a section between the two points of the individual optimal particle with the particle i, as shown in fig. 5.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A multi-unmanned aerial vehicle collaborative path planning method is characterized by comprising the following steps:

s2, constructing an optimization target of the multiple unmanned aerial vehicles according to the two-dimensional grid map established in the S1 through the path length and the threat strength of each unmanned aerial vehicle:

f_i＝f_step+f_penalty

the specific calculation formula of the path comprehensive weighting length of the multiple unmanned aerial vehicles is as follows:

wherein m is the number of unmanned aerial vehicles;

Wherein t is the three parts of the path, it is the t-th segment of the unmanned aerial vehicle path,

and

constructing an overall optimization target f of multiple unmanned aerial vehicles: f ═ w₁*f₁+w₂*f₂Wherein w is₁And w₂Weights for the total weighted path length and the total weighted threat strength, respectively;

2. The method for planning the collaborative path for multiple unmanned aerial vehicles according to claim 1, wherein the S1 is specifically as follows:

dividing the flight environment into 20 × 20 grids with the same area, wherein each grid carries different parameter information of 0, -1, 1 and 2, and when the grid parameter is 0, the grid parameter represents the position information of a starting point; when the grid parameter is 1, the area is free of obstacles, when the grid parameter is-1, the area contains obstacles or threat sources, and when the grid parameter is 2, the area represents the position information of the end point.

3. The method for planning the collaborative path for the multiple unmanned aerial vehicles according to claim 1, wherein the specific method for obtaining the initial population in S3 is as follows:

firstly, obtaining a part of initial population by adopting a random path point generation mode, then obtaining a part of initial population by adopting Q learning, and constructing a feasible point reward function based on a sparse function in the Q learning:

The action selection strategy adopted in Q learning is a Boltzmann distribution method, and the expression of the action selection strategy is as follows:

wherein V_iRepresenting state s or state-action pairs (s, a)_i) Any of the value functions of p (a)_i| s) represents action a under state s_iSelecting probability, wherein T is a control parameter, and k is iteration times;

selecting a path as a candidate initial solution of a population in the unmanned aerial vehicle path solutions obtained by the Q learning algorithm, prescribing the number of path solutions of each unmanned aerial vehicle in different initial directions by using prior knowledge, combining the initial population obtained by the Q learning and the initial population obtained by randomly generating path points, namely the initial population of the particle swarm algorithm, wherein the form of the initial solution of the population is a coordinate.

4. The method for planning the collaborative path of the multiple unmanned aerial vehicles according to claim 1, wherein an initial population obtained in step S3 is optimized by using an improved particle swarm algorithm according to the optimization target constructed in step S2, and the update formula of the position and the speed of the particle swarm algorithm is as follows:

particle and global optimum particle cross operation step and F₂Similarly;

5. The method for planning the collaborative path of multiple unmanned aerial vehicles according to claim 4, wherein the mutation operation comprises the following steps:

s41, randomly selecting a point in the path coordinates;

s42, calculating the distance D between each point i to be selected and the target point_iCalculating the probability P that the point i is selected_i：

Wherein k is_iAn adjustment coefficient determined according to prior knowledge, in particular, when the point to be selected is an obstacle point, the probability is 0;

s43, randomly generating a number P of 0-1 if P₁+P₂+...P_i-1＜P≤P₁+P₂+...P_iSelecting the point i as the next action point;

δ＝max{abs(x_i+1-x_i),abs(y_i+1-y_i)}

6. The method of claim 4, wherein the particle i is expressed as c₁The specific steps of the cross operation of the probability and the individual optimal particles are as follows:

s402, if only one identical point exists, the coordinates of the interval from the point to the end point of the individual optimal particle are exchanged with the particle i, if two or more identical points exist, two points are randomly selected, and the interval between the two points of the individual optimal particle is exchanged with the particle i.