CN110686695A

CN110686695A - Adaptive ant colony A-star hybrid algorithm based on target evaluation factor

Info

Publication number: CN110686695A
Application number: CN201911042714.8A
Authority: CN
Inventors: 陆敬怡; 梁志伟; 祝子健; 李欣昱
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University; Nanjing University of Posts and Telecommunications
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2020-01-14

Abstract

A self-adaptive ant colony-satellite A hybrid algorithm based on a target evaluation factor is a dynamic path planning algorithm applied to RCRSS, and the advantages of an ant colony algorithm and an A-satellite algorithm are extended and fused by introducing the target evaluation factor, so that an optimal path is planned in a complex real-time dynamic environment with roadblocks. The invention combines the heuristic path planning of the A star with the ant colony algorithm of the colony intelligent algorithm, so that the local optimization of the division of labor cooperation of the intelligent agent is realized under the condition that the intelligent agent is limited by the environment, and the global situation is further analyzed and predicted and the colony decision is optimized in view of the local optimization. Compared with various popular path planning algorithms, the adaptive ant colony A-satellite hybrid algorithm not only considers the path length factor, but also focuses on the aspects of the accessibility of the path, the dynamic property of the road condition and the like, and can achieve a good effect in a dynamic complex unknown environment.

Description

Adaptive ant colony A-star hybrid algorithm based on target evaluation factor

Technical Field

The invention relates to the technical field of intelligent dynamic path planning, in particular to a self-adaptive ant colony A-satellite hybrid algorithm based on a target evaluation factor.

Background

Since 1959, industrial robots are emerging, and through 60 years of development, the robots are widely applied, especially in the newly-developed field of disaster relief. Researchers have developed and designed a robot city disaster rescue simulation platform, such as a RoboCup robot world cup rescue simulation system (RCRSS).

In the robot world cup rescue simulation, due to limited perception of an intelligent agent, unsmooth communication and complexity, instantaneity and dynamics of the surrounding environment, the problem of cooperation and division of labor in an unknown environment is solved. From the aspect of engineering application, RCRSS can provide reliable decision guidance for human post-disaster rescue, so that the system has a far-reaching significance.

In RCRSS, there are many dynamically changing factors. The dynamic path planning in the RCRSS is to generate a better path to a target node by using a path planning algorithm according to the current task sequence of an agent, and is specifically embodied as follows: short stroke, safety, high passing rate and the like. The traditional path planning algorithm can only solve the problem that the path length and direction are only optimized under the condition that the global environment is known. However, RCRSS needs to solve the problem that under the non-ideal environment with complex dynamics and incomplete information, multiple indexes are comprehensively considered, for example: optimization problems of road safety, roadblock size, algorithm complexity and the like.

In the traditional path planning algorithm, the genetic algorithm and the neural network algorithm have the defects of low search efficiency, large data volume and the like. However, the heuristic algorithm, such as the a-star algorithm, is a mature and widely applied algorithm, has the characteristics of fast search, monotonous change and the like, can achieve a good effect in a static environment, but is lack of dynamics. In contrast, the ant colony algorithm has better support for the real-time changing environment, but has inherent defects, and the result convergence is easy to occur at the later stage of the search, so that the ant colony algorithm has the risks of being premature, stagnant and falling into local extreme values.

Disclosure of Invention

The invention provides a dynamic path planning algorithm applied to RCRSS, which extends and integrates the advantages of ant colony and A-star algorithms by introducing a Target Evaluation Factor (TEF), and aims to plan an optimal path in a complex real-time dynamic environment with roadblocks. Different from previous researches, the invention focuses on how to combine heuristic path planning of the A star with an ant colony algorithm of a colony intelligent algorithm, so that the local optimization of the division of labor and cooperation of the intelligent agent is realized under the condition that the intelligent agent is limited by the environment, and further the analysis and prediction and the colony decision optimization are carried out on the global situation in view of the local optimization.

The self-adaptive ant colony A-satellite hybrid algorithm based on the target evaluation factor comprises the following steps:

step 1, initializing system parameters and sharing and updating pheromone;

step 2, initializing ants;

step 3, selecting a next node, calculating ant transfer probability, calculating path length of each ant, and recording a current optimal solution;

step 4, judging whether the terminal is reached, if so, entering the next step, and if not, returning to the previous step;

step 5, updating local pheromones and evaluating ant colonies; updating the global pheromone; performing shared updating of the pheromone among the ant colonies;

step 6, judging whether the maximum iteration times is reached, if so, entering the next step, and if not, returning to the initialization ant;

step 7, comparing the cost value of the optimal solution of each circulation; traversing the candidate nodes, selecting the edge with the minimum cost, and adding the candidate nodes into an OPEN table; sorting the OPEN table according to the cost;

step 8, judging whether a target node is found, if so, entering the next step, and if not, returning to traverse the candidate node;

and 9, outputting a path and finishing algorithm circulation.

Further, the star algorithm a is based on Dijkstra algorithm, guides selection of a next node by introducing global information, and makes a cost estimation of a target node by a current node, and the general formula is as follows:

f(x)＝g(x)+h(x) (1)

wherein g (x) represents the cost from the current node to the target node x, and Manhattan distance calculation is adopted; and h (x) represents a heuristic cost evaluation function from the target point, and the cost of h (x) is evaluated by adopting an adaptive ant colony algorithm.

Further, in step 2, a target evaluation factor is introduced, and the target evaluation factor is defined as:

in the formula (2), the first and second groups,

is the Manhattan distance, pass, from the candidate node j to the target node target_ijRepresents the ratio of agent passing path (i, j);

the final ant state transition probability to the next node is:

wherein affected_kN-1 represents the set of next candidate nodes for ant k, while taboo_kRepresenting the set of nodes it has visited; eta_ijA heuristic function for path (i, j), i.e. the expectation of reaching j from i, d_ijDistance between nodes i.j; α and β represent the magnitude of the effect of the pheromone and heuristic function on node selection, respectively.

Further, the pheromone updating method is shown in formula (4):

wherein, t_ijIndicates the pheromone concentration on path (i, j), and T indicates the time required for the ant colony to complete one search; rho is the pheromone persistence, and 1-rho is the pheromone volatility.

Further, in step 5, the specific steps of local pheromone updating are as follows:

releasing a certain amount of pheromones on the passed path (i, j) by all ants in the ant colony according to a local updating rule, wherein in the searching process, the pheromone increment on the path (i, j) is as follows:

wherein N is the total number of ants,

represents the pheromone released by ant k on path (i, j) at this search, which is defined as follows:

in the formula, q₁Is a local pheromone intensity constant, L_kThe path length of the ant k in the search is obtained; the locality of the pheromone updating rule reflects the positive feedback characteristic of the pheromone, namely, more pheromones are released by ants on a quick road;

the following improvements were made:

wherein s is a constant with a small value. Setup 1, 2 means that when more or less than N/s ants in the ant colony select the path, the pheromone will be adaptively updated.

Further, in step 5, the global pheromone updating specifically comprises the following steps:

after one search is finished, the pheromone is updated according to the quality degree in the candidate path set obtained by ant colony planning according to the target function, and the global pheromone increment on the path (i, j) is as follows:

wherein the content of the first and second substances,

the pheromone content which represents that the kth ant finally releases on the path (i, j) after the search is finished according to the global updating rule is defined as:

wherein q is₂Is a global pheromone strength constant, L^kThe path length searched for by ant k; introduction of sigma_kIndicating the pheromone update influence degree of the candidate path obtained by the ant k on the path (i, j):

σ_k＝(1-μ_ij)·n_ij-rank[k](10)

in the formula, n_ijMu is the total number of ants passing through path (i, j)_ijThe weight of path (i, j) is in direct proportion to the concentration of road section pheromone; c_tThe number of segments, C, that pass the path (i, j) in all candidate paths_pThe total number of candidate paths searched for the ant colony; all the candidate paths are sorted in descending order according to the objective function value to generate rank array, rank [ k ]]Rank, rank [ k ], representing candidate paths derived from ant k]Smaller values indicate a relatively better solution.

Further, in step 5, the specific steps of shared updating of the ant colony pheromone are as follows:

in order to obtain a global better path, a communication mechanism is utilized to share a better solution searched by a single ant among ant groups, the perception of an intelligent agent to the global environment is enlarged, and therefore a candidate solution obtained by dynamic path planning is optimized and improved, and the formula is as follows:

wherein q is_commIntensity constant, L, updated for ant colony sharing_commThe path length of the pareto solution generated in the whole environment in the ant colony.

The invention achieves the following beneficial effects: compared with various popular path planning algorithms, the adaptive ant colony A-satellite hybrid algorithm not only considers the path length factor, but also focuses on the aspects of the accessibility of the path, the dynamic property of the road condition and the like, and can achieve a good effect in a dynamic complex unknown environment.

Drawings

Fig. 1 is a flow chart of the hybrid algorithm in the embodiment of the present invention.

Fig. 2 is a schematic table of the average value of the optimal solution, the value of the optimal solution, and the number of average convergence iterations obtained after the experiment is performed independently in the embodiment of the present invention.

FIG. 3 is a schematic table of 10 independent experiments performed and averaged in an example of the present invention.

Fig. 4 is a diagram illustrating comparison between success count and blocking count of path planning in each algorithm according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of a rescue simulation result in the embodiment of the invention.

Detailed Description

The technical scheme of the invention is further explained in detail by combining the drawings in the specification.

The dynamic path planning algorithm applied to the RCRSS extends and integrates the advantages of ant colony and A-star algorithms by introducing a Target Evaluation Factor (TEF), and aims to plan an optimal path in a complex real-time dynamic environment with roadblocks. Different from previous researches, the invention focuses on how to combine heuristic path planning of the A star with an ant colony algorithm of a colony intelligent algorithm, so that the local optimization of the division of labor and cooperation of the intelligent agent is realized under the condition that the intelligent agent is limited by the environment, and further the analysis and prediction and the colony decision optimization are carried out on the global situation in view of the local optimization.

From the foregoing, in RCRSS, there are many dynamically changing factors. The invention combines the advantages of the two methods, finds out a more ideal global better solution before the algorithm convergence, and is feasible through experimental verification.

The dynamic path planning in the RCRSS is to generate a better path to a target node by using a path planning algorithm according to the current task sequence of an agent, and is specifically embodied as follows: short stroke, safety, high passing rate and the like. The traditional path planning algorithm can only solve the problem that the path length and direction are only optimized under the condition that the global environment is known. However, RCRSS needs to solve the problem that under the non-ideal environment with complex dynamics and incomplete information, multiple indexes are comprehensively considered, for example: optimization problems of road safety, roadblock size, algorithm complexity and the like.

The A star algorithm is based on the Dijkstra algorithm, introduces global information to guide the selection of a next node, and makes cost estimation of a current node on a target node, so that a path which is probably an optimal solution is guaranteed to be searched preferentially, and the searching efficiency is improved. The general formula of the A star algorithm is as follows:

f(x)＝g(x)+h(x) (1)

wherein g (x) represents the cost from the current node to the target node x, and the Manhattan distance is adopted for calculation; and h (x) represents a heuristic cost evaluation function from a target point, the cost of h (x) is evaluated by adopting a self-adaptive ant colony algorithm, and the high efficiency and the reliability of the algorithm are ensured while the dynamic property of the algorithm is increased. The algorithm flow chart is shown in fig. 1, and the steps are as follows:

step 1, initializing system parameters and sharing and updating pheromone.

And 2, initializing ants.

And 3, selecting a next node, calculating ant transfer probability, calculating path length of each ant, and recording the current optimal solution.

And 4, judging whether the end point is reached, if so, entering the next step, and if not, returning to the previous step.

Step 5, updating local pheromones and evaluating ant colonies; updating the global pheromone; and (4) performing inter-ant pheromone sharing updating.

And 6, judging whether the maximum iteration times is reached, if so, entering the next step, and if not, returning to the initialization ant.

Step 7, comparing the cost value of the optimal solution of each circulation; traversing the candidate nodes, selecting the edge with the minimum cost, and adding the candidate nodes into an OPEN table; the OPEN table is sorted by cost.

And 8, judging whether the target node is found, if so, entering the next step, and if not, returning to traverse the candidate node.

And 9, outputting a path and finishing algorithm circulation.

The ant routing in the ant colony algorithm has a certain guiding function on the rescue path search of the rescue agent, and particularly has strong adaptability and robustness to a dynamic environment. However, the traditional ant colony algorithm is easy to generate phenomena of premature convergence, local optimum trapping, low algorithm convergence speed and the like during searching.

In RCRSS, in the urban rescue environment, due to the fact that the road conditions after fire and earthquake are complex and changeable, uncertain factors such as fire spreading, aftershock and road conditions which are influenced by factors such as wind power and building materials and the like and dynamically change along with the progress of rescue work can exist. In addition to this, the map information is usually noisy, locally unknown. Therefore, how to search for the optimal path in a complex and variable environment, the rescue time is reduced, and the response speed is improved is the key of the rescue work of the intelligent agent.

When planning a path by a traditional ant colony algorithm, the heuristic function in the formula (1) is generally defined as eta_ij＝1/d_ijSince it only takes into account the distance between nodes, it is easy to cause a fall into local optimality. To overcome this phenomenon, a Target Evaluation Factor (TEF) is introduced instead of the above-described heuristic function η_ij。

In the InRCRSS, the path planning problem for the rescue of the intelligent agent needs to consider not only the path length, but also the influence of the roadblock condition and the real-time fire on the health condition (HP value) of the intelligent agent, and the like, and the intelligent agent obtains a fast and safe path through a path planning model. Defining the target evaluation factor as:

in the formula (2), the first and second groups,

is the manhattan distance of the candidate node j to the target node target. Therefore, the distance between the places in the city map can be more truly and accurately represented. pass_ijRepresenting the ratio of agent passing path (i, j).

The Target Evaluation Factor (TEF) mainly considers two aspects, namely the distance between a candidate node and a target and the passing rate from a current node to the candidate node, which reflects the quality of the candidate node. After the intelligent agent generates the candidate route according to the algorithm, the intelligent agent is limited by the self driving force and the receptive field, and roadblocks formed by factors such as building collapse and the like cause certain probability of blocking the traffic, namely, the intelligent agent has insufficient confidence degree on real-time road conditions, so that multiple searches are possibly needed to find a passable route. And each search guides the intelligent agent to approach the target continuously through the control of the TEF, and finally a rapid and safe passable road is found.

However, the above two factorsThe method and the system can not be obtained at all, and different proportions of the distance degree and the passing rate are selected according to the category, the division of labor and different behavior modes of the agent. For example, police agents are responsible for clearing obstacles, so only a single factor of distance is considered; when the rescue agent moves towards the wounded, if the HP value of the rescue agent is predicted to be higher, the rescue agent prefers a road section with few roadblocks and high passing rate, and if the health condition of the rescue agent is predicted to be worse, the rescue agent prefers a path with shorter distance. Heuristic function η substituted with a Target Evaluation Factor (TEF)_ijAnd finally, the state transition probability of the ants to the next node is as follows:

wherein affected_kN-1 represents the set of next candidate nodes for ant k, while taboo_kRepresenting the set of nodes it has visited. Eta_ijA heuristic function for path (i, j), i.e. the expectation of reaching j from i, d_ijIs the distance between nodes i.j. α and β represent the magnitude of the effect of the pheromone and heuristic function on node selection, respectively.

In the path searching process, ants trigger the updating of pheromones, and a better path with the characteristics of short distance and high passing rate can obtain more pheromones, so that the globality of updating the pheromones of the algorithm is embodied, the quality degree of each candidate path in the globalsearch range is expressed, and the method is an information positive feedback phenomenon. The pheromone updating method defined by the invention is shown as a formula (4):

wherein, t_ijIndicates the pheromone concentration on path (i, j), and T indicates the time required for an ant colony to complete a search. Rho is the pheromone persistence, and 1-rho is the pheromone volatility. As is apparent from equation (4), the pheromone updates are global and local, and the ant colony will also share the updated pheromone. However, it is worth noting that when the concentration of the pheromone in the path segment is low, the positive feedback phenomenon of the pheromone is relatively unobvious, and the path search embodies strongerThe randomness of the algorithm is low in convergence speed; on the contrary, when the concentration of the pheromone is higher, the randomness is weakened, the positive feedback effect is strengthened, the convergence speed of the algorithm is accelerated, but the algorithm is easy to fall into local optimum. To avoid such problems, the present invention defines the pheromone adaptive update rule as follows:

local updating of pheromone:

wherein N is the total number of ants,

in the formula, q₁Is a local pheromone intensity constant, L_kThe path length of ant k in this search is obtained. The locality of the pheromone updating rule reflects the positive feedback characteristic of the pheromone, namely, more pheromones are released by ants on a fast road.

Considering the lack of control over the global environment by the pheromone local update rule, a "short-sighted" phenomenon with short and shallow eyes can be caused. The method is characterized in that a locally planned path cannot be guaranteed to reach a target position on the algorithm level, a globally ideal solution cannot be guaranteed to be obtained, the problems of local optimum, dead zones, algorithm precocity and the like cannot be avoided, and therefore the following improvements are made:

The rationale for doing this is that the candidate paths generated by each ant in the later stage of the search will gradually approach the path with high pheromone value, resulting in the rapid increase of pheromone value of the road segment and premature convergence of the algorithm. Therefore, in the searching process, the pheromone concentration of the road sections is regularly reduced, and the possibility that other paths are searched and the diversity of feasible solutions can be increased.

And (3) pheromone global updating:

after the search is finished, the pheromone is updated according to the quality degree from the candidate path set obtained by ant colony planning according to the objective function, and the global pheromone increment on the path (i, j) is as follows:

wherein the content of the first and second substances,

wherein q is₂Is a global pheromone strength constant, L^kThe path length searched for by ant k. Introduction of sigma_kIndicating the pheromone update influence degree of the candidate path obtained by the ant k on the path (i, j):

σ_k＝(1-μ_ij)·n_ij-rank[k](10)

in the formula, n_ijMu is the total number of ants passing through path (i, j)_ijThe weight of path (i, j) is proportional to the road segment pheromone concentration. C_tFor all candidatesNumber of links in the route that pass through path (i, j), C_pThe total number of candidate paths searched for the ant colony. All the candidate paths are sorted in descending order according to the objective function value to generate rank array, rank [ k ]]Rank, rank [ k ], representing candidate paths derived from ant k]Smaller values indicate a relatively better solution.

The rationality of the above formula will be explained from several points of view:

if the weight of path (i, j) is μ_ijIf the value is larger, the pheromone value of the road section is larger (1-mu)_kj)·n_ijIs relatively small, is not a bad solution rank k]Relatively small, ultimately resulting in σ_kRelatively large, and embodies the positive feedback characteristic of pheromones. Rank k for sub-optimal solution]Greater, σ_kRelatively small, may negatively impact the road segment, resulting in a reduction in its pheromone content.

If the weight of path (i, j) is μ_ijSmaller, then (1- μ)_kj)·n_ijThe value is relatively large, and the non-optimal solution is given a certain power for increasing the pheromone, so that the concentration of the global pheromone is ensured to have a certain dispersity, the over-precocity of the algorithm is relieved, the best Partorian of intelligent cooperation and decision making is achieved, and the dynamic adjustment of the pheromone is realized.

From the path that each ant passes through in the search process, if the path searched by the kth ant is shorter, rank [ k ] is smaller, and the larger the pheromone enhancement strength is, the pheromone of the road section can be effectively enhanced.

When the number of paths that the current node i passes through is large, the difference between the paths is small, and the influence on each candidate path is relatively uniform. On the contrary, when the paths passed by the node i are few, the pheromone concentration difference of each path is obvious, and the pheromone updating intensity on the optimal path is larger, so that the concentration value of the pheromone concentrated on the optimal path is higher. Therefore, the pheromone concentration of the better road section is maintained while the global pheromone is prevented from being excessively concentrated.

And (3) performing inter-ant colony pheromone sharing updating:

because the receptive field and the capability of a single intelligent body are limited, the superiority of an information sharing mechanism is embodied. In order to obtain a global better path, a communication mechanism needs to be fully utilized, a better solution searched by a single ant is shared among ant groups, perception of an intelligent agent to the global environment is enlarged, and therefore a candidate solution obtained by dynamic path planning is optimized and improved, and the formula is as follows:

In order to verify the reliability of the adaptive pheromone updating strategy and the applicability of an ant colony-a star hybrid algorithm (AACA & a star), the latest robot world cup rescue simulation system is subjected to experimental verification.

Consider that the hybrid algorithm contains multiple variables, such as pheromone persistence ρ, heuristic factors α, β, pheromone intensity constant Q (Q)₁,q₂,q_comm). In the following experiment, values of the variables are discussed by adopting a method for controlling the variables, a better value is selected for rescue simulation again, and the rescue effect is compared with the final score in multiple angles according to the path planning effect, the success times, the passing rate (blocking times) and the scores of all periods. In order to make the results of multiple experiments more comparative, the unified simulation map is used for carrying out comparison experiments for the Shenhu, and the parameter omega in the formula (2) is₁,ω₂Ambulance team or fire agent (0.5 ), police agent (1, 0), ant number

And the final simulation result is the average value of indexes to be compared in each search period of the three intelligent agents.

As can be easily seen from the formula (4), the sending speed of the pheromone is directly reflected by rho, and the larger the rho is, the pheromones of two adjacent generations areThe larger the difference is, the stronger the positive feedback effect is, and the faster the algorithm convergence speed is. But otherwise, the pheromone can not be volatilized and accumulated normally, and the feedback effect is not obvious. In experiment 1, the influence of the pheromone persistence rho on the ant colony A-satellite mixing algorithm is analyzed through a simulation experiment by using the model provided by the invention. To ensure that the test result is only affected by a single variable ρ, the other variables in the simulation are set as: q ═ Q₁＝q₂＝q_comm＝10，α＝1，β＝5,ρ∈[0.1,0.9]And sampling rho once every 0.2, repeatedly simulating the rho of each sampling value for 10 times, searching the optimal path length of each round, and taking the average value as an experimental result. When the difference between the optimal solutions of the adjacent generations is less than 0.01 or the maximum iteration number is reached (set to 50), the loop is exited.

Taking the ambulance squad as an example, the average value of the optimal solution, the optimal solution value (path length, throughput rate) and the average convergence iteration number obtained after each group of experiments run independently are shown in the table of fig. 2.

As can be seen from the table of fig. 2, the pheromone persistence ρ has a large influence on the convergence of the algorithm. If rho is small, the accumulation of pheromones is weakened, the search is quicker, but the search is mostly in a local extreme value; if rho is large, the pheromone residue causes unobvious positive feedback characteristic and slow convergence. The following conclusions can be drawn in conjunction with the above data: when rho epsilon is 0.5,0.7, the searching efficiency and the result of the algorithm are both ideal.

The heuristic factor alpha reflects the action strength of the residual pheromone, the larger the alpha is, the larger the ant is attracted by the historical experience is, and the attenuation of the randomness possibly causes the algorithm to be premature; beta reflects the strength of heuristic information, the coupling with alpha is larger, one of the beta and the alpha is kept unchanged in the experiment, and the result is analyzed by adjusting another factor. Independent experiments were performed 10 times with ρ 0.6 and other variables the same as in experiment 1 and averaged, the results are shown in the table of fig. 3.

Due to space limitations, only representative experimental results are presented here, and the comprehensive analysis leads to the conclusion that: if the heuristic factors alpha and beta are strongly coupled, if the heuristic factors alpha and beta are both large, the algorithm is excessively dependent on experience and heuristic information, and the algorithm is trapped in a local optimal solution. When beta > > alpha, the degradation is greedy algorithm and premature phenomenon occurs; alpha > > beta, and the searching effect is poor. The search result can be optimized only when the alpha and the beta are in a reasonable range, and the effect is best when the alpha is 1 and the beta belongs to the [2,5] through experiments.

The concentration of pheromones in the ant colony algorithm has an important influence on path planning, and in a simulation experiment, the successful search numbers of the rescue intelligent agent in the ant colony algorithm, the A-star algorithm and the self-adaptive ant colony A-star hybrid algorithm provided by the invention are counted respectively under the same environment. Each algorithm was run independently 10 times, with samples taken at 20 cycles intervals for each run, and the final results were averaged. The "success count" is defined as one count from the beginning of the exploration to the end of the planning program, when the end of the exploration reaches the end. As shown in fig. 4 (left), it is obvious that the ant colony-a-star hybrid algorithm has a higher path planning success rate than the ACA and a-star, and the advantages of the hybrid algorithm become increasingly apparent as the number of cycles increases.

In addition, in the RCRSS, roadblocks on the road segment side become obstacles for the intelligent agent to move forward, so that the rescue is delayed and the efficiency is reduced, so that it is necessary to compare the blocking numbers of the corresponding time periods of the above algorithms. As shown in fig. 4 (right), the optimization differences of the previous algorithms on the blocking times are not obvious. After the 60 th period, the blocking number of the hybrid algorithm proposed by the invention is generally lower than that of the ACA and A star algorithms, which is attributed to the good dynamic and adaptive performance of the algorithm.

A better path planning algorithm can help the intelligent agent to reach the target position in a shorter time to carry out rescue actions, and the probability of blocking the intelligent agent is reduced, so that the final score is greatly improved. Under the condition that simulation parameters are not changed, rescue simulation is carried out on different scenes, each scene independently runs for 10 times, an average value is obtained, and statistics is shown in fig. 5 (left). Meanwhile, for further comparison, a score curve was made thereto, as shown in fig. 5 (right). When the two graphs are combined, the mixing algorithm has better score retentivity, especially in the 40 th-80 th period.

The above description is only a preferred embodiment of the present invention, and the scope of the present invention is not limited to the above embodiment, but equivalent modifications or changes made by those skilled in the art according to the present disclosure should be included in the scope of the present invention as set forth in the appended claims.

Claims

1. The self-adaptive ant colony A-satellite hybrid algorithm based on the target evaluation factor is characterized in that: the method comprises the following steps:

step 1, initializing system parameters and sharing and updating pheromone;

step 2, initializing ants;

and 9, outputting a path and finishing algorithm circulation.

2. The adaptive ant colony-a-star hybrid algorithm based on the target evaluation factor as claimed in claim 1, wherein: the A star algorithm is based on Dijkstra algorithm, guides selection of a next node by introducing global information, and estimates the cost of a target node by a current node, and the general formula is as follows:

f(x)＝g(x)+h(x) (1)

3. The adaptive ant colony-a-star hybrid algorithm based on the target evaluation factor as claimed in claim 1, wherein: in step 2, introducing a target evaluation factor, and defining the target evaluation factor as follows:

in the formula (2), the first and second groups,

the final ant state transition probability to the next node is:

4. The adaptive ant colony-a-star hybrid algorithm based on the target evaluation factor as claimed in claim 1, wherein: the pheromone updating method is shown in formula (4):

5. The adaptive ant colony-a-star hybrid algorithm based on the target evaluation factor as claimed in claim 1, wherein: in step 5, the local pheromone updating specifically comprises the following steps:

wherein N is the total number of ants,

the following improvements were made:

6. The adaptive ant colony-a-star hybrid algorithm based on the target evaluation factor as claimed in claim 1, wherein: in step 5, the global pheromone is updated specifically as follows:

wherein the content of the first and second substances,

σ_k＝(1-μ_ij)·n_ij-rank[k](10)

7. The adaptive ant colony-a-star hybrid algorithm based on the target evaluation factor as claimed in claim 1, wherein: in step 5, the specific steps for sharing and updating the pheromone among the ant colonies are as follows: