CN114115285A

CN114115285A - Multi-agent search emotion target path planning method and device

Info

Publication number: CN114115285A
Application number: CN202111472609.5A
Authority: CN
Inventors: 岳伟; 辛弘; 刘中常; 邹存名; 李莉莉; 王丽媛
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2022-03-01

Abstract

The invention provides a multi-agent search emotion target path planning method and a device, wherein the method comprises the following steps: calculating and acquiring the probability of target movement based on emotion at a certain moment based on the initial moment emotional state distribution probability matrix, the basic emotion self-transfer probability matrix and the emotion-displacement probability matrix; obtaining a target probability distribution graph model based on emotion at a certain moment, obtaining a grid where a target is located at an initial moment, and constructing a target probability graph model at the initial moment; iterating the target probability, and calculating each grid in sequence so as to update the target probability graph; constructing an intelligent search collaborative optimization real-time multi-objective function based on the probability gain, the cost of a repeated path, the energy loss cost, the steering adjustment cost and the dynamic self-adaptive cost weight coefficient; and solving the intelligent agent search collaborative optimization multi-objective function based on an improved multi-wolf colony algorithm to obtain a final search path planning scheme.

Description

Multi-agent search emotion target path planning method and device

Technical Field

The invention relates to the field of multi-agent search, in particular to a multi-agent search emotion target path planning method and device.

Background

An agent is a computing entity that resides in a certain environment, can continuously and autonomously function, and has the characteristics of residence, reactivity, sociality, initiative and the like. Common agents in the field of actual engineering include unmanned aerial vehicles, unmanned ships, robots, and the like.

At present, in the technical field of multi-agent search dynamic targets, the influence of the action ability of a target on the probability position of the target in the future is only considered in general research, the influence of the emotion of the target on action decision is not considered, and the probability graph model modeling based on the action ability is incomplete. In addition, the conventional multi-objective function for collaborative optimization is generally fixed and unchangeable, and the weight coefficient of the multi-objective function cannot be adjusted in real time in the task process, so that the benefit and the cost cannot guarantee that the multi-objective function plays a good coordination role, namely the benefit and the cost are not in a grade in magnitude, and the multi-objective function loses the guidance effect.

Disclosure of Invention

According to the technical problem that the conventional intelligent agent path planning scheme cannot effectively execute the emotion target searching task, the multi-intelligent agent emotion target searching path planning method and the multi-intelligent agent emotion target searching path planning system are provided. The invention establishes an emotional state transition model of a target by a Markov analysis method, establishes and updates a real-time target probability graph by combining an emotion-displacement decision probability and a sensor detection probability model, and then searches dynamic targets with emotions in an unknown area for a multi-agent to plan a path by an improved multi-wolf colony algorithm.

The technical means adopted by the invention are as follows:

a multi-agent search emotion target path planning method comprises the following steps:

acquiring a preset emotional state probability distribution matrix at a certain moment through a Markov chain emotional self-transfer model based on a basic emotional set, an emotional state self-transfer probability matrix and an initial moment emotional state distribution probability matrix;

constructing a grid system in the moving range of a search target, iterating the target probability of each grid, combining an emotion-displacement conversion probability matrix on the basis, and calculating each grid in sequence so as to update a target probability graph;

defining an initial moment when an intelligent agent search target disappears after early warning for the first time, acquiring a grid where the target is located at the moment, and constructing a target probability graph model of the initial moment;

constructing an intelligent search collaborative optimization real-time adaptive multi-target function based on the probability gain, the cost of a repeated path, the energy loss cost, the steering adjustment cost and the real-time dynamic adaptive cost weight coefficient;

and solving the multi-agent search collaborative optimization real-time multi-objective function based on an improved multi-wolf pack algorithm to obtain a final search path planning scheme.

Further, iterating the target probabilities for each grid, including:

setting that at most nine conditions exist in displacement decision of each time step of a target, and sequentially defining a divergent displacement set corresponding to a current grid, a grid set corresponding to the divergent displacement set, a gathered displacement set and a grid set corresponding to the gathered displacement set;

calculating and obtaining the probability of displacement of all grids to the corresponding central grid in the convergence displacement grid set at a certain moment, namely summing the convergence displacement probability set of a certain grid, solving the probability of the existence target of the corresponding central grid through the emotional state and the displacement decision, and using the probability as the probability of updating the existence target of the undetected grid at a certain moment in real time in a task.

Furthermore, the method also comprises the step of designing an intelligent sensor detection probability and false alarm probability model according to the weather visibility, wherein the intelligent sensor detection probability and the false alarm probability model are used for updating the grid existence target probability detected at a certain moment in real time in a task.

Further, solving the multi-agent search collaborative optimization real-time multi-target function based on an improved multi-wolf group algorithm, wherein the solving comprises calculating a step factor based on a basic value of a step factor of the artificial wolf, a function value of a potential field at the position of a certain common artificial wolf after iteration and a preset potential field influence factor; and the potential field function value of the position of the certain common artificial wolf after iteration is obtained according to the attraction potential field function of the position of the common artificial wolf during iteration and the repulsion potential field function of the position of the common artificial wolf during iteration.

Further, the multi-agent search collaborative optimization real-time multi-objective function is solved based on an improved multi-wolf pack algorithm, and the method comprises the steps of setting a howling link to realize information sharing among wolf packs, and specifically comprises the following steps:

a. comparing to obtain the corresponding odor concentration of the prepared optimal solution in the wolfsbane group;

b. receiving optimal solution information among other wolf groups;

c. judging whether the solution meets the global requirement of the algorithm, if the solution is repeated with other known artificial wolf exploration ranges, punishing the function value and then turning to the step a; otherwise, go to step d;

d. judging whether the solution meets the constraint condition, if not, selecting a suboptimal solution, returning to the step d, and if so, returning to the step e;

e. this solution is distributed among all wolf groups.

Further, the intelligent agent search collaborative optimization multi-objective function is solved based on an improved multi-wolf colony algorithm, artificial wolfs are eliminated in the aspects of the value and the speed of the smell concentration, and new artificial wolfs with the quantity equal to the eliminated quantity are correspondingly generated.

The invention also discloses an electronic device which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and the electronic device is characterized in that the processor executes the multi-agent search emotion target path planning method by running the computer program.

Compared with the prior art, the invention has the following advantages:

1. the target probability map updating model adopted by the invention can update the target probability map in a task. Compared with the prior art that the searching can be carried out only according to the fixed information of the prior target probability graph, the method has the advantages of real-time performance and capability of more accurately searching the possible target position in the task process.

2. The invention adopts the real-time self-adaptive multi-target function to improve the real-time performance of the multi-target function in the task and keep the guiding function of the multi-target function on the multi-agent.

3. Compared with the defects of the traditional wolf pack algorithm, the invention improves the algorithm from the following three aspects: 1) the step factor is adjusted by using an artificial potential field method, the potential function value is in negative correlation with the step factor and in positive correlation with the step, and the exploration rule of a better wolf in the exploration process is continuously simulated and learned, so that the optimization process is more flexible and stable, and the optimal solution is prevented from being crossed. 2) The problem of solving the optimal track of a plurality of intelligent agents is solved by establishing a plurality of wolf clusters, and a howling link is additionally arranged to enhance information exchange among the wolf clusters, so that the repetition of exploration space is prevented. 3) The healthy and full artificial wolf updating and eliminating mechanism enables wolf clusters to keep wolfs with good exploration effect as much as possible, prevents the algorithm from tending to random search due to excessive elimination number, and ensures the diversity of wolf cluster individuals.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a graph of single step emotion target divergent displacement in an embodiment of the present invention.

FIG. 2 is a single step emotion target gathering displacement diagram in an embodiment of the present invention.

Fig. 3 is a structural diagram of multi-wolf group collaborative search in the embodiment of the present invention.

Figure 4 is a flowchart of the IMWPA algorithm in an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The invention provides a multi-agent search emotion target path planning method which mainly comprises a Markov analysis emotion target probability chart modeling step; constructing a decoupling model of the detection probability and the false alarm probability of the sensor; searching a multi-target function design step of the performance index; and a step of solving the target function by using an Improved Multi-Wolf Pack Algorithm (IMWPA).

s1, acquiring a preset emotional state probability distribution matrix at a certain moment based on the basic emotional set, the emotional state self-transition probability matrix and the initial moment emotional state distribution probability matrix, and acquiring the emotional state probability distribution matrix at a certain moment through the Markov chain emotional self-transition model. Specifically, the method comprises the following steps:

firstly, modeling an emotional target probability graph of a Markov analysis method, comprising the following steps:

setting pi ═ pi (pi)₁,π₂,...,π_n)_1×nFor the Emotion state distribution probability matrix at the initial moment, E ═ Emotion ═ E₁,E₂,...,E_nDenotes n basic emotion sets, D ═ D₁,D₂,...,D_gAnd (g is more than or equal to n) is a displacement decision set, namely, one or more displacement decisions are corresponding to each emotional state.

a_ij＝P(E(t_k+1)＝E_j|E(t_k)＝E_i) (i, j ∈ (1, 2...., n)) formula (1)

Wherein A is_nSelf-transition probability matrix a for emotional states_ijShows the emotional state E (t) from the previous moment_k)＝E_iTo the next time E (t)_k+1)＝E_jTransition probability of (1), probability a_ijNot negative and the sum of the probabilities of all possible emotional states occurring per row (i.e., at any time) is 1.

Emotion self-transfer matrix A (t) after k steps from starting moment according to Markov chain_k) Successive matrix multiplication equal to all preceding transition matrices, i.e.

A_n(t_k)＝πA_n ^k-1Formula (2)

To sum up, from the beginning to a certain time t_kProbability distribution matrix of emotional state

The method of expression is as follows,

probability distribution matrix of emotional state

Is shown at t_kProbability distribution of each emotion corresponding to probability in emotional state

On the basis, the emotion corresponds to the displacement, and the displacement probability of the target at a certain moment can be obtained.

S2, constructing a grid system in the moving range of the search target, iterating the target probability of each grid, and combining an emotion-displacement conversion probability matrix on the basis, and calculating each grid in sequence so as to update the target probability graph. Defining the initial moment when the search target of the intelligent agent disappears after the early warning for the first time, acquiring the grid of the target at the moment, and constructing a target probability graph model at the initial moment.

Specifically, the target probability map is initialized first. Defining the initial time t when the target disappears after the first early warning₀The grid of the target is (x)_T(t₀),y_T(t₀) At this time), the

Wherein (x)_m,y_m) The grids in the task area are numbered according to coordinates. The equation aims to establish a model of the target probability map at an initial time and to serve as a basis for the iterative update of the next target probability map.

And updating the target probability of the grids which do not participate in the search at a certain moment. Specifically, grid (x)_m,y_m) At t_kTarget probability corresponding to time

And carrying out iterative updating according to the Markov chain emotion self-transfer model.

Further, iterating the target probabilities for each grid includes:

a. setting that at most nine conditions exist in displacement decision of each time step of a target, and sequentially defining a divergent displacement set corresponding to a current grid, a grid set corresponding to the divergent displacement set, a gathered displacement set and a grid set corresponding to the gathered displacement set;

b. calculating and obtaining the probability of displacement of all grids to the corresponding central grid in the convergence displacement grid set at a certain moment, namely summing the convergence displacement probability set of a certain grid, solving the probability of the existence target of the corresponding central grid through the emotional state and the displacement decision, and using the probability as the probability of updating the existence target of the undetected grid at a certain moment in real time in a task.

Specifically, the invention sets the displacement decision in each time step of the target to have at most nine cases, and defines the grid (x)_m,y_m) The set of divergent displacements when used as a central grid is D ═ D₁,D₂,...,D_j,...,D₉},D_j＝{(Δx_j,Δy_j) Where Δ x_j,Δy _j0, ± 1, (j ═ 1,2,. 9), the single step divergent displacement case is shown in fig. 1. Definition grid (x)_m,y_m) The grid set corresponding to the divergent displacement set when the central grid is taken as G_m＝{G₁,G₂,...,G_j,...,G₉},

Wherein Δ x_j,Δy _j0, ± 1, (j ═ 1, 2.., 9). Set of displacement grids G_mCorresponding to the set of divergent displacements D one to one.

Defining a gather displacement set

Wherein Δ x_j,Δy _j0, ± 1, (j ═ 1,2,. 9), set

Taking the opposite direction to the displacement in set D, i.e. displacement grid G_mTo the grid (x)_m,y_m) The direction of movement is as shown in fig. 2.

t_kMoment gathering displacement probability distribution matrix

Represents the time shift grid set G_mAll grids in (x)_m,y_m) The probability set corresponding to the displacement is calculated as follows,

wherein the content of the first and second substances,

this equation aims to link both emotional states and displacement decisions,

the probability matrix is converted into an emotion-displacement probability matrix, the row vector of the probability matrix represents a probability set corresponding to nine displacements under each emotion, the column vector corresponds to probability values of the displacements under different emotions,

is shown in emotional state E_iIn the case of (2), displacement is performed

And is a probability of

Non-negative, the sum of the probabilities of performing all displacements in any emotional state is 1.

Since the target is located on the grid (x)_m,y_m) Only the grid set G can be shifted to it within a time step_mMove, therefore, in calculating a certain grid t_kProbability of object being present at moment, equivalent to pair t_kTime-shift grid set G_mAll grid-oriented grids (x)_m,y_m) The probabilities of the displacements are summed, and finally

The calculation formula is as follows,

in the formula (I), the compound is shown in the specification,

representing a set of displacement grids G_mMiddle grid

At t_kThe probability value corresponding to the last moment in time,

representing a set of displacement grids G_mMiddle grid

At t_kTime-oriented grid (x)_m,y_m) Probability of movement, which is t from equation (5)_kMoment gathering displacement probability distribution matrix

Middle corresponding displacement

The probability value of (2). G_S(t_k) Indicating agent at t_kAnd (4) collecting grids searched at the moment.

Each grid object probability update that does not participate in the search at a certain time by the multi-agent is updated according to equation (7).

The steps further include:

s21, designing an intelligent sensor detection probability and false alarm probability model according to weather visibility, and updating the target probability of the grid detected at a certain moment in real time in a task.

Particularly, in the task process, thereby the sensor of agent can receive the influence of environment and reduce the search accuracy, and this patent will be according to weather visibility design one set of sensor detection probability and false alarm probability model, as follows:

wherein p is_d∈[0,1]The detection probability represents the real existence of the target in the grid, and the detection result of the sensor is the probability of the existence of the target. Rho_χRepresentative of the concentration of the haze detected by the sensor;

representing the influence coefficient of the fog on the detection probability; concentration of dense fog

Is constant and

when the concentration of the mist is greater than

The sensor loses detection capability. p is a radical of_f∈[0,1]The false alarm rate indicates that the grid does not actually have a target, and the detection result of the sensor is the probability of the target. This formula indicates that the sensor is less than the fog concentration

False alarm condition can not occur, and the concentration of the dense fog is greater than

And meanwhile, the sensor loses detection capability and does not trust detection results.

Smart body sensor system determination grid (x)_m,y_m) Probability of target being present is determined by agent s at t_kDetecting the target event of the grid at any moment

And whether the grid actually has a target event

And (4) jointly determining. Designing an agent s to t based on Bayesian rules_kGrid (x) of time search_m,y_m) Is updated according to the probability

As follows below, the following description will be given,

each grid participating in the search at a certain moment of the multi-agent is updated according to the formula (10), and the total target probability graph at the moment can be obtained by combining the formula (7).

S3, constructing an intelligent search collaborative optimization real-time adaptive multi-objective function based on the probability gain, the repeated path cost, the energy loss cost, the steering adjustment cost and the real-time dynamic adaptive cost weight coefficient.

In particular, mobility constraints C of agents need to be considered before performance indicators_k：

And collision avoidance restraint C_d：d_ab(t_k)≥d_min,(a,b＝1,2,...,N_SAnd a ≠ b). In the formula (I), the compound is shown in the specification,

represents t_kThe steering angle of the agent at the moment,

indicating the maximum steering angle of the agent,d_ab(t_k) Represents t_kDistance between the Agents a, b at the moment, d_minRepresenting the minimum safe distance between agents.

According to the actual situation, the system t is connected_kThe collaborative optimization problem at a moment is described as a multi-objective function F (t)_k)：

F(t_k)＝R_P(t_k)-ω(t_k)[J_O(t_k)-J_E(t_k)-J_A(t_k)]Formula (11)

In the formula, R_PRepresenting a probabilistic gain, J_ORepresents the cost of the duplicate path, J_ERepresents the energy loss cost, J_ARepresents the steering adjustment cost, ω (t)_k) For dynamic self-adaptive cost weight coefficient, the average probability value of each grid is used for representing to prevent the probability profit from being in an order of magnitude with other costs, the path profit can be better measured by means of a real-time multi-objective function, the calculation method is as follows,

in the formula (I), the compound is shown in the specification,

means that the probability of all grids in the task area is summed, N_cellRepresenting the total number of grids within the task area and omega representing the task area.

(1) Probability profit R_P(t_k)

Probability profit R_P(t_k) For description at t_kTime grid (x)_m,y_m) The corresponding probability value income is calculated as follows

In the formula k_pThe probability gain factor is represented.

(2) Path repetition cost J_O(t_k)

In order to optimize a search path and avoid collision danger, search time waste and energy loss caused by repeated paths, the problem of repeated selection of the path needs to be considered, and a path repeated cost function J is introduced_O(t_k) The representation method is as follows:

in the formula, k_oRepresents a path repetition cost coefficient, L_(·)Representing the agent search path covering the grid set, and the card (-) function represents the number of elements in the set.

(3) Energy loss cost J_E(t_k)

In the sea area, the agent is difficult to be supplied during the task execution process, and the task voyage of the agent is limited, so that the energy consumption needs to be optimized in consideration of the endurance problem of the agent, and a cost function J is introduced into the method_E(t_k) Energy consumption in the execution of tasks by the agent is described. As follows:

J_E(t_k)＝J_k(t_k)+J_f(t_k) Formula (15)

In the formula, J_k(t_k) Is the cost of mechanical energy loss, expressed as fuel consumption; j. the design is a square_f(t_k) Representing power loss costs, including power loss for various electronic instruments.

(4) Steering adjustment cost J_A(t_k)

The speed is fast when the intelligent agent executes the task, the possibility of unstable safety factors caused by overlarge change of the steering angle exists, and the overlarge steering angle is not preferable to the flight path smoothness and the energy consumption, so the flight path adjustment cost J designed by the method is low_A(t_k) The expression mode is as follows:

k_aand adjusting the cost coefficient for the flight path.

And S4, solving the multi-agent search collaborative optimization real-time multi-objective function based on the improved multi-wolf colony algorithm to obtain a final search path planning scheme. The method specifically comprises the following steps:

s41, calculating a step factor based on the basic value of the step factor of the artificial wolf, the function value of the potential field at the position of a certain common artificial wolf after iteration and a preset potential field influence factor; and the potential field function value of the position of the certain common artificial wolf after iteration is obtained according to the attraction potential field function of the position of the common artificial wolf during iteration and the repulsion potential field function of the position of the common artificial wolf during iteration.

Specifically, the step length of the artificial wolf is adjusted by adopting an artificial potential field method. The head wolf h can generate corresponding attraction force to the common artificial wolf i, and repulsion force can be generated between the common artificial wolfs. The step size factor s (i) is proportional to the fineness of the search and inversely proportional to the step size, and appropriate adjustment of the step size can make the algorithm more flexible. The step factor s (i) is designed as follows,

in the formula S₀Represents the base value, U, of the artificial wolf step-size factor_i(I) The value of the potential field function at the position of the I wolf at the I-th iteration is represented, and lambda represents the potential field influence factor. This expression indicates that the step-size factor is influenced by the value of the force function at the location of the artificial wolf i.

Potential field function U_i(I) The representation form is as follows,

in the formula (I), the compound is shown in the specification,

an attractive potential field function representing where the I wolf is located at the I-th iteration,

represents the repulsive potential field function of the position of the I wolf at the I iteration, an

d (h, I) represents the distance between the common wolf I and the head wolf h, and Q (I) represents the distance threshold between the common wolfs.

Gravitational potential field function

The calculation method comprises the following steps:

in the formula, ζ represents a gravitational gain. The calculation mode of IMWPA design zeta is as follows:

in the formula, k_hThe coefficient of the gravitational force of the wolf head,

representing the number of times of the artificial wolf of generation I to select the wolf head, and D represents the dimension of the algorithm exploration space.

Repulsive force potential field function

The form of expression of (a) is as follows,

wherein μ represents a repulsive force gain, D_i(I) Indicating the distance between the position of the I wolf and its nearest common wolf at iteration I, greater than which will not generate a repulsive force.

The IMWPA has the advantages that the potential function value is in negative correlation with the step factor and in positive correlation with the step, and the exploration rule of a good wolf is continuously simulated and learned in the exploration process, so that the optimization process is more flexible and stable.

S42, a howling link is arranged to realize information sharing among the wolf clusters.

Specifically, IMWPA as a multi-wolf group algorithm needs to set a howling link to realize information sharing among wolf groups, and the howling link is added to prevent the repetition of an exploration space, improve the global exploration capability of the algorithm and reduce the algorithm calculation complexity of a single wolf group. IMWPA defines howling link wolf group execution steps as follows:

a. comparing to obtain wolf group WP^ξThe prepared wolf corresponds to the concentration of odor

b. Maximum smell concentration information between other wolf packs is received.

c. Judging whether the solution meets the global requirement of the algorithm: if the solution is repeated with the search range of other known artificial wolfs, punishment is performed on the function value (as shown in formula (21)) to obtain the final odor concentration corresponding to the wolf

Then go to step a; otherwise, go to step d.

Step d: judging whether the solution meets the constraint condition: if the constraint condition C is not satisfied_k、C_dSelecting suboptimal solution, and returning to the step d; if the constraint condition is satisfied, the wolf is used as the head wolf and the corresponding solution is used as the head wolf

Go to step e.

Step e: the wolf cluster is formed by howling the head wolf position x_idAnd issuing to all wolf groups.

Concentration of odor of artificial head wolf of howl

The penalty formula is as follows,

in the formula k_z∈[0,1]Representing the search space repetition penalty coefficient,

indicating wolf group WP^ξThe number of middle artificial wolfs. The structure diagram of the multi-wolf group collaborative search is shown in fig. 3.

In addition, as a heuristic algorithm, randomness is a big characteristic of the algorithm, the randomness is utilized more scientifically, artificial wolves can be screened more flexibly, and the quality of future wolves is promoted to be optimized.

The elimination update of the traditional wolf colony search only adopts a last elimination mechanism according to the odor concentration, because the elimination number can influence the algorithm effect, the IMWPA can completely eliminate the update mechanism, the algorithm tends to be randomly searched due to excessive elimination number is prevented, and the diversity of wolf colony individuals is ensured.

IMWPA starts with the condition that the artificial wolf is eliminated, and puts forward certain requirements on the artificial wolf in terms of both the value and the speed of the odor concentration. IMWPA plans that artificial wolves meet the following two requirements at the same time and is eliminated:

numerical angle: the magnitude of the odor concentration value of the artificial wolf is at the smaller R P:

wherein gamma is a population update scale factor, S_numThe number of detected wolves in the wolves;

rate angle: the increase of the objective function in each iteration of the artificial wolf is as follows:

ΔY_i ^ξ(I)＝Y_i ^ξ(I)-Y_i ^ξ(I-1),(I∈[1,I_max]) Within the smaller R piece. Y is_i ^ξ(I) Is the corresponding odor concentration value of the artificial wolf of the I th generation, I_maxIs the maximum number of iterations.

After the artificial wolfs are eliminated, the wolf group randomly generates the artificial wolfs with the same quantity as the eliminated quantity.

Searching and catching prey in D-dimension space by wolf group, and optimizing variable value range [ min ] in D-dimension space by wolf group algorithm_d,max_d]，d∈[1,D]The iteration number of the algorithm is I, and the odor concentration of the artificial wolf of the wolf group xi is Y_i ^ξThe concentration of the smell of the wolf head is

The position of the artificial wolf is x_idThe number of times of the head wolf wandering is T, the distance between the head wolf h and the common wolf i is d (h, i), and the enclosure judgment distance is d_nearThe number of targets searched by the agent is

The flow chart of searching in the IMWPA application agent is shown in FIG. 4.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A multi-agent search emotion target path planning method is characterized by comprising the following steps:

constructing a grid system in the moving range of a search target, iterating the target probability of each grid, combining an emotion-displacement conversion probability matrix on the basis, sequentially calculating each grid so as to update a target probability graph, defining the initial moment when the search target disappears after the first early warning of the intelligent agent, acquiring the grid of the target at the moment, and constructing a target probability graph model of the initial moment;

2. The multi-agent search emotion target path planning method of claim 1, wherein iterating the target probabilities for each grid comprises:

3. The multi-agent search emotion target path planning method of claim 1, wherein the method further comprises designing an agent sensor detection probability and false alarm probability model according to weather visibility as a grid existence target probability detected at a certain moment in time in a task.

4. The multi-agent search emotion target path planning method of claim 1, wherein solving the multi-agent search collaborative optimization real-time multi-target function based on an improved multi-wolf swarm algorithm comprises calculating a step factor based on an artificial wolf step factor base value, a potential field function value of a position where a certain general artificial wolf is located after iteration, and a preset potential field influence factor; and the potential field function value of the position of the certain common artificial wolf after iteration is obtained according to the attraction potential field function of the position of the common artificial wolf during iteration and the repulsion potential field function of the position of the common artificial wolf during iteration.

5. The multi-agent search emotion target path planning method of claim 1, wherein solving the multi-agent search collaborative optimization real-time multi-objective function based on an improved multi-wolf pack algorithm includes setting a "howling" link to achieve information sharing between wolf packs, and specifically includes the steps of:

b. receiving optimal solution information among other wolf groups;

e. this solution is distributed among all wolf groups.

6. The multi-agent search emotion target path planning method of claim 1, wherein the agent search collaborative optimization multi-target function is solved based on an improved multi-wolf cluster algorithm, artificial wolfs are eliminated in both numerical value and speed of smell concentration, and new artificial wolfs equal in number to elimination are generated accordingly.

7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes a multi-agent search emotion target path planning method according to any one of claims 1 to 6 by executing the computer program.