CN111538349A

CN111538349A - Long-range AUV autonomous decision-making method oriented to multiple tasks

Info

Publication number: CN111538349A
Application number: CN202010304780.4A
Authority: CN
Inventors: 何波; 陈关忠; 沈钺; 王殿蕊; 曲南竹; 邱翔宇
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2020-08-14
Anticipated expiration: 2040-04-17
Also published as: CN111538349B

Abstract

The invention provides a long-range AUV autonomous decision-making method for multiple tasks, which mainly solves the problems of task planning and scheduling: (1) the method for representing the index data of the event is introduced through an event triggering mechanism, and the decision calculation is carried out only when the event is triggered, so that the decision calculation is changed from periodicity to aperiodicity, unnecessary optimization calculation is reduced, and the function of responding to the emergency in real time is realized; (2) meanwhile, the effect function is constructed, safety and efficiency are quantified, two targets of safety and efficiency are synchronously optimized, the long-range AUV self-adaptive dynamic planning is realized, the operation efficiency is effectively improved on the premise of ensuring the sailing safety of the long-range AUV, the energy waste in the sailing process is reduced, the optimal task execution sequence is found, and the marine survey task with long duration and complex sailing environment is efficiently completed. The scheme reasonably balances the importance and priority of the tasks in the face of the multi-task autonomous decision problem, improves the autonomy of the aircraft, realizes a strong task management strategy, and has the advantages of strong real-time performance, small calculation amount and the like.

Description

Long-range AUV autonomous decision-making method oriented to multiple tasks

Technical Field

The invention belongs to the field of underwater robot control, is applied to a long-range autonomous underwater vehicle for multiple tasks, and particularly relates to a long-range AUV autonomous decision-making method for multiple tasks.

Background

Due to the dynamic and variable marine environment and unknown complexity, the long-range AUV faces the influence of various factors in the driving process. When highly complex tasks are executed, resources must be managed effectively and rationally and orchestrated so that the AUV can adapt to complex marine environments. Due to the high uncertainty of the marine environment, the current long-range AUV autonomous decision making technology cannot adapt to complex and variable marine environments, and the multi-task-oriented scheduling capability needs to be improved.

The autonomy is a characteristic that an AUV must have for completing various tasks, the existing autonomy decision technology only aims at the AUV of a general voyage, a hierarchical decision method is generally adopted, task coordination and allocation are carried out by exhausting tasks and events, for example, aiming at the task Planning problem of the autonomous decision of the AUV, Zadeh and the like design an AUV autonomous decision framework (Zadeh S M, Powers DM W, Sammut K, et al. persistent AUV Operations Using a Robust Read Missino and Path Planning (RRMPP) Architecture [ J ].2016 ]), the AUV task Planning and time management are completed by combining with dynamic constraints of ocean current data, uncertain obstacles and aircrafts through lower-level motion Planning, the framework is composed of a plurality of modules in a hierarchical form, and each module adopts a series of algorithms to research the efficiency and robustness of the autonomy decision method under different task scenes.

However, in the prior art, when multi-task scheduling is implemented, a periodic sampling mode is generally adopted, and a dynamic programming algorithm is continuously used for decision calculation. In order to make a decision in real time, the sampling period is short, the data acquisition amount in unit time is huge, the decision calculation frequency is high, and the calculation and storage resources of the AUV are limited, so that a large amount of system resources are easily occupied, and the overall data calculation and communication performance of the system is influenced; and the marine environment self-adaptive control system is closely connected with a mother ship, has short working time and good sea conditions of a working sea area, so that the marine environment self-adaptive control system is weaker in self-contained capability, is only suitable for a simple marine environment, cannot respond to unpredictable emergencies in real time, and is difficult to apply to a dynamic complex marine environment.

The long-range AUV has more environmental conditions and difficult problems than the common AUV, so a higher-level autonomous decision system is required to be provided to process the execution problem of multiple tasks under complex conditions.

Disclosure of Invention

Aiming at the defects of the existing autonomous decision making technology, the invention provides a long-range AUV autonomous decision making method facing multiple tasks, mainly solving the problems of task planning and scheduling, not only effectively managing resources of the long-range AUV and reasonably and comprehensively planning multiple tasks, but also being capable of adapting to more complex environment without manual participation and completing the execution of complex tasks.

The invention is realized by adopting the following technical scheme: a long-range AUV autonomous decision-making method for multitasking comprises the following steps:

a, executing a basic task sequence, wherein the basic task sequence comprises a planning task, an obstacle avoidance task, a tracking task, a control task and an emergency task so as to ensure the task execution under the long-range AUV (autonomous underwater vehicle) general state;

step B, expressing the triggering event of the long-range AUV by adopting a data index, and designing a task triggering mechanism to trigger a corresponding task sequence;

step C, constructing an autonomous decision system and performing dynamic self-adaptive planning, wherein the autonomous decision system comprises a control variable mapping module, an AUV state prediction module and a task sequence effect evaluation module, and specifically comprises the following steps;

c1, generating control variables according to AVU states based on the control variable mapping module;

c2, based on AUV state prediction module, calculating the next time AUV state according to the control variable and the current time AUV state;

c3, based on the task sequence effect evaluation module, constructing a utility function to evaluate the control variable and judging whether the current task sequence is the optimal task sequence;

and D, finally, executing the optimal task sequence according to the task sequence triggering result and the evaluation result.

Further, the step B specifically adopts the following manner:

(2) using the index data to represent the task sequence, and periodically sampling the index data, including:

acquiring data of the distance between the aircraft and the obstacle in real time, and taking the distance data as index data to represent an obstacle avoidance task sequence;

collecting the flow speed and direction of the surrounding ocean current, and representing the flow speed and flow direction data as a current avoidance event;

acquiring current and voltage data and other sensor related data of the AUV aircraft, and taking the current and voltage data and other sensor related data as index data to represent a fault event;

(2) design task triggering mechanism

Setting a task sequence trigger threshold to e_TThe sampling reference value is d_bThe sampled value is d_kDefining the sampling reference value as a safety distance;

for the obstacle avoidance task sequence, the sampling value is the real-time distance between the AUV (autonomous Underwater vehicle) and the obstacle, and the trigger error of the task sequence is e_k＝|d_k-d_bIf e_k＜e_TIt means that a task sequence event is triggered, which is called the trigger time.

6. The multitask-oriented long-range AUV autonomous decision method according to claim 1, characterized in that: in the step C1, the relationship between the AUV state and the control variable is mapped based on the control variable mapping module:

u(k)＝f(x(k))

where x (k) represents the system state at time k, and u (k) represents the control variables generated from the state of the long-range AUV system, including but not limited to system voltage, system current, communication frequency, mission status, speed, heading, altitude, and depth.

Further, in the step C2, calculating the AUV state at the next time according to the control variable and the AUV state at the current time is specifically implemented by:

x(k+1)＝F(x(k)，u(k)，k)

where x (k) is the system state at time k, and x (k) ═ W_k，I_k，F_k，S_k，V_k，A_k，H_k，D_kW represents system voltage, I represents system current, F represents communication frequency, S represents mission status, V represents navigational speed, a represents heading, H represents height of AUV from sea floor, D represents depth of AUV from sea surface, u (k) is a control variable, and u (k) is { t }_1，j，k，t_2，q，k，…t_n，l，k0, 1, …, N-1, N is positive integer, i, q, l ∈ [1, m]And m is a positive integer.

Further, in step C3, generating a utility function based on the task sequence effect evaluation module, and evaluating the control variable specifically includes:

1) and outputting an approximate performance index function according to the current AUV state:

wherein U (x (i), U (i)) is a utility function, γ is a discount factor, and 0 < γ ≦ 1;

wherein V (i) is the navigation speed of the aircraft at the i-th moment; t is t_sRequired for the aircraft to receive feedback to take actionReaction time set to a constant; DB (i) is the distance between the aircraft and the obstacle at the moment i; MDT is the distance from the task starting point to the task ending point; DT (i) is the distance between the aircraft and the task end point at the moment i;

3) judging the current task sequence

When judging whether the current task sequence is the optimal task sequence, setting an optimal target, namely searching a decision sequence u (i), wherein i is k, k +1, … and N-1, so that a performance index function is minimum;

assume that the optimal performance indicator function value for all AUV states x (k +1) from time k +1 is J^*(x (k +1)), and the optimal decision sequence starting from the time k +1 is u^*(k+1)，u^*(k+2)，…，u^*(N-1), the performance indicator function at time k is expressed as:

U(x(k)，u(k))+γJ^*(x(k+1))；

wherein x (k) is the state at time k, x (k +1) is determined by a system equation, and the optimal performance index function value at time k is expressed as:

the obtained optimal solution u^*(k) Comprises the following steps:

i.e. the optimal decision at time k is u^*(k)，u^*(k) The task sequence is the optimal task sequence, so that the optimal task sequence which needs to be executed by the long-range AUV at present is obtained.

Compared with the prior art, the invention has the advantages and positive effects that:

the autonomous decision method provided by the scheme reasonably balances task importance and priority in the face of a multi-task autonomous decision problem, improves the autonomous weight of the aircraft, realizes a strong task management strategy, and has the advantages of strong real-time performance, small calculated amount and the like;

the method for expressing the index data of the event is introduced through an event trigger mechanism, decision calculation is performed only when the event is triggered, so that the decision calculation is changed from periodicity to aperiodicity, unnecessary optimization calculation is reduced, the function of responding to the emergency in real time is realized, a large amount of calculation resources are saved, various tasks can be managed integrally, the autonomous decision level of the long-range AUV is improved practically, and the safety of the AUV is guaranteed while the tasks are completed;

meanwhile, in the long-range AUV task execution process, factors such as ocean currents have great influence on energy consumption of the aircraft, the long-range AUV carries limited resources, the scheme synchronously optimizes two targets of safety and high efficiency by constructing an effect function, so that the long-range AUV self-adaptive dynamic planning is realized, the operation efficiency is effectively improved on the premise of ensuring the navigation safety of the long-range AUV, the energy waste in the navigation process is reduced, an optimal task execution sequence is found, and the marine survey task with long duration and complex navigation environment is efficiently completed.

Drawings

Fig. 1 is a schematic diagram illustrating an autonomous decision method according to an embodiment of the present invention.

Detailed Description

In order to make the above objects, features and advantages of the present invention more clearly understood, the present invention will be further described with reference to the accompanying drawings and examples. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and thus, the present invention is not limited to the specific embodiments disclosed below.

The implementation provides a long-range Autonomous Underwater Vehicle (AUV) decision-making method for multiple tasks, which mainly comprises the following steps as shown in figure 1:

a, executing a basic task sequence, wherein the basic task sequence comprises a planning task, an obstacle avoidance task, a tracking task, a control task, an emergency task and the like, so as to ensure the task execution under the long-range AUV (autonomous underwater vehicle) general state;

step C, designing an autonomous decision system and carrying out dynamic self-adaptive planning, wherein the autonomous decision system comprises a control variable mapping module, an AUV state prediction module and a task sequence effect evaluation module;

(1) generating control variables according to AVU states based on a control variable mapping module;

(2) based on the AUV state prediction module, calculating the AUV state at the next moment according to the control variable and the AUV state at the current moment;

(3) based on a task sequence effect evaluation module, constructing a utility function to evaluate the control variable and judging whether the current task sequence is the optimal task sequence;

Specifically, in this embodiment, taking a flag fish type 210-AUV of the chinese ocean university as an example of an experimental platform, a detailed description is given to an implementation process of the long-range AUV autonomous decision method:

one, executing basic task sequence

The basic task sequence is a basic guarantee for executing tasks, and tasks such as planning tasks, obstacle avoidance tasks, tracking tasks, control tasks, emergency tasks and the like are started when the tasks start.

In the embodiment, a planning task is set to be in three modes of no planning, traversal planning and end-to-end planning, an obstacle avoidance task is set to be in three modes of no obstacle avoidance, static obstacle avoidance and dynamic obstacle avoidance, a tracking task is set to be in two modes of single-point tracking and multipoint continuous tracking, a control task is set to be in three modes of rudder piece control, propeller control and rudder piece propeller synchronous control, and an emergency task is set to be in three modes of mild emergency, moderate emergency and high risk emergency. Each task pattern represents a specific task execution mode, and each task pattern is numbered as shown in table 1.

TABLE 1 task numbering

Numbering	Task name
		1	No planning
2	Traversal planning
		3	End-to-end planning
4	Without obstacle avoidance
		5	Static obstacle avoidance
6	Dynamic obstacle avoidance
		7	Without tracking
8	Single point tracking
		9	Multipoint continuous tracking
10	Without control
		11	Rudder blade control
12	Propeller control
		13	Synchronous control of rudder propeller
14	Without emergency
		15	Light emergency
16	Moderate emergency
		17	High-risk emergency

The experiment is submarine target searching and detecting, basic task sequences are determined to be 2, 5, 8, 13 and 14 according to experiment tasks, and after the AUV is laid, the basic task sequences are executed to ensure that the AUV starts tasks normally.

Expressing the trigger event by adopting a data index, and designing a task trigger mechanism to trigger a corresponding task sequence;

(1) using the index data to represent the task sequence, and periodically sampling the index data:

for example, data of the distance between a vehicle and an obstacle is collected in real time, and the data is used as index data to represent an obstacle avoidance task sequence (also called an event); collecting the flow speed and direction of the surrounding ocean current, and representing the flow speed and flow direction data as a current avoidance event; collecting current and voltage data of the aircraft and relevant data of other sensors, and using the data as index data to represent fault events and the like;

(2) design task triggering mechanism

Setting task sequence triggersA threshold value of e_TThe sampling reference value is d_bThe sampled value is d_kAnd defining the sampling reference value as a safe distance, for example, for an obstacle avoidance task sequence, if the sampling value is the real-time distance between an aircraft and an obstacle, the triggering error of the task sequence is e_k＝|d_k-d_bL. If e_k＜e_TIt means that a task sequence event is triggered, which is called the trigger time.

For example, in this embodiment, an altimeter mounted on the AUV is used to collect distance data between the AUV and the obstacle in real time, and an obstacle avoidance decision triggering threshold e is set_T2, safety distance d_b4, distance d between current AUV and obstacle_kIf 5, then the event triggers error e_k＝|d_k-d_b1 due to e_k＜e_TTherefore, an obstacle avoidance event is triggered, and the AUV starts to make autonomous decisions.

A general decision scheme adopts a fixed period decision method, so that a large amount of system computing resources are easily occupied;

three, adaptive dynamic programming

Designing an autonomous decision system and carrying out dynamic self-adaptive planning, wherein the autonomous decision system comprises a control variable mapping module, an AUV (autonomous Underwater vehicle) state prediction module and a task sequence effect evaluation module; in this embodiment, the long-range AUV autonomous decision system is regarded as a discrete nonlinear system, the state variable represents a variable capable of describing a complete system, the system here is a discrete task system and is represented by a set T, where T ═ T₁，T₂，…，T_nAnd each task in the system is a necessary task for enabling the AUV to normally operate, and needs to be executed synchronously, and the relationship among the tasks is a parallel relationship, such as a path planning task, a tracking task, an underlying control task and the like. Each task can be divided into a plurality of subtasks, i.e. the ith task T_i∈{t_i1，t_i2，…t_imThat the subtasks cannot be performed synchronously, only at a timeThe method comprises the steps of selecting and executing a subtask, wherein path planning tasks comprise traversal planning, obstacle avoidance planning, flow avoidance planning and the like, and the planning tasks are mutually exclusive relations, namely when one task is executed, other tasks need to be closed.

The dynamic adaptive planning specifically includes:

(1) mapping the relationship between the AUV state and the control variable based on a control variable mapping module:

u(k)＝f(x(k))

wherein x (k) represents the system state at time k, u (k) represents control variables, the control variables including but not limited to system voltage, system current, communication frequency, mission status, speed, heading, altitude, and depth are generated based on the state of the long-range AUV system;

the state variable and the control variable are key elements capable of reasonably expressing a decision-making system, and the embodiment extracts key index data from a long-range AUV complex system, constructs the state variable and the control variable and lays a foundation for solving a multi-task-oriented autonomous decision-making problem;

x(k+1)＝F(x(k)，u(k)，k)

(3) The method comprises the following steps of generating a utility function based on a task sequence effect evaluation module, evaluating a control variable, judging whether a current task sequence is an optimal task sequence, evaluating whether the AUV state is safe or not and whether the task sequence is optimal or not through the utility function, realizing the consideration of two targets of safety and efficiency through the constructed utility function, and performing multi-target optimization on the task sequence, specifically:

1) and outputting an approximate performance index function according to the current AUV state based on the evaluation network module:

wherein V (i) is the navigation speed of the aircraft at the i-th moment; t is t_sSetting the reaction time required for the aircraft to receive feedback and make an action as a constant; DB (i) is the distance between the aircraft and the obstacle at the moment i; MDT is the distance from the task starting point to the task ending point; DT (i) is the distance between the aircraft and the task end point at the moment i;

4) judging the current task sequence

assume that the optimal performance indicator function value for all possible AUV states x (k +1) starting from time k +1 is J^*(x (k +1)), and the optimal decision sequence starting from the time k +1 is u^*(k+1)，u^*(k+2)，…，u^*(N-1), the performance indicator function at time k is expressed as:

U(x(k)，u(k))+γJ^*(x(k+1))；

the obtained optimal solution u^*(k) Comprises the following steps:

i.e. the optimal decision at time k is u^*(k)。u^*(k) The task sequence is the optimal task sequence, so that the optimal task sequence which needs to be executed by the long-range AUV at present is obtained.

D, finally, the autonomous decision system transmits the information of the optimal task sequence to an execution mechanism through a communication element, and starts to execute the optimal task sequence; and at each moment, generating an optimal task sequence according to the flow, and executing each stage of subtasks according to the selected task sequence until the task is completed.

The above description is only a preferred embodiment of the present invention, and not intended to limit the present invention in other forms, and any person skilled in the art may apply the above modifications or changes to the equivalent embodiments with equivalent changes, without departing from the technical spirit of the present invention, and any simple modification, equivalent change and change made to the above embodiments according to the technical spirit of the present invention still belong to the protection scope of the technical spirit of the present invention.

Claims

1. A long-range AUV autonomous decision-making method for multitasking is characterized by comprising the following steps:

2. The multitask-oriented long-range AUV autonomous decision method according to claim 1, characterized in that: the step B specifically adopts the following mode:

(1) using the index data to represent the task sequence, and periodically sampling the index data, including:

(2) design task triggering mechanism

3. The multitask-oriented long-range AUV autonomous decision method according to claim 1, characterized in that: in the step C1, the relationship between the AUV state and the control variable is mapped based on the control variable mapping module:

u(k)＝f(x(k))

4. The multitask-oriented long-range AUV autonomous decision method according to claim 1, characterized in that: in the step C2, calculating the AUV state at the next time according to the control variable and the AUV state at the current time is specifically implemented by the following method:

x(k+1)＝F(x(k)，u(k)，k)

5. The multitask-oriented long-range AUV autonomous decision method according to claim 1, characterized in that: in step C3, a utility function is generated based on the task sequence effect evaluation module, and the evaluation of the control variable specifically includes:

2) judging the current task sequence

U(x(k)，u(k))+γJ^*(x(k+1))；

the obtained optimal solution u^*(k) Comprises the following steps: