CN109241552B - Underwater robot motion planning method based on multiple constraint targets - Google Patents

Underwater robot motion planning method based on multiple constraint targets Download PDF

Info

Publication number
CN109241552B
CN109241552B CN201810764979.8A CN201810764979A CN109241552B CN 109241552 B CN109241552 B CN 109241552B CN 201810764979 A CN201810764979 A CN 201810764979A CN 109241552 B CN109241552 B CN 109241552B
Authority
CN
China
Prior art keywords
robot
constraint
training
action
underwater
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810764979.8A
Other languages
Chinese (zh)
Other versions
CN109241552A (en
Inventor
张国成
程俊涵
孙玉山
盛明伟
冉祥瑞
王力锋
焦文龙
王子楷
贾晨凯
吴凡宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201810764979.8A priority Critical patent/CN109241552B/en
Publication of CN109241552A publication Critical patent/CN109241552A/en
Application granted granted Critical
Publication of CN109241552B publication Critical patent/CN109241552B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
  • Manipulator (AREA)

Abstract

An underwater robot motion planning method based on multiple constraint targets belongs to the field of machine learning and underwater robot motion planning. A model construction stage: converting the signals of the obstacle avoidance sonar of the robot and the flow velocity signals of the flow velocity sensor into the current environment; establishing a discrete action space according to dynamic constraints; establishing a reward function by taking an underwater obstacle as a constraint; establishing a Markov decision process based on multi-target constraint to establish a basis for algorithm realization; a training stage: training is carried out based on a Q learning algorithm, actions are executed based on a greedy strategy in the current environment, the strategy is evaluated and updated once each time the strategy is executed, the strategy is improved until the strategy is adaptive to the environment, and the planning purpose is achieved. The invention considers multi-constraint targets such as water flow, obstructive objects, targets and the like, combines the reinforcement learning method with the underwater multi-constraint targets, realizes the motion planning of the underwater robot, has stronger real-time performance, and can be suitable for various environments.

Description

Underwater robot motion planning method based on multiple constraint targets
Technical Field
The invention belongs to the field of machine learning and underwater robot motion planning, and particularly relates to an underwater robot motion planning method based on multiple constraint targets.
Background
The intelligent underwater robot has wide application prospect in the aspects of marine scientific research, marine development, underwater engineering, military and the like. The intelligent underwater robot generally works in a complex marine environment, and in order to better complete various task missions and ensure the safety of the intelligent underwater robot, the intelligent underwater robot needs to have autonomous motion planning capability in an unknown environment, and can avoid obstacles and navigate to a target point in the unknown environment.
The traditional underwater robot motion planning technology needs to construct a global map in advance. When the environment changes, a communication model needs to be reestablished, the adaptability is poor, and the practicability is not strong. Reinforcement learning is an unsupervised learning method, which is a process of continuous trial and error. Knowledge is obtained through continuous action and evaluation, and strategies are improved to adapt to the environment, so that the final evaluation function value is maximized, and the learning purpose is achieved.
Reinforcement learning has been proved to be used in underwater robots, but the traditional underwater robot motion planning method based on reinforcement learning considers a single constraint target and does not consider the influence on the underwater robot motion under the condition of multi-target constraints such as water flow constraint, target constraint and obstruction constraint.
Disclosure of Invention
The invention aims to provide a motion planning method of an underwater robot based on multiple constraint targets. The method is characterized in that an underwater robot dynamics model under the influence of water flow is constructed, multiple constraint targets are fused by combining a reinforcement learning method, reasonable reward signals and action spaces are constructed, and an optimal control strategy of the underwater robot is output through training. In addition, the underwater multi-constraint target is combined with a Q learning algorithm in reinforcement learning, so that the underwater robot can obtain environmental characteristics under an unknown underwater environment, strategy iteration is carried out, and the motion planning of the underwater robot is completed.
The purpose of the invention is realized as follows:
an underwater robot motion planning method based on multiple constraint targets is divided into a model construction stage and an algorithm training stage, and specifically comprises the following steps:
(1) and a model construction stage, specifically to the construction of a Markov decision process E, wherein the reinforcement learning task can be generally described by the Markov decision process. Because of the particularity of the underwater environment, a Markov decision process is constructed by considering multi-target constraints such as environment constraint, obstructive object constraint, target point constraint and the like, and the method specifically comprises the following steps:
(1-1) establishing a current environment x from the sensor signalt(ii) a Let the distance of the obstacle in the ith degree of freedom of the robot be liIf no obstacle exists in the i degree of freedom, setting the i degree of freedom to be infinite; the flow speed of the position where the robot is located is set as vc; positioning the position of the robot in real time, and calculating the Euclidean distance d between the robot and a target point;
(1-2) establishing an action space A of the robot according to the maximum value of the underwater robot capable of advancing, wherein the action space A consists of five motion commands which are respectively advancing, left-front, right-front, left-side pushing and right-side pushing, and the speed is vaAngular velocity of ωa
(1-3) considering the constraint of the obstructive objects, setting the ith degree of freedom underwater alert safety distance hiIf detected li<hiThen a collision is considered to occur and a negative reward r is setter
(1-4) considering the target point constraint with a target point threshold of d', if it is detected that d is getting larger, a negative reward r is setoppIf d is detected to be smaller, a positive reward r is setmoveIf d < d' is detected, the machineThe robot arrives at the target point and sets a positive reward rarr
(2) In the algorithm training stage, the robot is specifically subjected to continuous trial and error in computer simulation, and a strategy is learned, and the method specifically comprises the following steps:
(2-1) initializing t to be 0, wherein t represents the step number of each training movement of the robot; initialization rt=0,rtRepresenting the reward obtained when the robot performs the t-th action;
(2-2) initializing a matrix Q (x, a), recording a Q value which can be obtained by selecting the action a when the matrix Q is in the state x, and initializing the Q value to be 0;
(2-3) initializing a counter count which is 0, and recording the total times of robot training; setting an M value, representing that the robot needs to be trained for M times in total;
(2-4) when the count is less than the specified training time M, (2-5) is executed, otherwise, (2-14) is executed;
(2-5) acquiring sensor signals to obtain the current state xtThe distance l of the obstacle in the direction of the freedom degree of the robot i is included in the robotiSetting the distance to infinity if there is no obstacle; current position ocean current velocity information vct(ii) a Self-position information is obtained, and the Euclidean distance d between the target point and the robot is obtained through calculation;
(2-6) selecting action a according to the matrix Qt
(2-7) the action a to be selected according to the speed actually externally presented by the target formula in consideration of the kinematic constraint and the water flow constrainttIs combined with the flow rate, simulated according to the combination, and l is updatedi
(2-8) if li<hiExecuting (2-9), otherwise executing (2-10);
(2-9) occurrence of a collision, rt=rterEnding the training, let xt+1If the matrix is empty, updating the matrix Q, and re-executing training from (2-4) by using count +1 and making t equal to 0;
(2-10) if d' < d, executing (2-11), otherwise, arriving at the target point, ending the training, and enabling rt=rarrLet xt+1Null and update matrix Q to count +1, let t equal to 0, re-perform training from (2-4);
(2-11) if dt<dt-1Executing (2-12), otherwise executing (2-13);
(2-12) d is decreased to let rt=rmoveUpdate xt+1Updating the matrix Q, and re-executing the training process from (2-5) for t + 1;
(2-13) d is increased to let rt=roppUpdate xt+1Updating the matrix Q, and re-executing the training process from (2-5) for t + 1;
(2-14) finishing training to obtain a trained matrix Q;
and (2-15) outputting the underwater robot motion planning strategy.
The kinematic constraint, namely the motion constraint of the underwater robot in the training process is as follows: assuming that the coordinates of the center of gravity of the aircraft in the fixed coordinate system are (x, y), the robot fixed coordinate system velocity is:
Figure BDA0001728817320000031
Figure BDA0001728817320000032
wherein theta is a longitudinal inclination angle, phi is a transverse inclination angle, and alpha is an influence coefficient of motion constraint on the speed of the underwater robot.
The water flow constraint is considered in the following method when selecting actions in the training process: in the learning training process, xtFlow rate vc obtained by ADCP in this statetAccording to a strategy, the robot selects one action a in the set of actionstIts own velocity is
Figure BDA0001728817320000033
When the robot executes the action, the water flow constraint is considered, and the actual outward represented navigation speed is as follows: vit=vat+βvctAnd beta is the influence coefficient of water flow on the speed of the underwater robot.
Said selection action atThe specific method comprises the following steps: setting a threshold value epsilon by adopting a greedy strategy, generating a random number epsilon 'by using a computer, and executing a state Q (x) in a Q matrix by the robot if the random number is smaller than the threshold value, namely epsilon' < epsilontAction corresponding to maximum value of element in a), i.e. at=maxa Q(xtA); if the random number is larger than the threshold value, namely epsilon' > epsilon, the robot randomly selects an action to execute, namely at=randomQ(xt,a)。
The method for updating the matrix Q comprises the following steps: the state of the robot before executing the action is assumed to be xtThe action to be performed is atThe reward coefficient r obtained from the feedbacktThe state reached after the action is executed is xt+1Then, then
Q(xt,at)←(1-α)*rt+α*(rt+γmaxa'Q(xt+1,a'))
Where α is the learning efficiency and where γ is the discount factor.
The invention has the beneficial effects that:
(1) according to the invention, multiple constraint targets such as water flow, an obstacle, a target and the like are considered, while the traditional reinforcement learning planning method does not consider multiple constraint targets at the same time, and the training method has practicability and robustness;
(2) the invention combines the reinforcement learning method with the underwater multi-constraint target, realizes the motion planning of the underwater robot, has stronger real-time performance and can be suitable for various environments.
Drawings
FIG. 1 is a model construction schematic diagram of an underwater robot motion planning method based on multiple constraint targets;
fig. 2 is a flow chart executed in a training phase of a multi-constraint target-based underwater robot motion planning method.
Detailed Description
The following further describes embodiments of the present invention with reference to the accompanying drawings:
the invention relates to a motion planning method for an underwater robot, in particular to a method for combining multi-target constraint and reinforcement learning and used for motion planning of the underwater robot. A model construction stage: converting the signals of the obstacle avoidance sonar of the robot and the flow velocity signals of the flow velocity sensor into the current environment; establishing a discrete action space based on the dynamic constraint of the underwater robot; establishing a reward function by taking an underwater obstacle as a constraint; and establishing a Markov decision process based on multi-target constraint to establish a basis for algorithm realization. A training stage: training is carried out based on a Q learning algorithm, actions are executed based on a greedy strategy in the current environment, the strategy is evaluated and updated once each time the strategy is executed, the strategy is improved until the strategy is adaptive to the environment, and the planning purpose is achieved. The invention combines the reinforcement learning method with the underwater multi-constraint target, realizes the motion planning of the underwater robot, has stronger real-time performance and can be suitable for various environments.
Aiming at the particularity of the underwater environment, the invention takes the combination of multiple constraint targets and a reinforcement learning method into consideration to train the motion planning strategy of the underwater robot. The method comprises a model construction stage and a strategy training stage, and comprises the following steps:
1. the model construction stage is shown in fig. 1, and comprises the following specific steps:
the reinforcement learning task may be generally described by a Markov decision process. Due to the particularity of the underwater environment, a Markov decision process is constructed by considering multi-target constraints such as environment constraint, obstructive object constraint, target point constraint and the like.
The specific composition of the state space X: firstly, obstacle avoidance sonar of an underwater robot processes the obstacle information of the environment where the robot is, namely the obstacle distance information l of the robot in the direction of the i-degree of freedomi(ii) a Secondly, processing the ocean current information of the environment where the robot is located by the ADCP, namely the flow velocity vc of the position where the robot is located; and thirdly, the GPS processes the relative position information of the robot and the target point, namely the Euclidean distance d between the robot and the target point.
The specific composition of the motion space a: the action space in the invention comprises four control commands with the names of left front and frontRight front, left side push and right side push. The linear velocity of the robot is a fixed value va
The concrete composition of the reward function R: once the robot collides, the reward value is rter(ii) a The robot does not collide but is farther and farther away from the target point with the reward value ropp(ii) a The robot is not collided and is closer to the target point, and the reward value is rmove(ii) a The robot arrives at the target point with the reward value rarr
2. In the strategy training phase, the process is shown in fig. 2, and the specific steps are as follows:
firstly, establishing a virtual environment for training, wherein the specific method comprises the following steps:
a simulated marine environment is established by using robot motion simulation software, and obstacles, target points and ocean currents are set in the virtual environment. The obstacles and target points may be randomly defined and 6-12 different robot starting points defined.
Rasterizing a two-dimensional plane space, wherein the ocean currents in each grid can be regarded as the same, a flow field is randomly generated by a flow function psi (x, y), and the velocity field of the ocean currents can be obtained by the flow field function:
Figure BDA0001728817320000051
Figure BDA0001728817320000052
due to incompressibility of the fluid
Figure BDA0001728817320000053
In the formula vcx,vcyThe velocity components of the ocean current at the (X, Y) position along the X-axis direction and the Y-axis direction are respectively taken as the center point of each grid.
Performing strategy training, which comprises the following specific steps:
1) initializing t to be 0, wherein t represents the step number of each training movement of the robot; and initializing rt to 0, wherein rt represents the reward obtained when the robot executes the t-th action. A matrix Q (x, a) is defined, and the available Q values for action a are selected and initialized to 0 when recorded in state x. And (5) recording the total times of robot training when the initial counter count is 0. Setting the value of M, representing that the robot needs to train M times in total. Initializing safe radius h of i-degree-of-freedom direction of underwater roboti. And setting a value d' which represents a threshold value of the distance between the robot and the target point.
2) Initializing the state of the robot, and randomly selecting a starting point to begin exploration.
3) Robot acquiring environmental information xtThe distance l between the obstacle and the robot in the direction of the degree of freedom of the robot i is includediSetting the distance to infinity if there is no obstacle; current position ocean current velocity information vc; self-position information is obtained, and the Euclidean distance d between the target point and the robot is obtained through calculation.
4) Setting a threshold value epsilon, generating a random number epsilon 'by means of a computer, and if epsilon' < epsilon, randomly selecting a motion in a motion space by the robot to execute, namely atrandomQ (xt, a); if ε' > ε, the robot chooses to be in state x according to the matrix Q (x, a)tThe action a with the largest value, i.e. at=maxa Q(xt,a)。
5) The robot considers kinematic constraint and water flow constraint, according to the speed actually expressed by a target formula, and according to the speed vi in a simulation environmenttAnd (6) moving.
6) The robot has performed action atThereafter, the environmental information x is acquired againt+1
6-1) if li<hiWhen the collision occurs, the training is finished, and the counter count +1 is based on
Q(xt,at)←(1-α)*rt+α*(rt+γmaxa'Q(xt+1,a'))
Updating the matrix Q, and if the training step number t is 0, if the count is less than M, starting to retrain from the step 2), and if the count is M, continuing to execute the step 7).
6-2) if li is more than hi, the collision is not generated, and whether the collision reaches the target point is continuously judged.
6-2-1) if d is less than or equal to d', indicating that the target point is reached, ending the training, counting a counter +1, updating the matrix Q, and if the training step number t is 0, starting retraining from the step 2) if the count is less than M, and continuing to execute the step 7) if the count is M).
6-2-2) if d > d', it is indicated that the target point has not been reached, t +1, continuing the training from step 3).
7) And after the training is finished, outputting a motion planning strategy of the underwater robot.
The method has the advantages that multiple constraint targets such as water flow, an obstacle, a target and the like are considered, multiple constraint targets are not considered simultaneously in the traditional reinforcement learning planning method, and the training method is lack of practicability and robustness. The invention performs characteristic fusion on multiple constraint targets through reinforcement learning, and can train a more practical underwater robot motion planning strategy.

Claims (5)

1. A motion planning method of an underwater robot based on multiple constraint targets is characterized by comprising a model construction stage and an algorithm training stage, and comprises the following steps:
(1) a model construction stage; model construction for a Markov decision Process E, comprising the steps of:
(1-1) establishing a current environment x from the sensor signalt(ii) a Let the distance of the obstacle in the ith degree of freedom of the robot be liIf there is no obstacle in the i degree of freedom, then liSetting to infinity; the flow speed of the position where the robot is located is set as vc; positioning the position of the robot in real time, and calculating the Euclidean distance d between the robot and a target point;
(1-2) establishing an action space A of the robot according to the maximum value of the underwater robot which can advance; the A comprises five motion commands which are respectively forward, left-front, right-front, left-side push and right-side push; velocity vaAngular velocity of ωa
(1-3) consideration of obstructionsConstraining; set the ith degree of freedom underwater alert safety distance hiIf detected li<hiThen a collision is considered to occur and a negative reward r is setter
(1-4) considering target point constraints; setting the target point threshold as d', if d is detected to be increased, setting a negative reward roppIf d is detected to be smaller, a positive reward r is setmoveIf d < d' is detected, the robot arrives at the target point and a positive reward r is setarr
(2) An algorithm training stage; the robot continuously tries on and mistakes in computer simulation to learn a strategy, and the method comprises the following steps:
(2-1) initializing t to be 0, wherein t represents the step number of each training movement of the robot; initialization rt=0,rtRepresenting the reward obtained when the robot performs the t-th action;
(2-2) initializing a matrix Q (x, a), and recording the Q value obtained by the action a when the matrix Q is in the state x;
(2-3) initializing a counter count which is 0, and recording the total times of robot training; setting an M value, representing that the robot needs to be trained for M times in total;
(2-4) when the count is less than the specified training time M, (2-5) is executed, otherwise, (2-14) is executed;
(2-5) acquiring sensor signals to obtain the current state xt(ii) a The current state xtDistance l of obstacle including information of obstacle, i degree of freedom direction of robotiCurrent position ocean current velocity information vctCalculating the Euclidean distance d between the target point and the robot according to the self position information;
(2-6) selecting action a according to the matrix Qt
(2-7) action a to be selected in consideration of kinematic constraint and water flow constrainttIs combined with the flow velocity, the actual outward-presented sailing speed obtained by combination is simulated, and l is updatedi
(2-8) if li<hiExecuting (2-9), otherwise executing (2-10);
(2-9) occurrence of a collision, rt=rterEnding the training, let xt+1If the matrix is empty, updating the matrix Q, and re-executing training from (2-4) by using count +1 and making t equal to 0;
(2-10) if d' < d, executing (2-11), otherwise, arriving at the target point, ending the training, and enabling rt=rarrLet xt+1Null and update matrix Q to count +1, let t equal to 0, re-perform training from (2-4);
(2-11) if dt<dt-1Executing (2-12), otherwise executing (2-13);
(2-12) d is decreased to let rt=rmoveUpdate xt+1Updating the matrix Q, and re-executing the training process from (2-5) for t + 1;
(2-13) d is increased to let rt=roppUpdate xt+1Updating the matrix Q, and re-executing the training process from (2-5) for t + 1;
(2-14) finishing training to obtain a trained matrix Q;
and (2-15) outputting the underwater robot motion planning strategy.
2. The underwater robot motion planning method based on the multiple constraint targets as recited in claim 1, wherein: the kinematic constraint, namely the motion constraint of the underwater robot in the training process is as follows: assuming that the coordinates of the center of gravity of the aircraft in the fixed coordinate system are (x, y), the robot fixed coordinate system velocity is:
Figure FDA0001728817310000021
Figure FDA0001728817310000022
wherein theta is a longitudinal inclination angle, phi is a transverse inclination angle, and alpha is an influence coefficient of motion constraint on the speed of the underwater robot.
3. The underwater robot motion planning method based on the multiple constraint targets as recited in claim 1, wherein: the water flow constraint is considered in the following method when selecting actions in the training process: in the learning training process, xtFlow rate vc obtained by ADCP in this statetAccording to a strategy, the robot selects one action a in the set of actionstIts own velocity is
Figure FDA0001728817310000023
When the robot executes the action, the water flow constraint is considered, and the actual outward represented navigation speed is as follows: vit=vat+βvctAnd beta is the influence coefficient of water flow on the speed of the underwater robot.
4. The underwater robot motion planning method based on the multiple constraint targets as recited in claim 1, wherein: said selection action atThe specific method comprises the following steps: setting a threshold value epsilon by adopting a greedy strategy, generating a random number epsilon 'by using a computer, and executing a state Q (x) in a Q matrix by the robot if the random number is smaller than the threshold value, namely epsilon' < epsilontAction corresponding to maximum value of element in a), i.e. at=maxaQ(xtA); if the random number is larger than the threshold value, namely epsilon' > epsilon, the robot randomly selects an action to execute, namely at=randomQ(xt,a)。
5. The underwater robot motion planning method based on the multiple constraint targets as recited in claim 1, wherein: the method for updating the matrix Q comprises the following steps: the state of the robot before executing the action is assumed to be xtThe action to be performed is atThe reward coefficient r obtained from the feedbacktThe state reached after the action is executed is xt+1Then, then
Q(xt,at)←(1-α)*rt+α*(rt+γmaxa'Q(xt+1,a'))
Where α is the learning efficiency and where γ is the discount factor.
CN201810764979.8A 2018-07-12 2018-07-12 Underwater robot motion planning method based on multiple constraint targets Active CN109241552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810764979.8A CN109241552B (en) 2018-07-12 2018-07-12 Underwater robot motion planning method based on multiple constraint targets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810764979.8A CN109241552B (en) 2018-07-12 2018-07-12 Underwater robot motion planning method based on multiple constraint targets

Publications (2)

Publication Number Publication Date
CN109241552A CN109241552A (en) 2019-01-18
CN109241552B true CN109241552B (en) 2022-04-05

Family

ID=65072571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810764979.8A Active CN109241552B (en) 2018-07-12 2018-07-12 Underwater robot motion planning method based on multiple constraint targets

Country Status (1)

Country Link
CN (1) CN109241552B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110196605B (en) * 2019-04-26 2022-03-22 大连海事大学 Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster
CN110333739B (en) * 2019-08-21 2020-07-31 哈尔滨工程大学 AUV (autonomous Underwater vehicle) behavior planning and action control method based on reinforcement learning
JP7221839B2 (en) * 2019-10-08 2023-02-14 国立大学法人静岡大学 Autonomous Mobile Robot and Control Program for Autonomous Mobile Robot
CN110765686B (en) * 2019-10-22 2020-09-11 中国人民解放军战略支援部队信息工程大学 Method for designing shipborne sonar sounding line by using limited wave band submarine topography
CN110955239B (en) * 2019-11-12 2021-03-02 中国地质大学(武汉) Unmanned ship multi-target trajectory planning method and system based on inverse reinforcement learning
CN110779132A (en) * 2019-11-13 2020-02-11 垚控科技(上海)有限公司 Water pump equipment operation control system of air conditioning system based on reinforcement learning
CN112945232B (en) * 2019-12-11 2022-11-01 中国科学院沈阳自动化研究所 Target value planning method for near-bottom terrain tracking of underwater robot
CN113222166A (en) * 2020-01-21 2021-08-06 厦门邑通软件科技有限公司 Machine heuristic learning method, system and device for operation behavior record management
CN112052511A (en) * 2020-06-15 2020-12-08 成都蓉奥科技有限公司 Air combat maneuver strategy generation technology based on deep random game
CN112149354A (en) * 2020-09-24 2020-12-29 哈尔滨工程大学 Reinforced learning algorithm research platform for UUV cluster
CN112650246B (en) * 2020-12-23 2022-12-09 武汉理工大学 Ship autonomous navigation method and device
CN112925319B (en) * 2021-01-25 2022-06-07 哈尔滨工程大学 Underwater autonomous vehicle dynamic obstacle avoidance method based on deep reinforcement learning
CN113052257B (en) * 2021-04-13 2024-04-16 中国电子科技集团公司信息科学研究院 Deep reinforcement learning method and device based on visual transducer
CN113110493B (en) * 2021-05-07 2022-09-30 北京邮电大学 Path planning equipment and path planning method based on photonic neural network
CN114559439B (en) * 2022-04-27 2022-07-26 南通科美自动化科技有限公司 Mobile robot intelligent obstacle avoidance control method and device and electronic equipment
CN116295449B (en) * 2023-05-25 2023-09-12 吉林大学 Method and device for indicating path of autonomous underwater vehicle
CN117079118B (en) * 2023-10-16 2024-01-16 广州华夏汇海科技有限公司 Underwater walking detection method and system based on visual detection

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117071A (en) * 2009-12-30 2011-07-06 中国科学院沈阳自动化研究所 Multi-underwater robot semi-physical simulation system and control method thereof
CN105512769A (en) * 2015-12-16 2016-04-20 上海交通大学 Unmanned aerial vehicle route planning system and unmanned aerial vehicle route planning method based on genetic programming
CN106950969A (en) * 2017-04-28 2017-07-14 深圳市唯特视科技有限公司 It is a kind of based on the mobile robot continuous control method without map movement planner
CN107065881A (en) * 2017-05-17 2017-08-18 清华大学 A kind of robot global path planning method learnt based on deeply
CN107168309A (en) * 2017-05-02 2017-09-15 哈尔滨工程大学 A kind of underwater multi-robot paths planning method of Behavior-based control
WO2017161632A1 (en) * 2016-03-24 2017-09-28 苏州大学张家港工业技术研究院 Cleaning robot optimal target path planning method based on model learning
CN107883961A (en) * 2017-11-06 2018-04-06 哈尔滨工程大学 A kind of underwater robot method for optimizing route based on Smooth RRT algorithms
CN107918396A (en) * 2017-11-30 2018-04-17 深圳市智能机器人研究院 A kind of underwater cleaning robot paths planning method and system based on hull model
CN108268031A (en) * 2016-12-30 2018-07-10 深圳光启合众科技有限公司 Paths planning method, device and robot

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117071A (en) * 2009-12-30 2011-07-06 中国科学院沈阳自动化研究所 Multi-underwater robot semi-physical simulation system and control method thereof
CN105512769A (en) * 2015-12-16 2016-04-20 上海交通大学 Unmanned aerial vehicle route planning system and unmanned aerial vehicle route planning method based on genetic programming
WO2017161632A1 (en) * 2016-03-24 2017-09-28 苏州大学张家港工业技术研究院 Cleaning robot optimal target path planning method based on model learning
CN108268031A (en) * 2016-12-30 2018-07-10 深圳光启合众科技有限公司 Paths planning method, device and robot
CN106950969A (en) * 2017-04-28 2017-07-14 深圳市唯特视科技有限公司 It is a kind of based on the mobile robot continuous control method without map movement planner
CN107168309A (en) * 2017-05-02 2017-09-15 哈尔滨工程大学 A kind of underwater multi-robot paths planning method of Behavior-based control
CN107065881A (en) * 2017-05-17 2017-08-18 清华大学 A kind of robot global path planning method learnt based on deeply
CN107883961A (en) * 2017-11-06 2018-04-06 哈尔滨工程大学 A kind of underwater robot method for optimizing route based on Smooth RRT algorithms
CN107918396A (en) * 2017-11-30 2018-04-17 深圳市智能机器人研究院 A kind of underwater cleaning robot paths planning method and system based on hull model

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Novel Adaptive Second Order Sliding Mode Path Following Control for a Portable AUV;Zhang Guo-cheng;《Ocean Engineering》;20180301;第1-11页 *
Artificial Intelligence (ACAAI2018)》.2018, *
Yushan Sun.Research on Recovery Strategy and Motion Control for Autonomous Underwater Vehicle.《Proceedings of 2018 International Conference on Advanced Control, Automation and *
基于分层马尔可夫决策过程的AUV 全局路径规划研究;洪晔 等;《***仿真学报》;20080905;第20卷(第9期);第2361-2367页 *

Also Published As

Publication number Publication date
CN109241552A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
CN109241552B (en) Underwater robot motion planning method based on multiple constraint targets
CN109540151B (en) AUV three-dimensional path planning method based on reinforcement learning
Sun et al. Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning
CN109765929B (en) UUV real-time obstacle avoidance planning method based on improved RNN
CN109782779B (en) AUV path planning method in ocean current environment based on population hyperheuristic algorithm
Cao et al. Target search control of AUV in underwater environment with deep reinforcement learning
CN110632931A (en) Mobile robot collision avoidance planning method based on deep reinforcement learning in dynamic environment
CN106338919B (en) Unmanned boat Track In Track control method based on enhancing learning type intellectual algorithm
CN109784201B (en) AUV dynamic obstacle avoidance method based on four-dimensional risk assessment
Wang et al. Improved quantum particle swarm optimization algorithm for offline path planning in AUVs
An et al. Uncertain moving obstacles avoiding method in 3D arbitrary path planning for a spherical underwater robot
Xue et al. Proximal policy optimization with reciprocal velocity obstacle based collision avoidance path planning for multi-unmanned surface vehicles
CN107168309A (en) A kind of underwater multi-robot paths planning method of Behavior-based control
CN112612290B (en) Underwater vehicle three-dimensional multi-task path planning method considering ocean currents
Sun et al. A novel fuzzy control algorithm for three-dimensional AUV path planning based on sonar model
CN110716574A (en) UUV real-time collision avoidance planning method based on deep Q network
Sun et al. Three dimensional D* Lite path planning for Autonomous Underwater Vehicle under partly unknown environment
Yan et al. Reinforcement Learning‐Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions
Zhou et al. A hybrid path planning and formation control strategy of multi-robots in a dynamic environment
Yu et al. A traversal multi-target path planning method for multi-unmanned surface vessels in space-varying ocean current
Zhang et al. Intelligent vector field histogram based collision avoidance method for auv
CN114779801A (en) Autonomous remote control underwater robot path planning method for target detection
Pandey et al. Real time navigation strategies for webots using fuzzy controller
Mohanty et al. A new intelligent approach for mobile robot navigation
CN116027796A (en) Multi-autonomous underwater robot formation control system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190118

Assignee: Osenda (Shandong) Offshore Engineering Co.,Ltd.

Assignor: HARBIN ENGINEERING University

Contract record no.: X2023230000005

Denomination of invention: A Motion Planning Method of Underwater Vehicle Based on Multi-constrained Targets

Granted publication date: 20220405

License type: Exclusive License

Record date: 20230130

EE01 Entry into force of recordation of patent licensing contract