CN109409592A - The optimal policy solution of mobile robot under dynamic environment - Google Patents

The optimal policy solution of mobile robot under dynamic environment Download PDF

Info

Publication number
CN109409592A
CN109409592A CN201811196536.XA CN201811196536A CN109409592A CN 109409592 A CN109409592 A CN 109409592A CN 201811196536 A CN201811196536 A CN 201811196536A CN 109409592 A CN109409592 A CN 109409592A
Authority
CN
China
Prior art keywords
state
point
task
represent
mdp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811196536.XA
Other languages
Chinese (zh)
Other versions
CN109409592B (en
Inventor
欧林林
范振雍
禹鑫燚
陆文祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201811196536.XA priority Critical patent/CN109409592B/en
Publication of CN109409592A publication Critical patent/CN109409592A/en
Application granted granted Critical
Publication of CN109409592B publication Critical patent/CN109409592B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • G06Q10/047Optimisation of routes or paths, e.g. travelling salesman problem

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Feedback Control In General (AREA)

Abstract

The optimal policy solution of mobile robot includes the following steps: first under dynamic environment, according to the running environment of robot, construct improvement-weighting switching system, according to mission requirements, using linear time temporal logic (LTL) by mission requirements mathematical expression, B ü chi automatic machine is converted by LTL task formula using LTL2BA kit;Then cartesian product is carried out by 2, obtains Product automatic machine, contains mission requirements and environmental information;Useless point removal on feasibility network topological diagram is further judged into the availability of state point further according to double labels and behavior restraint criterion, and then simplifies the quantity of state point.Remaining point is built into MDP model, the method for Utilization strategies iteration obtains optimal policy.The present invention not only solves the case where there is no DRA, also makes the reduction of available point quantity, and the MDP complexity decline of building can faster obtain optimal policy.

Description

The optimal policy solution of mobile robot under dynamic environment
Technical field
The present invention relates to the optimal policy generation methods of mobile robot under dynamic environment.
Background technique
It is in recent years, with the development of science and technology, increasing for the demand of intelligent robot in people's production, life, Requirement to robot automtion level is also higher and higher.The application of intelligent robot necessarily involves the movement of robot, i.e., The path planning of robot, existing paths planning method such as genetic algorithm, Particle Swarm Optimization, ant colony optimization algorithm, simulation Annealing algorithm is all the optimal path cooked up in static environment according to given robot running environment, and for path Search be all single step determine in the case where.And for such as artificial neural network algorithm, heuristic search algorithm, based on adopting The path planning algorithm etc. of sample, although the environment of dynamic change can be suitable for, for there are many complex tasks and single step In the case where selection, task can not be still completed well.Based on linear time temporal logic (linear temporal logic, LTL) theoretical method for planning path for mobile robot describes task complicated in practical application using linear temporal task formula Demand, and environmental information and mission bit stream are blended to guarantee to search out and not only meet environmental information, but also meet task need The optimal path asked.But for single step there are many selection the case where, required is not optimal path, but can satisfy appoint The optimal policy of business demand.
In order to solve the problem above-mentioned, traditional solution is LTL-DRA (deterministic automation) method, is believed with environment Breath and DRA are combined, and can guarantee that the optimal policy obtained can complete given mission requirements and make search cost compared to upper The dynamic programming algorithm stated is smaller.But using DRA, there are a kind of drawbacks, and in certain situations, LTL formula can not be converted into DRA, this makes traditional solution that can not solve all situations, and on the other hand, there is also some by the MDP that conventional method obtains The state of redundancy can further be reduced.
NBA (non-deterministic automata) is proposed to solve situation present in DRA, and NBA can guarantee each A LTL formula can be converted into automatic machine figure, be convenient for subsequent operation.The case where there are multiple choices for single step, will Model construction solves mahalanobis distance map process at MDP (Markovian decision model), solves optimal policy using Policy iteration.
Summary of the invention
The invention solves the above problem of the prior art, a kind of optimal policy of mobile robot under dynamic environment is provided Solution.
The present invention describes complex task demand using linear time temporal logic (LTL), replaces traditional DRA to turn LTL with NBA It changes graphical representation into, while using double labels and behavior restraint criterion, removing extra idle state, simplifying the solution of MDP problem Certainly.The invention flow chart constructs improvement-weighting switching system as shown in Figure 1, firstly, according to the running environment of robot, according to Mission requirements, using linear time temporal logic (LTL) by mission requirements mathematical expression, using LTL2BA kit by LTL task Formula is converted into B ü chi automatic machine;Then cartesian product is carried out by 2, obtains Product automatic machine, containing task needs Summation environmental information;By the useless point removal (some points are only inputted or only exported) on feasibility network topological diagram, then root According to double labels and behavior restraint criterion, the availability of state point is further judged, and then simplify the quantity of state point.It will be remaining Point is built into MDP model, and the method for Utilization strategies iteration obtains optimal policy.This method not only solves the feelings there is no DRA Condition, also makes the reduction of available point quantity, and the MDP complexity decline of building can faster obtain optimal policy.
The optimal policy solution of mobile robot under dynamic environment of the invention, the specific steps are as follows:
Step 1 constructs improvement-weighting switching system;
It is an improvement-weighting switching system by the environment construction where robot, weighting switching system is to environment Modelling, is defined as tuple T:=(Q, a q0,R,Π,L,wT), wherein Q is a limited state set, in environment The node chosen is as state set;q0∈ Q represents original state, i.e. original state where robot, runs starting point;R →2QHandoff relation is represented, the connected relation (between path point) is shown between each state;Π represents atomic proposition, i.e., The movement that each state point should be completed;L:Q→2ΠRepresent mark collection of functions;wTSwitching weight is represented, as measurement Value, i.e. another label.Effect of the atomic proposition in weighting switching system is to represent the attribute of each state, and if only if When atomic proposition π is true at state q, π ∈ L (q) is just set up, if q2∈R(q1), then q2For q1Succeeding state;Weighting switching system Any one track r in systemTIt is made of limited state in T, i.e. rT=q0q1q2..., wherein for arbitrary i >=0 There is qi+1∈δ(qi) set up, track rTContain limited mark function o=o1o2o3..., wherein oi∈L(qi).Such as Fig. 2 institute Show, is the MDP process an of robot, it is built into weighting switching system, as shown in figure 3, in q0Execute the dynamic of pickup Make, in q9Place executes dropoff movement.
Step 2 complex task mathematical expression;
Complex task can be subjected to mathematical expression according to linear time temporal logic theory;Linear time temporal logic (LTL) is A kind of high-level language close to natural language, by sequential logic operator G (always), F (final), X (following), U (until) and Boolean operator(non-), ∧ (with), ∨ (or), → (implication),(being equivalent to), which combines, can accurately describe moving machine The complex task of device people.Such as task formula
This Task expression robot is after pickup, it is necessary to which pickup can just be returned to after dropoff by reaching, together Reason, dropoff can just be returned to robot dropoff by having to pass through pickup later.
Step 3 generates B ü chi automatic machine;
In order to combine environmental information and mission bit stream, need linear temporal task formula through LTL2BA kit φ is converted to the form of task feasibility chart, i.e. B ü chi automatic machine, converts B ü chi automatic machine for the formula of step 3, such as Shown in Fig. 4.B ü chi automatic machine is a five-tuple B:=(SB,SB0BB,FB).Wherein, SBRepresent a limited state Collection;SB0∈SBRepresent original state;ΣBRepresent the character list of input;δB∈SB×ΣB×SBRepresent switching function;FB ∈SBRepresent set of final state.
Step 4 constructs task feasibility network topological diagram;
Switching system will be weighted and B ü chi automatic machine carries out cartesian product, obtained comprising environmental information and mission bit stream Task feasibility network topological diagram P, i.e.,P is a tuple (SP,SP0P,wP,FP), wherein SP=Q × SBGeneration Table finite state collection;Switching function is represented, is defined as and if only if qj∈R(qi) and sl∈δB(sk,L (qi)) when, (qj,sl)∈δP((qi,sk)) set up;wPFor the weight for being inherited from T, i.e., as (qj,sl)∈δP((qj,sl)) when, then wP((qi,sk),(qj,sl))=wT(qi,qj);FP=Q × FBRepresent a final reception state.In task feasibility network Useful point is selected on topological diagram to construct MDP, the decision strategy that can guarantee so meets environmental information and meets again Mission requirements.
Step 5 state point is deleted;
On the task feasibility network topological diagram that step 4 obtains, some useless points are taken the lead in rejecting, i.e., are not reached Point has some only inputs or only exports, and such point is not accessibility, because these points is selected to will lead to strategy It interrupts, is unable to get optimal result.Double labels are introduced, a label is state tag, i.e., before turning point, the shape of this state State label value will be consistent with Last status label, such as the state at P1 is pickup, and turning point is in P10, then P2- The state of P9 is all pickup, and similarly after P10, before P1, the state of P9-P2 is all dropoff.And different states it Between cannot be connected, that is, a state can only possess a state tag.Another label is metric, selects distance, i.e., The distance of each state and other state, the thought of behavior restraint criterion is from legal restraint in practice, when under robot One state makes metric lower than this state, then NextState will be rejected, if robot is in q1 point task, then machine People will be added robot and go to q4 toward movement at p10, and there are two states, and q5 may be selected in next step, and the metric of q6, q4 are 3, Exactly only have 3 step pitches from and q5 there are 4 step pitches from q6 there are 2 step pitches from then will give up from target point from target point from target point Q5 state, and select q6 state.Prepared by double labels and behavior restraint, is further simplified task feasibility topology.Such as Fig. 5 institute Show, removes down state point by double labels and behavior restraint criterion, remaining point is to construct MDP.
Step 6 constructs Markovian decision model;
By remaining state point construct MDP, quintet M:=(T, S, A (i), p (and | i, a), r (i, a)) are known as one Mahalanobis distance map process (MDP) wherein the time point for choosing action is referred to as the decision moment, and remembers the point at all decision moment with T Collection;S is limited state space collection;Available action collection A (i) at state i is known as actionable space;P (| i, it is a) referred to as next The probability distribution of etching system state in which when decision;R (i, a) remuneration obtained for policymaker;MDP is completed in building, uses plan Slightly iterative algorithm is solved, and optimal policy is obtained.
The present invention describes the complicated mission requirements in practical application using linear time temporal logic formula, and by mission bit stream It blends to obtain feasibility Task Network topological diagram with environmental information, the strategy made is able to satisfy the demand of task, more efficient Execution task.The invention is innovated on traditional LTL-DRA (DRA: deterministic automation), proposes LTL-DBA method, It avoids when mission requirements are to eventually arrive at P: a φ=GFP of point always, because causing not obtaining using conventional method The case where to optimal policy.The Utilization strategies Searching efficiency feature directly proportional to environment complexity and task node number, simultaneously It is proposed double label models, using different conditions be not attached to behavior restraint criterion, obtain more terse feasibility Task Network Remaining the feasible stage is constructed MDP, while Utilization strategies iterative algorithm, obtains optimal policy by topological diagram.
The invention has the advantages that comparing traditional LTL-DRA, there is more preferable wider applicability, good can obtain most Dominant strategy.
Detailed description of the invention:
Fig. 1 is LTL-MDP strategy generating figure of the invention.
Fig. 2 is Markovian decision model of the invention.
Fig. 3 is weighting switching system T of the invention.
Fig. 4 is the corresponding B ü chi automatic machine of equation φ of the invention.
Fig. 5 is network topological diagram of the invention.
Specific embodiment
LTL-MDP solution of the invention is described further by simplified example below in conjunction with attached drawing.
As shown in Figure 1, firstly, according to running environment Fig. 2 of robot, building improvement-weighting switches the invention flow chart System diagram 3, according to mission requirements: robot is after pickup, it is necessary to which pickup can just be returned to later by reaching dropoff, together Reason, dropoff can just be returned to by having to pass through pickup after robot dropoff, using linear time temporal logic (LTL) by task Demand mathematical expression converts B ü chi automatic machine for LTL task formula using LTL2BA kit;Then flute is carried out by 2 Karr product obtains Product automatic machine, contains mission requirements and environmental information;By the nothing on feasibility network topological diagram With a removal (some points are only inputted or only exported), further according to double labels and behavior restraint criterion, state is further judged The availability of point, and then simplify the quantity of state point.Remaining point is built into MDP model, the method for Utilization strategies iteration obtains Optimal policy out.This method not only solves the case where there is no DRA, also makes the reduction of available point quantity, and the MDP of building is multiple Miscellaneous degree decline, can faster obtain optimal policy.Specific step is as follows:
Step 1 constructs improvement-weighting switching system;
It is an improvement-weighting switching system by the environment construction where robot, weighting switching system is to environment Modelling, is defined as tuple T:=(Q, a q0,R,Π,L,wT), wherein Q is a limited state set, in environment The node chosen is as state set;q0∈ Q represents original state, i.e. original state where robot, runs starting point;R →2QHandoff relation is represented, the connected relation (between path point) is shown between each state;Π represents atomic proposition, i.e., The movement that each state point should be completed;L:Q→2ΠRepresent mark collection of functions;wTSwitching weight is represented, as measurement Value, i.e. another label.Effect of the atomic proposition in weighting switching system is to represent the attribute of each state, and if only if When atomic proposition π is true at state q, π ∈ L (q) is just set up, if q2∈R(q1), then q2For q1Succeeding state;Weighting switching system Any one track r in systemTIt is made of limited state in T, i.e. rT=q0q1q2..., wherein for arbitrary i >=0 There is qi+1∈δ(qi) set up, track rTContain limited mark function o=o1o2o3..., wherein oi∈L(qi).Such as Fig. 2 institute Show, is the MDP process an of robot, it is built into weighting switching system, as shown in figure 3, in q1Execute the dynamic of pickup Make, in q10Place executes dropoff movement.
Step 2, complex task mathematical expression;
Complex task is subjected to mathematical expression according to linear time temporal logic theory;Linear time temporal logic (LTL) is a kind of Close to the high-level language of natural language, by sequential logic operator G (always), F (final), X (following), U (until) and boolean Operator(non-), ∧ (with), ∨ (or), → (implication),(being equivalent to), which combines, can accurately describe mobile robot Complex task.The task formula of Fig. 2 is
This Task expression robot is after pickup, it is necessary to which pickup can just be returned to after dropoff by reaching, together Reason, dropoff can just be returned to robot dropoff by having to pass through pickup later.
Step 3 generates B ü chi automatic machine;
In order to combine environmental information and mission bit stream, linear temporal task equation φ is turned by LTL2BA kit It is changed to the form of task feasibility chart, i.e. B ü chi automatic machine, converts B ü chi automatic machine, such as Fig. 4 for the formula of step 3 It is shown.B ü chi automatic machine is a five-tuple B:=(SB,SB0BB,FB).Wherein, SBRepresent a limited state set; SB0∈SBRepresent original state;ΣBRepresent the character list of input;δB∈SB×ΣB×SBRepresent switching function;FB∈SB Represent set of final state.
Step 4 constructs task feasibility network topological diagram;
Switching system will be weighted and B ü chi automatic machine carries out cartesian product, obtained comprising environmental information and mission bit stream Task feasibility network environment topological diagram P, i.e.,P is a tuple (SP,SP0P,wP,FP), wherein SP=Q × SBRepresent finite state collection;Switching function is represented, is defined as and if only if qj∈R(qi) and sl∈δB (sk,L(qi)) when, (qj,sl)∈δP((qi,sk)) set up;wPFor the weight for being inherited from T, i.e., as (qj,sl)∈δP((qj,sl)) When, then wP((qi,sk),(qj,sl))=wT(qi,qj);FP=Q × FBRepresent a final reception state.In task feasibility Useful point is selected on network topological diagram to construct MDP, the decision strategy that can guarantee so meets environmental information again Meet mission requirements.
Step 5, state point are deleted;
It on the task feasibility network topological diagram that step 4 obtains, takes the lead in rejecting some useless points, then introduces double marks Label, a label is state tag, i.e., before turning point, the state tag value of this state will be with Last status label one Cause, state P1 at is pickup, and turning point is in P10, then the state of P2-P9 is all pickup, similarly P10 it Afterwards, before P1, the state of P9-P2 is all dropoff.And it cannot be connected between different states, that is, a state can only Possess a state tag.Another label is metric, this model selects distance, i.e., each state and other state away from From the thought of behavior restraint criterion is from legal restraint in practice, when next state of robot makes metric lower than this One state, then NextState will be rejected, robot is in q1 point task, then robot will be toward movement at p10, robot is walked To q6, there are three states, and q5, q7, q9 may be selected in next step, and the metric of q5 is 3, that is, from target point only have 3 step pitches from, And q7 has 2 step pitches to have 2 step pitches from so will give up q5 state, and selecting q7, q9 state from target point from, q9 from target point.It is logical It crosses double labels and behavior restraint prepares, be further simplified task feasibility topology.As shown in figure 5, about by double labels and behavior Beam criterion removes down state point, and remaining point is to construct MDP.
Step 6 constructs Markovian decision model.
By remaining state point construct MDP, quintet M:=(T, S, A (i), p (and | i, a), r (i, a)) are known as one Mahalanobis distance map process (MDP) wherein the time point for choosing action is referred to as the decision moment, and remembers the point at all decision moment with T Collection;S is limited state space collection;Available action collection A (i) at state i is known as actionable space;P (| i, it is a) referred to as next The probability distribution of etching system state in which when decision;R (i, a) remuneration obtained for policymaker;MDP is completed in building, uses plan Slightly iterative algorithm is solved, and optimal policy is obtained.
The present invention, as the synthesis automatic machine of LTL, avoids causing some LTL that DRA is not present using DRA using NBA The case where automatic machine, combines environmental information and task formula to obtain task feasibility network topological diagram, by double labels and Behavior restraint criterion removes useless point, remaining state point is formed building MDP, Utilization strategies iterative algorithm obtains optimal plan Slightly, the experimental results showed that such issues that method very good solution proposed by the invention.
Content described in this specification embodiment is only enumerating to the way of realization of inventive concept, protection of the invention Range should not be construed as being limited to the specific forms stated in the embodiments, and protection scope of the present invention is also and in art technology Personnel conceive according to the present invention it is conceivable that equivalent technologies mean.

Claims (1)

1. the optimal policy solution of mobile robot under dynamic environment, specific as follows:
Step 1: building improvement-weighting switching system;
It is an improvement-weighting switching system by the environment construction where robot, weighting switching system is the model to environment Change, is defined as tuple T:=(Q, a q0,R,Π,L,wT), wherein Q is a limited state set, choosing in environment Node as state set;q0∈ Q represents original state, i.e. original state where robot, runs starting point;R→2QGeneration Table handoff relation, shows between each state the connected relation (between path point);Π represents atomic proposition, i.e., each shape The movement that state point should be completed;L:Q→2ΠRepresent mark collection of functions;wTSwitching weight is represented, as metric, i.e., separately One label.Effect of the atomic proposition in weighting switching system is to represent the attribute of each state, at state q When atomic proposition π is true, π ∈ L (q) is just set up, if q2∈R(q1), then q2For q1Succeeding state;It weights in switching system Any one track rTIt is made of limited state in T, i.e. rT=q0q1q2..., wherein having q for arbitrary i >=0i+1 ∈δ(qi) set up, track rTContain limited mark function o=o1o2o3..., wherein oi∈L(qi);
Step 2: complex task mathematical expression;
Complex task can be carried out to mathematical expression according to linear time temporal logic theory, linear time temporal logic (LTL) is a kind of Close to the high-level language of natural language, by sequential logic operator G (always), F (final), X (following), U (until) and boolean Operator(non-), ∧ (with), ∨ (or), → (implication),(being equivalent to), which combines, can accurately describe mobile robot Complex task;
Step 3: B ü chi automatic machine is generated;
In order to combine environmental information and mission bit stream, need to turn linear temporal task equation φ by LTL2BA kit It is changed to the form of task feasibility chart, i.e. B ü chi automatic machine.B ü chi automatic machine is a five-tuple B:=(SB,SB0B, δB,FB);Wherein, SBRepresent a limited state set;SB0∈SBRepresent original state;ΣBRepresent the character list of input; δB∈SB×ΣB×SBRepresent switching function;FB∈SBRepresent set of final state;
Step 4: building task feasibility network topological diagram;
Switching system will be weighted and B ü chi automatic machine carries out cartesian product, obtain times comprising environmental information and mission bit stream Be engaged in feasibility network topological diagram P, i.e.,P is a tuple (SP,SP0P,wP,FP), wherein SP=Q × SBRepresentative has Limit state set;Switching function is represented, is defined as and if only if qj∈R(qi) and sl∈δB(sk,L(qi)) When, (qj,sl)∈δP((qi,sk)) set up;wPFor the weight for being inherited from T, i.e., as (qj,sl)∈δP((qj,sl)) when, then wP ((qi,sk),(qj,sl))=wT(qi,qj);FP=Q × FBRepresent a final reception state.It is opened up in task feasibility network It flutters on figure and selects useful point to construct MDP, the decision strategy that can guarantee in this way, which meets environmental information and meets again, appoints Business demand;
Step 5: state point is deleted;
On the task feasibility network topological diagram that step 4 obtains, some useless points are taken the lead in rejecting, i.e., can not the point of arrival, have Some only to input or only export, such point is not accessibility, because these is selected to put the interruption that will lead to strategy, It is unable to get optimal result;Double labels are introduced, a label is state tag, i.e., before turning point, the state mark of this state Label value will be consistent with Last status label, i.e., cannot be connected between different states, that is, a state can only possess one A state tag;Another label is metric, selects distance, i.e., the distance of each state and other state, behavior restraint standard Thought then is from legal restraint in practice, when next state of robot makes metric lower than this state, then under One state will be rejected;
Step 6: building Markovian decision model;
By remaining state point construct MDP, quintet M:=(T, S, A (i), p (and | i, a), r (i, a)) are known as a geneva Decision process (MDP) wherein the time point for choosing action is referred to as the decision moment, and remembers the point set at all decision moment with T;S For limited state space collection;Available action collection A (i) at state i is known as actionable space;P (| i a) is known as next decision When etching system state in which probability distribution;R (i, a) remuneration obtained for policymaker;MDP is completed in building, is changed using strategy It is solved for algorithm, obtains optimal policy.
CN201811196536.XA 2018-10-15 2018-10-15 Optimal strategy solution method of mobile robot in dynamic environment Active CN109409592B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811196536.XA CN109409592B (en) 2018-10-15 2018-10-15 Optimal strategy solution method of mobile robot in dynamic environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811196536.XA CN109409592B (en) 2018-10-15 2018-10-15 Optimal strategy solution method of mobile robot in dynamic environment

Publications (2)

Publication Number Publication Date
CN109409592A true CN109409592A (en) 2019-03-01
CN109409592B CN109409592B (en) 2021-08-24

Family

ID=65467171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811196536.XA Active CN109409592B (en) 2018-10-15 2018-10-15 Optimal strategy solution method of mobile robot in dynamic environment

Country Status (1)

Country Link
CN (1) CN109409592B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113031593A (en) * 2021-02-25 2021-06-25 上海交通大学 Active sensing task path planning method and system, robot and controller
CN113064429A (en) * 2021-03-15 2021-07-02 江南大学 Independent driving control system for magnetic micro-robot group
CN113672362A (en) * 2021-07-20 2021-11-19 中国科学技术大学先进技术研究院 Intelligent cooperative operation method and system for epidemic prevention machine group in complex and multi-environment
CN114722946A (en) * 2022-04-12 2022-07-08 中国人民解放军国防科技大学 Unmanned aerial vehicle asynchronous action and cooperation strategy synthesis method based on probability model detection

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1738291A (en) * 2005-08-26 2006-02-22 电子科技大学 Ad hot network subsequent multi-path route method based on load balance
CN102819264A (en) * 2012-07-30 2012-12-12 山东大学 Path planning Q-learning initial method of mobile robot
CN103686965A (en) * 2013-12-27 2014-03-26 北京农业信息技术研究中心 Wireless sensor network sequence fan-shaped area topology control method
CN106126688A (en) * 2016-06-29 2016-11-16 厦门趣处网络科技有限公司 Based on WEB content and the intelligent network information acquisition system of structure excavation, method
CN106211139A (en) * 2016-08-30 2016-12-07 单洪 A kind of recognition methods encrypting MANET interior joint type
CN106650172A (en) * 2017-01-05 2017-05-10 电子科技大学 MDP (Markov decision process) based airborne collision avoidance system logical unit design method
CN107832882A (en) * 2017-11-03 2018-03-23 上海交通大学 A kind of taxi based on markov decision process seeks objective policy recommendation method
CN108075975A (en) * 2017-12-28 2018-05-25 吉林大学 The definite method and definite system in the route transmission path in a kind of environment of internet of things
CN108092891A (en) * 2017-12-07 2018-05-29 重庆邮电大学 A kind of data dispatching method based on markov decision process
CN108494601A (en) * 2018-04-04 2018-09-04 西安电子科技大学 Stratification determines the multiple constraint dual path method for routing in network
CN108520326A (en) * 2018-04-20 2018-09-11 湖北工业大学 A kind of real-time synthetic method of monitoring controller based on the scheduling of agv task paths
CN108594858A (en) * 2018-07-16 2018-09-28 河南大学 The unmanned plane searching method and device of Markov moving target

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1738291A (en) * 2005-08-26 2006-02-22 电子科技大学 Ad hot network subsequent multi-path route method based on load balance
CN102819264A (en) * 2012-07-30 2012-12-12 山东大学 Path planning Q-learning initial method of mobile robot
CN103686965A (en) * 2013-12-27 2014-03-26 北京农业信息技术研究中心 Wireless sensor network sequence fan-shaped area topology control method
CN106126688A (en) * 2016-06-29 2016-11-16 厦门趣处网络科技有限公司 Based on WEB content and the intelligent network information acquisition system of structure excavation, method
CN106211139A (en) * 2016-08-30 2016-12-07 单洪 A kind of recognition methods encrypting MANET interior joint type
CN106650172A (en) * 2017-01-05 2017-05-10 电子科技大学 MDP (Markov decision process) based airborne collision avoidance system logical unit design method
CN107832882A (en) * 2017-11-03 2018-03-23 上海交通大学 A kind of taxi based on markov decision process seeks objective policy recommendation method
CN108092891A (en) * 2017-12-07 2018-05-29 重庆邮电大学 A kind of data dispatching method based on markov decision process
CN108075975A (en) * 2017-12-28 2018-05-25 吉林大学 The definite method and definite system in the route transmission path in a kind of environment of internet of things
CN108494601A (en) * 2018-04-04 2018-09-04 西安电子科技大学 Stratification determines the multiple constraint dual path method for routing in network
CN108520326A (en) * 2018-04-20 2018-09-11 湖北工业大学 A kind of real-time synthetic method of monitoring controller based on the scheduling of agv task paths
CN108594858A (en) * 2018-07-16 2018-09-28 河南大学 The unmanned plane searching method and device of Markov moving target

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
仵博等: ""基于点的POMDPs在线值迭代算法"", 《软件学报》 *
唐涛等: "《基于模型的列车运行控制***设计与验证方法》", 31 March 2014, 中国铁道出版社 *
杨柳等: "《智能规划理论和方法研究》", 31 July 2017, 冶金工业出版社 *
禹鑫燚等: ""基于线性时序逻辑理论的仓储机器人路径规划"", 《高技术通讯》 *
陈魁: ""基于马尔可夫决策过程的码垛机器人路径规划研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113031593A (en) * 2021-02-25 2021-06-25 上海交通大学 Active sensing task path planning method and system, robot and controller
CN113064429A (en) * 2021-03-15 2021-07-02 江南大学 Independent driving control system for magnetic micro-robot group
CN113672362A (en) * 2021-07-20 2021-11-19 中国科学技术大学先进技术研究院 Intelligent cooperative operation method and system for epidemic prevention machine group in complex and multi-environment
CN113672362B (en) * 2021-07-20 2023-11-07 中国科学技术大学先进技术研究院 Intelligent collaborative operation method and system under complex multi-environment of epidemic prevention machine group
CN114722946A (en) * 2022-04-12 2022-07-08 中国人民解放军国防科技大学 Unmanned aerial vehicle asynchronous action and cooperation strategy synthesis method based on probability model detection
CN114722946B (en) * 2022-04-12 2022-12-20 中国人民解放军国防科技大学 Unmanned aerial vehicle asynchronous action and cooperation strategy synthesis method based on probability model detection

Also Published As

Publication number Publication date
CN109409592B (en) 2021-08-24

Similar Documents

Publication Publication Date Title
CN109409592A (en) The optimal policy solution of mobile robot under dynamic environment
Carletti et al. Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with VF3
Lee et al. Effective white-box testing of deep neural networks with adaptive neuron-selection strategy
CN101093559B (en) Method for constructing expert system based on knowledge discovery
Hsu et al. A knowledge-based engineering system for assembly sequence planning
Chen et al. A three-stage integrated approach for assembly sequence planning using neural networks
CN110046810A (en) A kind of Shop Floor Multiobjective Scheduling method based on Timed Petri nets
CN107016077B (en) Optimization method for Web service combination
CN116402002B (en) Multi-target layered reinforcement learning method for chip layout problem
CN110428015A (en) A kind of training method and relevant device of model
Kamalian et al. Reducing human fatigue in interactive evolutionary computation through fuzzy systems and machine learning systems
Fritsche et al. The analysis of a cooperative hyper-heuristic on a constrained real-world many-objective continuous problem
Liu et al. Adaptive particle swarm optimizer combining hierarchical learning with variable population
Batouche et al. Semantic web services composition optimized by multi-objective evolutionary algorithms
Wang et al. Assembly sequence optimization based on hybrid symbiotic organisms search and ant colony optimization
CN101894063A (en) Method and device for generating test program for verifying function of microprocessor
Polnik et al. Ant colony optimization–evolutionary hybrid optimization with translation of problem representation
Diao et al. Fuzzy-rough classifier ensemble selection
CN113672362B (en) Intelligent collaborative operation method and system under complex multi-environment of epidemic prevention machine group
Campbell et al. A stochastic graph grammar algorithm for interactive search
Cheng et al. Using case-based reasoning to support Web service composition
Makarova et al. A case-based reasoning approach with fuzzy linguistic rules: Accuracy validation and application in interface design-support intelligent system
Schlebusch Solving the job-shop scheduling problem with reinforcement learning
CN103793769A (en) Semantic-based cloud scheduling system
Devadas et al. Review Of Different Fuzzy Logic Approaches For Prioritizing Software Requirements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant