CN114296440A - AGV real-time scheduling method integrating online learning - Google Patents

AGV real-time scheduling method integrating online learning Download PDF

Info

Publication number
CN114296440A
CN114296440A CN202111158873.1A CN202111158873A CN114296440A CN 114296440 A CN114296440 A CN 114296440A CN 202111158873 A CN202111158873 A CN 202111158873A CN 114296440 A CN114296440 A CN 114296440A
Authority
CN
China
Prior art keywords
agv
trolley
real
scheduling method
transportation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111158873.1A
Other languages
Chinese (zh)
Other versions
CN114296440B (en
Inventor
吕玉江
王延忠
陈燕燕
姚依铭
王姝濛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Great Wall Aviation Measurement And Control Technology Research Institute Co ltd
Beijing Ruisai Chang Cheng Aeronautical M & C Technology Co ltd
China Aviation Industry Corp of Beijing Institute of Measurement and Control Technology
Original Assignee
Beijing Great Wall Aviation Measurement And Control Technology Research Institute Co ltd
Beijing Ruisai Chang Cheng Aeronautical M & C Technology Co ltd
China Aviation Industry Corp of Beijing Institute of Measurement and Control Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Great Wall Aviation Measurement And Control Technology Research Institute Co ltd, Beijing Ruisai Chang Cheng Aeronautical M & C Technology Co ltd, China Aviation Industry Corp of Beijing Institute of Measurement and Control Technology filed Critical Beijing Great Wall Aviation Measurement And Control Technology Research Institute Co ltd
Priority to CN202111158873.1A priority Critical patent/CN114296440B/en
Publication of CN114296440A publication Critical patent/CN114296440A/en
Application granted granted Critical
Publication of CN114296440B publication Critical patent/CN114296440B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention belongs to the technical field of aviation intelligent manufacturing, and discloses an AGV real-time scheduling method integrating online learning. The method comprises the following steps: s1, acquiring position information, environment information and path planning information of the on-site AGV; s2, matching the most suitable trolley by using a genetic algorithm to complete a distribution task according to the information obtained in the S1; s3, adopting the most suitable trolley matched in the S2 to execute a transportation task; in the process of executing the transportation task, planning a shortest path for the trolley by adopting a reinforcement learning algorithm; and S4, the trolley completes the transportation task according to the shortest path planned in the S3. The genetic algorithm and the reinforcement learning system are combined, the genetic algorithm is used for solving the problem of dispatching the AGV, the reinforcement learning system is used for solving the problems of AGV avoidance and path planning in the dispatching process, and tasks are reasonably distributed to the AGV. The method reduces the complexity of the algorithm, improves the working efficiency of the AGV and can bring greater economic benefits.

Description

AGV real-time scheduling method integrating online learning
Technical Field
The invention belongs to the technical field of aviation intelligent manufacturing, and particularly relates to an AGV real-time scheduling method integrating online learning.
Background
At present, domestic intelligent logistics becomes a key link for realizing intelligent manufacturing, the AGV bears the carrying work of parts in a logistics system, and corresponding parts are timely sent to a proper place according to the requirements of the system. Therefore, the improvement of the working efficiency of the AGV trolley has great significance to the development of companies. Compared with foreign countries, the domestic logistics demand is larger, and the research work of real-time regulation and control is very important.
At present, domestic AGV are greatly influenced by an algorithm in the aspect of real-time scheduling, and complex space environment influences the scheduling efficiency of AGV. If the calculation speed is increased by improving hardware facilities of the AGV trolley, huge cost is needed, the calculation speed is not easy to realize, but the current algorithm is optimized, the working efficiency of the AGV can be greatly improved, the optimal distribution arrangement is realized in the distribution process, and the method is suitable for popularization.
Disclosure of Invention
Aiming at the problems, the invention aims to provide an AGV real-time scheduling method integrating online learning, which combines a genetic algorithm and a reinforcement learning system, solves the scheduling problem of an AGV by using the genetic algorithm, solves the problems of AGV avoidance and path planning by using the reinforcement learning system in the scheduling process, and reasonably distributes tasks to the AGV. The method reduces the complexity of the algorithm, improves the working efficiency of the AGV and can bring greater economic benefits.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
an AGV real-time scheduling method integrating online learning, the method comprising:
s1, acquiring position information, environment information and path planning information of the on-site AGV;
s2, matching the most suitable trolley by using a genetic algorithm according to the information acquired in the S1 to complete the distribution task;
s3, adopting the most suitable trolley matched in the S2 to execute a transportation task; in the process of executing the transportation task, planning a shortest path for the trolley by adopting a reinforcement learning algorithm;
and S4, the trolley completes the transportation task according to the shortest path planned in the S3.
The technical scheme of the invention has the characteristics and further improvements that:
(1) and S4, in the process of completing the transportation task, the trolley uses an improved artificial potential field method to realize autonomous obstacle avoidance.
(2) The position information, the environment information and the path planning information of the on-site AGV in the S1 are specifically as follows: the position information of the AGV car refers to a random initial position set for the AGV car;
the environment information refers to the random initial positions of a plurality of obstacles in the area where the AGV trolley can reach and the random initial positions of a plurality of workpieces needing to be transported;
the path planning information refers to a three-point initial path formed by the current position of the AGV trolley, the position of the workpiece to be transported and the destination position to which the workpiece is to be transported.
(3) In S2, in the process of matching the most suitable car by using the genetic algorithm, the design objective function is: f is min1≤k≤M{max1≤i≤N{Lik}};
Wherein, f is an objective function, which represents the longest time for completing the transportation of a certain workpiece in each transportation scheme, and the transportation scheme corresponding to the minimum value of the longest time is selected as the optimal transportation scheme in various transportation schemes; (ii) a
L is the workpiece delivery time, LikIndicating the time for completing the delivery of the workpiece i on the AGV trolley k; i is a delivery target point, i is 1, 2, …, N; n is the total number of the workpieces distributed for generation; k is the kth AGV cart, k is 1, 2, …, M; m is AGV smallThe total number of vehicles.
(4) In S3, in the process of executing the transportation task, a process of planning the shortest path for the car using a reinforcement learning algorithm specifically includes:
s31, building a grid map with a preset size as a two-dimensional simulation environment;
s32, setting the motion of the AGV trolley in the environment established in the two-dimensional grid map;
s33, representing the instant reward function by a nonlinear piecewise function, representing by a scalar R, and feeding back the current state and reward value when the AGV encounters different objects through the design of the reward function so as to change the behavior of the AGV;
s34, initializing an environment state and starting an exploration environment, starting from a starting point, acquiring a Q value corresponding to the current state by the trolley, finding out an action corresponding to the Q value through greedy decision, and recording the coordinate of the current state; judging whether the currently selected action is favorable or not through the advantage function, if the action obtains a positive reward value, executing the action, transferring to the next state to obtain the reward value, and storing the reward value in a sample playback cache region;
and S35, evaluating whether the current strategy is the optimal strategy or the optimal path according to the average reward value of the scene number.
(5) In S32, the setting of the motion of the AGV in the environment established in the two-dimensional grid map is specifically:
defining an action space model of the AGV trolley as four discrete actions, namely A is [0, 1; 0. -1; -1, 0; 1,0], the AGV car is treated as a particle, represented by a circle, and the target points are represented by boxes.
(6) In the step S34, the first step,
when the trolley is in the exploration state, if the current state coordinate has an obstacle, the reward value is-1; if no obstacle exists on the current state coordinate, the reward value is returned to be 1, and the next state is entered; if the current state is the target point, the reward value is returned to be 2, and a final path is planned;
(7) in the S4, in the process of implementing autonomous obstacle avoidance by using the artificial potential field method, the artificial potential field method specifically includes:
virtual potential fields are introduced among the AGV trolleys and between the AGV trolleys and surrounding obstacles;
when the distance between the AGV trolleys or the distance between the AGV trolleys and the obstacle is smaller than the expected distance, the AGV trolleys are far away;
when the distance between the AGV carts is greater than the desired distance, the AGV carts are brought closer together by the attractive force.
(8) The method for realizing autonomous obstacle avoidance by using the artificial potential field method specifically comprises the following steps:
regarding each AGV as a mass point, wherein the communication radius is r, the expected distance between the AGV is d alpha, and the expected distance between the AGV and the obstacle is d beta; the force borne by the AGV is the resultant force of the forces from other AGV trolleys within the communication radius and the repulsive force of the obstacle, and when the resultant force is 0, the AGV reaches a balanced state;
when all AGV dollies can both be in balanced state, just realized independently keeping away the barrier.
Compared with the existing algorithm, the invention has the advantages that: in the invention, reinforcement learning is adopted as a learning method from an environment state to behavior mapping, can be used for an uncertain environment, and can automatically adapt to the change of the environment; according to the method, the state can be updated in real time by using a Q-learning algorithm in reinforcement learning, so that an intelligent body can perform model optimization according to the current state in real time; the multiple AGV scheduling uses a genetic algorithm, so that the optimization problem of a complex structure can be solved, and the searching performance of the algorithm is not limited by the performance of a function; according to the method, the cloud edge cooperation technology is used for realizing the quantitative evaluation of the AGV car on-site operation data and the shortest path and time of logistics transportation.
Drawings
Fig. 1 is a schematic flowchart of an AGV real-time scheduling method incorporating online learning according to an embodiment of the present invention;
FIG. 2 is an environmental two-dimensional map of the present invention;
FIG. 3 is a diagram of the AGV traveling trajectory according to the present invention;
FIG. 4 is a block diagram of the Q-learning algorithm of the present invention;
fig. 5 is a table showing the planning time of the cart path of the workshop and the distribution sequence of the workpieces.
Detailed Description
The technical solution of the present invention is further explained with reference to the drawings and the detailed description.
The embodiment of the invention provides an AGV real-time scheduling method integrating online learning, and as shown in FIG. 1, the method comprises the following steps:
s1, acquiring position information, environment information and path planning information of the on-site AGV;
s2, matching the most suitable trolley by using a genetic algorithm according to the information acquired in the S1 to complete the distribution task;
s3, adopting the most suitable trolley matched in the S2 to execute a transportation task; in the process of executing the transportation task, planning a shortest path for the trolley by adopting a reinforcement learning algorithm;
and S4, the trolley completes the transportation task according to the shortest path planned in the S3.
And S4, in the process of completing the transportation task, the trolley uses an improved artificial potential field method to realize autonomous obstacle avoidance.
The position information, the environment information and the path planning information of the on-site AGV in the S1 are specifically as follows: the position information of the AGV car refers to a random initial position set for the AGV car;
the environment information refers to the random initial positions of a plurality of obstacles in the area where the AGV trolley can reach and the random initial positions of a plurality of workpieces needing to be transported;
the path planning information refers to a three-point initial path formed by the current position of the AGV trolley, the position of the workpiece to be transported and the destination position to which the workpiece is to be transported.
Specifically, a two-dimensional simulation environment is built by using a Python language and a TKinter library. The path planning algorithm experimental environment is a 30 × 30 grid map, the set obstacles are basically the same as those of a certain researched and researched processing workshop, and the starting points and the target points are also randomly set. The constructed environment two-dimensional map is shown in figure 2. In the figure, the black part is an obstacle, and the white part is a non-obstacle area. The goal of the MATLAB simulation algorithm is to initialize the environment state and begin exploring the environment.
Setting an action space: the setting is to set the motion of the AGV car in the environment established in the two-dimensional grid diagram, and the experiment defines the real motion space model of the robot as four discrete motions, namely, A is equal to [0, 1; 0. -1; -1, 0; 1,0]. The AGV car is approximated as a particle, represented by a circle, with the target points represented by boxes.
The reward function is a feedback that evaluates the goodness of the action performed by the AGV car to transition from the current state to another state, usually expressed as a scalar R. The AGV car aiming at the subject testing environment adopts a non-linear piecewise function to represent an instant reward function, and when the car passes through a feasible region in the environment exploration, an instant reward of 1 is obtained; when the trolley encounters an obstacle, a reward value of-1 is obtained, and when the trolley reaches the target position, a reward value of 2 is obtained. And finally, judging whether the strategy is the optimal strategy or not through the total reward value.
Fig. 3 shows a path simulation result of one of the paths, and after an optimal Q function is obtained through multiple operations, an optimal path can be obtained. The specific flow of the Q learning algorithm is shown in fig. 4.
For the scheduling problem of multiple AGVs, a genetic algorithm is used for solving the scheduling problem, and in order to express the problem more clearly, the car scheduling problem is represented by the following mathematical symbols: workpiece set J for setting to-be-delivered worki={j1,j2,…, jNAGV machine set M for distributionk={m1,m2,mi,…,mM}, each delivered workpiece JiWith different number of deliveries, each delivery number being Vi={V1 i,V2 i,…,VJi i}, each delivered workpiece VJi iMust be delivered at a specified target point m (V)Ji i)∈MkI.e. each step VJi iTarget point m (V) of deliveryJi i) Immobilization of wherein m (V)Ji i) Watch (A)V of workpiece JiJi iThe process is carried out by a trolley robot miAnd (4) carrying out distribution. Where k is 1, 2, …, M. The actual workshop scheduling system is complex, and the problem is simplified in order to meet the scheduling target in the modeling process as the research of the system.
The following assumptions were made:
1) in the distribution process, the conditions of damage of processing equipment and failure of the trolley are not considered.
2) Each AGV car can only complete one delivery task at a time.
3) The distribution path and the distribution time of each workpiece are planned in real time, and other auxiliary processing time is not considered.
4) All AGV trolley machines are available at time t ═ 0.
5) All workpieces can be dispensed at time t-0.
6) The delivery of the individual workpieces cannot be interrupted once the delivery has started, except in the case of an emergency.
7) The sequence of the delivered work pieces must satisfy the precedence constraint condition.
Because the researched trolley dispatching problem takes the shortest final delivery completion time of all workpieces as the optimal dispatching target, an objective function can be designed as follows:
f=min1≤k≤M{max1≤i≤N{Lik}}
in the formula: f is an objective function, which represents the longest time for completing the transportation of a certain workpiece in each transportation scheme, and the transportation scheme corresponding to the minimum value of the longest time is selected as the optimal transportation scheme in various transportation schemes; l is the workpiece delivery time, LikIndicating the time for completing the delivery of the workpiece i on the AGV trolley k; i is a delivery target point, i is 1, 2, …, N; n is the total number of the workpieces distributed for generation; k is the kth AGV, k is 1, 2, …, M; m is the total number of AGV trolleys.
Each transportation scheme is a scheme of transporting the workpieces to be transported to a destination by adopting different trolleys under the current environment.
The constraint of the objective function is:
Lik-Tik≥Lih (5)
Ljk-Lik≥Tjk (6)
equation (5) may be described as a preferential constraint for workpiece delivery: suppose AGV dolly h delivers work piece i before dolly k, LikIndicates the time (time to reach target point) at which the delivery of the workpiece i on the AGV car k is completed, LihShows the delivery time (time to reach the target point) T of the workpiece i delivered on the carriage hikIndicating the delivery time (time taken to deliver) of the workpiece i on the AGV cart k.
Equation (6) may be described as selecting preferential constraints for the AGV: assuming that the workpieces i and j need to be delivered by the trolley k at a certain moment, if the task of the workpiece i is issued before the task of the workpiece j, the workpiece i is delivered first. Wherein L isikIndicating the time of completion, T, of the workpiece i on the carriage kikIndicating the delivery time, L, of the workpiece i on the carriage kihIndicating the time, L, of completion of the workpiece i on the carriage hjkIndicating the time, T, for completion of the workpiece j on the carriage kjkThe delivery time of the workpiece j on the trolley k is shown, M is the number of trolleys, and N is the number of workpieces.
The simulation experiment is carried out, taking 6 × 10 scheduling problem as an example, and the example related parameters are as follows: the workpiece delivery task Ji is {1, 2, 3, 4, 5, 6}, where i is 1, 2, 3, 4, 5, 6; AGV car number Mj ═ {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, where j ═ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10; and each workpiece preferably corresponds to a final target point and a delivery sequence point set in the process of the target point. As can be seen from fig. 5, the delivery order of each workpiece and the time spent in each delivery.
Through multiple times of simulation, when the number of the seed groups is 100, the selection probability is 0.8, the cross probability is 0.8, the variation probability is 0.6 and the iteration number is 50, the fitness is optimal and the optimal solution is 42.
In cloud edge cooperation, a central cloud issues instructions to robots at different positions, the robots are obtained and sent back to a designated place, in order to finish autonomous obstacle avoidance in the distribution process, an improved artificial potential field method is used, topology switching in the advancing process is considered, and a control protocol is improved so that optimization of an obstacle avoidance algorithm can be achieved when an AGV trolley cluster finishes formation tasks.
The manual potential field method is characterized in that virtual potential fields are introduced among the intelligent AGV trolleys and between the intelligent AGV trolleys and the surrounding environment, and when the distance between the intelligent AGV trolleys or the distance between the intelligent AGV trolleys and the obstacle is smaller than an expected distance, the virtual potential fields are expressed as repulsive force to enable the intelligent AGV trolleys to be far away; when the distance between the intelligent AGV trolleys is larger than the expected distance, the intelligent AGV trolleys are expressed as gravitation, and the intelligent AGV trolleys are close to each other. For each intelligent AGV, regardless of a complex mechanical model and an aerodynamic principle, the intelligent AGV is considered as a mass point, the communication radius is r, the expected distance between the intelligent AGV is d alpha, and the expected distance between the intelligent AGV and an obstacle is d beta. Then, the force received by the intelligent AGV is the resultant force of the forces from other intelligent AGV in the communication radius and the repulsive force of the obstacle, and when the resultant force is 0, the intelligent AGV reaches a balanced state. When all intelligent AGV trolleys can be in a balanced state, safe formation obstacle avoidance walking can be realized.
After cloud edge cooperative computing is used, logistics distribution of the AGV is achieved, and the description of goods distribution, motion state conditions and environment conditions of the intelligent AGV is as follows:
(1) the area around the center of the intelligent AGV is the range including the obstacle and the intelligent AGV in the neighborhood.
(2) Dynamic obstacles in the environment may be detected by the intelligent AGV.
(3) All intelligent AGV dollies can both receive the intelligent AGV dolly in the field to and the information of barrier. Including position, velocity, obstacle radius, etc.
(4) And one trolley of the leading team leads the other trolley to realize autonomous dynamic obstacle avoidance in the driving process, and records the current motion state of the trolleys, the obstacle state of the surrounding environment and the like in real time.
(5) After the trolley finishes the task of distributing the goods for the first time, the trolley continues to start a second distribution task at the current position. The cart can complete delivery tasks to the target point at any starting point.
The embodiment of the invention provides a construction method of an AGV real-time scheduling model integrating online learning, which optimizes the real-time performance of AGV scheduling and realizes the function of reasonably distributing tasks for a plurality of AGV trolleys. The genetic algorithm and the reinforcement learning system are combined, the genetic algorithm is used for solving the problem of dispatching the AGV, the reinforcement learning system is used for solving the problems of AGV avoidance and path planning in the dispatching process, and tasks are reasonably distributed to the AGV. And a cloud edge cooperative computing technology is introduced to complete the tasks of trolley scheduling and path planning of an aeronautical processing workshop, and the aims of reasonably matching resources and requirements and minimizing the path planning distance of the trolley are fulfilled in the scheduling process. The method reduces the complexity of the algorithm, improves the working efficiency of the AGV trolley, and can bring greater economic benefit for companies.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. An AGV real-time scheduling method fused with online learning is characterized by comprising the following steps:
s1, acquiring position information, environment information and path planning information of the on-site AGV;
s2, matching the most suitable trolley by using a genetic algorithm to complete a distribution task according to the information obtained in the S1;
s3, adopting the most suitable trolley matched in the S2 to execute a transportation task; in the process of executing the transportation task, planning a shortest path for the trolley by adopting a reinforcement learning algorithm;
and S4, the trolley completes the transportation task according to the shortest path planned in the S3.
2. The AGV real-time scheduling method integrating online learning of claim 1, wherein in S4, the improved artificial potential field method is used to realize autonomous obstacle avoidance during the transportation task of the car.
3. The AGV real-time scheduling method integrating online learning according to claim 1, wherein the position information, the environment information and the path planning information of the field AGV in S1 specifically comprise: the position information of the AGV car refers to a random initial position set for the AGV car;
the environment information refers to the random initial positions of a plurality of obstacles in the area where the AGV trolley can reach and the random initial positions of a plurality of workpieces needing to be transported;
the path planning information refers to a three-point initial path formed by the current position of the AGV trolley, the position of the workpiece to be transported and the destination position to which the workpiece is to be transported.
4. The AGV real-time scheduling method integrating online learning according to claim 1, wherein in the process of matching the most suitable car by using the genetic algorithm in S2, an objective function is designed as follows: f is min1≤k≤M{max1≤i≤N{Lik}};
Wherein, f is an objective function, which represents the longest time for completing the transportation of a certain workpiece in each transportation scheme, and the transportation scheme corresponding to the minimum value of the longest time is selected as the optimal transportation scheme in various transportation schemes;
l is the workpiece delivery time, LikIndicating the time for completing the delivery of the workpiece i on the AGV trolley k; i is a delivery target point, i is 1, 2, …, N; n is the total number of the workpieces distributed for generation; k is the kth AGV, k is 1, 2, …, M; m is the total number of AGV trolleys.
5. The AGV real-time scheduling method integrating online learning according to claim 1, wherein in S3, in the process of executing the transportation task, a process of planning a shortest path for the car by using a reinforcement learning algorithm specifically includes:
s31, building a grid map with a preset size as a two-dimensional simulation environment;
s32, setting the motion of the AGV trolley in the environment established in the two-dimensional grid map;
s33, representing the instant reward function by a nonlinear piecewise function, representing by a scalar R, and feeding back the current state and reward value when the AGV encounters different objects through the design of the reward function so as to change the behavior of the AGV;
s34, initializing an environment state and starting an exploration environment, starting from a starting point by the trolley, acquiring a Q value corresponding to the current state, finding out an action corresponding to the Q value through greedy decision, and recording the coordinate of the current state; judging whether the currently selected action is favorable or not through the advantage function, if the action obtains a positive reward value, executing the action, transferring to the next state to obtain the reward value, and storing the reward value in a sample playback cache region;
and S35, evaluating whether the current strategy is the optimal strategy or the optimal path according to the average reward value of the scene number.
6. The AGV real-time scheduling method integrating online learning according to claim 5, wherein in S32, the motion of the AGV car in the environment established in the two-dimensional grid map is set, specifically:
defining an action space model of the AGV trolley as four discrete actions, namely A is [0, 1; 0. -1; -1, 0; 1,0], the AGV car is treated as a particle, represented by a circle, and the target points are represented by boxes.
7. The AGV real-time scheduling method with integrated online learning of claim 5, wherein in S34,
when the trolley is in the exploration state, if the current state coordinate has an obstacle, the reward value is-1; if no barrier exists on the current state coordinate, returning the reward value to be 1, and entering the next state; if the current state is the target point, the reward value is returned to be 2, and a final path is planned.
8. The AGV real-time scheduling method integrating online learning according to claim 2, wherein in S4, in the process of implementing autonomous obstacle avoidance using the artificial potential field method, the artificial potential field method specifically includes:
virtual potential fields are introduced among the AGV trolleys and between the AGV trolleys and surrounding obstacles;
when the distance between the AGV trolleys or the distance between the AGV trolleys and the obstacle is smaller than the expected distance, the AGV trolleys are far away;
when the distance between the AGV carts is greater than the desired distance, the AGV carts are brought closer together by the attractive force.
9. The AGV real-time scheduling method integrating online learning according to claim 8, wherein the implementation of autonomous obstacle avoidance using an artificial potential field method specifically comprises:
regarding each AGV as a mass point, wherein the communication radius is r, the expected distance between the AGV is d alpha, and the expected distance between the AGV and the obstacle is d beta; the force borne by the AGV is the resultant force of the forces from other AGV trolleys within the communication radius and the repulsive force of the obstacle, and when the resultant force is 0, the AGV reaches a balanced state;
when all AGV dollies can both be in balanced state, just realized independently keeping away the barrier.
CN202111158873.1A 2021-09-30 2021-09-30 AGV real-time scheduling method integrating online learning Active CN114296440B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111158873.1A CN114296440B (en) 2021-09-30 2021-09-30 AGV real-time scheduling method integrating online learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111158873.1A CN114296440B (en) 2021-09-30 2021-09-30 AGV real-time scheduling method integrating online learning

Publications (2)

Publication Number Publication Date
CN114296440A true CN114296440A (en) 2022-04-08
CN114296440B CN114296440B (en) 2024-04-09

Family

ID=80964371

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111158873.1A Active CN114296440B (en) 2021-09-30 2021-09-30 AGV real-time scheduling method integrating online learning

Country Status (1)

Country Link
CN (1) CN114296440B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114952839A (en) * 2022-05-27 2022-08-30 西南交通大学 Cloud edge cooperation-based two-stage robot motion decision technology framework
CN115145285A (en) * 2022-07-29 2022-10-04 陕西科技大学 Multi-point goods taking and delivering optimal path planning method and system for storage AGV
CN115640986A (en) * 2022-12-13 2023-01-24 北京云迹科技股份有限公司 Robot scheduling method, device, equipment and medium based on rewards
CN115660386A (en) * 2022-12-13 2023-01-31 北京云迹科技股份有限公司 Robot scheduling reward method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090327011A1 (en) * 2008-06-30 2009-12-31 Autonomous Solutions, Inc. Vehicle dispatching method and system
CN108776483A (en) * 2018-08-16 2018-11-09 圆通速递有限公司 AGV paths planning methods and system based on ant group algorithm and multiple agent Q study
CN110443412A (en) * 2019-07-18 2019-11-12 华中科技大学 The intensified learning method of Logistic Scheduling and path planning in dynamic optimization process
CN112232548A (en) * 2020-09-14 2021-01-15 中国船舶重工集团公司第七一六研究所 Machining workshop intelligent scheduling optimization method and system based on genetic algorithm
CN112344944A (en) * 2020-11-24 2021-02-09 湖北汽车工业学院 Reinforced learning path planning method introducing artificial potential field
JP2021034050A (en) * 2019-08-21 2021-03-01 哈爾浜工程大学 Auv action plan and operation control method based on reinforcement learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090327011A1 (en) * 2008-06-30 2009-12-31 Autonomous Solutions, Inc. Vehicle dispatching method and system
CN108776483A (en) * 2018-08-16 2018-11-09 圆通速递有限公司 AGV paths planning methods and system based on ant group algorithm and multiple agent Q study
CN110443412A (en) * 2019-07-18 2019-11-12 华中科技大学 The intensified learning method of Logistic Scheduling and path planning in dynamic optimization process
JP2021034050A (en) * 2019-08-21 2021-03-01 哈爾浜工程大学 Auv action plan and operation control method based on reinforcement learning
CN112232548A (en) * 2020-09-14 2021-01-15 中国船舶重工集团公司第七一六研究所 Machining workshop intelligent scheduling optimization method and system based on genetic algorithm
CN112344944A (en) * 2020-11-24 2021-02-09 湖北汽车工业学院 Reinforced learning path planning method introducing artificial potential field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王本年 等: "RLGA:一种基于强化学习机制的遗传算法", 《电子学报》, vol. 34, no. 5, pages 856 - 866 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114952839A (en) * 2022-05-27 2022-08-30 西南交通大学 Cloud edge cooperation-based two-stage robot motion decision technology framework
CN114952839B (en) * 2022-05-27 2024-02-06 西南交通大学 Cloud edge cooperation-based two-stage robot motion decision system
CN115145285A (en) * 2022-07-29 2022-10-04 陕西科技大学 Multi-point goods taking and delivering optimal path planning method and system for storage AGV
CN115640986A (en) * 2022-12-13 2023-01-24 北京云迹科技股份有限公司 Robot scheduling method, device, equipment and medium based on rewards
CN115660386A (en) * 2022-12-13 2023-01-31 北京云迹科技股份有限公司 Robot scheduling reward method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114296440B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN114296440A (en) AGV real-time scheduling method integrating online learning
Yang et al. Multi‐robot path planning based on a deep reinforcement learning DQN algorithm
Fu et al. An improved A* algorithm for the industrial robot path planning with high success rate and short length
Xinwei et al. A review on carrier aircraft dispatch path planning and control on deck
CN110989576B (en) Target following and dynamic obstacle avoidance control method for differential slip steering vehicle
Liu et al. Coordinated motion planning for multiple mobile robots along designed paths with formation requirement
Cai et al. Mobile robot path planning in dynamic environments: A survey
CN109978272B (en) Path planning system and method based on multiple omnidirectional mobile robots
CN113687659B (en) Optimal trajectory generation method and system based on digital twinning
CN109581987B (en) AGV (automatic guided vehicle) scheduling path planning method and system based on particle swarm optimization
Liao et al. AGV path planning model based on reinforcement learning
CN112147960A (en) Optimized scheduling method and device for flexible manufacturing system
CN108897316A (en) A kind of cluster storage robot system control method based on pheromones navigation
Chen et al. Path planning for vehicle-borne system consisting of multi air–ground robots
Ju et al. A hybrid systems-based hierarchical control architecture for heterogeneous field robot teams
CN117093009A (en) Logistics AGV trolley navigation control method and system based on machine vision
Hani et al. Simulation based optimization of a train maintenance facility
Zhang et al. Application of Automated Guided Vehicles in Smart Automated Warehouse Systems: A Survey.
Hong et al. Logistics in the sky: A two-phase optimization approach for the drone package pickup and delivery system
Jie et al. A homogenization-planning-tracking method to solve cooperative autonomous motion control for heterogeneous carrier dispatch systems
Tubaileh Layout of flexible manufacturing systems based on kinematic constraints of the autonomous material handling system
Suemitsu et al. Fast simulation-based order sequence optimization assisted by pre-trained bayesian recurrent neural network
CN113485418B (en) Flexible rope system constraint multi-robot track generation method
Jungbluth et al. Reinforcement Learning-based Scheduling of a Job-Shop Process with Distributedly Controlled Robotic Manipulators for Transport Operations
CN114537435A (en) Real-time whole vehicle track planning method in automatic driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant