CN115446867A - Industrial mechanical arm control method and system based on digital twinning technology - Google Patents
Industrial mechanical arm control method and system based on digital twinning technology Download PDFInfo
- Publication number
- CN115446867A CN115446867A CN202211217890.2A CN202211217890A CN115446867A CN 115446867 A CN115446867 A CN 115446867A CN 202211217890 A CN202211217890 A CN 202211217890A CN 115446867 A CN115446867 A CN 115446867A
- Authority
- CN
- China
- Prior art keywords
- mechanical arm
- digital twin
- industrial mechanical
- digital
- industrial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000005516 engineering process Methods 0.000 title claims abstract description 34
- 230000033001 locomotion Effects 0.000 claims abstract description 33
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 18
- 230000002787 reinforcement Effects 0.000 claims abstract description 13
- 238000005457 optimization Methods 0.000 claims abstract description 12
- 210000003857 wrist joint Anatomy 0.000 claims description 23
- 230000006870 function Effects 0.000 claims description 21
- 210000001503 joint Anatomy 0.000 claims description 14
- 238000004088 simulation Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 11
- 230000008859 change Effects 0.000 claims description 9
- 210000000323 shoulder joint Anatomy 0.000 claims description 8
- 230000001133 acceleration Effects 0.000 claims description 7
- 210000002310 elbow joint Anatomy 0.000 claims description 7
- 230000006854 communication Effects 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 4
- 230000035945 sensitivity Effects 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 230000009471 action Effects 0.000 description 15
- 102100040653 Tryptophan 2,3-dioxygenase Human genes 0.000 description 4
- 101710136122 Tryptophan 2,3-dioxygenase Proteins 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000009776 industrial production Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000007488 abnormal function Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010924 continuous production Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J17/00—Joints
- B25J17/02—Wrist joints
- B25J17/0258—Two-dimensional joints
- B25J17/0266—Two-dimensional joints comprising more than two actuating or connecting rods
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J18/00—Arms
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Numerical Control (AREA)
Abstract
The invention discloses an industrial mechanical arm control method and system based on a digital twinning technology, wherein the method comprises the following steps: constructing a digital twin model of the industrial mechanical arm with six degrees of freedom by utilizing a digital twin technology; learning and training the digital twin model based on a data-driven depth reinforcement learning algorithm, learning to obtain an optimal strategy of maximizing accumulated reward return, and determining a motion track of the digital twin model; and the industrial mechanical arm is driven to rotate in a self-adaptive manner based on the motion trail of the digital twin model, so that the intelligent control of the industrial mechanical arm is realized. According to the invention, a digital twin body is established for the industrial mechanical arm through a digital twin technology, a near-end strategy optimization algorithm is utilized, and self-learning self-adaptive rotation of the industrial mechanical arm is realized through reinforcement learning training, so that intelligent control over the mechanical arm is realized, and the automation degree and flexibility of mechanical arm control are improved.
Description
Technical Field
The invention relates to the technical field of intelligent robot control, in particular to an industrial mechanical arm control method and system based on a digital twinning technology.
Background
The mechanical arm is a complex system with high precision, multiple inputs and multiple outputs, high nonlinearity and strong coupling, and has been widely applied to the fields of industrial assembly, safety explosion prevention and the like due to the unique operation flexibility. At present, in industrial production, a teaching method is generally used to control a mechanical arm, that is, the mechanical arm is moved to each target position in advance by manually dragging or using a mode of adjusting by a demonstrator, and the position information of each target is stored, then the mechanical arm moves according to the sequence of the target points when in use, and in new application, as the target position changes, the teaching needs to be performed again. The method is labor-consuming, poor in flexibility and adaptability and incapable of being dynamically adjusted according to requirements, and the mechanical arm is used as a complex system and has uncertainties such as parameter perturbation, external interference, unmodeled dynamics and the like, so that re-teaching is often needed.
Therefore, in the conventional control method commonly used in the prior art, a mechanical arm system is modeled, and the mechanical arm is controlled through a motion planning theory, so that the waste of manpower and material resources is reduced, and the flexibility and the adaptability are improved. The current motion planning theory includes forward kinematics and inverse kinematics, the forward kinematics refers to the calculation of the position of the tail end of the mechanical arm according to the rotation angle of each shaft of the mechanical arm, and the inverse kinematics refers to the calculation of the rotation angle required by each shaft according to the target position of the tail end of the mechanical arm. However, there are still a number of problems with the existing methods, including:
1. traditional modeling is difficult. Aiming at a complex entity mechanical arm, the traditional physical modeling process is complex and difficult, and the motion state of the mechanical arm cannot be accurately described;
2. the robotic arm motion is discontinuous. The mechanical arm is driven to rotate by utilizing the traditional forward kinematics and reverse kinematics, and the situation that the mechanical arm is blocked and cannot rotate due to singular points can be met;
3. real-time information interaction cannot be carried out, and a mechanical arm controller cannot master the operation condition of the industrial mechanical arm at the first time, so that the real-time performance of the industrial mechanical arm is difficult to guarantee.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides an industrial mechanical arm control method and system based on a digital twinning technology.
In a first aspect, the present disclosure provides an industrial robot arm control method based on a digital twinning technology, including:
constructing a digital twin model of the industrial mechanical arm with six degrees of freedom by utilizing a digital twin technology; the digital twin model comprises 6 rotary joints and 5 driven parts, and a parent-child logic relationship between the rotary joints and the driven parts is set;
learning and training the digital twin model based on a data-driven depth reinforcement learning algorithm, learning to obtain an optimal strategy of maximizing accumulated reward return, and determining a motion track of the digital twin model;
and the industrial mechanical arm is driven to rotate in a self-adaptive manner based on the motion trail of the digital twin model, so that the intelligent control of the industrial mechanical arm is realized.
According to a further technical scheme, 6 rotary joints and 5 driven parts of the digital twin model form a father-son logic relationship in pairs respectively; the base joint, the first driven part, the shoulder joint, the second driven part, the elbow joint, the third driven part, the first wrist joint, the fourth driven part, the second wrist joint, the fifth driven part and the third wrist joint are sequentially in pairwise form a father-son logical relationship.
In a further technical solution, the parent-child logical relationship means that when an object is set as a child object of another object, the object is a child object, the other object is a parent object, the child object changes with the rotation change of the parent object, the position of the relative point does not change, and the parent object does not actively follow the change when the child object changes in rotation.
The further technical scheme is that in the modeling process, the method further comprises the following steps: setting basic parameters of the digital twin model based on actual parameters of the industrial mechanical arm; the basic parameters include joint sensitivity, joint range of motion, linear velocity of each joint, and upper limit of acceleration.
According to the further technical scheme, in the modeling process, communication connection between a digital twin model and an industrial mechanical arm is established, the industrial mechanical arm sends industrial mechanical arm synchronous parameters to the digital twin model, and the digital twin model receives target data for simulation and returns execution data to the industrial mechanical arm after training;
the execution data includes industrial robot arm pose adjustment data, that is, the rotation angle, the rotation angular velocity, and the rotation angular acceleration of each rotary joint.
In the further technical scheme, the optimal motion track of the digital twin model is planned based on a near-end strategy optimization algorithm PPO, and a self-learning function is realized.
According to the further technical scheme, an initial coordinate position and an initial joint angle of the digital twin model are set according to the six-degree-of-freedom mechanical arm;
based on the target position, the digital twin model selects strategies from a preset strategy set to perform simulation operation, obtains an observation result of each strategy operation, gives a reward based on the observation result, and selects the strategy with the maximum reward to perform iterative operation until the optimal strategy with the maximum accumulated reward is obtained;
and determining the motion trail of the digital twin model based on the obtained optimal strategy.
In a second aspect, the present disclosure provides an industrial robot arm control system based on a digital twinning technology, including a six-degree-of-freedom industrial robot arm and a digital twinning simulation platform;
the digital twinborn simulation platform is used for executing the industrial mechanical arm control method based on the digital twinborn technology provided by the first aspect, and drives the industrial mechanical arm with six degrees of freedom to rotate in a self-adaptive mode, so that intelligent control over the industrial mechanical arm is achieved.
In a third aspect, the present disclosure also provides an electronic device comprising a memory and a processor, and computer instructions stored on the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the steps of the method of the first aspect.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the steps of the method of the first aspect.
The above one or more technical solutions have the following beneficial effects:
1. the invention provides an industrial mechanical arm control method and system based on a digital twinning technology, wherein the digital twinning technology is used for constructing a digital twinning model of a mechanical arm, and the digital world and a physical world are connected, so that the physical information data interaction of the physical object and the virtual object in the up-down direction is realized, the problems of nonlinearity and uncertainty which cannot be solved by a traditional mechanism model are solved, and the real-time performance and the generalization capability of the mechanical arm are greatly improved.
2. The invention carries out learning training on the digital twin model of the mechanical arm, and utilizes the Markov reward process to lead the digital twin to learn and select the action which leads the reward to be maximum under different environmental states, thereby realizing the self-learning of the industrial mechanical arm and greatly improving the automation degree.
3. The invention utilizes a near-end strategy optimization algorithm, realizes self-learning self-adaptive rotation of the mechanical arm through reinforcement learning training, solves the problem of discontinuous motion process of the traditional mechanical arm, realizes intelligent control on the mechanical arm, and improves the automation degree and the industrial production efficiency.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is an overall flowchart of an industrial robot arm control method based on a digital twinning technology according to an embodiment of the present invention;
FIG. 2 is a functional diagram of five layers according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a six-degree-of-freedom industrial robot arm according to an embodiment of the present invention.
The wrist joint comprises a base joint 1, a base joint 2, a shoulder joint 3, an elbow joint 4, a first wrist joint 5, a second wrist joint 6, a third wrist joint 7, a base 8 and a tool end.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Example one
The embodiment provides an industrial mechanical arm control method based on a digital twinning technology, as shown in fig. 1, comprising the following steps:
s1, constructing a digital twin model of the industrial mechanical arm with six degrees of freedom by using a digital twin technology; the digital twin model comprises 6 rotary joints and 5 driven parts, and a parent-child logical relationship between the rotary joints and the driven parts is set;
s2, learning and training the digital twin model based on a data-driven depth reinforcement learning algorithm, learning to obtain an optimal strategy of maximizing accumulated reward return, and determining a motion track of the digital twin model;
and S3, driving the industrial mechanical arm to rotate in a self-adaptive mode based on the motion track of the digital twin model, and achieving intelligent control over the industrial mechanical arm.
In this embodiment, an industrial grabbing mechanical arm is taken as an example, and analysis and setting are performed in five layers, namely, a physical layer, a data layer, a transmission layer, a virtual layer and a service layer, so that intelligent control over the industrial mechanical arm is realized. As shown in fig. 2, in terms of physical layer, a six-degree-of-freedom industrial mechanical arm is used as a basis for virtual entity, service and twin data interaction; in the aspect of a data layer, dividing data into an inherent information layer and a pose adjusting layer; in the aspect of a transmission layer, data transmission between the six-degree-of-freedom industrial mechanical arm and the Unity3D platform is realized based on a Socket Modbus communication protocol; in the aspect of a virtual layer, a digital twin model of the industrial grabbing mechanical arm is established on a Unity3D platform, and the rotation angle and speed data of each joint are acquired and output through the reinforcement learning training of the digital twin, so that the accurate control of the industrial mechanical arm in a physical space is realized; in the aspect of a service layer, data related to the motion trail of the industrial mechanical arm are visually displayed.
In the step S1, a digital twin model of the industrial mechanical arm with six degrees of freedom is constructed by utilizing a digital twin technology.
In the present embodiment, the structure of the industrial robot arm is shown in fig. 3, and the robot arm includes 6 rotary joints, 5 driven members, a base, and a tool end. Wherein each rotary joint represents a degree of freedom, including a base joint 1, a shoulder joint 2, an elbow joint 3, a first wrist joint 4, a second wrist joint 5, and a third wrist joint 6; a part driven by rotation, namely a driven part, is arranged between every two joints, 5 driven parts are arranged in total, a first driven part is arranged between the base joint 1 and the shoulder joint 2, a second driven part is arranged between the shoulder joint 2 and the elbow joint 3, a third driven part is arranged between the elbow joint 3 and the first wrist joint 4, a fourth driven part is arranged between the first wrist joint 4 and the second wrist joint 5, and a fifth driven part is arranged between the second wrist joint 5 and the third wrist joint 6; besides, the industrial mechanical arm further comprises a base 7 and a tool end 8, wherein the base is connected with the base joint 1 and used for connecting the mechanical arm body with a robot base, and the tool end is connected with the third wrist joint 6 and used for connecting the mechanical arm body with a tool.
On the basis of the industrial mechanical arm with six degrees of freedom, a digital twin model of the industrial mechanical arm is constructed by utilizing a digital twin technology in a virtual layer, the digital twin model comprises 6 rotary joints and 5 driven parts, a fixed base and a tool end at the tail end of the mechanical arm, and a father-son logic relationship between the rotary joints and the driven parts is set.
The parent-child logical relationship means that when an object is set as a child object of another object, the object is a child object, the other object is a parent object, the child object changes along with the rotation change of the parent object, the position of a relative point does not change, and the parent object does not actively follow the change when the child object changes in rotation. One parent object can have a plurality of child objects, but one child object can only have one parent object, and the child object can become a parent object of other objects again.
In this embodiment, the 11 components (6 rotary joints and 5 driven components) of the digital twin model respectively form a parent-child logical relationship in pairs, specifically, the base joint 1 and the first driven component are in a parent-child logical relationship, the first driven component and the shoulder joint 2 are in a parent-child logical relationship, and so on, and finally, the base joint 1, the first driven component, the shoulder joint 2, the second driven component, the elbow joint 3, the third driven component, the first wrist joint 4, the fourth driven component, the second wrist joint 5, the fifth driven component and the third wrist joint 6 sequentially form a parent-child logical relationship in pairs.
In addition, the industrial mechanical arm further comprises a base and a tool end, wherein the base and the base joint 1 form a father-son logical relationship, and the third wrist joint 6 and the tool end form a father-son logical relationship.
In the modeling process, the method further comprises the following steps: based on the actual parameters of the industrial mechanical arm, basic parameters of the model are set, including joint sensitivity, a joint movement range (not 175-175 degrees in the embodiment), the upper limit of linear speed and acceleration of each joint and the like, so that the motion trail of the digital twin model is ensured to be closer to the motion of the actual industrial mechanical arm.
Specifically, in the embodiment, a Unity3D platform is adopted in the virtual layer to construct the digital twin model of the industrial robot arm. Unity3D is a multi-platform, integrated game development tool developed by Unity Technologies corporation that lets players easily create types of interactive content such as three-dimensional video games, architectural visualizations, real-time three-dimensional animations, and the like. In this embodiment, a Blender software is used to physically clone a six-degree-of-freedom mechanical arm Aubo-i10, a model is introduced into a Unity3D space, and parent-child logical relations between joints are set, that is, in Unity, an object is dragged to a list of another object, so that a parent-child relation is formed between the two objects, and the two objects are a parent object and a child object respectively; and mapping the physical properties of the six-degree-of-freedom mechanical arm Aubo-i10 into a virtual space, and then performing model reconstruction in a Unity3D space, thereby completing the construction of the digital twin model of the six-degree-of-freedom mechanical arm Aubo-i 10.
In the process of establishing the digital twin model, establishing communication connection between the digital twin model and the industrial mechanical arm, and sending mechanical arm synchronous parameters, namely an inherent information layer of a data layer to the digital twin model by the industrial mechanical arm, wherein the intrinsic information layer comprises parameters of the mechanical arm, such as each part and corresponding parameters thereof, system resolution, bearing capacity, maximum speed, actual parameters of the industrial mechanical arm and the like; the digital twin model receives data for simulation, and returns data to the industrial mechanical arm after training through the intelligent frame, wherein the data is a position and posture adjusting layer of the data layer and comprises position and posture adjusting data of the industrial mechanical arm, namely the rotation angle, the rotation angular speed, the rotation angular acceleration and the like of each rotating joint.
In this embodiment, based on a Socket Modbus communication protocol, a double-end connection is established between the six-degree-of-freedom mechanical arm Aubo-i10 and the Unity3D platform, so as to ensure stability and reliability of data transmission. Setting Socket sockets on two ports and confirming to establish connection by taking Aubo-i10 as an active end and a Unity3D platform as a passive end; in the formal communication process, the active end Aubo-i10 sends mechanical arm synchronization parameters to the passive end Unity3D platform, and the Unity3D platform receives data simulation, returns data to the active end after training through the intelligent frame and repeats according to the cycle. And when the Unity3D platform processing request of the passive end is abnormal, feeding back an abnormal function code to the active end.
In the embodiment, a digital twin model of the mechanical arm is constructed by using a digital twin technology, and the digital world and the physical world are connected, so that the up-down physical information data interaction between the physical object and the virtual object is realized, the problems of nonlinearity and uncertainty which cannot be solved by a traditional mechanism model are solved, and the real-time performance and the generalization capability of the mechanical arm are greatly improved.
As another embodiment, before performing the reinforcement learning training of the digital twin model, the method further includes: detecting whether a digital twin model obtained by digital twin modeling has the physical characteristics of the six-degree-of-freedom mechanical arm Aubo-i10 and whether the digital twin model has the physical characteristics which the six-degree-of-freedom mechanical arm Aubo-i10 does not have; and if the problem exists, the construction of the digital twin model of the industrial mechanical arm is carried out again, so that the accurate completion of the subsequent learning training and control is ensured.
In the step S2, the digital twin model is subjected to learning training based on a data-driven depth reinforcement learning algorithm, an optimal strategy of maximizing accumulated reward return is obtained through learning, and the motion trail of the digital twin model is determined.
In the embodiment, the optimal trajectory of the motion of the digital twin model is planned based on a near-end strategy optimization algorithm PPO, so that a self-learning function is realized.
Specifically, firstly, setting an initial coordinate position and an initial joint angle of a digital twin model according to a six-degree-of-freedom mechanical arm;
based on the target position, the digital twin model selects strategies from a preset strategy set to perform simulation operation, obtains an observation result of each strategy operation, gives a reward based on the observation results, and selects the strategy with the maximum reward to perform iterative operation until the optimal strategy with the maximum accumulated reward is obtained;
and determining the motion trail of the digital twin model based on the obtained optimal strategy.
The PPO algorithm is an existing algorithm, which is an improvement on the TRPO algorithm (PG series algorithm). Each iteration of the TRPO algorithm attempts to select an appropriate step size from the current policy, such that the cumulative reward obtained for the new policy monotonically increases, and the objective function is as shown in equation (1):
wherein, the first and the second end of the pipe are connected with each other,is a function of the advantage of the function,is the importance of the sampling weight or weights,representing the probability distribution of the new strategy, pi θ (a t |s t ) Probability distribution, s, representing old policies t Indicating the current state, a t Representing the action currently taken, pi represents the strategy, as a function of the state s, and in deep reinforcement learning, the strategy pi is composed of a neural network whose parameters are theta, represented as pi θ And KL represents KL divergence.
In reinforcement learning, a strategy is represented by pi, a probability distribution of one action selected from a set of actions (actions) by a robot (agent) in a current state is represented, a function f is expected to exist, and when the current state (state) is input, the strategy pi is output, and the next action (action) of the robot (agent), that is, pi = f (state), is acquired. If an action of an agent can promote the action of the agent to reach the target value as soon as possible, the probability that the action is selected more needs to be increased, namely reward (reward) is increased; otherwise, the probability that this action will be selected will be reduced, i.e., reward (reward) will be reduced. On the basis of the neural network model constructed in the way, expected benefit of action (action) is estimated, the parameter theta of the updated model is solved through the objective function, the expected benefit is higher, and the action of the mechanical arm is output.
The TRPO algorithm defines a confidence region for each round of optimization of the model, each optimization does not exceed the range in order to ensure the stability of the optimization, and the limit of the range is controlled by KL divergence, so that the probability distribution of a new strategy and an old strategy cannot be greatly different to keep stable growth.
In order to control the updating amplitude of the strategy, the PPO algorithm adopts a truncated proxy objective function to realize repeated sampling and accelerate the training speed. The algorithm compares the ratio of the old strategy to the new strategyLimited to one area, the stride of the update is limited by controlling the size of the area. Compared with the method using KL divergence in TRPO for limitation, in PPOThe limitation of (2) is simpler and easier to implement. The objective function of the PPO algorithm is shown in equation (2):
wherein the content of the first and second substances,weighting importance samplesConstraining in a range of (1-epsilon, 1+ epsilon) and epsilon is a hyperparameter; the min function serves to minimize the original term and the truncated term, so that the truncated term acts as a limit when the offset of the policy update exceeds a predetermined interval.
The PPO algorithm further utilizes a dominance function estimation method and an optimization method for adding extra entropy rewards to further improve the performance of the PPO algorithm. The use of the generalized dominance estimate to construct the dominance function can reduce the variance, so that the algorithm does not generate large fluctuation. The calculation formula of the generalized dominance estimation GAE is shown in formula (3):
wherein, delta t =r t +γV(d t+1 )-V(s t )。
When the PPO algorithm is applied to a network structure with parameters shared by a strategy and a value function, besides the cutoff return, an error term about the value function estimation and an entropy regular term of a strategy model are added to the target function for encouraging exploration. Therefore, the optimized objective function is as shown in equation (4):
wherein, c 1 And c 2 Two constant hyperparameters; c. C 1 (V θ (s)-V target ) 2 Is the mean square error of the state value function, the smaller the error the better; h (s, pi) θ ) Representing a strategy pi θ The larger the entropy, the better.
And continuously iterating based on the optimized objective function by using the PPO, finally quickly finishing training and outputting the optimal strategy of the mechanical arm in the current state, executing corresponding actions according to the strategy, and planning the optimal track of the motion of the digital twin model through self-learning.
In the embodiment, a self-learning self-adaptive rotation realization motion continuous process of the mechanical arm is realized by using a reinforcement learning algorithm, so that the situation that the mechanical arm is locked and can not rotate due to singularity when the mechanical arm is driven to rotate by using the traditional forward kinematics and inverse kinematics is avoided; the method adopts a near-end strategy optimization algorithm to enable a digital twin model and the environment to interact and obtain feedback, provides experience for action selection, and thus learns an optimal strategy for maximizing accumulated reward returned by the digital twin model, and enables the mechanical arm to realize self-learning self-adaptive motion through the near-end strategy optimization algorithm, thereby solving the problems that the traditional mechanical arm is discontinuous in motion process and difficult to solve by using a traditional optimization/gradient descent method for a plurality of contact points which are generated or disappear at any time, further realizing intelligent control over the mechanical arm, and improving the automation degree and the industrial production efficiency.
In the step S3, the industrial mechanical arm is driven to rotate in a self-adaptive mode based on the motion trail of the digital twin model, and intelligent control over the industrial mechanical arm is achieved. Specifically, after the running track of the digital twin model is obtained, the digital twin model returns pose adjustment data, namely the rotation angle, the rotation angular velocity, the rotation angular acceleration and the like of each rotating joint to the industrial mechanical arm, and the industrial mechanical arm is driven to rotate in a self-adaptive manner, so that the intelligent control of the industrial mechanical arm is realized.
As another implementation mode, the rotation of the digital twin model drives the industrial mechanical arm to rotate in a self-adaptive mode, and meanwhile, each data in the rotating process is displayed in a visual mode according to the rotation of the mechanical arm of the digital twin model, so that a controller can master the operating condition of the industrial mechanical arm at the first time, and the real-time performance is guaranteed.
Example two
The embodiment provides an industrial mechanical arm control system based on a digital twinning technology, which comprises a six-degree-of-freedom industrial mechanical arm and a digital twinning simulation platform;
the digital twinning simulation platform is used for executing the industrial mechanical arm control method based on the digital twinning technology provided by the first embodiment, drives the industrial mechanical arm with six degrees of freedom to rotate in a self-adaptive mode, and achieves intelligent control over the industrial mechanical arm.
Specifically, the digital twinning simulation platform utilizes a digital twinning technology to construct a digital twinning model of the industrial mechanical arm with six degrees of freedom; learning and training the digital twin model based on a data-driven depth reinforcement learning algorithm, learning to obtain an optimal strategy of maximizing accumulated reward return, and determining a motion track of the digital twin model; and the industrial mechanical arm is driven to rotate in a self-adaptive manner based on the motion trail of the digital twin model, so that the intelligent control of the industrial mechanical arm is realized.
EXAMPLE III
The present embodiment provides an electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the steps of the digital twinning technology based industrial robot control method as described above.
Example four
The present embodiment also provides a computer readable storage medium for storing computer instructions, which when executed by a processor, perform the steps of the digital twinning technology-based industrial robot arm control method as described above.
The steps involved in the second to fourth embodiments correspond to the first embodiment of the method, and the detailed description thereof can be found in the relevant description of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media containing one or more sets of instructions; it should also be understood to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any of the methods of the present invention.
Those skilled in the art will appreciate that the modules or steps of the present invention described above can be implemented using general purpose computer means, or alternatively, they can be implemented using program code that is executable by computing means, such that they are stored in memory means for execution by the computing means, or they are separately fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive changes in the technical solutions of the present invention.
Claims (10)
1. A control method of an industrial mechanical arm based on a digital twinning technology is characterized by comprising the following steps:
constructing a digital twin model of the industrial mechanical arm with six degrees of freedom by utilizing a digital twin technology; the digital twin model comprises 6 rotary joints and 5 driven parts, and a parent-child logic relationship between the rotary joints and the driven parts is set;
learning and training the digital twin model based on a data-driven depth reinforcement learning algorithm, learning to obtain an optimal strategy of maximizing accumulated reward return, and determining a motion track of the digital twin model;
and the industrial mechanical arm is driven to rotate in a self-adaptive manner based on the motion trail of the digital twin model, so that the intelligent control of the industrial mechanical arm is realized.
2. The industrial mechanical arm control method based on the digital twinning technology as claimed in claim 1, wherein 6 rotating joints and 5 driven parts of the digital twinning model respectively form a father-son logical relationship in pairs; specifically, the base joint, the first driven part, the shoulder joint, the second driven part, the elbow joint, the third driven part, the first wrist joint, the fourth driven part, the second wrist joint, the fifth driven part and the third wrist joint are sequentially in pairwise relationship to form a parent-child logical relationship.
3. The method as claimed in claim 1, wherein the parent-child logical relationship is that when an object is set as a child object of another object, the object is a child object, the other object is a parent object, the child object changes with the rotation change of the parent object, the position of the relative point does not change, and the parent object does not actively follow the change when the rotation of the child object changes.
4. The method for controlling an industrial mechanical arm based on the digital twinning technology as claimed in claim 1, wherein in the modeling process, the method further comprises: setting basic parameters of the digital twin model based on actual parameters of the industrial mechanical arm; the basic parameters include joint sensitivity, joint range of motion, linear velocity of each joint, and upper limit of acceleration.
5. The industrial mechanical arm control method based on the digital twin technology as claimed in claim 1, wherein in the modeling process, a communication connection between a digital twin model and the industrial mechanical arm is established, the industrial mechanical arm sends industrial mechanical arm synchronous parameters to the digital twin model, and the digital twin model receives target data for simulation and returns execution data to the industrial mechanical arm after training;
the execution data includes industrial robot arm pose adjustment data, that is, the rotation angle, the rotation angular velocity, and the rotation angular acceleration of each rotary joint.
6. The industrial mechanical arm control method based on the digital twin technology as claimed in claim 1, wherein the optimal trajectory of motion of the digital twin model is planned based on a near-end strategy optimization algorithm PPO, so that a self-learning function is realized.
7. The method as claimed in claim 6, wherein an initial coordinate position and an initial joint angle of the digital twin model are set according to a six-degree-of-freedom robot arm;
based on the target position, the digital twin model selects strategies from a preset strategy set to perform simulation operation, obtains an observation result of each strategy operation, gives a reward based on the observation result, and selects the strategy with the maximum reward to perform iterative operation until the optimal strategy with the maximum accumulated reward is obtained;
and determining the motion trail of the digital twin model based on the obtained optimal strategy.
8. An industrial mechanical arm control system based on a digital twinning technology is characterized by comprising an industrial mechanical arm with six degrees of freedom and a digital twinning simulation platform;
the digital twinning simulation platform is used for executing the industrial mechanical arm control method based on the digital twinning technology and claimed in any one of claims 1-7, drives the industrial mechanical arm with six degrees of freedom to rotate in a self-adaptive mode, and achieves intelligent control over the industrial mechanical arm.
9. An electronic device, characterized by: comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, which when executed by the processor, perform the steps of a method of controlling an industrial robot arm based on digital twinning as claimed in any one of claims 1-7.
10. A computer-readable storage medium characterized by: for storing computer instructions which, when executed by a processor, carry out the steps of a method of controlling an industrial robot arm based on digital twinning technology as claimed in any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211217890.2A CN115446867A (en) | 2022-09-30 | 2022-09-30 | Industrial mechanical arm control method and system based on digital twinning technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211217890.2A CN115446867A (en) | 2022-09-30 | 2022-09-30 | Industrial mechanical arm control method and system based on digital twinning technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115446867A true CN115446867A (en) | 2022-12-09 |
Family
ID=84308747
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211217890.2A Pending CN115446867A (en) | 2022-09-30 | 2022-09-30 | Industrial mechanical arm control method and system based on digital twinning technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115446867A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115781685A (en) * | 2022-12-26 | 2023-03-14 | 广东工业大学 | High-precision mechanical arm control method and system based on reinforcement learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020241037A1 (en) * | 2019-05-24 | 2020-12-03 | 株式会社エクサウィザーズ | Learning device, learning method, learning program, automatic control device, automatic control method, and automatic control program |
CN112192614A (en) * | 2020-10-09 | 2021-01-08 | 西南科技大学 | Man-machine cooperation based shaft hole assembling method for nuclear operation and maintenance robot |
US20210394359A1 (en) * | 2020-06-18 | 2021-12-23 | John David MATHIEU | Robotic Intervention Systems |
CN114407015A (en) * | 2022-01-28 | 2022-04-29 | 青岛理工大学 | Teleoperation robot online teaching system and method based on digital twins |
CN114942633A (en) * | 2022-04-28 | 2022-08-26 | 华南农业大学 | Multi-agent cooperative anti-collision picking method based on digital twins and reinforcement learning |
-
2022
- 2022-09-30 CN CN202211217890.2A patent/CN115446867A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020241037A1 (en) * | 2019-05-24 | 2020-12-03 | 株式会社エクサウィザーズ | Learning device, learning method, learning program, automatic control device, automatic control method, and automatic control program |
US20210394359A1 (en) * | 2020-06-18 | 2021-12-23 | John David MATHIEU | Robotic Intervention Systems |
CN112192614A (en) * | 2020-10-09 | 2021-01-08 | 西南科技大学 | Man-machine cooperation based shaft hole assembling method for nuclear operation and maintenance robot |
CN114407015A (en) * | 2022-01-28 | 2022-04-29 | 青岛理工大学 | Teleoperation robot online teaching system and method based on digital twins |
CN114942633A (en) * | 2022-04-28 | 2022-08-26 | 华南农业大学 | Multi-agent cooperative anti-collision picking method based on digital twins and reinforcement learning |
Non-Patent Citations (1)
Title |
---|
熊俊涛: "基于深度强化学习的虚拟机器人采摘路径壁障规划", 农业机械学报, vol. 51, no. 2, 31 December 2020 (2020-12-31) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115781685A (en) * | 2022-12-26 | 2023-03-14 | 广东工业大学 | High-precision mechanical arm control method and system based on reinforcement learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109960880B (en) | Industrial robot obstacle avoidance path planning method based on machine learning | |
CN112904728B (en) | Mechanical arm sliding mode control track tracking method based on improved approach law | |
US11529733B2 (en) | Method and system for robot action imitation learning in three-dimensional space | |
CN114083539B (en) | Mechanical arm anti-interference motion planning method based on multi-agent reinforcement learning | |
JP2022061022A (en) | Technique of assembling force and torque guidance robot | |
CN117103282B (en) | Double-arm robot cooperative motion control method based on MATD3 algorithm | |
CN116533249A (en) | Mechanical arm control method based on deep reinforcement learning | |
Vladareanu et al. | The navigation of mobile robots in non-stationary and non-structured environments | |
CN115446867A (en) | Industrial mechanical arm control method and system based on digital twinning technology | |
Kim et al. | Learning and generalization of dynamic movement primitives by hierarchical deep reinforcement learning from demonstration | |
KR20240052808A (en) | Multi-robot coordination using graph neural networks | |
CN115781685A (en) | High-precision mechanical arm control method and system based on reinforcement learning | |
Dong et al. | A novel human-robot skill transfer method for contact-rich manipulation task | |
CN116079747A (en) | Robot cross-body control method, system, computer equipment and storage medium | |
Peng et al. | Moving object grasping method of mechanical arm based on deep deterministic policy gradient and hindsight experience replay | |
CN110434854A (en) | A kind of redundancy mechanical arm Visual servoing control method and apparatus based on data-driven | |
CN116009542A (en) | Dynamic multi-agent coverage path planning method, device, equipment and storage medium | |
CN114967472A (en) | Unmanned aerial vehicle trajectory tracking state compensation depth certainty strategy gradient control method | |
Kallmann et al. | A skill-based motion planning framework for humanoids | |
KR20230010746A (en) | Training an action selection system using relative entropy Q-learning | |
Zhou et al. | Intelligent Control of Manipulator Based on Deep Reinforcement Learning | |
CN117140527B (en) | Mechanical arm control method and system based on deep reinforcement learning algorithm | |
Andersson | Simulation-Driven Machine Learning Control of a Forestry Crane Manipulator | |
CN116079730B (en) | Control method and system for operation precision of arm of elevator robot | |
Xiang et al. | Interactive natural motion planning for robot systems based on representation space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |