CN112743540B - Hexapod robot impedance control method based on reinforcement learning - Google Patents
Hexapod robot impedance control method based on reinforcement learning Download PDFInfo
- Publication number
- CN112743540B CN112743540B CN202011430098.6A CN202011430098A CN112743540B CN 112743540 B CN112743540 B CN 112743540B CN 202011430098 A CN202011430098 A CN 202011430098A CN 112743540 B CN112743540 B CN 112743540B
- Authority
- CN
- China
- Prior art keywords
- gain
- robot
- parameter
- impedance control
- reinforcement learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1656—Programme controls characterised by programming, planning systems for manipulators
- B25J9/1664—Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J17/00—Joints
- B25J17/02—Wrist joints
- B25J17/0258—Two-dimensional joints
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1602—Programme controls characterised by the control system, structure, architecture
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses a hexapod robot impedance control method based on reinforcement learning, which comprises the following steps: establishing a hexapod robot dynamic system with noise parameters based on dynamic motion primitives; determining a torque control expression based on impedance control; determining a tabular form of a variable gain table; determining a cost function of a control system; and determining a parameter updating rule based on the path integral learning algorithm. The final aim of the control method is to learn and update system parameters through a path integral learning algorithm, so that the value of the cost function is as small as possible, and then the robot can continuously adjust the reference track of the foot end movement and the controller gain under the interference of an uncertain force field, so as to obtain a good variable impedance control effect and move to an ideal target point in a desired form.
Description
Technical Field
The invention relates to the field of robot control and reinforcement learning, in particular to a hexapod robot impedance control method based on reinforcement learning.
Background
In the field of hexapod robot control, the control target is usually the stable motion of the foot end of the robot according to a given expected track, and the controller can reduce the error between the expected rotation angle and the actual rotation angle of the joint through position control. However, in a non-flat complex ground environment, the foot end of the hexapod robot may be unstable due to uneven stress, and therefore, the purpose of flexible control is difficult to achieve only by using position control.
Impedance control is one of the most widely used methods in compliance control of hexapod robots by varying the damping and stiffness of the end effector so that both position and force satisfy the desired kinematic equations. However, the conventional impedance control has the following disadvantages: the control parameters are fixed and invariable, and the nonlinear time-varying interference under the non-structural environment is difficult to deal with. Therefore, the academic community provides a variable impedance control method, and control parameters are dynamically planned and adjusted through interaction with the environment. How to accurately and adaptively adjust the parameters becomes the key of the intelligent control of the hexapod robot.
Nowadays, artificial intelligence technology and variable impedance control are combined to realize self-adaptive parameter adjustment, and good results are obtained. For example, li zheng et al propose an impedance control algorithm based on a neural network in a thesis "robot impedance control method adapted to unknown or variable environmental stiffness and damping parameters", so that the robot has a variable impedance capability, but the neural network method has the following disadvantages: firstly, a more complex network model needs to be established; secondly, the gradient needs to be calculated, backward propagation is completed, and the calculation amount is large. The reinforcement learning technology is a novel intelligent learning algorithm, an expression called a return function is set, a parameter updating strategy capable of obtaining high return is found through continuous trial and error and iteration, a system model of a controlled object does not need to be established, prior knowledge of a working environment does not need to be known, and the reinforcement learning technology is very suitable for being combined with variable impedance control of a robot.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a hexapod robot impedance control method based on reinforcement learning so as to realize self-adaptive smooth motion of a hexapod robot foot end under uncertain force field interference.
The invention is realized by at least one of the following technical schemes.
A hexapod robot impedance control method based on reinforcement learning comprises the following steps:
s1, establishing a hexapod robot dynamic system with noise parameters based on dynamic motion primitives;
s2, determining a torque control expression based on impedance control;
s3, determining the table form of the variable gain table;
s4, determining a cost function of the control system;
and S5, determining a parameter updating rule based on the path integral learning algorithm.
Preferably, in step S1, the dynamic motion primitive-based hexapod robot dynamics system expression with noise parameters is:
wherein x istIn order to be able to move the position of the system,andthe corresponding speed and acceleration are respectively; x is the number of0Is the system initial position; g is the target point, i.e. the desired movement position; τ is a scaling factor; α and β are damping parameters of a typical system; θ is an adjustable shape update parameter; epsilont,mIs a noise parameter; A non-linear forcing function; omegak(st) Is a basis function based on a gaussian kernel function; k is the kth basis function, and K is the total number of basis functions; s istAs the phase-variable,is the corresponding phase differential variable.
Preferably, said epsilont,mSamples were randomly taken from a gaussian distribution with standard deviation σ.
Preferably, the hexapod robot dynamic system based on dynamic motion primitives describes the six-foot robot dynamic system from an initial point x0Position x during movement to target point gtThe change of (2): when s ist1 denotes the position of the entire movement system at the initial point, when stApproaching 0 indicates that the entire motion system has reached the target position g, and by adjusting τValue control stDecay rate at xtThe desired trajectory is generated before convergence to g, and the trajectory shape is determined by θ.
Preferably, in step S2, based on the impedance control principle, the torque control expression is determined as follows:
wherein u is the torque control input; q. q.stIs the actual position of the joints of the robot,is the actual velocity of the corresponding joint; q. q.sr,tIs a reference position of the robot joint,is the reference velocity of the corresponding joint; kPIs the position gain; kDIs a speed gain, and takesC is a constant of a proportionality factor; f is a feedforward term parameter and is used for compensating gravity and inertia force and is obtained through an inverse dynamic equation.
Preferably, in the step S3, the position gain K is obtained according to the motion systemPWithout a specific target point, the gain is not represented as a transformation system converging on the target point, and KDAnd KPCorrelation, based on dynamic motion primitives extra dimension, directly on KPAnd performing function approximation to obtain a representation form of a gain table as follows:
wherein, thetaKParameters are updated for the adjustable gain table resulting from the extended dimensionality.
Preferably, in step S4, according to three targets of interest of the robot control system: position error, gain and acceleration, determining a cost function as:
wherein the cost function J is divided into three terms, in the first term, d (x)t) The position error of the deviation of the expected motion track in the process of moving the foot end of the robot from the starting point to the end point is shown, and the accuracy is ensured by hope of having a smaller bit value error; in the second term, the first term is,the gain of the j-th joint is shown,represents the minimum value of the gain of the j-th joint,the gain tables representing four joints of the robot subtract the corresponding minimum values respectively and then sum, and the gain is expected to be smaller so as to generate smaller control torque; in the third item, the first and second items,the absolute value of the acceleration of the foot end of the robot is shown, and the motor is damaged due to the fact that large acceleration is not expected to be generated.
Preferably, in step S5, the parameters θ and θ are updated for the adjustable shape by using a path integral learning algorithm in reinforcement learningKPerforming update learning to sum theta and thetaKCollectively denoted as the parameter vector Θ.
Preferably, the parameter update rule is determined as follows:
wherein m is the mth update time; m is the total number of updates; t is ti、tjThe ith moment and the jth moment respectively; t is tNAt the Nth timeMoment, namely the final moment; tau isiIs a cost variable of the algorithm; s (tau)iM) an updated cost function of the path integral learning algorithm;is at tNThe final cost of the time;is at tjInstantaneous cost of time; r is a constant positive definite matrix;is tjThe non-linear forcing function of the time of day,transpose the term for it;is relative toA spatial projection matrix of the space; p (tau)iM) is a probability variable; lambda is an exponential product function adjusting parameter;is the kth Gaussian kernel function; delta theta is the parameter update variation,is t thereofiA value of a time of day; [ Delta theta ]]kRepresents the kth component of delta theta,is t thereofiA value of a time of day; thetanewIs an updated parameter vector.
Preferably, in the parameter updating rule, the process in an updating period for the parameter vector Θ is as follows:
(1) updating cost function of calculation path integral learning algorithm Number S (tau)i,m);
(2) According to S (τ)iM) calculating the probability variable P (tau)i,m);
(3) All P (. tau.) are addediM) carrying out weighted average to obtain a parameter updating variation delta theta;
(4) adding a weight to each variable in delta theta by using a Gaussian kernel function;
(5) the original parameter vector is added with the parameter updating variable quantity to obtain an updated parameter vector thetanewAnd completing the parameter updating of one period.
For the robot system, the final ideal goal is to learn and update the parameter vector theta through the path integral learning algorithm, i.e. learn theta and thetaKThe value of the cost function J is minimized, and the robot can continuously adjust the reference track x of the foot end motion under the interference of uncertain force fieldtAnd changing the controller gain KPAnd a good variable impedance control effect is obtained, and an ideal target point is reached in a desired form.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention adopts a reinforcement learning method and utilizes the extra dimension thought of dynamic motion elements to update impedance control parameters and realize variable impedance control, so that the hexapod robot can deal with random force field interference in a non-structural environment, generate a proper reference track and move to a specified target point.
(2) The invention adopts a motion dynamic primitive model, and the established model can generate a smooth motion track with any shape, thereby being beneficial to realizing the smooth motion of the foot end of the robot in a non-structural environment.
(3) The model-free reinforcement learning algorithm is adopted, and a complex system model and an environment model of a controlled object do not need to be established; meanwhile, the updating rule does not need to calculate gradient and backward propagation of functions, and the calculation complexity is low.
Drawings
FIG. 1 is a schematic flow chart of a hexapod robot impedance control method based on reinforcement learning according to the present invention;
FIG. 2 is a scene diagram of a single-leg branched chain experiment of a hexapod robot according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a parameter updating strategy of the impedance control method for the hexapod robot based on reinforcement learning according to the present invention.
Detailed Description
For a better understanding of the inventive concept by those skilled in the art, the objects of the invention are described in further detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the described embodiments are only some but not all of the embodiments of the present invention, and the embodiments of the present invention are not limited to the following embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment provides a hexapod robot impedance control method based on reinforcement learning, as shown in fig. 1, the method includes the following steps:
s1, establishing a hexapod robot dynamic system with noise parameters based on dynamic motion primitives;
s2, determining a torque control expression based on impedance control;
s3, determining the table form of the variable gain table;
s4, determining a cost function of the control system;
and S5, determining a parameter updating rule based on the path integral learning algorithm.
In step S1, the dynamic motion primitive-based hexapod robot dynamics system expression with noise parameters is:
wherein x istIn order to be able to move the position of the system,andthe corresponding speed and acceleration are respectively; x is the number of0Is the system initial position; g is the target point, i.e. the desired movement position; τ is a scaling factor; α and β are damping parameters of a typical system; θ is an adjustable shape update parameter; epsilont,mIs a noise parameter;a non-linear forcing function; omegak(st) Is a basis function based on a gaussian kernel function; k is the kth basis function, and K is the total number of basis functions; stAs the phase-variable,is the corresponding phase differential variable.
In this example εt,mSamples were randomly taken from a gaussian distribution with standard deviation σ, and σ was taken as 0.2886.
The dynamic system based on dynamic motion primitives describes a starting point x0Position x during movement to target point gtThe change condition of (2): when stWhen 1, the whole system is at the initial point, when stApproaching 0 indicates that the whole system has reached the target position g, and s can be controlled by adjusting the value of τtDecay Rate, System at xtThe desired trajectory is generated before convergence to g, and the trajectory shape is determined by θ.
In step S2, according to the impedance control principle, the torque control expression of the present system is determined as follows:
wherein u is the torque control input; q. q.stIs the actual position of the robot joint, from the actual position x of the foot endtIs obtained by solving through inverse kinematics,is the actual velocity of the corresponding joint; q. q.sr,tIs a reference position of a robot joint, and is a reference position x of a foot endr,tIs obtained by solving through inverse kinematics,is the reference velocity of the corresponding joint; kPIs a stiffness coefficient, also representing a position gain; kDFor the damping coefficient, also representing the velocity gain, and takenC is a constant of a proportionality factor; f is a feedforward term parameter and is used for compensating gravity and inertia force and is obtained through an inverse dynamic equation.
In the step S3, the position gain K is obtained according to the system PWithout a specific target point, the gain is not represented as a transformation system converging on the target point, and KDAnd KPCorrelation, thus taking advantage of the extra dimension idea of dynamic motion primitives, directly on KPAnd performing function approximation to obtain a representation form of a gain table as follows:
wherein, thetaKParameters are updated for the adjustable gain table resulting from the extended dimensionality.
In step S4, according to three targets of interest of the robot control system: position error, gain and acceleration, and determining a cost function of the system as:
the cost function J is divided into three terms. In the first term, d (x)t) The position error of the deviation of the expected motion track in the process of moving the robot foot end from the starting point to the end point is shown, a smaller bit value error is expected to ensure the accuracy, and therefore, the smaller the term, the better. In the second term, the first term is,the gain of the j-th joint is shown,represents the minimum value of the gain of the j-th joint,the gain tables representing the four joints of the robot are respectively subtracted by the corresponding minimum values and then summed, and the system expects that the smaller moment can be used for control, the gain is smaller, and therefore the term is expected to be as small as possible. In the third item, the first and second items,the absolute value of the acceleration of the foot end of the robot is shown, and the condition that the motor is damaged due to the large acceleration is undesirable, so that the term is expected to be as small as possible.
In this embodiment, the minimum value of the gain table is taken asI.e. the minimum gain allowed when updating in the reinforcement learning algorithm, to prevent the tracking effect from deteriorating due to too low gain.
In step S5, the path integral learning algorithm in reinforcement learning is used to correct θ and θKUpdating and learning are carried out, the two are jointly expressed as a parameter vector theta, and the parameter updating rule is determined to be expressed as:
wherein m is the mth update time; m is the total number of updates; t is ti、tjThe ith moment and the jth moment respectively; t is tNIs the Nth moment, namely the final moment; tau isiIs a cost variable of the algorithm; s (tau)iM) an updated cost function of the path integral learning algorithm;is at tNThe final cost of the time;is at tjInstantaneous cost of time; r is a constant positive definite matrix;is tjThe non-linear forcing function of the time of day,transpose the term for it;is relative toA spatial projection matrix of the space; p (tau)iM) is a probability variable; lambda is an exponential product function adjusting parameter;is the kth Gaussian kernel function; delta theta is the parameter update variation,is t thereofiA value of a time of day; [ Delta theta ]]kRepresents the kth component of delta theta,is t thereofiA value of a time of day; thetanewIs an updated parameter vector.
In the parameter updating rule, the process in an updating period for the parameter vector Θ is as follows:
(1) Calculating an updated cost function S (tau) of a path integral learning algorithmi,m);
(2) According to S (τ)iM) calculating the probability variable P (tau)i,m);
(3) All P (. tau.) are addediM) is addedObtaining parameter updating variable quantity delta theta by weight average;
(4) adding a weight to each variable in delta theta by using a Gaussian kernel function;
(5) the original parameter vector is added with the parameter updating variable quantity to obtain an updated parameter vector thetanewAnd completing the parameter updating of one period.
For the robot system, the final ideal goal is to learn and update the parameter vector theta through the path integral learning algorithm, i.e. learn theta and thetaKThe value of the cost function J is minimized, and the robot can continuously adjust the reference track x of the foot end motion under the interference of uncertain force fieldtAnd changing the controller gain KPAnd a good variable impedance control effect is obtained, and an ideal target point is reached in a desired form.
In this embodiment, an experimental scenario as shown in fig. 2 is adopted, in the experiment, a single-leg branched chain of a four-joint of a hexapod robot is taken as a case, a foot end makes a linear motion along an expected trajectory from a coordinate (0, 0.7), a motion target point g is set to be a coordinate (0, 0.5), a motion distance is 0.2 m, a duration is 1 second, and a simulated random force field is added to the foot end in the motion process: Wherein, FxIs a disturbance force field added to the foot end along the x-axis direction;is the speed of movement of the foot end along the y-axis; β is a scaling parameter, randomly sampled from gaussian N (1, σ), where the standard deviation of gaussian is chosen to be σ 0.2886. The random force field along the x-axis direction, which easily affects the balance of the robot, is simulated in the above manner, and the scene is used as the reinforcement learning training scene of the embodiment.
In this embodiment, a parameter update strategy as shown in fig. 3 is used for learning. For the initial parameter vector ΘinitFirst, a noise epsilon randomly sampled from a gaussian distribution with a standard deviation of σ 0.2886 is addedt,mObtaining parameters with noise; then performing the extended dynamicsObtaining reference position track x by motion primitive modelr,tAnd gain table KP,t(ii) a Calculating the system cost function according to the track and the gain; then executing the path integral learning algorithm to update the parameters to obtain an updated parameter vector thetanewThe first update cycle is ended and a new cycle is started on the basis thereof. In the present embodiment, a total of 100 updates are performed, each time the random noise-carrying parameter generates a different reference position trajectory { x }r,t}m=1,2,…,100And gain table { KP,t}m=1,2,…,100Therefore, different system cost functions can be obtained, and the learning algorithm is always changed towards the direction of reducing the system cost function when updating the parameter vector. After updating, the better reference position track x r,tSolving the inverse kinematics to obtain qr,tAnd will go into qr,tAnd KP,tSubstituting the impedance control moment input expression into the impedance control moment input expression to obtain a good control effect, and finally realizing that the foot end of the hexapod robot moves to a desired target point g in a smooth track under the action of an interference force field.
The above description is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto. Any person skilled in the art can substitute or change the technical scheme of the invention and the inventive concept thereof with a plurality of equivalents within the scope of the disclosure of the invention.
Claims (9)
1. A hexapod robot impedance control method based on reinforcement learning is characterized by comprising the following steps:
s1, establishing a hexapod robot dynamic system with noise parameters based on dynamic motion primitives;
the dynamic motion primitive-based hexapod robot dynamic system expression with noise parameters is as follows:
wherein x istAs a system of movementThe location of the system is determined by the location of the system,andthe corresponding speed and acceleration are respectively; x is the number of0Is the system initial position; g is the target point, i.e. the desired movement position; τ is a scaling factor; α and β are damping parameters of a typical system; θ is an adjustable shape update parameter; epsilon t,mIs a noise parameter;a non-linear forcing function; omegak(st) Is a basis function based on a gaussian kernel function; k is the kth basis function, and K is the total number of basis functions; stAs the phase-variable,is the corresponding phase differential variable;
s2, determining a torque control expression based on impedance control;
s3, determining the table form of the variable gain table;
s4, determining a cost function of the control system;
and S5, determining a parameter updating rule based on the path integral learning algorithm.
2. The impedance control method for the hexapod robot based on the reinforcement learning as claimed in claim 1, wherein εt,mSamples were randomly taken from a gaussian distribution with standard deviation σ.
3. The impedance control method for the hexapod robot based on reinforcement learning as claimed in claim 2, wherein the dynamic motion primitive-based hexapod robot dynamic system description is from initial point x0Position x during movement to target point gtThe change of (2): when s istWhen 1 denotes integerThe motion system is at the initial point position when stApproaching 0 indicates that the whole moving system has reached the target position g, and s is controlled by adjusting the value of tautDecay rate at xtThe desired trajectory is generated before convergence to g, and the trajectory shape is determined by θ.
4. The impedance control method for the hexapod robot based on the reinforcement learning as claimed in claim 3, wherein in step S2, based on the impedance control principle, the moment control expression is determined as follows:
wherein u is a torque control input; q. q.stIs the actual position of the joints of the robot,is the actual velocity of the corresponding joint; q. q.sr,tIs a reference position of the robot joint,is the reference velocity of the corresponding joint; kPIs the position gain; kDIs a speed gain, and takesC is a constant of a proportionality factor; f is a feedforward term parameter and is used for compensating gravity and inertia force and is obtained through an inverse dynamic equation.
5. The impedance control method for the hexapod robot based on the reinforcement learning as claimed in claim 4, wherein in step S3, the position gain K is obtained by the motion systemPWithout a specific target point, the gain is not represented as a transformation system converging on the target point, and KDAnd KPCorrelation, based on dynamic motion primitives extra dimension, directly on KPMaking letterPerforming numerical approximation to obtain a representation form of a gain table as follows:
wherein, thetaKParameters are updated for the adjustable gain table resulting from the extended dimensionality.
6. The impedance control method for the hexapod robot based on the reinforcement learning as claimed in claim 5, wherein in step S4, according to the robot control system three targets of interest: position error, gain and acceleration, determining a cost function as:
Wherein the cost function J is divided into three terms, in the first term, d (x)t) The position error of the deviation of the expected motion track in the process of moving the robot foot end from the starting point to the end point is shown; in the second term, the first term is,the gain of the j-th joint is shown,represents the minimum value of the gain of the j-th joint,the gain tables representing four joints of the robot subtract the corresponding minimum values respectively and then sum, and the gain is expected to be smaller so as to generate smaller control torque; in the third item, the first and second items,the absolute value of the acceleration of the foot end of the robot is shown, and the motor is damaged due to the fact that large acceleration is not expected to be generated.
7. The method for controlling impedance of a hexapod robot based on reinforcement learning as claimed in claim 6, wherein in step S5, the parameters θ and θ are updated for the adjustable shape by using the path integral learning algorithm in reinforcement learningKPerforming update learning to sum theta and thetaKCollectively denoted as the parameter vector Θ.
8. The impedance control method for the hexapod robot based on the reinforcement learning according to claim 7, wherein the parameter updating rule is determined as:
wherein m is the mth update time; m is the total number of updates; t is ti、tjThe ith moment and the jth moment respectively; t is t NThe Nth moment is the final moment; tau isiIs a cost variable of the algorithm; s (tau)iM) an updated cost function of the path integral learning algorithm;is at tNThe final cost of the time;is at tjInstantaneous cost of time; r is a constant positive definite matrix;is tjThe non-linear forcing function of the time of day,transpose the term for it;is relative toA spatial projection matrix of the space; p (tau)iM) is a probability variable; lambda is an exponential product function adjusting parameter;is the kth Gaussian kernel function; delta theta is the parameter update variation,is t thereofiA value of a time of day; [ Delta theta ]]kRepresents the kth component of delta theta,is t thereofiA value of a time of day; thetanewIs an updated parameter vector.
9. The impedance control method for a hexapod robot based on reinforcement learning as claimed in claim 8, wherein the parameter updating rule comprises the following steps for the process within one updating period of the parameter vector Θ:
(1) calculating an updated cost function S (τ) for a path integral learning algorithmi,m);
(2) According to S (τ)iM) calculating the probability variable P (τ)i,m);
(3) All P (. tau.) are addediM) carrying out weighted average to obtain parameter updating variation delta theta;
(4) adding a weight to each variable in delta theta by using a Gaussian kernel function;
(5) the original parameter vector is added with the parameter updating variable quantity to obtain an updated parameter vector theta newAnd completing the parameter updating of one period.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011430098.6A CN112743540B (en) | 2020-12-09 | 2020-12-09 | Hexapod robot impedance control method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011430098.6A CN112743540B (en) | 2020-12-09 | 2020-12-09 | Hexapod robot impedance control method based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112743540A CN112743540A (en) | 2021-05-04 |
CN112743540B true CN112743540B (en) | 2022-05-24 |
Family
ID=75649119
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011430098.6A Active CN112743540B (en) | 2020-12-09 | 2020-12-09 | Hexapod robot impedance control method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112743540B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113995629B (en) * | 2021-11-03 | 2023-07-11 | 中国科学技术大学先进技术研究院 | Mirror image force field-based upper limb double-arm rehabilitation robot admittance control method and system |
CN114393579B (en) * | 2022-01-04 | 2023-09-22 | 南京航空航天大学 | Robot control method and device based on self-adaptive fuzzy virtual model |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009257580A (en) * | 2008-03-25 | 2009-11-05 | Tokai Rubber Ind Ltd | Adjustment device of mechanical impedance, control method therefor, standing assisting chair and rocking arm using adjustment device of mechanical impedance |
CN105690388A (en) * | 2016-04-05 | 2016-06-22 | 南京航空航天大学 | Impedance control method and device for restraining tendon tensile force of tendon driving mechanical arm |
DE102016004788A1 (en) * | 2016-04-20 | 2017-10-26 | Kastanienbaum GmbH | Method for producing a robot and device for carrying out this method |
CN108115690A (en) * | 2017-12-31 | 2018-06-05 | 芜湖哈特机器人产业技术研究院有限公司 | A kind of robot adaptive control system and method |
CN108153153A (en) * | 2017-12-19 | 2018-06-12 | 哈尔滨工程大学 | A kind of study impedance control system and control method |
CN109434830A (en) * | 2018-11-07 | 2019-03-08 | 宁波赛朗科技有限公司 | A kind of industrial robot platform of multi-modal monitoring |
CN109848990A (en) * | 2019-01-28 | 2019-06-07 | 南京理工大学 | Knee joint ectoskeleton gain-variable model-free angle control method based on PSO |
-
2020
- 2020-12-09 CN CN202011430098.6A patent/CN112743540B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009257580A (en) * | 2008-03-25 | 2009-11-05 | Tokai Rubber Ind Ltd | Adjustment device of mechanical impedance, control method therefor, standing assisting chair and rocking arm using adjustment device of mechanical impedance |
CN105690388A (en) * | 2016-04-05 | 2016-06-22 | 南京航空航天大学 | Impedance control method and device for restraining tendon tensile force of tendon driving mechanical arm |
DE102016004788A1 (en) * | 2016-04-20 | 2017-10-26 | Kastanienbaum GmbH | Method for producing a robot and device for carrying out this method |
CN108153153A (en) * | 2017-12-19 | 2018-06-12 | 哈尔滨工程大学 | A kind of study impedance control system and control method |
CN108115690A (en) * | 2017-12-31 | 2018-06-05 | 芜湖哈特机器人产业技术研究院有限公司 | A kind of robot adaptive control system and method |
CN109434830A (en) * | 2018-11-07 | 2019-03-08 | 宁波赛朗科技有限公司 | A kind of industrial robot platform of multi-modal monitoring |
CN109848990A (en) * | 2019-01-28 | 2019-06-07 | 南京理工大学 | Knee joint ectoskeleton gain-variable model-free angle control method based on PSO |
Also Published As
Publication number | Publication date |
---|---|
CN112743540A (en) | 2021-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112904728B (en) | Mechanical arm sliding mode control track tracking method based on improved approach law | |
CN112743540B (en) | Hexapod robot impedance control method based on reinforcement learning | |
CN111941432B (en) | Artificial intelligence output feedback control method for high-performance mechanical arm | |
CN109176525A (en) | A kind of mobile manipulator self-adaptation control method based on RBF | |
CN112207834B (en) | Robot joint system control method and system based on disturbance observer | |
CN104589349A (en) | Combination automatic control method with single-joint manipulator under mixed suspension microgravity environments | |
Šuster et al. | Tracking trajectory of the mobile robot Khepera II using approaches of artificial intelligence | |
CN115990888B (en) | Mechanical arm control method with dead zone and time-varying constraint function | |
CN116533249A (en) | Mechanical arm control method based on deep reinforcement learning | |
CN109605377A (en) | A kind of joint of robot motion control method and system based on intensified learning | |
CN114310851B (en) | Dragging teaching method of robot moment-free sensor | |
US6768927B2 (en) | Control system | |
CN107511830B (en) | Adaptive adjustment realization method for parameters of five-degree-of-freedom hybrid robot controller | |
Sanders et al. | The addition of neural networks to the inner feedback path in order to improve on the use of pre-trained feed forward estimators | |
CN113641099A (en) | Impedance control imitation learning training method for surpassing expert demonstration | |
CN113219825A (en) | Single-leg track tracking control method and system for quadruped robot | |
CN116834014A (en) | Intelligent cooperative control method and system for capturing non-cooperative targets by space dobby robot | |
Qu et al. | Fractional-order finite-time sliding mode control for uncertain teleoperated cyber–physical system with actuator fault | |
Wei et al. | Sensorimotor coordination and sensor fusion by neural networks | |
Ak et al. | Fuzzy sliding mode controller with neural network for robot manipulators | |
Jung et al. | On reference trajectory modification approach for Cartesian space neural network control of robot manipulators | |
Hamavand et al. | Trajectory control of robotic manipulators by using a feedback-error-learning neural network | |
Jung et al. | New neural network control technique for non-model based robot manipulator control | |
Zhang et al. | Biped walking on rough terfrain using reinforcement learning | |
Jung et al. | Neural network reference compensation technique for position control of robot manipulators |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |