CN112948464B

CN112948464B - Collision avoidance intelligent robot based on reinforcement learning

Info

Publication number: CN112948464B
Application number: CN202110237543.5A
Authority: CN
Inventors: 张晓琴
Original assignee: Chongqing Industry Polytechnic College
Current assignee: Chongqing Industry Polytechnic College
Priority date: 2021-03-04
Filing date: 2021-03-04
Publication date: 2021-09-17
Anticipated expiration: 2041-03-04
Also published as: CN112948464A

Abstract

The invention discloses a collision avoidance intelligent robot based on reinforcement learning, wherein a data acquisition module is used for acquiring data information of the robot and surrounding environment information; the positioning module is used for acquiring the coordinates of the robot movement and the coordinates of the obstacle; the data processing module is used for receiving the data information and the environment information for processing and sending the data information and the environment information to the data analysis module; the data analysis module is used for receiving the data sent by the data processing module and carrying out analysis calculation to obtain a forward-moving sequencing set and a barrier shadow sequencing set; the statistic early warning module is used for receiving the forward movement sorting set and the barrier shadow sorting set and carrying out statistics and early warning operation, and the regulation and control module is used for regulating and controlling the operation of the robot; the invention is used for solving the problems that the comprehensive analysis can not be carried out according to the moving state and the barrier state of the robot to carry out early warning for the operation of the robot and carry out learning and adjustment in time.

Description

Collision avoidance intelligent robot based on reinforcement learning

Technical Field

The invention relates to the technical field of intelligent robots, in particular to an intelligent robot for collision avoidance based on reinforcement learning.

Background

The intelligent robot at least has the following three elements: the first is a sensory element for recognizing the state of the surrounding environment; second, the movement element, make the responsive action to the outside world; third, the thinking element, which action is taken according to the information obtained by the feeling element. The sensory elements include non-contact sensors capable of sensing vision, proximity, distance, and the like, and contact sensors capable of sensing force, pressure, touch, and the like. These elements are substantially equivalent to five sense organs such as eyes, nose, ears and the like of a human, and the functions of the elements can be realized by using electromechanical components such as a camera, an image sensor, an ultrasonic transducer, a laser, conductive rubber, a piezoelectric element, a pneumatic element, a travel switch and the like;

the advanced intelligent robot has the capabilities of feeling, identifying, reasoning and judging, and can automatically modify programs within a certain range according to the change of external conditions. In contrast, the principle of modifying the program is not specified by a human, but rather the robot itself learns and summarizes the experience to obtain the principle of modifying the program.

The existing collision-prevention intelligent robot has the following defects: the problem that the robot cannot carry out comprehensive analysis to carry out early warning and timely learning and adjustment on the operation of the robot according to the moving state and the barrier state of the robot.

Disclosure of Invention

The invention aims to provide an intelligent collision avoidance robot based on reinforcement learning, and the technical problems to be solved by the invention are as follows:

how to solve can not carry out comprehensive analysis according to the mobile state of robot and barrier state among the current scheme and carry out the problem that early warning and in time study and adjustment are carried out to the operation of robot.

The purpose of the invention can be realized by the following technical scheme: an intelligent robot for avoiding collision based on reinforcement learning comprises a data acquisition module, a positioning module, a data processing module, a data analysis module, a statistic and early warning module and a regulation and control module;

the data acquisition module is used for acquiring data information of the robot and surrounding environment information, wherein the data information comprises size data, movement data and electric quantity data of the robot; the environment information comprises type data of the obstacles and contact data between the obstacles, and the data information and the environment information are sent to the data processing module;

the positioning module is used for acquiring the moving coordinates of the robot to obtain a first coordinate set, acquiring the coordinates of the obstacle to obtain a second coordinate set, classifying and combining the first coordinate set and the second coordinate set to obtain a coordinate information set, and sending the coordinate information set to the data analysis module;

the data processing module is used for receiving the data information and the environment information for processing to obtain size processing data, movement processing data, electric quantity processing data, type processing data and contact processing data, and sending the size processing data, the movement processing data, the electric quantity processing data, the type processing data and the contact processing data to the data analysis module;

the data analysis module is used for receiving size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set, analyzing and calculating to obtain a forward movement sorting set and a barrier shadow sorting set;

the statistics early warning module is used for receiving the forward movement sorting set and the barrier shadow sorting set and carrying out statistics and early warning operation, and the specific steps comprise:

the method comprises the following steps: receive a set of forward ordered sets andthe barrier shadow sorting set marks a preset standard forward moving threshold as P1, marks a preset standard barrier shadow threshold as P2, and respectively matches the barrier shadow threshold with a forward moving value Q in the forward moving sorting set_qyBarrier value Q in barrier sorting set_zyCarrying out comparison and judgment;

step two: if Q_qyNot less than P1 and Q_zyIf the moving speed is more than or equal to P2, judging that the robot can move efficiently and can normally avoid the obstacle, and generating a first early warning signal; if Q_qy< P1 and Q_zyIf the distance is not less than P2, judging that the robot moves inefficiently and can normally avoid the obstacle, and generating a second early warning signal; if Q_qyNot less than P1 and Q_zyIf the number is less than P2, the robot is judged to be capable of efficiently moving but not capable of avoiding the obstacle, a third early warning signal is generated, and a forward shift value and an obstacle value corresponding to the third early warning signal are respectively marked as a first statistical forward shift value and a first statistical obstacle value; if Q_qy< P1 and Q_zyIf the number is less than P2, judging that the robot moves inefficiently and cannot avoid the obstacle, generating a fourth early warning signal, and marking a forward moving value and an obstacle value corresponding to the fourth early warning signal as a second statistical forward moving value and a second statistical obstacle value respectively;

step three: sending the first statistical forward shift value, the first statistical barrier shadow value, the second statistical forward shift value and the second statistical barrier shadow value to a regulation module;

the regulation and control module is used for regulating and controlling the operation of the robot.

Preferably, the specific steps of the data processing module for receiving and processing the data information and the environment information include:

s21: receiving the data information and the environment information, and acquiring size data, movement data and electric quantity data of the robot in the data information;

s22: the largest width in the size data is set as the first measurement and labeled YCi, i ═ 1,2,3.. n; the largest thickness in the dimensional data was set as the second measurement and labeled ECi, i-1, 2,3.. n; setting the height in the dimensional data as a third measurement and marking it as SCi, i-1, 2,3.. n; carrying out normalization processing on the marked first measurement value, the marked second measurement value and the marked third measurement value, and carrying out value combination to obtain size processing data;

s23: setting the maximum moving speed in the moving data as moving upper limit data, and marking the moving upper limit data as YSi, i is 1,2,3.. n; setting the maximum acceleration in the movement data as movement acceleration data and marking it as YJi, i ═ 1,2,3.. n; carrying out normalization processing on the marked movement upper limit data and the movement acceleration data and carrying out value combination to obtain movement processing data;

s24: marking real-time electric quantity in the electric quantity data as first electric measurement data, and marking the first electric measurement data as CDYi, i-1, 2,3.. n; marking standby electricity consumption data in the electricity quantity data as second electricity consumption data and marking the second electricity consumption data as CDEi, i is 1,2,3.. n; marking the mobile electricity consumption data in the electricity quantity data as third electricity consumption data and marking the third electricity consumption data as CDSi, wherein i is 1,2,3.. n; normalizing the marked first measured electrical data, the marked second measured electrical data and the marked third measured electrical data and carrying out value combination to obtain electrical quantity processing data;

s25: acquiring type data of obstacles in the environmental information and contact data between the obstacles;

s26: setting different obstacle types to correspond to different obstacle preset values, matching the obstacle types in the obstacle type data with all the obstacle types to obtain corresponding obstacle preset values, and marking the corresponding obstacle preset values as ZLIk, wherein i is 1,2,3.. n; k is 1, 2; carrying out normalization processing on a plurality of obstacle preset values and carrying out value combination to obtain type processing data; wherein ZLYik contains an obstacle preset value for a movable obstacle and an obstacle preset value for a non-movable obstacle;

s27: setting the space height in the relation data between the obstacles as first obstacle measurement data, and marking the first obstacle measurement data as YZCi, i-1, 2,3.. n; setting the maximum width of a space in the link data between the obstacles as second obstacle measurement data, and marking the second obstacle measurement data as EZCi, i-1, 2,3.. n; setting the minimum width of space in the relation data between obstacles as third obstacle measurement data, and marking the third obstacle measurement data as SZCi, i-1, 2,3.. n; and carrying out normalization processing on the marked first obstacle measurement data, the marked second obstacle measurement data and the marked third obstacle measurement data, and carrying out value combination to obtain the connection processing data.

Preferably, the specific steps of the data analysis module for performing the analysis operation include:

s31: acquiring size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set which are subjected to normalization processing;

s32: acquiring a forward value of the robot movement by using a formula, wherein the formula is as follows:

wherein Q is_qyThe method comprises the steps of expressing the values as forward shift values, mu as preset forward shift correction factors, expressing a1 and a2 as different proportionality coefficients, expressing YSi as upper limit data of the shift, expressing CDYi as first electrical data, expressing CDEi as second electrical data, expressing CDSi as third electrical data, expressing t1 as the time length of standby power consumption of the robot, expressing t2 as the time length of power consumption of the robot in shifting, and expressing t3 as the time length of acceleration of the robot in shifting;

s33: carrying out descending order arrangement on the forward values to obtain a forward ordered set;

s34: according to the real-time coordinates of the robot movement in the first coordinate set in the coordinate information set and the coordinates of the obstacles in the second coordinate set, and the distance value between the real-time coordinates and the coordinates of the obstacles in the second coordinate set is obtained, and the distance value is marked as D1;

s35: obtaining the obstacle shadow value of the obstacle by using a formula, wherein the formula is as follows:

wherein Q is_zyExpressing as barrier shadow values, alpha is expressed as a preset barrier shadow correction factor, b1, b2, b3 and b4 are expressed as different scale factors, YCi is expressed as a first measured value, ECi is expressed as a second measured value, SCi is expressed as a third measured value, YZCi is expressed as first barrier measurement data, EZCi is expressed as second barrier measurement data, SZCi is expressed as third barrier measurement data, and ZLIk is expressed as a barrier preset value;

s36: and (4) carrying out descending order arrangement on the barrier shadow values to obtain a barrier shadow ordering set.

Preferably, the control module is used for controlling the operation of the robot, and the specific steps include:

s41: receiving a first statistical forward shift value, a first statistical barrier shadow value, a second statistical forward shift value and a second statistical barrier shadow value;

s42: acquiring a moving speed corresponding to a first statistical forward-moving value and marking the moving speed as a first early warning speed, acquiring real-time electric quantity corresponding to the first statistical forward-moving value and marking the real-time electric quantity as a first early warning electric quantity, acquiring contact data corresponding to a first statistical obstacle shadow value and marking the contact data as a first early warning size, and marking a distance value between the robot and the obstacle as a first early warning distance; controlling the moving speed and the moving direction of the robot when the robot meets an obstacle according to the first early warning distance, the first early warning size, the first early warning electric quantity and the first early warning speed;

s43: acquiring a moving speed corresponding to the second statistical forward-moving value and marking the moving speed as a second early warning speed, acquiring a real-time electric quantity corresponding to the second statistical forward-moving value and marking the real-time electric quantity as a second early warning electric quantity, acquiring contact data corresponding to the second statistical barrier shadow value and marking the contact data as a second early warning size, and marking a distance value between the robot and the barrier as a second early warning distance; and controlling the moving direction of the robot when the robot meets the obstacle according to the second early warning distance, the second early warning size, the second early warning electric quantity and the second early warning speed.

The invention has the beneficial effects that:

according to the various aspects disclosed by the invention, the purposes of carrying out comprehensive analysis according to the moving state and the barrier state of the robot to carry out early warning for the operation of the robot and timely learning and adjusting can be achieved by using the data acquisition module, the positioning module, the data processing module, the data analysis module, the statistic early warning module and the regulation and control module in a matching way;

the method comprises the steps that a data acquisition module is used for acquiring data information of the robot and surrounding environment information, wherein the data information comprises size data, movement data and electric quantity data of the robot; the environment information comprises type data of the obstacles and contact data between the obstacles, and the data information and the environment information are sent to the data processing module; by collecting data information of the robot and surrounding environment information and carrying out processing and analysis, effective data support is provided for collision avoidance, early warning learning and adjustment of the robot;

acquiring a coordinate of robot movement by using a positioning module to obtain a first coordinate set, acquiring a coordinate of an obstacle to obtain a second coordinate set, classifying and combining the first coordinate set and the second coordinate set to obtain a coordinate information set, and sending the coordinate information set to a data analysis module; by positioning the movement of the robot and the position of the obstacle, data support can be provided for the robot to change the running direction;

the data processing module is used for receiving the data information and the environment information for processing to obtain size processing data, movement processing data, electric quantity processing data, type processing data and connection processing data, and the size processing data, the movement processing data, the electric quantity processing data, the type processing data and the connection processing data are sent to the data analysis module; by processing the data information and the environmental information in comparison with each other, the relationship between the data items is conveniently established, and the data processing efficiency and the processing accuracy are improved;

receiving size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set by using a data analysis module, and carrying out analysis calculation to obtain a forward movement sorting set and an obstacle shadow sorting set; the forward movement value and the barrier shadow are obtained by calculating and establishing a relation between the processed data, so that the movement of the robot and the state between the barriers can be conveniently analyzed;

the intelligent robot system comprises a front moving sorting set, a barrier shadow sorting set, a regulation and control module, a warning module and a warning module, wherein the front moving sorting set and the barrier shadow sorting set are received by the statistics and warning module, the regulation and control module is used for regulating and controlling the operation of the robot, and the data after collision are subjected to statistics and warning, so that the purpose of intelligent learning is achieved.

Drawings

The invention will be further described with reference to the accompanying drawings.

Fig. 1 is a block diagram of an intelligent collision avoidance robot based on reinforcement learning.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, the invention relates to an intelligent robot for collision avoidance based on reinforcement learning, which comprises a data acquisition module, a positioning module, a data processing module, a data analysis module, a statistical early warning module and a regulation and control module;

the data processing module is used for receiving the data information and the environment information for processing to obtain size processing data, movement processing data, electric quantity processing data, type processing data and contact processing data, and sending the size processing data, the movement processing data, the electric quantity processing data, the type processing data and the contact processing data to the data analysis module; the data processing module is used for receiving the data information and the environment information for processing, and comprises the following specific steps:

receiving the data information and the environment information, and acquiring size data, movement data and electric quantity data of the robot in the data information;

the largest width in the size data is set as the first measurement and labeled YCi, i ═ 1,2,3.. n; the largest thickness in the dimensional data was set as the second measurement and labeled ECi, i-1, 2,3.. n; setting the height in the dimensional data as a third measurement and marking it as SCi, i-1, 2,3.. n; carrying out normalization processing on the marked first measurement value, the marked second measurement value and the marked third measurement value, and carrying out value combination to obtain size processing data;

setting the maximum moving speed in the moving data as moving upper limit data, and marking the moving upper limit data as YSi, i is 1,2,3.. n; setting the maximum acceleration in the movement data as movement acceleration data and marking it as YJi, i ═ 1,2,3.. n; carrying out normalization processing on the marked movement upper limit data and the movement acceleration data and carrying out value combination to obtain movement processing data;

marking real-time electric quantity in the electric quantity data as first electric measurement data, and marking the first electric measurement data as CDYi, i-1, 2,3.. n; marking standby electricity consumption data in the electricity quantity data as second electricity consumption data and marking the second electricity consumption data as CDEi, i is 1,2,3.. n; marking the mobile electricity consumption data in the electricity quantity data as third electricity consumption data and marking the third electricity consumption data as CDSi, wherein i is 1,2,3.. n; normalizing the marked first measured electrical data, the marked second measured electrical data and the marked third measured electrical data and carrying out value combination to obtain electrical quantity processing data;

acquiring type data of obstacles in the environmental information and contact data between the obstacles;

setting different obstacle types to correspond to different obstacle preset values, matching the obstacle types in the obstacle type data with all the obstacle types to obtain corresponding obstacle preset values, and marking the corresponding obstacle preset values as ZLIk, wherein i is 1,2,3.. n; k is 1, 2; carrying out normalization processing on a plurality of obstacle preset values and carrying out value combination to obtain type processing data; wherein ZLYik contains an obstacle preset value for a movable obstacle and an obstacle preset value for a non-movable obstacle;

setting the space height in the relation data between the obstacles as first obstacle measurement data, and marking the first obstacle measurement data as YZCi, i-1, 2,3.. n; setting the maximum width of a space in the link data between the obstacles as second obstacle measurement data, and marking the second obstacle measurement data as EZCi, i-1, 2,3.. n; setting the minimum width of space in the relation data between obstacles as third obstacle measurement data, and marking the third obstacle measurement data as SZCi, i-1, 2,3.. n; carrying out normalization processing on the marked first obstacle measurement data, the marked second obstacle measurement data and the marked third obstacle measurement data, and carrying out value combination to obtain connection processing data;

the data analysis module is used for receiving size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set, analyzing and calculating to obtain a forward movement sorting set and a barrier shadow sorting set; the specific steps of the data analysis module for performing analysis operation comprise:

acquiring size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set which are subjected to normalization processing;

acquiring a forward value of the robot movement by using a formula, wherein the formula is as follows:

carrying out descending order arrangement on the forward values to obtain a forward ordered set;

according to the real-time coordinates of the robot movement in the first coordinate set in the coordinate information set and the coordinates of the obstacles in the second coordinate set, and the distance value between the real-time coordinates and the coordinates of the obstacles in the second coordinate set is obtained, and the distance value is marked as D1;

obtaining the obstacle shadow value of the obstacle by using a formula, wherein the formula is as follows:

carrying out descending order arrangement on the barrier shadow values to obtain a barrier shadow ordering set;

the method comprises the following steps: receiving the forward-shift sorting set and the barrier sorting set, marking a preset standard forward-shift threshold as P1, marking a preset standard barrier threshold as P2, and respectively comparing the standard forward-shift threshold with a forward-shift value Q in the forward-shift sorting set_qyBarrier value Q in barrier sorting set_zyCarrying out comparison and judgment;

the regulation and control module is used for regulating and controlling the operation of the robot, and the specific steps comprise:

receiving a first statistical forward shift value, a first statistical barrier shadow value, a second statistical forward shift value and a second statistical barrier shadow value;

acquiring a moving speed corresponding to a first statistical forward-moving value and marking the moving speed as a first early warning speed, acquiring real-time electric quantity corresponding to the first statistical forward-moving value and marking the real-time electric quantity as a first early warning electric quantity, acquiring contact data corresponding to a first statistical obstacle shadow value and marking the contact data as a first early warning size, and marking a distance value between the robot and the obstacle as a first early warning distance; controlling the moving speed and the moving direction of the robot when the robot meets an obstacle according to the first early warning distance, the first early warning size, the first early warning electric quantity and the first early warning speed;

acquiring a moving speed corresponding to the second statistical forward-moving value and marking the moving speed as a second early warning speed, acquiring a real-time electric quantity corresponding to the second statistical forward-moving value and marking the real-time electric quantity as a second early warning electric quantity, acquiring contact data corresponding to the second statistical barrier shadow value and marking the contact data as a second early warning size, and marking a distance value between the robot and the barrier as a second early warning distance; controlling the moving direction of the robot when the robot meets the obstacle according to the second early warning distance, the second early warning size, the second early warning electric quantity and the second early warning speed;

the above formulas are obtained by collecting a large amount of data and performing software simulation, and the coefficients in the formulas are set by those skilled in the art according to actual conditions.

The working principle of the invention is as follows: in the embodiment of the invention, the data acquisition module, the positioning module, the data processing module, the data analysis module, the statistic and early warning module and the regulation and control module are used in a matched manner, so that the aims of carrying out comprehensive analysis on the moving state and the barrier state of the robot to carry out early warning on the operation of the robot and timely learning and adjusting can be achieved;

receiving size processing data, mobile processing data, electric quantity processing data, type processing data, contact processing data and coordinate information set by using a data analysis module, analyzing and calculating, and using a formula

Acquiring a forward value of the movement of the robot; carrying out descending order arrangement on the forward values to obtain a forward ordered set; using formulas

Acquiring a barrier shadow value of a barrier; carrying out descending order arrangement on the barrier shadow values to obtain a barrier shadow ordering set; the forward movement value and the barrier shadow are obtained by calculating and establishing a relation between the processed data, so that the movement of the robot and the state between the barriers can be conveniently analyzed;

In the embodiments provided by the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the method of the embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one control module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware form, and can also be realized in a form of hardware and a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is to be understood that the word "comprising" does not exclude other modules or steps, and the singular does not exclude the plural. A plurality of modules or means recited in the system claims may also be implemented by one module or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above examples are only intended to illustrate the technical process of the present invention and not to limit the same, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical process of the present invention without departing from the spirit and scope of the technical process of the present invention.

Claims

1. An intelligent robot for avoiding collision based on reinforcement learning is characterized by comprising a data acquisition module, a positioning module, a data processing module, a data analysis module, a statistic and early warning module and a regulation and control module;

2. The intelligent robot for collision avoidance based on reinforcement learning of claim 1, wherein the data processing module is configured to receive data information and environmental information for processing, and the specific steps of the data processing module include:

3. The intelligent robot for collision avoidance based on reinforcement learning of claim 1, wherein the specific steps of the data analysis module for performing the analysis operation comprise:

4. The intelligent collision avoidance robot based on reinforcement learning of claim 1, wherein the control module is used for controlling the operation of the robot, and the specific steps comprise: