CN113920322A

CN113920322A - Modular robot kinematic chain configuration identification method and system

Info

Publication number: CN113920322A
Application number: CN202111228925.8A
Authority: CN
Inventors: 李伟昌; 黄尚樱; 梁梓熙
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2021-10-21
Filing date: 2021-10-21
Publication date: 2022-01-11
Anticipated expiration: 2041-10-21
Also published as: CN113920322B

Abstract

The invention discloses a modular robot kinematic chain configuration identification method and a system, wherein the method comprises the steps of obtaining a modular robot image to obtain an input image; inputting an input image into a preset target detection model for multi-module identification to obtain a module central point position, a module type and a module bounding box size, wherein the preset target detection model comprises a preliminary feature extraction unit, a feature fusion unit and an output unit; and processing the position of the center point of the module, the type of the module and the size of a module enclosure frame based on a nearest point searching method, determining the connection relation of the modules and finishing the identification of the configuration of the kinematic chain. The system comprises: the device comprises an input module, a detection module and a configuration identification module. By using the method and the device, the kinematic chain configuration identification of the robot can be realized under the condition that no additional detection component is required to be installed on the robot, and the robustness to a noise environment is higher. The invention can be widely applied to the field of visual identification.

Description

Modular robot kinematic chain configuration identification method and system

Technical Field

The invention relates to the field of visual identification, in particular to a modular robot kinematic chain configuration identification method and system.

Background

The robot is widely regarded by modern large-scale industrial production due to high efficiency, accuracy and low cost, and the modular robot system has high expansibility and reconfigurability and can be applied to different fields in various configurations. However, configuration recognition of a modular robotic system is a time-consuming and less accurate task, especially for modular robotic systems lacking sensing devices.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide a method and a system for identifying a kinematic chain configuration of a modular robot, where the algorithm is a semi-supervised learning method, the kinematic chain configuration of the robot can be identified without installing an additional detection component to the robot, and the method and the system have high robustness to a noise environment.

The first technical scheme adopted by the invention is as follows: a modular robot kinematic chain configuration identification method comprises the following steps:

acquiring a modular robot image to obtain an input image;

inputting an input image into a preset target detection model for multi-module identification to obtain a module central point position, a module type and a module bounding box size;

the preset target detection model comprises a preliminary feature extraction unit, a feature fusion unit and an output unit;

and processing the position of the center point of the module, the type of the module and the size of a module enclosure frame based on a nearest point searching method, determining the connection relation of the modules and finishing the identification of the configuration of the kinematic chain.

Further, the output unit includes:

a central point predicting section for predicting a position of a central point of a module in an input image and a type of the module;

an offset predicting section for predicting an offset of a center point position of a block in an input image;

and a bounding box prediction section for predicting a size of a bounding box of the block in the input image.

Further, the step of inputting the input image into a preset target detection model for multi-module recognition to obtain a module central point position, a module type and a module enclosure frame specifically includes:

inputting an input image into a preset target detection model;

performing primary feature extraction processing on the input image based on a primary feature extraction unit to obtain an intermediate feature tensor;

performing depth feature fusion on the intermediate feature tensor based on the feature fusion unit to obtain a high-level feature tensor;

sampling the high-level feature tensor and adjusting the size to obtain a feature map;

respectively inputting the characteristic diagram into a central point prediction part, an offset prediction part and a surrounding frame prediction part to obtain module central point output, module central point offset correction quantity output and module surrounding frame output;

obtaining the position of the center point of the module and the type of the module based on the output of the center point of the module and the output of the offset correction of the center point of the module;

and obtaining the size of the module bounding box based on the output of the module bounding box.

Further, the module center point output is represented as follows:

in the above formula, R represents the ratio of the feature map to the input image size, C represents the number of module types, w₁Is the width of the image, h₁Is the height of the image.

Further, the method also comprises a construction step of a central point prediction part, which specifically comprises the following steps:

constructing a confidence label of a module central point;

confidence of center point of c-class module in (x, y) position

Acquiring a training image;

decomposing the real center point p (x, y) in the training image into low-resolution center points

Offset from center point

Based on low resolution center point

And confidence level

The center point prediction part is trained.

Further, the mean square error is taken as a loss function of the prediction part of the training center point, and the formula is as follows:

in the above formula, the E_xRepresenting a mathematical expectation operator, said Y_h；x,y,cIndicates a confidence label, L_hRepresenting the centroid prediction loss function, x, y representing the position of the training real centroid, and c representing the class.

Further, the method also comprises a construction step of an offset prediction part, which specifically comprises the following steps:

offset based on center point

And the L1 loss function trains the bounding box prediction part;

the L1 loss function is

Said L_oRepresents a center point offset operator, said N represents a total number of modules, | represents an L1 distance operator,

represents the predicted value of the center point at the (x, y) position.

Further, the method also comprises a construction step of a bounding box prediction part, which specifically comprises the following steps:

to be provided with

To indicate the kind as c_kThe module detects the coordinates of the upper left corner and the lower right corner of a square frame in an image to obtain the coordinates of the center point of the square frame

And the size of the enclosure frame

Taking the height h and the width w of the bounding box s of all the modules in all the categories in the training image as the values of two feature maps respectively, and corresponding the low-resolution bounding box sizes w and h of the module with the central point at the given position (x, y) in the feature maps;

the bounding box prediction part is trained based on the L1 loss function.

Further, the method for searching based on the nearest neighbor processes the position of the module center point, the module type and the size of the module enclosure frame, determines the module connection relationship, and completes the step of identifying the kinematic chain configuration, which specifically includes:

constructing a strong connection graph by taking the central points of all modules in all categories in the input image as nodes;

taking a main control module as a starting point and trying to connect all modules in sequence to obtain a linked list with the minimum weight, wherein the weight is the distance between two central points;

determining the connection relation of all modules according to the intersection and parallel ratio between the central point position and the surrounding frames;

and finishing the identification of the kinematic chain type configuration.

The second technical scheme adopted by the invention is as follows: a modular robot kinematic chain configuration identification system comprising:

the input module is used for acquiring the image of the modular robot to obtain an input image;

the system comprises a detection module, a detection module and a processing module, wherein the detection module is used for inputting an input image into a preset target detection model for multi-module identification to obtain a module central point position, a module type and a module bounding box size, and the preset target detection model comprises a preliminary feature extraction unit, a feature fusion unit and an output unit;

and the configuration identification module is used for processing the position of the center point of the module, the type of the module and the size of a module enclosure frame based on a nearest neighbor point searching method, determining the connection relation of the modules and finishing the identification of the configuration of the kinematic chain.

The method and the system have the beneficial effects that: the method adopts the idea of target detection, uses a deep learning method to obtain the robot module information in a given image, and carries out configuration matching based on a nearest neighbor searching mode to find out the current configuration. The robot kinematic chain configuration recognition method based on the multi-dimensional motion detection can realize the kinematic chain configuration recognition of the robot without installing an additional detection component for the robot, and has high robustness for a noise environment.

Drawings

FIG. 1 is a flow chart illustrating the steps of a modular robot kinematic chain configuration identification method of the present invention;

FIG. 2 is a block diagram of a modular robot kinematic chain configuration identification system according to the present invention;

FIG. 3 is a schematic representation of a configuration identified in accordance with an embodiment of the present invention;

FIG. 4 is a schematic illustration of an identification process according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the experimental validation process of the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

The invention provides a chain type modular robot kinematic configuration identification method based on visual identification, which uses a target detection neural network to complete module identification of different module types and surrounding frames thereof in a modular robot in an image, and then an algorithm realizes the kinematic chain identification by searching the intersection and comparison mode of a nearest module central point and the surrounding frames thereof.

Referring to fig. 1 and 4, the present invention provides a modular robot kinematic chain configuration identification method, including the following steps:

s1, acquiring a modular robot image to obtain an input image;

s2, inputting the input image into a preset target detection model for multi-module recognition to obtain a module central point position, a module type and a module bounding box size;

specifically, the feature fusion units are connected in series by adopting a tree structure (HDA), feature information between a shallow layer and a deep layer is effectively combined to encourage the model to learn richer combinations, so that more feature hierarchical structures are expanded, residual connection is introduced into each HDA output to prevent the model from generating a deep degradation problem, and the condition that the gradient of the model disappears or the gradient explodes in the training process can be effectively prevented.

And S3, processing the position of the center point of the module, the type of the module and the size of the module enclosure frame based on the nearest neighbor point searching method, determining the connection relation of the modules and finishing the identification of the configuration of the kinematic chain.

Specifically, the RRMS is a modular robot system having a highly integrated and centralized control system, and the modules of the RRMS may be divided into control modules and function modules according to the definition of the control system. For a given RGB photograph

Wherein w₁Is the width of the image, h₁The number 3 characterizes the number of eigenchannels of the image tensor, for the height of the image. We extract high-level semantic information from a given image tensor by a feature extractor h (-), the feature extractor h

In general, since the feature extractor has a down-sampling function, there is an inequality w₁≥w₂，h₁≥h₂. Then, we will use the output module to map the high-level semantic information z to get the output

Fig. 4(a) shows a given input image, fig. 4(b) shows that the module center point position, the module type, and the module bounding box size are obtained, fig. 4(c) shows that the module center point position, the module type, and the module bounding box size are processed based on the nearest neighbor search method to determine the module connection relationship, and fig. 4(d) shows the result of the kinematic chain configuration recognition.

Further as a preferred embodiment of the method, the output unit includes:

Specifically, we name these three outputs as the center point output in turn

Bounding box size output

And center point offset correction output

As a preferred embodiment of the method, the step of inputting the input image into a preset target detection model for multi-module recognition to obtain a module center point position, a module type, and a module enclosure frame specifically includes.

S21, inputting the input image into a preset target detection model;

s22, performing primary feature extraction processing on the input image based on the primary feature extraction unit to obtain an intermediate feature tensor;

s23, performing depth feature fusion on the intermediate feature tensor based on the feature fusion unit to obtain a high-level feature tensor;

s24, sampling the high-level feature tensor and adjusting the size to obtain a feature map;

s25, inputting the feature graph into a central point prediction part, an offset prediction part and a surrounding frame prediction part respectively to obtain module central point output, module central point offset correction quantity output and module surrounding frame output;

specifically, for an image X containing the RRMS module, it is first converted into an intermediate feature tensor H with four times down-sampling by the preliminary feature extraction module₁And then, the feature tensor sequentially carries out deep fusion of features through three feature fusion units so as to extract a feature tensor H with high-level semantic information. Then we perform successive 2 times up-sampling process on H to restore the 16 times down-sampled feature map to 4 times down-sampled size, so as to prevent the accuracy from decreasing due to too small size of the output feature map. Finally, the 4 times down-sampled feature maps are respectively used as the input of three output modules, and the output which finally contains the detection result is obtained

S26, obtaining the position of the module center point and the type of the module based on the module center point output and the module center point offset correction quantity output;

specifically, since the minimum scale of the prediction result is 4 pixels, an offset prediction part needs to be separately constructed for correction, that is, the position of the module center point is based on the position of the center point after the offset correction.

And S27, obtaining the size of the module bounding box based on the output of the module bounding box.

Further as a preferred embodiment of the method, in order to make the model have the function of predicting the positions of the center points of the plurality of modules and the types of the modules, let the center point output be expressed as follows:

in the above formula, R represents the ratio of the feature map to the input image size, C represents the number of module types, i.e. the feature map of each channel output by the center point predicts the position of the center point of the module of one category, respectively, w₁Is the width of the image, h₁Is the height of the image.

Under ideal conditions, the center point output of the model will be located at the center point of the correct class of small modules

And the other positions are

Further, as a preferred embodiment of the method, the method further comprises a step of constructing a central point prediction part, which specifically comprises:

based on Gaussian kernel

Constructing a confidence label of a module central point, and enabling a standard deviation sigma to be a constant;

c type module center point in (x, y) positionUpper confidence level

Acquiring a training image;

Offset from center point

Based on low resolution center point

And confidence level

The center point prediction part is trained.

Further, as a preferred embodiment of the method, the loss function of the prediction part of the training center point is taken as the mean square error, and the formula is expressed as follows:

Further, as a preferred embodiment of the method, the method further comprises a step of constructing an offset prediction part, which specifically comprises:

offset based on center point

And the L1 loss function trains the bounding box prediction part;

predicting only the presence of small module locations in a statistical true tagValue, the L1 loss function is

represents the predicted value of the center point at the (x, y) position.

Specifically, for the true center point p of the target (x, y), we decompose it into two parts: low resolution center point

Offset from center point

The two parts respectively correspond to the center point output

And center point offset correction output

And p ═ p' + δ p. For center point output

In other words, we will express the confidence of the center point of the c-class module at the (x, y) location in the form of a heat map

And the abscissa x and the ordinate y of the central point offset δ p of all the modules in all the categories are used as central point offset correction quantity to be output

And the coordinates (x, y) of the two feature maps correspond to the module center point offsets δ x and δ y of the center point at the position respectively.

Further, as a preferred embodiment of the method, the method further includes a step of constructing a bounding box prediction part, which specifically includes:

to be provided with

And the size of the enclosure frame

training the bounding box prediction part based on the L1 loss function, i.e.

In order to unify the relative magnitude of loss values of different links, two hyper-parameters lambda are added_sAnd λ_oI.e. by

L＝L_p+λ_o·L_o+λ_s·L_s

By the module identification method, we obtain the category c, the central point p and the bounding box of all the small modules in the graph

Further, as a preferred embodiment of the method, the step of processing the module center point position, the module type and the module enclosure frame size based on the nearest neighbor point search method, determining the module connection relationship, and completing the identification of the kinematic chain configuration specifically includes:

s31, constructing a strong connection graph by taking the central points of all modules in all categories in the input image as nodes;

s32, determining the connection relation of all modules according to the intersection ratio between the central point position and the surrounding frames;

s33, determining a starting point and trying to connect all modules in sequence to obtain a linked list with the minimum weight, wherein the weight is the distance between two central points;

and S34, finishing the identification of the kinematic chain type configuration, and referring to the identification result in FIG. 3.

Specifically, due to the problem of the shooting angle, the distance difference between the modules is small, and at this time, if the search is directly performed according to the distance between the two central points, an unsatisfactory result may be obtained. We heuristically used the magnitude of the intersection ratio IoU between two blocks to make an auxiliary decision, where IoU is the ratio of the intersection of two bounding boxes to the union of two bounding boxes. IoU is close to 1 as the bounding boxes of the two modules overlap more; otherwise, it is close to 0. IoU is taken as the next joint in the robot configuration, so that the accuracy of the prediction result can be ensured to the maximum extent;

in a specific implementation, we use breadth-first search to accomplish this: and (5) taking the control module (F) as a root node, and traversing other modules which are not in the linked list in sequence for screening. The screening criteria were: the module is the shortest distance from the central point of the linked list tail module, and the module with the largest intersection ratio IoU is used as the new linked list tail module. When the distances between the modules are different from the intersection ratio, or when abnormal behaviors such as the distance of the central point is far larger than the size of the bounding box of the linked list tail module (corresponding to the identified RRMS module which is not in the moving chain) are found, the algorithm refuses to identify and tries to prompt the user to perform manual configuration identification.

Experimental verification was carried out with a specific example:

we randomly picked a photograph and visualized the result as shown in fig. 5. In the actual search, we directly construct two adjacency matrixes based on all modules, namely between every two modulesIntersection ratio IoU matrix H_IoUAnd center point spacing H_dWhere the subscript (i, j) indicates that the result is given by index i and j, so that the two adjacent matrices are symmetric matrices. For the results shown in fig. 5, there are:

since we artificially select the master control module (F) as the starting point of the linked list, the serial number of the master control module is 0, as can be obtained from fig. 5 (c). And from H_dAnd H_IoUIt can be seen that the sequence number 0 module is closest to the sequence number 2 module, and both have the largest intersection ratio IoU, so we add the sequence number 2 module to the linked list. Repeating the above process, and finally obtaining a linked list as follows: [0214]. It can be easily found that the serial number of the linked list is consistent with the configuration of the real module even in the case of large background noise (such as reflection).

Experimental results show that the method can realize the identification of the kinematic chain configuration of the robot under the condition that no additional detection component is required to be installed on the robot, and has high robustness for a noise environment.

As shown in fig. 2, a modular robot kinematic chain configuration recognition system includes:

The contents in the above method embodiments are all applicable to the present system embodiment, the functions specifically implemented by the present system embodiment are the same as those in the above method embodiment, and the beneficial effects achieved by the present system embodiment are also the same as those achieved by the above method embodiment.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A modular robot kinematic chain configuration identification method is characterized by comprising the following steps:

acquiring a modular robot image to obtain an input image;

2. The modular robot kinematic chain configuration identification method according to claim 1, characterized in that the output unit comprises:

3. The modular robot kinematic chain configuration identification method according to claim 2, wherein the step of inputting the input image into a preset target detection model for multi-module identification to obtain a module center point position, a module type and a module enclosure frame specifically comprises:

inputting an input image into a preset target detection model;

4. The modular robot kinematic chain configuration identification method of claim 3, characterized in that the module center point output is represented as follows:

5. The modular robot kinematic chain configuration identification method according to claim 4, further comprising a central point prediction part construction step, which specifically comprises:

constructing a confidence label of a module central point;

confidence of center point of c-class module in (x, y) position

Acquiring a training image;

Offset from center point

Based on low resolution center point

And confidence level

The center point prediction part is trained.

6. The modular robot kinematic chain configuration identification method according to claim 5, characterized in that the mean square error is used as a loss function of the predicted part of the training center point, and the formula is as follows:

in the above formula, the E_xRepresenting a mathematical expectation operator, said Y_{h；x，y，c}Indicates a confidence label, L_hRepresenting the centroid prediction loss function, x, y representing the position of the training real centroid, and c representing the class.

7. The modular robot kinematic chain configuration identification method according to claim 6, further comprising a construction step of an offset prediction part, which specifically comprises:

offset based on center point

And the L1 loss function trains the bounding box prediction part;

the L1 loss function is

represents the predicted value of the center point at the (x, y) position.

8. The modular robot kinematic chain configuration identification method according to claim 7, further comprising a building step of a bounding box prediction part, which specifically comprises:

And the size of the enclosure frame

the bounding box prediction part is trained based on the L1 loss function.

9. The modular robot kinematic chain configuration identification method according to claim 8, wherein the step of processing the module center point position, the module type and the module bounding box size based on the nearest neighbor point search method, determining the module connection relationship, and completing the kinematic chain configuration identification specifically comprises:

determining a starting point and trying to connect all modules in sequence to obtain a linked list with the minimum weight, wherein the weight is the distance between two central points;

and finishing the identification of the kinematic chain type configuration.

10. A modular robot kinematic chain configuration identification system, comprising: