CN114546114A

CN114546114A - Control method and control device for mobile robot and mobile robot

Info

Publication number: CN114546114A
Application number: CN202210137890.5A
Authority: CN
Inventors: 刘三军; 梅江元; 蒋思凡; 区志财; 唐剑
Original assignee: Midea Group Co Ltd; Midea Group Shanghai Co Ltd
Current assignee: Midea Group Co Ltd; Midea Group Shanghai Co Ltd
Priority date: 2022-02-15
Filing date: 2022-02-15
Publication date: 2022-05-27

Abstract

The application relates to the field of robots and provides a control method and a control device of a mobile robot and the mobile robot. The control method of the mobile robot comprises the following steps: acquiring a to-be-processed image set comprising a target object; performing key point detection on at least one frame of first image in an image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame; determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relation; performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result; and controlling the mobile robot based on the gesture recognition result corresponding to the first target image when the at least one frame of second image and the first target image meet the target condition. According to the method and the device, the effective gestures can be accurately recognized in various scenes, the phenomenon of false triggering of the mobile robot is effectively avoided, and the user experience is improved.

Description

Control method and control device for mobile robot and mobile robot

Technical Field

The present application relates to the field of robot technology, and in particular, to a mobile robot control method, a mobile robot control device, and a mobile robot.

Background

In the related technology, in the interaction between the mobile robot and the target user, the mobile robot can make corresponding feedback according to the instruction of the target user, and the life of people is enriched.

As technology advances, mobile robots are increasingly used in a variety of scenarios. In a related scene, the robot cannot accurately identify a target user and an instruction sent by the target user, so that a false touch phenomenon occurs, inconvenience is brought to use, and user experience is poor.

Disclosure of Invention

The present application is directed to solving at least one of the problems in the prior art. Therefore, the application provides a control method of the mobile robot, which effectively avoids the phenomenon of false triggering of the mobile robot.

The application also provides a control device of the mobile robot.

The application also provides a mobile robot.

The application also provides an electronic device.

The present application also proposes a non-transitory computer-readable storage medium.

The present application also proposes a computer program product.

The control method of the mobile robot according to the first aspect of the present application includes:

acquiring a to-be-processed image set comprising a target object;

performing key point detection on at least one frame of first image in the image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame;

determining a first target image corresponding to the target object in the hand-lifting state from the first image based on the relative position relation;

performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result;

and controlling the mobile robot based on a gesture recognition result corresponding to the first target image when the at least one frame of second image and the first target image meet a target condition.

According to the control method of the mobile robot, the hand-lifting state of the target user and the target gesture of the target user can be accurately identified in various scenes, the phenomenon of false triggering of the mobile robot is effectively avoided, and the use experience of the user is improved.

According to an embodiment of the application, before said controlling said mobile robot, said method further comprises:

and identifying the target object through at least one third image in the image set to be processed, and determining the target object as an effective object.

According to the control method of the mobile robot, the target user can be accurately identified in various scenes, the phenomenon of false triggering of the mobile robot is further avoided, and the user experience is improved.

According to an embodiment of the present application, the performing keypoint detection on at least one frame of first image in the to-be-processed image set to obtain a relative position relationship of a plurality of target keypoints in the first image in the same frame includes:

performing key point detection on the at least one frame of first image to obtain the positions of a plurality of target key points in the first image in the same frame;

determining a target threshold value for distinguishing the relative position relation based on the result of the identity recognition;

determining relative positional relationships of the plurality of target keypoints based on the target threshold and the positions of the plurality of target keypoints.

According to the control method of the mobile robot, different setting modes can be matched according to different target users, flexibility is enhanced, and user experience is improved.

According to an embodiment of the present application, the performing gesture recognition on at least one frame of second image in the to-be-processed image set to obtain a gesture recognition result includes:

determining the corresponding relation between the gesture and the control instruction based on the identity recognition result;

identifying a target gesture in the at least one frame of second image;

and obtaining the gesture recognition result based on the corresponding relation between the gesture and the control instruction and the target gesture. According to an embodiment of the application, the target object is identified through at least one third image in the to-be-processed image set, the target object is determined to be a valid object, and before the first target image corresponding to the target object in the hand-up state is determined from the first image based on the relative position relationship;

determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relationship, and performing gesture recognition on at least one frame of second image in the to-be-processed image set before obtaining a gesture recognition result;

the first frame of the at least one frame of second image is the last frame of the at least one frame of first image.

According to the control method of the mobile robot, the processing sequence of identity recognition, hand-raising state recognition and gesture recognition can be optimized, the processing flow is simplified, resources are saved, and the user experience is improved.

According to an embodiment of the present application, the identifying the target object through at least one third image in the to-be-processed image set includes:

performing face recognition on the at least one frame of third image; or, the REID identification is performed on the at least one frame of third image.

According to an embodiment of the present application, the at least one frame of the second image and the first target image satisfy a target condition, including: the at least one frame of second image comprises the first target image; or, the at least one frame of second image is a subsequent frame of the first target image.

According to an embodiment of the application, when the first target image corresponding to the target object in the hand-up state is determined from the first image based on the relative position relationship, performing gesture recognition on at least one frame of second image in the to-be-processed image set to obtain a gesture recognition result;

the at least one frame of second image comprises n continuous frames of images, and a first frame of the at least one frame of first image is an m-th frame of the second image;

wherein

And n and m are positive integers.

According to the control method of the mobile robot, the processing time can be shortened and the user experience can be improved by parallel processing of the hand-lifting state recognition and the gesture recognition.

According to an embodiment of the present application, the performing keypoint detection on at least one frame of a first image in the to-be-processed image set to obtain a relative position relationship of a plurality of target keypoints in the first image in the same frame includes: performing key point detection on the at least one frame of first image to obtain positions of a wrist key point, an elbow key point, a shoulder key point and a waist key point; determining the relative position relationship among the wrist key point, the elbow key point, the shoulder key point and the waist key point in the same frame based on the positions of the wrist key point, the elbow key point, the shoulder key point and the waist key point in the first image;

the determining, from the first image based on the relative positional relationship, a first target image corresponding to the target object being in a hand-up state includes: determining that the wrist key point is higher than the elbow key point, the wrist key point is located between the shoulder key point and the waist key point, and the elbow key point is located between the shoulder key point and the waist key point, a first image is the first target image.

According to the control method of the mobile robot, the target key points can be selected from the key points of the human body, only the relative position relation of the target key points is processed, the processing time is shortened, the processing steps are saved, the hand raising action of the target object is accurately identified, and the use experience of a user is improved.

A control device for a mobile robot according to an embodiment of a second aspect of the present application includes:

the acquisition module is used for acquiring an image set to be processed comprising a target object;

the first processing module is used for performing key point detection on at least one frame of first image in the image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame;

the second processing module is used for determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relation;

the third processing module is used for performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result;

and the fourth processing module is used for controlling the mobile robot based on the gesture recognition result corresponding to the first target image under the condition that the at least one frame of second image and the first target image meet the target condition.

According to the control device of the mobile robot, the hand-lifting state of the target user and the target gesture of the target user can be accurately identified in various scenes, the phenomenon of false triggering of the mobile robot is effectively avoided, and the use experience of the user is improved.

An electronic device according to an embodiment of the third aspect of the present application includes a storage and a processor, and implements the steps of the control method for a mobile robot according to any one of the above.

A non-transitory computer-readable storage medium according to an embodiment of the fourth aspect of the present application, having stored thereon a computer program that, when being executed by a processor, implements the steps of the method of controlling a mobile robot according to any one of the above.

A computer program product according to an embodiment of the fifth aspect of the present application comprises a computer program which, when executed by a processor, implements the steps of the method for controlling a mobile robot as described in any one of the above.

One or more technical solutions in the embodiments of the present application have at least one of the following technical effects:

furthermore, a gesture recognition result is obtained through combination of multiple judgment conditions, the target user and the target gesture of the target user can be accurately recognized in various scenes, the phenomenon of false triggering of the mobile robot is greatly avoided, and the user experience is improved.

Furthermore, by combining the sequence of the judgment conditions, the processing steps and time can be greatly simplified, resources are saved, and the user experience is improved.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a control method of a mobile robot according to an embodiment of the present disclosure;

fig. 2 is a human body key point detection schematic diagram of a control method of a mobile robot according to an embodiment of the present application;

fig. 3 is one of schematic effective hand-raising states of a control method of a mobile robot according to an embodiment of the present disclosure;

fig. 4 is a second schematic diagram illustrating an effective hand-lifting state of the control method of the mobile robot according to the embodiment of the present application;

fig. 5 is a schematic diagram illustrating an invalid hand-up state of a control method of a mobile robot according to an embodiment of the present disclosure;

fig. 6 is a schematic flowchart of a control method of a mobile robot according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a control device of a mobile robot according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in further detail below with reference to the drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.

In the description of the embodiments of the present application, it should be noted that the terms "center", "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the embodiments of the present application and simplifying the description, but do not indicate or imply that the referred devices or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the embodiments of the present application. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the embodiments of the present application, it should be noted that the terms "connected" and "connected" are to be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected, unless explicitly stated or limited otherwise; can be mechanically or electrically connected; may be directly connected or indirectly connected through an intermediate. Specific meanings of the above terms in the embodiments of the present application can be understood in specific cases by those of ordinary skill in the art.

In the description of the present application, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like is intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

A control method, a control device, a mobile robot, and a readable storage medium of a mobile robot according to an embodiment of the present application are described in detail below with reference to fig. 1 to 8.

The control method for starting the mobile robot can be applied to the terminal, and can be specifically executed by hardware or software in the terminal. The main body of the mobile robot control method may be a terminal, a terminal control device, or the like.

The terminal comprises other intelligent robots such as a mobile service robot or an emotional robot.

In the method for controlling a mobile robot provided in the embodiment of the present application, the execution subject may be an intelligent device or a functional module or a functional entity in the intelligent device, which can implement the method for controlling the mobile robot, the intelligent device mentioned in the embodiment of the present application includes, but is not limited to, a mobile service robot, an emotional robot, and the like, and the method for controlling a mobile robot provided in the embodiment of the present application is described below with the intelligent device as the execution subject.

The control method of the mobile robot in the embodiment of the application can be applied to various scenes, such as an open environment with multiple targets or a closed environment with a single target. In different scenes, the control method of the mobile robot is used for accurately identifying the gesture actions of the user in various actions of a multi-target crowd and a target crowd, and intelligent control of the mobile robot is achieved.

As shown in fig. 1, the method for controlling a mobile robot includes: step 110, step 120, step 130, step 140 and step 150.

According to the control method of the mobile robot in the embodiment, the target object and the effective gesture of the target object can be intelligently distinguished, and different working states can be realized according to the effective gesture.

Step 110, a set of images to be processed including the target object is obtained.

In this embodiment, the mobile robot may acquire a set of images to be processed including the target object through the camera.

The target object may be a human object included in an image acquired by the mobile robot.

The image set to be processed may be a multi-frame image set including a target object.

The image set to be processed may contain parts of the body of the target object, for example, a whole-body or upper-body picture of the target object.

The image set to be processed can be a plurality of frames of images, and the plurality of frames of images are a plurality of continuous images.

The successive frames of images in the set of images to be processed form successive actions, so that the action of the target object in each frame of image can be the same or different or partly the same.

And 120, performing key point detection on at least one frame of first image in the image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame.

In this embodiment, the mobile robot may perform human body key point detection on the target object in the at least one frame of the first image, and acquire a relative position relationship between the target key points of the target object human body.

Wherein, the at least one frame of first image is at least one frame of image in the image set to be processed.

The first image is used for detecting the relative position relation of the target key points by carrying out key point detection on the target object.

The first image is at least one frame.

The first image may be a multi-frame image, which is a continuous plurality of images.

For example, the set time is 1s, the set frame number is 30 frames, and the first image includes consecutive 30 frames of images within 1 s.

The successive frames of images in the first image form successive motions, and the motions of the target object contained in the multi-frame images may be the same or partially the same.

The object of the key point detection can be a human skeleton point of the target object, and the key point detection modes are various and can be whole body skeleton point detection or local skeleton point detection.

For example, as shown in fig. 2, the human key points of the target object may be the detection of skeleton points of the whole body, including 3 key points of the trunk and 8 key points of the limbs.

For example, the body keypoints of the target object may be local skeletal point detection, and the local skeletal points may include torso keypoints, limb keypoints, or local combination keypoints of the target object.

As shown in fig. 3, the upper body keypoints of the target object may include 3 keypoints of the torso and 4 keypoints of the arms.

The target key points are a plurality of specific key points involved in the specific motion made by the human body in the human body key points under the condition that the detection target object makes the specific motion.

In some embodiments, the target object may set an action for controlling the mobile robot to perform a corresponding working mode, and the system may determine a target key point to be detected according to the action, thereby simplifying the algorithm steps.

The relative position relationship of the target key points may be a relative position between specific key points in a frame of the first image after the key point detection is performed on the frame of the first image.

In some embodiments, when detecting an arm motion, an arm key point may be set as a target key point, the arm key point includes a wrist key point and an elbow key point, position information of the wrist key point and the elbow key point may be obtained after performing key point detection on a target object in at least one frame of the first image, and a relative position relationship between the wrist key point and the elbow key point may be obtained based on the position information of the wrist key point and the elbow key point in the frame of the first image.

It should be noted that a mutual positional relationship may exist only when a plurality of target keypoints are located in the same frame of the first image, and the relative positional relationship of the plurality of target keypoints is obtained through positional information of a plurality of specific keypoints in the frame of the first image.

That is, the position information of a plurality of specific key points required in the relative positional relationship of a plurality of target key points is located in the same frame first image.

For example, the relative position between the wrist keypoints and the elbow keypoints may be wrist keypoints higher than elbow keypoints, wrist keypoints lower than elbow keypoints, or elbow keypoints as high as the elbow keypoints.

In this embodiment, the required target key points may be selected from the detected human key points, the motion of the target object may be determined using the relative positional relationship of relatively few target key points, and the determination of the motion may not be affected while the determination step may be simplified.

In some embodiments, step 120 further comprises: and performing key point detection on at least one frame of first image to obtain positions of a wrist key point, an elbow key point, a shoulder key point and a waist key point.

In this step, as shown in fig. 3, the mobile robot may perform human body key point detection on the upper half of the human body of the target object in at least one frame of the first image, and detect positions of a wrist key point, an elbow key point, a shoulder key point, and a waist key point.

Based on the positions of the wrist keypoints, the elbow keypoints, the shoulder keypoints and the waist keypoints in the first image, the relative positional relationships of the wrist keypoints, the elbow keypoints, the shoulder keypoints and the waist keypoints in the same frame can be determined.

By the action of the target object, the target key point can be determined in the plurality of human key points.

For example, when the target object makes a hand-up motion, it may be determined that the target keypoints associated with the hand-up motion include wrist keypoints, elbow keypoints, shoulder keypoints, and waist keypoints.

And after determining the plurality of target key points, acquiring the position information of the plurality of target key points.

The relative position relationship between the plurality of target key points can be obtained based on the determined position information of the target key points.

For example, the relative position information of the 4 target key points may be obtained from the position information of the target key points including the wrist key point, the elbow key point, the shoulder key point, and the waist key point.

In this embodiment, the target key points are selected from the plurality of key points, so as to obtain the relative positions of the target key points, and the required action is determined according to the relative positions, so that the determination of the positional relationship of the key points of the whole body can be avoided, and the algorithm steps can be simplified.

And step 130, determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relation.

In the step, the mobile robot determines the action of the target object in the first image according to the relative position relationship of the plurality of target key points, and judges whether the target object is in a hand-up state.

The first target image is at least one frame of image corresponding to the target object in the multi-frame first image in the hand-up state.

The hand-raising state can be expressed as the position relation of the arm key point relative to the trunk key point.

In the related art, the motion of the target human body is usually detected by adopting the height of the key point relative to the ground, and the detection method is limited by different human body conditions, such as height or human body proportion, so that the motion detection result is inaccurate.

The relative position relation of the key points is adopted to detect the action of the target human body, and the relative position relation is objective and is not influenced by human body conditions, so that the problem of inaccurate detection result caused by different human body conditions can be avoided, and the accuracy of detecting the human body action is improved.

In some embodiments, the target keypoints for the hand-up state may include wrist keypoints, elbow keypoints, shoulder keypoints, and waist keypoints, and the hand-up state may be such that the wrist keypoints are higher than the elbow keypoints.

The relative position relationship of the target key points of the target object in the hand-up state at least comprises the following three states according to different action amplitudes:

firstly, the wrist key point and the elbow key point are higher than the shoulder key point;

secondly, the wrist key point and the elbow key point are positioned between the shoulder key point and the waist key point;

and thirdly, the wrist key point is higher than the shoulder key point, and the elbow key point is positioned between the shoulder key point and the waist key point.

In some embodiments, step 130 may further include: and determining a first image with a wrist key point higher than the elbow key point, a wrist key point between the shoulder key point and the waist key point and an elbow key point between the shoulder key point and the waist key point as a first target image.

In this embodiment, the criterion for determining the hand-up state of the target object may be that the wrist keypoint is higher than the elbow keypoint and that the wrist keypoint and the elbow keypoint are located between the shoulder keypoint and the waist keypoint.

When the target object in the first image meets the criterion of the hand-up state, the first image corresponding to the target object at this time is the first target image.

In the embodiment, the hand-raising standard of the target object is measured by a quantitative standard, so that whether the target object is in a specified hand-raising state can be more accurately judged, and the phenomenon of false triggering of the mobile robot is further prevented.

The wrist key point is higher than the elbow key point, and the wrist key point and the elbow key point are located between the shoulder key point and the waist key point, and at least two forms can be included:

one, one hand satisfies the condition.

As shown in fig. 3, in the mutual position relationship of the target key points detected by the target object, a single hand of the target object satisfies the criterion of the hand-up state and the hand-up amplitude, and the hand-up state of the target object can be determined as the effective hand-up.

And two hands meet the conditions.

As shown in fig. 4, in the mutual position relationship of the target key points detected by the target object, both hands of the target object satisfy the criterion of the hand-up state and the hand-up amplitude, and the hand-up state of the target object can be determined to be effective hand-up.

The hand-lifting state of the target object can be flexibly judged through different judging conditions of the hand-lifting state, so that the judgment is more sensitive, and the use experience of a user is improved.

The hand-up state is invalid under the condition that any one of the above detections of the hand-up state is not satisfied.

For example, in this embodiment, as shown in fig. 5, when the wrist key point of the left hand of the target subject is higher than the elbow key point, but the wrist key point and the elbow key point of the hand raising are not located between the shoulder key point and the waist key point, the hand raising state of the target subject is invalid.

The target object hand-raising state detection provided in the embodiment can be used for more accurately judging the hand-raising state of the target object by limiting the hand-raising standard and the hand-raising amplitude of the target object.

And 140, performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result.

In the step, the mobile robot performs gesture recognition on at least one frame of second image including the gesture of the target object, and obtains a corresponding gesture recognition result.

Wherein the second image comprises an image of the target object making a particular gesture.

The second image is at least one frame image in the image set to be processed.

The second image may be a multi-frame image, which is a continuous plurality of images.

For example, the set time is 1s, the set frame number is 30 frames, and the second image includes consecutive 30 frames of images within 1 s.

The continuous frames of the images in the second image form continuous gesture actions, and the gesture of the target object contained in the multi-frame images can be a sliding gesture or a static gesture, so the gesture in each frame of the images can be the same or partially the same.

The second image may be used for gestures to find the target object.

And the gesture recognition result is corresponding gesture semantic information obtained after gesture recognition is carried out on the specific gesture of the target object.

And 150, controlling the mobile robot based on the gesture recognition result corresponding to the first target image under the condition that the at least one frame of second image and the first target image meet the target condition.

In this step, the mobile robot is controlled based on a gesture recognition result obtained by gesture recognition of the first target image in which the target object is in the hand-up state.

The gesture recognition result may be used to control the mobile robot.

And under the condition that the gesture recognition result needs to meet the target condition, the gesture is an effective gesture.

The target condition is a mutual relation between at least one frame of second image and the first target image, and when the at least one frame of second image and the first target image meet the set condition, the mobile robot can be controlled to make a corresponding working state based on a gesture recognition result corresponding to the first target image.

In some embodiments, the target conditions include at least two conditions that need to be satisfied:

one, at least one frame of the second image may comprise the first target image.

In this embodiment, when it is determined that the target object is in the hand-up state according to the at least one frame of the first image, and the gesture made by the target object in the first target image included in the at least one frame of the second image is a correct gesture, the gesture is an effective gesture, and the mobile robot may be controlled according to a gesture recognition result corresponding to the effective gesture.

In this case, the mobile robot is controlled by the effective gesture made by the target object in the hand-up state, so that the phenomenon of false triggering of the mobile robot can be avoided,

and the second image of at least one frame can be a subsequent frame of the first target image.

In this embodiment, in the case where it is determined from the first target image that the target object is in the hand-up state and the second image is a frame subsequent to the first target image, even if the target object in the second image is not in the hand-up state, gesture recognition may be performed on the gesture of the target object in the second image.

The subsequent frames of the first image are N image frames after the target object in the first target image finishes the hand-up state, the N image frames are next to the last image frame of the first target image, but the N image frames do not belong to the first image.

The N frame images may be the 1 st to 5 th frame images of the last frame image next to the first target image. For example, the second image is the 2 nd frame image or the 3 rd frame image next to the last frame image of the first target image.

The target object in the second image is not in the hand-up state, but since the second image is the first image frame after the hand-up state of the target object is ended, it can also be determined that the target object has made the hand-up state, and therefore the second image is an image frame meeting the requirements and can be used for gesture recognition.

And under the condition that the target object is judged to make the hand-over motion and the gesture recognition result in the second image of the subsequent frame is the effective gesture, controlling the mobile robot based on the gesture recognition result.

In this case, the mobile robot is controlled by the effective gesture of the target object in the subsequent frame of the hand-up state, so that the hand-up time of the target object can be reduced, the sensitivity of the control method is improved, and the use experience of the user is improved under the condition of avoiding false triggering of the mobile robot.

In the embodiment, the gesture recognition result is obtained through the judgment of at least two effective conditions to control the mobile robot, so that the phenomenon of false triggering of the mobile robot is effectively avoided, and the use experience of a user is improved.

The judgment sequence of the two effective conditions that the gesture recognition result needs to satisfy at least may include at least: the hand-lifting state detection is carried out firstly and then the gesture recognition is carried out, or the hand-lifting state detection and the gesture recognition are carried out simultaneously.

The specific implementation method of the above judgment sequence will be specifically described in the following embodiments.

Different orders of determination of the two valid conditions may produce different effects.

For example, when two valid conditions are sequentially executed, if the preceding condition is not satisfied, the subsequent condition determination is stopped, so that the determination conditions can be simplified, and resources can be saved.

For example, in the case of parallel processing of two effective conditions, the determination time can be shortened, the sensitivity of the mobile robot can be improved, and the user experience can be improved.

It should be noted that, in this embodiment, the gesture effective condition determination order is not limited, and the user may set the determination order according to specific situations.

According to the control method of the mobile robot, the hand-lifting state of the target user and the target gesture of the target user can be accurately identified in various scenes, the phenomenon of false triggering of the mobile robot is effectively avoided, and the user experience is improved.

In some embodiments, before controlling the mobile robot, the control method of the mobile robot may further include: and identifying the target object through at least one frame of third image in the image set to be processed, and determining the target object as an effective object.

In this embodiment, the mobile robot may determine whether the target object meets a preset condition capable of controlling the identity of the mobile robot by performing identity recognition on the target object in at least one third image of the to-be-processed image set.

The third image may be at least one frame in the image set to be processed.

The third image may be used for identification of the target object.

Wherein the third image is at least one frame.

The third image may be a multi-frame image, which is a continuous plurality of images.

When the target object meets the preset identity condition and is an effective object, the target object also meets other judgment effective conditions, the gesture made by the target object can be determined to be effective, and the gesture of the target object can achieve an expected control effect on the mobile robot.

If the target object is an invalid object, the target object cannot control the mobile robot.

Controlling a valid gesture of a mobile robot requires at least a number of conditions to be met: at least including the target object being a valid object, the target object being in a hand-up state, and the gesture being a correct gesture.

In some embodiments, the target object is identified through at least one third image in the image set to be processed, and multiple identification methods, such as face identification or REID identification or a combination of the two identification methods, may be adopted.

The identification of the target object at least comprises the following three modes:

firstly, face recognition.

In this embodiment, before controlling the mobile robot, the face recognition may be performed on at least one frame of the third image, the facial feature information of the target user may be extracted, and whether the target object is a valid object may be checked, so that a false touch operation of the mobile robot by a false object may be prevented.

And II, REID identification.

In this embodiment, before controlling the mobile robot, REID recognition may be performed on at least one frame of the third image, human body feature information of the target user may be extracted, and whether the target object is a valid object may be checked, so that a false touch operation of the mobile robot by a false object may be further prevented.

The REID technology extracts the human body feature information of the target user, and the stored human body feature information can be extracted when the mobile robot is started or pre-stored in the mobile robot by comparing the human body feature information with the stored human body feature information.

And thirdly, face recognition and REID recognition.

In this embodiment, before controlling the mobile robot, the face recognition may be performed on the target object, the REID recognition mode may be switched when the face information of the target object cannot be collected, and whether the target object is a valid object may be checked under different actions of the target object, thereby greatly preventing a false touch operation of a wrong object on the mobile robot.

In this embodiment, the case where the face information of the target object cannot be acquired may include that the face is blocked or that the target object faces away from the mobile robot.

The human face can be shielded by a mask, sunglasses and other shielding objects, and the human face information of the target object cannot be collected by the human face recognition.

In some embodiments, step 120 may further include the steps of:

and performing key point detection on at least one frame of first image to obtain the positions of a plurality of target key points in the first image in the same frame.

In this step, the mobile robot may perform human body keypoint detection on the target object in at least one frame of the first image, and obtain position information of a plurality of target keypoints of the target object in the same frame of the first image according to a preset action.

The human body key points can be human body joint points and can comprise whole body skeleton point detection or local skeleton point detection.

The position information may be a height from the ground of a position where the target key point is located, or a distance from the target key point to other key points of the human body.

It should be noted that, because the human body conditions are different, the position information of the key points of different human bodies is different, and the human body conditions may include human body proportions or action habits.

And determining a target threshold value for distinguishing the relative position relation based on the identification result.

In this step, the mobile robot may call a preset human condition corresponding to the target object according to the identification result of the target object, and the mobile robot may call a target threshold of the key point position information corresponding to the key point of the target object according to the human condition.

The target threshold may be a critical point of a range of a highest value and a lowest value that a specific key point of the target object can reach when the target object performs a specific action, or a critical point of a position range where a probability of the specific key point of the target object is highest.

The target threshold may be a threshold preset in the mobile robot or a threshold acquired when the mobile robot is started. The mobile robot can call out a target threshold corresponding to the identity recognition result according to the identity recognition result of the target object.

It should be noted that the target threshold value differs from human body to human body due to different human body conditions.

For example, when the target object is high in height, the distance between the human key points of the target object is long, and the target threshold corresponding to the target object is large.

For example, in a case where the target object habitually makes a specific motion, when the target object makes the specific motion, the probability that the human body key point position of the target object appears at a specific position is high, and the target threshold corresponding to the target object corresponds to the specific position.

In the step, the target threshold value of the key point corresponding to the target object can be selected according to the identity recognition result, whether the effective action is made by the effective object can be judged more accurately, and the mobile robot can be effectively prevented from being triggered by mistake in the relevant environment.

Based on the target threshold and the positions of the plurality of target key points, determining the relative position relationship of the plurality of target key points.

In this step, the mobile robot may determine, through a target threshold corresponding to the target object and the positions of the target key points of the target object, the relative positions of the target key points corresponding to the motion when the target object makes a specific motion.

The target threshold may be used to determine whether the action is being made by the target object

The relative position relationship of the plurality of target key points can judge the action state of the target object.

In the embodiment, the relative position relation of the corresponding target key points when the target object makes the specific action is judged according to the identity recognition result, so that whether the target object makes the effective action can be judged more accurately, and the mobile robot is further prevented from being triggered by the wrong object or the meaningless action in the relevant scene.

In some embodiments, step 140 may further include the steps of:

and determining the corresponding relation between the gesture and the control instruction based on the identification result.

In this step, the mobile robot may call the setting corresponding to the identity, including the relationship between the gesture and the control instruction, through the result of the identity recognition.

Wherein the control instructions may be for controlling the mobile robot.

The corresponding relationship between the gesture and the control instruction may be instruction content for controlling the mobile robot corresponding to the gesture.

The control instructions may include a variety of instructions, such as a stop move instruction, a start move instruction, a power on instruction, or a power off instruction.

It should be noted that, the mobile robot in this embodiment may set multiple sets of control instructions, different objects may set their own control instructions, and the mobile robot may call the control instruction corresponding to the mobile robot according to the identification result, so that the setting is more humanized, and the use experience of the mobile robot is improved.

The mobile robot may recognize a target gesture of the target object in the at least one frame of the second image.

In this step, the mobile robot may recognize a target gesture made by the target object based on the at least one frame of the second image.

The target gesture can control the mobile robot to make a specific gesture of a specific working state.

The target gesture may be any gesture set for the target object or defaulted for the mobile robot.

Wherein the at least one frame of second image may include a gesture made by the target object.

The second image may be at least one image in the set of images to be processed.

The second image may be used to identify a gesture of the target object.

The second image is at least one frame and may comprise a plurality of frames. The multi-frame second image may be a plurality of consecutive images.

And obtaining a gesture recognition result based on the corresponding relation between the gesture and the control instruction and the target gesture.

In this step, the mobile robot may determine a gesture recognition result of the target object based on the gesture recognition result of the target gesture and the instruction content of the gesture corresponding to the control instruction list.

In this step, the mobile robot may determine a correspondence between the gesture of the target object and the control instruction, and determine instruction content of the gesture.

The control instruction can be various gesture instructions preset in the mobile robot for the target object.

The corresponding relation between the gesture and the control instruction can be multiple sets, and different target objects correspond to different corresponding relations between the gesture and the control instruction.

In some embodiments, after the mobile robot identifies the target object, the mobile robot calls the corresponding relationship between the corresponding gesture and the control instruction according to the identification result.

In this embodiment, the mobile robot may call the corresponding control instruction according to the identity recognition result to obtain the instruction content of the target gesture corresponding to the control instruction list, and then determine the gesture recognition result of the target object, which may improve the use experience of the mobile robot, and further avoid that the mobile robot is falsely triggered by the wrong object in the relevant scene.

In some embodiments, step 130 may further include performing motion preservation detection on the hand-up state of the target object using a multi-frame fusion technique.

In this embodiment, the hand-up motion holding detection processing may be performed on consecutive multiple frames of fourth images whose frames start from the hand-up state, and the hand-up state may be determined to be valid when the detected hand-up motion remains unchanged in the consecutive multiple frames of fourth images; and determining that the hand-up state is invalid when detecting that the hand-up action changes in the continuous multi-frame fourth images.

The fourth image may be a continuous multi-frame image in the image set to be processed.

The fourth image of the plurality of frames may be used to determine whether the hand-up state is maintained.

In the embodiment, the multi-frame fusion technology is used for detecting the hand-up state, so that the phenomenon of false triggering of the mobile robot caused by the inadvertent hand-up action of the target object can be avoided, the false triggering capability of the mobile robot is further improved, and the use experience of a user is improved.

The overall flow of the control method of the mobile robot will be specifically described with reference to fig. 6.

The gesture recognition result at least needs to meet the conditions that the target object is an effective object, the target gesture is correct and the target object is in a hand-up state, and can be determined to be an effective gesture, and the gesture recognition result can achieve an expected control effect on the mobile robot.

The processing order correspondence for determining the condition for which the gesture is valid includes a plurality of types, and for example, the respective determination conditions may be processed sequentially, or a plurality of conditions may be processed in parallel, which will be described in detail below.

First, each judgment condition is processed in sequence.

It should be noted that the determination order of the gesture recognition result may be a permutation and combination of the determination conditions, for example, the determination order of the gesture effectiveness at least includes the following three cases:

firstly, judging an effective object, judging a hand-lifting state and judging a target gesture;

secondly, judging the hand-lifting state, judging the target gesture and judging the effective object;

and thirdly, judging target gestures, judging effective objects and judging the hand-lifting state.

The control method of the mobile robot according to the present invention is not limited to the processing procedure of the determination condition, and the user can set the processing procedure of the determination condition according to different situations.

In the control method of the mobile robot provided in this embodiment, the determination conditions may be sequentially determined according to a certain order, and if the previous condition is not satisfied, the subsequent operation of the false touch processing is stopped, so that the processing efficiency can be greatly improved, and the operation steps can be simplified.

The first case will be described in detail below.

In some embodiments, identifying the target object and determining that the target object is a valid object may precede step 130, and step 130 may precede step 140.

In other words, the determination order of the mobile robot obtaining the gesture recognition result can be effective object determination, hand-up state determination, and target gesture determination.

The mobile robot identifies the target object;

under the condition that the target object is judged to be an invalid object by identity recognition, the state of the mobile robot is not changed;

under the condition that the target object is judged to be an effective object by identity recognition, judging that the target object is in a hand-up state based on the relative position relation of a plurality of target key points;

under the condition that the target object is not in the hand-up state, the state of the mobile robot is not changed;

under the condition that the target object is judged to be in the hand-up state, recognizing the gesture of the target object to obtain a gesture recognition result;

under the condition that the gesture of the gesture recognition result is incorrect, the state of the mobile robot is not changed;

and controlling the mobile robot based on the gesture recognition result when the gesture content is the target gesture and the gesture is correct.

In the embodiment, the identity recognition is performed firstly, then the hand-raising detection is performed, and finally whether the gesture is an effective gesture is judged, so that the processing efficiency of false triggering processing can be greatly improved, the judging step is simplified, and the resource consumption is reduced.

In some embodiments, the first frame of the at least one frame of the second image may be a last frame of the at least one frame of the first image.

In this embodiment, the first frame of the second image of the gesture recognition on the target object may be the last frame of the first image of the target process hand-up state.

In this embodiment, the control method of the mobile robot may be performed sequentially, and after the hand-up state detection, the last frame of the multi-frame images of the hand-up state detection may be used as the first frame of the gesture recognition, so as to avoid the gesture recognition from processing all the images of the hand-up state detection, reduce algorithm steps, reduce resource consumption, accelerate recognition speed, and improve user experience.

And secondly, processing a plurality of judgment conditions in parallel.

In some embodiments, step 140 may be performed while step 130 is performed.

For example, the determination order of the mobile robot obtaining the gesture recognition result may be that the target gesture recognition is performed when the valid object determination is performed first and then the hand-up state determination is performed.

For example, the determination order of the mobile robot obtaining the gesture recognition result may be that the target gesture recognition is performed when the hand-up state determination is performed, and then the valid object determination is performed.

In the embodiment, the hand-up state judgment and the target gesture judgment are processed in parallel, and gesture recognition is performed while the hand-up state is judged, so that the processing time can be reduced, and the user experience is improved.

In some embodiments, the at least one frame of second image may comprise n consecutive frames of images, the first frame of the at least one frame of first image being the mth frame of second image;

wherein

And n and m are positive integers.

Where m and n may represent the number of frames.

In this embodiment, the second image for gesture recognition may include n consecutive frames, and since the gesture of the target object in the second images of the consecutive frames may be a sliding gesture, an image close to the middle frame in the second images of the consecutive frames is selected as the first frame of the first image for hand-up recognition.

In this embodiment, a range of frame numbers, which is a range of frame numbers for images close to the intermediate frame, may be defined

For example, the gesture of the target object may be a sliding gesture, the motion of the sliding gesture in each frame of the second image is different, and may include continuous motions of lifting and lowering, the first frame of the second image close to the middle frame is selected as the first image, and the image including the gesture made by the target motion may be selected as much as possible, so as to avoid invalid judgment.

In this embodiment, the control method of the mobile robot may be parallel processing of the hand-up state and the gesture recognition, and an image of the second image close to the intermediate frame is selected as the first frame of the first image for the hand-up recognition, so that the first image for the hand-up recognition is prevented from being an invalid image, algorithm steps are reduced, the recognition efficiency is improved, and the user experience is further improved.

In the related art, in a related scene, a mobile robot cannot accurately distinguish an effective instruction sent by an effective target user in multiple targets or multiple actions, so that a false touch phenomenon occurs.

According to the control method of the mobile robot, the target gestures of the target user and the target user can be accurately recognized in various scenes through the gesture recognition result obtained through combination of various judgment conditions, the phenomenon of false triggering of the mobile robot is greatly avoided, and the user experience is improved.

The following describes a control device of a mobile robot provided in an embodiment of the present application, and the control device of a mobile robot described below and the control method of a mobile robot described above may be referred to in correspondence with each other.

As shown in fig. 7, the control apparatus of the mobile robot includes an acquisition module 710, a first processing module 720, a second processing module 730, a third processing module 740, and a fourth processing module 750.

An obtaining module 710, configured to obtain a set of images to be processed including a target object;

the first processing module 720 is configured to perform keypoint detection on at least one frame of first image in the to-be-processed image set, so as to obtain a relative position relationship of a plurality of target keypoints in the first image in the same frame;

the second processing module 730 is configured to determine, based on the relative position relationship, a first target image corresponding to the target object in the hand-up state from the first image;

the third processing module 740 is configured to perform gesture recognition on at least one frame of second image in the to-be-processed image set to obtain a gesture recognition result;

and a fourth processing module 750, configured to control the mobile robot based on a gesture recognition result corresponding to the first target image when the at least one frame of the second image and the first target image satisfy the target condition.

In some embodiments, before controlling the mobile robot, the controlling device of the mobile robot may further include identifying the target object by at least one third image in the image set to be processed, and determining that the target object is a valid object.

In some embodiments, the first processing module 720 may be further configured to perform keypoint detection on at least one frame of the first image, so as to obtain positions of multiple target keypoints in the first image in the same frame;

based on the target threshold and the positions of the plurality of target key points, the relative position relationship of the plurality of target key points is determined.

In some embodiments, the third processing module 740 may be further configured to determine a correspondence between the gesture and the control instruction based on the result of the identification;

identifying a target gesture in at least one frame of the second image;

and obtaining a gesture recognition result based on the corresponding relation between the gesture and the control command and the target gesture.

In some embodiments, the control device of the mobile robot may further include: performing identity recognition on a target object through at least one third image in an image set to be processed, determining that the target object is an effective object, and determining that the target object is in front of a first target image corresponding to a hand-raising state from a first image based on a relative position relation;

determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relationship, and performing gesture recognition on at least one frame of second image in the image set to be processed before obtaining a gesture recognition result;

the first frame of the at least one frame of the second image is the last frame of the at least one frame of the first image.

In some embodiments, the control device of the mobile robot may further include: the method for identifying the target object through at least one frame of third image in the image set to be processed comprises the following steps:

performing face recognition on at least one frame of third image; alternatively, the first and second liquid crystal display panels may be,

and performing REID identification on at least one frame of third image.

In some embodiments, the third processing module 740 may be executed while the second processing module 730 is executed; the at least one frame of second image comprises n continuous frames of images, and the first frame of the at least one frame of first image is the m-th frame of second image;

wherein

And n and m are positive integers.

In some embodiments, the first processing module 720 may be further configured to perform keypoint detection on at least one frame of the first image, so as to obtain positions of a wrist keypoint, an elbow keypoint, a shoulder keypoint, and a waist keypoint; determining relative position relations among the wrist key point, the elbow key point, the shoulder key point and the waist key point in the same frame based on the positions of the wrist key point, the elbow key point, the shoulder key point and the waist key point in the first image;

the second processing module 730 may be further configured to determine that the wrist key point is higher than the elbow key point, the wrist key point is located between the shoulder key point and the waist key point, and the first image with the elbow key point located between the shoulder key point and the waist key point is the first target image.

The control device for a mobile robot provided in the embodiment of the present application can implement each process implemented by the method embodiments of fig. 1 to fig. 6, and is not described herein again to avoid repetition.

The control device of the mobile robot provided by the embodiment of the application can obtain the gesture recognition result through the combination of various judgment conditions, can accurately recognize the target user and the target gesture of the target user in various scenes, greatly avoids the false triggering phenomenon of the mobile robot, and improves the user experience.

An embodiment of the present application further provides a mobile robot, including: camera and controller.

The camera is used for collecting images.

The image collected by the camera can be used for various identification processing of the controller.

The controller is used for controlling the mobile robot to realize the control method of the mobile robot.

The mobile robot provided by the embodiment of the application can obtain the gesture recognition result through the combination of various judgment conditions, can accurately recognize the target user and the target gesture of the target user in various scenes, greatly avoids the false triggering phenomenon of the mobile robot, and improves the user experience.

Fig. 8 illustrates a physical structure diagram of an electronic device, which may include: a processor (processor)810 and a memory (memory)820, wherein processor 810 and memory 830 are coupled. The processor 810 may invoke logic instructions in the memory 820 to perform a method of controlling a mobile robot, the method comprising: acquiring a to-be-processed image set comprising a target object; performing key point detection on at least one frame of first image in an image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame; determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relation; performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result; and controlling the mobile robot based on the gesture recognition result corresponding to the first target image when the at least one frame of second image and the first target image meet the target condition.

Furthermore, the logic instructions in the memory 820 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.

Further, the present application also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the method for controlling a mobile robot provided by the above-mentioned method embodiments, the method comprising: acquiring a to-be-processed image set comprising a target object; performing key point detection on at least one frame of first image in an image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame; determining a first target image corresponding to the target object in the hand-lifting state from the first image based on the relative position relation; performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result; and controlling the mobile robot based on the gesture recognition result corresponding to the first target image when the at least one frame of second image and the first target image meet the target condition.

In another aspect, embodiments of the present application further provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method for controlling a mobile robot provided in the foregoing embodiments, where the method includes: acquiring a to-be-processed image set comprising a target object; performing key point detection on at least one frame of first image in an image set to be processed to obtain the relative position relation of a plurality of target key points in the first image in the same frame; determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relation; performing gesture recognition on at least one frame of second image in the image set to be processed to obtain a gesture recognition result; and controlling the mobile robot based on the gesture recognition result corresponding to the first target image when the at least one frame of second image and the first target image meet the target condition.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

The above embodiments are merely illustrative of the present application and are not intended to limit the present application. Although the present application has been described in detail with reference to the embodiments, it should be understood by those skilled in the art that various combinations, modifications or equivalents may be made to the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application, and the technical solutions of the present application should be covered by the claims of the present application.

Claims

1. A method for controlling a mobile robot, comprising:

acquiring a to-be-processed image set comprising a target object;

determining a first target image corresponding to the target object in the hand-up state from the first image based on the relative position relation;

2. The method of controlling a mobile robot according to claim 1, wherein before said controlling the mobile robot, the method further comprises:

3. The method according to claim 2, wherein the detecting key points of at least one frame of the first image in the set of images to be processed to obtain the relative position relationship of a plurality of target key points in the first image in the same frame comprises:

4. The method according to claim 2, wherein the performing gesture recognition on at least one frame of second image in the to-be-processed image set to obtain a gesture recognition result includes:

identifying a target gesture in the at least one frame of second image;

and obtaining the gesture recognition result based on the corresponding relation between the gesture and the control instruction and the target gesture.

5. The method according to claim 2, wherein the target object is identified by at least one third image in the image set to be processed, the target object is determined to be a valid object, and the first target image corresponding to the target object in the hand-up state is determined from the first image based on the relative position relationship;

6. The method of claim 2, wherein the identifying the target object through at least one third image in the set of images to be processed comprises:

performing face recognition on the at least one frame of third image; alternatively, the first and second electrodes may be,

and performing REID identification on the at least one frame of third image.

7. The method of controlling a mobile robot according to claim 2, wherein the at least one frame of the second image and the first target image satisfy a target condition, including:

the at least one frame of second image comprises the first target image; alternatively, the first and second electrodes may be,

the at least one frame of second image is a subsequent frame of the first target image.

8. The method according to any one of claims 1 to 7, wherein when the first target image corresponding to the target object being in the hand-up state is determined from the first image based on the relative positional relationship, performing gesture recognition on at least one frame of second image in the to-be-processed image set to obtain a gesture recognition result;

wherein

And n and m are positive integers.

9. The method for controlling a mobile robot according to any one of claims 1 to 7, wherein the detecting key points of at least one first image of the set of images to be processed to obtain the relative position relationship of a plurality of target key points in the first image in the same frame comprises: performing key point detection on the at least one frame of first image to obtain positions of a wrist key point, an elbow key point, a shoulder key point and a waist key point; determining the relative position relationship among the wrist key point, the elbow key point, the shoulder key point and the waist key point in the same frame based on the positions of the wrist key point, the elbow key point, the shoulder key point and the waist key point in the first image;

10. A control device for a mobile robot, comprising:

11. A mobile robot, comprising:

the camera is used for collecting images;

a controller for controlling the mobile robot to implement the control method of the mobile robot according to any one of claims 1 to 9.

12. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the control method of a mobile robot according to any one of claims 1 to 9.

13. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the control method of a mobile robot according to any of the claims 1 to 9.