CN114035683A

CN114035683A - User capturing method, device, equipment, storage medium and computer program product

Info

Publication number: CN114035683A
Application number: CN202111314304.1A
Authority: CN
Inventors: 林楠; 李健龙; 葛瀚丞; 石磊; 徐昭吉; 翟忆蒙; 郝志雄; 张苗昌; 张茜
Original assignee: Baidu Online Network Technology Beijing Co Ltd; Shanghai Xiaodu Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Shanghai Xiaodu Technology Co Ltd
Priority date: 2021-11-08
Filing date: 2021-11-08
Publication date: 2022-02-11
Anticipated expiration: 2041-11-08
Also published as: CN114035683B

Abstract

The disclosure provides a user capturing method, a user capturing device, electronic equipment, a computer readable storage medium and a computer program product, and relates to the technical field of artificial intelligence such as smart home and gesture recognition. The method comprises the following steps: scanning key points of bones of the whole body of a user entering a capture area to generate a bone posture model; highlighting the bone posture model corresponding to the user closest to the intelligent mirror as a suspected model in a preset display area of the intelligent mirror; and in response to not receiving the wrong capturing feedback of the displayed suspected model within the preset time length, locking the suspected model into an effective model corresponding to the target user, and controlling the capturing component to take the target user as a capturing object and adjust the capturing posture. The method not only can shorten the time consumed by rendering the attitude model, but also is beneficial to enabling the intelligent mirror to capture the target user by a display mechanism of the highlight suspected model.

Description

User capturing method, device, equipment, storage medium and computer program product

Technical Field

The present disclosure relates to the field of human-computer interaction technology, and in particular, to the field of artificial intelligence technology such as smart home and gesture recognition, and in particular, to a user capture method, apparatus, electronic device, computer-readable storage medium, and computer program product.

Background

With the continuous popularization of smart home concepts and the trend of people toward increasingly beautiful lives, in addition to various small-screen smart devices which are widely familiar to people, various smart large-screen devices including smart mirrors are formed by combining services and large-screen devices in families or special places step by step.

Taking an intelligent fitness mirror arranged in a fitness classroom as an example, how to provide good human-computer interaction experience for a fitness user in front of the mirror and provide good fitness guidance for the user is a problem to be solved urgently by technical staff in the field.

Disclosure of Invention

The embodiment of the disclosure provides a user capturing method and device, an electronic device, a computer readable storage medium and a computer program product.

In a first aspect, an embodiment of the present disclosure provides a user capture method, including: scanning key points of bones of the whole body of a user entering a capture area to generate a bone posture model; highlighting the bone posture model corresponding to the user closest to the intelligent mirror as a suspected model in a preset display area of the intelligent mirror; and in response to not receiving the wrong capturing feedback of the displayed suspected model within the preset time length, locking the suspected model into an effective model corresponding to the target user, and controlling the capturing component to take the target user as a capturing object and adjust the capturing posture.

In a second aspect, an embodiment of the present disclosure provides a user capture apparatus, including: a skeleton attitude model generation unit configured to perform whole body skeleton key point scanning on a user entering a capture area to generate a skeleton attitude model; the suspected model highlight display unit is configured to highlight a bone posture model corresponding to a user closest to the intelligent mirror in a preset display area of the intelligent mirror as a suspected model; and the effective model determining and capturing posture adjusting unit is configured to lock the suspected model into an effective model corresponding to the target user in response to not receiving the error capturing feedback of the displayed suspected model within a preset time length, and control the capturing component to take the target user as a capturing object and adjust the capturing posture.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the user capture method as described in any one of the implementations of the first aspect when executed.

In a fourth aspect, the disclosed embodiments provide a non-transitory computer-readable storage medium storing computer instructions for enabling a computer to implement a user capture method as described in any implementation manner of the first aspect when executed.

In a fifth aspect, the embodiments of the present disclosure provide a computer program product comprising a computer program, which when executed by a processor is capable of implementing the user capture method as described in any implementation manner of the first aspect.

Aiming at the large-screen intelligent device of the intelligent mirror, the technical scheme provided by the disclosure can simplify rendering computation amount and shorten rendering time by representing the user gesture as the skeleton gesture model, so that the real-time property is improved. Meanwhile, in order to prevent the user who is mistakenly captured as the target user according to the nearest algorithm, the suspected model is highlighted as a suspected model to confirm whether the error is captured or not by receiving the error capture feedback or not, the suspected model is locked as an effective model corresponding to the target user when the error capture feedback is not received, and the target user is used as a capture object, so that the suspected model can be used for guiding the capture component to adjust the capture posture according to the capture object, more comprehensive user posture information can be captured, and the use experience of the intelligent mirror is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture to which the present disclosure may be applied;

FIG. 2 is a flowchart of a user capture method provided by an embodiment of the present disclosure;

FIG. 3 is a flowchart of a processing method of error capture feedback based on a speech form in a user capture method provided by an embodiment of the present disclosure;

fig. 4 is a flowchart of a posture comparing and feedback method according to an embodiment of the disclosure;

5-1-5-8 are schematic diagrams illustrating a user starting to use the intelligent fitness scope to enter a fitness class according to an embodiment of the present disclosure;

FIG. 6-1 is a schematic diagram illustrating the effect of prompting and correcting a user's erroneous motion by the intelligent fitness mirror according to the embodiment of the present disclosure;

FIG. 6-2 is a schematic diagram illustrating an effect of the intelligent fitness mirror on the problem that normal capturing cannot be performed due to lens occlusion according to the embodiment of the present disclosure;

7-1-7-3 are schematic diagrams illustrating the effect of the intelligent fitness mirror provided by the embodiment of the present disclosure on giving different positive excitation feedback according to the degree of gesture consistency;

FIG. 8 is a block diagram of a user capture device according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an electronic device suitable for executing a user capture method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the user capture methods, apparatus, electronic devices, and computer-readable storage media of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include a smart mirror 101 and a user 102 using the smart mirror 101.

The intelligent mirror 101 may interact with other terminal devices and servers through a network or the like, so as to provide more functions for the user 102 by using the other terminal devices and servers, and the user 102 may also perform human-computer interaction with the intelligent mirror 101 through various ways, such as picture display, voice interaction, gesture interaction, and the like. The intelligent mirror 101 can be provided with various functional components, such as a camera component for shooting images, a three-dimensional scanning component for scanning object structures, a loudspeaker for producing sound, a display screen for presenting pictures, and the like.

The smart mirror 101 may be installed with various applications or programs to implement the above functions, such as news-based applications, voice interaction-based applications, fitness-based applications, smart clothing-based applications, and the like.

The intelligent mirror 101 can provide various services through various built-in applications, and taking fitness applications that can provide fitness services as an example, the intelligent mirror 101 can achieve the following effects when running the fitness applications: firstly, scanning key points of the whole skeleton of a user entering a capture area to generate a skeleton posture model; then, highlighting the bone posture model corresponding to the user closest to the intelligent mirror as a suspected model in a preset display area of the intelligent mirror to display the suspected model to the user 102; next, if the false capture feedback of the displayed suspected model is not received within the preset time length, the suspected model is locked into an effective model corresponding to the target user, and the capture component is controlled to take the target user as a capture object and adjust the capture posture.

The user capture method provided in the subsequent embodiments of the present disclosure is generally executed by the smart mirror 101, and accordingly, the user capture device is also generally disposed in the smart mirror 101.

It should be understood that the form, size, number of the smart mirrors in fig. 1 and the positional relationship between the users are merely illustrative. The adjustment of adaptability can be carried out according to the implementation requirement.

Referring to fig. 2, fig. 2 is a flowchart of a user capturing method according to an embodiment of the disclosure, where the process 200 includes the following steps:

step 201: scanning key points of bones of the whole body of a user entering a capture area to generate a bone posture model;

this step is intended to generate a skeletal pose model by a whole body skeletal keypoint scan of a user entering the capture area by the executing agent of the user capture method (e.g., the smart mirror 101 shown in fig. 1).

When the capturing device is an image capturing device of a camera type, the capturing area is represented as a shooting visual field corresponding to the camera; when the capture device is a scanner-like point capture device, the capture area will appear as the corresponding scan range of the scanner.

Correspondingly, the whole body bone key point scanning mode also changes correspondingly according to the original material captured by the capturing equipment, and when the capturing equipment is a camera, the original material captured by the capturing equipment is the whole body image of the user, so that the whole body bone key point scanning is to firstly identify bones from the whole body image and then determine which positions of the bones are bone key points; when the capturing device is a three-dimensional laser scanner or a three-dimensional structure light scanner, the original material obtained by scanning is a human-type point set or a human-type point cloud, and in order to determine the skeletal key points, discrete points are firstly fitted into a continuous human-type contour, then skeletons in the human-type contour are determined, and finally, the positions of the skeletons are determined to be skeletal key points.

After the skeleton key points are scanned, the generated skeleton posture model is the posture model for restoring the real posture of the corresponding user by using the skeleton key points, and because the identity detail information (such as the face, the posture and the like) of the user is not required to be rendered based on skeleton generation, the rendering calculation amount is reduced, the rendering time is shortened, and the linkage between the rendering result seen by the user and the actual action of the user is improved.

Step 202: highlighting the bone posture model corresponding to the user closest to the intelligent mirror as a suspected model in a preset display area of the intelligent mirror;

on the basis of step 201, this step is intended to select, by the execution subject, a bone posture model of a user away from the execution subject (the smart mirror body) among bone posture models of all users present in the capture area as a plausible model according to a preset nearest distance positioning principle, and draw the attention of the target user by highlighting the plausible model in a preset display area.

The reason why the intelligent mirror is preset to be closest to the positioning principle is that in general, a plurality of users cannot pass through the intelligent mirror frequently, and therefore, a target user who actually uses the intelligent mirror can be successfully positioned according to the principle in most cases. Meanwhile, the intelligent mirror disclosed by the invention does not collect more user identity characteristic information to distinguish different users, and more users are not distinguished in a public place (such as a body-building classroom and a body-building room) to provide the same service.

The present disclosure considers that a non-target user who passes through a capture area may be locked incorrectly in some cases according to the nearest positioning principle, and therefore, the non-target user draws the attention of the target user through the operation of highlighting the non-target user in a preset display area as a plausible model and feeds back the result when the target user finds that the locking is incorrect, thereby providing a mechanism for correcting the incorrect locking.

That is, in this step, a period of time after the suspected model is displayed in the preset display area in a highlighted form is a feedback time for the target user to perform a possible error capturing or error locking operation, and the specific time duration may be set according to the actual situation, for example, 5 seconds, 7 seconds, and the like.

The preset display area of the smart mirror is an area for displaying a skeletal posture model of a user in the present disclosure, and a mirror body position most attracting attention of the user may be determined as the preset display area, for example, a middle area and a lower half area of the mirror body. In addition, in addition to highlighting the mode of distinguishing the currently locked suspected model from other unlocked bone posture models, other modes such as edge tracing, differentiated color rendering, adding a selection frame, adding an arrow indication special effect and the like can be added or replaced according to actual conditions, and the preset distinguishing reminding mode can be adjusted in a targeted manner according to different installation places of the intelligent mirror and types of crowds, and is not specifically limited herein as long as the currently locked suspected model and other unlocked bone posture models can be clearly distinguished.

Step 203: and in response to not receiving the wrong capturing feedback of the displayed suspected model within the preset time length, locking the suspected model into an effective model corresponding to the target user, and controlling the capturing component to take the target user as a capturing object and adjust the capturing posture.

In the step, the execution subject locks the suspected model to be an effective model corresponding to the target user, that is, the suspected model is indeed a bone posture model corresponding to the target user, and the suspected model can be definitely an effective model because the target user is successfully captured, and if other alternative bone posture models exist, the suspected model can be confirmed to be excluded and shielded.

After determining the valid model, the execution subject controls the capture component to take the target user as the capture object and adjust the capture gesture, i.e. the capture component can adjust its capture gesture (e.g. capture angle, capture area, etc.) according to the position change of the target user, so as to better capture the subsequent motion information of the target user.

For a large-screen intelligent device, namely an intelligent mirror, the user capturing method provided by the embodiment of the disclosure can simplify rendering computation, shorten rendering time, and further improve real-time performance by representing the user gesture as a skeleton gesture model. Meanwhile, in order to prevent the user who is mistakenly captured as the target user according to the nearest algorithm, the suspected model is highlighted as a suspected model to confirm whether the error is captured or not by receiving the error capture feedback or not, the suspected model is locked as an effective model corresponding to the target user when the error capture feedback is not received, and the target user is used as a capture object, so that the suspected model can be used for guiding the capture component to adjust the capture posture according to the capture object, more comprehensive user posture information can be captured, and the use experience of the intelligent mirror is improved.

Unlike the case of step 203 in flow 200, another case may be: in the case where the false capture feedback for the displayed suspected model is received within the preset time period, the recapture indication may be extracted from the false capture feedback according to the feedback form of the false capture feedback, and a new suspected model may be determined again according to the recapture indication.

The feedback forms may include voice feedback, gesture feedback, limb motion feedback, touch feedback, and the like, and it should be understood that information amounts contained in different feedback forms are often very different, and actual feedback difficulties for the same information amount contained in different feedback forms are also different, so that when an effective re-capture indication is extracted from an error capture feedback, the feedback form needs to be combined to extract an accurate re-capture indication according to a key information extraction manner corresponding to the feedback form, thereby helping to re-determine a new suspected model.

To further enhance the understanding of the implementation of extracting the recapture indication from the error capture feedback and determining the new suspected model, the present disclosure further provides an implementation of a method for processing the error capture feedback based on the speech form in conjunction with fig. 3, where the process 300 includes the following steps:

step 301: carrying out semantic recognition on the received voice feedback signal, and extracting action information and emotional tendency information from a semantic recognition result;

the execution main body firstly carries out semantic recognition on the received voice feedback signal to obtain a semantic recognition result which can be convenient for the execution main body to clarify the meaning which a user wants to express, and then extracts action information and emotional tendency information from the semantic recognition result. The action information refers to indication information for re-determining a new suspected model, such as "select the left" and "next", and the emotional tendency information is content of expressing the emotion of the target user, such as "stubborn egg, wrong selection", "right selection, stick" and the like, and it can be determined whether the new suspected model currently selected by the smart mirror selects the desired one of the users according to the emotional tendency.

Step 302: determining the position relation of a bone posture model for indicating a target user relative to a suspected model according to the emotional tendency information and the action information;

step 303: and determining a new suspected model according to the current position and the position relation of the suspected model.

Steps 302-303 are directed to determining, by the executing body, a position relationship of the bone posture model for indicating the target user relative to the suspected model according to the emotional tendency information and the action information, and then determining a new suspected model according to the current position and the position relationship of the suspected model.

As can be seen from the above processing example of the error capturing feedback in the form of voice, other feedback forms may also embody similar action information and emotional tendency information in different manners, and further determine a new suspected model according to similar processing and analyzing manners, which is not listed here.

Based on any of the above embodiments, the present disclosure further provides an actual usage manner of the user acting following the standard action displayed by the smart mirror through fig. 4 after successfully determining the valid model corresponding to the target user, wherein the process 400 includes the following steps:

step 401: rendering the skeleton posture of the effective model in real time according to the captured real-time action information of the target user;

namely, in the step, the executing body renders the bone posture of the effective model in real time according to the captured real-time action information of the target user, so that the posture of the bone posture model obtained after real-time rendering is consistent with the real-time action of the target user. Then the skeleton attitude model at this moment can be presented to the target user by the intelligent mirror, and the target user can further determine the actual action currently performed by the target user.

Step 402: comparing the skeleton posture with the standard skeleton posture at the same time point to obtain the posture consistency degree;

on the basis of step 401, this step aims to compare the bone posture actually presented by the target user with the standard bone posture at the same time point by the execution subject, and further obtain a posture consistency degree, that is, the posture consistency degree describes the posture consistency between the actual bone posture and the standard bone posture. The standard bone pose may be a bone pose represented by a bone pose model obtained by a fitness activity demonstration person in the same manner.

Step 403: presenting a posture feedback corresponding to the posture consistency degree;

on the basis of step 402, this step is intended to present posture feedback corresponding to the degree of posture consistency by the execution subject described above. Specifically, the gesture consistency degree can be simply divided into gesture consistency and gesture inconsistency, and then forward excitation feedback and error action prompt are given.

For example, when the gesture consistency degree exceeds the preset degree, corresponding forward excitation feedback is presented according to the range that the gesture consistency degree exceeds the preset degree, for example, when the preset degree is 80% consistency, different forward excitation feedback is presented when the actual gesture consistency degree is 85% and 95%, for example, "good doing" forward excitation feedback is given at 85%, and "true Tai excellent" forward excitation feedback is given at 95%, so as to stimulate the driving force of the user for continuously improving the consistency between the user and the standard action.

For another example, when the posture consistency degree does not exceed the preset degree, an error bone corresponding to the error posture can be determined, and the error bone is rendered differentially on the effective model to prompt the target user. The differentiated rendering can adopt various modes such as color difference, highlight and the like. Furthermore, corresponding posture correction guidance can be generated for the effective model with the differentiated rendering, so that the target user can quickly correct the self error posture to the standard posture through the posture correction guidance. Specifically, the posture correcting guidance may be a text guidance presented beside the skeleton posture model, may be an animation guidance embodied as a continuous motion, and may be a voice guidance including a motion leader.

For deepening understanding, the application method of the intelligent fitness mirror is further provided by taking the intelligent fitness mirror which is specifically arranged in a fitness scene and provides fitness action guidance for a user as an example:

as shown in FIG. 5-1, the user first selects a desired exercise program from the intelligent exercise mirror;

as shown in FIG. 5-2, the intelligent fitness mirror presents a prompt that requires the target user to stand 1-1.5 meters from the mirror;

at the moment, the intelligent fitness mirror starts a camera to scan and identify human body data of all users in a visual field, key skeleton point information of a human body is obtained by capturing human body action postures, analyzing and abstracting, a skeleton posture model is generated by connecting key skeleton points, and a human body outline of the skeleton posture model is outlined in a matchmaker mode;

5-3, the intelligent fitness mirror will recognize the user closest to the mirror based on the closest human bone point detection of vision, and highlight its bone pose model in white;

after the intelligent fitness scope waits for the error capture feedback accompanied by the reciprocal 321 as shown in fig. 5-4, 5-5 and 5-6, if the error capture feedback is not received, the intelligent fitness scope will prompt that the target user is locked as shown in fig. 5-7, and will formally enter the subsequent fitness class after passing through the transition picture of fig. 5-8.

After entering the fitness course formally, the user tries to do the same action according to the standard fitness action presented in the mirror, and the intelligent fitness mirror continuously monitors the locked real-time posture of the user and renders a corresponding skeleton posture model according to the real-time posture, namely, the skeleton posture model is displayed as matchmaker animation.

Meanwhile, a circular dance floor can be arranged around the matchmaker, the dance floor is controlled to combine the state of the matchmaker and the course link to show a corresponding special effect, for example, when a user walks out of a motion capture area before the matchmaker changes into a red dotted outline, and the stage presents a light red effect.

When the action of the user is obviously different from the standard action, the action can be embodied on the matchmaker in a way of rendering wrong limbs in a differentiated way (see the lower half red part of the white matchmaker shown in fig. 6-1); if the user leaves the front-of-the-mirror motion capture area of the intelligent fitness mirror or the camera is occluded, a corresponding prompt can also be issued (see the lens occlusion prompt shown in fig. 6-2).

When the action made by the user is closer to the standard action, the corresponding action scores are presented in a quantitative mode, and the corresponding stage fantasy special effects are displayed in sequence from low to high according to the branch score, for example, the user is sequentially prompted to be good, perfect and the like (corresponding to fig. 7-1, fig. 7-2 and fig. 7-3 respectively).

Specifically, the motion score may be calculated by:

the intelligent fitness mirror acquires human skeleton key points based on a visual recognition algorithm by acquiring human body images in front of a screen, matches the skeleton key frames of a user with the standard action key frames of a coach in a course by combining an action matching algorithm, and returns a matching result in real time. The matching result is converted into a prompt language which is easy to understand by the user, such as that the elbow joint should be close to the body trunk. If the matching degree of the action posture of the user and the standard action is higher, and the action reaction has a certain rhythm (specifically, the action is represented as a key data frame of the user and needs to be reacted within a certain time range), the corresponding AI score is converted and calculated. And meanwhile, the action amplitude of the user is also identified (if the action range of the user is larger, the activity value of the user is considered to be higher), the activity value is calculated and converted into a corresponding course score, and the movement score of the user is recorded.

With further reference to fig. 8, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a user capture apparatus, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.

As shown in fig. 8, the user capturing apparatus 800 of the present embodiment may include: a bone posture model generating unit 801, a suspected model highlighting unit 802, and an effective model determining and capturing posture adjusting unit 803. The skeleton attitude model generation unit 801 is configured to perform whole body skeleton key point scanning on a user entering a capture area to generate a skeleton attitude model; a suspected model highlighting unit 802, configured to highlight the bone posture model corresponding to the user closest to the smart mirror as a suspected model in a preset display area of the smart mirror; an effective model determination and capture pose adjustment unit 803 configured to lock the suspected model to an effective model corresponding to the target user in response to not receiving an erroneous capture feedback of the displayed suspected model within a preset time period, and control the capture component to take the target user as a capture object and adjust the capture pose.

In the present embodiment, in the user capture device 800: the detailed processing and the technical effects of the bone pose model generating unit 801, the suspected model highlighting unit 802, and the effective model determining and capturing pose adjusting unit 803 can refer to the related descriptions of step 201 and step 203 in the corresponding embodiment of fig. 2, which are not described herein again.

In some optional implementations of this embodiment, the user capture device 800 may further include:

and the error capturing feedback processing unit is configured to respond to the received error capturing feedback of the displayed suspected model within a preset time length, extract a re-capturing instruction from the error capturing feedback according to the feedback form of the error capturing feedback, and re-determine a new suspected model according to the re-capturing instruction.

In some optional implementations of this embodiment, the error capture feedback processing unit may include a recapture indication extracting subunit configured to extract a recapture indication from the error capture feedback according to a feedback form of the error capture feedback, and the recapture indication extracting subunit may be further configured to:

responding to the feedback form of the error capture feedback as voice, performing semantic recognition on the received voice feedback signal, and extracting action information and emotional tendency information from a semantic recognition result;

determining the position relation of a bone posture model for indicating a target user relative to a suspected model according to the emotional tendency information and the action information;

correspondingly, the error-capture feedback processing unit may comprise a new suspected model re-determination subunit configured to re-determine a new suspected model in dependence on the re-capture indication, the new suspected model re-determination subunit in turn may be further configured to:

and determining a new suspected model according to the current position and the position relation of the suspected model.

a real-time rendering unit configured to render a skeletal pose of the effective model in real time according to the captured real-time action information of the target user;

the posture consistency comparison unit is configured to compare the skeleton posture with a standard skeleton posture at the same time point to obtain a posture consistency degree;

a posture feedback unit configured to present posture feedback corresponding to the degree of posture consistency.

In some optional implementations of the present embodiment, the gesture feedback unit may be further configured to:

responding to the gesture consistency degree exceeding the preset degree, and presenting corresponding forward excitation feedback according to the range that the gesture consistency degree exceeds the preset degree;

and in response to the gesture consistency degree not exceeding the preset degree, determining the wrong bone corresponding to the wrong gesture, and performing differential rendering on the wrong bone on the effective model.

and the posture correction guidance generating unit is configured to generate corresponding posture correction guidance for the effective models with differentiated rendering.

This embodiment exists as an apparatus embodiment corresponding to the method embodiment described above.

For the large-screen intelligent device, namely the intelligent mirror, the user capturing device provided by the embodiment of the disclosure can simplify rendering calculation amount and shorten rendering time by representing the user gesture as the skeleton gesture model, so that the real-time performance is improved. Meanwhile, in order to prevent the user who is mistakenly captured as the target user according to the nearest algorithm, the suspected model is highlighted as a suspected model to confirm whether the error is captured or not by receiving the error capture feedback or not, the suspected model is locked as an effective model corresponding to the target user when the error capture feedback is not received, and the target user is used as a capture object, so that the suspected model can be used for guiding the capture component to adjust the capture posture according to the capture object, more comprehensive user posture information can be captured, and the use experience of the intelligent mirror is improved.

According to an embodiment of the present disclosure, the present disclosure also provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the user capture method described in any of the above embodiments when executed.

According to an embodiment of the present disclosure, there is also provided a readable storage medium storing computer instructions for enabling a computer to implement the user capturing method described in any of the above embodiments when executed.

According to an embodiment of the present disclosure, there is also provided a computer program product, which when executed by a processor is capable of implementing the user capture method described in any of the embodiments above.

FIG. 9 illustrates a schematic block diagram of an example electronic device 900 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the apparatus 900 includes a computing unit 901, which can perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The calculation unit 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.

A number of components in the device 900 are connected to the I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, and the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, optical disk, or the like; and a communication unit 909 such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 901 performs the respective methods and processes described above, such as the user capture method. For example, in some embodiments, the user capture method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 900 via ROM 902 and/or communications unit 909. When the computer program is loaded into RAM 903 and executed by computing unit 901, one or more steps of the user capture method described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the user capture method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in the conventional physical host and Virtual Private Server (VPS) service.

Aiming at the large-screen intelligent device, namely the intelligent mirror, the technical scheme provided by the embodiment of the disclosure can simplify rendering calculation amount and shorten rendering time by representing the user gesture as the skeleton gesture model, so that the real-time performance is improved. Meanwhile, in order to prevent the user who is mistakenly captured as the target user according to the nearest algorithm, the suspected model is highlighted as a suspected model to confirm whether the error is captured or not by receiving the error capture feedback or not, the suspected model is locked as an effective model corresponding to the target user when the error capture feedback is not received, and the target user is used as a capture object, so that the suspected model can be used for guiding the capture component to adjust the capture posture according to the capture object, more comprehensive user posture information can be captured, and the use experience of the intelligent mirror is improved.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A user capturing method is applied to an intelligent mirror and comprises the following steps:

scanning key points of bones of the whole body of a user entering a capture area to generate a bone posture model;

highlighting the bone posture model corresponding to the user closest to the intelligent mirror as a suspected model in a preset display area of the intelligent mirror;

and in response to not receiving false capture feedback of the displayed suspected model within a preset time length, locking the suspected model into an effective model corresponding to a target user, and controlling a capture component to take the target user as a capture object and adjust a capture posture.

2. The method of claim 1, further comprising:

and in response to receiving the error capturing feedback of the displayed suspected model within the preset time length, extracting a re-capturing instruction from the error capturing feedback according to the feedback form of the error capturing feedback, and re-determining a new suspected model according to the re-capturing instruction.

3. The method of claim 2, wherein said extracting a recapture indication from the error capture feedback according to a feedback form of the error capture feedback comprises:

responding to the feedback form of the error capturing feedback as voice, performing semantic recognition on the received voice feedback signal, and extracting action information and emotional tendency information from a semantic recognition result;

determining a position relation of a bone posture model used for indicating the target user relative to the suspected model according to the emotional tendency information and the action information;

correspondingly, the re-determining a new suspected model according to the re-capturing indication includes:

and determining a new suspected model according to the current position of the suspected model and the position relation.

4. The method of any of claims 1-3, further comprising:

rendering the skeleton posture of the effective model in real time according to the captured real-time action information of the target user;

comparing the skeleton posture with a standard skeleton posture at the same time point to obtain a posture consistency degree;

presenting a gesture feedback corresponding to the gesture consistency degree.

5. The method of claim 4, wherein said presenting the gesture feedback corresponding to the degree of gesture consistency comprises:

responding to the gesture consistency degree exceeding a preset degree, and presenting corresponding forward excitation feedback according to the amplitude of the gesture consistency degree exceeding the preset degree;

and in response to the gesture consistency degree not exceeding the preset degree, determining an error bone corresponding to the error gesture, and performing differential rendering on the error bone on the effective model.

6. The method of claim 5, further comprising:

generating corresponding posture correction guidance for the effective models with the differentiated renderings.

7. A user capturing device applied to a smart mirror comprises:

a skeleton attitude model generation unit configured to perform whole body skeleton key point scanning on a user entering a capture area to generate a skeleton attitude model;

the suspected model highlight display unit is configured to highlight a bone posture model corresponding to a user closest to the intelligent mirror in a preset display area of the intelligent mirror as a suspected model;

an effective model determination and capture pose adjustment unit configured to lock the suspected model as an effective model corresponding to a target user in response to not receiving an erroneous capture feedback of the displayed suspected model within a preset time period, and control a capture component to take the target user as a capture object and adjust a capture pose.

8. The apparatus of claim 7, further comprising:

and the error capturing feedback processing unit is configured to respond to the received error capturing feedback of the displayed suspected model within the preset time length, extract a re-capturing instruction from the error capturing feedback according to the feedback form of the error capturing feedback, and re-determine a new suspected model according to the re-capturing instruction.

9. The apparatus of claim 8, wherein the error capture feedback processing unit comprises a recapture indication extraction subunit configured to extract a recapture indication from the error capture feedback according to a feedback form of the error capture feedback, the recapture indication extraction subunit further configured to:

correspondingly, the error-capture feedback processing unit comprises a new suspected model re-determination subunit configured to re-determine a new suspected model in accordance with the re-capture indication, the new suspected model re-determination subunit further configured to:

10. The apparatus of any of claims 7-9, further comprising:

a real-time rendering unit configured to render the bone pose of the effective model in real time according to the captured real-time action information of the target user;

a posture consistency comparison unit configured to compare the bone posture with a standard bone posture at the same time point to obtain a posture consistency degree;

a gesture feedback unit configured to present a gesture feedback corresponding to the gesture consistency degree.

11. The apparatus of claim 10, wherein the gesture feedback unit is further configured to:

12. The apparatus of claim 11, further comprising:

a gesture correction guidance generation unit configured to generate a corresponding gesture correction guidance for the valid model in which the differentiated rendering exists.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the user capture method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the user capture method of any one of claims 1-6.

15. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the user capture method according to any one of claims 1-6.