CN111191632B

CN111191632B - Gesture recognition method and system based on infrared reflective glove

Info

Publication number: CN111191632B
Application number: CN202010019629.6A
Authority: CN
Inventors: 梁正
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-01-08
Filing date: 2020-01-08
Publication date: 2023-10-13
Anticipated expiration: 2040-01-08
Also published as: CN111191632A

Abstract

The present disclosure provides a gesture recognition method and system based on infrared reflective gloves, the method comprising: acquiring hand action images of a user wearing infrared reflective gloves through an infrared camera of a VR system; inputting the acquired hand action image into a contour recognition neural network model which is trained in advance to obtain a corresponding gesture contour region; according to the gesture outline area and the glove position, obtaining the 3D coordinates of the key hand joint points; inputting the 3D coordinates of the hand key joint points into a gesture recognition neural network model obtained through pre-training to obtain a gesture recognition result; the infrared reflecting glove comprises a palm sleeve and a finger sleeve, wherein an infrared reflecting block is arranged on the front surface of the palm sleeve; the front of the fingerstall is provided with infrared reflection bands, and different infrared reflection blocks and the infrared reflection bands have different infrared reflection patterns. Can distinguish each dactylotheca of infrared reflection gloves, and then effectively discern user's gesture.

Description

Gesture recognition method and system based on infrared reflective glove

Technical Field

The invention relates to the VR field, in particular to a gesture recognition method and system based on infrared reflective gloves.

Background

In the existing gesture recognition technology, the hand actions of a user are collected through a camera; capturing main gesture features by utilizing an edge detection and normalization processing technology, and inputting the main gesture features into a model for gesture recognition; deeply tracking the preprocessed image, capturing the specific action direction through a sensor, and determining the space position of the finger of the user; collecting and storing the data after tracking; judging and analyzing the stored data; classifying the different data according to the extracted rule features by a classifier; quickly matching the classified data with a big data network; matching results in a correct gesture. In this way, it is necessary to obtain the exact spatial position of the user's finger, which has a low reflectivity in the infrared camera, is easily blocked from each other, and causes confusion and errors in recognition. However, the existing VR glove can only identify the hands of the user, but cannot identify the fingers of the user; and the gap between the fingers can be further reduced, so that confusion of mutual shielding, merging and recognition of the fingers in the recognition process by other gesture recognition technologies is enhanced, for example, two fingers are recognized as one finger.

Therefore, how to design an infrared reflective glove which can more conveniently obtain a position by an infrared camera and can improve the position obtaining precision, and a VR system and a gesture recognition method including the infrared reflective glove become the current urgent problems to be solved.

Disclosure of Invention

According to an embodiment of the disclosure, a mining pressure prediction treatment scheme during stoping of a coal mine working face is provided.

In a first aspect of the present disclosure, there is provided a gesture recognition method based on infrared reflective gloves, the method comprising capturing, by an infrared camera of a VR system, an image of a user's hand motion wearing an infrared reflective glove; the hand action image comprises infrared reflection images of the palm and each finger; inputting the acquired hand action image into a contour recognition neural network model which is trained in advance to obtain a corresponding gesture contour region; the gesture contour area comprises a palm contour area and each finger contour area; according to the gesture outline area and the glove position, obtaining the 3D coordinates of the key hand joint points; inputting the 3D coordinates of the hand key joint points into a gesture recognition neural network model obtained through pre-training to obtain a gesture recognition result; the infrared reflecting glove comprises a palm sleeve and a finger sleeve, wherein an infrared reflecting block is arranged on the front surface of the palm sleeve; the front of the fingerstall is provided with infrared reflection bands, and different infrared reflection blocks and the infrared reflection bands have different infrared reflection patterns.

In a second aspect of the present disclosure, there is provided an infrared reflective glove-based gesture recognition system, the system comprising an acquisition module for acquiring, via an infrared camera of a VR system, an image of a user's hand motion wearing an infrared reflective glove; the hand action image comprises infrared reflection images of the palm and each finger; the contour recognition module is used for inputting the acquired hand action image into a contour recognition neural network model which is trained in advance to obtain a corresponding gesture contour region; the gesture contour area comprises a palm contour area and each finger contour area; the key joint point acquisition module is used for acquiring 3D coordinates of the key joint points of the hand according to the gesture outline area and the glove positions; the gesture recognition module is used for inputting the 3D coordinates of the hand key joint points into a gesture recognition neural network model obtained through pre-training to obtain a gesture recognition result; the infrared reflecting glove comprises a palm sleeve and a finger sleeve, wherein an infrared reflecting block is arranged on the front surface of the palm sleeve; the front of the fingerstall is provided with infrared reflection bands, and different infrared reflection blocks and the infrared reflection bands have different infrared reflection patterns.

In a third aspect of the present disclosure, an electronic device is provided. The electronic device includes: a memory and a processor, the memory having stored thereon a computer program, the processor implementing the method as described above when executing the program.

In a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method as according to the first aspect of the present disclosure.

It should be understood that what is described in this summary is not intended to limit the critical or essential features of the embodiments of the disclosure nor to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The above and other features, advantages and aspects of embodiments of the present invention will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, wherein like or similar reference numerals denote like or similar elements, in which:

FIG. 1 shows a schematic structural diagram of an infrared reflective glove provided by an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a VR system according to an embodiment of the present invention;

figure 3 illustrates a flowchart of a method of gesture recognition based on infrared reflective gloves in accordance with an embodiment of the present disclosure,

FIG. 4 illustrates a block diagram of an infrared reflective glove-based gesture recognition system in accordance with an embodiment of the present disclosure;

fig. 5 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present disclosure.

The correspondence between the reference numerals and the component names in fig. 1 is:

100 infrared reflecting gloves, 1 wrist strap, 12 signal transmitter, 2 palm cover, 22 infrared reflecting block, 3 finger cover, 32 infrared reflecting band.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In addition, the term "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

An infrared reflective glove 100 provided by an embodiment of the present invention is described below with reference to fig. 1.

As shown in fig. 1, an infrared reflective glove 100 provided according to an embodiment of the first aspect of the present invention includes a wristband 1, a palm cover 2, and at least one finger cuff 3, specifically, at least three signal transmitters 12 are provided on the wristband 1; the palm sleeve 2, one end of the palm sleeve 2 is connected with the wrist strap 1 into a whole; at least one finger cuff 3 is connected to the palm cuff 2 at the end far away from the wrist strap 1, and an infrared reflection band 32 is arranged on the front surface of each finger cuff 3.

An infrared reflective glove 100 according to an embodiment of the present invention includes a wristband 1, a palm cover 2, and at least one finger cover 3, wherein the wristband 1, the palm cover 2, and the at least one finger cover 3 are connected to each other to form a complete conventional glove structure so that a user can wear the glove on a hand normally. The at least three signal transmitters 12 arranged on the wrist strap 1 are used for emitting infrared signals, so that devices such as an external infrared camera can accurately position the infrared reflective glove 100 according to the signals emitted by the signal transmitters 12, and the spatial position of the infrared reflective glove 100 can be accurately obtained according to the principle of an active infrared optical positioning method. And the signal emitter 12 is arranged on the wrist strap 1, so that the signal shielding of the glove on the signal emitter 12 can be reduced, and the space position of the infrared reflecting glove 100 can be acquired more accurately. The infrared reflection bands 32 arranged on the front surface of each finger stall 3 are used for reflecting infrared light to mark the finger positions of the user, so that the VR system can determine the finger positions of the user according to the infrared signals reflected by the infrared reflection bands 32 and perform subsequent infrared imaging, and the VR system can accurately recognize gesture actions of the user according to the positions and the formed infrared images.

The front surface of the glove is the surface of the glove opposite to the palm center, and the front surface of the fingerstall 3 is the outer surface of a layer of the fingerstall 3 contacted with the fingers of the user. And the front surface of the palm sleeve 2 is the outer surface of a layer of the palm sleeve 2 which is contacted with the palm of the user.

On the basis of the above embodiment, preferably, as shown in fig. 1, the number of finger cuffs 3 is plural, and each finger cuff 3 is provided with an infrared reflection band 32; wherein the plurality of infrared reflection bands 32 are each provided with reflection surfaces of different shapes.

In this embodiment, the number of finger cuffs 3 is a plurality, preferably 5, but may be less than 5. Preferably, an infrared-reflecting band 32 is provided on each finger cuff 3, so that each finger cuff 3 can be position-marked by the infrared-reflecting band 32 on the finger cuff 3. The plurality of infrared reflection belts 32 are provided with reflection surfaces with different shapes, that is, the shape of the reflection surface of each infrared reflection belt 32 is different, specifically, for example, different lines can be arranged on each infrared reflection belt 32 to form reflection surfaces with different shapes, and the reflection surfaces with different shapes can form different infrared images, so that different formed infrared images can be classified on corresponding fingers in subsequent image recognition, and each finger stall 3 can be distinguished from each other, thereby reducing confusion and errors of recognition, avoiding recognition of two fingers as one finger and other types, and further improving the recognition accuracy of the glove.

On the basis of any of the above embodiments, preferably, as shown in fig. 1, a plurality of protruding structures are provided on the wristband 1 at intervals, and a signal transmitter 12 is provided on each protruding structure. Different signal emitters 12 are distinguished by different colors or infrared bands.

In these embodiments, it may be preferable to provide a plurality of protruding structures on the wristband 1 at equal intervals, and then mount each signal emitter 12 on a protruding structure, so that the signal emitter 12 can protrude from the wristband 1 through the protruding structure, and thus, the signal emitter 12 can be prevented from being blocked by the wristband 1, so that the probability of blocking the signal is reduced to a great extent, and the accuracy of positioning the infrared reflective glove 100 can be improved. Of course, the plurality of protruding structures may be provided at random or at unequal intervals on the wristband 1.

On the basis of any of the above embodiments, it is preferable that the infrared reflection bands 32 are disposed along the length direction of the finger stall 3 as shown in fig. 1, and the length of the infrared reflection bands 32 is equal to or less than the length of the finger stall 3, and/or the width of the infrared reflection bands 32 is smaller than the width of the finger stall 3.

In these embodiments, the infrared reflection bands 32 are disposed along the length of the finger stall 3, so that the infrared reflection blocks 22 can be designed to be longer, thereby increasing the reflectivity of the user's fingers in the infrared camera and reducing the mutual shielding between the fingers. The length of the infrared reflection band 32 is smaller than or equal to the length of the fingerstall 3, so that waste caused by too long arrangement of the infrared reflection band 32 can be avoided, and the width of the infrared reflection band 32 is smaller than the width of the fingerstall 3, so that adjacent fingers can be distinguished more easily, and the recognition error rate can be reduced.

On the basis of any of the above embodiments, it is preferable that an infrared reflecting block 22 is provided on the front surface of the palm cover 2 as shown in fig. 1.

In these embodiments, the palm of the user can be positioned by the infrared reflecting block 22 on the palm cover 2, so that the palm of the user can be distinguished in the subsequent image recognition.

On the basis of any of the above embodiments, a position sensor (shown in the figure) is preferably provided in the wristband 1.

In these embodiments, the provision of a position sensor in the wristband 1 enables accurate determination of the position and orientation of the hand, thereby enabling further positional positioning of the glove.

Further preferably, the position sensor is a gyroscope. Of course, the position sensor may be other types of sensors.

On the basis of any of the above embodiments, preferably, as shown in fig. 1, the number of finger stalls 3 is 5, and the front surfaces of the 5 finger stalls 3 are each provided with an infrared reflection band 32.

As shown in fig. 2, a second aspect of the present invention provides a VR system (not shown in the drawings), including: an infrared reflective glove 100 provided in any of the embodiments of the first aspect; an infrared camera 102 for receiving a first signal from the signal emitter 12 on the infrared reflective glove 100, for sending a second signal to the infrared reflective glove 100, and for receiving a signal reflected back from the infrared reflective glove 100; and the server 104 is used for carrying out position and gesture recognition according to the infrared images acquired by the infrared camera.

According to the VR system provided in the second aspect of the present invention, the VR system includes an ir reflecting glove 100, an ir camera 102 and a server 104, where the ir cameras are plural and are configured to send ir signals to ir reflecting devices such as an ir reflecting block 22 and an ir reflecting block 32 on the ir reflecting glove 100, receive ir signals reflected by the ir reflecting devices such as the ir reflecting block 22 and the ir reflecting block 32, and send the received ir signals to the server 104; so that server 104 can infrared image the user's finger, palm, etc. from the reflected infrared signal to determine the user's gesture. Meanwhile, the infrared camera is further used for receiving infrared signals emitted by the plurality of signal emitters 12 on the infrared reflective glove 100, so that the server 104 can determine the spatial position of the glove according to the infrared signals emitted by the plurality of signal emitters 12. Meanwhile, since the VR system provided by the embodiment of the second aspect of the present invention includes the ir reflecting glove 100 provided by any one of the embodiments of the first aspect, the VR system provided by the embodiment of the second aspect of the present invention further has the advantages of the ir reflecting glove 100 provided by any one of the embodiments of the first aspect, which are not described herein.

In this embodiment, the server 104 is specifically configured to analyze and process infrared signals received by the infrared camera, so as to accurately determine the spatial position of the infrared reflective glove 100 and the position of the finger, palm, etc. of the user, and then analyze the gesture of the user according to these positions.

FIG. 3 shows a flowchart of a method 300 of infrared reflective glove-based gesture recognition, according to an embodiment of the present disclosure, including the steps of:

at block 302, capturing a hand motion image of a user by an infrared camera of a VR system; the hand action image comprises infrared reflection images of the palm and each finger;

inputting a contour recognition neural network model which is trained in advance into the acquired gesture motion image to obtain a corresponding gesture contour region in a block 304; the gesture contour area comprises a palm contour area and each finger contour area;

in some embodiments, the pre-trained neural network model is an SSD network, where SSD is based on a forward-propagating CNN network, generating a series of fixed-size (fixed-size) bounding boxes, and each box contains the possibility of an object instance, i.e., score, followed by a Non-maximal suppression (Non-maximum suppression) to obtain the final predictors.

The SSD network comprises a basic network layer, an additional layer and a prediction layer;

the basic network is used for completing feature extraction, generating a feature map with larger resolution, and generating a default bounding box with set size and set aspect ratio by utilizing the feature map; the VGG 16-based network is modified, and specifically comprises the following steps: all the convolutional layers of the VGG16 network are used and the two fully connected layers of the VGG16 network are replaced by two common convolutional layers.

The additional layer consists of a series of convolution layers, two of which are a group, and the main functions are to generate a feature map with smaller resolution, and generate a default bounding box with set size and set aspect ratio by using the feature map;

the prediction layer is composed of convolution layers and is mainly used for carrying out two convolution filtering processes on each feature map and respectively predicting the position deviation of a default boundary box on the feature map and the category confidence of the default boundary box, namely the probability of containing hands in the default boundary box. The convolution layer for predicting the default bounding box position offset consists of 4×q convolution kernels with a size of 3×3×p, where the parameter q is the default bounding box number (4 or 6 in the embodiment) generated at each point of the feature map and the parameter p is the channel number of the feature map; the convolution layer predicting the confidence of the default bounding box class consists of c×q convolution kernels of size 3×3×p, where the parameter c is the total number of predicted classes (2 in the embodiment, i.e., hand and background classes).

In some embodiments, the contour regions of the palm and each finger may be obtained through the SSD network, so as to further determine the 3D coordinates of the corresponding hand key nodes.

At block 306, obtaining 3D coordinates of the hand key joint points according to the gesture contour region and the glove position;

the glove position is that at least three signal transmitters 12 arranged on the wrist strap 1 send out infrared signals, the infrared camera 102 acquires the signals sent by the signal transmitters 12, the server 104 is used for accurately positioning the infrared reflective glove 100, and the spatial position of the infrared reflective glove 100 is accurately acquired according to the principle of an infrared optical positioning method.

In some embodiments, the spatial position of the palm and finger is obtained according to the principle of the infrared optical positioning method, and the spatial position of the palm and finger is corrected according to the spatial position of the transmitting glove 100 accurately determined by the signal transmitter 12.

Because the infrared reflection images of the palm and different fingers are different, the joint points of the palm and each finger can be accurately determined, and confusion among the joint points is avoided, for example, when a user stretches out different fingers, if the method is an existing identification method, misjudgment is easy, and according to the method, the finger stretching out by the user can be uniquely determined through the reflection images.

The 3D coordinates of the key articulation points are the positions of the articulation points under the image coordinate system and can be converted into the camera coordinate system, and when the camera is calibrated, the camera coordinate system is the world coordinate system.

In some embodiments, the hand keypoints comprise root keypoints and node keypoints, wherein the root keypoints are located in the palm root center position, the node keypoints comprise finger joint keypoints and finger end keypoints, the finger joint keypoints are located at joints of fingers, and the finger end keypoints are located at finger tips. The number of gesture key points per hand totals 21.

In some embodiments, the method further comprises the steps of:

connecting each key joint point in a relation set of the key joint points of the hand in each frame of image to obtain a space diagram of the key joint points of the hand; connecting the same key joint points on each frame in the relation set of the hand key joint points of each frame to obtain a time chart of the hand key joint points; and combining the time diagram of the hand key joint point and the space diagram of the hand key joint point together to construct a time-space diagram of the hand key joint point.

In some embodiments, further comprising: and normalizing the values of each key node in the time diagram and the space diagram of the hand key node time space diagram to obtain the data to be calculated of each key node. The normalization processing is a way of simplifying computation, namely, an expression with dimension is converted into a non-dimension expression to become a scalar.

In block 308, inputting the 3D coordinates of the hand key node into a preset gesture recognition neural network model to obtain a recognition result;

in some embodiments, the preset gesture recognition neural network model is trained by:

performing gesture labeling on 3D coordinates of hand key nodes obtained by realizing recorded different gesture images to generate a training sample of a gesture recognition neural network model;

training a neural network model through the training sample; the neural network model can be a deep convolutional neural network model, the parameters of the neural network model can be trained and learned by a random gradient descent method (Stochastic gradient descent, SGD), one data is randomly selected for calculation when the direction of the fastest descent is calculated by random gradient descent, and the data is not scanned for all training data sets, so that the iteration speed is increased;

and ending training when the error between the network output value and the target value is smaller than the expected value.

The convolutional neural network comprises a convolutional layer, a pooling layer and a full-connection layer. The convolution layer extracts high-dimensional key joint point information from data to be calculated; the pooling layer is used for compressing the data and the parameter quantity, reducing the overfitting, removing redundant information, and leaving information with scale invariance, and the information which can most express the key joint point characteristics. And the pooling layer performs downsampling operation on the high-dimensional key joint point information to obtain information to be classified. The full-connection layer performs classification calculation on the information to be classified to obtain the probability of the gesture type corresponding to the information to be classified, and finds a gesture type with the maximum probability in the probabilities of the gesture types corresponding to the information to be classified to obtain the recognition result.

In some embodiments, the time-space diagram of the hand joint node is input into a gesture recognition neural network model in cloud in advance, and a recognition result is obtained.

The space-time diagram convolution neural network-gesture recognition model is constructed in advance by utilizing the diagram convolution neural network and a plurality of gesture data sets.

In some embodiments, the parameters of the graph-rolling neural network model may be learned by training through a random gradient descent method (Stochastic gradient descent, SGD), where random gradient descent randomly selects one data to calculate when calculating the direction of the fastest descent, rather than scanning the entire training data set, thereby increasing the iteration speed.

The graph roll-up neural network model includes six spatiotemporal convolution units, three pooling layers, and one support vector machine (Support Vector Machine, SVM) classifier.

Any two space-time convolution units and a pooling layer can be combined into a calculation unit; the two space-time convolution units in the calculation unit sequentially operate, and high-dimensional finger joint point information is extracted from data to be calculated; the pooling layer is used for compressing the data and the parameter quantity, reducing the overfitting, removing redundant information, leaving information with scale invariance, and expressing the information of finger features most. Wherein the latter space-time convolution unit processes the processing result of the former space-time convolution unit; and the pooling layer performs downsampling operation on the high-dimensional finger joint point information to obtain information to be classified.

The SVM classifier performs classification calculation on the information to be classified to obtain the probability of the gesture type corresponding to the information to be classified, and finds a gesture type with the maximum probability from the probabilities of the gesture types corresponding to the information to be classified to obtain the recognition result.

In some embodiments, each spatiotemporal convolution unit includes an attention model, a graph convolution model, and a temporal convolution model. Wherein the attention model is used for restricting the recognition range of the graph convolution model; the graph convolution model is used for identifying data to be calculated in the identification range of the graph convolution model to obtain a space structure set among finger joint points: the time convolution model is used for calculating a space structure set among the finger joint points obtained by the graph convolution model to obtain high-dimensional finger joint point information.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present disclosure is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present disclosure. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all alternative embodiments, and that the acts and modules referred to are not necessarily required by the present disclosure.

The foregoing is a description of method embodiments, and the following further describes the aspects of the disclosure with reference to system embodiments.

FIG. 4 illustrates a block diagram of an infrared reflective glove-based gesture recognition system 300, which system 400 may be included in the server 104 of FIG. 2 or implemented as the server 104, in accordance with embodiments of the present disclosure. As shown in fig. 4, the system 400 includes:

the acquisition module 402 is used for acquiring hand action images of a user wearing infrared reflective gloves through an infrared camera of the VR system; the hand action image comprises infrared reflection images of the palm and each finger;

the contour recognition module 404 is configured to input the collected hand motion image into a contour recognition neural network model obtained by training in advance, so as to obtain a corresponding gesture contour region; the gesture contour area comprises a palm contour area and each finger contour area;

the key joint point obtaining module 406 is configured to obtain 3D coordinates of a key joint point of the hand according to the gesture contour region and the glove position;

the gesture recognition module 408 is configured to input the 3D coordinates of the hand key node into a gesture recognition neural network model obtained by training in advance, so as to obtain a gesture recognition result;

the infrared reflecting glove comprises a palm sleeve and a finger sleeve, wherein an infrared reflecting block is arranged on the front surface of the palm sleeve; the front of the fingerstall is provided with infrared reflection bands, and different infrared reflection blocks and the infrared reflection bands have different infrared reflection patterns.

Fig. 5 shows a schematic block diagram of an electronic device 500 that may be used to implement embodiments of the present disclosure. The device 500 may be used to implement the server 104 of fig. 2. As shown, the device 500 includes a Central Processing Unit (CPU) 501 that may perform various suitable actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM) 502 or loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The CPU 501, ROM 502, and RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The processing unit 501 performs the various methods and processes described above, such as method 300. For example, in some embodiments, the method 300 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by CPU 501, one or more steps of method 300 described above may be performed. Alternatively, in other embodiments, CPU 501 may be configured to perform method 300 by any other suitable means (e.g., by means of firmware).

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), etc.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Moreover, although operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. The gesture recognition method based on the infrared reflective glove is characterized by comprising the following steps of:

acquiring hand action images of a user wearing infrared reflective gloves through an infrared camera of a VR system; the hand action image comprises infrared reflection images of the palm and each finger;

inputting the acquired hand action image into a contour recognition neural network model which is trained in advance to obtain a corresponding gesture contour region; the gesture contour area comprises a palm contour area and each finger contour area;

according to the gesture outline area and the glove positions, obtaining 3D coordinates of hand key joint points, wherein the hand key joint points comprise root key points and node key points;

inputting the 3D coordinates of the hand key joint points into a gesture recognition neural network model obtained through pre-training to obtain a gesture recognition result;

2. The method of claim 1, wherein the plurality of infrared cameras are used for collecting hand actions of a user and avoiding shielding.

3. The method of claim 2, wherein the profile-recognition neural network model is a forward-propagating CNN network.

4. The method according to claim 2, wherein the glove position is obtained by an optical positioning method after obtaining infrared signals emitted by at least three signal emitters arranged on a wrist strap of the glove by an infrared camera.

5. The method of claim 4, wherein deriving 3D coordinates of a hand critical node from the gesture contour region and glove position comprises:

acquiring the space positions of the palm and the finger according to the principle of an infrared optical positioning method, and correcting the space positions of the palm and the finger according to the glove positions; and distinguishing the joint points of different fingers according to the infrared reflection stripes of different reflection blocks and reflection bands.

6. The method of claim 4, wherein deriving 3D coordinates of a hand critical node from the gesture contour region and glove position further comprises:

constructing a time diagram of the hand key articulation point and a space diagram of the hand key articulation point, and further constructing a time-space diagram of the hand key articulation point.

7. The method of claim 6, wherein inputting the 3D coordinates of the hand critical node into a pre-trained gesture recognition neural network model to obtain a gesture recognition result comprises:

and inputting the time-space diagram of the key joint points of the hand into a pre-trained gesture recognition neural network model to obtain a recognition result.

8. A gesture recognition system based on infrared reflective gloves, comprising:

the acquisition module is used for acquiring hand action images of a user wearing infrared reflective gloves through an infrared camera of the VR system; the hand action image comprises infrared reflection images of the palm and each finger;

the contour recognition module is used for inputting the acquired hand action image into a contour recognition neural network model which is trained in advance to obtain a corresponding gesture contour region; the gesture contour area comprises a palm contour area and each finger contour area;

the key joint point acquisition module is used for acquiring 3D coordinates of a hand key joint point according to the gesture outline area and the glove position, wherein the hand key joint point comprises a root key point and a node key point;

the gesture recognition module is used for inputting the 3D coordinates of the hand key joint points into a gesture recognition neural network model obtained through pre-training to obtain a gesture recognition result;

9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program, characterized in that the processor, when executing the program, implements the method of any of claims 1-7.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1-7.