CN110717385A - Dynamic gesture recognition method - Google Patents
Dynamic gesture recognition method Download PDFInfo
- Publication number
- CN110717385A CN110717385A CN201910816005.4A CN201910816005A CN110717385A CN 110717385 A CN110717385 A CN 110717385A CN 201910816005 A CN201910816005 A CN 201910816005A CN 110717385 A CN110717385 A CN 110717385A
- Authority
- CN
- China
- Prior art keywords
- gesture
- node
- palm
- dynamic
- distance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 26
- 238000012706 support-vector machine Methods 0.000 claims abstract description 16
- 230000033001 locomotion Effects 0.000 claims abstract description 14
- 210000000707 wrist Anatomy 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 14
- 210000000988 bone and bone Anatomy 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 238000005259 measurement Methods 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 230000002093 peripheral effect Effects 0.000 claims description 2
- 239000000284 extract Substances 0.000 abstract 1
- 230000009471 action Effects 0.000 description 18
- 238000001514 detection method Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 230000011218 segmentation Effects 0.000 description 6
- 230000015654 memory Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 210000003811 finger Anatomy 0.000 description 3
- 210000003813 thumb Anatomy 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000005452 bending Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a method capable of dynamically recognizing dynamic gestures, which extracts palm position information by combining human skeleton node information with depth image information, separates palm image information from a background, recognizes a segmented gesture image by adopting an SVM (support vector machine) algorithm, and finally performs optimal matching of dynamic gestures by utilizing a DTW (dynamic time warping) dynamic time warping algorithm through a motion sequence of arm skeleton nodes.
Description
Technical Field
The invention relates to an intelligent recognition technology, which is particularly applied to the field of gesture recognition in transportation.
Background
The train driver gesture recognition is an important component of an intelligent traffic management system, belongs to a non-contact gesture collection mode based on computer vision, is low in equipment cost, can better meet the naturalness and comfort required by human-computer interaction, and is a hotspot of current research.
Compared with static gestures, dynamic gestures have the characteristics of intuition and convenience, are more suitable for flexible man-machine interaction application, and are difficult to research on dynamic gesture recognition due to the fact that the dynamic gestures are multiple in types, complex in characteristics and fast in change. In the prior art, distance transformation is performed on a binarized image to generate a hand region image with a skeleton extraction effect, and a central point is connected to obtain a hand skeleton, so that gestures are recognized and classified, and the recognition accuracy is almost 100%. Some methods can only realize the recognition of left hand gestures and human body trunks, and lack the recognition of palm skeletons. Still other methods use Kinect to identify hand arithmetic (arabic numbers and operator signs) and stone scissors cloth, obtain accurate images of hand regions by depth threshold segmentation, and use finger-ground movement distance (FEMD) measurement to measure the difference between different hand types for identification and classification, and the highest identification rate of the method reaches 93.9%. However, in the gesture recognition process, the black wrist strap worn by the hand of the tester has a certain influence on the recognition result, and the recognition accuracy is low when the wrist strap is not worn.
Most of the existing methods aim at static general gesture recognition in common scenes, have poor recognition effect on dynamic professional gestures in certain specific scenes, and cannot effectively judge and recognize gestures in the scenes.
Disclosure of Invention
Based on the problems, the invention provides a Dynamic gesture recognition method, in particular to a train driver Dynamic gesture recognition method based on machine vision, which adopts Kinect to obtain human skeleton node information, sets a distance difference threshold value to determine the position of an approximate palm node to obtain a gesture segmentation image, then adopts a Support Vector Machine (SVM) to perform gesture recognition and evaluation, combines a motion sequence of the skeleton node, adopts a DTW (Dynamic Time Warping) algorithm to recognize and detect the arm action of a train driver, and finally obtains effective gesture information.
The invention provides a gesture recognition method, which specifically comprises the following steps: step S1, determining the position of the palm node, averaging the position coordinates of all white pixels in a circle with the palm node as the center of the circle and the distance r between the palm node and the wrist node as the radius, and representing the position coordinates of the palm node by the average value, so that the (x) of the palm nodep,yp) Is positioned as:
In the formula, T represents the number x of white pixels in the circleiAbscissa, y, representing the ith white pixeliExpressing the ordinate of the ith white pixel point;
step S2, after finding out the position of the palm node, searching gesture pixel points, and segmenting the gesture by judging the distance difference between the palm node and the peripheral area pixel points to Kinect;
when the gesture recognition object is a palm gesture,
recognizing the segmented gesture image by adopting an SVM algorithm and evaluating the gesture normalization; the SVM classification result is the confidence between the test gesture and the standard gesture, and can be used as an evaluation criterion of palm gesture specification, as shown in a formula:
in the formula, round (-) represents an integer, T is the total frame number of the dynamic gesture sequence,and (4) representing the gesture image output result of the SVM to the ith frame, wherein the palm gesture score is the average value of the gesture scores of the whole sequence.
Further, the gesture recognition also comprises recognition of arm actions, and the recognition of the arm actions specifically comprises: key skeleton node coordinate data of the data acquired by the Kinect sensor; the coordinate of the key bone node is Ps=(xs,ys,zs) And the remaining arm skeleton node coordinate is Pi=(xi,yi,zi) I is 1,2,3,4, so node PiAnd key bone node PsThe distance between them is:
within a certain time T, the motion sequence of the arm skeleton node is represented as (D)si 1,Dsi 2,...,Dsi T) And i is 1,2,3 and 4, and according to the arm skeleton node motion sequence, a dynamic gesture optimal matching can be performed by adopting a DTW algorithm.
Further, wherein the key bone node is specifically one of: palm, wrist, elbow, shoulder center.
Further, searching for a gesture pixel in S2 specifically includes:
the method comprises the following steps of searching gesture pixel points in a large rectangular area with a palm node as a center, and the method comprises the following steps:
setting the distance d from the skeleton palm node extracted by the Kinect to the Kinect camerapThe position of the palm node is (x)p,yp,dp) The position of the wrist node is (x)r,yr,dr) Performing gesture pixel point search in the range of rectangular pixels with the palm node as the center, the width w and the height H,
with S0Denotes an initial gesture pixel point set, dijIndicating the ith row and jth column of pixels g in the rectangular areaijDistance to the Kinect camera, therefore:
wherein k represents the number of searching times, threshold represents the distance difference threshold between the palm node and the gesture pixel point in the rectangular area to Kinect, abs (d)p-dij) Representing the absolute value of the difference between the distance of the palm node and the gesture pixel area, SkRepresenting the final detected set of gesture pixel points.
Further, the DTW algorithm specifically includes: time of taking standard dynamic gesture sampleThe intersequence is X ═ Ds 1,Ds 2,...,Ds m) Test gesture time sequence Y ═ D (D)t 1,Dt 2,...,Dt n) Let the point-to-point relationship between the two sequences be (k) to (phi)s(k),φt(k) Wherein 1 is less than or equal to phis(k)≤m,1≤φt(k) N is less than or equal to n, k is less than or equal to m + n, and finding the optimal point pair relation phi (k) between the two sequences so that the sum of the distances between the corresponding points is minimum, expressed as:
and recognizing the input dynamic gesture according to the obtained DTW distance to obtain a gesture recognition result.
Further wherein the size of the rectangular area is greater than 3 times the distance between the palm node and the wrist node.
Further, the gesture recognition result is consistent with the sample class with the minimum DTW distance in the standard dynamic gesture library, and is represented as:
in the formula, Xi is a standard dynamic gesture sample, Y is an input dynamic gesture, i is a dynamic gesture sample category, and O is a finally identified dynamic gesture category;
the measurement mode of the arm dynamic gesture score is DTW distance, the closer the test gesture sequence is to the standard gesture sample sequence, the smaller the DTW distance is, and the arm dynamic gesture score is expressed as:
in the formula, XiAnd representing a standard dynamic gesture sequence, wherein Y is a test gesture sequence, N is the number of standard gesture sequence samples, and alpha is the DTW distance average value between the standard gesture sequence samples.
A computer storage medium having a computer program stored thereon, the computer program being executable by a processor to implement the method as described above
Drawings
The features and advantages of the present disclosure will be more clearly understood by reference to the accompanying drawings, which are illustrative and should not be construed as limiting the disclosure in any way, in which
FIG. 1 is a schematic diagram of a gesture image segmentation method
FIG. 2 is a diagram illustrating a method for determining a palm node position
FIG. 3 flow chart of palm gesture recognition
FIG. 4 flow chart for identifying arm movements of a driver
Detailed Description
These and other features and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will be better understood by reference to the following description and drawings, which form a part of this specification. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the disclosure. It will be understood that the figures are not drawn to scale. Various block diagrams are used in this disclosure to illustrate various variations of embodiments according to the disclosure.
Example 1
According to the gesture detection method combining human skeleton node information and depth image information, a gesture image is segmented, human skeleton node data are obtained through Kinect, the position of a palm node is found, gestures are searched in the range of the palm node, when all pixel points on the whole palm are close to a camera, gesture information can be separated from the background by setting a distance difference threshold, and the flow of the method is shown in fig. 1.
Because the node drift phenomenon easily occurs when the Kinect tracks the human skeleton node, the distance from the palm node to the Kinect is not the distance from the actual palm node to the Kinect, and the gesture segmentation fails when the distance difference threshold is used for segmentation. Therefore, an approximate palm node position determining method is adopted, as shown in fig. 2, the position coordinates of all white pixel points are averaged in a circle with the palm node as the center of the circle and the distance r between the palm node and the wrist node as the radius, and the average value represents the position coordinates of the palm node, so that the (x) of the palm nodep,yp) The positions are as follows:
in the formula, T represents the number x of white pixels in the circleiAbscissa, y, representing the ith white pixeliThe ordinate of the ith white pixel point is represented. After the position of the palm node is found, the gesture is segmented by judging the distance difference between the palm node and the pixels in the surrounding area to the Kinect.
After finding the palm node, need search the gesture pixel around the palm node, in order to prevent to take place to drift because of the palm node and lead to the gesture pixel to search and appear the deviation, carry out the search of gesture pixel in a big rectangle region with the palm node as the center, the algorithm process is as follows:
setting the distance d from the skeleton palm node extracted by the Kinect to the Kinect camerapThe position of the palm node is (x)p,yp,dp) The position of the wrist node is (x)r,yr,dr) Performing gesture pixel point search in the range of rectangular pixels with the palm node as the center, the width w and the height H,
with S0Denotes an initial gesture pixel point set, dijIndicating the ith row and jth column of pixels g in the rectangular areaijDistance to the Kinect camera, therefore:
wherein k represents the number of searching times, threshold represents the distance difference threshold between the palm node and the gesture pixel point in the rectangular area to Kinect, abs (d)p-dij) Representing the absolute value of the difference between the distance of the palm node and the gesture pixel area, SkRepresenting the final detected set of gesture pixel points.
And (4) performing recognition on the gesture action, recognizing the segmented gesture image by adopting an SVM algorithm and evaluating the gesture normalization. The SVM classification result is a confidence between the test gesture and the standard gesture, and can be used as an evaluation criterion of the palm gesture specification, as shown in formula (3):
in the formula, round (-) represents an integer, T is the total frame number of the dynamic gesture sequence,and (4) representing the gesture image output result of the SVM to the ith frame, so that the palm gesture score is the average value of the gesture scores of the whole sequence.
The method comprises the steps that a depth image collected by a Kinect sensor is firstly processed through gesture detection, on one hand, pixel point searching is carried out in a palm region, on the other hand, a human body skeleton node image is adopted to judge the position of a palm node, and as the Kinect detects the skeleton node with low precision and the node is easy to drift, the size threshold of a rectangular search region cannot be set to be too small in order to avoid incomplete palm detection, in the experiment, when t is larger than or equal to 3, namely the size of the rectangular search region is larger than the distance between a palm node and a wrist node by 3 times, the gesture search effect is ideal at the moment. The setting of the distance difference threshold value threshold between the palm node and the gesture pixel region is also important for the gesture segmentation result, and when the threshold value is too small, the gesture segmentation is incomplete; when the threshold is set too large, the wrist and the like are easily separated. The experimental result shows that when threshold belongs to (15,25) mm, the gesture detection effect is ideal.
The Kinect sensor acquires image depth data as well as driver skeletal data. The driver completes a whole set of gesture actions including not only the gesture of the palm part but also the action of the arm part, so that the data acquired by the Kinect sensor should include a plurality of key skeleton node coordinate data of the palm, the wrist, the elbow, the shoulder center and the like. When a driver does different gesture motions, generally the relative position of the shoulder center node is basically kept unchanged, and the coordinate of the shoulder center node is set as Ps=(xs,ys,zs) And the remaining arm skeleton node coordinate is Pi=(xi,yi,zi) I is 1,2,3,4, so node PiAnd shoulder center node PsThe distance between them is:
within a certain time T, the motion sequence of the arm skeleton node is represented as (D)si 1,Dsi 2,...,Dsi T) The dynamic gesture optimal matching method based on the DTW algorithm has the advantages that i is 1,2,3 and 4, the DTW algorithm can be adopted to carry out the optimal matching of the dynamic gesture according to the arm skeleton node motion sequence, the DTW algorithm can solve the problem that the lengths of two time sequences are different, the method is very suitable for train driver dynamic gesture recognition, and the time sequence of a standard dynamic gesture sample of a driver is set to be X (D)s 1,Ds 2,...,Ds m) Test gesture time sequence Y ═ D (D)t 1,Dt 2,...,Dt n) Let the point-to-point relationship between the two sequences be (k) to (phi)s(k),φt(k) Wherein 1 is less than or equal to phis(k)≤m,1≤φt(k) N is less than or equal to n, k is less than or equal to max (m, n) and less than or equal to m + n, the DTW algorithm aims to find the optimal point pair relation phi (k) between the two sequences, so that the sum of the distances between corresponding pointsMinimum, expressed as:
in the formula (I), the compound is shown in the specification,the equation (5) is usually solved by a dynamic programming method to reduce the algorithm complexity.
And recognizing and evaluating the input dynamic gesture according to the obtained DTW distance. The recognition result of the dynamic gesture of the driver is consistent with the sample category with the minimum DTW distance in the standard dynamic gesture library, and is represented as follows:
in the formula, XiThe method comprises the steps of inputting a standard dynamic gesture sample, inputting a dynamic gesture by Y, classifying the dynamic gesture sample by i, and classifying the finally recognized dynamic gesture by O.
The measurement mode of the arm dynamic gesture score is DTW distance, the closer the test gesture sequence is to the standard gesture sample sequence, the smaller the DTW distance is, and the arm dynamic gesture score is expressed as:
in the formula, XiAnd representing a standard dynamic gesture sequence, wherein Y is a test gesture sequence, N is the number of standard gesture sequence samples, and alpha is the DTW distance average value between the standard gesture sequence samples.
The detection of the action of the palm area of the train driver is preferably taken as an example. The palm region action generally comprises four actions, namely extending out the index finger and the middle finger of a palm fist (a), tilting up the thumb of the palm fist (b), extending out the thumb and the small thumb of the palm fist (c), and stretching out the five fingers of the palm to be closed (d).
When a train driver does a gesture action, a palm usually generates corresponding deformation and rotation, therefore, when the gesture recognition is performed, the size normalization of the detected gesture is required, in order to reduce the influence of the gesture rotation problem in the sequence on the recognition effect, when an SVM classifier is trained offline, a rotation sample is participated in the classifier training, the robustness of the classifier is increased, and table 1 shows 4 palm gesture classification recognition rates when the train driver does a hand raising action
TABLE 1
As can be seen from table 1, when the method provided by the present application is used to detect a palm and a gesture, the detection rate is high, firstly, by determining a palm center node, a gesture pixel point is searched around the palm center node, which can effectively avoid the condition of missing detection, and by setting a distance difference threshold between a pixel around the gesture and the palm center pixel to Kinect, the possibility that the algorithm detects a wrist part can be reduced. In the aspect of recognition, palm gesture images in a plurality of gesture sequences are trained by using an SVM algorithm, so that the error recognition caused by the rotation of the images is reduced, and the average recognition rate of 4 palm gestures of a train driver can reach over 88%. For the gesture score, the average value of the sum of the confidence degrees of the gesture sequence recognition is the final gesture score, so that the standard degree of the palm gesture of the train driver can be effectively judged.
The arm actions of the train driver comprise a plurality of different gestures with different meanings, such as 4 dynamic gestures of arm lifting, forward arm, elbow bending and arm left-right swinging,
TABLE 2
Table 2 shows that the gesture recognition method provided by the present invention has an average recognition rate of 4 common dynamic gestures of over 85% for the recognition effect of 4 train driver arm actions, and the DTW algorithm adopted by the present invention is very suitable for train driver dynamic arm action recognition. The calculation of the arm action score can effectively improve the recognition degree and the accuracy of the algorithm by analyzing the relation between the DTW distance between the test gesture sequence and the standard gesture sequence in the sample library.
Table 3 shows the comparison between the classical HMM algorithm and the method of the present invention for the recognition effect of 4 kinds of arm dynamic gestures of the train driver, and it can be known from table 3 that the average recognition rate DTW algorithm is higher than the HMM algorithm by 4.3%, and since the length of the driver dynamic gesture sequence changes all the time, the DTW algorithm of the present invention can solve the matching problem of motion sequences of different lengths by the dynamic programming method. Therefore, compared with the HMM algorithm, the DTW algorithm is more suitable for processing the dynamic arm action recognition problem of the train driver.
TABLE 3
Meanwhile, according to simulation experiment results, the gesture action recognition system provided by the application has the advantages that through multiple tests, the result is reliable, the operation is stable, the operation speed can reach 25 frames per second, and the gesture action recognition system is very suitable for gesture recognition and normalization evaluation of train drivers.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.
Claims (8)
1. Dynamic gesture recognition methodThe method is characterized in that: step S1, determining the position of the palm node, averaging the position coordinates of all white pixel points in a circle with the palm node as the center of the circle and the distance r between the palm node and the wrist node as the radius, and representing the position coordinates of the palm node by the average value so as to obtain the (x) of the palm nodep,yp) The positions are as follows:
in the formula, T represents the number x of white pixels in the circleiAbscissa, y, representing the ith white pixeliExpressing the ordinate of the ith white pixel point;
step S2, after finding out the position of the palm node, searching gesture pixel points, and segmenting the gesture by judging the distance difference between the palm node and the peripheral area pixel points to Kinect;
when the dynamic gesture recognition object is a palm gesture,
recognizing the segmented gesture image by adopting an SVM (support vector machine) algorithm and evaluating the gesture normalization; the SVM classification result is the confidence between the test gesture and the standard gesture, and can be used as an evaluation criterion of palm gesture specification, as shown in a formula:
2. The method of claim 1, wherein the dynamic gesture recognition further comprises recognition of an arm motion, and the recognition of an arm motion is specifically: key skeleton node coordinate number of data acquired by Kinect sensorAccordingly; the coordinate of the key bone node is Ps=(xs,ys,zs) And the remaining arm skeleton node coordinate is Pi=(xi,yi,zi) I is 1,2,3,4, so node PiAnd key bone node PsThe distance between them is:
within a certain time T, the motion sequence of the arm skeleton node is represented as (D)si 1,Dsi 2,...,Dsi T) And i is 1,2,3 and 4, and according to the arm skeleton node motion sequence, a DTW dynamic time warping algorithm can be adopted for performing dynamic gesture optimal matching.
3. The method of claim 2, wherein the key bone node is specifically one of: palm, wrist, elbow, shoulder center.
4. The method according to any one of claims 1-3, wherein the searching for gesture pixels in S2 specifically comprises:
the method comprises the following steps of searching gesture pixel points in a large rectangular area with a palm node as a center, and the method comprises the following steps:
setting the distance d from the skeleton palm node extracted by the Kinect to the Kinect camerapThe position of the palm node is (x)p,yp,dp) The position of the wrist node is (x)r,yr,dr) Performing gesture pixel point search in the range of rectangular pixels with the palm node as the center, the width w and the height H,
with S0Denotes an initial gesture pixel point set, dijIndicating the ith row and jth column of pixels g in the rectangular areaijDistance to the Kinect camera, therefore:
wherein k represents the number of searching times, threshold represents the distance difference threshold between the palm node and the gesture pixel point in the rectangular area to Kinect, abs (d)p-dij) Representing the absolute value of the difference between the distance of the palm node and the gesture pixel area, SkRepresenting the final detected set of gesture pixel points.
5. The method of claim 3, wherein the DTW algorithm is specifically: the time sequence for obtaining the standard dynamic gesture sample is X ═ Ds 1,Ds 2,...,Ds m) Test gesture time sequence Y ═ D (D)t 1,Dt 2,...,Dt n) Let the point-to-point relationship between the two sequences be (k) to (phi)s(k),φt(k) Wherein 1 is less than or equal to phis(k)≤m,1≤φt(k) N is less than or equal to n, k is less than or equal to m + n, and finding the optimal point pair relation phi (k) between the two sequences so that the sum of the distances between the corresponding points is minimum, expressed as:
and recognizing the input dynamic gesture according to the obtained DTW distance to obtain a gesture recognition result.
6. A method as claimed in claim 5, wherein the size of the rectangular area is greater than 3 times the distance between the palm node and the wrist node.
7. The method of claim 6, wherein the gesture recognition result is consistent with the sample class with the smallest DTW distance in the standard dynamic gesture library, and is represented as:
in the formula, XiThe method comprises the steps of obtaining a standard dynamic gesture sample, Y an input dynamic gesture, i a dynamic gesture sample category and O a finally recognized dynamic gesture category;
the measurement mode of the arm dynamic gesture score is DTW distance, the closer the test gesture sequence is to the standard gesture sample sequence, the smaller the DTW distance is, and the arm dynamic gesture score is expressed as:
in the formula, XiAnd representing a standard dynamic gesture sequence, wherein Y is a test gesture sequence, N is the number of standard gesture sequence samples, and alpha is the DTW distance average value between the standard gesture sequence samples.
8. A computer storage medium having stored thereon a computer program for execution by a processor to perform the method of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910816005.4A CN110717385A (en) | 2019-08-30 | 2019-08-30 | Dynamic gesture recognition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910816005.4A CN110717385A (en) | 2019-08-30 | 2019-08-30 | Dynamic gesture recognition method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110717385A true CN110717385A (en) | 2020-01-21 |
Family
ID=69210196
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910816005.4A Pending CN110717385A (en) | 2019-08-30 | 2019-08-30 | Dynamic gesture recognition method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110717385A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111401166A (en) * | 2020-03-06 | 2020-07-10 | 中国科学技术大学 | Robust gesture recognition method based on electromyographic information decoding |
CN111723688A (en) * | 2020-06-02 | 2020-09-29 | 北京的卢深视科技有限公司 | Human body action recognition result evaluation method and device and electronic equipment |
CN113283314A (en) * | 2021-05-11 | 2021-08-20 | 桂林电子科技大学 | Unmanned aerial vehicle night search and rescue method based on YOLOv3 and gesture recognition |
CN114705236A (en) * | 2022-02-11 | 2022-07-05 | 清华大学深圳国际研究生院 | Thermal comfort degree measuring method and device, air conditioner control method and device and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120068917A1 (en) * | 2010-09-17 | 2012-03-22 | Sony Corporation | System and method for dynamic gesture recognition using geometric classification |
CN103455794A (en) * | 2013-08-23 | 2013-12-18 | 济南大学 | Dynamic gesture recognition method based on frame fusion technology |
CN107169411A (en) * | 2017-04-07 | 2017-09-15 | 南京邮电大学 | A kind of real-time dynamic gesture identification method based on key frame and boundary constraint DTW |
CN107463873A (en) * | 2017-06-30 | 2017-12-12 | 长安大学 | A kind of real-time gesture analysis and evaluation methods and system based on RGBD depth transducers |
CN108664877A (en) * | 2018-03-09 | 2018-10-16 | 北京理工大学 | A kind of dynamic gesture identification method based on range data |
-
2019
- 2019-08-30 CN CN201910816005.4A patent/CN110717385A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120068917A1 (en) * | 2010-09-17 | 2012-03-22 | Sony Corporation | System and method for dynamic gesture recognition using geometric classification |
CN103455794A (en) * | 2013-08-23 | 2013-12-18 | 济南大学 | Dynamic gesture recognition method based on frame fusion technology |
CN107169411A (en) * | 2017-04-07 | 2017-09-15 | 南京邮电大学 | A kind of real-time dynamic gesture identification method based on key frame and boundary constraint DTW |
CN107463873A (en) * | 2017-06-30 | 2017-12-12 | 长安大学 | A kind of real-time gesture analysis and evaluation methods and system based on RGBD depth transducers |
CN108664877A (en) * | 2018-03-09 | 2018-10-16 | 北京理工大学 | A kind of dynamic gesture identification method based on range data |
Non-Patent Citations (2)
Title |
---|
何超;胡章芳;王艳;: "一种基于改进DTW算法的动态手势识别方法", 数字通信, no. 03, 25 June 2013 (2013-06-25) * |
郭晓利;杨婷婷;张雅超;: "基于Kinect深度信息的动态手势识别", 东北电力大学学报, no. 02, 15 April 2016 (2016-04-15) * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111401166A (en) * | 2020-03-06 | 2020-07-10 | 中国科学技术大学 | Robust gesture recognition method based on electromyographic information decoding |
CN111723688A (en) * | 2020-06-02 | 2020-09-29 | 北京的卢深视科技有限公司 | Human body action recognition result evaluation method and device and electronic equipment |
CN111723688B (en) * | 2020-06-02 | 2024-03-12 | 合肥的卢深视科技有限公司 | Human body action recognition result evaluation method and device and electronic equipment |
CN113283314A (en) * | 2021-05-11 | 2021-08-20 | 桂林电子科技大学 | Unmanned aerial vehicle night search and rescue method based on YOLOv3 and gesture recognition |
CN114705236A (en) * | 2022-02-11 | 2022-07-05 | 清华大学深圳国际研究生院 | Thermal comfort degree measuring method and device, air conditioner control method and device and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5845365B2 (en) | Improvements in or related to 3D proximity interaction | |
CN110717385A (en) | Dynamic gesture recognition method | |
Cai et al. | A scalable approach for understanding the visual structures of hand grasps | |
CN108268838B (en) | Facial expression recognition method and facial expression recognition system | |
KR20200111617A (en) | Gesture recognition method, device, electronic device, and storage medium | |
US20160171293A1 (en) | Gesture tracking and classification | |
TW201926140A (en) | Method, electronic device and non-transitory computer readable storage medium for image annotation | |
Wachs et al. | A real-time hand gesture interface for medical visualization applications | |
CN106648078B (en) | Multi-mode interaction method and system applied to intelligent robot | |
Kalsh et al. | Sign language recognition system | |
CN112114675B (en) | Gesture control-based non-contact elevator keyboard using method | |
KR101559502B1 (en) | Method and recording medium for contactless input interface with real-time hand pose recognition | |
JP2016014954A (en) | Method for detecting finger shape, program thereof, storage medium of program thereof, and system for detecting finger shape | |
TW202201275A (en) | Device and method for scoring hand work motion and storage medium | |
Weber et al. | Distilling location proposals of unknown objects through gaze information for human-robot interaction | |
KR20120089948A (en) | Real-time gesture recognition using mhi shape information | |
Ghadhban et al. | Segments interpolation extractor for finding the best fit line in Arabic offline handwriting recognition words | |
Huang et al. | Real-time automated detection of older adults' hand gestures in home and clinical settings | |
Li et al. | Recognizing hand gestures using the weighted elastic graph matching (WEGM) method | |
Tara et al. | Sign language recognition in robot teleoperation using centroid distance Fourier descriptors | |
Oszust et al. | Isolated sign language recognition with depth cameras | |
Abdulghani et al. | Discover human poses similarity and action recognition based on machine learning | |
Li et al. | A novel art gesture recognition model based on two channel region-based convolution neural network for explainable human-computer interaction understanding | |
Rubin Bose et al. | In-situ identification and recognition of multi-hand gestures using optimized deep residual network | |
Shitole et al. | Dynamic hand gesture recognition using PCA, Pruning and ANN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |