CN110889387A - Real-time dynamic gesture recognition method based on multi-track matching - Google Patents
Real-time dynamic gesture recognition method based on multi-track matching Download PDFInfo
- Publication number
- CN110889387A CN110889387A CN201911215465.8A CN201911215465A CN110889387A CN 110889387 A CN110889387 A CN 110889387A CN 201911215465 A CN201911215465 A CN 201911215465A CN 110889387 A CN110889387 A CN 110889387A
- Authority
- CN
- China
- Prior art keywords
- gesture recognition
- real
- image
- method based
- recognition method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 20
- 238000001514 detection method Methods 0.000 claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 claims abstract description 6
- 238000011176 pooling Methods 0.000 claims description 7
- 239000000126 substance Substances 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 230000003993 interaction Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a real-time dynamic gesture recognition method based on multi-track matchingFASTObtaining a positive sample containing all fingertip points by a convolutional neural network of an angular point detection algorithm, clustering the positive sample based on the obtained fingertip points and another unprocessed video image to obtain a minimum point set in one frame, matching the fingertips in the two frames by a global nearest neighbor matching algorithm, and finally utilizing the global nearest neighbor matching algorithm to match the fingertips in the two framesLSTMAnd the neural network carries out multi-track classification and dynamically identifies the gesture. The invention is combined withCNNAndFASTthe angular point detection algorithm can quickly detect the fingerThe positions of the sharp points are fused with an asymmetric point set matching algorithmLSTMThe dynamic gestures are classified with high robustness, and the gesture recognition efficiency is high and the recognition effect is good.
Description
Technical Field
The present invention relates to computing; calculating; the technical field of counting, in particular to a real-time dynamic gesture recognition method based on multi-track matching in the fields of human-computer interaction and computer vision.
Background
In the field of human-computer interaction, gesture interaction is one of the most common and important interaction forms, and is a crucial link in gesture interaction for directly relating gesture recognition to accuracy and robustness of gesture interaction.
Currently, the gesture recognition is divided from the perspective of whether the gesture recognition has a time sequence meaning, and the gesture recognition can be divided into static gesture recognition and dynamic gesture recognition; static gesture recognition aims at classifying gestures in a single frame through morphological characteristics of the gestures, and dynamic gesture recognition is used for classifying gesture actions in a time sequence. In practical application, the gestures almost have time sequence meanings such as rotation, grabbing and the like, so that the dynamic gesture recognition has more practical meanings.
A monocular camera or a sensor can be used for information acquisition in the dynamic gesture recognition. Common sensors for acquiring gesture information include Kinect, Leap Motion, and the like; although the sensors can acquire complicated characteristics such as depth information and joint information of the hand, the sensors generally have the defects of high price, large calculation amount and inapplicability to a mobile terminal; the gesture recognition can be well carried out by only using the monocular camera, the problems can be well avoided, the cost is low, the monocular camera can be widely applied to various mobile platforms, and the support of other hardware is not needed, however, the monocular camera only has RGB information, cannot directly segment gestures, and is easily influenced by noise.
In the prior art, the method for performing dynamic gesture recognition through a monocular camera mainly includes two methods:
(1) training frames in an interval as a group through 3D convolution to achieve the purpose of classifying actions in a period of time, wherein the method has the defect that long-time gestures are difficult to recognize;
(2) after the gesture is segmented through the color space, the mass center of the hand is tracked, and the track is predicted by using the DTW or the HMM.
Disclosure of Invention
The invention solves the problems in the prior art, provides an optimized real-time dynamic gesture recognition method based on multi-track matching, can dynamically recognize gestures in real time, and has high robustness.
The invention adopts the technical scheme that a real-time dynamic gesture recognition method based on multi-track matching comprises the following steps:
step 1: acquiring a video stream;
step 2: copying a video image, and obtaining a hand area image by dividing the video image;
and step 3: constructing a convolutional neural network based on a FAST corner detection algorithm, and acquiring a positive sample containing all the fingertip points;
and 4, step 4: clustering based on the obtained positive sample of the fingertip point and another unprocessed video image;
and 5: matching the fingertips in the two frames through a global nearest neighbor matching algorithm;
step 6: and carrying out multi-track classification by using an LSTM neural network, and dynamically identifying the gesture.
Preferably, the step 2 comprises the steps of:
step 2.1: compressing each frame of video image of the video stream to a preset resolution;
step 2.2: converting the compressed video images from an RGB color space to a YCrCb color space in sequence;
step 2.3: and taking the Cb component and the Cr component, and segmenting to obtain a hand region image.
Preferably, the step 3 comprises the steps of:
step 3.1: acquiring the hand region image segmented in the step 2;
step 3.2: detecting all corners of the hand region image by using a FAST corner detection algorithm, and cutting the original image by taking each corner as a center to obtain a plurality of image slices;
step 3.3: and constructing a lightweight convolutional neural network, classifying the image slices, and if the classification probability is greater than or equal to 50%, determining the image slices as positive samples, otherwise, determining the image slices as negative samples.
Preferably, in step 3.2, the image slices are 32 × 32 pixels.
Preferably, in the step 3.3, the lightweight convolutional neural network comprises four groups of sub-blocks connected in sequence, wherein any sub-block comprises a depth convolutional layer and a maximum pooling layer, and the depth convolutional layer of the second group to the fourth group of sub-blocks is a 1 × 1 convolutional layer; the fourth group of sub-blocks are sequentially connected with a 1 × 1 deep convolutional layer, a global mean pooling layer and a full-connection layer; a number of image slices are input through the depth convolution layer of the first set of sub-blocks, and classification probabilities are output from the fully-connected layer.
Preferably, the step 4 comprises the steps of:
step 4.1: traversing all the positive samples obtained in the step 3, constructing a set C, wherein any positive sample riHas the coordinates of (x)i,yi),D1A threshold value for distance; i and j identify different positive samples, respectively;
step 4.2: constructing a set T for keeping the scores of all the elements in the set C, updating the set T,where n is the increment in element T, the length of set T;
step 4.3: reordering the elements in the set T according to a descending order, and correspondingly modifying the ordering of the elements in the set C;
step 4.4: let the minimum set of points after clustering that contains all fingertips be set R,wherein the content of the first and second substances,
preferably, the step 5 comprises the steps of:
step 5.1: constructing two frame imagesTwo minimum point sets A, B and distance matrices D, D obtained after step 4i,j=||Ai,Bj||2Wherein i is less than or equal to the number of elements in the point set A and is greater than or equal to 0, and j is less than or equal to the number of elements in the point set B and is greater than or equal to 0; taking col and row as the row number and column number of D respectively;
step 5.2: if col>row, thenCarrying out the next step, otherwise, directly carrying out the next step;
step 5.3: constructing sets seq, Dis, gloseq and glodis for storing temporary variables;
step 5.4: from Di,jStart search, add i to seq, let Disi=Disi+Di,jSeq is added to gloseq;
step 5.5: let j equal j +1, Disi=Disi-Di,jIf j is row-1, the next step is carried out, otherwise, the step 5.4 is returned;
step 5.6: obtaining the minimum element glodis in the glodisiWith glodisiAs the optimal solution for point matching.
Preferably, in step 6, after a matching solution between two frames is obtained, the direction angle of the corresponding point of the two frames is calculated and used as a unit of the sequence of LSTM to be input.
Preferably, the direction angle is encoded, mapping it from (0,360 ° ] to an integer of 1 to 12.
Preferably, if the corresponding points of the two frames cannot be matched, the point pair which cannot be matched is marked as "-1".
The invention relates to an optimized real-time dynamic gesture recognition method based on multi-track matching, which comprises the steps of obtaining video streams, copying video images, obtaining hand region images by dividing one video image, constructing a convolutional neural network based on a FAST corner detection algorithm, obtaining a positive sample containing all fingertip points, clustering the positive sample based on the obtained fingertip points and the other unprocessed video image to obtain a minimum point set in one frame, matching fingertips in two frames by a global nearest neighbor point matching algorithm, and finally performing multi-track classification by using an LSTM neural network to dynamically recognize gestures.
The method combines the corner detection algorithm of CNN and FAST, can quickly detect the position of the fingertip point, and simultaneously integrates the asymmetric point set matching algorithm and the LSTM, thereby realizing high-robustness classification of the dynamic gesture, and having high gesture recognition efficiency and good recognition effect.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of a lightweight convolutional neural network according to the present invention;
FIG. 3 is a schematic diagram of clustering positive samples of a lightweight convolutional neural network according to the present invention, where dots at the cusps are the clustered positive sample points;
fig. 4 shows that finger tip points between different frames are matched, and the pointing position of the pin is the matched position.
Detailed Description
The present invention is described in further detail with reference to the following examples, but the scope of the present invention is not limited thereto.
The invention relates to a real-time dynamic gesture recognition method based on multi-track matching, which combines a FAST corner detection algorithm with improved depth CNN, uses a new global nearest neighbor matching algorithm to match fingertips in two frames of images and enable the fingertips to form tracks, and finally uses LSTM to classify a plurality of matched tracks to obtain a dynamic gesture recognition result.
The method comprises the following steps.
Step 1: a video stream is acquired.
Step 2: and copying the video image, and segmenting the video image to obtain a hand area image.
The step 2 comprises the following steps:
step 2.1: compressing each frame of video image of the video stream to a preset resolution;
step 2.2: converting the compressed video images from an RGB color space to a YCrCb color space in sequence;
step 2.3: and taking the Cb component and the Cr component, and segmenting to obtain a hand region image.
In the invention, the human skin color has strong aggregation property on the Cb component and the Cr component of YCrCb, so that the non-skin color area can be effectively removed by utilizing the Cb component and the Cr component.
In the present invention, specifically, the Cb component is a blue chrominance component and the Cr component is a red chrominance component.
In the invention, before processing, the resolution of the image can be compressed to 480 × 640, and the effectiveness of segmentation is ensured.
And step 3: and (3) constructing a convolutional neural network based on a FAST corner detection algorithm, and acquiring a positive sample containing all the fingertip points.
The step 3 comprises the following steps:
step 3.1: acquiring the hand region image segmented in the step 2;
step 3.2: detecting all corners of the hand region image by using a FAST corner detection algorithm, and cutting the original image by taking each corner as a center to obtain a plurality of image slices;
in said step 3.2, the image slice is 32 x 32 pixels.
Step 3.3: and constructing a lightweight convolutional neural network, classifying the image slices, and if the classification probability is greater than or equal to 50%, determining the image slices as positive samples, otherwise, determining the image slices as negative samples.
In the step 3.3, the lightweight convolutional neural network comprises four groups of sub-blocks which are sequentially connected, wherein any sub-block comprises a depth convolutional layer and a maximum pooling layer, and the depth convolutional layers of the second group to the fourth group of sub-blocks are 1 × 1 convolutional layers; the fourth group of sub-blocks are sequentially connected with a 1 × 1 deep convolutional layer, a global mean pooling layer and a full-connection layer; a number of image slices are input through the depth convolution layer of the first set of sub-blocks, and classification probabilities are output from the fully-connected layer.
In the invention, the corner is a category of characteristic points, the FAST corner detection algorithm is the prior art in the field, and the person skilled in the art can detect the corner according to the requirement.
In the present invention, the positive samples are all candidate points that may be fingertips, the negative samples are points that are not necessarily fingertips, and the positive samples obtained in step 3 include fingertips and interference points near the fingertips, which need to be excluded in step 4.
In the invention, the lightweight convolutional neural network is constructed on the assumption that cross-channel correlation mapping and space correlation mapping in a feature diagram are completely decoupled, so that the parameter quantity of the network can be greatly reduced; to further reduce the parameters, we do not directly use depth separable convolution, but first add the maximum pool layer after the depth convolution layer and then expand the number of channels with 1 × 1 convolution. In addition, a plurality of full-connection layers are not adopted in the invention, but a global mean pooling layer and a full-connection layer are used, so that the overfitting of the network is weakened, and the parameter quantity of the full-connection layer is reduced.
And 4, step 4: clustering is performed based on a positive sample of the acquired fingertip points and another unprocessed video image.
The step 4 comprises the following steps:
step 4.1: traversing all the positive samples obtained in the step 3, constructing a set C, wherein any positive sample riHas the coordinates of (x)i,yi),D1A threshold value for distance; i and j identify different positive samples, respectively;
step 4.2: constructing a set T for keeping the scores of all the elements in the set C, updating the set T,where n is the increment in element T, the length of set T;
step 4.3: reordering the elements in the set T according to a descending order, and correspondingly modifying the ordering of the elements in the set C;
step 4.4: let the minimum set of points after clustering that contains all fingertips be set R,wherein the content of the first and second substances,
in the invention, because the corners obtained by the FAST corner detection algorithm are gathered at the fingertip points and the areas nearby the fingertip points, a plurality of fingertip points can be detected, and the CNN cannot eliminate redundant points, the points need to be combined by the clustering algorithm.
In the present invention, the formula of step 4.1 means: replacing elements in the set when the condition is satisfied, and adding the elements to the set when the condition is not satisfied; at the outset, the set is an empty set, i.e. { c }jThe symbol is empty set; after traversing to the first positive sample point, cjSatisfies the condition of beingiReplacement is carried out; when the subsequent points are traversed, judging whether to add a new point to the set or not according to conditions; the formula of step 4.2 is the same.
And 5: and matching the fingertips in the two frames by using a global nearest neighbor matching algorithm.
The step 5 comprises the following steps:
step 5.1: constructing two minimum point sets A, B and distance matrixes D, D obtained after step 4 of two-frame imagei,j=||Ai,Bj||2Wherein i is less than or equal to the number of elements in the point set A and is greater than or equal to 0, and j is less than or equal to the number of elements in the point set B and is greater than or equal to 0; taking col and row as the row number and column number of D respectively;
step 5.2: if col>row, thenCarrying out the next step, otherwise, directly carrying out the next step;
step 5.3: constructing sets seq, Dis, gloseq and glodis for storing temporary variables;
step 5.4: from Di,jStart search, add i to seq, let Disi=Disi+Di,jSeq is added to gloseq;
step 5.5: let j equal j +1, Disi=Disi-Di,jIf j is row-1, the next step is carried out, otherwise, the step 5.4 is returned;
step 5.6: obtaining the minimum element glodis in the glodisiWith glodisiAs the optimal solution for point matching.
In the invention, step 4 obtains the minimum point set in a frame, and step 5 matches the minimum point sets in two frames; the main problem of combining multiple points into multiple tracks is the asymmetric matching between two point sets, i.e. the length of a point set is not equal to the point set itself; if the fingertips in two independent frames need to be matched into a track, the problem that the fingertips cannot be identified 100% and are inevitably misrecognized must be considered, and a global nearest neighbor matching algorithm is adopted for solving the problem.
In the present invention, D is the euclidean distance between all points in the point set A, B.
In the present invention, step 5.2 means to ensure that the number of columns of D is necessarily greater than the number of rows, otherwise, the transposition is performed.
In the present invention, from Di,jStarting the search means starting the search from both i and j being 0, and adding one to the corresponding subscript every time a cycle is completed.
Step 6: and carrying out multi-track classification by using an LSTM neural network, and dynamically identifying the gesture.
In step 6, after a matching solution between two frames is obtained, the direction angles of corresponding points of the two frames are calculated and used as a unit of the sequence of the LSTM to be input.
The direction angle is encoded and mapped from (0,360 ° ] to an integer of 1 to 12.
And if the corresponding points of the two frames cannot be matched, marking the point pairs which cannot be matched as '-1'.
In the invention, after a matching solution between two frames is obtained, the direction angles of corresponding points of the two frames are calculated to represent input data of the LSTM, but the track is discontinuous due to incomplete matching of fingertips between the two frames, in order to enable the LSTM to recognize a dynamic gesture with a sparse track, a point pair which cannot be matched is marked as '-1', and the azimuth angle of the point pair which can be matched is calculated and is taken as a unit of a sequence of the LSTM to be input.
The method comprises the steps of obtaining video streams, copying video images, obtaining hand region images by dividing one video image, constructing a convolutional neural network based on a FAST corner detection algorithm, obtaining a positive sample containing all fingertip points, clustering the obtained positive sample of the fingertip points and another unprocessed video image to obtain a minimum point set in one frame, matching fingertips in two frames by a global nearest neighbor point matching algorithm, and finally performing multi-track classification by using an LSTM neural network to dynamically identify gestures.
The method combines the corner detection algorithm of CNN and FAST, can quickly detect the position of the fingertip point, and simultaneously integrates the asymmetric point set matching algorithm and the LSTM, thereby realizing high-robustness classification of the dynamic gesture, and having high gesture recognition efficiency and good recognition effect.
Claims (10)
1. A real-time dynamic gesture recognition method based on multi-track matching is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring a video stream;
step 2: copying a video image, and obtaining a hand area image by dividing the video image;
and step 3: constructing a convolutional neural network based on a FAST corner detection algorithm, and acquiring a positive sample containing all the fingertip points;
and 4, step 4: clustering based on the obtained positive sample of the fingertip point and another unprocessed video image;
and 5: matching the fingertips in the two frames through a global nearest neighbor matching algorithm;
step 6: and carrying out multi-track classification by using an LSTM neural network, and dynamically identifying the gesture.
2. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 2 comprises the following steps:
step 2.1: compressing each frame of video image of the video stream to a preset resolution;
step 2.2: converting the compressed video images from an RGB color space to a YCrCb color space in sequence;
step 2.3: and taking the Cb component and the Cr component, and segmenting to obtain a hand region image.
3. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 3 comprises the following steps:
step 3.1: acquiring the hand region image segmented in the step 2;
step 3.2: detecting all corners of the hand region image by using a FAST corner detection algorithm, and cutting the original image by taking each corner as a center to obtain a plurality of image slices;
step 3.3: and constructing a lightweight convolutional neural network, classifying the image slices, and if the classification probability is greater than or equal to 50%, determining the image slices as positive samples, otherwise, determining the image slices as negative samples.
4. The real-time dynamic gesture recognition method based on multi-track matching according to claim 3, characterized in that: in said step 3.2, the image slice is 32 x 32 pixels.
5. The real-time dynamic gesture recognition method based on multi-track matching according to claim 3, characterized in that: in the step 3.3, the lightweight convolutional neural network comprises four groups of sub-blocks which are sequentially connected, wherein any sub-block comprises a depth convolutional layer and a maximum pooling layer, and the depth convolutional layers of the second group to the fourth group of sub-blocks are 1 × 1 convolutional layers; the fourth group of sub-blocks are sequentially connected with a 1 × 1 deep convolutional layer, a global mean pooling layer and a full-connection layer; a number of image slices are input through the depth convolution layer of the first set of sub-blocks, and classification probabilities are output from the fully-connected layer.
6. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 4 comprises the following steps:
step 4.1: traversing all the positive samples obtained in the step 3, constructing a set C,wherein any positive sample riHas the coordinates of (x)i,yi),D1A threshold value for distance; i and j identify different positive samples, respectively;
step 4.2: constructing a set T for keeping the scores of all the elements in the set C, updating the set T,where n is the increment in element T, the length of set T;
step 4.3: reordering the elements in the set T according to a descending order, and correspondingly modifying the ordering of the elements in the set C;
7. the real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 5 comprises the following steps:
step 5.1: constructing two minimum point sets A, B and distance matrixes D, D obtained after step 4 of two-frame imagei,j=||Ai,Bj||2Wherein i is less than or equal to the number of elements in the point set A and is greater than or equal to 0, and j is less than or equal to the number of elements in the point set B and is greater than or equal to 0; taking col and row as the row number and column number of D respectively;
step 5.2: if col>row, thenCarrying out the next step, otherwise, directly carrying out the next step;
step 5.3: constructing sets seq, Dis, gloseq and glodis for storing temporary variables;
step 5.4: from Di,jStart search, add i to seq, let Disi=Disi+Di,jSeq is added to gloseq;
step 5.5: let j equal j +1, Disi=Disi-Di,jIf j is row-1, the next step is carried out, otherwise, the step 5.4 is returned;
step 5.6: obtaining the minimum element glodis in the glodisiWith glodisiAs the optimal solution for point matching.
8. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: in step 6, after a matching solution between two frames is obtained, the direction angles of corresponding points of the two frames are calculated and used as a unit of the sequence of the LSTM to be input.
9. The real-time dynamic gesture recognition method based on multi-track matching according to claim 8, characterized in that: the direction angle is encoded and mapped from (0,360 ° ] to an integer of 1 to 12.
10. The real-time dynamic gesture recognition method based on multi-track matching according to claim 8, characterized in that: and if the corresponding points of the two frames cannot be matched, marking the point pairs which cannot be matched as '-1'.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911215465.8A CN110889387A (en) | 2019-12-02 | 2019-12-02 | Real-time dynamic gesture recognition method based on multi-track matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911215465.8A CN110889387A (en) | 2019-12-02 | 2019-12-02 | Real-time dynamic gesture recognition method based on multi-track matching |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110889387A true CN110889387A (en) | 2020-03-17 |
Family
ID=69749985
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911215465.8A Pending CN110889387A (en) | 2019-12-02 | 2019-12-02 | Real-time dynamic gesture recognition method based on multi-track matching |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110889387A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460933A (en) * | 2020-03-18 | 2020-07-28 | 哈尔滨拓博科技有限公司 | Method for real-time recognition of continuous handwritten pattern |
CN111680618A (en) * | 2020-06-04 | 2020-09-18 | 西安邮电大学 | Dynamic gesture recognition method based on video data characteristics, storage medium and device |
CN111797709A (en) * | 2020-06-14 | 2020-10-20 | 浙江工业大学 | Real-time dynamic gesture track recognition method based on regression detection |
CN112052724A (en) * | 2020-07-23 | 2020-12-08 | 深圳市玩瞳科技有限公司 | Finger tip positioning method and device based on deep convolutional neural network |
CN112985415A (en) * | 2021-04-15 | 2021-06-18 | 武汉光谷信息技术股份有限公司 | Indoor positioning method and system |
CN113420752A (en) * | 2021-06-23 | 2021-09-21 | 湖南大学 | Three-finger gesture generation method and system based on grabbing point detection |
WO2021227933A1 (en) * | 2020-05-14 | 2021-11-18 | 索尼集团公司 | Image processing apparatus, image processing method, and computer-readable storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5454043A (en) * | 1993-07-30 | 1995-09-26 | Mitsubishi Electric Research Laboratories, Inc. | Dynamic and static hand gesture recognition through low-level image analysis |
US20120113241A1 (en) * | 2010-11-09 | 2012-05-10 | Qualcomm Incorporated | Fingertip tracking for touchless user interface |
CN102622601A (en) * | 2012-03-12 | 2012-08-01 | 李博男 | Fingertip detection method |
CN202815864U (en) * | 2012-03-12 | 2013-03-20 | 李博男 | Gesture identification system |
US20140010441A1 (en) * | 2012-07-09 | 2014-01-09 | Qualcomm Incorporated | Unsupervised movement detection and gesture recognition |
CN103984928A (en) * | 2014-05-20 | 2014-08-13 | 桂林电子科技大学 | Finger gesture recognition method based on field depth image |
CN107180226A (en) * | 2017-04-28 | 2017-09-19 | 华南理工大学 | A kind of dynamic gesture identification method based on combination neural net |
CN107977604A (en) * | 2017-11-06 | 2018-05-01 | 浙江工业大学 | A kind of hand detection method based on improvement converging channels feature |
CN108171133A (en) * | 2017-12-20 | 2018-06-15 | 华南理工大学 | A kind of dynamic gesture identification method of feature based covariance matrix |
CN109614922A (en) * | 2018-12-07 | 2019-04-12 | 南京富士通南大软件技术有限公司 | A kind of dynamic static gesture identification method and system |
CN110443128A (en) * | 2019-06-28 | 2019-11-12 | 广州中国科学院先进技术研究所 | One kind being based on SURF characteristic point accurately matched finger vein identification method |
CN110458059A (en) * | 2019-07-30 | 2019-11-15 | 北京科技大学 | A kind of gesture identification method based on computer vision and identification device |
-
2019
- 2019-12-02 CN CN201911215465.8A patent/CN110889387A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5454043A (en) * | 1993-07-30 | 1995-09-26 | Mitsubishi Electric Research Laboratories, Inc. | Dynamic and static hand gesture recognition through low-level image analysis |
US20120113241A1 (en) * | 2010-11-09 | 2012-05-10 | Qualcomm Incorporated | Fingertip tracking for touchless user interface |
CN102622601A (en) * | 2012-03-12 | 2012-08-01 | 李博男 | Fingertip detection method |
CN202815864U (en) * | 2012-03-12 | 2013-03-20 | 李博男 | Gesture identification system |
US20140010441A1 (en) * | 2012-07-09 | 2014-01-09 | Qualcomm Incorporated | Unsupervised movement detection and gesture recognition |
CN103984928A (en) * | 2014-05-20 | 2014-08-13 | 桂林电子科技大学 | Finger gesture recognition method based on field depth image |
CN107180226A (en) * | 2017-04-28 | 2017-09-19 | 华南理工大学 | A kind of dynamic gesture identification method based on combination neural net |
CN107977604A (en) * | 2017-11-06 | 2018-05-01 | 浙江工业大学 | A kind of hand detection method based on improvement converging channels feature |
CN108171133A (en) * | 2017-12-20 | 2018-06-15 | 华南理工大学 | A kind of dynamic gesture identification method of feature based covariance matrix |
CN109614922A (en) * | 2018-12-07 | 2019-04-12 | 南京富士通南大软件技术有限公司 | A kind of dynamic static gesture identification method and system |
CN110443128A (en) * | 2019-06-28 | 2019-11-12 | 广州中国科学院先进技术研究所 | One kind being based on SURF characteristic point accurately matched finger vein identification method |
CN110458059A (en) * | 2019-07-30 | 2019-11-15 | 北京科技大学 | A kind of gesture identification method based on computer vision and identification device |
Non-Patent Citations (15)
Title |
---|
CHEN TIANDING: "A solution of computer vision based real-time hand pointing recognition", 《2008 27TH CHINESE CONTROL CONFERENCE》, 22 August 2008 (2008-08-22), pages 384 - 388 * |
CHENGFENG JIAN ET AL.: "Mobile terminal gesture recognition based on improved FAST corner detection", 《IET IMAGE PROCESSING》, vol. 13, no. 6, 11 April 2019 (2019-04-11), pages 1 * |
CHENGFENG JIAN ET AL.: "Mobile terminal trajectory recognition based on improved LSTM model", 《IET IMAGE PROCESSING》, vol. 13, no. 11, 1 August 2019 (2019-08-01), pages 1914 - 1921, XP006082437, DOI: 10.1049/iet-ipr.2019.0183 * |
CHENGFENG JIAN ET AL.: "Real-time multi-trajectory matching for dynamic hand gesture recognition", 《IET IMAGE PROCESSING》, vol. 14, no. 2, 28 November 2019 (2019-11-28), pages 3 - 5 * |
HO-SUB YOON ET AL.: "Hand gesture recognition using combined features of location, angle and velocity", 《PATTERN RECOGNITION》, vol. 34, no. 7, 7 June 2001 (2001-06-07), pages 1491 - 1501, XP004362560, DOI: 10.1016/S0031-3203(00)00096-0 * |
NASSER H. DARDAS ET AL.: "Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques", 《IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT》, vol. 60, no. 11, 15 August 2011 (2011-08-15), pages 3592 - 3607, XP011384965, DOI: 10.1109/TIM.2011.2161140 * |
XIANG GAO ET AL.: "RGBD finger detection based on the 3D K-curvature", 《2017 2ND INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM)》, 1 February 2018 (2018-02-01), pages 120 - 125 * |
姜晓恒: "基于凸包分析的实时指尖检测***", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》, 24 April 2014 (2014-04-24), pages 138 - 2563 * |
张美玉 等: "一种基于改进ACF特征的手部检测方法", 《小型微型计算机***》, no. 7, 31 July 2018 (2018-07-31), pages 1574 - 1578 * |
张美玉 等: "一种面向移动端的快速手势分割优化方法", 《小型微型计算机***》, no. 6, 30 June 2019 (2019-06-30), pages 1346 - 1349 * |
朱正涛: "三维多手指点检测与识别技术在人机交互中的应用", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, 25 December 2012 (2012-12-25), pages 138 - 817 * |
潘峥嵘: "基于Kinect深度图像的手势识别分类", 《自动化技术与应用》, vol. 38, no. 4, 30 April 2019 (2019-04-30), pages 143 - 147 * |
陈佳俊: "基于Harris角点检测和光流的手势识别算法的研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, vol. 2018, no. 1, 15 January 2018 (2018-01-15), pages 138 - 1260 * |
马建平 等: "Android智能手机自适应手势识别方法", 《小型微型计算机***》, no. 7, 31 July 2013 (2013-07-31), pages 1703 - 1707 * |
高雅萍: "基于单目摄像头的手势识别方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, vol. 2014, no. 8, 15 August 2014 (2014-08-15), pages 138 - 1330 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460933A (en) * | 2020-03-18 | 2020-07-28 | 哈尔滨拓博科技有限公司 | Method for real-time recognition of continuous handwritten pattern |
CN111460933B (en) * | 2020-03-18 | 2022-08-09 | 哈尔滨拓博科技有限公司 | Method for real-time recognition of continuous handwritten pattern |
WO2021227933A1 (en) * | 2020-05-14 | 2021-11-18 | 索尼集团公司 | Image processing apparatus, image processing method, and computer-readable storage medium |
CN111680618A (en) * | 2020-06-04 | 2020-09-18 | 西安邮电大学 | Dynamic gesture recognition method based on video data characteristics, storage medium and device |
CN111680618B (en) * | 2020-06-04 | 2023-04-18 | 西安邮电大学 | Dynamic gesture recognition method based on video data characteristics, storage medium and device |
CN111797709A (en) * | 2020-06-14 | 2020-10-20 | 浙江工业大学 | Real-time dynamic gesture track recognition method based on regression detection |
CN112052724A (en) * | 2020-07-23 | 2020-12-08 | 深圳市玩瞳科技有限公司 | Finger tip positioning method and device based on deep convolutional neural network |
CN112985415A (en) * | 2021-04-15 | 2021-06-18 | 武汉光谷信息技术股份有限公司 | Indoor positioning method and system |
CN113420752A (en) * | 2021-06-23 | 2021-09-21 | 湖南大学 | Three-finger gesture generation method and system based on grabbing point detection |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110889387A (en) | Real-time dynamic gesture recognition method based on multi-track matching | |
CN106960195B (en) | Crowd counting method and device based on deep learning | |
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN109344701B (en) | Kinect-based dynamic gesture recognition method | |
CN107808143B (en) | Dynamic gesture recognition method based on computer vision | |
CN108304765B (en) | Multi-task detection device for face key point positioning and semantic segmentation | |
CN110097044B (en) | One-stage license plate detection and identification method based on deep learning | |
CN108062525B (en) | Deep learning hand detection method based on hand region prediction | |
Zhou et al. | Robust vehicle detection in aerial images using bag-of-words and orientation aware scanning | |
CN112784810B (en) | Gesture recognition method, gesture recognition device, computer equipment and storage medium | |
Yan et al. | Crowd counting via perspective-guided fractional-dilation convolution | |
CN1207924C (en) | Method for testing face by image | |
CN109377441B (en) | Tongue image acquisition method and system with privacy protection function | |
CN111639577A (en) | Method for detecting human faces of multiple persons and recognizing expressions of multiple persons through monitoring video | |
CN109977834B (en) | Method and device for segmenting human hand and interactive object from depth image | |
CN112949440A (en) | Method for extracting gait features of pedestrian, gait recognition method and system | |
Yi et al. | Human action recognition based on action relevance weighted encoding | |
CN111414910A (en) | Small target enhancement detection method and device based on double convolutional neural network | |
CN109840498B (en) | Real-time pedestrian detection method, neural network and target detection layer | |
CN109615610B (en) | Medical band-aid flaw detection method based on YOLO v2-tiny | |
CN112183148B (en) | Batch bar code positioning method and identification system | |
CN106022226B (en) | A kind of pedestrian based on multi-direction multichannel strip structure discrimination method again | |
CN110490210B (en) | Color texture classification method based on t sampling difference between compact channels | |
Zhu et al. | Scene text relocation with guidance | |
CN110490170A (en) | A kind of face candidate frame extracting method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200317 |