CN110889387A - Real-time dynamic gesture recognition method based on multi-track matching - Google Patents

Real-time dynamic gesture recognition method based on multi-track matching Download PDF

Info

Publication number
CN110889387A
CN110889387A CN201911215465.8A CN201911215465A CN110889387A CN 110889387 A CN110889387 A CN 110889387A CN 201911215465 A CN201911215465 A CN 201911215465A CN 110889387 A CN110889387 A CN 110889387A
Authority
CN
China
Prior art keywords
gesture recognition
real
image
method based
recognition method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911215465.8A
Other languages
Chinese (zh)
Inventor
简琤峰
李俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201911215465.8A priority Critical patent/CN110889387A/en
Publication of CN110889387A publication Critical patent/CN110889387A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a real-time dynamic gesture recognition method based on multi-track matchingFASTObtaining a positive sample containing all fingertip points by a convolutional neural network of an angular point detection algorithm, clustering the positive sample based on the obtained fingertip points and another unprocessed video image to obtain a minimum point set in one frame, matching the fingertips in the two frames by a global nearest neighbor matching algorithm, and finally utilizing the global nearest neighbor matching algorithm to match the fingertips in the two framesLSTMAnd the neural network carries out multi-track classification and dynamically identifies the gesture. The invention is combined withCNNAndFASTthe angular point detection algorithm can quickly detect the fingerThe positions of the sharp points are fused with an asymmetric point set matching algorithmLSTMThe dynamic gestures are classified with high robustness, and the gesture recognition efficiency is high and the recognition effect is good.

Description

Real-time dynamic gesture recognition method based on multi-track matching
Technical Field
The present invention relates to computing; calculating; the technical field of counting, in particular to a real-time dynamic gesture recognition method based on multi-track matching in the fields of human-computer interaction and computer vision.
Background
In the field of human-computer interaction, gesture interaction is one of the most common and important interaction forms, and is a crucial link in gesture interaction for directly relating gesture recognition to accuracy and robustness of gesture interaction.
Currently, the gesture recognition is divided from the perspective of whether the gesture recognition has a time sequence meaning, and the gesture recognition can be divided into static gesture recognition and dynamic gesture recognition; static gesture recognition aims at classifying gestures in a single frame through morphological characteristics of the gestures, and dynamic gesture recognition is used for classifying gesture actions in a time sequence. In practical application, the gestures almost have time sequence meanings such as rotation, grabbing and the like, so that the dynamic gesture recognition has more practical meanings.
A monocular camera or a sensor can be used for information acquisition in the dynamic gesture recognition. Common sensors for acquiring gesture information include Kinect, Leap Motion, and the like; although the sensors can acquire complicated characteristics such as depth information and joint information of the hand, the sensors generally have the defects of high price, large calculation amount and inapplicability to a mobile terminal; the gesture recognition can be well carried out by only using the monocular camera, the problems can be well avoided, the cost is low, the monocular camera can be widely applied to various mobile platforms, and the support of other hardware is not needed, however, the monocular camera only has RGB information, cannot directly segment gestures, and is easily influenced by noise.
In the prior art, the method for performing dynamic gesture recognition through a monocular camera mainly includes two methods:
(1) training frames in an interval as a group through 3D convolution to achieve the purpose of classifying actions in a period of time, wherein the method has the defect that long-time gestures are difficult to recognize;
(2) after the gesture is segmented through the color space, the mass center of the hand is tracked, and the track is predicted by using the DTW or the HMM.
Disclosure of Invention
The invention solves the problems in the prior art, provides an optimized real-time dynamic gesture recognition method based on multi-track matching, can dynamically recognize gestures in real time, and has high robustness.
The invention adopts the technical scheme that a real-time dynamic gesture recognition method based on multi-track matching comprises the following steps:
step 1: acquiring a video stream;
step 2: copying a video image, and obtaining a hand area image by dividing the video image;
and step 3: constructing a convolutional neural network based on a FAST corner detection algorithm, and acquiring a positive sample containing all the fingertip points;
and 4, step 4: clustering based on the obtained positive sample of the fingertip point and another unprocessed video image;
and 5: matching the fingertips in the two frames through a global nearest neighbor matching algorithm;
step 6: and carrying out multi-track classification by using an LSTM neural network, and dynamically identifying the gesture.
Preferably, the step 2 comprises the steps of:
step 2.1: compressing each frame of video image of the video stream to a preset resolution;
step 2.2: converting the compressed video images from an RGB color space to a YCrCb color space in sequence;
step 2.3: and taking the Cb component and the Cr component, and segmenting to obtain a hand region image.
Preferably, the step 3 comprises the steps of:
step 3.1: acquiring the hand region image segmented in the step 2;
step 3.2: detecting all corners of the hand region image by using a FAST corner detection algorithm, and cutting the original image by taking each corner as a center to obtain a plurality of image slices;
step 3.3: and constructing a lightweight convolutional neural network, classifying the image slices, and if the classification probability is greater than or equal to 50%, determining the image slices as positive samples, otherwise, determining the image slices as negative samples.
Preferably, in step 3.2, the image slices are 32 × 32 pixels.
Preferably, in the step 3.3, the lightweight convolutional neural network comprises four groups of sub-blocks connected in sequence, wherein any sub-block comprises a depth convolutional layer and a maximum pooling layer, and the depth convolutional layer of the second group to the fourth group of sub-blocks is a 1 × 1 convolutional layer; the fourth group of sub-blocks are sequentially connected with a 1 × 1 deep convolutional layer, a global mean pooling layer and a full-connection layer; a number of image slices are input through the depth convolution layer of the first set of sub-blocks, and classification probabilities are output from the fully-connected layer.
Preferably, the step 4 comprises the steps of:
step 4.1: traversing all the positive samples obtained in the step 3, constructing a set C,
Figure BDA0002299382770000031
Figure BDA0002299382770000032
wherein any positive sample riHas the coordinates of (x)i,yi),D1A threshold value for distance; i and j identify different positive samples, respectively;
step 4.2: constructing a set T for keeping the scores of all the elements in the set C, updating the set T,
Figure BDA0002299382770000033
where n is the increment in element T, the length of set T;
step 4.3: reordering the elements in the set T according to a descending order, and correspondingly modifying the ordering of the elements in the set C;
step 4.4: let the minimum set of points after clustering that contains all fingertips be set R,
Figure BDA0002299382770000034
wherein the content of the first and second substances,
Figure BDA0002299382770000035
preferably, the step 5 comprises the steps of:
step 5.1: constructing two frame imagesTwo minimum point sets A, B and distance matrices D, D obtained after step 4i,j=||Ai,Bj||2Wherein i is less than or equal to the number of elements in the point set A and is greater than or equal to 0, and j is less than or equal to the number of elements in the point set B and is greater than or equal to 0; taking col and row as the row number and column number of D respectively;
step 5.2: if col>row, then
Figure BDA0002299382770000041
Carrying out the next step, otherwise, directly carrying out the next step;
step 5.3: constructing sets seq, Dis, gloseq and glodis for storing temporary variables;
step 5.4: from Di,jStart search, add i to seq, let Disi=Disi+Di,jSeq is added to gloseq;
if it satisfies
Figure BDA0002299382770000042
Adding Dis to glodis, otherwise, making i ═ i + 1;
step 5.5: let j equal j +1, Disi=Disi-Di,jIf j is row-1, the next step is carried out, otherwise, the step 5.4 is returned;
step 5.6: obtaining the minimum element glodis in the glodisiWith glodisiAs the optimal solution for point matching.
Preferably, in step 6, after a matching solution between two frames is obtained, the direction angle of the corresponding point of the two frames is calculated and used as a unit of the sequence of LSTM to be input.
Preferably, the direction angle is encoded, mapping it from (0,360 ° ] to an integer of 1 to 12.
Preferably, if the corresponding points of the two frames cannot be matched, the point pair which cannot be matched is marked as "-1".
The invention relates to an optimized real-time dynamic gesture recognition method based on multi-track matching, which comprises the steps of obtaining video streams, copying video images, obtaining hand region images by dividing one video image, constructing a convolutional neural network based on a FAST corner detection algorithm, obtaining a positive sample containing all fingertip points, clustering the positive sample based on the obtained fingertip points and the other unprocessed video image to obtain a minimum point set in one frame, matching fingertips in two frames by a global nearest neighbor point matching algorithm, and finally performing multi-track classification by using an LSTM neural network to dynamically recognize gestures.
The method combines the corner detection algorithm of CNN and FAST, can quickly detect the position of the fingertip point, and simultaneously integrates the asymmetric point set matching algorithm and the LSTM, thereby realizing high-robustness classification of the dynamic gesture, and having high gesture recognition efficiency and good recognition effect.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of a lightweight convolutional neural network according to the present invention;
FIG. 3 is a schematic diagram of clustering positive samples of a lightweight convolutional neural network according to the present invention, where dots at the cusps are the clustered positive sample points;
fig. 4 shows that finger tip points between different frames are matched, and the pointing position of the pin is the matched position.
Detailed Description
The present invention is described in further detail with reference to the following examples, but the scope of the present invention is not limited thereto.
The invention relates to a real-time dynamic gesture recognition method based on multi-track matching, which combines a FAST corner detection algorithm with improved depth CNN, uses a new global nearest neighbor matching algorithm to match fingertips in two frames of images and enable the fingertips to form tracks, and finally uses LSTM to classify a plurality of matched tracks to obtain a dynamic gesture recognition result.
The method comprises the following steps.
Step 1: a video stream is acquired.
Step 2: and copying the video image, and segmenting the video image to obtain a hand area image.
The step 2 comprises the following steps:
step 2.1: compressing each frame of video image of the video stream to a preset resolution;
step 2.2: converting the compressed video images from an RGB color space to a YCrCb color space in sequence;
step 2.3: and taking the Cb component and the Cr component, and segmenting to obtain a hand region image.
In the invention, the human skin color has strong aggregation property on the Cb component and the Cr component of YCrCb, so that the non-skin color area can be effectively removed by utilizing the Cb component and the Cr component.
In the present invention, specifically, the Cb component is a blue chrominance component and the Cr component is a red chrominance component.
In the invention, before processing, the resolution of the image can be compressed to 480 × 640, and the effectiveness of segmentation is ensured.
And step 3: and (3) constructing a convolutional neural network based on a FAST corner detection algorithm, and acquiring a positive sample containing all the fingertip points.
The step 3 comprises the following steps:
step 3.1: acquiring the hand region image segmented in the step 2;
step 3.2: detecting all corners of the hand region image by using a FAST corner detection algorithm, and cutting the original image by taking each corner as a center to obtain a plurality of image slices;
in said step 3.2, the image slice is 32 x 32 pixels.
Step 3.3: and constructing a lightweight convolutional neural network, classifying the image slices, and if the classification probability is greater than or equal to 50%, determining the image slices as positive samples, otherwise, determining the image slices as negative samples.
In the step 3.3, the lightweight convolutional neural network comprises four groups of sub-blocks which are sequentially connected, wherein any sub-block comprises a depth convolutional layer and a maximum pooling layer, and the depth convolutional layers of the second group to the fourth group of sub-blocks are 1 × 1 convolutional layers; the fourth group of sub-blocks are sequentially connected with a 1 × 1 deep convolutional layer, a global mean pooling layer and a full-connection layer; a number of image slices are input through the depth convolution layer of the first set of sub-blocks, and classification probabilities are output from the fully-connected layer.
In the invention, the corner is a category of characteristic points, the FAST corner detection algorithm is the prior art in the field, and the person skilled in the art can detect the corner according to the requirement.
In the present invention, the positive samples are all candidate points that may be fingertips, the negative samples are points that are not necessarily fingertips, and the positive samples obtained in step 3 include fingertips and interference points near the fingertips, which need to be excluded in step 4.
In the invention, the lightweight convolutional neural network is constructed on the assumption that cross-channel correlation mapping and space correlation mapping in a feature diagram are completely decoupled, so that the parameter quantity of the network can be greatly reduced; to further reduce the parameters, we do not directly use depth separable convolution, but first add the maximum pool layer after the depth convolution layer and then expand the number of channels with 1 × 1 convolution. In addition, a plurality of full-connection layers are not adopted in the invention, but a global mean pooling layer and a full-connection layer are used, so that the overfitting of the network is weakened, and the parameter quantity of the full-connection layer is reduced.
And 4, step 4: clustering is performed based on a positive sample of the acquired fingertip points and another unprocessed video image.
The step 4 comprises the following steps:
step 4.1: traversing all the positive samples obtained in the step 3, constructing a set C,
Figure BDA0002299382770000071
Figure BDA0002299382770000072
wherein any positive sample riHas the coordinates of (x)i,yi),D1A threshold value for distance; i and j identify different positive samples, respectively;
step 4.2: constructing a set T for keeping the scores of all the elements in the set C, updating the set T,
Figure BDA0002299382770000073
where n is the increment in element T, the length of set T;
step 4.3: reordering the elements in the set T according to a descending order, and correspondingly modifying the ordering of the elements in the set C;
step 4.4: let the minimum set of points after clustering that contains all fingertips be set R,
Figure BDA0002299382770000074
wherein the content of the first and second substances,
Figure BDA0002299382770000075
in the invention, because the corners obtained by the FAST corner detection algorithm are gathered at the fingertip points and the areas nearby the fingertip points, a plurality of fingertip points can be detected, and the CNN cannot eliminate redundant points, the points need to be combined by the clustering algorithm.
In the present invention, the formula of step 4.1 means: replacing elements in the set when the condition is satisfied, and adding the elements to the set when the condition is not satisfied; at the outset, the set is an empty set, i.e. { c }jThe symbol is empty set; after traversing to the first positive sample point, cjSatisfies the condition of beingiReplacement is carried out; when the subsequent points are traversed, judging whether to add a new point to the set or not according to conditions; the formula of step 4.2 is the same.
And 5: and matching the fingertips in the two frames by using a global nearest neighbor matching algorithm.
The step 5 comprises the following steps:
step 5.1: constructing two minimum point sets A, B and distance matrixes D, D obtained after step 4 of two-frame imagei,j=||Ai,Bj||2Wherein i is less than or equal to the number of elements in the point set A and is greater than or equal to 0, and j is less than or equal to the number of elements in the point set B and is greater than or equal to 0; taking col and row as the row number and column number of D respectively;
step 5.2: if col>row, then
Figure BDA0002299382770000081
Carrying out the next step, otherwise, directly carrying out the next step;
step 5.3: constructing sets seq, Dis, gloseq and glodis for storing temporary variables;
step 5.4: from Di,jStart search, add i to seq, let Disi=Disi+Di,jSeq is added to gloseq;
if it satisfies
Figure BDA0002299382770000082
Adding Dis to glodis, otherwise, making i ═ i + 1;
step 5.5: let j equal j +1, Disi=Disi-Di,jIf j is row-1, the next step is carried out, otherwise, the step 5.4 is returned;
step 5.6: obtaining the minimum element glodis in the glodisiWith glodisiAs the optimal solution for point matching.
In the invention, step 4 obtains the minimum point set in a frame, and step 5 matches the minimum point sets in two frames; the main problem of combining multiple points into multiple tracks is the asymmetric matching between two point sets, i.e. the length of a point set is not equal to the point set itself; if the fingertips in two independent frames need to be matched into a track, the problem that the fingertips cannot be identified 100% and are inevitably misrecognized must be considered, and a global nearest neighbor matching algorithm is adopted for solving the problem.
In the present invention, D is the euclidean distance between all points in the point set A, B.
In the present invention, step 5.2 means to ensure that the number of columns of D is necessarily greater than the number of rows, otherwise, the transposition is performed.
In the present invention, from Di,jStarting the search means starting the search from both i and j being 0, and adding one to the corresponding subscript every time a cycle is completed.
Step 6: and carrying out multi-track classification by using an LSTM neural network, and dynamically identifying the gesture.
In step 6, after a matching solution between two frames is obtained, the direction angles of corresponding points of the two frames are calculated and used as a unit of the sequence of the LSTM to be input.
The direction angle is encoded and mapped from (0,360 ° ] to an integer of 1 to 12.
And if the corresponding points of the two frames cannot be matched, marking the point pairs which cannot be matched as '-1'.
In the invention, after a matching solution between two frames is obtained, the direction angles of corresponding points of the two frames are calculated to represent input data of the LSTM, but the track is discontinuous due to incomplete matching of fingertips between the two frames, in order to enable the LSTM to recognize a dynamic gesture with a sparse track, a point pair which cannot be matched is marked as '-1', and the azimuth angle of the point pair which can be matched is calculated and is taken as a unit of a sequence of the LSTM to be input.
The method comprises the steps of obtaining video streams, copying video images, obtaining hand region images by dividing one video image, constructing a convolutional neural network based on a FAST corner detection algorithm, obtaining a positive sample containing all fingertip points, clustering the obtained positive sample of the fingertip points and another unprocessed video image to obtain a minimum point set in one frame, matching fingertips in two frames by a global nearest neighbor point matching algorithm, and finally performing multi-track classification by using an LSTM neural network to dynamically identify gestures.
The method combines the corner detection algorithm of CNN and FAST, can quickly detect the position of the fingertip point, and simultaneously integrates the asymmetric point set matching algorithm and the LSTM, thereby realizing high-robustness classification of the dynamic gesture, and having high gesture recognition efficiency and good recognition effect.

Claims (10)

1. A real-time dynamic gesture recognition method based on multi-track matching is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring a video stream;
step 2: copying a video image, and obtaining a hand area image by dividing the video image;
and step 3: constructing a convolutional neural network based on a FAST corner detection algorithm, and acquiring a positive sample containing all the fingertip points;
and 4, step 4: clustering based on the obtained positive sample of the fingertip point and another unprocessed video image;
and 5: matching the fingertips in the two frames through a global nearest neighbor matching algorithm;
step 6: and carrying out multi-track classification by using an LSTM neural network, and dynamically identifying the gesture.
2. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 2 comprises the following steps:
step 2.1: compressing each frame of video image of the video stream to a preset resolution;
step 2.2: converting the compressed video images from an RGB color space to a YCrCb color space in sequence;
step 2.3: and taking the Cb component and the Cr component, and segmenting to obtain a hand region image.
3. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 3 comprises the following steps:
step 3.1: acquiring the hand region image segmented in the step 2;
step 3.2: detecting all corners of the hand region image by using a FAST corner detection algorithm, and cutting the original image by taking each corner as a center to obtain a plurality of image slices;
step 3.3: and constructing a lightweight convolutional neural network, classifying the image slices, and if the classification probability is greater than or equal to 50%, determining the image slices as positive samples, otherwise, determining the image slices as negative samples.
4. The real-time dynamic gesture recognition method based on multi-track matching according to claim 3, characterized in that: in said step 3.2, the image slice is 32 x 32 pixels.
5. The real-time dynamic gesture recognition method based on multi-track matching according to claim 3, characterized in that: in the step 3.3, the lightweight convolutional neural network comprises four groups of sub-blocks which are sequentially connected, wherein any sub-block comprises a depth convolutional layer and a maximum pooling layer, and the depth convolutional layers of the second group to the fourth group of sub-blocks are 1 × 1 convolutional layers; the fourth group of sub-blocks are sequentially connected with a 1 × 1 deep convolutional layer, a global mean pooling layer and a full-connection layer; a number of image slices are input through the depth convolution layer of the first set of sub-blocks, and classification probabilities are output from the fully-connected layer.
6. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 4 comprises the following steps:
step 4.1: traversing all the positive samples obtained in the step 3, constructing a set C,
Figure FDA0002299382760000021
wherein any positive sample riHas the coordinates of (x)i,yi),D1A threshold value for distance; i and j identify different positive samples, respectively;
step 4.2: constructing a set T for keeping the scores of all the elements in the set C, updating the set T,
Figure FDA0002299382760000022
where n is the increment in element T, the length of set T;
step 4.3: reordering the elements in the set T according to a descending order, and correspondingly modifying the ordering of the elements in the set C;
step 4.4: let the minimum set of points after clustering that contains all fingertips be set R,
Figure FDA0002299382760000031
wherein the content of the first and second substances,
Figure FDA0002299382760000032
7. the real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: the step 5 comprises the following steps:
step 5.1: constructing two minimum point sets A, B and distance matrixes D, D obtained after step 4 of two-frame imagei,j=||Ai,Bj||2Wherein i is less than or equal to the number of elements in the point set A and is greater than or equal to 0, and j is less than or equal to the number of elements in the point set B and is greater than or equal to 0; taking col and row as the row number and column number of D respectively;
step 5.2: if col>row, then
Figure FDA0002299382760000033
Carrying out the next step, otherwise, directly carrying out the next step;
step 5.3: constructing sets seq, Dis, gloseq and glodis for storing temporary variables;
step 5.4: from Di,jStart search, add i to seq, let Disi=Disi+Di,jSeq is added to gloseq;
if it satisfies
Figure FDA0002299382760000034
Adding Dis to glodis, otherwise, making i ═ i + 1;
step 5.5: let j equal j +1, Disi=Disi-Di,jIf j is row-1, the next step is carried out, otherwise, the step 5.4 is returned;
step 5.6: obtaining the minimum element glodis in the glodisiWith glodisiAs the optimal solution for point matching.
8. The real-time dynamic gesture recognition method based on multi-track matching according to claim 1, characterized in that: in step 6, after a matching solution between two frames is obtained, the direction angles of corresponding points of the two frames are calculated and used as a unit of the sequence of the LSTM to be input.
9. The real-time dynamic gesture recognition method based on multi-track matching according to claim 8, characterized in that: the direction angle is encoded and mapped from (0,360 ° ] to an integer of 1 to 12.
10. The real-time dynamic gesture recognition method based on multi-track matching according to claim 8, characterized in that: and if the corresponding points of the two frames cannot be matched, marking the point pairs which cannot be matched as '-1'.
CN201911215465.8A 2019-12-02 2019-12-02 Real-time dynamic gesture recognition method based on multi-track matching Pending CN110889387A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911215465.8A CN110889387A (en) 2019-12-02 2019-12-02 Real-time dynamic gesture recognition method based on multi-track matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911215465.8A CN110889387A (en) 2019-12-02 2019-12-02 Real-time dynamic gesture recognition method based on multi-track matching

Publications (1)

Publication Number Publication Date
CN110889387A true CN110889387A (en) 2020-03-17

Family

ID=69749985

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911215465.8A Pending CN110889387A (en) 2019-12-02 2019-12-02 Real-time dynamic gesture recognition method based on multi-track matching

Country Status (1)

Country Link
CN (1) CN110889387A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460933A (en) * 2020-03-18 2020-07-28 哈尔滨拓博科技有限公司 Method for real-time recognition of continuous handwritten pattern
CN111680618A (en) * 2020-06-04 2020-09-18 西安邮电大学 Dynamic gesture recognition method based on video data characteristics, storage medium and device
CN111797709A (en) * 2020-06-14 2020-10-20 浙江工业大学 Real-time dynamic gesture track recognition method based on regression detection
CN112052724A (en) * 2020-07-23 2020-12-08 深圳市玩瞳科技有限公司 Finger tip positioning method and device based on deep convolutional neural network
CN112985415A (en) * 2021-04-15 2021-06-18 武汉光谷信息技术股份有限公司 Indoor positioning method and system
CN113420752A (en) * 2021-06-23 2021-09-21 湖南大学 Three-finger gesture generation method and system based on grabbing point detection
WO2021227933A1 (en) * 2020-05-14 2021-11-18 索尼集团公司 Image processing apparatus, image processing method, and computer-readable storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US20120113241A1 (en) * 2010-11-09 2012-05-10 Qualcomm Incorporated Fingertip tracking for touchless user interface
CN102622601A (en) * 2012-03-12 2012-08-01 李博男 Fingertip detection method
CN202815864U (en) * 2012-03-12 2013-03-20 李博男 Gesture identification system
US20140010441A1 (en) * 2012-07-09 2014-01-09 Qualcomm Incorporated Unsupervised movement detection and gesture recognition
CN103984928A (en) * 2014-05-20 2014-08-13 桂林电子科技大学 Finger gesture recognition method based on field depth image
CN107180226A (en) * 2017-04-28 2017-09-19 华南理工大学 A kind of dynamic gesture identification method based on combination neural net
CN107977604A (en) * 2017-11-06 2018-05-01 浙江工业大学 A kind of hand detection method based on improvement converging channels feature
CN108171133A (en) * 2017-12-20 2018-06-15 华南理工大学 A kind of dynamic gesture identification method of feature based covariance matrix
CN109614922A (en) * 2018-12-07 2019-04-12 南京富士通南大软件技术有限公司 A kind of dynamic static gesture identification method and system
CN110443128A (en) * 2019-06-28 2019-11-12 广州中国科学院先进技术研究所 One kind being based on SURF characteristic point accurately matched finger vein identification method
CN110458059A (en) * 2019-07-30 2019-11-15 北京科技大学 A kind of gesture identification method based on computer vision and identification device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US20120113241A1 (en) * 2010-11-09 2012-05-10 Qualcomm Incorporated Fingertip tracking for touchless user interface
CN102622601A (en) * 2012-03-12 2012-08-01 李博男 Fingertip detection method
CN202815864U (en) * 2012-03-12 2013-03-20 李博男 Gesture identification system
US20140010441A1 (en) * 2012-07-09 2014-01-09 Qualcomm Incorporated Unsupervised movement detection and gesture recognition
CN103984928A (en) * 2014-05-20 2014-08-13 桂林电子科技大学 Finger gesture recognition method based on field depth image
CN107180226A (en) * 2017-04-28 2017-09-19 华南理工大学 A kind of dynamic gesture identification method based on combination neural net
CN107977604A (en) * 2017-11-06 2018-05-01 浙江工业大学 A kind of hand detection method based on improvement converging channels feature
CN108171133A (en) * 2017-12-20 2018-06-15 华南理工大学 A kind of dynamic gesture identification method of feature based covariance matrix
CN109614922A (en) * 2018-12-07 2019-04-12 南京富士通南大软件技术有限公司 A kind of dynamic static gesture identification method and system
CN110443128A (en) * 2019-06-28 2019-11-12 广州中国科学院先进技术研究所 One kind being based on SURF characteristic point accurately matched finger vein identification method
CN110458059A (en) * 2019-07-30 2019-11-15 北京科技大学 A kind of gesture identification method based on computer vision and identification device

Non-Patent Citations (15)

* Cited by examiner, † Cited by third party
Title
CHEN TIANDING: "A solution of computer vision based real-time hand pointing recognition", 《2008 27TH CHINESE CONTROL CONFERENCE》, 22 August 2008 (2008-08-22), pages 384 - 388 *
CHENGFENG JIAN ET AL.: "Mobile terminal gesture recognition based on improved FAST corner detection", 《IET IMAGE PROCESSING》, vol. 13, no. 6, 11 April 2019 (2019-04-11), pages 1 *
CHENGFENG JIAN ET AL.: "Mobile terminal trajectory recognition based on improved LSTM model", 《IET IMAGE PROCESSING》, vol. 13, no. 11, 1 August 2019 (2019-08-01), pages 1914 - 1921, XP006082437, DOI: 10.1049/iet-ipr.2019.0183 *
CHENGFENG JIAN ET AL.: "Real-time multi-trajectory matching for dynamic hand gesture recognition", 《IET IMAGE PROCESSING》, vol. 14, no. 2, 28 November 2019 (2019-11-28), pages 3 - 5 *
HO-SUB YOON ET AL.: "Hand gesture recognition using combined features of location, angle and velocity", 《PATTERN RECOGNITION》, vol. 34, no. 7, 7 June 2001 (2001-06-07), pages 1491 - 1501, XP004362560, DOI: 10.1016/S0031-3203(00)00096-0 *
NASSER H. DARDAS ET AL.: "Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques", 《IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT》, vol. 60, no. 11, 15 August 2011 (2011-08-15), pages 3592 - 3607, XP011384965, DOI: 10.1109/TIM.2011.2161140 *
XIANG GAO ET AL.: "RGBD finger detection based on the 3D K-curvature", 《2017 2ND INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM)》, 1 February 2018 (2018-02-01), pages 120 - 125 *
姜晓恒: "基于凸包分析的实时指尖检测***", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》, 24 April 2014 (2014-04-24), pages 138 - 2563 *
张美玉 等: "一种基于改进ACF特征的手部检测方法", 《小型微型计算机***》, no. 7, 31 July 2018 (2018-07-31), pages 1574 - 1578 *
张美玉 等: "一种面向移动端的快速手势分割优化方法", 《小型微型计算机***》, no. 6, 30 June 2019 (2019-06-30), pages 1346 - 1349 *
朱正涛: "三维多手指点检测与识别技术在人机交互中的应用", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, 25 December 2012 (2012-12-25), pages 138 - 817 *
潘峥嵘: "基于Kinect深度图像的手势识别分类", 《自动化技术与应用》, vol. 38, no. 4, 30 April 2019 (2019-04-30), pages 143 - 147 *
陈佳俊: "基于Harris角点检测和光流的手势识别算法的研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, vol. 2018, no. 1, 15 January 2018 (2018-01-15), pages 138 - 1260 *
马建平 等: "Android智能手机自适应手势识别方法", 《小型微型计算机***》, no. 7, 31 July 2013 (2013-07-31), pages 1703 - 1707 *
高雅萍: "基于单目摄像头的手势识别方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, vol. 2014, no. 8, 15 August 2014 (2014-08-15), pages 138 - 1330 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460933A (en) * 2020-03-18 2020-07-28 哈尔滨拓博科技有限公司 Method for real-time recognition of continuous handwritten pattern
CN111460933B (en) * 2020-03-18 2022-08-09 哈尔滨拓博科技有限公司 Method for real-time recognition of continuous handwritten pattern
WO2021227933A1 (en) * 2020-05-14 2021-11-18 索尼集团公司 Image processing apparatus, image processing method, and computer-readable storage medium
CN111680618A (en) * 2020-06-04 2020-09-18 西安邮电大学 Dynamic gesture recognition method based on video data characteristics, storage medium and device
CN111680618B (en) * 2020-06-04 2023-04-18 西安邮电大学 Dynamic gesture recognition method based on video data characteristics, storage medium and device
CN111797709A (en) * 2020-06-14 2020-10-20 浙江工业大学 Real-time dynamic gesture track recognition method based on regression detection
CN112052724A (en) * 2020-07-23 2020-12-08 深圳市玩瞳科技有限公司 Finger tip positioning method and device based on deep convolutional neural network
CN112985415A (en) * 2021-04-15 2021-06-18 武汉光谷信息技术股份有限公司 Indoor positioning method and system
CN113420752A (en) * 2021-06-23 2021-09-21 湖南大学 Three-finger gesture generation method and system based on grabbing point detection

Similar Documents

Publication Publication Date Title
CN110889387A (en) Real-time dynamic gesture recognition method based on multi-track matching
CN106960195B (en) Crowd counting method and device based on deep learning
CN110738207B (en) Character detection method for fusing character area edge information in character image
CN109344701B (en) Kinect-based dynamic gesture recognition method
CN107808143B (en) Dynamic gesture recognition method based on computer vision
CN108304765B (en) Multi-task detection device for face key point positioning and semantic segmentation
CN110097044B (en) One-stage license plate detection and identification method based on deep learning
CN108062525B (en) Deep learning hand detection method based on hand region prediction
Zhou et al. Robust vehicle detection in aerial images using bag-of-words and orientation aware scanning
CN112784810B (en) Gesture recognition method, gesture recognition device, computer equipment and storage medium
Yan et al. Crowd counting via perspective-guided fractional-dilation convolution
CN1207924C (en) Method for testing face by image
CN109377441B (en) Tongue image acquisition method and system with privacy protection function
CN111639577A (en) Method for detecting human faces of multiple persons and recognizing expressions of multiple persons through monitoring video
CN109977834B (en) Method and device for segmenting human hand and interactive object from depth image
CN112949440A (en) Method for extracting gait features of pedestrian, gait recognition method and system
Yi et al. Human action recognition based on action relevance weighted encoding
CN111414910A (en) Small target enhancement detection method and device based on double convolutional neural network
CN109840498B (en) Real-time pedestrian detection method, neural network and target detection layer
CN109615610B (en) Medical band-aid flaw detection method based on YOLO v2-tiny
CN112183148B (en) Batch bar code positioning method and identification system
CN106022226B (en) A kind of pedestrian based on multi-direction multichannel strip structure discrimination method again
CN110490210B (en) Color texture classification method based on t sampling difference between compact channels
Zhu et al. Scene text relocation with guidance
CN110490170A (en) A kind of face candidate frame extracting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317