CN104573621A - Dynamic gesture learning and identifying method based on Chebyshev neural network - Google Patents

Dynamic gesture learning and identifying method based on Chebyshev neural network Download PDF

Info

Publication number
CN104573621A
CN104573621A CN201410514602.9A CN201410514602A CN104573621A CN 104573621 A CN104573621 A CN 104573621A CN 201410514602 A CN201410514602 A CN 201410514602A CN 104573621 A CN104573621 A CN 104573621A
Authority
CN
China
Prior art keywords
gesture
network
neural network
vector
dynamic gesture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410514602.9A
Other languages
Chinese (zh)
Inventor
李文生
邓春健
吕燚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201410514602.9A priority Critical patent/CN104573621A/en
Publication of CN104573621A publication Critical patent/CN104573621A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Social Psychology (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Psychiatry (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a dynamic gesture learning and identifying method based on a Chebyshev neural network. Chebyshev orthogonal polynomials serve as hidden-layer neuron excitation functions for constructing a multi-input multi-output three-layer feedforward neural network, and a weights direct determination method and a hidden-layer node number adaptive determination algorithm are given; a fingertip detection algorithm based on a color histogram and a fingertip tracking algorithm based on bigraph optimal matching are given for obtaining a dynamic gesture track in real time; an MIMO-CNN (multi-input multi-output Chebyshev neural network) is subjected to input output structure design and network weights learning training according to the dynamic gesture identifying requirements, and a dynamic gesture is identified by the trained MIMO-CNN. A test result shows that the MIMO-CNN can increase the network training speed and improve the network training precision, so that the dynamic gesture learning speed is increased and the dynamic gesture identifying accuracy is improved; moreover, relatively good robustness and generalization ability in the aspect of dynamic gesture identification are achieved.

Description

Based on dynamic gesture study and the recognition methods of Chebyshev neural network
Technical field
The invention belongs to technical field of image processing, relate to the dynamic gesture study based on Chebyshev neural network and recognition methods.
Background technology
Gesture identification human-computer interaction technology is mainly through identifying that user's gesture motion realizes the operation to computing machine, and it can provide one more naturally man-machine interaction means [1 ~ 3].Dynamic hand gesture recognition based on machine vision mainly adopts Camera location moving target (finger tip), and the related coefficient then by calculating fingertip motions track and default template judges the interaction semantics of gesture [4,5].
Dynamic gesture knows the nonlinear dynamic system that system is a more complicated, its input (fingertip motions track) and the Function Mapping exported between (gesture classification) are difficult to determine, carry out dynamic hand gesture recognition, first System Identification must be carried out, namely by model structure and the model parameter of one group of experimental data determination dynamic hand gesture recognition system.Neural network, due to the concurrency of its height, adaptive learning ability and the feature such as approximation capability for nonlinear system, is widely used in debating to know and controlling and the field such as pattern-recognition of nonlinear system [6].The current neural network for System Discrimination and pattern-recognition is mainly based on the multilayer feedforward neural network of BP training algorithm.But the shortcoming that traditional BP algorithm has some intrinsic, as be easily limited to local minimum, speed of convergence slow, can not effectively utilize learning experience, hidden layer and neuronal quantity thereof setting in the past to lack theoretical direction etc., in order to overcome the shortcoming of traditional BP algorithm, many innovatory algorithm are constantly put forward, these innovatory algorithm have plenty of the rule of iteration being carried out modified BP neural network training by gradient descent method or numerical optimization, as being with the gradient descent method, Levenberg-Marquardt method etc. of momentum [6], have plenty of the training sample being selected BP neural network by orthogonal design method, reduce training sample number [7], some population or genetic algorithm idea are applied to the training of BP neural network [6,8].But above-mentioned innovatory algorithm still can not avoid tediously long learning training completely, and training speed depends on the training sample of selection, not too meets the needs of the situation of user's gesture being carried out to on-line study training.
Document [9] proposes the theoretical and building method of neural network configuration based on orthogonal polynomial functions, and demonstrates under the highest power same case, selects the neural network best performance of Chebyshev orthogonal polynomial structure.Document [10] proposes and carries out by Chebyshev orthogonal basis neural network the method that nonlinear dynamic system debates knowledge, and give a kind of adaptive learning training algorithm of this network, can greatly reduce network training calculated amount relative to traditional BP neural network.Document [11] proposes a kind of self-adaptation Chebyshev neural network debating knowledge for nonlinear dynamic system, and demonstrates the stability and convergence of the method.Document [12 ~ 15] utilizes multiple orthogonal basis function series structure orthogonal basis feedforward neural network measure for fulfill to approach for single input, single (SISO) situation that exports, and propose a kind of direct computational grid weights method and require the algorithm of self-adaptation determination hidden layer neuron number according to training precision, substantially increase the training speed of network.
Dynamic hand gesture recognition based on machine vision is a typical multi input, multi output nonlinear system, and its input is dynamic gesture track vector, has multiple component, and its output is gesture identification result, is a vector showing various gesture possibility.Document [16] utilizes the orthogonal basis neural network of Laguerre orthogonal basis function series structure multi input, multi output, and agree obtain good effect in identification in dynamic gesture study.Chebyshev feedforward neural network is generalized to multi input, multi output (MIMO) situation herein, propose network weight direct calculation method and require the algorithm of self-adaptation determination hidden layer neuron number according to training precision, and be applied to dynamic gesture study and identify, improve pace of learning and the recognition correct rate of dynamic gesture.
Summary of the invention
The object of the present invention is to provide the dynamic gesture study based on Chebyshev neural network and recognition methods, solve existing method slow for the pace of learning of dynamic gesture, and the problem that recognition correct rate is low.
The technical solution adopted in the present invention is according to following steps:
Step 1: set up MIMO-CNN neural network model;
Input layer has n node, X k=(x 1, k, x 2, k..., x n, k) t∈ R nfor the input vector of network; Output layer has m node, Y k=(y 1, k, y 2, k..., y m, k) t∈ R mfor the output vector of network, the ∑ in output layer node represents that output layer neuron adopts linear incentive function; Hidden layer has h neuron, adopts Chebyshev orthogonal polynomial T j(x) (j=0,1,2 ..., h-1) as its excitation function; Input layer is all 1, W ∈ R to the weights of hidden layer neuron h × mfor hidden layer is to the weight matrix of output layer, wherein w j, ifor the connection weights of i-th node of a hidden layer jth neuron and output layer;
Step 2: adopt p learning sample to train MIMO-CNN neural network model, the input of these learning samples is respectively: X 1, X 2..., X p, corresponding expection output is: D 1, D 2..., D p, actual output is: Y 1, Y 2..., Y p, the actual input/output relation of network is expressed as:
y j , k = Σ i = 1 n w j , i T i ( x ) , j = 1,2 , . . . , m ; k = 1,2 , . . . , p
Definition error function: E j , k = y j , k d - y j , k
Wherein: for the ideal of learning sample exports, y i, kfor reality exports;
Step 3: determine MIMO-CNN weights: pass through formula
Wherein: W j, k=(w j, 1, w j, 2..., w j, k), η is pace of learning and requires 0 < η < 1; The matrix form of weights iterative formula is: W (r+1)=W (r)-η Φ t(Φ W (r)-D), wherein:
for input transition matrix, r=0,1,2 ... for iterations, have when network training reaches (when namely r is enough large) after stable state:
substitute into W (r+1)=W (r)-η Φ t(Φ W (r)-D) then has:
Φ T(ΦW-D)=0
MIMO-Chebyshev network weight Direct calculation formulas can be obtained thus:
W=(Φ TΦ) -1Φ TD
Wherein pinv (Φ)=(Φ tΦ) -1Φ tbe called the pseudoinverse of Φ, the weight calculation method based on pseudoinverse directly can obtain the network weight of MIMO-CNN by matrix operation one step, avoid tediously long repetitive exercise, can adapt to Fast Learning and the identification of dynamic gesture;
Step 4: network hidden layer neuron number self-adaptation is determined;
Given network objectives precision ε > 0, under meeting the minimum hidden layer neuron number self-adaptation determination algorithm of this accuracy requirement:
(1) target setting precision, gets ε=0.005, the hidden layer neuron number maximal value of setting neural network MaxHideNode = n + m + 20 And initialization value h = n + m + 1 ;
(2) if h > is MaxHideNode, then illustrate within the scope of MaxHideNode, cannot meet network objectives accuracy requirement, program exits, otherwise jumps to (3);
(3) calculate when hidden layer neuron number is h, the weight matrix W of neural network and least mean-square error (MSE);
(4) if MSE≤ε, then illustrate and have found the minimum hidden layer neuron number h meeting network objectives accuracy requirement, program exits, otherwise h=h+1, jump to (2);
Step 5: carry out finger tip tracking; By the Optimum Matching algorithm of bigraph (bipartite graph), realize the finger tip coupling in two two field pictures as Kuhn-Munkrs algorithm etc., thus indirectly realize finger tip tracking;
Step 6: dynamic gesture trajectory extraction, dynamic gesture track is by two of a hand finger tip (thumbs, forefinger) motion formation, gesture record is started when having detected two finger tip is asymptotic and distance being less than certain threshold value, namely start to write, when detected two give directions gradually far away and distance is greater than certain threshold value time terminate gesture record, namely pen is lifted, from starting to write, according to the centre position of certain time interval record two finger tips, terminate until lift pen, so obtain the set of the point of dynamic gesture process, i.e. gesture path, in order to the needs that dynamic gesture below learns and identifies, we choose the representative point ascertained the number from the some set of gesture path, if gesture path count more than 15, can by finding two minimum points of span, and replace them with their mid point, repeat, until the number of gesture path mid point is 15, these points just can regard the representative point of a gesture path as, 15 representative points according to gesture path can obtain 14 vectors and be normalized them, then the vector of this gesture stroke is connected successively, the vector representing this gesture path can be formed,
Step 7: based on dynamic gesture study and the identification of MIMO-CNN;
(1) neural network input and output design:
Input node quantity is determined according to the component number of gesture path vector, utilizes the input of each component as neural network of gesture path vector simultaneously; Output layer nodes arranges the species number m of corresponding dynamic gesture, and length is the foundation that the output vector of m can judge as dynamic gesture;
(2) determination of hidden layer neuron quantity and neural network weight:
The key of dynamic gesture study is according to gesture training sample, determine hidden layer neuron quantity and the network weight of corresponding neural network, for the input vector of these training samples, obtained by the dynamic gesture track extraction method introduced above, and for the expection output vector of these training samples, because dynamic gesture is divided into m kind, so arrange the vector of unit length corresponding m kind gesture that m length is m, the wherein expection output vector (y of N kind gesture 1, y 2... .y m) tdetermine with following formula:
y j = 0 j &NotEqual; N 1 j = N
Setting network precision target is 0.005, utilize the training sample provided, adopt minimum hidden layer neuron number self-adaptation determination algorithm can determine hidden layer neuron quantity, by the input vector of training sample and expection output vector are sent into neural network, utilize formula W=(Φ tΦ) -1Φ td, can obtain the best initial weights matrix W of MIMO-CNN, this MIMO-CNN just may be used for dynamic hand gesture recognition afterwards;
(3) identification of dynamic gesture:
After obtaining dynamic gesture track, by gesture path vector (x 1, k, x 2, k...., x n, k) tbe input to the MIMO-CNN determining weights as input vector, utilize formula y j , k = &Sigma; i = 1 n w j , i T i ( x ) , j = 1,2 , . . . , m ; k = 1,2 , . . . , p Calculate network output vector (y 1, k, x 2, k...., x m, k) t, can carry out gesture identification according to this output vector: the greatest member finding out this vector, if this element is greater than certain threshold value, then the position of this element in vector is exactly the dynamic gesture numbering that will identify.
The invention has the beneficial effects as follows the pace of learning block for dynamic gesture, and recognition correct rate is high.
Accompanying drawing explanation
Fig. 1 is MIMO-CNN model schematic;
Fig. 2 is MIMO-CNN hidden neuron self-adaptation determination algorithm flow chart;
Fig. 3 is finger tip detecting and tracking processing flow chart;
Fig. 4 is gesture path and the sample point schematic diagram thereof of numeral 6;
Fig. 5 is the robustness schematic diagram that MIMO-CNN inputs for gesture.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is described in detail.
1 MIMO-Chebyshev BP network model
Definition and the relevant nature thereof of Chebyshev orthogonal polynomial is provided below 1.1 Chebyshev orthogonal polynomials.
Define 1 Chebyshev polynomial expression to define with following recursion formula:
T 0 ( x ) = 1 T 1 ( x ) = x T h + 2 ( x ) = 2 x T h + 1 ( x ) - T h ( x ) - - - ( 1 )
for in interval [-1,1] about weight function system of orthogonal polynomials, T jx () is j the orthogonal polynomial about weight function ρ (x) in interval [-1,1], namely have:
&Integral; - 1 1 T i ( x ) T j ( x ) dx = 0 i &NotEqual; j &pi; / 2 i = j > 0 &pi; i = j = 0 - - - ( 2 )
Definition 2 hypothesis φ (x) sum functions system in all functions continuous print at interval [a, b], and be a linear independence function system on interval [a, b], ρ (x) is a weight function on [a, b].Determine Generalized Polynomical Functions coefficient w 0, w 1..., w h-1, make minimum, the function obtained like this be called as φ (x) Least squares approach (function) about weight function ρ (x) on [a, b].
If definition 2 neutral lines have nothing to do, Chebyshev system of orthogonal polynomials is got by function system, namely then there is following theorem.
Theorem 1 establishes φ (x) upper continuously and first order derivative is continuous in [-1,1], if for the Least squares approach of φ (x) on [-1,1], then as n → ∞, progression in [-1,1] upper uniform convergence, and
φ(x)=w 0T 0(x)+w 1T 1(x)+w 2T 2(x)+...... (3)
Meanwhile, along with the increase of k, w ktrend towards 0 very soon.Theorem 1 describes the weighted sum of finite term Chebyshev orthogonal basis can with any precision approximating function φ (x).
1.2 MIMO-Chebyshev neural network (MIMO-CNN) models and Weighting MIMO-CNN model are as shown in Figure 1.Input layer has n node, X k=(x 1, k, x 2, k..., x n, k) t∈ R nfor the input vector of network; Output layer has m node, Y k=(y 1, k, y 2, k..., y m, k) t∈ R mfor the output vector of network, the ∑ in output layer node represents that output layer neuron adopts linear incentive function; Hidden layer has h neuron, adopts Chebyshev orthogonal polynomial T j(x) (j=0,1,2 ..., h-1) as its excitation function; Input layer is all 1, W ∈ R to the weights of hidden layer neuron h × mfor hidden layer is to the weight matrix of output layer, wherein w j, ifor the connection weights of i-th node of a hidden layer jth neuron and output layer.
Because Chebyshev function is function of a single variable, Chebyshev feedforward neural network only has when input to be only convergence when interval [-1,1], otherwise can not normally restrain during network learning and training.In order to address this problem, the following S-function of definition x:
x = 1 1 + e - &sigma; . Z - - - ( 4 )
Wherein formula (4) can input and be transformed into [0,1] from (-∞, ∞), and realizes from R nthe mapping of → R.
Adopt p learning sample to train the MIMO-CNN shown in Fig. 1, the input of these learning samples is respectively: X 1, X 2..., X p, corresponding expection output is: D 1, D 2..., D p, actual output is: Y 1, Y 2..., Y p, obviously, the actual input/output relation of network can be expressed as:
y j , k = &Sigma; i = 1 n w j , i T i ( x ) , j = 1,2 , . . . , m ; k = 1,2 , . . . , p - - - ( 5 )
Definition error function: E j , k = y j , k d - y j , k - - - ( 6 )
Wherein: for the ideal of learning sample exports, y i, kfor reality exports, then MIMO-CNN weights can be determined by following iterative formula.
Wherein: W j, k=(w j, 1, w j, 2..., w j, k), η is pace of learning and requires 0 < η < 1.
The matrix form of weights iterative formula is:
W(r+1)=W(r)-ηΦ T(ΦW(r)-D) (8)
Wherein:
for input transition matrix, r=0,1,2 ... for iterations.
The same with SISO-Chebyshev neural network, according to Chebyshev orthogonal basis function character, convergence and the stability of above-mentioned MIMO-Chebyshev neural network weight alternative manner can be proved.But the same with traditional BP algorithm, it still needs could obtain best initial weights by tediously long repetitive exercise.
Notice the convergence due to iterative formula (8), so have when network training reaches (when namely r is enough large) after stable state:
W ( r + 1 ) = W ( r ) = lim k &RightArrow; &infin; W ( k ) = W , Substitute into formula (8) then to have:
Φ T(ΦW-D)=0 (9)
MIMO-Chebyshev network weight Direct calculation formulas can be obtained thus:
W=(Φ TΦ) -1Φ TD (10)
Wherein pinv (Φ)=(Φ tΦ) -1Φ tbe called the pseudoinverse of Φ.Weight calculation method based on pseudoinverse directly can obtain the network weight of MIMO-CNN by matrix operation one step, avoid tediously long repetitive exercise, can adapt to Fast Learning and the identification of dynamic gesture.
1.3 network hidden layer neuron number self-adaptations are determined.The setting of traditional BP neural network hidden layer and hidden layer neuron number lacks perfect theoretical direction, how to determine that concrete network structure there is no fine method according to practical problems, mostly tries to gather according to deviser's experience.If the setting of hidden layer neuron number is very few, then accuracy requirement cannot be reached; Setting too much may cause hidden layer neuron redundancy again, is difficult to guarantee that network structure is optimum.For above-mentioned shortcoming, on the basis that weights are directly determined, a kind of setting of design hidden layer neuron number adaptive algorithm, can according to practical problems, fast and effeciently automatically determine network structure, make meet aimed at precision requirement again while guarantee hidden layer neuron minimum number.
Given network objectives precision ε > 0, meets the minimum hidden layer neuron number self-adaptation determination algorithm of this accuracy requirement as shown in Figure 2:
(1) target setting precision, as ε=0.005, the hidden layer neuron number maximal value of setting neural network MaxHideNode = n + m + 20 And initialization value h = n + m + 1 ;
(2) if h > is MaxHideNode, then illustrate within the scope of MaxHideNode, cannot meet network objectives accuracy requirement, program exits, otherwise jumps to (3);
(3) calculate when hidden layer neuron number is h, the weight matrix W of neural network and least mean-square error (MSE);
(4) if MSE≤ε, then illustrate and have found the minimum hidden layer neuron number h meeting network objectives accuracy requirement, program exits, otherwise h=h+1, jump to (2).
The acquisition of 2 dynamic gesture samples and identification.
2.1 finger tips are followed the tracks of: Fig. 3 give the processing flow chart of the finger tip target detection tracking method based on color histogram and bigraph (bipartite graph) Optimum Matching algorithm idea.
Due to the color distribution of different finger tip and form very similar, carry out the algorithm of motion tracking based on color histogram or shape facility, as CamShift etc., when extracting finger tip image feature, difference is little, easily judges by accident when finger tip is followed the tracks of; Simultaneously because different finger tip may intersect, this kind of motion prediction algorithm of Kalman filter is easily obscured when predicted value and measured value being mated.Notice that the finger tip of current time is followed the tracks of only to judge in conjunction with the fingertip location information in a upper moment, the fingertip location in two moment is divided into two independent set, the fingertip location in a moment associates with a fingertip location in another moment at most simultaneously, namely be man-to-man association, so, finger tip between different frame follows the tracks of the Optimum Matching problem that can be converted to bigraph (bipartite graph) completely, can by the Optimum Matching algorithm of bigraph (bipartite graph), realize the finger tip coupling in two two field pictures as Kuhn-Munkrs algorithm etc., thus indirectly realize finger tip tracking.
2.2 dynamic gesture trajectory extraction dynamic gesture tracks are formed by two of a hand finger tips (thumb, forefinger) moving, gesture record is started when having detected two finger tip is asymptotic and distance being less than certain threshold value, namely start to write, when detected two give directions gradually far away and distance is greater than certain threshold value time terminate gesture record, namely lift pen.From starting to write, according to the centre position of certain time interval record two finger tips, terminate until lift pen, so obtain the set of the point of dynamic gesture process, i.e. gesture path.In order to the needs that dynamic gesture below learns and identifies, we choose the representative point (such as 15) ascertained the number from the some set of gesture path.If gesture path count more than 15, can by finding two minimum points of span, and replace them with their mid point, repeat, until the number of gesture path mid point is 15, these points just can regard the representative point of a gesture path as.15 representative points according to gesture path can obtain 14 vectors and be normalized them.Then the vector of this gesture stroke is connected successively, the vector (28 components) representing this gesture path can be formed.Such as, for the vector that numeral " 6 " gesture path of such as Fig. 4 is corresponding be:
{-0.93,0.36,-0.72,0.69,-0.47,0.88,-0.40,0.92,-0.26,0.97,-0.07,1.00,0.11,0.99,0.65,0.76,
0.98,0.18,0.86,-0.51,0.46,-0.89,-0.49,-0.87,-0.98,-0.18,-1.00,0.00}。
Gesture path by user's self-defining, can define 16 kinds of default gestures herein, comprise 0 ~ 9 numeral, confirm, cancel, advance, retreat, on turn over, under turn over.
2.3 based on MIMO-CNN dynamic gesture study and identify: the study and the identification that how are realized dynamic gesture above by the MIMO-CNN introduced are discussed below.
(1) neural network input and output design
For the MIMO-CNN shown in Fig. 1, input node quantity is determined according to the component number of gesture path vector, utilizes the input of each component as neural network of gesture path vector simultaneously; Output layer nodes arranges the species number m of corresponding dynamic gesture, length is the foundation that the output vector of m can judge as dynamic gesture: if the expection output vector of output vector and N kind gesture is more close, illustrates that input gesture is most possibly N kind gesture.
(2) determination of hidden layer neuron quantity and neural network weight
The key of dynamic gesture study is according to gesture training sample, determines hidden layer neuron quantity and the network weight of corresponding neural network.Suppose to need to identify m kind dynamic gesture, then prepare m × 5 training sample, each dynamic gesture respectively gets 5 training samples.
For the input vector of these training samples, can be obtained by the dynamic gesture track extraction method introduced above, and for the expection output vector of these training samples, because dynamic gesture is divided into m kind, so arrange the vector of unit length corresponding m kind gesture that m length is m, the wherein expection output vector (y of N kind gesture 1, y 2... .y m) tcan determine with following formula.
y j = 0 j &NotEqual; N 1 j = N - - - ( 11 )
Setting network precision target is 0.005, utilizes the training sample provided, adopts minimum hidden layer neuron number self-adaptation determination algorithm previously discussed can determine hidden layer neuron quantity.
By the input vector of training sample and expection output vector are sent into neural network, utilize formula (10), the best initial weights matrix W of MIMO-CNN can be obtained.This MIMO-CNN just may be used for dynamic hand gesture recognition afterwards.
(3) identification of dynamic gesture
After obtaining dynamic gesture track, by gesture path vector (x 1, k, x 2, k...., x n, k) tbe input to the MIMO-CNN determining weights as input vector, utilize formula (5) to calculate network output vector (y 1, k, x 2, k...., x m, k) t, can carry out gesture identification according to this output vector: the greatest member finding out this vector, if this element is greater than certain threshold value (generally getting 0.5), then the position of this element in vector is exactly the dynamic gesture numbering that will identify.Such as, a dynamic gesture vector is inputted after neural network, if output vector is (0.01,0.00,0.00,0.00,0.01,0.00,0.00,0.00,0.00,0.00,0.99,0.00,0.00,0.00,0.01,0.01), the position of its greatest member 0.99 is 11, then can judge this gesture be numbered 11 gesture.If the greatest member in output vector is less than this threshold value, then can judge that this gesture is not predefined effective gesture.
3 test result of the present invention and analyses:
In order to verify the validity that algorithm of the present invention learns at dynamic gesture and identifies, we adopt Visual Studio+Open CV to develop the prototype system of a dynamic hand tracking and identification.First by system acquisition 5 groups training test sample book, 200 groups of identification test sample books, then adopt traditional BP neural network (training respectively by band momentum gradient descent method GDX, LM algorithm) and MIMO-CNN (adopting direct weight determination to train) to carry out dynamic gesture learning training and gesture identification simultaneously, the training of three and gesture identification situation are analyzed.
3.1 network training tests.
Table 1 MIMO-CNN and the general BP neural metwork training situation table of comparisons
As can be seen from Table 1, for MIMO-CNN, adopt Weigh Direct Determination one step determination weights, the training time shortens greatly, only needs 0.31s, and adopts the BP neural network of GDX and LM training algorithm, is respectively 15.3s and 6.5s between the average training emulating 10 times.Simultaneously, adopt BP neural network, by after the training of GDX and LM method, (the setting iterations upper limit is 3000 times, network performance objective is 0.005), network error is respectively 0.0049925 and 0.0042041, although reach target 0.005, neural network accuracy starts stagnation, can not evolve to less network error target again.And adopting MIMO-CNN, the network error drawn by direct weight determination is 0.0003453.Visible, adopt MIMO-CNN compared to general BP neural network, not only in network training speed, tool has an enormous advantage, and it also can reach better neural network accuracy.
3.2 gesture identification tests: after having trained neural network, be input to MIMO-CNN by needing the gesture test sample book identified and carry out identification test, can find out that MIMO-CNN has good robustness and generalization ability for dynamic hand gesture recognition from test result, the difference of the certain limit that gesture inputs can be tolerated, as shown in Figure 5.
For 200 groups of identification test sample books, MIMO-CNN and the BP neural network having determined weights is adopted to carry out identification test respectively respectively, the gesture identification Average Accuracy adopting the BP neural network of GDX and LM method training is 89.7% and 91.9%, and the gesture identification Average Accuracy of MIMO-CNN reaches 96.7%, recognition accuracy significantly improves.
Advantage of the present invention, dynamic gesture study fast identifies and dynamic hand gesture recognition that is real-time, robust realizes the intuitively natural basis of the man-machine interaction based on gesture.The present invention is on the basis providing a kind of quick dynamic gesture detecting and tracking, construct a kind of MIMO-CNN based on Chebyshev orthogonal basis function for the study of dynamic gesture and identification, by directly determining network weight and self-adaptation determination hidden layer neuron number, overcome the shortcomings such as traditional BP neural network speed of convergence is slow, network training precision is low, greatly improve network training speed, be convenient to user and carry out online gesture learning training; Increase substantially precision and the generalization ability of network simultaneously, thus improve the robustness of dynamic hand gesture recognition.
The above is only to better embodiment of the present invention, not any pro forma restriction is done to the present invention, every any simple modification done above embodiment according to technical spirit of the present invention, equivalent variations and modification, all belong in the scope of technical solution of the present invention.
List of references:
[1]Alexander T C,Ahmed H S,Anagnostopoulos G C.An Open Source Frameworkfor Real-Time,Incremental,Static and Dynamic Hand Gesture Learning andRecognition.Jacko J A.Human-Computer Interaction,Part II.Berlin:Springer-Verlag,2009,5611:123~130.
[2]Jose M M,Ainhoa M E,Alvaro A,et al.Low-Cost Gesture-BasedInteraction for Intelligent Environments.Omatu S.The 10th InternationalWork Conference on Artificial Neural Networks,Part II.Berlin:Springer-Verlag,2009,5518:752~755.
[3]Jiyoung P,Juneho Y,Crowley J L.Efficient Fingertip Tracking andMouse Pointer Control for a Human Mouse.Crowley J L,et al.The 3thInternational Conference on Computer Vision Systems.Berlin:Springer-Verlag,2003,2626:88~97.
[4]Kaustubh S.Hand gesture modelling and recognition involving changingshapes and trajectories,using a Predictive EigenTracker.PatternRecognition Letters,2007(28):329~334.
[5]Hardy F,Javier R S,Rodrigo V.Real-Time Hand Gesture Detection andRecognition Using Boosted Classifiers and Active Learning.Pirhonen A,Brewster S.2007 IEEE Paciffic-Rim Symposium on Image and Video Technology.Berlin:Springer-Verlag,2007,4872:533~547.
[6]Martin T.Hagan,Howard B.Demuth,Mark H.Beale.Neural Network Design.Beijing:China Machine Press,2002.197~257.
[7] Zhou Y, Xu B L.Orthogonal Method for Training Neural Networks.Journalof Nanjing University (Natural Science), 2001,37 (1): 72 ~ 78 (Zhou Yi, Xu Bailing. the orthogonal design research in neural network. Nanjing University's journal (natural science), 2001,37 (1): 72 ~ 78).
[8] Wang Z J, Yu W D, Chen Z Q, et al.A BP Neural Network Algorithm Basedon Genetic Algorithms and Its Application.Journal of NanjingUniversity (Natural Science), 2003,39 (5): 459 ~ 465. (Wang Chongjun, in river in Shangdong Province Polyester, Chen Zhaoqian etc. a kind of neural network algorithm based on genetic algorithm and application thereof. Nanjing University's journal (natural science), 2003,39 (5): 459 ~ 465).
[9] Wu X J, Wang S T, Yang J Y.The Study on the OrthogonalPolynomials-based Neural Networks and Its Properties.ComputerEngineering and Application, 2002,38 (9): 25 ~ 26. (Wu little Jun, Wang Shitong, Yang Jingyu. based on neural network and the character research thereof of orthogonal polynomial functions. computer engineering and application, 2002,38 (9): 25 ~ 26).
[10]Jagdish C.Patra,Alex C.Kot.Nonlinear Dynamic SystemIdentification Using Chebyshev Functional Link Artificial NeuralNetworks.IEEE Transactions on Systems,Man,and Cybernetics,Part B:Cybernetics,2002,32(4):505~511.
[11]Mu Li,Yigang He,Nonlinear system identification using adaptiveChebyshev neural networks.Intelligent Computing and Intelligent Systems(ICIS),2010:243~247.
[12] Zhang Y N, Chen Y L, Jiang X H, et al.Weights-directly-determinedand Structure adaptively tuned Neural Network Based on Chebyshev BasisFunctions.Computer Science, 2009,36 (6): 210 ~ 213. (Zhang Yunong, Chen Yulong, Jiang Xiaohua etc. a kind of weights are directly determined and the Chebyshev basis function neural network of structure adaptive. computer science, 2009,36 (6): 210 ~ 213).
[13] Zhang Y N, Xiao X C, Chen Y W, et al.Number determination ofhidden-layer nodes for Hermite feed-forward neural network.Journal ofZhejiang Univer sity (Eng ineering Science) .2010,44 (2): 271 ~ 275. (Zhang Yunong, Xiao Xiuchun, the .Hermite feedforward neural network hidden node numbers such as Chen Yangwen are determined automatically. journal of Zhejiang university (engineering version) .2010,44 (2): 271 ~ 275).
[14] Zhang Y N, Zhong T K, Li W, et al.Laguerre orthogonal basisfeed-forward neural network with its weights determined directly.Journalof Jinan University (Natural Science and MedicineEdition), 2008, 29 (3): 249 ~ 253. (Zhang Yunong, Zhong Tongke, .Laguerre orthogonal basis feedforward neural network and the direct weight determination thereof such as Li Wei. Ji'nan University's journal (natural science edition), 2008, 29 (3): 249 ~ 253).
[15] Xiao X C, Zhang Y N, Jiang X H, et al.Weights-direct-determinationand structure-adaptive-determination of feedforward neural networkactivated with the 2nd-class Chebyshev orthogonal polynomials.Journalof Dalian Maritime University, 2009, 35 (1): 80 ~ 84. (Xiao Xiuchun, Zhang Yunong, Jiang Xiaohua, Deng. Equations of The Second Kind Chebyshev feedforward neural network weights directly determine and structure adaptive is determined [J]. Maritime Affairs University Of Dalian's journal, 2009, 35 (1): 80 ~ 84).
[16] Li W S, Xie M, Yao Q.Dynamic Gesture Recognition Based on LaguerreOrthogonal Basis Neural Network.Journal of Nanjing University (NaturalScience), 2011,47 (5): 515 ~ 523 (Li Wensheng, Xie Mei, Yao Qiong. based on the dynamic hand gesture recognition of Laguerre orthogonal basis neural network. Nanjing University's journal (natural science), 2011,47 (5): 515 ~ 523).

Claims (1)

1., based on dynamic gesture study and the recognition methods of Chebyshev neural network, it is characterized in that carrying out according to following steps:
Step 1: set up MIMO-CNN neural network model;
Input layer has n node, X k=(x 1, k, x 2, k..., x n, k) t∈ R nfor the input vector of network; Output layer has m node, Y k=(y 1, k, y 2, k..., y m, k) t∈ R mfor the output vector of network, the ∑ in output layer node represents that output layer neuron adopts linear incentive function; Hidden layer has h neuron, adopts Chebyshev orthogonal polynomial T j(x) (j=0,1,2 ..., h-1) as its excitation function; Input layer is all 1, W ∈ R to the weights of hidden layer neuron h × mfor hidden layer is to the weight matrix of output layer, wherein w j, ifor the connection weights of i-th node of a hidden layer jth neuron and output layer;
Step 2: adopt p learning sample to train MIMO-CNN neural network model, the input of these learning samples is respectively: X 1, X 2..., X p, corresponding expection output is: D 1, D 2..., D p, actual output is: Y 1, Y 2..., Y p, the actual input/output relation of network is expressed as:
y j , k = &Sigma; i = 1 n w j , i T i ( x ) , j = 1,2 , . . . , m ; k = 1,2 , . . . , p
Definition error function: E j , k = y j , k d - y j , k
Wherein: for the ideal of learning sample exports, y i, kfor reality exports;
Step 3: determine MIMO-CNN weights: pass through formula
Wherein: W j, k=(w j, 1, w j, 2..., w j, k), η is pace of learning and requires 0 < η < 1; The matrix form of weights iterative formula is: W (r+1)=W (r)-η Φ t(Φ W (r)-D), wherein:
∈ R p × hfor input transition matrix, r=0,1,2 ... for iterations, have when network training reaches (when namely r is enough large) after stable state:
substitute into W (r+1)=W (r)-η Φ t(Φ W (r)-D) then has:
Φ T(ΦW-D)=0
MIMO-Chebyshev network weight Direct calculation formulas can be obtained thus:
W=(Φ TΦ) -1Φ TD
Wherein pinv (Φ)=(Φ tΦ) -1Φ tbe called the pseudoinverse of Φ, the weight calculation method based on pseudoinverse directly can obtain the network weight of MIMO-CNN by matrix operation one step, avoid tediously long repetitive exercise, can adapt to Fast Learning and the identification of dynamic gesture;
Step 4: network hidden layer neuron number self-adaptation is determined;
Given network objectives precision ε > 0, under meeting the minimum hidden layer neuron number self-adaptation determination algorithm of this accuracy requirement:
(1) target setting precision, gets ε=0.005, the hidden layer neuron number maximal value of setting neural network MaxHideNode = n + m + 20 And initialization value h = n + m + 1 ;
(2) if h > is MaxHideNode, then illustrate within the scope of MaxHideNode, cannot meet network objectives accuracy requirement, program exits, otherwise jumps to (3);
(3) calculate when hidden layer neuron number is h, the weight matrix W of neural network and least mean-square error (MSE);
(4) if MSE≤ε, then illustrate and have found the minimum hidden layer neuron number h meeting network objectives accuracy requirement, program exits, otherwise h=h+1, jump to (2);
Step 5: carry out finger tip tracking; By the Optimum Matching algorithm of bigraph (bipartite graph), realize the finger tip coupling in two two field pictures as Kuhn-Munkrs algorithm etc., thus indirectly realize finger tip tracking;
Step 6: dynamic gesture trajectory extraction, dynamic gesture track is by two of a hand finger tip (thumbs, forefinger) motion formation, gesture record is started when having detected two finger tip is asymptotic and distance being less than certain threshold value, namely start to write, when detected two give directions gradually far away and distance is greater than certain threshold value time terminate gesture record, namely pen is lifted, from starting to write, according to the centre position of certain time interval record two finger tips, terminate until lift pen, so obtain the set of the point of dynamic gesture process, i.e. gesture path, in order to the needs that dynamic gesture below learns and identifies, we choose the representative point ascertained the number from the some set of gesture path, if gesture path count more than 15, can by finding two minimum points of span, and replace them with their mid point, repeat, until the number of gesture path mid point is 15, these points just can regard the representative point of a gesture path as, 15 representative points according to gesture path can obtain 14 vectors and be normalized them, then the vector of this gesture stroke is connected successively, the vector representing this gesture path can be formed,
Step 7: based on dynamic gesture study and the identification of MIMO-CNN;
(1) neural network input and output design:
Input node quantity is determined according to the component number of gesture path vector, utilizes the input of each component as neural network of gesture path vector simultaneously; Output layer nodes arranges the species number m of corresponding dynamic gesture, and length is the foundation that the output vector of m can judge as dynamic gesture;
(2) determination of hidden layer neuron quantity and neural network weight:
The key of dynamic gesture study is according to gesture training sample, determine hidden layer neuron quantity and the network weight of corresponding neural network, for the input vector of these training samples, obtained by the dynamic gesture track extraction method introduced above, and for the expection output vector of these training samples, because dynamic gesture is divided into m kind, so arrange the vector of unit length corresponding m kind gesture that m length is m, the wherein expection output vector (y of N kind gesture 1, y 2... .y m) tdetermine with following formula:
y j = 0 j &NotEqual; N 1 j = N
Setting network precision target is 0.005, utilize the training sample provided, adopt minimum hidden layer neuron number self-adaptation determination algorithm can determine hidden layer neuron quantity, by the input vector of training sample and expection output vector are sent into neural network, utilize formula W=(Φ tΦ) -1Φ td, can obtain the best initial weights matrix W of MIMO-CNN, this MIMO-CNN just may be used for dynamic hand gesture recognition afterwards;
(3) identification of dynamic gesture:
After obtaining dynamic gesture track, by gesture path vector (x 1, k, x 2, k...., x n, k) tbe input to the MIMO-CNN determining weights as input vector, utilize formula y j , k = &Sigma; i = 1 n w j , i T i ( x ) , j = 1,2 , . . . , m ; k = 1,2 , . . . , p Calculate network output vector (y 1, k, x 2, k...., x m, k) t, can carry out gesture identification according to this output vector: the greatest member finding out this vector, if this element is greater than certain threshold value, then the position of this element in vector is exactly the dynamic gesture numbering that will identify.
CN201410514602.9A 2014-09-30 2014-09-30 Dynamic gesture learning and identifying method based on Chebyshev neural network Pending CN104573621A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410514602.9A CN104573621A (en) 2014-09-30 2014-09-30 Dynamic gesture learning and identifying method based on Chebyshev neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410514602.9A CN104573621A (en) 2014-09-30 2014-09-30 Dynamic gesture learning and identifying method based on Chebyshev neural network

Publications (1)

Publication Number Publication Date
CN104573621A true CN104573621A (en) 2015-04-29

Family

ID=53089647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410514602.9A Pending CN104573621A (en) 2014-09-30 2014-09-30 Dynamic gesture learning and identifying method based on Chebyshev neural network

Country Status (1)

Country Link
CN (1) CN104573621A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834922A (en) * 2015-05-27 2015-08-12 电子科技大学 Hybrid neural network-based gesture recognition method
CN105740823A (en) * 2016-02-01 2016-07-06 北京高科中天技术股份有限公司 Dynamic gesture trace recognition method based on depth convolution neural network
CN106354262A (en) * 2016-09-09 2017-01-25 哈尔滨理工大学 Optimized-neural-network gesture-recognition human-computer interaction method based on GL
CN106961104A (en) * 2017-04-06 2017-07-18 新疆大学 Wind power forecasting method based on data analysis and combination basis function neural network
CN107483813A (en) * 2017-08-08 2017-12-15 深圳市明日实业股份有限公司 A kind of method, apparatus and storage device that recorded broadcast is tracked according to gesture
CN107526438A (en) * 2017-08-08 2017-12-29 深圳市明日实业股份有限公司 The method, apparatus and storage device of recorded broadcast are tracked according to action of raising one's hand
CN108734329A (en) * 2017-04-21 2018-11-02 北京微影时代科技有限公司 A kind of method and device at prediction film next day box office
CN108877409A (en) * 2018-07-24 2018-11-23 王钦 The deaf-mute's auxiliary tool and its implementation shown based on gesture identification and VR
CN108885721A (en) * 2016-03-15 2018-11-23 学校法人冲绳科学技术大学院大学学园 Utilize the direct reverse intensified learning of density compared estimate
CN110692032A (en) * 2017-06-01 2020-01-14 奥迪股份公司 Method and device for automatic gesture recognition
CN110781742A (en) * 2019-09-23 2020-02-11 中国地质大学(武汉) Intelligent pedestrian flow identification travel management system
CN112132260A (en) * 2020-09-03 2020-12-25 深圳索信达数据技术有限公司 Training method, calling method, device and storage medium of neural network model
CN112183216A (en) * 2020-09-02 2021-01-05 温州大学 Auxiliary system for communication of disabled people

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834922B (en) * 2015-05-27 2017-11-21 电子科技大学 Gesture identification method based on hybrid neural networks
CN104834922A (en) * 2015-05-27 2015-08-12 电子科技大学 Hybrid neural network-based gesture recognition method
CN105740823A (en) * 2016-02-01 2016-07-06 北京高科中天技术股份有限公司 Dynamic gesture trace recognition method based on depth convolution neural network
CN105740823B (en) * 2016-02-01 2019-03-29 北京高科中天技术股份有限公司 Dynamic gesture track recognizing method based on depth convolutional neural networks
CN108885721B (en) * 2016-03-15 2022-05-06 学校法人冲绳科学技术大学院大学学园 Direct inverse reinforcement learning using density ratio estimation
CN108885721A (en) * 2016-03-15 2018-11-23 学校法人冲绳科学技术大学院大学学园 Utilize the direct reverse intensified learning of density compared estimate
CN106354262B (en) * 2016-09-09 2019-06-07 哈尔滨理工大学 Neural network gesture identification man-machine interaction method based on GL optimization
CN106354262A (en) * 2016-09-09 2017-01-25 哈尔滨理工大学 Optimized-neural-network gesture-recognition human-computer interaction method based on GL
CN106961104A (en) * 2017-04-06 2017-07-18 新疆大学 Wind power forecasting method based on data analysis and combination basis function neural network
CN106961104B (en) * 2017-04-06 2021-07-02 新疆大学 Wind power prediction method based on data analysis and combined basis function neural network
CN108734329A (en) * 2017-04-21 2018-11-02 北京微影时代科技有限公司 A kind of method and device at prediction film next day box office
CN110692032A (en) * 2017-06-01 2020-01-14 奥迪股份公司 Method and device for automatic gesture recognition
CN107526438A (en) * 2017-08-08 2017-12-29 深圳市明日实业股份有限公司 The method, apparatus and storage device of recorded broadcast are tracked according to action of raising one's hand
CN107483813A (en) * 2017-08-08 2017-12-15 深圳市明日实业股份有限公司 A kind of method, apparatus and storage device that recorded broadcast is tracked according to gesture
CN108877409A (en) * 2018-07-24 2018-11-23 王钦 The deaf-mute's auxiliary tool and its implementation shown based on gesture identification and VR
CN110781742A (en) * 2019-09-23 2020-02-11 中国地质大学(武汉) Intelligent pedestrian flow identification travel management system
CN112183216A (en) * 2020-09-02 2021-01-05 温州大学 Auxiliary system for communication of disabled people
CN112132260A (en) * 2020-09-03 2020-12-25 深圳索信达数据技术有限公司 Training method, calling method, device and storage medium of neural network model

Similar Documents

Publication Publication Date Title
CN104573621A (en) Dynamic gesture learning and identifying method based on Chebyshev neural network
Cheng et al. Jointly network: a network based on CNN and RBM for gesture recognition
CN109858406B (en) Key frame extraction method based on joint point information
CN107316067B (en) A kind of aerial hand-written character recognition method based on inertial sensor
Su et al. HDL: Hierarchical deep learning model based human activity recognition using smartphone sensors
KR20120052610A (en) Apparatus and method for recognizing motion using neural network learning algorithm
Li et al. Upper body motion recognition based on key frame and random forest regression
CN106127125A (en) Distributed DTW human body behavior intension recognizing method based on human body behavior characteristics
CN116502069B (en) Haptic time sequence signal identification method based on deep learning
CN108830170A (en) A kind of end-to-end method for tracking target indicated based on layered characteristic
Wang et al. Gesture recognition by using kinect skeleton tracking system
Ge et al. A real-time gesture prediction system using neural networks and multimodal fusion based on data glove
CN107346207B (en) Dynamic gesture segmentation recognition method based on hidden Markov model
CN110348492A (en) A kind of correlation filtering method for tracking target based on contextual information and multiple features fusion
Villagomez et al. Hand gesture recognition for deaf-mute using fuzzy-neural network
Qian et al. Hardness recognition of robotic forearm based on semi-supervised generative adversarial networks
Kim et al. A comparative study on performance of deep learning models for vision-based concrete crack detection according to model types
Ikram et al. Real time hand gesture recognition using leap motion controller based on CNN-SVM architechture
Abdulsattar et al. Facial expression recognition using transfer learning and fine-tuning strategies: A comparative study
CN112215112A (en) Method and system for generating neural network model for hand motion recognition
CN111767932B (en) Action determination method and device, computer equipment and computer readable storage medium
Chen et al. Human body gesture recognition method based on deep learning
CN115091467A (en) Intent prediction and disambiguation method and system based on fuzzy Petri net
Xi et al. Real-time Pedestrian Detection Algorithm Based on Improved YOLOv3
CN112507940A (en) Skeleton action recognition method based on difference guidance representation learning network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150429