CN106778576A

CN106778576A - A kind of action identification method based on SEHM feature graphic sequences

Info

Publication number: CN106778576A
Application number: CN201611110573.5A
Authority: CN
Inventors: 吴贺俊; 李嘉豪
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2016-12-06
Filing date: 2016-12-06
Publication date: 2017-05-31
Anticipated expiration: 2036-12-06
Also published as: CN106778576B

Abstract

The action identification method that the present invention is provided, when action recognition is carried out, is with SEHM proposed by the present invention（segment energy history maps）Characteristic pattern carries out action recognition as low-level image feature.By parameters such as time leaf length in reasonable selection algorithm, calculate corresponding SEHM features graphic sequence and be applied to neutral net and be predicted, the function of identified off-line and ONLINE RECOGNITION can be realized in action recognition.And because the SEHM characteristic patterns for building are related to the front and rear change of the overall attitude of action, therefore fully using the action message acted in change procedure can improve the degree of accuracy of action recognition.Meanwhile, carrying out having carried out initial data certain compression when SEHM characteristic patterns are calculated, the requirement of the complexity and hardware of method is relatively low, and can accomplish online real-time action identification.

Description

A kind of action identification method based on SEHM feature graphic sequences

Technical field

The present invention relates to field of image recognition, more particularly, to a kind of action recognition based on SEHM feature graphic sequences Method.

Background technology

Along with the development of camera sensing device technology, the definition level of camera has universal raising, and causes The number and probability that camera occurs in various scenes are greatly increased.Under the tide of current era internet, substantial amounts of image Video data emerges in large numbers in daily life, has also driven the development of image processing techniques.Action recognition technology as image at A field in reason technology, is widely used in many scenes always, including video monitoring, somatic sensation television game, health care, society The aspect such as can help.For example, Microsoft was proposed body-sensing peripheral hardware --- the Kinect of Xbox360 in 2010, can be swum in main frame As depth camera the limb action of player is caught in play and with game carry out interaction；In addition, developer can be with profit With development kit, oneself application is developed on windows platform, such as simulation is worn the clothes.

Possess be widely applied scene while, the development of action recognition also exist always many technological difficulties and Constraints.

First is the restriction of objective condition.In sequence of video images, because actual photographed situation, usually occurring cannot People in the Obstacle Factors for avoiding, such as camera runs into blocking (object is blocked) for other objects；Camera is because be not always Fixed the reason for, causes camera picture to occur rocking (visual angle shake)；Same person its color in light and shade changes (illumination condition)；Different cameras are because the quality of camera lens causes image sharpness to make a big difference (resolution ratio).In action Identification field, even image processing field, above-mentioned is all the problem that must take into consideration.

Second is the influence of subjective condition.Used as the main body that action recognition is processed, different people has certainly for same action Oneself definition and understanding, even same action also has some fine distinctions.Different people are specifically shown as to do together One action of sample, length, amplitude, pause of action etc. often cause many differences of whole image sequence.Except this main body is dynamic The difference that work is caused is outer, and different people is because age-sex's reason also has little bit different in three-dimensional-structure；From camera away from From, in face of camera angle can all cause record action between have huge difference.Above-mentioned each described factor has The diversity of data may be increased.Meanwhile, in order to realize action recognition algorithm, specifically connect for different industry and scene are provided Mouth and application, will not only consider the accuracy rate of action recognition algorithm, it is also contemplated that other constraintss, such as Cost Problems and reality When sex chromosome mosaicism.

In action recognition algorithm, typically all using sensor as original input data, coordinate pretreatment, feature calculation Judge with the classification that the process such as disaggregated model is acted.Traditional action identification method is typically all to be imaged with traditional RGB Head is the method for input, but as various new sensors occur, the sensor of more and more species is applied to action recognition In method, such as depth camera, infrared pick-up head and acceleration transducer.The appearance of new sensor is so that new is defeated Enter data to be applied in action identification method, be even born many Model Fusion methods.Depth map is used as being different from biography The new data of system RGB figures, each pixel record is not color value but apart from the distance of camera.Because there is distance to believe for it Breath, research and algorithm based on this obtain increasing concern and interest.

Bibliography one discloses a kind of action identification method, the method using depth map as input, according to depth map Range information projects to depth map in three Different Planes of orthogonal coordinate system：Front view, side-looking drawing, top view.Document One proposes a new characteristic pattern --- depth energy diagram；Then the corresponding HOG of depth energy diagram under different views is calculated Feature is simultaneously input into SVM classifier and is predicted classification.Entire depth video sequence is directly combined into a depth energy by the method Scheme, do not fully take into account overlap, the redundancy between acting before and after whole action, also do not have human body appearance in the whole action of consideration The front and rear change of state.In the front and rear videos for occurring in that multiple different actions, correctly Ground Split and multiple actions can not be generated Energy diagram lead to not to identify multiple actions (front and rear many action videos identifications)；Similarly, because can not select in ONLINE RECOGNITION Take end frame to lead to not synthesize depth energy diagram, i.e., cannot meet the demand of real-time.

Bibliography two discloses a kind of action identification method, and the method equally first projects to depth map three coordinate surfaces Corresponding depth energy diagram is gone up and calculated, another feature operator LBP is then quoted and is made advanced features.Depth is calculated After the LBP features of energy diagram, then action recognition is done with improved extreme learning machine model.The method is equally whole video sequence A depth energy diagram is processed as, not consideration acts the inner link of front-back position, it is impossible to many action videos before and after meeting The demands such as identification, ONLINE RECOGNITION and real-time.

Bibliography three discloses a kind of action identification method, and depth map is equally projected to three and different regarded by the method Angle figure, the depth energy diagram different from representing distance change in bibliography one, two whole videos of calculating of bibliography, the party Method calculates the historical track figure of depth distance active regions, and the order that attitude occurs is accounted for；It is also proposed simultaneously quiet State posturography and average energy diagram, enrich the input of feature.However, the method although it is contemplated that the appearance of attitude sequentially, but It is not take into full account that the history attitude before in whole video sequence can cause some by the problem of attitude covering below The first half of action is covered by latter half and lost many information.Though whole video sequence is synthesized historical track figure Situation about occurring before and after attitude is so considered to a certain extent, but does not account for the interference of some redundant actions.Although adding The calculating of stagnant zone, but only considered the positive negative direction of the absolute value without consideration motion energy of motion energy figure. It is similar to bibliography one, bibliography two, bibliography three many action video identifications, ONLINE RECOGNITION before and after cannot equally meeting And real-time demand.

Bibliography one：Yang,Xiaodong,C.Zhang,and Y.L.Tian."Recognizing actions using depth motion maps-based histograms of oriented gradients."ACM International Conference on Multimedia 2012:1057-1060.

Bibliography two：Chen,Chen,R.Jafari,and N.Kehtarnavaz."Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns." Applications of Computer Vision IEEE,2015:1092-1099.

Bibliography three：Liang,Bin,and L.Zheng."3D Motion Trail Model Based Pyramid Histograms of Oriented Gradient for Action Recognition."International Conference on Pattern Recognition IEEE Computer Society,2014:1952-1957.

The content of the invention

The present invention is the problem of solution above prior art, there is provided a kind of action recognition based on SEHM feature graphic sequences Method, the method can realize identified off-line and ONLINE RECOGNITION, and the real-time of method is preferable.

To realize above goal of the invention, the technical scheme of use is：

A kind of action identification method based on SEHM feature graphic sequences, comprises the following steps：

S1. it is the depth map sequence of N frames for the time segment length selected in video, by each frame in depth map sequence Depth map is projected in three Different Planes of orthogonal coordinate system, obtains three orthogonal visual angle figures：Front view, side view and bow View；

S2. for the depth map sequence under each visual angle figure, the difference of its adjacent two frame is calculated as energy diagram, wherein often Frame energy diagram represents the distance change of front and rear frame；Then the concrete numerical value according to energy diagram and the threshold value of setting divide energy diagram It is three kinds of state diagrams：The binary system figure of the binary system figure of state, backward the binary system figure of state or static state forward.It is specific as follows：

WhereinIt is the energy diagram of the t frames under visual angle figure v；ε is set threshold value； Represent that a later frame subtracts the absolute value of the difference of former frame；I=1,2,3, the binary system figure of state forward, shape backward are represented respectively The binary system figure of state and the binary system figure of static state；The state diagram of t framesBy a triple channel matrix EM_tIt is indicated；

S3. the state graphic sequence respectively obtained under three visual angle figures after step S2 is performed；By three N frame states of visual angle figure Graphic sequence is equally divided into S timeslice according to tandem respectively, and S=N/K, wherein K represent the length of each timeslice；For State graphic sequence under each visual angle figure, the state graphic sequence for choosing a timeslice successively according to vertical order is carried out The calculating of SEHM characteristic patterns：

S31. the state graphic sequence for setting the timeslice that pth time selection is calculated is (the p- from N frame state graphic sequences 1) * K+1 frames start and in pth * K frame ends, then the SEHM characteristic patterns of the timeslice are calculated by below equation and step S32 Arrive：

SEHM_p=max (SEHM_p,EM_(p-1)*K+k·k)

Wherein, the initial value of k is 1, SEHM_pIt is triple channel matrix that initial value is set as zero；

S32. make k=k+1 and then perform the formula of step S31 until k>K, exports after eventually passing standardization SEHM_pThe SEHM characteristic patterns of the timeslice calculated as pth time selection；

S4. the SEHM characteristic patterns of each timeslice under three visual angle figures are obtained by step S31, S32；

S5. the SEHM characteristic patterns of mutual corresponding timeslice under three visual angle figures are merged, merged with when Between piece for unit SEHM characteristic patterns；

The SEHM characteristic patterns of each timeslice after S6. merging constitute SEHM feature graphic sequences, by SEHM feature graphic sequences It is input in neutral net, the row of neutral net output one represent the probability vector P of each action possibility, according to the probability of output Vectorial P determines the action recognition result of current N frames depth map sequence.

In such scheme, action identification method is to carry out action recognition based on SEHM characteristic patterns when action recognition is carried out 's.By parameters such as time leaf length in reasonable selection algorithm, calculate corresponding SEHM features graphic sequence and be applied to nerve net Network is predicted, and the function of identified off-line and ONLINE RECOGNITION can be realized in action recognition.And due to the SEHM features for building Figure is related to the front and rear change of the overall attitude of action, therefore can fully using the action letter acted in change procedure Breath, improves the degree of accuracy of action recognition.Meanwhile, carrying out having carried out certain pressure to initial data when SEHM characteristic patterns are calculated Contracting, the requirement of the complexity and hardware of method is relatively low, and can accomplish online real-time action identification.

Preferably, the calculating of SEHM characteristic patterns is carried out to the N frame states graphic sequence under three visual angle figures respectively, then to meter SEHM characteristic patterns under the three visual angle figures for obtaining are merged, and obtain global SEHM characteristic patterns；It is global in the step S6 SEHM characteristic patterns constitute SEHM feature graphic sequences with the SEHM characteristic patterns of each timeslice, by SEHM characteristic patterns sequence inputting god Through carrying out action recognition in network.By arrangement above, the motion characteristic of whole time segment length can be taken into account, enter one Step improves the accuracy rate of action recognition.

Preferably, when selected N frames depth map sequence carries out action recognition, entered by sliding window in the step S1 Row is selected, and the sliding window includes a window size value m, the start frame of depth map sequence that expression is chosen next time away from From the time span of the start frame of the last depth map sequence chosen.One section of video can be chosen by the form of sliding window Multiple lengths carry out action recognition for the time period of N frames, and model finally also can respectively provide every section of prediction of result.

Preferably, ε=30.

Preferably, the K=10.

Preferably, the N=80.

Preferably, in the step S5, by the SEHM features of front view, side view and top view mutually corresponding timeslice Figure is according to 2:1:1 ratio is merged.

Preferably, the neutral net includes convolutional layer, magnetized layer, LSTM layers, full articulamentum and Softmax layers；

Wherein convolutional layer and magnetized layer are used to extract advanced features from SEHM feature graphic sequences；

The described LSTM layers advanced features for being used for the feature graphic sequence to extracting carry out context treatment, export recognition effect Preferably there are the advanced features of timing information；

The full articulamentum and Softmax layers are used to receive the advanced features that LSTM layer or convolutional layer, magnetized layer are exported, defeated Go out a row prediction probability vector P.

Preferably, the probability vector P includes several Probability ps_i, wherein p_iRepresent that action recognition is the probability for acting i；

Then determine that the process of action recognition result is as follows in step S6：

Threshold value ρ of one value of setting between 0 to 1, if the probability of no one of probability vector P actions is more than ρ, recognizes For the action in N frame depth map sequences is meaningless action；The maximum action of identification probability value is otherwise taken to enter as recognition result Row output.

Preferably, ρ=0.5.

Compared with prior art, the beneficial effects of the invention are as follows：

The action identification method that the present invention is provided is to carry out action recognition based on SEHM characteristic patterns when action recognition is carried out 's.By parameters such as time leaf length in reasonable selection algorithm, calculate corresponding SEHM features graphic sequence and be applied to nerve net Network is predicted, and the function of identified off-line and ONLINE RECOGNITION can be realized in action recognition.And due to the SEHM features for building Figure is related to the front and rear change of the overall attitude of action, therefore can fully using the action letter acted in change procedure Breath, improves the degree of accuracy of action recognition.Meanwhile, carrying out having carried out certain pressure to initial data when SEHM characteristic patterns are calculated Contracting, the requirement of the complexity and hardware of method is relatively low, and can accomplish online real-time action identification.

Brief description of the drawings

Fig. 1 be wave action SEHM feature graphic sequences structure decomposition figure.

The overall structure chart of the neutral net for possessing LSTM layers that Fig. 2 is used for embodiment.

Specific embodiment

Accompanying drawing being for illustration only property explanation, it is impossible to be interpreted as the limitation to this patent；

Below in conjunction with drawings and Examples, the present invention is further elaborated.

Embodiment 1

Different people has the definition and understanding of oneself for same action, and one of them performance the most obvious is exactly dynamic Make the difference of length.And because the time that subjective reason same person does same action in the different time also differs Sample.Entire depth video sequence is typically all simply merged into a new characteristic pattern by existing most method.But so The consequence done is to cause a big chunk space time information in video sequence lost, particularly as one hand draws the appearances such as fork before body The more actions of gesture lap are even more and are easily lost many information.In order to reduce the loss of information, the present invention is proposed SEHM characteristic patterns (segment energy history maps).

Be the deep video sequence of N frames for time segment length, first the depth map of each frame project to three it is orthogonal Visual angle figure (Map_f,Map_s,Map_t)：Front view, side view, top view.For the depth map sequence under each view next The calculating of energy diagram can be carried out.For the depth map sequence under each view, the difference of adjacent two frame in the present invention meeting sequence of calculation Value (a later frame subtracts former frame) is used as energy diagram.Every energy diagram represents the distance change of front and rear frame.It is every according to what is obtained The concrete numerical value of energy diagram is opened, they are divided into three binary system figures of state by the present invention according to threshold value：Forward state, backward It is state, static state.It is specific as follows：

WhereinIt is the energy diagram of the t frames under visual angle figure v；ε is set threshold value； Represent that a later frame subtracts the absolute value of the difference of former frame；I=1,2,3, the binary system figure of state forward, shape backward are represented respectively The binary system figure of state and the binary system figure of static state；Three state diagrams of t framesBy a triple channel Matrix EM_tIt is indicated；

By calculating, the energy graphic sequence under different visual angles can be obtained.Directly but energy diagram directly can not be worked as Neutral net is used as input data, because：

1st, image recognition is generally required to use data sets up to a million and could obtain preferable effect；And some simple motions Length typically have the picture of tens to hundreds of frame, and each main body is the similar people of profile；For compared to image recognition, The data set that action recognition goes for similar effect needs is much more than the former.If therefore using the data of every frame as defeated Enter, then need larger data set just to obtain comparing considerable result during training pattern.

2nd, because of the LSTM layers of context for needing to consider all list entries in neutral net, if with every in video One frame does input unit, otherwise the time period chosen is suitable but amount of calculation is larger and higher to the demand of hardware；Choose Time period it is too short influence model training result.

In sum, the present invention has first carried out appropriate compression merging --- SEHM characteristic pattern meters to original depth sequence Calculate.

After all energy graphic sequences under each view under calculating current slot, it is possible to the energy diagram of every frame Synthesize SEHM characteristic patterns.In order to consider the real-time of algorithm, it is necessary to which the concrete condition for considering action data chooses suitable N and K value do SEHM characteristic patterns calculating.Simultaneously for many action video identifications, ONLINE RECOGNITION and real-time before and after reaching Function, in the form of sliding window video is divided into multiple time periods is identified respectively.For example, the length of certain video is 120 frames, if taking length of 80 frames as the time period every time, sliding window is 40 frames, then need the depth to 1 to 80 frames, 41 to 120 frames Degree graphic sequence does the calculating of SEHM features graphic sequence respectively.Two SEHM feature graphic sequences of time period can be obtained by calculating； Two action recognition results of time period are drawn respectively by neural network model, so as to realize the functions such as ONLINE RECOGNITION.

And be the energy graphic sequence of N frames for certain time segment length, by three N frame states graphic sequence difference of visual angle figure S timeslice is equally divided into according to tandem, S=N/K, wherein K represent the length of each timeslice；For each visual angle figure Under state graphic sequence, the state graphic sequence for choosing a timeslice successively according to vertical order carries out SEHM characteristic patterns Calculating：

SEHM_p=max (SEHM_p,EM_(p-1)*K+k·k)

Calculating is over after the characteristic pattern of timeslice, similarly also needs to calculate the global SEHM characteristic patterns of whole time period. Since first frame of time period, last frame terminates global SEHM characteristic patterns.By aforesaid operations, SEHM characteristic patterns are carried out Compress and remain the crucial attitude information in video.For the action of general speed and complexity, it may be considered that N= 80, K=10.

Simultaneously in order to draw last recognition result, it is necessary to which the SEHM characteristic patterns under three visual angles are merged.Examine Considering neutral net has the part for the treatment of picture and the ability of whole relation, and the present invention is lower each the corresponding timeslice of various visual angles SEHM characteristic patterns or the global SEHM characteristic patterns of time period press 2:1:1 ratio merges into a final SEHM characteristic pattern. Final SEHM characteristic patterns then transfer to neutral net to extract feature.Final SEHM feature graphic sequences after the merging that Fig. 1 is given Structure composition.

For mode identification method, in addition to having carried out feature extraction to initial data, algorithm model is exactly the most Part and parcel.Because SEHM feature graphic sequences have been carried out compression and front and rear sequence, LSTM (long short Term memory) layer it is this possess treatment in order input model have relatively good effect.And LSTM is in natural language and language Sound field has been achieved for larger achievement, also begins to refer to image domains in recent years.

The effect that deep neural network is played when data set is bigger is better, can select to pre-process model. Alexnet network models are directed to the image recognition model of RGB figures, and wherein people is exactly an identification types of its task.Consider It is the triple channel characteristic pattern of the substantially contour feature with human body to SEHM characteristic patterns, it is believed that SEHM characteristic patterns of the invention Preferable result can be obtained by initial value retraining of the parameter of the convolutional layer of Alexnet and magnetized layer.Use Alexnet networks Convolutional layer and magnetized layer network structure as neural network structure leading portion of the invention while LSTM Internets are connected on The back segment of neural network model, so with the training speed of acceleration model front half section and can improve precision.Neural network model Overall structure chart is as shown in Figure 2.

Can as seen from Figure 2, global SEHM characteristic patterns and SEHM features graphic sequence all experienced convolution magnetized layer and extract high Level feature；The difference is that because SEHM features graphic sequence can be proposed more preferably because possessing context information by LSTM layers for the treatment of Advanced features；And overall situation SEHM characteristic patterns are because covering the information of whole time period need not again by LSTM layers.Finally Their advanced features are input into full articulamentum and Softmax layers obtains a row probability vector P (wherein each p in vector_iRepresent It is judged as such probability).

For the probability vector P of certain time period, threshold value ρ between zero and one can be defined, if in probability vector The p of neither one classification_iMore than ρ, then it is assumed that the action of the time period is meaningless action；Otherwise take Probability p_iMaximum classification As prediction action.

Obviously, the above embodiment of the present invention is only intended to clearly illustrate example of the present invention, and is not right The restriction of embodiments of the present invention.For those of ordinary skill in the field, may be used also on the basis of the above description To make other changes in different forms.There is no need and unable to be exhaustive to all of implementation method.It is all this Any modification, equivalent and improvement made within the spirit and principle of invention etc., should be included in the claims in the present invention Protection domain within.

Claims

1. a kind of action identification method based on SEHM feature graphic sequences, it is characterised in that：Comprise the following steps：

S1. it is the depth map sequence of N frames for the time segment length selected in video, by the depth of each frame in depth map sequence Figure is projected in three Different Planes of orthogonal coordinate system, obtains three orthogonal visual angle figures：Front view, side view and vertical view Figure；

S2. for the depth map sequence under each visual angle figure, the difference of its adjacent two frame is calculated as energy diagram, wherein per frame energy Spirogram represents the distance change of front and rear frame；Then energy diagram is divided into three by the concrete numerical value according to energy diagram and the threshold value of setting Plant state diagram：The binary system figure of the binary system figure of state, backward the binary system figure of state or static state forward.It is specific as follows：

{EM}_{t}^{i} = \{\begin{matrix} (({map}_{t + 1}^{v} - {map}_{t}^{v}) > ϵ) = 1, i = 1 \\ (({map}_{t + 1}^{v} - {map}_{t}^{v}) < - ϵ) = 1, i = 2 \\ (0 < a b s ({map}_{t + 1}^{v} - {map}_{t}^{v}) < ϵ) = 1, i = 3 \end{matrix}

WhereinIt is the energy diagram of the t frames under visual angle figure v；ε is set threshold value； Represent A later frame subtracts the absolute value of the difference of former frame；I=1,2,3, the binary system figure of state forward, state backward are represented respectively The binary system figure of binary system figure and static state；The state diagram of t framesBy a triple channel matrix EM_tEnter Row is represented；

S3. the state graphic sequence respectively obtained under three visual angle figures after step S2 is performed；By three N frame state figure sequences of visual angle figure Row are equally divided into S timeslice according to tandem respectively, and S=N/K, wherein K represent the length of each timeslice；For each State graphic sequence under the figure of visual angle, the state graphic sequence for choosing a timeslice successively according to vertical order carries out SEHM The calculating of characteristic pattern：

S31. the state graphic sequence for setting the timeslice that pth time selection is calculated is (p-1) the * K+1 from N frame state graphic sequences Frame starts and in pth * K frame ends, then the SEHM characteristic patterns of the timeslice are calculated by below equation and step S32：

SEHM_p=max (SEHM_p,EM_(p-1)*K+k·k)

S32. make k=k+1 and then perform the formula of step S31 until k>K, SEHM is exported after eventually passing standardization_pAs The SEHM characteristic patterns of the timeslice that pth time selection is calculated；

S5. the SEHM characteristic patterns of mutual corresponding timeslice under three visual angle figures are merged, merged with timeslice It is the SEHM characteristic patterns of unit；

The SEHM characteristic patterns of each timeslice after S6. merging constitute SEHM feature graphic sequences, by SEHM characteristic pattern sequence inputtings To in neutral net, the row of neutral net output one represent the probability vector P of each action possibility, according to the probability vector of output P determines the action recognition result of current N frames depth map sequence.

2. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：Respectively to three N frame states graphic sequence under individual visual angle figure carries out the calculating of SEHM characteristic patterns, then under three visual angle figures being calculated SEHM characteristic patterns are merged, and obtain global SEHM characteristic patterns；In the step S6, global SEHM characteristic patterns and each timeslice SEHM characteristic patterns constitute SEHM feature graphic sequences, action recognition will be carried out in the SEHM characteristic pattern sequence inputting neutral nets.

3. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The step It is that selected, the sliding window bag is carried out by sliding window when selected N frames depth map sequence carries out action recognition in S1 Include a window size value m, the last depth map sequence chosen of start frame distance of the depth map sequence that expression is chosen next time The time span of the start frame of row.

4. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The ε= 30。

5. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The K= 10。

6. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The N= 80。

7. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The step In S5, by the SEHM characteristic patterns of front view, side view and top view mutually corresponding timeslice according to 2:1:1 ratio is melted Close.

8. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The nerve Network includes convolutional layer, magnetized layer, LSTM layers, full articulamentum and Softmax layers；

The described LSTM layers advanced features for being used for the feature graphic sequence to extracting carry out context treatment, and output recognition effect is more preferable The advanced features with timing information；

The full articulamentum and Softmax layers are used to receive LSTM layer or convolutional layer, the advanced features of magnetized layer output, export one Row prediction probability vector P.

9. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that：The probability Vectorial P includes several Probability ps_i, wherein p_iRepresent that action recognition is the probability for acting i；

Threshold value ρ of one value of setting between 0 to 1, if the probability of no one of probability vector P actions is more than ρ, then it is assumed that N Action in frame depth map sequence is meaningless action；Otherwise take the maximum action of identification probability value carried out as recognition result it is defeated Go out.