CN106778576A - A kind of action identification method based on SEHM feature graphic sequences - Google Patents
A kind of action identification method based on SEHM feature graphic sequences Download PDFInfo
- Publication number
- CN106778576A CN106778576A CN201611110573.5A CN201611110573A CN106778576A CN 106778576 A CN106778576 A CN 106778576A CN 201611110573 A CN201611110573 A CN 201611110573A CN 106778576 A CN106778576 A CN 106778576A
- Authority
- CN
- China
- Prior art keywords
- sehm
- action
- sequence
- frame
- characteristic patterns
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000009471 action Effects 0.000 title claims abstract description 105
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000007935 neutral effect Effects 0.000 claims abstract description 12
- 230000008859 change Effects 0.000 claims abstract description 11
- 238000010586 diagram Methods 0.000 claims description 30
- 230000000007 visual effect Effects 0.000 claims description 28
- 229910002056 binary alloy Inorganic materials 0.000 claims description 16
- 230000000694 effects Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000003068 static effect Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 4
- 210000005036 nerve Anatomy 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 abstract description 9
- 230000006835 compression Effects 0.000 abstract description 3
- 238000007906 compression Methods 0.000 abstract description 3
- 230000033001 locomotion Effects 0.000 description 7
- 238000011161 development Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 210000004218 nerve net Anatomy 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 206010068052 Mosaicism Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000002611 posturography Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 210000003765 sex chromosome Anatomy 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000000392 somatic effect Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The action identification method that the present invention is provided, when action recognition is carried out, is with SEHM proposed by the present invention(segment energy history maps)Characteristic pattern carries out action recognition as low-level image feature.By parameters such as time leaf length in reasonable selection algorithm, calculate corresponding SEHM features graphic sequence and be applied to neutral net and be predicted, the function of identified off-line and ONLINE RECOGNITION can be realized in action recognition.And because the SEHM characteristic patterns for building are related to the front and rear change of the overall attitude of action, therefore fully using the action message acted in change procedure can improve the degree of accuracy of action recognition.Meanwhile, carrying out having carried out initial data certain compression when SEHM characteristic patterns are calculated, the requirement of the complexity and hardware of method is relatively low, and can accomplish online real-time action identification.
Description
Technical field
The present invention relates to field of image recognition, more particularly, to a kind of action recognition based on SEHM feature graphic sequences
Method.
Background technology
Along with the development of camera sensing device technology, the definition level of camera has universal raising, and causes
The number and probability that camera occurs in various scenes are greatly increased.Under the tide of current era internet, substantial amounts of image
Video data emerges in large numbers in daily life, has also driven the development of image processing techniques.Action recognition technology as image at
A field in reason technology, is widely used in many scenes always, including video monitoring, somatic sensation television game, health care, society
The aspect such as can help.For example, Microsoft was proposed body-sensing peripheral hardware --- the Kinect of Xbox360 in 2010, can be swum in main frame
As depth camera the limb action of player is caught in play and with game carry out interaction;In addition, developer can be with profit
With development kit, oneself application is developed on windows platform, such as simulation is worn the clothes.
Possess be widely applied scene while, the development of action recognition also exist always many technological difficulties and
Constraints.
First is the restriction of objective condition.In sequence of video images, because actual photographed situation, usually occurring cannot
People in the Obstacle Factors for avoiding, such as camera runs into blocking (object is blocked) for other objects;Camera is because be not always
Fixed the reason for, causes camera picture to occur rocking (visual angle shake);Same person its color in light and shade changes
(illumination condition);Different cameras are because the quality of camera lens causes image sharpness to make a big difference (resolution ratio).In action
Identification field, even image processing field, above-mentioned is all the problem that must take into consideration.
Second is the influence of subjective condition.Used as the main body that action recognition is processed, different people has certainly for same action
Oneself definition and understanding, even same action also has some fine distinctions.Different people are specifically shown as to do together
One action of sample, length, amplitude, pause of action etc. often cause many differences of whole image sequence.Except this main body is dynamic
The difference that work is caused is outer, and different people is because age-sex's reason also has little bit different in three-dimensional-structure;From camera away from
From, in face of camera angle can all cause record action between have huge difference.Above-mentioned each described factor has
The diversity of data may be increased.Meanwhile, in order to realize action recognition algorithm, specifically connect for different industry and scene are provided
Mouth and application, will not only consider the accuracy rate of action recognition algorithm, it is also contemplated that other constraintss, such as Cost Problems and reality
When sex chromosome mosaicism.
In action recognition algorithm, typically all using sensor as original input data, coordinate pretreatment, feature calculation
Judge with the classification that the process such as disaggregated model is acted.Traditional action identification method is typically all to be imaged with traditional RGB
Head is the method for input, but as various new sensors occur, the sensor of more and more species is applied to action recognition
In method, such as depth camera, infrared pick-up head and acceleration transducer.The appearance of new sensor is so that new is defeated
Enter data to be applied in action identification method, be even born many Model Fusion methods.Depth map is used as being different from biography
The new data of system RGB figures, each pixel record is not color value but apart from the distance of camera.Because there is distance to believe for it
Breath, research and algorithm based on this obtain increasing concern and interest.
Bibliography one discloses a kind of action identification method, the method using depth map as input, according to depth map
Range information projects to depth map in three Different Planes of orthogonal coordinate system:Front view, side-looking drawing, top view.Document
One proposes a new characteristic pattern --- depth energy diagram;Then the corresponding HOG of depth energy diagram under different views is calculated
Feature is simultaneously input into SVM classifier and is predicted classification.Entire depth video sequence is directly combined into a depth energy by the method
Scheme, do not fully take into account overlap, the redundancy between acting before and after whole action, also do not have human body appearance in the whole action of consideration
The front and rear change of state.In the front and rear videos for occurring in that multiple different actions, correctly Ground Split and multiple actions can not be generated
Energy diagram lead to not to identify multiple actions (front and rear many action videos identifications);Similarly, because can not select in ONLINE RECOGNITION
Take end frame to lead to not synthesize depth energy diagram, i.e., cannot meet the demand of real-time.
Bibliography two discloses a kind of action identification method, and the method equally first projects to depth map three coordinate surfaces
Corresponding depth energy diagram is gone up and calculated, another feature operator LBP is then quoted and is made advanced features.Depth is calculated
After the LBP features of energy diagram, then action recognition is done with improved extreme learning machine model.The method is equally whole video sequence
A depth energy diagram is processed as, not consideration acts the inner link of front-back position, it is impossible to many action videos before and after meeting
The demands such as identification, ONLINE RECOGNITION and real-time.
Bibliography three discloses a kind of action identification method, and depth map is equally projected to three and different regarded by the method
Angle figure, the depth energy diagram different from representing distance change in bibliography one, two whole videos of calculating of bibliography, the party
Method calculates the historical track figure of depth distance active regions, and the order that attitude occurs is accounted for;It is also proposed simultaneously quiet
State posturography and average energy diagram, enrich the input of feature.However, the method although it is contemplated that the appearance of attitude sequentially, but
It is not take into full account that the history attitude before in whole video sequence can cause some by the problem of attitude covering below
The first half of action is covered by latter half and lost many information.Though whole video sequence is synthesized historical track figure
Situation about occurring before and after attitude is so considered to a certain extent, but does not account for the interference of some redundant actions.Although adding
The calculating of stagnant zone, but only considered the positive negative direction of the absolute value without consideration motion energy of motion energy figure.
It is similar to bibliography one, bibliography two, bibliography three many action video identifications, ONLINE RECOGNITION before and after cannot equally meeting
And real-time demand.
Bibliography one:Yang,Xiaodong,C.Zhang,and Y.L.Tian."Recognizing actions
using depth motion maps-based histograms of oriented gradients."ACM
International Conference on Multimedia 2012:1057-1060.
Bibliography two:Chen,Chen,R.Jafari,and N.Kehtarnavaz."Action Recognition
from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns."
Applications of Computer Vision IEEE,2015:1092-1099.
Bibliography three:Liang,Bin,and L.Zheng."3D Motion Trail Model Based Pyramid
Histograms of Oriented Gradient for Action Recognition."International
Conference on Pattern Recognition IEEE Computer Society,2014:1952-1957.
The content of the invention
The present invention is the problem of solution above prior art, there is provided a kind of action recognition based on SEHM feature graphic sequences
Method, the method can realize identified off-line and ONLINE RECOGNITION, and the real-time of method is preferable.
To realize above goal of the invention, the technical scheme of use is:
A kind of action identification method based on SEHM feature graphic sequences, comprises the following steps:
S1. it is the depth map sequence of N frames for the time segment length selected in video, by each frame in depth map sequence
Depth map is projected in three Different Planes of orthogonal coordinate system, obtains three orthogonal visual angle figures:Front view, side view and bow
View;
S2. for the depth map sequence under each visual angle figure, the difference of its adjacent two frame is calculated as energy diagram, wherein often
Frame energy diagram represents the distance change of front and rear frame;Then the concrete numerical value according to energy diagram and the threshold value of setting divide energy diagram
It is three kinds of state diagrams:The binary system figure of the binary system figure of state, backward the binary system figure of state or static state forward.It is specific as follows:
WhereinIt is the energy diagram of the t frames under visual angle figure v;ε is set threshold value;
Represent that a later frame subtracts the absolute value of the difference of former frame;I=1,2,3, the binary system figure of state forward, shape backward are represented respectively
The binary system figure of state and the binary system figure of static state;The state diagram of t framesBy a triple channel matrix
EMtIt is indicated;
S3. the state graphic sequence respectively obtained under three visual angle figures after step S2 is performed;By three N frame states of visual angle figure
Graphic sequence is equally divided into S timeslice according to tandem respectively, and S=N/K, wherein K represent the length of each timeslice;For
State graphic sequence under each visual angle figure, the state graphic sequence for choosing a timeslice successively according to vertical order is carried out
The calculating of SEHM characteristic patterns:
S31. the state graphic sequence for setting the timeslice that pth time selection is calculated is (the p- from N frame state graphic sequences
1) * K+1 frames start and in pth * K frame ends, then the SEHM characteristic patterns of the timeslice are calculated by below equation and step S32
Arrive:
SEHMp=max (SEHMp,EM(p-1)*K+k·k)
Wherein, the initial value of k is 1, SEHMpIt is triple channel matrix that initial value is set as zero;
S32. make k=k+1 and then perform the formula of step S31 until k>K, exports after eventually passing standardization
SEHMpThe SEHM characteristic patterns of the timeslice calculated as pth time selection;
S4. the SEHM characteristic patterns of each timeslice under three visual angle figures are obtained by step S31, S32;
S5. the SEHM characteristic patterns of mutual corresponding timeslice under three visual angle figures are merged, merged with when
Between piece for unit SEHM characteristic patterns;
The SEHM characteristic patterns of each timeslice after S6. merging constitute SEHM feature graphic sequences, by SEHM feature graphic sequences
It is input in neutral net, the row of neutral net output one represent the probability vector P of each action possibility, according to the probability of output
Vectorial P determines the action recognition result of current N frames depth map sequence.
In such scheme, action identification method is to carry out action recognition based on SEHM characteristic patterns when action recognition is carried out
's.By parameters such as time leaf length in reasonable selection algorithm, calculate corresponding SEHM features graphic sequence and be applied to nerve net
Network is predicted, and the function of identified off-line and ONLINE RECOGNITION can be realized in action recognition.And due to the SEHM features for building
Figure is related to the front and rear change of the overall attitude of action, therefore can fully using the action letter acted in change procedure
Breath, improves the degree of accuracy of action recognition.Meanwhile, carrying out having carried out certain pressure to initial data when SEHM characteristic patterns are calculated
Contracting, the requirement of the complexity and hardware of method is relatively low, and can accomplish online real-time action identification.
Preferably, the calculating of SEHM characteristic patterns is carried out to the N frame states graphic sequence under three visual angle figures respectively, then to meter
SEHM characteristic patterns under the three visual angle figures for obtaining are merged, and obtain global SEHM characteristic patterns;It is global in the step S6
SEHM characteristic patterns constitute SEHM feature graphic sequences with the SEHM characteristic patterns of each timeslice, by SEHM characteristic patterns sequence inputting god
Through carrying out action recognition in network.By arrangement above, the motion characteristic of whole time segment length can be taken into account, enter one
Step improves the accuracy rate of action recognition.
Preferably, when selected N frames depth map sequence carries out action recognition, entered by sliding window in the step S1
Row is selected, and the sliding window includes a window size value m, the start frame of depth map sequence that expression is chosen next time away from
From the time span of the start frame of the last depth map sequence chosen.One section of video can be chosen by the form of sliding window
Multiple lengths carry out action recognition for the time period of N frames, and model finally also can respectively provide every section of prediction of result.
Preferably, ε=30.
Preferably, the K=10.
Preferably, the N=80.
Preferably, in the step S5, by the SEHM features of front view, side view and top view mutually corresponding timeslice
Figure is according to 2:1:1 ratio is merged.
Preferably, the neutral net includes convolutional layer, magnetized layer, LSTM layers, full articulamentum and Softmax layers;
Wherein convolutional layer and magnetized layer are used to extract advanced features from SEHM feature graphic sequences;
The described LSTM layers advanced features for being used for the feature graphic sequence to extracting carry out context treatment, export recognition effect
Preferably there are the advanced features of timing information;
The full articulamentum and Softmax layers are used to receive the advanced features that LSTM layer or convolutional layer, magnetized layer are exported, defeated
Go out a row prediction probability vector P.
Preferably, the probability vector P includes several Probability psi, wherein piRepresent that action recognition is the probability for acting i;
Then determine that the process of action recognition result is as follows in step S6:
Threshold value ρ of one value of setting between 0 to 1, if the probability of no one of probability vector P actions is more than ρ, recognizes
For the action in N frame depth map sequences is meaningless action;The maximum action of identification probability value is otherwise taken to enter as recognition result
Row output.
Preferably, ρ=0.5.
Compared with prior art, the beneficial effects of the invention are as follows:
The action identification method that the present invention is provided is to carry out action recognition based on SEHM characteristic patterns when action recognition is carried out
's.By parameters such as time leaf length in reasonable selection algorithm, calculate corresponding SEHM features graphic sequence and be applied to nerve net
Network is predicted, and the function of identified off-line and ONLINE RECOGNITION can be realized in action recognition.And due to the SEHM features for building
Figure is related to the front and rear change of the overall attitude of action, therefore can fully using the action letter acted in change procedure
Breath, improves the degree of accuracy of action recognition.Meanwhile, carrying out having carried out certain pressure to initial data when SEHM characteristic patterns are calculated
Contracting, the requirement of the complexity and hardware of method is relatively low, and can accomplish online real-time action identification.
Brief description of the drawings
Fig. 1 be wave action SEHM feature graphic sequences structure decomposition figure.
The overall structure chart of the neutral net for possessing LSTM layers that Fig. 2 is used for embodiment.
Specific embodiment
Accompanying drawing being for illustration only property explanation, it is impossible to be interpreted as the limitation to this patent;
Below in conjunction with drawings and Examples, the present invention is further elaborated.
Embodiment 1
Different people has the definition and understanding of oneself for same action, and one of them performance the most obvious is exactly dynamic
Make the difference of length.And because the time that subjective reason same person does same action in the different time also differs
Sample.Entire depth video sequence is typically all simply merged into a new characteristic pattern by existing most method.But so
The consequence done is to cause a big chunk space time information in video sequence lost, particularly as one hand draws the appearances such as fork before body
The more actions of gesture lap are even more and are easily lost many information.In order to reduce the loss of information, the present invention is proposed
SEHM characteristic patterns (segment energy history maps).
Be the deep video sequence of N frames for time segment length, first the depth map of each frame project to three it is orthogonal
Visual angle figure (Mapf,Maps,Mapt):Front view, side view, top view.For the depth map sequence under each view next
The calculating of energy diagram can be carried out.For the depth map sequence under each view, the difference of adjacent two frame in the present invention meeting sequence of calculation
Value (a later frame subtracts former frame) is used as energy diagram.Every energy diagram represents the distance change of front and rear frame.It is every according to what is obtained
The concrete numerical value of energy diagram is opened, they are divided into three binary system figures of state by the present invention according to threshold value:Forward state, backward
It is state, static state.It is specific as follows:
WhereinIt is the energy diagram of the t frames under visual angle figure v;ε is set threshold value;
Represent that a later frame subtracts the absolute value of the difference of former frame;I=1,2,3, the binary system figure of state forward, shape backward are represented respectively
The binary system figure of state and the binary system figure of static state;Three state diagrams of t framesBy a triple channel
Matrix EMtIt is indicated;
By calculating, the energy graphic sequence under different visual angles can be obtained.Directly but energy diagram directly can not be worked as
Neutral net is used as input data, because:
1st, image recognition is generally required to use data sets up to a million and could obtain preferable effect;And some simple motions
Length typically have the picture of tens to hundreds of frame, and each main body is the similar people of profile;For compared to image recognition,
The data set that action recognition goes for similar effect needs is much more than the former.If therefore using the data of every frame as defeated
Enter, then need larger data set just to obtain comparing considerable result during training pattern.
2nd, because of the LSTM layers of context for needing to consider all list entries in neutral net, if with every in video
One frame does input unit, otherwise the time period chosen is suitable but amount of calculation is larger and higher to the demand of hardware;Choose
Time period it is too short influence model training result.
In sum, the present invention has first carried out appropriate compression merging --- SEHM characteristic pattern meters to original depth sequence
Calculate.
After all energy graphic sequences under each view under calculating current slot, it is possible to the energy diagram of every frame
Synthesize SEHM characteristic patterns.In order to consider the real-time of algorithm, it is necessary to which the concrete condition for considering action data chooses suitable
N and K value do SEHM characteristic patterns calculating.Simultaneously for many action video identifications, ONLINE RECOGNITION and real-time before and after reaching
Function, in the form of sliding window video is divided into multiple time periods is identified respectively.For example, the length of certain video is
120 frames, if taking length of 80 frames as the time period every time, sliding window is 40 frames, then need the depth to 1 to 80 frames, 41 to 120 frames
Degree graphic sequence does the calculating of SEHM features graphic sequence respectively.Two SEHM feature graphic sequences of time period can be obtained by calculating;
Two action recognition results of time period are drawn respectively by neural network model, so as to realize the functions such as ONLINE RECOGNITION.
And be the energy graphic sequence of N frames for certain time segment length, by three N frame states graphic sequence difference of visual angle figure
S timeslice is equally divided into according to tandem, S=N/K, wherein K represent the length of each timeslice;For each visual angle figure
Under state graphic sequence, the state graphic sequence for choosing a timeslice successively according to vertical order carries out SEHM characteristic patterns
Calculating:
S31. the state graphic sequence for setting the timeslice that pth time selection is calculated is (the p- from N frame state graphic sequences
1) * K+1 frames start and in pth * K frame ends, then the SEHM characteristic patterns of the timeslice are calculated by below equation and step S32
Arrive:
SEHMp=max (SEHMp,EM(p-1)*K+k·k)
Wherein, the initial value of k is 1, SEHMpIt is triple channel matrix that initial value is set as zero;
S32. make k=k+1 and then perform the formula of step S31 until k>K, exports after eventually passing standardization
SEHMpThe SEHM characteristic patterns of the timeslice calculated as pth time selection;
Calculating is over after the characteristic pattern of timeslice, similarly also needs to calculate the global SEHM characteristic patterns of whole time period.
Since first frame of time period, last frame terminates global SEHM characteristic patterns.By aforesaid operations, SEHM characteristic patterns are carried out
Compress and remain the crucial attitude information in video.For the action of general speed and complexity, it may be considered that N=
80, K=10.
Simultaneously in order to draw last recognition result, it is necessary to which the SEHM characteristic patterns under three visual angles are merged.Examine
Considering neutral net has the part for the treatment of picture and the ability of whole relation, and the present invention is lower each the corresponding timeslice of various visual angles
SEHM characteristic patterns or the global SEHM characteristic patterns of time period press 2:1:1 ratio merges into a final SEHM characteristic pattern.
Final SEHM characteristic patterns then transfer to neutral net to extract feature.Final SEHM feature graphic sequences after the merging that Fig. 1 is given
Structure composition.
For mode identification method, in addition to having carried out feature extraction to initial data, algorithm model is exactly the most
Part and parcel.Because SEHM feature graphic sequences have been carried out compression and front and rear sequence, LSTM (long short
Term memory) layer it is this possess treatment in order input model have relatively good effect.And LSTM is in natural language and language
Sound field has been achieved for larger achievement, also begins to refer to image domains in recent years.
The effect that deep neural network is played when data set is bigger is better, can select to pre-process model.
Alexnet network models are directed to the image recognition model of RGB figures, and wherein people is exactly an identification types of its task.Consider
It is the triple channel characteristic pattern of the substantially contour feature with human body to SEHM characteristic patterns, it is believed that SEHM characteristic patterns of the invention
Preferable result can be obtained by initial value retraining of the parameter of the convolutional layer of Alexnet and magnetized layer.Use Alexnet networks
Convolutional layer and magnetized layer network structure as neural network structure leading portion of the invention while LSTM Internets are connected on
The back segment of neural network model, so with the training speed of acceleration model front half section and can improve precision.Neural network model
Overall structure chart is as shown in Figure 2.
Can as seen from Figure 2, global SEHM characteristic patterns and SEHM features graphic sequence all experienced convolution magnetized layer and extract high
Level feature;The difference is that because SEHM features graphic sequence can be proposed more preferably because possessing context information by LSTM layers for the treatment of
Advanced features;And overall situation SEHM characteristic patterns are because covering the information of whole time period need not again by LSTM layers.Finally
Their advanced features are input into full articulamentum and Softmax layers obtains a row probability vector P (wherein each p in vectoriRepresent
It is judged as such probability).
For the probability vector P of certain time period, threshold value ρ between zero and one can be defined, if in probability vector
The p of neither one classificationiMore than ρ, then it is assumed that the action of the time period is meaningless action;Otherwise take Probability piMaximum classification
As prediction action.
Obviously, the above embodiment of the present invention is only intended to clearly illustrate example of the present invention, and is not right
The restriction of embodiments of the present invention.For those of ordinary skill in the field, may be used also on the basis of the above description
To make other changes in different forms.There is no need and unable to be exhaustive to all of implementation method.It is all this
Any modification, equivalent and improvement made within the spirit and principle of invention etc., should be included in the claims in the present invention
Protection domain within.
Claims (9)
1. a kind of action identification method based on SEHM feature graphic sequences, it is characterised in that:Comprise the following steps:
S1. it is the depth map sequence of N frames for the time segment length selected in video, by the depth of each frame in depth map sequence
Figure is projected in three Different Planes of orthogonal coordinate system, obtains three orthogonal visual angle figures:Front view, side view and vertical view
Figure;
S2. for the depth map sequence under each visual angle figure, the difference of its adjacent two frame is calculated as energy diagram, wherein per frame energy
Spirogram represents the distance change of front and rear frame;Then energy diagram is divided into three by the concrete numerical value according to energy diagram and the threshold value of setting
Plant state diagram:The binary system figure of the binary system figure of state, backward the binary system figure of state or static state forward.It is specific as follows:
WhereinIt is the energy diagram of the t frames under visual angle figure v;ε is set threshold value; Represent
A later frame subtracts the absolute value of the difference of former frame;I=1,2,3, the binary system figure of state forward, state backward are represented respectively
The binary system figure of binary system figure and static state;The state diagram of t framesBy a triple channel matrix EMtEnter
Row is represented;
S3. the state graphic sequence respectively obtained under three visual angle figures after step S2 is performed;By three N frame state figure sequences of visual angle figure
Row are equally divided into S timeslice according to tandem respectively, and S=N/K, wherein K represent the length of each timeslice;For each
State graphic sequence under the figure of visual angle, the state graphic sequence for choosing a timeslice successively according to vertical order carries out SEHM
The calculating of characteristic pattern:
S31. the state graphic sequence for setting the timeslice that pth time selection is calculated is (p-1) the * K+1 from N frame state graphic sequences
Frame starts and in pth * K frame ends, then the SEHM characteristic patterns of the timeslice are calculated by below equation and step S32:
SEHMp=max (SEHMp,EM(p-1)*K+k·k)
Wherein, the initial value of k is 1, SEHMpIt is triple channel matrix that initial value is set as zero;
S32. make k=k+1 and then perform the formula of step S31 until k>K, SEHM is exported after eventually passing standardizationpAs
The SEHM characteristic patterns of the timeslice that pth time selection is calculated;
S4. the SEHM characteristic patterns of each timeslice under three visual angle figures are obtained by step S31, S32;
S5. the SEHM characteristic patterns of mutual corresponding timeslice under three visual angle figures are merged, merged with timeslice
It is the SEHM characteristic patterns of unit;
The SEHM characteristic patterns of each timeslice after S6. merging constitute SEHM feature graphic sequences, by SEHM characteristic pattern sequence inputtings
To in neutral net, the row of neutral net output one represent the probability vector P of each action possibility, according to the probability vector of output
P determines the action recognition result of current N frames depth map sequence.
2. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:Respectively to three
N frame states graphic sequence under individual visual angle figure carries out the calculating of SEHM characteristic patterns, then under three visual angle figures being calculated
SEHM characteristic patterns are merged, and obtain global SEHM characteristic patterns;In the step S6, global SEHM characteristic patterns and each timeslice
SEHM characteristic patterns constitute SEHM feature graphic sequences, action recognition will be carried out in the SEHM characteristic pattern sequence inputting neutral nets.
3. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The step
It is that selected, the sliding window bag is carried out by sliding window when selected N frames depth map sequence carries out action recognition in S1
Include a window size value m, the last depth map sequence chosen of start frame distance of the depth map sequence that expression is chosen next time
The time span of the start frame of row.
4. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The ε=
30。
5. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The K=
10。
6. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The N=
80。
7. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The step
In S5, by the SEHM characteristic patterns of front view, side view and top view mutually corresponding timeslice according to 2:1:1 ratio is melted
Close.
8. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The nerve
Network includes convolutional layer, magnetized layer, LSTM layers, full articulamentum and Softmax layers;
Wherein convolutional layer and magnetized layer are used to extract advanced features from SEHM feature graphic sequences;
The described LSTM layers advanced features for being used for the feature graphic sequence to extracting carry out context treatment, and output recognition effect is more preferable
The advanced features with timing information;
The full articulamentum and Softmax layers are used to receive LSTM layer or convolutional layer, the advanced features of magnetized layer output, export one
Row prediction probability vector P.
9. the action identification method based on SEHM feature graphic sequences according to claim 1, it is characterised in that:The probability
Vectorial P includes several Probability psi, wherein piRepresent that action recognition is the probability for acting i;
Then determine that the process of action recognition result is as follows in step S6:
Threshold value ρ of one value of setting between 0 to 1, if the probability of no one of probability vector P actions is more than ρ, then it is assumed that N
Action in frame depth map sequence is meaningless action;Otherwise take the maximum action of identification probability value carried out as recognition result it is defeated
Go out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611110573.5A CN106778576B (en) | 2016-12-06 | 2016-12-06 | Motion recognition method based on SEHM characteristic diagram sequence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611110573.5A CN106778576B (en) | 2016-12-06 | 2016-12-06 | Motion recognition method based on SEHM characteristic diagram sequence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106778576A true CN106778576A (en) | 2017-05-31 |
CN106778576B CN106778576B (en) | 2020-05-26 |
Family
ID=58874488
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611110573.5A Active CN106778576B (en) | 2016-12-06 | 2016-12-06 | Motion recognition method based on SEHM characteristic diagram sequence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106778576B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107944376A (en) * | 2017-11-20 | 2018-04-20 | 北京奇虎科技有限公司 | The recognition methods of video data real-time attitude and device, computing device |
CN109002780A (en) * | 2018-07-02 | 2018-12-14 | 深圳码隆科技有限公司 | A kind of shopping process control method, device and user terminal |
CN110138681A (en) * | 2019-04-19 | 2019-08-16 | 上海交通大学 | A kind of network flow identification method and device based on TCP message feature |
CN110633004A (en) * | 2018-06-21 | 2019-12-31 | 杭州海康威视数字技术股份有限公司 | Interaction method, device and system based on human body posture estimation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886293A (en) * | 2014-03-21 | 2014-06-25 | 浙江大学 | Human body behavior recognition method based on history motion graph and R transformation |
CN104636725A (en) * | 2015-02-04 | 2015-05-20 | 华中科技大学 | Gesture recognition method based on depth image and gesture recognition system based on depth images |
CN105608421A (en) * | 2015-12-18 | 2016-05-25 | 中国科学院深圳先进技术研究院 | Human movement recognition method and device |
CN105631415A (en) * | 2015-12-25 | 2016-06-01 | 中通服公众信息产业股份有限公司 | Video pedestrian recognition method based on convolution neural network |
CN105740773A (en) * | 2016-01-25 | 2016-07-06 | 重庆理工大学 | Deep learning and multi-scale information based behavior identification method |
-
2016
- 2016-12-06 CN CN201611110573.5A patent/CN106778576B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886293A (en) * | 2014-03-21 | 2014-06-25 | 浙江大学 | Human body behavior recognition method based on history motion graph and R transformation |
CN104636725A (en) * | 2015-02-04 | 2015-05-20 | 华中科技大学 | Gesture recognition method based on depth image and gesture recognition system based on depth images |
CN105608421A (en) * | 2015-12-18 | 2016-05-25 | 中国科学院深圳先进技术研究院 | Human movement recognition method and device |
CN105631415A (en) * | 2015-12-25 | 2016-06-01 | 中通服公众信息产业股份有限公司 | Video pedestrian recognition method based on convolution neural network |
CN105740773A (en) * | 2016-01-25 | 2016-07-06 | 重庆理工大学 | Deep learning and multi-scale information based behavior identification method |
Non-Patent Citations (4)
Title |
---|
BIN LIANG等: "3D Motion Trail Model based Pyramid Histograms of Oriented Gradient for Action Recognition", 《2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION》 * |
MARCUS EDEL等: "Binarized-BLSTM-RNN based Human Activity Recognition", 《2016 INTERNATIONAL CONFERENCE ON INDOOR POSITIONING AND INDOOR NAVIGATION (IPIN)》 * |
RUI YANG等: "DMM-Pyramid B ased Deep Architectures for Action R ecognition with Depth Cameras", 《ACCV 2014》 * |
XIAODONG YANG等: "Recognizing Actions Using Depth Motion Maps-based Histograms of Oriented Gradients", 《MM` 12》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107944376A (en) * | 2017-11-20 | 2018-04-20 | 北京奇虎科技有限公司 | The recognition methods of video data real-time attitude and device, computing device |
CN110633004A (en) * | 2018-06-21 | 2019-12-31 | 杭州海康威视数字技术股份有限公司 | Interaction method, device and system based on human body posture estimation |
CN110633004B (en) * | 2018-06-21 | 2023-05-26 | 杭州海康威视数字技术股份有限公司 | Interaction method, device and system based on human body posture estimation |
CN109002780A (en) * | 2018-07-02 | 2018-12-14 | 深圳码隆科技有限公司 | A kind of shopping process control method, device and user terminal |
CN109002780B (en) * | 2018-07-02 | 2020-12-18 | 深圳码隆科技有限公司 | Shopping flow control method and device and user terminal |
CN110138681A (en) * | 2019-04-19 | 2019-08-16 | 上海交通大学 | A kind of network flow identification method and device based on TCP message feature |
Also Published As
Publication number | Publication date |
---|---|
CN106778576B (en) | 2020-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Single image dehazing via conditional generative adversarial network | |
CN108830252B (en) | Convolutional neural network human body action recognition method fusing global space-time characteristics | |
CN112530019B (en) | Three-dimensional human body reconstruction method and device, computer equipment and storage medium | |
CN106778796B (en) | Human body action recognition method and system based on hybrid cooperative training | |
CN112766160A (en) | Face replacement method based on multi-stage attribute encoder and attention mechanism | |
CN110674701A (en) | Driver fatigue state rapid detection method based on deep learning | |
CN108268864A (en) | Face identification method, system, electronic equipment and computer program product | |
CN111652827A (en) | Front face synthesis method and system based on generation countermeasure network | |
CN112288627B (en) | Recognition-oriented low-resolution face image super-resolution method | |
CN110728183A (en) | Human body action recognition method based on attention mechanism neural network | |
CN106778576A (en) | A kind of action identification method based on SEHM feature graphic sequences | |
CN108596041A (en) | A kind of human face in-vivo detection method based on video | |
CN111680550B (en) | Emotion information identification method and device, storage medium and computer equipment | |
CN113420703B (en) | Dynamic facial expression recognition method based on multi-scale feature extraction and multi-attention mechanism modeling | |
CN110378234A (en) | Convolutional neural networks thermal imagery face identification method and system based on TensorFlow building | |
CN115914505B (en) | Video generation method and system based on voice-driven digital human model | |
CN113298047B (en) | 3D form and posture estimation method and device based on space-time correlation image | |
CN115100707A (en) | Model training method, video information generation method, device and storage medium | |
Wang et al. | Digital twin: Acquiring high-fidelity 3D avatar from a single image | |
Pini et al. | Learning to generate facial depth maps | |
CN112487926A (en) | Scenic spot feeding behavior identification method based on space-time diagram convolutional network | |
CN112560618A (en) | Behavior classification method based on skeleton and video feature fusion | |
CN112488165A (en) | Infrared pedestrian identification method and system based on deep learning model | |
Yaseen et al. | A novel approach based on multi-level bottleneck attention modules using self-guided dropblock for person re-identification | |
CN110766093A (en) | Video target re-identification method based on multi-frame feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |