CN110110686A - Based on the human motion recognition methods for losing double-current convolutional neural networks more - Google Patents

Based on the human motion recognition methods for losing double-current convolutional neural networks more Download PDF

Info

Publication number
CN110110686A
CN110110686A CN201910400344.4A CN201910400344A CN110110686A CN 110110686 A CN110110686 A CN 110110686A CN 201910400344 A CN201910400344 A CN 201910400344A CN 110110686 A CN110110686 A CN 110110686A
Authority
CN
China
Prior art keywords
network
neural networks
convolutional neural
double
action recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910400344.4A
Other languages
Chinese (zh)
Inventor
吴春雷
曹海文
王雷全
魏燚伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN201910400344.4A priority Critical patent/CN110110686A/en
Publication of CN110110686A publication Critical patent/CN110110686A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses based on the human motion recognition methods for losing double-current convolutional neural networks, it belongs to action recognition technical field more, solves the problems, such as that traditional binary-flow network movement detailed information loses and can not extract space-time characteristic.The present invention is to be constituted to the improvement of timing segmentation network by losing spatial network and time network more, and from the point of view of architecture angle, the double fluid convolutional neural networks that lose are made of more three branches: action recognition, movement are restored and difference punishment.Movement, which is restored, joined recovery loss, and reservation acts the motion characteristic information that detailed information and balance are extracted.Difference punishment is classified using external appearance characteristic calculating action feature, to obtain effective space-time characteristic.It is lose the training study in a manner of end to end of double fluid convolutional neural networks more, and extract video abundant using action recognition loss, recovery loss and difference loss auxiliary movement identification module and express, it can preferably improve the accuracy rate of action recognition.

Description

Based on the human motion recognition methods for losing double-current convolutional neural networks more
Technical field
The present invention relates to computer vision and area of pattern recognition, especially relate to lose double-current convolutional Neural based on more The human motion recognition method of network belongs to action recognition field.
Background technique
Action recognition technology is identified to the movement of the people in one section of video.With the day of internet and digital equipment Benefit is universal, and processing and the analysis especially video actions identification of video are widely studied in computer vision direction, it can be wide It is general to be applied to every field, for example, intelligent video monitoring, human-computer interaction and human behavior analysis etc..Due to convolutional Neural net Network achieves huge success in picture classification task, and the action recognition technology based on video can be counted as a kind of classification Task, therefore action identification method is no longer limited to traditional hand-made characterization method, but based on convolutional neural networks Action identification method.But the research direction still has many challenges, for example, the movement of camera, the influence of shooting background The accuracy of action recognition can all be impacted with the variation of light etc..
In recent years, action recognition technology focuses primarily upon the static and dynamic of fusion video in the progress of video field State information.Due to validity of the convolutional neural networks in Computer Vision Task, naturally it is applied in action recognition Space characteristics extract network in.However, only capturing video static state action message is not fill to complicated action recognition task Point.So a kind of input mode of the light stream as complementation, captures the multidate information of video in time network, show to dynamic Make the powerful validity of identification mission.The double-current convolutional network that Karen Simonyan et al. is proposed, combines spatial network And time network, become one of the mainstream of action recognition technology, but it is confined to the input of single picture and light stream figure, not There is the sequence problem in view of video.Timing segmentation network is proposed as Limin Wang et al. is improved, when being based on long-term Between structural modeling thought, sparse time sampling strategy is utilized and measure of supervision based on videl stage carries out action recognition, energy It is enough efficiently to learn entire action video.But timing segmentation network also has certain difficulty on identifying similar movement, and And in the detailed information for being easily lost movement and space-time characteristic in video is not accounted for.Therefore the present invention is divided in timing Improved on the basis of network, introduce recovery loss and difference loss can auxiliary movement identification module extract space-time Feature and reservation act detailed information.
Summary of the invention
The purpose of the present invention is to solve can not be extracted existing for traditional double-current convolutional neural networks space-time characteristic and Shortage movement details and the problem that causes accuracy of identification low.
The technical solution adopted by the present invention to solve the above technical problem is:
S1. video V in data set is equally divided into K sections of S1,S2,…,SK(K be empirical value K=3), from each subsegment with Machine samples a frame picture and light stream figure as the inputs for losing double-current convolutional neural networks more.
S2. building loses double-current convolutional neural networks framework.
S3. the picture acquired in step S1 and light stream figure lose in double-current convolutional neural networks is input to more to instruct Practice, so that loss function is minimum.
S4. by test sample picture and light stream figure be input to it is above-mentioned trained complete more loss double fluid convolutional Neurals It is tested in network, then carries out double-current fusion, finally complete the human action identification based on video.
Specifically, the building double fluid convolutional neural networks that lose include the following contents more:
The double fluid convolutional neural networks that lose are the improvement to timing segmentation network, the net of spatial network and time network more Network structure is identical (input mode is different, is picture and light stream figure respectively), the spaces for losing double-current convolutional neural networks more The network structure of network and time network is divided into three branches: action recognition, movement are restored and difference punishment.
(1) action recognition
Action recognition branch selects the network structure based on BN-Inception, in order to simulate long-term time knot Structure, the present invention carry out sparse sampling in entire video and assemble segment characterizations progress action recognition.
(2) movement is restored
The recovery to input data is carried out in the output of the last layer convolutional layer of action recognition branch, present invention employs Four layers of warp lamination and four layers of jump articulamentum are restored, and poor using the recovery of Euclidean distance costing bio disturbance, in order to guarantee Partial act detailed information can be retained in action recognition network.
(3) difference is punished
Difference punishes that branch and action recognition and movement restore branch's sharing feature coding network, it is in action recognition branch The last layer convolutional layer after carry out difference punishment operation, present invention utilizes the feature differences between adjacent segment to carry out movement knowledge Not, auxiliary movement identification network can extract space-time characteristic abundant.
Specifically, the training methods for losing double-current convolutional neural networks are to do pre-training using ImageNet data set more Model, training action identification module are trained whole network on the basis of action recognition network training is completed, and are utilized Stochastic gradient descent algorithm optimizes.
The loss function calculation formula for losing double-current convolutional neural networks as follows more:
(1) action recognition
Entire video V is equally divided into K sections of { S1,S2,…,SK, the one frame { I of stochastical sampling from each segment1,I2,…, IKInput as network, it finally obtains video and predicts fractional function in each movement class are as follows:
R(I1,I2,…,IK)=P (h (C (I1;W),C(I2;W),…,C(IK;W))) (1)
Wherein W is parameter, function C (Ik;W it) calculates each input and passes through the class score that network is exported, k ∈ 1,2 ..., K, function h represent the output of K segment of fusion, obtain final movement class hypothesis score, and function P is Softmax operation.
Therefore, the loss function of action recognition module are as follows:
Wherein n is movement class total number, yiBeing really is label, Hr=h (C (I1;W),C(I2;W),…,C(IK;W)), i.e. Hi =h (Ci(I1),Ci(I2),…,Ci(IK)) indicate that the same i of K segment prediction acts the score of class.
(2) movement is restored
This module optimizes training, loss function using Euclidean distance loss are as follows:
WhereinIt is the characteristic pattern that k-th of segment is exported by restoring network, IkFor primitive character figure.
(3) difference is punished
It is defeated to last one layer of convolutional layer by action recognition network to first have to calculate adjacent two picture either light stream figure Feature (f outk, fk+1) difference:
dk=fk+1-fk (4)
Then by feature dkCarry out action recognition, the i.e. loss function of difference punishment are as follows:
In conclusion the total losses functions for losing double-current convolutional neural networks more are as follows:
L=Lr(y,Hr)+Lg+Ld(y,Hd) (6)
In general, through the invention it is contemplated above technical scheme is compared with the prior art, have below beneficial to effect Fruit:
(1) present invention employs the sparse samplings on entire video, to obtain long term time video presentation.
(2) present invention adds restoring to lose, reduce the loss of movement detailed information to a certain extent, and balance The video presentation information of extraction.
(3) the invention proposes difference penalty term, subtract each other to obtain feature difference using adjacent segment feature, recycle difference Feature carries out action recognition, and auxiliary movement identifies that network extracts space-time characteristic.
(4) present invention utilizes multiple losses to optimize action recognition network, with action recognition loss, restores loss and difference It loses auxiliary movement identification module and extracts better space-time video expression characteristic, increase substantially the precision of action recognition.
Detailed description of the invention
Fig. 1 is more loss double fluid convolutional neural networks structural schematic diagrams that the embodiment of the present invention uses.
Fig. 2 is that timing divides network and the Structure Comparison for losing double-current convolutional neural networks provided in an embodiment of the present invention more Figure.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent.
Below in conjunction with drawings and examples, the present invention is further elaborated.
Fig. 1 is more loss double fluid convolutional neural networks structural schematic diagrams that the embodiment of the present invention uses.As shown in Figure 1, should Method the following steps are included:
S1. video V in data set is equally divided into K sections of S1,S2,…,SK(K be empirical value K=3), from each subsegment with Machine samples a frame picture and light stream figure as the inputs for losing double-current convolutional neural networks more.
S2. building loses double-current convolutional neural networks framework.
S3. the picture acquired in step S1 and light stream figure lose in double-current convolutional neural networks is input to more to instruct Practice, so that loss function is minimum.
S4. by test sample picture and light stream figure be input to it is above-mentioned trained complete more loss double fluid convolutional Neurals It is tested in network, then carries out double-current fusion, finally complete the human action identification based on video.
Specifically, the building double fluid convolutional neural networks that lose include the following contents more:
The double fluid convolutional neural networks that lose are the improvement to timing segmentation network, the net of spatial network and time network more Network structure is identical (input mode is different, is picture and light stream figure respectively), the spaces for losing double-current convolutional neural networks more The network structure of network and time network is divided into three branches: action recognition, movement are restored, difference is punished.
(1) action recognition
Action recognition branch selects the network structure based on BN-Inception, in order to simulate long-term time knot Structure, the present invention carry out sparse sampling in entire video and assemble segment characterizations progress action recognition.
(2) movement is restored
The recovery to input data is carried out in the output of the last layer convolutional layer of action recognition branch, present invention employs Four layers of warp lamination and four layers of jump articulamentum are restored, and poor using the recovery of Euclidean distance costing bio disturbance, in order to guarantee Partial act detailed information can be retained in action recognition network.
(3) difference is punished
Difference punishes that branch and action recognition and movement restore branch's sharing feature coding network, it is in action recognition branch The last layer convolutional layer after carry out difference punishment operation, present invention utilizes the feature differences between adjacent segment to carry out movement knowledge Not, auxiliary movement identification network can extract space-time characteristic abundant.
The loss function calculation formula for losing double-current convolutional neural networks as follows more:
(1) action recognition
Entire video V is equally divided into K sections of { S1,S2,…,SK, the one frame { I of stochastical sampling from each segment1,I2,…, IKInput as network, it finally obtains video and predicts fractional function in each movement class are as follows:
R(I1,I2,…,IK)=P (h (C (I1;W),C(I2;W),…,C(IK;W))) (1)
Wherein W is parameter, function C (Ik;W it) calculates each input and passes through the class score that network is exported, k ∈ 1,2 ..., K, function h represent the output of K segment of fusion, obtain final movement class hypothesis score, and function P is Softmax operation.
Therefore, the loss function of action recognition module are as follows:
Wherein n is movement class total number, yiBeing really is label, Hr=h (C (I1;W),C(I2;W),…,C(IK;W)), i.e. Hi =h (Ci(I1),Ci(I2),…,Ci(IK)) indicate that the same i of K segment prediction acts the score of class.
(2) movement is restored
This module optimizes training, loss function using Euclidean distance loss are as follows:
WhereinIt is the characteristic pattern that k-th of segment is exported by restoring network, IkFor primitive character figure.
(3) difference is punished
It is defeated to last one layer of convolutional layer by action recognition network to first have to calculate adjacent two picture either light stream figure Feature (f outk, fk+1) difference:
dk=fk+1-fk (4)
Then by feature dkCarry out action recognition, the i.e. loss function of difference punishment are as follows:
In conclusion the total losses functions for losing double-current convolutional neural networks more are as follows:
L=Lr(y,Hr)+Lg+Ld(y,Hd) (6)
Fig. 2 is that timing divides network and the Structure Comparison for losing double-current convolutional neural networks provided in an embodiment of the present invention more Figure.As shown in Fig. 2, Fig. 2-(a) is that timing divides network, Fig. 2-(b) be lose double-current convolutional neural networks, the present invention when Recovery loss and difference penalty term are increased on the basis of sequence segmentation network, it is special that the video that multiple loss optimizations are extracted is utilized Sign, robustness and accuracy with higher.
In this work, the present invention provides a kind of new methods to identify skill to complete the human action based on video Art.Compared with the existing methods, the present invention is utilized recovery loss and difference is punished on the basis of traditional timing divides network It penalizes item to carry out auxiliary optimization to action recognition module, realizes the video expression characteristic for extracting and there is space-time, and reduce movement The loss of detailed information, so that action recognition precision is greatly improved.
Finally, the details of the above embodiment of the present invention is only to illustrate examples of the invention, for this field Technical staff, any modification, improvement and replacement etc. to above-described embodiment, should be included in the protection model of the claims in the present invention Within enclosing.

Claims (4)

1. based on the human motion recognition methods for losing double-current convolutional neural networks more, which is characterized in that the method includes with Lower step:
S1. video V in data set is equally divided into K sections of S1,S2,…,SK(K is empirical value K=3), adopts at random from each subsegment One frame picture of sample and light stream figure as the inputs for losing double-current convolutional neural networks more.
S2. building loses double-current convolutional neural networks framework.
S3. the picture acquired in step S1 and light stream figure are input to lose in double-current convolutional neural networks to be trained more, are made It is minimum to obtain loss function.
S4. by test sample picture and light stream figure be input to it is above-mentioned trained complete more loss double fluid convolutional neural networks In tested, then carry out double-current fusion, finally complete the human action identification based on video.
2. according to claim 1 based on the human motion recognition methods for losing double-current convolutional neural networks, feature more It is, the detailed process of the S2 are as follows:
The double fluid convolutional neural networks that lose are the improvement to timing segmentation network, the network knot of spatial network and time network more Structure is identical (input mode is different, is picture and light stream figure respectively), the spatial networks for losing double-current convolutional neural networks more Be divided into three branches with the network structure of time network: action recognition, movement are restored and difference punishment.
(1) action recognition
Action recognition branch selects the network structure based on BN-Inception, in order to simulate long-term time structure, this Invention carries out sparse sampling in entire video and assembles segment characterizations progress action recognition.
(2) movement is restored
The recovery to input data is carried out in the output of the last layer convolutional layer of action recognition branch, present invention employs four layers Warp lamination and four layers of jump articulamentum are restored, and poor using the recovery of Euclidean distance costing bio disturbance, in order to guarantee to act Partial act detailed information can be retained in identification network.
(3) difference is punished
Difference punishes branch and action recognition and movement recovery branch's sharing feature coding network, it action recognition branch most Difference punishment operation is carried out after later layer convolutional layer, present invention utilizes the feature differences between adjacent segment to carry out action recognition, auxiliary Help action recognition network that can extract space-time characteristic abundant.
3. according to claim 1 based on the human motion recognition methods for losing double-current convolutional neural networks, feature more It is, the detailed process of the S3 are as follows:
Specifically, the training methods for losing double-current convolutional neural networks are to do pre-training mould using ImageNet data set more Type, training action identification module, action recognition network training complete on the basis of be trained whole network, and be utilized with Machine gradient descent algorithm optimizes.
The loss function calculation formula for losing double-current convolutional neural networks as follows more:
(1) action recognition
Entire video V is equally divided into K sections of { S1,S2,…,SK, the one frame { I of stochastical sampling from each segment1,I2,…,IKMake For the input of network, finally obtains video and predicts fractional function in each movement class are as follows:
R(I1,I2,…,IK)=P (h (C (I1;W),C(I2;W),…,C(IK;W))) (1)
Wherein W is parameter, function C (Ik;W the class score that each input is exported by network, k ∈ 1,2 ..., K, function) are calculated H represents the output of K segment of fusion, obtains final movement class hypothesis score, and function P is Softmax operation.
Therefore, the loss function of action recognition module are as follows:
Wherein n is movement class total number, yiBeing really is label, Hr=h (C (I1;W),C(I2;W),…,C(IK;W)),
That is Hi=h (Ci(I1),Ci(I2),…,Ci(IK)) indicate that the same i of K segment prediction acts the score of class.
(2) movement is restored
This module optimizes training, loss function using Euclidean distance loss are as follows:
WhereinIt is the characteristic pattern that k-th of segment is exported by restoring network, IkFor primitive character figure.
(3) difference is punished
It first has to calculate what adjacent two picture either light stream figure was exported by action recognition network to last one layer of convolutional layer Feature (fk, fk+1) difference:
dk=fk+1-fk (4)
Then by feature dkCarry out action recognition, the i.e. loss function of difference punishment are as follows:
In conclusion the total losses functions for losing double-current convolutional neural networks more are as follows:
L=Lr(y,Hr)+Lg+Ld(y,Hd) (6) 。
4. according to claim 1 based on the human motion recognition methods for losing double-current convolutional neural networks, feature more It is, more loss double fluid convolutional neural networks that training is completed are tested in the S4, and each video uses a picture either Light stream figure carrys out the score of prediction action identification as the input of more loss double-stream digestions, finally merges spatial network and time network The score of output as the final test scores for losing double-current convolutional neural networks more.
CN201910400344.4A 2019-05-14 2019-05-14 Based on the human motion recognition methods for losing double-current convolutional neural networks more Pending CN110110686A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910400344.4A CN110110686A (en) 2019-05-14 2019-05-14 Based on the human motion recognition methods for losing double-current convolutional neural networks more

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910400344.4A CN110110686A (en) 2019-05-14 2019-05-14 Based on the human motion recognition methods for losing double-current convolutional neural networks more

Publications (1)

Publication Number Publication Date
CN110110686A true CN110110686A (en) 2019-08-09

Family

ID=67490072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910400344.4A Pending CN110110686A (en) 2019-05-14 2019-05-14 Based on the human motion recognition methods for losing double-current convolutional neural networks more

Country Status (1)

Country Link
CN (1) CN110110686A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598598A (en) * 2019-08-30 2019-12-20 西安理工大学 Double-current convolution neural network human behavior identification method based on finite sample set
CN110852273A (en) * 2019-11-12 2020-02-28 重庆大学 Behavior identification method based on reinforcement learning attention mechanism
CN110969191A (en) * 2019-11-07 2020-04-07 吉林大学 Glaucoma prevalence probability prediction method based on similarity maintenance metric learning method
CN111027377A (en) * 2019-10-30 2020-04-17 杭州电子科技大学 Double-flow neural network time sequence action positioning method
CN111539290A (en) * 2020-04-16 2020-08-14 咪咕文化科技有限公司 Video motion recognition method and device, electronic equipment and storage medium
CN112258381A (en) * 2020-09-29 2021-01-22 北京达佳互联信息技术有限公司 Model training method, image processing method, device, equipment and storage medium
CN113139467A (en) * 2021-04-23 2021-07-20 西安交通大学 Hierarchical structure-based fine-grained video action identification method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006133866A (en) * 2004-11-02 2006-05-25 Advanced Telecommunication Research Institute International Robot device and position memory unit
CN108427713A (en) * 2018-02-01 2018-08-21 宁波诺丁汉大学 A kind of video summarization method and system for homemade video
CN108764128A (en) * 2018-05-25 2018-11-06 华中科技大学 A kind of video actions recognition methods based on sparse time slice network
CN108805078A (en) * 2018-06-11 2018-11-13 山东大学 Video pedestrian based on pedestrian's average state recognition methods and system again
CN109492581A (en) * 2018-11-09 2019-03-19 中国石油大学(华东) A kind of human motion recognition method based on TP-STG frame

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006133866A (en) * 2004-11-02 2006-05-25 Advanced Telecommunication Research Institute International Robot device and position memory unit
CN108427713A (en) * 2018-02-01 2018-08-21 宁波诺丁汉大学 A kind of video summarization method and system for homemade video
CN108764128A (en) * 2018-05-25 2018-11-06 华中科技大学 A kind of video actions recognition methods based on sparse time slice network
CN108805078A (en) * 2018-06-11 2018-11-13 山东大学 Video pedestrian based on pedestrian's average state recognition methods and system again
CN109492581A (en) * 2018-11-09 2019-03-19 中国石油大学(华东) A kind of human motion recognition method based on TP-STG frame

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张晶晶: "面向步态识别的显著前景分割", 《中国优秀硕士学位论文全文数据库》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598598A (en) * 2019-08-30 2019-12-20 西安理工大学 Double-current convolution neural network human behavior identification method based on finite sample set
CN111027377A (en) * 2019-10-30 2020-04-17 杭州电子科技大学 Double-flow neural network time sequence action positioning method
CN110969191A (en) * 2019-11-07 2020-04-07 吉林大学 Glaucoma prevalence probability prediction method based on similarity maintenance metric learning method
CN110852273A (en) * 2019-11-12 2020-02-28 重庆大学 Behavior identification method based on reinforcement learning attention mechanism
CN111539290A (en) * 2020-04-16 2020-08-14 咪咕文化科技有限公司 Video motion recognition method and device, electronic equipment and storage medium
CN111539290B (en) * 2020-04-16 2023-10-20 咪咕文化科技有限公司 Video motion recognition method and device, electronic equipment and storage medium
CN112258381A (en) * 2020-09-29 2021-01-22 北京达佳互联信息技术有限公司 Model training method, image processing method, device, equipment and storage medium
CN112258381B (en) * 2020-09-29 2024-02-09 北京达佳互联信息技术有限公司 Model training method, image processing method, device, equipment and storage medium
CN113139467A (en) * 2021-04-23 2021-07-20 西安交通大学 Hierarchical structure-based fine-grained video action identification method
CN113139467B (en) * 2021-04-23 2023-04-25 西安交通大学 Fine granularity video action recognition method based on hierarchical structure

Similar Documents

Publication Publication Date Title
CN110110686A (en) Based on the human motion recognition methods for losing double-current convolutional neural networks more
CN107330362B (en) Video classification method based on space-time attention
Zhang et al. Real-time action recognition with enhanced motion vector CNNs
Simonyan et al. Two-stream convolutional networks for action recognition in videos
CN106096568B (en) A kind of pedestrian's recognition methods again based on CNN and convolution LSTM network
CN108830252A (en) A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic
Biswas et al. Structural recurrent neural network (SRNN) for group activity analysis
CN109101896A (en) A kind of video behavior recognition methods based on temporal-spatial fusion feature and attention mechanism
CN110188637A (en) A kind of Activity recognition technical method based on deep learning
CN110147743A (en) Real-time online pedestrian analysis and number system and method under a kind of complex scene
CN111626171B (en) Group behavior identification method based on video segment attention mechanism and interactive relation activity diagram modeling
CN110119703A (en) The human motion recognition method of attention mechanism and space-time diagram convolutional neural networks is merged under a kind of security protection scene
CN109815785A (en) A kind of face Emotion identification method based on double-current convolutional neural networks
CN109829443A (en) Video behavior recognition methods based on image enhancement Yu 3D convolutional neural networks
CN109891897A (en) Method for analyzing media content
CN105138953B (en) A method of action recognition in the video based on continuous more case-based learnings
CN105574510A (en) Gait identification method and device
CN109214285A (en) Detection method is fallen down based on depth convolutional neural networks and shot and long term memory network
CN107025420A (en) The method and apparatus of Human bodys' response in video
CN106650694A (en) Human face recognition method taking convolutional neural network as feature extractor
CN106529477A (en) Video human behavior recognition method based on significant trajectory and time-space evolution information
CN110348364A (en) A kind of basketball video group behavior recognition methods that Unsupervised clustering is combined with time-space domain depth network
CN113239801B (en) Cross-domain action recognition method based on multi-scale feature learning and multi-level domain alignment
CN107392131A (en) A kind of action identification method based on skeleton nodal distance
Xu et al. Scene image and human skeleton-based dual-stream human action recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20231215

AD01 Patent right deemed abandoned