CN106650674B - A kind of action identification method of the depth convolution feature based on mixing pit strategy - Google Patents
A kind of action identification method of the depth convolution feature based on mixing pit strategy Download PDFInfo
- Publication number
- CN106650674B CN106650674B CN201611229368.0A CN201611229368A CN106650674B CN 106650674 B CN106650674 B CN 106650674B CN 201611229368 A CN201611229368 A CN 201611229368A CN 106650674 B CN106650674 B CN 106650674B
- Authority
- CN
- China
- Prior art keywords
- time
- video
- depth
- space
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 230000009471 action Effects 0.000 title claims abstract description 21
- 238000010586 diagram Methods 0.000 claims abstract description 18
- 230000009467 reduction Effects 0.000 claims abstract description 14
- 238000004458 analytical method Methods 0.000 claims abstract description 10
- 238000012706 support-vector machine Methods 0.000 claims description 9
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 claims description 3
- 239000010931 gold Substances 0.000 claims description 3
- 229910052737 gold Inorganic materials 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000003542 behavioural effect Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of action identification method of depth convolution feature based on mixing pit strategy, comprising: 1) uses spatial flow depth network model to each frame of input video, obtain the appearance features of every frame;To time flow depth network model is used in video per continuous 10 frame, the motion feature of video is extracted;2) corresponding character representation is obtained using termporal filter pond method to the depth trellis diagram of the last layer convolutional layer output of spatial flow and time flow depth network, obtains first using principal component analytical method progress dimensionality reduction and describes subcharacter;Corresponding character representation is obtained using space-time pyramid pond method to the depth trellis diagram of the last layer convolutional layer of spatial flow and time flow depth network output, carries out dimensionality reduction with principal component analytical method and obtain second to describe subcharacter;3) obtain step 2) first and second describes subcharacter and cascades up, and forms the Feature Descriptor of input video, and carry out tagsort using linear SVM, obtains recognition accuracy.
Description
Technical field
The present invention relates to computer vision fields, more particularly, to a kind of depth convolution based on mixing pit strategy
The action identification method of feature.
Background technique
So that picture pick-up device is popularized, the video data of enormous amount also generates therewith for the development of science and technology.Meanwhile needle
It also comes into being to the application of video: intelligent video monitoring, video data classification, advanced human-computer interaction etc..In such applications,
Movement for people carries out the core content that understanding is most crucial focus and people's research.
Since human action identification has very big potential value, so this project continue for as a research hotspot
At least ten years, a variety of methods are all suggested, such as: it is based on the method for intensive track (DT), based on space-time interest points
Method and the method etc. for being based on convolutional neural networks (CNN).Wherein, the number of the technique study based on CNN is most, this side
Method can obtain result best at present.However, most of deep layer CNN networks all regard individual trellis diagram as an entirety
With, and the local message in trellis diagram is often ignored, so, our action recognition research will be for based on depth convolution
The action identification method in feature multichannel pyramid pond is to extract the local message in depth characteristic.
The main thought of method based on convolutional neural networks is: firstly, using convolutional layer, the pond layer of multilayer to video
With full articulamentum, the description subcharacter of video is extracted;Next these features are put into classifier and are classified, to complete most
Whole identification process.Many scholars are explored and have been improved on this basis.Annane et al. proposes a kind of double fluid volume
Product network is used for action recognition, including spatial flow and time flow network, and spatial flow is used to extract the appearance features of video frame, time
The motion feature for extracting video successive frame is flowed, the two is merged, recognition effect is promoted with this.Wang et al. is by depth
Convolution feature and manual features are merged, and the advantage of both different type features of depth characteristic and manual features is arrived in study.
Above method all achieves preferable effect, but the existing research based on depth network usually makees individual depth characteristic figure
For an entirety come using and have ignored the local message in depth characteristic, and this clue for improve based on depth network
Recognition accuracy is helpful.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides a kind of depth convolution based on mixing pit strategy
The action identification method of feature.This method carries out video feature extraction and identification, most using the video of sets of video data as input
The classification results of video are exported afterwards, and this method has simple easily realization, the good feature of recognition effect.
In order to achieve the above object, the technical solution adopted by the present invention is that:
A kind of action identification method of the depth convolution feature based on mixing pit strategy, comprising the following steps:
(1) it inputs video to be identified and every frame is obtained using spatial flow depth network model to each frame of input video
Appearance features;Motion feature is obtained using time flow depth network model per continuous 10 frame to input video simultaneously.Wherein
Spatial flow depth network and time flow depth network model include 5 convolutional layers, 3 pond layers and 3 full articulamentums;
(2) the last layer convolutional layer that spatial flow depth network model and time flow depth network model obtain is exported
Depth trellis diagram obtains corresponding character representation using termporal filter pond method, using different length interval time sequence
Column to obtain the global and local movement of video, and carry out dimensionality reduction to feature using principal component analytical method, obtain the first description
Subcharacter;
Meanwhile the last layer convolutional layer that spatial flow depth network model and time flow depth network model obtain is exported
Depth trellis diagram corresponding character representation is obtained using space-time pyramid pond method, using 4 layers of space-time pyramid structure
To obtain the local message in depth characteristic figure, and there is robustness for target and geometry deformation;Similarly also using it is main at
Analysis carries out Feature Dimension Reduction, obtains second and describes subcharacter;
(4) it describes subcharacter to step (2) are extracted first and second to cascade up, forming the final vector of the video indicates;
Using support vector machines (SVM) carry out tagsort, final output classification results, obtain the action recognition of video as a result,
90.8% accuracy rate is realized on UCF50 human body behavioral data collection.
The present invention is based on depth convolutional neural networks methods, and by exploring local message and fortune in depth characteristic figure
Dynamic information proposes a kind of new depth convolution feature based on mixing pit strategy, it can effectively obtain characteristic pattern and exist
Local message and motion information under different scale, significantly improve the accuracy rate of action recognition.
Preferably, in step (1), spatial flow and time flow depth network model are using the every frame of video as input, to original
Image does the convolution sum pondization operation of multilayer, and the output for obtaining every layer is all multiple depth trellis diagrams, forms more abstract figure
As feature.
Preferably, in step (2), the convolution of the last layer convolutional layer output of space flow network and time flow network is chosen
Figure is temporally filtered the operation in device pond, specifically to characteristic pattern using 4 kinds of different time intervals filter (Isosorbide-5-Nitrae, 8,
16) carry out analysis depth feature in the movement of time-domain, it is time movement in entire range of video that wherein time interval 1 is corresponding
Namely global motion, and time interval 16 it is corresponding be under out to out local time movement.For each different time
Interval, depth characteristic can all be divided into multiple timeslices within the scope of entire video time, to the feature in each timeslice
We obtain most representative feature in the timeslice using maximum pond and pond method of summing simultaneously, and by both ponds
Changing result and being together in series indicates movement in the timeslice.Then the video features entire termporal filter Chi Huahou obtained
Carry out PCA dimensionality reduction.
Preferably, in step (2), the multi-pass of the last layer convolutional layer output of space flow network and time flow network is chosen
Road trellis diagram carries out the operation in space-time pyramid pond, specifically to trellis diagram using 4 layers of space-time pyramid structure (1 × 1 ×
1,2 × 2 × 2,3 × 3 × 3,4 × 4 × 4) it is in entire time and spatial dimension that, wherein first layer (1 × 1 × 1) is corresponding
Characteristic pattern, and it is local space time's characteristic block under out to out that the 4th layer (4 × 4 × 4) is corresponding.Therefore pass through space-time pyramid
Structure obtains the localized mass that characteristic pattern is located under different time and space scales.To each local space time's block using maximum pond method, meter
Calculate character representation of the maximum value in space-time block as the localized mass.Since the characteristic pattern on each channel is extracted different figures
Picture/video information, therefore the feature of the localized mass of space-time position same in the characteristic pattern on all channels is together in series, being formed should
The multi-channel feature of local space time's block describes son.Finally space-time block features all in video are cascaded up, form the spy of video
Sign indicates.Then PCA dimensionality reduction is carried out to the video features that entire space-time pyramid Chi Huahou is obtained.
Preferably, in step (3), the depth characteristic of video is passed through into termporal filter pondization and space-time pyramid Chi Huahou
Two kinds of features be together in series, obtain the final character representation of video.Classified using support vector machines to feature, is obtained
To the action classification label of the video.
The present invention has the following advantages and effects with respect to the prior art:
1, the invention proposes a kind of new description subcharacters sufficiently to obtain motion information and part under different scale
Information improves recognition effect.
2, the present invention does pondization connection, the difference in the available region to the same area of the trellis diagram under different channels
The information of aspect, such as edge or texture.
Detailed description of the invention
Fig. 1 is overview flow chart of the invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below
Not constituting a conflict with each other can be combined with each other.
Attached drawing gives operating process of the invention, as shown, a kind of depth convolution based on mixing pit strategy is special
The action identification method of sign, comprising the following steps:
(1) it inputs video to be identified and every frame is obtained using spatial flow depth network model to each frame of input video
Appearance features;Motion feature is obtained using time flow depth network model per continuous 10 frame to input video simultaneously.Wherein
Spatial flow depth network and time flow depth network model include 5 convolutional layers, 3 pond layers and 3 full articulamentums;
(2) the depth convolution for the last layer convolutional layer output that spatial flow network model and time flow network model are obtained
Figure obtains corresponding character representation using termporal filter pond method, using different length interval time sequence, to obtain
The global and local movement of video, and dimensionality reduction is carried out to feature using principal component analytical method;
(3) the depth convolution for the last layer convolutional layer output that spatial flow network model and time flow network model are obtained
Figure obtains corresponding character representation using space-time pyramid pond method, obtains depth using 4 layers of space-time pyramid structure
Local message in characteristic pattern, and there is robustness for target and geometry deformation;Similarly also carried out using principal component analysis
Feature Dimension Reduction;
(4) the description subcharacter extracted to step (2) and (3) cascades up, and forming the final vector of the video indicates;It adopts
Tagsort is carried out with support vector machines (SVM), final output classification results predict the action classification label of video, and
90.8% accuracy rate is realized on UCF50 human body behavioral data collection.
Further, detailed process is as follows in step (1): spatial flow and time flow depth network model are by the every frame of video
As input, the convolution sum pondization for doing multilayer to original image is operated, and the output for obtaining every layer is all multiple depth trellis diagrams, shape
At more abstract characteristics of image.
Detailed process is as follows in step (2): the last layer convolutional layer for choosing space flow network and time flow network is defeated
Trellis diagram out is temporally filtered the operation in device pond, to characteristic pattern using 4 kinds of different time intervals filter (Isosorbide-5-Nitrae,
8,16) carry out analysis depth feature in the movement of time-domain, it is time fortune in entire range of video that wherein time interval 1 is corresponding
Dynamic namely global motion, and it is local time's movement under out to out that time interval 16 is corresponding.For it is each different when
Between be spaced, depth characteristic can all be divided into multiple timeslices within the scope of entire video time, to the spy in each timeslice
Levy us while most representative feature in the timeslice obtained using maximum pond and pond method of summing, and by both
Pond result, which is together in series, indicates movement in the timeslice.Then the video spy entire termporal filter Chi Huahou obtained
Sign carries out PCA dimensionality reduction.
Detailed process is as follows in step (3): the last layer convolutional layer for choosing space flow network and time flow network is defeated
Multichannel convolutive figure out carries out the operation in space-time pyramid pond, to trellis diagram uses 4 layers of space-time pyramid structure (1 × 1
× 1,2 × 2 × 2,3 × 3 × 3,4 × 4 × 4) it is in entire time and spatial dimension that, wherein first layer (1 × 1 × 1) is corresponding
Characteristic pattern, and it is local space time's characteristic block under out to out that the 4th layer (4 × 4 × 4) is corresponding.Therefore pass through space-time gold word
Tower structure obtains the localized mass that characteristic pattern is located under different time and space scales.Maximum pond method is used to each local space time's block,
Calculate character representation of the maximum value in space-time block as the localized mass.Due to the characteristic pattern on each channel be extracted it is different
Image/video information, therefore the feature of the localized mass of space-time position same in the characteristic pattern on all channels is together in series, it is formed
The multi-channel feature of local space time's block describes son.Finally space-time block features all in video are cascaded up, form video
Character representation.Then PCA dimensionality reduction is carried out to the video features that entire space-time pyramid Chi Huahou is obtained.
Detailed process is as follows in step (4): the depth characteristic of video is passed through termporal filter pondization and space-time gold word
Two kinds of features after tower basin are together in series, and obtain the final character representation of video.Feature is carried out using support vector machines
Classification, obtains the action classification label of the video.
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair
The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description
To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this
Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention
Protection scope within.
Claims (3)
1. a kind of action identification method of the depth convolution feature based on mixing pit strategy, which is characterized in that including following step
It is rapid:
(1) it inputs video to be identified and the table of every frame is obtained using spatial flow depth network model to each frame of input video
See feature;Simultaneously to every continuous N frame of input video, motion feature is obtained using time flow depth network model;Wherein space
Flow depth degree network model and time flow depth network model include 5 convolutional layers, 3 pond layers and 3 full articulamentums;
(2) depth for the last layer convolutional layer output that spatial flow depth network model and time flow depth network model are obtained
Trellis diagram obtains corresponding character representation using termporal filter pond method, using different length interval time sequence, with
The global and local movement of video is obtained, and dimensionality reduction is carried out to feature using principal component analytical method, it is special to obtain the first description
Sign;
Meanwhile the depth for the last layer convolutional layer output that spatial flow depth network model and time flow depth network model are obtained
Degree trellis diagram obtains corresponding character representation using space-time pyramid pond method, is obtained using 4 layers of space-time pyramid structure
The local message in depth characteristic figure is taken, and there is robustness for target and geometry deformation;Similarly also using principal component point
Analysis carries out Feature Dimension Reduction, obtains second and describes subcharacter;
(3) it describes subcharacter to step (2) are extracted first and second to cascade up, forming the final vector of the video indicates;Using
Support vector machines (SVM) carries out tagsort, and final output classification results obtain the action recognition result of video;
In the step (2), the volume of the last layer convolutional layer output of spatial flow depth network and time flow depth network is chosen
Product figure is temporally filtered the operation in device pond, specifically to characteristic pattern use 4 kinds of different time intervals filter (Isosorbide-5-Nitrae,
8,16) carry out analysis depth feature in the movement of time-domain, it is time fortune in entire range of video that wherein time interval 1 is corresponding
Dynamic namely global motion, and it is local time's movement under out to out that time interval 16 is corresponding;For it is each different when
Between be spaced, depth characteristic can all be divided into multiple timeslices within the scope of entire video time, to the spy in each timeslice
Levy us while most representative feature in the timeslice obtained using maximum pond and pond method of summing, and by both
Pond result, which is together in series, indicates movement in the timeslice;Then the video spy entire termporal filter Chi Huahou obtained
Sign carries out PCA dimensionality reduction;
In the step (2), the more of the last layer convolutional layer output of spatial flow depth network and time flow depth network are chosen
Channel trellis diagram carries out the operation in space-time pyramid pond, specifically uses 4 layers of space-time pyramid structure (1 × 1 to trellis diagram
× 1,2 × 2 × 2,3 × 3 × 3,4 × 4 × 4) it is in entire time and spatial dimension that, wherein first layer (1 × 1 × 1) is corresponding
Characteristic pattern, and it is local space time's characteristic block under out to out that the 4th layer (4 × 4 × 4) is corresponding;Therefore pass through space-time gold word
Tower structure obtains the localized mass that characteristic pattern is located under different time and space scales;Maximum pond method is used to each local space time's block,
Calculate character representation of the maximum value in space-time block as the localized mass;Due to the characteristic pattern on each channel be extracted it is different
Image/video information, therefore the feature of the localized mass of space-time position same in the characteristic pattern on all channels is together in series, it is formed
The multi-channel feature of local space time's block describes son;Finally space-time block features all in video are cascaded up, form video
Character representation;Then PCA dimensionality reduction is carried out to the video features that entire space-time pyramid Chi Huahou is obtained.
2. the action identification method of the depth convolution feature according to claim 1 based on mixing pit strategy, feature
It is, in the step (1), spatial flow and time flow depth network model do original image using the every frame of video as input
The convolution sum pondization of multilayer operates, and the output for obtaining every layer is all multiple depth trellis diagrams, forms more abstract characteristics of image.
3. the action identification method of the depth convolution feature according to claim 1 based on mixing pit strategy, feature
It is, in the step (3), the depth characteristic of video is passed through into termporal filter pondization and two kinds of space-time pyramid Chi Huahou
Feature is together in series, and obtains the final character representation of video, is classified using support vector machines to feature, obtains the view
The action classification label of frequency.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611229368.0A CN106650674B (en) | 2016-12-27 | 2016-12-27 | A kind of action identification method of the depth convolution feature based on mixing pit strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611229368.0A CN106650674B (en) | 2016-12-27 | 2016-12-27 | A kind of action identification method of the depth convolution feature based on mixing pit strategy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106650674A CN106650674A (en) | 2017-05-10 |
CN106650674B true CN106650674B (en) | 2019-09-10 |
Family
ID=58832925
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611229368.0A Active CN106650674B (en) | 2016-12-27 | 2016-12-27 | A kind of action identification method of the depth convolution feature based on mixing pit strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106650674B (en) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108305240B (en) * | 2017-05-22 | 2020-04-28 | 腾讯科技(深圳)有限公司 | Image quality detection method and device |
CN107609460B (en) * | 2017-05-24 | 2021-02-02 | 南京邮电大学 | Human body behavior recognition method integrating space-time dual network flow and attention mechanism |
CN107292247A (en) * | 2017-06-05 | 2017-10-24 | 浙江理工大学 | A kind of Human bodys' response method and device based on residual error network |
CN107437083B (en) * | 2017-08-16 | 2020-09-22 | 广西荷福智能科技有限公司 | Self-adaptive pooling video behavior identification method |
CN107944488B (en) * | 2017-11-21 | 2018-12-11 | 清华大学 | Long time series data processing method based on stratification depth network |
CN108416795B (en) * | 2018-03-04 | 2022-03-18 | 南京理工大学 | Video action identification method based on sorting pooling fusion space characteristics |
CN108647625A (en) * | 2018-05-04 | 2018-10-12 | 北京邮电大学 | A kind of expression recognition method and device |
CN109308444A (en) * | 2018-07-16 | 2019-02-05 | 重庆大学 | A kind of abnormal behaviour recognition methods under indoor environment |
CN110032942B (en) * | 2019-03-15 | 2021-10-08 | 中山大学 | Action identification method based on time domain segmentation and feature difference |
CN110163286B (en) * | 2019-05-24 | 2021-05-11 | 常熟理工学院 | Hybrid pooling-based domain adaptive image classification method |
CN111460876B (en) * | 2019-06-05 | 2021-05-25 | 北京京东尚科信息技术有限公司 | Method and apparatus for identifying video |
CN112241673B (en) * | 2019-07-19 | 2022-11-22 | 浙江商汤科技开发有限公司 | Video processing method and device, electronic equipment and storage medium |
CN110991617B (en) * | 2019-12-02 | 2020-12-01 | 华东师范大学 | Construction method of kaleidoscope convolution network |
CN111325149B (en) * | 2020-02-20 | 2023-05-26 | 中山大学 | Video action recognition method based on time sequence association model of voting |
CN113111822B (en) * | 2021-04-22 | 2024-02-09 | 深圳集智数字科技有限公司 | Video processing method and device for congestion identification and electronic equipment |
CN113536683B (en) * | 2021-07-21 | 2024-01-12 | 北京航空航天大学 | Feature extraction method based on fusion of artificial features and convolution features of deep neural network |
CN113537164B (en) * | 2021-09-15 | 2021-12-07 | 江西科技学院 | Real-time action time sequence positioning method |
CN114926905B (en) * | 2022-05-31 | 2023-12-26 | 江苏濠汉信息技术有限公司 | Cable accessory procedure discriminating method and system based on gesture recognition with glove |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8345984B2 (en) * | 2010-01-28 | 2013-01-01 | Nec Laboratories America, Inc. | 3D convolutional neural networks for automatic human action recognition |
CN103164694B (en) * | 2013-02-20 | 2016-06-01 | 上海交通大学 | A kind of human action knows method for distinguishing |
CN103927561B (en) * | 2014-04-29 | 2017-02-22 | 东南大学 | Behavior recognition method based on probability fusion and dimensionality reduction technology |
CN104268568B (en) * | 2014-09-17 | 2018-03-23 | 电子科技大学 | Activity recognition method based on Independent subspace network |
CN105354528A (en) * | 2015-07-15 | 2016-02-24 | 中国科学院深圳先进技术研究院 | Depth image sequence based human body action identification method and system |
CN105678216A (en) * | 2015-12-21 | 2016-06-15 | 中国石油大学(华东) | Spatio-temporal data stream video behavior recognition method based on deep learning |
CN105894045B (en) * | 2016-05-06 | 2019-04-26 | 电子科技大学 | A kind of model recognizing method of the depth network model based on spatial pyramid pond |
-
2016
- 2016-12-27 CN CN201611229368.0A patent/CN106650674B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106650674A (en) | 2017-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106650674B (en) | A kind of action identification method of the depth convolution feature based on mixing pit strategy | |
Ullah et al. | Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications | |
CN106845329A (en) | A kind of action identification method based on depth convolution feature multichannel pyramid pond | |
CN108734290B (en) | Convolutional neural network construction method based on attention mechanism and application | |
CN111523462B (en) | Video sequence expression recognition system and method based on self-attention enhanced CNN | |
CN105956517B (en) | A kind of action identification method based on intensive track | |
CN112784798A (en) | Multi-modal emotion recognition method based on feature-time attention mechanism | |
CN109767456A (en) | A kind of method for tracking target based on SiameseFC frame and PFP neural network | |
CN103971095B (en) | Large-scale facial expression recognition method based on multiscale LBP and sparse coding | |
CN112597985B (en) | Crowd counting method based on multi-scale feature fusion | |
CN109858407B (en) | Video behavior recognition method based on multiple information flow characteristics and asynchronous fusion | |
CN110390952A (en) | City sound event classification method based on bicharacteristic 2-DenseNet parallel connection | |
CN103955682B (en) | Activity recognition method and device based on SURF points of interest | |
CN113011504A (en) | Virtual reality scene emotion recognition method based on visual angle weight and feature fusion | |
CN111387974A (en) | Electroencephalogram feature optimization and epileptic seizure detection method based on depth self-coding | |
CN109389035A (en) | Low latency video actions detection method based on multiple features and frame confidence score | |
CN106778444A (en) | A kind of expression recognition method based on multi views convolutional neural networks | |
CN110458235A (en) | Movement posture similarity comparison method in a kind of video | |
CN109635812A (en) | The example dividing method and device of image | |
Dar et al. | Efficient-SwishNet based system for facial emotion recognition | |
CN114863572B (en) | Myoelectric gesture recognition method of multi-channel heterogeneous sensor | |
CN113192076B (en) | MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction | |
CN103345623B (en) | A kind of Activity recognition method based on robust relative priority | |
CN105956604A (en) | Action identification method based on two layers of space-time neighborhood characteristics | |
CN112801009B (en) | Facial emotion recognition method, device, medium and equipment based on double-flow network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |