CN109389035A - Low latency video actions detection method based on multiple features and frame confidence score - Google Patents
Low latency video actions detection method based on multiple features and frame confidence score Download PDFInfo
- Publication number
- CN109389035A CN109389035A CN201810998778.4A CN201810998778A CN109389035A CN 109389035 A CN109389035 A CN 109389035A CN 201810998778 A CN201810998778 A CN 201810998778A CN 109389035 A CN109389035 A CN 109389035A
- Authority
- CN
- China
- Prior art keywords
- frame
- picture
- light stream
- confidence score
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The low latency video actions detection method based on multiple features and frame confidence score that the present invention provides a kind of, comprising: step 1, data prediction is carried out to data set, obtains RGB picture and light stream pictures;Step 2, Three dimensional convolution-deconvolution CDC neural network model is constructed;Step 3, RGB picture obtained in step 1 and light stream picture training set are separately input to be trained in the network model of step 2, obtain trained model;Step 4, the test set of RGB picture and light stream picture is respectively put into trained two models of step 3, generated after the output of two models and merged, obtain the confidence score of each frame, generation acts segment;Step 5, the movement segment obtained using step 4 is chosen the frame number of different weight percentage respectively in timing and made comparisons with true value, and low latency motion detection result is obtained.
Description
Technical field
The present invention relates to the video human motion detection technologies in computer vision technique, especially a kind of to be based on multiple features
With the low latency video actions detection method of frame confidence score.
Background technique
With the reach of science and the raising of computer technology level, people have higher to the acquisition analysis of information
Secondary requirement, increasingly, it is desired that computer can recognize the world, i.e. computer vision as people by vision.Human body
Action recognition has become highly developed as one research hotspot of computer vision field, various investigative technique methods.Movement
Detection is developed by action recognition, and the purpose is to the positions of the location action in one section of long video without editing, simultaneously
It needs to provide correct label to the movement in video.
There is researcher to propose the concept of low latency detection (Low-latency Detection).Delay is originally interactive
A key index in formula experiencing system refers to that user is making movement to obtaining the time difference between system feedback.It will
This concept expands to identification field, it can be understood as generates from observation data to obtaining the time difference correct recognition result.
We can be regarded as, and early stage, real-time, continuous and online recognition one kind is extensive.For simple, low latency movement inspection
Survey refers to, for the long video of non-editing, is identified and is positioned every to the movement content having been observed that in playing process
The beginning and end of one movement.The difficult point of low latency recognition detection mostlys come from two aspects: 1) the incomplete sight of data
The property surveyed, i.e., will identify the type of the behavior in the case where only observing part behavioral data and need to position each
Movement starts over;2) to the timeliness requirement of algorithm, that is, require algorithm that can detect as soon as possible while video acquisition
And identify the type of behavior.The two difficult points cause many traditional algorithms not can be used directly in such problem.
The automatic detection of mankind's activity has much potential applications in video, supervises as video understands with detection, automatic video frequency
Control and human-computer interaction etc..It further says again, many applications is required to detect activity as soon as possible.The human body of low latency
Motion analysis increasingly highlights its importance in the man-machine interactive system of multiplicity.For man-machine interactive system, system
The minimum of reaction delay is a very important Consideration.Excessively high delay not only seriously reduces the use of interactive system
Family experience, while but also certain specific interactive systems, such as the electronic game that gesture control or enhancing perceive, forfeiture are inhaled
Gravitation is to be difficult to popularize.Particularly, low latency detection is very important in terms of manufacture machine people, such as disposes a machine
Before people helps a patient to stand up, need first to detect that this patient is intended to that movement done.Or it can be with for one
The robot of emotion communication is carried out with the mankind, it allows for the emotional state that the mankind are accurately and rapidly found from facial expression,
It appropriate in time can respond in this way.In addition, low latency detection can also make system give a forecast in advance.For example, if
Early warning can be just provided when hazardous act not yet occurs, then being possible to prevent the generation of some hazard events.To sum up institute
It states, the research of the human body low latency motion detection based on video just becomes a critically important research direction, has great
Commercial value and realistic meaning.
Summary of the invention
The purpose of the present invention is to provide a kind of low latency video actions detection side based on multiple features and frame confidence score
Method can provide complete data observation, and calculate in real time.
Realize the technical solution of the object of the invention are as follows: a kind of low latency video actions based on multiple features and frame confidence score
Detection method, comprising the following steps:
Step 1, data prediction is carried out to data set, obtains RGB picture and light stream pictures;
Step 2, Three dimensional convolution-deconvolution CDC neural network model is constructed;
Step 3, RGB picture obtained in step 1 and light stream picture training set are separately input to the network model of step 2
In be trained, obtain trained model;
Step 4, the test set of RGB picture and light stream picture is respectively put into trained two models of step 3, is generated
It after the output of two models and merges, obtains the confidence score of each frame, generation acts segment;
Step 5, the movement segment obtained using step 4, chosen respectively in timing the frame number of different weight percentage and with it is true
Value is made comparisons, and low latency motion detection result is obtained.
Compared with prior art, it has the advantage that and needs first extraction movement segment to put with traditional complete motion detection
Enter network class difference, the present invention only needs chronologically to input frame sequence in a network, can obtain the action classification of every frame, be
A kind of motion detection method based on frame.Meanwhile invention introduces the loss functions of a Rank loss, can constrain mould
The monotone nondecreasing that type exports a correct label detects score, so as to detect the beginning of movement as soon as possible, realizes low latency
Motion detection.Also, present invention uses two kinds of data training networks, and one is RGB picture, have sufficiently used space characteristics,
The other is light stream picture, has sufficiently used temporal characteristics, finally space-time training data is combined, is extracted action message, is mentioned
The confidence score of high frame classification, to improve the precision of motion detection.
The invention will be further described with reference to the accompanying drawings of the specification.
Detailed description of the invention
Fig. 1 is the basic framework schematic diagram of video low latency motion detection technology.
Fig. 2 is light stream figure.
Fig. 3 is CDC network structure.
Fig. 4 is frame confidence score figure.
Specific embodiment
The present invention is described in more detail with reference to the accompanying drawing:
The present invention proposes a kind of low latency video actions detection method based on multiple features and frame confidence score, and it is more to include building
Layer Three dimensional convolution network extracts RGB and light stream picture, extracts the processes such as frame confidence score, low latency detection, to the length of non-editing
Video carries out a series of calculating, and the generation of movement can be detected in video display process and judges its classification.Video is low to be prolonged
The basic framework of slow motion detection technology is as shown in Figure 1, the present invention is carried out according to this basic framework.
Step 1, the long video of non-editing, including training set and test set, with the picture format of png, according to 25FPS's
Frame per second is read.
Step 2, as shown in Fig. 2, the continuous RGB picture read in never editing long video is obtained using TVL1 optical flow algorithm
Take light stream picture.Every two frames RGB picture generates two single channel light stream pictures on one group of direction x, y by algorithm.Light stream is calculated
The specific method is as follows for method:
Assuming that the gray value of a point m (x, y) is I (x, y, t) on image in moment t, after dt, which is moved
To new position m'(x+dx, y+dy), which is I (x+dx, y+dy, t+dt), it is assumed that is arrived after point movement in image
It is equal to the gray value of movement front position up to the gray value of position, then has:
I (x, y, t)=I (x+dx, y+dy, t+dt)
Taylor's formula expansion will be carried out on the right of equation, it may be assumed that
Wherein ε represents the infinite event of second order, due to dt → 0, ignores ε, available
If u, v is respectively velocity vector of the light stream in X-axis and Y direction, and is had It enablesA then available light stream Basic Constraint Equation:
Ixu+Iyv+It=0
In order to solve above formula unique solution u and v, it is necessary to add other constraint condition.TVL1 algorithm is according to flatness vacation
If --- the movement of each pixel is distributed with the gym suit of its field point from flatness, be joined smooth item and is established light stream mould
Type is as follows:
E is the energy function of optical flow estimation, and λ is the weight constant of data item,WithIt is two-dimensional gradient, passes through minimum
Change energy function E solution and obtains u and v.
Step 3, CDC network is constructed, CDC network structure is as shown in Figure 3.CDC network is using C3D network structure
First part of the conv1a-conv5b as CDC, wherein the pond of layer 5 is changed to 1 × 2 × 2.Then by the three-dimensional of C3D
Full articulamentum after convolutional network is changed to CDC filter.CDC6 layers by the output data (512, L/8,4,4) after convolution in space
Upper down-sampling, up-sampling becomes (4096, L/4,1,1) in time, and CDC7 layers up-sample CDC6 layers of output in time
As (4096, L/2,1,1), then CDC8 layers upper one layer of output is continued to be up-sampled in time as (K+1, L, 1,1),
Finally by the classification results of softmax layers of generation L frame.
Step 4, there are two the loss functions of whole network, one is the Classification Loss function based on cross entropy, another
It is the loss function based on Rank loss.Whole loss function calculates as follows:
Wherein,It is Classification Loss function,It is Rank loss function, λrIt is a constant, is set as 6 here.Classification damage
Losing function is calculated with cross entropy, as follows:
Wherein, ytIt is the true tag of t frame in training sequence,It is that t frame belongs to correct classification ytDetection score,
As the softmax of network model is exported.
The present invention also proposed a Rank Loss function on the basis of being based on Classification Loss function.As shown in figure 4,
During low latency detects video, it is seen that the frame number of a movement is more, belongs to the detection score of some correct classification
Then can be higher, confidence level is bigger;Conversely, the detection score that this movement belongs to some error category can be lower, confidence level is smaller.
Therefore, when a movement occurs, its detection score can be a monotone nondecreasing curve.According to this characteristic, if in t
Interior there is no movements to change, and the detection score of t frame is not less than the detection score of former frame certainly.Therefore, one is constructed
A Rank Loss function.If t frame is not acted and changed, loss function calculates as follows:
If changed in the movement of t frame, i.e., t frame and t-1 frame are not belonging to same category, and loss function calculates such as
Under:
Step 5, the input of CDC network first tier is 32 frame images in video, is sliced using every 32 frame of video as one
It inputs in network, (1~32), (33~64) ... frame is not overlapped as input, slice window.Use RGB picture and light
Flow graph piece is put into the CDC network built respectively as training set in the manner described above and starts to train, and uses stochastic gradient
Decline (SGD) optimization object function, initial learning rate is set as 1e-6, and batch size is set as 4, after 25 epoch of iteration most
Two training patterns are obtained eventually.
Step 6, classified respectively to RGB and light stream test set picture using two training patterns in step 5, extracted
The output of layer network second from the bottom takes maximum confidence score as this to get belonging to the confidence score of every one kind to every frame
The motion detection score of class finally merges the output score of RGB picture and light stream picture as final frame confidence score.
Step 7, the classification of every frame is obtained according to the frame confidence score in step 6, in one section of successive video frames, if phase
Adjacent two frames belong to same category, and just successively merging these frames becomes as small fragment.
Step 8, it if the small fragment in step 7 is close two-by-two in time series, is i.e. differed between two small fragments
As soon as frame number less than 20 frames, they is continued to be merged into a large fragment, become final movement segment.
Step 9, the movement segment obtained using step 8, chooses the frame number of different weight percentage respectively in timing, and such as one
The movement segment of 50 frames therefrom takes preceding 3/10 frame number, that is, preceding 15 frame of the segment is taken to carry out low latency motion detection.By this
Preceding 15 frame and preceding 3/10 frame number of true movement segment do intersection, the degree of overlapping of the two are obtained, then according to different IOU thresholds
Value calculates mean accuracy (AP), is averaged out classification finally to obtain mean accuracy mean value (mAP).Low latency motion detection effect
Fruit is by mAP (mean accuracy mean value) come what is evaluated, if mean accuracy mean value is high, this low latency detection effect is just
It is good.(that is result that Map is equivalent to the detection of this low latency).
Claims (8)
1. a kind of low latency video actions detection method based on multiple features and frame confidence score, which is characterized in that including following
Step:
Step 1, data prediction is carried out to data set, obtains RGB picture and light stream pictures;
Step 2, Three dimensional convolution-deconvolution CDC neural network model is constructed;
Step 3, by RGB picture obtained in step 1 and light stream picture training set be separately input in the network model of step 2 into
Row training, obtains trained model;
Step 4, the test set of RGB picture and light stream picture is respectively put into trained two models of step 3, generates two
It after the output of model and merges, obtains the confidence score of each frame, generation acts segment;
Step 5, the movement segment obtained using step 4 is chosen the frame number of different weight percentage respectively in timing and made with true value
Compare, obtains low latency motion detection result.
2. the method according to claim 1, wherein the step 1 specifically includes:
Step 1.1. is the long video of non-editing, including training set and test set, with the picture format of png, according to the frame of 25FPS
Rate is read, as RGB pictures;
Never the continuous RGB picture read in editing long video is obtained light stream picture using TVL1 optical flow algorithm by step 1.2..
3. according to the method described in claim 2, it is characterized in that, optical flow algorithm in step 1.2 are as follows:
Step 1.2.1, it is assumed that in moment t, the gray value of a point m (x, y) is I (x, y, t) on image, after dt, is somebody's turn to do
Point moves to new position m'(x+dx, y+dy), which is I (x+dx, y+dy, t+dt);
Assuming that the gray value of in-position is equal to the gray value of movement front position after point movement in image, i.e.,
I (x, y, t)=I (x+dx, y+dy, t+dt) (1)
Step 1.2.2 will carry out Taylor's formula expansion, i.e., on the right of formula (1)
Wherein, ε represents the infinite event of second order, due to dt → 0, ignores ε and obtains
Step 1.2.3 if u, v are respectively velocity vector of the light stream in X-axis and Y direction, and hasIt enablesThen obtain light stream Basic Constraint Equation:
Ixu+Iyv+It=0 (4)
Step 1.2.4, (u, v) forms light stream pictures.
4. the method according to claim 1, wherein the step 2 specifically includes the following steps:
Step 2.1, first part of the CDC network using the conv1a-conv5b of C3D network structure as CDC, wherein by the 5th
The pond of layer is changed to 1 × 2 × 2;Full articulamentum after the Three dimensional convolution network of C3D is changed to CDC filter;CDC6 layers by convolution
Output data (512, L/8,4,4) afterwards spatially down-sampling, in time up-sampling become (4096, L/4,1,1);CDC7
Layer up-samples CDC6 layers of output in time becomes (4096, L/2,1,1);CDC8 layers by upper one layer of output continue when
Between upper up-sampling become (K+1, L, 1,1);Finally by the classification results of softmax layers of generation L frame;
Step 2.2, whole loss function L is designedt
Wherein,It is Classification Loss function,It is Rank loss function, λrIt is a constant;
Classification Loss function
Wherein, ytIt is the true tag of t frame in training sequence,It is that t frame belongs to correct classification ytDetection score;
Rank loss function is divided into what two kinds of situations calculated, changes if t frame does not act, and loss function calculates such as
Formula (7)
If changed in the movement of t frame, i.e., t frame and t-1 frame are not belonging to same category, and loss function calculates in this way
(8):
5. according to the method described in claim 4, it is characterized in that, the step 3 specifically includes the following steps:
The input of CDC network first tier is 32 frame images in video, is inputted in network using every 32 frame of video as a slice,
(1~32), (33~64) ... frame is not overlapped as input, slice window;The RGB picture and light obtained using step 1
Flow graph piece is put into the CDC network built respectively as training set in the manner described above and starts to train, and obtains two training
Model.
6. according to the method described in claim 5, it is characterized in that, the step 4 specifically includes the following steps:
Step 4.1, classified respectively to RGB and light stream test set picture using two training patterns in step 3, extraction is fallen
The output of several second layer networks takes maximum confidence score as such to get belonging to the confidence score of every one kind to every frame
Motion detection score, the output score of RGB picture and light stream picture is finally done into average value processing as final frame confidence score;
Step 4.2, the classification of every frame is obtained according to the frame confidence score in step 4.1, in one section of successive video frames, if phase
Adjacent two frames belong to same category, and just successively merging these frames becomes as small fragment;
Step 4.3, if as soon as the frame number differed between two small fragments in step 4.2 continues them less than 20 frames
It is merged into a large fragment, becomes final movement segment.
7. according to the method described in claim 6, it is characterized in that, in step 4.2 each frame can all provide the frame belong to it is each
The prediction score of classification, that highest one kind of prediction score are the classification of the frame.
8. according to the method described in claim 6, it is characterized in that, the step 5 specifically includes the following steps:
Step 5.1, the movement segment obtained using step 4, chooses the frame number of different weight percentage respectively in timing, low to carry out
Delay voltage detection;
Step 5.2, the frame extracted in step 5.1 is done into intersection with the same number of frames of true movement segment and obtains the overlapping of the two
Degree, then calculates mean accuracy according to different IOU threshold values, is averaged out classification finally to obtain mean accuracy mean value, obtains
To low latency testing result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810998778.4A CN109389035A (en) | 2018-08-30 | 2018-08-30 | Low latency video actions detection method based on multiple features and frame confidence score |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810998778.4A CN109389035A (en) | 2018-08-30 | 2018-08-30 | Low latency video actions detection method based on multiple features and frame confidence score |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109389035A true CN109389035A (en) | 2019-02-26 |
Family
ID=65418545
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810998778.4A Pending CN109389035A (en) | 2018-08-30 | 2018-08-30 | Low latency video actions detection method based on multiple features and frame confidence score |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109389035A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109886225A (en) * | 2019-02-27 | 2019-06-14 | 浙江理工大学 | A kind of image gesture motion on-line checking and recognition methods based on deep learning |
CN110007675A (en) * | 2019-04-12 | 2019-07-12 | 北京航空航天大学 | A kind of Vehicular automatic driving decision system based on driving situation map and the training set preparation method based on unmanned plane |
CN111027472A (en) * | 2019-12-09 | 2020-04-17 | 北京邮电大学 | Video identification method based on fusion of video optical flow and image space feature weight |
US20210158483A1 (en) * | 2019-11-26 | 2021-05-27 | Samsung Electronics Co., Ltd. | Jointly learning visual motion and confidence from local patches in event cameras |
CN113221633A (en) * | 2021-03-24 | 2021-08-06 | 西安电子科技大学 | Weak supervision time sequence behavior positioning method based on hierarchical category model |
CN113678137A (en) * | 2019-08-18 | 2021-11-19 | 聚好看科技股份有限公司 | Display device |
CN116453010A (en) * | 2023-03-13 | 2023-07-18 | 彩虹鱼科技(广东)有限公司 | Ocean biological target detection method and system based on optical flow RGB double-path characteristics |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107292912A (en) * | 2017-05-26 | 2017-10-24 | 浙江大学 | A kind of light stream method of estimation practised based on multiple dimensioned counter structure chemistry |
US20170336398A1 (en) * | 2016-04-26 | 2017-11-23 | Washington State University | Compositions and methods for antigen detection incorporating inorganic nanostructures to amplify detection signals |
-
2018
- 2018-08-30 CN CN201810998778.4A patent/CN109389035A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170336398A1 (en) * | 2016-04-26 | 2017-11-23 | Washington State University | Compositions and methods for antigen detection incorporating inorganic nanostructures to amplify detection signals |
CN107292912A (en) * | 2017-05-26 | 2017-10-24 | 浙江大学 | A kind of light stream method of estimation practised based on multiple dimensioned counter structure chemistry |
Non-Patent Citations (3)
Title |
---|
VAN-MINH KHONG等: "Improving human action recognition with two-stream 3D convolutional neural network", 《2018 1ST INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR)》 * |
ZHENG SHOU等: "CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION(CVPR)》 * |
赵谦 等: "《智能视频图像处理技术与应用》", 30 September 2016, 西安电子科技大学出版社 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109886225A (en) * | 2019-02-27 | 2019-06-14 | 浙江理工大学 | A kind of image gesture motion on-line checking and recognition methods based on deep learning |
CN109886225B (en) * | 2019-02-27 | 2020-09-15 | 浙江理工大学 | Image gesture action online detection and recognition method based on deep learning |
CN110007675A (en) * | 2019-04-12 | 2019-07-12 | 北京航空航天大学 | A kind of Vehicular automatic driving decision system based on driving situation map and the training set preparation method based on unmanned plane |
CN113678137A (en) * | 2019-08-18 | 2021-11-19 | 聚好看科技股份有限公司 | Display device |
CN113678137B (en) * | 2019-08-18 | 2024-03-12 | 聚好看科技股份有限公司 | Display apparatus |
US20210158483A1 (en) * | 2019-11-26 | 2021-05-27 | Samsung Electronics Co., Ltd. | Jointly learning visual motion and confidence from local patches in event cameras |
US11694304B2 (en) * | 2019-11-26 | 2023-07-04 | Samsung Electronics Co., Ltd. | Jointly learning visual motion and confidence from local patches in event cameras |
CN111027472A (en) * | 2019-12-09 | 2020-04-17 | 北京邮电大学 | Video identification method based on fusion of video optical flow and image space feature weight |
CN113221633A (en) * | 2021-03-24 | 2021-08-06 | 西安电子科技大学 | Weak supervision time sequence behavior positioning method based on hierarchical category model |
CN113221633B (en) * | 2021-03-24 | 2023-09-19 | 西安电子科技大学 | Weak supervision time sequence behavior positioning method based on hierarchical category model |
CN116453010A (en) * | 2023-03-13 | 2023-07-18 | 彩虹鱼科技(广东)有限公司 | Ocean biological target detection method and system based on optical flow RGB double-path characteristics |
CN116453010B (en) * | 2023-03-13 | 2024-05-14 | 彩虹鱼科技(广东)有限公司 | Ocean biological target detection method and system based on optical flow RGB double-path characteristics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109389035A (en) | Low latency video actions detection method based on multiple features and frame confidence score | |
CN109919031B (en) | Human behavior recognition method based on deep neural network | |
Hu et al. | 3D separable convolutional neural network for dynamic hand gesture recognition | |
CN109886358B (en) | Human behavior recognition method based on multi-time-space information fusion convolutional neural network | |
CN108133188B (en) | Behavior identification method based on motion history image and convolutional neural network | |
CN112784798B (en) | Multi-modal emotion recognition method based on feature-time attention mechanism | |
CN108229338B (en) | Video behavior identification method based on deep convolution characteristics | |
CN104933417B (en) | A kind of Activity recognition method based on sparse space-time characteristic | |
CN112766172B (en) | Facial continuous expression recognition method based on time sequence attention mechanism | |
CN110084228A (en) | A kind of hazardous act automatic identifying method based on double-current convolutional neural networks | |
CN109190479A (en) | A kind of video sequence expression recognition method based on interacting depth study | |
CN112836640A (en) | Single-camera multi-target pedestrian tracking method | |
CN114049381A (en) | Twin cross target tracking method fusing multilayer semantic information | |
CN108764019A (en) | A kind of Video Events detection method based on multi-source deep learning | |
CN113158861B (en) | Motion analysis method based on prototype comparison learning | |
CN116363738A (en) | Face recognition method, system and storage medium based on multiple moving targets | |
CN114529984A (en) | Bone action recognition method based on learnable PL-GCN and ECLSTM | |
CN115966010A (en) | Expression recognition method based on attention and multi-scale feature fusion | |
CN114360067A (en) | Dynamic gesture recognition method based on deep learning | |
CN112906520A (en) | Gesture coding-based action recognition method and device | |
Wei et al. | Learning facial expression and body gesture visual information for video emotion recognition | |
CN105956604B (en) | Action identification method based on two-layer space-time neighborhood characteristics | |
CN113850182A (en) | Action identification method based on DAMR-3 DNet | |
CN113780140A (en) | Gesture image segmentation and recognition method and device based on deep learning | |
CN115546491B (en) | Fall alarm method, system, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190226 |