CN110866458A - Multi-user action detection and identification method and device based on three-dimensional convolutional neural network - Google Patents

Multi-user action detection and identification method and device based on three-dimensional convolutional neural network Download PDF

Info

Publication number
CN110866458A
CN110866458A CN201911032206.1A CN201911032206A CN110866458A CN 110866458 A CN110866458 A CN 110866458A CN 201911032206 A CN201911032206 A CN 201911032206A CN 110866458 A CN110866458 A CN 110866458A
Authority
CN
China
Prior art keywords
person
sequence
video
neural network
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911032206.1A
Other languages
Chinese (zh)
Inventor
宋波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yingpu Technology Co Ltd
Original Assignee
Beijing Yingpu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yingpu Technology Co Ltd filed Critical Beijing Yingpu Technology Co Ltd
Priority to CN201911032206.1A priority Critical patent/CN110866458A/en
Publication of CN110866458A publication Critical patent/CN110866458A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method and a device for detecting and identifying multi-user actions based on a three-dimensional convolutional neural network. The method comprises the steps of preprocessing a video of a training data set to obtain a sequence of body actions of each person; and inputting the extracted body motion sequence of each person into a three-dimensional convolutional neural network model for motion detection and identification, wherein the three-dimensional convolutional neural network model comprises two 3D convolutional pooling units, two convolutional layers, two maximum pooling layers, a flat layer, two complete connection layers and an output layer. The device comprises a preprocessing module and a detection and identification module. The method and the device can be applied to multi-person motion detection, and the three-dimensional convolution neural network model has good generalization capability and higher identification accuracy when being applied to different types of video data.

Description

Multi-user action detection and identification method and device based on three-dimensional convolutional neural network
Technical Field
The application relates to the field of video processing, in particular to a method and a device for detecting and identifying multi-user actions based on a three-dimensional convolutional neural network.
Background
Human motion detection and recognition is a specific application of video processing, aiming at recognizing human motion and activity through a series of behavior recognition about subjects and surrounding environment, which is a hot problem in the field of computer vision. Human motion detection is an important task for many applications, such as video surveillance systems, video retrieval, multimedia applications of human motion, and so on. Human action recognition methods can be divided into two broad categories: detection recognition and classification. The detection and identification method firstly detects the motion of a person and then identifies the motion. These methods are validated using conventional monitoring data sets, such as KTH, Weisman, IX-MAS, UCF-ARG, PETS, and the like. These data sets are recorded under controlled conditions so that the individual is in the best shot position at the camera, with a simple static background and similar lighting variations. Under the classification method, classification is generally performed according to the motion contained in the video. These methods are evaluated using newly developed data sets that are videos collected from a network, such as from YouTube, or videos recorded using a mobile camera and a complex background under practical conditions without controlling lighting conditions, and include Hollywood, Hollywood2, UCFsport, UCF50, UCF101, HMDB51, HMDB, etc., which explore the diversity of video content (the size of a human body, changes in the position of a human body, etc.), changes in camera motion, and background analysis of such data sets. In order to improve the human motion recognition performance, most of recent studies adopt various deep learning models. Since human body behavior is extracted from multiple movements of the human body or parts thereof, the recognition process must involve video processing in order to understand the pattern of visual appearance changes. Many approaches use several input videos for deep learning modeling to recognize human activities, such as recognizing human behavior using CNN models and long-term short-term memory (LSTM); there is another example of many features proposed by scholars for use with CNN models, using raw frames, optical flow and motion stacked difference images as inputs to CNN models for human behavior recognition; yet another approach is to use six-stream features for general motion recognition, where a number of inputs are used, including a full image, a human image representing only the human body, and the optical flow results of each of the previous features.
The above described method has the following drawbacks:
1. most of the existing methods aim at single-person motion detection and identification, and because of the multi-complexity characteristics of multi-person motion detection, the methods aiming at multi-person motion detection are less;
2. the existing multi-person motion detection and identification method often has low identification accuracy due to the characteristics of complexity and diversity;
3. existing methods are generally directed to a specific type of video data set, such as testing using a conventional monitoring data set or testing using a video data set collected from a network, and few models can be applied to both types of models simultaneously, i.e., the existing models are less generalizable.
Disclosure of Invention
It is an object of the present application to overcome the above problems or to at least partially solve or mitigate the above problems.
According to one aspect of the application, a method for detecting and identifying actions of multiple persons based on a three-dimensional convolutional neural network is provided, and the method comprises the following steps:
preprocessing a video of a training data set to obtain a sequence of body actions of each person;
and inputting the extracted body motion sequence of each person into a three-dimensional convolutional neural network model for motion detection and identification, wherein the three-dimensional convolutional neural network model comprises two 3D convolutional pooling units, two convolutional layers, two maximum pooling layers, a flat layer, two complete connection layers and an output layer.
Optionally, the preprocessing the video of the training data set includes:
separating the human body from the video of the training data set;
and extracting a sequence of the body action of each person from the video after the human body is separated.
Optionally, the separating the human body from the video of the training data set includes:
determining a background frame using a background evaluation method of an initialization value of SAD continuity between every two images and entropy of each block;
and calculating the background frame to obtain a differential absolute image, and then performing structural texture decomposition on the differential absolute image to obtain the region of the moving object to complete human body separation.
Optionally, the sequence of extracting the body motion of each person from the video after separating the human body is performed by a tracking method based on an extended version of the coring correlation filter.
Optionally, the preprocessing further comprises extracting motion history images from the video of the training data set;
and combining the extracted motion history image with the sequence of the body motion of each person, and inputting the combined motion history image and the sequence of the body motion of each person into a three-dimensional convolutional neural network model together for motion detection and identification.
According to another aspect of the present application, there is provided a multi-person motion detection and recognition apparatus based on a three-dimensional convolutional neural network, comprising:
a pre-processing module configured to pre-process the video of the training data set to obtain a sequence of body movements of each person;
and the detection and identification module is configured to input the extracted sequence of the body actions of each person into a three-dimensional convolutional neural network model for action detection and identification, wherein the three-dimensional convolutional neural network model comprises two 3D convolutional pooling units, two convolutional layers, two maximum pooling layers, a flat layer, two complete connection layers and an output layer.
Optionally, the preprocessing module includes:
a human body separation submodule configured to separate a human body from a video of a training data set;
a sequence extraction sub-module configured to extract a sequence of body movements of each person from the video of the separated human body.
Optionally, the human body separation submodule includes:
a background frame sub-module configured to determine a background frame using an initialization value of SAD continuity between each two images and a background evaluation method of entropy of each block;
and the structure texture decomposition submodule is configured to calculate the background frame to obtain a difference absolute image, and then perform structure texture decomposition on the difference absolute image to obtain the region of the moving object and complete human body separation.
Optionally, the sequence of extracting the body motion of each person from the video after separating the human body is performed by a tracking method based on an extended version of the coring correlation filter.
Optionally, the preprocessing module further includes:
a history image extraction sub-module configured to extract motion history images from the video of the training data set;
and in the detection and identification module, the extracted motion history image is combined with the sequence of the body action of each person, and the combined motion history image and the sequence are jointly input into the three-dimensional convolutional neural network model for action detection and identification.
The application discloses a method and a device for detecting and identifying actions of multiple persons based on a three-dimensional convolutional neural network, due to the fact that training data are preprocessed, sequences of every person are extracted, and improved three-dimensional convolutional neural network models are adopted to detect and identify actions of the human body, therefore, the method and the device can be applied to action detection of multiple persons, and the three-dimensional convolutional neural network models have good generalization capability and higher identification accuracy when being applied to video data of different types.
The above and other objects, advantages and features of the present application will become more apparent to those skilled in the art from the following detailed description of specific embodiments thereof, taken in conjunction with the accompanying drawings.
Drawings
Some specific embodiments of the present application will be described in detail hereinafter by way of illustration and not limitation with reference to the accompanying drawings. The same reference numbers in the drawings identify the same or similar elements or components. Those skilled in the art will appreciate that the drawings are not necessarily drawn to scale. In the drawings:
FIG. 1 is a schematic flow chart diagram of a method for multi-user motion detection and recognition based on a three-dimensional convolutional neural network according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of a multi-user motion detection and recognition apparatus based on a three-dimensional convolutional neural network according to an embodiment of the present application;
FIG. 3 is a block schematic diagram of a computing device of one embodiment of the present application;
fig. 4 is a schematic block diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
Fig. 1 is a schematic flow diagram of a method for multi-person motion detection and recognition based on a three-dimensional convolutional neural network, according to an embodiment of the present application, which may generally include:
s1, preprocessing the video of the training data set to obtain a sequence of the body actions of each person:
prior to model training using a training data set, the data set needs to be preprocessed, the human body is separated in a video and a sequence of body movements is extracted during the movements. The preprocessing process is completed by using a background modeling technology, and specifically comprises the following steps: background frames can be determined quickly and efficiently using an initialization value of SAD continuity between every two images and then using a background evaluation method of entropy of each block. In order to minimize information in each frame and reduce noise and false modeling areas, a cvSub () function in OpenCV may be used to perform function calculation on a background frame to obtain a differential absolute image, then calculate a structural texture decomposition of the differential absolute image, then use a structural component containing only a uniform part of the image for a subsequent process, so far, obtain an area of a moving object, and then adjust the size of the preprocessed data.
In the case of many moving objects or people in a scene, each person in the scene needs to be detected and tracked to generate a sequence of each person. For this purpose, an extended version of the tracking method based on a coring correlation filter (KCF) is used. And extracting and generating a sequence representing human body actions of each person from the tracking result. However, these sequences or RGB videos may contain some redundant information, such as static background, for this reason, Motion History Images (MHI) need to be extracted from the videos, the MHI is combined with the sequences, and then the sequences and the sequences are jointly sent into a three-dimensional convolutional neural network model (3DCNN) for training, and the use of the MHI improves the recognition accuracy and reduces the recognition time.
S2, inputting the extracted sequence of each person' S body motion into a three-dimensional convolutional neural network model for motion detection and recognition, wherein the three-dimensional convolutional neural network model comprises two 3D convolutional pooling units, two convolutional layers, two maximum pooling layers, a flat layer, two complete connection layers and an output layer:
the three-dimensional convolutional neural network (3DCNN) is a supervised learning model with a multi-level deep learning network, which can learn a plurality of invariant features from an input video, and convolution and pooling are main components in the 3DCNN model. In this embodiment, the 3D cnn model architecture includes two 3D convolution pooling units, two convolution layers, two maximum pooling layers, one flat layer, two fully connected layers, and an output layer, where the output layer includes ten neurons representing the number of actions, and the activation function is a ReLU function. And combining the MHI obtained in the last step with the sequence of each person, and then jointly sending the MHI into a three-dimensional convolutional neural network model for training.
The method described in this embodiment uses the KTH, Weizmann, and UCF-ARG datasets for model training, which are three conventional monitoring datasets. And performing model test by using PETS, UCF101, Hollywood and MHAD data sets, wherein the test data set is a conventional monitoring data set and three video data sets collected from a network, and the test data set is directly sent into a trained 3DCNN for detecting and identifying multiple human bodies without any preprocessing part. The resolution of the data set was 32 × 32, the time depth was 10, the batch size during the model training was 128, and the learning rate was 1. According to the embodiment, different types of data sets are tested, and the test result shows that compared with the prior art, the generalization capability and accuracy of the 3DCNN model of the embodiment to different data sets are improved.
Fig. 2 is a schematic block diagram of a multi-person motion detection and recognition apparatus based on a three-dimensional convolutional neural network according to an embodiment of the present application, which may generally include:
a pre-processing module 1 configured to pre-process the video of the training data set, obtaining a sequence of body movements of each person:
prior to model training using a training data set, the data set needs to be preprocessed, the human body is separated in a video and a sequence of body movements is extracted during the movements. The preprocessing process is completed by using a background modeling technology, and specifically comprises the following steps: background frames can be determined quickly and efficiently using an initialization value of SAD continuity between every two images and then using a background evaluation method of entropy of each block. In order to minimize information in each frame and reduce noise and false modeling areas, a cvSub () function in OpenCV may be used to perform function calculation on a background frame to obtain a differential absolute image, then calculate a structural texture decomposition of the differential absolute image, then use a structural component containing only a uniform part of the image for a subsequent process, so far, obtain an area of a moving object, and then adjust the size of the preprocessed data.
In the case of many moving objects or people in a scene, each person in the scene needs to be detected and tracked to generate a sequence of each person. For this purpose, an extended version of the tracking method based on a coring correlation filter (KCF) is used. And extracting and generating a sequence representing human body actions of each person from the tracking result. However, these sequences or RGB videos may contain some redundant information, such as static background, for this reason, Motion History Images (MHI) need to be extracted from the videos, the MHI is combined with the sequences, and then the sequences and the sequences are jointly sent into a three-dimensional convolutional neural network model (3DCNN) for training, and the use of the MHI improves the recognition accuracy and reduces the recognition time.
A detection and recognition module 2 configured to input the extracted sequence of each person's body motion into a three-dimensional convolutional neural network model for motion detection and recognition, the three-dimensional convolutional neural network model including two 3D convolutional pooling units, two convolutional layers, two max-pooling layers, one flat layer, two fully-connected layers, and an output layer:
the three-dimensional convolutional neural network (3DCNN) is a supervised learning model with a multi-level deep learning network, which can learn a plurality of invariant features from an input video, and convolution and pooling are main components in the 3DCNN model. In this embodiment, the 3D cnn model architecture includes two 3D convolution pooling units, two convolution layers, two maximum pooling layers, one flat layer, two fully connected layers, and an output layer, where the output layer includes ten neurons representing the number of actions, and the activation function is a ReLU function. And combining the MHI obtained in the last step with the sequence of each person, and then jointly sending the MHI into a three-dimensional convolutional neural network model for training.
The apparatus of this embodiment uses the KTH, Weizmann, and UCF-ARG datasets for model training, which are three conventional monitoring datasets. And performing model test by using PETS, UCF101, Hollywood and MHAD data sets, wherein the test data set is a conventional monitoring data set and three video data sets collected from a network, and the test data set is directly sent into a trained 3DCNN for detecting and identifying multiple human bodies without any preprocessing part. The resolution of the data set was 32 × 32, the time depth was 10, the batch size during the model training was 128, and the learning rate was 1. According to the embodiment, different types of data sets are tested, and the test result shows that compared with the prior art, the generalization capability and accuracy of the 3DCNN model of the embodiment to different data sets are improved.
Embodiments also provide a computing device, referring to fig. 3, comprising a memory 1120, a processor 1110 and a computer program stored in said memory 1120 and executable by said processor 1110, the computer program being stored in a space 1130 for program code in the memory 1120, the computer program, when executed by the processor 1110, implementing the method steps 1131 for performing any of the methods according to the invention.
The embodiment of the application also provides a computer readable storage medium. Referring to fig. 4, the computer readable storage medium comprises a storage unit for program code provided with a program 1131' for performing the steps of the method according to the invention, which program is executed by a processor.
The embodiment of the application also provides a computer program product containing instructions. Which, when run on a computer, causes the computer to carry out the steps of the method according to the invention.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed by a computer, cause the computer to perform, in whole or in part, the procedures or functions described in accordance with the embodiments of the application. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by a program, and the program may be stored in a computer-readable storage medium, where the storage medium is a non-transitory medium, such as a random access memory, a read only memory, a flash memory, a hard disk, a solid state disk, a magnetic tape (magnetic tape), a floppy disk (floppy disk), an optical disk (optical disk), and any combination thereof.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A multi-person action detection and identification method based on a three-dimensional convolutional neural network comprises the following steps:
preprocessing a video of a training data set to obtain a sequence of body actions of each person;
and inputting the extracted body motion sequence of each person into a three-dimensional convolutional neural network model for motion detection and identification, wherein the three-dimensional convolutional neural network model comprises two 3D convolutional pooling units, two convolutional layers, two maximum pooling layers, a flat layer, two complete connection layers and an output layer.
2. The method of claim 1, wherein preprocessing the video of the training data set comprises:
separating the human body from the video of the training data set;
and extracting a sequence of the body action of each person from the video after the human body is separated.
3. The method of claim 2, wherein the separating the human body from the video of the training data set comprises:
determining a background frame using a background evaluation method of an initialization value of SAD continuity between every two images and entropy of each block;
and calculating the background frame to obtain a differential absolute image, and then performing structural texture decomposition on the differential absolute image to obtain the region of the moving object to complete human body separation.
4. The method of claim 2, wherein the extracting of the sequence of body movements of each person from the video of separated persons is performed using an extended version of a tracking method based on a coring correlation filter.
5. The method according to any one of claims 1 to 4,
the preprocessing further comprises extracting motion history images from the video of the training data set;
and combining the extracted motion history image with the sequence of the body motion of each person, and inputting the combined motion history image and the sequence of the body motion of each person into a three-dimensional convolutional neural network model together for motion detection and identification.
6. A multi-person action detection and identification device based on a three-dimensional convolutional neural network comprises:
a pre-processing module configured to pre-process the video of the training data set to obtain a sequence of body movements of each person;
and the detection and identification module is configured to input the extracted sequence of the body actions of each person into a three-dimensional convolutional neural network model for action detection and identification, wherein the three-dimensional convolutional neural network model comprises two 3D convolutional pooling units, two convolutional layers, two maximum pooling layers, a flat layer, two complete connection layers and an output layer.
7. The apparatus of claim 6, wherein the preprocessing module comprises:
a human body separation submodule configured to separate a human body from a video of a training data set;
a sequence extraction sub-module configured to extract a sequence of body movements of each person from the video of the separated human body.
8. The apparatus of claim 7, wherein the human body separating submodule comprises:
a background frame sub-module configured to determine a background frame using an initialization value of SAD continuity between each two images and a background evaluation method of entropy of each block;
and the structure texture decomposition submodule is configured to calculate the background frame to obtain a difference absolute image, and then perform structure texture decomposition on the difference absolute image to obtain the region of the moving object and complete human body separation.
9. The apparatus of claim 7, wherein the sequence of extracting the body movements of each person from the video after separating the persons is performed by a tracking method based on an extended version of a coring correlation filter.
10. The apparatus according to any one of claims 6-9,
the preprocessing module further comprises:
a history image extraction sub-module configured to extract motion history images from the video of the training data set;
and in the detection and identification module, the extracted motion history image is combined with the sequence of the body action of each person, and the combined motion history image and the sequence are jointly input into the three-dimensional convolutional neural network model for action detection and identification.
CN201911032206.1A 2019-10-28 2019-10-28 Multi-user action detection and identification method and device based on three-dimensional convolutional neural network Pending CN110866458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911032206.1A CN110866458A (en) 2019-10-28 2019-10-28 Multi-user action detection and identification method and device based on three-dimensional convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911032206.1A CN110866458A (en) 2019-10-28 2019-10-28 Multi-user action detection and identification method and device based on three-dimensional convolutional neural network

Publications (1)

Publication Number Publication Date
CN110866458A true CN110866458A (en) 2020-03-06

Family

ID=69653544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911032206.1A Pending CN110866458A (en) 2019-10-28 2019-10-28 Multi-user action detection and identification method and device based on three-dimensional convolutional neural network

Country Status (1)

Country Link
CN (1) CN110866458A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131995A (en) * 2020-09-16 2020-12-25 北京影谱科技股份有限公司 Action classification method and device, computing equipment and storage medium
CN112530144A (en) * 2020-11-06 2021-03-19 华能国际电力股份有限公司上海石洞口第一电厂 Method and system for warning violation behaviors of thermal power plant based on neural network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919011A (en) * 2019-01-28 2019-06-21 浙江工业大学 A kind of action video recognition methods based on more duration informations

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919011A (en) * 2019-01-28 2019-06-21 浙江工业大学 A kind of action video recognition methods based on more duration informations

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NOOR ALMAADEED等: ""A Novel Approach for Robust Multi Human Action Detection and Recognition based on 3-Dimentional Convolutional Neural Networks"", 《PATTERN RECOGNITION LETTERS》 *
OMAR ELHARROUSS等: ""Moving object detection zone using a block based background model"", 《IET COMPUTER VISION》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131995A (en) * 2020-09-16 2020-12-25 北京影谱科技股份有限公司 Action classification method and device, computing equipment and storage medium
CN112530144A (en) * 2020-11-06 2021-03-19 华能国际电力股份有限公司上海石洞口第一电厂 Method and system for warning violation behaviors of thermal power plant based on neural network
CN112530144B (en) * 2020-11-06 2022-06-28 华能国际电力股份有限公司上海石洞口第一电厂 Method and system for warning violation behaviors of thermal power plant based on neural network

Similar Documents

Publication Publication Date Title
Teoh et al. Face recognition and identification using deep learning approach
CN109558832B (en) Human body posture detection method, device, equipment and storage medium
CN112446270B (en) Training method of pedestrian re-recognition network, pedestrian re-recognition method and device
Wang et al. Hierarchical attention network for action recognition in videos
CN111709311B (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN109472191B (en) Pedestrian re-identification and tracking method based on space-time context
CN107067413B (en) A kind of moving target detecting method of time-space domain statistical match local feature
Seal et al. Human face recognition using random forest based fusion of à-trous wavelet transform coefficients from thermal and visible images
Allaert et al. Micro and macro facial expression recognition using advanced local motion patterns
CN113378649A (en) Identity, position and action recognition method, system, electronic equipment and storage medium
CN111353399A (en) Tamper video detection method
CN111814690A (en) Target re-identification method and device and computer readable storage medium
CN110866458A (en) Multi-user action detection and identification method and device based on three-dimensional convolutional neural network
Ali et al. Deep Learning Algorithms for Human Fighting Action Recognition.
Sowmyayani et al. Fall detection in elderly care system based on group of pictures
Putra A Novel Method for Handling Partial Occlusion on Person Re-identification using Partial Siamese Network
Al-Dmour et al. Masked face detection and recognition system based on deep learning algorithms
Ghafoor et al. Egocentric video summarization based on people interaction using deep learning
Manssor et al. TIRFaceNet: thermal IR facial recognition
Zhang et al. ATMLP: Attention and Time Series MLP for Fall Detection
Phang et al. Real-time multi-camera multi-person action recognition using pose estimation
Pushparaj et al. Using 3D convolutional neural network in surveillance videos for recognizing human actions.
Wong et al. Multi-Camera Face Detection and Recognition in Unconstrained Environment
Muhamad et al. A comparative study using improved LSTM/GRU for human action recognition
Voronin et al. Action recognition using the 3D dense microblock difference

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200306