CN113486706B - Online action recognition method based on human body posture estimation and historical information - Google Patents

Online action recognition method based on human body posture estimation and historical information Download PDF

Info

Publication number
CN113486706B
CN113486706B CN202110558936.6A CN202110558936A CN113486706B CN 113486706 B CN113486706 B CN 113486706B CN 202110558936 A CN202110558936 A CN 202110558936A CN 113486706 B CN113486706 B CN 113486706B
Authority
CN
China
Prior art keywords
action
model
training
skeleton
stgcn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110558936.6A
Other languages
Chinese (zh)
Other versions
CN113486706A (en
Inventor
冯伟
孙佳敏
边存灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202110558936.6A priority Critical patent/CN113486706B/en
Publication of CN113486706A publication Critical patent/CN113486706A/en
Application granted granted Critical
Publication of CN113486706B publication Critical patent/CN113486706B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an online action recognition method based on human body posture estimation and historical information, which comprises the following steps: constructing and training an online action recognition model: for an input video, a skeleton sequence is extracted through a human body posture estimation algorithm, then online action recognition is realized, and the type of an action recognition result is given, wherein the method comprises the following steps: collecting original motion video data: 3D skeleton data generated by a posture estimation algorithm is used as an original data set; constructing a high-quality action recognition guidance module; constructing a low-quality robust action recognition module; and constructing an online action recognition module. And testing the online action recognition model.

Description

Online action recognition method based on human body posture estimation and historical information
Technical Field
The invention is mainly applied to the field of motion recognition, and relates to a graph convolution neural network technology, a long-term and short-term memory neural network technology and a knowledge distillation technology in the field of artificial intelligence. The method can be used for human body action recognition application in the field of video processing.
Background
In recent years, with the rapid development of artificial intelligence, human body motion recognition has made great progress, and especially in application scenarios such as intelligent security monitoring, man-machine interaction, education, intelligent medical treatment and the like, the human body motion recognition plays an increasingly important role, receives attention of numerous scholars and researchers, and becomes an active research field.
The background art related to the invention is as follows:
(1) Estimating the posture of the human body: the human body posture estimation is to extract motion and action data of a human body in a video through a posture estimation algorithm, the extracted human body motion data are presented in a 3D framework sequence, the 3D framework sequence is formed by connecting a plurality of human body joint points, each joint point comprises space coordinate data of a human body joint, and the continuous multi-frame 3D framework sequence can simply and efficiently represent human body motion characteristics, namely action information. The human posture estimation can effectively help the model classifier to perform high-precision motion recognition on motion recognition.
(2) Knowledge distillation algorithm: the knowledge distillation method can transfer the motion representation capability of the model to a target network model (namely a robust motion recognition model), and the robust motion recognition model is used for initializing the continuous motion recognition model, so that the on-line recognition of human motion in a video can be realized, and an on-line motion recognition task is supported.
(3) STGCN network: the STGCN network has good effect on human body action recognition task [1] The human body action recognition model is a classic behavior recognition network model applied to the human body skeleton, the model design generalization capability is strong, and the accuracy of human body action recognition can be improved by extracting and utilizing the characteristics of the skeleton sequence from two aspects of space and time.
The related documents are:
[1]Yan S,Xiong Y,Lin D.Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition[J].2018.
disclosure of Invention
The invention provides an online action recognition method, which comprises the steps of extracting a framework coordinate of each frame of image in a video by utilizing a posture estimation algorithm, constructing a framework space-time diagram of the framework coordinate by utilizing a deep-learning space-time diagram convolution network model, transferring the action representation capability of the model to a target network model by using a knowledge distillation method, finally constructing a continuous action recognition prediction model, and using a robust action recognition model to initialize the continuous action model, so that the online recognition of human actions in the video can be realized, and an online action recognition task is supported.
The invention adopts the following technical scheme:
an online action recognition method based on human body posture estimation and historical information comprises the following steps:
(1) Constructing and training an online action recognition model: for an input video, a skeleton sequence is extracted through a human body posture estimation algorithm, then online action recognition is realized, and the type of an action recognition result is given, wherein the method comprises the following steps:
a) Collecting original motion video data: obtaining 3D skeleton coordinate data through a human body posture estimation algorithm, and taking the 3D skeleton data generated by the posture estimation algorithm as an original data set;
b) Constructing a high-quality action recognition guidance module: processing the original 3D skeleton data extracted in the step a) to construct an accurate action segmentation training set V 1 The teacher network model is mainly used for training a teacher network model, the teacher network model is constructed by selecting an STGCN network, the teacher network is trained by adopting a training strategy with a single label, and finally a high-quality action representation guidance model is obtained.
c) Constructing a low-quality robust action recognition module: processing the 3D skeleton data of the original data set to generate a preamble action training set V 2 Selecting STGCN network to build student network model, and adopting single label training strategy in training set V 2 Training, namely using a knowledge distillation method to guide the training of the student network model by using the high-quality motion representation guidance model generated in the step b), and then obtaining a robust motion recognition model.
d) Constructing an online action recognition module: processing original 3D skeleton data to generate a new training set V 3 ,V 3 The training set is a continuous action training set, the training set comprises continuous multi-section action 3D framework sequences, then an LSTM-STGCN is built by using an STGCN network and the LSTM network, in the implementation of a specific building structure of the built action prediction model, the LSTM network is accessed behind a full connection layer of the STGCN, the output of the STGCN is used as the input of the LSTM, firstly, the LSTM-STGCN model is initialized, the robust action recognition model obtained in the step c) is used as an LSTM-STGCN framework feature extraction module, then, the parameters of the robust action recognition model are loaded into the LSTM-STGCN, the constructed continuous action training set is used for training, and a final target model is obtained through a multi-label classification training strategy: an online motion recognition model.
(2) And (3) testing the online action recognition model: and for the input online action video, obtaining a 3D skeleton data sequence of the human body action through a human body posture estimation algorithm, sending the 3D skeleton sequence into an online action recognition model, outputting the action category, and completing the recognition of the online action.
Further, in the step a), 3D skeleton data is extracted by using a Kinectv2 sensor pose estimation algorithm.
Further, in the step b), the step c) and the step D, a training set in a tensor form is constructed according to the coordinate data of the 3D framework, and the data structure of the training set is as follows: c multiplied by T multiplied by V multiplied by M, wherein C represents 3 channels, T represents the frame number of data, V represents the number of skeleton joint points, M represents the number of characters in a video, a precise action segmentation training set is constructed, segmentation is carried out according to action class labels, 3D skeleton data of each action sequence only comprises one action class, the constructed front action training set is used for intercepting a section of 3D skeleton sequence comprising the previous action in the original 3D skeleton sequence by changing the size of T, the frame number of the action sequence of the front action sequence training set is T + B, B is a set number, and the front action training set in a tensor form is generated according to a data format.
Furthermore, in the step c, the high-quality action representation guidance model is used for guiding the training of the student model, and the realization method is that for the construction of the loss function of the training model, the total loss function L of the student network model total =αL student +βL kl +γL mse Wherein L is kl As a function of divergence loss, L student A loss term of the convolutional network of the student network model space-time diagram; l mse The method is a mean square loss function and is used for calculating three hyper-parameters set by loss, alpha, beta and gamma of the square of the extracted features of the student network model and the square of the extracted features of the teacher network model, and a robust action recognition model is obtained by training on the data set and using a single-label training strategy.
Further, in the step d), a continuous action recognition model is constructed, an LSTM network is constructed in the STGCN network to realize the continuous sequence label classification function, namely, an LSTM-STGCN model, the model initializes an LSTM-STGCN skeleton feature extraction module by using the robust action recognition model obtained by the previous training, and trains on the constructed continuous action training set based on the LSTM modeling historical behavior information, and finally obtains a final target model: an online motion recognition model.
Compared with the prior art, the method has the following advantages:
(1) The invention realizes the online identification of the human body action in the video. In the past, the action recognition can be realized only by observing the complete action video, and the method has the advantages that the online action recognition can be realized, namely, the action can be classified and recognized without observing the action in the complete video, and the action can be judged in advance in a practical application scene.
(2) The invention utilizes the training method of knowledge distillation to improve the robustness and the recognition accuracy of the network model. The teacher network model with strong action representation capability guides the training of the student network model, so that the student network model can learn good action representation capability, and the robustness of the model is improved while the identification accuracy is improved.
(3) The online action recognition model of the invention combines historical action information to recognize, and realizes recognition and prediction of each frame of video.
Drawings
FIG. 1: flow chart of online action recognition method based on human body posture estimation and historical information
Detailed Description
The invention provides an online action recognition method based on human body posture estimation and historical information, which is different from the current existing method in that the online action recognition is realized by utilizing a knowledge distillation technology and an LSTM sequence classification technology, and meanwhile, the robustness and the generalization capability of a network are improved. The technical scheme of the invention is clearly and completely described below with reference to the accompanying drawings. The technical method and the beneficial progress of the invention are all within the protection scope of the invention.
1. Constructing and training an online action recognition model:
as shown in fig. 1, for an input video, a skeleton sequence of human body motion in the video is first extracted,
the skeleton sequence is constructed into a training set which can be used for model training. Then different models are built, and a knowledge distillation method and different training strategies are used, so that the target model required by the invention is obtained: an online motion recognition model.
1) And (3) construction of a training set:
the method is mainly divided into three types, namely, a precise action segmentation training set V for a teacher network model is constructed 1 And a preceding text action training set V for the student network model 2 Continuous action training set V for training on-line action recognition model 3 . Wherein V 1 The method is characterized in that each action in a collected video data set is accurately divided, namely each piece of video action skeleton data and an action label are single and do not contain data of other actions. V for training student network model 2 It means that each action segment of the training set includes not only the current action but also the last action segment before the action starts and the 3D skeleton data of the current action. V 3 The training set is a 3D skeleton data sequence comprising a plurality of action sequences and is not a single segmented action sequence.
2) Data structure of training set:
the data structure of the training set is set as: c is multiplied by T by V and M is multiplied by M, wherein C represents 3 channels, T represents the frame number of data, V represents the number of skeleton joint points, 25 3D skeleton joint points are selected, and M represents the number of people in a video. And constructing an accurate action segmentation training set, segmenting according to the action class labels, wherein the 3D skeleton data of each action sequence only comprises one action class. The constructed preamble action training set is used for intercepting a section of 3D framework sequence containing the previous action in the original 3D framework sequence by changing the size of T, and then generating a tensor preamble action training set according to the data structure set by the invention. The constructed continuous motion data set generates a complete 3D skeleton sequence training set by all motion sequences in the whole video.
3) A high-quality action recognition guidance module:
a teacher network model needs to be built in the high-quality action recognition guidance module, and the teacher network model is built by selecting an STGCN network.
STGCN is formally expressed as follows:
Figure BDA0003078362830000041
where F represents the model output, A represents the adjacency matrix, I represents the unit matrix, F represents the input, and w represents the weight matrix, where Λ ii =Σ j j(A ij +I ij )。
4) A low-quality robust motion recognition module:
the low-quality robust action recognition module needs to construct a student network model, and the STGCN network is also selected for the construction of the student network model. The low-quality robust motion recognition module applies a knowledge distillation method, and the knowledge distillation method is described in detail in a loss function of model training.
5) An online action identification module:
the line action identification module needs to construct an LSTM-STGCN model and is mainly constructed by using an STGCN network and an LSTM network. With particular regard to the LSTM-STGCN formalized expression, the following:
f t =δ(W·[h t-1 ,F t ]+b) (2)
where δ represents the sigmod function, f t For the output of LSTM-STGCN at the t-th frame, h t-1 Output label, F for t-1 frame t For the output label of the STGCN network at the t frame, W is the weight and b is the bias parameter.
6) The model training method comprises the following steps:
the invention mainly relates to the training of three models, namely the training of a teacher network model, the training of a student network model and the training of an online action recognition model. Each model was trained using a different strategy and three different training sets constructed. And the teacher network model is trained by using a training strategy of a single label and a training set of accurate segmentation. And (3) training the student network model, wherein the teacher network model is used for guiding the training of the student network model by using a knowledge distillation method. The training strategy is a single label training strategy, the training set uses a preceding action training set, and then a robust action recognition model is obtained. The training of the linear motion recognition model is guided by using a robust motion recognition model, and the training strategy is a sequence label classification method. The constructed continuous motion training set is used for training to obtain a continuous motion recognition model, and the model is a final model required by the invention and can realize online video human motion recognition and recognition.
7) Teacher network model loss function:
teacher network model training oss
L teacher =L crossentroy (P teacher ,Q) (3)
Wherein L is crossentroy As a cross-entropy loss function, P teacher And Q is label of the output of the teacher network model.
8) Student network model loss function:
and (4) training a student network model, and selecting a single label classification method for a training strategy. Different from the teacher network model, the knowledge distillation method is added for the training of the student network model, and simply speaking, the reasoning ability of the teacher network is transferred to the student network model. The specific implementation is mainly characterized in that the reasoning ability of the teacher model is transferred to the student model through the improved loss function. Entire student network
The total loss of the model is as follows:
L total =αL student +βL kl +γL mse (4)
L student alpha, beta and gamma are three hyper-parameters set by experiments for loss terms of a student network model space-time graph convolutional network, the numerical values are mainly adjusted according to the effects in the experiments, and the initial values are all 1.
L student =L crossentroy (P student ,Q) (5)
Wherein L is crossentroy As a cross-entropy loss function, P student Q is label for the output of the student network model.
L kl To measure the loss terms of the teacher network model output and the student network model output:
L kl =D(l softmax (P student ),l softmax (P teacher) )) (6)
wherein l softmax (P student ) Is the probability distribution of the student network model output by the softmax function, l softmax (P teacher) ) The probability distribution of the teacher network model output through the softmax function is shown, and D is the KL divergence function.
L mse For measuring the extraction characteristics of the teacher network model and the extraction characteristic loss items of the student network model:
L mse =l mse (P student ,P teacher ) (7)
l mse is a mean square loss function used for calculating the loss of the square of the extracted features of the student network model and the square of the extracted features of the teacher network model. P student Output representing extracted features of a student network model, P teacher And (4) output representing the extracted features of the teacher network model.
9) The LSTM-STGCN model is realized by the following steps:
the LSTM-STGCN model needs to be realized by fully utilizing historical information to realize continuous action identification and realize identification of each frame, so that the model is built by using the graph space-time convolution network and the LSTM network. The LSTM network is accessed to the full connection layer of the space-time graph convolutional network, the output of the STGCN is used as the input of the LSTM, and the LSTM network capable of identifying the action sequence is designed in the input and output module, so that the online action identification and recognition are realized.
The formalized expression of LSTM-STGCN is as follows:
f t =δ(W·[h t-1 ,F t ]+b) (8)
where δ represents the sigmod function, f t For the output of LSTM-STGCN at the t-th frame, h t-1 Output label, F for t-1 frame t For the output label of the STGCN network at the t frame, W is the weight and b is the bias parameter.
10 LSTM-STGCN model loss function:
training of the LSTM-STGCN model loss:
L LSTM-STGCN =L crossentroy (P,Q) (9)L LSTM-STGCN a loss term representing the model, wherein L crossentroy For the cross entropy loss function, P is the output of the current model, and Q is label.
2. Experimental setup:
and (3) setting specific parameters of an experiment, and realizing motion recognition of the video sequence by sliding a window on the video sequence in the recognition process of the three constructed models. The larger the sliding window is, the larger the number of frames of the model extracted from the training data in the identification process is, for example, when the sliding window is set to 50, the larger the number of frames of the model extracted from the training data in the identification process is, the larger the number of frames of the model extracted from a data sample in the identification process is, the larger the sliding window is, the larger the number of frames of the model extracted from the training data in the identification process is, and the size of the sliding window becomes an important influence factor in the experimental part of the present invention. Regarding the size of the sliding window, the present invention mainly sets 4 values, 50, 100, 150, and 200.
3. Model test and result evaluation:
and (4) evaluating the results: the experimental part of the invention adopts two indexes for evaluating the experimental result: accuracy and average accuracy. In the experiment of the high-quality action representation guidance model and the experiment of the low-quality robust action recognition model, the evaluation index of the experiment result is the accuracy, and the experiment accuracy of the method is top1 precision. top1 precision mainly refers to probability output of an identified object, and if the maximum probability value in the output is a correct label, prediction is successful. The method comprises the steps of identifying each frame of a video during testing of an action identification model to obtain an identification result of each frame, then adding the identification accuracy of all the frames to divide the total number of the identified frames, and finally obtaining the average accuracy of the identification of the whole video. In the comparison of the LSTM-STGCN ablation experiment and the experiment of other existing methods, the average precision is adopted in the experiment to evaluate the result, the average precision can better evaluate the recognition and prediction capabilities of the model for the online actions, when the model only observes incomplete actions, namely a part of the actions, the model can give the accuracy to each frame of the video, and the recognition capabilities of the model can be objectively evaluated.
The final experimental results are given in the following table:
Figure BDA0003078362830000061
Figure BDA0003078362830000071
table 1: knowledge distillation-based robust motion feature extraction module experimental result table
Figure BDA0003078362830000072
Table 2: LSTM-STGCN ablation experiment result table
Model (model) ST-LSTM FSNet SSNet LSTM-STGCN
Average accuracy 53.46% 53.96% 59.03% 62.37%
Table 3: results table of the present invention and other prior art methods
The online action recognition provided by the invention mainly refers to the recognition of human body actions in a video input sequence. Unlike traditional motion recognition, the online motion recognition method has the advantage of advanced prejudgment on human motion, namely, the model already recognizes the motion before the complete motion is not observed. The advantage can help decision makers to make analysis, early warning and the like, and the advantage has important practical application value and significance, for example, in the field of intelligent security monitoring, online action recognition can predict actions in advance.

Claims (3)

1. An online action recognition method based on human body posture estimation and historical information comprises the following steps:
(1) Constructing and training an online action recognition model: for an input video, a skeleton sequence is extracted through a human body posture estimation algorithm, then online action recognition is realized, and the type of an action recognition result is given, wherein the method comprises the following steps:
a) Collecting original motion video data: obtaining 3D skeleton coordinate data through a human body posture estimation algorithm, and taking the 3D skeleton data generated by the posture estimation algorithm as an original data set;
b) Constructing a high-quality action recognition guidance module: processing the original 3D skeleton data extracted in the step a) to construct an accurate action segmentation training set for a teacher network model
Figure 384258DEST_PATH_IMAGE001
Selecting an STGCN network for construction of the teacher network model, training the teacher network by adopting a single-label training strategy, and finally obtaining a high-quality action representation guidance model;
c) Constructing a low-quality robust action recognition module: processing the 3D skeleton data of the original data set to generate a preceding action training set for the student network model
Figure 753929DEST_PATH_IMAGE002
Training set
Figure 736928DEST_PATH_IMAGE004
Each action fragment of (a) includes not only the current action but also the last action fragment before the action starts and the 3D skeleton data of the current action; selecting STGCN network to build student network model, and adopting single label training strategy in training set
Figure 584667DEST_PATH_IMAGE002
Training, namely, using a knowledge distillation method to guide the training of a student network model by using the high-quality action representation guidance model generated in the step b), and then obtaining a robust action recognition model; the high-quality action representation guidance model is used for guiding the training of the student model, and the realization method is that for the construction of the loss function of the training model, the total loss function of the student network model
Figure 867881DEST_PATH_IMAGE005
Wherein L is kl In order to be a function of the divergence loss,
Figure 628027DEST_PATH_IMAGE006
a loss term of the convolutional network of the student network model space-time diagram;
Figure 347590DEST_PATH_IMAGE007
is a mean square loss function used for calculating the loss of the square of the extracted features of the student network model and the square of the extracted features of the teacher network model,
Figure 15332DEST_PATH_IMAGE008
Figure 887473DEST_PATH_IMAGE009
Figure 67787DEST_PATH_IMAGE010
the three set hyper-parameters are trained on the data set in the front, and a robust action recognition model is obtained by using a single label training strategy;
d) Constructing an online action identification module:processing original 3D skeleton data to generate a new continuous motion training set for on-line motion recognition model training
Figure 290958DEST_PATH_IMAGE011
Training set
Figure 496812DEST_PATH_IMAGE011
The method comprises the steps that a segmented single action sequence is not needed, the segmented single action sequence comprises a continuous multi-segment action 3D framework sequence, then an STGCN and an LSTM network are used for building an LSTM-STGCN, an LSTM network is built in the STGCN to achieve the continuous sequence label classification effect, the LSTM network is connected to the back of a full connection layer of the STGCN, the output of the STGCN is used as the input of the LSTM, and a continuous action recognition model, namely an LSTM-STGCN model, is built; initializing an LSTM-STGCN model, taking the robust action recognition model obtained in the step c) as an LSTM-STGCN framework feature extraction module, loading parameters of the robust action recognition model into the LSTM-STGCN, training by using a constructed continuous action training set, and obtaining a final target model through a multi-label classification training strategy: an online action recognition model;
(2) And (3) testing the online action recognition model: and for the input online action video, obtaining a 3D skeleton data sequence of the human body action through a human body posture estimation algorithm, sending the 3D skeleton sequence into an online action recognition model, outputting the action category, and completing the recognition of the online action.
2. The on-line motion recognition method as claimed in claim 1, wherein in the step a), 3D skeleton data is extracted by using a Kinect v2 sensor pose estimation algorithm.
3. The on-line motion recognition method according to claim 1, wherein in the steps b), c) and D), a training set in a tensor form is constructed according to the coordinate data of the 3D skeleton, and the data structure of the training set is as follows:
Figure 472727DEST_PATH_IMAGE012
wherein C represents 3 channels, T represents the frame number of data, V represents the number of skeleton joint points, M represents the number of characters in the video, and the constructed accurate motion segmentation training set
Figure 105833DEST_PATH_IMAGE013
The method is divided according to action category labels, the 3D skeleton data of each action sequence only comprises one action category, and a constructed preamble action training set
Figure 816300DEST_PATH_IMAGE014
By changing the size of T, a 3D skeleton sequence containing the last action is cut out from the original 3D skeleton sequence, and the former action training set
Figure 543954DEST_PATH_IMAGE015
The number of frames of the motion sequence of (1) is T + B, B is a set number, and a tensor-form preceding motion training set is generated according to the data format
Figure 656266DEST_PATH_IMAGE014
CN202110558936.6A 2021-05-21 2021-05-21 Online action recognition method based on human body posture estimation and historical information Active CN113486706B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110558936.6A CN113486706B (en) 2021-05-21 2021-05-21 Online action recognition method based on human body posture estimation and historical information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110558936.6A CN113486706B (en) 2021-05-21 2021-05-21 Online action recognition method based on human body posture estimation and historical information

Publications (2)

Publication Number Publication Date
CN113486706A CN113486706A (en) 2021-10-08
CN113486706B true CN113486706B (en) 2022-11-15

Family

ID=77932972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110558936.6A Active CN113486706B (en) 2021-05-21 2021-05-21 Online action recognition method based on human body posture estimation and historical information

Country Status (1)

Country Link
CN (1) CN113486706B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160277A (en) * 2019-12-31 2020-05-15 深圳中兴网信科技有限公司 Behavior recognition analysis method and system, and computer-readable storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133188B (en) * 2017-12-22 2021-12-21 武汉理工大学 Behavior identification method based on motion history image and convolutional neural network
CN110503077B (en) * 2019-08-29 2022-03-11 郑州大学 Real-time human body action analysis method based on vision
CN111444879A (en) * 2020-04-10 2020-07-24 广东工业大学 Joint strain autonomous rehabilitation action recognition method and system
CN111582095B (en) * 2020-04-27 2022-02-01 西安交通大学 Light-weight rapid detection method for abnormal behaviors of pedestrians
CN111814719B (en) * 2020-07-17 2024-02-20 江南大学 Skeleton behavior recognition method based on 3D space-time diagram convolution
CN112036379A (en) * 2020-11-03 2020-12-04 成都考拉悠然科技有限公司 Skeleton action identification method based on attention time pooling graph convolution
CN112597883B (en) * 2020-12-22 2024-02-09 武汉大学 Human skeleton action recognition method based on generalized graph convolution and reinforcement learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160277A (en) * 2019-12-31 2020-05-15 深圳中兴网信科技有限公司 Behavior recognition analysis method and system, and computer-readable storage medium

Also Published As

Publication number Publication date
CN113486706A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN110532900B (en) Facial expression recognition method based on U-Net and LS-CNN
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN107609009B (en) Text emotion analysis method and device, storage medium and computer equipment
CN109492099B (en) Cross-domain text emotion classification method based on domain impedance self-adaption
CN108681752B (en) Image scene labeling method based on deep learning
CN113496217B (en) Method for identifying human face micro expression in video image sequence
CN110046656B (en) Multi-mode scene recognition method based on deep learning
CN111126488B (en) Dual-attention-based image recognition method
CN109101938B (en) Multi-label age estimation method based on convolutional neural network
CN107229914B (en) Handwritten digit recognition method based on deep Q learning strategy
CN110046671A (en) A kind of file classification method based on capsule network
CN107506722A (en) One kind is based on depth sparse convolution neutral net face emotion identification method
CN113065460B (en) Establishment method of pig face facial expression recognition framework based on multitask cascade
CN112949740B (en) Small sample image classification method based on multilevel measurement
CN111783879B (en) Hierarchical compressed graph matching method and system based on orthogonal attention mechanism
CN109508686B (en) Human behavior recognition method based on hierarchical feature subspace learning
CN111401105B (en) Video expression recognition method, device and equipment
CN113297936B (en) Volleyball group behavior identification method based on local graph convolution network
CN111028319A (en) Three-dimensional non-photorealistic expression generation method based on facial motion unit
CN112529063B (en) Depth domain adaptive classification method suitable for Parkinson voice data set
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN107341471A (en) A kind of Human bodys' response method based on Bilayer condition random field
CN113255543B (en) Facial expression recognition method based on graph convolution network
CN113408418A (en) Calligraphy font and character content synchronous identification method and system
CN117765432A (en) Motion boundary prediction-based middle school physical and chemical life experiment motion detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant