CN116052276A - Human body posture estimation behavior analysis method - Google Patents

Human body posture estimation behavior analysis method Download PDF

Info

Publication number
CN116052276A
CN116052276A CN202310045400.3A CN202310045400A CN116052276A CN 116052276 A CN116052276 A CN 116052276A CN 202310045400 A CN202310045400 A CN 202310045400A CN 116052276 A CN116052276 A CN 116052276A
Authority
CN
China
Prior art keywords
data
posture
gesture
hypothesis
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310045400.3A
Other languages
Chinese (zh)
Inventor
史金余
孙悦琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN202310045400.3A priority Critical patent/CN116052276A/en
Publication of CN116052276A publication Critical patent/CN116052276A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a human body posture estimation behavior analysis method, which comprises the following steps: processing the character image information into picture data, acquiring global posture characteristics of each group of pictures, and acquiring an image frame sequence with two-dimensional key point information of a human body based on the global posture characteristics; establishing a motion model according to the image frame sequence, estimating the task motion state in the image frame sequence, constructing a transducer model, generating posture assumptions based on a multi-assumption generator of the transducer model, and carrying out regression on each group of posture assumptions to generate posture assumption information; constructing a behavior prediction network, and learning based on a plurality of groups of pre-collected gesture data to obtain the optimal parameters of the network; and applying the optimal parameters to a behavior prediction network, predicting the posture assumption information based on the behavior prediction network, and finally outputting predicted posture estimation data. According to the invention, parameters are not required to be set manually, meanwhile, the efficiency and the accuracy of searching the parameters of the neural network are improved, the operation is simple, and the neural network is convenient for workers to use.

Description

Human body posture estimation behavior analysis method
Technical Field
The invention relates to the technical field of posture estimation, in particular to a human body posture estimation behavior analysis method.
Background
Human body posture estimation is an important branch in computer vision, has a wide application range, and estimates human body posture by correctly linking detected human body key points in pictures. The key points of the human body generally correspond to joints with a certain degree of freedom on the human body, such as a neck, a shoulder, an elbow, a wrist, a waist, a knee, an ankle and the like, for example, the gesture detection and the motion prediction of pedestrians in street views are performed in the autopilot industry; the problem of re-identification of pedestrians in the security field is solved, and specific actions of a special scene are monitored; movie special effects in the movie industry, etc.
The invention discloses a behavior recognition method integrating human body posture information, which has strong stability, overcomes the defect that the recognition capability of a graph convolution neural network is very influenced by the translation of skeleton point coordinate points, integrates information of frames before and after an image and human body key point information, and helps to improve the performance of motion recognition, but needs to manually set parameters to reduce the efficiency and accuracy of searching parameters of the neural network, needs to be operated by technicians with certain experience, and is inconvenient for common people to use.
Disclosure of Invention
The invention provides a human body posture estimation behavior analysis method. According to the invention, the optimal parameter setting estimation network is automatically obtained based on the uploaded multiple groups of gesture data, so that gesture prediction is performed, manual parameter setting is not needed, meanwhile, the efficiency and the accuracy of searching parameters of the neural network are improved, the operation is simple, and the use of staff is convenient.
The invention adopts the following technical means:
a human body pose estimation behavior analysis method, comprising:
s1, processing character image information into picture data, preprocessing the picture data, acquiring global posture features of each group of pictures, and acquiring an image frame sequence with two-dimensional key point information of a human body based on the global posture features;
s2, offline processing single-camera video or image sequence frames with fixed frame rate, establishing a motion model, estimating task motion states in an image frame sequence to obtain 2D gesture data, establishing a transducer model, processing the 2D gesture data based on a multi-hypothesis generator of the transducer model to generate gesture hypothesis, and carrying out regression on each group of gesture hypotheses to generate gesture hypothesis information;
s3, constructing a behavior prediction network, and learning based on a plurality of groups of pre-collected gesture data to obtain the optimal parameters of the network;
and S4, applying the optimal parameters to a behavior prediction network, predicting the posture assumption information based on the behavior prediction network, and finally outputting predicted posture estimation data.
Further, preprocessing the picture data includes:
converting each group of image data from an image space to a frequency space through Fourier positive transformation, and filtering high-frequency components of the image data to reduce noise interference;
the filtered sets of image data are then converted from frequency space to image space by inverse fourier transform.
Further, acquiring global posture features of each group of pictures, and acquiring two-dimensional key points of a human body based on the global posture features comprises:
the method comprises the steps of performing a plurality of times of ShuffleBlock through an acquisition network to obtain global attitude characteristics of each group of image data;
returning the global attitude features to the key point feature map through deconvolution operation;
and decoding the key point feature map, and collecting the two-dimensional key points of the human body generated after decoding.
Further, a motion model is built according to an image frame sequence, task motion states in the image frame sequence are estimated to obtain 2D gesture data, a transducer model is built, the 2D gesture data is processed by a multi-hypothesis generator based on the transducer model to generate gesture hypothesis, and regression is performed on each group of gesture hypotheses to generate gesture hypothesis information, which comprises the following steps:
s201, processing the video information or the image sequence frame of the current figure image offline, calculating and recording the interval time of the actual video frame, and establishing a motion model according to a Kalman filtering theory;
s202, assigning an ID to all people in the image information, defining the motion state of the people in a video frame according to the linear motion assumption of the people through a motion model after the assignment is completed, collecting the motion state of each person in the current video frame, and constructing a prediction equation to estimate the motion state of each tracking target in the next video frame so as to obtain 2D gesture data;
s203, constructing a transducer model, inputting 2D gesture data into the transducer model, receiving each group of 2D gesture data by a multi-hypothesis generator in the transducer model, generating different representations of gesture hypotheses at different layers of the model, and modeling single hypothesis dependencies through a plurality of parallel self-attention blocks to form self-hypothesis communication;
s204, extracting all spliced hypothesis features by the mixed hypothesis MLP, cutting the mixed hypothesis MLP to obtain each corrected hypothesis, interactively modeling information of different hypotheses by the cross hypothesis interactor, and finally regressing all groups of gesture hypotheses by the transform model regression module to obtain final 3D gesture data.
Further, a behavior prediction network is constructed, learning is performed based on the pre-collected multiple sets of gesture data, and optimal parameters of the network are obtained, including:
s301, collecting a plurality of groups of pre-uploaded gesture data by a behavior prediction network, selecting one group of pre-uploaded gesture data as verification data, then fitting the rest data into a group of test models, verifying the detection precision of the test models through the verification data, and replacing the verification data to verify again until all gesture data are verified;
s302, initializing a parameter range, listing all possible data results according to a preset learning rate and step length, selecting any subset as a test set for each group of data, training a test model by using the rest subsets as a training set, predicting the test set after training, and counting root mean square errors of the test results;
s303, simultaneously replacing the test set with another subset, taking the rest subset as a training set, and counting root mean square errors again until all data are predicted once, and selecting the corresponding combined parameter with the minimum root mean square error as the optimal parameter in the data interval.
Further, applying the optimal parameters to the behavior prediction network, predicting the posture assumption information based on the behavior prediction network, and finally outputting predicted posture estimation data, including:
s401, the behavior prediction network receives 3D gesture data generated by a transducer model, changes original parameters into optimal parameters, and then introduces key point information of each character in current image information into the behavior prediction network;
s402, dividing key point information of each character in the current image information into a training set and a testing set, carrying out standardization processing on the training set, guiding a training sample generated by the standardization processing into a behavior prediction network, training the behavior prediction network by adopting a long-term iteration method, inputting the testing set into a trained model, outputting the prediction percentage of 3D gesture data, and outputting the highest 3D gesture data as a prediction result.
Compared with the prior art, the invention has the following advantages:
1. compared with the prior behavior analysis method, the human body posture estimation behavior analysis method provided by the invention has the advantages that a plurality of groups of posture data uploaded by staff are collected through the behavior prediction network, a group of posture data is selected as verification data, then the rest data are simulated into a group of test models and the detection precision of the test models is verified, then all possible data results are listed according to the manual setting or default setting learning rate and step length of the system, each group of data results are predicted and root mean square errors are recorded, meanwhile, the combination parameter with the minimum root mean square error is selected as the optimal parameter, then the key point information of each person in the current image information is imported into the behavior prediction network and the 3D posture data generated by the transform model are imported into the behavior prediction network for posture prediction, the prediction percentage of the 3D posture data is output, the highest 3D posture data is output as the prediction result, the parameter is not required to be set manually, the efficiency and the accuracy of searching parameters of the neural network are improved, the operation is simple, and the convenience is brought to the staff.
2. According to the human body posture estimation behavior analysis method provided by the invention, the interval time of an actual video frame is calculated and recorded, a motion model is established according to a Kalman filtering theory to define the motion state of a character in the video frame, a prediction equation is constructed to estimate the motion state of each tracking target in the next video frame to obtain 2D posture data, the 2D posture data is input into a transform model, a multi-hypothesis generator in the model receives each group of 2D posture data, different representations of posture hypotheses are generated at different layers, single hypothesis dependence is modeled through a plurality of parallel self-attention blocks, each hypothesis feature is extracted to obtain each hypothesis after correction, interactive modeling is performed on information of different hypotheses, finally regression is performed on each group of posture hypotheses through a transform model regression module to obtain final 3D posture data, 3D posture prediction data of each character is obtained through the transform model, visual estimation results can be more checked by workers, the use of the experience of the workers is improved, and meanwhile, the prediction accuracy of the subsequent posture of the characters is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
Fig. 1 is a flowchart of a human body posture estimation behavior analysis method according to the present invention.
Fig. 2 is a flowchart of an algorithm of a human body posture estimation behavior analysis method according to the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
As shown in fig. 1-2, the embodiment of the invention discloses a human body posture estimation behavior analysis method, which comprises the following specific steps:
s1, collecting character image information and extracting key points.
Specifically, the collected character image information is extracted frame by frame according to a manually set time frame to obtain corresponding picture data, and the blocking number of the character image information is determined according to a picture data display ratio column, wherein the picture display ratio, namely the picture height-width pixel ratio, is determined according to the height and width of a picture and according to a user setting or system default ratio. Then according to the number of the blocks, the block processing is carried out, each group of image data after the block processing is converted from the image space to the frequency space through the Fourier positive transformation, the high-frequency component is filtered to reduce noise interference, then each group of image data is converted from the frequency space to the image space through the Fourier inverse transformation, and the image data is acquiredNetwork multiple shuffleblocks to obtain global pose features F of each set of image data mid Thereafter the global pose feature F is made by deconvolution operations mid Return to key point feature map F 0 On the key point feature map F 0 And performing decoding processing, and collecting two-dimensional key points of the human body generated after decoding.
In this embodiment, the specific expression formula of fourier transform is as follows:
Figure BDA0004055189400000051
wherein u and v are frequency variables, x and y are coordinates of each pixel point corresponding to each picture data, N represents a Fourier transform coefficient, formula (1) is Fourier positive transform, and formula (2) is Fourier inverse transform.
Global gesture feature F mid The specific expression formula is as follows:
F mid =ρ(M,w,b) (3)
where ρ (,) represents a number of shuffleblocks, M represents the input picture, w and b represent the learnable convolution kernels and offsets;
the deconvolution operation is specifically formulated as follows:
F 0 =f -1 (F mid ,w,b) (4)
wherein F is 0 Representing a tensor with a dimension of (17, n, m), 17 representing the number of key points, n and m representing the width and height of the feature map;
the specific formula of the decoding process is as follows:
J i =Max(F 0 (i)) (5)
wherein J is i Represents the ith key point, F 0 (i) And (5) representing the ith key point characteristic diagram.
It should be further noted that, the ShuffleBlock mainly includes a Channel Split and a Channel Shuffle, where the Channel Split is a Channel Split operation, the Channel Shuffle is a Channel Shuffle operation, and the key points of the character include a nose, a left eye, a right eye, a left ear, a right ear, a left shoulder, a right shoulder, a left elbow, a right elbow, a left wrist, a right wrist, a left hip, a right hip, a left knee, a right knee, a left ankle, and a right ankle.
S2, acquiring posture assumption information according to the current image information.
Specifically, referring to fig. 2, it can be seen that a single camera video or image sequence frame with a fixed frame rate is processed offline, and motion influence information of a target task is included in the single camera video or image sequence frame. Calculating and recording the interval time of an actual video frame, establishing a motion model according to a Kalman filtering theory, distributing an ID (identity) to all people in image information, defining the motion state of the people in the video frame according to the linear motion assumption of the people through the motion model, collecting the motion state of each person in the current video frame, and constructing a prediction equation to estimate the motion state of each tracking target in the next video frame so as to acquire 2D gesture data. The predictive equation is:
Figure BDA0004055189400000071
in the method, in the process of the invention,
Figure BDA0004055189400000072
representing the mean value of the motion state of the tracking target predicted by the linear motion model in the next video frame,
Figure BDA0004055189400000073
representing the best estimated value of the motion state of the tracking target in the current video frame, A k+1 Representing the state transition matrix at time k+1, P k+1,k Representing the predicted motion state covariance matrix of the tracked object in the next video frame, P k,k Covariance matrix representing motion state of tracking target in current video frame, Q k+1 Representing the motion model noise matrix at time k + 1.
A transducer model is constructed and 2D pose data is input into the transducer model, after which a multi-hypothesis generator in the model receives sets of 2D pose data and generates different representations of pose hypotheses at different layers of the model, and then models single hypothesis dependencies through multiple parallel self-attention blocks to form self-hypothesis communications, and the generated pose hypotheses are used to obtain the key point positions of each human body to confirm the hypothesis features. The mixed hypothesis MLP extracts all the spliced hypothesis features, segments the mixed hypothesis MLP to obtain each corrected hypothesis, the cross hypothesis interactor carries out interactive modeling on information of different hypotheses, and finally, the transform model regression module carries out regression on all the gesture hypotheses to obtain final 3D gesture data.
S3, constructing a behavior prediction network and searching for optimal parameters.
Specifically, the behavior prediction network collects multiple sets of posture data uploaded by staff, selects one set of posture data as verification data, then fits the remaining data into one set of test models, verifies the detection precision of the test models through the verification data, then replaces the verification data to verify again until all posture data are verified, initializes a parameter range, sets learning rate and step length according to manual setting or default setting of a system, lists all possible data results at the same time, selects any subset as a test set for each set of data, trains the test models by using the remaining subset as a training set, predicts the test set after the training is completed, counts root mean square errors of the test results, meanwhile replaces the test set with another subset, then takes the remaining subset as a training set, counts root mean square errors again until all data are predicted once, and selects a corresponding combination parameter as an optimal parameter in a data interval when the root mean square errors are minimum.
S4, carrying out gesture estimation on the current character.
Specifically, the behavior prediction network receives 3D gesture data generated by a transducer model, changes original parameters into optimal parameters, then introduces key point information of each character in current image information into the behavior prediction network, divides the key point information of each character in the current image information into a training set and a testing set, performs standardization processing on the training set, introduces a training sample generated by the standardization processing into the behavior prediction network, trains the behavior prediction network by adopting a long-term iteration method, inputs the testing set into a trained model, outputs a prediction percentage of the 3D gesture data, and outputs the highest 3D gesture data as a prediction result.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (6)

1. A human body posture estimation behavior analysis method, characterized by comprising:
s1, processing character image information into picture data, preprocessing the picture data, acquiring global posture features of each group of pictures, and acquiring an image frame sequence with two-dimensional key point information of a human body based on the global posture features;
s2, offline processing single-camera video or image sequence frames with fixed frame rate, establishing a motion model, estimating task motion states in an image frame sequence to obtain 2D gesture data, establishing a transducer model, processing the 2D gesture data based on a multi-hypothesis generator of the transducer model to generate gesture hypothesis, and carrying out regression on each group of gesture hypotheses to generate gesture hypothesis information;
s3, constructing a behavior prediction network, and learning based on a plurality of groups of pre-collected gesture data to obtain the optimal parameters of the network;
and S4, applying the optimal parameters to a behavior prediction network, predicting the posture assumption information based on the behavior prediction network, and finally outputting predicted posture estimation data.
2. The human body posture estimation behavior analysis method of claim 1, wherein preprocessing the picture data comprises:
converting each group of image data from an image space to a frequency space through Fourier positive transformation, and filtering high-frequency components of the image data to reduce noise interference;
the filtered sets of image data are then converted from frequency space to image space by inverse fourier transform.
3. The method for analyzing human body posture estimation behaviors according to claim 2, wherein acquiring global posture features of each group of pictures, acquiring human body two-dimensional key points based on the global posture features, comprises:
the method comprises the steps of performing a plurality of times of ShuffleBlock through an acquisition network to obtain global attitude characteristics of each group of image data;
returning the global attitude features to the key point feature map through deconvolution operation;
and decoding the key point feature map, and collecting the two-dimensional key points of the human body generated after decoding.
4. The method of claim 1, wherein building a motion model from a sequence of image frames, estimating task motion states in the sequence of image frames to obtain 2D pose data, building a transducer model and processing the 2D pose data based on a multi-hypothesis generator of the transducer model to generate pose hypotheses, and regressively generating pose hypothesis information for each set of pose hypotheses, comprising:
s201, processing the video information or the image sequence frame of the current figure image offline, calculating and recording the interval time of the actual video frame, and establishing a motion model according to a Kalman filtering theory;
s202, assigning an ID to all people in the image information, defining the motion state of the people in a video frame according to the linear motion assumption of the people through a motion model after the assignment is completed, collecting the motion state of each person in the current video frame, and constructing a prediction equation to estimate the motion state of each tracking target in the next video frame so as to obtain 2D gesture data;
s203, constructing a transducer model, inputting 2D gesture data into the transducer model, receiving each group of 2D gesture data by a multi-hypothesis generator in the transducer model, generating different representations of gesture hypotheses at different layers of the model, and modeling single hypothesis dependencies through a plurality of parallel self-attention blocks to form self-hypothesis communication;
s204, extracting all spliced hypothesis features by the mixed hypothesis MLP, cutting the mixed hypothesis MLP to obtain each corrected hypothesis, interactively modeling information of different hypotheses by the cross hypothesis interactor, and finally regressing all groups of gesture hypotheses by the transform model regression module to obtain final 3D gesture data.
5. The human body posture estimation behavior analysis method of claim 1, wherein constructing a behavior prediction network and learning based on pre-collected sets of posture data, obtaining optimal parameters of the network, comprises:
s301, collecting a plurality of groups of pre-uploaded gesture data by a behavior prediction network, selecting one group of pre-uploaded gesture data as verification data, then fitting the rest data into a group of test models, verifying the detection precision of the test models through the verification data, and replacing the verification data to verify again until all gesture data are verified;
s302, initializing a parameter range, listing all possible data results according to a preset learning rate and step length, selecting any subset as a test set for each group of data, training a test model by using the rest subsets as a training set, predicting the test set after training, and counting root mean square errors of the test results;
s303, simultaneously replacing the test set with another subset, taking the rest subset as a training set, and counting root mean square errors again until all data are predicted once, and selecting the corresponding combined parameter with the minimum root mean square error as the optimal parameter in the data interval.
6. The human body posture estimation behavior analysis method of claim 1, wherein applying the optimal parameters to the behavior prediction network, predicting posture assumption information based on the behavior prediction network, and finally outputting predicted posture estimation data, comprises:
s401, the behavior prediction network receives 3D gesture data generated by a transducer model, changes original parameters into optimal parameters, and then introduces key point information of each character in current image information into the behavior prediction network;
s402, dividing key point information of each character in the current image information into a training set and a testing set, carrying out standardization processing on the training set, guiding a training sample generated by the standardization processing into a behavior prediction network, training the behavior prediction network by adopting a long-term iteration method, inputting the testing set into a trained model, outputting the prediction percentage of 3D gesture data, and outputting the highest 3D gesture data as a prediction result.
CN202310045400.3A 2023-01-30 2023-01-30 Human body posture estimation behavior analysis method Pending CN116052276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310045400.3A CN116052276A (en) 2023-01-30 2023-01-30 Human body posture estimation behavior analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310045400.3A CN116052276A (en) 2023-01-30 2023-01-30 Human body posture estimation behavior analysis method

Publications (1)

Publication Number Publication Date
CN116052276A true CN116052276A (en) 2023-05-02

Family

ID=86113000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310045400.3A Pending CN116052276A (en) 2023-01-30 2023-01-30 Human body posture estimation behavior analysis method

Country Status (1)

Country Link
CN (1) CN116052276A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805423A (en) * 2023-08-23 2023-09-26 江苏源驶科技有限公司 Lightweight human body posture estimation algorithm based on structural heavy parameterization
CN117423138A (en) * 2023-12-19 2024-01-19 四川泓宝润业工程技术有限公司 Human body falling detection method, device and system based on multi-branch structure
CN117456612A (en) * 2023-12-26 2024-01-26 西安龙南铭科技有限公司 Cloud computing-based body posture automatic assessment method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805423A (en) * 2023-08-23 2023-09-26 江苏源驶科技有限公司 Lightweight human body posture estimation algorithm based on structural heavy parameterization
CN116805423B (en) * 2023-08-23 2023-11-17 江苏源驶科技有限公司 Lightweight human body posture estimation algorithm based on structural heavy parameterization
CN117423138A (en) * 2023-12-19 2024-01-19 四川泓宝润业工程技术有限公司 Human body falling detection method, device and system based on multi-branch structure
CN117423138B (en) * 2023-12-19 2024-03-15 四川泓宝润业工程技术有限公司 Human body falling detection method, device and system based on multi-branch structure
CN117456612A (en) * 2023-12-26 2024-01-26 西安龙南铭科技有限公司 Cloud computing-based body posture automatic assessment method and system
CN117456612B (en) * 2023-12-26 2024-03-12 西安龙南铭科技有限公司 Cloud computing-based body posture automatic assessment method and system

Similar Documents

Publication Publication Date Title
Villegas et al. Hierarchical long-term video prediction without supervision
CN116052276A (en) Human body posture estimation behavior analysis method
CN111414797B (en) System and method for estimating pose and pose information of an object
US20210049371A1 (en) Localisation, mapping and network training
Mall et al. A deep recurrent framework for cleaning motion capture data
CN110659573B (en) Face recognition method and device, electronic equipment and storage medium
CN110599395A (en) Target image generation method, device, server and storage medium
CN110942006A (en) Motion gesture recognition method, motion gesture recognition apparatus, terminal device, and medium
CN110852256A (en) Method, device and equipment for generating time sequence action nomination and storage medium
CN114663593B (en) Three-dimensional human body posture estimation method, device, equipment and storage medium
CN114581613B (en) Trajectory constraint-based human model posture and shape optimization method and system
CN111539262B (en) Motion transfer method and system based on single picture
CN114973097A (en) Method, device, equipment and storage medium for recognizing abnormal behaviors in electric power machine room
CN112597824A (en) Behavior recognition method and device, electronic equipment and storage medium
CN111724370A (en) Multi-task non-reference image quality evaluation method and system based on uncertainty and probability
CN112419419A (en) System and method for human body pose and shape estimation
Liu et al. ACDnet: An action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregation
CN112990154B (en) Data processing method, computer equipment and readable storage medium
CN112446253A (en) Skeleton behavior identification method and device
CN116052264B (en) Sight estimation method and device based on nonlinear deviation calibration
CN117152815A (en) Student activity accompanying data analysis method, device and equipment
CN116416678A (en) Method for realizing motion capture and intelligent judgment by using artificial intelligence technology
CN117137435A (en) Rehabilitation action recognition method and system based on multi-mode information fusion
CN115100745A (en) Swin transform model-based motion real-time counting method and system
CN111681270A (en) Method, device and storage medium for realizing registration between image frames

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination