CN110163116A - Method by accelerating OpenPose reasoning to obtain human body attitude - Google Patents
Method by accelerating OpenPose reasoning to obtain human body attitude Download PDFInfo
- Publication number
- CN110163116A CN110163116A CN201910347091.9A CN201910347091A CN110163116A CN 110163116 A CN110163116 A CN 110163116A CN 201910347091 A CN201910347091 A CN 201910347091A CN 110163116 A CN110163116 A CN 110163116A
- Authority
- CN
- China
- Prior art keywords
- data
- openpose
- human body
- model
- body attitude
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of methods by accelerating OpenPose reasoning to obtain human body attitude, include the following steps: S1: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single-frame images;S2: handling single-frame images, obtains the input data of Optimized model;S3: model structure reconstruct;S4: the precision of OpenPose Model Parameter is reduced;S5: optimum results are obtained;S6: output data is obtained;S7: human body attitude data are obtained.This method is reconstructed OpenPose network structure using TensorRT, and optimize the precision of network parameter, obtain that inference speed is fast, the accurate Optimized model of the reasoning results, using the Optimized model good basis can be laid in practical application deployment for model with quick obtaining human body attitude data.
Description
Technical field
The present invention relates to computer science and depth learning technology field, specifically provide a kind of by accelerating OpenPose
The method that reasoning obtains human body attitude accelerates net by reconstructing the network structure of depth model and reducing the precision of model parameter
The speed of network reasoning.
Background technique
Currently based on deep learning application present explosive growth, image recognition, speech recognition, natural language processing,
The functions such as image retrieval have become the daily necessary tool of people, thereupon to the Reasoning Efficiency of deep learning and response speed
Then more stringent requirements are proposed for degree;Deep learning is divided into training and deployment two parts need to disappear under trained operation is general online
A large amount of GPU is consumed, a bigger batchsize in contrast can be generally given, because its requirement of real-time is relatively low,
What general training model was given is 128, can adequately utilize GPU equipment.But to just difference, reasoning only need when reasoning
A forward calculation is done, input is obtained to the result of prediction by neural network.And it is possible there are many actual deployments of reasoning,
May deployment beyond the clouds, such as the voice input on common mobile phone, at present all or cloud, that is, first by speaker
Sound pass to cloud, cloud returns again to data to come after handling well;It is also possible to be deployed in built-in end, for example, it is Embedded
Camera, unmanned plane, robot or vehicle-mounted automatic Pilot, as this embedded or automatic Pilot, its feature is to real-time
Property require it is very high.
In this stage of training, if model is slow, bigger cluster, more machines can be used, bigger number is done
According to even model is trained parallel parallel.And the problem of end is more than cost is disposed, if method is not proper, even if using special
Not good GPU, is also unable to satisfy the requirement of real-time of reasoning, if not doing and optimizing because model is done badly, it may be necessary to two
300 milliseconds can just finish a reasoning, can not be applied in the higher built-in end of requirement of real-time.
Summary of the invention
In consideration of it, the purpose of the present invention is to provide a kind of by accelerating OpenPose reasoning to obtain the side of human body attitude
Method, to solve the problems, such as that OpenPose model inference time in actual deployment is longer.
It is provided by the invention lower to hardware device requirement by the method for accelerating OpenPose reasoning to obtain human body attitude,
By the way that model is reconstructed, it can speed up the speed of OpenPose reasoning, apply for OpenPose and provided in real life
Good basis.
Present invention provide the technical scheme that the method by accelerating OpenPose reasoning to obtain human body attitude, including such as
Lower step:
S1: it obtains video flowing: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single frames
Image, wherein described image is the image of the BGR format in 3 channels;
S2: single-frame images processing: the data buffer zone of one TensorRT of creation, the data buffer zone be used in GPU and
Data are transmitted between memory;An input array is created, size is N × C × W × H, wherein N expression is once input to
The quantity of picture in TensorRT, C, H, W respectively indicate number of channels, picture altitude and the width of input picture;Take out single frames
The data in each channel in image, and the data in each channel are saved in respectively in input array according to the sequence of BGR;It will be defeated
Enter array to pass in the data buffer zone of TensorRT, the input as Optimized model;
S3: model structure reconstruct: loading the model of original OpenPose, obtains network structure, then passes through TensorRT
Convolutional layer, bias layer and active coating in network is reconstructed, is combined into one layer;
S4: data precision is reduced: using TensoRT by the parameter optimization of the single precision fp32 in OpenPose model at half
The parameter of precision fp16, the model of the OpenPose after being optimized;
S5: optimum results are obtained: the OpenPose mould after the buffered data obtained in S2 to be input to the optimization of S4 acquisition
In type, after the network reasoning of optimization, optimum results are obtained, later, the data buffer zone of TensorRT are updated with the result;
S6: obtaining output data: the updated buffered data in data buffer zone copied in memory from GPU, creates
An array identical with OpenPose network output size is built as output array, the data in data buffer zone are saved in
It exports in array;
S7: human body attitude data are obtained: output array is carried out using human body attitude some algorithm is generated in OpenPose
Processing, obtains human body attitude, analyzes for subsequent human body attitude.
A kind of method by accelerating OpenPose reasoning to obtain human body attitude provided by the invention, passes through change
The structure of OpenPose network and the precision for reducing network parameter can accelerate the speed of network reasoning, obtain accurate people faster
Body attitude data;The data format in TensorRT is converted input data into first, is then optimized using TensorRT
The data cached method using in OpenPose algorithm obtained after reasoning is finally obtained the posture of human body by OpenPose model
Data.The present invention accelerates the reasoning process of network by TensorRT, and the model occupied space after on the one hand optimizing is smaller, convenient
Deployment is in the actual environment;On the other hand, after model optimization, the requirement to hardware is lower, can save in large scale deployment
Many costs.
Specific embodiment
The present invention is further explained below in conjunction with specific embodiment, but the not limitation present invention.
The present invention provides a kind of methods by accelerating OpenPose reasoning to obtain human body attitude, include the following steps:
S1: it obtains video flowing: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single frames
Image, wherein described image is the image of the BGR format in 3 channels;
S2: single-frame images processing: the data buffer zone of one TensorRT of creation, the data buffer zone be used in GPU and
Data are transmitted between memory;An input array is created, size is N × C × W × H, wherein N expression is once input to
The quantity of picture in TensorRT, C, H, W respectively indicate number of channels, picture altitude and the width of input picture;Take out single frames
The data in each channel in image, and the data in each channel are saved in respectively in input array according to the sequence of BGR;It will be defeated
Enter array to pass in the data buffer zone of TensorRT, the input as Optimized model;
S3: model structure reconstruct: loading the model of original OpenPose, obtains network structure, then passes through TensorRT
Convolutional layer, bias layer and active coating in network is reconstructed, is combined into one layer;
In normal deep learning, convolutional layer, bias layer and active coating need to call the corresponding interface of cuDNN three times, but
Some network layers can be merged in TensorRT, current network is on the one hand deeper and deeper, and it is on the other hand more and more wider,
It may do the convolution of several same sizes parallel, these convolutional calculations could be incorporated into fact to be come together to do, such as
Concat layer in OpenPose, a branch of network obtain the matrix that a size is N × 38 × 45 × 80, another point
The matrix that size is N × 19 × 45 × 80 is calculated in branch, and N represents the quantity of input picture, is merged together, and forms one big
This two layers can also be incorporated directly into together by the small matrix for N × 57 × 45 × 80, TensorRT, not need to define in a network
Union operation;
S4: data precision is reduced: using TensoRT by the parameter optimization of the single precision fp32 in OpenPose model at half
The parameter of precision fp16, the model of the OpenPose after being optimized;
In order to guarantee data precision of the model in training, when network training, all uses the data of single precision fp32, still
One disadvantage of the high data of service precision will also be calculated in reasoning by a large amount of, it is demonstrated experimentally that with lower essence
Degree does reasoning equally and can achieve good detection effect, so using TensoRT by the single precision in OpenPose model
Parameter of the parameter optimization of fp32 at half precision fp16, the model of the OpenPose after being optimized, meanwhile, TensorRT can also
The Tensor Core module in GPU is called, the inference speed of network is accelerated;
S5: optimum results are obtained: the OpenPose mould after the buffered data obtained in S2 to be input to the optimization of S4 acquisition
In type, after the network reasoning of optimization, optimum results are obtained, later, the data buffer zone of TensorRT are updated with the result;
S6: obtaining output data: the updated buffered data in data buffer zone copied in memory from GPU, creates
Identical with an OpenPose network output size array is built as output array, by taking the example in S2 as an example, number herein
Group size is N × 57 × 45 × 80, and the data in data buffer zone are saved in output array;
S7: human body attitude data are obtained: output array is carried out using human body attitude some algorithm is generated in OpenPose
Processing, obtains human body attitude, analyzes for subsequent human body attitude.
The method for obtaining human body attitude by acceleration OpenPose reasoning, by changing the structure of OpenPose network simultaneously
The precision for reducing network parameter can accelerate the speed of network reasoning, obtain accurate human body attitude data faster;It first will be defeated
Enter data conversion into the data format in TensorRT, then optimizes OpenPose model using TensorRT, finally by reasoning
The data cached method using in OpenPose algorithm obtained afterwards obtains the attitude data of human body.The present invention passes through TensorRT
Accelerate the reasoning process of network, the model occupied space after on the one hand optimizing is smaller, facilitates deployment in the actual environment;Another party
Face, after model optimization, the requirement to hardware is lower, and many costs can be saved in large scale deployment.
A specific embodiment of the invention is write according to progressive mode, and each embodiment is highlighted
Difference, similar portion can be with cross-reference.
Embodiments of the present invention are elaborated above, but present invention is not limited to the embodiments described above,
Those of ordinary skill in the art within the scope of knowledge, can also make various without departing from the purpose of the present invention
Variation.
Claims (1)
1. the method by accelerating OpenPose reasoning to obtain human body attitude, which comprises the steps of:
S1: it obtains video flowing: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single frames figure
Picture, wherein described image is the image of the BGR format in 3 channels;
S2: single-frame images processing: the data buffer zone of one TensorRT of creation, the data buffer zone are used in GPU and memory
Between transmit data;An input array is created, size is N × C × W × H, wherein N expression is once input in TensorRT
The quantity of picture, C, H, W respectively indicate number of channels, picture altitude and the width of input picture;It takes out each in single-frame images
The data in channel, and the data in each channel are saved in respectively in input array according to the sequence of BGR;Input array is passed to
Input in the data buffer zone of TensorRT, as Optimized model;
S3: model structure reconstruct: loading the model of original OpenPose, obtains network structure, then by TensorRT to net
Convolutional layer, bias layer and active coating in network are reconstructed, and are combined into one layer;
S4: data precision is reduced: using TensoRT by the parameter optimization of the single precision fp32 in OpenPose model at half precision
The parameter of fp16, the model of the OpenPose after being optimized;
S5: optimum results are obtained: in the OpenPose model after the buffered data obtained in S2 to be input to the optimization of S4 acquisition,
After the network reasoning of optimization, optimum results are obtained, later, the data buffer zone of TensorRT are updated with the result;
S6: obtaining output data: the updated buffered data in data buffer zone copied in memory from GPU, creation one
Data in data buffer zone are saved in output as output array by a array identical with OpenPose network output size
In array;
S7: human body attitude data are obtained: output array are handled using human body attitude some algorithm is generated in OpenPose,
Human body attitude is obtained, is analyzed for subsequent human body attitude.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910347091.9A CN110163116A (en) | 2019-04-26 | 2019-04-26 | Method by accelerating OpenPose reasoning to obtain human body attitude |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910347091.9A CN110163116A (en) | 2019-04-26 | 2019-04-26 | Method by accelerating OpenPose reasoning to obtain human body attitude |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110163116A true CN110163116A (en) | 2019-08-23 |
Family
ID=67638803
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910347091.9A Pending CN110163116A (en) | 2019-04-26 | 2019-04-26 | Method by accelerating OpenPose reasoning to obtain human body attitude |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110163116A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796242A (en) * | 2019-11-01 | 2020-02-14 | 广东三维家信息科技有限公司 | Neural network model reasoning method and device, electronic equipment and readable medium |
CN111368791A (en) * | 2020-03-18 | 2020-07-03 | 南通大学 | Pull-up test counting method and system based on Quick-OpenPose model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198492A (en) * | 2013-03-28 | 2013-07-10 | 沈阳航空航天大学 | Human motion capture method |
CN108537136A (en) * | 2018-03-19 | 2018-09-14 | 复旦大学 | The pedestrian's recognition methods again generated based on posture normalized image |
CN109325469A (en) * | 2018-10-23 | 2019-02-12 | 北京工商大学 | A kind of human posture recognition method based on deep neural network |
US20190122424A1 (en) * | 2017-10-23 | 2019-04-25 | Fit3D, Inc. | Generation of Body Models and Measurements |
-
2019
- 2019-04-26 CN CN201910347091.9A patent/CN110163116A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198492A (en) * | 2013-03-28 | 2013-07-10 | 沈阳航空航天大学 | Human motion capture method |
US20190122424A1 (en) * | 2017-10-23 | 2019-04-25 | Fit3D, Inc. | Generation of Body Models and Measurements |
CN108537136A (en) * | 2018-03-19 | 2018-09-14 | 复旦大学 | The pedestrian's recognition methods again generated based on posture normalized image |
CN109325469A (en) * | 2018-10-23 | 2019-02-12 | 北京工商大学 | A kind of human posture recognition method based on deep neural network |
Non-Patent Citations (1)
Title |
---|
HAANJACK: "openpose-trt-optimize_update README.md", 《HTTPS://GITHUB.COM/HAANJACK/OPENPOSE-TRT-OPTIMIZE/COMMIT/3D8A992ED840C3DBBE6C1F37114C4D0CBEE749CA#》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796242A (en) * | 2019-11-01 | 2020-02-14 | 广东三维家信息科技有限公司 | Neural network model reasoning method and device, electronic equipment and readable medium |
CN111368791A (en) * | 2020-03-18 | 2020-07-03 | 南通大学 | Pull-up test counting method and system based on Quick-OpenPose model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE102019122180A1 (en) | METHOD AND SYSTEM FOR KEY EXPRESSION DETECTION BASED ON A NEURONAL NETWORK | |
KR102555057B1 (en) | Method for formatting weight matrix, accelerator using the formatted weight matrix and system including the same | |
EP3665676B1 (en) | Speaking classification using audio-visual data | |
CN111242844B (en) | Image processing method, device, server and storage medium | |
CN113344188A (en) | Lightweight neural network model based on channel attention module | |
CN111709493B (en) | Object classification method, training device, object classification equipment and storage medium | |
CN110942502B (en) | Voice lip fitting method and system and storage medium | |
CN110930342A (en) | Depth map super-resolution reconstruction network construction method based on color map guidance | |
RU2770748C1 (en) | Method and apparatus for image processing, device and data carrier | |
CN110163116A (en) | Method by accelerating OpenPose reasoning to obtain human body attitude | |
CN111079923A (en) | Spark convolution neural network system suitable for edge computing platform and circuit thereof | |
KR20220098991A (en) | Method and apparatus for recognizing emtions based on speech signal | |
CN111369430A (en) | Mobile terminal portrait intelligent background replacement method based on mobile deep learning engine | |
CN115797835A (en) | Non-supervision video target segmentation algorithm based on heterogeneous Transformer | |
US20240071070A1 (en) | Algorithm and method for dynamically changing quantization precision of deep-learning network | |
WO2022083165A1 (en) | Transformer-based automatic speech recognition system incorporating time-reduction layer | |
CN113610192A (en) | Neural network lightweight method and system based on continuous pruning | |
CN111832720B (en) | Configurable neural network reasoning and online learning fusion calculation circuit | |
JP2019197445A (en) | Image recognition device, image recognition method, and program | |
CN116580184A (en) | YOLOv 7-based lightweight model | |
CN116861963A (en) | Automatic driving equipment of photon convolution reserve pool based on multipath light injection laser | |
CN115861841A (en) | SAR image target detection method combined with lightweight large convolution kernel | |
KR102393761B1 (en) | Method and system of learning artificial neural network model for image processing | |
CN116468902A (en) | Image processing method, device and non-volatile computer readable storage medium | |
CN111832336B (en) | Improved C3D video behavior detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20201106 Address after: Room d09-629, international software park, No. 863-9, shangshengou village, Hunnan District, Shenyang City, Liaoning Province Applicant after: Shenyang Tuwei Technology Co.,Ltd. Address before: 110136, Liaoning, Shenyang, Shenbei New Area moral South Avenue No. 37 Applicant before: SHENYANG AEROSPACE University |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190823 |