CN110163116A - Method by accelerating OpenPose reasoning to obtain human body attitude - Google Patents

Method by accelerating OpenPose reasoning to obtain human body attitude Download PDF

Info

Publication number
CN110163116A
CN110163116A CN201910347091.9A CN201910347091A CN110163116A CN 110163116 A CN110163116 A CN 110163116A CN 201910347091 A CN201910347091 A CN 201910347091A CN 110163116 A CN110163116 A CN 110163116A
Authority
CN
China
Prior art keywords
data
openpose
human body
model
body attitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910347091.9A
Other languages
Chinese (zh)
Inventor
张德园
王俊远
石祥滨
刘芳
武卫东
刘翠微
李照奎
吴杰宏
毕静
颜卓
李浩文
代海龙
杨啸宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Tuwei Technology Co ltd
Original Assignee
Shenyang Aerospace University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Aerospace University filed Critical Shenyang Aerospace University
Priority to CN201910347091.9A priority Critical patent/CN110163116A/en
Publication of CN110163116A publication Critical patent/CN110163116A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of methods by accelerating OpenPose reasoning to obtain human body attitude, include the following steps: S1: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single-frame images;S2: handling single-frame images, obtains the input data of Optimized model;S3: model structure reconstruct;S4: the precision of OpenPose Model Parameter is reduced;S5: optimum results are obtained;S6: output data is obtained;S7: human body attitude data are obtained.This method is reconstructed OpenPose network structure using TensorRT, and optimize the precision of network parameter, obtain that inference speed is fast, the accurate Optimized model of the reasoning results, using the Optimized model good basis can be laid in practical application deployment for model with quick obtaining human body attitude data.

Description

Method by accelerating OpenPose reasoning to obtain human body attitude
Technical field
The present invention relates to computer science and depth learning technology field, specifically provide a kind of by accelerating OpenPose The method that reasoning obtains human body attitude accelerates net by reconstructing the network structure of depth model and reducing the precision of model parameter The speed of network reasoning.
Background technique
Currently based on deep learning application present explosive growth, image recognition, speech recognition, natural language processing, The functions such as image retrieval have become the daily necessary tool of people, thereupon to the Reasoning Efficiency of deep learning and response speed Then more stringent requirements are proposed for degree;Deep learning is divided into training and deployment two parts need to disappear under trained operation is general online A large amount of GPU is consumed, a bigger batchsize in contrast can be generally given, because its requirement of real-time is relatively low, What general training model was given is 128, can adequately utilize GPU equipment.But to just difference, reasoning only need when reasoning A forward calculation is done, input is obtained to the result of prediction by neural network.And it is possible there are many actual deployments of reasoning, May deployment beyond the clouds, such as the voice input on common mobile phone, at present all or cloud, that is, first by speaker Sound pass to cloud, cloud returns again to data to come after handling well;It is also possible to be deployed in built-in end, for example, it is Embedded Camera, unmanned plane, robot or vehicle-mounted automatic Pilot, as this embedded or automatic Pilot, its feature is to real-time Property require it is very high.
In this stage of training, if model is slow, bigger cluster, more machines can be used, bigger number is done According to even model is trained parallel parallel.And the problem of end is more than cost is disposed, if method is not proper, even if using special Not good GPU, is also unable to satisfy the requirement of real-time of reasoning, if not doing and optimizing because model is done badly, it may be necessary to two 300 milliseconds can just finish a reasoning, can not be applied in the higher built-in end of requirement of real-time.
Summary of the invention
In consideration of it, the purpose of the present invention is to provide a kind of by accelerating OpenPose reasoning to obtain the side of human body attitude Method, to solve the problems, such as that OpenPose model inference time in actual deployment is longer.
It is provided by the invention lower to hardware device requirement by the method for accelerating OpenPose reasoning to obtain human body attitude, By the way that model is reconstructed, it can speed up the speed of OpenPose reasoning, apply for OpenPose and provided in real life Good basis.
Present invention provide the technical scheme that the method by accelerating OpenPose reasoning to obtain human body attitude, including such as Lower step:
S1: it obtains video flowing: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single frames Image, wherein described image is the image of the BGR format in 3 channels;
S2: single-frame images processing: the data buffer zone of one TensorRT of creation, the data buffer zone be used in GPU and Data are transmitted between memory;An input array is created, size is N × C × W × H, wherein N expression is once input to The quantity of picture in TensorRT, C, H, W respectively indicate number of channels, picture altitude and the width of input picture;Take out single frames The data in each channel in image, and the data in each channel are saved in respectively in input array according to the sequence of BGR;It will be defeated Enter array to pass in the data buffer zone of TensorRT, the input as Optimized model;
S3: model structure reconstruct: loading the model of original OpenPose, obtains network structure, then passes through TensorRT Convolutional layer, bias layer and active coating in network is reconstructed, is combined into one layer;
S4: data precision is reduced: using TensoRT by the parameter optimization of the single precision fp32 in OpenPose model at half The parameter of precision fp16, the model of the OpenPose after being optimized;
S5: optimum results are obtained: the OpenPose mould after the buffered data obtained in S2 to be input to the optimization of S4 acquisition In type, after the network reasoning of optimization, optimum results are obtained, later, the data buffer zone of TensorRT are updated with the result;
S6: obtaining output data: the updated buffered data in data buffer zone copied in memory from GPU, creates An array identical with OpenPose network output size is built as output array, the data in data buffer zone are saved in It exports in array;
S7: human body attitude data are obtained: output array is carried out using human body attitude some algorithm is generated in OpenPose Processing, obtains human body attitude, analyzes for subsequent human body attitude.
A kind of method by accelerating OpenPose reasoning to obtain human body attitude provided by the invention, passes through change The structure of OpenPose network and the precision for reducing network parameter can accelerate the speed of network reasoning, obtain accurate people faster Body attitude data;The data format in TensorRT is converted input data into first, is then optimized using TensorRT The data cached method using in OpenPose algorithm obtained after reasoning is finally obtained the posture of human body by OpenPose model Data.The present invention accelerates the reasoning process of network by TensorRT, and the model occupied space after on the one hand optimizing is smaller, convenient Deployment is in the actual environment;On the other hand, after model optimization, the requirement to hardware is lower, can save in large scale deployment Many costs.
Specific embodiment
The present invention is further explained below in conjunction with specific embodiment, but the not limitation present invention.
The present invention provides a kind of methods by accelerating OpenPose reasoning to obtain human body attitude, include the following steps:
S1: it obtains video flowing: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single frames Image, wherein described image is the image of the BGR format in 3 channels;
S2: single-frame images processing: the data buffer zone of one TensorRT of creation, the data buffer zone be used in GPU and Data are transmitted between memory;An input array is created, size is N × C × W × H, wherein N expression is once input to The quantity of picture in TensorRT, C, H, W respectively indicate number of channels, picture altitude and the width of input picture;Take out single frames The data in each channel in image, and the data in each channel are saved in respectively in input array according to the sequence of BGR;It will be defeated Enter array to pass in the data buffer zone of TensorRT, the input as Optimized model;
S3: model structure reconstruct: loading the model of original OpenPose, obtains network structure, then passes through TensorRT Convolutional layer, bias layer and active coating in network is reconstructed, is combined into one layer;
In normal deep learning, convolutional layer, bias layer and active coating need to call the corresponding interface of cuDNN three times, but Some network layers can be merged in TensorRT, current network is on the one hand deeper and deeper, and it is on the other hand more and more wider, It may do the convolution of several same sizes parallel, these convolutional calculations could be incorporated into fact to be come together to do, such as Concat layer in OpenPose, a branch of network obtain the matrix that a size is N × 38 × 45 × 80, another point The matrix that size is N × 19 × 45 × 80 is calculated in branch, and N represents the quantity of input picture, is merged together, and forms one big This two layers can also be incorporated directly into together by the small matrix for N × 57 × 45 × 80, TensorRT, not need to define in a network Union operation;
S4: data precision is reduced: using TensoRT by the parameter optimization of the single precision fp32 in OpenPose model at half The parameter of precision fp16, the model of the OpenPose after being optimized;
In order to guarantee data precision of the model in training, when network training, all uses the data of single precision fp32, still One disadvantage of the high data of service precision will also be calculated in reasoning by a large amount of, it is demonstrated experimentally that with lower essence Degree does reasoning equally and can achieve good detection effect, so using TensoRT by the single precision in OpenPose model Parameter of the parameter optimization of fp32 at half precision fp16, the model of the OpenPose after being optimized, meanwhile, TensorRT can also The Tensor Core module in GPU is called, the inference speed of network is accelerated;
S5: optimum results are obtained: the OpenPose mould after the buffered data obtained in S2 to be input to the optimization of S4 acquisition In type, after the network reasoning of optimization, optimum results are obtained, later, the data buffer zone of TensorRT are updated with the result;
S6: obtaining output data: the updated buffered data in data buffer zone copied in memory from GPU, creates Identical with an OpenPose network output size array is built as output array, by taking the example in S2 as an example, number herein Group size is N × 57 × 45 × 80, and the data in data buffer zone are saved in output array;
S7: human body attitude data are obtained: output array is carried out using human body attitude some algorithm is generated in OpenPose Processing, obtains human body attitude, analyzes for subsequent human body attitude.
The method for obtaining human body attitude by acceleration OpenPose reasoning, by changing the structure of OpenPose network simultaneously The precision for reducing network parameter can accelerate the speed of network reasoning, obtain accurate human body attitude data faster;It first will be defeated Enter data conversion into the data format in TensorRT, then optimizes OpenPose model using TensorRT, finally by reasoning The data cached method using in OpenPose algorithm obtained afterwards obtains the attitude data of human body.The present invention passes through TensorRT Accelerate the reasoning process of network, the model occupied space after on the one hand optimizing is smaller, facilitates deployment in the actual environment;Another party Face, after model optimization, the requirement to hardware is lower, and many costs can be saved in large scale deployment.
A specific embodiment of the invention is write according to progressive mode, and each embodiment is highlighted Difference, similar portion can be with cross-reference.
Embodiments of the present invention are elaborated above, but present invention is not limited to the embodiments described above, Those of ordinary skill in the art within the scope of knowledge, can also make various without departing from the purpose of the present invention Variation.

Claims (1)

1. the method by accelerating OpenPose reasoning to obtain human body attitude, which comprises the steps of:
S1: it obtains video flowing: obtaining the video flowing comprising human body attitude information of input by OpenCV, and obtain single frames figure Picture, wherein described image is the image of the BGR format in 3 channels;
S2: single-frame images processing: the data buffer zone of one TensorRT of creation, the data buffer zone are used in GPU and memory Between transmit data;An input array is created, size is N × C × W × H, wherein N expression is once input in TensorRT The quantity of picture, C, H, W respectively indicate number of channels, picture altitude and the width of input picture;It takes out each in single-frame images The data in channel, and the data in each channel are saved in respectively in input array according to the sequence of BGR;Input array is passed to Input in the data buffer zone of TensorRT, as Optimized model;
S3: model structure reconstruct: loading the model of original OpenPose, obtains network structure, then by TensorRT to net Convolutional layer, bias layer and active coating in network are reconstructed, and are combined into one layer;
S4: data precision is reduced: using TensoRT by the parameter optimization of the single precision fp32 in OpenPose model at half precision The parameter of fp16, the model of the OpenPose after being optimized;
S5: optimum results are obtained: in the OpenPose model after the buffered data obtained in S2 to be input to the optimization of S4 acquisition, After the network reasoning of optimization, optimum results are obtained, later, the data buffer zone of TensorRT are updated with the result;
S6: obtaining output data: the updated buffered data in data buffer zone copied in memory from GPU, creation one Data in data buffer zone are saved in output as output array by a array identical with OpenPose network output size In array;
S7: human body attitude data are obtained: output array are handled using human body attitude some algorithm is generated in OpenPose, Human body attitude is obtained, is analyzed for subsequent human body attitude.
CN201910347091.9A 2019-04-26 2019-04-26 Method by accelerating OpenPose reasoning to obtain human body attitude Pending CN110163116A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910347091.9A CN110163116A (en) 2019-04-26 2019-04-26 Method by accelerating OpenPose reasoning to obtain human body attitude

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910347091.9A CN110163116A (en) 2019-04-26 2019-04-26 Method by accelerating OpenPose reasoning to obtain human body attitude

Publications (1)

Publication Number Publication Date
CN110163116A true CN110163116A (en) 2019-08-23

Family

ID=67638803

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910347091.9A Pending CN110163116A (en) 2019-04-26 2019-04-26 Method by accelerating OpenPose reasoning to obtain human body attitude

Country Status (1)

Country Link
CN (1) CN110163116A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796242A (en) * 2019-11-01 2020-02-14 广东三维家信息科技有限公司 Neural network model reasoning method and device, electronic equipment and readable medium
CN111368791A (en) * 2020-03-18 2020-07-03 南通大学 Pull-up test counting method and system based on Quick-OpenPose model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198492A (en) * 2013-03-28 2013-07-10 沈阳航空航天大学 Human motion capture method
CN108537136A (en) * 2018-03-19 2018-09-14 复旦大学 The pedestrian's recognition methods again generated based on posture normalized image
CN109325469A (en) * 2018-10-23 2019-02-12 北京工商大学 A kind of human posture recognition method based on deep neural network
US20190122424A1 (en) * 2017-10-23 2019-04-25 Fit3D, Inc. Generation of Body Models and Measurements

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198492A (en) * 2013-03-28 2013-07-10 沈阳航空航天大学 Human motion capture method
US20190122424A1 (en) * 2017-10-23 2019-04-25 Fit3D, Inc. Generation of Body Models and Measurements
CN108537136A (en) * 2018-03-19 2018-09-14 复旦大学 The pedestrian's recognition methods again generated based on posture normalized image
CN109325469A (en) * 2018-10-23 2019-02-12 北京工商大学 A kind of human posture recognition method based on deep neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAANJACK: "openpose-trt-optimize_update README.md", 《HTTPS://GITHUB.COM/HAANJACK/OPENPOSE-TRT-OPTIMIZE/COMMIT/3D8A992ED840C3DBBE6C1F37114C4D0CBEE749CA#》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796242A (en) * 2019-11-01 2020-02-14 广东三维家信息科技有限公司 Neural network model reasoning method and device, electronic equipment and readable medium
CN111368791A (en) * 2020-03-18 2020-07-03 南通大学 Pull-up test counting method and system based on Quick-OpenPose model

Similar Documents

Publication Publication Date Title
DE102019122180A1 (en) METHOD AND SYSTEM FOR KEY EXPRESSION DETECTION BASED ON A NEURONAL NETWORK
KR102555057B1 (en) Method for formatting weight matrix, accelerator using the formatted weight matrix and system including the same
EP3665676B1 (en) Speaking classification using audio-visual data
CN111242844B (en) Image processing method, device, server and storage medium
CN113344188A (en) Lightweight neural network model based on channel attention module
CN111709493B (en) Object classification method, training device, object classification equipment and storage medium
CN110942502B (en) Voice lip fitting method and system and storage medium
CN110930342A (en) Depth map super-resolution reconstruction network construction method based on color map guidance
RU2770748C1 (en) Method and apparatus for image processing, device and data carrier
CN110163116A (en) Method by accelerating OpenPose reasoning to obtain human body attitude
CN111079923A (en) Spark convolution neural network system suitable for edge computing platform and circuit thereof
KR20220098991A (en) Method and apparatus for recognizing emtions based on speech signal
CN111369430A (en) Mobile terminal portrait intelligent background replacement method based on mobile deep learning engine
CN115797835A (en) Non-supervision video target segmentation algorithm based on heterogeneous Transformer
US20240071070A1 (en) Algorithm and method for dynamically changing quantization precision of deep-learning network
WO2022083165A1 (en) Transformer-based automatic speech recognition system incorporating time-reduction layer
CN113610192A (en) Neural network lightweight method and system based on continuous pruning
CN111832720B (en) Configurable neural network reasoning and online learning fusion calculation circuit
JP2019197445A (en) Image recognition device, image recognition method, and program
CN116580184A (en) YOLOv 7-based lightweight model
CN116861963A (en) Automatic driving equipment of photon convolution reserve pool based on multipath light injection laser
CN115861841A (en) SAR image target detection method combined with lightweight large convolution kernel
KR102393761B1 (en) Method and system of learning artificial neural network model for image processing
CN116468902A (en) Image processing method, device and non-volatile computer readable storage medium
CN111832336B (en) Improved C3D video behavior detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20201106

Address after: Room d09-629, international software park, No. 863-9, shangshengou village, Hunnan District, Shenyang City, Liaoning Province

Applicant after: Shenyang Tuwei Technology Co.,Ltd.

Address before: 110136, Liaoning, Shenyang, Shenbei New Area moral South Avenue No. 37

Applicant before: SHENYANG AEROSPACE University

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190823