CN113111850A - Human body key point detection method, device and system based on region-of-interest transformation - Google Patents

Human body key point detection method, device and system based on region-of-interest transformation Download PDF

Info

Publication number
CN113111850A
CN113111850A CN202110478213.5A CN202110478213A CN113111850A CN 113111850 A CN113111850 A CN 113111850A CN 202110478213 A CN202110478213 A CN 202110478213A CN 113111850 A CN113111850 A CN 113111850A
Authority
CN
China
Prior art keywords
face
image
human body
key points
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110478213.5A
Other languages
Chinese (zh)
Other versions
CN113111850B (en
Inventor
杨帆
郝强
潘鑫淼
胡建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiaoshi Technology Jiangsu Co ltd
Original Assignee
Nanjing Zhenshi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Zhenshi Intelligent Technology Co Ltd filed Critical Nanjing Zhenshi Intelligent Technology Co Ltd
Priority to CN202110478213.5A priority Critical patent/CN113111850B/en
Publication of CN113111850A publication Critical patent/CN113111850A/en
Application granted granted Critical
Publication of CN113111850B publication Critical patent/CN113111850B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method, a device and a system for detecting human key points based on region-of-interest transformation. And in the model training process, carrying out region-of-interest transformation on the human key point data, and training the human key point model by using the transformed data. And in the process of detecting the model, detecting the human body key points according to the trained human body key point model, and performing inverse transformation to obtain the human body key points of the image before transformation. The invention effectively standardizes the data to a uniform form, overcomes the problem of large data change in an open scene, reduces the training difficulty, can improve the face proportion in the image through the region-of-interest transformation, is beneficial to the prediction of key points of the face, and further improves the integral precision of key points of a human body. Compared with a method for separately predicting body and face key points, the method only needs one face detector and one key point detector, and the calculation cost is low.

Description

Human body key point detection method, device and system based on region-of-interest transformation
Technical Field
The invention relates to the technical field of image processing, in particular to human face detection and recognition, and specifically relates to a human body key point detection method, device and system based on region-of-interest transformation.
Background
The task of human body key point detection is to detect the key point positions of the face and the limbs in the human body image. Human body image data under an uncontrolled scene has large changes, such as large differences among people, dresses, postures, shelters and background environments, and small face proportion, which brings difficulty to training of a human body key point detection model.
The existing human body key point detection methods mainly comprise two types, one type is that the human body position in an image is detected firstly, the human body image is intercepted, and then key points in the image are detected, but because the face occupies a small proportion in the image, the face key point prediction is not accurate, the number of the face key points is often large, and the number of the limb key points is small, so the integral precision is influenced.
The other method is to detect key points of the human body by detecting the positions of the human body and the human face, and specifically comprises the steps of firstly intercepting images of the human body and the human face and then respectively detecting key points of limbs and the face. Although the method has high precision, a plurality of model predictions are needed, and the calculation is time-consuming.
Disclosure of Invention
The invention aims to provide a method, a device and a system for detecting human key points based on region-of-interest transformation.
In order to achieve the above object, a first aspect of the present invention provides a method for detecting human key points based on region of interest transformation, including the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for the input image to be detected containing the human body, and then carrying out region-of-interest transformation according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model obtained by training in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The second aspect of the present invention further provides a human body key point detection device based on region of interest transformation, including:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The third aspect of the present invention further provides a system for human body keypoint detection based on region of interest transformation, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a human keypoint detection method based on region of interest transformation as previously described.
Compared with the prior art, the technical scheme of the invention has the following remarkable beneficial effects:
the method aims at the problem of human body detection obstacle caused by the problems of large scene change and small face proportion of human body image data in an open environment, provides a mode of carrying out region-of-interest transformation on the data by taking a human face as a center, training a human body key point detection model by using the transformed data, therefore, during actual detection, after the human face is detected by the human face detector, the interested region of the image is changed, then detecting key points of the human body, finally performing inverse transformation to obtain the key point data of the original image to be detected, therefore, on one hand, the data can be adjusted to a uniform mode, the training difficulty is reduced, on the other hand, the proportion of the face in the image can be improved through transformation because the number of the face key points is far more than that of the body key points, the face key points can be predicted more accurately, therefore, the overall performance of human body key point detection is improved, and the accuracy of the human body key point detection model is improved. Meanwhile, the method only needs one face detector and one key point detector, and the calculation cost is low.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent. In addition, all combinations of claimed subject matter are considered a part of the presently disclosed subject matter.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
fig. 1 is a schematic diagram of a training process of a human body key point detection model based on region of interest transformation according to an exemplary embodiment of the present invention.
FIG. 2 is a schematic diagram of a model structure of the human body key point detection model of the present invention.
FIG. 3 is a schematic diagram illustrating a process of detecting key points of a human body by using the model shown in FIG. 1 according to the embodiment of the invention.
Detailed Description
In order to better understand the technical content of the present invention, specific embodiments are described below with reference to the accompanying drawings.
In this disclosure, aspects of the present invention are described with reference to the accompanying drawings, in which a number of illustrative embodiments are shown. Embodiments of the present disclosure are not necessarily intended to include all aspects of the invention. It should be appreciated that the various concepts and embodiments described above, as well as those described in greater detail below, may be implemented in any of numerous ways, as the disclosed concepts and embodiments are not limited to any one implementation. In addition, some aspects of the present disclosure may be used alone, or in any suitable combination with other aspects of the present disclosure.
Referring to fig. 1, 2 and 3, the method for detecting key points of a human body based on region of interest transformation provided by the invention comprises a model training process and a model detection process. And in the model training process, carrying out region-of-interest transformation on the human key point data, and training the human key point model by using the transformed data. And in the process of detecting the model, detecting the human body key points according to the trained human body key point model, and performing inverse transformation to obtain the human body key points of the image before transformation.
In the model training process, the problem of large data change in an open scene is solved by effectively standardizing the data to a uniform form, the training difficulty is reduced, meanwhile, the face proportion in the image can be improved through region-of-interest transformation, the prediction of the face key points is facilitated, and the integral precision of the human body key points is further improved. Compared with a method for separately predicting body and face key points, the method only needs one face detector and one key point detector, and the calculation cost is low.
As shown in fig. 1 and 3, the method for detecting human key points based on region of interest transformation according to the embodiment of the present disclosure includes the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for the input image to be detected containing the human body, and then carrying out region-of-interest transformation according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model trained in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
Human body key point data acquisition and labeling
In step 1, a base image of a training set is constructed by acquiring a large number of color images M including a human body, M being greater than 1000. In particular, the image data covers as much of the scene as possible, such as different people, clothing, poses, occlusions, and background environments.
In step 2, labeling N human body key points to each color image, and obtaining labeling data as follows:
Figure BDA0003048079120000041
wherein,
Figure BDA0003048079120000042
as the m-th image
Figure BDA0003048079120000043
Is equal to 0, 1, 2, M-1, N is equal to 0, 1, 2.
The labeled human key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points. From the face key points, a face bounding box for the face can be determined.
Region of interest transformation
In step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed key point coordinates, including:
taking the central point of the human face boundary frame as the human face central point, taking the length of the long edge of the boundary frame as the human face size, and according to the human face central point and the size, carrying out image processing
Figure BDA0003048079120000051
And carrying out region-of-interest transformation on the corresponding human body key points to obtain transformed data expression as follows:
{[I0,(p0,0,p0,1,...,p0,N-1)],[I1,(p1,0,p1,1,...,p1,N-1)],...,[IM-1,(pM-1,0,pM-1,1,...,pM-1,N-1)]}
wherein p ism,n=(xm,n,ym,n) For the m-th transformed image ImThe n-th transformed human body key point coordinate is obtained, the side length of the transformed image is L, and L is a positive integer; in a preferred example, L ≧ 64; in the present example, the value is 64 or 128;
transformed image ImWherein each pixel value is a slave image
Figure BDA0003048079120000052
Sampled in (i.e. images before transformation), xindices,mTo be from an image
Figure BDA0003048079120000053
List of sampled position abscissas, yindices,mTo be from an image
Figure BDA0003048079120000054
The sampling position ordinate list is specifically obtained as follows:
xindices,m=(xface,m+warpRoI,m(0),xface,m+warpRoI,m(1),...,xface,m+warpRoI,m(L-1))
yindices,m=(yface,m+warpRoI,m(0),yface,m+warpRoI,m(1),...,yface,m+warpRoI,m(L-1))
warpRoI,m(t)=am/2·arctanh(2t/L-0.9)
wherein warpRoI,m(t) is a region of interest transform function of the mth image, t is a function input, and t is 0, 1, 2.
The image interesting region transformation adopts a remap method in an opencv image processing library, and the parameter map1 is set as xindices,mThe parameter map2 is set to yindices,m
Human body key point (x) after region of interest transformationm,n,ym,n) And calculating by a traversal method.
Wherein, the human body key point (x) after the region of interest is transformedm,n,ym,n) The method is calculated by a traversal method, and the traversal calculation process comprises the following steps:
go through t in the range of t-0, 1, 2
Figure BDA0003048079120000055
At this time, the value of t is the abscissa x of the transformed key pointm,n(ii) a And
go through t to find
Figure BDA0003048079120000056
At this time, the value of t is the ordinate y of the transformed key pointm,n
Human body key point training detection model
In step 5, a CNN network-based implementation of a human key point detection model for detecting human key points, such as the model structure shown in fig. 2, is made up of a convolutional layer, a maximum pooling layer, and a full-link layer.
The convolution kernel size of the convolution layer is 3 × 3, the step size is 1, the zero Padding method is Same Padding, and the number of convolution kernels is indicated in parentheses of each convolution layer in fig. 2.
The pooling window size of the maximum pooling layer was 2 × 2 with a step size of 2.
The number of first fully-connected layer neurons was 1024 and the number of second fully-connected layer neurons was 2N.
Each convolutional layer and the first fully connected layer are then activated using a ReLU activation function.
During the training process, the loss function of the mth data is expressed as:
Figure BDA0003048079120000061
wherein (x)m,n,ym,n) Is the nth human body key point of the mth training sample in the data set after the region of interest transformation, (x'm,n,y′m,n) And predicting the nth human body key point of the training image after the mth interesting area is transformed by the model.
Therefore, a detection model for detecting key points in the human body image after the region of interest is transformed is trained and obtained according to the image after the region of interest is transformed and the transformed coordinates of the key points of the human body.
Human body key point detection application
As an example shown in fig. 3, the human body key point detection process for an input image to be detected containing a human body includes:
firstly, detecting a human face boundary frame by using a human face detector, then carrying out region-of-interest transformation according to the method in the step 4, and improving the proportion of the human face in the image to obtain a transformed image;
then, detecting the human key points in the transformed image by using a human key point detection model obtained by training; and
and finally, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The adopted face detector can adopt a Dlib tool and the like to detect a human body and determine a boundary frame of the face. It should be understood that, in the implementation of the present invention, the face detection is not limited to the above Dlib tool, and may also be implemented by using other face detection models trained in advance.
According to the center point (x) of the boundary box of the human facetest,face,ytest,face) And the length a of the long side of the face bounding boxtestUsing a remap method in an opencv image processing library to transform the region of interest of the image to be detected, setting the parameter map1 as xtest,indicesThe parameter map2 is set to ytest,indices. The calculation method is as follows:
xtest,indices=(xtest,face+warpRoI,test(0),xtest,face+warpRoI,test(1),...,xtest,face+warpRoI,test(L-1))
ytest,indices=(ytest,face+warpRoI,test(0),ytest,face+warpRoI,test(1),...,ytest,face+warpRoI,test(L-1))
warpRoI,test(t)=atest/2·arctanh(2t/L-0.9)
wherein warpRoI,testAnd (t) is a transformation function of the region of interest of the image to be detected, t is a function input, and t is 0, 1, 2.
Detecting keypoints (x) in the transformed image using the human keypoint detection model trained in step 3test,n,ytest,n)。
Then, carrying out region-of-interest inverse transformation on the key points in the transformed image to obtain human body key points (x) of the image to be detected before transformationsrc,test,n,ysrc,test,n):
xsrc,test,n=xtest,face+warpRoI,test(xtest,n)
ysrc,test,n=ytest,face+warpRoI,test(ytest,n) Therefore, the human body key point data of the image to be detected before transformation is obtained.
It should be understood that in step 4 and step 6, the side length L of the image after the region of interest transformation has the same value.
Test procedure
12000 groups of labeled human body key point data are prepared according to the steps 1 and 2, and comprise 10000 groups of training data and 2000 groups of test data. The data covers various people, dresses, poses, occlusions and background environments. On the basis of 10000 groups of training data, region-of-interest transformation is carried out, a detection model is trained, a training human body key point model is used, and verification is carried out on test data. And (4) comparing and directly using the original data to train the human body key point detection model to carry out key point detection on the basis of the test data.
The normalized average error is used as an evaluation index, namely the Euclidean distance between a predicted coordinate and a labeled coordinate is divided by the length of a diagonal line of a human body boundary box. The comparative results are shown in Table 1.
TABLE 1 comparison of test results of the prior art method and the method of the present invention
Normalized mean error
Existing methods 6.32%
The method of the invention 4.94%
As can be seen from comparison of test results, the model training method can effectively improve the model precision, and compared with the existing method, the test error is reduced by 1.38%.
Human key point detection device based on region of interest transformation
According to the disclosure of the present invention, there is also provided a human body key point detection device based on region of interest transformation, comprising:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
It should be understood that the functions and implementation of the modules of the human body key point detection apparatus based on region of interest transformation of the present embodiment can be implemented based on the specific operations of the aforementioned human body key point detection method based on region of interest transformation.
System for human body key point detection based on region of interest transformation
According to the disclosure of the present invention, there is also provided a system for human keypoint detection based on region of interest transformation, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a region of interest transformation based human keypoint detection method as previously described, in particular the procedures of the detection method as implemented in connection with fig. 1, 3.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (10)

1. A human body key point detection method based on region of interest transformation is characterized by comprising the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for an input image to be detected containing a human body, and then carrying out region-of-interest transformation on the image to be detected according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model trained in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
2. The method for detecting key points of a human body based on region of interest transformation according to claim 1, wherein in the step 2, N key points of the human body are labeled to each color image, and the obtained labeled data is expressed as:
Figure RE-FDA0003073926600000011
wherein,
Figure RE-FDA0003073926600000012
as the m-th image
Figure RE-FDA0003073926600000013
Is equal to 0, 1, 2, M-1, N is equal to 0, 1, 2.
3. The method for detecting key points of a human body based on region-of-interest transformation according to claim 1, wherein in the step 4, the region-of-interest transformation is performed on each color image and the labeled data according to the center point of the human face and the size of the human face to obtain the transformed image and the transformed key point coordinates, and the method comprises the following steps:
taking the central point of the human face boundary frame as the human face central point, taking the length of the long edge of the boundary frame as the human face size, and according to the human face central point and the size, carrying out image processing
Figure RE-FDA0003073926600000014
And carrying out region-of-interest transformation on the corresponding human body key points to obtain transformed data expression as follows:
{[I0,(p0,0,p0,1,...,p0,N-1)],[I1,(p1,0,p1,1,...,p1,N-1)],...,[IM-1,(pM-1,0,pM-1,1,...,pM-1,N-1)]}
wherein p ism,n=(xm,n,ym,n) For the m-th transformed image ImThe n-th transformed human body key point coordinate is obtained, the side length of the transformed image is L, and L is a positive integer;
transformed image ImWherein each pixel value is a slave image
Figure RE-FDA0003073926600000021
Obtained by intermediate sampling, xindices,mTo be from an image
Figure RE-FDA0003073926600000022
List of sampled position abscissas, yindices,mTo be from an image
Figure RE-FDA0003073926600000023
The sampling position ordinate list is specifically obtained as follows:
xindices,m=(xface,m+warpRoI,m(0),xface,m+warpRoI,m(1),...,xface,m+warpRoI,m(L-1))
yindices,m=(yface,m+warpRoI,m(0),yface,m+warpRoI,m(1),...,yface,m+warpRol,m(L-1))
warpRoI,m(t)=am/2·arctanh(2t/L-0.9)
wherein warpRoI,m(t) is a region of interest transform function of the mth image, t is a function input, and t is 0, 1, 2.
The image interesting region transformation adopts a remap method in an opencv image processing library, and the parameter map1 is set as xinidices,mThe parameter map2 is set to yindices,m
Human body key point (x) after region of interest transformationm,n,ym,n) And calculating by a traversal method.
4. The method for detecting key points of human body based on region of interest transformation as claimed in claim 3, wherein the method comprises
In the step 4, the human body key points (x) after the region of interest transformationm,nym,n) The method is calculated by a traversal method, and the traversal calculation process comprises the following steps:
go through t in the range of t-0, 1, 2
Figure RE-FDA0003073926600000024
At this time, the value of t is the abscissa x of the transformed key pointm,n(ii) a And
go through t to find
Figure RE-FDA0003073926600000025
At this time, the value of t is the ordinate y of the transformed key pointm,n
5. The method for detecting human key points based on region of interest transformation according to claim 3, wherein in the step 5, the CNN-based network implementation of the human key point detection model for detecting human key points is implemented, wherein in the training process, the loss function of the mth data is expressed as:
Figure RE-FDA0003073926600000026
wherein (x)m,n,ym,n) Is the nth human body key point of the mth training sample in the data set after the region of interest transformation, (x'm,n,y′m,n) And predicting the nth human body key point of the training image after the mth interesting area is transformed by the model.
6. The method for detecting human key points based on region-of-interest transformation according to claim 3, wherein in the step 8, the human key points in the transformed image are subjected to region-of-interest inverse transformation to obtain the human key points of the image to be detected before transformation, and the method comprises the following steps:
detecting the human key points (x) output by using the human key point detection model in the step 5test,n,ytest,n) Obtaining the human body key point (x) of the image to be detected before transformation by using the following region of interest inverse transformation formulasrc,test,n,ysrc,test,n):
xsrc,test,n=xtest,face+warpRoI,test(xtest,n)
ysrc,test,n=ytest,face+warpRoI,test(ytest,n)
Wherein (x)test,face,ytest,face) Representing the midpoint of the face bounding box, atestRepresenting the length of the long side of the face bounding box; x is the number oftest,indicesAnd ytest,indicesRespectively representing the sample values, x, from the image to be detected before transformation when the image to be detected is subjected to the interesting transformationtest,indicesFor a list of sampled position abscissas, yindices,mIs a sampled position ordinate list;
wherein, for the transformation of the interested region of the image to be detected before transformation, the remap method in the opencv image processing library is used, and the parameter map1 is set as xtest,indicesThe parameter map2 is set to ytest,indices
xtest,indices=(xtest,face+warpRoI,test(0),xtest,face+warpRoI,test(1),...,xtest,face+warpRoI,test(L-1))
ytest,indices=(ytest,face+warpRoI,test(0),ytest,face+warpRoI,test(1),...,ytest,face+warpRoI,test(L-1))
warpRoI,test(t)=atest/2·arctanh(2t/L-0.9)
Wherein warpRoI,testAnd (t) is a region-of-interest transformation function of the image to be detected, t is a function input, and t is 0.1, 2.
7. The method for detecting human body key points based on region of interest transformation according to claim 3, wherein in the step 4 and the step 6, the side lengths L of the images after the region of interest transformation have the same value.
8. The method for detecting human key points based on region of interest transformation according to claim 6, wherein the side length L of the image after the region of interest transformation is 64 or 128.
9. A human key point detection device based on region of interest transform is characterized by comprising:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
10. A system for human keypoint detection based on region of interest transformations, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a region of interest transform based human keypoint detection method according to any of claims 1-7.
CN202110478213.5A 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation Active CN113111850B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110478213.5A CN113111850B (en) 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110478213.5A CN113111850B (en) 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation

Publications (2)

Publication Number Publication Date
CN113111850A true CN113111850A (en) 2021-07-13
CN113111850B CN113111850B (en) 2022-08-16

Family

ID=76720661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110478213.5A Active CN113111850B (en) 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation

Country Status (1)

Country Link
CN (1) CN113111850B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422721A (en) * 2023-12-19 2024-01-19 天河超级计算淮海分中心 Intelligent labeling method based on lower limb CT image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190370537A1 (en) * 2018-05-29 2019-12-05 Umbo Cv Inc. Keypoint detection to highlight subjects of interest
CN110807448A (en) * 2020-01-07 2020-02-18 南京甄视智能科技有限公司 Human face key point data enhancement method, device and system and model training method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190370537A1 (en) * 2018-05-29 2019-12-05 Umbo Cv Inc. Keypoint detection to highlight subjects of interest
CN110807448A (en) * 2020-01-07 2020-02-18 南京甄视智能科技有限公司 Human face key point data enhancement method, device and system and model training method
CN111178337A (en) * 2020-01-07 2020-05-19 南京甄视智能科技有限公司 Human face key point data enhancement method, device and system and model training method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422721A (en) * 2023-12-19 2024-01-19 天河超级计算淮海分中心 Intelligent labeling method based on lower limb CT image
CN117422721B (en) * 2023-12-19 2024-03-08 天河超级计算淮海分中心 Intelligent labeling method based on lower limb CT image

Also Published As

Publication number Publication date
CN113111850B (en) 2022-08-16

Similar Documents

Publication Publication Date Title
WO2022002150A1 (en) Method and device for constructing visual point cloud map
CN108388896B (en) License plate identification method based on dynamic time sequence convolution neural network
CN108256394B (en) Target tracking method based on contour gradient
CN111104816B (en) Object gesture recognition method and device and camera
CN109118473B (en) Angular point detection method based on neural network, storage medium and image processing system
CN106599830B (en) Face key point positioning method and device
WO2020177432A1 (en) Multi-tag object detection method and system based on target detection network, and apparatuses
CN111445459B (en) Image defect detection method and system based on depth twin network
CN111709980A (en) Multi-scale image registration method and device based on deep learning
CN111461113B (en) Large-angle license plate detection method based on deformed plane object detection network
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
WO2018035794A1 (en) System and method for measuring image resolution value
CN113808180B (en) Heterologous image registration method, system and device
CN111415339B (en) Image defect detection method for complex texture industrial product
CN110930378A (en) Emphysema image processing method and system based on low data demand
CN113011401A (en) Face image posture estimation and correction method, system, medium and electronic equipment
CN113111850B (en) Human body key point detection method, device and system based on region-of-interest transformation
CN111523586A (en) Noise-aware-based full-network supervision target detection method
CN108992033B (en) Grading device, equipment and storage medium for vision test
CN117541652A (en) Dynamic SLAM method based on depth LK optical flow method and D-PROSAC sampling strategy
CN117253062A (en) Relay contact image characteristic quick matching method under any gesture
CN113111849B (en) Human body key point detection method, device, system and computer readable medium
CN114419716B (en) Calibration method for face image face key point calibration
CN111768436B (en) Improved image feature block registration method based on fast-RCNN
CN112633078B (en) Target tracking self-correction method, system, medium, equipment, terminal and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: No.568 longmian Avenue, gaoxinyuan, Jiangning District, Nanjing City, Jiangsu Province, 211000

Patentee after: Xiaoshi Technology (Jiangsu) Co.,Ltd.

Address before: No.568 longmian Avenue, gaoxinyuan, Jiangning District, Nanjing City, Jiangsu Province, 211000

Patentee before: NANJING ZHENSHI INTELLIGENT TECHNOLOGY Co.,Ltd.