CN113111850B - Human body key point detection method, device and system based on region-of-interest transformation - Google Patents

Human body key point detection method, device and system based on region-of-interest transformation Download PDF

Info

Publication number
CN113111850B
CN113111850B CN202110478213.5A CN202110478213A CN113111850B CN 113111850 B CN113111850 B CN 113111850B CN 202110478213 A CN202110478213 A CN 202110478213A CN 113111850 B CN113111850 B CN 113111850B
Authority
CN
China
Prior art keywords
face
test
image
human body
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110478213.5A
Other languages
Chinese (zh)
Other versions
CN113111850A (en
Inventor
杨帆
郝强
潘鑫淼
胡建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiaoshi Technology Jiangsu Co ltd
Original Assignee
Nanjing Zhenshi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Zhenshi Intelligent Technology Co Ltd filed Critical Nanjing Zhenshi Intelligent Technology Co Ltd
Priority to CN202110478213.5A priority Critical patent/CN113111850B/en
Publication of CN113111850A publication Critical patent/CN113111850A/en
Application granted granted Critical
Publication of CN113111850B publication Critical patent/CN113111850B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method, a device and a system for detecting human key points based on region-of-interest transformation. And in the model training process, carrying out region-of-interest transformation on the human key point data, and training the human key point model by using the transformed data. And in the process of detecting the model, detecting the human body key points according to the trained human body key point model, and performing inverse transformation to obtain the human body key points of the image before transformation. The invention effectively standardizes the data to a uniform form, overcomes the problem of large data change in an open scene, reduces the training difficulty, can improve the face proportion in the image through the region-of-interest transformation, is beneficial to the prediction of key points of the face, and further improves the integral precision of key points of a human body. Compared with a method for separately predicting body and face key points, the method only needs one face detector and one key point detector, and the calculation cost is low.

Description

Human body key point detection method, device and system based on region-of-interest transformation
Technical Field
The invention relates to the technical field of image processing, in particular to human face detection and recognition, and specifically relates to a human body key point detection method, device and system based on region-of-interest transformation.
Background
The task of human body key point detection is to detect the key point positions of the face and the limbs in the human body image. Human body image data under an uncontrolled scene have large changes, for example, the differences among people, dresses, postures, shelters and background environments are large, the face proportion is small, and difficulty is brought to the training of a human body key point detection model.
The existing human body key point detection methods mainly comprise two types, one type is that the human body position in an image is detected firstly, the human body image is intercepted, and then key points in the image are detected, but because the face occupies a small proportion in the image, the face key point prediction is not accurate, the number of the face key points is often large, and the number of the limb key points is small, so the integral precision is influenced.
The other method is to detect key points of the human body by detecting the positions of the human body and the human face, and specifically comprises the steps of firstly intercepting images of the human body and the human face and then respectively detecting key points of limbs and the face. Although the method has high precision, a plurality of model predictions are needed, and the calculation is time-consuming.
Disclosure of Invention
The invention aims to provide a method, a device and a system for detecting human key points based on region-of-interest transformation.
In order to achieve the above object, a first aspect of the present invention provides a method for detecting human key points based on region of interest transformation, including the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for the input image to be detected containing the human body, and then carrying out region-of-interest transformation according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model obtained by training in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The second aspect of the present invention further provides a human body key point detection device based on region of interest transformation, including:
a module for acquiring M color images including a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The third aspect of the present invention further provides a system for human body keypoint detection based on region of interest transformation, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a human keypoint detection method based on region of interest transformation as previously described.
Compared with the prior art, the technical scheme of the invention has the following remarkable beneficial effects:
the method aims at the problem of human body detection obstacle caused by the problems of large scene change and small face proportion of human body image data in an open environment, provides a mode of carrying out region-of-interest transformation on the data by taking a human face as a center, training a human body key point detection model by using the transformed data, therefore, during actual detection, after the human face is detected by the human face detector, the interested area of the image is changed, then detecting key points of the human body, finally performing inverse transformation to obtain the key point data of the original image to be detected, therefore, on one hand, the data can be adjusted to a uniform mode, the training difficulty is reduced, on the other hand, the proportion of the face in the image can be improved through transformation because the number of the face key points is far more than that of the body key points, the face key points can be predicted more accurately, therefore, the overall performance of human body key point detection is improved, and the accuracy of the human body key point detection model is improved. Meanwhile, the method only needs one face detector and one key point detector, and the calculation cost is low.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent. In addition, all combinations of claimed subject matter are considered a part of the presently disclosed subject matter.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a training process of a human key point detection model based on region of interest transformation according to an exemplary embodiment of the present invention.
FIG. 2 is a schematic diagram of a model structure of the human body key point detection model of the present invention.
FIG. 3 is a schematic diagram illustrating a process of detecting key points of a human body by using the model shown in FIG. 1 according to the embodiment of the invention.
Detailed Description
In order to better understand the technical content of the present invention, specific embodiments are described below with reference to the accompanying drawings.
In this disclosure, aspects of the present invention are described with reference to the accompanying drawings, in which a number of illustrative embodiments are shown. Embodiments of the present disclosure are not necessarily intended to include all aspects of the invention. It should be appreciated that the various concepts and embodiments described above, as well as those described in greater detail below, may be implemented in any of numerous ways, as the disclosed concepts and embodiments are not limited to any one implementation. In addition, some aspects of the present disclosure may be used alone, or in any suitable combination with other aspects of the present disclosure.
Referring to fig. 1, 2 and 3, the method for detecting key points of a human body based on region of interest transformation provided by the invention comprises a model training process and a model detection process. And in the model training process, carrying out region-of-interest transformation on the human key point data, and training the human key point model by using the transformed data. And in the process of detecting the model, detecting the human body key points according to the trained human body key point model, and performing inverse transformation to obtain the human body key points of the image before transformation.
In the model training process, the problem of large data change in an open scene is solved by effectively standardizing the data to a uniform form, the training difficulty is reduced, meanwhile, the face proportion in the image can be improved through region-of-interest transformation, the prediction of the face key points is facilitated, and the integral precision of the human body key points is further improved. Compared with a method for separately predicting body and face key points, the method only needs one face detector and one key point detector, and the calculation cost is low.
As shown in fig. 1 and 3, the method for detecting human key points based on region of interest transformation according to the embodiment of the present disclosure includes the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary box by using a human face detector for the input image to be detected containing the human body, and then carrying out region of interest conversion according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a converted image;
step 7, detecting the human key points in the transformed image by using the human key point detection model trained in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
Human body key point data acquisition and labeling
In step 1, a base image of a training set is constructed by acquiring a large number of color images M including a human body, M being greater than 1000. In particular, the image data covers as much of the scene as possible, such as different people, clothing, poses, occlusions, and background environments.
In step 2, each color image is labeled with N human body key points, and the obtained labeling data are as follows:
Figure BDA0003048079120000041
wherein the content of the first and second substances,
Figure BDA0003048079120000042
as the m-th image
Figure BDA0003048079120000043
The nth key point coordinate of (a), M-1, N-1, 0, 1, 2.
The labeled human key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points. From the face key points, a face bounding box for the face can be determined.
Region of interest transformation
In step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed key point coordinates, including:
the central point of the face boundary box is taken as the central point of the face,taking the length of the long edge of the bounding box as the size of the face, and according to the central point and the size of the face, the image is displayed
Figure BDA0003048079120000051
And carrying out region-of-interest transformation on the corresponding human body key points to obtain transformed data expression as follows:
{[I 0 ,(p 0,0 ,p 0,1 ,...,p 0,N-1 )],[I 1 ,(p 1,0 ,p 1,1 ,...,p 1,N-1 )],...,[I M-1 ,(p M-1,0 ,p M-1,1 ,..., pM-1,N-1 )]}
wherein p is m,n =(x m,n ,y m,n ) For the m-th transformed image I m The n-th transformed human body key point coordinate is obtained, the side length of the transformed image is L, and L is a positive integer; in a preferred example, L ≧ 64; in the present example, the value is 64 or 128;
transformed image I m Wherein each pixel value is a slave image
Figure BDA0003048079120000052
Sampled in (i.e. images before transformation), x indices,m To be from an image
Figure BDA0003048079120000053
List of sampled position abscissas, y indices,m To be from an image
Figure BDA0003048079120000054
The sampling position ordinate list is specifically obtained as follows:
x indices,m =(x face,m +warp RoI,m (0),x face,m +warp RoI,m (1),...,x face,m +warp RoI,m (L-1))
y indices,m =(y face,m +warp RoI,m (0),y face,m +warp RoI,m (1),...,y face ,m+warp RoI,m (L-1))
warp RoI,m (t)=a m /2·arctanh(2t/L-0.9)
wherein warp RoI,m (t) is a region of interest transform function of the mth image, t is a function input, and t is 0, 1, 2.
The image interesting region transformation adopts a remap method in an opencv image processing library, and the parameter map1 is set as x indices,m The parameter map2 is set to y indices,m
Human body key point (x) after region of interest transformation m,n ,y m,n ) And calculating by a traversal method.
Wherein, the human body key point (x) after the region of interest is transformed m,n ,y m,n ) The method is calculated by a traversal method, and the traversal calculation process comprises the following steps:
go through t in the range of t-0, 1, 2
Figure BDA0003048079120000055
At this time, the value of t is the abscissa x of the transformed key point m,n (ii) a And
go through t to find
Figure BDA0003048079120000056
At this time, the value of t is the ordinate y of the transformed key point m,n
Human body key point training detection model
In step 5, a CNN network-based implementation of a human key point detection model for detecting human key points, such as the model structure shown in fig. 2, is made up of a convolutional layer, a maximum pooling layer, and a full-link layer.
The convolution kernel size of the convolution layer is 3 × 3, the step size is 1, the zero Padding method is Same Padding, and the number of convolution kernels is indicated in parentheses of each convolution layer in fig. 2.
The pooling window size of the maximum pooling layer was 2 × 2 with a step size of 2.
The number of first fully-connected layer neurons was 1024 and the number of second fully-connected layer neurons was 2N.
Each convolutional layer and the first fully connected layer are then activated using a ReLU activation function.
During the training process, the loss function of the mth data is expressed as:
Figure BDA0003048079120000061
wherein (x) m,n ,y m,n ) Is the nth human body key point of the mth training sample in the data set after the region of interest transformation, (x' m,n ,y′ m,n ) And predicting the nth human body key point of the training image after the mth interesting area is transformed by the model.
Therefore, a detection model for detecting key points in the human body image after the region of interest is transformed is trained and obtained according to the image after the region of interest is transformed and the transformed coordinates of the key points of the human body.
Human body key point detection application
As an example shown in fig. 3, the human key point detection process for an input image to be detected containing a human body includes:
firstly, detecting a human face boundary frame by using a human face detector, then carrying out region-of-interest transformation according to the method in the step 4, and improving the proportion of the human face in the image to obtain a transformed image;
then, detecting the human key points in the transformed image by using a human key point detection model obtained by training; and
and finally, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The adopted face detector can adopt a Dlib tool and the like to detect a human body and determine a boundary frame of the face. It should be understood that, in the implementation of the present invention, the face detection is not limited to the above Dlib tool, and may also be implemented by using other face detection models trained in advance.
According to the center point (x) of the boundary box of the human face test,face ,y test,face ) And the length a of the long side of the face bounding box test Using a remap method in an opencv image processing library to transform the region of interest of the image to be detected, setting the parameter map1 as x test,indices With the parameter map2 set to y test,indices . The calculation method is as follows:
x test,indices =(x test,face +warp RoI,test (0),x test,face +warp RoI,test (1),...,x test,face +warp RoI,test (L-1))
y test,indices =(y test,face +warp RoI,test (0),y test,face +warp RoI,test (1),...,y test,face +warp RoI,test (L-1))
warp RoI,test (t)=a test /2·arctanh(2t/L-0.9)
wherein warp RoI,test And (t) is a transformation function of the region of interest of the image to be detected, t is a function input, and t is 0, 1, 2.
Detecting keypoints (x) in the transformed image using the human keypoint detection model trained in step 3 test,n ,y test,n )。
Then, carrying out region-of-interest inverse transformation on the key points in the transformed image to obtain human body key points (x) of the image to be detected before transformation src,test,n ,y src,test,n ):
x src,test,n =x test,face +warp RoI,test (x test,n )
y src,test,n =y test,face +warp RoI,test (y test,n ) Therefore, the human body key point data of the image to be detected before transformation is obtained.
It should be understood that in step 4 and step 6, the side length L of the image after the region of interest transformation has the same value.
Test procedure
12000 groups of labeled human body key point data are prepared according to the steps 1 and 2, and comprise 10000 groups of training data and 2000 groups of test data. The data covers various people, dresses, poses, occlusions and background environments. On the basis of 10000 groups of training data, region-of-interest transformation is carried out, a detection model is trained, a training human body key point model is used, and verification is carried out on test data. And (4) comparing and directly using the original data to train the human body key point detection model to carry out key point detection on the basis of the test data.
The normalized average error is used as an evaluation index, namely the Euclidean distance between a predicted coordinate and a labeled coordinate is divided by the length of a diagonal line of a human body boundary box. The comparative results are shown in Table 1.
TABLE 1 comparison of test results of the prior art method and the method of the present invention
Normalized mean error
Existing methods 6.32%
The method of the invention 4.94%
As can be seen from comparison of test results, the model training method can effectively improve the model precision, and compared with the existing method, the test error is reduced by 1.38%.
Human key point detection device based on region of interest transformation
According to the disclosure of the present invention, there is also provided a human body key point detection device based on region of interest transformation, comprising:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
It should be understood that the functions and implementation of the modules of the human body key point detection apparatus based on region of interest transformation of the present embodiment can be implemented based on the specific operations of the aforementioned human body key point detection method based on region of interest transformation.
System for human body key point detection based on region of interest transformation
According to the disclosure of the present invention, there is also provided a system for human keypoint detection based on region of interest transformation, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a region of interest transformation based human keypoint detection method as previously described, in particular the procedures of the detection method as implemented in connection with fig. 1, 3.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (7)

1. A human body key point detection method based on region of interest transformation is characterized by comprising the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, for the input image to be detected containing the human body, detecting a human face bounding box by using a human face detector, and then carrying out region-of-interest transformation on the image to be detected according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model trained in the step 5; and
step 8, performing region-of-interest inverse transformation on the human body key points in the transformed image to obtain human body key points of the image to be detected before transformation;
in step 2, N human body key points are labeled on each color image, and the obtained labeling data are expressed as:
Figure FDA0003735779160000011
wherein the content of the first and second substances,
Figure FDA0003735779160000012
as the m-th image
Figure FDA0003735779160000013
The nth keypoint coordinate of (a), M0, 1, 2,., M-1, N0, 1, 2,., N-1;
in step 4, performing region-of-interest transformation on each color image and label data according to the face center point and the face size to obtain transformed images and transformed key point coordinates, including:
taking the central point of the human face boundary frame as the human face central point, taking the length of the long edge of the boundary frame as the human face size, and according to the human face central point and the size, carrying out image processing
Figure FDA0003735779160000014
And carrying out region-of-interest transformation on the corresponding human body key points to obtain transformed data expression as follows:
{[I 0 ,(p 0,0 ,p 0,1 ,...,p 0,N-1 )],[I 1 ,(p 1,0 ,p 1,1 ,...,p 1,N-1 )],...,[I M-1 ,(p M-1,0 ,p M-1,1 ,...,p M-1,N-1 )]}
wherein p is m,n =(x m,n ,y m,n ) For the m-th transformed image I m The n-th transformed human body key point coordinate is obtained, the side length of the transformed image is L, and L is a positive integer;
transformed image I m Wherein each pixel value is a slave image
Figure FDA0003735779160000015
Obtained by intermediate sampling, x indices,m To be from an image
Figure FDA0003735779160000016
List of sampled position abscissas, y indices,m To be from an image
Figure FDA0003735779160000021
The sampling position ordinate list is obtained in the following specific manner:
x indices,m =(x face,m +warp RoI,m (0),x face,m +warp RoI,m (1),...,x face,m +warp RoI,m (L-1))
y indices,m =(y face,m +warp Rol,m (0),y face,m +warp RoI,m (1),...,y face,m +warp Rol,m (L-1))
warp RoI,m (t)=a m /2·arctanh(2t/L-0.9)
wherein warp ReI,m (t) is a region of interest transform function of the mth image, t is a function input, and t is 0, 1, 2.
The image interesting region transformation adopts a remap method in an opencv image processing library, and the parameter map1 is set as x indices,m With the parameter map2 set to y indices,m
Region of interest changeChanged key points (x) of human body m,n ,y m,n ) And calculating by a traversal method.
2. The method for detecting human key points based on region of interest transformation according to claim 1, wherein in the step 4, the human key points (x) after the region of interest transformation m,n ,y m,n ) The method is calculated by a traversal method, and the traversal calculation process comprises the following steps:
go through t in the range of t-0, 1, 2
Figure FDA0003735779160000022
At this time, the value of t is the abscissa x of the transformed key point m,n (ii) a And
go through t to find
Figure FDA0003735779160000023
At this time, the value of t is the ordinate y of the transformed key point m,n
3. The method for detecting human key points based on region of interest transformation according to claim 1, wherein in the step 5, the CNN-based network implementation of the human key point detection model for detecting human key points is implemented, wherein in the training process, the loss function of the mth data is expressed as:
Figure FDA0003735779160000024
wherein (x) m,n ,y m,n ) Is the nth human body key point of the mth training sample in the data set after the region of interest transformation, (x' m,n ,y′ m,n ) And predicting the nth human body key point of the training image after the mth interesting area is transformed by the model.
4. The method for detecting key points of a human body based on region-of-interest transformation according to claim 1, wherein in the step 8, the inverse region-of-interest transformation is performed on the key points of the human body in the transformed image to obtain the key points of the human body in the image to be detected before transformation, and the method comprises the following steps:
detecting the human key points (x) output by using the human key point detection model in the step 5 test,n ,y test,n ) Obtaining the human body key point (x) of the image to be detected before transformation by using the following region of interest inverse transformation formula src,test,n ,y src,test,n ):
x src,test,n =x test,face +warp RoI.test (x test,n )
y src,test,n =y test,face +warp RoI,test (y test,n )
Wherein (x) test,face ,y test,face ) Representing the midpoint of the face bounding box, a test Representing the length of the long side of the face bounding box; x is the number of test,indices And y test,indices Respectively representing the sampling values, x, of the image to be detected before transformation when the image to be detected is subjected to the interesting transformation test,indices For a list of sampled position abscissas, y indices,m Is a sampled position ordinate list;
wherein, for the transformation of the interested region of the image to be detected before transformation, the remap method in the opencv image processing library is used, and the parameter map1 is set as x test,indices The parameter map2 is set to y test,indices
x test,indices =(x test,face +warp RoI,test (0),x test,face +warp RoI,test (1),...,x test,face +warp RoI,test (L-1))
y test,indices =(y test,face +warp RoI,test (0),y test,face +warp RoI,test (1),...,y test,face +warp RoI,test (L-1))
warp RoI,test (t)=a test /2·arctanh(2t/L-0.9)
Wherein warp RoI,test And (t) is a region-of-interest transformation function of the image to be detected, t is a function input, and t is 0, 1, 2.
5. The method for detecting human body key points based on region of interest transformation according to claim 1, wherein in the step 4 and the step 6, the side lengths L of the images after the region of interest transformation have the same value.
6. The method for detecting human key points based on region of interest transformation according to claim 5, wherein the side length L of the image after the region of interest transformation is 64 or 128.
7. A system for human keypoint detection based on region of interest transformations, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a region of interest transform based human keypoint detection method according to any of claims 1-6.
CN202110478213.5A 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation Active CN113111850B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110478213.5A CN113111850B (en) 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110478213.5A CN113111850B (en) 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation

Publications (2)

Publication Number Publication Date
CN113111850A CN113111850A (en) 2021-07-13
CN113111850B true CN113111850B (en) 2022-08-16

Family

ID=76720661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110478213.5A Active CN113111850B (en) 2021-04-30 2021-04-30 Human body key point detection method, device and system based on region-of-interest transformation

Country Status (1)

Country Link
CN (1) CN113111850B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422721B (en) * 2023-12-19 2024-03-08 天河超级计算淮海分中心 Intelligent labeling method based on lower limb CT image

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807448A (en) * 2020-01-07 2020-02-18 南京甄视智能科技有限公司 Human face key point data enhancement method, device and system and model training method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190370537A1 (en) * 2018-05-29 2019-12-05 Umbo Cv Inc. Keypoint detection to highlight subjects of interest

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807448A (en) * 2020-01-07 2020-02-18 南京甄视智能科技有限公司 Human face key point data enhancement method, device and system and model training method
CN111178337A (en) * 2020-01-07 2020-05-19 南京甄视智能科技有限公司 Human face key point data enhancement method, device and system and model training method

Also Published As

Publication number Publication date
CN113111850A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
WO2022002150A1 (en) Method and device for constructing visual point cloud map
CN108256394B (en) Target tracking method based on contour gradient
CN108388896B (en) License plate identification method based on dynamic time sequence convolution neural network
CN106599830B (en) Face key point positioning method and device
CN109118473B (en) Angular point detection method based on neural network, storage medium and image processing system
CN108108764B (en) Visual SLAM loop detection method based on random forest
CN108918536B (en) Tire mold surface character defect detection method, device, equipment and storage medium
CN111709980A (en) Multi-scale image registration method and device based on deep learning
CN111461113B (en) Large-angle license plate detection method based on deformed plane object detection network
CN110766041A (en) Deep learning-based pest detection method
CN107123130B (en) Kernel correlation filtering target tracking method based on superpixel and hybrid hash
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
CN111415339B (en) Image defect detection method for complex texture industrial product
CN112381175A (en) Circuit board identification and analysis method based on image processing
CN113808180B (en) Heterologous image registration method, system and device
CN110659637A (en) Electric energy meter number and label automatic identification method combining deep neural network and SIFT features
CN114332942A (en) Night infrared pedestrian detection method and system based on improved YOLOv3
CN113111850B (en) Human body key point detection method, device and system based on region-of-interest transformation
CN109919215B (en) Target detection method for improving characteristic pyramid network based on clustering algorithm
CN110516527B (en) Visual SLAM loop detection improvement method based on instance segmentation
CN111523586A (en) Noise-aware-based full-network supervision target detection method
CN114841992A (en) Defect detection method based on cyclic generation countermeasure network and structural similarity
CN113111849B (en) Human body key point detection method, device, system and computer readable medium
CN114419716B (en) Calibration method for face image face key point calibration
CN111768436B (en) Improved image feature block registration method based on fast-RCNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: No.568 longmian Avenue, gaoxinyuan, Jiangning District, Nanjing City, Jiangsu Province, 211000

Patentee after: Xiaoshi Technology (Jiangsu) Co.,Ltd.

Address before: No.568 longmian Avenue, gaoxinyuan, Jiangning District, Nanjing City, Jiangsu Province, 211000

Patentee before: NANJING ZHENSHI INTELLIGENT TECHNOLOGY Co.,Ltd.