CN116831560A

CN116831560A - Human height detection method based on skeleton key point recognition

Info

Publication number: CN116831560A
Application number: CN202310538428.0A
Authority: CN
Inventors: 郑煜涵; 蒋婉玥; 刘晓瑞; 葛树志; 刘银华; 张中浩; 张瑞
Original assignee: Qingdao University
Current assignee: Qingdao University
Priority date: 2023-05-12
Filing date: 2023-05-12
Publication date: 2023-10-03

Abstract

The invention discloses a human height detection method based on skeleton key point identification, which comprises the steps of firstly acquiring an image containing a target to be detected in real time through a depth camera, acquiring pixel coordinates of skeleton key points of the target to be detected in the image by adopting a human skeleton detector, calculating the relative height from a nose to an ankle in the image based on the pixel coordinates of the skeleton key points, then bringing the pixel coordinates of any skeleton key point into the depth camera, calculating the distance between the skeleton key points and the depth camera, then calculating the visible longitudinal distance of the depth camera, and finally calculating the real height of the target to be detected. The depth camera is accurately utilized to measure the height of the human body in cooperation with bone detection, and the height detection can be performed when the human body is in a non-standing posture. The invention only needs one depth image and then outputs reliable results, thereby saving a great deal of manpower and time.

Description

Human height detection method based on skeleton key point recognition

Technical field:

the invention belongs to the technical field of human body height detection, and particularly relates to a human body height detection method based on skeleton key point recognition.

The background technology is as follows:

in daily life, height is a major concern. However, the judgment of the height by naked eyes is always irregular, the judgment of the height by the depth camera not only can improve the accuracy, but also can automatically acquire the height information of the person by a computer for further processing, and in recent years, with the development of deep learning and image processing technology, the height detection technology is also continuously developed and perfected. The existing height detection technology is very lacking, the technology for detecting the height through a neural network is few, and the existing height detection needs to be compared by using a reference object, so that the accuracy and the practicability are deficient. The trend in the field of height detection technology is constantly towards more accurate, diversified and personalized. The current height detection and amplification mostly needs to calibrate a reference object, and the height of a person is obtained by converting the scale proportion of the person and the reference object, so that the reference object is difficult to determine, and the algorithm is inconvenient to realize under a fixed scene.

Recently, researchers in san diego and Adobe, university of california, have proposed a monocular vision-based measurement method that can recover the absolute dimensions of a scene and a target by measuring the height of the target in a photograph, the height of a camera, and the viewing angle orientation parameters, which can be accurately implemented in an unrestricted environment using monocular vision. Fig. 1 is a diagram of detection results, and it can be found that the detection principle is to scale the white stool, so that when the size of the white stool is changed, the height of the human body is changed, when no stool appears in the environment, the height of the human body is not detected, and when the human body is in different postures, the detected height is also changed, and the real-time performance is poor. In the method and the device for estimating the absolute depth of the image by using the patent CN 115797432A, the Marshall trunk index of the target is obtained through calculation according to skeleton key points, and then the absolute height of the target is obtained from a corresponding table of the Marshall trunk index and the height in a table look-up mode according to the Marshall trunk index.

In the fields of three-dimensional reconstruction, medical treatment, clothing size, etc., human height data are indispensable. In most cases, we will require the tester to stand upright and then measure the height with a meter or other tool, which will consume a lot of time and manpower. In particular, in practice, if we have no measuring tool, or if the person to be measured is a child, or if the injury is not standing straight, it is very difficult to measure the height.

The invention comprises the following steps:

the invention aims to seek to design a human body height detection method based on skeleton key point recognition, and solves the technical problems that the existing height measurement technology needs a reference object and the height measurement of a human body in a non-standing posture cannot be performed. According to the method, the height result of the target to be detected is obtained according to the target image to be detected obtained by the depth camera, and the technical blank of height identification by deep learning at present is filled.

In order to achieve the above purpose, the invention relates to a human height detection method based on skeleton key point recognition, which specifically comprises the following steps:

the method comprises the steps of (1) obtaining an image containing a target to be detected in real time through a depth camera, wherein the image size is M pixels multiplied by N pixels, when the target to be detected stands, performing the step (2), and when the target to be detected does not stand, performing the step (3);

(2) Acquiring pixel coordinates of skeleton key points of an object to be detected in an image by using a human skeleton detector, wherein the pixel coordinates comprise a nose (X ₀ ,Y ₀ ) And right ankle (X) ₄ ,Y ₄ ) Or (b)Nose (X) ₀ ,Y ₀ ) And left ankle (X) ₇ ,Y ₇ )，

Through nose (X) ₀ ,Y ₀ ) And right ankle (X) ₄ ,Y ₄ ) Or nose (X) ₀ ,Y ₀ ) And left ankle (X) ₇ ,Y ₇ ) The relative height Y of the nose to the ankle in the image is calculated:

or->

(3) Acquiring pixel coordinates of skeleton key points of an object to be detected in an image by using a human skeleton detector, wherein the pixel coordinates comprise a nose (X ₀ ,Y ₀ ) Neck (X) ₁ ,Y ₁ ) Right buttocks (X) ₂ ,Y ₂ ) And right knee (X) ₃ ,Y ₃ ) Right ankle (X) ₄ ,Y ₄ ) Or nose (X) ₀ ,Y ₀ ) Neck (X) ₁ ,Y ₁ ) Left buttocks (X) ₅ ,Y ₅ ) Left knee (X) ₆ ,Y ₆ ) And left ankle (X) ₇ ,Y ₇ )，

The relative neck-to-hip distance a in the image is calculated,

or->

The relative distance B between buttocks and knees in the image is calculated,

or->

The relative distance C from knee to ankle in the image is calculated,

or->

Then: the relative height Y of nose to ankle in the image:

Y＝Y ₁ -Y ₀ +A+B+C；

(4) Bringing the pixel coordinates of any bone key point in the step (2) or (3) into a depth camera, calculating the distance Z between the bone key point and the depth camera, calculating the visible longitudinal distance H of the depth camera according to the following formula,

wherein Z is the distance between the object to be measured and the camera, and θ is the longitudinal angle of the image shot by the depth camera;

(5) Finally, the true height S of the target to be measured is calculated according to the following formula:

where L is the nose-to-ankle true height, μ is the proportionality constant between the true height S of the target to be measured and the nose-to-ankle height L.

Specifically, the depth camera is RealSense, and the human skeleton detector is openpost.

Specifically, according to a function aligned_depth_frame_distance (x, y) in the depth camera Realsense, an actual distance Z between the object to be measured and the camera is calculated.

Compared with the prior art, the invention accurately utilizes the depth camera to measure the height of the human body in combination with bone detection, and can also detect the height when the human body is in a non-standing posture. The invention only needs one depth image and then outputs reliable results, thereby saving a great deal of manpower and time.

Description of the drawings:

FIG. 1 is a diagram showing a method for measuring height of a human body based on monocular vision in the prior art.

FIG. 2 is a flow chart of a method for detecting height of a human body based on skeletal key point recognition according to the present invention.

Fig. 3 is a photograph of the RealSense camera referred to in example 1.

Fig. 4 is a network configuration diagram of openpost according to embodiment 1.

Fig. 5 is a schematic diagram of coordinates of key points of bones of a human body in a standing posture.

Fig. 6 is a schematic diagram of coordinates of key points of bones of a human body in a non-standing posture.

FIG. 7 is a RealSense RGB image size diagram in a standing position.

Fig. 8 is a dimensional diagram between the subject and the camera at the time of photographing of fig. 7.

The specific embodiment is as follows:

the invention is further described below by way of examples.

Example 1:

according to the human height detection method based on skeleton key point recognition, openPose is used as a human skeleton detector, a distance between a depth camera and a person is obtained by matching with a RealSense depth camera, and the human height is judged by combining distance information, wherein a specific flow is shown in FIG. 2.

According to the method, the two-dimensional position information of each pixel point in the image can be represented by pixel coordinates of an image of a target to be detected, which is obtained by the depth camera, and the depth camera can also obtain the distance between each pixel point and the depth camera by the pixel coordinates, wherein the target to be detected is a person. The depth camera used in this embodiment is RealSense, and the manufacturer is intel. The implementation principle of the current depth camera is mainly divided into three types: structured light, tof, binocular imaging, respectively. The RealSense uses a structured light scheme. Fig. 3 shows the RealSense camera, four cameras on the front, left infrared camera from left to right, infrared spot projector, right infrared camera and RGB camera. .

In the embodiment, the human skeleton detector detects the joint points of the human body in the target image to be detected as skeleton key points, and the human skeleton information is described through the skeleton key points. openPose is an open source library based on convolutional neural network and supervised learning and written by taking caffe as a framework, can track facial expression, trunk, limbs and even fingers of people, is suitable for a single person and multiple people, and has good robustness. The method can be called as the first real-time multi-person two-dimensional attitude estimation based on deep learning in the world, is a milestone in human-computer interaction, and provides a high-quality information dimension for robot understanding.

The embodiment relates to a human height detection method based on skeleton key point recognition, which specifically comprises the following steps:

the method comprises the steps of (1) acquiring an image containing a target to be measured in real time through a depth camera (RealSense), wherein the image size is M pixels multiplied by N pixels, when the target to be measured stands, performing the step (2), and when the target to be measured does not stand, performing the step (3);

in this embodiment, the image is a RealSense RGB image, the size of the RealSense RGB image is M pixels×n pixels, the pixel coordinate of the upper right corner of the image is (0, 0), and the pixel coordinate of the lower left corner of the image is (M, N);

(2) Acquiring pixel coordinates of skeleton key points of an object to be detected in an image by using a human skeleton detector (such as openpost), including a nose (X) ₀ ,Y ₀ ) And right ankle (X) ₄ ,Y ₄ ) Or nose (X) ₀ ,Y ₀ ) And left ankle (X) ₇ ,Y ₇ )，

or->

(3) Acquiring pixel coordinates of skeleton key points of an object to be detected in an image by using a human skeleton detector (such as openpost), including a nose (X) ₀ ,Y ₀ ) Neck (X) ₁ ,Y ₁ ) Right buttocks (X) ₂ ,Y ₂ ) And right knee (X) ₃ ,Y ₃ ) Right ankle (X) ₄ ,Y ₄ ) Or nose (X) ₀ ,Y ₀ ) Neck (X) ₁ ,Y ₁ ) Left buttocks (X) ₅ ,Y ₅ ) Left knee (X) ₆ ,Y ₆ ) And left ankle (X) ₇ ,Y ₇ )，

The relative neck-to-hip distance a in the image is calculated,

or->

The relative distance B between buttocks and knees in the image is calculated,

or->

The relative distance C from knee to ankle in the image is calculated,

or->

Then: the relative height Y of nose to ankle in the image:

Y＝Y ₁ -Y ₀ +A+B+C；

(4) Bringing the pixel coordinates of any bone key point in the step (2) or the step (3) into a depth camera, calculating the distance Z between the bone key point and the depth camera, and calculating the visible longitudinal distance H of the depth camera according to the following formula, wherein the longitudinal distance H corresponds to N in the target image to be detected:

since the object to be measured is basically parallel to the depth camera during shooting, namely the distance between any bone key point and the depth camera is the same.

In this embodiment, according to the function aligned_depth_frame_distance (x, y) in the depth camera readsense, the actual distance Z between the object to be measured and the camera is calculated, and then the visible longitudinal distance H of the readsense camera is calculated according to the following formula:

wherein Z is the distance between the object to be measured and the camera, and θ is the longitudinal angle of the RealSense RGB image shot by the RealSense camera;

where L is the nose-to-ankle true height, L corresponds to Y, μ is a proportionality constant between the true height S of the object to be measured and the nose-to-ankle height L, and μ=1.07 is verified based on a large amount of experimental data.

Claims

1. The human height detection method based on skeleton key point recognition is characterized by comprising the following steps of:

(2) Acquiring pixel coordinates of skeleton key points of an object to be detected in an image by using a human skeleton detector, wherein the pixel coordinates comprise a nose (X ₀ ,Y ₀ ) And right ankle (X) ₄ ,Y ₄ ) Or nose (X) ₀ ,Y ₀ ) And left ankle (X) ₇ ,Y ₇ )，

or->

The relative neck-to-hip distance a in the image is calculated,

or->

The relative distance B between buttocks and knees in the image is calculated,

or->

The relative distance C from knee to ankle in the image is calculated,

or->

Then: the relative height Y of nose to ankle in the image:

Y＝Y ₁ -Y ₀ +A+B+C；

2. The method for detecting human height based on bone key point recognition according to claim 1, wherein the depth camera is RealSense and the human bone detector is openpost.

3. The human height detection method based on bone key point recognition according to claim 1, wherein the actual distance Z between the object to be detected and the camera is calculated according to a function aligned_depth_frame_get_distance (x, y) in the depth camera Realsense.