CN112102947A - Apparatus and method for body posture assessment - Google Patents

Apparatus and method for body posture assessment Download PDF

Info

Publication number
CN112102947A
CN112102947A CN202010283948.8A CN202010283948A CN112102947A CN 112102947 A CN112102947 A CN 112102947A CN 202010283948 A CN202010283948 A CN 202010283948A CN 112102947 A CN112102947 A CN 112102947A
Authority
CN
China
Prior art keywords
sample
image
module
health
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010283948.8A
Other languages
Chinese (zh)
Other versions
CN112102947B (en
Inventor
冯强
王富百慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHINA INSTITUTE OF SPORT SCIENCE
Original Assignee
CHINA INSTITUTE OF SPORT SCIENCE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHINA INSTITUTE OF SPORT SCIENCE filed Critical CHINA INSTITUTE OF SPORT SCIENCE
Priority to CN202010283948.8A priority Critical patent/CN112102947B/en
Publication of CN112102947A publication Critical patent/CN112102947A/en
Application granted granted Critical
Publication of CN112102947B publication Critical patent/CN112102947B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The application discloses an apparatus and a method for body posture assessment. The device includes: a data acquisition module that acquires an image including a body to be evaluated and detects the body from the image; the prediction module predicts the positions of key points of the body according to a preset key point detection model; the calculation module is used for calculating health indexes corresponding to the positions of the key points according to a preset health standard database; and the evaluation module compares the calculated health index with a reference index in the health standard database and evaluates the body based on the comparison result. In an embodiment, the apparatus may further comprise a training module to train a data set based on labeled positions of the plurality of body samples corresponding to human skeletal positions obtained with the touch to obtain a preset keypoint detection model. Through the device for evaluating the body postures, the standard of the normal value of each body posture of the children and the teenagers can be established, and a feasible basis is provided for the health detection of the children and the teenagers.

Description

Apparatus and method for body posture assessment
Technical Field
The present application relates to the field of medical devices, and in particular to devices and methods for body posture assessment.
Background
The physical position of children and teenagers is of great importance to their health. In order to test the body posture of a child or adolescent, the existing method includes: visual detection methods, radiation imaging methods, photogrammetry methods, grating projection methods, 3D imaging methods, and the like.
The visual detection method is a method for testing by using a vertical measuring instrument, a scoliosis instrument, a goniometer, an inclinometer and a movable scale. However, the visual detection method takes a long time to test, and the test instrument directly contacts the skin to easily cause the subject to change its natural posture. In addition, the flesh eye detection method has high technical requirements on testers and is greatly influenced by subjective factors of the testers.
Radiography is a method of evaluation using X-ray imaging, Computed Tomography (CT), and Magnetic Resonance (MRI) imaging results. However, radiation is generated using radiation imaging methods, and radiation imaging methods are often expensive to test.
The raster projection method is a moire method in which a rectangular raster is projected on a surface to be measured, and height information of each point on a free-form surface is measured by the raster projection method. However, the grating projection method has low measurement accuracy on scoliosis and low correlation with the Cobb angle.
The 3D imaging method is a method for capturing a 3D body type by laser scanning and performing a test. However, the testing cost of the 3D imaging method is high, and the method is still under further development and utilization.
In summary, the current technology lacks a stable and reliable method suitable for testing of large-scale population.
Disclosure of Invention
The present application provides an apparatus and method for body posture assessment suitable for testing the body posture of a child or adolescent that overcomes at least one of the above-identified deficiencies in the prior art. The present application provides an image-based computational solution for key indicators of body posture such as neck protrusion, high and low shoulders, pelvic tilt (forward and backward), thoracic kyphosis, lumbar kyphosis, long and short legs, body center of gravity, knee joint Q angle, spinal curvature, thereby providing a viable solution for large-scale screening and evaluation.
A first aspect of the present application provides an apparatus for body posture assessment, which may comprise: the data acquisition module is used for acquiring an image including a body to be evaluated and detecting the body from the image; the prediction module predicts the positions of key points of the body according to a preset key point detection model; the calculation module is used for calculating health indexes corresponding to the positions of the key points according to a preset health standard database; and the evaluation module compares the calculated health index with a reference index in the health standard database and evaluates the body based on the comparison result.
In some optional embodiments, the apparatus for body posture assessment may further comprise: and the training module is used for training a data set based on the marked positions of a plurality of body samples to obtain a preset key point detection model, wherein the marked positions are determined according to the positions of human bones of the body samples, which are obtained in a touch mode.
In some alternative embodiments, the data acquisition module may acquire a sample image of the body sample; and the training module may include: a marking unit that marks a marking position corresponding to a position of a human bone on the sample image; and the training unit is used for training the key point detection model by using the marking position of the sample image.
In some alternative embodiments, the marking unit may mark the marking position manually.
In some alternative embodiments, the marking unit may mark the marking position by different colors.
In some alternative embodiments, the data acquisition module may acquire the sample image by taking a front, side, and back view of the body sample.
In some optional embodiments, the training module may further comprise: the cleaning unit is used for cleaning the sample image so as to screen out a qualified sample image suitable for machine learning; and a labeling unit labels the qualified sample image among the sample images.
In some alternative embodiments, the body and body sample may be the human body of a particular population having particular characteristics in appearance.
In certain alternative embodiments, the body and body sample may be a human body of a juvenile population of children.
In certain alternative embodiments, the health criteria database may be preset specifically for the posture of the juvenile population.
In certain alternative embodiments, the health indicator may include at least one of: high and low shoulders, neck extension, pelvis forward or backward tilting, thoracic vertebra kyphotic angle, lumbar vertebra kyphotic angle, long and short legs, human body gravity center line, knee joint Q angle and spine curvature.
A second aspect of the present application provides a method for body posture assessment, which may comprise: acquiring an image including a body to be evaluated, and detecting the body from the image; predicting the positions of key points of a body according to a preset key point detection model; calculating health indexes corresponding to the positions of the key points according to a preset health standard database; the calculated health indicator is compared with reference indicators in a database of health criteria, and the body is evaluated based on the result of the comparison.
In certain alternative embodiments, the method may further comprise: training a data set based on the labeled positions of a plurality of body samples to obtain a preset key point detection model, wherein the labeled positions are determined according to the positions of human bones of the body samples, which are obtained in a touch mode.
In some alternative embodiments, the training data set may include: acquiring a sample image of a body sample; marking a marking position corresponding to the position of the human skeleton on the sample image; and training the key point detection model by using the marked positions of the sample images.
In some alternative embodiments, acquiring the sample image may include: sample images are acquired by taking a front, side and back photograph of a body sample.
In some alternative embodiments, the training data set may include: cleaning the sample images so as to screen out qualified sample images suitable for machine learning; and marking the marked location may include: qualified sample images among the sample images are marked.
By adopting the technical scheme of the application, at least one of the following technical effects can be realized:
1) by utilizing an image data set obtained by photographing the labeled human body and training a key point detection model through a deep neural network, a human body posture health detection model which is more accurate than the existing method and aims at specific people is provided.
2) The key point position and the health detection result of the target to be detected can be directly obtained without manual intervention, and the data analysis efficiency is improved.
3) An automatic detection model suitable for physical health of children and teenagers is provided.
4) And a large-scale physical state health detection data set of children and teenagers is collected and labeled.
5) Provides the body state detection standard and the body state detection method of children and teenagers.
Drawings
Other features, objects, and advantages of the present application will become more apparent from the following detailed description of non-limiting embodiments when taken in conjunction with the accompanying drawings. In the drawings:
fig. 1 is a block diagram of a body posture estimation device according to an embodiment of the present application.
Fig. 2 is a flowchart of a key point prediction process of the body posture estimation device according to the embodiment of the present application.
Fig. 3 is a flowchart of a training data set of a body posture estimation device according to an embodiment of the present application.
FIG. 4 is a table that schematically illustrates a correspondence of health indicators calculated by the calculation module to keypoint locations.
Fig. 5A is a deep neural network architecture diagram for keypoint detection, according to an embodiment of the present application.
Fig. 5B illustrates a first stage network for down-sampling of a deep neural network trellis.
Fig. 6A and 6B are schematic diagrams of an optimized deep neural network architecture and a VGG16 network front-end architecture, respectively, in accordance with a specific embodiment of the present application.
Detailed Description
Various aspects of the present application will be described in detail below with reference to the attached figures to provide a better understanding of the present application. It should be understood that the detailed description is merely illustrative of exemplary embodiments of the present application and does not limit the scope of the present application.
Throughout this specification and throughout the drawings, like reference numerals refer to like elements. For convenience of description, only portions related to the technical subject are shown in the drawings. Further, in the drawings, the size and shape of some elements, components or parts may be exaggerated for convenience of explanation. The figures are purely diagrammatic and not drawn to scale.
The present application will be further described with reference to the following detailed description with reference to the accompanying drawings.
The present application relates to an apparatus and method for body posture assessment, which is implemented based on a deep neural network. Specifically, body posture assessment is achieved by training a key point detection model of a posture according to the positions of human bones, and predicting the positions of key points of a body to be assessed by using the key point detection model to calculate corresponding health indexes.
First, an apparatus for body posture assessment according to an embodiment of the present application will be described with reference to fig. 1. Fig. 1 is a block diagram of a body posture estimation device 100 according to an embodiment of the present application.
As shown in fig. 1, an apparatus 100 for body posture assessment according to an embodiment of the present application may include a data acquisition module 110, a prediction module 120, a calculation module 130, and an assessment module 140.
The body posture evaluating device 100 according to the embodiment of the present application is used to evaluate whether the body posture of a person is standard or healthy, thereby providing appropriate advice for treatment or care. For example, the body posture evaluating apparatus 100 may be a medical instrument such as a medical device in a hospital, a school, or a home.
Alternatively, the body posture assessment apparatus 100 may be adapted to the human body of a specific population having specific characteristics in appearance, such as the elderly, the disabled, the children, or the juvenile population. For example, the body posture assessment apparatus 100 according to the embodiment of the present application may be used for testing/assessing the body posture of children and adolescents aged 6-19 years. Herein, the adolescent population of children will be described as applicable objects, but those skilled in the art will appreciate that it is only an example and not a limitation.
The data acquisition module 110 may acquire an image including a body to be evaluated and detect the body from the image. For example, the data acquisition module 110 may be a camera or a video camera, which may capture at least one of a front photograph, a side photograph, and a back photograph of a human body as an image to be detected. The data acquisition module 110 may include a human detection model 210 to detect a human body from the photograph. The human body detection model 210 can be constructed using a retina network (RetinaNet) structure. The data acquisition module 110 may further include a detector to perform the detection operation. However, the data acquisition module 110 is not so limited, so long as it can facilitate implementation of an image-based computing scheme. For example, the data acquisition module 110 may be implemented as a body scanner.
In addition to taking images of the body to be evaluated during the test, the data acquisition module 110 may also acquire (e.g., take) sample images of a body sample that is the subject of training during the course of training (as will be described in detail in the background). For example, a part of volunteers in primary and middle schools can be selected as training objects for acquiring training data of the health indexes of children and teenagers, and three pictures of front-view, side-view and back-view can be taken. For example, 1 million sample bodies may be selected, or one may take a total of 1 million photographs. It should be understood, however, that these numbers are exemplary only and not limiting.
The prediction module 120 may predict the location of the key points of the body according to a preset key point detection model 220. For example, the prediction module 120 may predict the keypoint locations of the body detected by the data acquisition module 110 from the images it takes according to rules or algorithms set by the keypoint detection model 220.
The position of the key point can be a position corresponding to a human body part to be detected, and can be a point or an area; alternatively, the keypoint location may be the coordinates of where several critical points for assessing the health of the site are located. For example, the keypoint locations may correspond to human bone locations. In particular embodiments, the keypoint locations may be locations in the vicinity of at least one of the neck, shoulder, pelvis, thoracic spine, lumbar spine, legs, body centerline, and knee joints of the human body, or coordinates of keypoints for assessing health conditions of these locations.
The keypoint detection model 220 may be pre-trained prior to performing the test. For example, the key point detection model 220 may be trained in advance by the producer of the body posture assessment apparatus 100, whereas the human body is assessed by the consumer/owner/operator using the trained preset key point detection model 220 in use. In an alternative embodiment, in addition to being trained in advance, the keypoint detection model 220 may be dynamically trained using previously detected, predicted, or estimated data in use.
To implement the training operation, the body posture assessment apparatus 100 may further include a training module 150. The training module 150 may train the data set based on the labeled positions of the plurality of body samples to obtain a preset keypoint detection model 220. The body sample may be the body of 1 million volunteers selected from the primary schools described above. The marked locations may be certain body parts of the volunteers, for example, at least one of the neck, shoulder, pelvis, thoracic spine, lumbar spine, legs, body centerline of gravity, and knee joint. The marker positions are determined according to the positions of human bones of the body sample obtained by means of touch. For example, a position corresponding to a human skeletal position may be determined and marked by touching the volunteer's body by a worker.
As shown in FIG. 1, training module 150 may include a labeling unit 154 and a training unit 156.
The marking unit 154 may mark a marking position corresponding to a position of a human bone on the sample image. For example, taking the lumbar vertebrae as an example, a worker may touch the waist of a volunteer to determine the position of a certain lumbar vertebra, and operate the marking unit 154 to mark a marking position corresponding to the determined position on a picture of the volunteer taken by the data collection module 110. The position of the marker on the sample image (e.g., a photograph of the volunteer) corresponds to the position where the lumbar vertebrae on the volunteer's own body are located. Herein, it is described that the marking unit 154 performs the marking operation under the manual operation of the worker, however, this is only an example and not a limitation. For example, the marking unit 154 may automatically perform the marking operation according to computer instructions.
The marking unit 154 may mark the marking position by different colors, and may mark a plurality of different marking positions at the same time. For example, if the sample image (e.g., a photograph of a volunteer) is keytone white and black, the marking unit 154 may mark the lumbar vertebra with blue, the cervical vertebra with green, the pelvis with red, the thoracic vertebra with purple, and so on.
The training unit 156 may train the keypoint detection model 220 using the label positions of the sample images. For example, the training unit 156 may train the keypoint detection model 220 through deep neural network-based machine learning using photographs of the 1 ten thousand volunteers and the locations of the markers on the photographs. The structure of the keypoint detection model 220 may be constructed by a high resolution network (HRNet). This will be described later.
As shown in FIG. 1, training module 150 may also include a washing unit 152. The washing unit 152 may wash the sample images to screen out qualified sample images suitable for machine learning, and the labeling unit 154 labels the qualified sample images among the sample images. For example, the cleaning unit 152 may filter the photos of 1 ten thousand volunteers taken by the data acquisition module 110 according to preset conditions, so as to discard the non-qualified photos and keep only the qualified photos. For example, the cleaning unit 152 may discard blurred photos, leaving only photos with a sharpness above a preset resolution. After screening, the washing unit 152 may transfer the retained photographs to the marking unit 154, and the marking unit 154 marks the received clear photographs.
Data cleaning is an important ring for algorithm improvement, the data cleaning comprises operations of splitting, association aggregation, regrouping, removing and the like, and a high-quality data set can ensure that a model can move to a correct convergence direction so as to ensure the consistency of data in the data set.
In the field of image recognition, a high-quality data set requires that pictures in one data set of each target (id) are as diverse as possible, for example, the similarity variance of a human face is as large as possible, and ideal data should include various scenes, props or cross-ages and the like, so as to improve the robustness of a model. The specific algorithm of the cleaning unit 152 will be described later.
The calculation module 130 may calculate the health index corresponding to the location of the key point according to a preset health standard database 160.
The health criteria database 160 may be preset specifically for the posture of the juvenile population. For example, the health criteria database 160 may include more frequently occurring adverse physical states in a population of children and adolescents, such as at least one or all of high and low shoulders, neck protrusion, pelvis forward or backward tilting, thoracic lobes, lumbar lobes, long and short legs, body center of gravity line, knee joint Q angle, and spine curvature.
In addition, the health criteria database 160 may also include the key point coordinates of the body part corresponding to each bad posture and the parameters to be measured. The data in the health criteria database 160 are correlated or mapped such that the calculation module 130 can know which part of the tester's body (specifically, which coordinate position) needs to be measured and what kind of operation is performed based on the coordinate position in order to evaluate each bad posture by referring to or calling up the health criteria database 160, thereby deriving a corresponding solution value. These operations may be performed according to a preset corresponding formula, may not be performed according to a non-formula algorithm or rule, or may be directly solved by the key point coordinates without performing any operation. These solution values may be numerical or non-numerical angles. Specific algorithms for various undesirable physical states will be described in detail in the examples that follow.
The health criteria database 160 may be designed to be closed so as to be not dynamically modifiable by the user of the device 100 during commissioning after training is complete, or may be semi-closed so as to be dynamically modifiable during use depending on the results of each test.
The evaluation module 140 may compare the calculated health indicator with reference indicators in the health criteria database 160 and evaluate the body based on the results of the comparison. For example, for a certain bad posture, the evaluation module 140 may compare the solution value obtained by the calculation module 130 with a reference value for evaluating the bad posture, and evaluate that the tester has the bad posture (i.e., is not healthy) if the result of the comparison exceeds a preset threshold, and evaluate that the tester does not have the bad posture (i.e., is healthy) if the result of the comparison is within a threshold range. Specifically, taking the long and short legs as an example, the calculation module 130 calculates the leg length difference (a-b) of the left leg and the right leg of the tester, the evaluation module 140 compares the measured leg length difference (a-b) with a preset threshold (0-c), and if | (a-b) | > c, the tester is evaluated to have the long and short feet, otherwise, the tester is evaluated to be healthy.
Next, the operations of the respective modules of the body posture estimating apparatus 100 and the cooperation therebetween and the neural network architecture will be explained in detail with reference to fig. 2 to 6B in conjunction with specific embodiments.
Keypoint detection
The relationship between the data collection module 110, the prediction module 120, the calculation module 130, and the evaluation module 140 is explained below by describing the keypoint detection process in conjunction with a particular embodiment with reference to FIG. 2. Fig. 2 is a flowchart of a key point prediction process of the body posture estimation device 100 according to the embodiment of the present application.
As shown in FIG. 2, the keypoint detection model 220 is constructed using the Top-Down method. The human body detection model 210 is used by the data acquisition module 110 to detect persons in the photo of the tester to form detection frames, and the key point detection model 220 of the human body is used by the prediction module 120 to perform key point prediction on each detection frame.
Specifically, referring to fig. 1 and 2, taking the example of using the body posture evaluating apparatus 100 to test a child or adolescent tester, the body posture evaluating process may include the following steps.
1) The professional takes a picture of the tester using the data acquisition module 110 of the body posture-evaluating apparatus 100.
2) The data collection module 110 selects (e.g., frames) people in the photograph using the human detection model 210 to form a detection frame, and passes the detection frame to the prediction module 120.
3) The prediction module 120 predicts a keypoint location (e.g., coordinates 230 of the keypoint location) of the human body in the detection box corresponding to a certain bad posture (e.g., long and short legs) according to the keypoint detection model 220, and transmits the keypoint location coordinates 230 to the calculation module 130.
4) The calculation module 130 inputs the keypoint location coordinates 230 into a preset algorithm for poor posture of long and short legs to calculate the solution values (a-b), and transmits the solution values (a-b) to the evaluation module 140.
5) The evaluation module 140 compares the calculated solution values (a-b) with the leg length difference threshold values (0-c), and evaluates the health condition of the tester based on the comparison result.
Although only the case of automatically evaluating the bad posture of the long and short legs is described above, it will be understood by those skilled in the art that a plurality of bad postures may be evaluated at the same time.
Further, the body posture estimation device 100 may further include an output device 100 to output a result of the estimation. For example, the body posture evaluation device 100 may include a display, and display "healthy" or "long and short legs" to the user via the display according to the evaluation result. As another example, the body posture estimation apparatus 100 may include a speaker to output the estimation result by sound.
Training data set
The relationship between the data acquisition module 110, the prediction module 120, the calculation module 130, the evaluation module 140 and the training module 150 is explained below by describing the process of training a data set in connection with two specific embodiments with reference to fig. 3. Fig. 3 is a flowchart of a training data set of the body posture estimation device 100 according to an embodiment of the present application.
Specifically, referring to fig. 1 and 3, taking as an example that the body posture evaluating apparatus 100 is used to test a child or adolescent tester for a bad posture of long and short legs, the key point detecting process of the body posture evaluating apparatus 100 may include the following steps.
1) The data collection module 110 takes a picture of the tester and selects (e.g., frame-selects) a detection box including a person therein, and transfers the detection box to the washing unit 152 of the training module 150 (S302).
2) The washing unit 152 performs data washing on the detection frames to screen out clear detection frames, and transfers the screened detection frames to the prediction module 120 (S304).
3) The prediction module 120 performs a keypoint location prediction on the detection box received from the data acquisition module 110 according to the keypoint detection model 220 (S306).
4) The prediction module 120 transmits the predicted keypoint location data to the calculation module 130 (S308).
5) The calculation module 130 calculates a health indicator based on the keypoint location data (S310).
Alternatively, the body posture estimation device 100 may dynamically train the key point detection model 220 using data of a previous tester during the test, and then predict and estimate the current tester using the trained key point detection model 220. In this case, the process of training data set may be nested in the above-described key point detection process of the body posture estimation device 100. Specifically, the following steps may also be included.
6) The washing unit 152 of the training module 150 transfers the test frame of the screened previous tester to the labeling unit 154 of the training module 150 after performing the data washing (S304).
7) The labeling unit 154 performs data labeling on the body in the detection frame, and transmits the detection frame in which the key point position has been labeled to the training unit 156 of the training module 150 (S312).
8) The training unit 156 trains the keypoint detection model 220 through machine learning based on the depth neural network using the detection box photo and the keypoint positions marked thereon, and transfers the trained keypoint detection model 220 to the prediction module 120 (S314).
9) The prediction module 120 predicts the keypoint location of the body image of the current tester by using the trained keypoint detection model 220 (S306).
Steps S308 and S310 are repeated.
In this embodiment, the above steps may be repeated dynamically and continuously during the test to train the keypoint detection model 220 using the physical data of the previous testers and to predict the keypoint location of the current tester using the trained keypoint detection model 220.
As another example, taking a developer or a maintenance person (collectively referred to as a worker) of the body posture estimation device 100 to perform data acquisition training thereon as an example, the process of the training data set of the body posture estimation device 100 may include the following steps.
1) The data collection module 110 takes a picture of 1 ten thousand children' S teenager volunteers and selects (e.g., frames) a detection box including a person therein and transfers the detection box to the washing unit 152 of the training module 150 (S302).
2) The washing unit 152 performs data washing on the detection frames to screen out clear detection frames, and transfers the screened detection frames to the prediction module 120 (S304).
3) On the other hand, the washing unit 152 of the training module 150 transfers the screened detection frames of 1 ten thousand volunteers to the labeling unit 154 of the training module 150 (S304).
4) The staff knows the body position corresponding to the human skeleton position by touching the volunteer's body, wherein the human skeleton position may be the key point position of some special parts of the body.
5) The worker operates the marking unit 154 to manually perform data labeling on the body in the detection frame using the marking unit 154, and transmits the detection frame in which the key point position has been marked to the training unit 156 of the training module 150 (S312).
6) The training unit 156 trains the keypoint detection model 220 through machine learning based on the depth neural network using the detection box photo and the keypoint positions marked thereon, and transfers the trained keypoint detection model 220 to the prediction module 120 (S314).
7) The prediction module 120 predicts the keypoint locations of the detection boxes of the volunteer' S body received from the washing unit 152 using the trained keypoint detection model 220 (S306).
8) The prediction module 120 transmits the predicted keypoint location data to the calculation module 130 (S308).
9) The calculation module 130 calculates a health indicator based on the keypoint location data (S310).
Optionally, the process of training the data set may further comprise the following steps.
10) The health condition of the volunteer is manually evaluated by a worker by touching the human skeleton position of the volunteer.
11) The manually evaluated health condition is recorded and compared with the result evaluated by the body posture evaluating device 100, and the key point detection model 220 is debugged and/or adjusted according to the comparison result. The recording, comparing, debugging and adjusting in this step can be performed manually or can be performed by a machine, for example, the health status data evaluated manually is recorded into the machine and the subsequent operation is completed by the machine.
It is noted that although the various steps of the apparatus 100/module/unit are specifically described herein in explaining the principles of the present invention, these steps are merely exemplary and not limiting. For example, in alternative embodiments, some steps may be added or omitted, alternative steps may be employed, or the order of some steps may be changed.
Unlike prior art datasets, in the present application, the dataset is trained to focus on the keypoint locations and is targeted to a specific group of children and adolescents. In this way, at least one of the following can be achieved: the blank of a large-scale high-quality annotation data set detected by key points of children and teenagers is filled, and a foundation is laid for research in a future relevant direction; by marking according to the positions of the skeleton points, compared with a key point data set marked according to vision, the method has higher accuracy and practicability; and marking according to the physical state health detection standard of the children and the teenagers, and completely supporting related tasks of physical state detection of the children and the teenagers.
Health index calculation
The relationships between the data prediction module 120, the calculation module 130, and the evaluation module 140 are explained below by describing a health indicator calculation process in conjunction with a specific embodiment of a plurality of poor poses with reference to FIG. 4. Fig. 4 is a table schematically showing the correspondence of the health index calculated by the calculation module 130 to the position of the key point.
The table may be predetermined and may be at least part of the database of predetermined health criteria 160. The calculation module 130 calculates for each or at least one of the bad physical states in the table according to the mapping relationship of the table, and the evaluation module 140 evaluates the health condition of the tester according to the calculation result of the calculation module 130. The operation and/or algorithm of the prediction module 120, calculation module 130, evaluation module 140 will be explained below in connection with specific embodiments for each undesirable posture.
1) The neck is extended forward. The prediction module 120 predicts the coordinate positions of the shoulder peak and the earlobe as the key point positions from the key point detection model 220. The calculation module 130 performs the calculation using coordinate mapping. If the vertical line of the one-side shoulder peak is a and the vertical line of the side earlobe is b, the absolute value of the distance between the coordinates of the two straight lines a and b on the paper, namely the horizontal distance between the one-side shoulder peak and the earlobe is obtained. The evaluation module 140 compares the found horizontal distance with a threshold range to thereby evaluate the health condition. The same thing as before is not repeated here.
2) High and low shoulders. The prediction module 120 predicts the left shoulder level and the right shoulder level as the key point positions according to the key point detection model 220. The calculation module 130 calculates the height difference between the left and right shoulders as the height value according to the horizontal height of the left shoulder and the horizontal height of the right shoulder. The evaluation module 140 compares the found shoulder values with a threshold range to thereby evaluate the health condition.
3) The pelvis is inclined. Pelvic tilt may include, for example, forward and backward pelvic tilt, and pelvic roll.
The prediction module 120 predicts the coordinate position of the anterior superior iliac spine, posterior superior iliac spine, and pubic bone combination as the key point position according to the key point detection model 220. The calculation module 130 calculates an angle between a connecting line of the anterior superior iliac spine and the posterior superior iliac spine and a horizontal line according to a coordinate position of the combination of the anterior superior iliac spine, the posterior superior iliac spine and the pubis, and calculates whether the combination of the anterior superior iliac spine and the pubis is on a plane. The evaluation module 140 compares the solved solution to a threshold range and evaluates the pelvis as forward tilting if protruding forward, and backward tilting if protruding backward.
In addition, the prediction module 120 predicts the coordinate positions of the lateral anterior superior iliac spines as the positions of the key points according to the key point detection model 220. The calculation module 130 calculates the height of the lateral anterior superior iliac spines according to the coordinate positions of the lateral anterior superior iliac spines. The evaluation module 140 compares the calculated heights of the lateral anterior superior iliac spines to determine whether they are equal, and if they are not equal, the evaluation is made as healthy, and if they are not equal, the evaluation is made as pelvic roll.
4) The posterior spine is convex. The prediction module 120 predicts the coordinate positions of the seventh cervical vertebra and the twelfth thoracic vertebra as the key point positions according to the key point detection model 220. The calculation module 130 connects the seventh cervical vertebra and the twelfth thoracic vertebra arc line according to the coordinate positions of the seventh cervical vertebra and the twelfth thoracic vertebra, and calculates an included angle between a tangent of the seventh cervical vertebra arc line and a tangent of the twelfth thoracic vertebra line (i.e., a thoracic posterior convex angle). The evaluation module 140 compares the found thoracic lobe to a threshold range to thereby evaluate the health condition.
5) Lumbar lordosis. The prediction module 120 predicts the coordinate positions of the first lumbar vertebra and the second sacral vertebra as the key point positions according to the key point detection model 220. The calculation module 130 connects the first lumbar vertebra and the second sacral vertebra based on the coordinate positions of the first lumbar vertebra and the second sacral vertebra, and calculates an included angle between a tangent of the first lumbar vertebra arc and a tangent of the second sacral vertebra line (i.e., a lumbar lordotic angle). The evaluation module 140 compares the found lumbar lordosis angle with a threshold range to thereby evaluate the health condition.
6) Long and short legs. The prediction module 120 predicts coordinate positions of bilateral greater femurs as the key point positions, for example, the level of the left greater femoris is a and the level of the right greater femoris is b, according to the key point detection model 220. The calculating module 130 calculates a height difference (a-b) of the greater trochanters of the left and right sides according to the coordinate positions a and b of the greater trochanters of the two sides. The evaluation module 140 compares the found height difference | (a-b) | with a threshold range to thereby evaluate the health condition, for example, if | (a-b) | > c, the tester is evaluated to have long and short feet, and otherwise, the tester is evaluated to be healthy.
7) The human body gravity center line deviates. The prediction module 120 predicts the coordinate positions of the external ear canal, the acromion, the anterior superior iliac spine, and the lateral malleolus as the key point positions according to the key point detection model 220. The calculation module 130 connects the four points into a line according to the coordinate positions of the external ear hole, the acromion, the anterior superior iliac spine, and the lateral malleolus, and compares the connection of the four points with a plumb line passing through any one of the four points, and the number and the offset distance of points offset from the plumb line among the four points. In the calculation, the point where the plumb line floats on the straight line must be a fixed reference point. The evaluation module 140 compares the found number of deviated points and the deviation distance with a threshold range to thereby evaluate the health condition.
8) Knee joint angle Q. The prediction module 120 predicts the coordinate locations of the patellar midpoint and the quadriceps femoris force lines as the key point locations according to the key point detection model 220. The calculation module 130 plots the quadriceps line of force and the perpendicular to the ground to the patella midpoint based on the coordinate positions of the patella midpoint and the quadriceps line of force, and solves for the angle between the two lines, i.e., the knee joint Q angle. The evaluation module 140 compares the found knee joint Q angle with a threshold range to thereby evaluate the health condition.
9) Spinal curvature. The prediction module 120 predicts the coordinate position of the spinous process of the spine as the key point position according to the key point detection model 220. The calculation module 130 traces a spine curve according to the coordinate positions of the spinous processes of the spine, and calculates a curvature of the spine curve. The evaluation module 140 compares the found curvature of the spinal curve with a threshold range to thereby evaluate the health condition.
It should be understood that the tables and contents of the health criteria database 160 described herein are merely exemplary and not limiting, e.g., the number of terms of bad posture in the tables may be added, deleted, or modified; the reference measured or calculated for a certain bad posture or certain bad postures may also be modified as long as the same evaluation effect is achieved.
Deep neural network architecture for key point detection
The deep neural network architecture for keypoint detection is explained in detail below with reference to fig. 5A and 5B. Fig. 5A is a deep neural network architecture diagram for keypoint detection according to an embodiment of the present application. Fig. 5B illustrates a first stage network for down-sampling of a deep neural network trellis.
The deep neural network architecture is a convolutional neural network ResNet50 (residual network). The network uses a U-shaped network structure.
Specifically, as shown in fig. 5A, after a picture is input into the network, four-stage down-sampling is performed to perform picture feature extraction; and then upsampled. In fig. 5A, each solid line square represents a convolution block of one stage, and the dashed line square represents a side-way convolution.
During the process of up-sampling, the features of down-sampling are merged. The down-sampling makes the feature map smaller, and the up-sampling makes the feature map larger. Since this resembles a U-shape, it is called a U-shaped structure. In the up-sampling stage, the heat map is output by the business layer as an output characteristic map of the key point.
Next, the down-sampling network and the up-sampling network in the deep neural network trellis are explained in detail, respectively.
1) A downsampling network.
First, as shown in fig. 5A, a picture to be sampled is input to a network. The input image may be a 3-channel BGR image, and its size may be 384 × 288.
Thereafter, as shown in fig. 5A, the input picture is down-sampled in the first stage. As shown in FIG. 5B, the first stage network may include a convolutional layer, a bulk normalization layer, a displacement scaling layer, and a nonlinear activation function layer. As shown in fig. 5B, the size of the output image after the first stage down-sampling may be 192 × 144. The features of the image are down-sampled once compared to the input image.
Then, the down-sampling of the second stage to the fourth stage is performed on the image output after the down-sampling of the first stage. The second to fourth stage networks are specifically as follows:
performing down-sampling once on a maximized pooling layer with a 3x3 convolution kernel, a step length of 2 and a patch of 1, wherein an output channel is 64;
pooling layer pooll, using 2 × 2 nucleus, maximal pooling method;
convolutional layer conv2_1, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 128;
convolutional layer conv2_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 128;
pooling layer pool2, 2x2 core, maximal pooling method;
convolutional layer conv3_1, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256;
convolutional layer conv3_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 256;
convolutional layer conv3_3, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256;
pooling layer pool3, 2x2 core, maximal pooling method;
convolutional layer conv4_1, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 512; and
convolutional layer conv4_2, which uses a 3x3 convolutional kernel, and the output signature channel is 512.
2) And (4) up-sampling the network.
The up-sampling network is used for completing the human body characteristic point positioning task. The method comprises the following specific steps:
convolutional layer conv5_1, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256;
convolutional layer conv5_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 256;
convolutional layer conv5_3, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256;
convolutional layer conv5_4, adopting 1x1 convolutional kernel, and outputting a characteristic diagram channel of 256; and
convolutional layer conv5_5, which uses a 1x1 convolutional kernel, outputs a signature channel of 16.
The network output is a thermodynamic diagram of 16 × 96 × 72, where 16 represents 16 key points. The 16 key points respectively represent different key points in order. During the test, the maximum score was taken from the 96 x 72 thermodynamic diagram and the position (x, y) of the maximum was obtained. The position of the key point in the original is (4x +2,4y +2), where the multiplication by 4 is to map the original size from the thermodynamic diagram and the addition of 2 is to reduce the quantization error.
According to an embodiment of the present application, the loss function may be:
Figure BDA0002447789190000171
wherein, W is a weight, and when the position p is not labeled, W is 0, so as to avoid a true positive prediction error during the training period. St j(p) refers to the confidence value of p point on the jth position confidence map output by Branch1 in the tth network. S* j(p) refers to the confidence value of p point on the jth body part bitmap of the ground truth.
In the network model, the error between x, y and ground route is smaller than the picture width and 1/20 hit height. And calculating the total hit number so as to obtain the model accuracy. In particular, table 1 illustrates model accuracy for keypoint detection of the front, sides, and back of the human body.
TABLE 1
Figure BDA0002447789190000172
Human body detection model 210 and key point detection model 220
The algorithms of the human detection model 210 of the data acquisition module 110 and the keypoint detection model 220 of the training module 150 will be explained below with reference to fig. 6A and 6B. Fig. 6A and 6B are schematic diagrams of an optimized deep neural network architecture and a VGG16 network front-end architecture, respectively, according to particular embodiments of the present application.
The human keypoint detection model 220 is obtained by constructing a deep neural network architecture and training the deep neural network architecture with already labeled images. Predicting the tester's keypoint locations may include the following substeps.
1) Human body partial image detection is performed by the human body detection model 210.
In this embodiment, the key point detection model 220 is constructed by a Top-Down method, in which each human body in the picture is detected by the human body detection model 210, and then key point detection is performed on each detected and framed human body partial image. This step is used to detect each human body in the graph, wherein the human body detection model 210 adopts a RetinaNet model, and the RetinaNet model is a single-stage detection network structure, i.e., a one-stage detection network. The RetinaNet network structure is a network used for detecting a target image, the RetinaNet network structure is a single network consisting of a backhaul network and two sub-networks with specific tasks, the backhaul network is responsible for calculating convolution characteristics on the whole image, the first sub-network executes an image classification task on the output of the backhaul network, and the second sub-network is responsible for convolution frame regression. Wherein, the first classification network may comprise cascaded convolutional layer and fully-connected layer, the ReLoss function L corresponding to tinaNet network structureCIncluding the loss function Lcls of the first classification network and the corresponding loss function L of the regression networkbbI.e. LC=Lbb+Lcls
2) And carrying out normalization processing on the marked image.
In the embodiment of the application, data preprocessing has an irreplaceable role in deep learning network training, and when a network is trained, if original data are directly input into the deep network for training, a large amount of characteristic information is often lost during training due to the existence of an activation function, so that before network training, an input image needs to be subjected to normalization processing, which is a preprocessing process of data. The method comprises the steps of obtaining a normalized image by adopting cutting and mean value subtraction, uniformly cutting an original image (a labeled human body part image taken out of a frame) to 368x368, then carrying out subtraction processing on each channel by utilizing a three-channel image mean value, wherein the image mean value is {104, 117 and 123}, and obtaining the normalized image and sending the normalized image to a network for training;
3) and constructing a deep neural network architecture.
In the embodiment of the present application, as shown in fig. 6A, the deep neural network infrastructure is a convolutional neural network VGG16 (as shown in fig. 6B), and the number of output tasks is fixed at 4 layers; the purpose of doing so is naturally to speed up the body posture assessment on the premise of meeting the posture assessment accuracy; the method comprises the following steps: the system comprises a front-end network and a back-end network, wherein the front-end network is used for extracting image characteristics and completing the transmission of characteristic images; the rear-end network is used for positioning the human body characteristic points and connecting human body skeletons;
the front-end network specifically comprises:
convolution layer convl _1, adopting 3x3 convolution kernel, and outputting a characteristic diagram channel of 64; the convolution layer convl _2 adopts a 3x3 convolution kernel, and the output characteristic diagram channel is 64; pooling layer pooll, using 2 × 2 nucleus, maximal pooling method; convolutional layer conv2_1, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 128; convolutional layer conv2_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 128; pooling layer pool2, 2x2 core, maximal pooling method; convolutional layer conv3_1, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256; convolutional layer conv3_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 256; convolutional layer conv3_3, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256; pooling layer pool3, 2x2 core, maximal pooling method; convolution layer conv4_1, adopting 3x3 convolution kernel, and outputting characteristic diagram channel 512; convolutional layer conv4_2, which uses a 3x3 convolutional kernel, and the output signature channel is 512.
The backend network, comprising:
branch 1: the system is used for completing the task of positioning the human body feature points; the method specifically comprises the following steps: convolutional layer conv5_1, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 512; convolutional layer conv5_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 512; convolutional layer conv5_3, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256; convolutional layer conv5_4, adopting 1x1 convolutional kernel, and outputting a characteristic diagram channel of 256; convolutional layer conv5_5, which adopts 1x1 convolutional kernel, and the output characteristic diagram channel is 38;
and branch 2: used for completing the human skeleton linking task; the method specifically comprises the following steps: convolutional layer conv6_1, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 512; convolutional layer conv6_2, which adopts a 3x3 convolutional kernel, and the output characteristic diagram channel is 512; convolutional layer conv6_3, adopting 3x3 convolutional kernel, and outputting characteristic diagram channel of 256; convolutional layer conv6_4, adopting 1x1 convolutional kernel, and outputting a characteristic diagram channel of 256; convolutional layer conv6_5, using a 1x1 convolutional kernel, outputs a signature channel of 19.
In the embodiment of the present application, the loss function:
Figure BDA0002447789190000191
Figure BDA0002447789190000192
wherein, W is a weight, and when the position p is not labeled, W is 0, so as to avoid a true positive prediction error during the training period. St j(p) means Branch1 at the t-thAnd the confidence value of the p point on the jth position confidence map output in the network. S* j(p) refers to the confidence value of p point on the jth body part bitmap of the ground truth. L ist c(p) is the vector of p points on the jth site affinity vector field output by Branch2 in the tth network. L is* c(p) | refers to the vector of point p on the jth body-part affinity vector field of ground truth.
4) The labeled human body part image extracted from the frame is used as a training set, the training set is input into the constructed deep neural network for training to obtain a human body key point detection model 220,
in the embodiment of the present application, a training process for a convolutional neural network, which is one type of deep neural network, is to learn parameters in a network structure through a back propagation algorithm for a large amount of labeled data (supervised learning). The basic idea is as follows: based on a set of set initialization model parameters, such as parameters in a network structure, which are randomly initialized by using gaussian distribution, input data is propagated forward in a convolutional neural network to obtain an expected output, if the expected output is different from an actual class label of the data, errors are propagated backward to an input layer by layer, and neurons in each layer update the parameters in the network structure according to the errors. For convolutional neural networks, the parameters to be learned include convolutional kernel parameters, connection parameters between layers, and biases for each layer. The trained model can calculate the class label corresponding to the new input data, thereby completing classification or prediction tasks.
The method and the device utilize the deep convolutional neural network to estimate the human body posture, and effectively complete characteristic point regression and human body posture regression in a bottom-up mode; the human body posture estimation is finished in real time under high accuracy by utilizing a deep neural network optimization algorithm; and a self-defined included angle similarity evaluation method is adopted, so that the similarity of the two postures is effectively obtained, and the similarity estimation function is accurately finished.
Further, in the present application, the human detection model 210 is constructed by a retina network (RetinaNet) structure and has been trained to have the ability to accurately frame out an image of a human body part from an annotated image.
Data cleansing
The data cleansing algorithm of the cleansing unit 152 of the training module 150 will be explained below.
To improve the consistency of the data in the image dataset, the cleaning unit 152 cleans the acquired images and transfers the filtered images to the labeling unit 154. The cleaning process may include the following substeps: a comparison step, an acquisition condition proportion step and a cleaning step.
In the comparison step, the similarity of one or more images to be detected and all basic images in a basic image set is compared to obtain the comparison similarity.
The image to be detected and all M basic images can be compared through the neural network model, and the comparison similarity between every two images is obtained. In one example, if there is only one image to be measured, the image to be measured is compared with each basic image to obtain M comparison similarities; in another example, if there are N images to be measured, each image to be measured is compared with each base image two by two, so as to obtain M × N comparison similarities.
In the step of obtaining the condition proportion, comparing each comparison similarity with a comparison threshold value to obtain the condition proportion of one or more images to be detected, wherein the condition proportion is as follows: the comparison similarity between one or more images to be detected and all the basic images is higher than the ratio between the number of comparison threshold values and the total comparison times.
And comparing the comparison similarity obtained in the comparison step with a comparison threshold, wherein the comparison threshold can be set to different values according to different conditions. When determining whether an image in an image dataset belongs to the dataset, the comparison threshold may be set relatively low as appropriate; the comparison may be set relatively high compared to a threshold value when determining whether an image of another image dataset belongs to the present image dataset.
After each comparison similarity is compared with the comparison threshold, the comparison threshold may be higher than the number of comparison thresholds, and the ratio of the number to the total comparison times is the conditional comparison. In one example, after the comparison step, obtaining M comparison similarities between one image to be detected and all M basic images, and comparing the M comparison similarities with a comparison threshold respectively, wherein if X images are higher than the comparison threshold, the conditional occupation ratio of the image to be detected is X/M; in another example, M × N comparison similarities are obtained for the N images to be measured, where Y are higher than the comparison threshold, and the conditional proportion of the N images to be measured as a whole is Y/(M × N).
In the cleaning step, one or more images to be detected are merged into the basic image set according to the condition ratio and the preset ratio.
And judging whether one or more images to be detected and the images in the basic image set are images of the same target or not according to the condition ratio obtained in the step of obtaining the condition ratio. The image to be detected is integrated into the basic image set if the target is considered to be the same target if the target is higher than the preset ratio by comparing the target with the preset ratio. In one example, if the condition ratio X/M obtained by one image to be measured is higher than the preset ratio, the image to be measured is merged into the image set. In another example, if the condition ratio Y/(M × N) of the N to-be-measured images is higher than a preset ratio, the N to-be-measured images can be merged together into the basic image set.
According to another embodiment of the present application, an apparatus 100 for body posture assessment may include a memory and a processor. The memory stores instructions. A processor is coupled to the memory and is capable of executing instructions stored in the memory such that the execution performs operations. The specific operations have been set forth in the above description for the structures of the respective modules, and the description is not repeated here.
The advantages of the present application will be described below by comparing with the prior art, wherein the solution of the present application can achieve at least one of the following advantages.
1) The data sets are acquired in different ways. Conventional keypoint detection marker locations employ moving structure connection points in a visual sense. However, the key points of the data set collected in the present application are points obtained by touch according to the bone position and labeled. Meanwhile, the labels of the key points are not the key point mark positions in the traditional sense (such as data sets like COCO) and are a set of key point positions suitable for the body posture health indexes of children and teenagers.
2) In the prior art, the calibration of anatomical landmark points needs to be manually carried out on a shot image, the automatic identification and the automatic dotting cannot be realized, and the precision is poor. However, the method utilizes the large-scale high-precision data set to train the key point detection model by using the deep neural network, and provides a more accurate physical state health detection model for children and teenagers than the existing method.
3) The prior art needs a great deal of manual intervention to obtain the key point position and the health detection result. However, the result can be directly obtained without manual intervention, and the data analysis efficiency is improved.
4) The prior art scheme has high requirements on photo collection scenes, for example, the background is required to be a scale for auxiliary position marking and the like. However, the scheme does not need to arrange the scale background, and the project implementation difficulty is small.
In addition, the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Throughout this document, terms are not limited to the meanings literally defined, but encompass different means for achieving the same or similar functionality, without departing from the scope of the application, as defined in the appended claims.
It is to be understood that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. Additionally, terms (e.g., terms defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For example, ordinal terms such as "first," "second," etc., are used only to distinguish one element from another, and do not limit the order or importance thereof; spatially relative terms such as "upper", "lower", and the like, are not limited to the orientation shown in the drawings, but include different orientations of the device in use; the term "and/or" includes any and all combinations of one or more of the associated listed items; the terms "comprises," "comprising," and/or "having," when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof; the term "exemplary" is intended to mean exemplary or illustrative; the terms "substantially," "about," and the like represent approximations, not degrees, and are intended to indicate inherent deviations in measured or calculated values that will be recognized by those of ordinary skill in the art; in describing embodiments of the present application, the term "may" mean "one or more embodiments of the present application; when appearing after a list of listed features, terms such as "at least one of … …" modify the entire list rather than individual elements of the list. In addition, in the embodiments of the present application, the singular form may include plural meanings unless otherwise specified in the reverse direction.
It should also be noted that some of the steps described herein do not necessarily have to occur in the written sequential order unless explicitly stated otherwise. For example, in some alternative embodiments, the steps may occur in reverse order, in parallel order, or certain steps may be omitted or added.
Furthermore, those skilled in the art will appreciate that the subject technology can be implemented as a system, method, or computer program product. Accordingly, this application may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to as a "circuit," module "or" system. Furthermore, the present application may take the form of a computer program product embodied in any tangible expression medium having computer-usable program code embodied in the medium.
The present application is described in terms of flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate systems, methods, functions, and operations according to various embodiments of the present application. It should be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only for the purpose of illustrating the preferred embodiments of the present application and the principles of the present application. It will be appreciated by a person skilled in the art that the scope of the application referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but that the present application shall also cover other embodiments with any combination of the above-mentioned features or their equivalents without departing from the concept of the present application. For example, the above features and the technical features having similar functions disclosed in the present application are mutually replaced to form the technical solution.

Claims (12)

1. Apparatus for body posture assessment, wherein the apparatus comprises:
a data acquisition module that acquires an image including a body to be evaluated, and detects the body from the image;
the prediction module predicts the positions of the key points of the body according to a preset key point detection model;
the calculation module is used for calculating a health index corresponding to the key point position according to a preset health standard database;
an assessment module to compare the calculated health indicator with reference indicators in the health criteria database and to assess the body based on a result of the comparison.
2. The apparatus of claim 1, further comprising:
a training module to train a data set based on the labeled positions of a plurality of body samples to obtain the preset keypoint detection model,
wherein the marker position is determined from a human bone position of the body sample obtained by means of touch.
3. The apparatus of claim 2, wherein,
the data acquisition module acquires a sample image of the body sample; and
the training module comprises:
a marking unit for marking a marking position corresponding to the position of the human skeleton on the sample image; and
and the training unit is used for training the key point detection model by using the marking position of the sample image.
4. The apparatus of claim 3, wherein the data acquisition module acquires the sample image by taking a front, side, and back view of the body sample.
5. The apparatus of claim 3, wherein,
the training module further comprises:
a cleaning unit for cleaning the sample image to screen out a qualified sample image suitable for machine learning; and
the labeling unit labels the qualified sample image among the sample images.
6. The apparatus of claim 2, wherein the body and the body sample are human bodies of a particular population having particular characteristics in appearance, and the body sample are human bodies of a population of children and adolescents.
7. The apparatus of claim 1, wherein the health indicator comprises at least one of: high and low shoulders, neck extension, pelvis forward or backward tilting, pelvis side tilting, thoracic vertebra backward convex angle, lumbar vertebra forward convex angle, long and short legs, human body gravity center line, knee joint Q angle and spine curvature.
8. A method for body posture assessment, comprising:
acquiring an image including a body to be evaluated, and detecting the body from the image;
predicting the positions of key points of the body according to a preset key point detection model;
calculating a health index corresponding to the position of the key point according to a preset health standard database;
comparing the calculated health indicator with reference indicators in the health criteria database and evaluating the body based on the result of the comparison.
9. The method of claim 8, further comprising:
training a data set based on marker positions of a plurality of body samples to obtain the preset keypoint detection model,
wherein the marker position is determined from a human bone position of the body sample obtained by means of touch.
10. The method of claim 9, wherein training the data set comprises:
acquiring a sample image of the body sample;
marking a marking position corresponding to the human skeleton position on the sample image; and
and training the key point detection model by using the marked positions of the sample images.
11. The method of claim 10, wherein acquiring the sample image comprises: the sample image is acquired by taking a front, side and back photograph of the body sample.
12. The method of claim 10, wherein,
training the data set comprises: cleaning the sample images to screen out qualified sample images suitable for machine learning; and
marking the mark position comprises: labeling the qualified sample images among the sample images.
CN202010283948.8A 2020-04-13 2020-04-13 Apparatus and method for body posture assessment Active CN112102947B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010283948.8A CN112102947B (en) 2020-04-13 2020-04-13 Apparatus and method for body posture assessment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010283948.8A CN112102947B (en) 2020-04-13 2020-04-13 Apparatus and method for body posture assessment

Publications (2)

Publication Number Publication Date
CN112102947A true CN112102947A (en) 2020-12-18
CN112102947B CN112102947B (en) 2024-02-13

Family

ID=73749637

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010283948.8A Active CN112102947B (en) 2020-04-13 2020-04-13 Apparatus and method for body posture assessment

Country Status (1)

Country Link
CN (1) CN112102947B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686854A (en) * 2020-12-25 2021-04-20 四川大学华西医院 Method and system for automatically measuring scoliosis Cobb angle
CN113066549A (en) * 2021-04-06 2021-07-02 青岛瑞斯凯尔生物科技有限公司 Clinical effectiveness evaluation method and system of medical instrument based on artificial intelligence
CN113139962A (en) * 2021-05-26 2021-07-20 北京欧应信息技术有限公司 System and method for scoliosis probability assessment
CN117357103A (en) * 2023-12-07 2024-01-09 山东财经大学 CV-based limb movement training guiding method and system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062526A (en) * 2017-12-15 2018-05-22 厦门美图之家科技有限公司 A kind of estimation method of human posture and mobile terminal
CN109256212A (en) * 2018-08-17 2019-01-22 上海米因医疗器械科技有限公司 Bone health assessment models construction method, device, equipment, medium and appraisal procedure
CN109508681A (en) * 2018-11-20 2019-03-22 北京京东尚科信息技术有限公司 The method and apparatus for generating human body critical point detection model
CN109758756A (en) * 2019-02-28 2019-05-17 国家体育总局体育科学研究所 Gymnastics video analysis method and system based on 3D camera
CN109902659A (en) * 2019-03-15 2019-06-18 北京字节跳动网络技术有限公司 Method and apparatus for handling human body image
US20190347828A1 (en) * 2018-05-09 2019-11-14 Beijing Kuangshi Technology Co., Ltd. Target detection method, system, and non-volatile storage medium
CN110495889A (en) * 2019-07-04 2019-11-26 平安科技(深圳)有限公司 Postural assessment method, electronic device, computer equipment and storage medium
CN110633608A (en) * 2019-03-21 2019-12-31 广州中科凯泽科技有限公司 Human body limb similarity evaluation method of posture image
US20200042776A1 (en) * 2018-08-03 2020-02-06 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for recognizing body movement
CN110858277A (en) * 2018-08-22 2020-03-03 阿里巴巴集团控股有限公司 Method and device for obtaining attitude classification model
WO2020052169A1 (en) * 2018-09-12 2020-03-19 深圳云天励飞技术有限公司 Clothing attribute recognition detection method and apparatus

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062526A (en) * 2017-12-15 2018-05-22 厦门美图之家科技有限公司 A kind of estimation method of human posture and mobile terminal
US20190347828A1 (en) * 2018-05-09 2019-11-14 Beijing Kuangshi Technology Co., Ltd. Target detection method, system, and non-volatile storage medium
US20200042776A1 (en) * 2018-08-03 2020-02-06 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for recognizing body movement
CN109256212A (en) * 2018-08-17 2019-01-22 上海米因医疗器械科技有限公司 Bone health assessment models construction method, device, equipment, medium and appraisal procedure
CN110858277A (en) * 2018-08-22 2020-03-03 阿里巴巴集团控股有限公司 Method and device for obtaining attitude classification model
WO2020052169A1 (en) * 2018-09-12 2020-03-19 深圳云天励飞技术有限公司 Clothing attribute recognition detection method and apparatus
CN110895702A (en) * 2018-09-12 2020-03-20 深圳云天励飞技术有限公司 Clothing attribute identification detection method and device
CN109508681A (en) * 2018-11-20 2019-03-22 北京京东尚科信息技术有限公司 The method and apparatus for generating human body critical point detection model
CN109758756A (en) * 2019-02-28 2019-05-17 国家体育总局体育科学研究所 Gymnastics video analysis method and system based on 3D camera
CN109902659A (en) * 2019-03-15 2019-06-18 北京字节跳动网络技术有限公司 Method and apparatus for handling human body image
CN110633608A (en) * 2019-03-21 2019-12-31 广州中科凯泽科技有限公司 Human body limb similarity evaluation method of posture image
CN110495889A (en) * 2019-07-04 2019-11-26 平安科技(深圳)有限公司 Postural assessment method, electronic device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BAKKEN RH 等: "Real-time three-dimensional skeletonisation using general-purpose computing on graphics processing units applied to computer vision-based human pose estimation.", 《THE INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS》, vol. 31, no. 4, pages 259 - 273 *
强保华 等: "基于改进CPMs和SqueezeNet的轻量级人体骨骼关键点检测模型", 《计算机应用》, vol. 40, no. 06, pages 1806 - 1811 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686854A (en) * 2020-12-25 2021-04-20 四川大学华西医院 Method and system for automatically measuring scoliosis Cobb angle
CN113066549A (en) * 2021-04-06 2021-07-02 青岛瑞斯凯尔生物科技有限公司 Clinical effectiveness evaluation method and system of medical instrument based on artificial intelligence
CN113066549B (en) * 2021-04-06 2022-07-26 青岛瑞斯凯尔生物科技有限公司 Clinical effectiveness evaluation method and system of medical instrument based on artificial intelligence
CN113139962A (en) * 2021-05-26 2021-07-20 北京欧应信息技术有限公司 System and method for scoliosis probability assessment
CN113139962B (en) * 2021-05-26 2021-11-30 北京欧应信息技术有限公司 System and method for scoliosis probability assessment
CN117357103A (en) * 2023-12-07 2024-01-09 山东财经大学 CV-based limb movement training guiding method and system
CN117357103B (en) * 2023-12-07 2024-03-19 山东财经大学 CV-based limb movement training guiding method and system

Also Published As

Publication number Publication date
CN112102947B (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN112102947B (en) Apparatus and method for body posture assessment
US11017547B2 (en) Method and system for postural analysis and measuring anatomical dimensions from a digital image using machine learning
CN110969114B (en) Human body action function detection system, detection method and detector
CN107533370B (en) Image processing apparatus, image processing method, and program
CN111881705A (en) Data processing, training and recognition method, device and storage medium
CN110074788B (en) Body data acquisition method and device based on machine learning
CN110263768A (en) A kind of face identification method based on depth residual error network
Campomanes-Alvarez et al. Computer vision and soft computing for automatic skull–face overlay in craniofacial superimposition
CN113139962B (en) System and method for scoliosis probability assessment
CN110544302A (en) Human body action reconstruction system and method based on multi-view vision and action training system
Yang et al. Human upper limb motion analysis for post-stroke impairment assessment using video analytics
Campomanes-Álvarez et al. Modeling facial soft tissue thickness for automatic skull-face overlay
CN112016497A (en) Single-view Taijiquan action analysis and assessment system based on artificial intelligence
JP2009230703A (en) Object detection method, object detection device, and object detection program
CN109241881A (en) A kind of estimation method of human posture
CN110349206B (en) Method and related device for detecting human body symmetry
Tran et al. MBNet: A multi-task deep neural network for semantic segmentation and lumbar vertebra inspection on X-ray images
Zhang Innovation of English teaching model based on machine learning neural network and image super resolution
Škorvánková et al. Automatic estimation of anthropometric human body measurements
CN111275754B (en) Face acne mark proportion calculation method based on deep learning
CN113298783A (en) Hip joint rotation center detection method and imaging method under multi-posture condition
CN109740458B (en) Method and system for measuring physical characteristics based on video processing
Bermejo et al. FacialSCDnet: a deep learning approach for the estimation of subject-to-camera distance in facial photographs
CN115631155A (en) Bone disease screening method based on space-time self-attention
CN112861699A (en) Method for estimating height of human body in any posture based on single depth image and multi-stage neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant