WO2023037401A1 - Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device - Google Patents

Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device Download PDF

Info

Publication number
WO2023037401A1
WO2023037401A1 PCT/JP2021/032788 JP2021032788W WO2023037401A1 WO 2023037401 A1 WO2023037401 A1 WO 2023037401A1 JP 2021032788 W JP2021032788 W JP 2021032788W WO 2023037401 A1 WO2023037401 A1 WO 2023037401A1
Authority
WO
WIPO (PCT)
Prior art keywords
person
information
point cloud
twisting motion
skeleton
Prior art date
Application number
PCT/JP2021/032788
Other languages
French (fr)
Japanese (ja)
Inventor
一成 井上
Original Assignee
富士通株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士通株式会社 filed Critical 富士通株式会社
Priority to PCT/JP2021/032788 priority Critical patent/WO2023037401A1/en
Publication of WO2023037401A1 publication Critical patent/WO2023037401A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the present disclosure relates to a skeleton recognition method, a skeleton recognition program, and a gymnastics scoring support device.
  • Skeleton recognition technology is a technology that identifies the joint positions of the human body from point cloud information, which is multiple points on the surface of the human body acquired from a 3D sensor.
  • a human body model which is a geometric model, is fitted to the point cloud to determine the joint positions of the human body model. Fitting means optimizing an objective function representing the degree of matching between the point cloud and the human body model, and optimization is achieved by minimizing the distance between the point cloud and the human body model.
  • the present disclosure aims to improve the accuracy of skeletal recognition based on the 3D sensor point cloud in a twisting motion.
  • 3D point cloud information of a person and an object that the person comes into contact with is acquired, and the acquired 3D point cloud information is input to a machine learning model to generate skeleton information of the person.
  • the acquired 3D point group information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. Based on the generated 3D point group information rotated by a predetermined angle and the 3D model representing the person and the object, the generated skeletal information of the person is corrected.
  • FIG. 1 is an exemplary functional configuration diagram of a gymnastics scoring support device of this embodiment
  • FIG. 1 is an exemplary conceptual diagram representing an integrated 3D point cloud of a subject
  • FIG. 2 is an exemplary conceptual diagram illustrating acquisition of multi-view depth images of a subject and an object
  • FIG. 4 is an exemplary conceptual diagram illustrating fitting of a human body model to an integrated 3D point cloud
  • 1 is an exemplary conceptual diagram illustrating a three-dimensional model of a subject and an object
  • FIG. FIG. 4 is an exemplary conceptual diagram representing a multi-angle view
  • FIG. 4 is an exemplary conceptual diagram representing a technique recognition view
  • FIG. 11 is an exemplary conceptual diagram representing referencing of scoring support information
  • FIG. 4 is an exemplary functional configuration diagram of a twisting motion recognition unit according to the embodiment;
  • FIG. 10 is an exemplary conceptual diagram illustrating twisting motion;
  • FIG. 10 is an exemplary conceptual diagram illustrating twisting motion;
  • FIG. 10 is an exemplary conceptual diagram illustrating twisting motion;
  • FIG. 10 is an exemplary conceptual diagram illustrating twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion;
  • FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information;
  • FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information;
  • FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information;
  • FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information;
  • FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information;
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information;
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information
  • FIG. 4 is an exemplary conceptual diagram illustrating noise that occurs in a twisting motion integrated 3D point cloud
  • FIG. 4 is an exemplary conceptual diagram illustrating occlusion that occurs in a twisting motion integrated 3D point cloud
  • FIG. 11 is an exemplary conceptual diagram explaining a stitching defect that occurs in an integrated three-dimensional point cloud of twisting motion;
  • FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information;
  • 1 is an exemplary hardware configuration diagram of a gymnastics scoring support device according to this embodiment;
  • FIG. 7 is an exemplary flowchart of gymnastics scoring support processing according to the present embodiment. 7 is an exemplary flowchart of twisting motion recognition processing according to the present embodiment;
  • Fig. 1 illustrates the functional configuration of the gymnastics scoring support device 1.
  • the gymnastics scoring support device 1 includes a point group generation unit 12 , a skeleton recognition unit 14 , a technique recognition unit 16 and a scoring support unit 18 .
  • the point cloud generation unit 12 is an example of a point cloud acquisition unit, and uses a plurality of detection devices 32 to measure the distances from the detection devices 32 to the target person and objects, and generate a depth image.
  • Sensing device 32 may be, for example, a three-dimensional laser sensor.
  • the three-dimensional laser sensor may be, for example, a MEMS (Micro Electro Mechanical Systems) mirror-type laser sensor that employs LiDAR (Light Detection and Ranging) technology.
  • the target person may be, for example, a person such as a gymnast (hereinafter, athlete), and the target object may be gymnastics equipment. In this embodiment, the gymnastics equipment is a horizontal bar.
  • the point group generation unit 12 calculates the time from when the laser pulse is projected from each of the light projecting units of the plurality of detection devices 32 to when the reflected light reflected by the target person and the target object is received by the light receiving unit. to measure the distance to the subject and the object, and generate a depth image.
  • the point cloud generating unit 12 generates a three-dimensional point cloud from the depth images generated by each of the plurality of detection devices 32, and integrates the generated three-dimensional point clouds to generate an integrated three-dimensional point cloud.
  • FIG. 2 exemplifies the integrated 3D point cloud of the subject.
  • multiple detectors 32 are used as illustrated in FIG. 3, but only one detector is shown in FIG. 32.
  • two detection devices 32 are illustrated in FIG. 3, three or more detection devices may be appropriately installed so as not to interfere with the competition, spectator or referee.
  • the skeleton recognition unit 14 is an example of a skeleton generation unit. For example, by combining skeleton recognition and fitting, the three-dimensional coordinates of each joint constituting the human body are calculated from the integrated three-dimensional point cloud generated by the point cloud generation unit 12. to extract In skeleton recognition, for example, a trained machine learning model is used to estimate three-dimensional skeleton coordinates.
  • the machine learning model may be created, for example, on a CNN (Convolutional Neural Network) deep learning network.
  • posture information which is the result of fitting in a frame acquired at a time earlier than the target frame, is used as initial information, and the integrated 3D point cloud generated by the point cloud generation unit 12 is combined with the target person and the target object.
  • a three-dimensional model representing is fitted.
  • the 3D skeletal coordinates are determined by defining an objective function representing the likelihood of matching between the coordinates of the integrated 3D point cloud and the surface coordinates of the 3D model, and optimizing the joint angle with the highest likelihood. do.
  • a human body model which is a three-dimensional model representing the subject, is applied to the integrated three-dimensional point cloud of the subject.
  • the human body model is composed of a cylinder, an elliptical cylinder, etc., and the length and radius of the cylinder, and the length, major axis, minor axis, etc. of the elliptical cylinder are adjusted according to the body shape of the subject. pre-optimized for For example, in the horizontal bar competition, the horizontal bar is also observed as a point cloud by the detection device 32, so that the three-dimensional model of the object that the target person comes into contact with, that is, the three-dimensional model of the human body. Use a 3D model plus a 3D model of the bar.
  • the 3D model of the human body and the 3D model of the bar are not connected to each other. Being in contact means a state in which the subject and the object are connected, and includes, for example, a state in which the subject is holding the object.
  • the technique recognition unit 16 recognizes breaks in the basic motion from the time-series data of the three-dimensional skeletal coordinates that are the fitting results, and determines the basic motion and the feature amount for the divided time-series data.
  • Basic motions, breaks in basic motions, feature amounts, etc. are determined by rule base and machine learning.
  • the technique recognition unit 16 recognizes the basic technique using the feature amount related to the basic technique as a parameter, and compares consecutive basic techniques against the technique dictionary 34, which is a database created in advance, in chronological order to determine the technique to be graded. Recognize information.
  • the scoring support unit 18 uses the three-dimensional skeleton coordinates acquired by the skeleton recognition unit 14 and the technique information recognized by the technique recognition unit 16 to obtain, for example, the multi-angle view illustrated in FIG. 6 and the technique illustrated in FIG. A recognition view or the like is generated and displayed on the display device 36 .
  • the multi-angle view for example, the joint angles of each frame in the performance of the competitor can be confirmed in detail, and in the technique recognition view, the name of the technique obtained from the technique recognition result for each technique performed is displayed.
  • the scoring support unit 18 performs scoring using the three-dimensional skeletal coordinates based on the scoring rule defined based on the bending angle of the joint determined by the three-dimensional coordinate position, and displays the scoring result on the display device 36.
  • 3D skeleton coordinates may be displayed from front, side, plane, and other perspectives.
  • the technique recognition view may display, for example, the technique recognition results in chronological order, the group number of the technique, the difficulty level of the technique, the difficulty level score, the score indicating the difficulty level of all performance techniques, and the like.
  • the referees can score by referring to the scoring support information such as the multi-angle view displayed on the display device 36, the technique recognition view, and the scoring result by the scoring support unit 18, as illustrated in FIG. .
  • FIG. 9 illustrates the functional configuration of the twisting motion recognition section 20 included in the skeleton recognition section 14. As shown in FIG.
  • the twisting motion recognition unit 20 performs skeleton recognition when the player is performing a twisting motion.
  • the twisting motion recognition unit 20 includes a twisting motion determination unit 22 , an initial information adjustment unit 24 and an optimization unit 26 .
  • the twisting motion determination unit 22 determines whether or not the athlete is performing a twisting motion.
  • the twisting motion is, for example, a motion of twisting the player's body in order to change direction while the hands of both hands are closer to the bar than both shoulders in a horizontal bar competition.
  • 10A-10D show an example of a twisting motion.
  • FIGS. 10A to 10D illustrate a twisting motion in which the direction is changed by twisting the body more than half a turn while changing hands gripping the bar.
  • the player's motion is determined to be a twisting motion candidate.
  • a twist motion candidate if the average rotation angle of the predetermined part of the player is greater than the predetermined angle, it is determined that the player is performing a twist motion.
  • Xhandn be a vector obtained by projecting the vector Vhandn onto the bar.
  • the hands of both hands are in contact with the bar, so vector Vhandn and vector Xhandn are equal.
  • the twisting motion determination unit 22 determines that the distance between two positions on the bar corresponding to the positions of the fingers of both hands of the player has been shortened when the formula (1) is satisfied. That is, in the example of FIGS. 11A and 11B, the magnitude of the component along the bar of the vector between both hands is smaller one frame earlier than two frames earlier.
  • the twisting motion determination unit 22 determines that the motion of the athlete is a twisting motion candidate when the formula (2) is satisfied in addition to the formula (1). That is, as illustrated in FIG. 11C, when the magnitude of the component along the bar of the vector between both hands one frame before is shorter than the predetermined length Lth, it is determined to be a twist motion candidate.
  • the predetermined length Lth may be, for example, 30 cm.
  • the twisting motion determination unit 22 determines that the twisting motion candidate is a twisting motion when the average rotation angle of a predetermined part of the body is greater than a predetermined threshold value ⁇ th. judge.
  • the predetermined part may be, for example, the torso, chest, or the like.
  • a predetermined threshold ⁇ th may be 3°, for example.
  • the direction of rotation that is, the direction of the twisting motion is determined based on the sign of the average rotation angle ⁇ ave.
  • FIG. 12A shows an example of a predetermined site three frames before
  • FIG. 12B shows a predetermined site two frames before
  • FIG. 12C a predetermined site one frame before.
  • the predetermined part rotates around its own axis SA according to the movements of the player, and the position and inclination of the own axis SA also change according to the movements of the player.
  • 13A illustrates the bottom surface of the predetermined portion of FIG. 12A
  • FIG. 13B illustrates the bottom surface of the predetermined portion of FIG. 12B
  • FIG. 13C illustrates the bottom surface of the predetermined portion of FIG. 12C.
  • FIG. 13A shows the rotation angle ⁇ 3 from the reference line CL of the predetermined part three frames before, and FIG. represents the angular difference ⁇ 2 of .
  • FIG. 13C shows the rotation angle ⁇ 1 from the reference line CL of the predetermined portion one frame before and the angle difference ⁇ 1 between the rotation angle ⁇ 2 and the rotation angle ⁇ 1.
  • the average rotation angle ⁇ ave can be calculated by ( ⁇ 2+ ⁇ 1)/2.
  • the twisting motion determination unit 22 may not determine whether there is a twisting motion candidate, but may determine a twisting motion based on, for example, changes in the position and inclination of the body's own axis and the rotation angle of a predetermined part of the body.
  • the initial information adjusting unit 24 determines that the motion of the player is a twisting motion
  • the posture information obtained before generating the integrated 3D point cloud, that is, in the past frame is rotated by a predetermined angle in the direction of the twisting motion. to adjust the initial information for optimizing the objective function.
  • the posture information k frames before illustrated in FIG. by rotating it by a predetermined angle ⁇ con.
  • the posture information adjusted as illustrated in FIG. 14B is set as the initial information.
  • the predetermined angle ⁇ con may be a constant angle, or may be the average rotation angle ⁇ ave calculated when determining whether or not the twisting motion is performed.
  • the predetermined angle ⁇ con is a constant angle, it may be 30°, for example.
  • the angle difference ⁇ fn (fn is the frame number) may be predicted as an ARMA (AutoRegressive Moving Average) model, and the average rotation angle ⁇ ave may be calculated using the predicted ⁇ fn. .
  • ARo is the AR order
  • MAo is the MA order
  • aj is the AR coefficient
  • bj is the MA coefficient
  • vfn is the white noise
  • the coefficients are based on frame data containing twisting motion, e.g. can be obtained by known methods such as
  • the rotation axis of the initial information is not limited to the straight line SL illustrated in FIG. 14A.
  • a synthetic vector Vax of a vector Vwn connecting the waist point WP and the base of the neck NP and a vector Vwh connecting the waist point WP and the midpoint HC of the hands of both hands is used. You may
  • the optimization unit 26 performs fitting to fit the 3D point cloud to the 3D model, and optimizes the objective function using the adjusted initial information to obtain the joint angle with the highest likelihood. Determine dimensional skeletal coordinates.
  • posture information obtained in advance is generally used as initial information.
  • initial information For example, posture information acquired from a frame one frame before the target frame or posture information acquired from linear prediction using a plurality of past frames is used as the initial information.
  • optimization has different results depending on the initial information used. If the initial information is good, good optimization results can be obtained, but if the initial information is not good, good optimization results cannot be obtained. can't
  • posture information in which the upper end of the geometric model M11 exists in front of the upper end of the geometric model M12 is used.
  • the upper end of the geometric model M11 exists in front of the geometric model M12, as illustrated in FIG. 16C.
  • pose information in which the upper end of the geometric model M11 exists behind the upper end of the geometric model M12 is used as initial information.
  • the optimization result shows that the upper end of the geometric model M11 is behind the geometric model M12, as illustrated in FIG. 16E.
  • the optimization result converges to the minimum value R01 of the objective function OF01
  • the optimization result is the objective function OF01 converges to the minimum value R02 of .
  • the local minimum R02 is the minimum value of the objective function OF01 and is a suitable optimization result.
  • point cloud defects may occur due to (1) to (3) below.
  • the sensing device S01 projects a laser onto the subject and the object.
  • the remaining part of the spot hits a different object farther from the target person or object than the detection device S01, thereby generating noise N01. .
  • the target person and the target object may be recognized as being farther than the actual distance from the detection device S01.
  • the point cloud may be missing due to occlusion or bonding failure.
  • the occlusion is a hidden portion N02 that occurs when the line of sight of the detection device S02 is blocked by an object in front of the line of sight, as illustrated in FIG. 18B. That is, when a twisting motion is performed, occlusion occurs by changing the first position of the body part so as to block the line of sight to the second position.
  • the bonding failure is the generation of gaps due to improper bonding, that is, improper integration of three-dimensional point groups acquired by a plurality of detection devices.
  • FIG. 18C exemplifies a gap N03 that occurs when the three-dimensional point clouds acquired by the two detection devices S03 and S04 are not properly pasted together.
  • the optimization result converges to the minimum value R11 of the objective function OF02.
  • the optimization result converges to the minimum value R12 of the objective function OF02.
  • the local minimum R12 is the minimum value of the objective function OF02 and is the appropriate optimization result.
  • the posture information of the subject is used as the initial information for optimizing the objective function.
  • Information rotated by a predetermined angle in the direction of the twisting motion is used.
  • FIG. 20 illustrates the hardware configuration of the gymnastics scoring assistance device 1 .
  • the gymnastics scoring support device 1 includes, for example, a CPU (Central Processing Unit) 52, a RAM (Random Access Memory) 54, a SSD (Solid State Drive) 56, and an external interface 58.
  • a CPU Central Processing Unit
  • RAM Random Access Memory
  • SSD Solid State Drive
  • the CPU 52 is an example of a processor that is hardware.
  • the CPU 52 , RAM 54 , SSD 56 and external interface 58 are interconnected via a bus 72 .
  • the CPU 52 may be a single processor or multiple processors. Also, instead of the CPU 52, for example, a GPU (Graphics Processing Unit) may be used.
  • a GPU Graphics Processing Unit
  • the RAM 54 is a volatile memory and an example of a primary storage device.
  • the primary storage device may include ROM (Read Only Memory) in which programs are stored in advance.
  • the SSD 56 is a non-volatile memory and an example of a secondary storage device.
  • the secondary storage device may be an HDD (Hard Disk Drive) or the like in addition to or instead of the SSD.
  • the secondary storage device includes a program storage area and a data storage area.
  • the program storage area stores programs such as a gymnastics scoring support program, for example.
  • the data storage area may store, for example, three-dimensional point cloud data, a dictionary of techniques, gymnastics scoring results, and the like.
  • the CPU 52 loads a program such as a gymnastics scoring support program from the program storage area via the RAM 54 and executes it, so that the point group generation unit 12, the skeleton recognition unit 14, the technique recognition unit 16, and the It operates as the scoring support unit 18 .
  • the gymnastics scoring support program includes a skeleton recognition program as a part, and the skeleton recognition program includes a twisting motion recognition program as a part.
  • the CPU 52 also operates as a twisting motion determining section 22 , an initial information adjusting section 24 and an optimizing section 26 included in the twisting motion recognition section 20 .
  • a program such as the gymnastics scoring support program may be stored in an external server and loaded into the CPU 52 via a network. Also, a program such as the gymnastics scoring support program may be recorded on a non-temporary computer-readable recording medium such as a DVD (Digital Versatile Disc) and loaded into the CPU 52 via a recording medium reader.
  • a program such as the gymnastics scoring support program may be recorded on a non-temporary computer-readable recording medium such as a DVD (Digital Versatile Disc) and loaded into the CPU 52 via a recording medium reader.
  • FIG. 20 shows an example in which a three-dimensional laser sensor 62 as an example of the detection device 32 and a display 64 as an example of the display device 36 are connected to the external interface 58 .
  • a communication device, an external storage device, or the like, for example, may be connected to the external interface.
  • the gymnastics scoring support device 1 may be a personal computer, a server, or the like, and may be on-premise or in the cloud.
  • FIG. 21 illustrates the flow of gymnastics scoring support processing.
  • the CPU 52 detects the athlete and the horizontal bar with each of the plurality of three-dimensional laser sensors 62 .
  • the CPU 52 generates a three-dimensional point group from the depth images acquired by each of the plurality of three-dimensional laser sensors 62, integrates the generated three-dimensional point groups, and generates an integrated three-dimensional point group.
  • the CPU 52 extracts the three-dimensional coordinates of each joint that constitutes the human body from the integrated three-dimensional point cloud, and applies the three-dimensional models of the player and the bar to the integrated three-dimensional point cloud.
  • the 3D skeletal coordinates are calculated. decide.
  • the CPU 52 recognizes the basic technique from the time-series data of the three-dimensional skeleton coordinates obtained at step 106, and checks the time-series against the technique dictionary 34 to recognize the technique to be graded.
  • the CPU 52 performs scoring using the technique recognition results obtained at step 108, and at step 112, displays a multi-angle view, technique recognition view, etc., on the display 64 to support scoring by the referee. .
  • FIG. 22 illustrates the flow of the twisting motion recognition process, which is part of the skeleton recognition process in step 106.
  • the CPU 52 determines in steps 122 to 126 whether or not the motion of the player is a twisting motion. Specifically, in step 122, the motion in which the distance between the fingers of the player's hands one frame before is shorter than that two frames before and the distance is shorter than a predetermined length is a twisting motion candidate. It is determined whether or not the first condition (condition 1) is satisfied.
  • step 122 determines, in step 124, the second condition that the athlete's arms are crossed, that is, the motion is a twisting motion candidate. It is determined whether or not (condition 2) is satisfied. If the determination in step 122 or the determination in step 124 is affirmative, that is, if the first condition or the second condition is satisfied, the CPU 52 determines in step 126 that the average rotation angle of the predetermined part of the athlete is a predetermined value. It is determined whether or not the condition that the threshold value is exceeded is satisfied.
  • step 126 determines whether the motion of the player is determined to be a twisting motion. If the determination in step 126 is affirmative, that is, if the motion of the player is determined to be a twisting motion, the CPU 52 rotates the posture information of the preceding frame by a predetermined angle in the direction of the twisting motion in step 128. set as initial information for objective function optimization.
  • step 124 or step 126 determines whether the athlete's motion is a twisting motion. If the determination in step 124 or step 126 is negative, that is, if it is determined that the athlete's motion is not a twisting motion, the CPU 52 does not rotate the posture information of the previous frame in step 130. Set as initial information for objective function optimization. The CPU 52 determines the 3D skeletal coordinates of the athlete at step 132 by optimizing the objective function using the initial information set at step 128 or step 130 . The processing of steps 122 to 132 is applied to each frame acquired by the three-dimensional laser sensor 62. FIG.
  • step 122 it is determined whether the distance between the hands of the player one frame before is shorter than two frames before and whether the distance is shorter than a predetermined length. It is not limited to this. In other words, it is sufficient that the distance between the hands of the player's hands before the first frame is shorter than before the second frame, and the point before the second frame is before the first frame.
  • the number of frames between the previous and the second frame before may be two frames or more.
  • FIGS. 21 and 22 are exemplary, and the order of the steps in the flowcharts may be changed, any of the steps may be replaced with others, and steps may be added or deleted. .
  • the present embodiment is not limited to the gymnastics horizontal bar competition scoring support device, and may be applied to scoring support and training support for various sports. It may also be applied to creation of entertainment materials such as movies, skill analysis of handicrafts, training support, and the like.
  • the 3D point cloud information of the person and the object that the person comes into contact with is acquired, and the acquired 3D point cloud information is input to the machine learning model to generate the skeleton information of the person.
  • the acquired 3D point group information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. Based on the generated 3D point group information rotated by a predetermined angle and the 3D model representing the person and the object, the generated skeletal information of the person is corrected.
  • this embodiment makes it possible to improve the accuracy of skeletal recognition based on the 3D sensor point cloud in twisting motions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The present invention acquires the three-dimensional point cloud information of a person and an object with which the person comes into contact and inputs the acquired three-dimensional point cloud information to a machine learning model to thereby generate skeleton information of the person. When it is assessed that the person is performing a twist motion, the present invention generates three-dimensional point cloud information in which orientation information of the person in the acquired three-dimensional point cloud information is rotated by a prescribed angle in the direction of the twist motion. The present invention corrects the generated skeleton information of the person on the basis of the generated three-dimensional point cloud information that was rotated by the prescribed angle and three-dimensional models that represent the person and the object.

Description

骨格認識方法、骨格認識プログラム、及び体操競技採点支援装置Skeleton Recognition Method, Skeleton Recognition Program, and Gymnastics Scoring Support Device
 本開示は、骨格認識方法、骨格認識プログラム、及び体操競技採点支援装置に関する。 The present disclosure relates to a skeleton recognition method, a skeleton recognition program, and a gymnastics scoring support device.
 骨格認識技術は、3次元センサから取得される人体の表面上の複数の点である点群情報から、人体の関節位置を特定する技術である。点群に幾何モデルである人体モデルをフィッティングし、人体モデルの関節位置を決定する。フィッティングとは、点群と人体モデルとの適合度合いを表す目的関数を最適化することであり、最適化は点群と人体モデルとの距離を最小化することで実現される。 Skeleton recognition technology is a technology that identifies the joint positions of the human body from point cloud information, which is multiple points on the surface of the human body acquired from a 3D sensor. A human body model, which is a geometric model, is fitted to the point cloud to determine the joint positions of the human body model. Fitting means optimizing an objective function representing the degree of matching between the point cloud and the human body model, and optimization is achieved by minimizing the distance between the point cloud and the human body model.
 本開示は、1つの側面として、ひねり動作における、3次元センサ点群に基づいた骨格認識の精度向上を目的とする。 As one aspect, the present disclosure aims to improve the accuracy of skeletal recognition based on the 3D sensor point cloud in a twisting motion.
 1つの実施形態では、人物及び人物が接触する対象物の3次元点群情報を取得し、取得した3次元点群情報を機械学習モデルに入力することで、人物の骨格情報を生成する。人物がひねり動作を行っていることが判定された場合に、取得した3次元点群情報を、人物の姿勢情報をひねり動作の方向に所定角度回転した3次元点群情報を生成する。生成された所定角度回転した3次元点群情報と、人物及び対象物を表す3次元モデルとに基づいて、生成された人物の骨格情報を修正する。 In one embodiment, 3D point cloud information of a person and an object that the person comes into contact with is acquired, and the acquired 3D point cloud information is input to a machine learning model to generate skeleton information of the person. When it is determined that the person is performing a twisting motion, the acquired 3D point group information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. Based on the generated 3D point group information rotated by a predetermined angle and the 3D model representing the person and the object, the generated skeletal information of the person is corrected.
 本開示は、1つの側面として、ひねり動作における、3次元センサ点群に基づいた骨格認識の精度向上を可能とする。 As one aspect of the present disclosure, it is possible to improve the accuracy of skeletal recognition based on the 3D sensor point cloud in a twisting motion.
本実施形態の体操競技採点支援装置の例示的な機能構成図である。1 is an exemplary functional configuration diagram of a gymnastics scoring support device of this embodiment; FIG. 対象者の統合3次元点群を表す例示的な概念図である。1 is an exemplary conceptual diagram representing an integrated 3D point cloud of a subject; FIG. 対象者及び対象物の多視点の深度画像の取得を説明する例示的な概念図である。FIG. 2 is an exemplary conceptual diagram illustrating acquisition of multi-view depth images of a subject and an object; 統合3次元点群に対する人体モデルの当てはめを説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating fitting of a human body model to an integrated 3D point cloud; 対象者及び対象物の3次元モデルを説明する例示的な概念図である。1 is an exemplary conceptual diagram illustrating a three-dimensional model of a subject and an object; FIG. マルチアングルビューを表す例示的な概念図である。FIG. 4 is an exemplary conceptual diagram representing a multi-angle view; 技認識ビューを表す例示的な概念図である。FIG. 4 is an exemplary conceptual diagram representing a technique recognition view; 採点支援情報の参照を表す例示的な概念図である。FIG. 11 is an exemplary conceptual diagram representing referencing of scoring support information; 本実施形態のひねり動作認識部の例示的な機能構成図である。FIG. 4 is an exemplary functional configuration diagram of a twisting motion recognition unit according to the embodiment; ひねり動作を説明する例示的な概念図である。FIG. 10 is an exemplary conceptual diagram illustrating twisting motion; ひねり動作を説明する例示的な概念図である。FIG. 10 is an exemplary conceptual diagram illustrating twisting motion; ひねり動作を説明する例示的な概念図である。FIG. 10 is an exemplary conceptual diagram illustrating twisting motion; ひねり動作を説明する例示的な概念図である。FIG. 10 is an exemplary conceptual diagram illustrating twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; ひねり動作の判定を説明する例示的な概念図である。FIG. 5 is an exemplary conceptual diagram for explaining determination of a twisting motion; 初期情報の調整を説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information; 初期情報の調整を説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information; 初期情報の調整を説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information; 初期情報の調整を説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating adjustment of initial information; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; ひねり動作の統合3次元点群に生じるノイズを説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating noise that occurs in a twisting motion integrated 3D point cloud; ひねり動作の統合3次元点群に生じるオクルージョンを説明する例示的な概念図である。FIG. 4 is an exemplary conceptual diagram illustrating occlusion that occurs in a twisting motion integrated 3D point cloud; ひねり動作の統合3次元点群に生じる貼り合わせ不良を説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a stitching defect that occurs in an integrated three-dimensional point cloud of twisting motion; 初期情報の違いによる最適化結果の違いを説明する例示的な概念図である。FIG. 11 is an exemplary conceptual diagram explaining a difference in optimization result due to a difference in initial information; 本実施形態の体操競技採点支援装置の例示的なハードウェア構成図である。1 is an exemplary hardware configuration diagram of a gymnastics scoring support device according to this embodiment; FIG. 本実施形態の体操競技採点支援処理の例示的なフローチャートである。7 is an exemplary flowchart of gymnastics scoring support processing according to the present embodiment. 本実施形態のひねり動作認識処理の例示的なフローチャートである。7 is an exemplary flowchart of twisting motion recognition processing according to the present embodiment;
〔機能構成〕 [Functional configuration]
 図1に体操競技採点支援装置1の機能構成を例示する。体操競技採点支援装置1は、点群生成部12、骨格認識部14、技認識部16、及び採点支援部18を含む。 Fig. 1 illustrates the functional configuration of the gymnastics scoring support device 1. The gymnastics scoring support device 1 includes a point group generation unit 12 , a skeleton recognition unit 14 , a technique recognition unit 16 and a scoring support unit 18 .
 点群生成部12は、点群取得部の一例であり、複数の検知装置32を使用して、検知装置32から対象者及び対象物までの距離を計測し、深度画像を生成する。検知装置32は、例えば、3次元レーザセンサであってよい。3次元レーザセンサは、例えば、LiDAR(Light Detection and Ranging)技術を採用したMEMS(Micro Electro Mechanical Systems)ミラー型レーザセンサであってよい。対象者は、例えば、体操競技者(以下、競技者)などの人物であってよく、対象物は体操器具であってよい。本実施形態では、体操器具は鉄棒である。 The point cloud generation unit 12 is an example of a point cloud acquisition unit, and uses a plurality of detection devices 32 to measure the distances from the detection devices 32 to the target person and objects, and generate a depth image. Sensing device 32 may be, for example, a three-dimensional laser sensor. The three-dimensional laser sensor may be, for example, a MEMS (Micro Electro Mechanical Systems) mirror-type laser sensor that employs LiDAR (Light Detection and Ranging) technology. The target person may be, for example, a person such as a gymnast (hereinafter, athlete), and the target object may be gymnastics equipment. In this embodiment, the gymnastics equipment is a horizontal bar.
 点群生成部12は、複数の検知装置32の各々の投光ユニットからレーザパルスが投射されてから、対象者及び対象物で反射された反射光が受光ユニットで受光されるまでの時間に基づいて、対象者及び対象物までの距離を計測し、深度画像を生成する。点群生成部12は、複数の検知装置32の各々で生成された深度画像から3次元点群を生成し、生成した3次元点群を統合して統合3次元点群を生成する。図2に対象者の統合3次元点群を例示する。 The point group generation unit 12 calculates the time from when the laser pulse is projected from each of the light projecting units of the plurality of detection devices 32 to when the reflected light reflected by the target person and the target object is received by the light receiving unit. to measure the distance to the subject and the object, and generate a depth image. The point cloud generating unit 12 generates a three-dimensional point cloud from the depth images generated by each of the plurality of detection devices 32, and integrates the generated three-dimensional point clouds to generate an integrated three-dimensional point cloud. FIG. 2 exemplifies the integrated 3D point cloud of the subject.
 対象者及び対象物の多視点の深度画像を取得するため、検知装置32は、図3に例示するように、複数使用されるが、図1では、表示を簡潔にするため、1つの検知装置32を示す。また、図3では、2つの検知装置32を例示しているが、3つ以上の検知装置が競技、観戦、または審判などを妨げないように適宜設置されてもよい。 In order to acquire multi-view depth images of the subject and object, multiple detectors 32 are used as illustrated in FIG. 3, but only one detector is shown in FIG. 32. In addition, although two detection devices 32 are illustrated in FIG. 3, three or more detection devices may be appropriately installed so as not to interfere with the competition, spectator or referee.
 骨格認識部14は、骨格生成部の一例であり、例えば、骨格認識とフィッティングとを組み合わせることで、点群生成部12で生成した統合3次元点群から人体を構成する各関節の3次元座標を抽出する。骨格認識では、例えば、学習済みの機械学習モデルを使用して3次元骨格座標を推定する。機械学習モデルは、例えば、CNN(Convolutional Neural Network)系Deep Learningネットワーク上に作成されてもよい。 The skeleton recognition unit 14 is an example of a skeleton generation unit. For example, by combining skeleton recognition and fitting, the three-dimensional coordinates of each joint constituting the human body are calculated from the integrated three-dimensional point cloud generated by the point cloud generation unit 12. to extract In skeleton recognition, for example, a trained machine learning model is used to estimate three-dimensional skeleton coordinates. The machine learning model may be created, for example, on a CNN (Convolutional Neural Network) deep learning network.
 フィッティングでは、対象フレームより以前の時点で取得されたフレームにおけるフィッティングの結果である姿勢情報などを初期情報として、点群生成部12で生成した統合3次元点群に対して、対象者及び対象物を表す3次元モデルを当てはめる。統合3次元点群の座標と3次元モデルの表面座標の一致度を表す尤度を表す目的関数を定義し、最も尤度が高い関節角度を最適化により求めることで、3次元骨格座標を決定する。図4の例示では、対象者の統合3次元点群に対して、対象者を表す3次元モデルである人体モデルを当てはめている。 In fitting, posture information, which is the result of fitting in a frame acquired at a time earlier than the target frame, is used as initial information, and the integrated 3D point cloud generated by the point cloud generation unit 12 is combined with the target person and the target object. A three-dimensional model representing is fitted. The 3D skeletal coordinates are determined by defining an objective function representing the likelihood of matching between the coordinates of the integrated 3D point cloud and the surface coordinates of the 3D model, and optimizing the joint angle with the highest likelihood. do. In the illustration of FIG. 4, a human body model, which is a three-dimensional model representing the subject, is applied to the integrated three-dimensional point cloud of the subject.
 図5に例示するように、人体モデルは、円柱、楕円柱などで構成されており、円柱の長さ及び半径、並びに楕円柱の長さ、長径、及び短径などは対象者の体型に合わせて予め最適化されている。例えば、鉄棒競技においては、鉄棒のバーも検知装置32で点群として観測されるため、対象者の3次元モデルに対象者が接触する対象物の3次元モデル、即ち、人体の3次元モデルにバーの3次元モデルを加えた3次元モデルを使用する。 As illustrated in FIG. 5, the human body model is composed of a cylinder, an elliptical cylinder, etc., and the length and radius of the cylinder, and the length, major axis, minor axis, etc. of the elliptical cylinder are adjusted according to the body shape of the subject. pre-optimized for For example, in the horizontal bar competition, the horizontal bar is also observed as a point cloud by the detection device 32, so that the three-dimensional model of the object that the target person comes into contact with, that is, the three-dimensional model of the human body. Use a 3D model plus a 3D model of the bar.
 対象者と対象物とが接触していない状態があるため、人体の3次元モデルとバーの3次元モデルとは相互に接続されていないモデルを使用する。接触している、とは、対象者と対象物とが接続している状態であり、例えば、対象者が対象物を把持している状態を含む。  Because there is a state where the subject and the object are not in contact, the 3D model of the human body and the 3D model of the bar are not connected to each other. Being in contact means a state in which the subject and the object are connected, and includes, for example, a state in which the subject is holding the object.
 技認識部16は、フィッティング結果である3次元骨格座標の時系列データから、基本運動の切れ目を認識し、分割された時系列データに対して基本運動及び特徴量を決定する。基本運動、基本運動の切れ目、特徴量などは、ルールベース及び機械学習によって決定される。技認識部16は、基本運動に関連した特徴量をパラメータとして基本技を認識し、連続する基本技を予め作成されたデータベースである技の辞書34と時系列照合して採点の対象となる技情報を認識する。 The technique recognition unit 16 recognizes breaks in the basic motion from the time-series data of the three-dimensional skeletal coordinates that are the fitting results, and determines the basic motion and the feature amount for the divided time-series data. Basic motions, breaks in basic motions, feature amounts, etc. are determined by rule base and machine learning. The technique recognition unit 16 recognizes the basic technique using the feature amount related to the basic technique as a parameter, and compares consecutive basic techniques against the technique dictionary 34, which is a database created in advance, in chronological order to determine the technique to be graded. Recognize information.
 採点支援部18は、骨格認識部14で取得された3次元骨格座標と技認識部16で認識された技情報とから、例えば、図6に例示するマルチアングルビュー、及び図7に例示する技認識ビューなどを生成し、表示装置36に表示する。マルチアングルビューでは、例えば、競技者の演技におけるフレーム毎の関節角度などを詳細に確認でき、技認識ビューでは、実施された技毎に技認識結果により取得される技の名称などを示す。採点支援部18は、3次元座標位置によって定まる関節の曲がり角度に基づいて定義された採点規則に基づいて、3次元骨格座標を使用して採点を行い、採点結果を表示装置36に表示する。 The scoring support unit 18 uses the three-dimensional skeleton coordinates acquired by the skeleton recognition unit 14 and the technique information recognized by the technique recognition unit 16 to obtain, for example, the multi-angle view illustrated in FIG. 6 and the technique illustrated in FIG. A recognition view or the like is generated and displayed on the display device 36 . In the multi-angle view, for example, the joint angles of each frame in the performance of the competitor can be confirmed in detail, and in the technique recognition view, the name of the technique obtained from the technique recognition result for each technique performed is displayed. The scoring support unit 18 performs scoring using the three-dimensional skeletal coordinates based on the scoring rule defined based on the bending angle of the joint determined by the three-dimensional coordinate position, and displays the scoring result on the display device 36.
 マルチアングルビューは、例えば、3次元骨格座標を、正面、側面、平面などの視点から表示してもよい。技認識ビューは、例えば、時系列での技認識結果、技のグループ番号、技の難易度、難易度価値点、全演技技の難度を示すスコアなどを表示してもよい。審判らは、図8に例示するように、表示装置36に表示されるマルチアングルビュー、技認識ビュー及び採点支援部18による採点結果などの採点支援情報を参照して、採点を行うことができる。 In the multi-angle view, for example, 3D skeleton coordinates may be displayed from front, side, plane, and other perspectives. The technique recognition view may display, for example, the technique recognition results in chronological order, the group number of the technique, the difficulty level of the technique, the difficulty level score, the score indicating the difficulty level of all performance techniques, and the like. The referees can score by referring to the scoring support information such as the multi-angle view displayed on the display device 36, the technique recognition view, and the scoring result by the scoring support unit 18, as illustrated in FIG. .
 図9に、骨格認識部14に含まれるひねり動作認識部20の機能構成を例示する。ひねり動作認識部20は、競技者がひねり動作を行っている場合の骨格認識を行う。ひねり動作認識部20は、ひねり動作判定部22、初期情報調整部24、及び最適化部26を含む。 FIG. 9 illustrates the functional configuration of the twisting motion recognition section 20 included in the skeleton recognition section 14. As shown in FIG. The twisting motion recognition unit 20 performs skeleton recognition when the player is performing a twisting motion. The twisting motion recognition unit 20 includes a twisting motion determination unit 22 , an initial information adjustment unit 24 and an optimization unit 26 .
 ひねり動作判定部22は、競技者がひねり動作を行っているか否か判定する。本実施形態においてひねり動作とは、例えば、鉄棒競技において、競技者の両手の手先が両肩よりバーに近接した状態で、向きを転換するために身体をひねる動作である。図10A~図10Dに、ひねり動作の一例を示す。図10A~図10Dでは、バーを把持する手を入れ替えながら身体を半回転以上ひねることで方向転換するひねり動作を例示している。 The twisting motion determination unit 22 determines whether or not the athlete is performing a twisting motion. In this embodiment, the twisting motion is, for example, a motion of twisting the player's body in order to change direction while the hands of both hands are closer to the bar than both shoulders in a horizontal bar competition. 10A-10D show an example of a twisting motion. FIGS. 10A to 10D illustrate a twisting motion in which the direction is changed by twisting the body more than half a turn while changing hands gripping the bar.
 ひねり動作判定部22は、例えば、競技者の両手の手先の位置に対応するバー上の2つの位置の間の距離が短縮され、当該距離が所定長さより短い場合、または、競技者の両腕が交差している場合に、競技者の動作がひねり動作候補であると判定する。また、ひねり動作候補である場合、競技者の所定部位の平均回転角度が所定角度より大きい場合に、競技者がひねり動作を行っていると判定する。 For example, if the distance between two positions on the bar corresponding to the positions of the fingers of both hands of the player is shortened and the distance is shorter than a predetermined length, or if the distance is shorter than a predetermined length, are crossed, the player's motion is determined to be a twisting motion candidate. In the case of a twist motion candidate, if the average rotation angle of the predetermined part of the player is greater than the predetermined angle, it is determined that the player is performing a twist motion.
 詳細には、図11A~図11Dに例示するように、nフレーム前(n=1,2)の両手の手先を結ぶベクトルをVhandn、ベクトルVhandnがバーに射影されたベクトルをベクトルXhandnとする。図11A~図11Dの例では、両手の手先がバーに接触しているため、ベクトルVhandnとベクトルXhandnとが等しい。 Specifically, as illustrated in FIGS. 11A to 11D, let Vhandn be a vector connecting the hands of both hands n frames before (n=1, 2), and let Xhandn be a vector obtained by projecting the vector Vhandn onto the bar. In the example of FIGS. 11A to 11D, the hands of both hands are in contact with the bar, so vector Vhandn and vector Xhandn are equal.
 ひねり動作判定部22は、式(1)が満たされている場合、競技者の両手の手先の位置に対応するバー上の2つの位置の間の距離が短縮されたと判定する。即ち、図11A及び図11Bの例では、両方の手先の間のベクトルのバーに沿った成分の大きさは、2フレーム前より1フレーム前の方が小さい。
  |Xhand1|-|Xhand2|<0  …(1)
The twisting motion determination unit 22 determines that the distance between two positions on the bar corresponding to the positions of the fingers of both hands of the player has been shortened when the formula (1) is satisfied. That is, in the example of FIGS. 11A and 11B, the magnitude of the component along the bar of the vector between both hands is smaller one frame earlier than two frames earlier.
|Xhand1|-|Xhand2|<0 (1)
 ひねり動作判定部22は、式(1)に加え、式(2)が満たされている場合、競技者の動作がひねり動作候補であると判定する。即ち、図11Cに例示するように、1フレーム前の両方の手先の間のベクトルのバーに沿った成分の大きさが所定長さLthより短い場合、ひねり動作候補であると判定する。所定長さLthは、例えば、30cmであってよい。
  |Xhand1|<Lth  …(2)
The twisting motion determination unit 22 determines that the motion of the athlete is a twisting motion candidate when the formula (2) is satisfied in addition to the formula (1). That is, as illustrated in FIG. 11C, when the magnitude of the component along the bar of the vector between both hands one frame before is shorter than the predetermined length Lth, it is determined to be a twist motion candidate. The predetermined length Lth may be, for example, 30 cm.
|Xhand1|<Lth (2)
 ひねり動作判定部22は、競技者の両腕が交差しているか否か判定する。図11A~図11Dに例示するように、nフレーム前(n=2,1)の両肩の間を結ぶベクトルをVshon、ベクトルVshonのバーに沿った成分をXshonとする。図11Dに例示するように、ベクトルXsho1とベクトルVsho1との向きが反対である場合、両腕が交差しており、ひねり動作候補であると判定する。 The twisting motion determination unit 22 determines whether or not the athlete's arms are crossed. As illustrated in FIGS. 11A to 11D, let Vshon be a vector connecting both shoulders n frames before (n=2, 1), and Xshon be a component of the vector Vshon along the bar. As illustrated in FIG. 11D, when the directions of the vector Xsho1 and the vector Vsho1 are opposite to each other, it is determined that both arms are crossed and that the motion is a twist motion candidate.
 ひねり動作判定部22は、競技者の動作がひねり動作候補であると判定された場合、身体の所定部位の平均回転角度が所定閾値θthより大きい場合に、当該ひねり動作候補がひねり動作であると判定する。所定部位は、例えば、胴部、胸部などであってよい。詳細には、mフレームの隣接するフレームとの角度差分の合計をmで除算することで取得される平均回転角度Δθaveが所定閾値θthより大きい場合にひねり動作であると判定する。所定閾値θthは、例えば、3°であってよい。また、平均回転角度Δθaveの符号に基づいて、回転方向、即ち、ひねり動作の方向を判定する。 When the motion of the athlete is determined to be a twisting motion candidate, the twisting motion determination unit 22 determines that the twisting motion candidate is a twisting motion when the average rotation angle of a predetermined part of the body is greater than a predetermined threshold value θth. judge. The predetermined part may be, for example, the torso, chest, or the like. Specifically, when the average rotation angle Δθave obtained by dividing the sum of angle differences between m frames and adjacent frames by m is greater than a predetermined threshold θth, it is determined that a twisting motion has occurred. The predetermined threshold θth may be 3°, for example. Also, the direction of rotation, that is, the direction of the twisting motion is determined based on the sign of the average rotation angle Δθave.
 図12A~図13Cにm=2である場合のひねり動作判定を例示する。図12Aに、3フレーム前の所定部位、図12Bに2フレーム前の所定部位、図12Cに1フレーム前の所定部位を例示する。所定部位は、競技者の動作にしたがって自軸SAを中心に回転しており、自軸SAの位置及び傾きも、競技者の動作にしたがって変化する。図13Aに図12Aの所定部位の底面、図13Bに図12Bの所定部位の底面、図13Cに図12Cの所定部位の底面を例示する。 Figs. 12A to 13C illustrate twisting motion determination when m = 2. Figs. FIG. 12A shows an example of a predetermined site three frames before, FIG. 12B shows a predetermined site two frames before, and FIG. 12C a predetermined site one frame before. The predetermined part rotates around its own axis SA according to the movements of the player, and the position and inclination of the own axis SA also change according to the movements of the player. 13A illustrates the bottom surface of the predetermined portion of FIG. 12A, FIG. 13B illustrates the bottom surface of the predetermined portion of FIG. 12B, and FIG. 13C illustrates the bottom surface of the predetermined portion of FIG. 12C.
 図13Aは、3フレーム前の所定部位の基準線CLからの回転角度θ3を表し、図13Bは、2フレーム前の所定部位の基準線CLからの回転角度θ2及び回転角度θ3と回転角度θ2との角度差分Δθ2を表す。図13Cは、1フレーム前の所定部位の基準線CLからの回転角度θ1及び回転角度θ2と回転角度θ1との角度差分Δθ1を表す。平均回転角度Δθaveは(Δθ2+Δθ1)/2で算出することができる。 FIG. 13A shows the rotation angle θ3 from the reference line CL of the predetermined part three frames before, and FIG. represents the angular difference Δθ2 of . FIG. 13C shows the rotation angle θ1 from the reference line CL of the predetermined portion one frame before and the angle difference Δθ1 between the rotation angle θ2 and the rotation angle θ1. The average rotation angle Δθave can be calculated by (Δθ2+Δθ1)/2.
 n=1,2は例示であり、n=1,3、n=2,4などであってもよい。また、m=2は例示であり、m=3、m=4などであってもよい。競技者の両手の手先が競技者の両肩よりもバーに近接しているか否かは、Vhandnの始点及び終点がVshonの始点及び終点より、バーに近接しているか否かで判定することができる。ひねり動作判定部22は、ひねり動作候補の有無を判定せず、例えば、身体の所定部位の自軸の位置及び傾き並びに回転角度の変動などに基づいてひねり動作を判定してもよい。 n=1, 2 is an example, and n=1, 3, n=2, 4, etc. may also be used. Also, m=2 is an example, and m=3, m=4, and the like may be used. Whether or not the player's hands are closer to the bar than the player's shoulders is determined by whether the start and end points of Vhandn are closer to the bar than the start and end points of Vshon. can. The twisting motion determination unit 22 may not determine whether there is a twisting motion candidate, but may determine a twisting motion based on, for example, changes in the position and inclination of the body's own axis and the rotation angle of a predetermined part of the body.
 初期情報調整部24は、競技者の動作がひねり動作であると判定した場合、統合3次元点群を生成する以前、即ち、過去フレームで取得した姿勢情報をひねり動作の方向に所定角度回転することで、目的関数を最適化するための初期情報を調整する。詳細には、図14Aに例示するkフレーム前の姿勢情報を、両手の手先の中点HCと腰点WPとを結ぶ直線SLを回転軸として、平均回転角度Δθaveの符号で決定される回転方向に、所定角度θcon回転することで、調整する。図14Bに例示するように調整された姿勢情報を初期情報として設定する。kは、k=1であってよいが、k=2などであってもよい。 When the initial information adjusting unit 24 determines that the motion of the player is a twisting motion, the posture information obtained before generating the integrated 3D point cloud, that is, in the past frame, is rotated by a predetermined angle in the direction of the twisting motion. to adjust the initial information for optimizing the objective function. Specifically, the posture information k frames before illustrated in FIG. , by rotating it by a predetermined angle θcon. The posture information adjusted as illustrated in FIG. 14B is set as the initial information. k may be k=1, but may be k=2 or the like.
 所定角度θconは、一定の角度であってもよいし、ひねり動作であるか否かを判定する際に算出した平均回転角度Δθaveであってもよい。所定角度θconが一定の角度である場合、例えば、30°であってよい。また、(3)式で例示するように、角度差分Δθfn(fnはフレーム番号)をARMA(AutoRegressive Moving Average)モデルとして予測し、予測したΔθfnを使用して平均回転角度Δθaveを算出してもよい。
Figure JPOXMLDOC01-appb-M000001
The predetermined angle θcon may be a constant angle, or may be the average rotation angle Δθave calculated when determining whether or not the twisting motion is performed. When the predetermined angle θcon is a constant angle, it may be 30°, for example. Further, as exemplified by the equation (3), the angle difference Δθfn (fn is the frame number) may be predicted as an ARMA (AutoRegressive Moving Average) model, and the average rotation angle Δθave may be calculated using the predicted Δθfn. .
Figure JPOXMLDOC01-appb-M000001
 ここで、ARoはAR次数、MAoはMA次数、aはAR係数、bはMA係数、vfnは白色雑音であり、係数はひねり動作を含むフレームデータに基づいて、例えば、最尤法など、の既知の手法で取得することができる。 where ARo is the AR order, MAo is the MA order, aj is the AR coefficient, bj is the MA coefficient, vfn is the white noise, and the coefficients are based on frame data containing twisting motion, e.g. can be obtained by known methods such as
 初期情報の回転軸は、図14Aに例示する直線SLに限定されない。図15A及び図15Bに例示するように、腰点WPと首の付け根NPとを結ぶベクトルVwnと、腰点WPと両手の手先の中点HCとを結ぶベクトルVwhと、の合成ベクトルVaxを使用してもよい。 The rotation axis of the initial information is not limited to the straight line SL illustrated in FIG. 14A. As illustrated in FIGS. 15A and 15B, a synthetic vector Vax of a vector Vwn connecting the waist point WP and the base of the neck NP and a vector Vwh connecting the waist point WP and the midpoint HC of the hands of both hands is used. You may
 最適化部26は、3次元点群を3次元モデルに当てはめるフィッティングを行い、調整された初期情報を使用して目的関数を最適化することにより最も尤度が高い関節角度を求めることで、3次元骨格座標を決定する。 The optimization unit 26 performs fitting to fit the 3D point cloud to the 3D model, and optimizes the objective function using the adjusted initial information to obtain the joint angle with the highest likelihood. Determine dimensional skeletal coordinates.
 フィッティングにおいて、統合3次元点群と人体モデルとの最適化を行う際は、一般的に、事前に取得された姿勢情報を初期情報として使用する。例えば、対象フレームの1フレーム前のフレームから取得された姿勢情報、または、過去の複数のフレームを使用する直線予測から取得された姿勢情報を初期情報として使用する。しかしながら、最適化は、使用する初期情報によって結果が異なり、初期情報が適切であれば、適切な最適化結果を得ることができるが、初期情報が適切でないと、適切な最適化結果を得ることができない。 In fitting, when optimizing the integrated 3D point cloud and the human body model, posture information obtained in advance is generally used as initial information. For example, posture information acquired from a frame one frame before the target frame or posture information acquired from linear prediction using a plurality of past frames is used as the initial information. However, optimization has different results depending on the initial information used. If the initial information is good, good optimization results can be obtained, but if the initial information is not good, good optimization results cannot be obtained. can't
 例えば、図16Aに例示する統合3次元点群に対して、初期情報として、図16Bに例示するように、幾何モデルM11の上端が幾何モデルM12の上端の手前に存在する姿勢情報を使用する。この場合、最適化結果では、図16Cに例示するように、幾何モデルM11の上端が幾何モデルM12の手前に存在する。 For example, for the integrated three-dimensional point cloud illustrated in FIG. 16A, as illustrated in FIG. 16B, posture information in which the upper end of the geometric model M11 exists in front of the upper end of the geometric model M12 is used. In this case, in the optimization result, the upper end of the geometric model M11 exists in front of the geometric model M12, as illustrated in FIG. 16C.
 一方、図16Aに例示する統合3次元点群に対して、初期情報として、図16Dに例示するように、幾何モデルM11の上端が幾何モデルM12の上端の背後に存在する姿勢情報を使用する。この場合、最適化結果では、図16Eに例示するように、幾何モデルM11の上端が幾何モデルM12の背後に存在する。 On the other hand, for the integrated 3D point cloud illustrated in FIG. 16A, as illustrated in FIG. 16D, pose information in which the upper end of the geometric model M11 exists behind the upper end of the geometric model M12 is used as initial information. In this case, the optimization result shows that the upper end of the geometric model M11 is behind the geometric model M12, as illustrated in FIG. 16E.
 また、図17に例示するように、初期情報をII01に設定すると、最適化結果は、目的関数OF01の極小値R01に収束し、初期情報をII02に設定すると、最適化結果は、目的関数OF01の極小値R02に収束する。ここでは、極小値R02が目的関数OF01の最小値であり、適切な最適化結果である。 Also, as illustrated in FIG. 17, when the initial information is set to II01, the optimization result converges to the minimum value R01 of the objective function OF01, and when the initial information is set to II02, the optimization result is the objective function OF01 converges to the minimum value R02 of . Here, the local minimum R02 is the minimum value of the objective function OF01 and is a suitable optimization result.
 競技者がひねり動作を行っている場合、以下(1)~(3)により、点群不良が生じる場合がある。 If the competitor is performing a twisting motion, point cloud defects may occur due to (1) to (3) below.
(1)3次元センサのサンプリングの速度が遅く、ひねり動作の速度が早い場合、サンプリングをひねり動作の速度に追従させることが困難である。 (1) When the sampling speed of the three-dimensional sensor is slow and the speed of the twisting motion is fast, it is difficult to make the sampling follow the speed of the twisting motion.
(2)ノイズが発生する場合がある。図18Aに例示するように、検知装置S01は対象者及び対象物に対してレーザを投射する。レーザの断面であるスポットの一部分だけが対象者または対象物に当たる場合、スポットの残りの部分が、検知装置S01から、対象者または対象物より遠い位置にある異なる物体に当たることでノイズN01が発生する。これにより、対象者及び対象物が、検知装置S01からの実際の距離よりも遠くにあると認識される場合がある。 (2) Noise may occur. As illustrated in FIG. 18A, the sensing device S01 projects a laser onto the subject and the object. When only part of the spot, which is the cross section of the laser, hits the target person or object, the remaining part of the spot hits a different object farther from the target person or object than the detection device S01, thereby generating noise N01. . As a result, the target person and the target object may be recognized as being farther than the actual distance from the detection device S01.
(3)オクルージョン、貼り合わせ不良により点群が欠ける場合がある。オクルージョンとは、図18Bに例示するように、検知装置S02の視線が視線上手前の物体により遮られることにより生じる隠れ部分N02である。即ち、ひねり動作が行われると、身体の部分の第1位置が第2位置への視線を遮るように変化することで、オクルージョンが生じる。貼り合わせ不良とは、図18Cに例示するように、複数の検知装置で取得した3次元点群の貼り合わせ、即ち、統合が適切に行われないことにより隙間部分が発生することである。図18Cでは、2つの検知装置S03、S04で取得した3次元点群の貼り合わせが適切に行われないことにより生じる隙間部分N03を例示する。 (3) The point cloud may be missing due to occlusion or bonding failure. The occlusion is a hidden portion N02 that occurs when the line of sight of the detection device S02 is blocked by an object in front of the line of sight, as illustrated in FIG. 18B. That is, when a twisting motion is performed, occlusion occurs by changing the first position of the body part so as to block the line of sight to the second position. As illustrated in FIG. 18C, the bonding failure is the generation of gaps due to improper bonding, that is, improper integration of three-dimensional point groups acquired by a plurality of detection devices. FIG. 18C exemplifies a gap N03 that occurs when the three-dimensional point clouds acquired by the two detection devices S03 and S04 are not properly pasted together.
 点群不良が生じた場合、最適化に使用する目的関数の形状が複雑となり、直前で、例えば、1フレーム前で、取得した姿勢情報を初期情報として使用しても、適切な最適化結果を取得することができる可能性が低下する。 When a point cloud defect occurs, the shape of the objective function used for optimization becomes complicated. less likely to be obtained.
 図19に例示するように、複雑な形状を有する目的関数OF02において、初期情報をII11に設定すると、最適化結果は、目的関数OF02の極小値R11に収束する。一方、初期情報をII12に設定すると、最適化結果は、目的関数OF02の極小値R12に収束する。ここでは、極小値R12が目的関数OF02の最小値であり、適切な最適化結果である。 As illustrated in FIG. 19, in the objective function OF02 having a complicated shape, when the initial information is set to II11, the optimization result converges to the minimum value R11 of the objective function OF02. On the other hand, when the initial information is set to II12, the optimization result converges to the minimum value R12 of the objective function OF02. Here, the local minimum R12 is the minimum value of the objective function OF02 and is the appropriate optimization result.
 本実施形態では、初期情報を適切に設定するために、対象者がひねり動作を行っていることが判定された場合に、目的関数を最適化するための初期情報として、対象者の姿勢情報をひねり動作の方向に所定角度回転した情報を用いる。 In this embodiment, in order to appropriately set the initial information, when it is determined that the subject is performing a twisting motion, the posture information of the subject is used as the initial information for optimizing the objective function. Information rotated by a predetermined angle in the direction of the twisting motion is used.
〔ハードウェア構成〕
 図20に、体操競技採点支援装置1のハードウェア構成を例示する。体操競技採点支援装置1は、一例として、CPU(Central Processing Unit)52、RAM(Random Access Memory)54、SSD(Solid State Drive)56及び、外部インターフェイス58を含む。
[Hardware configuration]
FIG. 20 illustrates the hardware configuration of the gymnastics scoring assistance device 1 . The gymnastics scoring support device 1 includes, for example, a CPU (Central Processing Unit) 52, a RAM (Random Access Memory) 54, a SSD (Solid State Drive) 56, and an external interface 58.
 CPU52は、ハードウェアであるプロセッサの一例である。CPU52、RAM54、SSD56及び、外部インターフェイス58は、バス72を介して相互に接続されている。CPU52は、単一のプロセッサであってもよいし、複数のプロセッサであってもよい。また、CPU52に代えて、例えば、GPU(Graphics Processing Unit)が使用されてもよい。 The CPU 52 is an example of a processor that is hardware. The CPU 52 , RAM 54 , SSD 56 and external interface 58 are interconnected via a bus 72 . The CPU 52 may be a single processor or multiple processors. Also, instead of the CPU 52, for example, a GPU (Graphics Processing Unit) may be used.
 RAM54は、揮発性のメモリであり、一次記憶装置の一例である。一次記憶装置は、予めプログラムを記憶したROM(Read Only Memory)を含んでいてもよい。SSD56は、不揮発性のメモリであり、二次記憶装置の一例である。二次記憶装置は、SSDに加えて、または、SSDに代えて、HDD(Hard Disk Drive)などであってもよい。 The RAM 54 is a volatile memory and an example of a primary storage device. The primary storage device may include ROM (Read Only Memory) in which programs are stored in advance. The SSD 56 is a non-volatile memory and an example of a secondary storage device. The secondary storage device may be an HDD (Hard Disk Drive) or the like in addition to or instead of the SSD.
 二次記憶装置は、プログラム格納領域及びデータ格納領域などを含む。プログラム格納領域は、一例として、体操競技採点支援プログラムなどのプログラムを格納している。データ格納領域は、例えば、3次元点群データ、技の辞書、体操競技採点結果などを記憶していてもよい。 The secondary storage device includes a program storage area and a data storage area. The program storage area stores programs such as a gymnastics scoring support program, for example. The data storage area may store, for example, three-dimensional point cloud data, a dictionary of techniques, gymnastics scoring results, and the like.
 CPU52は、プログラム格納領域から体操競技採点支援プログラムなどのプログラムを、RAM54を介して、ロードして実行することで、図1の点群生成部12、骨格認識部14、技認識部16、及び採点支援部18として動作する。体操競技採点支援プログラムは、骨格認識プログラムを部分として含み、骨格認識プログラムはひねり動作認識プログラムを部分として含む。CPU52は、ひねり動作認識部20に含まれるひねり動作判定部22、初期情報調整部24、及び最適化部26としても動作する。 The CPU 52 loads a program such as a gymnastics scoring support program from the program storage area via the RAM 54 and executes it, so that the point group generation unit 12, the skeleton recognition unit 14, the technique recognition unit 16, and the It operates as the scoring support unit 18 . The gymnastics scoring support program includes a skeleton recognition program as a part, and the skeleton recognition program includes a twisting motion recognition program as a part. The CPU 52 also operates as a twisting motion determining section 22 , an initial information adjusting section 24 and an optimizing section 26 included in the twisting motion recognition section 20 .
 なお、体操競技採点支援プログラムなどのプログラムは、外部サーバに記憶され、ネットワークを介して、CPU52にロードされてもよい。また、体操競技採点支援プログラムなどのプログラムは、DVD(Digital Versatile Disc)などの非一時的コンピュータ可読記録媒体に記録され、記録媒体読込装置を介して、CPU52にロードされてもよい。 A program such as the gymnastics scoring support program may be stored in an external server and loaded into the CPU 52 via a network. Also, a program such as the gymnastics scoring support program may be recorded on a non-temporary computer-readable recording medium such as a DVD (Digital Versatile Disc) and loaded into the CPU 52 via a recording medium reader.
 外部インターフェイス58には外部装置が接続され、外部インターフェイス58は、外部装置とCPU52との間の各種情報の送受信を司る。図20では、外部インターフェイス58に、検知装置32の一例である3次元レーザセンサ62、及び表示装置36の一例であるディスプレイ64が接続されている例を示している。外部インターフェイスには、例えば、通信装置、外部記憶装置などが接続されていてもよい。体操競技採点支援装置1は、パーソナルコンピュータ、またはサーバなどであってよいし、オンプレミス、または、クラウドであってもよい。 An external device is connected to the external interface 58 , and the external interface 58 controls transmission and reception of various information between the external device and the CPU 52 . FIG. 20 shows an example in which a three-dimensional laser sensor 62 as an example of the detection device 32 and a display 64 as an example of the display device 36 are connected to the external interface 58 . A communication device, an external storage device, or the like, for example, may be connected to the external interface. The gymnastics scoring support device 1 may be a personal computer, a server, or the like, and may be on-premise or in the cloud.
〔体操競技採点支援処理〕 [Gymnastics Scoring Support Processing]
 図21に、体操競技採点支援処理の流れを例示する。CPU52は、ステップ102で、複数の3次元レーザセンサ62の各々で競技者及び鉄棒のバーを検知する。CPU52は、ステップ104で、複数の3次元レーザセンサ62の各々で取得した深度画像から3次元点群を生成し、生成した3次元点群を統合し、統合3次元点群を生成する。CPU52は、ステップ106で、統合3次元点群から人体を構成する各関節の3次元座標を抽出し、統合3次元点群に対して競技者及びバーの3次元モデルを当てはめる。 FIG. 21 illustrates the flow of gymnastics scoring support processing. At step 102 , the CPU 52 detects the athlete and the horizontal bar with each of the plurality of three-dimensional laser sensors 62 . In step 104, the CPU 52 generates a three-dimensional point group from the depth images acquired by each of the plurality of three-dimensional laser sensors 62, integrates the generated three-dimensional point groups, and generates an integrated three-dimensional point group. At step 106, the CPU 52 extracts the three-dimensional coordinates of each joint that constitutes the human body from the integrated three-dimensional point cloud, and applies the three-dimensional models of the player and the bar to the integrated three-dimensional point cloud.
 3次元点群座標と競技者の3次元モデルの表面座標の一致度を表す尤度を表す目的関数を定義し、最も尤度が高い関節角度を最適化により求めることで、3次元骨格座標を決定する。CPU52は、ステップ108で、ステップ106で取得した3次元骨格座標の時系列データから基本技を認識し、技の辞書34と時系列照合して採点の対象となる技を認識する。CPU52は、ステップ110で、ステップ108で取得された技認識結果などを使用して採点を行うとともに、ステップ112で、審判による採点を支援するマルチアングルビュー、技認識ビューなどをディスプレイ64に表示する。 By defining an objective function that expresses the likelihood of matching between the 3D point cloud coordinates and the surface coordinates of the 3D model of the athlete, and optimizing the joint angle with the highest likelihood, the 3D skeletal coordinates are calculated. decide. At step 108, the CPU 52 recognizes the basic technique from the time-series data of the three-dimensional skeleton coordinates obtained at step 106, and checks the time-series against the technique dictionary 34 to recognize the technique to be graded. At step 110, the CPU 52 performs scoring using the technique recognition results obtained at step 108, and at step 112, displays a multi-angle view, technique recognition view, etc., on the display 64 to support scoring by the referee. .
 図22に、ステップ106の骨格認識処理の部分であるひねり動作認識処理の流れを例示する。CPU52は、ステップ122~ステップ126で、競技者の動作がひねり動作であるか否か判定する。詳細には、ステップ122で、1フレーム前の競技者の両手の手先の間の距離が2フレーム前より短縮され、かつ、当該距離が所定長さよりも短い、という、動作がひねり動作候補である第1条件(条件1)を満たすか否か判定する。 FIG. 22 illustrates the flow of the twisting motion recognition process, which is part of the skeleton recognition process in step 106. The CPU 52 determines in steps 122 to 126 whether or not the motion of the player is a twisting motion. Specifically, in step 122, the motion in which the distance between the fingers of the player's hands one frame before is shorter than that two frames before and the distance is shorter than a predetermined length is a twisting motion candidate. It is determined whether or not the first condition (condition 1) is satisfied.
 ステップ122の判定が否定された場合、即ち、第1条件が満たされない場合、CPU52は、ステップ124で、競技者の両腕が交差している、という、動作がひねり動作候補である第2条件(条件2)を満たすか否か判定する。ステップ122の判定、または、ステップ124の判定が肯定された場合、即ち、第1条件または第2条件が満たされた場合、CPU52は、ステップ126で、競技者の所定部位の平均回転角度が所定閾値を超えている、という条件を満たすか否か判定する。 If the determination in step 122 is negative, that is, if the first condition is not satisfied, the CPU 52 determines, in step 124, the second condition that the athlete's arms are crossed, that is, the motion is a twisting motion candidate. It is determined whether or not (condition 2) is satisfied. If the determination in step 122 or the determination in step 124 is affirmative, that is, if the first condition or the second condition is satisfied, the CPU 52 determines in step 126 that the average rotation angle of the predetermined part of the athlete is a predetermined value. It is determined whether or not the condition that the threshold value is exceeded is satisfied.
 ステップ126の判定が肯定された場合、即ち、競技者の動作がひねり動作であると判定された場合、CPU52は、ステップ128で、1フレーム前の姿勢情報をひねり動作の方向に所定角度回転して、目的関数最適化の初期情報として設定する。 If the determination in step 126 is affirmative, that is, if the motion of the player is determined to be a twisting motion, the CPU 52 rotates the posture information of the preceding frame by a predetermined angle in the direction of the twisting motion in step 128. set as initial information for objective function optimization.
 一方、ステップ124またはステップ126の判定が否定された場合、即ち、競技者の動作がひねり動作ではないと判定された場合、CPU52は、ステップ130で、1フレーム前の姿勢情報を回転せず、目的関数最適化の初期情報として設定する。CPU52は、ステップ132で、ステップ128またはステップ130で設定した初期情報を使用して目的関数を最適化することで、競技者の3次元骨格座標を決定する。ステップ122~ステップ132の処理は、3次元レーザセンサ62で取得される各フレームについて適用される。 On the other hand, if the determination in step 124 or step 126 is negative, that is, if it is determined that the athlete's motion is not a twisting motion, the CPU 52 does not rotate the posture information of the previous frame in step 130. Set as initial information for objective function optimization. The CPU 52 determines the 3D skeletal coordinates of the athlete at step 132 by optimizing the objective function using the initial information set at step 128 or step 130 . The processing of steps 122 to 132 is applied to each frame acquired by the three-dimensional laser sensor 62. FIG.
 ステップ122で、1フレーム前の競技者の両手の手先の間の距離が2フレーム前より短縮され、かつ、当該距離が所定長さよりも短いか否かについて判定しているが、本実施形態はこれに限定されない。即ち、第1フレーム前の競技者の両手の手先の距離が第2フレーム前より短縮されていればよく、第2フレーム前は、第1フレーム前より以前の時点であればよく、第1フレーム前と第2フレーム前との間のフレーム数は、2フレーム以上であってもよい。 In step 122, it is determined whether the distance between the hands of the player one frame before is shorter than two frames before and whether the distance is shorter than a predetermined length. It is not limited to this. In other words, it is sufficient that the distance between the hands of the player's hands before the first frame is shorter than before the second frame, and the point before the second frame is before the first frame. The number of frames between the previous and the second frame before may be two frames or more.
 図21及び図22のフローチャートは例示であり、フローチャートのステップの順番を変更してもよいし、ステップの何れかを他のステップと置き替えてもよいし、ステップを追加または削除してもよい。 The flowcharts of FIGS. 21 and 22 are exemplary, and the order of the steps in the flowcharts may be changed, any of the steps may be replaced with others, and steps may be added or deleted. .
 本実施形態は、体操の鉄棒競技採点支援装置に限定されず、各種スポーツの採点支援、トレーニング支援に適用されてもよい。また、映画などのエンターテインメント素材の作成、手工芸などの技能解析、トレーニング支援などに適用されてもよい。 The present embodiment is not limited to the gymnastics horizontal bar competition scoring support device, and may be applied to scoring support and training support for various sports. It may also be applied to creation of entertainment materials such as movies, skill analysis of handicrafts, training support, and the like.
 本実施形態では、人物及び人物が接触する対象物の3次元点群情報を取得し、取得した3次元点群情報を機械学習モデルに入力することで、人物の骨格情報を生成する。人物がひねり動作を行っていることが判定された場合に、取得した3次元点群情報を、人物の姿勢情報をひねり動作の方向に所定角度回転した3次元点群情報を生成する。生成された所定角度回転した3次元点群情報と、人物及び対象物を表す3次元モデルとに基づいて、生成された人物の骨格情報を修正する。 In this embodiment, the 3D point cloud information of the person and the object that the person comes into contact with is acquired, and the acquired 3D point cloud information is input to the machine learning model to generate the skeleton information of the person. When it is determined that the person is performing a twisting motion, the acquired 3D point group information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. Based on the generated 3D point group information rotated by a predetermined angle and the 3D model representing the person and the object, the generated skeletal information of the person is corrected.
 これにより、本実施形態は、ひねり動作における、3次元センサ点群に基づいた骨格認識の精度向上を可能とする。 As a result, this embodiment makes it possible to improve the accuracy of skeletal recognition based on the 3D sensor point cloud in twisting motions.
1 体操競技採点支援装置
14 骨格認識部
20 ひねり動作認識部
22 ひねり動作判定部
24 初期情報調整部
26 最適化部
52 CPU
54 RAM
56 SSD
1 Gymnastics Scoring Support Device 14 Skeleton Recognition Unit 20 Twisting Motion Recognizing Unit 22 Twisting Motion Judging Unit 24 Initial Information Adjustment Unit 26 Optimization Unit 52 CPU
54 RAMs
56 SSDs

Claims (15)

  1.  コンピュータが、
     人物及び人物が接触する対象物の3次元点群情報を取得し、
     取得した前記3次元点群情報を機械学習モデルに入力することで、前記人物の骨格情報を生成する、
     骨格認識方法であって、
     前記人物がひねり動作を行っていることが判定された場合に、取得した前記3次元点群情報を、前記人物の姿勢情報を前記ひねり動作の方向に所定角度回転した3次元点群情報を生成し、
     生成された前記所定角度回転した3次元点群情報と、前記人物及び対象物を表す3次元モデルとに基づいて、生成された前記人物の骨格情報を修正する、
     骨格認識方法。
    the computer
    Acquire 3D point cloud information of the person and the object that the person contacts,
    generating skeletal information of the person by inputting the acquired 3D point cloud information into a machine learning model;
    A skeleton recognition method comprising:
    When it is determined that the person is performing a twisting motion, 3D point cloud information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. death,
    correcting the generated skeletal information of the person based on the generated 3D point cloud information rotated by a predetermined angle and the 3D model representing the person and the object;
    skeleton recognition method.
  2.  前記3次元点群情報と、前記人物及び前記対象物を表す3次元モデルとに基づいて、前記3次元点群情報の座標と前記3次元モデルの表面座標との一致を表す目的関数を最適化し、前記人物の関節角度を取得することで、前記人物の骨格情報を認識し、
     前記人物がひねり動作を行っていることが判定された場合に、前記目的関数を最適化するための初期情報として、前記人物の姿勢情報を前記ひねり動作の方向に所定角度回転した情報を用いる、
     請求項1に記載の骨格認識方法。
    optimizing an objective function representing matching between coordinates of the 3D point cloud information and surface coordinates of the 3D model, based on the 3D point cloud information and a 3D model representing the person and the object; , recognizing the skeletal information of the person by acquiring the joint angles of the person;
    When it is determined that the person is performing a twisting motion, using information obtained by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion as initial information for optimizing the objective function,
    The skeleton recognition method according to claim 1.
  3.  前記人物の姿勢情報は、前記人物の骨格情報を認識する対象フレームより以前の時点で取得されたフレームから取得される、
     請求項2に記載の骨格認識方法。
    The posture information of the person is acquired from a frame acquired at a time point before a target frame for recognizing the skeletal information of the person.
    The skeleton recognition method according to claim 2.
  4.  前記人物は競技者であり、
     前記対象物は鉄棒のバーであり、
     前記競技者の両手の手先が前記競技者の両肩よりも前記バーに近接した状態で、前記競技者の両手の手先の位置に対応する前記バー上の2つの位置の間の距離が短縮され、前記距離が所定長さより短い場合、または、前記競技者の両腕が交差している場合であって、かつ、前記競技者の所定部位の平均回転角度が所定閾値より大きい場合に、前記競技者は前記ひねり動作を行っていると判定する、
     請求項2または請求項3に記載の骨格認識方法。
    said person is an athlete,
    The object is a horizontal bar,
    With the player's hands on the bar closer to the bar than the player's shoulders, the distance between two positions on the bar corresponding to the positions of the player's hands on the bar is reduced. , if the distance is shorter than a predetermined length, or if the athlete's arms are crossed and the average rotation angle of the predetermined part of the athlete is greater than a predetermined threshold, the competition determining that the person is performing the twisting motion;
    The skeleton recognition method according to claim 2 or 3.
  5.  前記所定角度は、前記競技者の所定部位の平均回転角度である、請求項4に記載の骨格認識方法。 The skeleton recognition method according to claim 4, wherein the predetermined angle is an average rotation angle of a predetermined part of the player.
  6.  前記3次元点群情報は3次元レーザセンサによって取得される、
     請求項1~請求項5の何れか1項に記載の骨格認識方法。
    The three-dimensional point cloud information is obtained by a three-dimensional laser sensor,
    The skeleton recognition method according to any one of claims 1 to 5.
  7.  認識された前記骨格情報に基づいて取得される競技の技に関する採点支援情報を表示装置に表示する、
     請求項4~請求項6の何れか1項に記載の骨格認識方法。
    displaying, on a display device, scoring support information relating to a competition technique acquired based on the recognized skeleton information;
    The skeleton recognition method according to any one of claims 4 to 6.
  8.  人物及び人物が接触する対象物の3次元点群情報を取得し、
     取得した前記3次元点群情報を機械学習モデルに入力することで、前記人物の骨格情報を生成する、
     骨格認識処理であって、
     前記人物がひねり動作を行っていることが判定された場合に、取得した前記3次元点群情報を、前記人物の姿勢情報を前記ひねり動作の方向に所定角度回転した3次元点群情報を生成し、
     生成された前記所定角度回転した3次元点群情報と、前記人物及び対象物を表す3次元モデルとに基づいて、生成された前記人物の骨格情報を修正する、
     骨格認識処理をコンピュータに実行させる、骨格認識プログラム。
    Acquire 3D point cloud information of the person and the object that the person contacts,
    generating skeletal information of the person by inputting the acquired 3D point cloud information into a machine learning model;
    Skeleton recognition processing,
    When it is determined that the person is performing a twisting motion, 3D point cloud information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. death,
    correcting the generated skeletal information of the person based on the generated 3D point cloud information rotated by a predetermined angle and the 3D model representing the person and the object;
    A skeleton recognition program that causes a computer to execute skeleton recognition processing.
  9.  前記3次元点群情報と、前記人物及び前記対象物を表す3次元モデルとに基づいて、前記3次元点群情報の座標と前記3次元モデルの表面座標との一致を表す目的関数を最適化し、前記人物の関節角度を取得することで、前記人物の骨格情報を認識し、
     前記人物がひねり動作を行っていることが判定された場合に、前記目的関数を最適化するための初期情報として、前記人物の姿勢情報を前記ひねり動作の方向に所定角度回転した情報を用いる、
     請求項8に記載の骨格認識プログラム。
    optimizing an objective function representing matching between coordinates of the 3D point cloud information and surface coordinates of the 3D model, based on the 3D point cloud information and a 3D model representing the person and the object; , recognizing the skeletal information of the person by acquiring the joint angles of the person;
    When it is determined that the person is performing a twisting motion, using information obtained by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion as initial information for optimizing the objective function,
    The skeleton recognition program according to claim 8.
  10.  前記人物は競技者であり、
     前記対象物は鉄棒のバーであり、
     前記骨格認識処理は、
     前記競技者の両手の手先が前記競技者の両肩よりも前記バーに近接した状態で、前記競技者の両手の手先の位置に対応する前記バー上の2つの位置の間の距離が短縮され、前記距離が所定長さより短い場合、または、前記競技者の両腕が交差している場合であって、かつ、前記競技者の所定部位の平均回転角度が所定閾値より大きい場合に、前記競技者は前記ひねり動作を行っていると判定する、
     請求項9に記載の骨格認識プログラム。
    said person is an athlete,
    The object is a horizontal bar,
    The skeleton recognition processing includes
    With the player's hands on the bar closer to the bar than the player's shoulders, the distance between two positions on the bar corresponding to the positions of the player's hands on the bar is reduced. , if the distance is shorter than a predetermined length, or if the athlete's arms are crossed and the average rotation angle of the predetermined part of the athlete is greater than a predetermined threshold, the competition determining that the person is performing the twisting motion;
    The skeleton recognition program according to claim 9.
  11.  前記所定角度は、前記競技者の所定部位の平均回転角度である、請求項10に記載の骨格認識プログラム。 The skeleton recognition program according to claim 10, wherein the predetermined angle is an average rotation angle of a predetermined part of the athlete.
  12.  人物及び人物が接触する対象物の3次元点群情報を取得する点群取得部と、
     取得した前記3次元点群情報を機械学習モデルに入力することで、前記人物の骨格情報を生成する骨格生成部と、
     生成された前記骨格情報と、前記骨格情報に基づいて認識された技情報と、に基づいて生成した採点支援情報を表示装置に表示する採点支援部と、
     を含む、体操競技採点支援装置であって、
     前記骨格生成部は、
     前記人物がひねり動作を行っていることが判定された場合に、取得した前記3次元点群情報を、前記人物の姿勢情報を前記ひねり動作の方向に所定角度回転した3次元点群情報を生成し、
     生成された前記所定角度回転した3次元点群情報と、前記人物及び対象物を表す3次元モデルとに基づいて、生成された前記人物の骨格情報を修正する、
     体操競技採点支援装置。
    A point cloud acquisition unit that acquires three-dimensional point cloud information of a person and an object that the person contacts;
    a skeleton generation unit that generates skeleton information of the person by inputting the acquired three-dimensional point group information into a machine learning model;
    a scoring support unit that displays on a display device scoring support information generated based on the generated skeleton information and technique information recognized based on the skeleton information;
    A gymnastics scoring support device comprising
    The skeleton generation unit
    When it is determined that the person is performing a twisting motion, 3D point cloud information is generated by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion. death,
    correcting the generated skeletal information of the person based on the generated 3D point cloud information rotated by a predetermined angle and the 3D model representing the person and the object;
    Gymnastics scoring support device.
  13.  前記3次元点群情報と、前記人物及び前記対象物を表す3次元モデルとに基づいて、前記3次元点群情報の座標と前記3次元モデルの表面座標との一致を表す目的関数を最適化し、前記人物の関節角度を取得することで、前記人物の骨格情報を認識し、
     前記人物がひねり動作を行っていることが判定された場合に、前記目的関数を最適化するための初期情報として、前記人物の姿勢情報を前記ひねり動作の方向に所定角度回転した情報を用いる、
     請求項12に記載の体操競技採点支援装置。
    optimizing an objective function representing matching between coordinates of the 3D point cloud information and surface coordinates of the 3D model, based on the 3D point cloud information and a 3D model representing the person and the object; , recognizing the skeletal information of the person by acquiring the joint angles of the person;
    When it is determined that the person is performing a twisting motion, using information obtained by rotating the posture information of the person by a predetermined angle in the direction of the twisting motion as initial information for optimizing the objective function,
    The gymnastics scoring support device according to claim 12.
  14.  前記人物は競技者であり、
     前記対象物は鉄棒のバーであり、
     前記骨格生成部は、前記競技者の両手の手先が前記競技者の両肩よりも前記バーに近接した状態で、前記競技者の両手の手先の位置に対応する前記バー上の2つの位置の間の距離が短縮され、前記距離が所定長さより短い場合、または、前記競技者の両腕が交差している場合であって、かつ、前記競技者の所定部位の平均回転角度が所定閾値より大きい場合に、前記競技者は前記ひねり動作を行っていると判定する、
     請求項13に記載の体操競技採点支援装置。
    said person is an athlete,
    The object is a horizontal bar,
    The skeleton generator generates two positions on the bar corresponding to the positions of the fingers of both hands of the athlete, with the fingers of both hands of the athlete closer to the bar than the shoulders of the athlete. If the distance between the If so, determine that the athlete is performing the twisting motion;
    The gymnastics scoring support device according to claim 13.
  15.  前記所定角度は、前記競技者の所定部位の平均回転角度である、請求項14に記載の体操競技採点支援装置。 The gymnastics scoring support device according to claim 14, wherein the predetermined angle is an average rotation angle of a predetermined part of the athlete.
PCT/JP2021/032788 2021-09-07 2021-09-07 Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device WO2023037401A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/032788 WO2023037401A1 (en) 2021-09-07 2021-09-07 Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/032788 WO2023037401A1 (en) 2021-09-07 2021-09-07 Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device

Publications (1)

Publication Number Publication Date
WO2023037401A1 true WO2023037401A1 (en) 2023-03-16

Family

ID=85507263

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/032788 WO2023037401A1 (en) 2021-09-07 2021-09-07 Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device

Country Status (1)

Country Link
WO (1) WO2023037401A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019030794A1 (en) * 2017-08-07 2019-02-14 富士通株式会社 Information processing device, model data creation program, and model data creation method
WO2020121500A1 (en) * 2018-12-13 2020-06-18 富士通株式会社 Estimation method, estimation program, and estimation device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019030794A1 (en) * 2017-08-07 2019-02-14 富士通株式会社 Information processing device, model data creation program, and model data creation method
WO2020121500A1 (en) * 2018-12-13 2020-06-18 富士通株式会社 Estimation method, estimation program, and estimation device

Similar Documents

Publication Publication Date Title
JP7127650B2 (en) Recognition program, recognition method and recognition device
JP5858261B2 (en) Virtual golf simulation apparatus, and sensing apparatus and sensing method used therefor
US8175326B2 (en) Automated scoring system for athletics
JP5931215B2 (en) Method and apparatus for estimating posture
Jiang et al. Real-time full-body motion reconstruction and recognition for off-the-shelf VR devices
JP6923789B2 (en) Information processing programs, information processing devices, information processing methods, and information processing systems
US9135502B2 (en) Method for the real-time-capable, computer-assisted analysis of an image sequence containing a variable pose
EP2391988B1 (en) Visual target tracking
KR20190110539A (en) Enhanced Virtual Reality System Using Multiple Force Arrays for Solver
JP2018532199A (en) Eye pose identification using eye features
CN101996311A (en) Yoga stance recognition method and system
JP7077603B2 (en) Judgment program, judgment method and image generator
Baek et al. Dance experience system using multiple kinects
WO2023037401A1 (en) Skeleton recognition method, skeleton recognition program, and gymnastics scoring assistance device
JP6783646B2 (en) Selection support device, selection support system and selection support method
JP6851038B2 (en) Analysis device for the behavior of hitting tools
JP7124888B2 (en) Estimation method, estimation program and estimation device
US10561901B2 (en) Method of evaluating stability of golf swing
US10300332B2 (en) Electronic apparatus, system, presentation method, presentation program, and recording medium
Novo et al. Testing the Microsoft kinect skeletal tracking accuracy under varying external factors
WO2022240745A1 (en) Methods and systems for representing a user
JP2022152752A (en) Skeleton recognition method, skeleton recognition program and gymnastic grading support device
Sharma et al. Digital Yoga Game with Enhanced Pose Grading Model
US20220273998A1 (en) Detecting method of golf club and sensing device using the same
JP7388576B2 (en) Skeletal estimation device, skeletal estimation method, and gymnastics scoring support system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21956701

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE