WO2022208859A1 - 技認識方法、技認識装置および体操採点支援システム - Google Patents
技認識方法、技認識装置および体操採点支援システム Download PDFInfo
- Publication number
- WO2022208859A1 WO2022208859A1 PCT/JP2021/014248 JP2021014248W WO2022208859A1 WO 2022208859 A1 WO2022208859 A1 WO 2022208859A1 JP 2021014248 W JP2021014248 W JP 2021014248W WO 2022208859 A1 WO2022208859 A1 WO 2022208859A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- technique
- recognition
- technique recognition
- techniques
- skeleton
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 461
- 230000008569 process Effects 0.000 claims abstract description 27
- 238000001514 detection method Methods 0.000 claims abstract description 25
- 238000004364 calculation method Methods 0.000 claims description 63
- 238000010801 machine learning Methods 0.000 claims description 42
- 230000033001 locomotion Effects 0.000 claims description 37
- 238000012545 processing Methods 0.000 claims description 27
- 230000008859 change Effects 0.000 claims description 8
- 230000006870 function Effects 0.000 description 36
- 238000010586 diagram Methods 0.000 description 25
- 238000012549 training Methods 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 230000036544 posture Effects 0.000 description 8
- 210000000707 wrist Anatomy 0.000 description 6
- 210000001145 finger joint Anatomy 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 210000000245 forearm Anatomy 0.000 description 3
- 210000003127 knee Anatomy 0.000 description 3
- 241000283086 Equidae Species 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000002040 relaxant effect Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 210000003423 ankle Anatomy 0.000 description 1
- 210000000784 arm bone Anatomy 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000001513 elbow Anatomy 0.000 description 1
- 210000002310 elbow joint Anatomy 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 210000001624 hip Anatomy 0.000 description 1
- 210000001503 joint Anatomy 0.000 description 1
- 238000007562 laser obscuration time method Methods 0.000 description 1
- 210000002414 leg Anatomy 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63B—APPARATUS FOR PHYSICAL TRAINING, GYMNASTICS, SWIMMING, CLIMBING, OR FENCING; BALL GAMES; TRAINING EQUIPMENT
- A63B71/00—Games or sports accessories not covered in groups A63B1/00 - A63B69/00
- A63B71/06—Indicating or scoring devices for games or players, or for other sports activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63B—APPARATUS FOR PHYSICAL TRAINING, GYMNASTICS, SWIMMING, CLIMBING, OR FENCING; BALL GAMES; TRAINING EQUIPMENT
- A63B5/00—Apparatus for jumping
- A63B5/12—Bolster vaulting apparatus, e.g. horses, bucks, tables
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63B—APPARATUS FOR PHYSICAL TRAINING, GYMNASTICS, SWIMMING, CLIMBING, OR FENCING; BALL GAMES; TRAINING EQUIPMENT
- A63B71/00—Games or sports accessories not covered in groups A63B1/00 - A63B69/00
- A63B71/06—Indicating or scoring devices for games or players, or for other sports activities
- A63B71/0605—Decision makers and devices using detection means facilitating arbitration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30221—Sports video; Sports image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/033—Recognition of patterns in medical or anatomical images of skeletal patterns
Definitions
- the present invention relates to technique recognition technology.
- the skeletal information of people such as athletes and patients is used to automatically recognize the movements of people.
- the current scoring method in gymnastics is based on visual observation by multiple referees, but advances in equipment and improvements in training methods have led to more sophisticated techniques that involve more complex movements, and the judges' recognition of the techniques has increased. There are times when it becomes difficult. As a result, there are concerns about maintaining the fairness and accuracy of scoring, such as the scoring results of athletes differing for each referee.
- skeletal information For this reason, automatic scoring technology using the athlete's 3D skeletal coordinates (hereinafter sometimes referred to as "skeletal information") is used.
- a 3D (Three-Dimensional) laser sensor acquires 3D point cloud data of a player, and the 3D point cloud data is used to calculate skeletal information of the player. Then, from the time-series data of the skeletal information, the feature value indicating the characteristics of the posture corresponding to the "skill" is calculated, and the technique performed by the player is automatically recognized based on the time-series data of the skeletal information and the feature value. and provide referees with automatic scoring results to improve the fairness and accuracy of scoring.
- the performance score is calculated as the sum of the D (Difficulty) score and the E (Execution) score.
- the D score is a score calculated based on whether or not a technique is established.
- the E-score is a score calculated by a deduction method according to the degree of perfection of the technique. Whether or not a technique is successful or not is determined by the referee's visual inspection based on the rule book that describes the scoring rules.
- JP 2020-89539 A Japanese Patent Application Laid-Open No. 2020-38440
- the above-mentioned feature values include various things, from those common to many events such as hip and knee postures to those unique to specific events, such as pommel horses, such as hand support positions.
- Such various feature amounts include those that are easy to obtain with high accuracy and those that are difficult to obtain with high accuracy.
- an object of the present invention is to provide a technique recognition method, a technique recognition device, and a gymnastics scoring support system that can improve the accuracy of technique recognition.
- skeleton information obtained by skeleton detection is acquired, and based on the skeleton information, a first technique recognition is performed in which techniques included in gymnastics are narrowed down to a part of techniques, and the first technique recognition is performed.
- a process of performing a second technique recognition that recognizes which of the partial techniques has been performed according to a specialized algorithm specialized in recognizing some of the techniques narrowed down in the first technique recognition. is executed by the computer.
- FIG. 1 is a diagram showing a configuration example of a gymnastics scoring support system.
- FIG. 2 is a schematic diagram showing the skeleton recognition technology.
- FIG. 3 is a schematic diagram showing the technique recognition technology.
- FIG. 4 is a block diagram showing a functional configuration example of the technique recognition device.
- FIG. 5 is a diagram showing an example of dictionary data of tentative techniques.
- FIG. 6 is a schematic diagram showing an example of an inverted twist.
- FIG. 7 is a diagram showing an example of rotation information.
- FIG. 8 is a diagram showing an example of rotation information.
- FIG. 9 is a schematic diagram showing an example of rear wheels and wheels.
- FIG. 10 is a diagram showing an example of dictionary data of tricks.
- FIG. 11 is a flow chart showing the procedure of technique recognition processing.
- FIG. 12 is a diagram showing an example of the specialized algorithm of the first series.
- FIG. 13 is a diagram showing an example of the second series of specialized algorithms.
- FIG. 14 is a diagram
- FIG. 1 is a diagram showing a configuration example of a gymnastics scoring support system.
- a gymnastics scoring support system 1 shown in FIG. 1 captures three-dimensional data of a performer 3 as a subject, recognizes the skeleton and the like, and scores accurate techniques.
- the gymnastics scoring support system 1 may include a 3D laser sensor 5, a skeleton detection device 7, and a technique recognition device 10.
- the 3D laser sensor 5 is an example of a sensor device that uses an infrared laser or the like to measure the distance to an object, the so-called depth, for each pixel corresponding to a scanning point.
- the 3D laser sensor 5 may be a depth image camera, a laser sensor using LIDAR (Light Detection and Ranging) technology, such as a MEMS (Micro-Electro-Mechanical Systems) mirror-type laser sensor.
- LIDAR Light Detection and Ranging
- MEMS Micro-Electro-Mechanical Systems
- the skeletal detection device 7 is an example of a computer that provides a skeletal detection function that uses depth images measured by the 3D laser sensor 5 to detect skeletal information such as skeletal parts of the performer 3, such as joint positions. Note that skeleton detection is sometimes called skeleton recognition or skeleton estimation.
- the 3D laser sensor 5 and the skeleton detection device 7 realize 3D sensing that performs three-dimensional measurement of the movement of the performer 3 without markers.
- the technique recognition device 10 is an example of a computer that provides a technique recognition function for recognizing techniques performed by the performer 3 using time-series data of skeleton information obtained by skeleton detection by the skeleton detection device 7 .
- Such a technique recognition function is further packaged with an automatic scoring function that calculates the techniques of the actor 3 and the performance, for example, the calculation of the D score and the E score, based on the technique recognition result of the actor 3. can be
- the technique recognition result is used for automatic scoring, but the method of using the technique recognition result is not limited to this.
- skeleton information and technique recognition results can be output to a scoring support application (hereinafter referred to as a "scoring support application").
- the scoring support app has a multi-angle view that displays the joint angles for each frame in performer 3's performance from multiple viewpoints such as the front, side, and plane. Display such as recognition view can be realized.
- technique recognition results can be used in various usage scenes such as training applications and broadcast/entertainment content.
- FIG. 2 is a schematic diagram showing the skeleton recognition technology. As shown in FIG. 2, the skeleton recognition function can be realized by a hybrid system that combines skeleton recognition using a machine learning model and fitting, just as an example.
- a machine learning model 7m such as a CNN (Convolutional Neural Network) neural network, which takes a depth image as input and outputs estimated values of 3D skeletal coordinates
- a data set 7TR including training data in which depth images and 3D skeletal coordinates of correct labels are associated
- the training data can be prepared by generating depth images using computer graphics or the like from 3D skeletal coordinates for gymnastics.
- the depth image is used as an explanatory variable for the machine learning model 7m
- the label is used as the objective variable for the machine learning model 7m
- any machine learning algorithm such as deep learning, is used for machine learning.
- Model 7m can be trained. This yields a trained machine learning model 7M.
- multi-viewpoint depth images output from multi-viewpoint 3D laser sensors 5A to 5N installed so as to overcome occlusion by the gymnastics equipment and the performer 3 himself are input to the machine learning model 7M.
- the machine learning model 7M to which the multi-viewpoint depth images are input in this way outputs the 3D skeleton coordinates of the performer 3 .
- the output of the 3D skeleton coordinates of the machine learning model 7M and the fitting result of the previous frame are used as initial values, and the human body model is applied to the 3D point cloud in which the multi-viewpoint depth images are integrated.
- the 3D skeleton coordinates are determined. do.
- FIG. 3 is a schematic diagram showing the technique recognition technology.
- FIG. 3 shows an example of pommel horse technique recognition as an example of gymnastics.
- the technique recognition function divides the 3D skeleton coordinate time series data at the breaks between basic movements recognized from the 3D skeleton coordinate time series data (S1).
- the "basic movement” referred to here refers to a basic movement common to the techniques that constitute the performance.
- a dictionary can be created by registering.
- the basic movements included in the time-series data are identified for each time-series data of the 3D skeletal coordinates divided in this way.
- Features are extracted (S2 and S3).
- the basic technique is recognized based on the basic motion identified in step S2 and the feature amount extracted in step S3 (S4). Then, the time-series pattern of the basic technique obtained as the recognition result in step S4 is compared with the time-series pattern registered in the technique dictionary data 13B, thereby determining the technique demonstrated by the performer 3. (S5). For example, in the example shown in FIG. 3, as a result of recognizing the "crossed handstand" as the first basic movement and the "lower and open leg support" as the second basic movement, the performed technique is the "sea handstand". ”.
- the D-score and E-score are calculated by aggregating the value points and implementation points of the techniques determined in step S5 according to the scoring rules (S6 and S8).
- the scoring rules S6 and S8.
- This kind of technique recognition is realizing automatic scoring of five events such as rings, pommel horses, men's and women's vaults, and balance beams.
- the above feature values include various items such as the posture of the hips and knees, which are common to many events, as well as those unique to a specific event, such as the position of the hand support in the case of a pommel horse.
- various feature amounts there is an aspect that some are easy to obtain with high accuracy and others are difficult to obtain with high accuracy.
- gymnastics since various exercises are performed in one event, it is difficult to calculate feature amounts by a uniform method.
- the way to hold the horizontal bar or uneven parallel bars can include forward, reverse, and overhand.
- a reverse hand refers to a state in which the hand is twisted outward by 180° from the forward hand
- a large reverse hand refers to a state in which the hand is twisted inward by 180°.
- the arm is twisted in the opposite direction, but since the twist of the arm is difficult to observe from the image, even an expert such as a referee cannot distinguish it from the image in which the grip is fixed. It can be difficult.
- reference technology 1 for acquiring finger joint positions and reference technology 2 for acquiring arm rotation information can be cited.
- These Reference Techniques 1 and 2 are distinguished from the known prior art.
- 3D skeletal coordinates are obtained that further include finger joint positions in addition to major joints such as the head, shoulders, spine, elbows, wrists, hips, knees, and ankles.
- major joints such as the head, shoulders, spine, elbows, wrists, hips, knees, and ankles.
- the size of the finger is smaller than other skeletal parts, there is an aspect that the finger is observed to be smaller and thinner than the other skeletal parts on the depth image.
- the finger is photographed in contact with the stick, occlusion and the like are likely to occur even in multi-viewpoint depth images. From these aspects, in reference technique 1, it is originally difficult to acquire the correct finger joint position. In addition, even if the correct finger joint positions can be acquired, it is still difficult to distinguish between the overhand and the great overhand because the finger joint positions are less likely to differ between the overhand and the overhand.
- the rotation information of the arm bones is acquired.
- the calculation accuracy of arm rotation information varies depending on the degree of bending of the arm. For example, when the arm is extended, the calculation accuracy of arm rotation information is lower than when the arm is bent, so it is difficult to obtain highly accurate rotation information. In this case, since it is still not possible to distinguish between grips, the accuracy of technique recognition and automatic scoring is degraded.
- the techniques included in gymnastics are narrowed down to some of the techniques based on the skeleton information obtained by skeleton detection, and the specialized techniques are specialized for recognizing some of the techniques that have been narrowed down.
- a pattern algorithm is selected to recognize which of some tricks has been performed.
- the problem is solved by applying a specialized algorithm that specializes in recognizing some techniques.
- the horizontal bar as an example of gymnastics.
- the basic motion 1 "front wheel handstand", the basic motion 2
- the cases in which the basic motions are recognized are listed in the order of "single twist”.
- all the techniques included in the gymnastics "horizontal bar” can be narrowed down to two techniques, “front wheel single twist single overhand” and “front wheel single twist large overhand”. Since the difficulty levels of these two techniques are different, there is also a difference in the value points that are added when totaling the D score.
- the above two techniques can be distinguished. It is possible to apply an algorithm that calculates the feature amount of the gripping method that is the decisive factor.
- Such an algorithm may be built, in one aspect, on logic that holds under constraints such as performance constructs or rules. That is, there is a heuristic that the elbow is more likely to be flexed than extended under the constraint condition that the hand that is not the pivot during the handstand twisting grips the stick. Therefore, under the above constraints, the logic holds that arm rotation information used for fitting when the elbow is flexed is more reliable than arm rotation information used for fitting when the elbow is extended.
- an algorithm is applied that uses the time-series data of the skeleton information of the performer 3 as well as the rotation information when the arm is flexed as auxiliary information when calculating the feature amount of the grip.
- the grip feature amount can be calculated with higher accuracy than when the grip feature amount is calculated from the time-series data of the skeleton information of the performer 3 .
- technique recognition is performed using highly accurate feature amounts.
- FIG. 4 is a block diagram showing a functional configuration example of the technique recognition device 10. As shown in FIG. FIG. 4 schematically shows blocks corresponding to the technique recognition functions of the technique recognition device 10. As shown in FIG. As shown in FIG. 4 , the technique recognition device 10 has a communication interface section 11 , a storage section 13 and a control section 15 . It should be noted that FIG. 1 only shows an excerpt of the functional units related to the above-mentioned technique recognition function, and in addition to the skeleton detection function and automatic scoring function, the functions that existing computers are equipped with by default or as an option. may be provided in the technique recognition device 10.
- the communication interface unit 11 corresponds to an example of a communication control unit that controls communication with another device such as the skeleton detection device 7.
- the communication interface unit 11 can be realized by a network interface card such as a LAN (Local Area Network) card.
- the communication interface unit 11 receives skeleton information including 3D skeleton coordinates or 3D skeleton coordinates after fitting from the skeleton detection device 7, and receives technique recognition results or automatic scoring results from an external device (not shown). output to
- the storage unit 13 is a functional unit that stores various data.
- the storage unit 13 is implemented by storage, such as internal, external or auxiliary storage.
- the storage unit 13 stores provisional technique dictionary data 13A and technique dictionary data 13B.
- the storage unit 13 can store various data such as technique recognition results and automatic scoring results. The description of each data of the tentative technique dictionary data 13A and the technique dictionary data 13B will be given later together with the description of the process in which the data is referred to or generated.
- the control unit 15 is a processing unit that performs overall control of the technique recognition device 10.
- the control unit 15 is realized by a hardware processor.
- the control unit 15 includes an acquisition unit 15A, a first calculation unit 15B, a first recognition unit 15C, a selection unit 15D, a second calculation unit 15E, and a second recognition unit 15F.
- FIG. 1 shows an excerpt of the function corresponding to the technique recognition function, but the skeleton detection function may be further included, and the back end such as automatic scoring, scoring support, training, entertainment content, etc. may further include the function of
- the acquisition unit 15A is a processing unit that acquires skeleton information.
- the acquisition unit 15A can acquire time-series data of skeleton information from the skeleton detection device 7 .
- the information source from which the acquisition unit 15A acquires the skeleton information may be any information source, and is not limited to communication via the network NW.
- the acquisition unit 15A may acquire skeleton information from a storage included in the technique recognition device 10, or removable media that can be attached to and removed from the technique recognition device 10, such as a memory card or USB (Universal Serial Bus) memory.
- the first calculation unit 15B is a processing unit that calculates a first feature amount used for first technique recognition that narrows down gymnastics techniques. As an example only, the first calculator 15B calculates the first feature amount from the time-series data of the skeleton information. At this time, the first calculation unit 15B can also calculate feature amounts related to all the items defined in the technique dictionary data 13B, for example, the items exemplified in FIG. Calculations can also be performed.
- the "first feature amount” referred to here refers to a feature amount narrowed down to a part of all the techniques of gymnastics, that is, one or more techniques, and does not necessarily include feature amounts related to all items. It's good.
- a feature amount that is easily obtained with high accuracy among the feature amounts defined in the technique dictionary data 13B.
- a feature amount whose calculation accuracy is equal to or higher than a threshold value can be used as the first feature amount.
- it is possible to use, as the first feature amount a feature amount whose accuracy is stable, for example, a feature amount whose calculation accuracy varies, for example, a variance of which is less than a threshold.
- the first recognition unit 15C is a processing unit that executes the first technique recognition.
- the technique recognition technology described in WO2019/116495 can be used for the first technique recognition.
- the first recognition unit 15C can perform the first technique recognition using the time-series data of skeleton information and the first feature amount calculated by the first calculation unit 15B. More specifically, the first recognition unit 15C divides the 3D skeleton coordinate time-series data at the breaks between basic motions recognized from the 3D skeleton coordinate time-series data. Then, the first recognition unit 15C identifies the basic motion included in the partial time-series data for each divided partial time-series data. After that, the first recognition unit 15C recognizes the basic technique based on the identified basic motion and the first feature amount calculated by the first calculation unit 15B.
- the first recognition unit 15C compares the time-series pattern of the basic technique obtained as the recognition result with the time-series pattern registered in the dictionary data 13A of the provisional technique, thereby identifying all the techniques of gymnastics. Candidates of techniques demonstrated by performer 3 are narrowed down.
- the technique temporarily narrowed down by the first technique recognition may be referred to as a "provisional technique" from the aspect of distinguishing it from the demonstration technique uniquely identified by the second technique recognition described later. .
- FIG. 5 is a diagram showing an example of the temporary technique dictionary data 13A.
- FIG. 5 shows, as an example only, dictionary data 13A of provisional techniques relating to the gymnastics "horizontal bar".
- the temporary technique dictionary data 13A can employ data in which a set of technique candidates and a time-series pattern of basic techniques are associated with each temporary technique.
- basic techniques may include items such as basic motions and feature amounts.
- the tentative move dictionary data 13A is used for narrowing down the tentative moves as one aspect. From this aspect, the dictionary data 13A of provisional techniques may not necessarily include the second feature amount used in the second technique recognition for uniquely identifying the demonstration technique from among the provisional techniques.
- the basic motion "front wheel handstand” and the basic motion "1/2 twist” are performed in this order.
- An example of matching with the time-series pattern of the recognized basic technique will be given.
- the tentative trick identified by the tentative trick ID "002" that is, candidate 1 "front wheel twisting handstand” is narrowed down.
- This front wheel twist handstand is a trick that does not matter how you hold it, so it is narrowed down to one without waiting for the execution of the second trick recognition.
- the selection unit 15D is a processing unit that selects specialized algorithms that specialize in recognizing some of the techniques narrowed down by the first recognition unit 15C. As an example only, when the tentative technique recognition result by the first recognition unit 15C is obtained, the selection unit 15D selects the technique narrowed down as the tentative technique according to the function name associated with the tentative technique obtained as the tentative technique recognition result.
- Candidate specialized algorithms can be invoked. For example, by registering a function name in association with the tentative move ID in the tentative move dictionary data 13A, a specialized algorithm can be called.
- a database for example, a lookup table, in which the correspondence between temporary techniques and function names is defined, may be used.
- the second calculation unit 15E is a processing unit that calculates a second feature amount.
- the second calculation unit 15E according to the specialized algorithm selected by the selection unit 15D, performs a second calculation which is a decisive factor for distinguishing the demonstration technique from among the technique candidates narrowed down by the first technique recognition. is calculated.
- Such specialized algorithms may be built on logic that holds under constraints such as performance constructs or rules.
- FIG. 6 is a schematic diagram showing an example of an inverted twist.
- the flow of time t is indicated by arrows, and poses P11 to P14 of performer 3 from time t11 to time t14 are schematically illustrated.
- the left hand of performer 3 is used as the pivot, while the right hand, which is not the pivot, is released from the stick and transitions to an inverted posture.
- performer 3 performs one twist while standing upside down with the left hand as the pivot.
- the right elbow joint at time t11 under the constraint condition that the right elbow, which is not the axis, grabs the bar at the time of transition to the inverted twist, the right elbow is flexed rather than extended. It is clear that there is a heuristic that the probability that the Therefore, for the technique candidates belonging to the first series, under the above constraints, the credibility of the arm rotation information used for fitting when the elbow is flexed is higher than the arm rotation information used for fitting when the elbow is extended.
- Logic can work. Based on such a logic, a specialized algorithm is constructed that uses the rotation information when the elbow is flexed together with the time-series data of the skeletal information of the performer 3 as auxiliary information when calculating the second feature amount. .
- the second calculation unit 15E identifies the axis of the performer 3. For example, the hand with the smaller distance between the joint position of the wrist and the position of the horizontal bar can be estimated as the "hand”. Then, the second calculation unit 15E calculates a specific type of feature amount, such as a body rotation direction or a body rotation amount, among the first feature amounts recognized as the basic motion "single twist" at the time of recognition of the first technique. Based on the above, etc., the method of gripping the handle of the performer 3 is estimated. At this time, when the grip of actor 3's grip is "Ogyakute", the second calculation unit 15E executes the following processing.
- a specific type of feature amount such as a body rotation direction or a body rotation amount
- the second calculation unit 15E calculates the position of the performer 3 based on the arm rotation information used for fitting in the skeleton detection of the interval in which the distance between the non-axis wrist of the performer 3 and the horizontal bar is equal to or greater than the threshold. Estimate the grip of the non-handed hand. Then, when the grip of the hand other than the shaft hand is the overhanded hand, the second calculation unit 15E calculates the gripping style of the second feature value as the "overhanded hand.” On the other hand, if the grip style of the hand that is not the shaft hand is not the overhand grip, the second calculation unit 15E calculates the grip style of the second feature value as "other than the overhand grip.”
- FIG. 7 and 8 are diagrams showing an example of rotation information.
- 7 and 8 show the rotation values of the upper arm and forearm of the right hand, which is not the axis of the performer 3 performing the handstand twist, for example, the total value of the rotation angles, as a mere example of the rotation information.
- FIG. 7 shows an example in which the right hand of performer 3 grips the stick in a reverse hand
- FIG. 8 shows an example in which the right hand of performer 3 grips the stick in a reverse hand.
- the vertical axis of the graph indicates the rotation value
- the horizontal axis of the graph indicates time.
- FIG 7 and 8 show an example of using the rotation value of the upper arm and the forearm, but this is only an example, and at least one of the rotation value of the upper arm and the rotation of the forearm can be used. can.
- the second calculation unit 15E determines whether or not the most recent technique recognition result among the technique recognition results for which the previous technique, for example, the second technique recognition has been performed, is an Adler-type technique. . At this time, if the technique is not an Adler-type technique, the second calculation unit 15E determines whether or not the previous technique is a handstand twist. Then, if the previous technique is a handstand twist, the second calculation unit 15E determines whether the grip is the "large reverse hand" based on the second feature amount used for the second technique recognition of the previous technique. determine whether or not At this time, if the grip style is "large reverse hand", the second calculation unit 15E determines whether or not a grip change was performed during the completion of the technique being recognized. is equal to or greater than a threshold.
- the second calculation unit 15E calculates the second feature amount.
- the grip is calculated as 'Ogyakute'.
- the second calculation unit 15E calculates the grip as "other than overhand" for the second feature amount.
- the second feature is used as a decisive factor for distinguishing the third series of technique candidates, which is difficult to logicalize such as condition determination for calculating the second feature with high accuracy.
- An example of calculation is given.
- combinations of such candidate techniques include “rear wheel” and “normal wheel”, “reverse wheel” and “front wheel”, “reverse wheel” and “rear wheel”, and the like.
- FIG. 9 is a schematic diagram showing an example of rear wheels and wheels.
- the postures P21 to P22 of the performer 3A performing the back wheel and the postures P31 to P32 of the performer 3B performing the normal wheel are shown side by side.
- postures P21 to P22 of performer 3A and postures P31 to P32 of performer 3B are compared, as shown in FIG. Therefore, it is difficult to distinguish techniques with high accuracy by comparing the angle of the armpit with the threshold value.
- a specialized algorithm that uses a machine learning model that inputs skeleton information or time-series data of skeleton information and outputs a class corresponding to the value of the second feature value, for example, opening and closing of the armpits.
- a machine learning model For the training of such a machine learning model, skeletal information to which correct labels for opening and closing of the armpits are assigned is used as training data.
- the skeleton information can be used as an explanatory variable of the machine learning model
- the label can be used as the objective variable of the machine learning model
- the machine learning model can be trained according to any machine learning algorithm, such as deep learning. This gives us a trained machine learning model.
- the skeleton information obtained as a result of fitting is input to a trained machine learning model.
- the machine learning model to which skeletal information is input in this way outputs classes corresponding to the opening and closing of the sides.
- the second feature amount can be obtained.
- High precision can be achieved.
- a specialized algorithm using a machine learning model is applied to the candidate techniques belonging to the third series.
- Specialized algorithms using machine learning models can also be applied. In this case, it can be easily realized by replacing the label, which is the objective variable of the machine learning model, with the second feature quantity corresponding to the first series or the second series.
- the second recognition unit 15F is a processing unit that executes the second technique recognition.
- the technique recognition technique described in WO2019/116495 can also be used for the second technique recognition.
- the second recognition unit 15F performs the second technique recognition using the tentative technique recognition result obtained in the first technique recognition and the second feature amount calculated by the second calculation unit 15E. can be executed.
- This description does not prevent the time-series data of the skeleton information and the first feature amount from being used for the second technique recognition.
- the processing overlapping with the first technique recognition can be skipped. For example, division of time-series data of 3D skeleton coordinates and recognition of basic motions can be omitted.
- the second recognition unit 15F targets basic techniques of techniques corresponding to technique candidates narrowed down in the first technique recognition among the basic techniques defined in the technique dictionary data 13B, and Among the techniques, the basic technique corresponding to the second feature amount calculated by the second calculation unit 15E is recognized. After that, the second recognition unit 15F compares the time-series pattern of the basic technique obtained as the recognition result with the time-series pattern registered in the technique dictionary data 13B, thereby narrowing down the technique by the first technique recognition. The technique demonstrated by the performer 3 is recognized among the candidate techniques.
- FIG. 10 is a diagram showing an example of the technique dictionary data 13B.
- FIG. 10 shows, as an example only, the dictionary data 13B of techniques related to the gymnastics "horizontal bar".
- the technique dictionary data 13B data in which a time-series pattern of a basic technique is associated with each technique can be used.
- the basic technique may include basic movements, feature values, and the like.
- the first technique recognition narrows down two techniques belonging to the first series: Candidate 1 "Front Wheel Single Twist Single Large Backhand” and Candidate 2 "Front Wheel Single Twist Large Backhand”.
- the second technique recognition recognizes the technique name “front wheel single twist single large reverse hand”.
- the second technique recognition recognizes the technique name "front wheel single twist large reverse hand”.
- the technique recognition result obtained by the second technique recognition in this way can be output to the following output destinations.
- back-end functions and services such as an automatic scoring function that calculates the skills of the performer 3 and performances, such as calculating D scores and E scores, as well as back-end functions and services such as scoring support, training, and entertainment content.
- FIG. 11 is a flowchart showing the procedure of technique recognition processing.
- the technique recognition process can be repeatedly executed as long as the skeleton information is output from the skeleton detection device 7 .
- the technique recognition processing may be real-time processing in which skeleton information is acquired in units of frames, or batch processing in which time-series data of skeleton information accumulated over a certain period of time or over a specific number of frames are collectively acquired. may be
- the first calculation unit 15B calculates a first technique for narrowing down gymnastics techniques from the skeleton information acquired in step S101.
- a first feature amount used for recognition is calculated (step S102).
- the first recognition unit 15C uses the skeleton information acquired in step S101 and the first feature amount calculated in step S102 to determine which technique is performed by the performer 3 among all the gymnastics techniques.
- a first technique recognition for narrowing down the candidates for the technique is executed (step S103).
- the selection unit 15D selects a specialized algorithm that specializes in recognizing some of the techniques narrowed down in step S103 (step S104).
- the second calculation unit 15E calculates a second feature amount that is a decisive factor for distinguishing the demonstration technique from among the technique candidates narrowed down by the first technique recognition, according to the specialized algorithm selected in step S104. is calculated (step S105).
- the second recognition unit 15F uses the tentative technique recognition result obtained in step S103 and the second feature amount calculated in step S105 to perform performance recognition among the technique candidates narrowed down in the first technique recognition.
- a second technique recognition for recognizing the technique demonstrated by the person 3 is performed (step S106).
- FIG. 12 is a diagram showing an example of a first series specialized algorithm. This process corresponds to the process of step S105 shown in FIG. 11, and is activated, for example, when the first series of specialized algorithms is selected in step S104.
- the second calculator 15E identifies the axis of the performer 3 (step S301). For example, the hand with the smaller distance between the joint position of the wrist and the position of the horizontal bar can be estimated as the "hand".
- the second calculation unit 15E calculates a specific type of feature amount, such as a body rotation direction or a body rotation amount, among the first feature amounts recognized as the basic motion "single twist" at the time of recognition of the first technique. Based on the above, etc., the method of gripping the handle of actor 3 is estimated (step S302).
- the second calculation unit 15E performs the following processing. That is, the second calculation unit 15E calculates the position of the performer 3 based on the arm rotation information used for fitting in the skeleton detection of the interval in which the distance between the non-axis wrist of the performer 3 and the horizontal bar is equal to or greater than the threshold. The method of gripping the hand that is not the handle is estimated (step S304).
- the second calculation unit 15E calculates that the way of gripping of the second feature value is the "overhanded hand” (step S306).
- the grip of actor 3's gripping hand is "high reverse hand” or when the grip of the hand other than the supporting hand is not high reverse hand (step S303 No or step S305 No)
- the second calculation unit 15E The grip style of the feature value of 2 is calculated as "other than oganete" (step S307).
- FIG. 13 is a diagram showing an example of a second series specialized algorithm. This process corresponds to the process of step S105 shown in FIG. 11, and is activated, for example, when the second series of specialized algorithms is selected in step S104.
- the second calculation unit 15E determines whether or not the most recent technique recognition result among the technique recognition results for which the previous technique, for example, the second technique recognition has been performed, is an Adler-type technique. Determine (step S501). If the previous technique is an Adler-type technique (Yes at step S501), the process proceeds to step S504.
- the second calculation unit 15E determines whether or not the previous technique is a handstand twist (step S502). Furthermore, if the previous technique is a handstand twist (Yes in step S502), the second calculation unit 15E determines that the grip style is "large reverse hand” based on the second feature amount used for the second technique recognition of the previous technique. ” (step S503).
- the second calculation unit 15E executes the following processing. That is, the second calculation unit 15E determines whether or not a change of grip has been performed before the technique being recognized is completed. It is determined whether or not (step S504).
- the second calculation unit 15E calculates the second is calculated as "large reverse hand" (step S505).
- step S506 calculates the gripping style of the second feature value as "other than the overhanded hand" (step S506). Note that step S502 No, step S503 No, or step S504 Yes corresponds to a branch proceeding to step S506.
- the technique recognition apparatus 10 narrows down the techniques included in the technique dictionary to a part based on the skeleton information obtained by skeleton detection, and recognizes the narrowed-down part of the technique.
- a specialized algorithm is selected to recognize which of some tricks has been performed. Therefore, according to the technique recognition device 10 according to the present embodiment, it is possible to improve accuracy of technique recognition. As a result, the accuracy of back-end functions and services such as automatic scoring, scoring support, training, and entertainment content can be improved.
- the second feature amount for distinguishing the technique candidate is calculated with high accuracy, and then the second technique recognition is performed. may not be calculated. For example, it is possible to skip the calculation of the second feature amount and execute the second technique recognition.
- the second recognition unit 15F selects the technique demonstrated by the performer 3 among the technique candidates narrowed down in the first technique recognition as "one twist of front wheels Recognize it as a reverse.
- the second recognition unit 15F selects the technique demonstrated by the performer 3 among the technique candidates narrowed down in the first technique recognition as "front wheel single twist single-handed reverse hand”. ” and recognize. In this way, the calculation of the second feature amount may be skipped.
- step S501 Yes or step S504 No shown in FIG. ” when proceeding to the branch of step S501 Yes or step S504 No shown in FIG. ” and recognize.
- step S502 No, step S503 No, or step S504 Yes the second recognition unit 15F recognizes the technique demonstrated by performer 3 among the technique candidates narrowed down in the first technique recognition as "front wheel”. do. In this way, the calculation of the second feature amount may be skipped.
- a machine learning model is used that inputs skeleton information or time-series data of skeleton information and outputs a class corresponding to the skill name, for example, "rear wheel” or "normal wheel.”
- skeletal information to which the correct label of "rear wheel” or "normal wheel” is assigned is used as training data.
- the skeleton information can be used as an explanatory variable of the machine learning model
- the label can be used as the objective variable of the machine learning model
- the machine learning model can be trained according to any machine learning algorithm, such as deep learning.
- each component of each illustrated device does not necessarily have to be physically configured as illustrated.
- the specific form of distribution and integration of each device is not limited to the one shown in the figure, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured.
- the acquiring unit 15A, the first calculating unit 15B, the first recognizing unit 15C, the selecting unit 15D, the second calculating unit 15E, or the second recognizing unit 15F are connected as external devices of the technique recognition device 10 via a network. good too.
- different devices each have the acquiring unit 15A, the first calculating unit 15B, the first recognizing unit 15C, the selecting unit 15D, the second calculating unit 15E, or the second recognizing unit 15F, and are connected to a network and cooperate with each other.
- the function of the above-described technique recognition device 10 may be realized.
- another device has all or part of the temporary technique dictionary data 13A or the technique dictionary data 13B stored in the storage unit 13, and is connected to a network and cooperates with the above technique recognition apparatus. 10 functions may be implemented.
- FIG. 13 is a diagram showing a hardware configuration example.
- the computer 100 has an operation unit 110a, a speaker 110b, a camera 110c, a display 120, and a communication unit . Furthermore, this computer 100 has a CPU 150 , a ROM 160 , an HDD 170 and a RAM 180 . Each part of these 110 to 180 is connected via a bus 140 .
- the HDD 170 includes the acquisition unit 15A, the first calculation unit 15B, the first recognition unit 15C, the selection unit 15D, the second calculation unit 15E, and the second recognition unit shown in the first embodiment.
- a technique recognition program 170a that exhibits the same function as 15F is stored.
- This technique recognition program 170a is similar to the components of the acquisition unit 15A, the first calculation unit 15B, the first recognition unit 15C, the selection unit 15D, the second calculation unit 15E, and the second recognition unit 15F shown in FIG. , may be combined or separated. That is, the HDD 170 does not necessarily store all the data shown in the first embodiment, and the HDD 170 only needs to store data used for processing.
- the CPU 150 reads the technique recognition program 170a from the HDD 170 and expands it to the RAM 180.
- the technique recognition program 170a functions as a technique recognition process 180a, as shown in FIG.
- the technique recognition process 180a deploys various data read from the HDD 170 in an area assigned to the technique recognition process 180a among the storage areas of the RAM 180, and executes various processes using the deployed various data.
- examples of processing executed by the technique recognition process 180a include the processing shown in FIGS. 11 to 13.
- FIG. Note that the CPU 150 does not necessarily have to operate all the processing units described in the first embodiment, as long as the processing units corresponding to the processes to be executed are virtually realized.
- each program is stored in a “portable physical medium” such as a flexible disk inserted into the computer 100, so-called FD, CD-ROM, DVD disk, magneto-optical disk, IC card, or the like. Then, the computer 100 may acquire and execute each program from these portable physical media.
- each program may be stored in another computer or server device connected to computer 100 via a public line, the Internet, LAN, WAN, etc., and computer 100 may obtain and execute each program from these. can be
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Psychiatry (AREA)
- Evolutionary Computation (AREA)
- Physical Education & Sports Medicine (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Description
図1は、体操採点支援システムの構成例を示す図である。図1に示す体操採点支援システム1は、被写体である演技者3の3次元データを撮像し、骨格等を認識して正確な技の採点を行うものである。
図2は、骨格認識技術を示す模式図である。図2に示すように、骨格認識機能は、あくまで一例として、機械学習モデルを用いる骨格認識とフィッティングとを組みわせたハイブリッド方式により実現できる。
図3は、技認識技術を示す模式図である。図3には、体操競技の一例として、あん馬の技認識が行われる例が示されている。図3に示すように、技認識機能は、3D骨格座標の時系列データから認識される基本運動間の切れ目で3D骨格座標の時系列データを分割する(S1)。ここで言う「基本運動」とは、演技を構成する技に共通する基本となる動きを指し、例えば、技の辞書データ13Bに例示される通り、技ごとに1又は複数の基本運動を関連付けて登録することにより辞書化できる。
しかしながら、上記の技認識を実現する体操競技を5種目からさらに拡張するには、特徴量の算出精度のばらつきが妨げとなる場合がある。
そこで、本実施例に係る技認識機能では、骨格検出で得られる骨格情報に基づいて体操競技に含まれる技のうち一部に絞り込み、絞り込まれた一部の技の認識に特化した特化型アルゴリズムを選択して一部の技のうちいずれの技が演技されたかを認識する。つまり、技の辞書に含まれる全ての技に対応する技認識のアルゴリズムを用いる代わりに、一部の技の認識に特化した特化型アルゴリズムを適用するアプローチにより課題を解決する。
図4は、技認識装置10の機能構成例を示すブロック図である。図4には、技認識装置10が有する技認識機能に対応するブロックが模式化されている。図4に示すように、技認識装置10は、通信インタフェイス部11と、記憶部13と、制御部15とを有する。なお、図1には、上記の技認識機能に関連する機能部が抜粋して示されているに過ぎず、骨格検出機能や自動採点機能の他、既存のコンピュータがデフォルトまたはオプションで装備する機能が技認識装置10に備わることとしてもよい。
(1)「後方とび車輪3/2ひねり片大逆手」及び「後方とび車輪3/2ひねり大逆手」
(2)「シュタルダーひねり倒立」及び「シュタルダーひねり大逆手」
(3)「シュタルダーとび車輪3/2ひねり片大逆手」及び「シュタルダーとび車輪3/2ひねり大逆手」
(1)「エンドー」及び「大逆手エンドー」
(2)「エンドー1回ひねり片大逆手」及び「大逆手エンドー1回ひねり片大逆手倒立」
(イ)握り以外の動きは同一である
(ロ)大逆手に握り替えるのは簡単ではなく、特定の運動を伴う必要がある
(ハ)特定の運動の例として、アドラー系の技や倒立ひねりなどが該当する
(ニ)技候補に対応する運動の直前に、特定の運動をしなければ大逆手ではないと判断でき、特定の運動をしていても技候補とみられる運動を完了する前に手を離して握り替えをすれば大逆手ではないと判断できる
以下、本実施例に係る技認識装置10が実行する(1)技認識処理を説明する。さらに、技認識処理のステップS105で第2の特徴量の算出に用いられる特化型アルゴリズムの例として、(2)第1の系列に適用される特化型アルゴリズムおよび(3)第2の系列に適用される第2の系列の特化型アルゴリズムを例に挙げて説明する。
図11は、技認識処理の手順を示すフローチャートである。あくまで一例として、技認識処理は、骨格検出装置7から骨格情報の出力が継続する限り、反復して実行できる。また、技認識処理は、フレーム単位で骨格情報が取得されるリアルタイム処理であってもよいし、一定期間、あるいは特定のフレーム数にわたって蓄積された骨格情報の時系列データをまとめて取得するバッチ処理であってもよい。
図12は、第1の系列の特化型アルゴリズムの一例を示す図である。この処理は、図11に示されたステップS105の処理に対応し、例えば、ステップS104で第1の系列の特化型アルゴリズムが選択された場合に起動される。
図13は、第2の系列の特化型アルゴリズムの一例を示す図である。この処理は、図11に示されたステップS105の処理に対応し、例えば、ステップS104で第2の系列の特化型アルゴリズムが選択された場合に起動される。
上述してきたように、本実施例に係る技認識装置10は、骨格検出で得られる骨格情報に基づいて技の辞書に含まれる技を一部に絞り込み、絞り込んだ一部の技の認識に特化した特化型アルゴリズムを選択して一部の技のうちいずれの技が演技されたかを認識する。したがって、本実施例に係る技認識装置10によれば、技認識の精度向上を実現することが可能である。ひいては、自動採点や採点支援、トレーニング、エンタメコンテンツなどのバックエンドの機能やサービスの精度向上も実現される。
上記の実施例1では、特化型アルゴリズムの一例として、技候補を区別する第2の特徴量を高精度に算出してから第2の技認識を実行する例を挙げたが、必ずしも第2の特徴量の算出が実行されずともよい。例えば、第2の特徴量の算出をスキップして第2の技認識を実行することもできる。
また、図示した各装置の各構成要素は、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、取得部15A、第1算出部15B、第1認識部15C、選択部15D、第2算出部15Eまたは第2認識部15Fを技認識装置10の外部装置としてネットワーク経由で接続するようにしてもよい。また、取得部15A、第1算出部15B、第1認識部15C、選択部15D、第2算出部15Eまたは第2認識部15Fを別の装置がそれぞれ有し、ネットワーク接続されて協働することで、上記の技認識装置10の機能を実現するようにしてもよい。また、記憶部13に記憶される仮技の辞書データ13Aまたは技の辞書データ13Bの全部または一部を別の装置がそれぞれ有し、ネットワーク接続されて協働することで、上記の技認識装置10の機能を実現するようにしてもかまわない。
また、上記の実施例で説明した各種の処理は、予め用意されたプログラムをパーソナルコンピュータやワークステーションなどのコンピュータで実行することによって実現することができる。そこで、以下では、図13を用いて、実施例1及び実施例2と同様の機能を有する技認識プログラムを実行するコンピュータの一例について説明する。
3 演技者
5 3Dレーザセンサ
7 骨格検出装置
10 技認識装置
11 通信インタフェイス部
13 記憶部
13A 仮技の辞書データ
13B 技の辞書データ
15 制御部
15A 取得部
15B 第1算出部
15C 第1認識部
15D 選択部
15E 第2算出部
15F 第2認識部
Claims (13)
- 骨格検出で得られる骨格情報を取得し、
前記骨格情報に基づいて体操競技に含まれる技のうち一部の技に絞り込む第1の技認識を実行し、
前記第1の技認識で絞り込まれた一部の技の認識に特化した特化型アルゴリズムに従って前記一部の技のうちいずれの技が演技されたかを認識する第2の技認識を実行する、
処理をコンピュータが実行することを特徴とする技認識方法。 - 前記第1の技認識は、前記体操競技に含まれる技に関する特徴量のうち算出精度が閾値以上である第1の特徴量に基づいて前記一部の技に絞り込む処理を含み、
前記第2の技認識は、前記特化型アルゴリズムに従って前記第1の技認識で絞り込まれた一部の技を区別する第2の特徴量を算出し、該算出された第2の特徴量に基づいて前記一部の技のうちいずれの技が演技されたかを認識する処理を含む、
ことを特徴とする請求項1に記載の技認識方法。 - 前記第2の技認識は、前記骨格情報と、前記骨格検出時に用いられた肘の屈曲時のローテーション情報とに基づいて握り方を前記第2の特徴量として算出する処理を含む、
ことを特徴とする請求項2に記載の技認識方法。 - 前記第2の技認識は、前記第2の技認識が実行済みである技認識結果のうち直近の技認識結果として得られる技における特定の運動の有無および前記特定の運動後における握り替えの有無に基づいて握り方を前記第2の特徴量として算出する処理を含む、
ことを特徴とする請求項2に記載の技認識方法。 - 前記第2の技認識は、骨格情報を説明変数とし、前記第1の技認識で絞り込まれた一部の技を区別する第2の特徴量のラベルを目的変数とする機械学習が実行された機械学習モデルに前記骨格情報を入力することにより前記第2の特徴量を算出する処理を含む、
ことを特徴とする請求項2に記載の技認識方法。 - 前記第2の技認識は、骨格情報を説明変数とし、前記第1の技認識で絞り込まれた一部の技名のラベルを目的変数とする機械学習が実行された機械学習モデルに前記骨格情報を入力することにより、前記一部の技のうちいずれの技が演技されたかを認識する処理を含む、
ことを特徴とする請求項1に記載の技認識方法。 - 骨格検出で得られる骨格情報を取得し、
前記骨格情報に基づいて体操競技に含まれる技のうち一部の技に絞り込む第1の技認識を実行し、
前記第1の技認識で絞り込まれた一部の技の認識に特化した特化型アルゴリズムに従って前記一部の技のうちいずれの技が演技されたかを認識する第2の技認識を実行する、
処理を実行する制御部を含む技認識装置。 - 前記第1の技認識は、前記体操競技に含まれる技に関する特徴量のうち算出精度が閾値以上である第1の特徴量に基づいて前記一部の技に絞り込む処理を含み、
前記第2の技認識は、前記特化型アルゴリズムに従って前記第1の技認識で絞り込まれた一部の技を区別する第2の特徴量を算出し、該算出された第2の特徴量に基づいて前記一部の技のうちいずれの技が演技されたかを認識する処理を含む、
ことを特徴とする請求項7に記載の技認識装置。 - 前記第2の技認識は、前記骨格情報と、前記骨格検出時に用いられた肘の屈曲時のローテーション情報とに基づいて握り方を前記第2の特徴量として算出する処理を含む、
ことを特徴とする請求項8に記載の技認識装置。 - 前記第2の技認識は、前記第2の技認識が実行済みである技認識結果のうち直近の技認識結果として得られる技における特定の運動の有無および前記特定の運動後における握り替えの有無に基づいて握り方を前記第2の特徴量として算出する処理を含む、
ことを特徴とする請求項8に記載の技認識装置。 - 前記第2の技認識は、骨格情報を説明変数とし、前記第1の技認識で絞り込まれた一部の技を区別する第2の特徴量のラベルを目的変数とする機械学習が実行された機械学習モデルに前記骨格情報を入力することにより前記第2の特徴量を算出する処理を含む、
ことを特徴とする請求項8に記載の技認識装置。 - 前記第2の技認識は、骨格情報を説明変数とし、前記第1の技認識で絞り込まれた一部の技名のラベルを目的変数とする機械学習が実行された機械学習モデルに前記骨格情報を入力することにより、前記一部の技のうちいずれの技が演技されたかを認識する処理を含む、
ことを特徴とする請求項7に記載の技認識装置。 - 深度画像を取得するセンサ装置と、
前記深度画像に対する骨格検出を実行する骨格検出部と、前記骨格検出で得られる骨格情報を取得する取得部と、前記骨格情報に基づいて体操競技に含まれる技のうち一部の技に絞り込む第1の技認識を実行する第1認識部と、前記第1の技認識で絞り込まれた一部の技の認識に特化した特化型アルゴリズムに従って前記一部の技のうちいずれの技が演技されたかを認識する第2の技認識を実行する第2認識部と、前記第2の技認識で得られた技を採点する採点部とを有する技認識装置と、
を有することを特徴とする体操採点支援システム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023510126A JPWO2022208859A1 (ja) | 2021-04-01 | 2021-04-01 | |
CN202180094916.6A CN116963808A (zh) | 2021-04-01 | 2021-04-01 | 技巧识别方法、技巧识别装置以及体操评分支援*** |
PCT/JP2021/014248 WO2022208859A1 (ja) | 2021-04-01 | 2021-04-01 | 技認識方法、技認識装置および体操採点支援システム |
EP21935021.2A EP4316614A4 (en) | 2021-04-01 | 2021-04-01 | SKILL RECOGNITION METHOD, SKILL RECOGNITION APPARATUS AND GYMNASTICS SCORING SUPPORT SYSTEM |
US18/456,990 US20230405433A1 (en) | 2021-04-01 | 2023-08-28 | Element recognition method, element recognition device, and gymnastics scoring support system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/014248 WO2022208859A1 (ja) | 2021-04-01 | 2021-04-01 | 技認識方法、技認識装置および体操採点支援システム |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/456,990 Continuation US20230405433A1 (en) | 2021-04-01 | 2023-08-28 | Element recognition method, element recognition device, and gymnastics scoring support system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022208859A1 true WO2022208859A1 (ja) | 2022-10-06 |
Family
ID=83458258
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/014248 WO2022208859A1 (ja) | 2021-04-01 | 2021-04-01 | 技認識方法、技認識装置および体操採点支援システム |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230405433A1 (ja) |
EP (1) | EP4316614A4 (ja) |
JP (1) | JPWO2022208859A1 (ja) |
CN (1) | CN116963808A (ja) |
WO (1) | WO2022208859A1 (ja) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018070414A1 (ja) * | 2016-10-11 | 2018-04-19 | 富士通株式会社 | 運動認識装置、運動認識プログラムおよび運動認識方法 |
JP2018068516A (ja) * | 2016-10-26 | 2018-05-10 | 国立大学法人名古屋大学 | 運動動作評価システム |
WO2019116495A1 (ja) | 2017-12-14 | 2019-06-20 | 富士通株式会社 | 技認識プログラム、技認識方法および技認識システム |
JP2020038440A (ja) | 2018-09-03 | 2020-03-12 | 国立大学法人 東京大学 | 動作認識方法及び装置 |
JP2020089539A (ja) | 2018-12-05 | 2020-06-11 | 富士通株式会社 | 表示方法、表示プログラムおよび情報処理装置 |
CN111527520A (zh) * | 2017-12-27 | 2020-08-11 | 富士通株式会社 | 提取程序、提取方法以及信息处理装置 |
-
2021
- 2021-04-01 EP EP21935021.2A patent/EP4316614A4/en active Pending
- 2021-04-01 JP JP2023510126A patent/JPWO2022208859A1/ja active Pending
- 2021-04-01 WO PCT/JP2021/014248 patent/WO2022208859A1/ja active Application Filing
- 2021-04-01 CN CN202180094916.6A patent/CN116963808A/zh active Pending
-
2023
- 2023-08-28 US US18/456,990 patent/US20230405433A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018070414A1 (ja) * | 2016-10-11 | 2018-04-19 | 富士通株式会社 | 運動認識装置、運動認識プログラムおよび運動認識方法 |
JP2018068516A (ja) * | 2016-10-26 | 2018-05-10 | 国立大学法人名古屋大学 | 運動動作評価システム |
WO2019116495A1 (ja) | 2017-12-14 | 2019-06-20 | 富士通株式会社 | 技認識プログラム、技認識方法および技認識システム |
CN111527520A (zh) * | 2017-12-27 | 2020-08-11 | 富士通株式会社 | 提取程序、提取方法以及信息处理装置 |
JP2020038440A (ja) | 2018-09-03 | 2020-03-12 | 国立大学法人 東京大学 | 動作認識方法及び装置 |
JP2020089539A (ja) | 2018-12-05 | 2020-06-11 | 富士通株式会社 | 表示方法、表示プログラムおよび情報処理装置 |
Non-Patent Citations (2)
Title |
---|
See also references of EP4316614A4 |
TOMIMORI, HIDEKI ET AL.: "A Judging Support System for Gymnastics Using 3D Sensing", JOURNAL OF THE ROBOTICS SOCIETY OF JAPAN, vol. 38, no. 4, 15 May 2020 (2020-05-15), pages 37 - 42, XP009538617, ISSN: 0289-1824, DOI: 10.7210/jrsj.38.339 * |
Also Published As
Publication number | Publication date |
---|---|
EP4316614A4 (en) | 2024-05-01 |
CN116963808A (zh) | 2023-10-27 |
US20230405433A1 (en) | 2023-12-21 |
EP4316614A1 (en) | 2024-02-07 |
JPWO2022208859A1 (ja) | 2022-10-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6733738B2 (ja) | 運動認識装置、運動認識プログラムおよび運動認識方法 | |
Chaudhari et al. | Yog-guru: Real-time yoga pose correction system using deep learning methods | |
US11763603B2 (en) | Physical activity quantification and monitoring | |
JP6943294B2 (ja) | 技認識プログラム、技認識方法および技認識システム | |
JP7235133B2 (ja) | 運動認識方法、運動認識プログラムおよび情報処理装置 | |
CN106572816A (zh) | 步行解析***和步行解析程序 | |
Avola et al. | Deep temporal analysis for non-acted body affect recognition | |
CN106600626A (zh) | 三维人体运动捕获方法与*** | |
CN113398556B (zh) | 俯卧撑识别方法及*** | |
Ross et al. | Classifying elite from novice athletes using simulated wearable sensor data | |
JP6635848B2 (ja) | 3次元動画データ生成装置、3次元動画データ生成プログラム、及びその方法 | |
WO2019198696A1 (ja) | 行動推定装置 | |
JP7409390B2 (ja) | 運動認識方法、運動認識プログラムおよび情報処理装置 | |
Parisi et al. | Learning human motion feedback with neural self-organization | |
WO2022208859A1 (ja) | 技認識方法、技認識装置および体操採点支援システム | |
Krishnan et al. | Pose estimation of yoga poses using ml techniques | |
JP6525180B1 (ja) | 対象数特定装置 | |
CN110781857B (zh) | 运动监控方法、装置、***和存储介质 | |
Wahyuni et al. | Motion recognition system with dynamic time warping method using kinect camera sensor | |
US20220301352A1 (en) | Motion recognition method, non-transitory computer-readable storage medium for storing motion recognition program, and information processing device | |
CN117037279B (zh) | 动作检测方法、装置、设备、存储介质及异构芯片 | |
JP6908312B1 (ja) | 情報処理装置、情報処理方法及びプログラム | |
JPWO2018211713A1 (ja) | 情報処理装置、情報処理システム、および情報処理方法 | |
JP2024087325A (ja) | 行動解析結果出力方法、行動解析結果出力プログラム、および行動解析結果出力システム | |
Madake et al. | Vision-Based Squat Correctness System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21935021 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023510126 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180094916.6 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021935021 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2021935021 Country of ref document: EP Effective date: 20231102 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |