WO2022003989A1

WO2022003989A1 - Action identification device, action identification method, and action identification program

Info

Publication number: WO2022003989A1
Application number: PCT/JP2020/029244
Authority: WO
Inventors: 浩平望月; 勝大草野; 誠司奥村
Original assignee: 三菱電機株式会社
Priority date: 2020-07-03
Filing date: 2020-07-30
Publication date: 2022-01-06
Also published as: WO2022003981A1

Abstract

In the present invention, a skeletal information acquisition unit (22) targets each of multiple subjects, who are a plurality of persons appearing in video data acquired by a video acquisition unit (21), and thereby acquires skeletal information indicating the position of joints of a skeleton for a targeted subject. An action identification unit (24) identifies, from the skeletal information for each of the plurality of subjects acquired by the skeletal information acquisition unit (22), an action taken by the plurality of subjects by taking into consideration a mutual action, which is an action taken by the plurality of subjects that has mutual impact.

Description

行動特定装置、行動特定方法及び行動特定プログラムBehavior identification device, behavior identification method and behavior identification program

　本開示は、人の骨格の関節の位置を示す骨格情報に基づき、人の行動を特定する技術に関する。 This disclosure relates to a technique for identifying human behavior based on skeletal information indicating the positions of joints in the human skeleton.

　特許文献１には、骨格情報を用いた人の行動認識技術が記載されている。特許文献１に記載された技術では、映像に映った人それぞれを対象として、対象の人の周辺の画像情報が取得され、対象の人の骨格情報が抽出され、骨格情報から対象の人の動作を確認可能な画像情報が生成される。そして、生成された画像情報と予め記憶されている判定人物属性の画像情報とに基づき、対象の人の属性が判定人物属性であるかが判定される。 Patent Document 1 describes a human behavior recognition technique using skeletal information. In the technique described in Patent Document 1, image information around the target person is acquired for each person shown in the image, skeleton information of the target person is extracted, and the movement of the target person is extracted from the skeleton information. Image information that can be confirmed is generated. Then, based on the generated image information and the image information of the determined person attribute stored in advance, it is determined whether the attribute of the target person is the determined person attribute.

特開２０１９－０４６４８１号公報Japanese Unexamined Patent Publication No. 2019-046481

　特許文献１に記載された技術では、対象の人についての１人分の骨格情報を基にして行動認識している。そのため、「握手する」と「殴る」とのように「腕を前に伸ばす」という点でその姿勢及び動作が類似している行動については、正しく判別できない可能性がある。
　本開示は、行動認識の精度を向上させることを目的とする。 In the technique described in Patent Document 1, the behavior is recognized based on the skeletal information of one person about the target person. Therefore, it may not be possible to correctly discriminate behaviors that have similar postures and movements in terms of "stretching the arm forward" such as "shaking hands" and "beating".
The present disclosure is intended to improve the accuracy of behavior recognition.

　本開示に係る行動特定装置は、
　映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する骨格情報取得部と、
　前記骨格情報取得部によって取得された前記複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定部と
を備える。 The behavior identification device related to this disclosure is
A skeleton information acquisition unit that acquires skeletal information indicating the positions of joints of the skeleton for each of the subjects who are multiple people reflected in the video data.
From the skeleton information about each of the plurality of subject persons acquired by the skeleton information acquisition unit, the plurality of subject persons can be regarded as the plurality of subject persons in consideration of mutual behavior which is an action in which the plurality of subject persons influence each other. It has a behavior identification unit that identifies behavior.

　本開示では、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 In this disclosure, the behavior as a plurality of subject subjects is specified in consideration of the mutual behavior which is the behavior in which a plurality of subject subjects influence each other. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.

実施の形態１に係る行動特定装置１０の構成図。The block diagram of the action specifying apparatus 10 which concerns on Embodiment 1. FIG. 実施の形態１に係る行動特定装置１０の全体的な動作を示すフローチャート。The flowchart which shows the overall operation of the action specifying apparatus 10 which concerns on Embodiment 1. FIG. 実施の形態１に係る行動特定処理のフローチャート。The flowchart of the action specifying process which concerns on Embodiment 1. 変形例３に係る行動特定装置１０の構成図。The block diagram of the behavior specifying apparatus 10 which concerns on modification 3. 実施の形態２に係る学習装置５０の構成図。The block diagram of the learning apparatus 50 which concerns on Embodiment 2. FIG. 実施の形態２に係る学習装置５０が個別モデルを生成する動作を示すフローチャート。The flowchart which shows the operation which the learning apparatus 50 which concerns on Embodiment 2 generate an individual model. 実施の形態２に係る学習装置５０が相互モデルを生成する動作を示すフローチャート。The flowchart which shows the operation which the learning apparatus 50 which concerns on Embodiment 2 generate a mutual model. 変形例６に係る学習装置５０の構成図。The block diagram of the learning apparatus 50 which concerns on modification 6. 実施の形態３に係る行動特定装置１０の構成図。The block diagram of the action specifying apparatus 10 which concerns on Embodiment 3. 実施の形態３に係る行動特定装置１０の動作を示すフローチャート。The flowchart which shows the operation of the action specifying apparatus 10 which concerns on Embodiment 3. 実施の形態５に係る特徴量計算処理のフローチャート。The flowchart of the feature amount calculation process which concerns on Embodiment 5. 実施の形態６に係る特徴量計算処理のフローチャート。The flowchart of the feature amount calculation process which concerns on Embodiment 6. 実施の形態７に係る特徴量計算処理のフローチャート。The flowchart of the feature amount calculation process which concerns on Embodiment 7.

　実施の形態１．
　＊＊＊構成の説明＊＊＊
　図１を参照して、実施の形態１に係る行動特定装置１０の構成を説明する。
　行動特定装置１０は、コンピュータである。
　行動特定装置１０は、プロセッサ１１と、メモリ１２と、ストレージ１３と、通信インタフェース１４とのハードウェアを備える。プロセッサ１１は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。 Embodiment 1.
*** Explanation of configuration ***
The configuration of the behavior specifying device 10 according to the first embodiment will be described with reference to FIG.
The behavior identification device 10 is a computer.
The behavior identification device 10 includes hardware such as a processor 11, a memory 12, a storage 13, and a communication interface 14. The processor 11 is connected to other hardware via a signal line and controls these other hardware.

　プロセッサ１１は、プロセッシングを行うＩＣ（Ｉｎｔｅｇｒａｔｅｄ　Ｃｉｒｃｕｉｔ）である。プロセッサ１１は、具体例としては、ＣＰＵ（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）、ＤＳＰ（Ｄｉｇｉｔａｌ　Ｓｉｇｎａｌ　Ｐｒｏｃｅｓｓｏｒ）、ＧＰＵ（Ｇｒａｐｈｉｃｓ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）である。 The processor 11 is an IC (Integrated Circuit) that performs processing. Specific examples of the processor 11 are a CPU (Central Processing Unit), a DSP (Digital Signal Processor), and a GPU (Graphics Processing Unit).

　メモリ１２は、データを一時的に記憶する記憶装置である。メモリ１２は、具体例としては、ＳＲＡＭ（Ｓｔａｔｉｃ　Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）、ＤＲＡＭ（Ｄｙｎａｍｉｃ　Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）である。 The memory 12 is a storage device that temporarily stores data. As a specific example, the memory 12 is a SRAM (Static Random Access Memory) or a DRAM (Dynamic Random Access Memory).

　ストレージ１３は、データを保管する記憶装置である。ストレージ１３は、具体例としては、ＨＤＤ（Ｈａｒｄ　Ｄｉｓｋ　Ｄｒｉｖｅ）である。また、ストレージ１３は、ＳＤ（登録商標，Ｓｅｃｕｒｅ　Ｄｉｇｉｔａｌ）メモリカード、ＣＦ（ＣｏｍｐａｃｔＦｌａｓｈ，登録商標）、ＮＡＮＤフラッシュ、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ（登録商標）ディスク、ＤＶＤ（Ｄｉｇｉｔａｌ　Ｖｅｒｓａｔｉｌｅ　Ｄｉｓｋ）といった可搬記録媒体であってもよい。 The storage 13 is a storage device for storing data. As a specific example, the storage 13 is an HDD (Hard Disk Drive). The storage 13 includes SD (registered trademark, Secure Digital) memory card, CF (CompactFlash, registered trademark), NAND flash, flexible disk, optical disk, compact disk, Blu-ray (registered trademark) disk, DVD (Digital Versaille Disk), and the like. It may be a portable recording medium.

　通信インタフェース１４は、外部の装置と通信するためのインタフェースである。通信インタフェース１４は、具体例としては、Ｅｔｈｅｒｎｅｔ（登録商標）、ＵＳＢ（Ｕｎｉｖｅｒｓａｌ　Ｓｅｒｉａｌ　Ｂｕｓ）、ＨＤＭＩ（登録商標，Ｈｉｇｈ－Ｄｅｆｉｎｉｔｉｏｎ　Ｍｕｌｔｉｍｅｄｉａ　Ｉｎｔｅｒｆａｃｅ）のポートである。 The communication interface 14 is an interface for communicating with an external device. As a specific example, the communication interface 14 is a port of Ethernet (registered trademark), USB (Universal Serial Bus), HDMI (registered trademark, High-Definition Multimedia Interface).

　行動特定装置１０は、通信インタフェース１４を介して、カメラ３１と接続されている。カメラ３１は、一般的な２Ｄ（Ｄｉｍｅｎｓｉｏｎ）カメラであってもよいが、３Ｄカメラであってもよい。カメラ３１として３Ｄカメラを用いることにより、奥行に関する情報も得られる。そのため、後述する処理において、人の関節の位置を適切に特定可能になる。 The action specifying device 10 is connected to the camera 31 via the communication interface 14. The camera 31 may be a general 2D (Dimension) camera, but may be a 3D camera. By using a 3D camera as the camera 31, information on the depth can also be obtained. Therefore, in the process described later, the position of a human joint can be appropriately specified.

　行動特定装置１０は、機能構成要素として、映像取得部２１と、骨格情報取得部２２と、相関判定部２３と、行動特定部２４とを備える。行動特定部２４は、個別特定部２５と、相互特定部２６とを備える。行動特定装置１０の各機能構成要素の機能はソフトウェアにより実現される。
　ストレージ１３には、行動特定装置１０の各機能構成要素の機能を実現するプログラムが格納されている。このプログラムは、プロセッサ１１によりメモリ１２に読み込まれ、プロセッサ１１によって実行される。これにより、行動特定装置１０の各機能構成要素の機能が実現される。 The action specifying device 10 includes a video acquisition unit 21, a skeleton information acquisition unit 22, a correlation determination unit 23, and an action identification unit 24 as functional components. The action specifying unit 24 includes an individual specifying unit 25 and a mutual specifying unit 26. The functions of each functional component of the action specifying device 10 are realized by software.
The storage 13 stores a program that realizes the functions of each functional component of the action specifying device 10. This program is read into the memory 12 by the processor 11 and executed by the processor 11. As a result, the functions of each functional component of the action specifying device 10 are realized.

　図１では、プロセッサ１１は、１つだけ示されていた。しかし、プロセッサ１１は、複数であってもよく、複数のプロセッサ１１が、各機能を実現するプログラムを連携して実行してもよい。 In FIG. 1, only one processor 11 was shown. However, the number of processors 11 may be plural, and the plurality of processors 11 may execute programs that realize each function in cooperation with each other.

　＊＊＊動作の説明＊＊＊
　図２及び図３を参照して、実施の形態１に係る行動特定装置１０の動作を説明する。
　実施の形態１に係る行動特定装置１０の動作手順は、実施の形態１に係る行動特定方法に相当する。また、実施の形態１に係る行動特定装置１０の動作を実現するプログラムは、実施の形態１に係る行動特定プログラムに相当する。 *** Explanation of operation ***
The operation of the action specifying device 10 according to the first embodiment will be described with reference to FIGS. 2 and 3.
The operation procedure of the action specifying device 10 according to the first embodiment corresponds to the action specifying method according to the first embodiment. Further, the program that realizes the operation of the action specifying device 10 according to the first embodiment corresponds to the action specifying program according to the first embodiment.

　図２を参照して、実施の形態１に係る行動特定装置１０の全体的な動作を説明する。
　（ステップＳ１１：映像取得処理）
　映像取得部２１は、カメラ３１によって取得された映像データを取得する。映像取得部２１は、映像データをメモリ１２に書き込む。 With reference to FIG. 2, the overall operation of the behavior specifying device 10 according to the first embodiment will be described.
(Step S11: Video acquisition process)
The video acquisition unit 21 acquires video data acquired by the camera 31. The video acquisition unit 21 writes the video data to the memory 12.

　（ステップＳ１２：骨格情報取得処理）
　骨格情報取得部２２は、ステップＳ１１で取得された映像データに映った１人以上の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する。
　具体的には、骨格情報取得部２２は、メモリ１２から映像データを読み出す。骨格情報取得部２２は、映像データに映った１人以上の被写体者それぞれを対象の被写体者に設定する。骨格情報取得部２２は、対象の被写体者の骨格の関節の位置を特定し、被写体者を判別可能なインデックスを付与して骨格情報を生成する。関節の位置は、座標値等によって表される。骨格情報取得部２２は、骨格情報をメモリ１２に書き込む。 (Step S12: Skeleton information acquisition process)
The skeleton information acquisition unit 22 acquires skeleton information indicating the positions of joints of the skeleton for each subject who is one or more people reflected in the video data acquired in step S11. ..
Specifically, the skeleton information acquisition unit 22 reads video data from the memory 12. The skeleton information acquisition unit 22 sets each of one or more subject persons reflected in the video data as the target subject person. The skeleton information acquisition unit 22 identifies the positions of the joints of the skeleton of the target subject, assigns an index that can identify the subject, and generates skeleton information. The position of the joint is represented by a coordinate value or the like. The skeleton information acquisition unit 22 writes the skeleton information in the memory 12.

　骨格情報取得部２２は、映像データを構成するある１つのフレームから特定された関節の位置を骨格情報に含めてもよいし、映像データを構成する複数のフレームから特定された関節の位置を骨格情報に含めてもよい。
　映像データに映る人の関節の位置の抽出方法としては、深層学習を用いる方法と、対象者の関節の位置に物理的にマーカを付け、マーカを識別することで関節を特定する方法等がある。 The skeleton information acquisition unit 22 may include the position of the joint specified from one frame constituting the video data in the skeleton information, or may include the position of the joint specified from a plurality of frames constituting the video data as the skeleton. It may be included in the information.
As a method of extracting the position of a person's joint shown in the video data, there are a method of using deep learning and a method of physically attaching a marker to the position of the joint of the subject and identifying the joint by identifying the marker. ..

　（ステップＳ１３：人数判定処理）
　相関判定部２３は、ステップＳ１２で２人以上の骨格情報が取得されたか否かを判定する。つまり、相関判定部２３は、映像データに２人以上の人が映っていたか否かを判定する。
　相関判定部２３は、２人以上の骨格情報が抽出された場合には、２人以上の骨格情報が取得されたと判定し、処理をステップＳ１４に進める。一方、相関判定部２３は、そうでない場合には、処理をステップＳ１１に戻す。 (Step S13: Number of people determination process)
The correlation determination unit 23 determines whether or not the skeleton information of two or more persons has been acquired in step S12. That is, the correlation determination unit 23 determines whether or not two or more people are shown in the video data.
When the skeleton information of two or more people is extracted, the correlation determination unit 23 determines that the skeleton information of two or more people has been acquired, and proceeds to the process in step S14. On the other hand, if not, the correlation determination unit 23 returns the process to step S11.

　（ステップＳ１４：相関判定処理）
　相関判定部２３は、ステップＳ１２で骨格情報が取得された複数の被写体者が互いに影響を与える行動である相互行動を行っているか否かを判定する。相互行動とは、複数の人の間で互いに影響を与えるような行動のことである。具体例としては、２人が手を伸ばして握り合う握手と、２人のうち１人がもう一方を殴る暴力行為といった行動である。
　具体的には、相関判定部２３は、２つ以上の骨格情報の組を対象として、対象の組に含まれる骨格情報が示す骨格間の距離が設定した閾値よりも小さければ、その組の骨格情報が示す骨格は相互行動行っている組であると判定する。また、相関判定部２３は、２つ以上の骨格情報の組を対象として、対象の組の骨格情報が示す骨格のある関節の位置の変化量又は変化の時刻が相互に相関していれば、その組の骨格情報が示す骨格は相互行動行っている組であると判定してもよい。
　相関判定部２３は、相互行動を行っていると判定された組があった場合には、相互行動行っている組であると判定された各組について、その組に含まれる骨格情報のインデックスをメモリ１２に書き込む。そして、相関判定部２３は、処理をステップＳ１５に進める。一方、相関判定部２３は、相互行動を行っていると判定された組がなかった場合には、処理をステップＳ１１に戻す。 (Step S14: Correlation determination process)
The correlation determination unit 23 determines whether or not the plurality of subjects whose skeleton information has been acquired in step S12 are performing mutual actions, which are actions that affect each other. Mutual behavior is behavior that influences each other among multiple people. Specific examples are actions such as a handshake in which two people reach out and hold each other, and a violent act in which one of the two hits the other.
Specifically, the correlation determination unit 23 targets a set of two or more skeleton information, and if the distance between the skeletons indicated by the skeleton information included in the target set is smaller than the set threshold value, the skeleton of the set. It is determined that the skeleton indicated by the information is a pair of mutual actions. Further, if the correlation determination unit 23 targets two or more sets of skeletal information and the amount of change or the time of change in the position of the joint having the skeleton indicated by the skeletal information of the target set is correlated with each other, It may be determined that the skeleton indicated by the skeleton information of the set is a set in which mutual actions are performed.
When there is a group determined to be performing mutual action, the correlation determination unit 23 indexes the skeletal information included in the group for each group determined to be the group performing mutual action. Write to memory 12. Then, the correlation determination unit 23 advances the process to step S15. On the other hand, if there is no pair determined to be performing mutual action, the correlation determination unit 23 returns the process to step S11.

　（ステップＳ１５：行動特定処理）
　行動特定部２４は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。行動特定部２４は、ステップＳ１２で取得された対象の組に含まれる複数の被写体者それぞれについての骨格情報から、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者それぞれの行動を特定する。 (Step S15: Action identification process)
The action specifying unit 24 sets each group determined to be a group performing mutual action in step S14 as the target group. A plurality of behavior specifying units 24 consider mutual behaviors, which are behaviors in which a plurality of subject subjects influence each other, from skeletal information about each of the plurality of subject subjects included in the target set acquired in step S12. Identify the behavior of each subject.

　図３を参照して、実施の形態１に係る行動特定処理（図２のステップＳ１５）を説明する。
　（ステップＳ２１：個別特定処理）
　個別特定部２５は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。個別特定部２５は、対象の組に含まれる複数の被写体者それぞれを対象として、対象の被写体者の骨格情報から、対象の被写体者についての行動を個別行動として特定する。
　具体的には、個別特定部２５は、人の骨格情報を入力として、その人の行動を示す個別ラベルを出力する個別モデルを利用して、個別行動を特定する。個別モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ１３に記憶されているものとする。つまり、個別特定部２５は、個別モデルに対して、対象の被写体者の骨格情報を入力することにより、対象の被写体者の個別行動を示す個別ラベルを取得する。個別特定部２５は、個別ラベルをメモリ１２に書き込む。
　個別ラベルが示す個別行動は、１人の人としての行動である。したがって、個別行動は、例えば、「腕を前に伸ばす」、「倒れる」、「仰け反る」といった行動である。 The behavior specifying process (step S15 in FIG. 2) according to the first embodiment will be described with reference to FIG.
(Step S21: Individual identification process)
The individual identification unit 25 sets each group determined to be a group performing mutual action in step S14 as the target group. The individual identification unit 25 specifies the behavior of the target subject as individual behavior from the skeleton information of the target subject for each of the plurality of subjects included in the target group.
Specifically, the individual identification unit 25 specifies an individual behavior by using an individual model that inputs a person's skeleton information and outputs an individual label indicating the person's behavior. It is assumed that the individual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, the individual identification unit 25 acquires an individual label indicating the individual behavior of the target subject by inputting the skeleton information of the target subject into the individual model. The individual identification unit 25 writes the individual label in the memory 12.
The individual behavior indicated by the individual label is the behavior as one person. Therefore, the individual actions are, for example, actions such as "stretching the arm forward", "falling down", and "rebelling".

　（ステップＳ２２：相互特定処理）
　相互特定部２６は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。相互特定部２６は、ステップＳ２１で特定された対象の組に含まれる複数の被写体者それぞれについての個別行動から、相互行動を考慮して、対象の組に含まれる複数の被写体者全体としての行動を特定する。相互行動を考慮するとは、ある被写体者の行動を特定する場合に、他の被写体者の行動を考慮するという意味である。つまり、相互行動を考慮するとは、他の被写体者の行動に基づき、ある被写体者の行動を特定するという意味である。
　具体的には、相互特定部２６は、複数の人それぞれの個別行動を示す個別ラベルの組を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルを利用して、被写体者の行動を特定する。相互モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ１３に記憶されているものとする。つまり、相互特定部２６は、相互モデルに対して、ステップＳ２１で特定された対象の組に含まれる複数の被写体者それぞれについての個別ラベルの組を入力することにより、対象の組に含まれる複数の被写体者全体としての行動を示す相互ラベルを取得する。相互特定部２６は、相互ラベルをメモリ１２に書き込む。
　相互ラベルが示す行動は、複数の人としての行動である。したがって、相互ラベルが示す行動は、例えば、「握手をする」、「一方の人が殴り、他方の人が殴られる」といった行動である。具体例としては、対象の組に含まれる被写体者が２人であり、両方の被写体者の個別行動が「腕を前に伸ばす」である場合には、相互ラベルが示す行動は、「握手」になる。また、対象の組に含まれる被写体者が２人であり、一方の被写体者の個別行動が「腕を前に伸ばす」であり、他方の被写体者の個別行動が「仰け反る」である場合には、相互ラベルが示す行動は「暴力」になる。また、対象の組に含まれる被写体者が３人以上の場合であっても、同様にそれぞれの動作の組合せで行動を特定することができる。 (Step S22: Mutual identification process)
The mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group. The mutual identification unit 26 takes into consideration the mutual behavior from the individual behavior of each of the plurality of subject subjects included in the target group specified in step S21, and the behavior of the plurality of subject subjects included in the target group as a whole. To identify. Considering mutual behavior means considering the behavior of another subject when identifying the behavior of one subject. In other words, considering mutual behavior means specifying the behavior of a certain subject based on the behavior of another subject.
Specifically, the mutual identification unit 26 inputs a set of individual labels indicating the individual behaviors of each of the plurality of people, and outputs a mutual label indicating the behaviors of the plurality of people in consideration of the mutual behaviors. Use it to identify the behavior of the subject. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, the mutual identification unit 26 inputs a set of individual labels for each of the plurality of subjects included in the set of targets specified in step S21 to the mutual model, whereby a plurality of sets included in the set of targets are included. Obtain a mutual label indicating the behavior of the subject as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.
The behavior indicated by the mutual label is the behavior as multiple people. Therefore, the behavior indicated by the mutual label is, for example, an action such as "shaking hands" or "one person is beaten and the other person is beaten". As a specific example, when there are two subjects included in the target group and the individual behavior of both subjects is "stretching the arm forward", the behavior indicated by the mutual label is "handshake". become. In addition, when there are two subjects included in the target group, the individual behavior of one subject is "stretching the arm forward", and the individual behavior of the other subject is "rebelling". , The behavior indicated by the mutual label becomes "violence". Further, even when the number of subjects included in the target group is three or more, the action can be similarly specified by the combination of each action.

　＊＊＊実施の形態１の効果＊＊＊
　以上のように、実施の形態１に係る行動特定装置１０は、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 *** Effect of Embodiment 1 ***
As described above, the behavior specifying device 10 according to the first embodiment identifies the behavior as a plurality of subject subjects in consideration of the mutual behavior which is the behavior in which the plurality of subject subjects influence each other. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.

　＊＊＊他の構成＊＊＊
　＜変形例１＞
　実施の形態１では、ニューラルネットワーク等を用いて生成された学習済みのモデルである個別モデル及び相互モデルを用いて行動を特定した。しかし、個別モデル及び相互モデルの少なくとも一方に代えて、入力と出力とを対応付けたルールが用いられてもよい。 *** Other configurations ***
<Modification 1>
In the first embodiment, the behavior is specified by using an individual model and a mutual model, which are trained models generated by using a neural network or the like. However, instead of at least one of the individual model and the mutual model, a rule in which the input and the output are associated may be used.

　個別モデルの代わりに用いられるルールは、人の骨格情報と人の行動を示す個別ラベルとを対応付けた個別ルールである。つまり、個別ルールは、人の骨格情報を入力として与えると、個別ラベルが出力として得られるルールである。
　個別モデルの代わりに個別ルールが用いられる場合には、図３のステップＳ２１で個別特定部２５は、個別ルールを参照して、対象の被写体者の骨格情報に対応する個別ラベルを対象の被写体者の個別行動を示す情報として取得する。この際、個別特定部２５は、対象の被写体者の骨格情報と最も類似度が高い骨格情報と対応付けられた個別ラベルを対象の被写体者の個別行動を示す情報として取得する。 The rule used instead of the individual model is an individual rule in which the human skeleton information and the individual label indicating the human behavior are associated with each other. That is, the individual rule is a rule in which an individual label is obtained as an output when human skeleton information is given as an input.
When an individual rule is used instead of the individual model, in step S21 of FIG. 3, the individual identification unit 25 refers to the individual rule and sets the individual label corresponding to the skeleton information of the target subject to the target subject. It is acquired as information indicating the individual behavior of. At this time, the individual identification unit 25 acquires an individual label associated with the skeleton information having the highest degree of similarity to the skeleton information of the target subject as information indicating the individual behavior of the target subject.

　相互モデルの代わりに用いられるルールは、複数の人それぞれの個別行動を示す個別ラベルの組と複数の人としての行動を示す相互ラベルとを対応付けた相互ルールである。つまり、相互ルールは、個別ラベルの組を入力として与えると、複数の人としての行動を示す相互ラベルが出力として得られるルールである。
　相互モデルの代わりに相互ルールが用いられる場合には、図３のステップＳ２２で相互特定部２６は、相互ルールを参照して、複数の被写体者それぞれについての個別ラベルの組に対応する相互ラベルを複数の被写体者全体としての行動を示す情報として取得する。 The rule used instead of the mutual model is a mutual rule in which a set of individual labels indicating the individual behavior of each of a plurality of people and a mutual label indicating the behavior of a plurality of people are associated with each other. That is, a mutual rule is a rule in which, when a set of individual labels is given as an input, a mutual label indicating an action as a plurality of people is obtained as an output.
When a mutual rule is used instead of the mutual model, in step S22 of FIG. 3, the mutual identification unit 26 refers to the mutual rule and obtains a mutual label corresponding to a set of individual labels for each of a plurality of subjects. It is acquired as information indicating the behavior of a plurality of subjects as a whole.

　＜変形例２＞
　実施の形態１では、複数の被写体者全体としての行動が特定された。しかし、行動特定装置１０は、さらに各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置１０の相互特定部２６は、各被写体者を対象として、全体としての行動と、対象の被写体者の個別ラベルとから、全体としての行動における対象の被写体者の行動を特定する。
　実施の形態１では、２人の組である場合に、一方の被写体者の個別行動が「腕を前に伸ばす」であり、他方の被写体者の個別行動が「仰け反る」である場合には、相互ラベルが示す行動は「殴る」になるという例を説明した。この例では、個別行動が「腕を前に伸ばす」である被写体者の行動は「相手を殴る」になり、個別行動が「仰け反る」である被写体者の行動は、「相手から殴られる」になる。 <Modification 2>
In the first embodiment, the behavior of the plurality of subjects as a whole was specified. However, the behavior specifying device 10 may further specify which behavior each subject is doing in the behavior as a whole. In this case, the mutual identification unit 26 of the behavior specifying device 10 targets each subject, and from the behavior as a whole and the individual label of the subject, the behavior of the target subject in the behavior as a whole. To identify.
In the first embodiment, in the case of a pair of two people, when the individual behavior of one subject is "stretching the arm forward" and the individual behavior of the other subject is "rebelling", He explained an example in which the behavior indicated by the mutual label would be "beating". In this example, the subject's behavior in which the individual behavior is "stretching the arm forward" is "beating the opponent", and the subject's behavior in which the individual behavior is "rebelling" is "beating by the opponent". Become.

　＜変形例３＞
　実施の形態１では、個別モデル及び相互モデルは、ストレージ１３に記憶されると説明した。しかし、個別モデル及び相互モデルは、行動特定装置１０の外部の記憶装置に記憶されていてもよい。この場合には、行動特定装置１０は、通信インタフェース１４を介して、個別モデル及び相互モデルにアクセスすればよい。 <Modification 3>
In the first embodiment, it has been described that the individual model and the mutual model are stored in the storage 13. However, the individual model and the mutual model may be stored in an external storage device of the behavior identification device 10. In this case, the behavior identification device 10 may access the individual model and the mutual model via the communication interface 14.

　＜変形例４＞
　実施の形態１では、各機能構成要素がソフトウェアで実現された。しかし、変形例４として、各機能構成要素はハードウェアで実現されてもよい。この変形例４について、実施の形態１と異なる点を説明する。 <Modification example 4>
In the first embodiment, each functional component is realized by software. However, as a modification 4, each functional component may be realized by hardware. The difference between the modified example 4 and the first embodiment will be described.

　図４を参照して、変形例４に係る行動特定装置１０の構成を説明する。
　各機能構成要素がハードウェアで実現される場合には、行動特定装置１０は、プロセッサ１１とメモリ１２とストレージ１３とに代えて、電子回路１５を備える。電子回路１５は、各機能構成要素と、メモリ１２と、ストレージ１３との機能とを実現する専用の回路である。 With reference to FIG. 4, the configuration of the behavior specifying device 10 according to the modified example 4 will be described.
When each functional component is realized by hardware, the action specifying device 10 includes an electronic circuit 15 in place of the processor 11, the memory 12, and the storage 13. The electronic circuit 15 is a dedicated circuit that realizes the functions of each functional component, the memory 12, and the storage 13.

　電子回路１５としては、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックＩＣ、ＧＡ（Ｇａｔｅ　Ａｒｒａｙ）、ＡＳＩＣ（Ａｐｐｌｉｃａｔｉｏｎ　Ｓｐｅｃｉｆｉｃ　Ｉｎｔｅｇｒａｔｅｄ　Ｃｉｒｃｕｉｔ）、ＦＰＧＡ（Ｆｉｅｌｄ－Ｐｒｏｇｒａｍｍａｂｌｅ　Ｇａｔｅ　Ａｒｒａｙ）が想定される。
　各機能構成要素を１つの電子回路１５で実現してもよいし、各機能構成要素を複数の電子回路１５に分散させて実現してもよい。 Examples of the electronic circuit 15 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
Each functional component may be realized by one electronic circuit 15, or each functional component may be distributed and realized by a plurality of electronic circuits 15.

　＜変形例５＞
　変形例５として、一部の各機能構成要素がハードウェアで実現され、他の各機能構成要素がソフトウェアで実現されてもよい。 <Modification 5>
As a modification 5, some functional components may be realized by hardware, and other functional components may be realized by software.

　プロセッサ１１とメモリ１２とストレージ１３と電子回路１５とを処理回路という。つまり、各機能構成要素の機能は、処理回路により実現される。 The processor 11, the memory 12, the storage 13, and the electronic circuit 15 are called processing circuits. That is, the function of each functional component is realized by the processing circuit.

　実施の形態２．
　実施の形態２では、個別モデル及び相互モデルの生成処理について説明する。 Embodiment 2.
In the second embodiment, the generation process of the individual model and the mutual model will be described.

　＊＊＊構成の説明＊＊＊
　図５を参照して、実施の形態２に係る学習装置５０の構成を説明する。
　学習装置５０は、コンピュータである。
　学習装置５０は、プロセッサ５１と、メモリ５２と、ストレージ５３と、通信インタフェース５４とのハードウェアを備える。プロセッサ５１は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。 *** Explanation of configuration ***
The configuration of the learning device 50 according to the second embodiment will be described with reference to FIG.
The learning device 50 is a computer.
The learning device 50 includes hardware such as a processor 51, a memory 52, a storage 53, and a communication interface 54. The processor 51 is connected to other hardware via a signal line and controls these other hardware.

　プロセッサ５１は、プロセッサ１１と同様に、プロセッシングを行うＩＣである。メモリ５２は、メモリ１２と同様に、データを一時的に記憶する記憶装置である。ストレージ５３は、ストレージ１３と同様に、データを保管する記憶装置である。ストレージ５３は、ストレージ１３と同様に、可搬記録媒体であってもよい。通信インタフェース５４は、通信インタフェース１４と同様に、外部の装置と通信するためのインタフェースである。
　学習装置５０は、通信インタフェース５４を介して行動特定装置１０と接続されている。 Like the processor 11, the processor 51 is an IC that performs processing. Like the memory 12, the memory 52 is a storage device that temporarily stores data. The storage 53 is a storage device for storing data, like the storage 13. The storage 53 may be a portable recording medium like the storage 13. Similar to the communication interface 14, the communication interface 54 is an interface for communicating with an external device.
The learning device 50 is connected to the action specifying device 10 via the communication interface 54.

　学習装置５０は、機能構成要素として、学習データ取得部６１と、モデル生成部６２とを備える。学習装置５０の各機能構成要素の機能はソフトウェアにより実現される。
　ストレージ１３には、学習装置５０の各機能構成要素の機能を実現するプログラムが格納されている。このプログラムは、プロセッサ５１によりメモリ５２に読み込まれ、プロセッサ５１によって実行される。これにより、学習装置５０の各機能構成要素の機能が実現される。 The learning device 50 includes a learning data acquisition unit 61 and a model generation unit 62 as functional components. The functions of each functional component of the learning device 50 are realized by software.
The storage 13 stores a program that realizes the functions of each functional component of the learning device 50. This program is read into the memory 52 by the processor 51 and executed by the processor 51. As a result, the functions of each functional component of the learning device 50 are realized.

　図５では、プロセッサ５１は、１つだけ示されていた。しかし、プロセッサ５１は、複数であってもよく、複数のプロセッサ５１が、各機能を実現するプログラムを連携して実行してもよい。 In FIG. 5, only one processor 51 was shown. However, the number of processors 51 may be plural, and the plurality of processors 51 may execute programs that realize each function in cooperation with each other.

　＊＊＊動作の説明＊＊＊
　図６及び図７を参照して、実施の形態２に係る学習装置５０の動作を説明する。
　実施の形態２に係る学習装置５０の動作手順は、実施の形態２に係る学習方法に相当する。また、実施の形態２に係る学習装置５０の動作を実現するプログラムは、実施の形態２に係る学習プログラムに相当する。 *** Explanation of operation ***
The operation of the learning device 50 according to the second embodiment will be described with reference to FIGS. 6 and 7.
The operation procedure of the learning device 50 according to the second embodiment corresponds to the learning method according to the second embodiment. Further, the program that realizes the operation of the learning device 50 according to the second embodiment corresponds to the learning program according to the second embodiment.

　図６を参照して、実施の形態２に係る学習装置５０が個別モデルを生成する動作を説明する。
　（ステップＳ３１：学習データ取得処理）
　学習データ取得部６１は、人の骨格の関節の位置を示す骨格情報と、その人の行動とを関連付けた学習データを取得する。
　例えば、学習データは、指定された行動を実際に行った人を撮像して得られた映像データから骨格情報を特定することによって生成される。つまり、抽出された骨格情報と、指定された行動とが関連付けられて学習データとされる。骨格情報は、映像データの１つのフレームから特定された関節の位置だけを含むベクトルデータであってもよいし、複数のフレームから特定された関節の位置を含む行列データであってもよい。 The operation of the learning device 50 according to the second embodiment to generate an individual model will be described with reference to FIG.
(Step S31: Learning data acquisition process)
The learning data acquisition unit 61 acquires learning data in which skeletal information indicating the positions of joints of a person's skeleton is associated with the behavior of the person.
For example, learning data is generated by identifying skeletal information from video data obtained by imaging a person who actually performed a specified action. That is, the extracted skeletal information and the specified action are associated with each other to obtain learning data. The skeleton information may be vector data including only the joint positions specified from one frame of the video data, or may be matrix data including the joint positions specified from a plurality of frames.

　（ステップＳ３２：モデル生成処理）
　モデル生成部６２は、ステップＳ３１で取得された学習データを入力として、学習を行い、個別モデルを生成する。モデル生成部６２は、個別モデルを行動特定装置１０のストレージ１３に書き込む。
　実施の形態２では、モデル生成部６２は、学習データを入力として、骨格の関節の位置と行動との関係をニューラルネットワークに学習させる。例えば、モデル生成部６２は、骨格情報が肩と肘と手首との位置が一直線に並び、かつ、それぞれの垂直方向の位置が同等であることを示していれば、それは「腕を前に伸ばす」動作を表していることを学習させる。用いられるニューラルネットワークの構成はＤＮＮ（深層ニューラルネットワーク）と、ＣＮＮ（畳み込みニューラルネットワーク）と、ＲＮＮ（再帰型ニューラルネットワーク）といった周知のものでよい。 (Step S32: Model generation process)
The model generation unit 62 receives the learning data acquired in step S31 as an input, performs learning, and generates an individual model. The model generation unit 62 writes the individual model in the storage 13 of the action specifying device 10.
In the second embodiment, the model generation unit 62 inputs the learning data and causes the neural network to learn the relationship between the position of the joint of the skeleton and the behavior. For example, if the model generator 62 indicates that the skeletal information indicates that the shoulders, elbows, and wrists are aligned and their vertical positions are equivalent, it is "extended the arm forward." Learn to represent an action. The configuration of the neural network used may be a well-known one such as DNN (deep neural network), CNN (convolutional neural network), and RNN (recurrent neural network).

　図７を参照して、実施の形態２に係る学習装置５０が相互モデルを生成する動作を説明する。
　（ステップＳ４１：学習データ取得処理）
　学習データ取得部６１は、複数の個別ラベルの組と、相互行動が考慮された複数の人それぞれの行動とを関連付けた学習データを取得する。
　例えば、学習データは、指定された相互行動を実際に行った場合における、複数の人それぞれの個別行動を示す個別ラベルと、相互行動における複数の人としての行動とが関連付けられて生成される。 The operation of the learning device 50 according to the second embodiment to generate a mutual model will be described with reference to FIG. 7.
(Step S41: Learning data acquisition process)
The learning data acquisition unit 61 acquires learning data in which a set of a plurality of individual labels is associated with the behavior of each of the plurality of people in consideration of mutual behavior.
For example, the learning data is generated by associating an individual label indicating the individual behavior of each of the plurality of people when the designated mutual action is actually performed with the behavior of the plurality of people in the mutual action.

　（ステップＳ４２：モデル生成処理）
　モデル生成部６２は、ステップＳ４１で取得された学習データを入力として、学習を行い、相互モデルを生成する。モデル生成部６２は、相互モデルを行動特定装置１０のストレージ１３に書き込む。
　実施の形態２では、モデル生成部６２は、学習データを入力として、複数の個別ラベルの組と、相互行動が考慮された複数の人としての行動との関係をニューラルネットワークに学習させる。例えば、モデル生成部６２は、２人の組である場合に、両方の被写体者の個別行動が「腕を前に伸ばす」である場合には、両方の被写体者について相互ラベルが示す行動は、「握手」であることを学習させる。用いられるニューラルネットワークの構成はＤＮＮ（深層ニューラルネットワーク）と、ＣＮＮ（畳み込みニューラルネットワーク）と、ＲＮＮ（再帰型ニューラルネットワーク）といった周知のものでよい。 (Step S42: Model generation process)
The model generation unit 62 receives the learning data acquired in step S41 as an input, performs learning, and generates a mutual model. The model generation unit 62 writes the mutual model in the storage 13 of the behavior identification device 10.
In the second embodiment, the model generation unit 62 inputs the learning data and causes the neural network to learn the relationship between the set of a plurality of individual labels and the behavior as a plurality of people in consideration of mutual behavior. For example, in the case of a pair of two subjects, if the individual behavior of both subjects is "stretching the arm forward", the behavior indicated by the mutual label for both subjects is Learn to be a "handshake". The configuration of the neural network used may be a well-known one such as DNN (deep neural network), CNN (convolutional neural network), and RNN (recurrent neural network).

　＊＊＊実施の形態２の効果＊＊＊
　以上のように、実施の形態２に係る学習装置５０は、学習データに基づき、行動特定装置１０が用いる個別モデル及び相互モデルを生成する。これにより、適切な学習データを与えることで、行動特定装置１０が用いる個別モデル及び相互モデルの認識精度を高くすることができる。 *** Effect of Embodiment 2 ***
As described above, the learning device 50 according to the second embodiment generates an individual model and a mutual model used by the behavior specifying device 10 based on the learning data. Thereby, by giving appropriate learning data, the recognition accuracy of the individual model and the mutual model used by the behavior specifying device 10 can be improved.

　＊＊＊他の構成＊＊＊
　＜変形例６＞
　変形例１で説明したように、行動特定装置１０は、個別モデルに代えて個別ルールを用いてもよいし、相互モデルに代えて相互ルールを用いてもよい。 *** Other configurations ***
<Modification 6>
As described in the first modification, the behavior specifying device 10 may use an individual rule instead of the individual model, or may use a mutual rule instead of the mutual model.

　個別モデルに代えて個別ルールが用いられる場合には、図６のステップＳ３２でモデル生成部６２は、個別モデルに代えて個別ルールを生成する。具体的には、モデル生成部６２は、ステップＳ３１で取得された各学習データが示す、人の骨格の関節の位置を示す骨格情報と、その人の行動を示す個別ラベルと対応付けたデータベースを個別ルールとして生成する。 When an individual rule is used instead of the individual model, the model generation unit 62 generates an individual rule instead of the individual model in step S32 of FIG. Specifically, the model generation unit 62 creates a database associated with skeletal information indicating the positions of joints of the human skeleton shown by each learning data acquired in step S31 and individual labels indicating the behavior of the person. Generate as an individual rule.

　相互モデルに代えて相互ルールが用いられる場合には、図７のステップＳ４２でモデル生成部６２は、相互モデルに代えて相互ルールを生成する。具体的には、モデル生成部６２は、ステップＳ４１で取得された各学習データが示す、複数の個別ラベルの組と、相互行動が考慮された複数の人としての行動とを対応付けたデータベースを相互ルールとして生成する。 When a mutual rule is used instead of the mutual model, the model generation unit 62 generates the mutual rule instead of the mutual model in step S42 of FIG. Specifically, the model generation unit 62 creates a database in which a set of a plurality of individual labels shown by each learning data acquired in step S41 is associated with behaviors as a plurality of people in consideration of mutual behaviors. Generate as a mutual rule.

　＜変形例７＞
　実施の形態２では、各機能構成要素がソフトウェアで実現された。しかし、変形例７として、各機能構成要素はハードウェアで実現されてもよい。この変形例７について、実施の形態２と異なる点を説明する。 <Modification 7>
In the second embodiment, each functional component is realized by software. However, as a modification 7, each functional component may be realized by hardware. The difference between the modified example 7 and the second embodiment will be described.

　図８を参照して、変形例７に係る学習装置５０の構成を説明する。
　各機能構成要素がハードウェアで実現される場合には、学習装置５０は、プロセッサ５１とメモリ５２とストレージ５３とに代えて、電子回路５５を備える。電子回路５５は、各機能構成要素と、メモリ５２と、ストレージ５３との機能とを実現する専用の回路である。 The configuration of the learning device 50 according to the modified example 7 will be described with reference to FIG.
When each functional component is realized by hardware, the learning device 50 includes an electronic circuit 55 instead of the processor 51, the memory 52, and the storage 53. The electronic circuit 55 is a dedicated circuit that realizes the functions of each functional component, the memory 52, and the storage 53.

　電子回路５５としては、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックＩＣ、ＧＡ（Ｇａｔｅ　Ａｒｒａｙ）、ＡＳＩＣ（Ａｐｐｌｉｃａｔｉｏｎ　Ｓｐｅｃｉｆｉｃ　Ｉｎｔｅｇｒａｔｅｄ　Ｃｉｒｃｕｉｔ）、ＦＰＧＡ（Ｆｉｅｌｄ－Ｐｒｏｇｒａｍｍａｂｌｅ　Ｇａｔｅ　Ａｒｒａｙ）が想定される。
　各機能構成要素を１つの電子回路５５で実現してもよいし、各機能構成要素を複数の電子回路５５に分散させて実現してもよい。 Examples of the electronic circuit 55 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
Each functional component may be realized by one electronic circuit 55, or each functional component may be distributed and realized by a plurality of electronic circuits 55.

　＜変形例８＞
　変形例８として、一部の各機能構成要素がハードウェアで実現され、他の各機能構成要素がソフトウェアで実現されてもよい。 <Modification 8>
As a modification 8, some functional components may be realized by hardware, and other functional components may be realized by software.

　プロセッサ５１とメモリ５２とストレージ５３と電子回路５５とを処理回路という。つまり、各機能構成要素の機能は、処理回路により実現される。 The processor 51, the memory 52, the storage 53, and the electronic circuit 55 are called processing circuits. That is, the function of each functional component is realized by the processing circuit.

　実施の形態３．
　実施の形態３は、複数の骨格情報から計算された特徴量から、相互行動を考慮して複数の被写体者全体としての行動が特定される点が実施の形態１と異なる。実施の形態３では、この異なる点を説明し、同一の点については説明を省略する。 Embodiment 3.
The third embodiment is different from the first embodiment in that the behavior of a plurality of subjects as a whole is specified in consideration of mutual behavior from the feature quantities calculated from the plurality of skeletal information. In the third embodiment, these different points will be described, and the same points will be omitted.

　＊＊＊構成の説明＊＊＊
　図９を参照して、実施の形態３に係る行動特定装置１０の構成を説明する。
　行動特定装置１０は、行動特定部２４が、個別特定部２５に代えて、特徴量計算部２７を備える点が図１に示す行動特定装置１０と異なる。特徴量計算部２７の機能は、他の機能と同様に、ソフトウェア又はハードウェアによって実現される。 *** Explanation of configuration ***
The configuration of the behavior specifying device 10 according to the third embodiment will be described with reference to FIG. 9.
The action specifying device 10 is different from the action specifying device 10 shown in FIG. 1 in that the action specifying unit 24 includes a feature amount calculation unit 27 instead of the individual specifying unit 25. The function of the feature amount calculation unit 27 is realized by software or hardware like other functions.

　＊＊＊動作の説明＊＊＊
　図１０を参照して、実施の形態３に係る行動特定装置１０の動作を説明する。
　実施の形態３に係る行動特定装置１０の動作手順は、実施の形態３に係る行動特定方法に相当する。また、実施の形態３に係る行動特定装置１０の動作を実現するプログラムは、実施の形態３に係る行動特定プログラムに相当する。 *** Explanation of operation ***
The operation of the action specifying device 10 according to the third embodiment will be described with reference to FIG.
The operation procedure of the action specifying device 10 according to the third embodiment corresponds to the action specifying method according to the third embodiment. Further, the program that realizes the operation of the action specifying device 10 according to the third embodiment corresponds to the action specifying program according to the third embodiment.

　図１０を参照して、実施の形態３に係る行動特定処理（図２のステップＳ１５）を説明する。
　（ステップＳ５１：特徴量計算処理）
　特徴量計算部２７は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部２７は、対象の組に含まれる複数の被写体者それぞれについての骨格情報に基づき特徴量を計算する。
　具体的には、特徴量計算部２７は、対象の組に含まれる複数の被写体者それぞれについての骨格情報を統合して特徴量を計算する。あるいは、特徴量計算部２７は、対象の組に含まれる複数の被写体者それぞれについての骨格情報から特徴量を抽出してもよい。
　ここで、特徴量の計算は、複数の骨格間の関節の位置関係について情報が保持されるよう処理される。例えば、骨格情報は、骨格の関節位置を示す座標が１人の骨格情報あたりｍ個あり、その骨格がｍ次元ベクトルで表現されているとする。ｎ人分の骨格情報を総合する場合には、ｍ次元ベクトルをｎ個連結させた（ｍ×ｎ）次元ベクトル、又は、ｍ行ｎ列の行列が特徴量となる。あるいは、複数の骨格間における任意の関節の間の距離についての時間変化を要素として持つベクトル又は行列が特徴量となる。複数の骨格間における任意の関節の間の距離とは、例えば、骨格Ａの首と、骨格Ｂの手首との間の距離である。 The behavior specifying process (step S15 in FIG. 2) according to the third embodiment will be described with reference to FIG.
(Step S51: Feature calculation process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 calculates the feature amount based on the skeletal information of each of the plurality of subjects included in the target set.
Specifically, the feature amount calculation unit 27 calculates the feature amount by integrating the skeletal information about each of the plurality of subjects included in the target set. Alternatively, the feature amount calculation unit 27 may extract the feature amount from the skeletal information about each of the plurality of subjects included in the target set.
Here, the feature amount calculation is processed so that information about the positional relationship of the joints between the plurality of skeletons is retained. For example, it is assumed that the skeleton information has m coordinates indicating the joint positions of the skeleton per person's skeleton information, and the skeleton is represented by an m-dimensional vector. When integrating the skeleton information for n people, the feature quantity is a (m × n) dimensional vector in which n m-dimensional vectors are concatenated, or a matrix with m rows and n columns. Alternatively, the feature quantity is a vector or matrix having a time change as an element with respect to the distance between arbitrary joints among a plurality of skeletons. The distance between any joints between the plurality of skeletons is, for example, the distance between the neck of the skeleton A and the wrist of the skeleton B.

　（ステップＳ５２：相互特定処理）
　相互特定部２６は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。相互特定部２６は、ステップＳ５１で特定された対象の組に含まれる複数の被写体者の骨格情報の特徴量を入力として、相互行動を考慮して、複数の被写体者全体としての行動を特定する。
　具体的には、相互特定部２６は、複数の人の骨格情報の特徴量を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルを利用して、被写体者の行動を特定する。相互モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ１３に記憶されているものとする。つまり、相互特定部２６は、相互モデルに対して、ステップＳ５１で計算された特徴量を入力することにより、対象の組に含まれる複数の被写体者全体としての行動を示す相互ラベルを取得する。相互特定部２６は、相互ラベルをメモリ１２に書き込む。 (Step S52: Mutual identification process)
The mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group. The mutual identification unit 26 specifies the behavior of the plurality of subjects as a whole in consideration of the mutual behavior by inputting the feature amount of the skeletal information of the plurality of subjects included in the set of targets specified in step S51. ..
Specifically, the mutual identification unit 26 uses a mutual model in which features of skeletal information of a plurality of people are input and a mutual label indicating the behavior as a plurality of people is output in consideration of the mutual behavior. Identify the behavior of the subject. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, by inputting the feature amount calculated in step S51 into the mutual model, the mutual identification unit 26 acquires a mutual label indicating the behavior of the plurality of subjects included in the target set as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.

　＊＊＊実施の形態３の効果＊＊＊
　以上のように、実施の形態３に係る行動特定装置１０は、実施の形態１に係る行動特定装置１０と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 *** Effect of Embodiment 3 ***
As described above, the behavior specifying device 10 according to the third embodiment considers mutual behavior, which is an behavior in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.

　＊＊＊他の構成＊＊＊
　＜変形例９＞
　実施の形態３では、ニューラルネットワーク等を用いて生成された学習済みのモデルである相互モデルを用いて行動を特定した。しかし、変形例１と同様に、相互モデルに代えて相互ルールが用いられてもよい。
　相互ルールは、複数の人の骨格情報の特徴量と複数の人としての行動を示す相互ラベルとを対応付けたルールである。相互モデルの代わりに相互ルールが用いられる場合には、図１０のステップＳ５２で相互特定部２６は、相互ルールを参照して、特徴量に対応する相互ラベルを複数の被写体者全体としての被写体者の行動を示す情報として取得する。 *** Other configurations ***
<Modification 9>
In the third embodiment, the behavior is specified by using a mutual model which is a trained model generated by using a neural network or the like. However, as in Modification 1, mutual rules may be used instead of the mutual model.
The mutual rule is a rule in which the feature amount of the skeletal information of a plurality of people and the mutual label indicating the behavior as a plurality of people are associated with each other. When a mutual rule is used instead of the mutual model, the mutual identification unit 26 refers to the mutual rule in step S52 of FIG. It is acquired as information indicating the behavior of.

　＜変形例１０＞
　実施の形態３では、実施の形態１と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置１０は、変形例２と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置１０の相互特定部２６は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部２６は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。 <Modification 10>
In the third embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.

　実施の形態４．
　実施の形態４は、実施の形態３に係る相互モデルを生成する点が実施の形態２と異なる。実施の形態４では、この異なる点を説明し、同一の点については説明を省略する。
　なお、実施の形態３では、個別モデルは用いられないため、実施の形態４では、個別モデルは生成されない。 Embodiment 4.
The fourth embodiment is different from the second embodiment in that a mutual model according to the third embodiment is generated. In the fourth embodiment, these different points will be described, and the same points will be omitted.
Since the individual model is not used in the third embodiment, the individual model is not generated in the fourth embodiment.

　＊＊＊動作の説明＊＊＊
　図７を参照して、実施の形態４に係る学習装置５０の動作を説明する。
　実施の形態４に係る学習装置５０の動作手順は、実施の形態４に係る学習方法に相当する。また、実施の形態４に係る学習装置５０の動作を実現するプログラムは、実施の形態４に係る学習プログラムに相当する。 *** Explanation of operation ***
The operation of the learning device 50 according to the fourth embodiment will be described with reference to FIG. 7.
The operation procedure of the learning device 50 according to the fourth embodiment corresponds to the learning method according to the fourth embodiment. Further, the program that realizes the operation of the learning device 50 according to the fourth embodiment corresponds to the learning program according to the fourth embodiment.

　図７を参照して、実施の形態４に係る学習装置５０が相互モデルを生成する動作を説明する。
　（ステップＳ４１：学習データ取得処理）
　学習データ取得部６１は、複数の人の骨格情報の特徴量と、複数の人としての行動とを関連付けた学習データを取得する。
　例えば、学習データは、指定された相互行動を実際に行った複数の人を撮像して得られた映像データから特徴量を計算することによって生成される。つまり、計算された特徴量と、指定された相互行動における各人の行動とが関連付けられて学習データとされる。 The operation of the learning device 50 according to the fourth embodiment to generate a mutual model will be described with reference to FIG. 7.
(Step S41: Learning data acquisition process)
The learning data acquisition unit 61 acquires learning data in which the feature amounts of the skeletal information of a plurality of people and the behaviors of the plurality of people are associated with each other.
For example, the learning data is generated by calculating the feature amount from the video data obtained by imaging a plurality of people who actually performed the specified mutual action. That is, the calculated feature amount and the behavior of each person in the designated mutual behavior are associated with each other to obtain learning data.

　（ステップＳ４２：モデル生成処理）
　モデル生成部６２は、ステップＳ３１で取得された学習データを入力として、学習を行い、相互モデルを生成する。モデル生成部６２は、相互モデルを行動特定装置１０のストレージ１３に書き込む。 (Step S42: Model generation process)
The model generation unit 62 receives the learning data acquired in step S31 as an input, performs learning, and generates a mutual model. The model generation unit 62 writes the mutual model in the storage 13 of the behavior identification device 10.

　＊＊＊実施の形態４の効果＊＊＊
　以上のように、実施の形態４に係る学習装置５０は、学習データに基づき、行動特定装置１０が用いる相互モデルを生成する。これにより、適切な学習データを与えることで、行動特定装置１０が用いる個別モデル及び相互モデルの認識精度を高くすることができる。 *** Effect of Embodiment 4 ***
As described above, the learning device 50 according to the fourth embodiment generates a mutual model used by the behavior specifying device 10 based on the learning data. Thereby, by giving appropriate learning data, the recognition accuracy of the individual model and the mutual model used by the behavior specifying device 10 can be improved.

　＊＊＊他の構成＊＊＊
　＜変形例１１＞
　変形例９で説明したように、行動特定装置１０は、相互モデルに代えて相互ルールを用いてもよい。 *** Other configurations ***
<Modification 11>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.

　相互モデルに代えて相互ルールが用いられる場合には、図７のステップＳ４２でモデル生成部６２は、相互モデルに代えて相互ルールを生成する。具体的には、モデル生成部６２は、ステップＳ４１で取得された各学習データが示す、特徴量と、相互行動が考慮された複数の人としての行動とを対応付けたデータベースを相互ルールとして生成する。 When a mutual rule is used instead of the mutual model, the model generation unit 62 generates the mutual rule instead of the mutual model in step S42 of FIG. Specifically, the model generation unit 62 generates as a mutual rule a database in which the feature amount shown by each learning data acquired in step S41 is associated with the behavior as a plurality of people in consideration of the mutual behavior. do.

　実施の形態５．
　実施の形態５は、骨格情報から特徴量を計算する方法が実施の形態３と異なる。実施の形態５では、この異なる点を説明し、同一の点については説明を省略する。 Embodiment 5.
In the fifth embodiment, the method of calculating the feature amount from the skeletal information is different from the third embodiment. In the fifth embodiment, these different points will be described, and the same points will be omitted.

　実施の形態５は、骨格情報から特徴量を計算する際に、少なくとも１時刻前の骨格情報が必要となる。そこで、実施の形態５では、図２のステップ１２において、骨格情報を取得後、ストレージ１３によって実現される骨格情報データベースに骨格情報が保存されるものとする。 In the fifth embodiment, when calculating the feature amount from the skeleton information, the skeleton information at least one time ago is required. Therefore, in the fifth embodiment, after the skeleton information is acquired in step 12 of FIG. 2, the skeleton information is stored in the skeleton information database realized by the storage 13.

　図１０を参照して、実施の形態５に係る行動特定処理（図２のステップＳ１５）を説明する。
　（ステップＳ５１：特徴量計算処理）
　特徴量計算部２７は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部２７は、対象の組に含まれる複数の被写体者それぞれについての骨格情報に基づき特徴量を計算する。特徴量計算部２７は、特徴量を、ストレージ１３によって実現される特徴量データベースに書き込む。
　具体的には、特徴量計算部２７は、対象の組に含まれる複数の被写体者それぞれについての骨格情報から特徴量を計算する。そして、特徴量計算部２７は、計算した特徴量に、現在時刻ｔをインデックスとして付与して、特徴量データベースに書き込む。
　算出される特徴量及びその算出方法については後述する。 The behavior specifying process (step S15 in FIG. 2) according to the fifth embodiment will be described with reference to FIG.
(Step S51: Feature calculation process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 calculates the feature amount based on the skeletal information of each of the plurality of subjects included in the target set. The feature amount calculation unit 27 writes the feature amount in the feature amount database realized by the storage 13.
Specifically, the feature amount calculation unit 27 calculates the feature amount from the skeletal information of each of the plurality of subjects included in the target set. Then, the feature amount calculation unit 27 adds the current time t as an index to the calculated feature amount and writes it in the feature amount database.
The calculated feature amount and the calculation method thereof will be described later.

　（ステップＳ５２：相互特定処理）
　相互特定部２６は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。相互特定部２６は、ステップＳ５１で特定された対象の組に含まれる複数の被写体者の骨格情報の特徴量を入力として、相互行動を考慮して、複数の被写体者全体としての行動を特定する。
　具体的には、相互特定部２６は、対象の組に含まれる複数の被写体者についての特徴量を特徴量データベースから取得する。そして、相互特定部２６は、複数の人の特徴量を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルを利用して、被写体者の行動を特定する。相互モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ１３に記憶されているものとする。つまり、相互特定部２６は、相互モデルに対して、ステップＳ５１で計算された特徴量を入力することにより、対象の組に含まれる複数の被写体者全体としての行動を示す相互ラベルを取得する。相互特定部２６は、相互ラベルをメモリ１２に書き込む。 (Step S52: Mutual identification process)
The mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group. The mutual identification unit 26 specifies the behavior of the plurality of subjects as a whole in consideration of the mutual behavior by inputting the feature amount of the skeletal information of the plurality of subjects included in the set of targets specified in step S51. ..
Specifically, the mutual identification unit 26 acquires the feature quantities of a plurality of subjects included in the target set from the feature quantity database. Then, the mutual identification unit 26 identifies the behavior of the subject by using a mutual model that inputs the feature quantities of a plurality of people and outputs a mutual label indicating the behavior as a plurality of people in consideration of the mutual behavior. do. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, by inputting the feature amount calculated in step S51 into the mutual model, the mutual identification unit 26 acquires a mutual label indicating the behavior of the plurality of subjects included in the target set as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.

　相互特定部２６が、特徴量データベースから取得する特徴量は、ある１時刻に計算された１個ではなく、時系列に連続する複数の特徴量であってもよい。時系列に連続する複数の特徴量を取得した場合には、相互特定部２６は、特徴量の変遷をもとに、対象の組に含まれる複数の被写体者としての行動を特定し、相互ラベルを取得する。つまり、この場合には、相互モデルは、複数の人の特徴量の変遷を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力するモデルである。 The feature amount acquired by the mutual identification unit 26 from the feature amount database may not be one calculated at a certain time, but may be a plurality of consecutive feature amounts in a time series. When a plurality of continuous feature quantities are acquired in a time series, the mutual identification unit 26 identifies the behavior as a plurality of subject persons included in the target set based on the transition of the feature quantities, and mutually labels them. To get. That is, in this case, the mutual model is a model that inputs the transition of the feature quantity of a plurality of people and outputs a mutual label indicating the behavior as a plurality of people in consideration of the mutual behavior.

　図１１を参照して、実施の形態５に係る特徴量計算処理（図１０のステップＳ５１）を説明する。
　（ステップＳ６１：骨格情報取得処理）
　特徴量計算部２７は、相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部２７は、対象の組に設定された組に含まれる複数の被写体者それぞれについての現在時刻の骨格情報と１時刻前の骨格情報とを骨格情報データベースから取得する。 The feature amount calculation process (step S51 in FIG. 10) according to the fifth embodiment will be described with reference to FIG.
(Step S61: Skeleton information acquisition process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action as a target group. The feature amount calculation unit 27 acquires the skeleton information of the current time and the skeleton information of one time ago for each of the plurality of subjects included in the set set in the target set from the skeleton information database.

　（ステップＳ６２：速度計算処理）
　特徴量計算部２７は、ステップＳ６１で取得された複数の被写体者それぞれの現在時刻の骨格情報と１時刻前の骨格情報とを用いて特徴量を算出する。
　具体的には、特徴量計算部２７は、ステップＳ６１で取得された、時系列的に連続する２フレームの画像間における被写体者についての骨格の各関節の移動距離を要素に持つベクトル又は行列を計算する。このようにして計算される各関節の移動距離は、２フレームの画像間で生じる時間幅に対する各関節の移動距離であるため、各関節の速度とみなすことができる。そして、特徴量計算部２７は、各関節の速度の合計又は平均を取って得られるスカラーを被写体者の骨格全体の速度とし、この速度を特徴量とする。 (Step S62: Speed calculation process)
The feature amount calculation unit 27 calculates the feature amount by using the skeleton information of each of the plurality of subjects acquired in step S61 at the current time and the skeleton information one time before.
Specifically, the feature amount calculation unit 27 obtains a vector or a matrix having the movement distance of each joint of the skeleton with respect to the subject between two consecutive frames of images acquired in step S61 as an element. calculate. Since the movement distance of each joint calculated in this way is the movement distance of each joint with respect to the time width generated between the images of two frames, it can be regarded as the speed of each joint. Then, the feature amount calculation unit 27 uses a scalar obtained by taking the total or average of the speeds of each joint as the speed of the entire skeleton of the subject, and uses this speed as the feature amount.

　ステップＳ６１で特徴量計算部２７は、現在時刻ｔから過去時刻ｔ－Ｎまでの時間幅Ｎ分の骨格情報を取得してもよい。この場合には、ステップＳ６２で特徴量計算部２７は、時系列に連続する２時刻間の骨格の各関節の移動距離を要素に持つベクトル又は行列を生成する。特徴量計算部２７は、各関節の移動距離について時間方向に総和を取り、時間幅Ｎで除算して、現在時刻ｔから過去時刻ｔ－Ｎにおける平均移動距離を各関節の速度として計算する。つまり、特徴量計算部２７は、各関節を対象として、対象の関節について計算された２つの時刻の間における移動距離を合計し、時間幅Ｎで除算して、対象の関節の平均移動距離を計算する。そして、特徴量計算部２７は、この平均移動距離を、対象の関節の速度として扱う。そして、特徴量計算部２７は、各関節の速度の合計又は平均を取って得られるスカラーを被写体者の骨格全体の速度とし、この速度を特徴量とする。 In step S61, the feature amount calculation unit 27 may acquire skeleton information for a time width N from the current time t to the past time t—N. In this case, in step S62, the feature amount calculation unit 27 generates a vector or a matrix having the movement distance of each joint of the skeleton as an element for two consecutive time periods. The feature amount calculation unit 27 sums up the movement distances of each joint in the time direction, divides them by the time width N, and calculates the average movement distance from the current time t to the past time t—N as the speed of each joint. That is, the feature amount calculation unit 27 totals the movement distances between the two times calculated for the target joints for each joint, divides by the time width N, and obtains the average movement distance of the target joints. calculate. Then, the feature amount calculation unit 27 treats this average moving distance as the speed of the target joint. Then, the feature amount calculation unit 27 uses a scalar obtained by taking the total or average of the speeds of each joint as the speed of the entire skeleton of the subject, and uses this speed as the feature amount.

　上記説明では、特徴量はスカラーであった。しかし、特徴量計算部２７は、全関節について速度の合計又は平均値を取らずに、各関節の速度を要素に持つベクトルデータを特徴量としてもよい。 In the above explanation, the feature amount was scalar. However, the feature amount calculation unit 27 may use vector data having the velocity of each joint as an element as the feature amount without taking the total or average value of the velocities for all the joints.

　特徴量計算部２７は、抽出された被写体者の骨格の関節のうち、任意の数の関節から特徴量を計算してもよい。あるいは、特徴量計算部２７は、任意の数の関節分だけ計算された特徴量どうしを加算する又は平均を取る等して、特徴量を抽出した関節数よりも少ない数の特徴量を計算してもよい。
　また、特徴量計算部２７は、被写体者の数だけ計算された特徴量を合計する又は平均を取る等して１つの特徴量としてもよい。 The feature amount calculation unit 27 may calculate the feature amount from any number of joints of the extracted subject's skeleton. Alternatively, the feature amount calculation unit 27 calculates a number of feature amounts smaller than the number of joints from which the feature amount has been extracted by adding or averaging the feature amounts calculated for an arbitrary number of joints. You may.
Further, the feature amount calculation unit 27 may add up or average the feature amounts calculated for the number of subjects to form one feature amount.

　特徴量を算出するにあたって、骨格情報のうちの一部の関節の位置が取得できない場合も起こり得る。この場合には、特徴量計算部２７は、特徴量データベースに記憶されている過去の特徴量をもとにする、又は、関節の位置が取得できた関節をもとにする等して、取得できなかった関節の位置又は取得できなかった関節に関する特徴量を補完してもよい。
　補完の方法としては、関節の位置が取得できなかった時刻の特徴量を１時刻前の特徴量とする、又は、関節の位置が取得できなかった時刻の特徴量を過去数時刻分の特徴量の変位から線形補完して計算することが考えられる。あるいは、特徴量計算部２７は、関節の位置が取得できた関節群全体の速度から１関節当たりの速度の平均値を算出し、関節の位置が取得できなかった関節の速度としても、関節の位置が取得できなかった関節の周囲の関節から成り、関節の位置が取得できた関節群の速度から１関節当たりの速度の平均値を計算し、関節の位置が取得できなかった関節の速度としてもよい。また、特徴量計算部２７は、取得できなかった右膝の位置を左膝の位置で補完するというように、取得できなかった関節と左右で対になっている関節、あるいは連結する関節の位置で補完してもよい。 In calculating the features, it may happen that the positions of some joints in the skeletal information cannot be obtained. In this case, the feature amount calculation unit 27 acquires the feature amount based on the past feature amount stored in the feature amount database, or based on the joint for which the joint position can be acquired. The position of the joint that could not be obtained or the feature amount related to the joint that could not be obtained may be supplemented.
As a complementing method, the feature amount at the time when the joint position could not be acquired is used as the feature amount one hour before, or the feature amount at the time when the joint position could not be acquired is used as the feature amount for the past several hours. It is conceivable to calculate by linearly complementing the displacement of. Alternatively, the feature amount calculation unit 27 calculates the average value of the speeds per joint from the speeds of the entire joint group in which the joint positions could be obtained, and even if the joint speeds in which the joint positions could not be obtained are used as the joint speeds. It consists of joints around the joint for which the position could not be obtained, and the average value of the speed per joint was calculated from the speed of the joint group for which the position of the joint could be obtained. May be good. In addition, the feature amount calculation unit 27 complements the position of the right knee that could not be acquired with the position of the left knee, and is the position of the joint that is paired on the left and right with the joint that could not be acquired, or the position of the joint that is connected. May be complemented with.

　＊＊＊実施の形態５の効果＊＊＊
　以上のように、実施の形態５に係る行動特定装置１０は、実施の形態１に係る行動特定装置１０と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 *** Effect of Embodiment 5 ***
As described above, the behavior specifying device 10 according to the fifth embodiment considers mutual behavior, which is an action in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.

　特に、実施の形態５に係る行動特定装置１０は、２つ以上のフレームから取得された骨格情報を用いて計算された速度を特徴量として用いる。例えば数秒といったある程度長い時間幅における骨格情報の時系列データから計算される骨格の速度を特徴量として用いれば、人の向き又はオクルージョンによる一部身体の隠蔽等による被写体者の骨格の関節の誤抽出が発生した場合でも、正しく行動を判別できる可能性が高くなる。 In particular, the behavior specifying device 10 according to the fifth embodiment uses the speed calculated using the skeleton information acquired from two or more frames as the feature amount. For example, if the speed of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds is used as the feature quantity, the skeletal joints of the subject are erroneously extracted due to the orientation of the person or the concealment of a part of the body by occlusion. Even if this occurs, there is a high possibility that the behavior can be correctly determined.

　＊＊＊他の構成＊＊＊
　＜変形例１２＞
　変形例９で説明したように、行動特定装置１０は、相互モデルに代えて相互ルールを用いてもよい。 *** Other configurations ***
<Modification 12>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.

　＜変形例１３＞
　実施の形態５では、実施の形態１と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置１０は、変形例２と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置１０の相互特定部２６は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部２６は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。 <Modification 13>
In the fifth embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.

　実施の形態６．
　実施の形態６は、骨格情報から特徴量を計算する方法が実施の形態３，５と異なる。実施の形態６では、この異なる点を説明し、同一の点については説明を省略する。
　実施の形態６では、実施の形態５と異なる点を説明する。 Embodiment 6.
The sixth embodiment is different from the third and fifth embodiments in the method of calculating the feature amount from the skeletal information. In the sixth embodiment, these different points will be described, and the same points will be omitted.
The sixth embodiment will explain the differences from the fifth embodiment.

　図１２を参照して、実施の形態６に係る特徴量計算処理（図１０のステップＳ５１）を説明する。
　（ステップＳ７１：骨格情報取得処理）
　特徴量計算部２７は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部２７は、対象の組に設定された組に含まれる複数の被写体者それぞれについての現在時刻ｔからＮ時刻前までの骨格情報を骨格情報データベースから取得する。特徴量計算部２７は、取得された骨格情報を時系列に並べたデータを時系列データとして設定する。
　時系列データは、例えば数秒といったある程度の長さをもった対象期間分の骨格情報を時系列に並べたデータであり、２つ以上の時刻における骨格情報を時系列に並べたデータであることが望ましく、さらに３つ以上の時刻における骨格情報を時系列に並べたデータであることが望ましい。 The feature amount calculation process (step S51 in FIG. 10) according to the sixth embodiment will be described with reference to FIG.
(Step S71: Skeleton information acquisition process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 acquires skeleton information from the current time t to N time before each of the plurality of subjects included in the set set in the target set from the skeleton information database. The feature amount calculation unit 27 sets data in which the acquired skeleton information is arranged in time series as time series data.
The time-series data is data in which skeletal information for a target period having a certain length, for example, several seconds, is arranged in chronological order, and skeletal information at two or more times is arranged in chronological order. It is desirable, and it is desirable that the data is a time-series arrangement of skeletal information at three or more times.

　（ステップＳ７２：移動距離計算処理）
　特徴量計算部２７は、ステップＳ７１で生成された骨格情報の時系列データにおいて、時系列に連続する２つの時刻の骨格情報間における対象の被写体者の骨格の各関節の移動距離を計算する。具体的には、特徴量計算部２７は、各関節を対象として、２つの時刻の骨格情報間における対象の関節の位置の差分を計算することによって、対象の関節の移動距離を計算する。特徴量計算部２７は、各関節の移動距離を要素とするベクトルあるいは行列を生成する。以下では、各関節の移動距離を要素とするベクトルが生成されたとして説明する。 (Step S72: Travel distance calculation process)
In the time-series data of the skeleton information generated in step S71, the feature amount calculation unit 27 calculates the movement distance of each joint of the skeleton of the subject subject between the skeleton information of two consecutive times in the time series. Specifically, the feature amount calculation unit 27 calculates the movement distance of the target joint by calculating the difference in the position of the target joint between the skeletal information at two times for each joint. The feature amount calculation unit 27 generates a vector or a matrix having the movement distance of each joint as an element. In the following, it will be described assuming that a vector having the movement distance of each joint as an element is generated.

　（ステップＳ７３：運動量計算処理）
　特徴量計算部２７は、ステップＳ２２で生成された、各関節の移動距離を要素とするベクトルを時間方向に合計する。つまり、特徴量計算部２７は、各関節を対象として、対象の関節について計算された２つの時刻の間における移動距離を合計する。このようにして計算された値は、現在時刻ｔから過去時刻ｔ－Ｎまでの時間幅Ｎにおける各関節の移動距離の総和である。そのため、この値は、時間幅Ｎにおける各関節の運動量とみなすことができる。
　特徴量計算部２７は、全関節の運動量を合計する、あるいは、平均値を取る等してスカラーとし、このスカラーを時間幅Ｎにおける被写体者の骨格全体の運動量とみなす。そして、特徴量計算部２７は、この運動量を特徴量とする。 (Step S73: Momentum calculation process)
The feature amount calculation unit 27 totals the vectors generated in step S22 with the movement distance of each joint as an element in the time direction. That is, the feature amount calculation unit 27 totals the movement distances between the two times calculated for each joint for each joint. The value calculated in this way is the sum of the movement distances of each joint in the time width N from the current time t to the past time t—N. Therefore, this value can be regarded as the momentum of each joint in the time width N.
The feature amount calculation unit 27 makes a scalar by summing up the momentums of all the joints or taking an average value, and regards this scalar as the momentum of the entire skeleton of the subject in the time width N. Then, the feature amount calculation unit 27 uses this momentum as the feature amount.

　上記説明では、運動量はスカラーであった。しかし、特徴量計算部２７は、全関節について運動量の合計又は平均値を取らずに、各関節の運動量を要素に持つベクトルデータを特徴量としてもよい。 In the above explanation, the amount of exercise was scalar. However, the feature amount calculation unit 27 may use vector data having the momentum of each joint as an element as the feature amount without taking the total or average value of the momentums of all the joints.

　特徴量計算部２７は、抽出された被写体者の骨格の関節のうち、任意の数の関節から特徴量を計算してもよい。あるいは、特徴量計算部２７は、任意の数の関節分だけ計算された特徴量どうしを加算する又は平均を取る等して、特徴量を抽出した関節数よりも少ない数の特徴量を計算してもよい。 The feature amount calculation unit 27 may calculate the feature amount from any number of the extracted joints of the subject's skeleton. Alternatively, the feature amount calculation unit 27 calculates a number of feature amounts smaller than the number of joints from which the feature amount has been extracted by adding or averaging the feature amounts calculated for an arbitrary number of joints. You may.

　特徴量を計算するにあたって、骨格情報のうちの一部の関節の位置が取得できない場合も起こり得る。この場合には、実施の形態５と同様に、特徴量計算部２７は、取得できなかった関節の位置又は取得できなかった関節に関する特徴量を補完してもよい。 When calculating the features, it may happen that the positions of some joints in the skeletal information cannot be obtained. In this case, as in the fifth embodiment, the feature amount calculation unit 27 may supplement the position of the joint that could not be acquired or the feature amount related to the joint that could not be acquired.

　＊＊＊実施の形態６の効果＊＊＊
　以上のように、実施の形態６に係る行動特定装置１０は、実施の形態１に係る行動特定装置１０と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 *** Effect of Embodiment 6 ***
As described above, the behavior specifying device 10 according to the sixth embodiment considers mutual behavior, which is an action in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.

　特に、実施の形態６に係る行動特定装置１０は、過去のフレームから取得された骨格情報を用いて計算された運動量を特徴量として用いる。例えば数秒といったある程度長い時間幅における骨格情報の時系列データから計算される骨格の運動量を特徴量として用いることにより、人の向き又はオクルージョンによる一部身体の隠蔽等による被写体者の骨格の関節の誤抽出が発生した場合でも、正しく行動を判別できる可能性が高くなる。 In particular, the behavior specifying device 10 according to the sixth embodiment uses the momentum calculated using the skeletal information acquired from the past frame as the feature amount. For example, by using the momentum of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds as the feature amount, the subject's skeletal joint error due to the orientation of the person or the concealment of a part of the body by occlusion, etc. Even if extraction occurs, there is a high possibility that the behavior can be correctly determined.

　＊＊＊他の構成＊＊＊
　＜変形例１４＞
　変形例９で説明したように、行動特定装置１０は、相互モデルに代えて相互ルールを用いてもよい。 *** Other configurations ***
<Modification 14>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.

　＜変形例１５＞
　実施の形態６では、実施の形態１と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置１０は、変形例２と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置１０の相互特定部２６は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部２６は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。 <Modification 15>
In the sixth embodiment, the behavior of the plurality of subjects as a whole was specified as in the first embodiment. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.

　実施の形態７．
　実施の形態７は、骨格情報から算出される特徴量が異なるという点で実施の形態３，５，６と異なる。実施の形態７では、この異なる点を説明し、同一の点については説明を省略する。
　実施の形態７では、実施の形態６と異なる点を説明する。 Embodiment 7.
The seventh embodiment is different from the third, fifth, and sixth embodiments in that the feature amount calculated from the skeletal information is different. In the seventh embodiment, these different points will be described, and the same points will be omitted.
The seventh embodiment will explain the differences from the sixth embodiment.

　図１３を参照して、実施の形態７に係る特徴量計算処理（図１０のステップＳ５１）を説明する。
　（ステップＳ８１：骨格情報取得処理）
　特徴量計算部２７は、ステップＳ１４で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部２７は、対象の組に設定された組に含まれる複数の被写体者それぞれについての現在時刻ｔからＮ時刻前までの骨格情報を骨格情報データベースから取得する。特徴量計算部２７は、取得された骨格情報を時系列に並べたデータを時系列データとして設定する。 The feature amount calculation process (step S51 in FIG. 10) according to the seventh embodiment will be described with reference to FIG.
(Step S81: Skeleton information acquisition process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 acquires skeleton information from the current time t to N time before each of the plurality of subjects included in the set set in the target set from the skeleton information database. The feature amount calculation unit 27 sets data in which the acquired skeleton information is arranged in time series as time series data.

　（ステップＳ８２：軌跡計算処理）
　特徴量計算部２７は、ステップＳ８１で生成された対象の被写体者の骨格情報の時系列データが表す、現在時刻ｔから過去時刻ｔ－Ｎ間の各時刻における被写体者の骨格の関節の位置の情報を時系列に並べたベクトルあるいは行列を特徴量として生成する。以下では、関節の位置の情報を時系列に並べたベクトルが生成されたとして説明する。このようにして生成されたベクトルは、時系列に並んだ骨格の関節の位置の情報を要素として持つ。そのため、時刻ｔから時刻ｔ－Ｎにおける関節の移動経路、つまり動作の軌跡を表す。
　このとき、関節の位置の情報は、２次元画像内から抽出された骨格情報を対象としていれば、水平方向の位置を表す座標値ｘと垂直方向の位置を表す座標値ｙとを用いて（ｘ，ｙ）といった具合に表される。 (Step S82: Trajectory calculation process)
The feature amount calculation unit 27 describes the positions of the joints of the subject's skeleton at each time between the current time t and the past time t-N represented by the time-series data of the skeleton information of the target subject generated in step S81. Generate a vector or matrix in which information is arranged in time series as a feature quantity. In the following, it will be described assuming that a vector in which joint position information is arranged in chronological order is generated. The vector generated in this way has information on the positions of the joints of the skeleton arranged in time series as an element. Therefore, it represents the movement path of the joint from time t to time t—N, that is, the locus of movement.
At this time, if the information on the position of the joint is targeted at the skeletal information extracted from the two-dimensional image, the coordinate value x representing the horizontal position and the coordinate value y representing the vertical position are used ( It is expressed as x, y).

　特徴量計算部２７は、特徴量を計算するにあたって、抽出された被写体者の骨格の関節のうち、任意の数の関節に対して特徴量を計算してもよい。また、特徴量計算部２７は、正の整数Ｍ，ｍに関して、骨格情報がＭ次元の関節の位置情報を持っていた場合、ｍ≦Ｍとなるようなｍ個の座標値を利用して特徴量を計算してもよい。 In calculating the feature amount, the feature amount calculation unit 27 may calculate the feature amount for any number of the extracted joints of the subject's skeleton. Further, the feature amount calculation unit 27 uses m coordinate values such that m ≦ M when the skeleton information has the position information of the M-dimensional joint with respect to the positive integers M and m. You may calculate the amount.

　特徴量を計算するにあたって、骨格情報のうちの一部の関節の位置が取得できない場合も起こり得る。この場合には、実施の形態６と同様に、特徴量計算部２７は、取得できなかった関節の位置又は取得できなかった関節に関する特徴量を補完してもよい。 When calculating the features, it may happen that the positions of some joints in the skeletal information cannot be obtained. In this case, as in the sixth embodiment, the feature amount calculation unit 27 may supplement the position of the joint that could not be acquired or the feature amount related to the joint that could not be acquired.

　＊＊＊実施の形態７の効果＊＊＊
　以上のように、実施の形態７に係る行動特定装置１０は、実施の形態１に係る行動特定装置１０と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 *** Effect of Embodiment 7 ***
As described above, the behavior specifying device 10 according to the seventh embodiment considers mutual behavior, which is an behavior in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.

　特に、実施の形態７に係る行動特定装置１０は、過去のフレームから取得された骨格情報を用いて計算された軌跡を特徴量として用いる。例えば数秒といったある程度長い時間幅における骨格情報の時系列データから計算される骨格の軌跡を特徴量として用いることにより、人の向き又はオクルージョンによる一部身体の隠蔽等による被写体者の骨格の関節の誤抽出が発生した場合でも、正しく行動を判別できる可能性が高くなる。 In particular, the behavior specifying device 10 according to the seventh embodiment uses a locus calculated using the skeleton information acquired from the past frame as a feature amount. For example, by using the locus of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds as a feature quantity, the subject's skeletal joint error due to the orientation of the person or the concealment of a part of the body by occlusion, etc. Even if extraction occurs, there is a high possibility that the behavior can be correctly determined.

　＊＊＊他の構成＊＊＊
　＜変形例１６＞
　変形例９で説明したように、行動特定装置１０は、相互モデルに代えて相互ルールを用いてもよい。
　＜変形例１７＞
　実施の形態７では、実施の形態１と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置１０は、変形例２と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置１０の相互特定部２６は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部２６は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。 *** Other configurations ***
<Modification 16>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.
<Modification 17>
In the seventh embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.

　以上、本開示の実施の形態及び変形例について説明した。これらの実施の形態及び変形例のうち、いくつかを組み合わせて実施してもよい。また、いずれか１つ又はいくつかを部分的に実施してもよい。なお、本開示は、以上の実施の形態及び変形例に限定されるものではなく、必要に応じて種々の変更が可能である。 The embodiments and modifications of the present disclosure have been described above. Some of these embodiments and modifications may be combined and carried out. In addition, any one or several may be partially carried out. The present disclosure is not limited to the above embodiments and modifications, and various modifications can be made as necessary.

　１０　行動特定装置、１１　プロセッサ、１２　メモリ、１３　ストレージ、１４　通信インタフェース、１５　電子回路、２１　映像取得部、２２　骨格情報取得部、２３　相関判定部、２４　行動特定部、２５　個別特定部、２６　相互特定部、２７　特徴量計算部、３１　カメラ、５０　学習装置、５１　プロセッサ、５２　メモリ、５３　ストレージ、５４　通信インタフェース、５５　電子回路、６１　学習データ取得部、６２　モデル生成部。 10 behavior identification device, 11 processor, 12 memory, 13 storage, 14 communication interface, 15 electronic circuit, 21 video acquisition unit, 22 skeleton information acquisition unit, 23 correlation judgment unit, 24 behavior identification unit, 25 individual identification unit, 26 mutual Specific unit, 27 feature quantity calculation unit, 31 camera, 50 learning device, 51 processor, 52 memory, 53 storage, 54 communication interface, 55 electronic circuit, 61 learning data acquisition unit, 62 model generation unit.

Claims

　映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する骨格情報取得部と、
　前記骨格情報取得部によって取得された複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定部と
を備える行動特定装置。 A skeleton information acquisition unit that acquires skeletal information indicating the positions of joints of the skeleton for each of the subjects who are multiple people reflected in the video data.
From the skeleton information about each of the plurality of subjects acquired by the skeleton information acquisition unit, the behavior as the plurality of subject subjects is taken into consideration in consideration of the mutual behavior which is the behavior in which the plurality of subject subjects influence each other. A behavior identification device provided with an behavior identification unit for identifying a behavior.
　前記行動特定部は、
　前記複数の被写体者それぞれを対象として、対象の被写体者の前記骨格情報から、前記対象の被写体者についての行動を個別行動として特定する個別特定部と、
　前記個別特定部によって特定された前記複数の被写体者それぞれについての前記個別行動から、前記相互行動を考慮して、前記複数の被写体者としての行動を特定する相互特定部と
を備える請求項１に記載の行動特定装置。 The behavior identification unit
An individual identification unit that specifies the behavior of the target subject as individual behavior from the skeleton information of the target subject for each of the plurality of subject subjects.
The first aspect of claim 1 includes a mutual identification unit that specifies an action as a plurality of subject persons in consideration of the mutual action from the individual action for each of the plurality of subject persons specified by the individual identification unit. The described behavior identification device.
　前記個別特定部は、人の骨格情報を入力として、その人の行動を示す個別ラベルを出力する個別モデルに対して、前記対象の被写体者の前記骨格情報を入力することにより、前記対象の被写体者の前記個別行動を示す個別ラベルを取得する
請求項２に記載の行動特定装置。 The individual identification unit inputs the skeleton information of the subject to the individual model that outputs the individual label indicating the behavior of the person by inputting the skeleton information of the subject, thereby inputting the skeleton information of the subject. The behavior specifying device according to claim 2, wherein an individual label indicating the individual behavior of the person is acquired.
　前記個別特定部は、人の骨格情報と人の行動を示す個別ラベルとを対応付けた個別ルールを参照して、前記対象の被写体者の前記骨格情報に対応する個別ラベルを前記対象の被写体者の前記個別行動を示す情報として取得する
請求項２に記載の行動特定装置。 The individual identification unit refers to an individual rule in which a person's skeleton information and an individual label indicating a person's behavior are associated with each other, and sets an individual label corresponding to the skeleton information of the target subject to the target subject. The behavior specifying device according to claim 2, which is acquired as information indicating the individual behavior of the above.
　前記相互特定部は、複数の人それぞれの個別行動を示す個別ラベルの組を入力として、前記相互行動を考慮して前記複数の人としての行動を示す相互ラベルを出力する相互モデルに対して、前記個別特定部によって特定された前記複数の被写体者それぞれについての個別ラベルの組を入力することにより、前記複数の被写体者としての行動を示す相互ラベルを取得する
請求項２から４までのいずれか１項に記載の行動特定装置。 The mutual identification unit inputs a set of individual labels indicating the individual behaviors of each of the plurality of people, and outputs a mutual label indicating the behaviors of the plurality of people in consideration of the mutual behaviors. Any of claims 2 to 4 for acquiring a mutual label indicating the behavior as the plurality of subjects by inputting a set of individual labels for each of the plurality of subjects specified by the individual identification unit. The behavior identification device according to item 1.
　前記相互特定部は、複数の人それぞれの個別行動を示す個別ラベルの組と前記複数の人としての行動を示す相互ラベルとを対応付けた相互ルールを参照して、前記個別特定部によって特定された前記複数の被写体者それぞれについての個別ラベルの組に対応する相互ラベルを前記複数の被写体者としての行動を示す情報として取得する
請求項２から４までのいずれか１項に記載の行動特定装置。 The mutual identification unit is specified by the individual identification unit with reference to a mutual rule in which a set of individual labels indicating the individual actions of each of the plurality of persons and the mutual labels indicating the behaviors of the plurality of persons are associated with each other. The behavior specifying device according to any one of claims 2 to 4, wherein mutual labels corresponding to a set of individual labels for each of the plurality of subjects are acquired as information indicating the behavior of the plurality of subjects. ..
　前記相互特定部は、前記複数の被写体者それぞれを対象として、前記複数の被写体者としての行動と、対象の被写体者についての個別行動とから、前記複数の被写体者としての行動における前記対象の被写体者についての行動を特定する
請求項２から６までのいずれか１項に記載の行動特定装置。 The mutual identification unit targets each of the plurality of subject subjects, and from the behavior as the plurality of subject subjects and the individual behavior regarding the target subject, the subject in the behavior as the plurality of subject subjects. The behavior specifying device according to any one of claims 2 to 6, which specifies the behavior of a person.
　前記行動特定部は、
　前記複数の被写体者それぞれについての前記骨格情報に基づき特徴量を計算する特徴量計算部と、
　前記特徴量計算部によって生成された前記特徴量を入力として、前記相互行動を考慮して、前記複数の被写体者としての行動を特定する相互特定部と
を備える請求項１に記載の行動特定装置。 The behavior identification unit
A feature amount calculation unit that calculates a feature amount based on the skeleton information for each of the plurality of subjects, and a feature amount calculation unit.
The behavior specifying device according to claim 1, further comprising a mutual specifying unit that specifies the behavior as a plurality of subject subjects in consideration of the mutual behavior by inputting the feature amount generated by the feature amount calculation unit. ..
　前記特徴量計算部は、前記複数の被写体者それぞれを対象として、対象の被写体者についての時系列に連続する骨格情報から前記対象の被写体者の速度を前記特徴量として計算する
請求項８に記載の行動特定装置。 The eighth aspect of the present invention, wherein the feature amount calculation unit calculates the speed of the target subject person as the feature amount from the skeletal information continuous in time series for each of the plurality of subject persons. Behavior identification device.
　前記特徴量計算部は、前記複数の被写体者それぞれを対象として、対象の被写体者についての時系列に連続する骨格情報から前記対象の被写体者の運動量を前記特徴量として計算する
請求項８に記載の行動特定装置。 The eighth aspect of claim 8, wherein the feature amount calculation unit calculates the momentum of the target subject as the feature amount from the skeletal information continuous in time series for each of the plurality of subjects. Behavior identification device.
　前記特徴量計算部は、前記複数の被写体者それぞれを対象として、対象の被写体者についての時系列に連続する骨格情報から前記対象の被写体者の動作の軌跡を前記特徴量として計算する
請求項８に記載の行動特定装置。 8. The feature amount calculation unit calculates, for each of the plurality of subject persons, the locus of movement of the target subject person as the feature amount from the skeletal information continuous in time series for the target subject person. Behavior identification device described in.
　前記相互特定部は、複数の人の骨格情報の特徴量を入力として、前記相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルに対して、前記特徴量計算部によって計算された前記特徴量を入力することにより、前記複数の被写体者としての行動を示す相互ラベルを取得する
請求項８から１１までのいずれか１項に記載の行動特定装置。 The mutual identification unit is a feature amount calculation unit for a mutual model in which a feature amount of skeletal information of a plurality of people is input and a mutual label indicating the behavior as a plurality of people is output in consideration of the mutual behavior. The behavior specifying device according to any one of claims 8 to 11, wherein by inputting the feature amount calculated by the above, a mutual label indicating the behavior as the plurality of subject subjects is acquired.
　前記相互特定部は、複数の人の骨格情報の特徴量と前記複数の人としての行動を示す相互ラベルとを対応付けた相互ルールを参照して、前記特徴量計算部によって計算された前記特徴量に対応する相互ラベルを前記複数の被写体者としての被写体者の行動を示す情報として取得する
請求項８から１１までのいずれか１項に記載の行動特定装置。 The mutual identification unit refers to a mutual rule in which a feature amount of skeletal information of a plurality of people and a mutual label indicating the behavior of the plurality of people are associated with each other, and the feature is calculated by the feature amount calculation unit. The behavior specifying device according to any one of claims 8 to 11, wherein a mutual label corresponding to the amount is acquired as information indicating the behavior of the subject as the plurality of subject.
　前記相互特定部は、前記複数の被写体者それぞれを対象として、前記複数の被写体者としての行動と、対象の被写体者についての骨格情報とから、前記複数の被写体者としての行動における前記対象の被写体者についての行動を特定する
請求項８から１３までのいずれか１項に記載の行動特定装置。 The mutual identification unit targets each of the plurality of subject subjects, and from the behavior as the plurality of subject subjects and the skeletal information about the target subject, the subject in the behavior as the plurality of subject subjects. The action specifying device according to any one of claims 8 to 13, which specifies an action about a person.
　前記行動特定装置は、さらに、
前記複数の被写体者が互いに影響を与える行動である相互行動を行っているか否かを判定する相関判定部
を備え、
　前記行動特定部は、前記相関判定部によって前記相互行動を行っていると判定された場合に、前記相互行動を考慮して、前記複数の被写体者としての行動を特定する
請求項１から１４までのいずれか１項に記載の行動特定装置。 The behavior identification device further
A correlation determination unit for determining whether or not the plurality of subject subjects are performing mutual actions, which are actions that affect each other, is provided.
Claims 1 to 14 for specifying the behavior as a plurality of subject subjects in consideration of the mutual behavior when the behavior specifying unit determines that the mutual behavior is being performed by the correlation determination unit. The behavior identification device according to any one of the above items.
　行動特定装置の骨格情報取得部が、映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得し、
　前記行動特定装置の行動特定部が、複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定方法。 The skeleton information acquisition unit of the behavior identification device acquires skeletal information indicating the position of the joints of the skeleton for each of the multiple subjects shown in the video data.
The behavior specifying unit of the behavior specifying device considers mutual behavior, which is an behavior in which the plurality of subject subjects influence each other, from the skeleton information about each of the plurality of subject subjects, and serves as the plurality of subject subjects. Behavior identification method to identify behavior.
　映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する骨格情報取得処理と、
　前記骨格情報取得処理によって取得された複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定処理と
を行う行動特定装置としてコンピュータを機能させる行動特定プログラム。 Skeletal information acquisition processing that acquires skeletal information indicating the positions of skeletal joints for each of the subject subjects who are multiple people reflected in the video data, and
From the skeleton information about each of the plurality of subjects acquired by the skeleton information acquisition process, the behavior as the plurality of subjects is taken into consideration in consideration of the mutual behavior which is the behavior in which the plurality of subjects influence each other. A behavior identification program that makes a computer function as an behavior identification device that performs behavior identification processing.