WO2022003989A1 - Action identification device, action identification method, and action identification program - Google Patents

Action identification device, action identification method, and action identification program Download PDF

Info

Publication number
WO2022003989A1
WO2022003989A1 PCT/JP2020/029244 JP2020029244W WO2022003989A1 WO 2022003989 A1 WO2022003989 A1 WO 2022003989A1 JP 2020029244 W JP2020029244 W JP 2020029244W WO 2022003989 A1 WO2022003989 A1 WO 2022003989A1
Authority
WO
WIPO (PCT)
Prior art keywords
behavior
mutual
subject
subjects
individual
Prior art date
Application number
PCT/JP2020/029244
Other languages
French (fr)
Japanese (ja)
Inventor
浩平 望月
勝大 草野
誠司 奥村
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to JP2021503612A priority Critical patent/JP6887586B1/en
Publication of WO2022003989A1 publication Critical patent/WO2022003989A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion

Definitions

  • This disclosure relates to a technique for identifying human behavior based on skeletal information indicating the positions of joints in the human skeleton.
  • Patent Document 1 describes a human behavior recognition technique using skeletal information.
  • image information around the target person is acquired for each person shown in the image, skeleton information of the target person is extracted, and the movement of the target person is extracted from the skeleton information.
  • Image information that can be confirmed is generated. Then, based on the generated image information and the image information of the determined person attribute stored in advance, it is determined whether the attribute of the target person is the determined person attribute.
  • the behavior identification device related to this disclosure is A skeleton information acquisition unit that acquires skeletal information indicating the positions of joints of the skeleton for each of the subjects who are multiple people reflected in the video data. From the skeleton information about each of the plurality of subject persons acquired by the skeleton information acquisition unit, the plurality of subject persons can be regarded as the plurality of subject persons in consideration of mutual behavior which is an action in which the plurality of subject persons influence each other. It has a behavior identification unit that identifies behavior.
  • the behavior as a plurality of subject subjects is specified in consideration of the mutual behavior which is the behavior in which a plurality of subject subjects influence each other. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
  • FIG. 1 The block diagram of the action specifying apparatus 10 which concerns on Embodiment 1.
  • FIG. 1 The flowchart which shows the overall operation of the action specifying apparatus 10 which concerns on Embodiment 1.
  • FIG. The flowchart of the action specifying process which concerns on Embodiment 1.
  • the block diagram of the behavior specifying apparatus 10 which concerns on modification 3.
  • the block diagram of the learning apparatus 50 which concerns on Embodiment 2.
  • FIG. The flowchart which shows the operation which the learning apparatus 50 which concerns on Embodiment 2 generate an individual model.
  • the flowchart which shows the operation which the learning apparatus 50 which concerns on Embodiment 2 generate a mutual model.
  • the block diagram of the action specifying apparatus 10 which concerns on Embodiment 3.
  • the behavior identification device 10 is a computer.
  • the behavior identification device 10 includes hardware such as a processor 11, a memory 12, a storage 13, and a communication interface 14.
  • the processor 11 is connected to other hardware via a signal line and controls these other hardware.
  • the processor 11 is an IC (Integrated Circuit) that performs processing. Specific examples of the processor 11 are a CPU (Central Processing Unit), a DSP (Digital Signal Processor), and a GPU (Graphics Processing Unit).
  • a CPU Central Processing Unit
  • DSP Digital Signal Processor
  • GPU Graphics Processing Unit
  • the memory 12 is a storage device that temporarily stores data.
  • the memory 12 is a SRAM (Static Random Access Memory) or a DRAM (Dynamic Random Access Memory).
  • the storage 13 is a storage device for storing data.
  • the storage 13 is an HDD (Hard Disk Drive).
  • the storage 13 includes SD (registered trademark, Secure Digital) memory card, CF (CompactFlash, registered trademark), NAND flash, flexible disk, optical disk, compact disk, Blu-ray (registered trademark) disk, DVD (Digital Versaille Disk), and the like. It may be a portable recording medium.
  • the communication interface 14 is an interface for communicating with an external device.
  • the communication interface 14 is a port of Ethernet (registered trademark), USB (Universal Serial Bus), HDMI (registered trademark, High-Definition Multimedia Interface).
  • the action specifying device 10 is connected to the camera 31 via the communication interface 14.
  • the camera 31 may be a general 2D (Dimension) camera, but may be a 3D camera.
  • 3D camera information on the depth can also be obtained. Therefore, in the process described later, the position of a human joint can be appropriately specified.
  • the action specifying device 10 includes a video acquisition unit 21, a skeleton information acquisition unit 22, a correlation determination unit 23, and an action identification unit 24 as functional components.
  • the action specifying unit 24 includes an individual specifying unit 25 and a mutual specifying unit 26.
  • the functions of each functional component of the action specifying device 10 are realized by software.
  • the storage 13 stores a program that realizes the functions of each functional component of the action specifying device 10. This program is read into the memory 12 by the processor 11 and executed by the processor 11. As a result, the functions of each functional component of the action specifying device 10 are realized.
  • processors 11 In FIG. 1, only one processor 11 was shown. However, the number of processors 11 may be plural, and the plurality of processors 11 may execute programs that realize each function in cooperation with each other.
  • the operation of the action specifying device 10 according to the first embodiment will be described with reference to FIGS. 2 and 3.
  • the operation procedure of the action specifying device 10 according to the first embodiment corresponds to the action specifying method according to the first embodiment.
  • the program that realizes the operation of the action specifying device 10 according to the first embodiment corresponds to the action specifying program according to the first embodiment.
  • Step S11 Video acquisition process
  • the video acquisition unit 21 acquires video data acquired by the camera 31.
  • the video acquisition unit 21 writes the video data to the memory 12.
  • Step S12 Skeleton information acquisition process
  • the skeleton information acquisition unit 22 acquires skeleton information indicating the positions of joints of the skeleton for each subject who is one or more people reflected in the video data acquired in step S11. .. Specifically, the skeleton information acquisition unit 22 reads video data from the memory 12. The skeleton information acquisition unit 22 sets each of one or more subject persons reflected in the video data as the target subject person. The skeleton information acquisition unit 22 identifies the positions of the joints of the skeleton of the target subject, assigns an index that can identify the subject, and generates skeleton information. The position of the joint is represented by a coordinate value or the like. The skeleton information acquisition unit 22 writes the skeleton information in the memory 12.
  • the skeleton information acquisition unit 22 may include the position of the joint specified from one frame constituting the video data in the skeleton information, or may include the position of the joint specified from a plurality of frames constituting the video data as the skeleton. It may be included in the information.
  • a method of extracting the position of a person's joint shown in the video data there are a method of using deep learning and a method of physically attaching a marker to the position of the joint of the subject and identifying the joint by identifying the marker. ..
  • Step S13 Number of people determination process
  • the correlation determination unit 23 determines whether or not the skeleton information of two or more persons has been acquired in step S12. That is, the correlation determination unit 23 determines whether or not two or more people are shown in the video data. When the skeleton information of two or more people is extracted, the correlation determination unit 23 determines that the skeleton information of two or more people has been acquired, and proceeds to the process in step S14. On the other hand, if not, the correlation determination unit 23 returns the process to step S11.
  • Step S14 Correlation determination process
  • the correlation determination unit 23 determines whether or not the plurality of subjects whose skeleton information has been acquired in step S12 are performing mutual actions, which are actions that affect each other.
  • Mutual behavior is behavior that influences each other among multiple people. Specific examples are actions such as a handshake in which two people reach out and hold each other, and a violent act in which one of the two hits the other.
  • the correlation determination unit 23 targets a set of two or more skeleton information, and if the distance between the skeletons indicated by the skeleton information included in the target set is smaller than the set threshold value, the skeleton of the set. It is determined that the skeleton indicated by the information is a pair of mutual actions.
  • the correlation determination unit 23 targets two or more sets of skeletal information and the amount of change or the time of change in the position of the joint having the skeleton indicated by the skeletal information of the target set is correlated with each other, It may be determined that the skeleton indicated by the skeleton information of the set is a set in which mutual actions are performed. When there is a group determined to be performing mutual action, the correlation determination unit 23 indexes the skeletal information included in the group for each group determined to be the group performing mutual action. Write to memory 12. Then, the correlation determination unit 23 advances the process to step S15. On the other hand, if there is no pair determined to be performing mutual action, the correlation determination unit 23 returns the process to step S11.
  • Step S15 Action identification process
  • the action specifying unit 24 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • a plurality of behavior specifying units 24 consider mutual behaviors, which are behaviors in which a plurality of subject subjects influence each other, from skeletal information about each of the plurality of subject subjects included in the target set acquired in step S12. Identify the behavior of each subject.
  • the behavior specifying process (step S15 in FIG. 2) according to the first embodiment will be described with reference to FIG. (Step S21: Individual identification process)
  • the individual identification unit 25 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the individual identification unit 25 specifies the behavior of the target subject as individual behavior from the skeleton information of the target subject for each of the plurality of subjects included in the target group.
  • the individual identification unit 25 specifies an individual behavior by using an individual model that inputs a person's skeleton information and outputs an individual label indicating the person's behavior. It is assumed that the individual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance.
  • the individual identification unit 25 acquires an individual label indicating the individual behavior of the target subject by inputting the skeleton information of the target subject into the individual model.
  • the individual identification unit 25 writes the individual label in the memory 12.
  • the individual behavior indicated by the individual label is the behavior as one person. Therefore, the individual actions are, for example, actions such as "stretching the arm forward", “falling down", and "rebelling".
  • Step S22 Mutual identification process
  • the mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the mutual identification unit 26 takes into consideration the mutual behavior from the individual behavior of each of the plurality of subject subjects included in the target group specified in step S21, and the behavior of the plurality of subject subjects included in the target group as a whole. To identify.
  • Considering mutual behavior means considering the behavior of another subject when identifying the behavior of one subject. In other words, considering mutual behavior means specifying the behavior of a certain subject based on the behavior of another subject.
  • the mutual identification unit 26 inputs a set of individual labels indicating the individual behaviors of each of the plurality of people, and outputs a mutual label indicating the behaviors of the plurality of people in consideration of the mutual behaviors. Use it to identify the behavior of the subject. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, the mutual identification unit 26 inputs a set of individual labels for each of the plurality of subjects included in the set of targets specified in step S21 to the mutual model, whereby a plurality of sets included in the set of targets are included. Obtain a mutual label indicating the behavior of the subject as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.
  • the behavior indicated by the mutual label is the behavior as multiple people. Therefore, the behavior indicated by the mutual label is, for example, an action such as "shaking hands" or "one person is beaten and the other person is beaten".
  • the behavior indicated by the mutual label is "handshake". become.
  • the individual behavior of one subject is “stretching the arm forward”
  • the individual behavior of the other subject is "rebelling”.
  • the behavior indicated by the mutual label becomes "violence". Further, even when the number of subjects included in the target group is three or more, the action can be similarly specified by the combination of each action.
  • the behavior specifying device 10 identifies the behavior as a plurality of subject subjects in consideration of the mutual behavior which is the behavior in which the plurality of subject subjects influence each other. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
  • the behavior is specified by using an individual model and a mutual model, which are trained models generated by using a neural network or the like.
  • a rule in which the input and the output are associated may be used.
  • the rule used instead of the individual model is an individual rule in which the human skeleton information and the individual label indicating the human behavior are associated with each other. That is, the individual rule is a rule in which an individual label is obtained as an output when human skeleton information is given as an input.
  • the individual identification unit 25 refers to the individual rule and sets the individual label corresponding to the skeleton information of the target subject to the target subject. It is acquired as information indicating the individual behavior of. At this time, the individual identification unit 25 acquires an individual label associated with the skeleton information having the highest degree of similarity to the skeleton information of the target subject as information indicating the individual behavior of the target subject.
  • the rule used instead of the mutual model is a mutual rule in which a set of individual labels indicating the individual behavior of each of a plurality of people and a mutual label indicating the behavior of a plurality of people are associated with each other. That is, a mutual rule is a rule in which, when a set of individual labels is given as an input, a mutual label indicating an action as a plurality of people is obtained as an output.
  • the mutual identification unit 26 refers to the mutual rule and obtains a mutual label corresponding to a set of individual labels for each of a plurality of subjects. It is acquired as information indicating the behavior of a plurality of subjects as a whole.
  • the behavior of the plurality of subjects as a whole was specified.
  • the behavior specifying device 10 may further specify which behavior each subject is doing in the behavior as a whole.
  • the mutual identification unit 26 of the behavior specifying device 10 targets each subject, and from the behavior as a whole and the individual label of the subject, the behavior of the target subject in the behavior as a whole.
  • the individual behavior of one subject is "stretching the arm forward" and the individual behavior of the other subject is "rebelling”
  • the subject's behavior in which the individual behavior is "stretching the arm forward" is "beating the opponent”
  • the subject's behavior in which the individual behavior is “rebelling” is "beating by the opponent”.
  • the individual model and the mutual model are stored in the storage 13.
  • the individual model and the mutual model may be stored in an external storage device of the behavior identification device 10.
  • the behavior identification device 10 may access the individual model and the mutual model via the communication interface 14.
  • each functional component is realized by software.
  • each functional component may be realized by hardware. The difference between the modified example 4 and the first embodiment will be described.
  • the action specifying device 10 includes an electronic circuit 15 in place of the processor 11, the memory 12, and the storage 13.
  • the electronic circuit 15 is a dedicated circuit that realizes the functions of each functional component, the memory 12, and the storage 13.
  • Examples of the electronic circuit 15 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
  • Each functional component may be realized by one electronic circuit 15, or each functional component may be distributed and realized by a plurality of electronic circuits 15.
  • Modification 5 As a modification 5, some functional components may be realized by hardware, and other functional components may be realized by software.
  • the processor 11, the memory 12, the storage 13, and the electronic circuit 15 are called processing circuits. That is, the function of each functional component is realized by the processing circuit.
  • Embodiment 2 In the second embodiment, the generation process of the individual model and the mutual model will be described.
  • the configuration of the learning device 50 according to the second embodiment will be described with reference to FIG.
  • the learning device 50 is a computer.
  • the learning device 50 includes hardware such as a processor 51, a memory 52, a storage 53, and a communication interface 54.
  • the processor 51 is connected to other hardware via a signal line and controls these other hardware.
  • the processor 51 is an IC that performs processing.
  • the memory 52 is a storage device that temporarily stores data.
  • the storage 53 is a storage device for storing data, like the storage 13.
  • the storage 53 may be a portable recording medium like the storage 13.
  • the communication interface 54 is an interface for communicating with an external device.
  • the learning device 50 is connected to the action specifying device 10 via the communication interface 54.
  • the learning device 50 includes a learning data acquisition unit 61 and a model generation unit 62 as functional components.
  • the functions of each functional component of the learning device 50 are realized by software.
  • the storage 13 stores a program that realizes the functions of each functional component of the learning device 50. This program is read into the memory 52 by the processor 51 and executed by the processor 51. As a result, the functions of each functional component of the learning device 50 are realized.
  • processor 51 In FIG. 5, only one processor 51 was shown. However, the number of processors 51 may be plural, and the plurality of processors 51 may execute programs that realize each function in cooperation with each other.
  • the operation of the learning device 50 according to the second embodiment will be described with reference to FIGS. 6 and 7.
  • the operation procedure of the learning device 50 according to the second embodiment corresponds to the learning method according to the second embodiment.
  • the program that realizes the operation of the learning device 50 according to the second embodiment corresponds to the learning program according to the second embodiment.
  • the learning data acquisition unit 61 acquires learning data in which skeletal information indicating the positions of joints of a person's skeleton is associated with the behavior of the person.
  • learning data is generated by identifying skeletal information from video data obtained by imaging a person who actually performed a specified action. That is, the extracted skeletal information and the specified action are associated with each other to obtain learning data.
  • the skeleton information may be vector data including only the joint positions specified from one frame of the video data, or may be matrix data including the joint positions specified from a plurality of frames.
  • Step S32 Model generation process
  • the model generation unit 62 receives the learning data acquired in step S31 as an input, performs learning, and generates an individual model.
  • the model generation unit 62 writes the individual model in the storage 13 of the action specifying device 10.
  • the model generation unit 62 inputs the learning data and causes the neural network to learn the relationship between the position of the joint of the skeleton and the behavior. For example, if the model generator 62 indicates that the skeletal information indicates that the shoulders, elbows, and wrists are aligned and their vertical positions are equivalent, it is "extended the arm forward.” Learn to represent an action.
  • the configuration of the neural network used may be a well-known one such as DNN (deep neural network), CNN (convolutional neural network), and RNN (recurrent neural network).
  • Step S41 Learning data acquisition process
  • the learning data acquisition unit 61 acquires learning data in which a set of a plurality of individual labels is associated with the behavior of each of the plurality of people in consideration of mutual behavior.
  • the learning data is generated by associating an individual label indicating the individual behavior of each of the plurality of people when the designated mutual action is actually performed with the behavior of the plurality of people in the mutual action.
  • Step S42 Model generation process
  • the model generation unit 62 receives the learning data acquired in step S41 as an input, performs learning, and generates a mutual model.
  • the model generation unit 62 writes the mutual model in the storage 13 of the behavior identification device 10.
  • the model generation unit 62 inputs the learning data and causes the neural network to learn the relationship between the set of a plurality of individual labels and the behavior as a plurality of people in consideration of mutual behavior. For example, in the case of a pair of two subjects, if the individual behavior of both subjects is "stretching the arm forward", the behavior indicated by the mutual label for both subjects is Learn to be a "handshake".
  • the configuration of the neural network used may be a well-known one such as DNN (deep neural network), CNN (convolutional neural network), and RNN (recurrent neural network).
  • the learning device 50 As described above, the learning device 50 according to the second embodiment generates an individual model and a mutual model used by the behavior specifying device 10 based on the learning data. Thereby, by giving appropriate learning data, the recognition accuracy of the individual model and the mutual model used by the behavior specifying device 10 can be improved.
  • the behavior specifying device 10 may use an individual rule instead of the individual model, or may use a mutual rule instead of the mutual model.
  • the model generation unit 62 When an individual rule is used instead of the individual model, the model generation unit 62 generates an individual rule instead of the individual model in step S32 of FIG. Specifically, the model generation unit 62 creates a database associated with skeletal information indicating the positions of joints of the human skeleton shown by each learning data acquired in step S31 and individual labels indicating the behavior of the person. Generate as an individual rule.
  • the model generation unit 62 When a mutual rule is used instead of the mutual model, the model generation unit 62 generates the mutual rule instead of the mutual model in step S42 of FIG. Specifically, the model generation unit 62 creates a database in which a set of a plurality of individual labels shown by each learning data acquired in step S41 is associated with behaviors as a plurality of people in consideration of mutual behaviors. Generate as a mutual rule.
  • each functional component is realized by software.
  • each functional component may be realized by hardware. The difference between the modified example 7 and the second embodiment will be described.
  • the learning device 50 includes an electronic circuit 55 instead of the processor 51, the memory 52, and the storage 53.
  • the electronic circuit 55 is a dedicated circuit that realizes the functions of each functional component, the memory 52, and the storage 53.
  • Examples of the electronic circuit 55 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
  • Each functional component may be realized by one electronic circuit 55, or each functional component may be distributed and realized by a plurality of electronic circuits 55.
  • Modification 8> As a modification 8, some functional components may be realized by hardware, and other functional components may be realized by software.
  • the processor 51, the memory 52, the storage 53, and the electronic circuit 55 are called processing circuits. That is, the function of each functional component is realized by the processing circuit.
  • Embodiment 3 is different from the first embodiment in that the behavior of a plurality of subjects as a whole is specified in consideration of mutual behavior from the feature quantities calculated from the plurality of skeletal information. In the third embodiment, these different points will be described, and the same points will be omitted.
  • the configuration of the behavior specifying device 10 according to the third embodiment will be described with reference to FIG. 9.
  • the action specifying device 10 is different from the action specifying device 10 shown in FIG. 1 in that the action specifying unit 24 includes a feature amount calculation unit 27 instead of the individual specifying unit 25.
  • the function of the feature amount calculation unit 27 is realized by software or hardware like other functions.
  • the behavior specifying process (step S15 in FIG. 2) according to the third embodiment will be described with reference to FIG. (Step S51: Feature calculation process)
  • the feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the feature amount calculation unit 27 calculates the feature amount based on the skeletal information of each of the plurality of subjects included in the target set. Specifically, the feature amount calculation unit 27 calculates the feature amount by integrating the skeletal information about each of the plurality of subjects included in the target set. Alternatively, the feature amount calculation unit 27 may extract the feature amount from the skeletal information about each of the plurality of subjects included in the target set.
  • the feature amount calculation is processed so that information about the positional relationship of the joints between the plurality of skeletons is retained.
  • the skeleton information has m coordinates indicating the joint positions of the skeleton per person's skeleton information, and the skeleton is represented by an m-dimensional vector.
  • the feature quantity is a (m ⁇ n) dimensional vector in which n m-dimensional vectors are concatenated, or a matrix with m rows and n columns.
  • the feature quantity is a vector or matrix having a time change as an element with respect to the distance between arbitrary joints among a plurality of skeletons.
  • the distance between any joints between the plurality of skeletons is, for example, the distance between the neck of the skeleton A and the wrist of the skeleton B.
  • Step S52 Mutual identification process
  • the mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the mutual identification unit 26 specifies the behavior of the plurality of subjects as a whole in consideration of the mutual behavior by inputting the feature amount of the skeletal information of the plurality of subjects included in the set of targets specified in step S51. ..
  • the mutual identification unit 26 uses a mutual model in which features of skeletal information of a plurality of people are input and a mutual label indicating the behavior as a plurality of people is output in consideration of the mutual behavior. Identify the behavior of the subject.
  • the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, by inputting the feature amount calculated in step S51 into the mutual model, the mutual identification unit 26 acquires a mutual label indicating the behavior of the plurality of subjects included in the target set as a whole.
  • the mutual identification unit 26 writes the mutual label in the memory 12.
  • the behavior specifying device 10 according to the third embodiment considers mutual behavior, which is an behavior in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
  • the behavior is specified by using a mutual model which is a trained model generated by using a neural network or the like.
  • mutual rules may be used instead of the mutual model.
  • the mutual rule is a rule in which the feature amount of the skeletal information of a plurality of people and the mutual label indicating the behavior as a plurality of people are associated with each other.
  • the mutual identification unit 26 refers to the mutual rule in step S52 of FIG. It is acquired as information indicating the behavior of.
  • the behavior of the plurality of subjects as a whole is specified.
  • the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2.
  • the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
  • Embodiment 4 is different from the second embodiment in that a mutual model according to the third embodiment is generated. In the fourth embodiment, these different points will be described, and the same points will be omitted. Since the individual model is not used in the third embodiment, the individual model is not generated in the fourth embodiment.
  • the operation of the learning device 50 according to the fourth embodiment will be described with reference to FIG. 7.
  • the operation procedure of the learning device 50 according to the fourth embodiment corresponds to the learning method according to the fourth embodiment.
  • the program that realizes the operation of the learning device 50 according to the fourth embodiment corresponds to the learning program according to the fourth embodiment.
  • Step S41 Learning data acquisition process
  • the learning data acquisition unit 61 acquires learning data in which the feature amounts of the skeletal information of a plurality of people and the behaviors of the plurality of people are associated with each other.
  • the learning data is generated by calculating the feature amount from the video data obtained by imaging a plurality of people who actually performed the specified mutual action. That is, the calculated feature amount and the behavior of each person in the designated mutual behavior are associated with each other to obtain learning data.
  • Step S42 Model generation process
  • the model generation unit 62 receives the learning data acquired in step S31 as an input, performs learning, and generates a mutual model.
  • the model generation unit 62 writes the mutual model in the storage 13 of the behavior identification device 10.
  • the learning device 50 As described above, the learning device 50 according to the fourth embodiment generates a mutual model used by the behavior specifying device 10 based on the learning data. Thereby, by giving appropriate learning data, the recognition accuracy of the individual model and the mutual model used by the behavior specifying device 10 can be improved.
  • the behavior specifying device 10 may use a mutual rule instead of the mutual model.
  • the model generation unit 62 When a mutual rule is used instead of the mutual model, the model generation unit 62 generates the mutual rule instead of the mutual model in step S42 of FIG. Specifically, the model generation unit 62 generates as a mutual rule a database in which the feature amount shown by each learning data acquired in step S41 is associated with the behavior as a plurality of people in consideration of the mutual behavior. do.
  • Embodiment 5 the method of calculating the feature amount from the skeletal information is different from the third embodiment. In the fifth embodiment, these different points will be described, and the same points will be omitted.
  • the skeleton information when calculating the feature amount from the skeleton information, the skeleton information at least one time ago is required. Therefore, in the fifth embodiment, after the skeleton information is acquired in step 12 of FIG. 2, the skeleton information is stored in the skeleton information database realized by the storage 13.
  • Step S51 Feature calculation process
  • the feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the feature amount calculation unit 27 calculates the feature amount based on the skeletal information of each of the plurality of subjects included in the target set.
  • the feature amount calculation unit 27 writes the feature amount in the feature amount database realized by the storage 13. Specifically, the feature amount calculation unit 27 calculates the feature amount from the skeletal information of each of the plurality of subjects included in the target set. Then, the feature amount calculation unit 27 adds the current time t as an index to the calculated feature amount and writes it in the feature amount database.
  • the calculated feature amount and the calculation method thereof will be described later.
  • Step S52 Mutual identification process
  • the mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the mutual identification unit 26 specifies the behavior of the plurality of subjects as a whole in consideration of the mutual behavior by inputting the feature amount of the skeletal information of the plurality of subjects included in the set of targets specified in step S51. ..
  • the mutual identification unit 26 acquires the feature quantities of a plurality of subjects included in the target set from the feature quantity database.
  • the mutual identification unit 26 identifies the behavior of the subject by using a mutual model that inputs the feature quantities of a plurality of people and outputs a mutual label indicating the behavior as a plurality of people in consideration of the mutual behavior. do.
  • the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, by inputting the feature amount calculated in step S51 into the mutual model, the mutual identification unit 26 acquires a mutual label indicating the behavior of the plurality of subjects included in the target set as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.
  • the feature amount acquired by the mutual identification unit 26 from the feature amount database may not be one calculated at a certain time, but may be a plurality of consecutive feature amounts in a time series.
  • the mutual identification unit 26 identifies the behavior as a plurality of subject persons included in the target set based on the transition of the feature quantities, and mutually labels them.
  • the mutual model is a model that inputs the transition of the feature quantity of a plurality of people and outputs a mutual label indicating the behavior as a plurality of people in consideration of the mutual behavior.
  • Step S51 in FIG. 10 The feature amount calculation process (step S51 in FIG. 10) according to the fifth embodiment will be described with reference to FIG. (Step S61: Skeleton information acquisition process)
  • the feature amount calculation unit 27 sets each group determined to be a group performing mutual action as a target group.
  • the feature amount calculation unit 27 acquires the skeleton information of the current time and the skeleton information of one time ago for each of the plurality of subjects included in the set set in the target set from the skeleton information database.
  • Step S62 Speed calculation process
  • the feature amount calculation unit 27 calculates the feature amount by using the skeleton information of each of the plurality of subjects acquired in step S61 at the current time and the skeleton information one time before. Specifically, the feature amount calculation unit 27 obtains a vector or a matrix having the movement distance of each joint of the skeleton with respect to the subject between two consecutive frames of images acquired in step S61 as an element. calculate. Since the movement distance of each joint calculated in this way is the movement distance of each joint with respect to the time width generated between the images of two frames, it can be regarded as the speed of each joint. Then, the feature amount calculation unit 27 uses a scalar obtained by taking the total or average of the speeds of each joint as the speed of the entire skeleton of the subject, and uses this speed as the feature amount.
  • the feature amount calculation unit 27 may acquire skeleton information for a time width N from the current time t to the past time t—N.
  • the feature amount calculation unit 27 generates a vector or a matrix having the movement distance of each joint of the skeleton as an element for two consecutive time periods.
  • the feature amount calculation unit 27 sums up the movement distances of each joint in the time direction, divides them by the time width N, and calculates the average movement distance from the current time t to the past time t—N as the speed of each joint. That is, the feature amount calculation unit 27 totals the movement distances between the two times calculated for the target joints for each joint, divides by the time width N, and obtains the average movement distance of the target joints. calculate.
  • the feature amount calculation unit 27 treats this average moving distance as the speed of the target joint. Then, the feature amount calculation unit 27 uses a scalar obtained by taking the total or average of the speeds of each joint as the speed of the entire skeleton of the subject, and uses this speed as the feature amount.
  • the feature amount was scalar.
  • the feature amount calculation unit 27 may use vector data having the velocity of each joint as an element as the feature amount without taking the total or average value of the velocities for all the joints.
  • the feature amount calculation unit 27 may calculate the feature amount from any number of joints of the extracted subject's skeleton. Alternatively, the feature amount calculation unit 27 calculates a number of feature amounts smaller than the number of joints from which the feature amount has been extracted by adding or averaging the feature amounts calculated for an arbitrary number of joints. You may. Further, the feature amount calculation unit 27 may add up or average the feature amounts calculated for the number of subjects to form one feature amount.
  • the feature amount calculation unit 27 acquires the feature amount based on the past feature amount stored in the feature amount database, or based on the joint for which the joint position can be acquired.
  • the position of the joint that could not be obtained or the feature amount related to the joint that could not be obtained may be supplemented.
  • the feature amount at the time when the joint position could not be acquired is used as the feature amount one hour before, or the feature amount at the time when the joint position could not be acquired is used as the feature amount for the past several hours. It is conceivable to calculate by linearly complementing the displacement of.
  • the feature amount calculation unit 27 calculates the average value of the speeds per joint from the speeds of the entire joint group in which the joint positions could be obtained, and even if the joint speeds in which the joint positions could not be obtained are used as the joint speeds. It consists of joints around the joint for which the position could not be obtained, and the average value of the speed per joint was calculated from the speed of the joint group for which the position of the joint could be obtained. May be good.
  • the feature amount calculation unit 27 complements the position of the right knee that could not be acquired with the position of the left knee, and is the position of the joint that is paired on the left and right with the joint that could not be acquired, or the position of the joint that is connected. May be complemented with.
  • the behavior specifying device 10 according to the fifth embodiment considers mutual behavior, which is an action in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
  • the behavior specifying device 10 uses the speed calculated using the skeleton information acquired from two or more frames as the feature amount. For example, if the speed of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds is used as the feature quantity, the skeletal joints of the subject are erroneously extracted due to the orientation of the person or the concealment of a part of the body by occlusion. Even if this occurs, there is a high possibility that the behavior can be correctly determined.
  • the behavior specifying device 10 may use a mutual rule instead of the mutual model.
  • the behavior of the plurality of subjects as a whole is specified.
  • the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2.
  • the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
  • Embodiment 6 is different from the third and fifth embodiments in the method of calculating the feature amount from the skeletal information. In the sixth embodiment, these different points will be described, and the same points will be omitted. The sixth embodiment will explain the differences from the fifth embodiment.
  • Step S71 Skeleton information acquisition process
  • the feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the feature amount calculation unit 27 acquires skeleton information from the current time t to N time before each of the plurality of subjects included in the set set in the target set from the skeleton information database.
  • the feature amount calculation unit 27 sets data in which the acquired skeleton information is arranged in time series as time series data.
  • the time-series data is data in which skeletal information for a target period having a certain length, for example, several seconds, is arranged in chronological order, and skeletal information at two or more times is arranged in chronological order. It is desirable, and it is desirable that the data is a time-series arrangement of skeletal information at three or more times.
  • Step S72 Travel distance calculation process
  • the feature amount calculation unit 27 calculates the movement distance of each joint of the skeleton of the subject subject between the skeleton information of two consecutive times in the time series. Specifically, the feature amount calculation unit 27 calculates the movement distance of the target joint by calculating the difference in the position of the target joint between the skeletal information at two times for each joint.
  • the feature amount calculation unit 27 generates a vector or a matrix having the movement distance of each joint as an element. In the following, it will be described assuming that a vector having the movement distance of each joint as an element is generated.
  • Step S73 Momentum calculation process
  • the feature amount calculation unit 27 totals the vectors generated in step S22 with the movement distance of each joint as an element in the time direction. That is, the feature amount calculation unit 27 totals the movement distances between the two times calculated for each joint for each joint.
  • the value calculated in this way is the sum of the movement distances of each joint in the time width N from the current time t to the past time t—N. Therefore, this value can be regarded as the momentum of each joint in the time width N.
  • the feature amount calculation unit 27 makes a scalar by summing up the momentums of all the joints or taking an average value, and regards this scalar as the momentum of the entire skeleton of the subject in the time width N. Then, the feature amount calculation unit 27 uses this momentum as the feature amount.
  • the feature amount calculation unit 27 may use vector data having the momentum of each joint as an element as the feature amount without taking the total or average value of the momentums of all the joints.
  • the feature amount calculation unit 27 may calculate the feature amount from any number of the extracted joints of the subject's skeleton. Alternatively, the feature amount calculation unit 27 calculates a number of feature amounts smaller than the number of joints from which the feature amount has been extracted by adding or averaging the feature amounts calculated for an arbitrary number of joints. You may.
  • the feature amount calculation unit 27 may supplement the position of the joint that could not be acquired or the feature amount related to the joint that could not be acquired.
  • the behavior specifying device 10 according to the sixth embodiment considers mutual behavior, which is an action in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
  • the behavior specifying device 10 uses the momentum calculated using the skeletal information acquired from the past frame as the feature amount. For example, by using the momentum of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds as the feature amount, the subject's skeletal joint error due to the orientation of the person or the concealment of a part of the body by occlusion, etc. Even if extraction occurs, there is a high possibility that the behavior can be correctly determined.
  • the behavior specifying device 10 may use a mutual rule instead of the mutual model.
  • the behavior of the plurality of subjects as a whole was specified as in the first embodiment.
  • the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2.
  • the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
  • Embodiment 7 is different from the third, fifth, and sixth embodiments in that the feature amount calculated from the skeletal information is different. In the seventh embodiment, these different points will be described, and the same points will be omitted. The seventh embodiment will explain the differences from the sixth embodiment.
  • Step S51 in FIG. 10 The feature amount calculation process (step S51 in FIG. 10) according to the seventh embodiment will be described with reference to FIG. (Step S81: Skeleton information acquisition process)
  • the feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group.
  • the feature amount calculation unit 27 acquires skeleton information from the current time t to N time before each of the plurality of subjects included in the set set in the target set from the skeleton information database.
  • the feature amount calculation unit 27 sets data in which the acquired skeleton information is arranged in time series as time series data.
  • Step S82 Trajectory calculation process
  • the feature amount calculation unit 27 describes the positions of the joints of the subject's skeleton at each time between the current time t and the past time t-N represented by the time-series data of the skeleton information of the target subject generated in step S81. Generate a vector or matrix in which information is arranged in time series as a feature quantity. In the following, it will be described assuming that a vector in which joint position information is arranged in chronological order is generated. The vector generated in this way has information on the positions of the joints of the skeleton arranged in time series as an element. Therefore, it represents the movement path of the joint from time t to time t—N, that is, the locus of movement.
  • the coordinate value x representing the horizontal position and the coordinate value y representing the vertical position are used ( It is expressed as x, y).
  • the feature amount calculation unit 27 may calculate the feature amount for any number of the extracted joints of the subject's skeleton. Further, the feature amount calculation unit 27 uses m coordinate values such that m ⁇ M when the skeleton information has the position information of the M-dimensional joint with respect to the positive integers M and m. You may calculate the amount.
  • the feature amount calculation unit 27 may supplement the position of the joint that could not be acquired or the feature amount related to the joint that could not be acquired.
  • the behavior specifying device 10 according to the seventh embodiment considers mutual behavior, which is an behavior in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
  • the behavior specifying device 10 uses a locus calculated using the skeleton information acquired from the past frame as a feature amount. For example, by using the locus of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds as a feature quantity, the subject's skeletal joint error due to the orientation of the person or the concealment of a part of the body by occlusion, etc. Even if extraction occurs, there is a high possibility that the behavior can be correctly determined.
  • the behavior specifying device 10 may use a mutual rule instead of the mutual model.
  • ⁇ Modification 17> In the seventh embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2.
  • the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

In the present invention, a skeletal information acquisition unit (22) targets each of multiple subjects, who are a plurality of persons appearing in video data acquired by a video acquisition unit (21), and thereby acquires skeletal information indicating the position of joints of a skeleton for a targeted subject. An action identification unit (24) identifies, from the skeletal information for each of the plurality of subjects acquired by the skeletal information acquisition unit (22), an action taken by the plurality of subjects by taking into consideration a mutual action, which is an action taken by the plurality of subjects that has mutual impact.

Description

行動特定装置、行動特定方法及び行動特定プログラムBehavior identification device, behavior identification method and behavior identification program
 本開示は、人の骨格の関節の位置を示す骨格情報に基づき、人の行動を特定する技術に関する。 This disclosure relates to a technique for identifying human behavior based on skeletal information indicating the positions of joints in the human skeleton.
 特許文献1には、骨格情報を用いた人の行動認識技術が記載されている。特許文献1に記載された技術では、映像に映った人それぞれを対象として、対象の人の周辺の画像情報が取得され、対象の人の骨格情報が抽出され、骨格情報から対象の人の動作を確認可能な画像情報が生成される。そして、生成された画像情報と予め記憶されている判定人物属性の画像情報とに基づき、対象の人の属性が判定人物属性であるかが判定される。 Patent Document 1 describes a human behavior recognition technique using skeletal information. In the technique described in Patent Document 1, image information around the target person is acquired for each person shown in the image, skeleton information of the target person is extracted, and the movement of the target person is extracted from the skeleton information. Image information that can be confirmed is generated. Then, based on the generated image information and the image information of the determined person attribute stored in advance, it is determined whether the attribute of the target person is the determined person attribute.
特開2019-046481号公報Japanese Unexamined Patent Publication No. 2019-046481
 特許文献1に記載された技術では、対象の人についての1人分の骨格情報を基にして行動認識している。そのため、「握手する」と「殴る」とのように「腕を前に伸ばす」という点でその姿勢及び動作が類似している行動については、正しく判別できない可能性がある。
 本開示は、行動認識の精度を向上させることを目的とする。
In the technique described in Patent Document 1, the behavior is recognized based on the skeletal information of one person about the target person. Therefore, it may not be possible to correctly discriminate behaviors that have similar postures and movements in terms of "stretching the arm forward" such as "shaking hands" and "beating".
The present disclosure is intended to improve the accuracy of behavior recognition.
 本開示に係る行動特定装置は、
 映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する骨格情報取得部と、
 前記骨格情報取得部によって取得された前記複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定部と
を備える。
The behavior identification device related to this disclosure is
A skeleton information acquisition unit that acquires skeletal information indicating the positions of joints of the skeleton for each of the subjects who are multiple people reflected in the video data.
From the skeleton information about each of the plurality of subject persons acquired by the skeleton information acquisition unit, the plurality of subject persons can be regarded as the plurality of subject persons in consideration of mutual behavior which is an action in which the plurality of subject persons influence each other. It has a behavior identification unit that identifies behavior.
 本開示では、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。 In this disclosure, the behavior as a plurality of subject subjects is specified in consideration of the mutual behavior which is the behavior in which a plurality of subject subjects influence each other. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
実施の形態1に係る行動特定装置10の構成図。The block diagram of the action specifying apparatus 10 which concerns on Embodiment 1. FIG. 実施の形態1に係る行動特定装置10の全体的な動作を示すフローチャート。The flowchart which shows the overall operation of the action specifying apparatus 10 which concerns on Embodiment 1. FIG. 実施の形態1に係る行動特定処理のフローチャート。The flowchart of the action specifying process which concerns on Embodiment 1. 変形例3に係る行動特定装置10の構成図。The block diagram of the behavior specifying apparatus 10 which concerns on modification 3. 実施の形態2に係る学習装置50の構成図。The block diagram of the learning apparatus 50 which concerns on Embodiment 2. FIG. 実施の形態2に係る学習装置50が個別モデルを生成する動作を示すフローチャート。The flowchart which shows the operation which the learning apparatus 50 which concerns on Embodiment 2 generate an individual model. 実施の形態2に係る学習装置50が相互モデルを生成する動作を示すフローチャート。The flowchart which shows the operation which the learning apparatus 50 which concerns on Embodiment 2 generate a mutual model. 変形例6に係る学習装置50の構成図。The block diagram of the learning apparatus 50 which concerns on modification 6. 実施の形態3に係る行動特定装置10の構成図。The block diagram of the action specifying apparatus 10 which concerns on Embodiment 3. 実施の形態3に係る行動特定装置10の動作を示すフローチャート。The flowchart which shows the operation of the action specifying apparatus 10 which concerns on Embodiment 3. 実施の形態5に係る特徴量計算処理のフローチャート。The flowchart of the feature amount calculation process which concerns on Embodiment 5. 実施の形態6に係る特徴量計算処理のフローチャート。The flowchart of the feature amount calculation process which concerns on Embodiment 6. 実施の形態7に係る特徴量計算処理のフローチャート。The flowchart of the feature amount calculation process which concerns on Embodiment 7.
 実施の形態1.
 ***構成の説明***
 図1を参照して、実施の形態1に係る行動特定装置10の構成を説明する。
 行動特定装置10は、コンピュータである。
 行動特定装置10は、プロセッサ11と、メモリ12と、ストレージ13と、通信インタフェース14とのハードウェアを備える。プロセッサ11は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。
Embodiment 1.
*** Explanation of configuration ***
The configuration of the behavior specifying device 10 according to the first embodiment will be described with reference to FIG.
The behavior identification device 10 is a computer.
The behavior identification device 10 includes hardware such as a processor 11, a memory 12, a storage 13, and a communication interface 14. The processor 11 is connected to other hardware via a signal line and controls these other hardware.
 プロセッサ11は、プロセッシングを行うIC(Integrated Circuit)である。プロセッサ11は、具体例としては、CPU(Central Processing Unit)、DSP(Digital Signal Processor)、GPU(Graphics Processing Unit)である。 The processor 11 is an IC (Integrated Circuit) that performs processing. Specific examples of the processor 11 are a CPU (Central Processing Unit), a DSP (Digital Signal Processor), and a GPU (Graphics Processing Unit).
 メモリ12は、データを一時的に記憶する記憶装置である。メモリ12は、具体例としては、SRAM(Static Random Access Memory)、DRAM(Dynamic Random Access Memory)である。 The memory 12 is a storage device that temporarily stores data. As a specific example, the memory 12 is a SRAM (Static Random Access Memory) or a DRAM (Dynamic Random Access Memory).
 ストレージ13は、データを保管する記憶装置である。ストレージ13は、具体例としては、HDD(Hard Disk Drive)である。また、ストレージ13は、SD(登録商標,Secure Digital)メモリカード、CF(CompactFlash,登録商標)、NANDフラッシュ、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ(登録商標)ディスク、DVD(Digital Versatile Disk)といった可搬記録媒体であってもよい。 The storage 13 is a storage device for storing data. As a specific example, the storage 13 is an HDD (Hard Disk Drive). The storage 13 includes SD (registered trademark, Secure Digital) memory card, CF (CompactFlash, registered trademark), NAND flash, flexible disk, optical disk, compact disk, Blu-ray (registered trademark) disk, DVD (Digital Versaille Disk), and the like. It may be a portable recording medium.
 通信インタフェース14は、外部の装置と通信するためのインタフェースである。通信インタフェース14は、具体例としては、Ethernet(登録商標)、USB(Universal Serial Bus)、HDMI(登録商標,High-Definition Multimedia Interface)のポートである。 The communication interface 14 is an interface for communicating with an external device. As a specific example, the communication interface 14 is a port of Ethernet (registered trademark), USB (Universal Serial Bus), HDMI (registered trademark, High-Definition Multimedia Interface).
 行動特定装置10は、通信インタフェース14を介して、カメラ31と接続されている。カメラ31は、一般的な2D(Dimension)カメラであってもよいが、3Dカメラであってもよい。カメラ31として3Dカメラを用いることにより、奥行に関する情報も得られる。そのため、後述する処理において、人の関節の位置を適切に特定可能になる。 The action specifying device 10 is connected to the camera 31 via the communication interface 14. The camera 31 may be a general 2D (Dimension) camera, but may be a 3D camera. By using a 3D camera as the camera 31, information on the depth can also be obtained. Therefore, in the process described later, the position of a human joint can be appropriately specified.
 行動特定装置10は、機能構成要素として、映像取得部21と、骨格情報取得部22と、相関判定部23と、行動特定部24とを備える。行動特定部24は、個別特定部25と、相互特定部26とを備える。行動特定装置10の各機能構成要素の機能はソフトウェアにより実現される。
 ストレージ13には、行動特定装置10の各機能構成要素の機能を実現するプログラムが格納されている。このプログラムは、プロセッサ11によりメモリ12に読み込まれ、プロセッサ11によって実行される。これにより、行動特定装置10の各機能構成要素の機能が実現される。
The action specifying device 10 includes a video acquisition unit 21, a skeleton information acquisition unit 22, a correlation determination unit 23, and an action identification unit 24 as functional components. The action specifying unit 24 includes an individual specifying unit 25 and a mutual specifying unit 26. The functions of each functional component of the action specifying device 10 are realized by software.
The storage 13 stores a program that realizes the functions of each functional component of the action specifying device 10. This program is read into the memory 12 by the processor 11 and executed by the processor 11. As a result, the functions of each functional component of the action specifying device 10 are realized.
 図1では、プロセッサ11は、1つだけ示されていた。しかし、プロセッサ11は、複数であってもよく、複数のプロセッサ11が、各機能を実現するプログラムを連携して実行してもよい。 In FIG. 1, only one processor 11 was shown. However, the number of processors 11 may be plural, and the plurality of processors 11 may execute programs that realize each function in cooperation with each other.
 ***動作の説明***
 図2及び図3を参照して、実施の形態1に係る行動特定装置10の動作を説明する。
 実施の形態1に係る行動特定装置10の動作手順は、実施の形態1に係る行動特定方法に相当する。また、実施の形態1に係る行動特定装置10の動作を実現するプログラムは、実施の形態1に係る行動特定プログラムに相当する。
*** Explanation of operation ***
The operation of the action specifying device 10 according to the first embodiment will be described with reference to FIGS. 2 and 3.
The operation procedure of the action specifying device 10 according to the first embodiment corresponds to the action specifying method according to the first embodiment. Further, the program that realizes the operation of the action specifying device 10 according to the first embodiment corresponds to the action specifying program according to the first embodiment.
 図2を参照して、実施の形態1に係る行動特定装置10の全体的な動作を説明する。
 (ステップS11:映像取得処理)
 映像取得部21は、カメラ31によって取得された映像データを取得する。映像取得部21は、映像データをメモリ12に書き込む。
With reference to FIG. 2, the overall operation of the behavior specifying device 10 according to the first embodiment will be described.
(Step S11: Video acquisition process)
The video acquisition unit 21 acquires video data acquired by the camera 31. The video acquisition unit 21 writes the video data to the memory 12.
 (ステップS12:骨格情報取得処理)
 骨格情報取得部22は、ステップS11で取得された映像データに映った1人以上の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する。
 具体的には、骨格情報取得部22は、メモリ12から映像データを読み出す。骨格情報取得部22は、映像データに映った1人以上の被写体者それぞれを対象の被写体者に設定する。骨格情報取得部22は、対象の被写体者の骨格の関節の位置を特定し、被写体者を判別可能なインデックスを付与して骨格情報を生成する。関節の位置は、座標値等によって表される。骨格情報取得部22は、骨格情報をメモリ12に書き込む。
(Step S12: Skeleton information acquisition process)
The skeleton information acquisition unit 22 acquires skeleton information indicating the positions of joints of the skeleton for each subject who is one or more people reflected in the video data acquired in step S11. ..
Specifically, the skeleton information acquisition unit 22 reads video data from the memory 12. The skeleton information acquisition unit 22 sets each of one or more subject persons reflected in the video data as the target subject person. The skeleton information acquisition unit 22 identifies the positions of the joints of the skeleton of the target subject, assigns an index that can identify the subject, and generates skeleton information. The position of the joint is represented by a coordinate value or the like. The skeleton information acquisition unit 22 writes the skeleton information in the memory 12.
 骨格情報取得部22は、映像データを構成するある1つのフレームから特定された関節の位置を骨格情報に含めてもよいし、映像データを構成する複数のフレームから特定された関節の位置を骨格情報に含めてもよい。
 映像データに映る人の関節の位置の抽出方法としては、深層学習を用いる方法と、対象者の関節の位置に物理的にマーカを付け、マーカを識別することで関節を特定する方法等がある。
The skeleton information acquisition unit 22 may include the position of the joint specified from one frame constituting the video data in the skeleton information, or may include the position of the joint specified from a plurality of frames constituting the video data as the skeleton. It may be included in the information.
As a method of extracting the position of a person's joint shown in the video data, there are a method of using deep learning and a method of physically attaching a marker to the position of the joint of the subject and identifying the joint by identifying the marker. ..
 (ステップS13:人数判定処理)
 相関判定部23は、ステップS12で2人以上の骨格情報が取得されたか否かを判定する。つまり、相関判定部23は、映像データに2人以上の人が映っていたか否かを判定する。
 相関判定部23は、2人以上の骨格情報が抽出された場合には、2人以上の骨格情報が取得されたと判定し、処理をステップS14に進める。一方、相関判定部23は、そうでない場合には、処理をステップS11に戻す。
(Step S13: Number of people determination process)
The correlation determination unit 23 determines whether or not the skeleton information of two or more persons has been acquired in step S12. That is, the correlation determination unit 23 determines whether or not two or more people are shown in the video data.
When the skeleton information of two or more people is extracted, the correlation determination unit 23 determines that the skeleton information of two or more people has been acquired, and proceeds to the process in step S14. On the other hand, if not, the correlation determination unit 23 returns the process to step S11.
 (ステップS14:相関判定処理)
 相関判定部23は、ステップS12で骨格情報が取得された複数の被写体者が互いに影響を与える行動である相互行動を行っているか否かを判定する。相互行動とは、複数の人の間で互いに影響を与えるような行動のことである。具体例としては、2人が手を伸ばして握り合う握手と、2人のうち1人がもう一方を殴る暴力行為といった行動である。
 具体的には、相関判定部23は、2つ以上の骨格情報の組を対象として、対象の組に含まれる骨格情報が示す骨格間の距離が設定した閾値よりも小さければ、その組の骨格情報が示す骨格は相互行動行っている組であると判定する。また、相関判定部23は、2つ以上の骨格情報の組を対象として、対象の組の骨格情報が示す骨格のある関節の位置の変化量又は変化の時刻が相互に相関していれば、その組の骨格情報が示す骨格は相互行動行っている組であると判定してもよい。
 相関判定部23は、相互行動を行っていると判定された組があった場合には、相互行動行っている組であると判定された各組について、その組に含まれる骨格情報のインデックスをメモリ12に書き込む。そして、相関判定部23は、処理をステップS15に進める。一方、相関判定部23は、相互行動を行っていると判定された組がなかった場合には、処理をステップS11に戻す。
(Step S14: Correlation determination process)
The correlation determination unit 23 determines whether or not the plurality of subjects whose skeleton information has been acquired in step S12 are performing mutual actions, which are actions that affect each other. Mutual behavior is behavior that influences each other among multiple people. Specific examples are actions such as a handshake in which two people reach out and hold each other, and a violent act in which one of the two hits the other.
Specifically, the correlation determination unit 23 targets a set of two or more skeleton information, and if the distance between the skeletons indicated by the skeleton information included in the target set is smaller than the set threshold value, the skeleton of the set. It is determined that the skeleton indicated by the information is a pair of mutual actions. Further, if the correlation determination unit 23 targets two or more sets of skeletal information and the amount of change or the time of change in the position of the joint having the skeleton indicated by the skeletal information of the target set is correlated with each other, It may be determined that the skeleton indicated by the skeleton information of the set is a set in which mutual actions are performed.
When there is a group determined to be performing mutual action, the correlation determination unit 23 indexes the skeletal information included in the group for each group determined to be the group performing mutual action. Write to memory 12. Then, the correlation determination unit 23 advances the process to step S15. On the other hand, if there is no pair determined to be performing mutual action, the correlation determination unit 23 returns the process to step S11.
 (ステップS15:行動特定処理)
 行動特定部24は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。行動特定部24は、ステップS12で取得された対象の組に含まれる複数の被写体者それぞれについての骨格情報から、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者それぞれの行動を特定する。
(Step S15: Action identification process)
The action specifying unit 24 sets each group determined to be a group performing mutual action in step S14 as the target group. A plurality of behavior specifying units 24 consider mutual behaviors, which are behaviors in which a plurality of subject subjects influence each other, from skeletal information about each of the plurality of subject subjects included in the target set acquired in step S12. Identify the behavior of each subject.
 図3を参照して、実施の形態1に係る行動特定処理(図2のステップS15)を説明する。
 (ステップS21:個別特定処理)
 個別特定部25は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。個別特定部25は、対象の組に含まれる複数の被写体者それぞれを対象として、対象の被写体者の骨格情報から、対象の被写体者についての行動を個別行動として特定する。
 具体的には、個別特定部25は、人の骨格情報を入力として、その人の行動を示す個別ラベルを出力する個別モデルを利用して、個別行動を特定する。個別モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ13に記憶されているものとする。つまり、個別特定部25は、個別モデルに対して、対象の被写体者の骨格情報を入力することにより、対象の被写体者の個別行動を示す個別ラベルを取得する。個別特定部25は、個別ラベルをメモリ12に書き込む。
 個別ラベルが示す個別行動は、1人の人としての行動である。したがって、個別行動は、例えば、「腕を前に伸ばす」、「倒れる」、「仰け反る」といった行動である。
The behavior specifying process (step S15 in FIG. 2) according to the first embodiment will be described with reference to FIG.
(Step S21: Individual identification process)
The individual identification unit 25 sets each group determined to be a group performing mutual action in step S14 as the target group. The individual identification unit 25 specifies the behavior of the target subject as individual behavior from the skeleton information of the target subject for each of the plurality of subjects included in the target group.
Specifically, the individual identification unit 25 specifies an individual behavior by using an individual model that inputs a person's skeleton information and outputs an individual label indicating the person's behavior. It is assumed that the individual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, the individual identification unit 25 acquires an individual label indicating the individual behavior of the target subject by inputting the skeleton information of the target subject into the individual model. The individual identification unit 25 writes the individual label in the memory 12.
The individual behavior indicated by the individual label is the behavior as one person. Therefore, the individual actions are, for example, actions such as "stretching the arm forward", "falling down", and "rebelling".
 (ステップS22:相互特定処理)
 相互特定部26は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。相互特定部26は、ステップS21で特定された対象の組に含まれる複数の被写体者それぞれについての個別行動から、相互行動を考慮して、対象の組に含まれる複数の被写体者全体としての行動を特定する。相互行動を考慮するとは、ある被写体者の行動を特定する場合に、他の被写体者の行動を考慮するという意味である。つまり、相互行動を考慮するとは、他の被写体者の行動に基づき、ある被写体者の行動を特定するという意味である。
 具体的には、相互特定部26は、複数の人それぞれの個別行動を示す個別ラベルの組を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルを利用して、被写体者の行動を特定する。相互モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ13に記憶されているものとする。つまり、相互特定部26は、相互モデルに対して、ステップS21で特定された対象の組に含まれる複数の被写体者それぞれについての個別ラベルの組を入力することにより、対象の組に含まれる複数の被写体者全体としての行動を示す相互ラベルを取得する。相互特定部26は、相互ラベルをメモリ12に書き込む。
 相互ラベルが示す行動は、複数の人としての行動である。したがって、相互ラベルが示す行動は、例えば、「握手をする」、「一方の人が殴り、他方の人が殴られる」といった行動である。具体例としては、対象の組に含まれる被写体者が2人であり、両方の被写体者の個別行動が「腕を前に伸ばす」である場合には、相互ラベルが示す行動は、「握手」になる。また、対象の組に含まれる被写体者が2人であり、一方の被写体者の個別行動が「腕を前に伸ばす」であり、他方の被写体者の個別行動が「仰け反る」である場合には、相互ラベルが示す行動は「暴力」になる。また、対象の組に含まれる被写体者が3人以上の場合であっても、同様にそれぞれの動作の組合せで行動を特定することができる。
(Step S22: Mutual identification process)
The mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group. The mutual identification unit 26 takes into consideration the mutual behavior from the individual behavior of each of the plurality of subject subjects included in the target group specified in step S21, and the behavior of the plurality of subject subjects included in the target group as a whole. To identify. Considering mutual behavior means considering the behavior of another subject when identifying the behavior of one subject. In other words, considering mutual behavior means specifying the behavior of a certain subject based on the behavior of another subject.
Specifically, the mutual identification unit 26 inputs a set of individual labels indicating the individual behaviors of each of the plurality of people, and outputs a mutual label indicating the behaviors of the plurality of people in consideration of the mutual behaviors. Use it to identify the behavior of the subject. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, the mutual identification unit 26 inputs a set of individual labels for each of the plurality of subjects included in the set of targets specified in step S21 to the mutual model, whereby a plurality of sets included in the set of targets are included. Obtain a mutual label indicating the behavior of the subject as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.
The behavior indicated by the mutual label is the behavior as multiple people. Therefore, the behavior indicated by the mutual label is, for example, an action such as "shaking hands" or "one person is beaten and the other person is beaten". As a specific example, when there are two subjects included in the target group and the individual behavior of both subjects is "stretching the arm forward", the behavior indicated by the mutual label is "handshake". become. In addition, when there are two subjects included in the target group, the individual behavior of one subject is "stretching the arm forward", and the individual behavior of the other subject is "rebelling". , The behavior indicated by the mutual label becomes "violence". Further, even when the number of subjects included in the target group is three or more, the action can be similarly specified by the combination of each action.
 ***実施の形態1の効果***
 以上のように、実施の形態1に係る行動特定装置10は、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。
*** Effect of Embodiment 1 ***
As described above, the behavior specifying device 10 according to the first embodiment identifies the behavior as a plurality of subject subjects in consideration of the mutual behavior which is the behavior in which the plurality of subject subjects influence each other. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
 ***他の構成***
 <変形例1>
 実施の形態1では、ニューラルネットワーク等を用いて生成された学習済みのモデルである個別モデル及び相互モデルを用いて行動を特定した。しかし、個別モデル及び相互モデルの少なくとも一方に代えて、入力と出力とを対応付けたルールが用いられてもよい。
*** Other configurations ***
<Modification 1>
In the first embodiment, the behavior is specified by using an individual model and a mutual model, which are trained models generated by using a neural network or the like. However, instead of at least one of the individual model and the mutual model, a rule in which the input and the output are associated may be used.
 個別モデルの代わりに用いられるルールは、人の骨格情報と人の行動を示す個別ラベルとを対応付けた個別ルールである。つまり、個別ルールは、人の骨格情報を入力として与えると、個別ラベルが出力として得られるルールである。
 個別モデルの代わりに個別ルールが用いられる場合には、図3のステップS21で個別特定部25は、個別ルールを参照して、対象の被写体者の骨格情報に対応する個別ラベルを対象の被写体者の個別行動を示す情報として取得する。この際、個別特定部25は、対象の被写体者の骨格情報と最も類似度が高い骨格情報と対応付けられた個別ラベルを対象の被写体者の個別行動を示す情報として取得する。
The rule used instead of the individual model is an individual rule in which the human skeleton information and the individual label indicating the human behavior are associated with each other. That is, the individual rule is a rule in which an individual label is obtained as an output when human skeleton information is given as an input.
When an individual rule is used instead of the individual model, in step S21 of FIG. 3, the individual identification unit 25 refers to the individual rule and sets the individual label corresponding to the skeleton information of the target subject to the target subject. It is acquired as information indicating the individual behavior of. At this time, the individual identification unit 25 acquires an individual label associated with the skeleton information having the highest degree of similarity to the skeleton information of the target subject as information indicating the individual behavior of the target subject.
 相互モデルの代わりに用いられるルールは、複数の人それぞれの個別行動を示す個別ラベルの組と複数の人としての行動を示す相互ラベルとを対応付けた相互ルールである。つまり、相互ルールは、個別ラベルの組を入力として与えると、複数の人としての行動を示す相互ラベルが出力として得られるルールである。
 相互モデルの代わりに相互ルールが用いられる場合には、図3のステップS22で相互特定部26は、相互ルールを参照して、複数の被写体者それぞれについての個別ラベルの組に対応する相互ラベルを複数の被写体者全体としての行動を示す情報として取得する。
The rule used instead of the mutual model is a mutual rule in which a set of individual labels indicating the individual behavior of each of a plurality of people and a mutual label indicating the behavior of a plurality of people are associated with each other. That is, a mutual rule is a rule in which, when a set of individual labels is given as an input, a mutual label indicating an action as a plurality of people is obtained as an output.
When a mutual rule is used instead of the mutual model, in step S22 of FIG. 3, the mutual identification unit 26 refers to the mutual rule and obtains a mutual label corresponding to a set of individual labels for each of a plurality of subjects. It is acquired as information indicating the behavior of a plurality of subjects as a whole.
 <変形例2>
 実施の形態1では、複数の被写体者全体としての行動が特定された。しかし、行動特定装置10は、さらに各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置10の相互特定部26は、各被写体者を対象として、全体としての行動と、対象の被写体者の個別ラベルとから、全体としての行動における対象の被写体者の行動を特定する。
 実施の形態1では、2人の組である場合に、一方の被写体者の個別行動が「腕を前に伸ばす」であり、他方の被写体者の個別行動が「仰け反る」である場合には、相互ラベルが示す行動は「殴る」になるという例を説明した。この例では、個別行動が「腕を前に伸ばす」である被写体者の行動は「相手を殴る」になり、個別行動が「仰け反る」である被写体者の行動は、「相手から殴られる」になる。
<Modification 2>
In the first embodiment, the behavior of the plurality of subjects as a whole was specified. However, the behavior specifying device 10 may further specify which behavior each subject is doing in the behavior as a whole. In this case, the mutual identification unit 26 of the behavior specifying device 10 targets each subject, and from the behavior as a whole and the individual label of the subject, the behavior of the target subject in the behavior as a whole. To identify.
In the first embodiment, in the case of a pair of two people, when the individual behavior of one subject is "stretching the arm forward" and the individual behavior of the other subject is "rebelling", He explained an example in which the behavior indicated by the mutual label would be "beating". In this example, the subject's behavior in which the individual behavior is "stretching the arm forward" is "beating the opponent", and the subject's behavior in which the individual behavior is "rebelling" is "beating by the opponent". Become.
 <変形例3>
 実施の形態1では、個別モデル及び相互モデルは、ストレージ13に記憶されると説明した。しかし、個別モデル及び相互モデルは、行動特定装置10の外部の記憶装置に記憶されていてもよい。この場合には、行動特定装置10は、通信インタフェース14を介して、個別モデル及び相互モデルにアクセスすればよい。
<Modification 3>
In the first embodiment, it has been described that the individual model and the mutual model are stored in the storage 13. However, the individual model and the mutual model may be stored in an external storage device of the behavior identification device 10. In this case, the behavior identification device 10 may access the individual model and the mutual model via the communication interface 14.
 <変形例4>
 実施の形態1では、各機能構成要素がソフトウェアで実現された。しかし、変形例4として、各機能構成要素はハードウェアで実現されてもよい。この変形例4について、実施の形態1と異なる点を説明する。
<Modification example 4>
In the first embodiment, each functional component is realized by software. However, as a modification 4, each functional component may be realized by hardware. The difference between the modified example 4 and the first embodiment will be described.
 図4を参照して、変形例4に係る行動特定装置10の構成を説明する。
 各機能構成要素がハードウェアで実現される場合には、行動特定装置10は、プロセッサ11とメモリ12とストレージ13とに代えて、電子回路15を備える。電子回路15は、各機能構成要素と、メモリ12と、ストレージ13との機能とを実現する専用の回路である。
With reference to FIG. 4, the configuration of the behavior specifying device 10 according to the modified example 4 will be described.
When each functional component is realized by hardware, the action specifying device 10 includes an electronic circuit 15 in place of the processor 11, the memory 12, and the storage 13. The electronic circuit 15 is a dedicated circuit that realizes the functions of each functional component, the memory 12, and the storage 13.
 電子回路15としては、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックIC、GA(Gate Array)、ASIC(Application Specific Integrated Circuit)、FPGA(Field-Programmable Gate Array)が想定される。
 各機能構成要素を1つの電子回路15で実現してもよいし、各機能構成要素を複数の電子回路15に分散させて実現してもよい。
Examples of the electronic circuit 15 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
Each functional component may be realized by one electronic circuit 15, or each functional component may be distributed and realized by a plurality of electronic circuits 15.
 <変形例5>
 変形例5として、一部の各機能構成要素がハードウェアで実現され、他の各機能構成要素がソフトウェアで実現されてもよい。
<Modification 5>
As a modification 5, some functional components may be realized by hardware, and other functional components may be realized by software.
 プロセッサ11とメモリ12とストレージ13と電子回路15とを処理回路という。つまり、各機能構成要素の機能は、処理回路により実現される。 The processor 11, the memory 12, the storage 13, and the electronic circuit 15 are called processing circuits. That is, the function of each functional component is realized by the processing circuit.
 実施の形態2.
 実施の形態2では、個別モデル及び相互モデルの生成処理について説明する。
Embodiment 2.
In the second embodiment, the generation process of the individual model and the mutual model will be described.
 ***構成の説明***
 図5を参照して、実施の形態2に係る学習装置50の構成を説明する。
 学習装置50は、コンピュータである。
 学習装置50は、プロセッサ51と、メモリ52と、ストレージ53と、通信インタフェース54とのハードウェアを備える。プロセッサ51は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。
*** Explanation of configuration ***
The configuration of the learning device 50 according to the second embodiment will be described with reference to FIG.
The learning device 50 is a computer.
The learning device 50 includes hardware such as a processor 51, a memory 52, a storage 53, and a communication interface 54. The processor 51 is connected to other hardware via a signal line and controls these other hardware.
 プロセッサ51は、プロセッサ11と同様に、プロセッシングを行うICである。メモリ52は、メモリ12と同様に、データを一時的に記憶する記憶装置である。ストレージ53は、ストレージ13と同様に、データを保管する記憶装置である。ストレージ53は、ストレージ13と同様に、可搬記録媒体であってもよい。通信インタフェース54は、通信インタフェース14と同様に、外部の装置と通信するためのインタフェースである。
 学習装置50は、通信インタフェース54を介して行動特定装置10と接続されている。
Like the processor 11, the processor 51 is an IC that performs processing. Like the memory 12, the memory 52 is a storage device that temporarily stores data. The storage 53 is a storage device for storing data, like the storage 13. The storage 53 may be a portable recording medium like the storage 13. Similar to the communication interface 14, the communication interface 54 is an interface for communicating with an external device.
The learning device 50 is connected to the action specifying device 10 via the communication interface 54.
 学習装置50は、機能構成要素として、学習データ取得部61と、モデル生成部62とを備える。学習装置50の各機能構成要素の機能はソフトウェアにより実現される。
 ストレージ13には、学習装置50の各機能構成要素の機能を実現するプログラムが格納されている。このプログラムは、プロセッサ51によりメモリ52に読み込まれ、プロセッサ51によって実行される。これにより、学習装置50の各機能構成要素の機能が実現される。
The learning device 50 includes a learning data acquisition unit 61 and a model generation unit 62 as functional components. The functions of each functional component of the learning device 50 are realized by software.
The storage 13 stores a program that realizes the functions of each functional component of the learning device 50. This program is read into the memory 52 by the processor 51 and executed by the processor 51. As a result, the functions of each functional component of the learning device 50 are realized.
 図5では、プロセッサ51は、1つだけ示されていた。しかし、プロセッサ51は、複数であってもよく、複数のプロセッサ51が、各機能を実現するプログラムを連携して実行してもよい。 In FIG. 5, only one processor 51 was shown. However, the number of processors 51 may be plural, and the plurality of processors 51 may execute programs that realize each function in cooperation with each other.
 ***動作の説明***
 図6及び図7を参照して、実施の形態2に係る学習装置50の動作を説明する。
 実施の形態2に係る学習装置50の動作手順は、実施の形態2に係る学習方法に相当する。また、実施の形態2に係る学習装置50の動作を実現するプログラムは、実施の形態2に係る学習プログラムに相当する。
*** Explanation of operation ***
The operation of the learning device 50 according to the second embodiment will be described with reference to FIGS. 6 and 7.
The operation procedure of the learning device 50 according to the second embodiment corresponds to the learning method according to the second embodiment. Further, the program that realizes the operation of the learning device 50 according to the second embodiment corresponds to the learning program according to the second embodiment.
 図6を参照して、実施の形態2に係る学習装置50が個別モデルを生成する動作を説明する。
 (ステップS31:学習データ取得処理)
 学習データ取得部61は、人の骨格の関節の位置を示す骨格情報と、その人の行動とを関連付けた学習データを取得する。
 例えば、学習データは、指定された行動を実際に行った人を撮像して得られた映像データから骨格情報を特定することによって生成される。つまり、抽出された骨格情報と、指定された行動とが関連付けられて学習データとされる。骨格情報は、映像データの1つのフレームから特定された関節の位置だけを含むベクトルデータであってもよいし、複数のフレームから特定された関節の位置を含む行列データであってもよい。
The operation of the learning device 50 according to the second embodiment to generate an individual model will be described with reference to FIG.
(Step S31: Learning data acquisition process)
The learning data acquisition unit 61 acquires learning data in which skeletal information indicating the positions of joints of a person's skeleton is associated with the behavior of the person.
For example, learning data is generated by identifying skeletal information from video data obtained by imaging a person who actually performed a specified action. That is, the extracted skeletal information and the specified action are associated with each other to obtain learning data. The skeleton information may be vector data including only the joint positions specified from one frame of the video data, or may be matrix data including the joint positions specified from a plurality of frames.
 (ステップS32:モデル生成処理)
 モデル生成部62は、ステップS31で取得された学習データを入力として、学習を行い、個別モデルを生成する。モデル生成部62は、個別モデルを行動特定装置10のストレージ13に書き込む。
 実施の形態2では、モデル生成部62は、学習データを入力として、骨格の関節の位置と行動との関係をニューラルネットワークに学習させる。例えば、モデル生成部62は、骨格情報が肩と肘と手首との位置が一直線に並び、かつ、それぞれの垂直方向の位置が同等であることを示していれば、それは「腕を前に伸ばす」動作を表していることを学習させる。用いられるニューラルネットワークの構成はDNN(深層ニューラルネットワーク)と、CNN(畳み込みニューラルネットワーク)と、RNN(再帰型ニューラルネットワーク)といった周知のものでよい。
(Step S32: Model generation process)
The model generation unit 62 receives the learning data acquired in step S31 as an input, performs learning, and generates an individual model. The model generation unit 62 writes the individual model in the storage 13 of the action specifying device 10.
In the second embodiment, the model generation unit 62 inputs the learning data and causes the neural network to learn the relationship between the position of the joint of the skeleton and the behavior. For example, if the model generator 62 indicates that the skeletal information indicates that the shoulders, elbows, and wrists are aligned and their vertical positions are equivalent, it is "extended the arm forward." Learn to represent an action. The configuration of the neural network used may be a well-known one such as DNN (deep neural network), CNN (convolutional neural network), and RNN (recurrent neural network).
 図7を参照して、実施の形態2に係る学習装置50が相互モデルを生成する動作を説明する。
 (ステップS41:学習データ取得処理)
 学習データ取得部61は、複数の個別ラベルの組と、相互行動が考慮された複数の人それぞれの行動とを関連付けた学習データを取得する。
 例えば、学習データは、指定された相互行動を実際に行った場合における、複数の人それぞれの個別行動を示す個別ラベルと、相互行動における複数の人としての行動とが関連付けられて生成される。
The operation of the learning device 50 according to the second embodiment to generate a mutual model will be described with reference to FIG. 7.
(Step S41: Learning data acquisition process)
The learning data acquisition unit 61 acquires learning data in which a set of a plurality of individual labels is associated with the behavior of each of the plurality of people in consideration of mutual behavior.
For example, the learning data is generated by associating an individual label indicating the individual behavior of each of the plurality of people when the designated mutual action is actually performed with the behavior of the plurality of people in the mutual action.
 (ステップS42:モデル生成処理)
 モデル生成部62は、ステップS41で取得された学習データを入力として、学習を行い、相互モデルを生成する。モデル生成部62は、相互モデルを行動特定装置10のストレージ13に書き込む。
 実施の形態2では、モデル生成部62は、学習データを入力として、複数の個別ラベルの組と、相互行動が考慮された複数の人としての行動との関係をニューラルネットワークに学習させる。例えば、モデル生成部62は、2人の組である場合に、両方の被写体者の個別行動が「腕を前に伸ばす」である場合には、両方の被写体者について相互ラベルが示す行動は、「握手」であることを学習させる。用いられるニューラルネットワークの構成はDNN(深層ニューラルネットワーク)と、CNN(畳み込みニューラルネットワーク)と、RNN(再帰型ニューラルネットワーク)といった周知のものでよい。
(Step S42: Model generation process)
The model generation unit 62 receives the learning data acquired in step S41 as an input, performs learning, and generates a mutual model. The model generation unit 62 writes the mutual model in the storage 13 of the behavior identification device 10.
In the second embodiment, the model generation unit 62 inputs the learning data and causes the neural network to learn the relationship between the set of a plurality of individual labels and the behavior as a plurality of people in consideration of mutual behavior. For example, in the case of a pair of two subjects, if the individual behavior of both subjects is "stretching the arm forward", the behavior indicated by the mutual label for both subjects is Learn to be a "handshake". The configuration of the neural network used may be a well-known one such as DNN (deep neural network), CNN (convolutional neural network), and RNN (recurrent neural network).
 ***実施の形態2の効果***
 以上のように、実施の形態2に係る学習装置50は、学習データに基づき、行動特定装置10が用いる個別モデル及び相互モデルを生成する。これにより、適切な学習データを与えることで、行動特定装置10が用いる個別モデル及び相互モデルの認識精度を高くすることができる。
*** Effect of Embodiment 2 ***
As described above, the learning device 50 according to the second embodiment generates an individual model and a mutual model used by the behavior specifying device 10 based on the learning data. Thereby, by giving appropriate learning data, the recognition accuracy of the individual model and the mutual model used by the behavior specifying device 10 can be improved.
 ***他の構成***
 <変形例6>
 変形例1で説明したように、行動特定装置10は、個別モデルに代えて個別ルールを用いてもよいし、相互モデルに代えて相互ルールを用いてもよい。
*** Other configurations ***
<Modification 6>
As described in the first modification, the behavior specifying device 10 may use an individual rule instead of the individual model, or may use a mutual rule instead of the mutual model.
 個別モデルに代えて個別ルールが用いられる場合には、図6のステップS32でモデル生成部62は、個別モデルに代えて個別ルールを生成する。具体的には、モデル生成部62は、ステップS31で取得された各学習データが示す、人の骨格の関節の位置を示す骨格情報と、その人の行動を示す個別ラベルと対応付けたデータベースを個別ルールとして生成する。 When an individual rule is used instead of the individual model, the model generation unit 62 generates an individual rule instead of the individual model in step S32 of FIG. Specifically, the model generation unit 62 creates a database associated with skeletal information indicating the positions of joints of the human skeleton shown by each learning data acquired in step S31 and individual labels indicating the behavior of the person. Generate as an individual rule.
 相互モデルに代えて相互ルールが用いられる場合には、図7のステップS42でモデル生成部62は、相互モデルに代えて相互ルールを生成する。具体的には、モデル生成部62は、ステップS41で取得された各学習データが示す、複数の個別ラベルの組と、相互行動が考慮された複数の人としての行動とを対応付けたデータベースを相互ルールとして生成する。 When a mutual rule is used instead of the mutual model, the model generation unit 62 generates the mutual rule instead of the mutual model in step S42 of FIG. Specifically, the model generation unit 62 creates a database in which a set of a plurality of individual labels shown by each learning data acquired in step S41 is associated with behaviors as a plurality of people in consideration of mutual behaviors. Generate as a mutual rule.
 <変形例7>
 実施の形態2では、各機能構成要素がソフトウェアで実現された。しかし、変形例7として、各機能構成要素はハードウェアで実現されてもよい。この変形例7について、実施の形態2と異なる点を説明する。
<Modification 7>
In the second embodiment, each functional component is realized by software. However, as a modification 7, each functional component may be realized by hardware. The difference between the modified example 7 and the second embodiment will be described.
 図8を参照して、変形例7に係る学習装置50の構成を説明する。
 各機能構成要素がハードウェアで実現される場合には、学習装置50は、プロセッサ51とメモリ52とストレージ53とに代えて、電子回路55を備える。電子回路55は、各機能構成要素と、メモリ52と、ストレージ53との機能とを実現する専用の回路である。
The configuration of the learning device 50 according to the modified example 7 will be described with reference to FIG.
When each functional component is realized by hardware, the learning device 50 includes an electronic circuit 55 instead of the processor 51, the memory 52, and the storage 53. The electronic circuit 55 is a dedicated circuit that realizes the functions of each functional component, the memory 52, and the storage 53.
 電子回路55としては、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックIC、GA(Gate Array)、ASIC(Application Specific Integrated Circuit)、FPGA(Field-Programmable Gate Array)が想定される。
 各機能構成要素を1つの電子回路55で実現してもよいし、各機能構成要素を複数の電子回路55に分散させて実現してもよい。
Examples of the electronic circuit 55 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
Each functional component may be realized by one electronic circuit 55, or each functional component may be distributed and realized by a plurality of electronic circuits 55.
 <変形例8>
 変形例8として、一部の各機能構成要素がハードウェアで実現され、他の各機能構成要素がソフトウェアで実現されてもよい。
<Modification 8>
As a modification 8, some functional components may be realized by hardware, and other functional components may be realized by software.
 プロセッサ51とメモリ52とストレージ53と電子回路55とを処理回路という。つまり、各機能構成要素の機能は、処理回路により実現される。 The processor 51, the memory 52, the storage 53, and the electronic circuit 55 are called processing circuits. That is, the function of each functional component is realized by the processing circuit.
 実施の形態3.
 実施の形態3は、複数の骨格情報から計算された特徴量から、相互行動を考慮して複数の被写体者全体としての行動が特定される点が実施の形態1と異なる。実施の形態3では、この異なる点を説明し、同一の点については説明を省略する。
Embodiment 3.
The third embodiment is different from the first embodiment in that the behavior of a plurality of subjects as a whole is specified in consideration of mutual behavior from the feature quantities calculated from the plurality of skeletal information. In the third embodiment, these different points will be described, and the same points will be omitted.
 ***構成の説明***
 図9を参照して、実施の形態3に係る行動特定装置10の構成を説明する。
 行動特定装置10は、行動特定部24が、個別特定部25に代えて、特徴量計算部27を備える点が図1に示す行動特定装置10と異なる。特徴量計算部27の機能は、他の機能と同様に、ソフトウェア又はハードウェアによって実現される。
*** Explanation of configuration ***
The configuration of the behavior specifying device 10 according to the third embodiment will be described with reference to FIG. 9.
The action specifying device 10 is different from the action specifying device 10 shown in FIG. 1 in that the action specifying unit 24 includes a feature amount calculation unit 27 instead of the individual specifying unit 25. The function of the feature amount calculation unit 27 is realized by software or hardware like other functions.
 ***動作の説明***
 図10を参照して、実施の形態3に係る行動特定装置10の動作を説明する。
 実施の形態3に係る行動特定装置10の動作手順は、実施の形態3に係る行動特定方法に相当する。また、実施の形態3に係る行動特定装置10の動作を実現するプログラムは、実施の形態3に係る行動特定プログラムに相当する。
*** Explanation of operation ***
The operation of the action specifying device 10 according to the third embodiment will be described with reference to FIG.
The operation procedure of the action specifying device 10 according to the third embodiment corresponds to the action specifying method according to the third embodiment. Further, the program that realizes the operation of the action specifying device 10 according to the third embodiment corresponds to the action specifying program according to the third embodiment.
 図10を参照して、実施の形態3に係る行動特定処理(図2のステップS15)を説明する。
 (ステップS51:特徴量計算処理)
 特徴量計算部27は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部27は、対象の組に含まれる複数の被写体者それぞれについての骨格情報に基づき特徴量を計算する。
 具体的には、特徴量計算部27は、対象の組に含まれる複数の被写体者それぞれについての骨格情報を統合して特徴量を計算する。あるいは、特徴量計算部27は、対象の組に含まれる複数の被写体者それぞれについての骨格情報から特徴量を抽出してもよい。
 ここで、特徴量の計算は、複数の骨格間の関節の位置関係について情報が保持されるよう処理される。例えば、骨格情報は、骨格の関節位置を示す座標が1人の骨格情報あたりm個あり、その骨格がm次元ベクトルで表現されているとする。n人分の骨格情報を総合する場合には、m次元ベクトルをn個連結させた(m×n)次元ベクトル、又は、m行n列の行列が特徴量となる。あるいは、複数の骨格間における任意の関節の間の距離についての時間変化を要素として持つベクトル又は行列が特徴量となる。複数の骨格間における任意の関節の間の距離とは、例えば、骨格Aの首と、骨格Bの手首との間の距離である。
The behavior specifying process (step S15 in FIG. 2) according to the third embodiment will be described with reference to FIG.
(Step S51: Feature calculation process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 calculates the feature amount based on the skeletal information of each of the plurality of subjects included in the target set.
Specifically, the feature amount calculation unit 27 calculates the feature amount by integrating the skeletal information about each of the plurality of subjects included in the target set. Alternatively, the feature amount calculation unit 27 may extract the feature amount from the skeletal information about each of the plurality of subjects included in the target set.
Here, the feature amount calculation is processed so that information about the positional relationship of the joints between the plurality of skeletons is retained. For example, it is assumed that the skeleton information has m coordinates indicating the joint positions of the skeleton per person's skeleton information, and the skeleton is represented by an m-dimensional vector. When integrating the skeleton information for n people, the feature quantity is a (m × n) dimensional vector in which n m-dimensional vectors are concatenated, or a matrix with m rows and n columns. Alternatively, the feature quantity is a vector or matrix having a time change as an element with respect to the distance between arbitrary joints among a plurality of skeletons. The distance between any joints between the plurality of skeletons is, for example, the distance between the neck of the skeleton A and the wrist of the skeleton B.
 (ステップS52:相互特定処理)
 相互特定部26は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。相互特定部26は、ステップS51で特定された対象の組に含まれる複数の被写体者の骨格情報の特徴量を入力として、相互行動を考慮して、複数の被写体者全体としての行動を特定する。
 具体的には、相互特定部26は、複数の人の骨格情報の特徴量を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルを利用して、被写体者の行動を特定する。相互モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ13に記憶されているものとする。つまり、相互特定部26は、相互モデルに対して、ステップS51で計算された特徴量を入力することにより、対象の組に含まれる複数の被写体者全体としての行動を示す相互ラベルを取得する。相互特定部26は、相互ラベルをメモリ12に書き込む。
(Step S52: Mutual identification process)
The mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group. The mutual identification unit 26 specifies the behavior of the plurality of subjects as a whole in consideration of the mutual behavior by inputting the feature amount of the skeletal information of the plurality of subjects included in the set of targets specified in step S51. ..
Specifically, the mutual identification unit 26 uses a mutual model in which features of skeletal information of a plurality of people are input and a mutual label indicating the behavior as a plurality of people is output in consideration of the mutual behavior. Identify the behavior of the subject. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, by inputting the feature amount calculated in step S51 into the mutual model, the mutual identification unit 26 acquires a mutual label indicating the behavior of the plurality of subjects included in the target set as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.
 ***実施の形態3の効果***
 以上のように、実施の形態3に係る行動特定装置10は、実施の形態1に係る行動特定装置10と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。
*** Effect of Embodiment 3 ***
As described above, the behavior specifying device 10 according to the third embodiment considers mutual behavior, which is an behavior in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
 ***他の構成***
 <変形例9>
 実施の形態3では、ニューラルネットワーク等を用いて生成された学習済みのモデルである相互モデルを用いて行動を特定した。しかし、変形例1と同様に、相互モデルに代えて相互ルールが用いられてもよい。
 相互ルールは、複数の人の骨格情報の特徴量と複数の人としての行動を示す相互ラベルとを対応付けたルールである。相互モデルの代わりに相互ルールが用いられる場合には、図10のステップS52で相互特定部26は、相互ルールを参照して、特徴量に対応する相互ラベルを複数の被写体者全体としての被写体者の行動を示す情報として取得する。
*** Other configurations ***
<Modification 9>
In the third embodiment, the behavior is specified by using a mutual model which is a trained model generated by using a neural network or the like. However, as in Modification 1, mutual rules may be used instead of the mutual model.
The mutual rule is a rule in which the feature amount of the skeletal information of a plurality of people and the mutual label indicating the behavior as a plurality of people are associated with each other. When a mutual rule is used instead of the mutual model, the mutual identification unit 26 refers to the mutual rule in step S52 of FIG. It is acquired as information indicating the behavior of.
 <変形例10>
 実施の形態3では、実施の形態1と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置10は、変形例2と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置10の相互特定部26は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部26は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。
<Modification 10>
In the third embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
 実施の形態4.
 実施の形態4は、実施の形態3に係る相互モデルを生成する点が実施の形態2と異なる。実施の形態4では、この異なる点を説明し、同一の点については説明を省略する。
 なお、実施の形態3では、個別モデルは用いられないため、実施の形態4では、個別モデルは生成されない。
Embodiment 4.
The fourth embodiment is different from the second embodiment in that a mutual model according to the third embodiment is generated. In the fourth embodiment, these different points will be described, and the same points will be omitted.
Since the individual model is not used in the third embodiment, the individual model is not generated in the fourth embodiment.
 ***動作の説明***
 図7を参照して、実施の形態4に係る学習装置50の動作を説明する。
 実施の形態4に係る学習装置50の動作手順は、実施の形態4に係る学習方法に相当する。また、実施の形態4に係る学習装置50の動作を実現するプログラムは、実施の形態4に係る学習プログラムに相当する。
*** Explanation of operation ***
The operation of the learning device 50 according to the fourth embodiment will be described with reference to FIG. 7.
The operation procedure of the learning device 50 according to the fourth embodiment corresponds to the learning method according to the fourth embodiment. Further, the program that realizes the operation of the learning device 50 according to the fourth embodiment corresponds to the learning program according to the fourth embodiment.
 図7を参照して、実施の形態4に係る学習装置50が相互モデルを生成する動作を説明する。
 (ステップS41:学習データ取得処理)
 学習データ取得部61は、複数の人の骨格情報の特徴量と、複数の人としての行動とを関連付けた学習データを取得する。
 例えば、学習データは、指定された相互行動を実際に行った複数の人を撮像して得られた映像データから特徴量を計算することによって生成される。つまり、計算された特徴量と、指定された相互行動における各人の行動とが関連付けられて学習データとされる。
The operation of the learning device 50 according to the fourth embodiment to generate a mutual model will be described with reference to FIG. 7.
(Step S41: Learning data acquisition process)
The learning data acquisition unit 61 acquires learning data in which the feature amounts of the skeletal information of a plurality of people and the behaviors of the plurality of people are associated with each other.
For example, the learning data is generated by calculating the feature amount from the video data obtained by imaging a plurality of people who actually performed the specified mutual action. That is, the calculated feature amount and the behavior of each person in the designated mutual behavior are associated with each other to obtain learning data.
 (ステップS42:モデル生成処理)
 モデル生成部62は、ステップS31で取得された学習データを入力として、学習を行い、相互モデルを生成する。モデル生成部62は、相互モデルを行動特定装置10のストレージ13に書き込む。
(Step S42: Model generation process)
The model generation unit 62 receives the learning data acquired in step S31 as an input, performs learning, and generates a mutual model. The model generation unit 62 writes the mutual model in the storage 13 of the behavior identification device 10.
 ***実施の形態4の効果***
 以上のように、実施の形態4に係る学習装置50は、学習データに基づき、行動特定装置10が用いる相互モデルを生成する。これにより、適切な学習データを与えることで、行動特定装置10が用いる個別モデル及び相互モデルの認識精度を高くすることができる。
*** Effect of Embodiment 4 ***
As described above, the learning device 50 according to the fourth embodiment generates a mutual model used by the behavior specifying device 10 based on the learning data. Thereby, by giving appropriate learning data, the recognition accuracy of the individual model and the mutual model used by the behavior specifying device 10 can be improved.
 ***他の構成***
 <変形例11>
 変形例9で説明したように、行動特定装置10は、相互モデルに代えて相互ルールを用いてもよい。
*** Other configurations ***
<Modification 11>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.
 相互モデルに代えて相互ルールが用いられる場合には、図7のステップS42でモデル生成部62は、相互モデルに代えて相互ルールを生成する。具体的には、モデル生成部62は、ステップS41で取得された各学習データが示す、特徴量と、相互行動が考慮された複数の人としての行動とを対応付けたデータベースを相互ルールとして生成する。 When a mutual rule is used instead of the mutual model, the model generation unit 62 generates the mutual rule instead of the mutual model in step S42 of FIG. Specifically, the model generation unit 62 generates as a mutual rule a database in which the feature amount shown by each learning data acquired in step S41 is associated with the behavior as a plurality of people in consideration of the mutual behavior. do.
 実施の形態5.
 実施の形態5は、骨格情報から特徴量を計算する方法が実施の形態3と異なる。実施の形態5では、この異なる点を説明し、同一の点については説明を省略する。
Embodiment 5.
In the fifth embodiment, the method of calculating the feature amount from the skeletal information is different from the third embodiment. In the fifth embodiment, these different points will be described, and the same points will be omitted.
 実施の形態5は、骨格情報から特徴量を計算する際に、少なくとも1時刻前の骨格情報が必要となる。そこで、実施の形態5では、図2のステップ12において、骨格情報を取得後、ストレージ13によって実現される骨格情報データベースに骨格情報が保存されるものとする。 In the fifth embodiment, when calculating the feature amount from the skeleton information, the skeleton information at least one time ago is required. Therefore, in the fifth embodiment, after the skeleton information is acquired in step 12 of FIG. 2, the skeleton information is stored in the skeleton information database realized by the storage 13.
 図10を参照して、実施の形態5に係る行動特定処理(図2のステップS15)を説明する。
 (ステップS51:特徴量計算処理)
 特徴量計算部27は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部27は、対象の組に含まれる複数の被写体者それぞれについての骨格情報に基づき特徴量を計算する。特徴量計算部27は、特徴量を、ストレージ13によって実現される特徴量データベースに書き込む。
 具体的には、特徴量計算部27は、対象の組に含まれる複数の被写体者それぞれについての骨格情報から特徴量を計算する。そして、特徴量計算部27は、計算した特徴量に、現在時刻tをインデックスとして付与して、特徴量データベースに書き込む。
 算出される特徴量及びその算出方法については後述する。
The behavior specifying process (step S15 in FIG. 2) according to the fifth embodiment will be described with reference to FIG.
(Step S51: Feature calculation process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 calculates the feature amount based on the skeletal information of each of the plurality of subjects included in the target set. The feature amount calculation unit 27 writes the feature amount in the feature amount database realized by the storage 13.
Specifically, the feature amount calculation unit 27 calculates the feature amount from the skeletal information of each of the plurality of subjects included in the target set. Then, the feature amount calculation unit 27 adds the current time t as an index to the calculated feature amount and writes it in the feature amount database.
The calculated feature amount and the calculation method thereof will be described later.
 (ステップS52:相互特定処理)
 相互特定部26は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。相互特定部26は、ステップS51で特定された対象の組に含まれる複数の被写体者の骨格情報の特徴量を入力として、相互行動を考慮して、複数の被写体者全体としての行動を特定する。
 具体的には、相互特定部26は、対象の組に含まれる複数の被写体者についての特徴量を特徴量データベースから取得する。そして、相互特定部26は、複数の人の特徴量を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルを利用して、被写体者の行動を特定する。相互モデルは、ニューラルネットワーク等を用いて生成された学習済みのモデルであり、予めストレージ13に記憶されているものとする。つまり、相互特定部26は、相互モデルに対して、ステップS51で計算された特徴量を入力することにより、対象の組に含まれる複数の被写体者全体としての行動を示す相互ラベルを取得する。相互特定部26は、相互ラベルをメモリ12に書き込む。
(Step S52: Mutual identification process)
The mutual identification unit 26 sets each group determined to be a group performing mutual action in step S14 as the target group. The mutual identification unit 26 specifies the behavior of the plurality of subjects as a whole in consideration of the mutual behavior by inputting the feature amount of the skeletal information of the plurality of subjects included in the set of targets specified in step S51. ..
Specifically, the mutual identification unit 26 acquires the feature quantities of a plurality of subjects included in the target set from the feature quantity database. Then, the mutual identification unit 26 identifies the behavior of the subject by using a mutual model that inputs the feature quantities of a plurality of people and outputs a mutual label indicating the behavior as a plurality of people in consideration of the mutual behavior. do. It is assumed that the mutual model is a trained model generated by using a neural network or the like and is stored in the storage 13 in advance. That is, by inputting the feature amount calculated in step S51 into the mutual model, the mutual identification unit 26 acquires a mutual label indicating the behavior of the plurality of subjects included in the target set as a whole. The mutual identification unit 26 writes the mutual label in the memory 12.
 相互特定部26が、特徴量データベースから取得する特徴量は、ある1時刻に計算された1個ではなく、時系列に連続する複数の特徴量であってもよい。時系列に連続する複数の特徴量を取得した場合には、相互特定部26は、特徴量の変遷をもとに、対象の組に含まれる複数の被写体者としての行動を特定し、相互ラベルを取得する。つまり、この場合には、相互モデルは、複数の人の特徴量の変遷を入力として、相互行動を考慮して複数の人としての行動を示す相互ラベルを出力するモデルである。 The feature amount acquired by the mutual identification unit 26 from the feature amount database may not be one calculated at a certain time, but may be a plurality of consecutive feature amounts in a time series. When a plurality of continuous feature quantities are acquired in a time series, the mutual identification unit 26 identifies the behavior as a plurality of subject persons included in the target set based on the transition of the feature quantities, and mutually labels them. To get. That is, in this case, the mutual model is a model that inputs the transition of the feature quantity of a plurality of people and outputs a mutual label indicating the behavior as a plurality of people in consideration of the mutual behavior.
 図11を参照して、実施の形態5に係る特徴量計算処理(図10のステップS51)を説明する。
 (ステップS61:骨格情報取得処理)
 特徴量計算部27は、相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部27は、対象の組に設定された組に含まれる複数の被写体者それぞれについての現在時刻の骨格情報と1時刻前の骨格情報とを骨格情報データベースから取得する。
The feature amount calculation process (step S51 in FIG. 10) according to the fifth embodiment will be described with reference to FIG.
(Step S61: Skeleton information acquisition process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action as a target group. The feature amount calculation unit 27 acquires the skeleton information of the current time and the skeleton information of one time ago for each of the plurality of subjects included in the set set in the target set from the skeleton information database.
 (ステップS62:速度計算処理)
 特徴量計算部27は、ステップS61で取得された複数の被写体者それぞれの現在時刻の骨格情報と1時刻前の骨格情報とを用いて特徴量を算出する。
 具体的には、特徴量計算部27は、ステップS61で取得された、時系列的に連続する2フレームの画像間における被写体者についての骨格の各関節の移動距離を要素に持つベクトル又は行列を計算する。このようにして計算される各関節の移動距離は、2フレームの画像間で生じる時間幅に対する各関節の移動距離であるため、各関節の速度とみなすことができる。そして、特徴量計算部27は、各関節の速度の合計又は平均を取って得られるスカラーを被写体者の骨格全体の速度とし、この速度を特徴量とする。
(Step S62: Speed calculation process)
The feature amount calculation unit 27 calculates the feature amount by using the skeleton information of each of the plurality of subjects acquired in step S61 at the current time and the skeleton information one time before.
Specifically, the feature amount calculation unit 27 obtains a vector or a matrix having the movement distance of each joint of the skeleton with respect to the subject between two consecutive frames of images acquired in step S61 as an element. calculate. Since the movement distance of each joint calculated in this way is the movement distance of each joint with respect to the time width generated between the images of two frames, it can be regarded as the speed of each joint. Then, the feature amount calculation unit 27 uses a scalar obtained by taking the total or average of the speeds of each joint as the speed of the entire skeleton of the subject, and uses this speed as the feature amount.
 ステップS61で特徴量計算部27は、現在時刻tから過去時刻t-Nまでの時間幅N分の骨格情報を取得してもよい。この場合には、ステップS62で特徴量計算部27は、時系列に連続する2時刻間の骨格の各関節の移動距離を要素に持つベクトル又は行列を生成する。特徴量計算部27は、各関節の移動距離について時間方向に総和を取り、時間幅Nで除算して、現在時刻tから過去時刻t-Nにおける平均移動距離を各関節の速度として計算する。つまり、特徴量計算部27は、各関節を対象として、対象の関節について計算された2つの時刻の間における移動距離を合計し、時間幅Nで除算して、対象の関節の平均移動距離を計算する。そして、特徴量計算部27は、この平均移動距離を、対象の関節の速度として扱う。そして、特徴量計算部27は、各関節の速度の合計又は平均を取って得られるスカラーを被写体者の骨格全体の速度とし、この速度を特徴量とする。 In step S61, the feature amount calculation unit 27 may acquire skeleton information for a time width N from the current time t to the past time t—N. In this case, in step S62, the feature amount calculation unit 27 generates a vector or a matrix having the movement distance of each joint of the skeleton as an element for two consecutive time periods. The feature amount calculation unit 27 sums up the movement distances of each joint in the time direction, divides them by the time width N, and calculates the average movement distance from the current time t to the past time t—N as the speed of each joint. That is, the feature amount calculation unit 27 totals the movement distances between the two times calculated for the target joints for each joint, divides by the time width N, and obtains the average movement distance of the target joints. calculate. Then, the feature amount calculation unit 27 treats this average moving distance as the speed of the target joint. Then, the feature amount calculation unit 27 uses a scalar obtained by taking the total or average of the speeds of each joint as the speed of the entire skeleton of the subject, and uses this speed as the feature amount.
 上記説明では、特徴量はスカラーであった。しかし、特徴量計算部27は、全関節について速度の合計又は平均値を取らずに、各関節の速度を要素に持つベクトルデータを特徴量としてもよい。 In the above explanation, the feature amount was scalar. However, the feature amount calculation unit 27 may use vector data having the velocity of each joint as an element as the feature amount without taking the total or average value of the velocities for all the joints.
 特徴量計算部27は、抽出された被写体者の骨格の関節のうち、任意の数の関節から特徴量を計算してもよい。あるいは、特徴量計算部27は、任意の数の関節分だけ計算された特徴量どうしを加算する又は平均を取る等して、特徴量を抽出した関節数よりも少ない数の特徴量を計算してもよい。
 また、特徴量計算部27は、被写体者の数だけ計算された特徴量を合計する又は平均を取る等して1つの特徴量としてもよい。
The feature amount calculation unit 27 may calculate the feature amount from any number of joints of the extracted subject's skeleton. Alternatively, the feature amount calculation unit 27 calculates a number of feature amounts smaller than the number of joints from which the feature amount has been extracted by adding or averaging the feature amounts calculated for an arbitrary number of joints. You may.
Further, the feature amount calculation unit 27 may add up or average the feature amounts calculated for the number of subjects to form one feature amount.
 特徴量を算出するにあたって、骨格情報のうちの一部の関節の位置が取得できない場合も起こり得る。この場合には、特徴量計算部27は、特徴量データベースに記憶されている過去の特徴量をもとにする、又は、関節の位置が取得できた関節をもとにする等して、取得できなかった関節の位置又は取得できなかった関節に関する特徴量を補完してもよい。
 補完の方法としては、関節の位置が取得できなかった時刻の特徴量を1時刻前の特徴量とする、又は、関節の位置が取得できなかった時刻の特徴量を過去数時刻分の特徴量の変位から線形補完して計算することが考えられる。あるいは、特徴量計算部27は、関節の位置が取得できた関節群全体の速度から1関節当たりの速度の平均値を算出し、関節の位置が取得できなかった関節の速度としても、関節の位置が取得できなかった関節の周囲の関節から成り、関節の位置が取得できた関節群の速度から1関節当たりの速度の平均値を計算し、関節の位置が取得できなかった関節の速度としてもよい。また、特徴量計算部27は、取得できなかった右膝の位置を左膝の位置で補完するというように、取得できなかった関節と左右で対になっている関節、あるいは連結する関節の位置で補完してもよい。
In calculating the features, it may happen that the positions of some joints in the skeletal information cannot be obtained. In this case, the feature amount calculation unit 27 acquires the feature amount based on the past feature amount stored in the feature amount database, or based on the joint for which the joint position can be acquired. The position of the joint that could not be obtained or the feature amount related to the joint that could not be obtained may be supplemented.
As a complementing method, the feature amount at the time when the joint position could not be acquired is used as the feature amount one hour before, or the feature amount at the time when the joint position could not be acquired is used as the feature amount for the past several hours. It is conceivable to calculate by linearly complementing the displacement of. Alternatively, the feature amount calculation unit 27 calculates the average value of the speeds per joint from the speeds of the entire joint group in which the joint positions could be obtained, and even if the joint speeds in which the joint positions could not be obtained are used as the joint speeds. It consists of joints around the joint for which the position could not be obtained, and the average value of the speed per joint was calculated from the speed of the joint group for which the position of the joint could be obtained. May be good. In addition, the feature amount calculation unit 27 complements the position of the right knee that could not be acquired with the position of the left knee, and is the position of the joint that is paired on the left and right with the joint that could not be acquired, or the position of the joint that is connected. May be complemented with.
 ***実施の形態5の効果***
 以上のように、実施の形態5に係る行動特定装置10は、実施の形態1に係る行動特定装置10と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。
*** Effect of Embodiment 5 ***
As described above, the behavior specifying device 10 according to the fifth embodiment considers mutual behavior, which is an action in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
 特に、実施の形態5に係る行動特定装置10は、2つ以上のフレームから取得された骨格情報を用いて計算された速度を特徴量として用いる。例えば数秒といったある程度長い時間幅における骨格情報の時系列データから計算される骨格の速度を特徴量として用いれば、人の向き又はオクルージョンによる一部身体の隠蔽等による被写体者の骨格の関節の誤抽出が発生した場合でも、正しく行動を判別できる可能性が高くなる。 In particular, the behavior specifying device 10 according to the fifth embodiment uses the speed calculated using the skeleton information acquired from two or more frames as the feature amount. For example, if the speed of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds is used as the feature quantity, the skeletal joints of the subject are erroneously extracted due to the orientation of the person or the concealment of a part of the body by occlusion. Even if this occurs, there is a high possibility that the behavior can be correctly determined.
 ***他の構成***
 <変形例12>
 変形例9で説明したように、行動特定装置10は、相互モデルに代えて相互ルールを用いてもよい。
*** Other configurations ***
<Modification 12>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.
 <変形例13>
 実施の形態5では、実施の形態1と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置10は、変形例2と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置10の相互特定部26は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部26は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。
<Modification 13>
In the fifth embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
 実施の形態6.
 実施の形態6は、骨格情報から特徴量を計算する方法が実施の形態3,5と異なる。実施の形態6では、この異なる点を説明し、同一の点については説明を省略する。
 実施の形態6では、実施の形態5と異なる点を説明する。
Embodiment 6.
The sixth embodiment is different from the third and fifth embodiments in the method of calculating the feature amount from the skeletal information. In the sixth embodiment, these different points will be described, and the same points will be omitted.
The sixth embodiment will explain the differences from the fifth embodiment.
 図12を参照して、実施の形態6に係る特徴量計算処理(図10のステップS51)を説明する。
 (ステップS71:骨格情報取得処理)
 特徴量計算部27は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部27は、対象の組に設定された組に含まれる複数の被写体者それぞれについての現在時刻tからN時刻前までの骨格情報を骨格情報データベースから取得する。特徴量計算部27は、取得された骨格情報を時系列に並べたデータを時系列データとして設定する。
 時系列データは、例えば数秒といったある程度の長さをもった対象期間分の骨格情報を時系列に並べたデータであり、2つ以上の時刻における骨格情報を時系列に並べたデータであることが望ましく、さらに3つ以上の時刻における骨格情報を時系列に並べたデータであることが望ましい。
The feature amount calculation process (step S51 in FIG. 10) according to the sixth embodiment will be described with reference to FIG.
(Step S71: Skeleton information acquisition process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 acquires skeleton information from the current time t to N time before each of the plurality of subjects included in the set set in the target set from the skeleton information database. The feature amount calculation unit 27 sets data in which the acquired skeleton information is arranged in time series as time series data.
The time-series data is data in which skeletal information for a target period having a certain length, for example, several seconds, is arranged in chronological order, and skeletal information at two or more times is arranged in chronological order. It is desirable, and it is desirable that the data is a time-series arrangement of skeletal information at three or more times.
 (ステップS72:移動距離計算処理)
 特徴量計算部27は、ステップS71で生成された骨格情報の時系列データにおいて、時系列に連続する2つの時刻の骨格情報間における対象の被写体者の骨格の各関節の移動距離を計算する。具体的には、特徴量計算部27は、各関節を対象として、2つの時刻の骨格情報間における対象の関節の位置の差分を計算することによって、対象の関節の移動距離を計算する。特徴量計算部27は、各関節の移動距離を要素とするベクトルあるいは行列を生成する。以下では、各関節の移動距離を要素とするベクトルが生成されたとして説明する。
(Step S72: Travel distance calculation process)
In the time-series data of the skeleton information generated in step S71, the feature amount calculation unit 27 calculates the movement distance of each joint of the skeleton of the subject subject between the skeleton information of two consecutive times in the time series. Specifically, the feature amount calculation unit 27 calculates the movement distance of the target joint by calculating the difference in the position of the target joint between the skeletal information at two times for each joint. The feature amount calculation unit 27 generates a vector or a matrix having the movement distance of each joint as an element. In the following, it will be described assuming that a vector having the movement distance of each joint as an element is generated.
 (ステップS73:運動量計算処理)
 特徴量計算部27は、ステップS22で生成された、各関節の移動距離を要素とするベクトルを時間方向に合計する。つまり、特徴量計算部27は、各関節を対象として、対象の関節について計算された2つの時刻の間における移動距離を合計する。このようにして計算された値は、現在時刻tから過去時刻t-Nまでの時間幅Nにおける各関節の移動距離の総和である。そのため、この値は、時間幅Nにおける各関節の運動量とみなすことができる。
 特徴量計算部27は、全関節の運動量を合計する、あるいは、平均値を取る等してスカラーとし、このスカラーを時間幅Nにおける被写体者の骨格全体の運動量とみなす。そして、特徴量計算部27は、この運動量を特徴量とする。
(Step S73: Momentum calculation process)
The feature amount calculation unit 27 totals the vectors generated in step S22 with the movement distance of each joint as an element in the time direction. That is, the feature amount calculation unit 27 totals the movement distances between the two times calculated for each joint for each joint. The value calculated in this way is the sum of the movement distances of each joint in the time width N from the current time t to the past time t—N. Therefore, this value can be regarded as the momentum of each joint in the time width N.
The feature amount calculation unit 27 makes a scalar by summing up the momentums of all the joints or taking an average value, and regards this scalar as the momentum of the entire skeleton of the subject in the time width N. Then, the feature amount calculation unit 27 uses this momentum as the feature amount.
 上記説明では、運動量はスカラーであった。しかし、特徴量計算部27は、全関節について運動量の合計又は平均値を取らずに、各関節の運動量を要素に持つベクトルデータを特徴量としてもよい。 In the above explanation, the amount of exercise was scalar. However, the feature amount calculation unit 27 may use vector data having the momentum of each joint as an element as the feature amount without taking the total or average value of the momentums of all the joints.
 特徴量計算部27は、抽出された被写体者の骨格の関節のうち、任意の数の関節から特徴量を計算してもよい。あるいは、特徴量計算部27は、任意の数の関節分だけ計算された特徴量どうしを加算する又は平均を取る等して、特徴量を抽出した関節数よりも少ない数の特徴量を計算してもよい。 The feature amount calculation unit 27 may calculate the feature amount from any number of the extracted joints of the subject's skeleton. Alternatively, the feature amount calculation unit 27 calculates a number of feature amounts smaller than the number of joints from which the feature amount has been extracted by adding or averaging the feature amounts calculated for an arbitrary number of joints. You may.
 特徴量を計算するにあたって、骨格情報のうちの一部の関節の位置が取得できない場合も起こり得る。この場合には、実施の形態5と同様に、特徴量計算部27は、取得できなかった関節の位置又は取得できなかった関節に関する特徴量を補完してもよい。 When calculating the features, it may happen that the positions of some joints in the skeletal information cannot be obtained. In this case, as in the fifth embodiment, the feature amount calculation unit 27 may supplement the position of the joint that could not be acquired or the feature amount related to the joint that could not be acquired.
 ***実施の形態6の効果***
 以上のように、実施の形態6に係る行動特定装置10は、実施の形態1に係る行動特定装置10と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。
*** Effect of Embodiment 6 ***
As described above, the behavior specifying device 10 according to the sixth embodiment considers mutual behavior, which is an action in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
 特に、実施の形態6に係る行動特定装置10は、過去のフレームから取得された骨格情報を用いて計算された運動量を特徴量として用いる。例えば数秒といったある程度長い時間幅における骨格情報の時系列データから計算される骨格の運動量を特徴量として用いることにより、人の向き又はオクルージョンによる一部身体の隠蔽等による被写体者の骨格の関節の誤抽出が発生した場合でも、正しく行動を判別できる可能性が高くなる。 In particular, the behavior specifying device 10 according to the sixth embodiment uses the momentum calculated using the skeletal information acquired from the past frame as the feature amount. For example, by using the momentum of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds as the feature amount, the subject's skeletal joint error due to the orientation of the person or the concealment of a part of the body by occlusion, etc. Even if extraction occurs, there is a high possibility that the behavior can be correctly determined.
 ***他の構成***
 <変形例14>
 変形例9で説明したように、行動特定装置10は、相互モデルに代えて相互ルールを用いてもよい。
*** Other configurations ***
<Modification 14>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.
 <変形例15>
 実施の形態6では、実施の形態1と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置10は、変形例2と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置10の相互特定部26は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部26は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。
<Modification 15>
In the sixth embodiment, the behavior of the plurality of subjects as a whole was specified as in the first embodiment. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
 実施の形態7.
 実施の形態7は、骨格情報から算出される特徴量が異なるという点で実施の形態3,5,6と異なる。実施の形態7では、この異なる点を説明し、同一の点については説明を省略する。
 実施の形態7では、実施の形態6と異なる点を説明する。
Embodiment 7.
The seventh embodiment is different from the third, fifth, and sixth embodiments in that the feature amount calculated from the skeletal information is different. In the seventh embodiment, these different points will be described, and the same points will be omitted.
The seventh embodiment will explain the differences from the sixth embodiment.
 図13を参照して、実施の形態7に係る特徴量計算処理(図10のステップS51)を説明する。
 (ステップS81:骨格情報取得処理)
 特徴量計算部27は、ステップS14で相互行動を行っている組であると判定された各組を対象の組に設定する。特徴量計算部27は、対象の組に設定された組に含まれる複数の被写体者それぞれについての現在時刻tからN時刻前までの骨格情報を骨格情報データベースから取得する。特徴量計算部27は、取得された骨格情報を時系列に並べたデータを時系列データとして設定する。
The feature amount calculation process (step S51 in FIG. 10) according to the seventh embodiment will be described with reference to FIG.
(Step S81: Skeleton information acquisition process)
The feature amount calculation unit 27 sets each group determined to be a group performing mutual action in step S14 as the target group. The feature amount calculation unit 27 acquires skeleton information from the current time t to N time before each of the plurality of subjects included in the set set in the target set from the skeleton information database. The feature amount calculation unit 27 sets data in which the acquired skeleton information is arranged in time series as time series data.
 (ステップS82:軌跡計算処理)
 特徴量計算部27は、ステップS81で生成された対象の被写体者の骨格情報の時系列データが表す、現在時刻tから過去時刻t-N間の各時刻における被写体者の骨格の関節の位置の情報を時系列に並べたベクトルあるいは行列を特徴量として生成する。以下では、関節の位置の情報を時系列に並べたベクトルが生成されたとして説明する。このようにして生成されたベクトルは、時系列に並んだ骨格の関節の位置の情報を要素として持つ。そのため、時刻tから時刻t-Nにおける関節の移動経路、つまり動作の軌跡を表す。
 このとき、関節の位置の情報は、2次元画像内から抽出された骨格情報を対象としていれば、水平方向の位置を表す座標値xと垂直方向の位置を表す座標値yとを用いて(x,y)といった具合に表される。
(Step S82: Trajectory calculation process)
The feature amount calculation unit 27 describes the positions of the joints of the subject's skeleton at each time between the current time t and the past time t-N represented by the time-series data of the skeleton information of the target subject generated in step S81. Generate a vector or matrix in which information is arranged in time series as a feature quantity. In the following, it will be described assuming that a vector in which joint position information is arranged in chronological order is generated. The vector generated in this way has information on the positions of the joints of the skeleton arranged in time series as an element. Therefore, it represents the movement path of the joint from time t to time t—N, that is, the locus of movement.
At this time, if the information on the position of the joint is targeted at the skeletal information extracted from the two-dimensional image, the coordinate value x representing the horizontal position and the coordinate value y representing the vertical position are used ( It is expressed as x, y).
 特徴量計算部27は、特徴量を計算するにあたって、抽出された被写体者の骨格の関節のうち、任意の数の関節に対して特徴量を計算してもよい。また、特徴量計算部27は、正の整数M,mに関して、骨格情報がM次元の関節の位置情報を持っていた場合、m≦Mとなるようなm個の座標値を利用して特徴量を計算してもよい。 In calculating the feature amount, the feature amount calculation unit 27 may calculate the feature amount for any number of the extracted joints of the subject's skeleton. Further, the feature amount calculation unit 27 uses m coordinate values such that m ≦ M when the skeleton information has the position information of the M-dimensional joint with respect to the positive integers M and m. You may calculate the amount.
 特徴量を計算するにあたって、骨格情報のうちの一部の関節の位置が取得できない場合も起こり得る。この場合には、実施の形態6と同様に、特徴量計算部27は、取得できなかった関節の位置又は取得できなかった関節に関する特徴量を補完してもよい。 When calculating the features, it may happen that the positions of some joints in the skeletal information cannot be obtained. In this case, as in the sixth embodiment, the feature amount calculation unit 27 may supplement the position of the joint that could not be acquired or the feature amount related to the joint that could not be acquired.
 ***実施の形態7の効果***
 以上のように、実施の形態7に係る行動特定装置10は、実施の形態1に係る行動特定装置10と同様に、複数の被写体者が相互に影響を与える行動である相互行動を考慮して、複数の被写体者全体としての行動を特定する。これにより、姿勢及び動作が類似している行動についても、正しく判別できる可能性が高くなる。その結果、行動認識の精度を向上させることが可能である。
*** Effect of Embodiment 7 ***
As described above, the behavior specifying device 10 according to the seventh embodiment considers mutual behavior, which is an behavior in which a plurality of subject subjects influence each other, similarly to the behavior specifying device 10 according to the first embodiment. , Identify the behavior of multiple subjects as a whole. As a result, there is a high possibility that even behaviors with similar postures and movements can be correctly discriminated. As a result, it is possible to improve the accuracy of behavior recognition.
 特に、実施の形態7に係る行動特定装置10は、過去のフレームから取得された骨格情報を用いて計算された軌跡を特徴量として用いる。例えば数秒といったある程度長い時間幅における骨格情報の時系列データから計算される骨格の軌跡を特徴量として用いることにより、人の向き又はオクルージョンによる一部身体の隠蔽等による被写体者の骨格の関節の誤抽出が発生した場合でも、正しく行動を判別できる可能性が高くなる。 In particular, the behavior specifying device 10 according to the seventh embodiment uses a locus calculated using the skeleton information acquired from the past frame as a feature amount. For example, by using the locus of the skeleton calculated from the time-series data of the skeleton information in a somewhat long time width such as several seconds as a feature quantity, the subject's skeletal joint error due to the orientation of the person or the concealment of a part of the body by occlusion, etc. Even if extraction occurs, there is a high possibility that the behavior can be correctly determined.
 ***他の構成***
 <変形例16>
 変形例9で説明したように、行動特定装置10は、相互モデルに代えて相互ルールを用いてもよい。
 <変形例17>
 実施の形態7では、実施の形態1と同様に、複数の被写体者全体としての行動が特定された。しかし、行動特定装置10は、変形例2と同様に、各被写体者が全体としての行動におけるどの行動をしているかまで特定してもよい。この場合には、行動特定装置10の相互特定部26は、各被写体者を対象として、全体としての行動と、対象の被写体者の骨格情報とから、全体としての行動における対象の被写体者の行動を特定する。具体的には、相互特定部26は、対象の被写体者の骨格情報から対象の被写体者の個別行動を特定し、全体としての行動と、対象の被写体者の個別行動とから、全体としての行動における対象の被写体者の行動を特定する。
*** Other configurations ***
<Modification 16>
As described in the modified example 9, the behavior specifying device 10 may use a mutual rule instead of the mutual model.
<Modification 17>
In the seventh embodiment, as in the first embodiment, the behavior of the plurality of subjects as a whole is specified. However, the behavior specifying device 10 may specify which behavior each subject is doing in the behavior as a whole, as in the modification 2. In this case, the mutual identification unit 26 of the action specifying device 10 targets each subject, and from the behavior as a whole and the skeleton information of the target subject, the behavior of the target subject in the overall action. To identify. Specifically, the mutual identification unit 26 identifies the individual behavior of the target subject from the skeletal information of the target subject, and the behavior as a whole is based on the behavior as a whole and the individual behavior of the subject. Identify the behavior of the subject in.
 以上、本開示の実施の形態及び変形例について説明した。これらの実施の形態及び変形例のうち、いくつかを組み合わせて実施してもよい。また、いずれか1つ又はいくつかを部分的に実施してもよい。なお、本開示は、以上の実施の形態及び変形例に限定されるものではなく、必要に応じて種々の変更が可能である。 The embodiments and modifications of the present disclosure have been described above. Some of these embodiments and modifications may be combined and carried out. In addition, any one or several may be partially carried out. The present disclosure is not limited to the above embodiments and modifications, and various modifications can be made as necessary.
 10 行動特定装置、11 プロセッサ、12 メモリ、13 ストレージ、14 通信インタフェース、15 電子回路、21 映像取得部、22 骨格情報取得部、23 相関判定部、24 行動特定部、25 個別特定部、26 相互特定部、27 特徴量計算部、31 カメラ、50 学習装置、51 プロセッサ、52 メモリ、53 ストレージ、54 通信インタフェース、55 電子回路、61 学習データ取得部、62 モデル生成部。 10 behavior identification device, 11 processor, 12 memory, 13 storage, 14 communication interface, 15 electronic circuit, 21 video acquisition unit, 22 skeleton information acquisition unit, 23 correlation judgment unit, 24 behavior identification unit, 25 individual identification unit, 26 mutual Specific unit, 27 feature quantity calculation unit, 31 camera, 50 learning device, 51 processor, 52 memory, 53 storage, 54 communication interface, 55 electronic circuit, 61 learning data acquisition unit, 62 model generation unit.

Claims (17)

  1.  映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する骨格情報取得部と、
     前記骨格情報取得部によって取得された複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定部と
    を備える行動特定装置。
    A skeleton information acquisition unit that acquires skeletal information indicating the positions of joints of the skeleton for each of the subjects who are multiple people reflected in the video data.
    From the skeleton information about each of the plurality of subjects acquired by the skeleton information acquisition unit, the behavior as the plurality of subject subjects is taken into consideration in consideration of the mutual behavior which is the behavior in which the plurality of subject subjects influence each other. A behavior identification device provided with an behavior identification unit for identifying a behavior.
  2.  前記行動特定部は、
     前記複数の被写体者それぞれを対象として、対象の被写体者の前記骨格情報から、前記対象の被写体者についての行動を個別行動として特定する個別特定部と、
     前記個別特定部によって特定された前記複数の被写体者それぞれについての前記個別行動から、前記相互行動を考慮して、前記複数の被写体者としての行動を特定する相互特定部と
    を備える請求項1に記載の行動特定装置。
    The behavior identification unit
    An individual identification unit that specifies the behavior of the target subject as individual behavior from the skeleton information of the target subject for each of the plurality of subject subjects.
    The first aspect of claim 1 includes a mutual identification unit that specifies an action as a plurality of subject persons in consideration of the mutual action from the individual action for each of the plurality of subject persons specified by the individual identification unit. The described behavior identification device.
  3.  前記個別特定部は、人の骨格情報を入力として、その人の行動を示す個別ラベルを出力する個別モデルに対して、前記対象の被写体者の前記骨格情報を入力することにより、前記対象の被写体者の前記個別行動を示す個別ラベルを取得する
    請求項2に記載の行動特定装置。
    The individual identification unit inputs the skeleton information of the subject to the individual model that outputs the individual label indicating the behavior of the person by inputting the skeleton information of the subject, thereby inputting the skeleton information of the subject. The behavior specifying device according to claim 2, wherein an individual label indicating the individual behavior of the person is acquired.
  4.  前記個別特定部は、人の骨格情報と人の行動を示す個別ラベルとを対応付けた個別ルールを参照して、前記対象の被写体者の前記骨格情報に対応する個別ラベルを前記対象の被写体者の前記個別行動を示す情報として取得する
    請求項2に記載の行動特定装置。
    The individual identification unit refers to an individual rule in which a person's skeleton information and an individual label indicating a person's behavior are associated with each other, and sets an individual label corresponding to the skeleton information of the target subject to the target subject. The behavior specifying device according to claim 2, which is acquired as information indicating the individual behavior of the above.
  5.  前記相互特定部は、複数の人それぞれの個別行動を示す個別ラベルの組を入力として、前記相互行動を考慮して前記複数の人としての行動を示す相互ラベルを出力する相互モデルに対して、前記個別特定部によって特定された前記複数の被写体者それぞれについての個別ラベルの組を入力することにより、前記複数の被写体者としての行動を示す相互ラベルを取得する
    請求項2から4までのいずれか1項に記載の行動特定装置。
    The mutual identification unit inputs a set of individual labels indicating the individual behaviors of each of the plurality of people, and outputs a mutual label indicating the behaviors of the plurality of people in consideration of the mutual behaviors. Any of claims 2 to 4 for acquiring a mutual label indicating the behavior as the plurality of subjects by inputting a set of individual labels for each of the plurality of subjects specified by the individual identification unit. The behavior identification device according to item 1.
  6.  前記相互特定部は、複数の人それぞれの個別行動を示す個別ラベルの組と前記複数の人としての行動を示す相互ラベルとを対応付けた相互ルールを参照して、前記個別特定部によって特定された前記複数の被写体者それぞれについての個別ラベルの組に対応する相互ラベルを前記複数の被写体者としての行動を示す情報として取得する
    請求項2から4までのいずれか1項に記載の行動特定装置。
    The mutual identification unit is specified by the individual identification unit with reference to a mutual rule in which a set of individual labels indicating the individual actions of each of the plurality of persons and the mutual labels indicating the behaviors of the plurality of persons are associated with each other. The behavior specifying device according to any one of claims 2 to 4, wherein mutual labels corresponding to a set of individual labels for each of the plurality of subjects are acquired as information indicating the behavior of the plurality of subjects. ..
  7.  前記相互特定部は、前記複数の被写体者それぞれを対象として、前記複数の被写体者としての行動と、対象の被写体者についての個別行動とから、前記複数の被写体者としての行動における前記対象の被写体者についての行動を特定する
    請求項2から6までのいずれか1項に記載の行動特定装置。
    The mutual identification unit targets each of the plurality of subject subjects, and from the behavior as the plurality of subject subjects and the individual behavior regarding the target subject, the subject in the behavior as the plurality of subject subjects. The behavior specifying device according to any one of claims 2 to 6, which specifies the behavior of a person.
  8.  前記行動特定部は、
     前記複数の被写体者それぞれについての前記骨格情報に基づき特徴量を計算する特徴量計算部と、
     前記特徴量計算部によって生成された前記特徴量を入力として、前記相互行動を考慮して、前記複数の被写体者としての行動を特定する相互特定部と
    を備える請求項1に記載の行動特定装置。
    The behavior identification unit
    A feature amount calculation unit that calculates a feature amount based on the skeleton information for each of the plurality of subjects, and a feature amount calculation unit.
    The behavior specifying device according to claim 1, further comprising a mutual specifying unit that specifies the behavior as a plurality of subject subjects in consideration of the mutual behavior by inputting the feature amount generated by the feature amount calculation unit. ..
  9.  前記特徴量計算部は、前記複数の被写体者それぞれを対象として、対象の被写体者についての時系列に連続する骨格情報から前記対象の被写体者の速度を前記特徴量として計算する
    請求項8に記載の行動特定装置。
    The eighth aspect of the present invention, wherein the feature amount calculation unit calculates the speed of the target subject person as the feature amount from the skeletal information continuous in time series for each of the plurality of subject persons. Behavior identification device.
  10.  前記特徴量計算部は、前記複数の被写体者それぞれを対象として、対象の被写体者についての時系列に連続する骨格情報から前記対象の被写体者の運動量を前記特徴量として計算する
    請求項8に記載の行動特定装置。
    The eighth aspect of claim 8, wherein the feature amount calculation unit calculates the momentum of the target subject as the feature amount from the skeletal information continuous in time series for each of the plurality of subjects. Behavior identification device.
  11.  前記特徴量計算部は、前記複数の被写体者それぞれを対象として、対象の被写体者についての時系列に連続する骨格情報から前記対象の被写体者の動作の軌跡を前記特徴量として計算する
    請求項8に記載の行動特定装置。
    8. The feature amount calculation unit calculates, for each of the plurality of subject persons, the locus of movement of the target subject person as the feature amount from the skeletal information continuous in time series for the target subject person. Behavior identification device described in.
  12.  前記相互特定部は、複数の人の骨格情報の特徴量を入力として、前記相互行動を考慮して複数の人としての行動を示す相互ラベルを出力する相互モデルに対して、前記特徴量計算部によって計算された前記特徴量を入力することにより、前記複数の被写体者としての行動を示す相互ラベルを取得する
    請求項8から11までのいずれか1項に記載の行動特定装置。
    The mutual identification unit is a feature amount calculation unit for a mutual model in which a feature amount of skeletal information of a plurality of people is input and a mutual label indicating the behavior as a plurality of people is output in consideration of the mutual behavior. The behavior specifying device according to any one of claims 8 to 11, wherein by inputting the feature amount calculated by the above, a mutual label indicating the behavior as the plurality of subject subjects is acquired.
  13.  前記相互特定部は、複数の人の骨格情報の特徴量と前記複数の人としての行動を示す相互ラベルとを対応付けた相互ルールを参照して、前記特徴量計算部によって計算された前記特徴量に対応する相互ラベルを前記複数の被写体者としての被写体者の行動を示す情報として取得する
    請求項8から11までのいずれか1項に記載の行動特定装置。
    The mutual identification unit refers to a mutual rule in which a feature amount of skeletal information of a plurality of people and a mutual label indicating the behavior of the plurality of people are associated with each other, and the feature is calculated by the feature amount calculation unit. The behavior specifying device according to any one of claims 8 to 11, wherein a mutual label corresponding to the amount is acquired as information indicating the behavior of the subject as the plurality of subject.
  14.  前記相互特定部は、前記複数の被写体者それぞれを対象として、前記複数の被写体者としての行動と、対象の被写体者についての骨格情報とから、前記複数の被写体者としての行動における前記対象の被写体者についての行動を特定する
    請求項8から13までのいずれか1項に記載の行動特定装置。
    The mutual identification unit targets each of the plurality of subject subjects, and from the behavior as the plurality of subject subjects and the skeletal information about the target subject, the subject in the behavior as the plurality of subject subjects. The action specifying device according to any one of claims 8 to 13, which specifies an action about a person.
  15.  前記行動特定装置は、さらに、
    前記複数の被写体者が互いに影響を与える行動である相互行動を行っているか否かを判定する相関判定部
    を備え、
     前記行動特定部は、前記相関判定部によって前記相互行動を行っていると判定された場合に、前記相互行動を考慮して、前記複数の被写体者としての行動を特定する
    請求項1から14までのいずれか1項に記載の行動特定装置。
    The behavior identification device further
    A correlation determination unit for determining whether or not the plurality of subject subjects are performing mutual actions, which are actions that affect each other, is provided.
    Claims 1 to 14 for specifying the behavior as a plurality of subject subjects in consideration of the mutual behavior when the behavior specifying unit determines that the mutual behavior is being performed by the correlation determination unit. The behavior identification device according to any one of the above items.
  16.  行動特定装置の骨格情報取得部が、映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得し、
     前記行動特定装置の行動特定部が、複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定方法。
    The skeleton information acquisition unit of the behavior identification device acquires skeletal information indicating the position of the joints of the skeleton for each of the multiple subjects shown in the video data.
    The behavior specifying unit of the behavior specifying device considers mutual behavior, which is an behavior in which the plurality of subject subjects influence each other, from the skeleton information about each of the plurality of subject subjects, and serves as the plurality of subject subjects. Behavior identification method to identify behavior.
  17.  映像データに映った複数の人である被写体者それぞれを対象として、対象の被写体者について、骨格の関節の位置を示す骨格情報を取得する骨格情報取得処理と、
     前記骨格情報取得処理によって取得された複数の被写体者それぞれについての前記骨格情報から、前記複数の被写体者が相互に影響を与える行動である相互行動を考慮して、前記複数の被写体者としての行動を特定する行動特定処理と
    を行う行動特定装置としてコンピュータを機能させる行動特定プログラム。
    Skeletal information acquisition processing that acquires skeletal information indicating the positions of skeletal joints for each of the subject subjects who are multiple people reflected in the video data, and
    From the skeleton information about each of the plurality of subjects acquired by the skeleton information acquisition process, the behavior as the plurality of subjects is taken into consideration in consideration of the mutual behavior which is the behavior in which the plurality of subjects influence each other. A behavior identification program that makes a computer function as an behavior identification device that performs behavior identification processing.
PCT/JP2020/029244 2020-07-03 2020-07-30 Action identification device, action identification method, and action identification program WO2022003989A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2021503612A JP6887586B1 (en) 2020-07-03 2020-07-30 Behavior identification device, behavior identification method and behavior identification program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JPPCT/JP2020/026277 2020-07-03
PCT/JP2020/026277 WO2022003981A1 (en) 2020-07-03 2020-07-03 Action specification device, action specification method, and action specification program

Publications (1)

Publication Number Publication Date
WO2022003989A1 true WO2022003989A1 (en) 2022-01-06

Family

ID=79315027

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/JP2020/026277 WO2022003981A1 (en) 2020-07-03 2020-07-03 Action specification device, action specification method, and action specification program
PCT/JP2020/029244 WO2022003989A1 (en) 2020-07-03 2020-07-30 Action identification device, action identification method, and action identification program

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/026277 WO2022003981A1 (en) 2020-07-03 2020-07-03 Action specification device, action specification method, and action specification program

Country Status (1)

Country Link
WO (2) WO2022003981A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018089221A1 (en) * 2016-11-09 2018-05-17 Microsoft Technology Licensing, Llc Neural network-based action detection
JP2020027496A (en) * 2018-08-14 2020-02-20 富士ゼロックス株式会社 Monitoring device, monitoring system, and program
JP6692086B1 (en) * 2019-12-16 2020-05-13 株式会社アジラ Abnormal behavior detector

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018089221A1 (en) * 2016-11-09 2018-05-17 Microsoft Technology Licensing, Llc Neural network-based action detection
JP2020027496A (en) * 2018-08-14 2020-02-20 富士ゼロックス株式会社 Monitoring device, monitoring system, and program
JP6692086B1 (en) * 2019-12-16 2020-05-13 株式会社アジラ Abnormal behavior detector

Also Published As

Publication number Publication date
WO2022003981A1 (en) 2022-01-06

Similar Documents

Publication Publication Date Title
JP6887586B1 (en) Behavior identification device, behavior identification method and behavior identification program
Ullah et al. Activity recognition using temporal optical flow convolutional features and multilayer LSTM
Song et al. Richly activated graph convolutional network for robust skeleton-based action recognition
Zhang et al. Fusing geometric features for skeleton-based action recognition using multilayer LSTM networks
Liao et al. Natural language guided visual relationship detection
Zhu et al. A cuboid CNN model with an attention mechanism for skeleton-based action recognition
Wang et al. Hidden‐Markov‐models‐based dynamic hand gesture recognition
Sun et al. Conditional regression forests for human pose estimation
Liu et al. Estimation of missing markers in human motion capture
Elmadany et al. Information fusion for human action recognition via biset/multiset globality locality preserving canonical correlation analysis
Nakai et al. Prediction of basketball free throw shooting by openpose
Xu et al. Spatiotemporal decouple-and-squeeze contrastive learning for semisupervised skeleton-based action recognition
Li et al. Time3d: End-to-end joint monocular 3d object detection and tracking for autonomous driving
Drumond et al. An LSTM recurrent network for motion classification from sparse data
Das et al. MMHAR-EnsemNet: A multi-modal human activity recognition model
Ding et al. Profile HMMs for skeleton-based human action recognition
Hachaj et al. Effectiveness comparison of Kinect and Kinect 2 for recognition of Oyama karate techniques
Malawski et al. Recognition of action dynamics in fencing using multimodal cues
Hachaj et al. Dependence of Kinect sensors number and position on gestures recognition with Gesture Description Language semantic classifier
Ben Tamou et al. Automatic learning of articulated skeletons based on mean of 3D joints for efficient action recognition
Baumann et al. Action graph a versatile data structure for action recognition
Earp et al. Face detection with feature pyramids and landmarks
Pham et al. An efficient feature fusion of graph convolutional networks and its application for real-time traffic control gestures recognition
JP6972434B1 (en) Behavior identification device, behavior identification method and behavior identification program
Tian et al. Joints kinetic and relational features for action recognition

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021503612

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20943352

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20943352

Country of ref document: EP

Kind code of ref document: A1