JP6977551B2

JP6977551B2 - Information processing equipment, information processing methods, and information processing programs

Info

Publication number: JP6977551B2
Application number: JP2017249607A
Authority: JP
Inventors: 大気関井
Original assignee: Konica Minolta Inc
Current assignee: Konica Minolta Inc
Priority date: 2017-12-26
Filing date: 2017-12-26
Publication date: 2021-12-08
Anticipated expiration: 2037-12-26
Also published as: JP2019114211A

Description

本発明は、機械学習で利用される情報処理技術に関する。 The present invention relates to an information processing technique used in machine learning.

カメラで撮像された動画から人物の行動を認識する技術がある。行動認識の頑健性を高めるために、この技術に、機械学習を用いることが提案されている。例えば、人物の行動が写された動画と、この動画が逆再生された動画（逆再生動画）とを用いた深層学習により、行動認識をする技術が提案されている（例えば、非特許文献１）。 There is a technology to recognize the behavior of a person from a moving image captured by a camera. It has been proposed to use machine learning for this technique in order to increase the robustness of behavior recognition. For example, a technique for recognizing behavior by deep learning using a moving image of a person's behavior and a moving image in which this moving image is played in reverse (reverse playing moving image) has been proposed (for example, Non-Patent Document 1). ).

バーラト・シン（ＢｈａｒａｔＳｉｎｇｈ）、他４名、「きめ細かい行動検知のための、マルチストリーム双方向再帰型ニューラルネットワーク（ＡＭｕｌｔｉ−ＳｔｒｅａｍＢｉ−ＤｉｒｅｃｔｉｏｎａｌＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋｆｏｒＦｉｎｅ−ＧｒａｉｎｅｄＡｃｔｉｏｎＤｅｔｅｃｔｉｏｎ）」、［ｏｎｌｉｎｅ］、ｐ．１−８、［平成２９年１２月１２日検索］、インターネット〈ＵＲＬ：http://www.cs.umd.edu/~bharat/cvpr2016.pdf〉Bharat Singh, 4 others, "A Multi-Stream Bi-Directional Recurrent Neural Network for Fine-Grained Action" ], P. 1-8, [Search on December 12, 2017], Internet <URL: http://www.cs.umd.edu/~bharat/cvpr2016.pdf>

教師あり学習の場合、学習器が、データと教師データ（ラベル）とのペアが多数集められたデータセットを学習することにより、学習モデルを構築する。教師あり学習の精度を高めるためには、データと教師データとのペアの数を多くする必要がある。上記行動認識技術の場合、データは、人物の行動が写された動画である。 In the case of supervised learning, the learner builds a learning model by learning a data set in which a large number of pairs of data and teacher data (labels) are collected. In order to improve the accuracy of supervised learning, it is necessary to increase the number of pairs of data and teacher data. In the case of the above-mentioned behavior recognition technology, the data is a moving image of a person's behavior.

逆再生動画をデータに利用できれば、データの数を増やすことができる。例えば、歩く人物が写された動画の逆再生動画を、「歩く人物」を認識する学習に利用するのである。しかし、動作によっては、逆再生動画に写された動作が、もとの動画に写された動作と別の動作と見なしたほうが妥当なことがある。例えば、人間が椅子に座る動作が写された動画の逆再生動画には、人間が椅子から立つ動作と見なせる動作が写されている。従って、この逆再生動画について、教師データを「人間が椅子に座る動作」とすれば、機械学習の精度が悪くなる。 If reverse vid video can be used for data, the number of data can be increased. For example, a reverse-played video of a moving person is used for learning to recognize a "walking person". However, depending on the motion, it may be more appropriate to consider the motion captured in the reverse-played video as a different motion from the motion captured in the original movie. For example, a reverse-played video of a video showing a human sitting on a chair shows a movement that can be regarded as a human standing up from a chair. Therefore, if the teacher data is "the movement of a human sitting on a chair" for this reverse playback video, the accuracy of machine learning deteriorates.

本発明の目的は、逆再生動画を機械学習に利用した場合に、機械学習の精度を向上させることができる情報処理装置、情報処理方法および情報処理プログラムを提供することである。 An object of the present invention is to provide an information processing device, an information processing method, and an information processing program capable of improving the accuracy of machine learning when the reverse playback moving image is used for machine learning.

本発明の第１局面に係る情報処理装置は、第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換部と、前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定部と、前記他方の動画と前記決定部が決定した動作情報とを紐付ける紐付け部と、を備える。 The information processing apparatus according to the first aspect of the present invention uses the first moving image of the first operation and the second operation opposite to the first operation as the first moving image. To the other moving image based on the conversion unit that converts the It includes a determination unit for determining motion information indicating the motion of the copied object, and a linking section for linking the other moving image with the motion information determined by the determination unit.

第１動作および第１動作と逆の動作である第２動作とは、一方の動作が他方の動作の逆である一対の動作である。例えば、人間が椅子に座る動作と人間が椅子から立つ動作、人間がテーブルの上に置かれた物を取る動作と人間がテーブルの上に物を戻す動作である。 The first operation and the second operation, which is the opposite of the first operation, are a pair of operations in which one operation is the opposite of the other operation. For example, the action of a human sitting on a chair and the action of a human standing up from a chair, the action of a human taking an object placed on a table, and the action of a human returning an object onto the table.

上記一対の動作の場合、第１動作をする物体が写された第１動画の逆再生動画（第２動画）は、第２動作と見なすことができる動作をする物体が写された動画である。例えば、人間が椅子に座る動作が写された第１動画の逆再生動画（第２動画）は、人間が椅子に座る動作を未来から過去へ再生する動画であるが、人間が椅子から立つ動作と見なすことができる動作が写された動画である。逆に、人間が椅子から立つ動作が写された第１動画の逆再生動画（第２動画）は、人間が椅子から立つ動作を未来から過去へ再生する動画であるが、人間が椅子に座る動作と見なすことができる動作が写された動画である。 In the case of the above pair of movements, the reverse playback moving image (second moving image) of the first moving image in which the object performing the first movement is shown is a moving image in which the moving object which can be regarded as the second movement is shown. .. For example, the reverse playback video (second video) of the first video showing the movement of a human sitting on a chair is a video that reproduces the movement of a human sitting on a chair from the future to the past, but the movement of a human standing from a chair. It is a video showing the movement that can be regarded as. Conversely, the reverse playback video (second video) of the first video, which shows the movement of a human standing from a chair, is a video that reproduces the movement of a human standing from a chair from the future to the past, but the human sits on the chair. It is a moving image of an action that can be regarded as an action.

第１動画および第２動画において、動画に写されてる物体の動作を示す動作情報（例えば、ラベル（言い換えれば、教師データ）を同じにすれば、機械学習の精度が悪くなる。 If the motion information (for example, the label (in other words, the teacher data)) indicating the motion of the object shown in the moving images is the same in the first moving image and the second moving image, the accuracy of machine learning deteriorates.

そこで、決定部は、第１動画および第２動画のうち、一方の動画に写された物体の動作を示す動作情報を基にして、他方の動画に写された物体の動作を示す動作情報を決定する。例えば、決定部は、一方の動画に写された物体の動作を示す動作情報が「人間が椅子に座る動作」の場合、他方の動画に写された物体の動作を示す動作情報を「人間が椅子から立つ動作」と決定する。 Therefore, the determination unit obtains motion information indicating the motion of the object captured in the other moving image based on the motion information indicating the motion of the object captured in one of the first moving image and the second moving image. decide. For example, when the motion information indicating the motion of the object shown in one of the moving images is "the motion of a human sitting on a chair", the determination unit obtains the motion information indicating the motion of the object captured in the other moving image by "human beings". "The action of standing up from the chair" is decided.

従って、本発明の第１局面に係る情報処理装置によれば、逆再生動画を機械学習に利用した場合に、機械学習の精度を向上させることができる。 Therefore, according to the information processing apparatus according to the first aspect of the present invention, the accuracy of machine learning can be improved when the reverse playback moving image is used for machine learning.

本発明の第１局面に係る情報処理装置は、以下の第１態様から第３態様がある。 The information processing apparatus according to the first aspect of the present invention has the following first to third aspects.

第１態様は、前記第１動作を示す動作情報である第１ラベルと前記第２動作を示す動作情報である第２ラベルとを紐付けて予め記憶する記憶部をさらに備え、前記変換部は、前記一方の動画である前記第１動画と前記第１ラベルのペアに対して、前記ペアを構成する前記第１動画を前記第２動画に変換し、前記決定部は、前記ペアを構成する前記第１ラベルと紐付けて記憶されている前記第２ラベルを、前記他方の動画である前記第２動画に写された前記物体の動作を示す動作情報に決定し、前記紐付け部は、前記第２動画と前記第２ラベルとを紐付ける。 The first aspect further includes a storage unit for preliminarily storing a first label which is operation information indicating the first operation and a second label which is operation information indicating the second operation in association with each other, and the conversion unit is provided. For the pair of the first moving image and the first label, which is the one moving image, the first moving image constituting the pair is converted into the second moving image, and the determination unit constitutes the pair. The second label, which is stored in association with the first label, is determined to be operation information indicating the operation of the object reflected in the second moving image, which is the other moving image, and the associating portion determines the operation information. The second moving image and the second label are linked.

第１態様は、一方の動画が第１動画であり、他方の動画が第２動画であり、動作情報がラベルの場合である。第１態様によれば、第２動画の撮像がされることなく、第２動画と第２ラベルとを紐付けたペアを生成することができる。このペアは、例えば、データセットを構成するペアとして、利用することができる。 In the first aspect, one moving image is a first moving image, the other moving image is a second moving image, and the operation information is a label. According to the first aspect, it is possible to generate a pair in which the second moving image and the second label are associated with each other without capturing the second moving image. This pair can be used, for example, as a pair constituting a data set.

第２態様は、前記一方の動画である前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを算出する機械学習部をさらに備え、前記決定部は、前記機械学習部が算出した前記第１動作である推定値を前記第２動作である推定値とし、前記機械学習部が算出した前記第２動作である推定値を前記第１動作である推定値とした組み合わせを、前記他方の動画である前記第２動画に写された前記物体の動作を示す動作情報に決定し、前記紐付け部は、前記第２動画と前記組み合わせとを紐付ける。 In the second aspect, with respect to the first moving image, which is one of the moving images, an estimated value in which the motion of the object captured in the first moving image is the first motion and an estimated value in which the motion is the second motion are calculated. The machine learning unit is further provided, and the determination unit uses the estimated value of the first operation calculated by the machine learning unit as the estimated value of the second operation, and the second operation calculated by the machine learning unit. The combination of the estimated value, which is the estimated value of the first operation, as the estimated value of the first operation is determined as the operation information indicating the operation of the object captured in the second moving image, which is the other moving image. The second moving image and the combination are linked.

第２態様は、一方の動画が第１動画であり、他方の動画が第２動画であり、動作情報が推定値（例えば、尤度、確率）の場合である。 The second aspect is a case where one moving image is the first moving image, the other moving image is the second moving image, and the motion information is an estimated value (for example, likelihood, probability).

機械学習部は、第１動画を学習して、第１動画に写された物体の動作が第１動作である推定値と第２動作である推定値とを算出する。第１動画に写された物体の動作は、第１動作なので、第１動作である推定値（例えば、９０％）は、第２動作である推定値（例えば、１０％）より高くなる。 The machine learning unit learns the first moving image and calculates an estimated value in which the motion of the object captured in the first moving motion is the first motion and an estimated value in which the motion is the second motion. Since the motion of the object captured in the first motion is the first motion, the estimated value of the first motion (for example, 90%) is higher than the estimated value of the second motion (for example, 10%).

決定部は、第１動作である推定値（例えば、９０％）を第２動作である推定値とし、第２動作である推定値（例えば、１０％）を第１動作である推定値とした組み合わせを、第２動作を示す動作情報と見なし、これを第２動画に写された物体の動作情報と決定する。 The determination unit uses the estimated value of the first operation (for example, 90%) as the estimated value of the second operation, and the estimated value of the second operation (for example, 10%) as the estimated value of the first operation. The combination is regarded as the motion information indicating the second motion, and this is determined as the motion information of the object captured in the second moving image.

第２態様によれば、機械学習部が第２動画を学習することなく、第２動画と、第２動画に写された物体の動作情報とを紐付けたペアを生成することができる。このペアは、例えば、データセットを構成するペアとして、利用することができる。 According to the second aspect, the machine learning unit can generate a pair in which the second moving image and the motion information of the object captured in the second moving image are linked without learning the second moving image. This pair can be used, for example, as a pair constituting a data set.

第３態様は、前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを算出し、かつ、前記第２動画について、前記第２動画に写された前記物体の動作が前記第２動作である推定値と前記第１動作である推定値とを算出する機械学習部をさらに備え、前記紐付け部は、前記第１動画について、前記機械学習部が算出した前記第１動作である推定値と前記第２動作である推定値との組み合わせである第１組み合わせを、前記第１動画に写された前記物体の動作を示す動作情報として、前記第１動画とを紐付け、前記決定部は、前記一方の動画である前記第２動画について、前記機械学習部が算出した前記第２動作である推定値を前記第１動作である推定値とし、前記機械学習部が算出した前記第１動作である推定値を前記第２動作である推定値とした第２組み合わせを、前記他方の動画である前記第１動画に写された前記物体の動作を示す動作情報に決定し、前記紐付け部は、前記第１動画と前記第２組み合わせとを紐付ける。 In the third aspect, with respect to the first moving image, an estimated value in which the motion of the object captured in the first moving image is the first motion and an estimated value in which the second motion is the second motion are calculated, and the first motion is described. Regarding the two moving images, the machine learning unit for calculating the estimated value in which the motion of the object captured in the second moving image is the second motion and the estimated value in which the motion is the first motion is further provided, and the linking section is provided. With respect to the first moving image, the first combination, which is a combination of the estimated value of the first operation calculated by the machine learning unit and the estimated value of the second operation, is copied to the first moving image. The first moving image is associated with the motion information indicating the motion of the object, and the determination unit determines the estimated value of the second motion calculated by the machine learning unit for the second moving image, which is one of the moving images. Is the estimated value of the first operation, and the second combination in which the estimated value of the first operation calculated by the machine learning unit is the estimated value of the second operation is the second moving image of the other. 1 The motion information indicating the motion of the object captured in the moving image is determined, and the linking portion associates the first moving image with the second combination.

第３態様は、一方の動画が第２動画であり、他方の動画が第１動画であり、動作情報が推定値（例えば、尤度、確率）の場合である。 The third aspect is a case where one moving image is a second moving image, the other moving image is a first moving image, and the motion information is an estimated value (for example, likelihood, probability).

機械学習部は、第１動画について、第１動画に写された物体の動作が第１動作である推定値と第２動作である推定値とを算出する。第１動画に写された物体の動作は、第１動作なので、第１動作である推定値（例えば、９０％）は、第２動作である推定値（例えば、１０％）より高くなる。紐付け部は、第１動画と、これらの推定値の組み合わせ（第１組み合わせ）と、を紐付ける。 The machine learning unit calculates, for the first moving image, an estimated value in which the motion of the object captured in the first moving motion is the first motion and an estimated value in which the motion is the second motion. Since the motion of the object captured in the first motion is the first motion, the estimated value of the first motion (for example, 90%) is higher than the estimated value of the second motion (for example, 10%). The linking unit associates the first moving image with a combination of these estimated values (first combination).

機械学習部は、第２動画について、第２動画に写された物体の動作が第２動作である推定値と第１動作である推定値とを算出する。第２動画（逆再生動画）に写された物体の動作は、第２動作と見なされる動作なので、第２動作である推定値（例えば、８０％）は、第１動作である推定値（例えば、２０％）より高くなる。 For the second moving image, the machine learning unit calculates an estimated value in which the motion of the object captured in the second moving image is the second motion and an estimated value in which the motion is the first motion. Since the motion of the object captured in the second moving image (reverse playback moving image) is regarded as the second motion, the estimated value (for example, 80%) of the second motion is the estimated value of the first motion (for example). , 20%).

決定部は、第２動作である推定値（例えば、８０％）を第１動作である推定値とし、第１動作である推定値（例えば、２０％）を第２動作である推定値とした第２組み合わせを、第１動作を示す動作情報と見なし、これを第１動画に写された物体の動作情報と決定する。紐付け部は、第１動画と第２組み合わせとを紐付ける。 The determination unit uses the estimated value of the second operation (for example, 80%) as the estimated value of the first operation, and the estimated value of the first operation (for example, 20%) as the estimated value of the second operation. The second combination is regarded as the motion information indicating the first motion, and this is determined as the motion information of the object captured in the first moving image. The linking unit links the first moving image and the second combination.

以上より、第３態様によれば、同じ第１動画に関して、第１動画と動作情報（第１動作である推定値９０％、第２動作である推定値１０％）とのペアと、第１動画と動作情報（第１動作である推定値８０％、第２動作である推定値２０％）とのペアと、を生成することができる。これにより、例えば、データセットに含まれるペアの数を２倍にすることができる。 From the above, according to the third aspect, with respect to the same first moving image, a pair of the first moving image and operation information (estimated value 90% for the first operation, estimated value 10% for the second operation) and the first. It is possible to generate a pair of a moving image and motion information (estimated value 80% for the first motion, estimated value 20% for the second motion). This allows, for example, to double the number of pairs contained in the dataset.

本発明の第２局面に係る情報処理方法は、第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換ステップと、前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定ステップと、前記他方の動画と前記決定ステップで決定された動作情報とを紐付ける紐付けステップと、を備える。 In the information processing method according to the second aspect of the present invention, among the first operation and the second operation which is the reverse of the first operation, the first moving image in which the first operation is copied is the first moving image. Based on the conversion step of converting It includes a determination step for determining motion information indicating the motion of the copied object, and a linking step for associating the other moving image with the motion information determined in the determination step.

本発明の第２局面に係る情報処理方法は、本発明の第１局面に係る情報処理装置を方法の観点から規定しており、本発明の第１局面に係る情報処理装置と同様の作用効果を有する。 The information processing method according to the second aspect of the present invention defines the information processing apparatus according to the first aspect of the present invention from the viewpoint of the method, and has the same effects as the information processing apparatus according to the first aspect of the present invention. Has.

本発明の第３局面に係る情報処理プログラムは、第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換ステップと、前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定ステップと、前記他方の動画と前記決定ステップで決定された動作情報とを紐付ける紐付けステップと、をコンピューターに実行させる。 In the information processing program according to the third aspect of the present invention, among the first operation and the second operation which is the reverse of the first operation, the first moving image in which the first operation is copied is the first moving image. Based on the conversion step of converting A computer is made to execute a determination step of determining motion information indicating the motion of the copied object and a linking step of linking the other moving image with the motion information determined in the determination step.

本発明の第３局面に係る情報処理プログラムは、本発明の第１局面に係る情報処理装置をプログラムの観点から規定しており、本発明の第１局面に係る情報処理装置と同様の作用効果を有する。 The information processing program according to the third aspect of the present invention defines the information processing apparatus according to the first aspect of the present invention from the viewpoint of the program, and has the same effects as the information processing apparatus according to the first aspect of the present invention. Has.

本発明によれば、逆再生動画を機械学習に利用した場合に、機械学習の精度を向上させることができる。 According to the present invention, when the reverse playback moving image is used for machine learning, the accuracy of machine learning can be improved.

実施形態に係る情報処理装置の機能ブロック図である。It is a functional block diagram of the information processing apparatus which concerns on embodiment. 第１動作のデータセットの一例を説明する説明図である。It is explanatory drawing explaining an example of the data set of 1st operation. 図１に示す情報処理装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware composition of the information processing apparatus shown in FIG. 実施形態が、第１動作のデータセットを基にして、第２動作のデータセットを生成する処理を説明するフローチャートである。The embodiment is a flowchart illustrating a process of generating a second operation data set based on the first operation data set. 実施形態において、第１動作のデータセットと第２動作のデータセットとの関係を説明する説明図である。In the embodiment, it is explanatory drawing explaining the relationship between the data set of a 1st operation and the data set of a 2nd operation. 第１変形例が、第１動画のセットを基にして、第２動作のデータセットを生成する処理を説明するフローチャートの前半である。The first modification is the first half of the flowchart for explaining the process of generating the data set of the second operation based on the set of the first moving image. 第１変形例が、第１動画のセットを基にして、第２動作のデータセットを生成する処理を説明するフローチャートの後半である。The first modification is the latter half of the flowchart for explaining the process of generating the data set of the second operation based on the set of the first moving image. 第１変形例において、第１動画のセットと、第１動作のデータセットと、第２動作のデータセットとの関係を説明する説明図である。In the first modification, it is explanatory drawing explaining the relationship between the set of 1st moving motion, the data set of 1st operation, and the data set of 2nd operation. 第２変形例が、第２動画のセットを基にして、第１動作のデータセットを生成する処理を説明するフローチャートである。The second modification is a flowchart illustrating a process of generating a data set of the first operation based on the set of the second moving image. 第２変形例において、第１動画のセットと第１動作のデータセットとの関係を説明する説明図である。In the second modification, it is explanatory drawing explaining the relationship between the set of 1st moving motion and the data set of 1st operation. 第２変形例において、第１動画のセットと、第２動画のセットと、第２動作のデータセットと、第１動作のデータセットとの関係を説明する説明図である。In the 2nd modification, it is explanatory drawing explaining the relationship between the 1st moving image set, the 2nd moving image set, the 2nd operation data set, and the 1st operation data set.

以下、図面に基づいて本発明の実施形態を詳細に説明する。各図において、同一符号を付した構成は、同一の構成であることを示し、その構成について、既に説明している内容については、その説明を省略する。本明細書において、総称する場合には添え字を省略した参照符号で示し（例えば、第１動画Ｖ１）、個別の構成を指す場合には添え字を付した参照符号で示す（例えば、第１動画Ｖ１−１）。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In each figure, the configurations with the same reference numerals indicate that they are the same configuration, and the description of the configurations already described will be omitted. In the present specification, when they are generically referred to, they are indicated by reference numerals without subscripts (for example, the first moving image V1), and when they refer to individual configurations, they are indicated by reference numerals with subscripts (for example, first). Movie V1-1).

図１は、実施形態に係る情報処理装置１の機能ブロック図である。情報処理装置１は、機能ブロックとして、本体部２と、入力部３と、出力部４と、を備える。 FIG. 1 is a functional block diagram of the information processing apparatus 1 according to the embodiment. The information processing apparatus 1 includes a main body unit 2, an input unit 3, and an output unit 4 as functional blocks.

本体部２は、機械学習を実行できる性能を有するコンピューターであり、機能ブロックとして、制御処理部２１と、機械学習部２２と、記憶部２３と、取出部２４と、変換部２５と、決定部２６と、紐付け部２７と、を備える。 The main body 2 is a computer having the ability to execute machine learning, and as functional blocks, the control processing unit 21, the machine learning unit 22, the storage unit 23, the extraction unit 24, the conversion unit 25, and the determination unit A 26 and a tying portion 27 are provided.

制御処理部２１は、本体部２の各部（機械学習部２２、記憶部２３、取出部２４、変換部２５、決定部２６、紐付け部２７）を当該各部の機能に応じてそれぞれ制御するための装置である。 The control processing unit 21 controls each unit (machine learning unit 22, storage unit 23, extraction unit 24, conversion unit 25, determination unit 26, linking unit 27) of the main body unit 2 according to the function of each unit. It is a device of.

機械学習には、学習フェーズ（学習によるモデル作成）と、予測・認識フェーズ（モデルをデータに適用して結果を得る）と、がある。機械学習部２２は、物体の予め定められた動作について、これらのフェーズを実行する。物体は、動くことができればよく、人間、動物、人為的に作られた物（例えば、自動車）のいずれでもよい。 Machine learning has a learning phase (model creation by learning) and a prediction / recognition phase (applying a model to data to obtain results). The machine learning unit 22 executes these phases for predetermined movements of the object. The object may be a human being, an animal, or an artificially created object (for example, a car) as long as it can move.

物体の予め定められた動作は、第１動作および第２動作である。第１動作は、第２動作と逆の動作である。言い換えれば、第２動作は、第１動作と逆の動作である。例えば、人間が椅子に座る動作が第１動作のとき、人間が椅子から立つ動作が第２動作となる。この逆でもよい。すなわち、人間が椅子から立つ動作が第１動作のとき、人間が椅子に座る動作が第２動作となる。 The predetermined movements of the object are the first movement and the second movement. The first operation is the reverse of the second operation. In other words, the second operation is the opposite of the first operation. For example, when the movement of a person sitting on a chair is the first movement, the movement of a person standing from the chair is the second movement. The reverse is also possible. That is, when the movement of the human standing from the chair is the first movement, the movement of the human sitting on the chair is the second movement.

記憶部２３は、情報処理装置１が実行する処理および制御に必要な各種の動画、データ、情報等を記憶する。記憶部２３に記憶される各種のデータの１つとして、第１動作のデータセットがある。第１動作のデータセットは、第１動作を学習するためのデータセット（学習データ）である。 The storage unit 23 stores various moving images, data, information, and the like necessary for processing and control executed by the information processing apparatus 1. As one of various data stored in the storage unit 23, there is a data set of the first operation. The data set of the first operation is a data set (learning data) for learning the first operation.

図２は、第１動作のデータセットＤＳ−１の一例を説明する説明図である。第１動作のデータセットＤＳ−１は、第１動画Ｖ１と第１ラベルとのペアをｎ個備える。ｎは、第１動作の機械学習に必要とされる数である。第１動画Ｖ１は、第１動作をする物体が写された動画である。ｎ個の第１動画Ｖ１（Ｖ１−１〜Ｖ１−ｎ）は、同じ動画でなく、それぞれ別々に撮像された動画である。 FIG. 2 is an explanatory diagram illustrating an example of the data set DS-1 of the first operation. The data set DS-1 of the first operation includes n pairs of the first moving image V1 and the first label. n is a number required for machine learning of the first operation. The first moving image V1 is a moving image in which an object performing the first operation is captured. The n first moving images V1 (V1-1 to V1-n) are not the same moving images, but are images taken separately.

第１ラベルは、第１動画Ｖ１に写された物体の動作が第１動作であることを示す動作情報である（例えば、「０」）。ｎ個の第１動画Ｖ１は、それぞれ、第１動作をする物体が写された動画なので、ラベルは、それぞれ、第１ラベルである。ｎ個の第１動画Ｖ１のそれぞれは、第１ラベルと紐付けられて、ｎ個のペアが構成されている。 The first label is motion information indicating that the motion of the object captured in the first moving image V1 is the first motion (for example, "0"). Since each of the n first moving images V1 is a moving image of an object performing the first operation, the labels are the first labels, respectively. Each of the n first moving images V1 is associated with the first label to form n pairs.

図１を参照して、取出部２４は、第１動作のデータセットＤＳ−１に含まれるペアをコピーし、このペアを構成する第１動画Ｖ１を変換部２５に送り、第１ラベルを決定部２６に送る処理をする。取出部２４は、この処理をｎ個のペアのそれぞれについて実行する。 With reference to FIG. 1, the extraction unit 24 copies the pair included in the data set DS-1 of the first operation, sends the first moving image V1 constituting this pair to the conversion unit 25, and determines the first label. The process of sending to the unit 26 is performed. The fetching unit 24 executes this process for each of the n pairs.

変換部２５は、取出部２４から送られてきた第１動画Ｖ１を構成するフレームの順番を逆にすることにより、第１動画Ｖ１を第２動画Ｖ２（逆再生動画）に変換する。フレームの順番を逆にするとは、例えば、フレームＡ、フレームＢ、フレームＣ、フレームＤの順にフレームが並ぶ動画（第１動画Ｖ１）の場合、フレームＤ、フレームＣ、フレームＢ、フレームＡの順にフレームが並ぶ動画（第２動画Ｖ２）にすることである。第２動画Ｖ２は、第１動画Ｖ１を逆再生した動画なので、第２動画Ｖ２に写された物体の動作は、第２動作でなく、第２動作と見なす動作である。 The conversion unit 25 converts the first moving image V1 into the second moving image V2 (reverse playback moving image) by reversing the order of the frames constituting the first moving image V1 sent from the taking out unit 24. Reversing the order of frames means, for example, in the case of a moving image (first moving image V1) in which frames are arranged in the order of frame A, frame B, frame C, and frame D, the order of frame D, frame C, frame B, and frame A. It is to make a moving image (second moving image V2) in which frames are lined up. Since the second moving image V2 is a moving image obtained by replaying the first moving image V1 in reverse, the operation of the object captured in the second moving image V2 is not the second operation but the operation regarded as the second operation.

決定部２６は、第１動画Ｖ１および第２動画Ｖ２のうち、一方の動画に写された物体の動作を示す動作情報を基にして、他方の動画に写された物体の動作を示す動作情報を決定する。詳しく説明すると、実施形態では、一方の動画が第１動画Ｖ１であり、他方の動画が第２動画Ｖ２である。記憶部２３は、第１ラベルと第２ラベルとを紐付けて予め記憶している。第２ラベルは、第２動画Ｖ２に写された物体の動作が第２動作であることを示す動作情報である（例えば、「１」）。決定部２６は、取出部２４から送られてきたラベルが第１ラベルなので、第１ラベルと紐付けて記憶されている第２ラベルを、第２動画Ｖ２に写された物体の動作を示す動作情報に決定する。 The determination unit 26 is based on the motion information indicating the motion of the object captured in one of the first moving image V1 and the second moving image V2, and the motion information indicating the motion of the object captured in the other moving image. To decide. More specifically, in the embodiment, one moving image is the first moving image V1, and the other moving image is the second moving image V2. The storage unit 23 associates the first label and the second label and stores them in advance. The second label is motion information indicating that the motion of the object captured in the second moving image V2 is the second motion (for example, "1"). Since the label sent from the extraction unit 24 is the first label, the determination unit 26 is an operation indicating the operation of the object in which the second label stored in association with the first label is copied to the second moving image V2. Decide on information.

紐付け部２７は、第２動画Ｖ２と第２ラベル（決定部２６が決定した動作情報）とを紐付ける。これにより、第２動画Ｖ２と第２ラベルのペアが生成される。 The associating unit 27 associates the second moving image V2 with the second label (operation information determined by the determination unit 26). As a result, a pair of the second moving image V2 and the second label is generated.

入力部３は、外部からコマンド（命令）やデータ等を情報処理装置１に入力する装置である。出力部４は、機械学習部２２が実行した認識結果等を出力する装置である。 The input unit 3 is a device that inputs commands (commands), data, and the like to the information processing device 1 from the outside. The output unit 4 is a device that outputs a recognition result or the like executed by the machine learning unit 22.

図３は、図１に示す情報処理装置１のハードウェア構成を示すブロック図である。情報処理装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１ａ、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１ｂ、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１ｃ、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１ｄ、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１ｅ、液晶ディスプレイ１ｆ、キーボード等１ｇ、および、これらを接続するバス１ｈを備える。 FIG. 3 is a block diagram showing a hardware configuration of the information processing apparatus 1 shown in FIG. The information processing device 1 includes a CPU (Central Processing Unit) 1a, a GPU (Graphics Processing Unit) 1b, a RAM (Random Access Memory) 1c, a ROM (Read Only Memory) 1d, and an HDD (Hard Disk 1). It includes 1 g of a keyboard and the like, and 1 h of a bus connecting them.

図１および図３を参照して、液晶ディスプレイ１ｆは、出力部４を実現するハードウェアである。液晶ディスプレイ１ｆの替わりに、有機ＥＬディスプレイ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅｄｉｓｐｌａｙ）、プラズマディスプレイ等でもよい。キーボード等１ｇは、入力部３を実現するハードウェアである。キーボードの替わりに、タッチパネルでもよい。 With reference to FIGS. 1 and 3, the liquid crystal display 1f is hardware that realizes the output unit 4. Instead of the liquid crystal display 1f, an organic EL display (Organic Light Emitting Diode display), a plasma display, or the like may be used. 1g of a keyboard or the like is hardware that realizes the input unit 3. A touch panel may be used instead of the keyboard.

ＨＤＤ１ｅは、記憶部２３を実現するハードウェアである。また、ＨＤＤ１ｅには、制御処理部２１、機械学習部２２、取出部２４、変換部２５、決定部２６、紐付け部２７について、これらの機能ブロックをそれぞれ実現するためのプログラムが格納されている。これらのプログラムは、機能ブロックの定義を用いて表現される。変換部２５および変換プログラムを例にして説明する。変換部２５は、第１動作およびこの動作と逆の動作である第２動作のうち、第１動作が写された第１動画Ｖ１を、第１動画Ｖ１を逆再生した第２動画Ｖ２に変換する。変換プログラムは、第１動作およびこの動作と逆の動作である第２動作のうち、第１動作が写された第１動画Ｖ１を、第１動画Ｖ１を逆再生した第２動画Ｖ２に変換するプログラムである。 The HDD 1e is hardware that realizes the storage unit 23. Further, the HDD 1e stores programs for realizing the functional blocks of the control processing unit 21, the machine learning unit 22, the extraction unit 24, the conversion unit 25, the determination unit 26, and the linking unit 27, respectively. .. These programs are represented using functional block definitions. The conversion unit 25 and the conversion program will be described as an example. The conversion unit 25 converts the first moving image V1 in which the first operation is captured into the second moving image V2 in which the first moving image V1 is reproduced in reverse, out of the first operation and the second operation which is the reverse operation of this operation. do. The conversion program converts the first moving image V1 in which the first operation is copied into the second moving image V2 in which the first moving image V1 is played in reverse, out of the first operation and the second operation which is the reverse operation of this operation. It is a program.

これらのプログラムは、ＨＤＤ１ｅに予め記憶されているが、これに限定されない。例えば、これらのプログラムを記録している記録媒体（例えば、磁気ディスク、光学ディスクのような外部記録媒体）が用意されており、この記録媒体に記憶されているプログラムがＨＤＤ１ｅに記憶されてもよい。また、これらのプログラムは、情報処理装置１とネットワーク接続されたサーバに格納されており、ネットワークを介して、これらのプログラムがＨＤＤ１ｅに送られ、ＨＤＤ１ｅに記憶されてもよい。これらのプログラムは、ＨＤＤ１ｅの替わりにＲＯＭ１ｄに記憶してもよい。情報処理装置１は、ＨＤＤ１ｅの替わりに、フラッシュメモリを備え、これらのプログラムはフラッシュメモリに記憶してもよい。 These programs are stored in the HDD 1e in advance, but are not limited thereto. For example, a recording medium (for example, an external recording medium such as a magnetic disk or an optical disk) for recording these programs may be prepared, and the program stored in the recording medium may be stored in the HDD 1e. .. Further, these programs are stored in a server connected to the information processing apparatus 1 via a network, and these programs may be sent to the HDD 1e and stored in the HDD 1e via the network. These programs may be stored in the ROM 1d instead of the HDD 1e. The information processing apparatus 1 includes a flash memory instead of the HDD 1e, and these programs may be stored in the flash memory.

ＣＰＵ１ａは、これらのプログラムを、ＨＤＤ１ｅから読み出してＲＡＭ１ｃに展開させ、展開されたプログラムを実行することによって、制御処理部２１、機械学習部２２、取出部２４、変換部２５、決定部２６、および、紐付け部２７が実現される。但し、これらの機能について、各機能の一部又は全部は、ＣＰＵ１ａによる処理に替えて、又は、これと共に、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）による処理によって実現されてもよい。又、同様に、各機能の一部又は全部は、ソフトウェアによる処理に替えて、又は、これと共に、専用のハードウェア回路による処理によって実現されてもよい。 The CPU 1a reads these programs from the HDD 1e, expands them into the RAM 1c, and executes the expanded programs to execute the control processing unit 21, the machine learning unit 22, the extraction unit 24, the conversion unit 25, the determination unit 26, and the determination unit 26. , The tying portion 27 is realized. However, with respect to these functions, a part or all of each function may be realized by the processing by the DSP (Digital Signal Processor) in place of or in combination with the processing by the CPU 1a. Similarly, a part or all of each function may be realized by processing by a dedicated hardware circuit in place of or in combination with processing by software.

ＣＰＵ１ａによって実行されるこれらのプログラム（変換プログラム等）のフローチャートが、後で説明する図４、図６、図７および図９のフローチャートである。 The flowcharts of these programs (conversion programs and the like) executed by the CPU 1a are the flowcharts of FIGS. 4, 6, 7, and 9 which will be described later.

ＧＰＵ１ｂは、例えば、ＣＰＵ１ａの制御の下で、機械学習部２２が機械学習をする際に必要な各種処理（例えば、画像処理）を実行する。 The GPU 1b, for example, under the control of the CPU 1a, executes various processes (for example, image processing) necessary for the machine learning unit 22 to perform machine learning.

実施形態では、第１動作のデータセットＤＳ−１を基にして、第２動作のデータセットＤＳ−２を生成する。図４は、これを説明するフローチャートである。図５は、実施形態において、第１動作のデータセットＤＳ−１と第２動作のデータセットＤＳ−２との関係を説明する説明図である。 In the embodiment, the second operation data set DS-2 is generated based on the first operation data set DS-1. FIG. 4 is a flowchart illustrating this. FIG. 5 is an explanatory diagram illustrating the relationship between the data set DS-1 of the first operation and the data set DS-2 of the second operation in the embodiment.

図５について説明する。図５に示す第１動作のデータセットＤＳ−１は、図２に示す第１動作のデータセットＤＳ−１と同じである。第２動作のデータセットＤＳ−２は、第２動作を学習するためのデータセット（学習セット）である。第２動作のデータセットＤＳ−２は、第２動画Ｖ２と第２ラベルとのペアをｎ個備える。これらのペアの数ｎは、第１動作のデータセットＤＳ−１に備えられるペアの数ｎと同じである。 FIG. 5 will be described. The data set DS-1 of the first operation shown in FIG. 5 is the same as the data set DS-1 of the first operation shown in FIG. The second operation data set DS-2 is a data set (learning set) for learning the second operation. The second operation data set DS-2 includes n pairs of the second moving image V2 and the second label. The number n of these pairs is the same as the number n of pairs provided in the data set DS-1 of the first operation.

ｎ個の第２動画Ｖ２（Ｖ２−１〜Ｖ２−ｎ）は、それぞれ、ｎ個の第１動画Ｖ１（Ｖ１−１〜Ｖ１−ｎ）を逆再生した動画である。すなわち、第２動画Ｖ２−１は、第１動画Ｖ１−１を逆再生した動画であり、第２動画Ｖ２−２は、第１動画Ｖ１−２を逆再生した動画であり、第２動画Ｖ２−３は、第１動画Ｖ１−３を逆再生した動画であり、・・・、第２動画Ｖ２−ｎは、第１動画Ｖ１−ｎを逆再生した動画である。第２ラベルは、上述したように、第２動画Ｖ２に写された物体の動作が第２動作であることを示す動作情報である（例えば、「１」）。ｎ個の第２動画Ｖ２のそれぞれは、第２ラベルと紐付けられて、ｎ個のペアが構成されている。 The n second moving images V2 (V2-1 to V2-n) are moving images in which n first moving images V1 (V1-1 to V1-n) are reproduced in reverse. That is, the second moving image V2-1 is a moving image in which the first moving image V1-1 is played in reverse, and the second moving image V2-2 is a moving image in which the first moving image V1-2 is played in reverse, and the second moving image V2. -3 is a moving image in which the first moving image V1-3 is played in reverse, ..., The second moving image V2-n is a moving image in which the first moving image V1-n is played in reverse. As described above, the second label is motion information indicating that the motion of the object captured in the second moving image V2 is the second motion (for example, "1"). Each of the n second moving images V2 is associated with the second label to form n pairs.

実施形態が、第１動作のデータセットＤＳ−１を基にして、第２動作のデータセットＤＳ−２を生成する処理を説明する。図１、図４および図５を参照して、取出部２４は、第１動作のデータセットＤＳ−１に含まれる１番目のペア［第１動画Ｖ１−１，第１ラベル］をコピーし、このペアを構成する第１動画Ｖ１−１、第１ラベルをそれぞれ、変換部２５、決定部２６に送る（ステップＳ１）。 The process of generating the data set DS-2 of the second operation based on the data set DS-1 of the first operation will be described. With reference to FIGS. 1, 4 and 5, the fetching unit 24 copies the first pair [first moving image V1-1, first label] included in the data set DS-1 of the first operation. The first moving image V1-1 and the first label constituting this pair are sent to the conversion unit 25 and the determination unit 26, respectively (step S1).

変換部２５は、取出部２４から送られてきた第１動画Ｖ１−１を第２動画Ｖ２−１に変換する（ステップＳ２）。このように、変換部２５は、一方の動画である第１動画Ｖ１と第１ラベルのペアに対して、ペアを構成する第１動画Ｖ１を第２動画Ｖ２に変換する。 The conversion unit 25 converts the first moving image V1-1 sent from the taking-out unit 24 into the second moving image V2-1 (step S2). In this way, the conversion unit 25 converts the first moving image V1 constituting the pair into the second moving image V2 for the pair of the first moving image V1 and the first label, which is one of the moving images.

決定部２６は、取出部２４から送られてきたラベルが第１ラベルなので、第１ラベルと紐付けて記憶されている第２ラベルを、第２動画Ｖ２−１に写された物体の動作を示す動作情報に決定する（ステップＳ３）。 Since the label sent from the extraction unit 24 is the first label, the determination unit 26 uses the second label stored in association with the first label as the operation of the object copied in the second moving image V2-1. The operation information to be shown is determined (step S3).

紐付け部２７は、第２動画Ｖ２−１と第２ラベルとを紐付ける（ステップＳ４）。これにより、１つのペア［第２動画Ｖ２−１，第２ラベル］が生成される。 The linking unit 27 links the second moving image V2-1 to the second label (step S4). As a result, one pair [second moving image V2-1, second label] is generated.

取出部２４は、第１動作のデータセットＤＳ−１に含まれる全てのペアについて、ステップＳ１〜Ｓ４の処理が終了したか否かを判断する（ステップＳ５）。取出部２４は、第１動作のデータセットＤＳ−１に含まれる全てのペアについて、ステップＳ１〜Ｓ４の処理が終了していないと判断したとき（ステップＳ５でＮｏ）、ステップＳ１に戻り、ステップＳ１の処理をする。 The extraction unit 24 determines whether or not the processing of steps S1 to S4 has been completed for all the pairs included in the data set DS-1 of the first operation (step S5). When the extraction unit 24 determines that the processing of steps S1 to S4 has not been completed for all the pairs included in the data set DS-1 of the first operation (No in step S5), the extraction unit 24 returns to step S1 and steps. Process S1.

取出部２４は、第１動作のデータセットＤＳ−１に含まれる全てのペアについて、ステップＳ１〜Ｓ４の処理が終了したと判断したとき（ステップＳ５でＹｅｓ）、図５に示すように、第２動作のデータセットＤＳ−２が完成される。 When the extraction unit 24 determines that the processing of steps S1 to S4 has been completed for all the pairs included in the data set DS-1 of the first operation (Yes in step S5), as shown in FIG. 5, the first operation is performed. The two-operation data set DS-2 is completed.

実施形態の主な効果を説明する。第２動作は、第１動作の逆の動作である。このため、第１動作をする物体が写された第１動画Ｖ１の逆再生動画（第２動画Ｖ２）は、第２動作と見なすことができる動作をする物体が写された動画である。例えば、人間が椅子に座る動作が写された第１動画Ｖ１の逆再生動画（第２動画Ｖ２）は、人間が椅子に座る動作を未来から過去へ再生する動画であるが、人間が椅子から立つ動作と見なすことができる動作が写された動画である。逆に、人間が椅子から立つ動作が写された第１動画Ｖ１の逆再生動画（第２動画Ｖ２）は、人間が椅子から立つ動作を未来から過去へ再生する動画であるが、人間が椅子に座る動作と見なすことができる動作が写された動画である。 The main effects of the embodiments will be described. The second operation is the reverse of the first operation. Therefore, the reverse playback moving image (second moving image V2) of the first moving image V1 in which the object performing the first operation is shown is a moving image in which the object performing the action that can be regarded as the second movement is shown. For example, the reverse playback video of the first video V1 (second video V2), which shows the movement of a human sitting on a chair, is a video of the movement of a human sitting on a chair played from the future to the past. This is a video showing a movement that can be regarded as a standing movement. Conversely, the reverse playback video of the first video V1 (second video V2), which shows the movement of a human standing from a chair, is a video that reproduces the movement of a human standing from a chair from the future to the past. This is a video showing a movement that can be regarded as a movement of sitting on a chair.

第１動画Ｖ１および第２動画Ｖ２において、動画に写されてる物体の動作を示す動作情報であるラベル（言い換えれば、教師データ）を同じにすれば、機械学習の精度が悪くなる。そこで、決定部２６は、第１動画Ｖ１に写された物体の動作を示す動作情報（第１ラベル）を基にして、第２動画Ｖ２に写された物体の動作を示す動作情報（第２ラベル）を決定する。従って、実施形態によれば、逆再生動画を機械学習に利用した場合に、機械学習の精度を向上させることができる。 If the labels (in other words, teacher data), which are the motion information indicating the motion of the object shown in the moving object, are the same in the first moving image V1 and the second moving image V2, the accuracy of machine learning deteriorates. Therefore, the determination unit 26 is based on the motion information (first label) indicating the motion of the object captured in the first moving image V1, and the motion information (second) indicating the motion of the object captured in the second moving image V2. Label) is determined. Therefore, according to the embodiment, when the reverse playback moving image is used for machine learning, the accuracy of machine learning can be improved.

実施形態には、第１変形例および第２変形例がある。これらの変形例において、動作情報はラベルでなく、推定値（推定値の組み合わせ）であり、動画に写された物体の動作が第１動作である推定値と第２動作である推定値とが求められる。例えば、第１動画Ｖ１に写された物体の動作が第１動作である推定値が９０％であり、第２動作である推定値が１０％である。推定値は、例えば、尤度、確率である。 Embodiments include a first modification and a second modification. In these variants, the motion information is not a label but an estimated value (combination of estimated values), and the motion of the object shown in the moving image is the estimated value that is the first motion and the estimated value that is the second motion. Desired. For example, the estimated value of the movement of the object captured in the first moving image V1 is 90%, and the estimated value of the second movement is 10%. The estimated values are, for example, likelihood and probability.

第１変形例について、実施形態と相違する点を主に説明する。第１変形例は、第１動画Ｖ１のセットを基にして、第２動作のデータセットＤＳ−４を生成する。図６は、これを説明するフローチャートの前半である。図７は、これを説明するフローチャートの後半である。図８は、第１変形例において、第１動画Ｖ１のセットと、第１動作のデータセットＤＳ−３と、第２動作のデータセットＤＳ−４との関係を説明する説明図である。 The first modification will be mainly described as being different from the embodiment. In the first modification, the data set DS-4 of the second operation is generated based on the set of the first moving image V1. FIG. 6 is the first half of the flowchart illustrating this. FIG. 7 is the latter half of the flowchart illustrating this. FIG. 8 is an explanatory diagram illustrating the relationship between the set of the first moving image V1, the data set DS-3 of the first operation, and the data set DS-4 of the second operation in the first modification.

図１、図６および図８を参照して、記憶部２３は、第１動画Ｖ１のセットを予め記憶している。第１動画Ｖ１のセットは、実施形態で説明したｎ個の第１動画Ｖ１を備える。 With reference to FIGS. 1, 6 and 8, the storage unit 23 stores the set of the first moving image V1 in advance. The set of the first moving images V1 includes n first moving images V1 described in the embodiment.

取出部２４は、第１動画Ｖ１のセットに含まれる１番目の第１動画Ｖ１（第１動画Ｖ１−１）をコピーし、これを機械学習部２２に送る（ステップＳ１１）。 The extraction unit 24 copies the first first moving image V1 (first moving image V1-1) included in the set of the first moving image V1 and sends it to the machine learning unit 22 (step S11).

機械学習部２２は、取出部２４から送られてきた第１動画Ｖ１−１を学習（言い換えれば、第１動画Ｖ１−１の特徴抽出）して、第１動画Ｖ１−１に写された物体の動作が第１動作である推定値（例えば、９０％）と第２動作である推定値（例えば、１０％）とをそれぞれ算出する（ステップＳ１２）。これらの推定値の組み合わせが、第１動画Ｖ１−１に写された物体の動作を示す動作情報となる。 The machine learning unit 22 learns the first moving image V1-1 sent from the taking-out unit 24 (in other words, the feature extraction of the first moving image V1-1), and the object copied to the first moving image V1-1. The estimated value (for example, 90%) in which the operation is the first operation and the estimated value (for example, 10%) in which the operation is the second operation are calculated (step S12). The combination of these estimated values becomes the motion information indicating the motion of the object captured in the first moving image V1-1.

紐付け部２７は、第１動画Ｖ１−１と、機械学習部２２が算出した第１動作の推定値および第２動作の推定値の組み合わせと、を紐づける（ステップＳ１３）。これにより、ペア［第１動画Ｖ１−１，（第１動作の推定値９０％，第２動作の推定値１０％）］が生成される
取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ１１〜Ｓ１３の処理が終了したか否かを判断する（ステップＳ１４）。取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ１１〜Ｓ１３の処理が終了していないと判断したとき（ステップＳ１４でＮｏ）、ステップＳ１１に戻り、ステップＳ１１の処理をする。 The associating unit 27 associates the first moving image V1-1 with the combination of the estimated value of the first operation and the estimated value of the second operation calculated by the machine learning unit 22 (step S13). As a result, the extraction unit 24 in which the pair [first moving image V1-1, (estimated value of the first operation 90%, estimated value of the second operation 10%)] is generated is included in the set of the first moving image V1. For all the first moving images V1, it is determined whether or not the processing of steps S11 to S13 is completed (step S14). When the extraction unit 24 determines that the processing of steps S11 to S13 has not been completed for all the first moving images V1 included in the set of the first moving images V1, the process returns to step S11 and steps are taken. Process S11.

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ１１〜Ｓ１３の処理が終了したと判断したとき（ステップＳ１４でＹｅｓ）、図８に示すように、第１動作のデータセットＤＳ−３が完成される。 When the extraction unit 24 determines that the processing of steps S11 to S13 has been completed for all the first moving images V1 included in the set of the first moving image V1 (Yes in step S14), as shown in FIG. The data set DS-3 for one operation is completed.

図１、図７および図８を参照して、取出部２４は、第１動画Ｖ１のセットに含まれる１番目の第１動画Ｖ１（第１動画Ｖ１−１）をコピーし、変換部２５に送る（ステップＳ１５）。 With reference to FIGS. 1, 7, and 8, the fetching unit 24 copies the first first moving image V1 (first moving image V1-1) included in the set of the first moving image V1 to the conversion unit 25. Send (step S15).

変換部２５は、取出部２４から送られてきた第１動画Ｖ１−１を第２動画Ｖ２−１に変換する（ステップＳ１６）。 The conversion unit 25 converts the first moving image V1-1 sent from the taking-out unit 24 into the second moving image V2-1 (step S16).

取出部２４は、第１動作のデータセットＤＳ−３に含まれる１番目のペアに含まれる推定値の組み合わせ（第１動作の推定値９０％，第２動作の推定値１０％）をコピーし、決定部２６に送る（ステップＳ１７）。 The fetching unit 24 copies the combination of the estimated values included in the first pair included in the data set DS-3 of the first operation (estimated value of the first operation 90%, estimated value of the second operation 10%). , Sent to the determination unit 26 (step S17).

決定部２６は、第２動画Ｖ２−１とペアを組む推定値の組み合わせを決定する（ステップＳ１８）。詳しく説明すると、決定部２６は、取出部２４から送られてきた推定値の組み合わせ（第１動作の推定値９０％，第２動作の推定値１０％）について、第１動作の推定値９０％を第２動作の推定値９０％とし、第２動作の推定値１０％を第１動作の推定値１０％とした組み合わせを生成し、この組み合わせを第２動画Ｖ２−１に写された物体の動作を示す動作情報に決定する。このように、決定部２６は、第１動画Ｖ１について、機械学習部２２が算出した第１動作である推定値を、第２動作である推定値とし、第１動画Ｖ１について、機械学習部２２が算出した第２動作である推定値を、第１動作である推定値とした組み合わせを、第２動画Ｖ２に写された物体の動作を示す動作情報に決定する。 The determination unit 26 determines a combination of estimated values to be paired with the second moving image V2-1 (step S18). More specifically, the determination unit 26 has 90% of the estimated value of the first operation for the combination of the estimated values sent from the fetching unit 24 (estimated value of the first operation 90%, estimated value of the second operation 10%). Is 90% of the estimated value of the second movement, and a combination is generated in which the estimated value of the second movement is 10% and the estimated value of the first movement is 10%. Determine the operation information indicating the operation. As described above, the determination unit 26 uses the estimated value of the first operation calculated by the machine learning unit 22 as the estimated value of the second operation for the first moving image V1, and the machine learning unit 22 for the first moving image V1. The combination of the estimated value of the second operation calculated by the above and the estimated value of the first operation is determined as the operation information indicating the operation of the object captured in the second moving image V2.

紐付け部２７は、第２動画Ｖ２−１と、ステップＳ１８で決定された推定値の組み合わせ（第２動作の推定値９０％、第１動作の推定値１０％）と、を紐付ける（ステップＳ１９）。これにより、１つのペア［第２動画Ｖ２−１，（第２動作の推定値９０％，第１動作の推定値１０％）］が生成される。 The linking unit 27 associates the second moving image V2-1 with the combination of the estimated values determined in step S18 (estimated value of the second operation 90%, estimated value of the first operation 10%) (step). S19). As a result, one pair [second moving image V2-1 (estimated value of the second operation 90%, estimated value of the first operation 10%)] is generated.

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ１５〜Ｓ１９の処理が終了したか否かを判断する（ステップＳ２０）。取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ１５〜Ｓ１９の処理が終了していないと判断したとき（ステップＳ２０でＮｏ）、ステップＳ１５に戻り、ステップＳ１５の処理をする。 The extraction unit 24 determines whether or not the processing of steps S15 to S19 has been completed for all the first moving images V1 included in the set of the first moving images V1 (step S20). When the extraction unit 24 determines that the processing of steps S15 to S19 has not been completed for all the first moving images V1 included in the set of the first moving images V1, the process returns to step S15 and steps are taken. Process S15.

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ１５〜Ｓ１９の処理が終了したと判断したとき（ステップＳ２０でＹｅｓ）、図８に示すように、第２動作のデータセットＤＳ−４が完成される。 When the extraction unit 24 determines that the processing of steps S15 to S19 has been completed for all the first moving images V1 included in the set of the first moving images V1 (Yes in step S20), as shown in FIG. The two-operation data set DS-4 is completed.

第１変形例の主な効果を説明する。機械学習部２２は、第１動画Ｖ１を学習して、第１動画Ｖ１に写された物体の動作が第１動作である推定値と第２動作である推定値とを算出する（ステップＳ１２）。第１動画Ｖ１に写された物体の動作は、第１動作なので、第１動作である推定値（例えば、９０％）は、第２動作である推定値（例えば、１０％）より高くなる。これらの推定値の組み合わせが、第１動画Ｖ１とペアを組む推定値の組み合わせとなる。 The main effects of the first modification will be described. The machine learning unit 22 learns the first moving image V1 and calculates an estimated value in which the motion of the object captured in the first moving image V1 is the first motion and an estimated value in which the motion is the second motion (step S12). .. Since the motion of the object captured in the first moving image V1 is the first motion, the estimated value of the first motion (for example, 90%) is higher than the estimated value of the second motion (for example, 10%). The combination of these estimated values is the combination of the estimated values paired with the first moving image V1.

決定部２６は、第１動作である推定値（例えば、９０％）を第２動作である推定値とし、第２動作である推定値（例えば、１０％）を第１動作である推定値とし、これらの推定の組み合わせを、第２動作を示す動作情報と見なし、これを第２動画Ｖ２に写された物体の動作情報と決定する（ステップＳ１８）。 The determination unit 26 uses the estimated value of the first operation (for example, 90%) as the estimated value of the second operation, and the estimated value of the second operation (for example, 10%) as the estimated value of the first operation. , The combination of these estimates is regarded as the motion information indicating the second motion, and this is determined as the motion information of the object copied in the second moving image V2 (step S18).

第１変形例によれば、機械学習部２２が第２動画Ｖ２を学習することなく、第２動画Ｖ２と、第２動画Ｖ２に写された物体の動作情報（推定値の組み合わせ）と、を紐付けたペアを生成することができる。これにより、機械学習部２２が第２動画Ｖ２を学習することなく、第２動作のデータセットＤＳ−４を生成することができる。 According to the first modification, the machine learning unit 22 obtains the second moving image V2 and the motion information (combination of estimated values) of the object captured in the second moving image V2 without learning the second moving image V2. You can generate linked pairs. As a result, the machine learning unit 22 can generate the second operation data set DS-4 without learning the second moving image V2.

第２変形例について、実施形態および第１変形例と相違する点を主に説明する。第２変形例は、第１動画Ｖ１のセットを基にして、図１０に示す第１動作のデータセットＤＳ−３を生成し、かつ、第２動画Ｖ２のセットを基にして、図１１に示す第１動作のデータセットＤＳ−６を生成する。図９は、第２変形例において、第２動画Ｖ２のセットを基にして、第１動作のデータセットＤＳ−６を生成する処理を説明するフローチャートである。図１０は、第２変形例において、第１動画Ｖ１のセットと第１動作のデータセットＤＳ−３との関係を説明する説明図である。図１１は、第２変形例において、第１動画Ｖ１のセットと、第２動画Ｖ２のセットと、第２動作のデータセットＤＳ−５と、第１動作のデータセットＤＳ−６との関係を説明する説明図である。 The second modification will mainly explain the differences from the embodiment and the first modification. In the second modification, the data set DS-3 of the first operation shown in FIG. 10 is generated based on the set of the first moving image V1, and the set of the second moving image V2 is used as the basis in FIG. The data set DS-6 of the first operation shown is generated. FIG. 9 is a flowchart illustrating a process of generating a data set DS-6 for the first operation based on the set of the second moving image V2 in the second modification. FIG. 10 is an explanatory diagram illustrating the relationship between the set of the first moving image V1 and the data set DS-3 of the first operation in the second modification. FIG. 11 shows the relationship between the set of the first moving image V1, the set of the second moving image V2, the data set DS-5 of the second operation, and the data set DS-6 of the first operation in the second modification. It is explanatory drawing to explain.

図１０を参照して、第２変形例に係る情報処理装置１は、第１動画Ｖ１のセットを基にして、第１動作のデータセットＤＳ−３を生成する。第１動作の推定値と第２動作の推定値との組み合わせが第１組み合わせである。第１動画Ｖ１と第１組み合わせとが紐付けられている。図１０は、第１変形例が、第１動画Ｖ１のセットを基にして、第１動作のデータセットＤＳ−３を生成する処理と同じである（図８に示す第１動画のセットから第１動作のデータセットＤＳ−３の生成、図６のステップＳ１１〜Ｓ１４）。よって、説明を省略する。 With reference to FIG. 10, the information processing apparatus 1 according to the second modification generates the data set DS-3 of the first operation based on the set of the first moving image V1. The combination of the estimated value of the first operation and the estimated value of the second operation is the first combination. The first moving image V1 and the first combination are linked. In FIG. 10, the first modification is the same as the process of generating the data set DS-3 of the first operation based on the set of the first moving image V1 (from the set of the first moving image shown in FIG. 8 to the first). Generation of the data set DS-3 for one operation, steps S11 to S14 in FIG. 6). Therefore, the description thereof will be omitted.

図１、図９および図１１を参照して、取出部２４は、第１動画Ｖ１のセットに含まれる１番目の第１動画Ｖ１（第１動画Ｖ１−１）をコピーし、これを変換部２５に送る（ステップＳ３１）。 With reference to FIGS. 1, 9 and 11, the extraction unit 24 copies the first first moving image V1 (first moving image V1-1) included in the set of the first moving image V1, and converts the first moving image V1 (first moving image V1-1). 25 (step S31).

変換部２５は、取出部２４から送られてきた第１動画Ｖ１−１を第２動画Ｖ２−１に変換する（ステップＳ３２）。 The conversion unit 25 converts the first moving image V1-1 sent from the taking-out unit 24 into the second moving image V2-1 (step S32).

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ３１およびステップＳ３２の処理が終了したか否かを判断する（ステップＳ３３）。取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ３１およびステップＳ３２の処理が終了していないと判断したとき（ステップＳ３３でＮｏ）、ステップＳ３１に戻り、ステップＳ３１の処理をする。 The extraction unit 24 determines whether or not the processes of steps S31 and S32 have been completed for all the first moving images V1 included in the set of the first moving images V1 (step S33). When the extraction unit 24 determines that the processes of step S31 and step S32 have not been completed for all the first moving images V1 included in the set of the first moving image V1, the extraction unit 24 returns to step S31 and returns to step S31. The process of step S31 is performed.

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ３１およびステップＳ３２の処理が終了したと判断したとき（ステップＳ３３でＹｅｓ）、図１１に示すように、第２動画Ｖ２のセットが完成される。 When the extraction unit 24 determines that the processes of steps S31 and S32 have been completed for all the first moving images V1 included in the set of the first moving images V1 (Yes in step S33), as shown in FIG. The set of the second moving image V2 is completed.

第２動画Ｖ２のセットの完成後、取出部２４は、第２動画Ｖ２のセットに含まれる１番目の第２動画Ｖ２（第２動画Ｖ２−１）をコピーし、これを機械学習部２２に送る（ステップＳ３４）。 After the set of the second moving image V2 is completed, the extraction unit 24 copies the first second moving image V2 (second moving image V2-1) included in the set of the second moving image V2, and transfers this to the machine learning unit 22. Send (step S34).

機械学習部２２は、取出部２４から送られてきた第２動画Ｖ２−１を学習（言い換えれば、第２動画Ｖ２−１の特徴抽出）して、第２動画Ｖ２−１に写された物体の動作が第２動作である推定値（例えば、８０％）と第１動作である推定値（例えば、２０％）とをそれぞれ算出する（ステップＳ３５）。これらの推定値の組み合わせが、第２動画Ｖ２−１に写された物体の動作を示す動作情報となる。 The machine learning unit 22 learns the second moving image V2-1 sent from the taking-out unit 24 (in other words, the feature extraction of the second moving image V2-1), and the object copied to the second moving image V2-1. The estimated value (for example, 80%) in which the operation is the second operation and the estimated value (for example, 20%) in which the operation is the first operation are calculated (step S35). The combination of these estimated values becomes the motion information indicating the motion of the object captured in the second moving image V2-1.

紐付け部２７は、第２動画Ｖ２−１と、機械学習部２２が算出した第２動作の推定値および第１動作の推定値の組み合わせと、を紐づける（ステップＳ３６）。これにより、ペア［第２動画Ｖ２−１，（第２動作の推定値８０％，第１動作の推定値２０％）］が生成される。 The associating unit 27 associates the second moving image V2-1 with the combination of the estimated value of the second operation and the estimated value of the first operation calculated by the machine learning unit 22 (step S36). As a result, a pair [second moving image V2-1 (estimated value of the second operation 80%, estimated value of the first operation 20%)] is generated.

取出部２４は、第２動画Ｖ２のセットに含まれる全ての第２動画Ｖ２について、ステップＳ３４〜Ｓ３６の処理が終了したか否かを判断する（ステップＳ３７）。取出部２４は、第２動画Ｖ２のセットに含まれる全ての第２動画Ｖ２について、ステップＳ３４〜Ｓ３６の処理が終了していないと判断したとき（ステップＳ３７でＮｏ）、ステップＳ３４に戻り、ステップＳ３４の処理をする。 The extraction unit 24 determines whether or not the processing of steps S34 to S36 has been completed for all the second moving images V2 included in the set of the second moving images V2 (step S37). When the extraction unit 24 determines that the processing of steps S34 to S36 has not been completed for all the second moving images V2 included in the set of the second moving image V2 (No in step S37), the extraction unit 24 returns to step S34 and steps. Process S34.

取出部２４は、第２動画Ｖ２のセットに含まれる全ての第２動画Ｖ２について、ステップＳ３４〜Ｓ３６の処理が終了したと判断したとき（ステップＳ３７でＹｅｓ）、図１１に示すように、第２動作のデータセットＤＳ−５が完成される。 When the extraction unit 24 determines that the processing of steps S34 to S36 has been completed for all the second moving images V2 included in the set of the second moving image V2 (Yes in step S37), as shown in FIG. The two-operation data set DS-5 is completed.

第２動作のデータセットＤＳ−５の完成後、取出部２４は、第２動作のデータセットＤＳ−５に含まれる１番目のペアにおいて、推定値の組み合わせ（第２動作の推定値８０％，第１動作の推定値２０％）をコピーし、決定部２６に送る（ステップＳ３８）。 After the completion of the second operation data set DS-5, the fetching unit 24 sets the estimated value combination (estimated value 80% of the second operation, 80%, in the first pair included in the second operation data set DS-5). The estimated value of the first operation (20%) is copied and sent to the determination unit 26 (step S38).

決定部２６は、第１動画Ｖ１−１とペアを組む推定値の組み合わせ（第２組み合わせ）を決定する（ステップＳ３９）。詳しく説明すると、決定部２６は、取出部２４から送られてきた推定値の組み合わせ（第２動作の推定値８０％，第１動作の推定値２０％）について、第２動作の推定値８０％を第１動作の推定値８０％とし、第１動作の推定値２０％を第２動作の推定値２０％とした第２組み合わせを生成し、第２組み合わせを第１動画Ｖ１−１に写された物体の動作を示す動作情報に決定する。このように、決定部２６は、第２動画Ｖ２について、機械学習部２２が算出した第２動作である推定値を、第１動作である推定値とし、第２動画Ｖ２について、機械学習部２２が算出した第１動作である推定値を、第２動作である推定値とした第２組み合わせを、第１動画Ｖ１に写された物体の動作を示す動作情報に決定する。 The determination unit 26 determines a combination of estimated values (second combination) to be paired with the first moving image V1-1 (step S39). More specifically, the determination unit 26 has an estimated value of 80% for the second operation with respect to the combination of the estimated values sent from the extraction unit 24 (estimated value of the second operation 80%, estimated value of the first operation 20%). Is 80% of the estimated value of the first operation, 20% of the estimated value of the first operation is set to 20% of the estimated value of the second operation, and the second combination is copied to the first moving image V1-1. It is determined to be the motion information indicating the motion of the object. As described above, the determination unit 26 uses the estimated value of the second operation calculated by the machine learning unit 22 as the estimated value of the first operation for the second moving image V2, and the machine learning unit 22 for the second moving image V2. The second combination in which the estimated value of the first motion calculated by the above is used as the estimated value of the second motion is determined as the motion information indicating the motion of the object captured in the first moving image V1.

紐付け部２７は、第１動画Ｖ１のセットに含まれる第１動画Ｖ１−１と、ステップＳ３９で決定された推定値の組み合わせである第２組み合わせ（第１動作の推定値８０％、第２動作の推定値２０％）と、を紐付ける（ステップＳ４０）。これにより、１つのペア［第１動画Ｖ１−１，（第１動作の推定値８０％，第２動作の推定値２０％）］が生成される。 The linking unit 27 is a second combination (estimated value 80% of the first operation, second) which is a combination of the first moving image V1-1 included in the set of the first moving image V1 and the estimated value determined in step S39. (Estimated value of operation 20%) is associated with (step S40). As a result, one pair [first moving image V1-1, (estimated value of first operation 80%, estimated value of second operation 20%)] is generated.

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ３８〜Ｓ４０の処理が終了したか否かを判断する（ステップＳ４１）。取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ３８〜Ｓ４０の処理が終了していないと判断したとき（ステップＳ４１でＮｏ）、ステップＳ３８に戻り、ステップＳ３８の処理をする。 The extraction unit 24 determines whether or not the processing of steps S38 to S40 has been completed for all the first moving images V1 included in the set of the first moving images V1 (step S41). When the extraction unit 24 determines that the processing of steps S38 to S40 has not been completed for all the first moving images V1 included in the set of the first moving images V1, the process returns to step S38 and steps are taken. Process S38.

取出部２４は、第１動画Ｖ１のセットに含まれる全ての第１動画Ｖ１について、ステップＳ３８〜Ｓ４０の処理が終了したと判断したとき（ステップＳ４１でＹｅｓ）、図１１に示すように、第１動作のデータセットＤＳ−６が完成される。 When the extraction unit 24 determines that the processing of steps S38 to S40 has been completed for all the first moving images V1 included in the set of the first moving images V1 (Yes in step S41), as shown in FIG. The data set DS-6 for one operation is completed.

第２変形例の主な効果を説明する。図６に示すフローチャートは、第２変形例にも適用される。第１変形例と同様に、機械学習部２２は、第１動画Ｖ１について、第１動画Ｖ１に写された物体の動作が第１動作である推定値と第２動作である推定値とを算出する（図６のステップＳ１２）。図１０を参照して、第１動画Ｖ１に写された物体の動作は、第１動作なので、第１動作である推定値（例えば、９０％）は、第２動作である推定値（例えば、１０％）より高くなる。紐付け部２７は、第１動画Ｖ１と、これらの推定値の組み合わせ（第１組み合わせ）と、を紐付ける。 The main effects of the second modification will be described. The flowchart shown in FIG. 6 is also applied to the second modification. Similar to the first modification, the machine learning unit 22 calculates, for the first moving image V1, an estimated value in which the motion of the object captured in the first moving image V1 is the first motion and an estimated value in which the motion is the second motion. (Step S12 in FIG. 6). With reference to FIG. 10, since the motion of the object captured in the first moving image V1 is the first motion, the estimated value of the first motion (for example, 90%) is the estimated value of the second motion (for example, 90%). It will be higher than 10%). The linking unit 27 associates the first moving image V1 with a combination of these estimated values (first combination).

機械学習部２２は、第２動画Ｖ２について、第２動画Ｖ２に写された物体の動作が第２動作である推定値と第１動作である推定値とを算出する（ステップＳ３５）。図１１を参照して、第２動画Ｖ２（逆再生動画）に写された物体の動作は、第２動作と見なされる動作なので、第２動作である推定値（例えば、８０％）は、第１動作である推定値（例えば、２０％）より高くなる。 The machine learning unit 22 calculates, for the second moving image V2, an estimated value in which the motion of the object captured in the second moving image V2 is the second motion and an estimated value in which the motion is the first motion (step S35). With reference to FIG. 11, since the motion of the object captured in the second moving image V2 (reverse playback moving image) is regarded as the second motion, the estimated value (for example, 80%) which is the second motion is the second motion. It is higher than the estimated value (for example, 20%) which is one operation.

決定部２６は、第２動作である推定値（例えば、８０％）を第１動作である推定値とし、第１動作である推定値（例えば、２０％）を第２動作である推定値とした第２組み合わせを、第１動作を示す動作情報と見なし、これを第１動画Ｖ１に写された物体の動作情報と決定する（ステップＳ３９）。紐付け部２７は、第１動画Ｖ１と第２組み合わせとを紐付ける（ステップＳ４０）。 The determination unit 26 uses the estimated value of the second operation (for example, 80%) as the estimated value of the first operation, and the estimated value of the first operation (for example, 20%) as the estimated value of the second operation. The second combination is regarded as the motion information indicating the first motion, and this is determined as the motion information of the object copied in the first moving image V1 (step S39). The tying unit 27 ties the first moving image V1 and the second combination (step S40).

以上より、第２変形例によれば、同じ第１動画Ｖ１に関して、第１動画Ｖ１と動作情報（第１動作である推定値９０％、第２動作である推定値１０％）とのペアと、第１動画Ｖ１と動作情報（第１動作である推定値８０％、第２動作である推定値２０％）とのペアと、を生成することができる（図１０に示す第１動作のデータセットＤＳ−３、図１１に示す第１動作のデータセットＤＳ−６）。従って、第２変形例によれば、第１動作のデータセットとして、第１動作のデータセットＤＳ−３に加えて、第１動作のデータセットＤＳ−６を生成することができる。これらを一つのデータセットにすることにより、第１動作のデータセットＤＳ−３を生成する場合と比べて、第１動作のデータセットに含まれるペアの数を２倍にすることができる。 From the above, according to the second modification, with respect to the same first moving image V1, the pair of the first moving image V1 and the operation information (estimated value 90% for the first operation, estimated value 10% for the second operation). , A pair of the first moving image V1 and the operation information (estimated value 80% of the first operation, estimated value 20% of the second operation) can be generated (data of the first operation shown in FIG. 10). Set DS-3, data set DS-6 of the first operation shown in FIG. 11). Therefore, according to the second modification, as the data set of the first operation, the data set DS-6 of the first operation can be generated in addition to the data set DS-3 of the first operation. By combining these into one data set, the number of pairs included in the data set of the first operation can be doubled as compared with the case of generating the data set DS-3 of the first operation.

実施形態、第１変形例および第２変形例では、データセットの生成を例にして説明したが、本発明は、これに限定されず、逆再生動画を利用する機械学習、逆再生動画を利用する深層学習に適用することが可能である。 In the embodiments, the first modification and the second modification, the generation of a data set has been described as an example, but the present invention is not limited to this, and machine learning using a reverse playback video and reverse playback video are used. It can be applied to deep learning.

１情報処理装置
Ｖ１，Ｖ１−１〜Ｖ１−ｎ第１動画
Ｖ２，Ｖ２−１〜Ｖ２−ｎ第２動画 1 Information processing device V1, V1-1 to V1-n 1st moving image V2, V2-1 to V2-n 2nd moving image

Claims

第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換部と、
前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定部と、
前記他方の動画と前記決定部が決定した動作情報とを紐付ける紐付け部と、
前記一方の動画である前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを算出する機械学習部と、を備え、
前記決定部は、前記機械学習部が算出した前記第１動作である推定値を前記第２動作である推定値とし、前記機械学習部が算出した前記第２動作である推定値を前記第１動作である推定値とした組み合わせを、前記他方の動画である前記第２動画に写された前記物体の動作を示す動作情報に決定し、
前記紐付け部は、前記第２動画と前記組み合わせとを紐付ける、情報処理装置。 A conversion unit that converts the first moving image of the first operation and the second operation, which is the reverse of the first operation, into a second moving image obtained by reverse-playing the first moving image. ,
Of the first moving image and the second moving image, the motion information indicating the motion of the object captured in the other moving image is determined based on the motion information indicating the motion of the object captured in one of the moving images. The decision department and
A linking unit that links the other video with the operation information determined by the determination unit,
With respect to the first moving image, which is one of the moving images, a machine learning unit that calculates an estimated value in which the motion of the object captured in the first moving motion is the first motion and an estimated value in which the motion is the second motion. , Equipped with
The determination unit uses the estimated value of the first operation calculated by the machine learning unit as the estimated value of the second operation, and the estimated value of the second operation calculated by the machine learning unit as the first operation. The combination of the estimated values of the motion is determined as the motion information indicating the motion of the object captured in the second moving image, which is the other moving image.
The linking unit is an information processing device that links the second moving image and the combination.

第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換部と、
前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定部と、
前記他方の動画と前記決定部が決定した動作情報とを紐付ける紐付け部と、
前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを算出し、かつ、前記第２動画について、前記第２動画に写された前記物体の動作が前記第２動作である推定値と前記第１動作である推定値とを算出する機械学習部と、を備え、
前記紐付け部は、前記第１動画について、前記機械学習部が算出した前記第１動作である推定値と前記第２動作である推定値との組み合わせである第１組み合わせを、前記第１動画に写された前記物体の動作を示す動作情報として、前記第１動画とを紐付け、
前記決定部は、前記一方の動画である前記第２動画について、前記機械学習部が算出した前記第２動作である推定値を前記第１動作である推定値とし、前記機械学習部が算出した前記第１動作である推定値を前記第２動作である推定値とした第２組み合わせを、前記他方の動画である前記第１動画に写された前記物体の動作を示す動作情報に決定し、
前記紐付け部は、前記第１動画と前記第２組み合わせとを紐付ける、情報処理装置。 A conversion unit that converts the first moving image of the first operation and the second operation, which is the reverse of the first operation, into a second moving image obtained by reverse-playing the first moving image. ,
Of the first moving image and the second moving image, the motion information indicating the motion of the object captured in the other moving image is determined based on the motion information indicating the motion of the object captured in one of the moving images. The decision department and
A linking unit that links the other video with the operation information determined by the determination unit ,
With respect to the first moving image, an estimated value in which the motion of the object captured in the first moving image is the first motion and an estimated value in which the second motion is the second motion are calculated, and the second moving image is described. a machine learning unit operation of the object that was photographed in the second video is calculated and the estimated value is the first operation and the estimated value is the second operation comprises,
Regarding the first moving image, the associating unit uses the first combination, which is a combination of the estimated value of the first operation calculated by the machine learning unit and the estimated value of the second operation, as the first moving image. As the motion information showing the motion of the object copied to the above, the first moving image is linked to the motion information.
The determination unit calculated the second moving image, which is one of the moving images, by using the estimated value of the second operation calculated by the machine learning unit as the estimated value of the first operation. The second combination in which the estimated value of the first operation is used as the estimated value of the second operation is determined as the operation information indicating the operation of the object captured in the first moving image, which is the other moving image.
The linking unit, attach cord and said second combination and said first video information processing apparatus.

コンピューターによって実行される情報処理方法であって、
第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換ステップと、
前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定ステップと、
前記他方の動画と前記決定ステップで決定された動作情報とを紐付ける紐付けステップと、
前記一方の動画である前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを機械学習部により算出する算出ステップと、を備え、
前記決定するステップは、前記算出ステップで算出した前記第１動作である推定値を前記第２動作である推定値とし、前記算出ステップで算出した前記第２動作である推定値を前記第１動作である推定値とした組み合わせを、前記他方の動画である前記第２動画に写された前記物体の動作を示す動作情報に決定し、
前記紐付けステップは、前記第２動画と前記組み合わせとを紐付ける、情報処理方法。 Information processing method executed by a computer
Of the first operation and the second operation which is the reverse of the first operation, the conversion step of converting the first moving image in which the first operation is copied to the second moving image in which the first moving image is reverse-played. ,
Of the first moving image and the second moving image, the motion information indicating the motion of the object captured in the other moving image is determined based on the motion information indicating the motion of the object captured in one of the moving images. The decision step and
A linking step that links the other video with the operation information determined in the determination step,
With respect to the first moving image, which is one of the moving images, the machine learning unit calculates an estimated value in which the motion of the object captured in the first moving motion is the first motion and an estimated value in which the motion is the second motion. With calculation steps ,
In the step to be determined, the estimated value of the first operation calculated in the calculation step is used as the estimated value of the second operation, and the estimated value of the second operation calculated in the calculation step is used as the estimated value of the first operation. The combination of the estimated values is determined as the motion information indicating the motion of the object captured in the second moving image, which is the other moving image.
The linking step is an information processing method for linking the second moving image and the combination.

第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換ステップと、
前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定ステップと、
前記他方の動画と前記決定ステップで決定された動作情報とを紐付ける紐付けステップと、
前記一方の動画である前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを機械学習部により算出する算出ステップと、をコンピューターに実行させる情報処理プログラムであって、
前記決定するステップは、前記算出ステップで算出した前記第１動作である推定値を前記第２動作である推定値とし、前記算出ステップで算出した前記第２動作である推定値を前記第１動作である推定値とした組み合わせを、前記他方の動画である前記第２動画に写された前記物体の動作を示す動作情報に決定し、
前記紐付けステップは、前記第２動画と前記組み合わせとを紐付ける、情報処理プログラム。 Of the first operation and the second operation which is the reverse of the first operation, the conversion step of converting the first moving image in which the first operation is copied to the second moving image in which the first moving image is reverse-played. ,
Of the first moving image and the second moving image, the motion information indicating the motion of the object captured in the other moving image is determined based on the motion information indicating the motion of the object captured in one of the moving images. The decision step and
A linking step that links the other video with the operation information determined in the determination step,
With respect to the first moving image, which is one of the moving images, the machine learning unit calculates an estimated value in which the motion of the object captured in the first moving motion is the first motion and an estimated value in which the motion is the second motion. An information processing program that causes a computer to execute calculation steps .
In the step to be determined, the estimated value of the first operation calculated in the calculation step is used as the estimated value of the second operation, and the estimated value of the second operation calculated in the calculation step is used as the estimated value of the first operation. The combination of the estimated values is determined as the motion information indicating the motion of the object captured in the second moving image, which is the other moving image.
The linking step is an information processing program that links the second moving image and the combination.

コンピューターによって実行される情報処理方法であって、
第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換ステップと、
前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定ステップと、
前記他方の動画と前記決定ステップで決定された動作情報とを紐付ける紐付けステップと、
前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを算出し、かつ、前記第２動画について、前記第２動画に写された前記物体の動作が前記第２動作である推定値と前記第１動作である推定値とを機械学習部により算出する算出ステップと、を備え、
前記紐付けステップは、前記第１動画について、前記算出ステップで算出した前記第１動作である推定値と前記第２動作である推定値との組み合わせである第１組み合わせを、前記第１動画に写された前記物体の動作を示す動作情報として、前記第１動画とを紐付け、
前記決定ステップは、前記一方の動画である前記第２動画について、前記算出ステップで算出した前記第２動作である推定値を前記第１動作である推定値とし、前記算出ステップで算出した前記第１動作である推定値を前記第２動作である推定値とした第２組み合わせを、前記他方の動画である前記第１動画に写された前記物体の動作を示す動作情報に決定し、
前記紐付けステップは、前記第１動画と前記第２組み合わせとを紐付ける、情報処理方法。 Information processing method executed by a computer
Of the first operation and the second operation which is the reverse of the first operation, the conversion step of converting the first moving image in which the first operation is copied to the second moving image in which the first moving image is reverse-played. ,
Of the first moving image and the second moving image, the motion information indicating the motion of the object captured in the other moving image is determined based on the motion information indicating the motion of the object captured in one of the moving images. The decision step and
A linking step that links the other video with the operation information determined in the determination step ,
With respect to the first moving image, an estimated value in which the motion of the object captured in the first moving image is the first motion and an estimated value in which the second motion is the second motion are calculated, and the second moving image is described. It is provided with a calculation step in which the machine learning unit calculates an estimated value in which the motion of the object captured in the second moving image is the second motion and an estimated value in which the motion is the first motion .
In the linking step, for the first moving image, the first combination, which is a combination of the estimated value of the first operation calculated in the calculation step and the estimated value of the second operation, is combined with the first moving image. As the motion information showing the motion of the copied object, the first moving image is linked to the image .
In the determination step, with respect to the second moving image, which is one of the moving images, the estimated value of the second operation calculated in the calculation step is set as the estimated value of the first operation, and the first calculated in the calculation step. The second combination in which the estimated value of one motion is the estimated value of the second motion is determined as the motion information indicating the motion of the object captured in the first moving image, which is the other moving image.
The linking step is an information processing method for linking the first moving image and the second combination .

第１動作および前記第１動作と逆の動作である第２動作のうち、前記第１動作が写された第１動画を、前記第１動画を逆再生した第２動画に変換する変換ステップと、
前記第１動画および前記第２動画のうち、一方の動画に写された前記物体の動作を示す動作情報を基にして、他方の動画に写された前記物体の動作を示す動作情報を決定する決定ステップと、
前記他方の動画と前記決定ステップで決定された動作情報とを紐付ける紐付けステップと、
前記第１動画について、前記第１動画に写された前記物体の動作が前記第１動作である推定値と前記第２動作である推定値とを算出し、かつ、前記第２動画について、前記第２動画に写された前記物体の動作が前記第２動作である推定値と前記第１動作である推定値とを機械学習部により算出する算出ステップと、をコンピューターに実行させる情報処理プログラムであって、
前記紐付けステップは、前記第１動画について、前記算出ステップで算出した前記第１動作である推定値と前記第２動作である推定値との組み合わせである第１組み合わせを、前記第１動画に写された前記物体の動作を示す動作情報として、前記第１動画とを紐付け、
前記決定ステップは、前記一方の動画である前記第２動画について、前記算出ステップで算出した前記第２動作である推定値を前記第１動作である推定値とし、前記算出ステップで算出した前記第１動作である推定値を前記第２動作である推定値とした第２組み合わせを、前記他方の動画である前記第１動画に写された前記物体の動作を示す動作情報に決定し、
前記紐付けステップは、前記第１動画と前記第２組み合わせとを紐付ける、情報処理プログラム。 Of the first operation and the second operation which is the reverse of the first operation, the conversion step of converting the first moving image in which the first operation is copied to the second moving image in which the first moving image is reverse-played. ,
Of the first moving image and the second moving image, the motion information indicating the motion of the object captured in the other moving image is determined based on the motion information indicating the motion of the object captured in one of the moving images. The decision step and
A linking step that links the other video with the operation information determined in the determination step ,
With respect to the first moving image, an estimated value in which the motion of the object captured in the first moving image is the first motion and an estimated value in which the second motion is the second motion are calculated, and the second moving image is described. An information processing program that causes a computer to execute a calculation step in which the machine learning unit calculates an estimated value in which the motion of the object captured in the second moving image is the second motion and an estimated value in which the motion is the first motion. There ,
In the linking step, for the first moving image, the first combination, which is a combination of the estimated value of the first operation calculated in the calculation step and the estimated value of the second operation, is combined with the first moving image. As the motion information showing the motion of the copied object, the first moving image is linked to the image .
In the determination step, with respect to the second moving image, which is one of the moving images, the estimated value of the second operation calculated in the calculation step is set as the estimated value of the first operation, and the first calculated in the calculation step. The second combination in which the estimated value of one motion is the estimated value of the second motion is determined as the motion information indicating the motion of the object captured in the first moving image, which is the other moving image.
The linking step is an information processing program that links the first moving image and the second combination .