JP6953818B2

JP6953818B2 - Operation judgment device

Info

Publication number: JP6953818B2
Application number: JP2017117239A
Authority: JP
Inventors: 和弥毛利; 倫史葉石
Original assignee: Aisin Seiki Co Ltd; Aisin Corp
Current assignee: Aisin Corp
Priority date: 2016-11-14
Filing date: 2017-06-14
Publication date: 2021-10-27
Anticipated expiration: 2037-06-14
Also published as: JP2018085087A

Description

本発明の実施形態は、動作判定装置に関する。 An embodiment of the present invention relates to an operation determination device.

従来、撮像装置で撮像した撮像画像に基づいて人物の動作を判定する動作判定装置が知られている。 Conventionally, there is known an motion determination device that determines the motion of a person based on an image captured by the image pickup device.

特開２０１２−００３３６４号公報Japanese Unexamined Patent Publication No. 2012-003364

この種の動作判定装置では、一例としては、動作の判定精度の更なる向上が望まれる場合があった。 In this type of motion determination device, as an example, further improvement in motion determination accuracy may be desired.

本発明の実施形態にかかる動作判定装置は、一例として、撮像画像から対象物の特徴点を抽出する抽出部と、時間的に前後する前記撮像画像からそれぞれ抽出される前記特徴点に基づき、前記対象物の移動方向を示す追跡情報を生成する追跡部と、複数の前記追跡情報を時系列に蓄積した追跡情報群と前記対象物の動作に対応付けて予め登録された登録情報群との比較結果に基づき、当該動作が行われたか否かを判定する判定部と、を備える。よって、一例としては、複数フレーム分の追跡情報を一つのかたまり（追跡情報群）として、予め登録された登録情報群と比較することで、一つの追跡情報から動作を判定する場合と比較して、登録情報群に対応する動作が行われたか否かを精度よく判定することができる。 As an example, the operation determination device according to the embodiment of the present invention is based on the extraction unit that extracts the feature points of the object from the captured image and the feature points that are extracted from the captured images that are back and forth in time. Comparison between a tracking unit that generates tracking information indicating the moving direction of an object, a tracking information group that accumulates a plurality of the tracking information in time series, and a registration information group that is registered in advance in association with the operation of the object. A determination unit for determining whether or not the operation has been performed based on the result is provided. Therefore, as an example, by comparing the tracking information for a plurality of frames as one block (tracking information group) with the registered information group registered in advance, the operation is judged from one tracking information. , It is possible to accurately determine whether or not the operation corresponding to the registered information group has been performed.

上記動作判定装置では、一例として、前記判定部は、前記追跡情報群と前記登録情報群との類似度が閾値以上であると判定した後、前記対象物が静止しているか否かを判定し、静止していると判定したならば、前記登録情報群に対応する動作が行われたと判定する。よって、一例としては、「登録情報群に対応する動作」を意図した動作と、一連の動作の中にたまたま含まれる「登録情報群に対応する動作」に似た動作とを切り分けることができる。したがって、「登録情報群に対応する動作」の誤判定を低減することが可能である。 In the operation determination device, as an example, the determination unit determines whether or not the object is stationary after determining that the similarity between the tracking information group and the registration information group is equal to or higher than a threshold value. If it is determined that the information is stationary, it is determined that the operation corresponding to the registration information group has been performed. Therefore, as an example, it is possible to separate an operation intended as an "operation corresponding to a registered information group" from an operation similar to an "operation corresponding to a registered information group" that happens to be included in a series of operations. Therefore, it is possible to reduce the erroneous determination of the "operation corresponding to the registered information group".

上記動作判定装置では、一例として、前記追跡部は、前記撮像画像における前記対象物の周囲に、前記登録情報群に対応する動作に応じた方向に幅広い追跡領域を設定し、設定した前記追跡領域に含まれる前記特徴点に基づいて前記追跡情報を生成する。よって、一例としては、対象物を追跡し損ねる事態を生じにくくすることができる。 In the motion determination device, as an example, the tracking unit sets a wide tracking area around the object in the captured image in a direction corresponding to the motion corresponding to the registration information group, and sets the tracking area. The tracking information is generated based on the feature points included in. Therefore, as an example, it is possible to reduce the possibility of failing to track the object.

上記動作判定装置では、一例として、前記抽出部による処理に用いられるパラメータを前記撮像画像の撮像位置から前記対象物までの距離情報に応じて変更する抽出用パラメータ変更部を備える。よって、一例としては、抽出される特徴点の数が撮像位置から人物までの距離に応じて最適化されることで、必要以上に抽出された特徴点がノイズとなって動作の判定精度を低下させる事態を生じにくくすることができる。 As an example, the operation determination device includes an extraction parameter changing unit that changes parameters used for processing by the extraction unit according to distance information from the imaging position of the captured image to the object. Therefore, as an example, the number of feature points extracted is optimized according to the distance from the imaging position to the person, so that the feature points extracted more than necessary become noise and the operation determination accuracy is lowered. It is possible to make it difficult for the situation to occur.

上記動作判定装置では、一例として、前記追跡部による処理に用いられるパラメータを前記撮像画像の撮像位置から前記対象物までの距離情報に応じて変更する追跡用パラメータ変更部を備える。よって、一例としては、対象物の追跡範囲が撮像位置から人物までの距離に応じて最適化されることで、対象物の追跡漏れを生じにくくすることができる。 As an example, the operation determination device includes a tracking parameter changing unit that changes parameters used for processing by the tracking unit according to distance information from the imaging position of the captured image to the object. Therefore, as an example, by optimizing the tracking range of the object according to the distance from the imaging position to the person, it is possible to prevent the tracking omission of the object from occurring.

上記動作判定装置では、一例として、前記撮像画像に含まれる人物の行動履歴に基づいて対象人物を特定する人物特定部を備え、前記抽出部は、前記人物特定部によって特定された対象人物から前記対象物の特徴点を抽出する。よって、一例としては、対象人物以外の人物について抽出部、追跡部および判定部による処理が実行されないので、撮像画像に複数の人物が含まれる場合における処理負荷の増加を抑えることができる。また、対象人物以外の人物による動作の影響が排除されるため、撮像画像に複数の人物が含まれる場合であっても判定精度の低下を防止することができる。 As an example, the motion determination device includes a person identification unit that identifies a target person based on the behavior history of the person included in the captured image, and the extraction unit is the target person identified by the person identification unit. Extract the feature points of the object. Therefore, as an example, since the processing by the extraction unit, the tracking unit, and the determination unit is not executed for a person other than the target person, it is possible to suppress an increase in the processing load when a plurality of people are included in the captured image. Further, since the influence of the operation by a person other than the target person is eliminated, it is possible to prevent the determination accuracy from being lowered even when a plurality of people are included in the captured image.

上記動作判定装置では、一例として、前記人物特定部は、前記撮像画像に基づいて前記行動履歴を生成し、生成した前記行動履歴と予め登録された行動パターン登録情報との類似度に基づいて前記対象人物を特定する。よって、一例としては、予め登録された行動パターンに類似した行動を取っている人物を対象人物として特定することができる。 In the motion determination device, as an example, the person identification unit generates the behavior history based on the captured image, and the behavior history generated is based on the degree of similarity between the generated behavior pattern registration information and the behavior pattern registration information. Identify the target person. Therefore, as an example, a person who is taking an action similar to a pre-registered action pattern can be specified as a target person.

図１は、第１の実施形態に係る動作判定装置の構成の一例を示すブロック図である。FIG. 1 is a block diagram showing an example of the configuration of the operation determination device according to the first embodiment. 図２Ａは、領域設定処理の一例を示す図である。FIG. 2A is a diagram showing an example of the area setting process. 図２Ｂは、領域設定処理の一例を示す図である。FIG. 2B is a diagram showing an example of the area setting process. 図３は、撮像位置から人物までの距離がＤ１である撮像画像から抽出される特徴点の一例を示す図である。FIG. 3 is a diagram showing an example of feature points extracted from an captured image in which the distance from the imaging position to a person is D1. 図４Ａは、抽出用閾値を固定とした場合に、撮像位置から人物までの距離がＤ１よりも短いＤ２である撮像画像から抽出される特徴点の一例を示す図である。FIG. 4A is a diagram showing an example of feature points extracted from an captured image in which the distance from the imaging position to a person is D2, which is shorter than D1, when the extraction threshold is fixed. 図４Ｂは、抽出用閾値を距離情報に応じて変更した場合に、撮像位置から人物までの距離がＤ１よりも短いＤ２である撮像画像から抽出される特徴点の一例を示す図である。FIG. 4B is a diagram showing an example of feature points extracted from an captured image in which the distance from the imaging position to a person is D2, which is shorter than D1, when the extraction threshold value is changed according to the distance information. 図５は、追跡情報生成処理の一例を示す図である。FIG. 5 is a diagram showing an example of tracking information generation processing. 図６Ａは、登録ジェスチャの一例を示す図である。FIG. 6A is a diagram showing an example of a registered gesture. 図６Ｂは、追跡領域の一例を示す図である。FIG. 6B is a diagram showing an example of a tracking area. 図７は、比較処理の一例を示す図である。FIG. 7 is a diagram showing an example of the comparison process. 図８は、静止判定処理の一例を示す図である。FIG. 8 is a diagram showing an example of static determination processing. 図９は、パラメータ変更部、抽出部、追跡部および判定部が実行する処理の手順の一例を示すフローチャートである。FIG. 9 is a flowchart showing an example of a processing procedure executed by the parameter changing unit, the extraction unit, the tracking unit, and the determination unit. 図１０Ａは、変形例に係る絞り込み処理の一例を示す図である。FIG. 10A is a diagram showing an example of the narrowing process according to the modified example. 図１０Ｂは、変形例に係る絞り込み処理の一例を示す図である。FIG. 10B is a diagram showing an example of the narrowing process according to the modified example. 図１１は、第２の実施形態に係る動作判定装置の構成の一例を示すブロック図である。FIG. 11 is a block diagram showing an example of the configuration of the operation determination device according to the second embodiment. 図１２は、人物特定部の構成の一例を示すブロック図である。FIG. 12 is a block diagram showing an example of the configuration of the person identification unit. 図１３Ａは、人物特定処理の一例を示す図である。FIG. 13A is a diagram showing an example of the person identification process. 図１３Ｂは、人物特定処理の一例を示す図である。FIG. 13B is a diagram showing an example of the person identification process. 図１４は、動作判定装置を搭載する車両の一例を示す平面図である。FIG. 14 is a plan view showing an example of a vehicle equipped with an operation determination device. 図１５は、車両後方に存在する複数の人物の中から対象人物を特定する様子を示す図である。FIG. 15 is a diagram showing how a target person is identified from a plurality of people existing behind the vehicle.

（第１の実施形態）
〔１．動作判定装置の構成〕
まず、第１の実施形態に係る動作判定装置の構成について図１を参照して説明する。図１は、第１の実施形態に係る動作判定装置の構成の一例を示すブロック図である。 (First Embodiment)
[1. Configuration of operation judgment device]
First, the configuration of the operation determination device according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing an example of the configuration of the operation determination device according to the first embodiment.

図１に示すように、第１の実施形態に係る動作判定装置１は、撮像装置１０から入力される撮像画像に基づいて人物の動作を判定し、判定結果を外部装置へ出力する。 As shown in FIG. 1, the motion determination device 1 according to the first embodiment determines the motion of a person based on the captured image input from the image pickup device 10, and outputs the determination result to the external device.

動作判定装置１は、たとえば、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、入出力ポートなどを有するマイクロコンピュータや各種の回路を含む。 The operation determination device 1 includes, for example, a microcomputer having a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), an input / output port, and various circuits.

動作判定装置１は、ＣＰＵがＲＯＭに記憶された駆動制御プログラムを、ＲＡＭを作業領域として使用して実行することにより機能する複数の処理部を備える。具体的には、動作判定装置１は、パラメータ変更部２と、抽出部３と、追跡部４と、判定部５とを備える。また、動作判定装置１は、記憶部６を備える。 The operation determination device 1 includes a plurality of processing units that function by the CPU executing a drive control program stored in the ROM using the RAM as a work area. Specifically, the operation determination device 1 includes a parameter change unit 2, an extraction unit 3, a tracking unit 4, and a determination unit 5. Further, the operation determination device 1 includes a storage unit 6.

なお、動作判定装置１が備える各処理部は、それぞれ一部または全部がＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアで構成されてもよい。 Each processing unit included in the operation determination device 1 may be partially or wholly composed of hardware such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

（撮像装置１０について）
撮像装置１０は、たとえば、ＣＣＤ（Charge Coupled Device）やＣＩＳ（CMOS Image Sensor）等の撮像素子を内蔵するデジタルカメラである。撮像装置１０は、所定のフレームレートで撮像した撮像画像の画像データ（以下、単に「撮像画像」と記載する）をパラメータ変更部２および抽出部３へ出力する。 (About the imaging device 10)
The image pickup device 10 is, for example, a digital camera having a built-in image pickup element such as a CCD (Charge Coupled Device) or a CIS (CMOS Image Sensor). The image pickup apparatus 10 outputs the image data of the captured image captured at a predetermined frame rate (hereinafter, simply referred to as “captured image”) to the parameter changing unit 2 and the extraction unit 3.

（パラメータ変更部２について）
パラメータ変更部２は、後述する抽出部３および追跡部４の処理に用いられる各種のパラメータを、撮像画像の撮像位置から撮像画像中の人物までの距離に応じて変更する。具体的には、パラメータ変更部２は、距離推定部２１と、抽出用パラメータ変更部２２と、追跡用パラメータ変更部２３とを備える。 (About parameter change part 2)
The parameter changing unit 2 changes various parameters used in the processing of the extraction unit 3 and the tracking unit 4, which will be described later, according to the distance from the imaging position of the captured image to the person in the captured image. Specifically, the parameter changing unit 2 includes a distance estimation unit 21, an extraction parameter changing unit 22, and a tracking parameter changing unit 23.

距離推定部２１は、撮像画像の撮像位置から撮像画像中の人物までの距離を推定する。一例として、距離推定部２１は、撮像画像から人物の足部を検出し、検出した足部の位置（すなわち、人物の立ち位置）から人物までの距離を推定する。距離推定部２１は、推定結果を距離情報として抽出用パラメータ変更部２２および追跡用パラメータ変更部２３へ出力する。なお、距離推定部２１は、上記の距離推定処理を１フレームごとに行う。 The distance estimation unit 21 estimates the distance from the captured position of the captured image to the person in the captured image. As an example, the distance estimation unit 21 detects the foot of a person from the captured image and estimates the distance from the detected position of the foot (that is, the standing position of the person) to the person. The distance estimation unit 21 outputs the estimation result as distance information to the extraction parameter changing unit 22 and the tracking parameter changing unit 23. The distance estimation unit 21 performs the above distance estimation process for each frame.

ここでは、撮像画像に基づいて人物までの距離を推定する場合の例を示したが、距離推定部２１は、スキャンレーザ、超音波センサ、ステレオカメラ、ＴＯＦ（Time Of Flight）カメラ等から入力される情報に基づいて人物までの距離を推定してもよい。 Here, an example of estimating the distance to a person based on the captured image is shown, but the distance estimation unit 21 is input from a scan laser, an ultrasonic sensor, a stereo camera, a TOF (Time Of Flight) camera, or the like. The distance to a person may be estimated based on the information provided.

また、動作判定装置１は、必ずしも距離推定部２１を備えることを要さず、外部から距離情報を取得する構成であってもよい。 Further, the motion determination device 1 does not necessarily have to include the distance estimation unit 21, and may be configured to acquire distance information from the outside.

抽出用パラメータ変更部２２は、距離推定部２１から距離情報が入力されるごとに、後述する抽出部３の処理に用いられる各種のパラメータを距離情報に応じて変更し、変更後のパラメータを抽出部３へ出力する。 Each time the distance information is input from the distance estimation unit 21, the extraction parameter changing unit 22 changes various parameters used in the processing of the extraction unit 3 described later according to the distance information, and extracts the changed parameters. Output to unit 3.

抽出部３の処理に用いられるパラメータには、たとえば、後述する領域設定部３１により設定される処理対象領域Ｒ（図２Ｂ参照）の大きさ、動体検出部３２による動体検出処理に用いられる検出用閾値、抽出処理部３３による特徴点抽出処理に用いられる抽出用閾値および最大抽出特徴点数などが含まれ得る。 The parameters used in the processing of the extraction unit 3 include, for example, the size of the processing target area R (see FIG. 2B) set by the area setting unit 31 described later, and the detection used in the moving object detection processing by the moving object detecting unit 32. The threshold value, the extraction threshold value used for the feature point extraction process by the extraction processing unit 33, the maximum number of feature points to be extracted, and the like may be included.

追跡用パラメータ変更部２３は、距離推定部２１から距離情報が入力されるごとに、後述する追跡部４の処理に用いられる各種のパラメータを距離情報に応じて変更し、変更後のパラメータを追跡部４へ出力する。 Each time the distance information is input from the distance estimation unit 21, the tracking parameter changing unit 23 changes various parameters used in the processing of the tracking unit 4, which will be described later, according to the distance information, and tracks the changed parameters. Output to unit 4.

追跡部４の処理に用いられるパラメータには、たとえば、後述する追跡情報生成部４１による処理において、複数の特徴点をクラスタリングする際の範囲、クラスタリングされる特徴点の最小数、クラスターの最小数、２フレーム間におけるクラスターを追跡する範囲などが含まれ得る。また、追跡部４の処理に用いられるパラメータには、たとえば、後述する蓄積部４２による処理において、クラスターを複数フレームにわたって追跡する場合の追跡範囲も含まれ得る。 The parameters used in the processing of the tracking unit 4 include, for example, the range when clustering a plurality of feature points in the processing by the tracking information generation unit 41 described later, the minimum number of feature points to be clustered, the minimum number of clusters, and the like. It may include a range for tracking clusters between two frames, and so on. Further, the parameters used in the processing of the tracking unit 4 may include, for example, a tracking range in the case of tracking the cluster over a plurality of frames in the processing by the storage unit 42 described later.

一例として、抽出用パラメータ変更部２２および追跡用パラメータ変更部２３は、記憶部６に記憶された変換情報６１を用い、距離推定部２１から入力される距離情報を各種のパラメータに変換する。変換情報６１は、予め実験またはシミュレーションによって求めておいた距離情報と各パラメータとの関係を示す変換テーブルや変換マップ等の情報である。あるいは、抽出用パラメータ変更部２２および追跡用パラメータ変更部２３は、予め実験またはシミュレーションによって求めておいた距離情報と各パラメータとの関係を示す式またはその近似式を用いて距離情報からパラメータへの変換を行ってもよい。 As an example, the extraction parameter changing unit 22 and the tracking parameter changing unit 23 use the conversion information 61 stored in the storage unit 6 to convert the distance information input from the distance estimation unit 21 into various parameters. The conversion information 61 is information such as a conversion table or a conversion map showing the relationship between the distance information and each parameter obtained in advance by experiment or simulation. Alternatively, the extraction parameter changing unit 22 and the tracking parameter changing unit 23 use an equation showing the relationship between the distance information and each parameter obtained in advance by experiment or simulation, or an approximate expression thereof to convert the distance information to the parameter. The conversion may be performed.

このように、抽出部３の処理に用いられるパラメータを撮像位置から人物までの距離に応じて最適化することで、抽出部３は、特徴点を抽出する処理を撮像位置から人物までの距離に応じて適切に行うことができる。詳細については後述するが、たとえば、特徴点の抽出数が撮像位置から人物までの距離に応じて最適化されることで、必要以上に抽出された特徴点がノイズとなって動作の判定精度を低下させる事態を生じにくくすることができる。 In this way, by optimizing the parameters used in the processing of the extraction unit 3 according to the distance from the imaging position to the person, the extraction unit 3 performs the processing of extracting the feature points to the distance from the imaging position to the person. It can be done appropriately accordingly. Details will be described later, but for example, by optimizing the number of feature points extracted according to the distance from the imaging position to the person, the feature points extracted more than necessary become noise and the operation judgment accuracy is improved. It is possible to make it difficult for the situation of lowering to occur.

また、追跡部４の処理に用いられるパラメータを撮像位置から人物までの距離に応じて最適化することで、追跡部４は、特徴点を追跡する処理を人物までの距離に応じて適切に行うことができる。詳細について後述するが、たとえば、対象物の追跡範囲が撮像位置から人物までの距離に応じて最適化されることで、対象物の追跡漏れを生じにくくすることができる。 Further, by optimizing the parameters used for the processing of the tracking unit 4 according to the distance from the imaging position to the person, the tracking unit 4 appropriately performs the processing of tracking the feature points according to the distance to the person. be able to. Although the details will be described later, for example, by optimizing the tracking range of the object according to the distance from the imaging position to the person, it is possible to prevent the tracking omission of the object from occurring.

（抽出部３について）
抽出部３は、撮像画像から対象物の特徴点を抽出する。具体的には、抽出部３は、領域設定部３１と、動体検出部３２と、抽出処理部３３とを備える。 (About extraction unit 3)
The extraction unit 3 extracts the feature points of the object from the captured image. Specifically, the extraction unit 3 includes an area setting unit 31, a moving object detection unit 32, and an extraction processing unit 33.

領域設定部３１は、撮像装置１０から入力される撮像画像に対して処理対象領域を設定する。処理対象領域とは、撮像画像中の人物の周囲に設定される領域である。 The area setting unit 31 sets a processing target area for the captured image input from the image capturing device 10. The processing target area is an area set around a person in the captured image.

ここで、領域設定部３１による領域設定処理について図２Ａおよび図２Ｂを参照して説明する。図２Ａおよび図２Ｂは、領域設定処理の一例を示す図である。図２Ａおよび図２Ｂには、撮像画像Ｘ１に写り込んだ人物Ｈの周囲に処理対象領域が設定される様子を示している。 Here, the area setting process by the area setting unit 31 will be described with reference to FIGS. 2A and 2B. 2A and 2B are diagrams showing an example of the area setting process. 2A and 2B show how a processing target area is set around the person H reflected in the captured image X1.

たとえば、図２Ａに示すように、人物Ｈの足部の位置Ｐの情報が距離推定部２１から領域設定部３１に入力される。また、処理対象領域の大きさを示すパラメータが抽出用パラメータ変更部２２から領域設定部３１に入力される。そして、領域設定部３１は、図２Ｂに示すように、抽出用パラメータ変更部２２から入力されたパラメータによって示される大きさの処理対象領域Ｒを、距離推定部２１から入力された足部の位置Ｐを基準に撮像画像Ｘ１上に設定する。 For example, as shown in FIG. 2A, information on the position P of the foot of the person H is input from the distance estimation unit 21 to the area setting unit 31. Further, a parameter indicating the size of the processing target area is input from the extraction parameter changing unit 22 to the area setting unit 31. Then, as shown in FIG. 2B, the area setting unit 31 sets the processing target area R of the size indicated by the parameter input from the extraction parameter changing unit 22 to the position of the foot unit input from the distance estimation unit 21. It is set on the captured image X1 with reference to P.

これにより、人物Ｈの周りを囲む処理対象領域Ｒが撮像画像Ｘ１上に設定され、動体検出部３２以降の各処理部によって実行される処理は、撮像画像Ｘ１のうち処理対象領域Ｒ内の情報に基づいて行われる。 As a result, the processing target area R surrounding the person H is set on the captured image X1, and the processing executed by each processing unit after the moving object detection unit 32 is the information in the processing target area R of the captured image X1. It is done based on.

このように、人物Ｈの周囲に処理対象領域Ｒを設定することにより、撮像画像Ｘ１に写り込んだ人物Ｈ以外の物体から受けるノイズの影響を低減することができる。また、処理対象領域Ｒを設定することで、処理を要する画素の数が減るため、処理の効率化を図ることができる。 By setting the processing target area R around the person H in this way, it is possible to reduce the influence of noise received from an object other than the person H reflected in the captured image X1. Further, by setting the processing target area R, the number of pixels required for processing is reduced, so that the processing efficiency can be improved.

処理対象領域Ｒの大きさは、距離情報に応じて最適化される。具体的には、撮像位置の比較的近くにいる人物は、撮像位置から比較的遠くにいる人物よりも見かけ上大きく撮像画像に写り込む。したがって、処理対象領域Ｒの大きさは、撮像位置に対して人物が近づくほど大きくなり、遠ざかるほど小さくなるように、抽出用パラメータ変更部２２によって変更される。 The size of the processing target area R is optimized according to the distance information. Specifically, a person who is relatively close to the imaging position appears larger in the captured image than a person who is relatively far from the imaging position. Therefore, the size of the processing target area R is changed by the extraction parameter changing unit 22 so that the size of the processing target area R increases as the person approaches the imaging position and decreases as the person moves away from the imaging position.

このように、処理対象領域Ｒの大きさを距離情報に応じて最適化することで、たとえば、処理対象領域Ｒの大きさを固定とした場合と比較し、人物Ｈ以外の物体によるノイズの影響をさらに低減することができるとともに、処理効率の更なる向上を図ることができる。 By optimizing the size of the processing target area R according to the distance information in this way, for example, as compared with the case where the size of the processing target area R is fixed, the influence of noise by an object other than the person H Can be further reduced, and the processing efficiency can be further improved.

動体検出部３２は、処理対象領域Ｒ内において動く物体（以下、「動体」と記載する場合もある）を検出する。 The moving object detection unit 32 detects a moving object (hereinafter, may be referred to as “moving object”) in the processing target area R.

動体の検出手法としては、たとえば、フレーム間差分法を用いることができる。フレーム間差分法とは、時間的に前後する複数の撮像画像、たとえば、現在のフレームの撮像画像とその直前のフレームの撮像画像との画素値を比較して変化箇所を特定する方法である。動体検出部３２は、画素値の変化量が検出用閾値を超えた箇所およびその周囲の領域を動体として検出する。 As a moving object detection method, for example, an inter-frame difference method can be used. The inter-frame difference method is a method of identifying a change portion by comparing the pixel values of a plurality of captured images that are back and forth in time, for example, the captured image of the current frame and the captured image of the frame immediately before the current frame. The moving object detection unit 32 detects a portion where the amount of change in the pixel value exceeds the detection threshold value and a region around the same as the moving object.

検出用閾値は、抽出用パラメータ変更部２２から入力されるパラメータの一つであり、上述したように距離情報に応じて最適化される。具体的には、撮像位置の比較的近くにいる人物の動き（すなわち、２フレーム間における画素値の変化量）は、撮像位置から比較的遠くにいる人物の動きよりも見かけ上大きくなる。したがって、検出用閾値は、撮像位置に対して人物が近づくほど大きくなり、遠ざかるほど小さくなるように、抽出用パラメータ変更部２２によって変更される。 The detection threshold value is one of the parameters input from the extraction parameter changing unit 22, and is optimized according to the distance information as described above. Specifically, the movement of a person who is relatively close to the imaging position (that is, the amount of change in the pixel value between two frames) is apparently larger than the movement of a person who is relatively far from the imaging position. Therefore, the detection threshold value is changed by the extraction parameter changing unit 22 so that the threshold value increases as the person approaches the imaging position and decreases as the person moves away from the imaging position.

このように、検出用閾値を距離情報に応じて最適化することで、たとえば、検出用閾値が固定である場合と比較して、動体の検出精度を向上させることができる。 By optimizing the detection threshold value according to the distance information in this way, for example, it is possible to improve the detection accuracy of the moving object as compared with the case where the detection threshold value is fixed.

なお、領域設定部３１は、フレーム間差分法以外の手法を用いて動体を検出してもよい。たとえば、撮像装置１０が固定的に設置される場合には、背景差分法を用いて動体を検出することも可能である。背景差分法とは、基準とする画像を予め用意しておき、撮像装置１０から入力される撮像画像と基準とする画像との画素値を比較して変化箇所を特定する手法である。 The area setting unit 31 may detect a moving object by using a method other than the inter-frame difference method. For example, when the image pickup apparatus 10 is fixedly installed, it is possible to detect a moving object by using the background subtraction method. The background subtraction method is a method in which a reference image is prepared in advance, and a change portion is specified by comparing the pixel values of the captured image input from the image pickup apparatus 10 and the reference image.

抽出処理部３３は、動体検出部３２によって検出された動体から特徴点を抽出する。特徴点を抽出する手法としては、たとえば、Ｈａｒｒｉｓコーナー検出法を用いることができる。Ｈａｒｒｉｓコーナー検出法は、撮像画像中のコーナー（角部）を特徴点として検出する手法の一種である。 The extraction processing unit 33 extracts feature points from the moving body detected by the moving body detecting unit 32. As a method for extracting feature points, for example, a Harris corner detection method can be used. The Harris corner detection method is a kind of method for detecting a corner (corner) in a captured image as a feature point.

抽出処理部３３は、Ｈａｒｒｉｓコーナー検出法により算出される値を抽出用閾値と比較して抽出用閾値よりも大きければ、その値に対応する箇所を特徴点として抽出し、抽出した特徴点の位置等の情報を追跡部４へ出力する。 The extraction processing unit 33 compares the value calculated by the Harris corner detection method with the extraction threshold value, and if it is larger than the extraction threshold value, extracts the portion corresponding to the value as a feature point, and the position of the extracted feature point. Etc. are output to the tracking unit 4.

抽出用閾値は、抽出用パラメータ変更部２２から入力されるパラメータの一つであり、上述したように距離情報に応じて最適化される。したがって、第１の実施形態に係る動作判定装置１によれば、抽出用閾値が固定である場合と比較して、特徴点をより適切に検出することができる。 The extraction threshold value is one of the parameters input from the extraction parameter changing unit 22, and is optimized according to the distance information as described above. Therefore, according to the operation determination device 1 according to the first embodiment, the feature points can be detected more appropriately as compared with the case where the extraction threshold value is fixed.

この点について図３、図４Ａおよび図４Ｂを参照して説明する。図３は、撮像位置から人物Ｈまでの距離がＤ１である撮像画像から抽出される特徴点の一例を示す図である。また、図４Ａおよび図４Ｂは、撮像位置から人物Ｈまでの距離がＤ１よりも短いＤ２である撮像画像から抽出される特徴点の一例を示す図であり、図４Ａには抽出用閾値を固定とした場合の例を、図４Ｂには抽出用閾値を距離情報に応じて変更した場合の例をそれぞれ示している。 This point will be described with reference to FIGS. 3, 4A and 4B. FIG. 3 is a diagram showing an example of feature points extracted from an captured image in which the distance from the imaging position to the person H is D1. Further, FIGS. 4A and 4B are diagrams showing an example of feature points extracted from an captured image in which the distance from the imaging position to the person H is D2, which is shorter than D1, and FIG. 4A has a fixed extraction threshold value. 4B shows an example in which the extraction threshold value is changed according to the distance information.

図３に示すように、撮像位置から人物Ｈまでの距離がＤ１（たとえば、２メートル）である撮像画像Ｘ２に対し、距離Ｄ１用の抽出用閾値を用いて特徴点を抽出する処理を行った結果、最適な数（たとえば、５個）の特徴点Ｆが抽出されると仮定する。 As shown in FIG. 3, for the captured image X2 in which the distance from the imaging position to the person H is D1 (for example, 2 meters), a process of extracting feature points using the extraction threshold value for the distance D1 was performed. As a result, it is assumed that the optimum number (for example, 5) of feature points F are extracted.

この場合において、距離Ｄ１用の抽出用閾値をそのまま用い、図４Ａに示すように、撮像位置から人物Ｈまでの距離がＤ２（たとえば、１メートル）である撮像画像Ｘ３から特徴点を抽出する処理を行ったとする。 In this case, the extraction threshold value for the distance D1 is used as it is, and as shown in FIG. 4A, a process of extracting feature points from the captured image X3 in which the distance from the imaging position to the person H is D2 (for example, 1 meter). Suppose that

この場合、抽出される特徴点の数は、最適な数である５個よりも多くなる。これは、人物Ｈが撮像位置に近づくほどその人物Ｈの輪郭が複雑化する結果、コーナーとして検出される箇所が多くなるためである。抽出される特徴点の数が最適な数よりも多くなると、ノイズの影響が大きくなり、動作の判定精度が低下するおそれがある。 In this case, the number of feature points extracted is greater than the optimum number of five. This is because the contour of the person H becomes more complicated as the person H approaches the imaging position, and as a result, the number of places detected as corners increases. If the number of feature points to be extracted is larger than the optimum number, the influence of noise becomes large, and the accuracy of motion determination may decrease.

これに対し、第１の実施形態に係る動作判定装置１では、抽出用閾値が距離情報に応じて最適化される。具体的には、人物Ｈまでの距離がＤ２である場合の抽出用閾値は、人物Ｈまでの距離がＤ１である場合の抽出用閾値よりも小さくなる。このように、人物Ｈまでの距離がＤ２である場合には、距離Ｄ２に適した抽出用閾値を用いて特徴点を抽出することで、図４Ｂに示すように、撮像位置から人物Ｈまでの距離によらず、最適な数の特徴点を抽出することができる。したがって、第１の実施形態に係る動作判定装置１によれば、撮像位置から人物Ｈまでの距離が変化した場合の動作の判定精度の低下を抑制することができる。 On the other hand, in the operation determination device 1 according to the first embodiment, the extraction threshold value is optimized according to the distance information. Specifically, the extraction threshold value when the distance to the person H is D2 is smaller than the extraction threshold value when the distance to the person H is D1. In this way, when the distance to the person H is D2, by extracting the feature points using the extraction threshold value suitable for the distance D2, as shown in FIG. 4B, from the imaging position to the person H. The optimum number of feature points can be extracted regardless of the distance. Therefore, according to the motion determination device 1 according to the first embodiment, it is possible to suppress a decrease in the accuracy of motion determination when the distance from the imaging position to the person H changes.

なお、抽出処理部３３は、Ｈａｒｒｉｓコーナー検出法により算出された値が抽出用閾値を超えた箇所の数が、抽出用パラメータ変更部２２から入力される最大抽出特徴点数を超える場合には、抽出する特徴点の数を最大抽出特徴点数に制限する処理を行う。この最大抽出特徴点数も、抽出用パラメータ変更部２２から入力されるパラメータの一つであり、上述したように距離情報に応じて最適化される。 The extraction processing unit 33 extracts when the number of locations where the value calculated by the Harris corner detection method exceeds the extraction threshold value exceeds the maximum number of extraction feature points input from the extraction parameter change unit 22. The process of limiting the number of feature points to be extracted to the maximum number of feature points to be extracted is performed. This maximum number of extraction feature points is also one of the parameters input from the extraction parameter changing unit 22, and is optimized according to the distance information as described above.

ここでは、抽出処理部３３が、Ｈａｒｒｉｓコーナー検出法を用いて特徴点を抽出する場合の例について説明したが、抽出処理部３３は、Ｈａｒｒｉｓコーナー検出法に限らず、たとえば、ＦＡＳＴ、ＤｏＧ、ＳＩＦＴ、ＳＵＲＦといった他の手法を用いて特徴点を抽出してもよい。 Here, an example in which the extraction processing unit 33 extracts feature points by using the Harris corner detection method has been described, but the extraction processing unit 33 is not limited to the Harris corner detection method, for example, FAST, DoG, SIFT. , SURF and other techniques may be used to extract feature points.

（追跡部４について）
追跡部４は、抽出部３によって抽出された特徴点を追跡する。具体的には、追跡部４は、追跡情報生成部４１と、蓄積部４２とを備える。 (About tracking unit 4)
The tracking unit 4 tracks the feature points extracted by the extraction unit 3. Specifically, the tracking unit 4 includes a tracking information generation unit 41 and a storage unit 42.

追跡情報生成部４１は、時間的に前後する２つの撮像画像からそれぞれ抽出される特徴点に基づき、２フレーム間における対象物の移動方向を示す追跡情報を生成する。 The tracking information generation unit 41 generates tracking information indicating the moving direction of the object between the two frames, based on the feature points extracted from the two captured images that are back and forth in time.

ここで、追跡情報生成部４１による追跡情報生成処理の一例について図５を参照して説明する。図５は、追跡情報生成処理の一例を示す図である。 Here, an example of the tracking information generation process by the tracking information generation unit 41 will be described with reference to FIG. FIG. 5 is a diagram showing an example of tracking information generation processing.

図５に示すように、追跡情報生成部４１は、まず、複数の特徴点Ｆを一つのかたまり（クラスターＣ）と見なすクラスタリング処理を行う。 As shown in FIG. 5, the tracking information generation unit 41 first performs a clustering process in which a plurality of feature points F are regarded as one block (cluster C).

クラスタリングの手法としては、たとえば、Ｗａｒｄ法を用いることができる。Ｗａｒｄ法では、まず、クラスタリングの対象となる複数（図の例では５個）の特徴点Ｆ間のユークリッド距離をそれぞれ算出する。つづいて、最小距離にある２個の特徴点Ｆを１個のクラスターとし、２個の特徴点Ｆの重心をこのクラスターの位置とする。つづいて、１個にまとめたクラスターを含めた各クラスター間のユークリッド距離を算出し、最小距離にある２個のクラスターをまとめて１個のクラスターとする。以上の処理を、複数の特徴点Ｆが人体の部位（手、足、頭など）ごとに１個のクラスターＣとなるまで繰り返す。これにより、人体の部位（手、足、頭など）ごとに１個のクラスターＣが得られる。 As a clustering method, for example, Ward's method can be used. In Ward's method, first, the Euclidean distances between a plurality of feature points F (five in the example of the figure) to be clustered are calculated. Subsequently, the two feature points F at the minimum distance are set as one cluster, and the center of gravity of the two feature points F is set as the position of this cluster. Next, the Euclidean distance between each cluster including the clusters combined into one is calculated, and the two clusters at the minimum distance are combined into one cluster. The above process is repeated until the plurality of feature points F become one cluster C for each part of the human body (hand, foot, head, etc.). As a result, one cluster C is obtained for each part of the human body (hands, feet, head, etc.).

追跡情報生成部４１は、クラスタリングの最大範囲（クラスターの最大サイズ）、最小特徴点数、最小クラスター数といった各種のパラメータを用いて上記のクラスタリング処理を実行する。これらのパラメータは、追跡用パラメータ変更部２３から入力されるパラメータの一部であり、上述したように距離情報に応じて最適化される。 The tracking information generation unit 41 executes the above clustering process using various parameters such as the maximum clustering range (maximum size of clusters), the minimum number of feature points, and the minimum number of clusters. These parameters are a part of the parameters input from the tracking parameter changing unit 23, and are optimized according to the distance information as described above.

これにより、第１の実施形態に係る動作判定装置１では、撮像位置から人物までの距離に応じた適切なクラスターＣを得ることができる。たとえば、撮像位置から人物までの距離が遠いほど、クラスタリングの最大範囲を小さくし、最小特徴点数を少なくすることで、対象物（たとえば、手）以外の物の特徴点がクラスターＣに含まれ難くすることができる。 As a result, in the operation determination device 1 according to the first embodiment, it is possible to obtain an appropriate cluster C according to the distance from the imaging position to the person. For example, as the distance from the imaging position to the person increases, the maximum range of clustering is reduced and the minimum number of feature points is reduced, so that feature points of objects other than the object (for example, a hand) are less likely to be included in the cluster C. can do.

つづいて、追跡情報生成部４１は、クラスターＣの２フレーム間の動きを追跡する処理を行う。 Subsequently, the tracking information generation unit 41 performs a process of tracking the movement between the two frames of the cluster C.

追跡手法としては、たとえば、Ｌｕｃａｓ−Ｋａｎａｄｅ法を用いることができる。Ｌｕｃａｓ−Ｋａｎａｄｅ法は、２つの画像の対応点を探索してその速度ベクトルを求める手法である。追跡情報生成部４１は、Ｌｕｃａｓ−Ｋａｎａｄｅ法を用いてクラスターＣの２フレーム間における移動方向および移動速度の情報を含む追跡情報を生成し、生成した追跡情報を蓄積部４２へ出力する。 As the tracking method, for example, the Lucas-Kanade method can be used. The Lucas-Kanade method is a method of searching for the corresponding points of two images and obtaining the velocity vector thereof. The tracking information generation unit 41 generates tracking information including information on the moving direction and moving speed between two frames of the cluster C by using the Lucas-Kanade method, and outputs the generated tracking information to the storage unit 42.

なお、追跡情報生成部４１は、Ｌｕｃａｓ−Ｋａｎａｄｅ法に限らず、たとえば、ブロックマッチング法等の他の手法を用いてクラスターＣの追跡を行ってもよい。 The tracking information generation unit 41 may track the cluster C by using not only the Lucas-Kanade method but also another method such as a block matching method.

ここで、追跡情報生成部４１は、２フレーム間におけるクラスターＣを追跡する範囲（以下、「追跡領域」と記載する）を設定し、設定した追跡領域内においてクラスターＣの追跡を行う。第１の実施形態に係る動作判定装置１では、登録された動作（以下、「登録ジェスチャ」と記載する）の動作方向に幅広い追跡領域が用いられる。この点について図６Ａおよび図６Ｂを参照して説明する。 Here, the tracking information generation unit 41 sets a range for tracking the cluster C between two frames (hereinafter, referred to as a “tracking area”), and tracks the cluster C within the set tracking area. In the motion determination device 1 according to the first embodiment, a wide tracking area is used in the motion direction of the registered motion (hereinafter, referred to as “registered gesture”). This point will be described with reference to FIGS. 6A and 6B.

図６Ａに示すように、登録ジェスチャとして、たとえば、手を上げて下げる動作が登録されているとする。この場合、追跡情報生成部４１は、登録ジェスチャの動作方向である上下方向に幅広い矩形状の追跡領域Ｗを対象物（ここでは、手）の周囲に設定する。 As shown in FIG. 6A, it is assumed that, for example, the action of raising and lowering the hand is registered as the registration gesture. In this case, the tracking information generation unit 41 sets a wide rectangular tracking area W in the vertical direction, which is the operation direction of the registered gesture, around the object (here, the hand).

図６Ｂに示すように、追跡情報生成部４１は、設定した追跡領域Ｗ内においてクラスターＣの追跡を行う。たとえば、追跡情報生成部４１は、現在のフレームにおけるクラスターＣの位置を基準に追跡領域Ｗを設定し、設定した追跡領域Ｗ内に存在する１フレーム前のクラスターＣ（破線で示したクラスターＣ）と現在のフレームのクラスターＣ（実線で示したクラスターＣ）とを対応付けることによって追跡情報を生成する。 As shown in FIG. 6B, the tracking information generation unit 41 tracks the cluster C within the set tracking area W. For example, the tracking information generation unit 41 sets the tracking area W based on the position of the cluster C in the current frame, and the cluster C one frame before (the cluster C shown by the broken line) existing in the set tracking area W. Tracking information is generated by associating this with the cluster C of the current frame (cluster C shown by the solid line).

このように、追跡情報生成部４１は、クラスターＣの追跡を、登録ジェスチャの動作方向に応じた方向に幅広い追跡領域Ｗ内において行うことにより、対象物を追跡し損ねる事態を生じにくくすることができる。また、追跡領域Ｗは、言い換えれば、登録ジェスチャの動作方向と直交する方向に幅狭の領域でもあるため、対象物以外の物の影響を受けにくくすることができる。 In this way, the tracking information generation unit 41 can track the cluster C in a wide tracking area W in a direction corresponding to the movement direction of the registered gesture, thereby making it difficult for the tracking information generation unit 41 to fail to track the object. can. Further, in other words, the tracking area W is also a narrow area in the direction orthogonal to the operation direction of the registered gesture, so that it can be less affected by an object other than the object.

追跡情報生成部４１は、図６Ａに示すように、複数フレーム間においてクラスターＣを追跡する範囲（以下、「ジェスチャ領域Ｚ」と記載する）を設定し、設定したジェスチャ領域ＺにおいてクラスターＣの追跡を行う。言い換えれば、追跡情報生成部４１は、ジェスチャ領域Ｚから外れたクラスターＣについては追跡を行わない。このジェスチャ領域Ｚも、追跡領域Ｗと同様に、登録ジェスチャの動作方向に幅広い形状を有する。したがって、対象物を追跡し損ねる事態を生じにくくすることができる。また、対象物以外の物の影響を受けにくくすることができる。 As shown in FIG. 6A, the tracking information generation unit 41 sets a range for tracking the cluster C between a plurality of frames (hereinafter, referred to as “gesture area Z”), and tracks the cluster C in the set gesture area Z. I do. In other words, the tracking information generation unit 41 does not track the cluster C outside the gesture region Z. Like the tracking region W, this gesture region Z also has a wide range of shapes in the operating direction of the registered gesture. Therefore, it is possible to reduce the possibility of failing to track the object. In addition, it is possible to reduce the influence of objects other than the object.

追跡領域Ｗおよびジェスチャ領域Ｚは、追跡用パラメータ変更部２３から入力されるパラメータの一つであり、上述したように、距離情報に応じて最適化される。具体的には、追跡領域Ｗおよびジェスチャ領域Ｚは、人物Ｈが撮像位置に近づくほど大きくなり、遠ざかるほど小さくなる。このように、距離情報に応じて追跡領域Ｗおよびジェスチャ領域Ｚの大きさを最適化することにより、追跡領域Ｗおよびジェスチャ領域Ｚの大きさを固定とした場合と比較して、対象物を追跡し損ねる事態がより生じにくくなるとともに、対象物以外の物の影響をより受けにくくすることができる。 The tracking area W and the gesture area Z are one of the parameters input from the tracking parameter changing unit 23, and are optimized according to the distance information as described above. Specifically, the tracking area W and the gesture area Z become larger as the person H approaches the imaging position and become smaller as the person H moves away from the imaging position. By optimizing the sizes of the tracking area W and the gesture area Z according to the distance information in this way, the object is tracked as compared with the case where the sizes of the tracking area W and the gesture area Z are fixed. It is possible to make it more difficult for the situation to fail and to be less susceptible to the influence of objects other than the object.

ここで、登録ジェスチャに関する情報は、登録ジェスチャ情報６２として記憶部６に記憶されている（図１参照）。登録ジェスチャ情報６２は、たとえば、登録ジェスチャに対応する人体の部位（手、足、頭など）、追跡領域Ｗの形状、ジェスチャ領域Ｚの形状および後述する登録情報群等の情報を含み得る。 Here, the information regarding the registered gesture is stored in the storage unit 6 as the registered gesture information 62 (see FIG. 1). The registered gesture information 62 may include, for example, information such as a part of the human body (hand, foot, head, etc.) corresponding to the registered gesture, the shape of the tracking area W, the shape of the gesture area Z, and the registration information group described later.

一例として、追跡情報生成部４１は、距離推定部２１によって検出される人物の足部の位置から、その人物の手や頭といった各部位の存在範囲を予測し、予測した存在範囲ごとに、その部位に対応付けられた追跡領域Ｗおよびジェスチャ領域Ｚを設定する。たとえば、動体検出部３２によって検出された動体が「手」の存在範囲に含まれる場合、追跡情報生成部４１は、対象物「手」に対応付けられた登録ジェスチャを登録ジェスチャ情報６２から特定し、特定した登録ジェスチャに対応する追跡領域Ｗおよびジェスチャ領域Ｚを対象物「手」の周囲に設定する。 As an example, the tracking information generation unit 41 predicts the existence range of each part such as the hand and head of the person from the position of the foot of the person detected by the distance estimation unit 21, and for each predicted existence range, the existing range is predicted. The tracking area W and the gesture area Z associated with the parts are set. For example, when the moving object detected by the moving object detecting unit 32 is included in the existence range of the “hand”, the tracking information generating unit 41 identifies the registered gesture associated with the object “hand” from the registered gesture information 62. , The tracking area W and the gesture area Z corresponding to the specified registered gesture are set around the object “hand”.

対象物「手」に対応付けられた登録ジェスチャが複数登録されている場合、追跡情報生成部４１は、対象物「手」に対応付けられた各登録ジェスチャにそれぞれ対応する複数の追跡領域Ｗおよびジェスチャ領域Ｚを対象物「手」の周囲に設定し、それぞれの領域についてクラスターＣの追跡を行う。たとえば、対象物「手」に対し、上述した「手を上げて下げる動作」の他に、「手を横に伸ばす動作」が登録ジェスチャとして登録されているとする。この場合、追跡情報生成部４１は、「手を上げて下げる動作」に対応する上下方向に幅広い追跡領域Ｗおよびジェスチャ領域Ｚと、「手を横に伸ばす動作」に対応する左右方向に幅広い追跡領域Ｗおよびジェスチャ領域Ｚとを対象物「手」の周囲に設定し、設定した領域ごとにクラスターＣの追跡を行う。 When a plurality of registered gestures associated with the object "hand" are registered, the tracking information generation unit 41 has a plurality of tracking areas W and a plurality of tracking areas W corresponding to each registered gesture associated with the object "hand". Gesture region Z is set around the object "hand", and cluster C is tracked for each region. For example, it is assumed that, in addition to the above-mentioned "movement of raising and lowering the hand", "movement of extending the hand sideways" is registered as a registration gesture for the object "hand". In this case, the tracking information generation unit 41 has a wide tracking area W and gesture area Z in the vertical direction corresponding to the "movement of raising and lowering the hand", and a wide tracking area W in the horizontal direction corresponding to the "movement of extending the hand sideways". Area W and gesture area Z are set around the object "hand", and cluster C is tracked for each set area.

蓄積部４２は、追跡情報生成部４１によって生成された追跡情報を時系列に蓄積した追跡情報群を生成する。 The storage unit 42 generates a tracking information group in which the tracking information generated by the tracking information generation unit 41 is accumulated in chronological order.

具体的には、蓄積部４２は、図示しないバッファに複数フレーム分の追跡情報を時系列に蓄積し、蓄積した複数フレーム分の追跡情報を「追跡情報群」として判定部５の比較部５１に出力する。蓄積部４２は、この処理を追跡情報生成部４１から追跡情報が入力されるごとに実行する。すなわち、蓄積部４２は、追跡情報生成部４１から新たな追跡情報が入力されると、バッファに蓄積されている追跡情報のうち最も古いものを破棄し、追跡情報生成部４１から入力された新たな追跡情報をバッファに追加する。そして、蓄積部４２は、バッファに記憶された追跡情報群を判定部５へ出力する。 Specifically, the storage unit 42 accumulates tracking information for a plurality of frames in a buffer (not shown) in chronological order, and the accumulated tracking information for the plurality of frames is used as a "tracking information group" in the comparison unit 51 of the determination unit 5. Output. The storage unit 42 executes this process each time tracking information is input from the tracking information generation unit 41. That is, when new tracking information is input from the tracking information generation unit 41, the storage unit 42 discards the oldest tracking information stored in the buffer, and the new tracking information generation unit 41 inputs new tracking information. Add tracking information to the buffer. Then, the storage unit 42 outputs the tracking information group stored in the buffer to the determination unit 5.

蓄積部４２は、上記の処理を登録ジェスチャごとに実行する。なお、蓄積するフレーム数は、登録ジェスチャごとに異ならせてもよい。 The storage unit 42 executes the above process for each registered gesture. The number of frames to be accumulated may be different for each registered gesture.

（判定部５について）
判定部５は、追跡部４による特徴点の追跡結果に基づいて登録ジェスチャが行われたか否かを判定する。かかる判定部５は、比較部５１と、静止判定部５２とを備える。 (About judgment unit 5)
The determination unit 5 determines whether or not the registration gesture has been performed based on the tracking result of the feature points by the tracking unit 4. The determination unit 5 includes a comparison unit 51 and a rest determination unit 52.

比較部５１は、蓄積部４２から追跡情報群が入力されるごとに、入力された追跡情報群と、記憶部６に記憶された登録ジェスチャ情報６２に含まれる登録情報群とを比較する。 Each time the tracking information group is input from the storage unit 42, the comparison unit 51 compares the input tracking information group with the registered information group included in the registered gesture information 62 stored in the storage unit 6.

ここで、比較部５１による比較処理について図７を参照して説明する。図７は、比較処理の一例を示す図である。 Here, the comparison process by the comparison unit 51 will be described with reference to FIG. 7. FIG. 7 is a diagram showing an example of the comparison process.

図７に示すように、追跡情報群は、複数フレーム（ここでは、９フレーム）分の追跡情報を時系列に蓄積した情報である。図７では、理解を容易にするために、最も古い追跡情報Ｔ１から最新の追跡情報Ｔ９までを紙面左側から順に並べたものを追跡情報群として示している。また、登録情報群は、登録ジェスチャに対応付けて予め登録される情報であって、登録ジェスチャが理想的に行われたと仮定した場合に得られる仮想的な追跡情報を複数フレーム分蓄積した情報である。登録情報群のフレーム数は、必ずしも追跡情報群のフレーム数と同数でなくてもよく、追跡情報群のフレーム数と異なるフレーム数であってもよい。 As shown in FIG. 7, the tracking information group is information obtained by accumulating tracking information for a plurality of frames (here, 9 frames) in a time series. In FIG. 7, in order to facilitate understanding, the oldest tracking information T1 to the latest tracking information T9 are arranged in order from the left side of the paper as a tracking information group. In addition, the registration information group is information that is registered in advance in association with the registration gesture, and is information obtained by accumulating a plurality of frames of virtual tracking information obtained when it is assumed that the registration gesture is ideally performed. be. The number of frames of the registration information group does not necessarily have to be the same as the number of frames of the tracking information group, and may be a number of frames different from the number of frames of the tracking information group.

比較部５１は、追跡情報群と登録情報群とを比較し、これらの類似度（尤度）を算出する。そして、比較部５１は、算出した類似度が閾値以上である場合には、登録ジェスチャが行われたと仮判定する。仮判定の手法としては、たとえば、ＤＰ（Dynamic Programming）マッチング法を用いることができる。比較部５１は、この仮判定処理を登録ジェスチャごとに実行する。 The comparison unit 51 compares the tracking information group with the registered information group, and calculates the degree of similarity (likelihood) between them. Then, when the calculated similarity is equal to or higher than the threshold value, the comparison unit 51 tentatively determines that the registration gesture has been performed. As a tentative determination method, for example, a DP (Dynamic Programming) matching method can be used. The comparison unit 51 executes this provisional determination process for each registered gesture.

このように、第１の実施形態に係る動作判定装置１では、複数の追跡情報を時系列に蓄積した追跡情報群と予め登録された登録情報群との比較結果に基づいて登録ジェスチャが行われたか否かを仮判定する。すなわち、第１の実施形態に係る動作判定装置１では、複数フレーム分の追跡情報を一つのかたまり（追跡情報群）として、予め登録された登録情報群と比較することとしたため、一つの追跡情報から動作を判定する場合と比較して、登録ジェスチャが行われたか否かを精度よく仮判定することができる。 As described above, in the operation determination device 1 according to the first embodiment, the registration gesture is performed based on the comparison result between the tracking information group accumulating a plurality of tracking information in time series and the pre-registered registration information group. Temporarily determine whether or not it is. That is, in the operation determination device 1 according to the first embodiment, since the tracking information for a plurality of frames is compared with the pre-registered registration information group as one block (tracking information group), one tracking information. Compared with the case of determining the operation from, it is possible to accurately and tentatively determine whether or not the registration gesture has been performed.

静止判定部５２は、比較部５１によって登録ジェスチャが行われたと仮判定された後、対象物が所定フレーム静止しているか否かを判定する。 The rest determination unit 52 determines whether or not the object is stationary for a predetermined frame after it is tentatively determined by the comparison unit 51 that the registration gesture has been performed.

ここで、静止判定部５２による静止判定処理について図８を参照して説明する。図８は、静止判定処理の一例を示す図である。なお、図８には、図７に示す追跡情報群から４フレーム後の追跡情報群を示している。 Here, the rest determination process by the rest determination unit 52 will be described with reference to FIG. FIG. 8 is a diagram showing an example of static determination processing. Note that FIG. 8 shows a tracking information group 4 frames after the tracking information group shown in FIG. 7.

一例として、静止判定部５２は、比較部５１によって登録ジェスチャが行われたと仮判定された場合に、その後に追跡部４から入力される追跡情報群を監視する。そして、図８に示すように、クラスターＣの移動量が閾値以下であることを示す追跡情報、たとえば、クラスターＣの位置が変化していないことを示す追跡情報Ｔ１０〜Ｔ１３が所定フレーム数（たとえば、４フレーム）連続した場合に、対象物が静止していると判定する。 As an example, the stationary determination unit 52 monitors the tracking information group input from the tracking unit 4 after the tentative determination that the registration gesture has been performed by the comparison unit 51. Then, as shown in FIG. 8, tracking information indicating that the movement amount of the cluster C is equal to or less than the threshold value, for example, tracking information T10 to T13 indicating that the position of the cluster C has not changed is a predetermined number of frames (for example). 4 frames) When it is continuous, it is determined that the object is stationary.

そして、判定部５は、静止判定部５２によって対象物が静止していると判定された場合に、登録ジェスチャが行われたことを判定し、判定結果を外部へ出力する。 Then, when the rest determination unit 52 determines that the object is stationary, the determination unit 5 determines that the registration gesture has been performed, and outputs the determination result to the outside.

このように、判定部５は、追跡情報群と登録情報群との類似度が閾値以上であると判定した後、対象物が静止しているか否かを判定し、静止していると判定したならば、登録情報群に対応する動作が行われたと判定する。これにより、「登録ジェスチャを意図した動作」と、一連の動作の中にたまたま含まれる「登録ジェスチャに似た動作」とを切り分けることができるため、登録ジェスチャの誤判定を低減することができる。 In this way, the determination unit 5 determines whether or not the object is stationary after determining that the similarity between the tracking information group and the registered information group is equal to or higher than the threshold value, and determines that the object is stationary. If so, it is determined that the operation corresponding to the registration information group has been performed. As a result, it is possible to separate "an operation intended for a registered gesture" from "an operation similar to a registered gesture" that happens to be included in a series of operations, so that it is possible to reduce erroneous determination of the registered gesture.

なお、ここでは、静止判定部５２が、追跡情報群に基づいて静止判定を行う場合の例について説明したが、静止判定の手法は、これに限定されない。たとえば、静止判定部５２は、比較部５１によって登録ジェスチャが行われたと仮判定された後、動体検出部３２によって動体が検出されない期間が所定フレーム数継続した場合に、対象物が静止していると判定してもよい。 Here, an example in which the rest determination unit 52 performs the rest determination based on the tracking information group has been described, but the method of the rest determination is not limited to this. For example, in the stationary determination unit 52, after the comparison unit 51 provisionally determines that the registration gesture has been performed, the object is stationary when the period in which the moving object is not detected by the moving object detecting unit 32 continues for a predetermined number of frames. May be determined.

なお、判定部５は、静止判定部５２による静止判定処理を必ずしも実行することを要しない。すなわち、判定部５は、比較部５１による仮判定結果を最終的な判定結果として外部装置へ出力するようにしてもよい。この場合、判定部５は、静止判定部５２を備えない構成であってもよい。 The determination unit 5 does not necessarily have to execute the stationary determination process by the stationary determination unit 52. That is, the determination unit 5 may output the provisional determination result by the comparison unit 51 to the external device as the final determination result. In this case, the determination unit 5 may be configured not to include the stationary determination unit 52.

（記憶部６について）
記憶部６は、たとえば、ＲＡＭ、フラッシュメモリ等の半導体メモリ素子、または、ＨＤＤ（Hard Disk Drive）、光ディスク等の記憶装置であり、変換情報６１と、登録ジェスチャ情報６２とを記憶する。 (About storage unit 6)
The storage unit 6 is, for example, a semiconductor memory element such as a RAM or a flash memory, or a storage device such as an HDD (Hard Disk Drive) or an optical disk, and stores conversion information 61 and registered gesture information 62.

変換情報６１は、予め実験またはシミュレーションによって求めておいた距離情報と各パラメータとの関係を示す変換テーブルや変換マップ等の情報である。また、登録ジェスチャ情報６２は、登録ジェスチャに対応する人体の部位（手、足、頭など）、追跡領域Ｗの形状、ジェスチャ領域Ｚの形状および登録情報群等の情報を含む。 The conversion information 61 is information such as a conversion table or a conversion map showing the relationship between the distance information and each parameter obtained in advance by experiment or simulation. Further, the registered gesture information 62 includes information such as a part of the human body (hand, foot, head, etc.) corresponding to the registered gesture, the shape of the tracking area W, the shape of the gesture area Z, and the registered information group.

〔２．動作判定装置の具体的動作〕
次に、上述した動作判定装置１の具体的動作について図９を参照して説明する。図９は、パラメータ変更部２、抽出部３、追跡部４および判定部５が実行する処理の手順の一例を示すフローチャートである。 [2. Specific operation of the operation judgment device]
Next, the specific operation of the above-described operation determination device 1 will be described with reference to FIG. FIG. 9 is a flowchart showing an example of a processing procedure executed by the parameter changing unit 2, the extraction unit 3, the tracking unit 4, and the determination unit 5.

図９に示すように、パラメータ変更部２の距離推定部２１は、撮像装置１０から入力される撮像画像に基づいて距離情報を生成する（ステップＳ１０１）。つづいて、パラメータ変更部２の抽出用パラメータ変更部２２および追跡用パラメータ変更部２３は、抽出部３および追跡部４の処理に用いられる各種のパラメータを距離情報に応じて変更する（ステップＳ１０２）。 As shown in FIG. 9, the distance estimation unit 21 of the parameter changing unit 2 generates distance information based on the captured image input from the imaging device 10 (step S101). Subsequently, the extraction parameter changing unit 22 and the tracking parameter changing unit 23 of the parameter changing unit 2 change various parameters used for processing of the extraction unit 3 and the tracking unit 4 according to the distance information (step S102). ..

つづいて、抽出部３の領域設定部３１は、抽出用パラメータ変更部２２から入力される変更後のパラメータを用い、撮像装置１０から入力される撮像画像に対して処理対象領域Ｒ（図２Ｂ参照）を設定する（ステップＳ１０３）。 Subsequently, the area setting unit 31 of the extraction unit 3 uses the changed parameters input from the extraction parameter changing unit 22 to process the image captured image input from the imaging device 10 in the processing target area R (see FIG. 2B). ) Is set (step S103).

つづいて、抽出部３の動体検出部３２は、抽出用パラメータ変更部２２から入力される変更後のパラメータを用い、処理対象領域Ｒの中から動体を検出し（ステップＳ１０４）、抽出処理部３３は、動体検出部３２によって検出された動体から特徴点を抽出する（ステップＳ１０５）。 Subsequently, the moving object detection unit 32 of the extraction unit 3 detects a moving object from the processing target area R using the changed parameters input from the extraction parameter changing unit 22 (step S104), and the extraction processing unit 33. Extracts feature points from the moving body detected by the moving body detecting unit 32 (step S105).

つづいて、追跡部４の追跡情報生成部４１は、追跡用パラメータ変更部２３から入力される変更後のパラメータを用い、抽出処理部３３によって抽出された複数の特徴点をクラスタリングして（ステップＳ１０６）、２フレーム間におけるクラスターＣの追跡情報を生成する（ステップＳ１０７）。 Subsequently, the tracking information generation unit 41 of the tracking unit 4 clusters a plurality of feature points extracted by the extraction processing unit 33 using the changed parameters input from the tracking parameter changing unit 23 (step S106). ), The tracking information of the cluster C between two frames is generated (step S107).

つづいて、追跡部４の蓄積部４２は、追跡情報生成部４１によって生成された追跡情報を時系列に蓄積した追跡情報群を生成する（ステップＳ１０８）。 Subsequently, the storage unit 42 of the tracking unit 4 generates a tracking information group in which the tracking information generated by the tracking information generation unit 41 is accumulated in chronological order (step S108).

つづいて、比較部５１は、追跡情報群と登録情報群との類似度を算出し、算出した類似度が閾値以上であるか否かを判定して（ステップＳ１０９）、閾値以上であると判定した場合には（ステップＳ１０９，Ｙｅｓ）、処理をステップＳ１１０へ進める。 Subsequently, the comparison unit 51 calculates the similarity between the tracking information group and the registered information group, determines whether or not the calculated similarity is equal to or greater than the threshold value (step S109), and determines that the similarity is equal to or greater than the threshold value. If so (step S109, Yes), the process proceeds to step S110.

ステップＳ１１０において、静止判定部５２は、対象物が所定フレーム数以上静止しているか否かを判定する。そして、静止判定部５２は、対象物が所定フレーム数以上静止していると判定した場合には（ステップＳ１１０，Ｙｅｓ）、判定結果を外部装置へ出力する（ステップＳ１１１）。 In step S110, the rest determination unit 52 determines whether or not the object is stationary for a predetermined number of frames or more. Then, when the stationary determination unit 52 determines that the object is stationary for a predetermined number of frames or more (steps S110, Yes), the stationary determination unit 52 outputs the determination result to the external device (step S111).

ステップＳ１１１の処理を終えても、たとえば外部装置から終了指示を受け付けていない場合（ステップＳ１１２，Ｎｏ）、ステップＳ１０９において類似度が閾値以上でない場合（ステップＳ１０９，Ｎｏ）またはステップＳ１１０において対象物が所定フレーム数以上静止していない場合（ステップＳ１１０，Ｎｏ）、動作判定装置１は、処理をステップＳ１０１へ戻す。動作判定装置１は、ステップＳ１０１〜Ｓ１１１の処理をたとえば外部装置から終了指示を受け付けるまで繰り返す。ステップＳ１１１の処理を終えて、たとえば外部装置から終了指示を受け付けた場合（ステップＳ１１２，Ｙｅｓ）、動作判定装置１は、一連の処理を終了する。 Even after the processing of step S111 is completed, for example, when the end instruction is not received from the external device (step S112, No), when the similarity is not equal to or higher than the threshold value in step S109 (step S109, No), or when the object is in step S110. When the number of frames is not stationary for the predetermined number of frames or more (steps S110, No), the operation determination device 1 returns the process to step S101. The operation determination device 1 repeats the processes of steps S101 to S111 until, for example, an end instruction is received from an external device. When the process of step S111 is completed and, for example, an end instruction is received from an external device (step S112, Yes), the operation determination device 1 ends a series of processes.

〔３．変形例〕
動作判定装置１は、対象物に対して複数の登録ジェスチャが対応付けられている場合に、追跡情報を用いて登録ジェスチャの絞り込みを行ってもよい。かかる点について図１０Ａおよび図１０Ｂを参照して説明する。図１０Ａおよび図１０Ｂは、変形例に係る絞り込み処理の一例を示す図である。 [3. Modification example]
The motion determination device 1 may narrow down the registered gestures by using the tracking information when a plurality of registered gestures are associated with the object. This point will be described with reference to FIGS. 10A and 10B. 10A and 10B are diagrams showing an example of the narrowing process according to the modified example.

図１０Ａに示すように、たとえば、対象物「手」に対し、「手を上げて下げる動作」と「手を横に伸ばす動作」とが登録ジェスチャとして登録されているとする。上述したように、「手を上げて下げる動作」には上下方向に幅広い追跡領域Ｗ１が設定され、「手を横に伸ばす動作」には左右方向に幅広い追跡領域Ｗ２が設定される。 As shown in FIG. 10A, for example, it is assumed that "the action of raising and lowering the hand" and "the action of extending the hand sideways" are registered as registration gestures with respect to the object "hand". As described above, the "movement of raising and lowering the hand" is set with a wide tracking area W1 in the vertical direction, and the "movement of extending the hand sideways" is set with a wide tracking area W2 in the horizontal direction.

ここで、人物Ｈが、手を上げる動作を行ったとすると、上向きのベクトルをもった追跡情報が多く蓄積されることとなる。そこで、図１０Ｂに示すように、動作判定装置１は、複数の追跡情報または追跡情報群から人物Ｈの動作を予測することにより、登録ジェスチャを絞り込んでもよい。すなわち、上向きのベクトルをもった追跡情報が多い場合には、「手を上げて下げる動作」の登録ジェスチャおよび「手を横に伸ばす動作」の登録ジェスチャのうち、「手を横に伸ばす動作」の登録ジェスチャを判定対象から除外するようにしてもよい。 Here, if the person H performs the action of raising his / her hand, a lot of tracking information having an upward vector will be accumulated. Therefore, as shown in FIG. 10B, the motion determination device 1 may narrow down the registered gestures by predicting the motion of the person H from a plurality of tracking information or tracking information groups. That is, when there is a lot of tracking information having an upward vector, among the registered gestures of "the action of raising and lowering the hand" and the registered gesture of "the action of extending the hand sideways", "the action of extending the hand sideways". The registered gesture of is excluded from the judgment target.

このように、複数の登録ジェスチャの中から判定対象とする登録ジェスチャを絞り込むことで、処理負荷を抑えることができる。 In this way, the processing load can be suppressed by narrowing down the registered gestures to be determined from the plurality of registered gestures.

上述してきたように、第１の実施形態に係る動作判定装置１は、抽出部３と、追跡部４と、判定部５とを備える。抽出部３は、撮像画像から対象物の特徴点を抽出する。追跡部４は、時間的に前後する撮像画像からそれぞれ抽出される特徴点に基づき、対象物の移動方向を示す追跡情報を生成する。判定部５は、複数の追跡情報を時系列に蓄積した追跡情報群と対象物の動作に対応付けて予め登録された登録情報群との比較結果に基づき、当該動作が行われたか否かを判定する。 As described above, the operation determination device 1 according to the first embodiment includes an extraction unit 3, a tracking unit 4, and a determination unit 5. The extraction unit 3 extracts the feature points of the object from the captured image. The tracking unit 4 generates tracking information indicating the moving direction of the object based on the feature points extracted from the captured images that are moved back and forth in time. The determination unit 5 determines whether or not the operation has been performed based on the comparison result between the tracking information group accumulating a plurality of tracking information in time series and the registered information group registered in advance in association with the operation of the object. judge.

よって、第１の実施形態に係る動作判定装置１によれば、一例としては、複数フレーム分の追跡情報を一つのかたまり（追跡情報群）として、予め登録された登録情報群と比較することにより、一つの追跡情報から動作を判定する場合と比較して、登録情報群に対応する動作が行われたか否かを精度よく判定することができる。 Therefore, according to the operation determination device 1 according to the first embodiment, as an example, by comparing the tracking information for a plurality of frames as one block (tracking information group) with the registered information group registered in advance. , It is possible to accurately determine whether or not the operation corresponding to the registered information group has been performed, as compared with the case where the operation is determined from one tracking information.

なお、上述した第１の実施形態では、動作判定装置１を用いて人物の動作を判定する場合の例について説明したが、動作判定装置１は、人物以外の動作の判定に用いてもよい。たとえば、動作判定装置１は、踏切やＥＴＣレーンに設置される遮断機の動作の判定に用いることができる。この場合、動作判定装置１は、遮断機が備える遮断桿を対象物として、遮断桿が降下する動作や上昇する動作が行われたか否かを判定する。その他、動作判定装置１は、犬猫などの動物やロボット等の動作の判定に用いることもできる。 In the first embodiment described above, an example in which the motion of a person is determined by using the motion determination device 1 has been described, but the motion determination device 1 may be used to determine the motion of a person other than the person. For example, the operation determination device 1 can be used to determine the operation of a breaker installed at a railroad crossing or an ETC lane. In this case, the operation determination device 1 determines whether or not the cutoff rod is lowered or raised with the cutoff rod provided in the breaker as an object. In addition, the motion determination device 1 can also be used to determine the motion of animals such as dogs and cats and robots.

（第２の実施形態）
次に、第２の実施形態について、図１１〜図１３Ｂを参照して説明する。なお、以下の説明では、既に説明した部分と同様の部分については、既に説明した部分と同一の符号を付し、重複する説明を省略する。同じ符号が付された複数の構成要素は、全ての機能及び性質が共通するとは限らず、各実施形態に応じた異なる機能及び性質を有していても良い。 (Second Embodiment)
Next, the second embodiment will be described with reference to FIGS. 11 to 13B. In the following description, the same parts as those already described will be designated by the same reference numerals as those already described, and duplicate description will be omitted. The plurality of components with the same reference numerals do not necessarily have all the functions and properties in common, and may have different functions and properties according to each embodiment.

まず、第２の実施形態に係る動作判定装置の構成について図１１および図１２を参照して説明する。図１１は、第２の実施形態に係る動作判定装置の構成の一例を示すブロック図である。また、図１２は、人物特定部の構成の一例を示すブロック図である。 First, the configuration of the operation determination device according to the second embodiment will be described with reference to FIGS. 11 and 12. FIG. 11 is a block diagram showing an example of the configuration of the operation determination device according to the second embodiment. Further, FIG. 12 is a block diagram showing an example of the configuration of the person identification unit.

図１１に示すように、第２の実施形態に係る動作判定装置１Ａは、撮像画像に含まれる人物の中から、動作判定の対象となる人物（以下、対象人物と記載する）を特定する人物特定部７をさらに備える。 As shown in FIG. 11, the motion determination device 1A according to the second embodiment is a person who identifies a person to be motion-determined (hereinafter referred to as a target person) from the persons included in the captured image. A specific unit 7 is further provided.

人物特定部７は、図１２に示すように、一例として、人物検出部７１と、履歴生成部７２と、特定処理部７３とを備える。 As shown in FIG. 12, the person identification unit 7 includes a person detection unit 71, a history generation unit 72, and a specific processing unit 73, as an example.

人物検出部７１は、撮像画像に含まれる人物の検出および追跡を行う。人物を検出および追跡する手法は、たとえばパターン認識によるものなど何れの従来技術を用いても構わない。 The person detection unit 71 detects and tracks a person included in the captured image. As a method for detecting and tracking a person, any conventional technique such as a pattern recognition method may be used.

なお、人物検出部７１は、顔や手といった人物の一部を検出するのではなく、人物全体を検出するものとする。また、人物検出部７１は、撮像装置１０から撮像画像が入力されるごとに、人物を検出および追跡する処理を行う。 The person detection unit 71 does not detect a part of a person such as a face or a hand, but detects the entire person. In addition, the person detection unit 71 performs a process of detecting and tracking a person each time a captured image is input from the image pickup device 10.

履歴生成部７２は、人物検出部７１によって検出された人物の行動履歴を生成する。たとえば、履歴生成部７２は、各撮像画像から、人物検出部７１によって検出された人物の体の向き、撮像画像における位置ならびに大きさ、視線等の情報を抽出する。また、履歴生成部７２は、時間的に前後する複数の撮像画像から、人物検出部７１によって検出された人物の移動方向、動いている状態か静止している状態かの別等の情報を抽出する。 The history generation unit 72 generates the action history of the person detected by the person detection unit 71. For example, the history generation unit 72 extracts information such as the orientation of the person's body detected by the person detection unit 71, the position and size in the captured image, and the line of sight from each captured image. In addition, the history generation unit 72 extracts information such as the moving direction of the person detected by the person detection unit 71 and whether it is in a moving state or a stationary state, etc., from a plurality of captured images that are moved back and forth in time. do.

そして、履歴生成部７２は、抽出したこれらの情報を含む行動履歴６３を生成して記憶部６Ａに記憶させる。これにより、記憶部６Ａには、人物ごとの行動履歴６３が蓄積される。なお、人物検出部７１は、必ずしも上述した全ての情報を抽出することを要しない。 Then, the history generation unit 72 generates an action history 63 including the extracted information and stores it in the storage unit 6A. As a result, the action history 63 for each person is accumulated in the storage unit 6A. The person detection unit 71 does not necessarily have to extract all the above-mentioned information.

特定処理部７３は、記憶部６Ａに記憶された行動履歴６３と行動パターン登録情報６４とを比較し、これらの類似度に基づいて対象人物を特定する。行動パターン登録情報６４は、これから登録ジェスチャを行おうとする人物が登録ジェスチャを行う前に取ると予想される行動パターンに関する情報であり、予め記憶部６Ａに登録される。 The specific processing unit 73 compares the action history 63 stored in the storage unit 6A with the action pattern registration information 64, and identifies the target person based on the degree of similarity thereof. The behavior pattern registration information 64 is information on the behavior pattern that is expected to be taken by the person who intends to perform the registration gesture before performing the registration gesture, and is registered in the storage unit 6A in advance.

ここで、人物特定部７による人物特定処理の一例について図１３Ａおよび図１３Ｂを参照して説明する。図１３Ａおよび図１３Ｂは、人物特定処理の一例を示す図である。 Here, an example of the person identification process by the person identification unit 7 will be described with reference to FIGS. 13A and 13B. 13A and 13B are diagrams showing an example of the person identification process.

図１３Ａに示すように、撮像装置１０によって撮像された撮像画像Ｘ５に複数の人物Ｈ１〜Ｈ３が写り込んでいるとする。この場合、人物特定部７は、撮像画像Ｘ５から人物Ｈ１〜Ｈ３を検出し、各人物Ｈ１〜Ｈ３の行動履歴６３を生成して記憶部６Ａに記憶させる。 As shown in FIG. 13A, it is assumed that a plurality of persons H1 to H3 are reflected in the captured image X5 captured by the imaging device 10. In this case, the person identification unit 7 detects the persons H1 to H3 from the captured image X5, generates the action history 63 of each person H1 to H3, and stores it in the storage unit 6A.

つづいて、人物特定部７は、人物Ｈ１〜Ｈ３ごとに、行動履歴６３と行動パターン登録情報６４との比較を行い、これらの類似度が閾値を超える人物を対象人物として特定する。 Subsequently, the person identification unit 7 compares the action history 63 with the action pattern registration information 64 for each person H1 to H3, and identifies a person whose similarity exceeds the threshold value as the target person.

たとえば、これから登録ジェスチャを行おうとする人物は、撮像装置１０に対して正面を向いている可能性が高い。そこで、人物特定部７は、体が所定時間正面を向いている人物を対象人物として特定してもよい。 For example, a person who is about to perform a registration gesture is likely to face the image pickup device 10. Therefore, the person identification unit 7 may specify a person whose body is facing the front for a predetermined time as a target person.

この場合、行動パターン登録情報６４には、「体が所定時間正面を向いていること」の項目が含まれる。これにより、撮像画像に含まれる人物Ｈ１〜Ｈ３のうち、体が正面を向いている人物Ｈ１を対象人物として特定することができる。 In this case, the behavior pattern registration information 64 includes the item "the body is facing the front for a predetermined time". Thereby, among the persons H1 to H3 included in the captured image, the person H1 whose body is facing the front can be specified as the target person.

人物Ｈ１が対象人物として特定された場合、人物特定部７以降の各処理部によって実行される処理は、人物Ｈ１についてのみ行われる。具体的には、パラメータ変更部２は、撮像画像の撮像位置から人物Ｈ１までの距離を推定し、推定した距離に応じて抽出部３および追跡部４の処理に用いられる各種のパラメータを変更する。また、抽出部３は、人物Ｈ１の周囲に処理対象領域Ｒ１を設定し、設定した処理対象領域Ｒ１において特徴点の抽出を行う（図１３Ｂ参照）。 When the person H1 is specified as the target person, the processing executed by each processing unit after the person identification unit 7 is performed only for the person H1. Specifically, the parameter changing unit 2 estimates the distance from the imaging position of the captured image to the person H1, and changes various parameters used in the processing of the extraction unit 3 and the tracking unit 4 according to the estimated distance. .. Further, the extraction unit 3 sets a processing target area R1 around the person H1 and extracts feature points in the set processing target area R1 (see FIG. 13B).

一方、人物特定部７以降の各処理部によって実行される処理は、対象人物として特定されなかった人物Ｈ２，Ｈ３については実行されない。したがって、撮像画像に複数の人物が含まれる場合における処理負荷の増加を抑えることができる。また、対象人物以外の人物Ｈ２，Ｈ３が登録ジェスチャに似た動作を行ったとしても、登録ジェスチャが行われたと判定されることがないため、判定精度の低下を防止することが可能である。 On the other hand, the processing executed by each processing unit after the person identification unit 7 is not executed for the persons H2 and H3 not specified as the target person. Therefore, it is possible to suppress an increase in the processing load when a plurality of people are included in the captured image. Further, even if the persons H2 and H3 other than the target person perform an operation similar to the registered gesture, it is not determined that the registered gesture has been performed, so that it is possible to prevent the determination accuracy from being lowered.

ところで、行動パターン登録情報６４には、体の向き以外の項目を含めることも可能である。たとえば、これから登録ジェスチャを行おうとする人物は、撮像装置１０に対して正対している、言い換えれば、撮像画像の中央に写り込んでいる可能性が高い。そこで、行動パターン登録情報６４には、撮像画像における人物の位置に関する項目を含めてもよい。たとえば、撮像画像の中央に近い人物ほど類似度が高くなるようにしてもよい。これにより、これから登録ジェスチャを行おうとする人物をさらに精度良く特定することができる。 By the way, the behavior pattern registration information 64 can include items other than the body orientation. For example, it is highly possible that the person who is about to perform the registration gesture is facing the image pickup device 10, in other words, is reflected in the center of the captured image. Therefore, the behavior pattern registration information 64 may include an item relating to the position of a person in the captured image. For example, a person closer to the center of the captured image may have a higher degree of similarity. This makes it possible to more accurately identify the person who is going to perform the registration gesture.

また、これから登録ジェスチャを行おうとする人物は、撮像装置１０に向かって移動してくる可能性が高い。そこで、行動パターン登録情報６４には、人物の移動方向に関する項目を含めてもよい。たとえば、移動方向が撮像装置１０を向いている人物ほど類似度が高くなるようにしてもよい。これにより、たとえば、撮像装置１０の前を横切る通行人や撮像装置１０から遠ざかる人物を対象人物から除外し易くなるため、これから登録ジェスチャを行おうとする人物をさらに精度良く特定することができる。 Further, the person who is going to perform the registration gesture is likely to move toward the image pickup apparatus 10. Therefore, the action pattern registration information 64 may include an item relating to the movement direction of the person. For example, the degree of similarity may be higher for a person whose moving direction is facing the imaging device 10. As a result, for example, a passerby who crosses in front of the image pickup device 10 and a person who moves away from the image pickup device 10 can be easily excluded from the target person, so that the person who is going to perform the registration gesture from now on can be identified more accurately.

また、これから登録ジェスチャを行おうとする人物は、視線が撮像装置１０の方を向いている可能性が高い。そこで、行動パターン登録情報６４には、視線に関する項目を含めてもよい。たとえば、視線が撮像装置１０を向いている人物ほど類似度が高くなるようにしてもよい。これにより、これから登録ジェスチャを行おうとする人物をさらに精度良く特定することができる。 Further, it is highly possible that the person who is going to perform the registration gesture is looking toward the image pickup apparatus 10. Therefore, the behavior pattern registration information 64 may include an item related to the line of sight. For example, the degree of similarity may be higher for a person whose line of sight is facing the imaging device 10. This makes it possible to more accurately identify the person who is going to perform the registration gesture.

また、これから登録ジェスチャを行おうとする人物は、撮像装置１０に比較的近い位置に存在している可能性が高い。そこで、行動パターン登録情報６４には、人物の大きさに関する項目を含めてもよい。たとえば、撮像画像に大きく写り込んでいる人物ほど類似度が高くなるようにしてもよい。これにより、撮像装置１０から遠く離れた場所にいる通行人等を対象人物から除外し易くなるため、これから登録ジェスチャを行おうとする人物をさらに精度良く特定することができる。 Further, it is highly possible that the person who is going to perform the registration gesture is present at a position relatively close to the image pickup apparatus 10. Therefore, the behavior pattern registration information 64 may include an item relating to the size of the person. For example, the degree of similarity may be higher for a person who appears larger in the captured image. As a result, it becomes easy to exclude a passerby or the like who is far away from the image pickup apparatus 10 from the target person, so that the person who is going to perform the registration gesture from now on can be identified more accurately.

人物特定部７は、上述したこれらの項目ごとの類似度をそれぞれ算出するとともに算出した類似度を点数化し、それらの合計点が閾値を超える人物を対象人物として特定するようにしてもよい。また、類似度を点数化する際には、項目ごとに重み付けを行ってもよい。 The person identification unit 7 may calculate the similarity degree for each of these items described above and score the calculated similarity degree, and specify a person whose total score exceeds the threshold value as the target person. Further, when scoring the similarity, weighting may be performed for each item.

また、人物特定部７は、たとえば、顔認証や歩容認証といった個人認証をさらに用いて対象人物の特定を行ってもよい。顔認証とは、顔の特徴から個人を特定する手法であり、歩容認証とは、歩き方から個人を特定する手法である。 In addition, the person identification unit 7 may further identify the target person by further using personal authentication such as face authentication or gait authentication. Face recognition is a method of identifying an individual based on facial features, and gait recognition is a method of identifying an individual based on how they walk.

たとえば、人物特定部７は、行動履歴６３と行動パターン登録情報６４との比較に基づいて特定した人物について、予め登録しておいた顔情報や歩容情報を用いた個人認証を行い、個人が認証された場合に、その人物を対象人物として特定するようにしてもよい。 For example, the person identification unit 7 performs personal authentication using the face information and gait information registered in advance for the person specified based on the comparison between the action history 63 and the action pattern registration information 64, and the individual can perform personal authentication. When authenticated, the person may be identified as the target person.

このように、予め登録しておいた人物のみを対象人物として特定するようにすることで、精度の向上とともにセキュリティ面の向上を図ることができる。また、行動履歴６３と行動パターン登録情報６４との比較に基づいて特定された人物（たとえば人物Ｈ１）についてのみ個人認証を行うようにすることで、撮像画像に含まれる全ての人物（たとえば人物Ｈ１〜Ｈ３）について個人認証を行う場合に比べ、個人認証による処理負荷の増加を抑制することができる。 In this way, by identifying only the person registered in advance as the target person, it is possible to improve the accuracy and the security aspect. Further, by performing personal authentication only for the person (for example, person H1) specified based on the comparison between the action history 63 and the action pattern registration information 64, all the persons (for example, person H1) included in the captured image are performed. ~ H3), the increase in processing load due to personal authentication can be suppressed as compared with the case where personal authentication is performed.

（第３の実施形態）
第３の実施形態では、上述した第２の実施形態に係る動作判定装置１Ａを車両の周辺を監視する周辺監視装置として用いる場合の例について図１４および図１５を参照して説明する。図１４は、動作判定装置１Ａを搭載する車両の一例を示す平面図である。また、図１５は、車両後方に存在する複数の人物の中から対象人物を特定する様子を示す図である。 (Third Embodiment)
In the third embodiment, an example in which the motion determination device 1A according to the second embodiment described above is used as a peripheral monitoring device for monitoring the periphery of the vehicle will be described with reference to FIGS. 14 and 15. FIG. 14 is a plan view showing an example of a vehicle equipped with the motion determination device 1A. Further, FIG. 15 is a diagram showing how a target person is specified from a plurality of people existing behind the vehicle.

図１４に示すように、動作判定装置１Ａを搭載する車両１００は、たとえば、不図示の内燃機関を駆動源とする自動車、すなわち内燃機関自動車であってもよいし、不図示の電動機を駆動源とする自動車、すなわち電気自動車や燃料電池自動車等であってもよい。また、それらの双方を駆動源とするハイブリッド自動車であってもよいし、他の駆動源を備えた自動車であってもよい。また、車両１００は、種々の変速装置を搭載することができるし、内燃機関や電動機を駆動するのに必要な種々の装置、たとえばシステムや部品等を搭載することができる。また、車両１００における車輪の駆動に関わる装置の方式や、数、レイアウト等は、種々に設定することができる。 As shown in FIG. 14, the vehicle 100 equipped with the operation determination device 1A may be, for example, an automobile having an internal combustion engine as a drive source (that is, an internal combustion engine automobile), or an electric vehicle (not shown) as a drive source. That is, it may be an electric vehicle, a fuel cell vehicle, or the like. Further, it may be a hybrid vehicle using both of them as a drive source, or a vehicle having another drive source. Further, the vehicle 100 can be equipped with various transmission devices, and can be equipped with various devices necessary for driving an internal combustion engine or an electric motor, such as a system or a component. In addition, the method, number, layout, and the like of the devices involved in driving the wheels in the vehicle 100 can be set in various ways.

車体２００には、複数の撮像装置１０として、たとえば四つの撮像装置１０ａ〜１０ｄが設けられる。撮像装置１０ａ〜１０ｄは、それぞれ、広角レンズまたは魚眼レンズを有し、水平方向にはたとえば１４０°〜１９０°の範囲を撮影することができる。また、撮像装置１０ａ〜１０ｄの光軸は斜め下方に向けて設定されている。よって、撮像装置１０ａ〜１０ｄは、車両１００が移動可能な路面や車両１００が駐車可能な領域を含む車両１００の周辺の外部の環境を逐次撮像可能である。 The vehicle body 200 is provided with, for example, four imaging devices 10a to 10d as a plurality of imaging devices 10. The imaging devices 10a to 10d have a wide-angle lens or a fisheye lens, respectively, and can capture a range of, for example, 140 ° to 190 ° in the horizontal direction. Further, the optical axes of the imaging devices 10a to 10d are set obliquely downward. Therefore, the imaging devices 10a to 10d can sequentially image the external environment around the vehicle 100 including the road surface on which the vehicle 100 can move and the area where the vehicle 100 can park.

撮像装置１０ａは、たとえば、車体２００の後側の端部に配置される。撮像装置１０ｂは、たとえば、車体２００の右側のドアミラー２０１に設けられる。撮像装置１０ｃは、たとえば、車体２００の前側、すなわち車両前後方向の前方側の端部に配置される。撮像装置１０ｄは、たとえば、車体２００の左側のドアミラー２０２に設けられる。 The image pickup apparatus 10a is arranged, for example, at the rear end of the vehicle body 200. The image pickup device 10b is provided, for example, on the door mirror 201 on the right side of the vehicle body 200. The image pickup apparatus 10c is arranged, for example, at the front side of the vehicle body 200, that is, at the front end portion in the front-rear direction of the vehicle. The image pickup device 10d is provided, for example, on the door mirror 202 on the left side of the vehicle body 200.

第３の実施形態において、動作判定装置１Ａは、一例として、車体２００の後方に設けられた撮像装置１０ａから入力される撮像画像に含まれる人物の中から対象人物を特定する。 In the third embodiment, the motion determination device 1A identifies a target person from among the persons included in the captured image input from the image pickup device 10a provided behind the vehicle body 200, as an example.

たとえば、図１５に示すように、撮像装置１０ａの撮像範囲内に人物Ｈ５〜Ｈ８が存在する場合、すなわち、撮像装置１０ａの撮像画像に人物Ｈ５〜Ｈ８が写り込んでいる場合、動作判定装置１Ａは、たとえば、撮像装置１０ａに向かって接近している人物Ｈ６を対象人物として特定し、人物Ｈ６について一連の動作判定処理を行う。動作判定装置１Ａから出力される判定結果は、たとえば、車体２００のリアトランクのドアを自動的に開ける処理等に用いられる。 For example, as shown in FIG. 15, when the person H5 to H8 is present in the image pickup range of the image pickup device 10a, that is, when the person H5 to H8 is reflected in the image captured by the image pickup device 10a, the operation determination device 1A For example, identifies a person H6 approaching the image pickup apparatus 10a as a target person, and performs a series of motion determination processes on the person H6. The determination result output from the operation determination device 1A is used, for example, for a process of automatically opening the door of the rear trunk of the vehicle body 200.

なお、ここでは、第２の実施形態に係る動作判定装置１Ａを周辺監視装置として用いる場合の例について説明したが、第１の実施形態に係る動作判定装置１を周辺監視装置として用いることも可能である。 Here, an example in which the operation determination device 1A according to the second embodiment is used as the peripheral monitoring device has been described, but the operation determination device 1 according to the first embodiment can also be used as the peripheral monitoring device. Is.

以上、本発明の実施形態を例示したが、上記実施形態および変形例はあくまで一例であって、発明の範囲を限定することは意図していない。上記実施形態や変形例は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、組み合わせ、変更を行うことができる。また、各実施形態や各変形例の構成や形状は、部分的に入れ替えて実施することも可能である。 Although the embodiments of the present invention have been illustrated above, the above-described embodiments and modifications are merely examples, and the scope of the invention is not intended to be limited. The above-described embodiment and modification can be implemented in various other forms, and various omissions, replacements, combinations, and changes can be made without departing from the gist of the invention. Further, the configuration and shape of each embodiment and each modification can be partially replaced.

１，１Ａ…動作判定装置、２…パラメータ変更部、３…抽出部、４…追跡部、５…判定部、６，６Ａ…記憶部、１０…撮像装置、２１…距離推定部、２２…抽出用パラメータ変更部、２３…追跡用パラメータ変更部、３１…領域設定部、３２…動体検出部、３３…抽出処理部、４１…追跡情報生成部、４２…蓄積部、５１…比較部、５２…静止判定部、６１…変換情報、６２…登録ジェスチャ情報、Ｃ…クラスター、Ｆ…特徴点、Ｈ…人物、Ｒ…処理対象領域、Ｗ…追跡領域、Ｚ…ジェスチャ領域。 1,1A ... Motion determination device, 2 ... Parameter change unit, 3 ... Extraction unit, 4 ... Tracking unit, 5 ... Judgment unit, 6,6A ... Storage unit, 10 ... Imaging device, 21 ... Distance estimation unit, 22 ... Extraction Parameter change unit, 23 ... Tracking parameter change unit, 31 ... Area setting unit, 32 ... Moving object detection unit, 33 ... Extraction processing unit, 41 ... Tracking information generation unit, 42 ... Accumulation unit, 51 ... Comparison unit, 52 ... Static determination unit, 61 ... conversion information, 62 ... registered gesture information, C ... cluster, F ... feature point, H ... person, R ... processing target area, W ... tracking area, Z ... gesture area.

Claims

撮像画像から対象物の特徴点を抽出する抽出部と、
時間的に前後する前記撮像画像からそれぞれ抽出される前記特徴点に基づき、前記対象物の移動方向を示す追跡情報を生成する追跡部と、
複数の前記追跡情報を時系列に蓄積した追跡情報群と前記対象物の動作に対応付けて予め登録された登録情報群との比較結果に基づき、当該動作が行われたか否かを判定する判定部と、
を備え、
前記追跡部は、
時間的に前後する前記撮像画像のうち後の撮像画像から抽出された前記特徴点の位置に基づいて追跡領域を設定し、設定した前記追跡領域に存在する、時間的に前後する前記撮像画像のうち前の撮像画像から抽出された前記特徴点と、前記後の撮像画像から抽出された前記特徴点とを対応付けることによって前記追跡情報を生成する、動作判定装置。 An extraction unit that extracts the feature points of the object from the captured image,
A tracking unit that generates tracking information indicating the moving direction of the object based on the feature points extracted from the captured images that are moved back and forth in time.
Judgment to determine whether or not the operation has been performed based on the comparison result between the tracking information group in which a plurality of the tracking information is accumulated in time series and the registered information group registered in advance in association with the operation of the object. Department and
Equipped with a,
The tracking unit
A tracking area is set based on the position of the feature point extracted from the later captured image among the captured images that are time-varying, and the time-varying captured image that exists in the set tracking area. An operation determination device that generates tracking information by associating the feature points extracted from the previous captured image with the feature points extracted from the later captured image.

前記判定部は、
前記追跡情報群と前記登録情報群との類似度が閾値以上であると判定した後、前記対象物が静止しているか否かを判定し、静止していると判定したならば、前記登録情報群に対応する動作が行われたと判定する、
請求項１に記載の動作判定装置。 The determination unit
After determining that the similarity between the tracking information group and the registered information group is equal to or higher than the threshold value, it is determined whether or not the object is stationary, and if it is determined that the object is stationary, the registered information Judge that the action corresponding to the group has been performed,
The operation determination device according to claim 1.

前記追跡部は、
前記撮像画像における前記対象物の周囲に、前記登録情報群に対応する動作に応じた方向に幅広い追跡領域を設定し、設定した前記追跡領域に含まれる前記特徴点に基づいて前記追跡情報を生成する、
請求項１または２に記載の動作判定装置。 The tracking unit
A wide tracking area is set around the object in the captured image in a direction corresponding to the operation corresponding to the registered information group, and the tracking information is generated based on the feature points included in the set tracking area. do,
The operation determination device according to claim 1 or 2.

前記抽出部による処理に用いられるパラメータを前記撮像画像の撮像位置から前記対象物までの距離情報に応じて変更する抽出用パラメータ変更部
を備える請求項１〜３のいずれか一つに記載の動作判定装置。 The operation according to any one of claims 1 to 3, further comprising an extraction parameter changing unit that changes the parameters used for processing by the extraction unit according to the distance information from the imaging position of the captured image to the object. Judgment device.

前記追跡部による処理に用いられるパラメータを前記撮像画像の撮像位置から前記対象物までの距離情報に応じて変更する追跡用パラメータ変更部
を備える請求項１〜４のいずれか一つに記載の動作判定装置。 The operation according to any one of claims 1 to 4, further comprising a tracking parameter changing unit that changes the parameters used for processing by the tracking unit according to the distance information from the imaging position of the captured image to the object. Judgment device.

前記撮像画像に含まれる人物の行動履歴に基づいて対象人物を特定する人物特定部
を備え、
前記抽出部は、
前記人物特定部によって特定された対象人物から前記対象物の特徴点を抽出する、
請求項１〜５のいずれか一つに記載の動作判定装置。 A person identification unit that identifies a target person based on the behavior history of the person included in the captured image is provided.
The extraction unit
Extracting the feature points of the object from the target person identified by the person identification unit,
The operation determination device according to any one of claims 1 to 5.

前記人物特定部は、
前記撮像画像に基づいて前記行動履歴を生成し、生成した前記行動履歴と予め登録された行動パターン登録情報との類似度に基づいて前記対象人物を特定する、
請求項６に記載の動作判定装置。 The person identification part
The action history is generated based on the captured image, and the target person is specified based on the degree of similarity between the generated action history and the pre-registered action pattern registration information.
The operation determination device according to claim 6.