JP6579657B2

JP6579657B2 - Judgment device, program, and remote communication support device

Info

Publication number: JP6579657B2
Application number: JP2015246703A
Authority: JP
Inventors: 建鋒徐; 茂之酒澤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2015-12-17
Filing date: 2015-12-17
Publication date: 2019-09-25
Anticipated expiration: 2035-12-17
Also published as: JP2017111685A

Description

本発明は、映像における歩行者等の移動対象が映像のカメラ視点に向かって真っすぐに移動している場合であっても当該移動の判定を行うことのできる判定装置及びプログラム並びに遠隔コミュニケーション支援装置に関する。 The present invention relates to a determination device and a program, and a remote communication support device capable of determining the movement even when a moving object such as a pedestrian in the image is moving straight toward the camera viewpoint of the image. .

カメラ画像や映像より歩行者その他の対象を検出したり追跡したりする技術には、多くの従来技術がある。例えば、非特許文献２では、HoG特徴量を用いて人物を検出する技術が開示されている。また、非特許文献１では、高速道路や交差点近辺等を対象とした交通監視技術として、設置されたカメラの映像より全ての動きを追跡し、当該追跡した結果からオブジェクトの属性（トラックや乗用車、バイクといった属性）を推定することが開示されている。 There are many conventional techniques for detecting and tracking pedestrians and other objects from camera images and videos. For example, Non-Patent Document 2 discloses a technique for detecting a person using HoG feature amounts. Further, in Non-Patent Document 1, as a traffic monitoring technique for highways and intersections, etc., all movements are tracked from the video of the installed camera, and object attributes (trucks, passenger cars, Estimating an attribute such as a motorcycle.

Kilger, M. "Video-based traffic monitoring." Image Processing and its Applications, 1992. International Conference on. IET, 1992Kilger, M. "Video-based traffic monitoring." Image Processing and its Applications, 1992. International Conference on. IET, 1992 Dalal, N.; Triggs, B., "Histograms of oriented gradients for human detection," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on , vol.1, no., pp.886-893 vol. 1, 25-25 June 2005Dalal, N .; Triggs, B., "Histograms of oriented gradients for human detection," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, no., Pp.886-893 vol. 1, 25-25 June 2005 Kalman, R. E. (1960). "A New Approach to Linear Filtering and Prediction Problems". Journal of Basic Engineering 82: 35. doi:10.1115/1.3662552Kalman, R. E. (1960). "A New Approach to Linear Filtering and Prediction Problems". Journal of Basic Engineering 82: 35. doi: 10.1115 / 1.3662552 Zivkovic, Z., "Improved adaptive Gaussian mixture model for background subtraction," Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on , vol.2, no., pp.28,31 Vol.2, 23-26 Aug. 2004Zivkovic, Z., "Improved adaptive Gaussian mixture model for background subtraction," Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol.2, no., Pp.28,31 Vol.2, 23- 26 Aug. 2004 Manuela Zuger and Thomas Fritz. 2015. Interruptibility of Software Developers and its Prediction Using Psycho-Physiological Sensors. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 2981-2990.Manuela Zuger and Thomas Fritz. 2015. Interruptibility of Software Developers and its Prediction Using Psycho-Physiological Sensors.In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15) .ACM, New York, NY, USA, 2981-2990.

しかしながら、従来技術では、オフィスにおける歩行者の歩行判定を適切に行うことができなかった。 However, in the prior art, it has not been possible to appropriately perform pedestrian walking determination in the office.

図１は、従来技術では適切に判定できなかった、オフィスにおける歩行者とそのカメラ撮影との例を模式的に説明するための図である。図１において、オフィス空間O1（の一部分）には、互いに垂直な壁W1及びW2が配置され、水平なフロアF1及び通路P1が配置されている。当該オフィスで働く人物H1は、会議に向かう等の仕事上の移動のため、双方向矢印A1で示されるように、壁W2に配置されたドアD1に向かう方向に、あるいはその逆方向に、通路P1を移動する。ここで、ドアD1の上部において壁W2に配置されたカメラC1が、当該矢印A1で示される人物H1の移動方向に対して正面の向きとなるように撮影を行っているものとする。従来技術では、このようにカメラC1が配置されている場合に、人物H1の歩行の判定を必ずしも適切に行うことができなかった。 FIG. 1 is a diagram for schematically explaining an example of a pedestrian in the office and its camera photography that could not be properly determined by the conventional technology. In FIG. 1, walls W1 and W2 perpendicular to each other are disposed in (a part of) the office space O1, and a horizontal floor F1 and a passage P1 are disposed. The person H1 working in the office moves in the direction toward the door D1 arranged on the wall W2 or in the opposite direction, as indicated by the two-way arrow A1, for work movement such as going to a meeting. Move P1. Here, it is assumed that the camera C1 disposed on the wall W2 in the upper part of the door D1 is shooting so that the front direction is in the moving direction of the person H1 indicated by the arrow A1. In the prior art, when the camera C1 is arranged in this way, it is not always possible to appropriately determine whether or not the person H1 is walking.

図２は、図１のようにしてオフィスに配置されたカメラに対して真っすぐに向かってくる歩行者の判定を、従来技術では適切に行えないことを模式的に説明するための、カメラ映像の例を示す図である。図２では[1]に、時刻がt1,t2,t3と進行していく際にそれぞれカメラ映像に撮影されている画像P(t1),P(t2),P(t3)が示されており、ここではオフィス内の歩行者がカメラに向かって近づいてきている様子が撮影されている。そして、図２の[2]は、従来技術においてこのような画像P(t1),P(t2),P(t3)に対して歩行者の歩行の判定を下すことが困難であることを説明するためのものであり、各時刻の画像P(t1),P(t2),P(t3)につきそれぞれ歩行者が領域R(t1),R(t2),R(t3)として検出されていることを示している。 FIG. 2 is a view of a camera image for schematically explaining that the determination of a pedestrian that comes straight toward the camera arranged in the office as shown in FIG. It is a figure which shows an example. In FIG. 2, [1] shows images P (t1), P (t2), and P (t3) that are captured in the camera video as the time progresses as t1, t2, and t3, respectively. Here, a picture of a pedestrian in the office approaching the camera is taken. [2] in FIG. 2 explains that it is difficult to make a pedestrian walk determination for such images P (t1), P (t2), and P (t3) in the prior art. Pedestrians are detected as regions R (t1), R (t2), and R (t3) for images P (t1), P (t2), and P (t3) at each time, respectively. It is shown that.

従来技術においては、当該例として示される領域R(t1),R(t2),R(t3)を検出できたとしても、これら領域から歩行者が歩行しているものであると判定することが困難である。当該困難となる事情として、例えば以下のような点を挙げることができる。 In the prior art, even if the regions R (t1), R (t2), and R (t3) shown as the example can be detected, it can be determined that a pedestrian is walking from these regions. Have difficulty. For example, the following points can be cited as such circumstances.

すなわち、例えば時刻t1〜t2間においては領域R(t1)から領域R(t2)へと変化していくが、この際、領域のサイズの変化はあるとしても、当該変化する領域の移動量が小さいという点である。特に、領域R(t1)と領域R(t2)とを比較すると、その下辺（歩行者の足の下端部分に相当）は移動していることを見て取ることができるが、その上辺（歩行者の頭の上端部分に相当）はほとんど移動していない。従来技術では、このような領域から歩行者の判定を行うことは困難である。 That is, for example, during the period from time t1 to t2, the region R (t1) changes to the region R (t2). At this time, even if there is a change in the size of the region, the amount of movement of the changing region is It is a small point. In particular, when comparing region R (t1) and region R (t2), it can be seen that the lower side (corresponding to the lower end of the pedestrian's foot) is moving, but the upper side (the pedestrian's (Corresponding to the upper end of the head) is hardly moving. In the prior art, it is difficult to determine a pedestrian from such an area.

さらに、同様の傾向は歩行者がカメラにさらに接近してきた以降の時刻t2〜t3でより顕著となっている。なぜなら、ここでは歩行者がカメラにかなり接近していることからその全身を領域として画像内に捉えることができないからである。例えば、領域R(t2)には歩行者の膝より上の部分しか捉えられておらず、領域R(t3)には歩行者の上半身しか捉えられていない。 Furthermore, the same tendency becomes more remarkable at times t2 to t3 after the pedestrian has further approached the camera. This is because here the pedestrian is quite close to the camera, so that the whole body cannot be captured as an area in the image. For example, only a portion above the pedestrian's knee is captured in the region R (t2), and only the upper body of the pedestrian is captured in the region R (t3).

当該時刻t2〜t3のように、歩行者がカメラにかなり接近していると、その全身を画像に捉えられないことから、領域R(t2)〜R(t3)のサイズ変化があるものの移動が見られないこととなり、従来技術では、このような領域から歩行者の判定を行うことは困難である。 If the pedestrian is quite close to the camera as at the time t2 to t3, the whole body cannot be captured in the image, so the movement of the region R (t2) to R (t3) has a change in size. As a result, it is difficult to determine a pedestrian from such an area.

以上のように、オフィス配置のカメラによって撮影される歩行者が真っすぐ向かってくるような映像に関しては、時刻t1〜t2のように歩行者がある程度カメラから遠くに存在する場合であっても、時刻t2〜t3のように歩行者がかなりカメラに近づいている場合であっても、従来技術では歩行者の判定を行うことが困難であった。 As described above, for images in which pedestrians are taken straight by an office-arranged camera, even if the pedestrian is some distance away from the camera, such as at times t1 to t2, the time Even when the pedestrian is very close to the camera as in t2 to t3, it is difficult to determine the pedestrian with the prior art.

また、時刻t1,t2,t3が図２の説明とは逆向きに進むとした場合を考えることで明らかなように、同様に配置されたカメラで歩行者が真っすぐに遠ざかっていく映像を得る場合であっても、従来技術では、歩行者の判定を行うことが困難であった。 In addition, as is clear from the case where the times t1, t2, and t3 proceed in the opposite direction to the explanation of FIG. 2, a video in which a pedestrian moves straight away with a similarly arranged camera is obtained. Even so, it has been difficult to determine a pedestrian in the prior art.

本発明は、上記従来技術の課題を解決し、映像における移動対象が映像におけるカメラ視点に対して真っすぐに移動している場合であっても判定を行うことが可能な判定装置及びプログラム並びに遠隔コミュニケーション支援装置を提供することを目的とする。 The present invention solves the above-described problems of the prior art, and a determination apparatus and program capable of performing determination even when a moving object in a video is moving straight with respect to a camera viewpoint in the video, and remote communication An object is to provide a support device.

上記目的を達成するため、本発明は、判定装置であって、映像を解析して移動対象の検出及び追跡を行う検出追跡部と、前記追跡された移動対象の領域の変化に基づいて、移動対象における移動の判定を、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を含めて行う判定部と、を備えることを特徴とする。 In order to achieve the above object, the present invention provides a determination device that detects and tracks a moving target by analyzing a video, and moves based on a change in the area of the tracked moving target. And a determination unit that performs determination of movement of the object including determination of movement that approaches or moves away straight from the camera viewpoint in the video.

また、本発明は、コンピュータを前記判定装置として機能させるプログラムであることを特徴とする。 Further, the present invention is a program that causes a computer to function as the determination device.

さらに、本発明は、前記判定装置と、通知部と、を備える遠隔コミュニケーション支援装置であって、前記判定装置が第一地点の映像を解析することで、当該映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動対象があることが判定された場合に、前記通知部が、第二地点に対して通知を行うことを特徴とする。 Furthermore, the present invention is a remote communication support device including the determination device and a notification unit, and the determination device analyzes the video of the first point so that the camera viewpoint in the video is straight. When it is determined that there is a moving object that approaches or moves away, the notification unit notifies the second point.

前記判定装置又はプログラムによれば、前記追跡された移動対象の領域の変化に基づいて、移動対象における移動の判定を、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を含めて行うことができる。 According to the determination device or the program, the determination of the movement in the movement target includes the determination of the movement that approaches or moves straight away from the camera viewpoint in the video based on the change in the tracked area of the movement target. It can be carried out.

前記遠隔コミュニケーション支援装置によれば、前記判定装置による判定があった際に第二地点へ通知を行うことで、第一地点と第二地点との間の遠隔コミュニケーションを支援することができる。 According to the remote communication support device, it is possible to support remote communication between the first point and the second point by notifying the second point when the determination by the determination device is made.

オフィスにおける歩行者とそのカメラ撮影との例を模式的に説明するための図である。It is a figure for demonstrating typically the example of the pedestrian in the office, and its camera photography. 図１のようなカメラ撮影による歩行者の映像の例を示す図である。It is a figure which shows the example of the image | video of a pedestrian by camera photography like FIG. 一実施形態に係る判定装置の機能ブロック図である。It is a functional block diagram of the determination apparatus concerning one embodiment. 一実施形態に係る検出追跡部1の機能ブロック図である。2 is a functional block diagram of a detection tracking unit 1 according to an embodiment. FIG. 画像の座標を説明するための図である。It is a figure for demonstrating the coordinate of an image. 追跡部の処理を説明するための図である。It is a figure for demonstrating the process of a tracking part. 一実施形態に係る判定部の機能ブロック図である。It is a functional block diagram of the determination part which concerns on one Embodiment. 一実施形態に係る判定部の処理のフローチャートである。It is a flowchart of the process of the determination part which concerns on one Embodiment. テレプレゼンスロボットがオフィス空間に配置されている例を示す図である。It is a figure which shows the example by which the telepresence robot is arrange | positioned in the office space. 一実施形態に係る遠隔コミュニケーション支援装置の機能ブロック図である。It is a functional block diagram of the remote communication assistance apparatus which concerns on one Embodiment. 制御部の処理を説明するための図である。It is a figure for demonstrating the process of a control part.

図３は、一実施形態に係る判定装置の機能ブロック図である。判定装置10は、検出追跡部1及び判定部4を備え、歩行者等の移動対象が撮影された映像を入力として受け取り、当該映像を各部1,4で解析することにより、当該映像の各時点における歩行者等の移動対象の状態を判定して出力する。 FIG. 3 is a functional block diagram of the determination apparatus according to the embodiment. The determination device 10 includes a detection tracking unit 1 and a determination unit 4, receives a video of a moving object such as a pedestrian as an input, and analyzes each video at each time point by analyzing the video at each unit 1 and 4. The state of the movement target such as a pedestrian is determined and output.

判定装置10では特に、移動対象の状態として、入力された映像における歩行者等の移動対象が歩行等によって、当該映像を撮影しているカメラに対して真っすぐに向かってきている状態であるか、及び／又は、当該カメラから真っすぐに離れていく状態にあるか、という判定結果を得ることができる。すなわち、図２で説明したような状態に関する判定結果を得ることができる。 In the determination apparatus 10, in particular, as the state of the movement target, whether the movement target such as a pedestrian in the input video is in a state of being straightly directed toward the camera that is shooting the video by walking or the like, And / or it is possible to obtain a determination result indicating whether or not the camera is in a state of moving straight away from the camera. That is, it is possible to obtain a determination result related to the state described with reference to FIG.

なお、判定装置10に入力される映像における移動対象は人物の他にも、図２で説明したような振る舞いを示すものであれば、動物、ロボット、乗り物その他の任意の移動する対象であってよく、判定装置10ではこのような任意の対象に関して、映像を撮影しているカメラに向かって真っすぐに近づいてきている状態である、あるいはその逆向きに遠ざかっていっている状態である、という判定結果を得ることができる。同様に、撮影環境もオフィスに限られず、屋外等でもよい。また同様に移動の態様も、歩行に限らず走りその他であってもよい。 In addition to the person, the moving object in the video input to the determination apparatus 10 is an animal, a robot, a vehicle, or any other moving object as long as it exhibits the behavior described in FIG. Well, the determination device 10 is such a determination result that such an arbitrary object is in a state of approaching straight toward the camera that is shooting the image, or in a state of moving away in the opposite direction. Can be obtained. Similarly, the shooting environment is not limited to an office, and may be outdoors. Similarly, the mode of movement is not limited to walking, but may be running or the like.

ただし、以下の説明においては、わかりやすい具体例で説明するという観点から、移動対象は図２で説明したようにオフィス環境における歩行者であり、歩行によってカメラに真っすぐに近づいている、あるいはその逆向きに進んでいる、ということを判定する場合を主な例として説明するものとする。すなわち、歩行者の歩行に関する判定は、移動対象がカメラに真っすぐに近づく／カメラから真っすぐに遠ざかる一例にすぎず、以下説明する処理によって、人物以外の同様の移動が判定可能であり、人物の移動であった場合でも、歩行に限らず走ることで近づく／遠ざかるような場合も判定可能である。 However, in the following description, from the viewpoint of explaining with an easy-to-understand example, the moving object is a pedestrian in the office environment as described in FIG. 2 and is approaching the camera straight by walking or vice versa. A case where it is determined that the process has proceeded to will be described as a main example. That is, the determination regarding the walking of the pedestrian is only an example in which the moving object approaches the camera straight away / moves straight away from the camera, and similar movements other than the person can be determined by the processing described below. Even if it is, it is possible to determine not only walking but also a case of approaching or moving away by running.

判定装置10の各部1,4の概要は次の通りである。 The outline of each part 1 and 4 of the determination apparatus 10 is as follows.

検出追跡部1は、移動対象が撮影された映像を読み込んで、当該映像の各時刻の画像（各フレーム）において、移動対象の検出及び追跡を行い、当該追跡結果を判定部4へと出力する。 The detection tracking unit 1 reads a video in which the moving target is captured, detects and tracks the moving target in each time image (each frame) of the video, and outputs the tracking result to the determination unit 4 .

判定部4は、検出追跡部1より出力された当該映像の各時刻における追跡結果（各時刻において移動対象が占める領域、当該領域の位置、速度、大きさ変化等）に対して分類器及びルールベースの判定手法を適用することにより、各時刻における移動対象が映像におけるカメラに対して真っすぐに近づいている状態であるか、その逆に遠ざかっている状態か、また、その他の状態であるか、という判定結果を出力する。 The determination unit 4 determines the classifiers and rules for the tracking results at each time of the video output from the detection tracking unit 1 (the region occupied by the movement target at each time, the position, speed, size change, etc. of the region). By applying the base judgment method, whether the moving object at each time is in a state of approaching the camera in the video straight, vice versa, or other state, Is output.

以下、各部1,4の詳細を説明する。 Hereinafter, details of each of the units 1 and 4 will be described.

図４は、一実施形態に係る検出追跡部1の機能ブロック図であり、検出追跡部1は検出部2及び追跡部3を備える。各部2,3の概要は次の通りである。 FIG. 4 is a functional block diagram of the detection tracking unit 1 according to an embodiment. The detection tracking unit 1 includes a detection unit 2 and a tracking unit 3. The outline of each part 2 and 3 is as follows.

検出部2は、入力される移動対象が撮影された映像の各時刻の画像につき、移動対象の領域を（その他の背景領域と区別される）前景領域として検出し、当該検出された各時刻における前景領域を追跡部3へと出力する。追跡部3では、当該出力された各時刻の前景領域は移動対象の本来の領域に対してノイズを含むものであることから、カルマンフィルタを適用することによって、移動対象の各時刻における追跡結果を得て、判定部4へと出力する。以下、各部2,3の詳細を説明する。 The detection unit 2 detects an area to be moved as a foreground area (distinguishable from other background areas) for each time image of the video in which the moving object is input, and the detected time at each detected time The foreground area is output to the tracking unit 3. In the tracking unit 3, since the output foreground region at each time includes noise with respect to the original region of the movement target, by applying the Kalman filter, the tracking result at each time of the movement target is obtained, Output to judgment unit 4. Hereinafter, the details of the units 2 and 3 will be described.

検出部2では、以下の第一〜第三処理を順に実施することで、前景領域を得て追跡部3へと出力する。 The detection unit 2 performs the following first to third processes in order to obtain a foreground region and output it to the tracking unit 3.

第一処理として、周知の手法である背景差分法を適用することで、映像の各時刻の画像につき前景領域を検出する。ここで、背景差分法としては種々のものが利用可能であるが、例えば前掲の非特許文献４に開示のものを利用することができる。非特許文献４においては、混合正規分布(Mixture of Gaussian Distribution, MoG)を用いて背景をモデル化し、新たに観測された画像を用いて逐次的に背景モデルを更新しながら、当フレームの前景領域（動きのある領域）を検出している。 As a first process, a foreground region is detected for each image of a video by applying a background difference method which is a well-known method. Here, various methods can be used as the background subtraction method. For example, the method disclosed in Non-Patent Document 4 described above can be used. In Non-Patent Document 4, the background is modeled using a mixed normal distribution (Mixture of Gaussian Distribution, MoG), and the background model is sequentially updated using newly observed images. (Region with motion) is detected.

第二処理として、上記の第一処理で得た前景領域は本来の移動対象の領域の他にも、いわゆるゴマ塩状の領域等をノイズとして含んでいるので、当該ノイズ影響を低減するために、第一処理で得た前景領域に対してさらに、2値画像におけるノイズ低減処理として周知の膨張・収縮処理を適用する。ここで、膨張(erode)処理は以下の式(1)で示され、収縮(dilate)処理は以下の式(2)で示される。 As the second process, the foreground area obtained in the first process includes a so-called sesame salt-like area as noise in addition to the original area to be moved. Further, a known expansion / contraction process is applied to the foreground region obtained by the first process as a noise reduction process in the binary image. Here, the erode process is expressed by the following expression (1), and the dilate process is expressed by the following expression (2).

なお、式(1),(2)において、dst(x,y)は膨張・収縮処理のそれぞれにおける出力画像（出力される前景領域）の構成画素を、src(x,y)は入力画像（入力される前景領域）の構成画素を表す。(x,y)は当該画像（すなわち領域）内に含まれる座標であり、図５に示すように、画像処理分野において慣用的に用いられているのと同様に、画像P（横Nx、縦Nyのサイズとする）の左上の頂点を原点とし、右方向にx軸を、下方向にy軸を取るものとする。図４では画像P内において座標(x,y)=(a1,a2)として特定される点Aが当該座標系を用いた画素位置指定の例として示されている。以下、本発明の説明において、図５で示したのと同様の座標系を用いる。 In Expressions (1) and (2), dst (x, y) is a constituent pixel of the output image (output foreground region) in each expansion / contraction process, and src (x, y) is an input image ( This represents the constituent pixels of the input foreground area. (x, y) is a coordinate included in the image (that is, a region). As shown in FIG. 5, the image P (horizontal Nx, vertical) is used in the same manner as is conventionally used in the image processing field. Ny size) is the origin at the top left corner, the x-axis to the right, and the y-axis to the bottom. In FIG. 4, a point A specified as coordinates (x, y) = (a1, a2) in the image P is shown as an example of pixel position designation using the coordinate system. In the following description of the present invention, the same coordinate system as shown in FIG. 5 is used.

式(1),(2)では各画素位置(x,y)に関して、2値画像処理の分野で行われているように、当該位置に前景領域が存在すれば0の値（最小値の黒）を定義し、存在しなければ1の値（最大値の白）を定義するものとする。こうして、式(1),(2)では共に各画素位置(x,y)の所定近傍の一連の画素(x+x', y+y')を参照することで、式(1)では画素位置(x,y)の所定近傍内に1つでも0すなわち前景領域に属している画素があれば当該位置(x,y)を前景領域に置き換えることで膨張処理が実施され、式(2)ではこの逆の処理（近傍に1つでも背景領域があれば背景領域に置き換える処理）によって収縮処理が実施される。ここで、近傍については所定距離内などを採用すればよい。 In Expressions (1) and (2), for each pixel position (x, y), as in the field of binary image processing, if there is a foreground area at that position, a value of 0 (the minimum black value) ) Is defined, and if it does not exist, a value of 1 (maximum white) is defined. Thus, in Equations (1) and (2), by referring to a series of pixels (x + x ′, y + y ′) in the vicinity of each pixel position (x, y), the pixels in Equation (1) If there is at least one pixel in the predetermined neighborhood of the position (x, y), that is, a pixel belonging to the foreground area, the expansion process is performed by replacing the position (x, y) with the foreground area, and Equation (2) Then, the contraction process is performed by the reverse process (the process of replacing at least one background area in the vicinity with the background area). Here, the vicinity may be adopted within a predetermined distance.

第三処理として、上記の第二処理において前掲領域におけるゴマ塩状のノイズ領域は除去されたものの、本来は１つである移動対象が２つ以上の互いに近接した前景領域に分断されてしまっている可能性があるので、さらに、このような分断している領域を１つの移動対象であるものとして互いに合併する処理を行う。例えば、移動対象が歩行者である場合で当該歩行者が大きく腕を振って歩いているような場合、第二処理を終えた時点で頭、胴体及び脚の領域と腕の領域とが互いに分断されて別の前景領域となってしまっていることがあるので、第三処理ではこのような分断された領域を本来の１つの歩行者の領域であるものとして合併する。 As the third process, although the sesame salt-like noise area in the above-mentioned area is removed in the above-described second process, the originally one moving object is divided into two or more foreground areas close to each other. Therefore, a process of merging the divided areas as one movement target is performed. For example, if the object to be moved is a pedestrian and the pedestrian is walking with large arms, the head, torso, and leg areas and the arm area are separated from each other when the second process is completed. In this case, such a divided area is merged as an original area of one pedestrian in the third process.

第三処理は具体的には例えば、第二処理において得られた複数の前景領域のうち、領域の中心間の距離が所定の閾値以下となるようなもの同士を合併すればよい。ここで、領域の中心に関しては、領域の重心や、領域を囲む最小サイズの矩形の中心（対角線の交点）等の、領域に関して定義される所定位置を採用すればよい。当該囲む矩形については追跡部3に関して後述する図６におけるのと同様の矩形を用いることができる。 Specifically, for example, the third process may be performed by merging the plurality of foreground areas obtained in the second process such that the distance between the centers of the areas is equal to or less than a predetermined threshold. Here, with respect to the center of the region, a predetermined position defined with respect to the region, such as the center of gravity of the region or the center of the minimum size rectangle (intersection of diagonal lines) surrounding the region, may be adopted. As the surrounding rectangle, the same rectangle as that in FIG.

追跡部3では、上記の検出部2における最後の処理としての第三処理を経た前景領域を入力として受け取り、これに前掲の非特許文献３等に開示されている周知のカルマンフィルタを適用して、判定部4へと出力する。ここで、カルマンフィルタの適用に際して状態及び状態遷移を定義する必要があるが、例えば以下のようにすればよい。 In the tracking unit 3, the foreground region that has undergone the third process as the last process in the detection unit 2 is received as an input, and a well-known Kalman filter disclosed in Non-Patent Document 3 and the like described above is applied thereto. Output to judgment unit 4. Here, it is necessary to define states and state transitions when applying the Kalman filter. For example, the following may be performed.

まず、状態に関しては次のように定義すればよい。すなわち、各時刻の画像P(t)における前景領域F(t)に関して、図６に示すように当該前景領域F(t)を囲む最小矩形R(t)の中心C(t)（対角線の交点）を当該前景領域F(t)の位置(x(t),y(t))として定義し、さらに、この速度(v_x(t),v_y(t))を定義し、当該位置及び速度を合わせた(x(t),y(t), v_x(t),v_y(t))を状態として定義する。なお、速度(v_x(t),v_y(t))に関しては例えば、現時刻tの直近の過去時刻t-1の位置(x(t-1),y(t-1))との差分として以下の式(3),(4)のように求めればよい。 First, the state may be defined as follows. That is, with respect to the foreground area F (t) in the image P (t) at each time, as shown in FIG. 6, the center C (t) of the minimum rectangle R (t) surrounding the foreground area F (t) ) As the position (x (t), y (t)) of the foreground region F (t), and further define the velocity (v _x (t), v _y (t)) the combined rate defining (x (t), y ( t), v x (t), v y (t)) as a state. Regarding the speed (v _x (t), v _y (t)), for example, the position (x (t-1), y (t-1)) of the past time t-1 that is closest to the current time t What is necessary is just to obtain | require as a following formula (3), (4) as a difference.

また、位置及び速度として定義された状態の時刻tから時刻t+1への状態遷移については以下の式(5)〜(8)ように定義すればよい。ここで、n_x(t),n_y(t),n_vx(t),n_vy(t)はノイズである。 Further, the state transition from the time t to the time t + 1 in the state defined as the position and the speed may be defined as the following equations (5) to (8). Here, n _x (t), n _y (t), n _vx (t), and n _vy (t) are noises.

判定部4では、検出追跡部1の追跡部3が以上のように出力した移動対象の状態(x(t),y(t), v_x(t),v_y(t))及び矩形領域R(t)の情報に基づき、移動対象が以下のように定義される3つの状態W,T,Nのいずれに該当するかを出力する。
W：walkingすなわち、歩行等で移動している状態である。
T：toward cameraすなわち、カメラ視点に向かって真っすぐに移動している状態である。
N：non-walkingすなわち、歩行等での移動をしていない状態である、あるいは移動対象ではない。 In the determination unit 4, the state (x (t), y (t), v _x (t), v _y (t)) of the movement target and the rectangular region output by the tracking unit 3 of the detection tracking unit 1 as described above Based on the information of R (t), it outputs which of the three states W, T, and N that the movement target is defined as follows.
W: A state of walking, that is, moving by walking or the like.
T: Toward camera, that is, the camera is moving straight toward the camera viewpoint.
N: Non-walking, that is, a state where no movement is performed by walking or the like, or is not a movement target.

上記では移動に関して歩行(walking)として状態を表す符号として採用しているが、前述の通り、歩行は移動の一例に過ぎないことに注意されたい。また、状態Tに関してはカメラ視点に向かって真っすぐに近づくよう移動する状態の他、その逆向きであるカメラ視点から真っすぐに遠ざかる（away from camera）場合も含むものとする。 In the above, it is adopted as a symbol representing a state as walking with respect to movement, but it should be noted that walking is only an example of movement as described above. The state T includes not only a state of moving straight toward the camera viewpoint, but also a case of moving away from the camera viewpoint in the opposite direction (away from camera).

なお、現時刻の画像P(t)において追跡部3が2以上の移動対象につきその状態を求めている場合は、判定部4では各移動対象につきそれぞれ、状態W,T,Nのいずれに該当するかを出力する。以下、判定部4が現時刻tの画像P(t)における移動対象に関して当該３つの状態W,T,Nのいずれに該当するかを判定する処理の詳細を説明する。 In addition, when the tracking unit 3 obtains the state for two or more moving objects in the image P (t) at the current time, the determination unit 4 corresponds to any of the states W, T, and N for each moving object. Output what to do. Hereinafter, the details of the process in which the determination unit 4 determines which of the three states W, T, and N corresponds to the movement target in the image P (t) at the current time t will be described.

図７は、一実施形態に係る判定部4の機能ブロック図であり、判定部4は状態N（移動対象に該当しない）であるか否かを判定する第一判定部41と、状態W（移動している）であるか否かを判定する第二判定部42と、状態T（カメラ視点に対して真っすぐに移動している）であるか否かを判定する第三判定部43と、を備える。 FIG. 7 is a functional block diagram of the determination unit 4 according to an embodiment. The determination unit 4 includes a first determination unit 41 that determines whether the state is N (not applicable to the movement target), and a state W ( A second determination unit 42 for determining whether or not a state T (moving straight with respect to the camera viewpoint), and a third determination unit 43 for determining whether or not a state T (moving straight with respect to the camera viewpoint), Is provided.

図８は、図７の第一〜第三判定部41〜43が互いに連携して判定を行う一実施形態のフローチャートである。以下、図８の各ステップを説明しながら、各部41〜43の詳細を説明する。 FIG. 8 is a flowchart of an embodiment in which the first to third determination units 41 to 43 in FIG. Hereinafter, the details of the respective units 41 to 43 will be described while explaining the steps of FIG.

ステップS1では、第一判定部41が、移動対象の状態(x(t),y(t), v_x(t),v_y(t))に対して事前学習されたランダム木やSVM（サポートベクトルマシン）等の分類器を適用し、移動対象ではなく状態Nに該当するか否かを判定してから、ステップS2へと進む。 In step S1, the first determination unit 41 performs random learning or SVM (SVM (pre-learned) on the state (x (t), y (t), v _x (t), v _y (t)) to be moved. A classifier such as a support vector machine is applied to determine whether it is not a movement target and corresponds to state N, and then the process proceeds to step S2.

ここで、ステップS1の目的は、次の通りである。すなわち、検出追跡部1においては移動対象として検出されたが、実際は歩行者のように継続的に移動するものではなく、オフィス空間にある物（パソコンなど）が偶然少し動いたものであった、あるいは人物の動きではあるが歩行のように移動を伴うものではなく、頭や手だけが動いていたものであった、というような場合をノイズとして排除することがステップS1の目的である。従って、分類器を事前に学習する際には、学習用映像に対して検出追跡部1を適用し、一連の移動対象の追跡結果を得たうえで、人手などにより追跡された移動対象が本当に移動対象であるか否かのラベル付与を行ったものを学習用データとし、分類器を構築するようにすればよい。 Here, the purpose of step S1 is as follows. In other words, in the detection tracking unit 1, it was detected as a moving object, but actually it was not continuously moving like a pedestrian, something in the office space (such as a personal computer) moved by chance, Alternatively, the purpose of step S1 is to eliminate, as noise, a case where a person's movement is not accompanied by movement like walking but only his head and hands are moving. Therefore, when learning the classifier in advance, the detection tracking unit 1 is applied to the video for learning, and after obtaining a series of moving target tracking results, the moving target tracked manually is really A classifier may be constructed by using as a learning data data that has been given a label indicating whether or not it is a movement target.

ステップS2では、ステップS1の判定結果が状態N（移動対象ではない）であったか否かが判断され、状態Nであった場合はステップS11へと進み、判定部4の判定結果として状態Nを出力して当該フローを終了する。一方、ステップS2で状態Nではなかった場合にはステップS3へと進む。 In step S2, it is determined whether or not the determination result of step S1 is state N (not a movement target). If it is state N, the process proceeds to step S11, and state N is output as the determination result of determination unit 4 Then, the flow ends. On the other hand, if it is not in the state N in step S2, the process proceeds to step S3.

ステップS3では、第二判定部42が、ルールベースの手法により移動対象が状態Wに該当するか否かを判定してから、ステップS4へ進む。ステップS3の詳細は以下の通りである。 In step S3, the second determination unit 42 determines whether or not the movement target corresponds to the state W by a rule-based method, and then proceeds to step S4. Details of step S3 are as follows.

すなわち、移動対象の状態(x(t),y(t), v_x(t),v_y(t))に関して、過去時刻t0から現時刻tに至るまでの各時刻i(i=t0,t0+1,t0+2, … , t-2,t-1,t)の一連の状態(x(i),y(i), v_x(i),v_y(i))を参照することにより、時刻t0〜t間におけるx方向の移動量D_xと平均速度avg_v_xとを求め、以下のルールベースの式(9)が成立する場合に、状態Wであると判定する。そして、成立しなければ状態Wではないと判定する。 That is, with respect to the state (x (t), y (t), v _x (t), v _y (t)) of the movement target, each time i from the past time t0 to the current time t (i = t0, Refers to a series of states (x (i), y (i), v _x (i), v _y (i)) of t0 + 1, t0 + 2,…, t-2, t-1, t) it makes obtains a movement amount D _x the average velocity Avg_v _x in the x direction between the time T0～t, if the following rule-based formula (9) is satisfied, it is determined that the state W. If it is not established, it is determined that the state is not W.

式(9)において、THD及びTHVはそれぞれ、移動量D_xと平均速度avg_v_xとに対する所定の閾値である。式(9)はすなわち、x方向の移動量D_xと平均速度avg_v_xとが共に大きい場合に、状態Wであると判定するものである。 In Equation (9), THD and THV are predetermined thresholds for the movement amount D _x and the average speed avg_v _x , respectively. That is, Expression (9) determines that the state is W when the movement amount D _x in the x direction and the average speed avg_v _x are both large.

また、図５で画像の座標(x,y)を説明し、図２で画像の実例を示した通り、画像上の縦方向であるy方向がカメラ視点に向かって真っすぐに近づく又は遠ざかる方向であるのに対し、画像上の横方向であるx方向はカメラ視点を横切って移動する方向（カメラに平行な方向）を意味している。従って、ステップS3において第二判定部42により判定される状態Wとはすなわち、移動対象がカメラ視点を横切って移動している状態である。 In addition, the coordinates (x, y) of the image will be described with reference to FIG. 5, and as shown in FIG. 2, the y direction, which is the vertical direction on the image, is a direction in which it approaches or moves away from the camera viewpoint. On the other hand, the x direction, which is the horizontal direction on the image, means the direction of moving across the camera viewpoint (direction parallel to the camera). Therefore, the state W determined by the second determination unit 42 in step S3 is a state in which the moving object is moving across the camera viewpoint.

なお、式(9)の過去の時刻t0に関しては、現時刻tから所定フレーム数の過去時刻としてもよいし、現時刻tにおいて（状態判定の対象として）当該注目している移動対象が初めて検出された時刻としてもよい。 Note that the past time t0 in equation (9) may be a predetermined number of frames from the current time t, or the moving object of interest is detected for the first time at the current time t (as a state determination target). It is also possible to set the time.

第二判定部42ではまた、式(9)に加えて次の条件も満たす場合に、状態Wであると判定するようにしてもよい。すなわち、現時刻tにおいて当該注目している移動対象が、検出追跡部1において連続して一定フレーム数以上、追跡されているという条件である。当該条件を課すことで、時間軸上の安定性を考慮することができる。 The second determination unit 42 may also determine that the state W is satisfied when the following condition is satisfied in addition to the equation (9). That is, the condition is that the moving object of interest at the current time t is being tracked continuously by the detection tracking unit 1 for a certain number of frames or more. By imposing the conditions, stability on the time axis can be taken into consideration.

ステップS4では、ステップS3の判定結果が状態W（x方向の移動対象である）であったか否かが判断され、状態Wであった場合はステップS13へと進み、判定部4の判定結果として状態Wを出力して当該フローを終了する。一方、ステップS4で状態Wではなかった場合にはステップS5へと進む。 In step S4, it is determined whether or not the determination result in step S3 is the state W (the object to be moved in the x direction). If it is in the state W, the process proceeds to step S13, and the determination unit 4 determines the state Output W and end the flow. On the other hand, if the state is not the state W in step S4, the process proceeds to step S5.

ステップS5では、第三判定部43が、式(9)と同様の過去時刻t0から現時刻tまでにおける一連の移動対象の領域R(i)(i=t0,t0+1, …, t-1,t)の変化量r(i)より現時刻tにおける移動対象の領域R(t)の特徴量Q(t)を求め、当該特徴量Q(t)に対して予め学習されているランダム木やSVM等の分類器を適用することにより、現時刻tの移動対象が状態Tに該当するか否かを判定し、ステップS6へ進む。ステップS5の詳細は以下の通りである。 In step S5, the third determination unit 43 performs a series of movement target regions R (i) (i = t0, t0 + 1,..., T− from the past time t0 to the current time t as in equation (9). 1, t), a feature amount Q (t) of the region R (t) to be moved at the current time t is obtained from the change amount r (i), and a random value that has been learned in advance for the feature amount Q (t) By applying a classifier such as a tree or SVM, it is determined whether or not the movement target at the current time t corresponds to the state T, and the process proceeds to step S6. Details of step S5 are as follows.

すなわち、領域R(t)とは図６で説明したのと同様の、現時刻tにおける前景領域としての移動対象の領域F(t)を囲む矩形である。特徴量Q(t)には、図２で説明したような状態Tを判定可能な特徴量として、例えば以下のようなものを採用することができる。 That is, the region R (t) is a rectangle surrounding the region F (t) to be moved as the foreground region at the current time t, as described with reference to FIG. As the feature quantity Q (t), for example, the following can be adopted as a feature quantity capable of determining the state T as described in FIG.

第一実施形態として、以下の式(10)のように各時刻iにおける領域R(i)の面積S(i)の変化量r1(i)を求め、式(11)のように当該面積の変化量r1(i)の時刻t0〜t間での平均を特徴量Q(t)として求めることができる。 As the first embodiment, the change amount r1 (i) of the area S (i) of the region R (i) at each time i is obtained as in the following equation (10), and the area is calculated as in the equation (11). An average of the change amount r1 (i) between times t0 and t can be obtained as the feature amount Q (t).

第二実施形態として、以下の式(12)のように各時刻iにおける領域R(i)の幅w(i)（図５，６等で定義した座標におけるy方向の幅、すなわち横幅）の変化量r2(i)を求め、式(13)のように当該横幅の変化量r2(i)の時刻t0〜t間での平均を特徴量Q(t)として求めることができる。 As a second embodiment, the width w (i) of the region R (i) at each time i (the width in the y direction at the coordinates defined in FIG. The change amount r2 (i) is obtained, and the average of the change amount r2 (i) of the horizontal width between times t0 and t can be obtained as the feature amount Q (t) as shown in the equation (13).

なお、上記の第一及び第二実施形態の式(10),(12)ではそれぞれ、変化量r1(i),r2(i)を面積S(i),横幅w(i)の変化比率として求めているが、差の絶対値などとして求めるようにしてもよい。その他、変化を表す任意の量として求めるようにしてもよい。また、第三実施形態として、第一実施形態で求めた特徴量をQ1(t)とし、第二実施形態で求めた特徴量をQ2(t)とし、これらを合わせた(Q1(t),Q2(t))を特徴量として採用するようにしてもよい。 In the equations (10) and (12) of the first and second embodiments, the amount of change r1 (i) and r2 (i) is defined as the change ratio of the area S (i) and the width w (i), respectively. Although it is obtained, it may be obtained as an absolute value of the difference. In addition, you may make it obtain | require as arbitrary quantity showing a change. Further, as the third embodiment, the feature amount obtained in the first embodiment is Q1 (t), the feature amount obtained in the second embodiment is Q2 (t), and these are combined (Q1 (t), Q2 (t)) may be adopted as the feature amount.

また、特徴量Q(t)に関しては、以上の式(10)〜(14)のような面積変化や横幅変化に基づく量に限られず、図２で説明したような領域R(t)の変化、特に大きさの変化を反映するものとして算出される任意の量を用いることができる。 Further, the feature quantity Q (t) is not limited to the quantity based on the area change and the width change as in the above formulas (10) to (14), but the change in the region R (t) as described in FIG. In particular, any amount calculated to reflect the change in size can be used.

なお、ステップS5における第三判定部43による状態Tに該当するか否かの判定では、上記のように特徴量Q(t)に対して分類器を適用することに加えて、次の条件を課すようにしてもよい。すなわち、ステップS3における第二判定部42で課すことのできる追加条件と同様に、現時刻tにおいて当該注目している移動対象が、検出追跡部1において連続して一定フレーム数以上、追跡されているという条件である。当該条件を課すことで同様に、時間軸上の安定性を考慮することができる。 In addition, in the determination of whether or not the state T corresponds to the state T by the third determination unit 43 in step S5, in addition to applying the classifier to the feature quantity Q (t) as described above, the following condition is satisfied: It may be imposed. That is, similar to the additional condition that can be imposed by the second determination unit 42 in step S3, the moving object of interest at the current time t is continuously tracked by the detection tracking unit 1 for a certain number of frames or more. It is a condition of being. By imposing such conditions, the stability on the time axis can be taken into consideration as well.

ステップS6では、ステップS5の判定結果が状態T（カメラ視点に対して真っすぐに移動する移動対象である）であったか否かが判断され、状態Tであった場合はステップS12へと進み、判定部4の判定結果として状態Tを出力して当該フローを終了する。一方、ステップS5で状態Tではなかった場合にはステップS11へと進み、判定部4の判定結果として状態N（移動対象ではない）を出力して当該フローを終了する。 In step S6, it is determined whether or not the determination result in step S5 is state T (a moving object that moves straight with respect to the camera viewpoint). If the determination result is state T, the process proceeds to step S12, and the determination unit The state T is output as the determination result of 4, and the flow is finished. On the other hand, if the state is not the state T in step S5, the process proceeds to step S11, the state N (not the movement target) is output as the determination result of the determination unit 4, and the flow ends.

以上、図７及び図８を参照して、判定部4の一実施形態を説明した。以下、これに関しての補足を何点か説明する。 The embodiment of the determination unit 4 has been described above with reference to FIGS. 7 and 8. In the following, some supplementary points will be described.

（補足１）上記の一実施形態では、第一判定部41及び第二判定部42が第三判定部43の判定を行うための振り分け判定として機能しているが、第一判定部41及び／又は第二判定部42を省略して、第三判定部43を適用するようにしてもよい。例えば、第一判定部41で状態Nではなかった場合に、第二判定部42を省略して第三判定部43の判定を行うようにしてもよいし、最初から第三判定部43のみの判定を行うようにしてもよい。また、第一判定部41及び第二判定部42の振り分け判定を行う順番を逆転させるようにしてもよい。 (Supplement 1) In the above-described embodiment, the first determination unit 41 and the second determination unit 42 function as a distribution determination for performing the determination of the third determination unit 43. Alternatively, the second determination unit 42 may be omitted and the third determination unit 43 may be applied. For example, when the first determination unit 41 is not in the state N, the second determination unit 42 may be omitted and the determination of the third determination unit 43 may be performed, or only the third determination unit 43 from the beginning. You may make it perform determination. In addition, the order in which the first determination unit 41 and the second determination unit 42 perform the distribution determination may be reversed.

なお、第一判定部41を省略する場合は、ステップS1,S2の処理を省略して分岐ステップS2では常にステップS3に進むようにすればよい。第二判定部42を省略する場合は、ステップS3,S4の処理を省略して分岐ステップS4は常にステップS5に進むようにすればよい。 If the first determination unit 41 is omitted, the processing in steps S1 and S2 may be omitted and the branch step S2 may always proceed to step S3. When the second determination unit 42 is omitted, the processing of steps S3 and S4 may be omitted and the branch step S4 may always proceed to step S5.

（補足２）第三判定部43では各時刻iの領域R(i)の面積S(i)や横幅w(i)といった量を用いたが、これらはノイズの影響を受けている場合もありうるので、各時刻iの時系列としての面積S(i)や横幅w(i)に対してローパスフィルタを適用、あるいは関数フィッティングなどを適用することでノイズの影響を除外したうえで、式(10),(12)のような変化量r1(i),r2(i)を計算するようにしてもよい。 (Supplement 2) The third determination unit 43 uses quantities such as the area S (i) and the width w (i) of the region R (i) at each time i, but these may be affected by noise. Therefore, after removing the influence of noise by applying a low-pass filter or applying function fitting to the area S (i) and the width w (i) as the time series of each time i, the equation ( The change amounts r1 (i) and r2 (i) as in 10) and (12) may be calculated.

（補足３）第三判定部43で状態Tの判定が得られた場合、第三判定部43における追加処理として、当該カメラ視点に対して真っすぐの方向であるy方向の移動対象の速度を算出するようにしてもよい。すなわち、追跡部3におけるカルマンフィルタの出力として各時刻iの状態(x(i),y(i), v_x(i),v_y(i))が得られているものの、現時刻tが状態Tであると判定された場合、図２で説明したような状況であるため、カルマンフィルタ出力におけるy方向の速度v_y(t)は実際の速度を反映していない可能性が高い。そこで、第三判定部43において以下のようにして、実際の値を推定するものとして、y方向の移動対象の速度を算出するようにしてもよい。 (Supplement 3) When the determination of the state T is obtained by the third determination unit 43, as an additional process in the third determination unit 43, the speed of the moving target in the y direction that is a straight direction with respect to the camera viewpoint is calculated. You may make it do. That is, although the state (x (i), y (i), v _x (i), v _y (i)) at each time i is obtained as the output of the Kalman filter in the tracking unit 3, the current time t is the state If it is determined that T, the situation described with reference to FIG. 2 is present, and therefore, there is a high possibility that the velocity v _y (t) in the y direction in the Kalman filter output does not reflect the actual velocity. Therefore, the third determination unit 43 may calculate the speed of the moving target in the y direction as an estimation of the actual value as follows.

具体的には、式(10)のr1(t)又は式(12)のr2(t)とy方向の速度V(t)とに相関があるものとし、以下の式(14)又は(15)のように算出することができる。
V(t)=k1*r1(t) …(14)
V(t)=k2*r2(t) …(15) Specifically, it is assumed that there is a correlation between r1 (t) in equation (10) or r2 (t) in equation (12) and velocity V (t) in the y direction, and the following equation (14) or (15 ).
V (t) = k1 * r1 (t) (14)
V (t) = k2 * r2 (t)… (15)

なお、上記において推定対象となっているy方向の速度V(t)とは、仮に図２のような状況が発生していなかったとした場合に画像上において検知されるであろうy方向に関する速度である。 It should be noted that the velocity V (t) in the y direction to be estimated in the above is the velocity in the y direction that will be detected on the image if the situation shown in FIG. 2 has not occurred. It is.

ここで、k1,k2は共に、相関を考慮して予め与えておく定数であり、カメラのキャリブレーションや、実際に映像が撮影される空間における奥行き（y方向）の３D空間の距離を考慮した定数として与えることができる。 Here, k1 and k2 are both constants given in advance in consideration of the correlation, taking into account the 3D space distance of camera calibration and depth (y direction) in the space where the video is actually shot. It can be given as a constant.

また、k1又はk2を上記のように定数として与えておく以外の実施形態として、実際に撮影している映像からkを算出する実施形態も以下のように可能である。 Further, as an embodiment other than the case where k1 or k2 is given as a constant as described above, an embodiment in which k is calculated from an actually captured image is also possible as follows.

すなわち、状態Tが判定された現時刻tの直前時刻t-1（1フレーム手前の時刻）が状態W(x方向の移動)と判定されていたのであれば、カルマンフィルタ出力において得られている当該直前時刻t-1のx方向の速度v_x(t-1)から相関が取得可能であるものとして、以下の式(16)又は(17)のように算出すればよい。
k1= v_x(t-1)/r1(t-1) …(16)
k2= v_x(t-1)/r2(t-1) …(17) That is, if the time t-1 immediately before the current time t at which the state T is determined (time one frame before) is determined to be the state W (movement in the x direction), the Kalman filter output Assuming that the correlation can be acquired from the velocity v _x (t−1) in the x direction at the immediately preceding time t−1, the following equation (16) or (17) may be calculated.
k1 = v _x (t-1) / r1 (t-1)… (16)
k2 = v _x (t-1) / r2 (t-1)… (17)

また、直前時刻t-1が状態Wではない場合は、一定範囲内の直近時刻t-kにおいて状態Wがあれば、当該直近時刻t-kにおいて上記の式(16),(17)と同様に算出するようにしてもよいし、これまで計測されている全ての移動対象の速度の平均値をv_x(t-1)として使うようにしてもよい。当該平均値を使う場合、判定装置10において図３等に不図示の記憶部を備えるようにし、状態Wが判定される都度、その時点iのx方向の速度v_x(i)（カルマンフィルタが出力したもの）を記憶させるようにしておき、平均値を算出するようにすればよい。ここで、平均値の算出においては、所定期間の過去のみ（例えば過去１週間のみ）の平均として算出するようにしてもよい。 Also, if the previous time t-1 is not the state W, if there is a state W at the latest time tk within a certain range, it is calculated in the same way as the above equations (16) and (17) at the latest time tk. Alternatively, the average value of the velocities of all the movement targets measured so far may be used as v _x (t−1). When the average value is used, the determination device 10 is provided with a storage unit (not shown in FIG. 3 or the like), and each time the state W is determined, the velocity v _x (i) in the x direction at that time i (the Kalman filter outputs The average value is calculated. Here, in the calculation of the average value, it may be calculated as an average of only the past of a predetermined period (for example, only the past one week).

以下、以上説明した判定装置10の応用利用として、遠隔コミュニケーション支援装置に判定装置10の全部又は一部を組み込んで利用する、又は判定装置10と遠隔コミュニケーション支援装置とを組み合わせて利用することに関する一実施形態を説明する。 Hereinafter, as an applied use of the determination device 10 described above, one example relates to using the remote communication support device by incorporating all or part of the determination device 10 or using the determination device 10 and the remote communication support device in combination. An embodiment will be described.

図９は、遠隔コミュニケーション支援装置の一部又は全部、あるいは遠隔コミュニケーション支援装置と組み合わせて利用可能な、いわゆる首型のテレプレゼンスロボットTR1が図１と同様のオフィス空間に配置されている模式的な例を示す図である。図９ではすなわち、図１と同様のオフィス空間O1においてドアD1の位置に、ドアD1の代わりにテレプレゼンスロボットTR1（以下、ロボットTR1と略称する）が配置されている点を除いて、図１と同様のオフィス空間O1が示されている。 FIG. 9 is a schematic diagram in which a so-called neck-type telepresence robot TR1 that can be used in combination with a part or all of the remote communication support device or in combination with the remote communication support device is arranged in the same office space as FIG. It is a figure which shows an example. 9, that is, except that a telepresence robot TR1 (hereinafter abbreviated as robot TR1) is arranged in place of the door D1 at the position of the door D1 in the same office space O1 as in FIG. A similar office space O1 is shown.

ロボットTR1はカメラC1を備え、オフィスO1における勤務者H1等を撮影することができ、ディスプレイDP1を備えることで、オフィスO1とは別の自宅等において勤務している遠隔勤務者の顔画像等をオフィスO1側に表示することができる。その他、ロボットTR1はマイク及びスピーカを備えることで、遠隔勤務者とオフィス側とでテレビ電話機能によって遠隔コミュニケーションを実現することができる。なお、図９では示されていないが、ロボットTR1と同様のテレビ電話機能を備える機器を遠隔勤務者の側にも配置しておく必要がある。 Robot TR1 is equipped with camera C1 and can photograph worker H1 etc. in office O1, and display DP1 enables to capture facial images etc. of remote workers working at homes other than office O1. Can be displayed on the office O1 side. In addition, since the robot TR1 includes a microphone and a speaker, remote communication can be realized by a videophone function between the remote worker and the office. Although not shown in FIG. 9, it is necessary to arrange a device having a videophone function similar to that of the robot TR1 on the remote worker side.

ロボットTR1はまた、遠隔勤務者等の操作によって、配置されているオフィス空間O1においてカメラC1で撮影している範囲を変える（すなわちロボットTR1の視点を変える）機能や、オフィス空間O1内を車輪等のアクチュエータを駆動して移動する機能を備えていてもよい。 The robot TR1 also has a function of changing the range captured by the camera C1 in the office space O1 in which the robot TR1 is operated (that is, changing the viewpoint of the robot TR1), a wheel in the office space O1, etc. The actuator may be provided with a function of moving by driving.

図１０は、一実施形態に係る遠隔コミュニケーション支援装置の機能ブロック図である。遠隔コミュニケーション支援装置20（以下、支援装置20と略称する）は、図３〜８等で説明したのと同様の判定装置10と、ロボットTR1の動作を制御する制御部11と、遠隔勤務者の側へ通知を行う通知部12と、を備える。 FIG. 10 is a functional block diagram of a remote communication support apparatus according to an embodiment. The remote communication support device 20 (hereinafter, abbreviated as the support device 20) includes a determination device 10 similar to that described with reference to FIGS. 3 to 8 and the like, a control unit 11 that controls the operation of the robot TR1, and a remote worker's A notification unit 12 that performs notification to the side.

図１０の実施形態において、判定装置10は第一地点（オフィス空間等）の映像を入力として受け取り、第一地点において移動対象が存在して状態Tに該当するか否かを判定する。状態Tに該当すると判定された場合、制御部11及び通知部12が以下のように動作することが可能である。 In the embodiment of FIG. 10, the determination device 10 receives an image of a first point (office space or the like) as an input, and determines whether there is a moving target at the first point and the state T is met. When it is determined that the state T is satisfied, the control unit 11 and the notification unit 12 can operate as follows.

制御部11は、状態Tとして移動対象（オフィスの歩行者等）が検出された場合に、ロボットTR1の視線（図１０のカメラC1）が当該移動対象の方を向く（つまり、ロボットTR1が移動対象の方に視線を向ける）ように、ロボットTR1を制御する。なお、当該制御を行う場合、ロボットTR1には少なくとも、カメラC1の視線を水平方向内において移動調整するアクチュエータ機構が備わっているものとする。 When the movement target (office pedestrian, etc.) is detected as the state T, the control unit 11 turns the line of sight of the robot TR1 (camera C1 in FIG. 10) toward the movement target (that is, the robot TR1 moves). The robot TR1 is controlled so that the line of sight is directed toward the subject. In the case of performing the control, it is assumed that the robot TR1 includes at least an actuator mechanism that moves and adjusts the line of sight of the camera C1 in the horizontal direction.

具体的には、まず、一定時間間隔（例えば、0.5秒間隔）で現時点の目標角度θを式（18）で算出する。ここで、図９の例のようにロボットTR1の顔に相当する正面方向とカメラC1の光軸方向が同じであることを前提とする。図９の例は、カメラC1がロボットTR1の頭部と一体になっている構成であり、この前提が満たされている。 Specifically, first, the current target angle θ is calculated by Formula (18) at a constant time interval (for example, 0.5 second interval). Here, it is assumed that the front direction corresponding to the face of the robot TR1 and the optical axis direction of the camera C1 are the same as in the example of FIG. The example of FIG. 9 is a configuration in which the camera C1 is integrated with the head of the robot TR1, and this assumption is satisfied.

図１１は上記の式(18)が想定している配置を、画像Pを水平に切った断面において示す図であり、xは歩行者の中心と画像の中心間のピクセル数であり、Lは映像における画像Pの横幅（図５のNx）の半分（すなわち、L=Nx/2）であり、ФはカメラC1の視野角度の半分である。図１１において、点P1はカメラ位置、点P2は画像Pの中心、点P3は画像Pにおいて検出された移動対象の中心位置である。 FIG. 11 is a diagram showing the arrangement assumed by the above equation (18) in a cross section obtained by horizontally cutting the image P, where x is the number of pixels between the center of the pedestrian and the center of the image, and L is The width of the image P in the video (Nx in FIG. 5) is half (ie, L = Nx / 2), and 映像 is half the viewing angle of the camera C1. In FIG. 11, point P1 is the camera position, point P2 is the center of image P, and point P3 is the center position of the moving object detected in image P.

次に、現時点のロボットTR1の視線の角度と目標角度を比較し、ロボットTR1を制御する。目標角度との差が一定数値（TH1）以上、または一定数値（TH2）以下になると、ロボットTR1は動かさないように制御する。目標角度との差がこれに該当しない場合には、一定時間間隔以内に目標角度まで移動できる場合は目標角度まで移動し、間に合わない場合は最大スピードで一定時間間隔だけ移動させるようにする。 Next, the robot TR1 is controlled by comparing the angle of view of the current robot TR1 with the target angle. When the difference from the target angle is greater than or equal to a certain value (TH1) or less than or equal to a certain value (TH2), the robot TR1 is controlled so as not to move. If the difference from the target angle does not correspond to this, if it can move to the target angle within a certain time interval, it moves to the target angle, and if it cannot keep up, it moves at the maximum speed for a certain time interval.

なお、ロボットの視線を移動対象等の目標に一致させるための制御に関しては、上記に限らず種々の既存技術を利用してもよい。 Note that the control for matching the robot's line of sight with a target such as a movement target is not limited to the above, and various existing techniques may be used.

通知部12では、歩行者等の移動対象が判定装置10によって状態Tと判定され、且つ、ロボットTR1と一定距離以内に接近した場合に、第二地点への当該移動対象が存在して接近している旨の通知を行う。テレワーク支援の場合、第二地点には例えば在宅勤務者が存在しており、在宅勤務者へ通知が行われる。ここで、通知部12では、在宅勤務者の状態が忙しいか否かによって、通知の形を選択することができる。 In the notification unit 12, when a moving object such as a pedestrian is determined to be in the state T by the determination device 10 and approaches the robot TR1 within a certain distance, the moving object to the second point exists and approaches. Notification that it is. In the case of telework support, for example, a telecommuter exists at the second point, and the telecommuter is notified. Here, the notification unit 12 can select the form of notification depending on whether the telecommuter is busy or not.

通知部12において、一定距離内、例えば２メートル以内の距離にいるかどうかを判定するために、映像の中に検出した歩行者等の移動対象の領域R(t)の横縦比と大きさを利用することができ、以下の条件１、２の両者を満たす場合に当該判定を下すことができる。
（条件１）：移動対象の領域の横縦比（アスペクト比）が一定値以下となること、すなわち、移動対象の領域が閾値判定によって縦に長い状態ではないと判定されること
（条件２）：大きさ（領域R(t)の横幅及び／又は縦幅）が一定値以上になること In order to determine whether the notification unit 12 is within a certain distance, for example, within a distance of 2 meters, the aspect ratio and size of the area R (t) to be moved such as a pedestrian detected in the video This determination can be made when both of the following conditions 1 and 2 are satisfied.
(Condition 1): The aspect ratio of the area to be moved is equal to or less than a certain value, that is, it is determined that the area to be moved is not in a vertically long state by threshold determination (Condition 2) : Size (horizontal width and / or vertical width of region R (t)) exceeds a certain value

また、通知部12において、在宅勤務者が忙しいかどうかを判定する手法としては、例えば、前掲の非特許文献５に開示の手法を利用することができる。非特許文献５では、在宅勤務者等の対象者に精神生理学上のデータを取得する特定のセンサ（例えば、EEGセンサ（脳波センサ）を付けるNeurosky Mindband（商品名）、EDAセンサ（皮膚電位センサ）を付けるEmpatica E3（商品名））を装着し、当該センサの取得したデータを予め学習された単純ベイズ分類器に入力し、忙しいかどうかの判定結果を得ている。 As a method for determining whether the telecommuter is busy in the notification unit 12, for example, the method disclosed in Non-Patent Document 5 described above can be used. In Non-Patent Document 5, a specific sensor (for example, Neurosky Mindband (trade name) that attaches an EEG sensor (electroencephalogram sensor), EDA sensor (skin potential sensor) that acquires psychophysiological data to a subject such as a telecommuter. Empatica E3 (trade name)) is attached, and the data obtained by the sensor is input to a previously learned naive Bayes classifier, and the result of whether it is busy is obtained.

通知部12では、在宅勤務者が忙しいと判定したら、文字等視覚通知のみを表示する。また、在宅勤務者が忙しくないと判定したら、前記文字等視覚通知を表示すると共に効果音を再生する。すなわち、通知の内容については、第一地点側において移動対象が存在して状態Tである旨の情報その他の、支援装置20の用途に応じた所定の内容とすればよい。なお、在宅勤務者が忙しい時には、制御部11の制御により、オフィス側（第一地点側）での勤務者がロボットTR1に見られている感（注目されている感覚）に用いるロボット動作を抑制するようにしてもよい。 When it is determined that the telecommuter is busy, the notification unit 12 displays only a visual notification such as text. If it is determined that the telecommuter is not busy, the visual notification such as the characters is displayed and the sound effect is reproduced. That is, the content of the notification may be predetermined content according to the use of the support device 20 such as information indicating that the movement target exists on the first point side and is in the state T. When the telecommuter is busy, the control of the control unit 11 suppresses the robot movement used for the feeling that the worker on the office side (first point side) is seen on the robot TR1 (sense of attention). You may make it do.

以下、前述の（補足１）〜（補足３）の続きとして、本発明における補足事項を述べる。 Hereinafter, supplementary matters in the present invention will be described as a continuation of the above (Supplement 1) to (Supplement 3).

（補足４）判定装置10及び支援装置20のそれぞれに関して、その各部（図３や図１０等で説明した各要素機能部）を実現するためのハードウェア構成に関しては、通常のコンピュータにおけるハードウェア構成を採用することができる。 (Supplement 4) With respect to each of the determination device 10 and the support device 20, the hardware configuration for realizing each unit (each element function unit described in FIG. 3, FIG. 10, etc.) is a hardware configuration in a normal computer. Can be adopted.

すなわち、各部を実現する判定装置10及び支援装置20のハードウェア構成としては、スマートフォンやタブレット端末といったような携帯端末の他、デスクトップ型、ラップトップ型その他の一般的なコンピュータの構成を採用することができる。すなわち、CPU(中央演算装置)と、CPUにワークエリアを提供する一時記憶装置と、プログラム等のデータを格納する二次記憶装置と、各種の入出力装置と、これらの間でのデータ通信を担うバスと、を備える一般的なコンピュータのハードウェア構成を採用できる。CPUが二次記憶装置に格納されたプログラムを読み込んで実行することで、図３や図１０等に示した各部が実現される。本発明はこのようなプログラムとしても提供可能である。なお、各種の入出力装置としては、画像取得するカメラ、表示を行うディスプレイ、ユーザ入力を受け取るタッチパネルやキーボード、音声を入出力するマイク・スピーカ、外部と有線・無線にて通信を行う通信インタフェース、といったものの中から必要機能に応じたものを利用することができる。 In other words, as the hardware configuration of the determination device 10 and the support device 20 that realize each unit, in addition to a mobile terminal such as a smartphone or a tablet terminal, a general configuration of a desktop type, a laptop type, or other general computer is adopted. Can do. That is, a CPU (Central Processing Unit), a temporary storage device that provides a work area to the CPU, a secondary storage device that stores data such as programs, various input / output devices, and data communication between them It is possible to adopt a general computer hardware configuration including a bus to be carried. When the CPU reads and executes the program stored in the secondary storage device, each unit shown in FIGS. 3 and 10 is realized. The present invention can also be provided as such a program. Various input / output devices include a camera for image acquisition, a display for display, a touch panel and keyboard for receiving user input, a microphone / speaker for inputting / outputting audio, a communication interface for communicating with the outside by wire / wirelessly, You can use the one according to the required function.

（補足５）第三判定部43においては、各時刻iにおけるr1(i)やr2(i)といった領域R(i)の変化を反映したものとして現時刻tについて算出された特徴量Q(t)に分類器を適用して状態Tの判定を行うものとしたが、このための分類器については、第一判定部41において説明したのと同様に、事前にラベル付与された教師データを人手等で与えておいて学習により構築しておけばよい。 (Supplement 5) In the third determination unit 43, the feature quantity Q (t calculated for the current time t as reflecting the change in the region R (i) such as r1 (i) and r2 (i) at each time i. ) Is used to determine the state T. However, for the classifier for this purpose, in the same way as described in the first determination unit 41, the teacher data labeled in advance is manually assigned. You can give it by etc. and build it by learning.

第三判定部43においてはまた、上記の特徴量Q(t)に加えて、カルマンフィルタの出力した現時刻tの状態(x(t),y(t), v_x(t),v_y(t))も、特徴量として採用し、上記と同様に事前学習された分類器で状態Tの判定を行うようにしてもよい。すなわち、(Q(t),x(t),y(t),v_x(t),v_y(t))の全体を特徴量として採用するようにしてもよい。 In the third determination unit 43, in addition to the above-described feature quantity Q (t), the state (x (t), y (t), v _x (t), v _y ( t)) may also be adopted as a feature quantity, and the state T may be determined by a classifier that has been previously learned in the same manner as described above. That is, may be adopted as the feature quantity overall (Q (t), x ( t), y (t), v x (t), v y (t)).

第三判定部43ではまた、カメラ視点に対して真っすぐ近づいてきている状態と、これとは逆方向に遠ざかっている状態と、を区別して判定するようにしてもよい。この場合、以上説明したのと同様にして、近づいてきている状態を判定する分類器と、遠ざかってきている状態を判定する分類器と、をそれぞれ個別に事前学習させて用意しておけばよい。特徴量Q(t)等に関しても、それぞれ個別の特徴量を求めるようにしてもよい。 The third determination unit 43 may also make a determination by distinguishing between a state in which the camera viewpoint is approaching straight and a state in which the camera viewpoint is away from the camera viewpoint. In this case, in the same manner as described above, a classifier that determines the approaching state and a classifier that determines the approaching state may be prepared by separately separately learning in advance. . For the feature quantity Q (t) and the like, individual feature quantities may be obtained.

（補足６）以上の説明より明らかなように、判定装置10は次のような映像にも適用可能である。すなわち、対象は静止しているがこれに対してカメラが移動することで、図２のような映像が得られる場合には、このような映像にも判定装置10は適用可能であり、カメラに対して対象が真っすぐに近づいている、あるいは遠ざかっている、という判定結果を得ることができる。ただし、この場合、背景差分法が適用可能なように、対象は大きさ等が変化するが背景は（非常に遠方にある等によって）ほぼ変化しない状態であることが好ましい。同様に、対象及びカメラの両者が移動して図２のような映像が得られるのであれば、このような映像に対しても判定装置10による判定が可能である。 (Supplement 6) As is clear from the above description, the determination apparatus 10 can also be applied to the following images. In other words, when the object is stationary but the camera moves in response to the image as shown in FIG. 2, the determination apparatus 10 can be applied to such an image. On the other hand, it is possible to obtain a determination result that the object is approaching straight away or is moving away. However, in this case, it is preferable that the size of the object changes but the background does not change substantially (for example, because it is very far away) so that the background subtraction method can be applied. Similarly, if both the target and the camera move and an image as shown in FIG. 2 is obtained, the determination device 10 can also determine such an image.

さらに同様に、カメラで撮影された実写映像に限らず、コンピュータグラフィックで作成された映像や、アニメーション映像や、これらの組み合わせの映像に対しても判定装置10による判定が可能である。すなわち、映像が生成された過程によらず、図２のような移動対象の変化が起きている任意に映像に対して、判定装置10による判定が可能であり、当該映像の視点（実写映像の場合、カメラ視点）に対して移動対象が真っすぐに近づいているか遠ざかっているかの判定を行うことができる。 Similarly, the determination device 10 can make determinations not only on live-action images shot by a camera, but also on images created by computer graphics, animation images, and combinations of these images. That is, regardless of the process in which the video is generated, it is possible to make a determination by the determination device 10 with respect to an arbitrary video in which the movement target is changed as shown in FIG. In this case, it can be determined whether the moving object is approaching or moving away from the camera viewpoint.

（補足７）検出追跡部1に関して、以上説明したような動き特徴量によって領域を検出及び追跡する手法は、映像がオフィス環境等において撮影されたものである場合に好適な一例の手法である。すなわち、オフィス環境では、照明なども相対的に安定であり、風などにより木やものの動きも少ないため、殆どの動きは人間に起因するものであることが想定される。例えば、人の移動、手や頭の動き、人の操作によるPCや椅子など物の移動である。よって、オフィス環境では、動きを安定的に検出できる、且つ、ほとんどの場合において動きの意味・種類は明確である。一方、オフィス環境では、家具や他人によりオクルージョンが多発するので、前掲の非特許文献２におけるHoGや色ヒストグラム、エッジなどテクスチャ関連の特徴量に対して時間的な連続性が足りないことが想定される。 (Supplement 7) With respect to the detection tracking unit 1, the method of detecting and tracking a region by the motion feature amount as described above is an example method suitable for a case where a video is taken in an office environment or the like. In other words, in an office environment, lighting and the like are relatively stable, and there is little movement of trees and things due to wind or the like, so it is assumed that most movements are caused by humans. For example, movement of people, movement of hands and heads, movement of objects such as PCs and chairs by human operations. Therefore, in an office environment, movement can be detected stably, and in most cases, the meaning and type of movement are clear. On the other hand, in an office environment, since occlusion frequently occurs due to furniture and others, it is assumed that there is insufficient temporal continuity with respect to texture-related features such as HoG, color histogram, and edge in Non-Patent Document 2 described above. The

しかしながら、テクスチャ関連の特徴量の方が領域の検出・追跡に好適な環境で映像が撮影されている場合は、検出追跡部1は当該テクスチャ関連の特徴量に基づいて領域の検出・追跡を行うようにすればよい。その他、映像の特性に応じた任意の既存手法を用いて、検出追跡部1による映像からの移動領域F(t)の検出・追跡を実現し、以上説明したのと同様に領域F(t)を囲む矩形として領域R(t)を検出してからカルマンフィルタを適用して状態(x(t),y(t), v_x(t),v_y(t))を出力するようにしてよい。 However, if the image is captured in an environment where the texture-related feature amount is more suitable for region detection / tracking, the detection tracking unit 1 detects and tracks the region based on the texture-related feature amount. What should I do? In addition, using any existing method according to the characteristics of the video, detection and tracking of the moving area F (t) from the video by the detection tracking unit 1 is realized, and the area F (t) is the same as described above. The region (x (t), y (t), v _x (t), v _y (t)) may be output by applying the Kalman filter after detecting the region R (t) as a rectangle surrounding .

10…判定装置、1…検出追跡部、4…判定部
20…遠隔コミュニケーション支援装置、11…制御部、12…通知部 10 ... determination device, 1 ... detection tracking unit, 4 ... determination unit
20 ... Remote communication support device, 11 ... Control unit, 12 ... Notification unit

Claims

映像を解析して移動対象の検出及び追跡を行う検出追跡部と、
前記追跡された移動対象の領域の変化に基づいて、移動対象における移動の判定を、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を含めて行う判定部と、を備え、
前記判定部は、前記追跡された移動対象における横方向の所定期間の移動量及び／又は横方向の移動速度に基づいて、当該追跡された移動対象が横方向に移動しているものであるか否かの判定を行い、当該横方向に移動しているとは判定されなかった場合に、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を行うことを特徴とする判定装置。 A detection and tracking unit that analyzes the video and detects and tracks the moving object; and
A determination unit that performs determination of movement in the movement target based on a change in the tracked area of the movement target, including determination of movement that approaches or moves away straight from the camera viewpoint in the video ,
Whether the tracked moving object is moving in the horizontal direction based on the movement amount and / or the moving speed in the horizontal direction of the tracked moving object for a predetermined period in the horizontal direction. A determination apparatus that determines whether or not the camera is moving in the horizontal direction, and determines whether or not the camera viewpoint in the video is straightly approaching or moving away .

前記判定部は、前記追跡された移動対象の領域の面積の変化に基づいて、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を行うことを特徴とする請求項１に記載の判定装置。 2. The determination unit according to claim 1, wherein the determination unit performs a determination of a movement that approaches or moves away straight from a camera viewpoint in the video based on a change in the area of the tracked movement target region. Judgment device.

前記判定部は、前記追跡された移動対象の領域の横幅の変化に基づいて、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を行うことを特徴とする請求項１または２に記載の判定装置。 3. The determination unit according to claim 1, wherein the determination unit determines a movement that approaches or moves away straight from a camera viewpoint in the video based on a change in a width of the tracked movement target area. The determination apparatus described.

前記判定部は、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動がある旨の判定が行われた場合にさらに、前記追跡された移動対象の領域の変化に基づいて、前記真っすぐに近づく又は遠ざかる移動の速度を推定することを特徴とする請求項１ないし３のいずれかに記載の判定装置。 In the case where it is determined that there is a movement that approaches or moves away straight from the camera viewpoint in the video, the determination unit further approaches the straight line based on a change in the tracked area to be moved. The determination apparatus according to claim 1, wherein the speed of moving away is estimated.

前記判定部は、前記速度を推定するに際してさらに、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動がある旨の判定が行われた時点の直近の過去時点において検出されている、前記追跡された領域の横方向の速度に基づいて推定することを特徴とする請求項４に記載の判定装置。 The tracking unit, which is detected at a past time point closest to a time point when it is determined that there is a movement that approaches or moves away straight from the camera viewpoint in the video when the speed is estimated. The determination apparatus according to claim 4, wherein the determination is performed based on a lateral speed of the determined region.

前記判定部は、前記追跡された移動対象における位置及び速度に基づいて、当該追跡された移動対象が実際の移動対象であるか否かの判定を行い、当該実際の移動対象であると判定された場合に、前記映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動の判定を行うことを特徴とする請求項１ないし５のいずれかに記載の判定装置。 The determination unit determines whether the tracked moving object is an actual moving object based on the position and speed in the tracked moving object, and is determined to be the actual moving object. 6. The determination device according to claim 1, wherein a determination is made as to whether or not the camera viewpoint in the video is moving straight toward or away from the camera viewpoint.

前記検出追跡部は、背景差分法によって前記映像の各時刻の画像において前景領域を抽出し、当該抽出した前景領域に基づいて前記移動対象の追跡を行うことを特徴とする請求項１ないし６のいずれかに記載の判定装置。 The detection and tracking unit extracts the foreground area in the image at each time of the image by the background difference method, of claims 1 and performs tracking of the moving object based on the foreground area the extracted 6 The determination apparatus in any one.

コンピュータを請求項１ないし７のいずれかに記載の判定装置として機能させることを特徴とするプログラム。 Program for causing to function as determination apparatus according to any one of claims 1 to computer 7.

請求項１ないし７のいずれかに記載の判定装置と、通知部と、を備える遠隔コミュニケーション支援装置であって、
前記判定装置が第一地点の映像を解析することで、当該映像におけるカメラ視点に対して真っすぐに近づく又は遠ざかる移動対象があることが判定された場合に、
前記通知部が、第二地点に対して通知を行うことを特徴とする遠隔コミュニケーション支援装置。 A determination device according to any one of claims 1 to 7, a remote communication support system comprising: a notification unit, a
When it is determined that there is a moving object that approaches or moves away straight from the camera viewpoint in the video by analyzing the video of the first point by the determination device,
The remote communication support apparatus, wherein the notification unit notifies the second point.