JP5886594B2

JP5886594B2 - Camera system

Info

Publication number: JP5886594B2
Application number: JP2011235737A
Authority: JP
Inventors: 裕二中沢
Original assignee: Secom Co Ltd
Current assignee: Secom Co Ltd
Priority date: 2011-10-27
Filing date: 2011-10-27
Publication date: 2016-03-16
Anticipated expiration: 2031-10-27
Also published as: JP2013093787A

Description

本発明は、複数のカメラが配列されたカメラシステムを用いて、人物等の移動物体の位置を検出するカメラシステムに関し、特に、カメラの外部パラメータの校正が一部のカメラごとに行われるカメラシステムに関する。 The present invention relates to a camera system that detects the position of a moving object such as a person using a camera system in which a plurality of cameras are arranged, and in particular, a camera system in which calibration of external parameters of a camera is performed for each of some cameras. About.

対象空間を撮影した画像に基づいて移動物体の行動分析等を行う場合、隣り合うカメラが共通視野を持つように複数のカメラを配列して移動物体を追跡すれば、共通視野にて移動物体を３次元的に解析することでオクルージョンが生じても物***置を精度良く検出でき、また複数の視野を連結することで物***置を広範囲に追跡できるというメリットがある。 When performing behavior analysis of moving objects based on images of the target space, tracking moving objects by arranging multiple cameras so that adjacent cameras have a common field of view, Even if occlusion occurs by three-dimensional analysis, the object position can be detected with high accuracy, and the object position can be tracked over a wide range by connecting a plurality of fields of view.

複数のカメラで物***置の情報を交換するには、共通視野を利用して各カメラの外部パラメーター（カメラの位置・姿勢）を計測する校正（キャリブレーション）を正確に行っておく必要があるが、カメラが多数になると全カメラに共通する視野を設定するのは困難がある。そのため、特許文献１に記載の撮像装置較正方法ではまず視野を共有するカメラごとにキャリブレーションを行ない、その後、これらの部分ごとのキャリブレーション結果を１つの座標系に変換する。 In order to exchange object position information between multiple cameras, it is necessary to accurately perform calibration to measure the external parameters (camera position and orientation) of each camera using a common field of view. When there are many cameras, it is difficult to set a field of view common to all cameras. For this reason, in the imaging apparatus calibration method described in Patent Document 1, calibration is first performed for each camera sharing the field of view, and then the calibration results for each part are converted into one coordinate system.

このとき特許文献１の方法では、部分ごとのキャリブレーション結果の間に生じる誤差を均一化するために各カメラの外部パラメータを少しずつ補正して、誤差を全カメラに分散させる構成を示している。 At this time, the method disclosed in Patent Document 1 shows a configuration in which the external parameters of each camera are corrected little by little in order to uniformize the error generated between the calibration results for each part, and the error is distributed to all cameras. .

特開２０１１−８６１１１号公報JP 2011-86111 A

複数のカメラを用いた追跡処理を好適に行うには、複数カメラの画像から物体の３次元位置を精度良く求める必要があり、そのために、共通視野を有するカメラ群（クラスタ）内におけるカメラキャリブレーションを高精度に行うことが求められる。この点、カメラシステム全体としての整合性を確保するために上述のように誤差を全カメラに分散させると、クラスタ内でのカメラの位置等、外部パラメータも修正される。その結果、クラスタ内の追跡精度が落ちるという問題があった。また、各クラスタでの追跡結果をクラスタ間にて対応付けて統合する処理はクラスタ内の追跡精度が高いことを前提としている。そのため、クラスタ内の追跡精度の低下はクラスタ間統合処理の精度も低下させ、全体としての追跡精度が低下するという問題があった。 In order to suitably perform tracking processing using a plurality of cameras, it is necessary to accurately obtain a three-dimensional position of an object from images of a plurality of cameras. For this reason, camera calibration in a camera group (cluster) having a common field of view is required. Is required to be performed with high accuracy. In this regard, when the error is distributed to all the cameras as described above in order to ensure the consistency of the entire camera system, external parameters such as the position of the camera in the cluster are also corrected. As a result, there is a problem that the tracking accuracy in the cluster is lowered. Further, the processing for associating and integrating the tracking results in each cluster is premised on high tracking accuracy within the cluster. For this reason, a decrease in tracking accuracy within the cluster also decreases the accuracy of inter-cluster integration processing, resulting in a problem that the tracking accuracy as a whole decreases.

また、各クラスタにおけるキャリブレーション結果はそれぞれに量子化誤差などを含んでおり、クラスタ間で物***置の情報を交換するとこの誤差が重畳する。そのため、クラスタがループ状に接続された場合などにシステム全体として物***置の整合性を保てなくなる問題があった。 Further, the calibration result in each cluster includes a quantization error and the like, and this error is superimposed when information on the object position is exchanged between the clusters. For this reason, there is a problem that the consistency of the object position cannot be maintained as a whole system when the clusters are connected in a loop.

このように多数のカメラを用いた追跡では、局所における物***置の検出精度と全体での物***置の整合性とのトレードオフが問題となる。 Thus, in tracking using a large number of cameras, a trade-off between local object position detection accuracy and overall object position consistency becomes a problem.

本発明は上記問題を鑑みてなされたものであり、多数のカメラを用いた広い空間での物***置の検出を高精度で行うことのできるカメラシステムを提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a camera system capable of detecting an object position in a wide space using a large number of cameras with high accuracy.

本発明に係るカメラシステムは、視野内に共通視野を有した一群のカメラであるクラスタを複数含み、当該複数のクラスタが隣り合うクラスタにて共通カメラを共有して互いに連鎖したものであって、前記クラスタごとに設定したクラスタ座標系、並びに当該クラスタ座標系における前記各カメラの位置及び姿勢を記憶するクラスタ校正情報記憶部と、前記隣り合うクラスタにて共有している前記共通カメラの前記クラスタ座標系それぞれの位置及び姿勢の間に所定範囲の誤差を許容することにより当該クラスタ座標系の相互の配置関係を調整した、全クラスタに亘る統合座標系を記憶する統合座標系記憶部と、前記クラスタごとに、前記カメラが撮像した物体の画像を解析して当該クラスタの前記クラスタ座標系における当該物体の物***置を検出する物***置検出部と、前記物***置検出部により検出された前記クラスタ座標系での前記物***置を前記統合座標系に変換して出力する物***置統合部と、を備える。 The camera system according to the present invention includes a plurality of clusters that are a group of cameras having a common field of view within a field of view, and the plurality of clusters share a common camera in adjacent clusters and are chained together. A cluster coordinate system set for each cluster, and a cluster calibration information storage unit that stores the position and orientation of each camera in the cluster coordinate system, and the cluster coordinates of the common camera shared by the adjacent clusters An integrated coordinate system storage unit for storing an integrated coordinate system over all clusters, wherein the mutual arrangement relationship of the cluster coordinate system is adjusted by allowing an error within a predetermined range between the position and orientation of each system; and the cluster Each time, the image of the object captured by the camera is analyzed to determine the object position of the object in the cluster coordinate system of the cluster. The comprising an object position detection unit for detecting, and a object position integrating unit for converting said combined coordinate system of the object position in said detected cluster coordinate system by the object location detecting portion.

他の本発明に係るカメラシステムにおいては、前記複数のクラスタはループ状に連鎖し、前記統合座標系は、ループ上の全ての前記共通カメラについて積算した前記誤差を最小化するように定められている。 In another camera system according to the present invention, the plurality of clusters are chained in a loop shape, and the integrated coordinate system is determined to minimize the error accumulated for all the common cameras on the loop. Yes.

本発明に係るカメラシステムにおいては、前記物***置統合部は、前記クラスタ座標系における前記共通カメラの前記位置及び姿勢を前記隣り合うクラスタにて合致させて、当該隣り合うクラスタのそれぞれから検出された前記物***置を照合し同一物体の対応付けを行う。 In the camera system according to the present invention, the object position integration unit is detected from each of the adjacent clusters by matching the position and orientation of the common camera in the cluster coordinate system with the adjacent clusters. The object positions are collated and the same object is associated.

別の本発明に係るカメラシステムにおいては、前記物***置統合部は、前記同一物体として対応付けされた前記物***置のうち、前記隣り合うクラスタそれぞれから同一時刻に検出された複数の物***置を前記統合座標系における当該物***置の内分点に置き換えて出力する。 In the camera system according to another aspect of the invention, the object position integration unit may include a plurality of object positions detected at the same time from the adjacent clusters among the object positions associated with the same object. Replace with the internal dividing point of the object position in the integrated coordinate system and output.

本発明によれば、多数のカメラを用いて広い空間で高精度に物***置を検出することが可能となる。 According to the present invention, it is possible to detect an object position with high accuracy in a wide space using a large number of cameras.

本発明の実施形態に係るカメラシステムにおけるカメラの配置の一例を示す模式的な平面図である。It is a typical top view showing an example of arrangement of a camera in a camera system concerning an embodiment of the present invention. 本発明の実施形態に係るカメラシステムの概略構成を示すブロック図である。1 is a block diagram illustrating a schematic configuration of a camera system according to an embodiment of the present invention. クラスタ処理部の概略の構成を示すブロック図である。It is a block diagram which shows the schematic structure of a cluster process part. 統合処理部の概略の構成を示すブロック図である。It is a block diagram which shows the schematic structure of an integrated process part. 図１に示したカメラシステムに関するクラスタ構成情報を表形式で表した模式図である。It is the schematic diagram which represented the cluster configuration information regarding the camera system shown in FIG. 1 in a table format. 図１に示すカメラ構成における誤差の累積を説明する模式的な平面図である。FIG. 2 is a schematic plan view for explaining error accumulation in the camera configuration shown in FIG. 1. 誤差行列を説明するためのカメラシステムの模式的な平面図である。It is a typical top view of a camera system for explaining an error matrix. 座標統合処理部の機能を説明する模式図である。It is a schematic diagram explaining the function of a coordinate integration process part. 各クラスタのカメラキャリブレーションの概略のフロー図である。It is a general | schematic flowchart of the camera calibration of each cluster. ループキャリブレーションの概略のフロー図である。It is a schematic flowchart of loop calibration. 追跡動作時に行われる座標統合処理の概略のフロー図である。It is a general | schematic flowchart of the coordinate integration process performed at the time of tracking operation | movement.

以下、本発明の実施の形態（以下実施形態という）について、図面に基づいて説明する。本実施形態は、監視対象空間に多数のカメラを配置して広範囲に人物等の移動物体を３次元追跡するカメラシステムである。 Hereinafter, embodiments of the present invention (hereinafter referred to as embodiments) will be described with reference to the drawings. This embodiment is a camera system that three-dimensionally tracks a moving object such as a person over a wide range by arranging a large number of cameras in a monitoring target space.

最初に本システムで用いるマルチカメラトラッキングについて説明する。マルチカメラトラッキングは複数台のカメラで異なる視点から撮影した画像に基づいて３次元位置を求める手法である。この手法は、異なったアングルの複数の画像を用いて同一物体を観察するため、１台のカメラだけで撮影する場合に比べ位置の検出精度が高いという特徴を有する。例えば、カメラ１台で物***置を検出する場合、一般に、カメラから見た奥行き方向に対する検出精度が悪いのに対し、別のカメラも用いれば同一物体を別の角度から見ることができるため奥行き方向の拘束を与えることができる。この拘束はカメラ台数が多い程強くなり、それに応じて物***置の検出精度が高くなる。 First, multi-camera tracking used in this system will be described. Multi-camera tracking is a technique for obtaining a three-dimensional position based on images taken from different viewpoints by a plurality of cameras. This method has a feature that the position detection accuracy is higher than that in the case of photographing with only one camera because the same object is observed using a plurality of images of different angles. For example, when detecting the position of an object with one camera, the detection accuracy in the depth direction viewed from the camera is generally poor, but if another camera is used, the same object can be viewed from different angles, so that the depth direction Can be constrained. This restriction becomes stronger as the number of cameras increases, and the detection accuracy of the object position increases accordingly.

ただし、その検出精度はカメラの位置・姿勢といった外部パラメータの精度に左右される。そのために、共通視野を持つ複数のカメラに対して予めカメラキャリブレーションを行い、各カメラの外部パラメータを高精度に求める必要がある。 However, the detection accuracy depends on the accuracy of external parameters such as the position and orientation of the camera. Therefore, it is necessary to perform camera calibration on a plurality of cameras having a common field of view in advance and obtain external parameters of each camera with high accuracy.

ここで、カメラシステムを構成する全てのカメラに対してマルチカメラトラッキングを適用して監視対象空間全体にて物体を一元的に追跡できることが望ましい。しかし、建物などの広い空間では例えば、廊下等の遮蔽物により、カメラシステム全体で共通視野を確保できない状況が存在し得る。つまり、この場合、共通視野を前提とするマルチカメラトラッキングを監視対象空間に一元的には適用することができない。 Here, it is desirable that the multi-camera tracking is applied to all the cameras constituting the camera system so that the object can be tracked in the entire monitoring target space. However, in a wide space such as a building, there may be a situation where a common field of view cannot be secured in the entire camera system due to, for example, a shield such as a corridor. That is, in this case, multi-camera tracking based on a common visual field cannot be applied to the monitoring target space in an integrated manner.

この場合には、共通視野を持つカメラの集合であるクラスタという概念を導入し、監視対象空間全体を複数のカメラクラスタで分担する。好適には３以上のクラスタを用いて広い空間を監視する。つまり、カメラクラスタ単位で独立にマルチカメラトラッキングを行う。そして、クラスタごとの追跡結果をクラスタ間にて統合して監視対象空間全体での追跡結果を得る。隣り合うクラスタには互いに共通するカメラを設け、統合処理はこのカメラを利用して行う。 In this case, the concept of a cluster that is a set of cameras having a common field of view is introduced, and the entire monitoring target space is shared by a plurality of camera clusters. Preferably, a large space is monitored using three or more clusters. That is, multi-camera tracking is performed independently for each camera cluster. Then, the tracking results for each cluster are integrated between the clusters to obtain the tracking results for the entire monitoring target space. Adjacent clusters are provided with a common camera, and integration processing is performed using this camera.

図１はカメラの配置の一例を示す模式的な平面図であり、本実施形態では当該配置を用いてカメラシステムを説明する。図１に示す配置では、監視対象空間には６台のカメラｅ１〜ｅ６が配置される。例えば、これらカメラが取り囲む領域に障害物（図示せず）が存在し、任意のカメラには当該障害物の陰に隠れてその視野を撮影できないカメラが存在する。例えば、カメラｅ２からはカメラｅ５の視野内の空間を撮影できない。 FIG. 1 is a schematic plan view showing an example of the arrangement of cameras. In this embodiment, a camera system will be described using the arrangement. In the arrangement shown in FIG. 1, six cameras e1 to e6 are arranged in the monitoring target space. For example, an obstacle (not shown) exists in an area surrounded by these cameras, and an arbitrary camera includes a camera that is hidden behind the obstacle and cannot capture its field of view. For example, the camera e2 cannot capture a space in the field of view of the camera e5.

図１のカメラシステムでは３つのクラスタＣ１〜Ｃ３を設定し、当該クラスタごとにカメラキャリブレーションを行う。キャリブレーションとマルチカメラトラッキングとを共にクラスタ単位で行うことでマルチカメラトラッキングを高精度に行うことができる。Ｃ１はカメラｅ１〜ｅ３からなり、Ｃ２はカメラｅ３〜ｅ５、Ｃ３はカメラｅ５，ｅ６及びｅ１からなる。各クラスタの３台のカメラは互いに共通視野を有する。例えば、Ｃ１においてｅ１とｅ２、ｅ２とｅ３、及びｅ３とｅ１とはそれぞれ共通視野を有し、それら共通視野は連続した共通視野（クラスタ内共通視野）を形成する。図１ではクラスタを表す符号Ｃ１〜Ｃ３で指し示す矩形により、各クラスタのクラスタ内共通視野を模式的に表現している。 In the camera system of FIG. 1, three clusters C1 to C3 are set, and camera calibration is performed for each cluster. Multi-camera tracking can be performed with high accuracy by performing both calibration and multi-camera tracking in units of clusters. C1 includes cameras e1 to e3, C2 includes cameras e3 to e5, and C3 includes cameras e5, e6, and e1. The three cameras in each cluster have a common field of view. For example, in C1, e1 and e2, e2 and e3, and e3 and e1 each have a common visual field, and these common visual fields form a continuous common visual field (common visual field within a cluster). In FIG. 1, a common field within a cluster of each cluster is schematically represented by rectangles indicated by reference numerals C <b> 1 to C <b> 3 representing the clusters.

また、図１のカメラシステムでは互いに隣り合うクラスタ（隣接クラスタ）にはそれらに共通するカメラ（共通カメラ）を１台設け、隣接クラスタ同士には互いのクラスタ内共通視野がオーバーラップする領域を設けている。具体的には、カメラｅ１，ｅ３，ｅ５が共通カメラである。すなわち、図１のカメラシステムは３つのクラスタがループ状に接続される例となっている。 In the camera system of FIG. 1, one camera (common camera) that is common to the clusters adjacent to each other (adjacent cluster) is provided, and a region in which the common visual field in each cluster overlaps is provided between adjacent clusters. ing. Specifically, the cameras e1, e3, e5 are common cameras. That is, the camera system of FIG. 1 is an example in which three clusters are connected in a loop.

上述したようにカメラキャリブレーションはクラスタごとに独立に行う。よって、クラスタごとに３次元世界を記述するローカルな座標系（クラスタ座標系）を持つことになる。複数のクラスタ座標系間の座標変換は当該複数のクラスタに共通する共通カメラのカメラ座標系を介して行う。 As described above, camera calibration is performed independently for each cluster. Therefore, each cluster has a local coordinate system (cluster coordinate system) describing the three-dimensional world. Coordinate conversion between a plurality of cluster coordinate systems is performed via a camera coordinate system of a common camera common to the plurality of clusters.

以上、マルチトラッキングロジックを用いた本システムにおける追跡処理の概要について説明した。以降、さらにその詳細について説明する。 The outline of the tracking process in this system using the multi-tracking logic has been described above. Hereinafter, further details will be described.

図２は本実施形態に係るカメラシステムの概略構成を示すブロック図であり、カメラシステム及び画像処理部１を備える。画像処理部１はクラスタ処理部２及び統合処理部３を含んで構成され、入力装置４及び出力装置５が画像処理部１に接続される。 FIG. 2 is a block diagram illustrating a schematic configuration of the camera system according to the present embodiment, which includes the camera system and the image processing unit 1. The image processing unit 1 includes a cluster processing unit 2 and an integration processing unit 3, and an input device 4 and an output device 5 are connected to the image processing unit 1.

図２に示すカメラ構成は図１に示したものと同じ構成であり、６台のカメラｅ１〜ｅ６からなり、３つのクラスタＣ１〜Ｃ３を設定される。既に述べたように、個々のカメラは監視対象空間の全体を撮影できないが、それぞれの視野を合わせて監視対象空間全体を撮影し、またマルチトラッキングを可能とするように設置される。移動物体の追跡を行う監視動作時には、各カメラはそれぞれの視野内を所定の時間間隔で撮影する。また、カメラは監視動作時だけでなく、その事前処理として行われるカメラキャリブレーションにも用いられる。カメラにより撮影された監視画像は順次、各カメラが属するクラスタに対応するクラスタ処理部２へ出力される。なお、専ら床面又は地表面等の基準面に沿って移動する人の位置、移動を把握するため、カメラは基本的に人を俯瞰撮影可能な高さに設置される。 The camera configuration shown in FIG. 2 is the same as that shown in FIG. 1, and is composed of six cameras e1 to e6, and three clusters C1 to C3 are set. As described above, each camera cannot shoot the entire monitoring target space, but is installed so as to shoot the entire monitoring target space with each field of view and to enable multi-tracking. At the time of a monitoring operation for tracking a moving object, each camera takes an image of each field of view at a predetermined time interval. Further, the camera is used not only for the monitoring operation but also for camera calibration performed as a pre-process thereof. The monitoring images captured by the cameras are sequentially output to the cluster processing unit 2 corresponding to the cluster to which each camera belongs. In order to grasp the position and movement of a person who moves along a reference surface such as the floor surface or the ground surface, the camera is basically installed at a height at which a person can be seen from above.

クラスタ処理部２及び統合処理部３は、ＣＰＵ(Central Processing Unit)、ＤＳＰ（Digital Signal Processor）、ＭＣＵ（Micro Control Unit）等の演算装置、及びＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、ハードディスク等の記憶装置を用いて構成される。記憶装置は演算装置で使用されるプログラムやデータを記憶し、演算装置は記憶装置からプログラムを読み出して実行し、監視動作時には移動物体追跡処理を行い、カメラシステムの設置時等にはキャリブレーションに伴う処理を行う。 The cluster processing unit 2 and the integrated processing unit 3 include a CPU (Central Processing Unit), a DSP (Digital Signal Processor), an MCU (Micro Control Unit), etc., a ROM (Read Only Memory), and a RAM (Random Access Memory). And a storage device such as a hard disk. The storage device stores programs and data used by the computing device, and the computing device reads and executes the program from the storage device, performs moving object tracking processing during monitoring operations, and performs calibration when installing a camera system, etc. Perform the associated process.

入力装置４はキーボード、マウス、タッチパネルディスプレイ等のユーザーインターフェース装置であり、ユーザーにより操作され画像処理部１に対する各種設定を入力するために用いられる。また、入力装置４はカメラシステムの設置時やメンテナンス時における管理者等の画像処理部１に対する設定入力にも用いられる。特に、入力装置４はカメラ外部パラメータの校正作業において、設置した各カメラの画像上での特徴点を指定する手段として用いることができる。 The input device 4 is a user interface device such as a keyboard, a mouse, and a touch panel display, and is used by the user to input various settings for the image processing unit 1. The input device 4 is also used for setting input to the image processing unit 1 such as an administrator at the time of installation or maintenance of the camera system. In particular, the input device 4 can be used as means for designating feature points on an image of each installed camera in the calibration of camera external parameters.

出力装置５はカメラが撮影した画像を表示する表示手段や、異常発生をユーザに知らせる音声・警告音等を出力する音響出力手段を含む。 The output device 5 includes display means for displaying an image taken by the camera, and sound output means for outputting sound / warning sound for notifying the user of the occurrence of abnormality.

図３はクラスタ処理部２の概略の構成を示すブロック図である。同図を用いて、クラスタ処理部２についてさらに説明する。クラスタ処理部２はクラスタＣ１〜Ｃ３それぞれに対応して設けられ、カメラクラスタごとの処理を行う。クラスタ処理部２はキャリブレーション実行部２０、３次元追跡部２１及び記憶部２２を含んでいる。 FIG. 3 is a block diagram showing a schematic configuration of the cluster processing unit 2. The cluster processing unit 2 will be further described with reference to FIG. The cluster processing unit 2 is provided corresponding to each of the clusters C1 to C3, and performs processing for each camera cluster. The cluster processing unit 2 includes a calibration execution unit 20, a three-dimensional tracking unit 21, and a storage unit 22.

キャリブレーション実行部２０は、クラスタを構成する各カメラで撮影された画像から得られる情報などを用いてカメラキャリブレーションを行う。例えば、共通視野内に設置した基準物体を各カメラで撮影し、それぞれの画像に映る基準物体上の特徴点の座標からクラスタ座標系における各カメラの位置・姿勢が算出される。画像における特徴点は、キャリブレーションを行う作業者が出力装置５に表示される画像にて特徴点を認識し入力装置４からその座標を入力する。また、画像処理部１が画像認識により自動的に特徴点を抽出する構成とすることもできる。キャリブレーション実行部２０により算出された各カメラの外部パラメータ等のキャリブレーション結果は記憶部２２に格納される。 The calibration execution unit 20 performs camera calibration using information obtained from images captured by the cameras constituting the cluster. For example, a reference object placed in the common field of view is photographed by each camera, and the position / orientation of each camera in the cluster coordinate system is calculated from the coordinates of the feature points on the reference object shown in each image. The feature point in the image is recognized by the operator who performs calibration in the image displayed on the output device 5, and the coordinates are input from the input device 4. Alternatively, the image processing unit 1 may automatically extract feature points by image recognition. Calibration results such as external parameters of each camera calculated by the calibration execution unit 20 are stored in the storage unit 22.

３次元追跡部２１（物***置検出部）は、クラスタごとに当該クラスタに属する各カメラから入力された画像を用いてマルチカメラトラッキングを行って各時刻における人物位置（物***置）を検出する。この時、記憶部２２に記憶されているカメラキャリブレーション情報を用いる。例えば、３次元追跡部２１は予め記憶されている背景画像を用い、各カメラの入力画像から背景差分により、追跡対象とする移動物体である人物が映る画像領域を抽出する。一方、仮想３次元空間に３次元的な人モデルを配置し、このモデルをカメラキャリブレーション情報に基づき各カメラの画像上に投影し投影領域を求める。そして、人モデルの投影領域と人物の像の抽出領域との一致度を計算し、各カメラについての一致度の合計が最も大きくなる人モデルの床面上の位置を探索して人物位置とする。なお、３次元追跡部２１にて検出される人物位置は各クラスタ座標系での座標値である。人物位置には人物の識別番号及びクラスタ処理部２のクラスタ番号を含めて統合処理部３へ出力される。 The three-dimensional tracking unit 21 (object position detection unit) detects a person position (object position) at each time by performing multi-camera tracking using an image input from each camera belonging to the cluster for each cluster. At this time, the camera calibration information stored in the storage unit 22 is used. For example, the three-dimensional tracking unit 21 uses a background image stored in advance, and extracts an image region in which a person, which is a moving object to be tracked, is reflected from the input image of each camera based on the background difference. On the other hand, a three-dimensional human model is arranged in the virtual three-dimensional space, and this model is projected on the image of each camera based on the camera calibration information to obtain a projection area. Then, the degree of coincidence between the projection area of the human model and the extraction area of the person image is calculated, and the position on the floor of the human model where the total degree of coincidence for each camera is the largest is searched for as the person position. . The person position detected by the three-dimensional tracking unit 21 is a coordinate value in each cluster coordinate system. The person position including the person identification number and the cluster number of the cluster processing unit 2 is output to the integration processing unit 3.

記憶部２２には、クラスタ番号、３次元追跡部２１により算出された人物位置が保存される。また記憶部２２はクラスタ校正情報記憶部であり、キャリブレーション実行部２０により算出された外部パラメータ等のカメラキャリブレーション情報を記憶する。 The storage unit 22 stores the cluster number and the person position calculated by the three-dimensional tracking unit 21. The storage unit 22 is a cluster calibration information storage unit that stores camera calibration information such as external parameters calculated by the calibration execution unit 20.

次に統合処理部３について説明する。図４は統合処理部３の概略の構成を示すブロック図である。統合処理部３は座標変換部３０、変換行列生成部３１、最適化処理部３２、座標統合処理部３３及び記憶部３４を含んでいる。 Next, the integrated processing unit 3 will be described. FIG. 4 is a block diagram showing a schematic configuration of the integrated processing unit 3. The integration processing unit 3 includes a coordinate conversion unit 30, a conversion matrix generation unit 31, an optimization processing unit 32, a coordinate integration processing unit 33, and a storage unit 34.

クラスタ処理部２がクラスタごとの処理を行うのに対し、統合処理部３はそれらクラスタを統合して扱う処理を行う。具体的には統合処理部３は、カメラシステムの設置時等に、クラスタ処理部２の記憶部２２に保存した各クラスタのカメラキャリブレーション情報を基に、全てのクラスタに対するループキャリブレーションを行う。また、監視動作時には各クラスタ処理部２から入力される人物位置を統合する物***置統合処理を行う。この物***置統合処理は、クラスタ間での同一人物の対応付けを行う処理と、各クラスタのクラスタ座標系で求められる人物位置を共通の座標系（統合座標系）に変換することによりこれらを統合する処理とを含んでいる。統合された人物位置を含む追跡結果は出力装置５へ出力することができる。 The cluster processing unit 2 performs processing for each cluster, whereas the integration processing unit 3 performs processing for integrating these clusters. Specifically, the integration processing unit 3 performs loop calibration for all clusters based on the camera calibration information of each cluster stored in the storage unit 22 of the cluster processing unit 2 when the camera system is installed. Further, an object position integration process for integrating the person positions input from each cluster processing unit 2 is performed during the monitoring operation. This object position integration process is a process of associating the same person between clusters and integrating the person positions obtained in the cluster coordinate system of each cluster into a common coordinate system (integrated coordinate system). Processing. The tracking result including the integrated person position can be output to the output device 5.

記憶部３４はクラスタ構成情報３４０、変換行列３４１及び誤差行列３４２などを保存する。 The storage unit 34 stores cluster configuration information 340, a transformation matrix 341, an error matrix 342, and the like.

クラスタ構成情報３４０は、カメラとクラスタの関係及びクラスタ間の関係を記述したものである。図５は図１に示したカメラシステムに関するクラスタ構成情報３４０を表形式で表した模式図である。カメラ及びクラスタはそれぞれカメラ番号及びクラスタ番号で識別される。また、互いに隣接するクラスタの組（隣接クラスタ組）は隣接番号で識別される。クラスタ構成情報３４０には、各クラスタとカメラとの対応関係及び当該クラスタを構成するカメラ台数（図５（ａ））、並びに、各隣接クラスタ組を構成するクラスタと共通カメラとの対応関係（図５（ｂ））が格納される。クラスタ構成情報３４０は予め記憶部３４に格納される。 The cluster configuration information 340 describes the relationship between cameras and clusters and the relationship between clusters. FIG. 5 is a schematic diagram showing the cluster configuration information 340 related to the camera system shown in FIG. 1 in a table format. Cameras and clusters are identified by camera numbers and cluster numbers, respectively. Further, a set of adjacent clusters (adjacent cluster set) is identified by an adjacent number. The cluster configuration information 340 includes a correspondence relationship between each cluster and a camera, the number of cameras constituting the cluster (FIG. 5A), and a correspondence relationship between clusters constituting each adjacent cluster set and a common camera (see FIG. 5). 5 (b)) is stored. The cluster configuration information 340 is stored in the storage unit 34 in advance.

変換行列３４１はクラスタ間の座標変換を実現するための行列Ｔ、及び同一クラスタ内の共通カメラ間の座標変換を実現するための行列Ｓである。すなわち、各隣接クラスタ組の一方のクラスタのクラスタ座標系から当該隣接クラスタ組の他方のクラスタのクラスタ座標系への回転及び平行移動を表す行列Ｔが隣接クラスタ組ごとに記憶され、各クラスタ内の任意の共通カメラのカメラ座標系から当該クラスタ内の別の共通カメラのカメラ座標系への回転及び平行移動を表す行列Ｓがクラスタごとに記憶される。これら変換行列３４１は変換行列生成部３１により各クラスタ処理部２のカメラキャリブレーション情報から生成される。 The transformation matrix 341 is a matrix T for realizing coordinate transformation between clusters and a matrix S for realizing coordinate transformation between common cameras in the same cluster. That is, a matrix T representing rotation and translation from the cluster coordinate system of one cluster of each adjacent cluster set to the cluster coordinate system of the other cluster of the adjacent cluster set is stored for each adjacent cluster set. A matrix S representing rotation and translation from the camera coordinate system of any common camera to the camera coordinate system of another common camera in the cluster is stored for each cluster. These conversion matrices 341 are generated from the camera calibration information of each cluster processing unit 2 by the conversion matrix generation unit 31.

さらに変換行列３４１の行列Ｔとそれを用いた座標変換について説明する。 Further, the matrix T of the transformation matrix 341 and coordinate transformation using the same will be described.

行列Ｔにより実現される座標変換は、各隣接クラスタ組の一方のクラスタのクラスタ座標系を当該隣接クラスタ組の他方のクラスタのクラスタ座標系に、当該隣接クラスタ組が共有する共通カメラの位置・姿勢を合致させるよう位置合わせして座標系を共通化することに相当する。以降、隣接クラスタにて共通カメラの位置・姿勢を合致させる位置合わせを「局所位置合わせ」と称する。 The coordinate transformation realized by the matrix T is performed by using the cluster coordinate system of one cluster of each adjacent cluster set as the cluster coordinate system of the other cluster of the adjacent cluster set, and the position / posture of the common camera shared by the adjacent cluster set. This is equivalent to making the coordinate system common by aligning them so as to match. Hereinafter, the alignment that matches the position / posture of the common camera in the adjacent cluster is referred to as “local alignment”.

各クラスタのクラスタ座標系と共通カメラのカメラ座標系との関係は当該クラスタのキャリブレーション情報により与えられる。すなわち、クラスタ座標系における共通カメラの位置・姿勢が定まれば、当該クラスタ座標系から当該共通カメラのカメラ座標系への変換が一意に定まる。クラスタ座標系での座標（Ｘ，Ｙ，Ｚ）からカメラ座標系での座標（ｘ，ｙ，ｚ）への変換は、当該座標間の回転及び平行移動を表す同次変換行列Ｗを用いてＱ＝Ｗ・Ｐと定式化できる。ここで、Ｑ＝（ｘ，ｙ，ｚ，１）^ｔ、Ｐ＝（Ｘ，Ｙ，Ｚ，１）^ｔであり、Ｗは４×４行列である。 The relationship between the cluster coordinate system of each cluster and the camera coordinate system of the common camera is given by the calibration information of the cluster. That is, if the position / orientation of the common camera in the cluster coordinate system is determined, the conversion from the cluster coordinate system to the camera coordinate system of the common camera is uniquely determined. The transformation from the coordinates (X, Y, Z) in the cluster coordinate system to the coordinates (x, y, z) in the camera coordinate system is performed using a homogeneous transformation matrix W representing rotation and translation between the coordinates. Q = W · P can be formulated. Here, Q = (x, y, z, 1) ^t , P = (X, Y, Z, 1) ^t , and W is a 4 × 4 matrix.

そして、共通カメラの位置・姿勢を合致させるよう位置合わせする場合、共通カメラのカメラ座標系の座標は隣接クラスタで同一であるから、共通カメラｍを共有するクラスタｉのクラスタ座標系Ｐ_ｉからクラスタｊのクラスタ座標系Ｐ_ｊへの変換及び変換行列Ｔ_ｉｊは、クラスタ座標系Ｐ_ｉから共通カメラｍのカメラ座標系Ｑ_ｍへの変換を表す同次変換行列Ｗ_ｉｍとクラスタ座標系Ｐ_ｊからカメラ座標系Ｑ_ｍへの変換を表す同次変換行列Ｗ_ｊｍとを用いてそれぞれ式（１）及び式（２）で表される。
Ｐ_ｊ＝Ｔ_ｉｊ・Ｐ_ｉ（１）
Ｔ_ｉｊ＝Ｗ_ｊｍ ^−１・Ｗ_ｉｍ（２） And when aligning so that the position and orientation of the common camera match, the coordinates of the camera coordinate system of the common camera are the same in the adjacent cluster, so the cluster coordinate system P _i of the cluster i sharing the common camera m is clustered. The transformation of _j to the cluster coordinate system P _j and the transformation matrix T _ij are obtained from the homogeneous transformation matrix W _im representing the transformation from the cluster coordinate system P _i to the camera coordinate system Q _m of the common camera m and the cluster coordinate system P _j. represented by each formula using the homogeneous transformation matrix W _jm representing the transformation to the camera coordinate system Q _m (1) and (2).
P _j = T _ij · P _i (1)
T _ij = W _jm ⁻¹ · W _im (2)

変換行列３４１の行列Ｔ及びＳを用いたもう一つの座標変換について説明する。 Another coordinate transformation using the matrices T and S of the transformation matrix 341 will be described.

上述した局所位置合わせを数珠繋ぎに連鎖させることで直接隣り合わないクラスタのクラスタ座標系同士を位置合わせすることができる。このような位置合わせを「連鎖位置合わせ」と称する。ただし、連鎖位置合わせには隣接クラスタでの座標変換に加えて、同一クラスタ内の共通カメラ間での座標変換が必要である。キャリブレーション情報からこれらの共通カメラのカメラ座標系の間の変換は一意に定まる。すなわち共通カメラｍのカメラ座標系Ｑ_ｍから共通カメラｎのカメラ座標系Ｑ_ｎへの変換は、これらの座標系の間の回転及び平行移動を表す同次変換行列Ｖ_ｍｎを用いてＱ_ｎ＝Ｖ_ｍｎ・Ｑ_ｍと定式化される。クラスタｊの共通カメラｍ側にクラスタｉが隣接し、クラスタｊの共通カメラｎ側にクラスタｋが隣接しているとすれば、クラスタｉのクラスタ座標系Ｐ_ｉからクラスタｋのクラスタ座標系Ｐ_ｋへの変換及び変換行列Ｓ_ｉｋは、それぞれ式（３）及び式（４）で表される。
Ｐ_ｋ＝Ｓ_ｉｋ・Ｐ_ｉ（３）
Ｓ_ｉｋ＝Ｗ_ｋｎ ^−１・Ｖ_ｍｎ・Ｗ_ｉｍ（４） Cluster coordinate systems of clusters that are not directly adjacent to each other can be aligned by linking the above-described local alignment in a daisy chain. Such alignment is referred to as “chain alignment”. However, chain alignment requires coordinate conversion between common cameras in the same cluster in addition to coordinate conversion in adjacent clusters. The conversion between the camera coordinate systems of these common cameras is uniquely determined from the calibration information. That common camera transformation from the camera coordinate system Q _m of m to the camera coordinate system Q _n of the common camera n, using the homogeneous transformation matrix V _mn representing the rotation and translation between these coordinate systems Q _n = It is formulated as V _mn · Q _m . If the cluster i is adjacent to the common camera m side of the cluster j and the cluster k is adjacent to the common camera n side of the cluster j, the cluster coordinate system P k of the cluster _i to the cluster coordinate system P _{k of the} cluster k is assumed. The conversion to and the conversion matrix S _ik are expressed by Expression (3) and Expression (4), respectively.
P _k = S _ik · P _i (3)
S _ik = W _kn ⁻¹ · V _mn · W _im (4)

なお、式（３）及び式（４）は２つ隣のクラスタへの変換であるが、３つ以上隣のクラスタへの変換も行列Ｖと行列Ｗの積を適宜連結することで算出することができる。 Equations (3) and (4) are conversions to two adjacent clusters, but conversions to three or more adjacent clusters are also calculated by appropriately connecting the products of matrix V and matrix W. Can do.

ここでキャリブレーション誤差について説明する。各クラスタのキャリブレーション結果にはそれぞれの量子化誤差や実測時の誤差などが含まれる。そのため共通カメラの位置・姿勢を合致させて位置合わせを行うと、位置合わせされたクラスタ座標系の相対関係には誤差が重畳する。図６は、クラスタＣ１に対してクラスタＣ２、クラスタＣ３の順に連鎖位置合わせしたクラスタ座標系の相対関係を模式的に示したものである。クラスタＣ１のカメラ座標系Ｑ_１とクラスタＣ２のカメラ座標系Ｑ_４，Ｑ_５の間には実空間の各カメラの関係との誤差が生じ、同様にＱ_２とＱ_４，Ｑ_５の間にも誤差が生じる。そのためクラスタ間で人物位置の情報を交換すると実空間の位置との乖離が生じる。クラスタ間の誤差は位置合わせにより伝播し、連鎖に沿って離れるほど累積して大きくなり得る。例えば、Ｑ_１とＱ_１’の間にはＱ_１とＱ_５よりも大きな誤差が生じ得る。このような誤差の伝播は上記乖離を大きくし、カメラのループの終端において人物位置が整合しないという形で顕在化する。すなわちカメラシステム全体で人物位置の整合性をとることが困難となる。そのため、図６の例ではクラスタＣ３で検出された人物位置とクラスタＣ１で検出された人物位置を同定できない、若しくは接近した複数の人物を取り違えて同定してしまう問題が生じる。また、局所位置合わせを繰り返した場合も同じ問題が生じ、この場合はさらに人物の移動経路によって誤差の重畳パターンが変わるためにカメラシステム全体で人物位置の整合性をとることがより一層困難となる。 Here, the calibration error will be described. The calibration result of each cluster includes each quantization error and error at the time of actual measurement. For this reason, when the alignment is performed by matching the position and orientation of the common camera, an error is superimposed on the relative relationship of the aligned cluster coordinate system. FIG. 6 schematically shows the relative relationship of the cluster coordinate system in which the chain alignment is performed in the order of the cluster C2 and the cluster C3 with respect to the cluster C1. Between the camera coordinate system _Q 4, _{Q 5} of the camera coordinate system _{Q 1,} cluster C2 cluster C1 error between relationship of each camera in the real space occurs, likewise between _{Q 2} 'and _Q 4, _{Q 5} Error. For this reason, when the information on the person position is exchanged between the clusters, a deviation from the position in the real space occurs. Errors between clusters propagate through alignment and can accumulate and grow further away along the chain. For example, an error larger than Q ₁ and Q ₅ may occur between Q ₁ and Q ₁ ′. Such error propagation increases the above-mentioned divergence and becomes apparent in the form that the person position does not match at the end of the camera loop. That is, it becomes difficult to achieve consistency of the person position in the entire camera system. Therefore, in the example of FIG. 6, there is a problem that the person position detected in the cluster C3 and the person position detected in the cluster C1 cannot be identified, or a plurality of approaching persons are mistakenly identified. The same problem also occurs when local positioning is repeated. In this case, since the error superposition pattern changes depending on the movement path of the person, it becomes more difficult to achieve consistency of the person position in the entire camera system. .

誤差行列３４２はカメラキャリブレーション誤差へ対処するためのものであり、本発明の特徴的な性質を有する部分である。誤差行列３４２は、各クラスタ座標系を、当該クラスタ座標系におけるカメラの位置・姿勢の配置関係を維持したまま共通カメラの位置・姿勢にて位置合わせしたときに、各クラスタ座標系の間に生じる誤差を定義する情報である。ここでの位置合わせは前述した局所位置合わせとは異なり、全クラスタの位置合わせであることから「全***置合わせ」と称する。全***置合わせされた全クラスタのクラスタ座標系は共通する１つの座標系に座標変換されることになる。この全***置合わせにより生成される全クラスタに共通の座標系を「統合座標系」と称する。なお本実施形態では、クラスタＣ１に対してクラスタＣ２，Ｃ３を全***置合わせすることとし、そのため統合座標系のＸＹＺ軸と原点はクラスタＣ１のクラスタ座標系のそれらと一致する。記憶部３４は、各クラスタ間の誤差行列３４２と、前述した変換行列３４１と、統合座標系の基準がクラスタＣ１のクラスタ座標系であることとを記憶する統合座標系記憶部としての機能を有する。 The error matrix 342 is for dealing with camera calibration errors, and is a part having the characteristic properties of the present invention. The error matrix 342 is generated between the cluster coordinate systems when the respective cluster coordinate systems are aligned at the common camera position / posture while maintaining the positional relationship between the camera positions / postures in the cluster coordinate system. This information defines the error. The alignment here is called “entire alignment” because it is alignment of all clusters, unlike the local alignment described above. The cluster coordinate system of all clusters aligned as a whole is coordinate-transformed into a common coordinate system. A coordinate system common to all clusters generated by this overall alignment is referred to as an “integrated coordinate system”. In the present embodiment, the entire clusters C2 and C3 are aligned with the cluster C1, and therefore the XYZ axes and the origin of the integrated coordinate system coincide with those of the cluster coordinate system of the cluster C1. The storage unit 34 has a function as an integrated coordinate system storage unit that stores an error matrix 342 between the clusters, the conversion matrix 341 described above, and that the reference of the integrated coordinate system is the cluster coordinate system of the cluster C1. .

カメラキャリブレーション誤差が生じる問題については図６を用いて既に説明し、そこでは特にループ状のクラスタ配置ではループの端点同士のずれが顕在化する問題があることも説明した。なお、クラスタ間で座標変換を繰り返すほど、変換行列３４１に含まれる誤差成分が累積してトータルの誤差は増加する。 The problem that the camera calibration error occurs has already been described with reference to FIG. 6, in which it is also described that there is a problem that the deviation between the end points of the loop becomes apparent particularly in a loop-shaped cluster arrangement. As the coordinate transformation is repeated between clusters, the error components included in the transformation matrix 341 are accumulated and the total error increases.

図７はカメラシステムの模式的な平面図であり、誤差行列３４２を説明するためのものである。同図に示すように、誤差行列３４２を導入した統合座標系では、各共通カメラはそれを共有する隣接クラスタの一方と他方とで別々の位置・姿勢を与えられる。つまり、図７においてＣ１におけるｅ３とＣ２におけるｅ３’、Ｃ２におけるｅ５とＣ３におけるｅ５’、及びＣ１におけるｅ１とＣ３におけるｅ１’とはそれぞれ実体としては同一のカメラｅ３，ｅ５，ｅ１である。 FIG. 7 is a schematic plan view of the camera system for explaining the error matrix 342. As shown in the figure, in the integrated coordinate system in which the error matrix 342 is introduced, each common camera is given different positions and orientations in one and the other of the adjacent clusters that share it. That is, in FIG. 7, e3 'in C1 and e3' in C2, e5 'in C2 and e5' in C3, and e1 in C1 and e1 'in C3 are the same cameras e3, e5, and e1, respectively.

誤差行列３４２は、同一の共通カメラについてクラスタそれぞれにて別々に設定した位置間で定義され、変換行列３４１と同様、カメラ座標系における回転成分と平行移動成分からなる同次変換行列Ｅで定義でき、誤差を表すと共に座標変換をも表す。具体的には、カメラｅ３’からｅ３への間の変換行列Ｅ_１と、カメラｅ５’からｅ５への変換行列Ｅ_２が誤差行列３４２として記憶される。カメラｅ１とｅ１’との間の変換行列は、Ｅ_１，Ｅ_２が決まると従属的に一意に決まるものであるため特に定義を要しない。 The error matrix 342 is defined between positions set separately for each cluster with respect to the same common camera. Like the transformation matrix 341, the error matrix 342 can be defined by a homogeneous transformation matrix E composed of a rotation component and a translation component in the camera coordinate system. Represents error and coordinate transformation. Specifically, a transformation matrix E ₁ from the camera e 3 ′ to e 3 and a transformation matrix E ₂ from the camera e 5 ′ to e 5 are stored as the error matrix 342. The transformation matrix between the cameras e1 and e1 ′ does not need to be defined because it is uniquely determined dependently when E ₁ and E ₂ are determined.

誤差行列３４２を定義するメリットは、クラスタの各連結部分で誤差を持たせるので変換行列３４１を修正することなく、すなわちクラスタ内のカメラキャリブレーション情報を維持したまま、カメラシステム全体でキャリブレーション誤差の分散・吸収を図れることである。これにより、クラスタ内の物***置検出精度が劣化せず各クラスタからの物***置を統合できるので、全体として精度の良いトラッキングが可能となる。 The merit of defining the error matrix 342 is that each connected portion of the cluster has an error, so that the calibration error of the entire camera system is not corrected without correcting the transformation matrix 341, that is, while maintaining the camera calibration information in the cluster. It is possible to achieve dispersion and absorption. As a result, the object position detection accuracy within the cluster is not deteriorated, and the object positions from the respective clusters can be integrated, so that accurate tracking can be performed as a whole.

ここで、誤差行列３４２及び変換行列３４１を使った座標変換の例として、クラスタＣ３のクラスタ座標系Ｘ_３Ｙ_３Ｚ_３から統合座標系Ｘ_１Ｙ_１Ｚ_１への変換について説明する。カメラキャリブレーション情報によりクラスタＣ３のクラスタ座標Ｐ_３から、カメラｅ５’のカメラ座標Ｑ_５’が求まる。次に、カメラ座標Ｑ_５’からクラスタＣ２のカメラｅ５のカメラ座標Ｑ_５への変換は、
Ｑ_５＝Ｅ_２・Ｑ_５’ （５）
となる。またカメラ座標Ｑ_５からカメラ座標Ｑ_３’への変換は、
Ｑ_３’＝Ｔ_５３・Ｑ_５（６）
となる。更に、カメラ座標Ｑ_３’からカメラ座標Ｑ_３への変換は、
Ｑ_３＝Ｅ_１・Ｑ_３’ （７）
となる。最後に、カメラ座標Ｑ_３から統合座標系Ｐ_１への変換はカメラキャリブレーション情報により求まる。この例では、クラスタＣ３のクラスタ座標から統合座標系Ｐ_１への変換を行ったが、カメラｅ１’のカメラ座標系Ｑ_１’から統合座標系Ｐ_１への変換も同様である。 Here, as an example of coordinate transformation using the error matrix 342 and the transformation matrix 341, transformation from the cluster coordinate system X ₃ Y ₃ Z ₃ of the cluster C3 to the integrated coordinate system X ₁ Y ₁ Z ₁ will be described. Based on the camera calibration information, the camera coordinate Q ₅ ′ of the camera e ₅ ′ is obtained from the cluster coordinate P ₃ of the cluster C ₃ . Next, the conversion from the camera coordinate _{Q 5} 'to the camera coordinate _{Q 5} of the camera e5 of the cluster C2 is,
Q ₅ = E ₂・ Q ₅ '(5)
It becomes. The conversion from the camera coordinate Q ₅ to the camera coordinate Q _{3 'is,}
Q ₃ '= T ₅₃・ Q ₅ (6)
It becomes. Furthermore, conversion from camera coordinates Q ₃ ′ to camera coordinates Q ₃ is
Q ₃ = E ₁・ Q ₃ '(7)
It becomes. Finally, the conversion from the camera coordinate Q ₃ to the integrated coordinate system P ₁ is determined by camera calibration information. In this example, it was converted from the cluster coordinate cluster C3 to integrate coordinates P _1, the conversion from 'the camera coordinate system to Q _1' camera e1 to combined coordinate system P ₁ is the same.

誤差行列３４２は後述する最適化処理部３２により求められる。 The error matrix 342 is obtained by the optimization processing unit 32 described later.

座標変換部３０は、各クラスタのカメラキャリブレーション情報、変換行列３４１、誤差行列３４２を用いて、式（１）による隣接クラスタ間の座標変換、式（３）等による任意のクラスタ間の座標変換、及び式（５）〜（７）等によるクラスタ座標系から統合座標系への変換を行う。 The coordinate conversion unit 30 uses the camera calibration information of each cluster, the conversion matrix 341, and the error matrix 342 to convert coordinates between adjacent clusters according to equation (1), coordinate conversion between arbitrary clusters according to equation (3), and the like. And conversion from the cluster coordinate system to the integrated coordinate system by the equations (5) to (7) and the like.

変換行列生成部３１は、各クラスタのカメラキャリブレーション情報から各クラスタ内の変換行列３４１を生成する。上述したように変換行列３４１は各クラスタ座標系における共通カメラの位置・姿勢から、当該クラスタ座標系に対する当該共通カメラのカメラ座標系への回転量及び平行移動量を求めて、行列Ｖ及びＷを算出し、これらに式（２）及び式（４）を適用することで算出される。 The conversion matrix generation unit 31 generates a conversion matrix 341 in each cluster from the camera calibration information of each cluster. As described above, the transformation matrix 341 obtains the rotation amount and parallel movement amount of the common camera relative to the cluster coordinate system from the position and orientation of the common camera in each cluster coordinate system, and the matrices V and W are obtained. It calculates by applying and applying Formula (2) and Formula (4) to these.

最適化処理部３２は、各クラスタ間の誤差がカメラシステム全体で最小化するように誤差行列３４２を定めることにより統合座標系の生成を行う。例えば、図７に示すカメラシステムではクラスタＣ１とＣ２との間、クラスタＣ２とＣ３との間、及びクラスタＣ３とＣ１との間の誤差の合計を最小化する。具体的には、誤差行列Ｅ_１，Ｅ_２それぞれの回転成分の回転角をθ_ｍ１，θ_ｍ２、誤差行列Ｅ_１，Ｅ_２それぞれの平行移動成分の移動量をｄ_ｍ１，ｄ_ｍ２、またクラスタＣ１のカメラｅ１からクラスタＣ３のカメラｅ１’への変換の回転角をθ_ｅ、移動量をｄ_ｅと表すと、例えば、次式で定義する積算誤差Ｕ_ＳＵＭを最小化する。
Ｕ_ＳＵＭ＝Ｕ_θ＋α×Ｕ_ｄ（８） The optimization processing unit 32 generates an integrated coordinate system by defining an error matrix 342 so that errors between the clusters are minimized in the entire camera system. For example, in the camera system shown in FIG. 7, the sum of errors between the clusters C1 and C2, between the clusters C2 and C3, and between the clusters C3 and C1 is minimized. Specifically, the rotation angles of the rotation components of the error matrices E ₁ and E ₂ are θ _m1 and θ _m2 , the movement amounts of the translation components of the error matrices E ₁ and E ₂ are d _m1 and d _m2 , and the cluster. the rotation angle theta _e in the conversion from the camera e1 of C1 to the camera e1 'cluster _C3, when the movement amount expressed as d _e, for example, to minimize the accumulated error U _SUM which is defined by the following equation.
U _SUM = U _θ + α × U _d (8)

ここで、Ｕ_θ，Ｕ_ｄはそれぞれ次式で定義する回転誤差の和、移動誤差の和である。αは重み係数であり、例えば、Ｕ_θとＵ_ｄとの単位の相違を調整し、回転角の誤差と移動量の誤差とが均等に評価されるように設定することができる。
Ｕ_θ＝ |θ_ｍ１|＋|θ_ｍ２|＋|θ_ｅ| （９）
Ｕ_ｄ＝ｄ_ｍ１＋ｄ_ｍ２＋ｄ_ｅ（１０） Here, U _θ and U _d are the sum of rotation errors and the sum of movement errors defined by the following equations, respectively. α is a weighting factor, and can be set so that, for example, the difference in unit between U _θ and U _d is adjusted so that the error in the rotation angle and the error in the movement amount are equally evaluated.
U _θ = | θ _m1 | + | θ _m2 | + | θ _e | (9)
U _d = d _m1 + d _m2 + d _e (10)

θ_ｅ及びｄ_ｅは次のように求められる。カメラｅ１’のカメラ座標系の原点と座標軸ベクトルを、座標変換部３０により統合座標系に変換する。座標変換部３０は、クラスタＣ１を始端クラスタ、クラスタＣ３を終端クラスタに設定し、始端クラスタから終端クラスタまでを連鎖位置合わせした変換行列Ｓ_３１を生成して、変換行列Ｓ_３１により統合座標系への変換を行う。一方、カメラｅ１のカメラ座標系の原点と座標軸ベクトルを基準座標系で表す。これら２つの原点座標のオフセットを移動量ｄ_ｅとし、座標軸ベクトルの回転角をθ_ｅとする。 theta _e and _{d e} is determined as follows. The origin and the coordinate axis vector of the camera coordinate system of the camera e1 ′ are converted into an integrated coordinate system by the coordinate conversion unit 30. Coordinate conversion unit 30, the cluster C1 start cluster, configure the cluster C3 to the end cluster, from start cluster to the end cluster generates a transformation matrix S ₃₁ was combined chain position, the transformation matrix S ₃₁ to the integrated coordinate system Perform the conversion. On the other hand, the origin and the coordinate axis vector of the camera coordinate system of the camera e1 are represented by the reference coordinate system. The offset of the two origin coordinates and the movement amount d _e, the rotation angle of the coordinate axis vectors and theta _e.

さて、既に述べたように、カメラｅ１とｅ１’との間の変換は、Ｅ_１，Ｅ_２が決まると従属的に一意に決まる。つまり、θ_ｅ及びｄ_ｅは独立のパラメータではなく、Ｅ_１，Ｅ_２から計算されるパラメータである。よって、Ｅ_１，Ｅ_２を最適化することによりＵ_ＳＵＭの最小化が図られる。最適化処理は一般的に用いられる、例えば、最急降下法、マルコフ連鎖モンテカルロ法（Markov chain Monte Carlo methods：ＭＣＭＣ法）などの手法を用いて行うことができ、当該最適化により誤差行列３４２が決定される。 Now, as already described, the conversion between the cameras e1 and e1 ′ is uniquely determined dependently when E ₁ and E ₂ are determined. That is, the theta _e and _{d e} not independent parameters are _parameters which are calculated from E 1, _{E 2.} Therefore, U _SUM can be minimized by optimizing E ₁ and E ₂ . The optimization process is generally used, for example, can be performed using a technique such as a steepest descent method or a Markov chain Monte Carlo method (MCMC method), and the error matrix 342 is determined by the optimization. Is done.

座標統合処理部３３は物***置統合部としての機能を有し、当該機能は各クラスタ処理部２から入力される人物位置を隣接クラスタ間で照合し同一人物の対応付けを行うクラスタ間追跡継承機能、及びクラスタごとの人物位置を統合座標系に統合して出力する追跡統合機能を含む。 The coordinate integration processing unit 33 has a function as an object position integration unit, and the function is an inter-cluster tracking inheritance function that collates the person position input from each cluster processing unit 2 between adjacent clusters and associates the same person. And a tracking integration function that integrates and outputs a person position for each cluster into an integrated coordinate system.

まず、クラスタ間追跡継承機能について説明する。当該機能では、隣接クラスタそれぞれから同時刻に検出された人物位置を座標変換部３０を用いて局所位置合わせして位置合わせされた人物位置のうち予め設定された同定しきい値よりも近接している人物位置同士を同一人物によるものと判定することで、クラスタ間での同一物体の対応付けを行う。 First, the inter-cluster tracking inheritance function will be described. In this function, the position of a person detected at the same time from each adjacent cluster is locally aligned by using the coordinate conversion unit 30 and is closer to the preset identification threshold value among the positions of persons aligned. By determining that the positions of the persons are from the same person, the same object is associated between the clusters.

図８は座標統合処理部３３の機能を説明する模式図であり、同図を用いて、クラスタ間追跡継承機能及び追跡統合機能について具体的に説明する。クラスタＣ１とクラスタＣ２との間での１人の人間の対応付けを考える。図８（ａ）は、局所位置合わせした場合の２つのクラスタの配置を示す模式的な平面図である。また同図（ｂ）は全***置合わせした場合の２つのクラスタの配置を模式的に示しており、誤差が含まれるため共通カメラｅ３の位置がクラスタＣ１，Ｃ２でずれている様子が示されている。 FIG. 8 is a schematic diagram for explaining the function of the coordinate integration processing unit 33. The inter-cluster tracking inheritance function and the tracking integration function will be specifically described with reference to FIG. Consider the association of one person between cluster C1 and cluster C2. FIG. 8A is a schematic plan view showing the arrangement of two clusters when local alignment is performed. FIG. 5B schematically shows the arrangement of the two clusters when the entire positions are aligned. Since the error is included, the common camera e3 is displaced in the clusters C1 and C2. Yes.

また、図８（ｃ）はクラスタＣ１にて検出された１人の人間の人物位置を三角印（▲）で示し、図８（ｄ）はクラスタＣ２にて検出された同一人の人物位置を四角印（■）で示している。ここではこれらＣ１，Ｃ２での人物位置は共通カメラｅ３の視野内で同一時刻に得られたものとする。 FIG. 8C shows the position of one person detected by the cluster C1 with a triangle (▲), and FIG. 8D shows the position of the same person detected by the cluster C2. It is indicated by a square mark (■). Here, it is assumed that the person positions in C1 and C2 are obtained at the same time within the field of view of the common camera e3.

図８（ｅ）は同図（ｃ），（ｄ）を局所位置合わせした様子を示しており、クラスタの配置は図８（ａ）の状態に相当する。また、図８（ｆ）は全***置合わせした様子を示しており、クラスタの配置は図８（ｂ）の状態に相当する。これら２つの図を比較してわかるように、誤差成分を含む場合（図８（ｆ））は、それぞれのクラスタで求まった三角（▲）及び四角（■）の位置が誤差成分の影響で誤差成分を含まない場合（図８（ｅ））よりも離れる。 FIG. 8 (e) shows a state where local alignment is performed in FIGS. 8 (c) and (d), and the arrangement of the clusters corresponds to the state of FIG. 8 (a). Further, FIG. 8F shows a state where the entire positions are aligned, and the arrangement of the clusters corresponds to the state shown in FIG. As can be seen by comparing these two figures, when the error component is included (FIG. 8 (f)), the position of the triangle (▲) and square (■) obtained in each cluster is affected by the error component. It is further away than when no component is included (FIG. 8E).

ここで、誤差行列３４２はカメラシステム全体としての整合性をとるためのものであるため、検出結果をクラスタ間で対応付けるという局所的な処理において誤差行列３４２の誤差成分は考慮する必要がない。また、人が大勢いる場合にはこのような誤差成分を考慮に入れることにより離れてしまう検出位置を用いると、対応付けを誤る可能性がある。 Here, since the error matrix 342 is for ensuring the consistency of the entire camera system, it is not necessary to consider the error component of the error matrix 342 in the local process of associating the detection results between the clusters. Further, when there are many people, using a detection position that is separated by taking such an error component into consideration may cause an erroneous association.

そこで、クラスタ間追跡継承機能においては、図８（ｅ）に示すように、共通カメラを共有する隣接クラスタそれぞれのクラスタ座標系を当該共通カメラの外部パラメータを合致させる局所位置合わせにより人物位置を位置合わせして同定する。すなわち、当該共通カメラの視野にて追跡された各クラスタでの人物位置を局所位置合わせした座標系にて対応付け、クラスタ間での同一人を決定する。 Therefore, in the inter-cluster tracking inheritance function, as shown in FIG. 8 (e), the position of the person is positioned by local alignment in which the cluster coordinate system of each adjacent cluster sharing the common camera matches the external parameter of the common camera. Identify together. That is, a person position in each cluster tracked in the field of view of the common camera is associated with a coordinate system that is locally aligned, and the same person is determined between the clusters.

一方、共通カメラを合致させる位置合わせにはキャリブレーション誤差の問題がある。そこで、座標統合処理部３３は追跡統合機能において、クラスタ間追跡継承機能により同一人とされたクラスタごとの人物位置について、それらの統合座標系における位置に基づいて一つの統合物***置を算出し、クラスタの接続部分での追跡結果とする。このとき、統合物***置は全てのクラスタの位置関係をみて整合性のとれた位置とする必要がある。そこで、座標統合処理部３３は、最適化処理部３２により積算誤差を最小化した配置関係のクラスタでの人物位置に基づいて統合物***置を算出する。つまり、クラスタ間の誤差成分を考慮した配置関係で、対応付け後の人の新たな座標を計算する。よって、図８の例では同図（ｆ）の状態で人の新たな座標が計算される。例えば、当該座標の計算方法として、対応付けられた２点の平均の位置（図８（ｆ）における“×”印の位置）を、対応付け後の座標とすることができる。 On the other hand, there is a problem of calibration error in the alignment for matching the common camera. Therefore, in the tracking integration function, the coordinate integration processing unit 33 calculates one integrated object position based on the position in the integrated coordinate system with respect to the person position for each cluster determined to be the same person by the inter-cluster tracking inheritance function. The tracking result at the connected part of the cluster. At this time, the integrated object position needs to be a consistent position in view of the positional relationship of all the clusters. Therefore, the coordinate integration processing unit 33 calculates the integrated object position based on the person position in the cluster in the arrangement relationship in which the integration error is minimized by the optimization processing unit 32. That is, the new coordinates of the person after the association are calculated based on the arrangement relationship in consideration of the error component between the clusters. Therefore, in the example of FIG. 8, new coordinates of the person are calculated in the state of FIG. For example, as a calculation method of the coordinates, an average position of two associated points (a position indicated by “x” in FIG. 8F) can be used as the coordinate after the association.

なお、既に述べたように、対応付けはクラスタ間でオーバーラップする領域にいる人間についてのみ行う。また、オーバーラップする領域に人間が複数存在する場合に対応するため、組み合わせ最適化処理により対応を求める。この際、座標間の距離が小さい程対応付けされやすくなるようにコストを設定する。 Note that as described above, the association is performed only for a person who is in an overlapping area between clusters. Further, in order to cope with the case where there are a plurality of people in the overlapping region, the correspondence is obtained by the combination optimization process. At this time, the cost is set so that the smaller the distance between the coordinates is, the easier the association is.

次に本実施形態のカメラシステムにおけるカメラキャリブレーション時の動作、及び移動物体の追跡動作について説明する。既に述べたように、画像処理部１はカメラシステムの設置時等に、各クラスタにおけるキャリブレーション、及びクラスタ間でのループキャリブレーションを行い、その結果を利用して複数台のカメラの画像を用いた移動物体追跡を行う。 Next, the operation at the time of camera calibration and the tracking operation of the moving object in the camera system of this embodiment will be described. As described above, the image processing unit 1 performs calibration in each cluster and loop calibration between the clusters at the time of installation of the camera system or the like, and uses the images of a plurality of cameras using the result. Tracking of moving objects.

[各クラスタのカメラキャリブレーション]
図９は各クラスタのカメラキャリブレーションの概略のフロー図である。クラスタを構成する各カメラからクラスタ処理部２に基準物体が映った画像を入力する（ステップＳ５０）。入力された画像における特徴点の位置を特定する（ステップＳ５１）。特徴点の位置の特定は、作業者が出力装置５に表示される画像にて特徴点を認識し入力装置４から指定することもできるし、クラスタ処理部２が自動的に抽出する構成とすることもできる。画像上での特徴点の位置に基づいてキャリブレーション実行部２０がカメラキャリブレーションを行い、算出された各カメラの外部パラメータ等のキャリブレーション結果は記憶部２２に格納される（ステップＳ５２）。 [Camera calibration for each cluster]
FIG. 9 is a schematic flowchart of camera calibration of each cluster. An image in which the reference object is reflected is input from each camera constituting the cluster to the cluster processing unit 2 (step S50). The position of the feature point in the input image is specified (step S51). The position of the feature point can be specified by the operator by recognizing the feature point from the image displayed on the output device 5 and specifying it from the input device 4, or by the cluster processing unit 2 automatically extracting the feature point. You can also. The calibration execution unit 20 performs camera calibration based on the position of the feature point on the image, and the calculated calibration results such as the external parameters of each camera are stored in the storage unit 22 (step S52).

[ループキャリブレーション]
図１０はループキャリブレーションの概略のフロー図である。統合処理部３は、クラスタ処理部２の記憶部２２に保存してある各クラスタのカメラキャリブレーション情報を読み込み（ステップＳ６０）、また統合処理部３の記憶部３４に保存してあるクラスタ構成情報３４０を読み込む（ステップＳ６１）。そして、変換行列生成部３１により、読み込んだカメラキャリブレーション情報より変換行列３４１を求め（ステップＳ６２）、最適化処理部３２により最適化処理を行って誤差行列３４２を求める（ステップＳ６３）。 [Loop calibration]
FIG. 10 is a schematic flowchart of loop calibration. The integration processing unit 3 reads the camera calibration information of each cluster stored in the storage unit 22 of the cluster processing unit 2 (step S60), and the cluster configuration information stored in the storage unit 34 of the integration processing unit 3 340 is read (step S61). Then, the conversion matrix generation unit 31 obtains a conversion matrix 341 from the read camera calibration information (step S62), and the optimization processing unit 32 performs an optimization process to obtain an error matrix 342 (step S63).

[追跡動作]
図１１は追跡動作時に行われる座標統合処理の概略のフロー図である。各クラスタ処理部２は、担当するクラスタの各カメラから画像を入力し（ステップＳ７０）、３次元追跡部２１により当該クラスタ内での追跡処理を実行してクラスタ座標系の人物位置を検出し、検出した人物位置を統合処理部３に出力する（ステップＳ７１）。統合処理部３は、入力された各人物位置に対して座標変換部３０により全***置合わせを行って統合座標系の値に変換し、変換後の各人物位置を記憶部３４に記憶させる（ステップＳ７２）。ここで算出された統合座標系の人物位置のうち隣接クラスタで別々に検出された同一人物の人物位置は以降の処理により１つにまとめられる。 [Tracking motion]
FIG. 11 is a schematic flowchart of coordinate integration processing performed during the tracking operation. Each cluster processing unit 2 inputs an image from each camera of the cluster in charge (step S70), performs tracking processing within the cluster by the three-dimensional tracking unit 21, and detects a person position in the cluster coordinate system, The detected person position is output to the integration processing unit 3 (step S71). The integration processing unit 3 performs overall alignment for each input person position by the coordinate conversion unit 30 and converts the position into an integrated coordinate system value, and stores the converted person position in the storage unit 34 (step S40). S72). Of the person positions in the integrated coordinate system calculated here, the person positions of the same person separately detected in the adjacent clusters are combined into one by the subsequent processing.

統合処理部３は、人物位置が検出された隣接クラスタ間で同一人物の対応付けを行う（ステップＳ７３〜Ｓ８２）。具体的には、各隣接クラスタ組を指定する隣接番号Ｉを１から１ずつインクリメントしながら（ステップＳ７３，Ｓ８２）、クラスタ校正情報３４０から該当する隣接クラスタの情報を読み出して隣接クラスタ組Ｉを構成する２つのクラスタｉ及びｊを特定し（ステップＳ７４）、クラスタｉからの人物位置とクラスタｊからの人物位置とが入力されていれば（ステップＳ７５にてＹＥＳ）、これらのクラスタ間にて対応付けを行う。他方、クラスタｉからの人物位置が入力されていないか、クラスタｊからの人物位置が入力されていなければ（ステップＳ７５にてＮＯ）、隣接クラスタ組Ｉに対応付け処理の対象はないとして次の隣接クラスタ組の処理に進む。 The integration processing unit 3 associates the same person between adjacent clusters in which the person position is detected (steps S73 to S82). Specifically, the adjacent number I specifying each adjacent cluster set is incremented from 1 by 1 (steps S73, S82), and the adjacent cluster set I is configured by reading the information of the corresponding adjacent cluster from the cluster calibration information 340. Two clusters i and j are identified (step S74), and if a person position from cluster i and a person position from cluster j are input (YES in step S75), a correspondence is established between these clusters. To do. On the other hand, if the person position from cluster i is not input or if the person position from cluster j is not input (NO in step S75), it is determined that there is no target for association processing in adjacent cluster set I, and Proceed to processing of adjacent cluster set.

対応付け処理の対象がある場合、統合処理部３は、変換行列３４１からクラスタｉ及びクラスタｊの組み合わせに対応する行列Ｔ_ｉｊを読み出して、座標変換部３０によりクラスタｉにて検出された人物位置をクラスタｊのクラスタ座標系に変換することで局所位置合わせを行う（ステップＳ７６）。これは上述したように誤差成分を除いた状態で対応付けを行うためである。なお、ステップＳ７６で処理される値はステップＳ７２で算出された統合座標系の値ではなく、各クラスタ座標系の値である。統合処理部３は座標統合処理部３３により、局所位置合わせされたクラスタｉの人物位置とクラスタｊの人物位置とに対して組み合わせ最適化による対応付けを行う（ステップＳ７７）。すなわち隣接クラスタ間で人物位置の総当たり組み合わせを設定して各組み合わせで人物位置間の距離を算出し、一方のクラスタの各人物位置に対して最も距離の近い組み合わせのみを選定し、選定された組み合わせのうち距離が予め設定された同定しきい値以下の組み合わせを対応付ける。 When there is a target for the association processing, the integration processing unit 3 reads the matrix T _ij corresponding to the combination of the cluster i and the cluster j from the conversion matrix 341, and the person position detected in the cluster i by the coordinate conversion unit 30 Is converted into the cluster coordinate system of cluster j to perform local alignment (step S76). This is because the association is performed with the error component removed as described above. Note that the value processed in step S76 is not the value of the integrated coordinate system calculated in step S72, but the value of each cluster coordinate system. The integration processing unit 3 uses the coordinate integration processing unit 33 to associate the person positions of the cluster i and the person positions of the cluster j that are locally aligned by combination optimization (step S77). In other words, a brute force combination of person positions is set between adjacent clusters and the distance between person positions is calculated for each combination, and only the combination closest to each person position in one cluster is selected and selected. A combination whose distance is equal to or less than a preset identification threshold is associated.

座標統合処理部３３は、隣接クラスタ間にて局所位置合わせにより対応付いた人物位置の組み合わせがあれば（ステップＳ７８にてＹＥＳ）、これと同じ組み合わせを、当該隣接クラスタそれぞれについてステップＳ７２にて記憶部３４に書き込んだ統合座標系の人物位置の中から読み出す。そして、隣接クラスタ組を構成するクラスタそれぞれについて読み出した統合座標系の値の平均値を統合物***置として算出し（ステップＳ７９）、読み出した各隣接クラスタの人物位置を当該統合物***置に置き換える（ステップＳ８０）。なお、対応付いた人物位置がない場合（ステップＳ７８にてＮＯ）、ステップＳ７９，Ｓ８０の処理はスキップされる。 If there is a combination of person positions associated by local alignment between adjacent clusters (YES in step S78), the coordinate integration processing unit 33 stores the same combination in step S72 for each of the adjacent clusters. Read out from the person position of the integrated coordinate system written in the unit 34. Then, the average value of the integrated coordinate system values read for each cluster constituting the adjacent cluster set is calculated as the integrated object position (step S79), and the read person position of each adjacent cluster is replaced with the integrated object position (step S79). S80). If there is no corresponding person position (NO in step S78), the processes in steps S79 and S80 are skipped.

全ての隣接クラスタ組について対応付けが終了するまで（ステップＳ８１）、統合処理部３は隣接番号ＩをインクリメントしてステップＳ７４〜Ｓ８０の対応付け処理を繰り返す。 Until the association is completed for all adjacent cluster groups (step S81), the integration processing unit 3 increments the adjacent number I and repeats the association processing of steps S74 to S80.

上述の座標統合処理により、クラスタごとの人の追跡結果がカメラシステム全体で整合性を有するように統合される。画像処理部１は人の複数のクラスタに亘る行動についての判定を、統合された追跡結果に基づいて行う。また、出力装置５への追跡結果の画像表示には統合された追跡結果を用いる。クラスタ間の座標系の誤差が最小化された統合座標系での追跡結果は移動軌跡がクラスタ間にて滑らかになるので、統合された追跡結果を用いた人の移動速度や動きのパターンの精度が向上し、それらに基づいた人の行動認識の精度が向上する。また、軌跡が滑らかになることにより、画像表示での移動軌跡の表示が見やすくなる効果も得られる。 By the coordinate integration process described above, the tracking results of the persons for each cluster are integrated so as to have consistency in the entire camera system. The image processing unit 1 determines a behavior of a person across a plurality of clusters based on the integrated tracking result. Further, the integrated tracking result is used for the image display of the tracking result on the output device 5. The tracking result in the integrated coordinate system in which the error of the coordinate system between clusters is minimized, the movement trajectory becomes smooth between the clusters, so the accuracy of the movement speed and movement pattern of the person using the integrated tracking result And the accuracy of human action recognition based on them is improved. In addition, since the locus becomes smooth, an effect of making it easy to see the display of the movement locus in the image display can be obtained.

以上説明したように本発明に係るカメラシステムでは、カメラシステムを構成する全カメラで共通視野が確保できない場合に、共通視野を有するクラスタごとに物***置を求め、それを全クラスタに亘る統合座標系にまとめることで、広範囲での物***置を求め、移動物体の行動分析等を可能としている。統合座標系は、クラスタ内のカメラの位置関係は保持し、カメラキャリブレーシの誤差をクラスタ間の複数の接続部分にて、最適化しつつ分散するように設定され、これにより、クラスタ内のトラッキング処理での移動物体の座標について高い検知精度が実現される。また、移動物体がクラスタをまたぐときは共通カメラで位置合わせして同一物体を同定するので追跡結果を高精度のまま引き継ぐことができる。 As described above, in the camera system according to the present invention, when a common field of view cannot be secured in all the cameras constituting the camera system, an object position is obtained for each cluster having the common field of view, and the integrated coordinate system over all the clusters is obtained. Thus, the object position in a wide range can be obtained and the behavior analysis of the moving object can be performed. The integrated coordinate system maintains the positional relationship of the cameras in the cluster and is set to distribute the camera calibration error while optimizing it at multiple connections between the clusters. High detection accuracy is realized for the coordinates of the moving object. In addition, when the moving object crosses the cluster, the same object is identified by positioning with the common camera, so that the tracking result can be taken over with high accuracy.

隣接クラスタ間での同一人物の対応付け処理では、座標間の距離に基づく対応付けという非常に簡単な処理により人物位置のクラスタ間での統合を行っているが、上述のようにクラスタ内での人物位置の検出精度が高いため、このような簡単な処理であっても対応付けを誤る可能性が低く抑えられる。すなわち、クラスタ内での人物位置の高い検知精度が、座標の対応付け精度を担保している。さらに言えば、クラスタ内での人の検知座標の精度は、上述したようにカメラキャリブレーション精度に依存することから、最終的に、クラスタ内のカメラキャリブレーションの精度が、対応付け精度を担保している。 In the process of associating the same person between adjacent clusters, the person positions are integrated between the clusters through a very simple process of associating based on the distance between the coordinates. Since the detection accuracy of the person position is high, the possibility of erroneous association is suppressed even with such a simple process. That is, the high detection accuracy of the person position in the cluster guarantees the coordinate matching accuracy. Furthermore, since the accuracy of the human detection coordinates in the cluster depends on the camera calibration accuracy as described above, the accuracy of the camera calibration in the cluster guarantees the matching accuracy in the end. ing.

なお、上述の実施形態ではクラスタがループ状に接続される例を説明した。ループ状のクラスタ接続では、ループを一周したときのずれによりカメラキャリブレーション全体の誤差を評価することができ、本発明では当該誤差を各クラスタ間に分散する。ここで、カメラキャリブレーション全体の誤差を各クラスタ間にて最適化しつつ分散するという本発明の構成は、全体の誤差を何らかの形で検知できれば、クラスタがループ状に閉合した形状に接続される場合に限らず、例えば、直鎖状やツリー状の開放した端部を有する形状に接続される場合にも適用することができる。例えば、部屋や廊下など形状が把握できる監視空間においては、監視空間の形状とクラスタの配置とを対比し、監視空間の壁からの距離などに基づいてクラスタのキャリブレーション誤差を検知して、本発明を適用することが可能である。 In the above-described embodiment, an example in which clusters are connected in a loop has been described. In a loop-shaped cluster connection, an error of the entire camera calibration can be evaluated by a deviation when the loop is made around. In the present invention, the error is distributed among the clusters. Here, the configuration of the present invention that optimizes and distributes the error of the entire camera calibration between the clusters is, if the entire error can be detected in some form, the cluster is connected in a closed loop shape However, the present invention is not limited to this, and can also be applied to a case where a linear or tree-like open end is connected. For example, in a monitoring space such as a room or corridor where the shape can be grasped, the shape of the monitoring space is compared with the arrangement of the cluster, and the calibration error of the cluster is detected based on the distance from the wall of the monitoring space. The invention can be applied.

また、上記実施形態では、座標統合処理部３３による隣接クラスタ間の対応付け後の新たな座標（統合物***置）は、対応付けられた２点の中点に設定している。しかし、統合物***置は中点以外とすることもできる。例えば、対応付けられた２点を結ぶ線分を内分する点に設定することができる。また、２つのクラスタ間の共通カメラの視野内にて対応付けられた２点又はその中点がどちらのクラスタに近いかを評価し、近い方のクラスタにて得られた物***置の重みを他方のクラスタにて得られた物***置より大きくする重み付け平均とすることもできる。 Moreover, in the said embodiment, the new coordinate (integrated object position) after the correlation between the adjacent clusters by the coordinate integration process part 33 is set to the midpoint of two matched. However, the integrated object position can be other than the midpoint. For example, it can be set to a point that internally divides a line segment connecting two associated points. Also, it evaluates which cluster is closer to the two points or its midpoint associated in the field of view of the common camera between the two clusters, and assigns the weight of the object position obtained in the closer cluster to the other It is also possible to use a weighted average that is larger than the object position obtained in the cluster.

上述の実施形態では、簡単な例としてクラスタが３台のカメラからなる場合を示したが、クラスタを構成する複数のカメラの台数はこれに限定されない。 In the above-described embodiment, the case where the cluster includes three cameras is shown as a simple example, but the number of the plurality of cameras constituting the cluster is not limited to this.

また、クラスタ間での物***置の対応付けは、当該クラスタの視野のオーバーラップ部分での各クラスタの物***置の間で行うのが好適であるが、オーバーラップ部分を有さないクラスタ間であっても、互いの視野が或る程度近接していれば、例えば、移動元のクラスタでの追跡結果から移動先での或る時刻ｔの物***置を推定し、これと当該時刻ｔにて移動先で実際に検知される物***置とを対応付けたり、近接した視野のクラスタの一方にて時刻ｔに検知された物***置と他方にて時刻ｔに近い時刻ｔ’に検知された物***置とを対応付けたりする近似的な対応付けを行うことも可能である。 In addition, it is preferable to associate the object positions between the clusters between the object positions of each cluster in the overlapping part of the field of view of the cluster, but between the clusters having no overlapping part. However, if the fields of view are close to each other, for example, an object position at a certain time t at the movement destination is estimated from the tracking result in the movement source cluster, and the object moves at the time t. The object position detected at the time t in one of the clusters in the near field of view and the object position detected at the time t ′ near the time t on the other It is also possible to perform an approximate association that associates.

１画像処理部、２クラスタ処理部、３統合処理部、４入力装置、５出力装置、２０キャリブレーション実行部、２１３次元追跡部、２２，３４記憶部、３０座標変換部、３１変換行列生成部、３２最適化処理部、３３座標統合処理部、３４０クラスタ構成情報、３４１変換行列、３４２誤差行列、Ｃ１〜Ｃ３クラスタ、ｅ１〜ｅ６カメラ。 1 image processing unit, 2 cluster processing unit, 3 integration processing unit, 4 input device, 5 output device, 20 calibration execution unit, 21 3D tracking unit, 22, 34 storage unit, 30 coordinate conversion unit, 31 transformation matrix generation Unit, 32 optimization processing unit, 33 coordinate integration processing unit, 340 cluster configuration information, 341 transformation matrix, 342 error matrix, C1-C3 cluster, e1-e6 camera.

Claims

視野内に共通視野を有した一群のカメラであるクラスタを複数含み、当該複数のクラスタが隣り合うクラスタにて共通カメラを共有して互いに連鎖し、かつ少なくとも１つの前記クラスタには前記共通カメラ以外のカメラが存在するカメラシステムであって、
前記クラスタごとに設定したクラスタ座標系、並びに当該クラスタ座標系における前記各カメラの位置及び姿勢を記憶するクラスタ校正情報記憶部と、
前記各クラスタ座標系における前記カメラの位置及び姿勢の配置関係を維持したまま、前記共通カメラを共有する複数の前記クラスタそれぞれの前記クラスタ座標系における当該共通カメラの位置及び姿勢の間に所定範囲の誤差を許容することにより当該クラスタ座標系の相互の配置関係を調整した、全クラスタに亘る統合座標系を記憶する統合座標系記憶部と、
前記クラスタごとに、前記カメラが撮像した物体の画像を解析して当該クラスタの前記クラスタ座標系における当該物体の物***置を検出する物***置検出部と、
前記物***置検出部により検出された前記クラスタ座標系での前記物***置を前記統合座標系に変換して出力する物***置統合部と、
を備えたことを特徴とするカメラシステム。 A plurality of clusters which are a group of cameras having a common field of view within the field of view, the plurality of clusters share a common camera in adjacent clusters and are linked to each other , and at least one of the clusters other than the common camera A camera system in which
A cluster coordinate information storage unit that stores a cluster coordinate system set for each cluster, and the position and orientation of each camera in the cluster coordinate system;
While maintaining the positional relationship between the position and orientation of the camera in each cluster coordinate system , a predetermined range between the position and orientation of the common camera in the cluster coordinate system of each of the plurality of clusters sharing the common camera . An integrated coordinate system storage unit for storing an integrated coordinate system across all clusters, wherein the mutual arrangement relationship of the cluster coordinate system is adjusted by allowing an error;
For each cluster, an object position detection unit that detects an object position of the object in the cluster coordinate system of the cluster by analyzing an image of the object captured by the camera;
An object position integration unit that converts the object position in the cluster coordinate system detected by the object position detection unit into the integrated coordinate system and outputs the converted object position;
A camera system comprising:

請求項１に記載のカメラシステムにおいて、
前記複数のクラスタはループ状に連鎖し、
前記統合座標系は、ループ上の全ての前記共通カメラについて積算した前記誤差を最小化するように定められていること、
を特徴とするカメラシステム。 The camera system according to claim 1,
The plurality of clusters are chained in a loop,
The integrated coordinate system is defined to minimize the error accumulated for all the common cameras on the loop;
A camera system characterized by

請求項１又は請求項２に記載のカメラシステムにおいて、
前記物***置統合部は、前記クラスタ座標系における前記共通カメラの前記位置及び姿勢を前記隣り合うクラスタにて合致させて、当該隣り合うクラスタのそれぞれから検出された前記物***置を照合し同一物体の対応付けを行うこと、を特徴とするカメラシステム。 The camera system according to claim 1 or 2,
The object position integration unit matches the position and orientation of the common camera in the cluster coordinate system in the adjacent clusters, collates the object positions detected from the adjacent clusters, and A camera system characterized by performing association.

請求項３に記載のカメラシステムにおいて、
前記物***置統合部は、前記同一物体として対応付けされた前記物***置のうち、前記隣り合うクラスタそれぞれから同一時刻に検出された複数の物***置を前記統合座標系における当該物***置の内分点に置き換えて出力すること、を特徴とするカメラシステム。 The camera system according to claim 3.
The object position integration unit includes a plurality of object positions detected at the same time from the adjacent clusters among the object positions associated as the same object, and an internal dividing point of the object position in the integrated coordinate system. A camera system characterized by being output in place of