JP3560326B2

JP3560326B2 - Object tracking method and object tracking device

Info

Publication number: JP3560326B2
Application number: JP2000180186A
Authority: JP
Inventors: 渡伊藤; 博唯上田
Original assignee: Hitachi Kokusai Electric Inc
Current assignee: Hitachi Kokusai Electric Inc
Priority date: 1999-06-15
Filing date: 2000-06-15
Publication date: 2004-09-02
Anticipated expiration: 2020-06-15
Also published as: JP2001060263A

Description

【０００１】
【発明の属する技術分野】
本発明は、撮像装置を用いた監視装置に係り、特に撮像視野内に侵入した物体を、撮像装置から入力する映像信号の中から自動的に検出し、検出した物体の動きを自動的に追跡するようにした物体追跡方法と、検出した物体の動きに応じて視野方向（撮像中心方向）を調節するようにした物体追跡装置に関する。
【０００２】
【従来の技術】
カメラ等の撮像装置を用いた映像監視装置は、従来から広く用いられている。しかし、このような映像監視装置を用いた監視システムにおいて、その監視視野内に入り込んでくる人間や自動車などの侵入物体の検出及び追跡を、監視員がモニタに表示される画像を見ながら行なう有人監視ではなく、カメラ等の画像入力手段から入力される画像から侵入物体を自動的に検出し、その動きを自動的に追跡するようにし、所定の報知や警報処置が得られるようにしたシステムが要求されるようになってきている。
【０００３】
このようなシステムを実現するためには、まず、差分法などによって視野内の侵入物体を検出する。差分法とは、テレビジョンカメラ（以下、ＴＶカメラと称する）等の撮像装置により得られた入力画像と、予め作成した基準背景画像、即ち、検出すべき物体の写っていない画像とを比較し、画素毎に輝度値の差分を求め、その差分値の大きい領域を物体として検出するものである。このようにして検出された侵入物体の位置に相当する入力画像の部分画像をテンプレートとして登録し、逐次入力される画像の中でテンプレート画像と一致度が最大となる位置を検出する。この方法は、テンプレートマッチングと呼ばれ広く知られ、例えば、１９８５年に総研出版より出版された田村秀行氏監修による『コンピュータ画像処理入門』と題する書籍のＰ１４９〜Ｐ１５３で解説されている。
通常、テンプレートマッチングを用いて対象物体を追跡する場合、対象物体の姿勢の変化に追従するため、マッチング処理によって検出された対象物体の位置の画像を新たにテンプレートとして逐次更新する。これらの処理を図４〜図７によって説明する。
【０００４】
図４は差分法を用いた侵入物体検出処理の一例を表すフローチャート、図５はテンプレートマッチングを用いた侵入物体追跡の一例を表すフローチャート、図６は、図４と図５で表される侵入物体検出処理から初期のテンプレート画像登録までの流れを画像の例によって説明するための図である。また図７は、図５で表される侵入物体追跡処理の流れを画像の例を用いて説明するための図であり、一定の時間間隔で入力された画像が初期に与えられたテンプレート画像をもとにどのように実行されていくか（初期のテンプレートがどのように変化していくか）を説明する図である。
【０００５】
図６で、６０１は入力画像、６０９は入力画像６０１中の人型の物体、６０２は基準背景画像、６０６は差分処理部、６０３は差分処理部６０６において差分処理された後の差分画像、６１０は人型の物体６０９に相当する差分画像６０３中の人型の差分画像、６０７は二値化処理部、６０４は二値化処理部６０７によって二値化処理された差分画像６０３の二値化画像、６１１は人型の差分画像６１０に相当する二値化画像６０４中の人型物体（人型の二値化画像）、６１２は人型の二値化画像６１１の外接矩形、６０８は画像抽出部、６０５は入力画像６０１から外接矩形６１２の囲む領域をテンプレート画像として切出すことを説明する画像、６１３は入力画像６０１から切出した初期テンプレート画像である。
【０００６】
図４と図６において、まず、ＴＶカメラから例えば３２０×２４０画素の入力画像６０１を入力する（画像入力ステップ４０１）。次に、差分処理部６０６において、入力画像６０１と、予め作成した基準背景画像６０２との画素毎の差分を計算し、差分画像６０３を取得する。この時、入力画像６０１中の人型の物体６０９は差分画像６０３中に、人型の差分画像６１０として現れる（差分処理ステップ４０２）。そして、二値化処理部６０７において、差分画像６０３の各画素に対しての差分値が所定のしきい値以下の画素の値を“０”、しきい値以上の画素の値を“２５５”（１画素を８ビットとして）に置換えて、二値化画像６０４を得る。この時、入力画像６０１に撮像された人型の物体６０９は、二値化画像６０４中の人型物体６１１として検出され、人型物体６１１の外接矩形６１２が生成される（二値化処理ステップ４０３）。
次に物体存在判定ステップ４０４では、画像抽出部６０８において、二値化画像６０４中で画素値が“２５５”となった画素のかたまりを検出し、画素値が“２５５”となる画素のかたまりが存在する場合は物体検出処理を終了し、存在したかたまりの外接矩形に相当する入力画像に部分画像を初期テンプレート画像６１３として後述の画像メモリ３０５（図３）に登録する。また、画素値が“２５５”となる画素のかたまりが存在しない場合は画像入力ステップ４０１へ分岐する。
【０００７】
物体追跡処理の流れを図５に従って説明する。図５の物体検出処理ステップ１０１と初期テンプレート登録ステップ１０２とにおいて、図４と図６で説明したように物体検出処理と初期テンプレート画像の登録がなされた後の処理について図７を用いて説明する。
図４と図６で説明した物体検出処理が終了した後は、図５に示すフローチャートに従って物体追跡処理がなされる。
【０００８】
図７で、７０１ａは時刻ｔ０−１において更新された物体のテンプレート画像、７０１は時刻ｔ０−１での入力画像におけるテンプレート画像７０１ａの位置を示す図、７０２は時刻ｔ０での入力画像、７０２ａは時刻ｔ０においてテンプレートマッチング処理によって検出された物体の位置（テンプレート画像）、７０２ｂは１フレーム前（ｔ０−１）でのテンプレート画像の位置、７０２ｃはテンプレートマッチング処理（例えば、図５のフローチャート）で探索する探索範囲、７０２ｄは時刻ｔ０−１からｔ０まで人型物体が移動した方向と軌跡を表す移動矢印（例えば、テンプレート画像７０１ａの中心位置からテンプレート画像７０２ａの中心位置へ向かう矢印）、７０２ｅはテンプレートマッチング処理で人型物体を検出した位置、７０３ａは時刻ｔ０で更新された物体のテンプレート画像、７０３は時刻ｔ０での入力画像におけるテンプレート画像７０３ａの位置を示す図、７０４は時刻ｔ０＋１での入力画像、７０４ａは時刻ｔ０＋１においてテンプレートマッチング処理によって検出された物体の位置（テンプレート画像）、７０４ｂは１フレーム前（ｔ０）でのテンプレート画像の位置、７０４ｃはテンプレートマッチング処理で探索する探索範囲、７０４ｄは時刻ｔ０−１からｔ０＋１まで人型物体が移動した方向と軌跡を表す移動矢印（例えば、テンプレート画像７０１ａの中心位置からテンプレート画像７０３ａの中心位置を経由して７０４ａの中心位置へ向かう矢印）、７０５ａは時刻ｔ０＋１で更新された物体のテンプレート画像、７０５は時刻ｔ０＋１での入力画像におけるテンプレート画像７０５ａの位置を示す図、７０６は時刻ｔ０＋２での入力画像、７０６ａは時刻ｔ０＋２においてテンプレートマッチング処理によって検出された物体の位置（テンプレート画像）、７０６ｂは１フレーム前（ｔ０＋１）でのテンプレート画像の位置、７０６ｃはテンプレートマッチング処理で探索する探索範囲、７０６ｄは時刻ｔ０−１からｔ０＋２まで人型物体が移動した方向と軌跡を表す移動矢印（例えば、テンプレート画像７０１ａの中心位置からテンプレート画像７０３ａの中心位置、テンプレート画像７０５ａの中心位置を経由して７０６ａの中心位置へ向かう矢印）、７０７ａは時刻ｔ０＋２で更新された物体のテンプレート画像、７０７は時刻ｔ０＋２での入力画像におけるテンプレート画像７０７ａの位置を示す図、７０８は時刻ｔ０＋３での入力画像、７０８ａは時刻ｔ０＋３においてテンプレートマッチング処理によって検出された物体の位置（テンプレート画像）、７０８ｂは１フレーム前（ｔ０＋２）でのテンプレート画像の位置、７０８ｃはテンプレートマッチング処理で探索する探索範囲、７０８ｄは時刻ｔ０−１からｔ０＋３まで人型物体が移動した方向と軌跡を表す移動矢印（例えば、テンプレート画像７０１ａの中心位置からテンプレート画像７０３ａの中心位置、テンプレート画像７０５ａの中心位置、テンプレート画像７０７ａの中心位置を経由して７０８ａの中心位置へ向かう矢印）である。
【０００９】
即ち、図５と図７において、物体追跡処理が開始され、二値化画像６０４中に物体が存在すると判定され、物体検出処理ステップ１０１を終了する（図４）。そして、二値化画像６０４中の人型の二値化画像のかたまりの外接矩形に相当する入力画像６０１の部分画像を、初期テンプレート画像６１３（図７のテンプレート画像７０１ａ）として画像メモリ３０５（図３）に登録する（初期テンプレート登録ステップ１０２）。続いて、逐次入力される入力画像中の探索範囲７０２ｃ内でテンプレート画像７０１ａと一致度ｒ（Δｘ，Δｙ）が最大となる部分画像７０２ａを検出する（テンプレートマッチッングステップ１０３）。
即ち、テンプレートマッチッングステップ１０３では、最大一致度と、その最大一致度が求められた位置とを得る。
【００１０】
この一致度ｒ（Δｘ，Δｙ）を算出する方法として、例えば以下の式（１）で得られる正規化相関と呼ばれる指標を用いることができる。
【数１】

【００１１】
ここで、入力画像７０２に対してテンプレートマッチングを行なった場合、ｆ（ｘ，ｙ）は入力画像７０２、ｆ_ｔ（ｘ，ｙ）はテンプレート画像７０１ａ、（ｘ_０，ｙ_０）は登録したテンプレート画像７０１ａの左上の座標（画像は左上を原点としている）、Ｄ_ｔはテンプレートの探索範囲７０２ｃを表し、探索範囲７０２ｃ内にテンプレート画像７０１ａとまったく同じ画素値を持つ画像が存在した場合には、一致度ｒ（Δｘ，Δｙ）は“１．０”となる。テンプレートマッチングステップ１０３では、この式（１）で表される指標を（Δｘ，Δｙ）∈Ｄで表される探索範囲７０２ｃに対して計算し、その中で一致度ｒ（Δｘ，Δｙ）が最大となる位置（外接矩形）７０２ａを検出する。この探索範囲７０２ｃは、対象物体の見かけの移動量によって決定される。例えば、速度４０ｋｍ／ｈで移動する物体を、５０ｍ離れたＴＶカメラ（素子サイズ６．５ｍｍ×４．８ｍｍのＣＣＤ、焦点距離２５ｍｍのレンズ、入力画像サイズ３２０×２４０画素（ｐｉｘ）。処理間隔０．１ｆｒａｍｅ／ｓｅｃ）で監視する場合、水平方向の物体の見かけの移動量は２７．４ｐｉｘ／ｆｒａｍｅ、垂直方向は２７．８ｐｉｘ／ｆｒａｍｅとなり、Ｄを−３０ｐｉｘ＜Δｘ＜３０ｐｉｘ，−３０ｐｉｘ＜Δｙ＜３０ｐｉｘ程度に設定すればよい。
尚、一致度の算出方法は上述の正規化相関の指標に限られるものではない。例えば、入力画像とテンプレート画像間で各画素毎の差をとって、その絶対値の累積値の逆数を一致度としてもよい。
【００１２】
次に、最大一致度判定ステップ１０４では、テンプレートマッチングステップ１０３において、テンプレート画像７０１ａと一致度が最大となる入力画像７０２の位置に、物体が移動した（外接矩形７０２ｂから外接矩形７０２ａに移動した）と判断したあと、次に、最大一致度が所定値以下に低下した場合（例えば“０．５”未満）、入力画像中に対象物体がいなくなったものとして、物体検出処理ステップ１０１へ分岐し、最大一致度が所定値以上であった場合（例えば“０．５”以上）は、テンプレート更新ステップ１０６へ分岐する。
テンプレート更新ステップ１０６では、入力画像中の探索範囲７０２ｃ内でテンプレート画像７０１ａと一致度ｒ（△ｘ，△ｙ）が最大となる部分画像７０２ａを使ってテンプレート画像７０１ａをテンプレート画像７０３ａに更新する。ここで、テンプレート画像を更新する理由は、対象物体の姿勢が変化（例えば、対象物体の人が手を上げたり、腰を曲げたり、足を上げたりして画像が変化）し、テンプレート画像を更新しないと一致度が低下してしまい、追跡結果の信頼性が低下するためである。従って、検出された物体の位置の部分画像７０２ｅを新たなテンプレート画像７０３ａとして更新し、対象物体が姿勢を変えた場合でも安定な追跡を行なうようにしている。
【００１３】
上述の実施例では、差分法によって検出した侵入物体について作成したテンプレート画像は、検出した画素のかたまりの外接矩形を取り込み、この外接矩形に囲まれた部分画像をテンプレート画像として切出した。
しかし、切出すテンプレート画像のサイズの決定方法はこの方法に限らない。例えば、外接矩形に一定の定数（例えば、０．８、または１．１等）を乗算してもよい。
更に、撮像素子としてＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）を使用した場合には、ＣＣＤのサイズ、レンズの焦点距離、ＣＣＤから検出する物体の距離によって、対象とみなす物体の大きさを予め算出できるので、算出した大きさをテンプレート画像サイズとすることもできる。
【００１４】
次に、カメラ雲台制御ステップ１０７に移る。
図１１は入力画像とテンプレートマッチングで検出された対象物体の位置との関係を説明するための図である。図１１に言及して、カメラ雲台制御ステップ１０７について説明する。
カメラ雲台制御ステップ１０７では、テンプレートマッチングによって検出された対象物体の位置と入力画像中央との変位、即ち、カメラの光軸（カメラの視野方向）に対して対象物体が存在する方向に基づいてカメラ雲台３０２のパン・チルトモータを制御する。
つまり、テンプレートマッチングによって検出された対象物体の中心位置（ｘ_０＋△ｘ＋ｄｘ／２，ｙ_０＋△ｙ＋ｄｙ／２）（（ｄｘ，ｄｙ）はテンプレート画像の大きさを表す）と入力画像の中心位置（１６０，１２０）（画像サイズを３２０×２４０とする）とを比較し、検出された対象物体の中心位置が入力画像の中心位置に対して左側に位置する場合は、カメラ雲台のパンモータをカメラの光軸方向が左に移動するように制御し、右側に位置する場合はカメラ雲台のパンモータをカメラの光軸方向が右に移動するように制御する。また、検出された対象物体の中心位置が入力画像の中心位置に対して上側に位置する場合は、カメラ雲台のチルトモータをカメラの光軸方向が上に移動するように制御し、下側に位置する場合は、カメラ雲台のチルトモータをカメラの光軸方向が下に移動するように制御する。
尚、パンモータ、チルトモータは同時に制御可能で、例えば、検出された対象物体の中心位置が入力画像の中心位置に対して左上側に位置する場合は、カメラ雲台のパンモータをカメラの光軸方向が左に移動するように制御し、かつ、チルトモータをカメラの光軸方向が上に移動するように同時に制御する。このようにすることで、カメラの光軸上に対する対象物体を捉えるようにカメラ雲台を制御することが可能となる。
次に、警報・モニタ表示ステップ１０８では、例えば対象物体が所定の警報を出す範囲に存在する場合に警報を鳴らしたり、監視モニタに対象物体の画像を表示したりする。
【００１５】
警報・モニタ表示ステップ１０８が終わると、画像入力ステップ４０１に戻り、新しい入力画像を得て、この現時刻での入力画像に対して再びテンプレートマッチッング処理を行う。即ち、時刻ｔ０での入力画像７０２により更新したテンプレート画像７０３ａと、時刻ｔ０＋１での入力画像７０４とによりテンプレートマッチング処理を行う。この時、探索範囲７０４ｃは時刻ｔ０で更新されたテンプレート画像７０４ｂを中心とした位置に移動しており、新しい探索範囲で検索が行われる。そして、最大の一致度を持った物体が検出され、その検出された物体の位置７０４ａを元に新しいテンプレート画像７０５ａが生成される。
以上のように、対象物体が存在する間は、ステップ４０１，ステップ１０３，ステップ１０４，ステップ１０６，ステップ１０７，ステップ１０８の処理を繰返し、新しいテンプレート画像７０６ａ、７０８ａ，‥‥‥へと次々にテンプレート画像を更新しながら、対象物体を追跡し続ける。
【００１６】
前述のテンプレートマッチングを用いた侵入物体の追跡法では、対象物体の向きが変化（例えば、対象物体の人が右を向いたり、後ろを向いたり）すると、対象物体とマッチング位置とのズレが大きくなり、正確かつ安定したな追跡ができないという問題がある。
即ち、テンプレートマッチングは、テンプレート内の高いコントラストの模様部分が一致するようにマッチングされるという性質がある。例えば、対象物体が車輌である場合において、はじめは正面を向いていて、車輌のほとんどすべてがマッチング対象となっていた場合（図８の入力画像８０２）と、その後進行方向（向き）が変わり横向きになってしまった車輌の前面部分だけがマッチング対象となり、車輌全体がマッチング対象となっていた時に比べて、マッチング中心が車両の中心から車輌の前部に移動するため、検出位置ズレが生ずる。
【００１７】
これを図８を用いて説明する。図８は、侵入物体追跡処理の流れを画像の例を用いて説明するために、撮像視野内での曲線を描く車道内を通過する車輌を侵入物体とした場合の図である。８０１ａ，８０３ａ，８０５ａ，８０７ａはそれぞれ時刻ｔ１−１，時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２のテンプレート画像、８０１，８０３，８０５，８０７はそれぞれテンプレート画像８０１ａ，８０３ａ，８０５ａ，８０７ａの更新時の位置を示す画像、８０２，８０４，８０６，８０８はそれぞれ時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２，時刻ｔ１＋３の入力画像、８０２ａ，８０４ａ，８０６ａ，８０９ａは時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２，時刻ｔ１＋３においてそれぞれテンプレートマッチング処理によって検出された物体の位置、８０２ｂ，８０４ｂ，８０６ｂ，８０８ｂはそれぞれ１フレーム前でのテンプレート画像（時刻ｔ１−１，時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２でのそれぞれのテンプレート画像）の位置である。
図８において、時刻ｔ１−１で登録されたテンプレート画像８０１ａは、移動車輌のフロント部がほぼ正面を向いている画像である。時刻ｔ１では、このテンプレート画像８０１ａを用いてテンプレートマッチングを行ない、対象物体の移動した位置を検出すると共に、テンプレート画像８０１ａをテンプレート画像８０３ａに更新する。続いて、時刻ｔ１＋１ではテンプレート画像８０５ａに更新され、更に時刻ｔ１＋２ではテンプレート画像８０７ａに更新され、この処理を時刻ｔ１＋３まで行なうと、追跡開始時刻ｔ１で車輌のライトなどがあるフロント部分をテンプレートマッチングしていたものが、時刻ｔ１＋３では、車輌の左側にずれてマッチングされてしまう。
【００１８】
この現象は、テンプレートマッチングが対象とする入力画像とテンプレート画像中のコントラストが高い画像部分の位置のズレを小さくする様にマッチングが行われるように働くためで、この例では車輌のライト部分がそれにあたる。そのため、図８のように、対象物体が向かって右から左に向きを変えるような場合には左側にずれ、向って左側から右側に向きを変えるような場合には右側にずれる。
更に、時刻ｔ１では、テンプレート画像８０１ａ中には車輌の画像だけが入っていたが、対象物体が向きを変えてテンプレート位置がずれたことによって、テンプレート画像８０７ａ中に対象物体以外の背景部分の画像が入り込んでしまう。このテンプレート画像８０７ａのような対象物体以外の画像を多く含んだテンプレート画像を用いて追跡を続けた場合には、対象物体とマッチングできずに、テンプレートに入り込んだ背景部分とマッチングしてしまう。従って、テンプレートマッチングを用いた物体追跡法は、対象物体の向きが変化するような場合には、対象物体の模様が見かけ上移動し、それに引張られてテンプレートの位置がずれるため、対象としている物体を追跡している保証ができず、安定な追跡を行うことができない。
【００１９】
【発明が解決しようとする課題】
前述の従来技術には、対象物体の向きの変化が大きい場合には安定な追跡を行うことができない欠点があった。
本発明の目的は、上記のような欠点を除去し、対象物体の向きの変化が大きい場合にも、正確に物体を検出および追跡することができる、信頼性の高く安定に動作する物体追跡方法及び装置を提供することにある。
【００２０】
【課題を解決するための手段】
上記の目的を達成するために、本発明の物体追跡方法は、撮像装置によって逐次取得する入力画像から撮像視野内の物体を検出し、検出した物体を追跡する物体追跡方法において、現時刻に取得した入力画像と、テンプレート画像とのテンプレートマッチングを行い、現時刻に取得した入力画像の中からテンプレート画像と一致度が最大となる部分画像の位置を検出するマッチングステップと、マッチングステップによって検出した位置の部分画像を包含し、かつ検出した位置の部分画像より所定サイズ大きい拡張部分画像の領域についてエッジ検出を行い、エッジ検出によって検出した部分画像の位置を物体の現時刻の検出位置に補正するマッチング位置補正ステップと、マッチング位置補正ステップによって補正された位置の部分画像を新たなテンプレートとして更新するテンプレート更新ステップとを設け、撮像視野内の物体を追跡するものである。
また、本発明の物体追跡方法において、位置補正ステップは、現時刻の入力画像の拡張部分画像の領域内で、エッジの密度が最大になる部分画像の位置を検出したものである。
【００２１】
更に本発明の物体追跡方法において、位置補正ステップは、拡張部分画像の領域内に含まれるエッジ成分を抽出し、ｘ軸及びｙ軸上にそれぞれ、ｙ軸方向及びｘ軸方向のエッジ成分量を累積表示し、ｘ軸及びｙ軸上の累積エッジ成分量から最大エッジ密度範囲を検出したものである。
また更に本発明の物体追跡方法において、テンプレート画像のサイズは、物体の撮像視野内での見かけの移動量に基づき決定することができる。
【００２２】
本発明の物体追跡方法はまた、入力画像から差分法によって物体を検出し、検出した物体の少なくとも一部を含む入力画像の所定サイズの部分画像をテンプレート画像として登録する初期テンプレート登録ステップを備え、記差分法によって検出した物体を追跡対象物体として、追跡を行なう。
更にまた、別の方法として、位置補正ステップにおいて、取得した最大一致度が所定の値未満であれば、差分法によって、現時刻の入力画像から物体を検出し、検出した物体を追跡対象物体として追跡する。
【００２３】
また更に、本発明の物体追跡方法においては、位置補正ステップによって検出された位置に基づいて、撮像装置の視野方向を変えるための制御信号を発生するカメラ雲台制御ステップを有し、その制御信号によって撮像装置の視野方向を、出された位置に常に向け、検出した物体を追跡するものである。
【００２４】
本発明の物体追跡装置は、監視対象範囲を逐次撮像する撮像装置と、撮像装置が取得した映像信号を逐次画像信号に変換する画像入力インターフェースと、画像入力インターフェースによって変換された画像信号を処理する画像処理手段と、登録されたテンプレート画像を格納する記憶装置とを備え、記画像処理手段は、撮像装置から現時刻に入力した画像信号を記憶装置にあらかじめ登録されたテンプレート画像によってテンプレートマッチングを行ない、現時刻に入力した画像信号の中から、テンプレート画像と最大の一致度を持つ部分画像の位置を検出し、検出した位置の部分画像を包含する、現時刻に入力した画像信号内の部分画像より所定サイズ大きい拡張部分画像の領域について、エッジの密度が最大となる部分画像の位置を検出し、エッジの密度が最大となる部分画像の位置を、物体の現時刻における検出位置とし、現時刻の検出位置の部分画像をを新たなテンプレートマッチング位置と更新することによって撮像装置の撮像視野内に侵入した物体を追跡するものである。
【００２５】
また本発明の物体追跡装置は、撮像装置の視野方向を変えるための雲台と、画像処理手段によって撮像装置の視野方向を変えるために雲台を制御するための制御信号を供給する雲台制御インターフェースとを更に備え、画像処理手段が、物体の現時刻における検出位置に基づいて、物体の方向を検出し、得られた方向から雲台制御インターフェースを介して、撮像装置の視野方向を調節し、撮像装置の撮像視野内に侵入した物体を追跡するものである。
【００２６】
【発明の実施の形態】
本発明の物体追跡方法では、従来からの対象物体の模様が見かけ上移動し、それに引張られてテンプレートの位置がずれる問題を解決するため、対象物体は背景部分に比べてエッジ成分が多いという特徴を利用して、追跡処理の過程で更新するテンプレート画像の位置を入力画像のエッジ画像の密度に基いて補正する。即ち本発明は、差分法によって物体を検出し、検出物体の画像をテンプレートとして保持し、検出位置の周辺でエッジ画像の密度が最大となる位置に検出位置を補正しながら対象物体を追跡することで、特に対象物体の向きの変化が起こった場合でも安定に追跡を行うことができる。
【００２７】
図３に、本発明の各実施例に共通する物体追跡装置のハードウエア構成の一例を示す。３０１はＴＶカメラ、３０３は画像入力Ｉ／Ｆ、３１３はデータバス、３０５は画像メモリ、３０６はワークメモリ、３０７はＣＰＵ、３０８はプログラムメモリ、３０２はカメラ雲台、３０４は雲台制御Ｉ／Ｆ、３０９は出力Ｉ／Ｆ、３１０は画像出力Ｉ／Ｆ、３１１は警告灯、３１２は監視モニタである。ＴＶカメラ３０１は画像入力Ｉ／Ｆ３０３に接続され、カメラ雲台３０２は雲台制御Ｉ／Ｆ３０４に接続され、警告灯３１１は出力Ｉ／Ｆ３０９に接続され、監視モニタ３１２は画像出力Ｉ／Ｆ３１０に接続されている。画像入力Ｉ／Ｆ３０３、雲台制御Ｉ／Ｆ３０４、画像メモリ３０５、ワークメモリ３０６、ＣＰＵ３０７、プログラムメモリ３０８、出力Ｉ／Ｆ３０９及び画像出力Ｉ／Ｆ３１０は、データバス３１３に接続されている。また、ＴＶカメラ３０１はカメラ雲台３０２に取付けられている。
【００２８】
図３において、ＴＶカメラ３０１は監視対象（視野範囲）を撮像する。撮像された映像信号は、画像入力Ｉ／Ｆ３０３からデータバス３１３を介して画像メモリ３０５に蓄積される。ＣＰＵ３０７はプログラムメモリ３０８に保存されているプログラムに従って、ワークメモリ３０６内で画像メモリ３０５に蓄積された画像の解析を行なう。ＣＰＵ３０７は、処理結果に応じてデータバス３１３から雲台制御Ｉ／Ｆ３０４を介してカメラ雲台３０２を制御してＴＶカメラ３０１の撮像視野を変えたり、出力Ｉ／Ｆ３０９を介して警告灯３１１を点灯し、画像出力Ｉ／Ｆ３１０を介して監視モニタ３１２に、例えば侵入物体検出結果画像を表示する。尚、画像メモリ３０５は、登録されたテンプレート画像を保存しておくためのテンプレート画像保持装置をも備えている。
【００２９】
以降に説明するフローチャートは、すべて上記図３で説明した物体追跡監視装置のハードウエア構成を使って説明する。
本発明の第１の実施例を図１によって説明する。図１は本発明の一実施例の処理プロセスを説明するフローチャートである。図１は図５で示したテンプレートマッチング法の処理プロセスにテンプレート位置補正ステップ１０５を追加したものである。ステップ１０１、１０２、４０１、１０３、１０４、１０６、１０７、１０８については、図４と図５によって説明したものと同じであるので説明を省略する。
【００３０】
さて、最大一致度判定ステップ１０４において、最大一致度が所定値以上であった場合、テンプレート位置補正ステップ１０５へ進む。テンプレート位置補正ステップ１０５での処理の内容を、図８において時刻ｔ１＋１に得られた入力画像８０４と図９を用いて説明する。図９は侵入物体追跡処理の流れを画像の例を用いて説明するために、撮像視野内での曲線を描く車道内を通過する車輌を侵入物体とした場合の図で、図８の入力画像８０４について処理を行った場合の一例ある。９０１は入力画像で図８の入力画像８０４と同じ画像が入力したもの、９０２は入力画像９０１から微分フィルタ（図示しない）を使って抽出したエッジ画像、９０３ａは探索領域、９０３ｂは水平方向（ｘ軸）への投影像、９０３ｃは垂直方向（ｙ軸）への投影像、９０３は説明のために領域９０３ａで切出したエッジ画像に投影像９０３ｂと９０３ｃとを重ねて表示した図、８０４ａは図８で既に示したテンプレートマッチングで得られた検出位置を表す範囲、９０４はｘ軸への投影像９０３ｂを表すグラフ、９０４ａはテンプレートマッチングによって得られた検出位置を表す範囲、９０４ｂは累積投影値が最大となる範囲、９０５はｙ軸への投影像９０３ｃを表す図、９０５ａはテンプレートマッチングによって得られた検出位置を表す範囲、９０５ｂは累積投影位置が最大となる範囲である。
【００３１】
図９において、テンプレート位置補正ステップ１０５では、入力画像９０１に対してエッジ抽出処理を施し、エッジ画像９０２を得る。このエッジ抽出処理は、例えば、入力画像９０１に対してＳｏｂｅｌ，Ｒｏｂｅｒｔｓ等の微分フィルタを掛け、得られた画像を二値化する（エッジ部分を“２５５”、それ以外を“０”とする）ことによって行われる。この例については、例えば、１９８５年に総研出版より出版された田村秀行氏監修による『コンピュータ画像処理入門』と題する書籍のＰ１１８〜１２５に解説されている。
【００３２】
次にエッジ画像９０２をテンプレートマッチングステップ１０３によって得られた検出位置８０４ａの範囲（図９０３の実線枠部分：左上の座標（ｘ_０，ｙ_０）、大きさ（ｄｘ，ｄｙ））から上下左右を所定画素ｄ分（ｄは対象物体の向きの変化に伴うマッチング位置のズレの許容値、例えばｄ＝５ｐｉｘ）拡げた探索領域９０３ａ（図９０３の点線枠部分：左上の座標（ｘ_０−ｄ，ｙ_０−ｄ）、大きさ（ｄｘ＋２ｄ，ｄｙ＋２ｄ））で切出し、ｘ軸方向に対するエッジ画像の投影像９０３ｂと、ｙ軸方向に対するエッジ画像の投影像９０４ｃを求める。従って、探索領域９０３ａは、上記検出位置８０４ａの範囲を包含する拡張部分画像である。
【００３３】
グラフ９０４において、横軸は水平（ｘ軸）方向、縦軸はｘ軸方向の各画素（ｐｉｘ）毎のエッジ画像の投影像９０３ｂの値ｈｘ（ｘ）であり、グラフ９０５において、横軸は垂直（ｙ軸）方向、縦軸はｙ軸方向の各画素（ｐｉｘ）毎のエッジ画像の投影像９０３ｃの値ｈｙ（ｙ）である。
ｘ軸方向の投影像９０３ｂのｘ＝ｘ_０における投影値ｘ（ｘ_０）は、探索領域９０３ａで切出したエッジ画像に対し、（ｘ，ｙ）を、ｘ＝ｘ_０において、ｙ_０−ｄ＜ｙ＜ｙ_０＋ｄｙ＋ｄと変化させ、画素値が“２５５”となる画素数を計数して得る。また、ｙ軸方向の投影像９０３ｃのｙ＝ｙ_０における投影値ｙ（ｙ_０）は、探索領域９０３ａで切出したエッジ画像に対し、（ｘ，ｙ）を、ｙ＝ｙ_０において、ｘ_０−ｄ＜ｘ＜ｘ_０＋ｄｘ＋ｄと変化させ画素値が“２５５”となる画素数を計数して得る。次に、図９０４はｘ軸への投影像９０３ｂを表すグラフであり、範囲９０４ａはテンプレートマッチングによって得られた検出位置を表す。また、範囲９０４ｂはその範囲の累積投影値が最大となる範囲、すなわちエッジの密度が最大となる範囲（ｘ_１＜ｘ＜ｘ_１＋ｄｘ）を示しており、この位置は、次の式（２）によって得られる。
【００３４】
【数２】

【００３５】
この式(2) は、x1 をx0-d＜x1＜x0+d と変化させた場合に、x1＜x＜x1+dx において hx(x) の累積値が最も大きくなる x1 を求めることを表している。また同様にｙ軸に対する投影像に対してもエッジの累積値が最大となる範囲（ y1＜y＜y1+dy ）を得る。従って、テンプレートマッチングステップ103によって検出された対象物体の位置（左上の座標（ x0 ，y0 ））は、テンプレート位置補正ステップ 105 によって補正された位置（左上の座標（ x1 ，y1 ））に修正される。
なお、本実施例では、式 (2) で表される通り、 x1 を x0 − d ＜ x1 ＜ x0 + d と変化させた場合に、 x1 ＜ x ＜ x1 + dx において hx(x) の累積値が最も大きくなる x1 を求めているが、 x0 − d ＜ x1 ＜ x0 + d と変化させる過程で式 (2) の中カッコ内の値が所定のしきい値を超えた場合に式 (2) の計算を中止し、その時の x1 をテンプレートの補正位置としても良い。この場合、所定のしきい値とは、例えば累積値の最大値 255 ×（ dy + 2d ）（ｙ軸に対しては 255 ×（ dx + 2d ））の 30 ％の値を設定し、求められる位置はエッジの最大累積値の 30% 以上のエッジを含む部分となり、式 (2) の計算量を減らすことができる。
【００３６】
上記本発明の一実施例の効果について、図１０を用いて説明する。図１０は、侵入物体追跡処理の流れを画像の例を用いて説明するために、撮像視野内での曲線を描く車道内を通過する車輌を侵入物体とした場合の図で、図８と同じ条件設定であり、図８で示したマッチング処理の後に位置ズレ補正処理を追加したものである。１００１ａ，１００３ａ，１００５ａ，１００７ａはそれぞれ時刻ｔ１−１，時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２のテンプレート画像、１００１，１００３，１００５，１００７はそれぞれテンプレート画像１００１ａ，１００３ａ，１００５ａ，１００７ａの更新時の位置を示す画像、１００２，１００４，１００６，１００８はそれぞれ時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２，時刻ｔ１＋３の入力画像、１００２ａ，１００４ａ，１００６ａ，１００９ａは時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２，時刻ｔ１＋３においてそれぞれテンプレートマッチング処理によって検出された物体の位置、１００２ｂ，１００４ｂ，１００６ｂ，１００８ｂはそれぞれ１フレーム前でのテンプレート画像（時刻ｔ１−１，時刻ｔ１，時刻ｔ１＋１，時刻ｔ１＋２でのそれぞれのテンプレート画像）の位置である。
【００３７】
図８で示された方法では、テンプレートマッチングが対象とする入力画像と、テンプレート画像の中のコントラストが高い画像部分（図８の例では、車両のフロント部分）の位置のズレを小さくするようにテンプレートマッチングが行われるため、対象物体の向きの変化が起る場面では、テンプレート画像中の車両フロント部分の位置は変らないが、追跡処理を繰返す度に対象物体以外の画像（背景画像）の画素が含まれる割合が大きくなってしまう。
【００３８】
それに対し、本発明の実施例を適用した図１０の場合には、テンプレートマッチング処理後にテンプレート画像の位置をエッジが多く含まれる画素部分、即ち、対象物体の画素部分に逐次位置補正するため、図８の方法によるテンプレート画像に比べて、対象物体以外の画素分が含まれる割合を小さくすることができる。従って、図８と図１０の同時刻のテンプレート位置、例えば、８０９ａと１００９ａとを比較すると、位置８０９ａの場合には、テンプレート画像中に含まれる背景部分の画素の割合が半分以上となっているが、位置１００９ａの場合には、テンプレート画像のほとんどの部分が対象物体の画素となっている。
【００３９】
上述のテンプレート位置補正ステップ１０５の次に、テンプレート更新ステップ１０６の処理がなされ、位置補正された対象物体の位置を新しいテンプレート画像として更新する。以降、図５と同様な処理がなされる。
【００４０】
上記のように本発明の実施例によれば、テンプレートマッチングステップ１０３によって検出された位置を対象物体に存在するエッジを検出し、そのエッジの密度が最大となる位置に検出位置を補正するため、対象物体が向きを変えたとしてもテンプレートの位置が対象物体からずれることがなく、正確に対象物体を追跡することができる。
【００４１】
本発明の第２の実施例を図２を用いて説明する。図２は本発明の処理プロセスの一実施例を表すフローチャートである。図２は図１で表される第１の実施例を表すフローチャートの最大一致度判定ステップ１０４の代わりに分岐ステップ２０１と最大一致度判定ステップ１０４´を置き、テンプレート更新ステップ１０６の代わりに複数テンプレート保存ステップ２０２を置いて構成したものである。
【００４２】
図２において処理が開始されると、既に説明した物体検出処理１０１から初期テンプレート登録ステップ１０２によって、時刻ｔ０−１における入力画像から取得した画像を時刻ｔ０−１のテンプレート画像として登録した後、画像入力ステップ４０１に進む。画像入力ステップ４０１で、時刻ｔ０における入力画像を取得する。
次に、テンプレートマッチング処理ステップ１０３では、保存された時刻ｔ０−１のテンプレート画像と、時刻ｔ０における入力画像とテンプレートマッチング処理がなされる。そして、分岐ステップ２０１（後述する）を通り、最大一致度判定ステップ１０４´に進む。
最大一致度判定ステップ１０４´では、最大一致度が所定値以上であった場合にはテンプレート位置補正ステップ１０５に進み、最大一致度が所定値未満であった場合には物体検出処理ステップ１０１に戻る。
【００４３】
テンプレート位置補正ステップ１０５では、最大一致度判定ステップ１０４´において抽出した位置を時刻ｔ０での検出位置として補正する。そして、次の複数テンプレート保存ステップ２０２では、位置補正した時刻ｔ０の検出位置をもとに時刻ｔ０のテンプレートが新たに保存される。この時、既に初期テンプレート登録ステップ１０２において登録された時刻ｔ０−１のテンプレ−ト画像はそのまま保存される。次に、カメラ雲台制御ステップ１０７に進み、カメラの視野（光軸方向）を時刻ｔ０における検出位置に基づき対象物体の方向に向ける。
次に、警報・モニタ表示ステップ１０８に処理を移し、例えば警報を鳴らしたり、例えば監視モニタに対象物体の画像を表示したりする。
警報・モニタ表示ステップ１０８が終了すると、画像入力ステップ４０１に戻り、新しい入力画像を取得し、再びテンプレートマッチッング処理を行う。
【００４４】
再びテンプレートマッチッングステップ１０３に戻った時、保存されているテンプレートは時刻ｔ０−２のテンプレートと時刻ｔ０−１のテンプレートの２つである（時刻は“１”進んでいるため“−１”が加えられる）。ここで、テンプレートマッチッングステップ１０３では時刻ｔ０の入力画像と、先ず時刻ｔ０−１のテンプレートとのテンプレートマッチング処理がなされ、分岐ステップ２０１に進む。
分岐ステップ２０１では、保存されているすべてのテンプレート画像すべてについてテンプレートマッチング処理がなされているかどうかを調べる。今、時刻ｔ０−１のテンプレートとのテンプレートマッチング処理がなされたが、まだ時刻ｔ０−２のテンプレートが残っている。従って、この時はステップ１０３に戻り、時刻ｔ０−２のテンプレートと時刻ｔ０の入力画像とのテンプレートマッチング処理を行う。このようにして、次々と残っているテンプレートとテンプレートマッチング処理を行い、すべてのテンプレートについてテンプレートマッチング処理が終了すれば、分岐ステップ２０１から処理ステップを最大一致度判定ステップ１０４´に進む。
【００４５】
最大一致度判定ステップ１０４´では、テンプレートマッチング処理によって、複数のテンプレート画像それぞれについてに得られた最大一致度の中から一番大きな値を選ぶ。そして、その一番大きな値の最大一致度が所定値（例えば、０．５）以上であった場合にはテンプレート位置補正ステップ１０５に進み、その一番大きな値の最大一致度が所定値未満であった場合には、入力画像中に対象物体が存在しなくなったものとして、物体検出処理ステップ１０１に戻る。
【００４６】
テンプレート位置補正ステップ１０５では、最大一致度判定ステップ１０４´において一番値の大きい最大一値度を得たテンプレート画像について入力画像のエッジ処理を行い、得られたエッジ画像から対象物体の位置を補正する。
そして、次の複数テンプレート保存ステップ２０２では、位置補正した時刻ｔ０の検出位置をもとに時刻ｔ０のテンプレートが新たに保存される。この時、既に初期テンプレート登録ステップ１０２において登録された時刻ｔ０−１のテンプレ−ト画像はそのまま保存される。
この複数テンプレート保存ステップ２０２で保存するテンプレート画像の数は、あらかじめ所定数（例えば、“３”）を定めておき、所定数を超える時は、一番古い時刻に取得したテンプレートを削除する。次にカメラ雲台制御ステップ１０７に進み、カメラの視野方向（光軸方向）を制御する。
そして次に、警報・モニタ表示ステップ１０８に進み、例えば警報を鳴らしたり、例えば監視モニタに対象物体の画像を表示したりする。
警報・モニタ表示ステップ１０８が終わると、画像入力ステップ４０１に戻り、新しい入力画像を得、再びテンプレートマッチッング処理を続ける。
上述の場合には、一番古い時刻に取得したテンプレートを削除したが、テンプレートマッチングステップ１０３において算出した中で、最小の一致度が得られたテンプレートを削除することでもよい。
【００４７】
この第２の実施例によれば、テンプレートマッチングステップ１０３によって検出された位置をもとに、対象物体に存在するエッジを検出し、そのエッジの密度が最大となる位置に検出位置を補正し、異なる時刻に得られた所定フレーム数分のテンプレート画像を独立にマッチングさせるため、対象物体が向きを変えたり、対象物体の前を別の物体が横切ったとしても、過去の複数のテンプレート画像を対象として、最大の一致度を持つ領域をテンプレートマッチング位置として補正するため、テンプレートの位置が対象物体からずれることがなく、また別の物体を追跡することなく、対象物体を正確に追跡することができる。
【００４８】
【発明の効果】
以上のように本発明によれば、対象物体の向きが変化する物体を追跡安定に物体を追跡することができ、撮像装置を用いた監視装置の適用範囲を大きく広げることができる。
【図面の簡単な説明】
【図１】本発明の一実施例の処理動作を説明するためのフローチャート。
【図２】本発明の一実施例の処理動作を説明するためのフローチャート。
【図３】本発明が適用された監視装置の一実施例を示すブロック構成図。
【図４】従来の差分法による物体検出処理の一例を示すフローチャート。
【図５】従来のテンプレートマッチング法による物体追跡処理の一例を示すフローチャート。
【図６】従来の差分法による物体検出処理の動作を説明する図。
【図７】従来のテンプレートマッチング法による物体追跡処理の動作を説明する図。
【図８】従来のテンプレートマッチング法による物体追跡処理の問題点を説明する図。
【図９】本発明の物体追跡方法の一実施例を説明する図。
【図１０】本発明の物体追跡方法の一実施例を説明する図。
【図１１】画像とテンプレートマッチングで検出された対象物体の位置との関係を説明するための図。
【符号の説明】
３０１：ＴＶカメラ、３０２：カメラ雲台、３０３：画像入力Ｉ／Ｆ、３０４：雲台制御Ｉ／Ｆ、３０５：画像メモリ、３０６：ワークメモリ、３０７：ＣＰＵ、３０８：プログラムメモリ、３０９は出力Ｉ／Ｆ、３１０：画像出力Ｉ／Ｆ、３１１：警告灯、３１２：監視モニタ、３１３はデータバス、６０１：入力画像、６０２：基準背景画像、６０３：差分処理された後の差分画像、６０４：二値化画像、６０５：画像、６０６：差分処理部、６０７：二値化処理部、６０８：画像抽出部、６０９：人型の物体、６１０：人型の差分画像、６１１：人型の二値化画像、６１２：外接矩形、６１３：初期テンプレート画像、７０１，７０３，７０５，７０７：画像、７０１ａ，７０３ａ，７０５ａ，７０７ａ：テンプレート画像、７０２，７０４，７０６，７０８：入力画像、７０２ａ，７０４ａ，７０６ａ，７０８ａ：テンプレートマッチング処理によって検出された物体の位置、７０２ｂ，７０４ｂ，７０６ｂ，７０８ｂ：１フレーム前でのテンプレート画像の位置、７０２ｃ，７０４ｃ，７０６ｃ，７０８ｃ：探索範囲、７０２ｄ，７０４ｄ，７０６ｄ，７０８ｄ：移動矢印、７０２ｅ：二値化画像、８０１，８０３，８０５，８０７：画像、８０１ａ，８０３ａ，８０５ａ，８０７ａ：テンプレート画像、８０２，８０４，８０６，８０８：入力画像、８０２ａ，８０４ａ，８０６ａ，８０９ａ：テンプレートマッチング処理によって検出された物体の位置、８０２ｂ，８０４ｂ，８０６ｂ，８０８ｂ：１フレーム前でのテンプレート画像の位置、９０１：入力画像、９０２：エッジ画像、９０３ａ：領域、９０３ｂ，９０３ｃ：投影像、９０３：説明のために領域９０３ａで切出したエッジ画像に投影像９０３ｂと９０３ｃとを重ねて表示した図、９０４：ｘ軸への投影像９０３ｂを表す図、９０４ａ：テンプレートマッチングによって得られた検出位置を表す範囲、９０４ｂ：累積投影位置が最大となる範囲、９０５：ｙ軸への投影像９０３ｃを表す図、９０５ａ：テンプレートマッチングによって得られた検出位置を表す範囲、９０５ｂ：累積投影位置が最大となる範囲、１００１，１００３，１００５，１００７：画像、１００１ａ，１００３ａ，１００５ａ，１００７ａ：テンプレート画像、１００２，１００４，１００６，１００８：入力画像、１００２ａ，１００４ａ，１００６ａ，１００９ａ：テンプレートマッチング処理によって検出された物体の位置、１００２ｂ，１００４ｂ，１００６ｂ，１００８ｂ：１フレーム前でのテンプレート画像の位置。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a monitoring device using an imaging device, and in particular, automatically detects an object that has entered a field of view from a video signal input from the imaging device, and automatically tracks the movement of the detected object. The present invention relates to an object tracking method and an object tracking device that adjusts a visual field direction (imaging center direction) according to a detected movement of an object.
[0002]
[Prior art]
2. Description of the Related Art A video monitoring device using an imaging device such as a camera has been widely used. However, in a surveillance system using such a video surveillance device, a manned person who monitors and detects an intruding object such as a human or a car entering the surveillance field of view while watching the image displayed on the monitor. Instead of monitoring, a system that automatically detects intruding objects from images input from image input means such as cameras and automatically tracks their movements so that predetermined notifications and alarm actions can be obtained It is becoming required.
[0003]
In order to realize such a system, first, an intruding object in a visual field is detected by a difference method or the like. The difference method compares an input image obtained by an imaging device such as a television camera (hereinafter, referred to as a TV camera) with a reference background image created in advance, that is, an image without an object to be detected. , A difference in luminance value is obtained for each pixel, and an area having a large difference value is detected as an object. A partial image of the input image corresponding to the position of the intruding object detected in this way is registered as a template, and a position having the highest degree of coincidence with the template image is detected among sequentially input images. This method is widely known as template matching, and is described in, for example, pages 149 to P153 of a book entitled "Introduction to Computer Image Processing" supervised by Mr. Hideyuki Tamura published by Soken Shuppan in 1985.
Normally, when tracking a target object using template matching, an image of the position of the target object detected by the matching process is sequentially updated as a new template in order to follow a change in the posture of the target object. These processes will be described with reference to FIGS.
[0004]
4 is a flowchart illustrating an example of an intruding object detection process using the difference method, FIG. 5 is a flowchart illustrating an example of intruding object tracking using the template matching, and FIG. 6 is an intruding object illustrated in FIGS. 4 and 5. FIG. 9 is a diagram for explaining a flow from detection processing to initial template image registration by using an example of an image. FIG. 7 is a diagram for explaining the flow of the intruding object tracking process shown in FIG. 5 using an example of an image. An image input at a fixed time interval is a template image initially given. FIG. 9 is a diagram for explaining how the process is executed based on the original condition (how the initial template changes).
[0005]
In FIG. 6, 601 is an input image, 609 is a human-shaped object in the input image 601, 602 is a reference background image, 606 is a difference processing unit, 603 is a difference image after being subjected to difference processing by the

difference processing unit

606, 610 Is a human-shaped difference image in the difference image 603 corresponding to the human-shaped object 609, 607 is a binarization processing unit, and 604 is a binarization of the difference image 603 binarized by the binarization processing unit 607. Image, 611 is a humanoid object (humanized binary image) in the binary image 604 corresponding to the

humanoid difference image

610, 612 is a circumscribed rectangle of the humanized binary image 611, and 608 is an image An extraction unit 605 is an image for explaining that a region surrounding the circumscribed rectangle 612 is cut out from the input image 601 as a template image, and 613 is an initial template image cut out from the input image 601.
[0006]
4 and 6, first, an input image 601 of, for example, 320 × 240 pixels is input from a TV camera (image input step 401). Next, the difference processing unit 606 calculates a difference for each pixel between the input image 601 and the previously created reference background image 602, and acquires a difference image 603. At this time, the human-shaped object 609 in the input image 601 appears as a human-shaped difference image 610 in the difference image 603 (difference processing step 402). Then, in the binarization processing unit 607, the value of a pixel whose difference value with respect to each pixel of the difference image 603 is equal to or smaller than a predetermined threshold is “0”, and the value of a pixel equal to or larger than the threshold is “255”. (1 pixel is assumed to be 8 bits) to obtain a binarized image 604. At this time, the human-shaped object 609 captured in the input image 601 is detected as the human-shaped object 611 in the binarized image 604, and a circumscribed rectangle 612 of the human-shaped object 611 is generated (binarization processing step). 403).
Next, in the object presence determination step 404, the image extraction unit 608 detects a cluster of pixels having a pixel value of “255” in the binarized image 604, and determines a cluster of pixels having a pixel value of “255”. If there is, the object detection process is terminated, and a partial image is registered as an initial template image 613 in an image memory 305 (FIG. 3) described later in the input image corresponding to the circumscribed rectangle of the existing lump. If there is no cluster of pixels having a pixel value of “255”, the flow branches to the image input step 401.
[0007]
The flow of the object tracking process will be described with reference to FIG. The processing after the object detection processing and the registration of the initial template image as described in FIGS. 4 and 6 in the object detection processing step 101 and the initial template registration step 102 in FIG. 5 will be described with reference to FIG. .
After the object detection processing described with reference to FIGS. 4 and 6, the object tracking processing is performed according to the flowchart shown in FIG.
[0008]
In FIG. 7, reference numeral 701a denotes a template image of an object updated at time t0-1, time 701 denotes a position of the template image 701a in the input image at time t0-1, time 702 denotes an input image at time t0, and time 702a denotes time. The position (template image) of the object detected by the template matching process at time t0, 702b is the position of the template image one frame before (t0-1), and 702c is a search by the template matching process (for example, the flowchart of FIG. 5). 702d is a moving arrow (for example, an arrow from the center position of the template image 701a to the center position of the template image 702a) indicating the direction and trajectory of the movement of the humanoid object from time t0-1 to t0, and 702e is the template. Position where a humanoid object was detected in the matching process , 703a is a template image of the object updated at time t0, 703 is a diagram showing the position of the template image 703a in the input image at time t0, 704 is an input image at time t0 + 1, and 704a is a template matching process at time t0 + 1. The position of the detected object (template image), 704b is the position of the template image one frame before (t0), 704c is the search range to be searched by the template matching process, and 704d is a humanoid object from time t0-1 to t0 + 1. A moving arrow indicating the moving direction and the locus (for example, an arrow from the center position of template image 701a to the center position of 704a via the center position of template image 703a), and 705a is the template image of the object updated at time t0 + 1 , 705 at time t FIG. 706 is a diagram showing the position of the template image 705a in the input image at +1, 706 is the input image at time t0 + 2, 706a is the position of the object detected by the template matching process at time t0 + 2 (template image), and 706b is one frame before ( The position of the template image at t0 + 1), 706c is a search range to be searched in the template matching process, and 706d is a moving arrow (for example, the center of the template image 701a) indicating the direction and trajectory of the humanoid object from time t0-1 to t0 + 2. Arrow from the position to the center position of the template image 703a, the center position of the template image 705a to the center position of the template image 705a), 707a is the template image of the object updated at time t0 + 2, and 707 is the input image at time t0 + 2. Ten FIG. 708 shows an input image at time t0 + 3, 708a shows an object position detected by template matching processing at time t0 + 3 (template image), and 708b shows a template image one frame before (t0 + 2). , 708c is a search range to be searched in the template matching process, and 708d is a moving arrow representing the direction and trajectory of the humanoid object from time t0-1 to t0 + 3 (for example, from the center position of the template image 701a to the template image 703a. (Arrow pointing toward the center position of 708a via the center position, the center position of template image 705a, and the center position of template image 707a).
[0009]
That is, in FIG. 5 and FIG. 7, the object tracking processing is started, it is determined that an object exists in the binary image 604, and the object detection processing step 101 is ended (FIG. 4). Then, the partial image of the input image 601 corresponding to the circumscribed rectangle of the cluster of the human-shaped binarized images in the binarized image 604 is used as the initial template image 613 (the template image 701a in FIG. 7) as the image memory 305 (FIG. 3) (initial template registration step 102). Subsequently, a partial image 702a having the maximum coincidence r (Δx, Δy) with the template image 701a is detected within the search range 702c in the sequentially input input image (template matching step 103).
That is, in the template matching step 103, the maximum matching degree and the position where the maximum matching degree is obtained are obtained.
[0010]
As a method of calculating the degree of coincidence r (Δx, Δy), for example, an index called a normalized correlation obtained by the following equation (1) can be used.
(Equation 1)

[0011]
Here, when the template matching is performed on the input image 702, f (x, y) becomes the input image 702, f_t(X, y) is the template image 701a, (x₀, Y₀) Are the coordinates of the upper left of the registered template image 701a (the image has the origin at the upper left), D_tRepresents a template search range 702c. If an image having exactly the same pixel value as the template image 701a exists in the search range 702c, the matching degree r (Δx, Δy) is “1.0”. In the template matching step 103, the index represented by the equation (1) is calculated for the search range 702c represented by (Δx, Δy) ∈D, and the coincidence r (Δx, Δy) is the largest among them. Is detected (a circumscribed rectangle) 702a. This search range 702c is determined by the apparent movement amount of the target object. For example, an object moving at a speed of 40 km / h is a TV camera (element size 6.5 mm × 4.8 mm CCD, focal length 25 mm lens, input image size 320 × 240 pixels (pix) at a distance of 50 m, processing interval 0. .1 frame / sec), the apparent amount of movement of the object in the horizontal direction is 27.4 pix / frame, the apparent movement amount in the vertical direction is 27.8 pix / frame, and D is −30 pix <Δx <30 pix, −30 pix <Δy < What is necessary is just to set to about 30 pix.
The method of calculating the degree of coincidence is not limited to the above-described index of the normalized correlation. For example, the difference between each pixel between the input image and the template image may be calculated, and the reciprocal of the accumulated value of the absolute values may be used as the degree of coincidence.
[0012]
Next, in the maximum matching degree determination step 104, in the template matching step 103, the object has moved to the position of the input image 702 where the matching degree with the template image 701a is maximum (moved from the circumscribed rectangle 702b to the circumscribed rectangle 702a). Next, if the maximum matching degree falls below a predetermined value (for example, less than “0.5”), it is determined that the target object has disappeared in the input image, and the process branches to the object detection processing step 101. If the maximum matching degree is equal to or more than a predetermined value (for example, “0.5” or more), the process branches to template updating step 106.
In the template update step 106, the template image 701a is updated to the template image 703a by using the template image 701a and the partial image 702a having the maximum coincidence r (ｒx, △ y) within the search range 702c in the input image. Here, the reason for updating the template image is that the posture of the target object changes (for example, the image of the target object changes because the person raises his hand, bends his waist, or raises his leg). If not updated, the degree of coincidence decreases, and the reliability of the tracking result decreases. Therefore, the partial image 702e of the detected position of the object is updated as a new template image 703a, and stable tracking is performed even when the target object changes its posture.
[0013]
In the above-described embodiment, the template image created for the intruding object detected by the difference method captures a circumscribed rectangle of a cluster of detected pixels, and cuts out a partial image surrounded by the circumscribed rectangle as a template image.
However, the method for determining the size of the template image to be cut out is not limited to this method. For example, the circumscribed rectangle may be multiplied by a certain constant (for example, 0.8 or 1.1).
Further, when a CCD (Charge Coupled Device) is used as an image pickup device, the size of an object regarded as a target can be calculated in advance based on the size of the CCD, the focal length of the lens, and the distance of the object detected from the CCD. The determined size can be used as the template image size.
[0014]
Next, the process proceeds to a camera head control step 107.
FIG. 11 is a diagram for explaining the relationship between the input image and the position of the target object detected by template matching. The camera head control step 107 will be described with reference to FIG.
In the camera head control step 107, the displacement between the position of the target object detected by the template matching and the center of the input image, that is, the direction in which the target object exists with respect to the optical axis of the camera (viewing direction of the camera). The pan / tilt motor of the camera platform 302 is controlled.
That is, the center position (x₀+ △ x + dx / 2, y₀+ △ y + dy / 2) ((dx, dy) represents the size of the template image) and the center position (160, 120) of the input image (the image size is set to 320 × 240) and detected. When the center position of the target object is located on the left side with respect to the center position of the input image, the pan motor of the camera platform is controlled so that the optical axis direction of the camera moves to the left. The pan motors are controlled so that the optical axis direction of the camera moves to the right. When the center position of the detected target object is located above the center position of the input image, the tilt motor of the camera head is controlled so that the optical axis direction of the camera moves upward, and In this case, the tilt motor of the camera platform is controlled so that the optical axis direction of the camera moves downward.
Note that the pan motor and the tilt motor can be controlled simultaneously. For example, when the center position of the detected target object is located on the upper left side with respect to the center position of the input image, the pan motor of the camera platform is moved in the optical axis direction of the camera. Is controlled to move to the left, and the tilt motor is simultaneously controlled so that the optical axis direction of the camera moves upward. This makes it possible to control the camera platform so as to capture the target object on the optical axis of the camera.
Next, in the alarm / monitor display step 108, for example, when the target object is within a predetermined alarm range, an alarm is sounded or an image of the target object is displayed on the monitoring monitor.
[0015]
When the alarm / monitor display step 108 is completed, the process returns to the image input step 401 to obtain a new input image and perform the template matching process again on the input image at the current time. That is, template matching processing is performed using the template image 703a updated by the input image 702 at time t0 and the input image 704 at time t0 + 1. At this time, the search range 704c has moved to a position centered on the template image 704b updated at time t0, and a search is performed in a new search range. Then, the object having the highest degree of coincidence is detected, and a new template image 705a is generated based on the position 704a of the detected object.
As described above, while the target object exists, the processing of step 401, step 103, step 104, step 106, step 107, and step 108 is repeated, and

new template images

706a, 708a,. Keep track of the target object while updating the image.
[0016]
In the tracking method of an intruding object using the template matching described above, when the direction of the target object changes (for example, when the target object person turns right or backward), the deviation between the target object and the matching position becomes large. Therefore, there is a problem that accurate and stable tracking cannot be performed.
That is, the template matching has a property that matching is performed such that pattern parts having high contrast in the template are matched. For example, in the case where the target object is a vehicle, the vehicle is initially facing the front, and almost all of the vehicles are to be matched (input image 802 in FIG. 8), and then the traveling direction (direction) changes and the landscape direction Only the front part of the vehicle that has become the target of matching, and the matching center moves from the center of the vehicle to the front part of the vehicle as compared with the case where the entire vehicle is the target of matching.
[0017]
This will be described with reference to FIG. FIG. 8 is a diagram illustrating a case where a vehicle passing through a road that draws a curve in an imaging field of view is set as an intruding object in order to explain the flow of the intruding object tracking process using an example of an image.

Reference numerals

801a, 803a, 805a, and 807a denote template images at time t1-1, time t1, time t1 + 1, and

time t1 +

2, and 801, 803, 805, and 807 denote updated positions of

template images

801a, 803a, 805a, and 807a, respectively. 802, 804, 806, and 808 are input images at time t1, time t1 + 1, time t1 + 2, and time t1 + 3, respectively. 802a, 804a, 806a, and 809a are template matching at time t1, time t1 + 1, time t1 + 2, and time t1 + 3, respectively. The positions of the objects detected by the processing, 802b, 804b, 806b, and 808b, are the template images (frame templates at time t1-1, time t1, time t1 + 1, and time t1 + 2, respectively) one frame before. Is a position.
In FIG. 8, a template image 801a registered at time t1-1 is an image in which the front part of the moving vehicle is substantially facing the front. At time t1, template matching is performed using the template image 801a to detect the moved position of the target object, and update the template image 801a to a template image 803a. Subsequently, at time t1 + 1, the image is updated to the template image 805a, and at time t1 + 2, the image is updated to the template image 807a. When this processing is performed until time t1 + 3, the front part having the vehicle light and the like at the tracking start time t1 is subjected to template matching. However, at time t1 + 3, matching is performed while shifting to the left side of the vehicle.
[0018]
This phenomenon is because the matching works so as to reduce the displacement between the input image targeted by the template matching and the high-contrast image portion in the template image.In this example, the light portion of the vehicle is Hit. Therefore, as shown in FIG. 8, when the target object turns from right to left, it shifts to the left, and when it turns from left to right, it shifts to the right.
Further, at the time t1, only the image of the vehicle is included in the template image 801a, but the image of the background portion other than the target object is included in the template image 807a due to the target object changing its orientation and shifting the template position. Gets in. If the tracking is continued using a template image including many images other than the target object, such as the template image 807a, matching with the target object cannot be performed, but matching with the background portion included in the template is performed. Therefore, in the object tracking method using template matching, when the direction of the target object changes, the pattern of the target object apparently moves, and the position of the template is shifted by being pulled by the pattern. Cannot be guaranteed, and stable tracking cannot be performed.
[0019]
[Problems to be solved by the invention]
The above-described conventional technique has a disadvantage that stable tracking cannot be performed when the direction of the target object changes greatly.
SUMMARY OF THE INVENTION It is an object of the present invention to eliminate the above-mentioned drawbacks and to reliably and stably operate an object tracking method capable of accurately detecting and tracking an object even when the direction of the target object is largely changed. And a device.
[0020]
[Means for Solving the Problems]
In order to achieve the above object, an object tracking method of the present invention detects an object in an imaging field of view from an input image sequentially acquired by an imaging device, and acquires the object at a current time in an object tracking method of tracking the detected object. A matching step of performing template matching between the input image obtained and the template image, and detecting a position of a partial image having a maximum matching degree with the template image from the input images acquired at the current time, and a position detected by the matching step. Matching for detecting the edge of the extended partial image area that includes the partial image and is larger than the partial image at the detected position by a predetermined size, and corrects the position of the partial image detected by the edge detection to the detected position of the object at the current time. A partial image of the position corrected by the position correction step and the matching position correction step A template updating step of updating the Tana template provided, is intended to track an object in the imaging field.
In the object tracking method according to the present invention, the position correction step detects a position of a partial image having a maximum edge density in an extended partial image region of the input image at the current time.
[0021]
Further, in the object tracking method of the present invention, the position correction step extracts an edge component included in the area of the extended partial image, and calculates the edge component amounts in the y-axis direction and the x-axis direction on the x-axis and the y-axis, respectively. The cumulative display is performed, and the maximum edge density range is detected from the cumulative edge component amounts on the x-axis and the y-axis.
Further, in the object tracking method of the present invention, the size of the template image can be determined based on the apparent movement amount of the object within the field of view of the image.
[0022]
The object tracking method of the present invention also includes an initial template registration step of detecting an object from the input image by a difference method, and registering a partial image of a predetermined size of the input image including at least a part of the detected object as a template image, Tracking is performed using the object detected by the difference method as a tracking target object.
Further, as another method, in the position correction step, if the acquired maximum matching degree is less than a predetermined value, an object is detected from the input image at the current time by a difference method, and the detected object is set as a tracking target object. Chase.
[0023]
Still further, the object tracking method of the present invention includes a camera pan head control step of generating a control signal for changing a view direction of the imaging device based on the position detected by the position correction step, and the control signal In this case, the direction of the field of view of the image pickup device is always directed to the output position, and the detected object is tracked.
[0024]
The object tracking device of the present invention processes an image capturing device that sequentially captures an area to be monitored, an image input interface that sequentially converts a video signal acquired by the image capturing device into an image signal, and an image signal converted by the image input interface. An image processing means, and a storage device for storing the registered template image, wherein the image processing means performs template matching of the image signal input from the imaging device at the current time by using a template image registered in the storage device in advance. Detecting the position of the partial image having the highest degree of coincidence with the template image from the image signal input at the current time, and including the partial image at the detected position in the image signal input at the current time. In the area of the extended partial image that is larger than the predetermined size, the position of the partial image having the maximum edge density is detected. Then, the position of the partial image where the edge density is maximum is set as the detection position of the object at the current time, and the partial image at the detection position of the current time is updated with the new template matching position, so that the position is within the field of view of the imaging device. This is to track the object that has entered the.
[0025]
The object tracking device of the present invention further comprises a pan head for changing the viewing direction of the imaging device and a pan head control for supplying a control signal for controlling the pan head for changing the viewing direction of the imaging device by the image processing means. Further comprising an interface, wherein the image processing means detects the direction of the object based on the detected position of the object at the current time, and adjusts the direction of the field of view of the imaging device via the camera platform control interface from the obtained direction. And for tracking an object that has entered the imaging field of view of the imaging device.
[0026]
BEST MODE FOR CARRYING OUT THE INVENTION
In the object tracking method of the present invention, in order to solve the problem that the pattern of the conventional target object apparently moves and is pulled by the pattern, and the position of the template is shifted, the target object has more edge components than the background portion. , The position of the template image to be updated in the tracking process is corrected based on the density of the edge image of the input image. That is, the present invention detects an object by the difference method, holds an image of the detected object as a template, and tracks the target object while correcting the detection position to a position where the density of the edge image is maximum around the detection position. Therefore, tracking can be performed stably even when the direction of the target object changes.
[0027]
FIG. 3 shows an example of a hardware configuration of an object tracking device common to the embodiments of the present invention. 301 is a TV camera, 303 is an image input I / F, 313 is a data bus, 305 is an image memory, 306 is a work memory, 307 is a CPU, 308 is a program memory, 302 is a camera platform, and 304 is a platform controller I / F. F and 309 are output I / Fs, 310 is an image output I / F, 311 is a warning light, and 312 is a monitoring monitor. The TV camera 301 is connected to the image input I / F 303, the camera platform 302 is connected to the platform control I / F 304, the warning light 311 is connected to the output I / F 309, and the monitoring monitor 312 is connected to the image output I / F 310. It is connected. The image input I / F 303, the camera platform control I / F 304, the image memory 305, the work memory 306, the CPU 307, the program memory 308, the output I / F 309, and the image output I / F 310 are connected to the data bus 313. The TV camera 301 is attached to a camera platform 302.
[0028]
In FIG. 3, a TV camera 301 captures an image of a monitoring target (viewing range). The captured video signal is stored in the image memory 305 from the image input I / F 303 via the data bus 313. The CPU 307 analyzes the image stored in the image memory 305 in the work memory 306 according to the program stored in the program memory 308. The CPU 307 controls the camera platform 302 from the data bus 313 via the platform control I / F 304 according to the processing result to change the field of view of the TV camera 301, and turns on the warning light 311 via the output I / F 309. Lights up and displays, for example, an intruding object detection result image on the monitor 312 via the image output I / F 310. Note that the image memory 305 also includes a template image holding device for storing the registered template images.
[0029]
The flowcharts described below are all described using the hardware configuration of the object tracking and monitoring device described in FIG.
A first embodiment of the present invention will be described with reference to FIG. FIG. 1 is a flowchart illustrating a processing process according to an embodiment of the present invention. FIG. 1 is obtained by adding a template position correction step 105 to the processing process of the template matching method shown in FIG.

Steps

101, 102, 401, 103, 104, 106, 107, and 108 are the same as those described with reference to FIGS.
[0030]
If the maximum matching degree is equal to or more than the predetermined value in the maximum matching degree determining step 104, the process proceeds to the template position correcting step 105. The contents of the processing in the template position correction step 105 will be described using the input image 804 obtained at time t1 + 1 in FIG. 8 and FIG. FIG. 9 is a view showing a case where a vehicle passing through a road that draws a curve in an imaging field of view is set as an intruding object in order to explain the flow of the intruding object tracking process using an example of an image. This is an example of a case where the processing has been performed for 804. Reference numeral 901 denotes an input image to which the same image as the input image 804 in FIG. 8 has been input; 902, an edge image extracted from the input image 901 using a differential filter (not shown); 903a, a search area; 903b, a horizontal direction (x 903c is a projected image in the vertical direction (y-axis), 903 is a diagram in which the projected

images

903b and 903c are superimposed on an edge image cut out in the region 903a for explanation, and 804a is a diagram. 8, a range representing the detected position obtained by the template matching already shown in FIG. 8, 904 is a graph showing the projected image 903b on the x-axis, 904a is a range showing the detected position obtained by the template matching, and 904b is the cumulative projected value. 905 is a diagram showing a projected image 903c on the y-axis, and 905a is a detection position obtained by template matching. To range, 905b is in the range of cumulative projection position is maximized.
[0031]
9, in a template position correction step 105, an edge extraction process is performed on an input image 901 to obtain an edge image 902. In the edge extraction processing, for example, a differential filter such as Sobel or Roberts is applied to the input image 901 to binarize the obtained image (the edge portion is set to “255”, and the other portions are set to “0”). This is done by: This example is described in, for example, pages 118 to 125 of a book entitled "Introduction to Computer Image Processing", supervised by Mr. Hideyuki Tamura, published by Soken Shuppan in 1985.
[0032]
Next, the edge image 902 is moved to the range of the detection position 804a obtained by the template matching step 103 (the solid line frame portion in FIG.₀, Y₀), The size (dx, dy)) of the search area 903a (FIG. 9) expanded from the top, bottom, left, and right by a predetermined pixel d (d is the allowable value of the deviation of the matching position due to the change in the direction of the target object, for example, d = 5 pix). 903 dotted line portion: upper left coordinates (x₀−d, y₀−d), size (dx + 2d, dy + 2d)), and a projection image 903b of the edge image in the x-axis direction and a projection image 904c of the edge image in the y-axis direction are obtained. Therefore, the search area 903a is an extended partial image including the range of the detection position 804a.
[0033]
In the graph 904, the horizontal axis is the horizontal (x-axis) direction, and the vertical axis is the value hx (x) of the projected image 903b of the edge image for each pixel (pix) in the x-axis direction. In the graph 905, the horizontal axis is The vertical (y-axis) direction and the vertical axis are the values hy (y) of the projected image 903c of the edge image for each pixel (pix) in the y-axis direction.
x = x of the projected image 903b in the x-axis direction₀Projection value x (x₀) Represents (x, y) with respect to the edge image cut out in the search area 903a, and x = x₀Where y₀−d <y <y₀+ Dy + d, and the number of pixels having a pixel value of “255” is obtained by counting. Also, y = y of the projected image 903c in the y-axis direction₀Projection value y (y₀) Represents (x, y) for the edge image cut out in the search area 903a, and y = y₀At x₀−d <x <x₀+ Dx + d and the number of pixels having a pixel value of “255” is obtained by counting. Next, FIG. 904 is a graph showing a projection image 903b on the x-axis, and a range 904a shows a detection position obtained by template matching. A range 904b is a range in which the cumulative projection value of the range is maximum, that is, a range (x₁<X <x₁+ Dx), and this position is obtained by the following equation (2).
[0034]
(Equation 2)

[0035]
Equation (2) indicates that when x1 is changed to x0-d <x1 <x0 + d, x1 that maximizes the accumulated value of hx (x) is obtained when x1 <x <x1 + dx. ing. Similarly, a range (y1 <y <y1 + dy) in which the cumulative value of the edge is maximum for the projection image on the y-axis is obtained. Therefore, the position of the target object (the upper left coordinates (x0, y0)) detected by the template matching step 103 is corrected to the position (the upper left coordinates (x1, y1)) corrected by the template position correction step 105. .
In the present embodiment, the expression (2) As represented by x1 To x0 − d < x1 < x0 + d If you change x1 < x < x1 + dx At hx (x) Has the largest cumulative value of x1 But x0 − d < x1 < x0 + d And in the process of changing (2) Expression if the value in braces exceeds a predetermined threshold (2) Stop the calculation of x1 May be used as the correction position of the template. In this case, the predetermined threshold is, for example, the maximum value of the accumulated value. 255 × ( dy + 2d ) (For the y-axis 255 × ( dx + 2d ))of 30 % Value and the required position is the maximum cumulative value of the edge. 30% It becomes a part including the above edge, and the expression (2) Can be reduced.
[0036]
The effect of the embodiment of the present invention will be described with reference to FIG. FIG. 10 is a diagram illustrating a case where a vehicle passing through a road that draws a curve in an imaging field of view is set as an intruding object in order to explain the flow of the intruding object tracking process using an example of an image. This is a condition setting in which a position shift correction process is added after the matching process shown in FIG. 1001a, 1003a, 1005a, and 1007a indicate template images at time t1-1, time t1, time t1 + 1, and time t1 + 2, respectively, and 1001, 1003, 1005, and 1007 indicate update positions of

template images

1001a, 1003a, 1005a, and 1007a, respectively. 1002, 1004, 1006, and 1008 are input images at time t1, time t1 + 1, time t1 + 2, and time t1 + 3, respectively. 1002a, 1004a, 1006a, and 1009a are template matching at time t1, time t1 + 1, time t1 + 2, and time t1 + 3, respectively. The positions of the objects detected by the processing, 1002b, 1004b, 1006b, and 1008b, are respectively the template images (time t1-1, time t1, time t1 + Are the respective position of the template image) at time t1 + 2.
[0037]
In the method shown in FIG. 8, the displacement between the position of the input image targeted for template matching and the position of the image portion having high contrast in the template image (the front portion of the vehicle in the example of FIG. 8) is reduced. Since the template matching is performed, the position of the front part of the vehicle in the template image does not change in a scene in which the orientation of the target object changes, but each time the tracking process is repeated, the pixels of the image (background image) other than the target object are repeated. Becomes large.
[0038]
On the other hand, in the case of FIG. 10 to which the embodiment of the present invention is applied, since the position of the template image is sequentially corrected to the pixel portion including many edges, that is, the pixel portion of the target object after the template matching processing, As compared with the template image according to the method 8, the ratio of pixels other than the target object can be reduced. Therefore, when comparing the template positions at the same time in FIGS. 8 and 10, for example, 809a and 1009a, in the case of the position 809a, the ratio of the pixels of the background portion included in the template image is more than half. However, in the case of the position 1009a, most parts of the template image are pixels of the target object.
[0039]
Subsequent to the above-described template position correction step 105, processing of a template update step 106 is performed, and the position of the position-corrected target object is updated as a new template image. Thereafter, the same processing as in FIG. 5 is performed.
[0040]
As described above, according to the embodiment of the present invention, the position detected by the template matching step 103 is used to detect an edge present in the target object, and to correct the detected position to a position where the density of the edge is maximum. Even if the target object changes direction, the position of the template does not deviate from the target object, and the target object can be accurately tracked.
[0041]
A second embodiment of the present invention will be described with reference to FIG. FIG. 2 is a flowchart showing one embodiment of the processing process of the present invention. FIG. 2 shows a flowchart representing the first embodiment shown in FIG. 1 in which a branching step 201 and a maximum matching degree determination step 104 'are provided in place of the maximum matching degree determination step 104, and a plurality of templates are replaced in place of the template updating step 106. It is configured with a storage step 202.
[0042]
When the process is started in FIG. 2, the image acquired from the input image at time t0-1 is registered as the template image at time t0-1 by the initial template registration step 102 from the object detection process 101 described above, Proceed to input step 401. In an image input step 401, an input image at time t0 is obtained.
Next, in the template matching processing step 103, template matching processing is performed between the stored template image at time t0-1 and the input image at time t0. Then, the process proceeds to a maximum matching degree determination step 104 'through a branch step 201 (described later).
In the maximum matching degree determination step 104 ′, the processing proceeds to the template position correction step 105 when the maximum matching degree is equal to or more than the predetermined value, and returns to the object detection processing step 101 when the maximum matching degree is less than the predetermined value. .
[0043]
In the template position correction step 105, the position extracted in the maximum matching degree determination step 104 'is corrected as a detection position at time t0. Then, in the next multiple template storage step 202, the template at time t0 is newly stored based on the position-corrected detection position at time t0. At this time, the template image at time t0-1 already registered in the initial template registration step 102 is stored as it is. Next, the process proceeds to the camera head control step 107, in which the visual field (the optical axis direction) of the camera is directed toward the target object based on the detection position at the time t0.
Next, the processing is moved to the alarm / monitor display step 108, for example, to sound an alarm, or to display an image of the target object on a monitoring monitor, for example.
When the alarm / monitor display step 108 is completed, the process returns to the image input step 401 to acquire a new input image and perform the template matching process again.
[0044]
When the process returns to the template matching step 103 again, the stored templates are the template at the time t0-2 and the template at the time t0-1 (the time is advanced by "1" and thus "-1"). Is added). Here, in the template matching step 103, a template matching process between the input image at time t0 and the template at time t0-1 is performed first, and the process proceeds to the branching step 201.
In the branching step 201, it is checked whether or not the template matching processing has been performed for all of the stored template images. Now, template matching processing with the template at time t0-1 has been performed, but the template at time t0-2 still remains. Therefore, at this time, the process returns to step 103 to perform the template matching process between the template at time t0-2 and the input image at time t0. In this way, the template matching processing is performed on the remaining templates one after another, and when the template matching processing is completed for all the templates, the process proceeds from the branching step 201 to the maximum matching degree determination step 104 ′.
[0045]
In the maximum matching score determination step 104 ', the largest value is selected from the maximum matching scores obtained for each of the plurality of template images by the template matching process. If the maximum matching degree of the largest value is equal to or more than a predetermined value (for example, 0.5), the process proceeds to the template position correction step 105, and if the maximum matching degree of the largest value is less than the predetermined value. If there is, the process returns to the object detection processing step 101, assuming that the target object is no longer present in the input image.
[0046]
In the template position correction step 105, the edge processing of the input image is performed on the template image that has obtained the largest maximum single degree in the maximum matching degree determination step 104 ', and the position of the target object is corrected from the obtained edge image. I do.
Then, in the next multiple template storage step 202, the template at time t0 is newly stored based on the position-corrected detection position at time t0. At this time, the template image at time t0-1 already registered in the initial template registration step 102 is stored as it is.
A predetermined number (for example, “3”) is determined in advance as the number of template images to be stored in the multiple template storage step 202, and when the number exceeds the predetermined number, the template acquired at the oldest time is deleted. Next, the process proceeds to the camera head control step 107, where the visual field direction (optical axis direction) of the camera is controlled.
Then, the process proceeds to an alarm / monitor display step 108, for example, to sound an alarm or to display an image of the target object on a monitor, for example.
When the alarm / monitor display step 108 is completed, the process returns to the image input step 401, a new input image is obtained, and the template matching processing is continued again.
In the above case, the template acquired at the oldest time is deleted. However, the template having the lowest matching degree obtained in the template matching step 103 may be deleted.
[0047]
According to the second embodiment, based on the position detected in the template matching step 103, an edge existing in the target object is detected, and the detected position is corrected to a position where the density of the edge becomes maximum, In order to independently match a predetermined number of template images obtained at different times, even if the target object changes direction or another object crosses in front of the target object, multiple past template image As the region having the highest degree of coincidence is corrected as the template matching position, the target object can be accurately tracked without shifting the position of the template from the target object and without tracking another object. .
[0048]
【The invention's effect】
As described above, according to the present invention, an object in which the direction of a target object changes can be tracked stably, and the applicable range of a monitoring device using an imaging device can be greatly expanded.
[Brief description of the drawings]
FIG. 1 is a flowchart illustrating a processing operation according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a processing operation according to an embodiment of the present invention.
FIG. 3 is a block diagram showing an embodiment of a monitoring device to which the present invention is applied.
FIG. 4 is a flowchart illustrating an example of an object detection process using a conventional difference method.
FIG. 5 is a flowchart showing an example of an object tracking process according to a conventional template matching method.
FIG. 6 is a view for explaining an operation of an object detection process by a conventional difference method.
FIG. 7 is a diagram for explaining the operation of an object tracking process using a conventional template matching method.
FIG. 8 is a view for explaining a problem of an object tracking process using a conventional template matching method.
FIG. 9 is a view for explaining an embodiment of the object tracking method of the present invention.
FIG. 10 is a view for explaining an embodiment of the object tracking method of the present invention.
FIG. 11 is a view for explaining a relationship between an image and a position of a target object detected by template matching.
[Explanation of symbols]
301: TV camera, 302: camera head, 303: image input I / F, 304: head control I / F, 305: image memory, 306: work memory, 307: CPU, 308: program memory, 309: output I / F, 310: image output I / F, 311: warning light, 312: monitoring monitor, 313: data bus, 601: input image, 602: reference background image, 603: difference image after differential processing, 604 : Binarized image, 605: image, 606: difference processing unit, 607: binarization processing unit, 608: image extraction unit, 609: human-shaped object, 610: human-shaped difference image, 611: human-shaped 612: circumscribed rectangle, 613: initial template image, 701, 703, 705, 707: image, 701a, 703a, 705a, 07a: template image, 702, 704, 706, 708: input image, 702a, 704a, 706a, 708a: position of an object detected by template matching processing, 702b, 704b, 706b, 708b: template image one frame before 702c, 704c, 706c, 708c: search range, 702d, 704d, 706d, 708d: moving arrow, 702e: binarized image, 801, 803, 805, 807: image, 801a, 803a, 805a, 807a: 802, 804, 806, 808: input image, 802a, 804a, 806a, 809a: position of object detected by template matching processing, 802b, 804b, 806b, 808b: one frame 901: input image, 902: edge image, 903a: area, 903b, 903c: projection image, 903: projection image 903b and 903c are superimposed on the edge image cut out in area 903a for explanation. 904: a diagram representing a projected image 903b on the x-axis; 904a: a range representing a detected position obtained by template matching; 904b: a range having a maximum cumulative projected position; 905: a projection on the y-axis A diagram showing an image 903c, a range 905a: a detection position obtained by template matching, a range 905b: a range in which the cumulative projection position becomes maximum, 1001, 1003, 1005, 1007: an image, 1001a, 1003a, 1005a, 1007a: a template Images, 1002, 1004, 1006 1008: input image, 1002a, 1004a, 1006a, 1009a: position of the object detected by the template matching process, 1002b, 1004b, 1006b, 1008b: 1 frame position of the template image in front.

Claims

撮像装置によって逐次取得する入力画像から撮像視野内の物体を検出し、該検出した物体を追跡する物体追跡方法において、
現時刻に取得した前記入力画像と、テンプレート画像とのテンプレートマッチングを行い、前記現時刻に取得した入力画像の中から前記テンプレート画像と一致度が最大となる部分画像の位置を検出するマッチングステップと、
前記マッチングステップによって検出した位置の部分画像を包含し、かつ前記検出した位置の部分画像より所定サイズ大きい拡張部分画像の領域についてエッジの密度が最大になる部分画像の位置を検出し、該検出した部分画像の位置を物体の現時刻の検出位置に補正するマッチング位置補正ステップと、
該マッチング位置補正ステップによって補正された位置の部分画像を新たなテンプレートとして更新するテンプレート更新ステップとを設け、
前記撮像視野内の物体を追跡することを特徴とする物体追跡方法。In an object tracking method for detecting an object in an imaging field of view from an input image sequentially obtained by an imaging device and tracking the detected object,
A matching step of performing template matching between the input image acquired at the current time and a template image, and detecting a position of a partial image having a maximum matching degree with the template image from the input image acquired at the current time; ,
Including the partial image at the position detected by the matching step, and detecting the position of the partial image at which the edge density is maximum for the region of the extended partial image that is larger than the partial image at the detected position by a predetermined size. A matching position correction step of correcting the position of the partial image to a detection position of the current time of the object,
Providing a template update step of updating the partial image at the position corrected by the matching position correction step as a new template,
An object tracking method, comprising tracking an object in the field of view.

撮像装置によって逐次取得する入力画像から撮像視野内の物体を検出し、該検出した物体を追跡する物体追跡方法において、
現時刻に取得した前記入力画像と、テンプレート画像とのテンプレートマッチングを行い、前記現時刻に取得した入力画像の中から前記テンプレート画像と一致度が最大となる部分画像の位置を検出するマッチングステップと、
前記マッチングステップによって検出した位置の部分画像を包含し、かつ前記検出した位置の部分画像より所定サイズ大きい拡張部分画像の領域についてエッジの密度が所定のしきい値以上になる部分画像の位置を検出し、該検出した部分画像の位置を物体の現時刻の検出位置に補正するマッチング位置補正ステップと、
該マッチング位置補正ステップによって補正された位置の部分画像を新たなテンプレートとして更新するテンプレート更新ステップとを設け、
前記撮像視野内の物体を追跡することを特徴とする物体追跡方法。In an object tracking method for detecting an object in an imaging field of view from an input image sequentially obtained by an imaging device and tracking the detected object,
A matching step of performing template matching between the input image acquired at the current time and a template image, and detecting a position of a partial image having a maximum matching degree with the template image from the input image acquired at the current time; ,
Detecting the position of the partial image that includes the partial image at the position detected by the matching step and has an edge density equal to or greater than a predetermined threshold value in an area of the extended partial image that is larger than the partial image at the detected position by a predetermined size. A matching position correction step of correcting the position of the detected partial image to a detection position of the object at the current time;
Providing a template update step of updating the partial image at the position corrected by the matching position correction step as a new template,
An object tracking method, comprising tracking an object in the field of view.

請求項１または請求項２のいずれかに記載の物体追跡方法において、前記位置補正ステップは、前記拡張部分画像の領域内に含まれるエッジ成分を抽出し、ｘ軸及びｙ軸上にそれぞれ、ｙ軸方向及びｘ軸方向のエッジ成分量を累積表示し、該ｘ軸及びｙ軸上の累積エッジ成分量から前記最大エッジ密度範囲を検出することを特徴とする物体追跡方法。3. The object tracking method according to claim 1, wherein, in the position correcting step, an edge component included in an area of the extended partial image is extracted, and y components are respectively set on an x-axis and a y-axis. An object tracking method, comprising: displaying the edge component amounts in the axial direction and the x-axis direction cumulatively; and detecting the maximum edge density range from the cumulative edge component amounts on the x-axis and the y-axis.

請求項１乃至請求項３記載のいずれかの物体追跡方法において、前記所定サイズは前記物体の前記撮像視野内での見かけの移動量に基づき決定することを特徴とする物体追跡方法。4. The object tracking method according to claim 1, wherein the predetermined size is determined based on an apparent movement amount of the object within the imaging field of view.

請求項１乃至請求項４のいずれかに記載の物体追跡方法において、
前記入力画像から差分法によって物体を検出し、該検出した物体の少なくとも一部を含む前記入力画像の所定サイズの部分画像を前記テンプレート画像として登録する初期テンプレート登録ステップを備え、
前記差分法によって検出した物体を追跡対象物体として、追跡を行なうことを特徴とする物体追跡方法。In the object tracking method according to any one of claims 1 to 4,
An initial template registration step of detecting an object from the input image by a difference method and registering a partial image of a predetermined size of the input image including at least a part of the detected object as the template image,
An object tracking method, wherein tracking is performed using an object detected by the difference method as a tracking target object.

請求項５記載の物体追跡方法において、
前記位置補正ステップにおいて、取得した最大一致度が所定の値未満であれば、前記差分法によって、現時刻の入力画像から物体を検出し、該検出した物体を追跡対象物体として追跡することを特徴とする物体追跡方法。The object tracking method according to claim 5,
In the position correction step, if the acquired maximum matching degree is less than a predetermined value, an object is detected from the input image at the current time by the difference method, and the detected object is tracked as a tracking target object. Object tracking method.

請求項１乃至請求項６のいずれかに記載の物体追跡方法において、
前記位置補正ステップによって検出された位置に基づいて、前記撮像装置の視野方向を変えるための制御信号を発生するカメラ雲台制御ステップを有し、
該制御信号によって前記撮像装置の視野方向を前記検出された位置に常に向けて、前記検出した物体を追跡することを特徴とする物体追跡方法。The object tracking method according to any one of claims 1 to 6,
Based on the position detected by the position correction step, comprising a camera head control step of generating a control signal for changing the direction of view of the imaging device,
An object tracking method, wherein the detected object is tracked by always directing the visual field direction of the imaging device to the detected position by the control signal.

撮像視野内の物体を検出し、該検出した物体を追跡する物体追跡方法において、
監視対象範囲を逐次撮像する撮像装置と、
該撮像装置が取得した映像信号を逐次画像信号に変換する画像入力インターフェースと、
該画像入力インターフェースによって変換された前記画像信号を処理する画像処理手段と、
テンプレート画像として登録された画像を格納する記憶装置とを備え、
前記画像処理手段は、前記撮像装置から現時刻に入力した画像信号を前記記憶装置にあらかじめ登録されたテンプレート画像によってテンプレートマッチングを行ない、
前記現時刻に入力した画像信号の中から、前記テンプレート画像と最大の一致度を持つ部分画像の位置を検出し、
前記検出した位置の前記部分画像を包含する、前記現時刻に入力した画像信号内の前記部分画像より所定サイズ大きい拡張部分画像の領域について、エッジの密度が最大となる部分画像の位置を検出し、
該エッジの密度が最大となる部分画像の位置を、物体の現時刻における検出位置とし、
該現時刻の検出位置の部分画像を新たなテンプレートマッチング位置と更新することによって前記撮像装置の撮像視野内に侵入した物体を追跡することを特徴とする物体追跡装置。In an object tracking method for detecting an object in an imaging field of view and tracking the detected object,
An imaging device for sequentially imaging a monitoring target range,
An image input interface for sequentially converting a video signal obtained by the imaging device into an image signal,
Image processing means for processing the image signal converted by the image input interface,
A storage device for storing an image registered as a template image,
The image processing means performs template matching with an image signal input at the current time from the imaging device using a template image registered in advance in the storage device,
From the image signal input at the current time, the position of the partial image having the highest matching degree with the template image is detected,
For the region of the extended partial image larger than the partial image by a predetermined size in the image signal input at the current time, including the partial image at the detected position, the position of the partial image having the maximum edge density is detected. ,
The position of the partial image at which the density of the edge is the maximum is determined as the detection position of the object at the current time,
An object tracking apparatus, characterized in that an object that has entered the imaging field of view of the imaging apparatus is tracked by updating the partial image at the detection position at the current time with a new template matching position.

請求項８の物体追跡装置において、
前記撮像装置の視野方向を変えるための雲台と、
前記画像処理手段によって前記撮像装置の視野方向を変えるために前記雲台を制御するための制御信号を供給する雲台制御インターフェースとを更に備え、
前記画像処理手段が、前記物体の現時刻における検出位置に基づいて、前記物体の方向を検出し、得られた方向から前記雲台制御インターフェースを介して、前記撮像装置の視野方向を調節し、前記撮像装置の撮像視野内に侵入した物体を追跡することを特徴とする物体追跡装置。The object tracking device according to claim 8,
A camera platform for changing the viewing direction of the imaging device,
A camera platform control interface that supplies a control signal for controlling the camera platform to change the viewing direction of the imaging device by the image processing unit;
The image processing means detects a direction of the object based on a detection position of the object at a current time, and adjusts a visual field direction of the imaging device from the obtained direction via the camera platform control interface, An object tracking device, which tracks an object that has entered the field of view of the imaging device.