JP3781835B2

JP3781835B2 - Video image segmentation device

Info

Publication number: JP3781835B2
Application number: JP26477196A
Authority: JP
Inventors: 慎一境田; 金子　　豊; 善明鹿喰; 豊田中
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 1996-10-04
Filing date: 1996-10-04
Publication date: 2006-05-31
Anticipated expiration: 2016-10-04
Also published as: JPH10111942A

Description

【０００１】
【発明の属する技術分野】
本発明は、Ｋ平均アルゴリズムで代表される画像の特徴量を利用してクラスタリング領域分割を行う動画像領域分割装置に関する。
【０００２】
［発明の概要］
本発明は、動画像を領域に分割する装置において、画像内の画素の有する動き情報、色情報、位置情報を利用して動画像を分割し、画面上に発生する粒状領域、および小領域は隣接の大きな領域に統合し、初期状態の影響で発生する過分割を解消するために複数の異なる初期状態で各別に分割を行った領域情報を統合することにより、高精度な領域分割をＫ平均アルゴリズム自体の繰り返し処理、反復処理をすることなく可能にしたものである。
【０００３】
【従来の技術】
従来、静止画像を対象とした領域分割装置としては、画面上で直接分割を行う手法と、画像の特徴量によるクラスタリング手法とが知られている。
【０００４】
画面上で直接分割を行う手法は、ある一つの画素に隣接している他の画素を、その輝度や色が近いものを統合していく手法であり、代表的な手法に領域成長法がある（S.W.Zucker,"Region growing:childhood and adolescence,"Comput.Graphics and Image Processing,5,pp.382-399,(1976)) 。この領域成長法では、分割された各領域は完全に連結した閉領域になっているが、画素を統合するための判断基準にしきい値を与えるため、得られる領域形状がこのしきい値によって異なるものになる。
【０００５】
一方、画像の特徴量によるクラスタリング手法は、画素の輝度や色だけでなく、画面上の位置などの特徴量が類似した画素を同一の領域とする手法である。このうち複数の特徴量を一元的に扱う手法として、Ｋ平均アルゴリズムと呼ばれる手法がある（S.Z.Selim and M.A.Ismail,"K-MEANS-Type algorithms",IEEE Trans.PAMI-6,1,pp.81-87,(1984)）。Ｋ平均アルゴリズムの処理は、ある評価関数を最小にする最適化問題であり、繰り返し処理を含むものの、単純な計算で領域分割を実現できる。
【０００６】
しかし、Ｋ平均アルゴリズムの処理には、初期に与えておく領域の形状が分割の結果として得られる領域の境界位置に影響したり、多数の小さな粒状の領域が発生するといった問題点が一般に指摘されている。
【０００７】
上述した初期領域への依存性を解消するための方法として、多点探索法である遺伝的アルゴリズムを導入する方法（堀田、村井、宮原、”クラスタリングと遺伝的アルゴリズムを用いたカラー画像の領域分割”、信学技報、PRU94-7,(1994)）や、初期に与える領域の個数を変化させて分割した結果を１９種類用意し、これらを統合する手法（中谷、大崎、阿部、”複数の領域分割結果に基づく対象物境界線検出”信学論D-II,J76,4,pp.914-916,(1993)）がある。
【０００８】
しかし、いずれの手法もＫ平均アルゴリズム処理を基本としており、Ｋ平均アルゴリズム処理自体の繰り返し処理を行うためにその計算量が膨大になる。例えば、前者の手法ではＫ平均アルゴリズム処理を最悪で約１０００回、後者の手法では１９回行うことが必要となる。
【０００９】
また、Ｋ平均アルゴリズム処理において発生する小さな粒状領域を除去する手法としては、弛緩法を利用する方法がある（”画素解析ハンドブック”、pp.677-679、東京大学出版会）。この方法では画像の領域の隣接関係などの事前知識を利用し、小さな粒状領域を統合すべき隣接領域を推測して処理する必要があり、例えば衛星から得られる地形の画面のような特殊な画像には適用できるが、一般の自然画像への適用は難しい。
【００１０】
以上の各手法は、全て静止画像を対象とした領域分割であるが、静止画像ではなく動画像を対象とした領域分割も画像の特徴量を利用する手法により行うことができる。動画像を対象とする際には動き情報を特徴量の一つとして利用するが、動き情報は画像の内容に関係なく定められたブロック単位で推定されることが多い。その場合、ブロック内に複数の動きが存在したり、ノイズが存在するときには、得られる動き情報が不正確になるという問題点がある（G.Adiv,"Determining three-dimensional motion and structure from optical flow generated by several moving objects",IEEE Trans.PAMI-7,4,pp.384-401(1985)）。
【００１１】
クラスタリングを利用して動領域を抽出する手法（許、原島、”動き、色、位置のクラスタリングに基づく動画像の領域分割と領域追跡”、信学技報、IE94-152,(1995) ）では、得られる動領域と別途用意した静止画像の領域分割結果とを統合することにより領域を修正する処理を行っている。そのためＫ平均アルゴリズムなどのクラスタリング手法の特長である、複数の特徴量を一元的に扱う処理が行えない。動き情報も特徴量の一つとして一元的に扱うためには動き情報が正確であればよいが、上述したように動き推定では必ずしも正確な動き情報を得ることはできない。
【００１２】
【発明が解決しようとする課題】
前述したように、従来からＫ平均アルゴリズムによる領域分割での問題点である初期領域依存性と小さな粒状領域発生を解消する種々の提案がなされている。
【００１３】
しかし、いずれの従来技術も処理が複雑になったり、他の情報を事前に用意する必要があるなどの欠点を有する。また、動画像を対象とした場合には、動き情報の不確かさから、Ｋ平均アルゴリズムの特徴量の一つとして動き情報をそのままでは利用できないという問題点がある。実用的で一般の自然画像への適用可能な動画像を対象とした領域分割手法の開発が望まれている。
【００１４】
本発明は上記の事情に鑑みてなされたものであり、その目的は、動き情報の不確かさを解消しつつも色情報や位置情報と共に動き情報を特徴量として利用でき、また、Ｋ平均アルゴリズムによるクラスタリング処理で生じる粒状小領域の除去や初期依存性も、Ｋ平均アルゴリズム自体の繰り返し処理、反復処理をすることなく解決できる動画像領域分割装置を提供することにある。
【００１５】
【課題を解決するための手段】
上記の目的を達成するために請求項１の発明は、動画像のうちの時間的に連続する画像間での各画素の移動量として定義される動き情報と、この動き情報の実際の値に対する推定された値の確からしさの指標として定義される信頼度とを入力し、一定の判定基準を満たす信頼度の高い動き情報には大きな係数を乗じ、判定基準に満たない信頼度の低い動き情報には小さな係数を乗じる重み付けを行う動き情報重み付け回路と、動画像内の画素の有する色情報、位置情報、前記動き情報重み付け回路からの重み付けされた動き情報、および初期領域を入力すると共に、前記色情報、位置情報、および重み付け動き情報を複数の特徴量として一元的に扱い、評価関数による最適探索を利用して、対象となる動画像を領域に分割して領域画像信号を生成するクラスタリング回路と、このクラスタリング回路によって生成された領域画像信号から得られた領域輪郭と、前記初期領域とは異なる初期領域形状を利用した領域画像信号から得られた領域輪郭とを入力し、これらの領域輪郭の共通部分を抽出して過分割部分を排除した領域統合画像を生成する統合回路と、を具備することを特徴とするものである。
【００１６】
上記の構成によれば、動画像を対象とした場合の、不確かさを含む動き情報を特徴量の一つとして利用するために、得られた動き情報の信頼度を判定基準と比較しており、この判定基準を満たさない動き情報に対してはこれを満たす動き情報に比べて小さな係数を乗ずる。このように信頼度に基づいて重み付けをすることによって特徴量として利用する際に、信頼度の高い動き情報に対する影響を減らしている。また、初期値依存性に対しては初期領域の形状を変化させた場合の結果との統合処理を行うことにより解決している。
【００１７】
請求項２の発明は、請求項１記載の動画像領域分割装置において、前記クラスタリング回路と前記統合回路間に、前記クラスタリング回路によって生成された領域画像信号を入力して、この領域画像信号中に含まれる孤立小領域を除去した領域画像信号を生成するフィルタ回路が介挿されていることを特徴とするものである。
【００１８】
上記の構成によれば、フィルタリング処理を実行することにより、クラスタリング回路によって生成された領域画像信号中に含まれる数画素程度の孤立した小粒状領域を除去することができる。
【００１９】
請求項３の発明は、請求項１または２記載の動画像領域分割装置において、前記クラスタリング回路または前記フィルタ回路と前記統合回路間に、前記クラスタリング回路によって生成された領域画像信号、または前記フィルタ回路により孤立小領域が除去された後の領域画像信号を入力して、これら領域画像信号における予め定められたしきい値以下の画素数の領域を、その領域内の画素を囲む近傍８画素の範囲内で最も多い画素数を有する隣接領域、またはその領域内の画素を囲む近傍８画素の範囲内で最も画素数が多い隣接領域が複数あるときは、それぞれの隣接領域全体を比較して最も大きな隣接領域に属させて小領域を統合した領域画像信号を生成する小領域統合回路が介挿されていることを特徴とするものである。
【００２０】
上記の構成によれば、クラスタリング回路によって生成された領域画像信号中の小領域を隣接する大領域へ統合することができる。
【００２１】
【発明の実施の形態】
図１は本発明に係る動画像領域分割装置の実施の形態の構成を示している。
【００２２】
この動画像領域分割装置は、初期領域生成回路１、動き情報重み付け回路２、クラスタリング回路３、フィルタ回路４、番号付与回路５、小領域統合回路６、及び輪郭抽出回路７を備えてなる第１領域分割修正回路８−１と、第１領域分割修正回路８−１と同一回路からなる第２領域分割修正回路８−２と、輪郭重畳回路９と、統合回路１０とから構成されている。
【００２３】
初期領域生成回路１は、入力された動画像信号Ｓ１の画面全体を予め設定された小領域に分割して初期領域を生成し、その初期領域信号Ｓ２をクラスタリング回路３に出力する。
【００２４】
動き情報重み付け回路２は、予め求められている動き情報と信頼度情報とを入力すると共に、この信頼度情報が一定の判定基準に満たない場合、判定基準を満たす信頼度の高い情報に比較して小さな値になるように信頼度に応じた重み係数を動き情報に乗ぜる処理を実行し、その重み付け動き情報Ｓ３をクラスタリング回路３に出力する。
【００２５】
クラスタリング回路３は、動画像信号Ｓ１と、初期領域生成回路１からの初期領域信号Ｓ２と、動き情報重み付け回路２からの重み付け動き情報Ｓ３とを入力すると共に、クラスタリング処理を実行し、画面が領域分割された最終的な領域画像信号Ｓ４を生成してフィルタ回路４に出力する。
【００２６】
フィルタ回路４は、多数決フィルタで構成され、クラスタリング回路３から供給された領域画像信号Ｓ４中に含まれる小さな粒状領域を除去し、粒状領域が除去された領域画像信号Ｓ５を番号付与回路５に出力する。
【００２７】
番号付与回路５は、粒状領域が除去された領域画像信号Ｓ５を入力すると共に、得られた同一領域に含まれる画面上で連結している画素に同じ識別番号を付与し、識別番号が付与された領域画像信号Ｓ６を小領域統合回路６に出力する。
【００２８】
小領域統合回路６は、識別番号が付与された領域画像信号Ｓ６を入力すると共に、予め定められたあるしきい値以下の画素数の小さな領域を別領域へ統合する処理を実行し、得られた領域画像信号Ｓ７を輪郭抽出回路７に出力する。
【００２９】
輪郭抽出回路７は、領域画像信号Ｓ７を入力すると共に、この領域画像信号Ｓ７中の各領域の境界である輪郭線を抽出し、その輪郭画像信号Ｓ８を輪郭重畳回路９と統合回路１０に出力する。
【００３０】
輪郭重畳回路９は、輪郭抽出回路７からの輪郭画像信号Ｓ８と、第２領域分割修正回路８−２で得られた輪郭画像信号Ｓ９とを入力すると共に、２つの輪郭画像信号Ｓ８，Ｓ９の論理積を演算して輪郭画像信号Ｓ１０を生成し、統合回路１０に出力する。
【００３１】
統合回路１０は、輪郭抽出回路７からの輪郭画像信号Ｓ８と、輪郭重畳回路９からの輪郭画像信号Ｓ１０とを入力すると共に、輪郭画像信号Ｓ８と輪郭画像信号Ｓ１０とから領域統合処理を実行して最終的な領域分割画像を生成して出力する。
【００３２】
次にこの実施の形態の動作を説明する。
【００３３】
《初期領域生成回路１における処理》
動画像信号Ｓ１が初期領域生成回路１に供給されると、先ず、図２に示すように画面全体がＭ×Ｎの小領域に分割される。なお、Ｍ，Ｎは４から６程度の整数値であり、図２では、Ｍ＝６、Ｎ＝４からなる小領域を示している。これが、第１領域分割修正回路８−１が設定する初期領域となる。同様に、第２領域分割修正回路８−２でも画面全体が、第１領域分割修正回路８−１が設定する初期領域とは別のＭ×Ｎの小領域に分割される。
【００３４】
《動き情報重み付け回路２における処理》
一方、動き情報と信頼度情報は、周知の手段によって予め求められており、これらの動き情報と信頼度情報は、動き情報重み付け回路２に供給される。ここで、動き推定による動き情報とは、動画像のうちの時間的に連続する画像間での各画素の移動量である。また信頼度情報とは、動き情報の実際の値に対する推定された値の確からしさの指標をいう。例えば、金子、鹿喰、田中、”輝度勾配ベクトル分布による動き推定の精度向上”、信学技報、IE96-10,(1996)に表された動きベクトル（ｕ，ｖ）が動き推定による動き情報に該当し、信頼度ｓ_s，ｓ_lが信頼度情報に該当する。
【００３５】
以下に、上記信頼度ｓ_s，ｓ_lの導出過程を説明する。
【００３６】
【外１】

時刻ｔにおける動画像面上の点（ｘ，ｙ）の輝度をｇ（ｘ，ｙ）、時空間輝度勾配ベクトルを、
【数１】

移動ベクトルを、ｍ＝（ｕ，ｖ，１）とすると、内積を用いて（２）式の勾配法の基本式が得られる。
【００３７】
【数２】
ｇ・^tｍ＝０ …（２）
ここで、動きベクトル（ｕ，ｖ）を３次元へ拡張したベクトルｍを移動ベクトルと呼ぶ。
【００３８】
拘束条件として、「ある画像断片内の全ての画素（画素数ｎ）が同じ動き量である」ことを用いると、画像断片内の全画素に対する誤差Σ（ｇ_i・^tｍ）²（ここで、ｉ＝０〜ｎ）を最小にすることで移動ベクトルｍを推定することができる。さらに、移動ベクトルｍの代わりに正規化した単位移動ベクトルｍ_uを用いると、推定する動きベクトルは（３）式を最小とするｍ_uから求めることができる。
【００３９】
【数３】

ここで、Ｇは画像断片内の輝度勾配ベクトルの共分散行列であり、次の（４）式で表せる。
【００４０】
【数４】

行列Ｇの固有値をλ₁≧λ₂≧λ₃、対応する固有ベクトルをｕ₁，ｕ₂，ｕ₃とすると、（３）式が最小となるのは、行列Ｇの最小固有値λ₃に対する固有ベクトルｕ₃をｍ_uとしたときであることから、移動ベクトルｍは行列Ｇの第３固有ベクトルｕ₃から推定できる。
【００４１】
次に、（２）式のベクトル表現について考えると、移動ベクトルｍを推定することは、図３に示すように、推定する移動ベクトルｍと画像断片内の複数の輝度勾配ベクトルｇ_iとの内積の和ができるだけ小さくなるようにすることである。例えば、輝度勾配ベクトルｇ_iが平面状に分布していた場合、移動ベクトルｍの方向を、その平面の法線ベクトルの方向とすることにより、（３）式の内積の自乗和は０になる。このような場合、推定された動きベクトルの正確さは高くなる。これに対し輝度勾配ベクトルｇ_iの分布が平面から外れて分布した場合、（３）式の内積の自乗和は０とはならず、推定される移動ベクトルｍの正確さは低下する。また、輝度勾配ベクトルｇ_iの分布が同一方向にのみ分布していた場合、平面が一意に定まらないため移動ベクトルｍの特定は不可能となる。
【００４２】
以上のように、輝度勾配ベクトルの分布と移動ベクトルの正確さとは密接な関係がある。図４に示すように、推定された移動ベクトルが精度良く求まるのは、図４（ａ）に示すように、輝度勾配ベクトルが平面上に分布するときである。図４（ｂ）に示すような輝度勾配ベクトルが平面から離れた球状に分布したり、図４（ｃ）に示すように、直線状に分布した場合、推定する移動ベクトルの精度は低いと考えられる。
【００４３】
図４の各輝度勾配ベクトルの３次元分布を、その第１固有値λ₁、第２固有値λ₂、第３固有値λ₃で表現すると、以下の（５）式のように特徴付けられる。
【数５】

以下、（ａ）を平面状分布、（ｂ）を球状分布、（ｃ）を直線状分布と呼ぶ。
【００４４】
第１固有値λ₁、第２固有値λ₂、第３固有値λ₃の分布を直交座標系で考えると、図５に示すように、λ₁≧λ₂≧λ₃の関係から、図５の三角錐の内部に分布することになる。三角錐の各辺は、平面状分布、球状分布、直線状分布の極限値を表すことになる。ここで、各辺を平面軸、球軸、直線軸と呼ぶ。従って、画像断片内の輝度勾配ベクトルの共分散行列Ｇの固有値が、図５の平面軸に近い位置にあるほど推定された移動ベクトルの信頼度が高いと考えられる。また、図５から理解できるように、信頼度が直線軸方向に低くなる場合と、球軸方向に低くなる場合とが考えれる。
【００４５】
上記（５）式の（ａ）と（ｂ）との関係、及び（ａ）と（ｃ）との関係から、次式を定義する。
【００４６】
【数６】

この（６）式に基づき以下の（７）式を信頼度とした。
【００４７】
【数７】

（７）式の信頼度ｓ_lの定義で１／ａ_lに第３固有値λ₃を乗じているのは、直線状分布に近い状態でも第３固有値λ₃が小さい場合は信頼度が高いと見なせるので、第３固有値λ₃による重み付けを行うためである。（７）式のｓ_sを球の信頼度、ｓ_lを直線の信頼度と呼ぶ。ｓ_s、ｓ_lの値は小さい程、信頼度が高い。
【００４８】
再び図１に戻り説明すると、動き情報重み付け回路２では、上述した動き情報に対し信頼度情報に応じた重み付けを実行する。具体的に説明すると、信頼度情報が一定の判定基準に満たない場合、判定基準を満たす信頼度の高い情報に比較して小さな値になるように信頼度に応じた重み係数が動き情報に対して乗ぜられる。例えば、判定基準を満たす信頼度の高い動き情報に対しては“１”以上、判定基準に満たない動き情報に対しては“１”未満の係数を乗ずる。こうして動き情報に信頼度が付された情報は、重み付け動き情報Ｓ３としてクラスタリング回路３に供給される。
【００４９】
《クラスタリング回路３における処理》
クラスタリング回路３では、動画像信号Ｓ１と、初期領域生成回路１からの初期領域信号Ｓ２と、動き情報重み付け回路２からの重み付け動き情報Ｓ３とを入力すると共に、次のような計算によりクラスタリング処理を実行する。
【００５０】
先ず、初期領域の各領域毎の全特徴量の平均値の計算を行う。ここでの特徴量とは画素の３つの色情報（３原色Ｒ，Ｇ，Ｂ）、及び２つの位置情報（座標Ｘ，Ｙ）と、画面水平方向、及び画面垂直方向の各重み付けされた動き情報（Ｕ，Ｖ）の合計７つの情報である。
【００５１】
次に、動画像信号Ｓ１の全画面の各画素毎にその画素の特徴量が全体的に見てどの領域の特徴量の平均値と最も近いかを次の評価関数式により決定する。
【００５２】
【数８】

但し、Ｒｍ：赤色情報Ｒの各領域での平均値
Ｇｍ：緑色情報Ｇの各領域での平均値
Ｂｍ：青色情報Ｂの各領域での平均値
Ｘｍ：位置情報Ｘの各領域での平均値
Ｙｍ：位置情報Ｙの各領域での平均値
Ｕｍ：動き情報Ｕの各領域での平均値
Ｖｍ：動き情報Ｖの各領域での平均値
Ｋｌ，Ｋｐ，Ｋｍ：係数値。係数値Ｋｌ，Ｋｐ，Ｋｍは、全ての特徴量が同じくらいの範囲、例えば０〜１になるように乗ぜられるものであり、予め設定されている。
【００５３】
次に、各画素の特徴量について全ての領域に対して評価関数を求める。
【００５４】
求められた評価関数が最も小さくなる領域を最適領域と考え、対象としている画素をその最適領域に含め、領域を新たに構成する。
【００５５】
全ての画素について上記の評価関数による最小値探索を行い、領域の再構成を行った後、新しく再構成された領域毎に各特徴量の平均値Ｒｍ，Ｇｍ，Ｂｍ，Ｘｍ，Ｙｍ，Ｕｍ，Ｖｍを計算し直す。
【００５６】
再び全画素について上記の操作を行う。各領域の平均値Ｒｍ，Ｇｍ，Ｂｍ，Ｘｍ，Ｙｍ，Ｕｍ，Ｖｍが変化しなくなるまで全ての処理を繰り返す。
【００５７】
このクラスタリング回路３内では上述のような処理が行われ、画面が領域分割された最終的な領域画像信号Ｓ４が生成される。
【００５８】
《フィルタ回路４における処理》
クラスタリング回路３で生成された領域画像信号Ｓ４はフィルタ回路４に供給され、このフィルタ回路４で領域画像信号Ｓ４中に含まれる小さな粒状領域が除去される。ここで用いられるフィルタは多数決フィルタであり、その大きさは５画素×５画素もあれば十分である。図６に多数決フィルタの処理例を示す。この例では、５画素×５画素サイズの窓のフィルタが使用されている。また、窓の中心の画素はフィルタ内の最も画素数の多い領域１に統合される。最多領域が中心画素に隣接せず、周辺部に存在する場合であっても中心画素は最多領域１に統合される。この場合、中心画素は依然として粒状領域を形成するが、この粒状領域は後述する小領域統合回路６により結果的に除去されることになる。フィルタの中心画素がもともと最多領域１の一部となる場合、中心画素の属する領域は変化しない。こうして領域画像がこのフィルタ回路４により修正される。
【００５９】
《番号付与回路５における処理》
フィルタ回路４により、大多数の小さな粒状領域が除去された領域画像信号Ｓ５は番号付与回路５に供給される。番号付与回路５では、得られた同一領域に含まれる画面上で連結している画素には同じ識別番号を付与する。番号の付与の順番は任意であり、また、領域が異なることが認識されればよいので、各領域に対して付与される番号は、連続番号である必要もない。この番号付与回路５により、フィルタ回路４によって修正された領域は、同一の特徴量を有する領域であってもそれが連結していない限り、それぞれ別個に識別できる。
【００６０】
《小領域統合回路６における処理》
番号付与回路５により識別番号が付与された領域画像信号Ｓ６は、小領域統合回路６に供給される。小領域統合回路６では、予め定められたあるしきい値以下の画素数の小さな領域を別領域へ統合する。しきい値は、例えば全画面の画素数の１％程度に設定する。統合は、統合対象の領域の最も外側から始め、別の領域に接している画素をその別領域に属させることによりなされる。外側から順に多領域へ属させるので、必ず全ての画素は他の領域へ属することになる。具体的な統合は、対象画素を中心にした３画素×３画素の範囲内にある対象画素を取り囲む近傍８画素の中で最も多い画素数を有する隣接領域に対象画素を統合することにより行う。外側の画素が複数の領域と接している場合には接している画素数が最も多い領域に属することになる。隣接領域の画素数が同数のときは、隣接領域全体を比較して最大の領域に統合する。図７に小領域統合回路６の処理例を示す。図７（ａ）の例では、中心の対象画素は領域１に属する。図７（ｂ）の例では、領域１、領域２共に近傍範囲では各３画素なので、中心の対象画素は領域１、領域２のうち、領域全体の面積の大きい方へ属する。仮に領域全体の面積が等しい場合は、隣接位置に応じて予め定めておいた優先順位（例えば、左上から時計回りにつけた優先順位）に従い属する領域を決定すればよい。ここまでの処理で、クラスタリングで生じる小さな領域が全て大きな領域へ統合された領域分割画像が得られる。
【００６１】
《輪郭抽出回路７における処理》
小領域統合回路６により得られた領域画像信号Ｓ７は輪郭抽出回路７へ供給される。輪郭抽出回路７では領域画像信号Ｓ７の中の各領域の境界である輪郭線を抽出した輪郭画像信号Ｓ８が生成される。
【００６２】
《輪郭重畳回路９における処理》
輪郭抽出回路７により得られた輪郭画像信号Ｓ８は輪郭重畳回路９に供給される。また、この輪郭重畳回路９には、第２領域分割修正回路８−２で得られた輪郭画像信号Ｓ９も供給される。そして、この輪郭重畳回路９では、供給された２つの輪郭画像信号Ｓ８，Ｓ９の論理積をとった輪郭画像信号Ｓ１０が作成される。この実施の形態では、２つの輪郭画像信号Ｓ８，Ｓ９を用いているがこの輪郭画像信号は３つ以上でも可能である。すなわち、３つ以上の領域分割修正回路を利用することができる。
【００６３】
《統合回路１０における処理》
輪郭重畳回路９で生成された輪郭画像信号Ｓ１０と、輪郭抽出回路７で生成された輪郭画像信号Ｓ８は統合回路１０に供給される。統合回路１０では、まず輪郭画像信号Ｓ８の画像中の１つの領域の輪郭線だけを抽出する。さらにこの輪郭線のうち隣接している１つの別領域との境界線だけを選択する。これと、該当する輪郭画像信号Ｓ１０の輪郭線の長さを比較し、この比がある判断基準以上であれば対象としている選択した境界線はそのまま残す。逆に、判断基準以下であれば、隣接している領域と同一の領域であるとして境界線を除去して領域統合する。判断基準は、領域分割の目的によって、分割領域数を多くしたい場合は小さく、分割領域数を少なくしたい場合には大きく設定する。例えば、比を６０％とするとあまり多くの領域が残らない結果が得られる。この操作を対象としている領域に隣接する全領域の境界線に対して行い、さらに基準とした輪郭画像信号Ｓ８の中の全領域の境界線に対して行う。
【００６４】
このように、この実施の形態によれば、小領域の発生と初期領域依存性の問題を解消するために、Ｋ平均アルゴリズムによるクラスタリング処理後に、３段階の領域修正を施している。すなわち、▲１▼数画素程度の小粒状領域に対してはフィルタリング処理を、▲２▼残りの小領域に対処するために領域への識別番号付けと隣接する大領域への統合処理を、▲３▼初期値依存性に対しては初期領域の形状を変化させた場合の結果との統合処理を行っている。これにより、繁雑な処理や膨大な計算量を必要とすることなく小領域発生と初期値依存問題を解消することができる。
【００６５】
また、前述した動画像を対象とした場合の、不確かさを含む動き情報を特徴量の一つとして利用するために本発明では、得られた動き情報の信頼度を判定基準と比較している。判定基準を満たさない動き情報に対してはこれを満たす動き情報に比べて小さな係数を乗ずる。このような重み付けをすることによって特徴量として利用する際に、信頼度の高い動き情報への影響を減らしている。これにより、色、位置、動きの情報を特徴量として一元的に扱うクラスタリングを行うことができる。
【００６６】
【発明の効果】
以上説明したように請求項１の発明によれば、不確かさを含む動き情報を特徴量の一つとして利用するために、得られた動き情報の信頼度を判定基準と比較しているので、不確かさを解消しつつも動き情報を色情報や位置情報といった画像の特徴量と共に利用できる。また、初期領域の形状を変化させた場合の結果との統合処理を行うことにより初期値に依存することのない領域分割画像を得ることができる。
【００６７】
請求項２の発明によれば、フィルタリング処理を実行することにより、クラスタリング回路によって生成された領域画像信号中に含まれる数画素程度の孤立した小粒状領域を、Ｋ平均アルゴリズム自体の繰り返し処理、反復処理をすることなく除去することができる。
【００６８】
請求項３の発明によれば、クラスタリング回路によって生成された領域画像信号中の小領域を、Ｋ平均アルゴリズム自体の繰り返し処理、反復処理をすることなく隣接する大領域へ統合することができる。
【図面の簡単な説明】
【図１】本発明に係る動画像領域分割装置の実施の一形態における構成を示すブロック図である。
【図２】初期領域生成回路により生成される初期領域の一例を示す説明図である。
【図３】信頼度情報の導出過程における動きベクトルと輝度勾配ベクトルとの関係を示す説明図である。
【図４】信頼度情報の導出過程における輝度勾配ベクトル分布を示す説明図である。
【図５】信頼度情報の導出過程における輝度勾配ベクトルの固有値を示す説明図である。
【図６】フィルタ回路を構成する多数決フィルタの処理例を示す説明図である。
【図７】小領域統合回路における処理例を示す説明図である。
【符号の説明】
１初期領域生成回路
２動き情報重み付け回路
３クラスタリング回路
４フィルタ回路
５番号付与回路
６小領域統合回路
７輪郭抽出回路
８−１第１領域分割修正回路
８−２第２領域分割修正回路
９輪郭重畳回路
１０統合回路
Ｓ１動画像信号
Ｓ２初期領域信号
Ｓ３重み付け動き情報
Ｓ４領域画像信号
Ｓ５粒状領域が除去された領域画像信号
Ｓ６識別番号が付与された領域画像信号
Ｓ７統合された領域画像信号
Ｓ８輪郭画像信号
Ｓ９第２領域分割修正回路からの輪郭画像信号
Ｓ１０論理積演算後の輪郭画像信号[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a moving image region dividing apparatus that performs clustering region division using image feature amounts typified by a K-average algorithm.
[0002]
[Summary of Invention]
The present invention is an apparatus that divides a moving image into regions, and uses the motion information, color information, and position information of the pixels in the image to divide the moving image. High-precision area division is K-averaged by integrating the area information that is divided into multiple different initial states in order to eliminate over-division that occurs due to the influence of the initial state. The algorithm itself can be repeated without iterating.
[0003]
[Prior art]
Conventionally, as a region dividing device for still images, a method of directly dividing on a screen and a clustering method based on image feature amounts are known.
[0004]
The method of direct division on the screen is a method of integrating other pixels adjacent to a certain pixel with the ones whose luminance and color are close, and a typical method is a region growth method (SWZucker, “Region growing: childhood and adolescence,” Comput. Graphics and Image Processing, 5, pp. 382-399, (1976)). In this area growing method, each divided area is a completely connected closed area. However, a threshold value is given to a criterion for integrating pixels, and thus the obtained area shape differs depending on the threshold value. Become a thing.
[0005]
On the other hand, the clustering method based on the feature amount of the image is a method in which not only the luminance and color of the pixel but also pixels having similar feature amounts such as the position on the screen are set as the same region. Among them, there is a technique called K-means algorithm as a technique that handles a plurality of feature values in a unified manner (SZSelim and MAIsmail, "K-MEANS-Type algorithms", IEEE Trans.PAMI-6,1, pp.81-87. (1984)). The process of the K-average algorithm is an optimization problem that minimizes a certain evaluation function. Although it includes an iterative process, the area division can be realized by a simple calculation.
[0006]
However, it is generally pointed out that the processing of the K-average algorithm has a problem that the shape of the region given in the initial stage affects the boundary position of the region obtained as a result of the division, and a large number of small granular regions are generated. ing.
[0007]
As a method for eliminating the dependency on the initial region described above, a method of introducing a genetic algorithm which is a multi-point search method (Horita, Murai, Miyahara, “Color image segmentation using clustering and genetic algorithm” ”, IEICE Technical Report, PRU94-7, (1994)) and 19 types of results obtained by changing the number of regions to be initially provided and integrating them (Nakatani, Osaki, Abe,“ Multiple Detecting the boundary of objects based on the result of region segmentation ”(Science Theory D-II, J76, 4, pp.914-916, (1993)).
[0008]
However, both methods are based on K-average algorithm processing, and the calculation amount becomes enormous because the K-average algorithm processing itself is repeated. For example, in the former method, it is necessary to perform the K average algorithm processing at worst about 1000 times and in the latter method 19 times.
[0009]
In addition, as a technique for removing small granular regions generated in the K average algorithm processing, there is a method using a relaxation method (“Pixel Analysis Handbook”, pp. 677-679, University of Tokyo Press). In this method, it is necessary to estimate and process adjacent areas where small granular areas should be integrated using prior knowledge such as the adjacent relationship of image areas. For example, special images such as terrain screens obtained from satellites However, it is difficult to apply to general natural images.
[0010]
Each of the above methods is a region division for a still image, but a region division for a moving image instead of a still image can also be performed by a method using the feature amount of the image. When a moving image is targeted, motion information is used as one of the feature amounts, but the motion information is often estimated in units of blocks determined regardless of the content of the image. In this case, there is a problem that the motion information obtained is inaccurate when there are multiple motions in the block or noise exists (G. Adiv, “Determining three-dimensional motion and structure from optical flow generated by several moving objects ", IEEE Trans.PAMI-7,4, pp.384-401 (1985)).
[0011]
In the method of moving region extraction using clustering (Kun, Harashima, “Area segmentation and region tracking based on motion, color, and position clustering”, IEICE Technical Report, IE94-152, (1995)) Then, processing for correcting the area is performed by integrating the obtained moving area and the separately prepared still image area division result. For this reason, it is not possible to perform processing that handles a plurality of feature quantities in a unified manner, which is a feature of clustering methods such as the K-average algorithm. In order to handle motion information as one of the feature quantities in a unified manner, the motion information only needs to be accurate. However, as described above, accurate motion information cannot always be obtained by motion estimation.
[0012]
[Problems to be solved by the invention]
As described above, various proposals have been made to eliminate the initial region dependency and the generation of small granular regions, which are problems in region division by the K-average algorithm.
[0013]
However, any of the conventional techniques has drawbacks such as complicated processing and the need to prepare other information in advance. In addition, when moving images are targeted, there is a problem that the motion information cannot be used as it is as one of the feature quantities of the K average algorithm because of the uncertainty of the motion information. Development of a region segmentation method for moving images that are practical and applicable to general natural images is desired.
[0014]
The present invention has been made in view of the above circumstances, and an object of the present invention is to use motion information as a feature amount together with color information and position information while eliminating uncertainty of motion information. It is an object of the present invention to provide a moving image region dividing device that can solve the removal of granular small regions and the initial dependence caused by the clustering processing without performing iterative processing and iterative processing of the K-average algorithm itself.
[0015]
[Means for Solving the Problems]
In order to achieve the above object, the first aspect of the present invention provides motion information defined as a movement amount of each pixel between temporally continuous images of moving images. When, Confidence defined as an indicator of the likelihood of the estimated value relative to the actual value of this motion information And multiply the motion information with high reliability that satisfies a certain criterion by a large coefficient, and multiply the motion information with low reliability that does not meet the criterion by a small coefficient. A motion information weighting circuit for performing weighting, color information of pixels in the moving image, position information, weighted motion information from the motion information weighting circuit, and an initial region are input, and the color information, position information, And weighted motion information Treat it centrally as multiple features, A clustering circuit that generates a region image signal by dividing a target moving image into regions using an optimal search by an evaluation function, a region outline obtained from a region image signal generated by the clustering circuit, An integrated circuit that inputs a region contour obtained from a region image signal using an initial region shape different from the initial region, extracts a common portion of these region contours, and generates a region integrated image that excludes an excessively divided portion It is characterized by comprising.
[0016]
According to the above configuration, in order to use motion information including uncertainties as one of the feature quantities when moving images are targeted, the reliability of the obtained motion information is compared with a criterion. The motion information that does not satisfy this criterion is multiplied by a smaller coefficient than the motion information that satisfies this criterion. In this way, weighting based on the reliability reduces the influence on the highly reliable motion information when using it as a feature amount. Further, the initial value dependency is solved by performing integration processing with the result of changing the shape of the initial region.
[0017]
According to a second aspect of the present invention, in the moving image region dividing device according to the first aspect, a region image signal generated by the clustering circuit is input between the clustering circuit and the integrated circuit, and the region image signal is included in the region image signal. A filter circuit for generating an area image signal from which the isolated small area is removed is inserted.
[0018]
According to the above configuration, by performing the filtering process, it is possible to remove an isolated small granular region of about several pixels included in the region image signal generated by the clustering circuit.
[0019]
According to a third aspect of the present invention, in the moving image region dividing device according to the first or second aspect, a region image signal generated by the clustering circuit or the filter circuit between the clustering circuit or the filter circuit and the integrated circuit. The region image signal after the isolated small region is removed by the input is input, and the region of the number of pixels equal to or smaller than a predetermined threshold in these region image signals is a range of 8 neighboring pixels surrounding the pixels in the region When there are multiple adjacent areas with the largest number of pixels in the adjacent area having the largest number of pixels, or within the range of the neighboring 8 pixels surrounding the pixels in the area, the entire neighboring areas are compared and the largest A small region integration circuit that generates a region image signal that belongs to adjacent regions and integrates the small regions is interposed.
[0020]
According to said structure, the small area | region in the area | region image signal produced | generated by the clustering circuit can be integrated into an adjacent large area | region.
[0021]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 shows a configuration of an embodiment of a moving image area dividing apparatus according to the present invention.
[0022]
The moving image region dividing apparatus includes a first region generation circuit 1, a motion information weighting circuit 2, a clustering circuit 3, a filter circuit 4, a numbering circuit 5, a small region integration circuit 6, and a contour extraction circuit 7. The area division correction circuit 8-1, the second area division correction circuit 8-2 having the same circuit as the first area division correction circuit 8-1, the contour superimposing circuit 9, and the integrated circuit 10 are included.
[0023]
The initial region generation circuit 1 divides the entire screen of the input moving image signal S1 into preset small regions, generates an initial region, and outputs the initial region signal S2 to the clustering circuit 3.
[0024]
The motion information weighting circuit 2 inputs motion information and reliability information obtained in advance, and if the reliability information does not satisfy a certain criterion, the motion information weighting circuit 2 compares the information with high reliability that satisfies the criterion. Then, a process of multiplying the motion information by a weighting coefficient corresponding to the reliability so as to become a small value is executed, and the weighted motion information S3 is output to the clustering circuit 3.
[0025]
The clustering circuit 3 receives the moving image signal S1, the initial region signal S2 from the initial region generation circuit 1, and the weighted motion information S3 from the motion information weighting circuit 2, and executes the clustering process so that the screen is a region. The final divided region image signal S4 is generated and output to the filter circuit 4.
[0026]
The filter circuit 4 is constituted by a majority filter, removes small granular areas included in the area image signal S4 supplied from the clustering circuit 3, and outputs the area image signal S5 from which the granular areas have been removed to the numbering circuit 5. To do.
[0027]
The number assigning circuit 5 inputs the region image signal S5 from which the granular region has been removed, assigns the same identification number to the pixels connected on the screen included in the obtained same region, and assigns the identification number. The region image signal S6 is output to the small region integration circuit 6.
[0028]
The small region integration circuit 6 is obtained by inputting a region image signal S6 to which an identification number is assigned and executing processing for integrating a region having a small number of pixels below a predetermined threshold value into another region. The region image signal S7 is output to the contour extraction circuit 7.
[0029]
The contour extraction circuit 7 receives the region image signal S7, extracts a contour line that is a boundary of each region in the region image signal S7, and outputs the contour image signal S8 to the contour superimposing circuit 9 and the integration circuit 10. To do.
[0030]
The contour superimposing circuit 9 inputs the contour image signal S8 from the contour extracting circuit 7 and the contour image signal S9 obtained by the second region division correcting circuit 8-2, and at the same time, generates two contour image signals S8 and S9. The logical product is calculated to generate the contour image signal S10 and output to the integrated circuit 10.
[0031]
The integration circuit 10 receives the contour image signal S8 from the contour extraction circuit 7 and the contour image signal S10 from the contour superimposing circuit 9, and executes region integration processing from the contour image signal S8 and the contour image signal S10. The final region divided image is generated and output.
[0032]
Next, the operation of this embodiment will be described.
[0033]
<< Processing in Initial Region Generation Circuit 1 >>
When the moving image signal S1 is supplied to the initial area generation circuit 1, first, the entire screen is divided into M × N small areas as shown in FIG. M and N are integer values of about 4 to 6, and FIG. 2 shows a small region of M = 6 and N = 4. This is an initial region set by the first region division correction circuit 8-1. Similarly, in the second area division correction circuit 8-2, the entire screen is divided into M × N small areas different from the initial area set by the first area division correction circuit 8-1.
[0034]
<< Processing in Motion Information Weighting Circuit 2 >>
On the other hand, the motion information and the reliability information are obtained in advance by known means, and the motion information and the reliability information are supplied to the motion information weighting circuit 2. Here, the motion information by motion estimation is the amount of movement of each pixel between temporally continuous images of the moving images. The reliability information is an index of the probability of the estimated value with respect to the actual value of the motion information. For example, Kaneko, Shikago, Tanaka, “Improvement of accuracy of motion estimation by luminance gradient vector distribution”, IEICE Technical Report, IE96-10, (1996), motion vectors (u, v) represented by motion estimation Applicable to information, reliability s _s , S _l Corresponds to reliability information.
[0035]
The reliability s _s , S _l The derivation process of will be described.
[0036]
[Outside 1]

The luminance of the point (x, y) on the moving image plane at time t is g (x, y), and the spatio-temporal luminance gradient vector is
[Expression 1]

If the movement vector is m = (u, v, 1), the basic formula of the gradient method of formula (2) is obtained using the inner product.
[0037]
[Expression 2]
g ^t m = 0 (2)
Here, a vector m obtained by extending the motion vector (u, v) to three dimensions is referred to as a movement vector.
[0038]
As a constraint condition, if “all pixels (number of pixels n) in an image fragment have the same amount of motion” is used, an error Σ (g _i ・ ^t m) ² The movement vector m can be estimated by minimizing (where i = 0 to n). Furthermore, the normalized unit movement vector m instead of the movement vector m _u Is used, the motion vector to be estimated is m which minimizes Equation (3). _u Can be obtained from
[0039]
[Equation 3]

Here, G is a covariance matrix of luminance gradient vectors in the image fragment, and can be expressed by the following equation (4).
[0040]
[Expression 4]

Let the eigenvalues of the matrix G be λ ₁ ≧ λ ₂ ≧ λ _Three And the corresponding eigenvector u ₁ , U ₂ , U _Three Then, the expression (3) is minimized because the minimum eigenvalue λ of the matrix G _Three The eigenvector u for _Three M _u Therefore, the movement vector m is the third eigenvector u of the matrix G. _Three Can be estimated from
[0041]
Next, considering the vector expression of equation (2), estimating the movement vector m means that the movement vector m to be estimated and a plurality of luminance gradient vectors g in the image fragment are shown in FIG. _i Is to make the sum of the inner products of and as small as possible. For example, luminance gradient vector g _i Is distributed in a planar shape, the sum of squares of the inner product of equation (3) becomes 0 by making the direction of the movement vector m the direction of the normal vector of the plane. In such a case, the accuracy of the estimated motion vector is high. In contrast, the luminance gradient vector g _i When the distribution of is deviated from the plane, the sum of squares of the inner product of the equation (3) does not become 0, and the accuracy of the estimated movement vector m decreases. In addition, the luminance gradient vector g _i Is distributed only in the same direction, the plane is not uniquely determined, and the movement vector m cannot be specified.
[0042]
As described above, the distribution of the luminance gradient vector and the accuracy of the movement vector are closely related. As shown in FIG. 4, the estimated movement vector is obtained with high accuracy when the luminance gradient vector is distributed on a plane as shown in FIG. When the brightness gradient vector as shown in FIG. 4B is distributed in a spherical shape away from the plane or as shown in FIG. 4C, it is considered that the accuracy of the estimated motion vector is low. It is done.
[0043]
The three-dimensional distribution of each luminance gradient vector in FIG. ₁ , Second eigenvalue λ ₂ , The third eigenvalue λ _Three Is expressed as the following equation (5).
[Equation 5]

Hereinafter, (a) is called a planar distribution, (b) is called a spherical distribution, and (c) is called a linear distribution.
[0044]
First eigenvalue λ ₁ , Second eigenvalue λ ₂ , The third eigenvalue λ _Three If we consider the distribution of x in the Cartesian coordinate system, as shown in FIG. ₁ ≧ λ ₂ ≧ λ _Three Therefore, the distribution is within the triangular pyramid of FIG. Each side of the triangular pyramid represents the limit value of the planar distribution, the spherical distribution, and the linear distribution. Here, each side is called a plane axis, a spherical axis, and a linear axis. Therefore, it is considered that the reliability of the estimated motion vector is higher as the eigenvalue of the covariance matrix G of the luminance gradient vector in the image fragment is closer to the plane axis in FIG. Further, as can be understood from FIG. 5, there are a case where the reliability decreases in the linear axis direction and a case where the reliability decreases in the spherical axis direction.
[0045]
From the relationship between (a) and (b) in equation (5) and the relationship between (a) and (c), the following equation is defined.
[0046]
[Formula 6]

Based on this equation (6), the following equation (7) was defined as the reliability.
[0047]
[Expression 7]

Reliability s of equation (7) _l 1 / a in the definition of _l To the third eigenvalue λ _Three Is multiplied by the third eigenvalue λ even in a state close to a linear distribution. _Three Is small, the third eigenvalue λ can be regarded as having high reliability. _Three This is because weighting is performed. S in equation (7) _s The reliability of the sphere, s _l Is called the reliability of the straight line. s _s , S _l The smaller the value, the higher the reliability.
[0048]
Returning to FIG. 1 again, the motion information weighting circuit 2 weights the motion information described above according to the reliability information. More specifically, when the reliability information does not satisfy a certain criterion, the weighting coefficient corresponding to the reliability is set to the motion information so that the reliability information is small compared to information with high reliability that satisfies the criterion. You can ride it. For example, a highly reliable motion information that satisfies the determination criterion is multiplied by a coefficient of “1” or more, and a motion information that does not satisfy the determination criterion is multiplied by a coefficient less than “1”. Information in which the reliability is added to the motion information in this way is supplied to the clustering circuit 3 as weighted motion information S3.
[0049]
<< Processing in Clustering Circuit 3 >>
In the clustering circuit 3, the moving image signal S1, the initial region signal S2 from the initial region generation circuit 1, and the weighted motion information S3 from the motion information weighting circuit 2 are input, and clustering processing is performed by the following calculation. Execute.
[0050]
First, an average value of all feature values for each area of the initial area is calculated. The feature amount here is the three color information (three primary colors R, G, B) and two position information (coordinates X, Y) of the pixel, and the weighted movements in the horizontal and vertical directions of the screen. A total of seven pieces of information (U, V).
[0051]
Next, for each pixel of the entire screen of the moving image signal S1, it is determined by the following evaluation function formula which region is closest to the average value of the feature amount of the region as a whole.
[0052]
[Equation 8]

However, Rm: Average value in each region of red information R
Gm: Average value of each area of green information G
Bm: Average value in each area of blue information B
Xm: Average value in each area of the position information X
Ym: average value of each area of position information Y
Um: Average value in each area of motion information U
Vm: Average value of each region of motion information V
Kl, Kp, Km: Coefficient values. The coefficient values Kl, Kp, and Km are multiplied so that all the feature values are in the same range, for example, 0 to 1, and are set in advance.
[0053]
Next, an evaluation function is obtained for all regions with respect to the feature amount of each pixel.
[0054]
The region where the obtained evaluation function is the smallest is considered as the optimum region, and the target pixel is included in the optimum region, and a region is newly constructed.
[0055]
After performing the minimum value search by the above evaluation function for all the pixels and reconstructing the region, the average value Rm, Gm, Bm, Xm, Ym, Um, Recalculate Vm.
[0056]
The above operation is performed again for all pixels. All processes are repeated until the average values Rm, Gm, Bm, Xm, Ym, Um, and Vm of each region do not change.
[0057]
In the clustering circuit 3, the processing as described above is performed, and a final region image signal S4 in which the screen is divided into regions is generated.
[0058]
<< Processing in Filter Circuit 4 >>
The area image signal S4 generated by the clustering circuit 3 is supplied to the filter circuit 4, and the filter circuit 4 removes small granular areas included in the area image signal S4. The filter used here is a majority filter, and the size of 5 × 5 pixels is sufficient. FIG. 6 shows an example of the majority filter processing. In this example, a window filter having a size of 5 pixels × 5 pixels is used. Further, the pixel at the center of the window is integrated into the region 1 having the largest number of pixels in the filter. Even when the most frequent region is not adjacent to the central pixel and exists in the peripheral portion, the central pixel is integrated into the most frequent region 1. In this case, the center pixel still forms a granular area, but this granular area is eventually removed by the small area integration circuit 6 described later. When the central pixel of the filter originally becomes a part of the most frequent region 1, the region to which the central pixel belongs does not change. Thus, the area image is corrected by the filter circuit 4.
[0059]
<< Processing in Numbering Circuit 5 >>
The area image signal S5 from which most of the small granular areas have been removed by the filter circuit 4 is supplied to the numbering circuit 5. In the number assignment circuit 5, the same identification number is assigned to the pixels connected on the screen included in the obtained same region. The order of assigning the numbers is arbitrary, and it is only necessary to recognize that the areas are different. Therefore, the numbers assigned to the areas do not need to be sequential numbers. By this number assigning circuit 5, the regions corrected by the filter circuit 4 can be separately identified even if they are regions having the same feature amount as long as they are not connected.
[0060]
<< Processing in Small Area Integration Circuit 6 >>
The area image signal S6 assigned with the identification number by the number assignment circuit 5 is supplied to the small area integration circuit 6. The small area integration circuit 6 integrates an area having a small number of pixels equal to or smaller than a predetermined threshold value into another area. The threshold value is set to about 1% of the number of pixels on the entire screen, for example. The integration is performed by starting from the outermost area of the integration target area and causing pixels in contact with another area to belong to the other area. Since the pixels belong to multiple regions in order from the outside, all the pixels always belong to other regions. Specifically, the integration is performed by integrating the target pixel into the adjacent region having the largest number of pixels among the eight neighboring pixels surrounding the target pixel within the range of 3 pixels × 3 pixels centered on the target pixel. When the outer pixel is in contact with a plurality of regions, it belongs to the region with the largest number of pixels in contact. When the number of pixels in the adjacent area is the same, the entire adjacent area is compared and integrated into the maximum area. FIG. 7 shows a processing example of the small area integration circuit 6. In the example of FIG. 7A, the central target pixel belongs to the region 1. In the example of FIG. 7B, since both the region 1 and the region 2 are 3 pixels in the vicinity range, the central target pixel belongs to the region 1 or the region 2 having the larger area as a whole. If the area of the entire region is equal, the region to which the region belongs should be determined according to a priority order determined in advance according to the adjacent position (for example, a priority order assigned clockwise from the upper left). Through the processing so far, a region-divided image in which all the small regions generated by clustering are integrated into a large region is obtained.
[0061]
<< Processing in Outline Extraction Circuit 7 >>
The region image signal S7 obtained by the small region integration circuit 6 is supplied to the contour extraction circuit 7. The contour extraction circuit 7 generates a contour image signal S8 obtained by extracting a contour line that is a boundary of each region in the region image signal S7.
[0062]
<< Processing in Outline Superimposing Circuit 9 >>
The contour image signal S8 obtained by the contour extraction circuit 7 is supplied to the contour superimposing circuit 9. The contour superimposing circuit 9 is also supplied with a contour image signal S9 obtained by the second region division correcting circuit 8-2. In the contour superimposing circuit 9, a contour image signal S10 obtained by taking the logical product of the two supplied contour image signals S8 and S9 is created. In this embodiment, two contour image signals S8 and S9 are used, but three or more contour image signals may be used. That is, three or more area division correction circuits can be used.
[0063]
<< Processing in Integrated Circuit 10 >>
The contour image signal S10 generated by the contour superimposing circuit 9 and the contour image signal S8 generated by the contour extracting circuit 7 are supplied to the integrated circuit 10. In the integrated circuit 10, only the contour line of one region in the image of the contour image signal S8 is first extracted. Further, only the boundary line with one adjacent region is selected from the contour line. This is compared with the length of the contour line of the corresponding contour image signal S10, and if this ratio is equal to or greater than a certain criterion, the selected boundary line is left as it is. On the other hand, if it is less than the judgment criterion, the boundary line is removed and the areas are integrated by assuming that the area is the same as the adjacent area. Depending on the purpose of area division, the determination criterion is set to be small when it is desired to increase the number of divided areas, and is set to be large when it is desired to reduce the number of divided areas. For example, when the ratio is set to 60%, a result in which not many regions remain is obtained. This operation is performed on the boundary line of the entire area adjacent to the target area, and further performed on the boundary line of the entire area in the reference contour image signal S8.
[0064]
As described above, according to this embodiment, in order to eliminate the problem of the generation of the small area and the initial area dependency, the three-stage area correction is performed after the clustering process by the K-average algorithm. That is, (1) filtering processing is applied to a small granular area of about several pixels, and (2) identification number assignment to the remaining small areas and integration processing to adjacent large areas are performed. 3) For the dependency on the initial value, an integration process with the result when the shape of the initial region is changed is performed. As a result, it is possible to eliminate the occurrence of small areas and the initial value dependency problem without requiring complicated processing and a huge amount of calculation.
[0065]
Further, in the present invention, the reliability of the obtained motion information is compared with a determination criterion in order to use motion information including uncertainties as one of the feature amounts when the above-described moving image is targeted. . The motion information that does not satisfy the criterion is multiplied by a smaller coefficient than the motion information that satisfies this criterion. Such weighting reduces the influence on highly reliable motion information when used as a feature value. Thereby, it is possible to perform clustering that handles color, position, and motion information as feature amounts in an integrated manner.
[0066]
【The invention's effect】
As described above, according to the invention of claim 1, in order to use motion information including uncertainty as one of feature quantities, the reliability of the obtained motion information is compared with a criterion. While eliminating uncertainty, motion information can be used together with image features such as color information and position information. Further, by performing integration processing with the result of changing the shape of the initial region, it is possible to obtain a region-divided image that does not depend on the initial value.
[0067]
According to the second aspect of the present invention, by performing the filtering process, an isolated small granular area of about several pixels included in the area image signal generated by the clustering circuit is subjected to an iterative process and an iterative process of the K average algorithm itself. It can be removed without processing.
[0068]
According to the third aspect of the present invention, the small regions in the region image signal generated by the clustering circuit can be integrated into adjacent large regions without performing the iterative process and the iterative process of the K-average algorithm itself.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an embodiment of a moving image region dividing device according to the present invention.
FIG. 2 is an explanatory diagram illustrating an example of an initial region generated by an initial region generation circuit.
FIG. 3 is an explanatory diagram showing a relationship between a motion vector and a luminance gradient vector in the process of deriving reliability information.
FIG. 4 is an explanatory diagram showing a luminance gradient vector distribution in the process of deriving reliability information.
FIG. 5 is an explanatory diagram showing eigenvalues of a luminance gradient vector in the process of deriving reliability information.
FIG. 6 is an explanatory diagram showing an example of processing of a majority filter constituting a filter circuit.
FIG. 7 is an explanatory diagram showing a processing example in the small region integration circuit.
[Explanation of symbols]
1 Initial region generation circuit
2 Motion information weighting circuit
3 Clustering circuit
4 Filter circuit
5 Numbering circuit
6 Small area integration circuit
7 Contour extraction circuit
8-1 First area division correction circuit
8-2 Second area division correction circuit
9 Contour superimposing circuit
10 Integrated circuit
S1 video signal
S2 Initial area signal
S3 Weighted motion information
S4 area image signal
S5 Area image signal from which granular areas have been removed
S6 Area image signal assigned identification number
S7 Integrated region image signal
S8 Contour image signal
S9: Contour image signal from the second region division correction circuit
S10 Contour image signal after AND operation

Claims

動画像のうちの時間的に連続する画像間での各画素の移動量として定義される動き情報と、この動き情報の実際の値に対する推定された値の確からしさの指標として定義される信頼度とを入力し、一定の判定基準を満たす信頼度の高い動き情報には大きな係数を乗じ、判定基準に満たない信頼度の低い動き情報には小さな係数を乗じる重み付けを行う動き情報重み付け回路と、
動画像内の画素の有する色情報、位置情報、前記動き情報重み付け回路からの重み付けされた動き情報、および初期領域を入力すると共に、前記色情報、位置情報、および重み付け動き情報を複数の特徴量として一元的に扱い、評価関数による最適探索を利用して、対象となる動画像を領域に分割して領域画像信号を生成するクラスタリング回路と、
このクラスタリング回路によって生成された領域画像信号から得られた領域輪郭と、前記初期領域とは異なる初期領域形状を利用した領域画像信号から得られた領域輪郭とを入力し、これらの領域輪郭の共通部分を抽出して過分割部分を排除した領域統合画像を生成する統合回路と、
を具備することを特徴とする動画像領域分割装置。Motion information defined as the amount of movement of each pixel between temporally continuous images of the moving image, and reliability defined as an index of the probability of the estimated value for the actual value of this motion information A motion information weighting circuit that performs weighting by multiplying high-reliability motion information that satisfies a certain determination criterion by a large coefficient, and multiplying low-reliability motion information that does not satisfy the determination criterion by a small coefficient ;
Inputs color information, position information, weighted motion information from the motion information weighting circuit, and initial region of pixels in the moving image, and a plurality of feature amounts of the color information, position information, and weighted motion information. And a clustering circuit that generates a region image signal by dividing the target moving image into regions using the optimal search by the evaluation function,
The region contour obtained from the region image signal generated by the clustering circuit and the region contour obtained from the region image signal using an initial region shape different from the initial region are input, and these region contours are shared. An integrated circuit that extracts a part and generates a region integrated image in which an excessively divided part is excluded;
A moving image region dividing device comprising:

請求項１記載の動画像領域分割装置において、
前記クラスタリング回路と前記統合回路間に、前記クラスタリング回路によって生成された領域画像信号を入力して、この領域画像信号中に含まれる孤立小領域を除去した領域画像信号を生成するフィルタ回路が介挿されていることを特徴とする動画像領域分割装置。The moving image region dividing device according to claim 1,
A filter circuit for inputting a region image signal generated by the clustering circuit between the clustering circuit and the integrated circuit and generating a region image signal from which isolated small regions included in the region image signal are removed is inserted. A moving image region dividing device characterized in that:

請求項１または２記載の動画像領域分割装置において、
前記クラスタリング回路または前記フィルタ回路と前記統合回路間に、前記クラスタリング回路によって生成された領域画像信号または前記フィルタ回路により孤立小領域が除去された後の領域画像信号を入力して、これら領域画像信号における予め定められたしきい値以下の画素数の領域を、その領域内の画素を囲む近傍８画素の範囲内で最も多い画素数を有する隣接領域、またはその領域内の画素を囲む近傍８画素の範囲内で最も画素数が多い隣接領域が複数あるときは、それぞれの隣接領域全体を比較して最も大きな隣接領域に属させて小領域を統合した領域画像信号を生成する小領域統合回路が介挿されていることを特徴とする動画像領域分割装置。The moving image region dividing device according to claim 1 or 2,
The region image signal generated by the clustering circuit or the region image signal after the isolated small region is removed by the filter circuit is input between the clustering circuit or the filter circuit and the integrated circuit, and these region image signals are input. An area having the number of pixels equal to or smaller than a predetermined threshold value in FIG. 5 is an adjacent area having the largest number of pixels in the range of the neighboring 8 pixels surrounding the pixels in the area, or the neighboring 8 pixels surrounding the pixels in the area When there are a plurality of adjacent areas with the largest number of pixels within the range, a small area integration circuit that compares the entire adjacent areas and belongs to the largest adjacent area to generate an area image signal that integrates the small areas. A moving image region dividing device, wherein the moving image region dividing device is interposed.