JP4264803B2

JP4264803B2 - Image processing apparatus and method, learning apparatus and method, recording medium, and program

Info

Publication number: JP4264803B2
Application number: JP2002327268A
Authority: JP
Inventors: 哲二郎近藤; 靖立平; 淳一石橋; 成司和田; 泰広周藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-11-11
Filing date: 2002-11-11
Publication date: 2009-05-20
Anticipated expiration: 2022-11-11
Also published as: JP2004165837A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置および方法、学習装置および方法、記録媒体、並びにプログラムに関し、特に、より正確に動きベクトルを検出できるようにした画像処理装置および方法、学習装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
画像の動きを示す動きベクトルを求め、この動きベクトルに基づいて効率よく動画像を圧縮する技術がある。
【０００３】
この動画像圧縮技術における上述の動きベクトルを求める手法としては、いくつか提案されているが、代表的な手法としてブロックマッチングアルゴリズムと呼ばれる手法がある。
【０００４】
図１は、ブロックマッチングアルゴリズムを採用した従来の画像処理装置の動き検出部１の構成例を示している。
【０００５】
動き検出部１のフレームメモリ１１は、例えば、時刻ｔ１において、入力端子Ｔｉｎから画像信号が入力されると、１フレーム分の情報を格納する。更に、フレームメモリ１１は、次のタイミングとなる時刻ｔ２において、入力端子Ｔｉｎから次のフレームの画像信号が入力されると、時刻ｔ１において、格納した１フレーム分の画像情報をフレームメモリ１２に出力した後、新たに入力された１フレーム分の画像情報を格納する。
【０００６】
また、フレームメモリ１２は、時刻ｔ２のタイミングで、フレームメモリ１１から入力されてくる時刻ｔ１のタイミングで入力端子Ｔｉｎから入力されてきた１フレーム分の画像情報を格納する。
【０００７】
すなわち、フレームメモリ１１が、上述の時刻ｔ２のタイミングで入力される（今現在の）１フレーム分の画像情報を格納するとき、フレームメモリ１２は、時刻ｔ１のタイミングで入力された（１タイミング過去の）１フレーム分の画像情報を格納していることになる。なお、以下において、フレームメモリ１１に格納される画像情報をカレントフレームＦｃ、フレームメモリ１２に格納される画像情報を参照フレームＦｒと称するものとする。
【０００８】
動きベクトル検出部１３は、フレームメモリ１１，１２に格納されているカレントフレームＦｃと参照フレームＦｒをそれぞれから読出し、このカレントフレームＦｃと参照フレームＦｒに基づいて、ブロックマッチングアルゴリズムにより動きベクトルを検出し、出力端子Toutから出力する。
【０００９】
ここで、ブロックマッチングアルゴリズムについて説明する。例えば、図２で示すように、カレントフレームＦｃ内の注目画素Ｐ（ｉ，ｊ）に対応する動きベクトルを求める場合、まず、カレントフレームＦｃ上に注目画素Ｐ（ｉ，ｊ）を中心としたＬ（画素数）×Ｌ（画素数）からなる基準ブロックＢｂ（ｉ，ｊ）、参照フレームＦｒ上に、注目画素Ｐ（ｉ，ｊ）の位置に対応するサーチエリアＳＲ、そして、そのサーチエリアＳＲ内に、Ｌ（画素数）×Ｌ（画素数）の画素からなる参照ブロックＢｒｎ（ｉ，ｊ）がそれぞれ設定される。
【００１０】
次に、この基準ブロックＢｂ（ｉ，ｊ）と、参照ブロックＢｒｎ（ｉ，ｊ）の各画素間の差分の絶対値の和を求める処理が、参照ブロックＢｒｎをサーチエリアＳＲ内の全域で水平方向、または、垂直方向に１画素分ずつ移動させながら、図２中のＢｒ１からＢｒｍ（参照ブロックＢｒｎが、サーチエリアＳＲ内にｍ個設定できるものとする）まで繰り返される。
【００１１】
このようにして求められた基準ブロックＢｂ（ｉ，ｊ）と、参照ブロックＢｒｎ（ｉ，ｊ）の各画素間の差分絶対値和のうち、差分絶対値和が最小となる参照ブロックＢｒｎを求めることにより、基準ブロックＢｂ（ｉ，ｊ）に最も近い（類似している）参照ブロックＢｒｎ（ｉ，ｊ）を構成するＬ×Ｌ個の画素の中心となる参照画素Ｐｎ（ｉ，ｊ）が求められる。
【００１２】
そして、このカレントフレームＦｃ上の注目画素Ｐ（ｉ，ｊ）に対応する参照フレームＦｒ上の画素Ｐ'（ｉ，ｊ）を始点とし、参照画素Ｐｎ（ｉ，ｊ）を終点とするベクトルが、注目画素Ｐ（ｉ，ｊ）の動きベクトル（Ｖｘ，Ｖｙ）として出力される。ここで、例えば、Ｐ（ｉ，ｊ）＝（ａ，ｂ）、および、Ｐｎ（ｉ，ｊ）＝（ｃ，ｄ）である場合、（Ｖｘ，Ｖｙ）は、（Ｖｘ，Ｖｙ）＝（ｃ−ａ，ｄ−ｂ）となる。
【００１３】
すなわち、注目画素Ｐ（ｉ，ｊ）に対応する参照フレームＦｒ上の参照画素Ｐ'（ｉ，ｊ）を始点とし、基準ブロックＢｂ（ｉ，ｊ）に最も近い（類似している）参照ブロックＢｒｎ（ｉ，ｊ）を構成するＬ×Ｌ個の画素の中心となる参照画素Ｐｎ（ｉ，ｊ）を終点とするベクトルが動きベクトルとして求められる。
【００１４】
次に、図３のフローチャートを参照して、図１の動き検出部１の動き検出処理について説明する。
【００１５】
ステップＳ１において、動きベクトル検出部１３は、フレームメモリ１１に格納されているカレントフレームＦｃ上の注目画素Ｐ（ｉ，ｊ）の画素位置に応じて、サーチエリアＳＲを設定する。
【００１６】
ステップＳ２において、動きベクトル検出部１３は、上述のように、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和の最小値を設定する変数minを、画素の階調数に基準ブロックＢｂ（ｉ，ｊ）を構成する画素数を乗じた値に設定することにより初期化する。すなわち、例えば、１画素が８ビットのデータであった場合、１画素の階調数は、２の８乗となるため２５６階調（２５６色）となる。また、基準ブロックＢｂ（ｉ，ｊ）がＬ画素×Ｌ画素＝３画素×３画素から構成される場合、その画素数は、９個となる。結果として、変数minは、２３０４（＝２５６（階調数）×９（画素数））に初期化される。
【００１７】
ステップＳ３において、動きベクトル検出部１３は、参照ブロックＢｒｎをカウントするカウンタ変数ｎを１に初期化する。
【００１８】
ステップＳ４において、動きベクトル検出部１３は、基準ブロックＢｂと参照ブロックＢｒｎの画素間の差分絶対値和を代入するために用いる変数sumを０に初期化する。
【００１９】
ステップＳ５において、動きベクトル検出部１３は、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和（＝sum）を求める。すなわち、基準ブロックＢｂ（ｉ，ｊ）の各画素がＰ＿Ｂｂ（ｉ，ｊ）、基準ブロックＢｒｎ（ｉ，ｊ）の各画素がＰ＿Ｂｒｎ（ｉ，ｊ）としてそれぞれ示される場合、動きベクトル検出部１３は、以下の式（１）で示される演算を実行して、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和を求める。
【００２０】
【数１】

【００２１】
ステップＳ６において、動きベクトル検出部１３は、変数minが変数sumよりも大きいか否かを判定し、例えば、変数minが変数sumよりも大きいと判定する場合、ステップＳ７において、変数minを変数sumに更新し、その時点でのカウンタｎの値を動きベクトル番号として登録する。すなわち、今求めた差分絶対値和を示す変数sumが、最小値を示す変数minよりも小さいと言うことは、これまで演算したどの参照ブロックよりも、今演算している参照ブロックＢｒｎ（ｉ，ｊ）が基準ブロックＢｂ（ｉ，ｊ）により類似したものであるとみなすことができるので、動きベクトルを求める際の候補とするため、その時点でのカウンタｎが動きベクトル番号として登録される。また、ステップＳ６において、変数minが変数sumよりも大きくないと判定された場合、ステップＳ７の処理がスキップされる。
【００２２】
ステップＳ８において、動きベクトル検出部１３は、カウンタ変数ｎがサーチエリアＳＲの参照ブロックＢｒｎの総数ｍであるか否か、すなわち、今の参照ブロックＢｒｎがＢｒｎ＝Ｂｒｍであるか否かを判定し、例えば、総数ｍではないと判定した場合、ステップＳ９において、カウンタ変数ｎを１インクリメントし、その処理は、ステップＳ４に戻る。
【００２３】
ステップＳ８において、カウンタ変数ｎがサーチエリア内の参照ブロックＢｒｎの総数ｍである、すなわち、今の参照ブロックＢｒｎがＢｒｎ＝Ｂｒｍであると判定された場合、ステップＳ１０において、動きベクトル検出部１３は、登録されている動きベクトル番号に基づいて動きベクトルを出力する。すなわち、ステップＳ４乃至Ｓ９が繰り返されることにより、差分絶対値和が最小となる参照ブロックＢｒｎに対応するカウンタ変数ｎが動きベクトル番号として登録されることになるので、動きベクトル検出部１３は、この動きベクトル番号に対応する参照ブロックＢｒｎのＬ×Ｌ個の画素のうち、その中心となる参照画素Ｐｎ（ｉ，ｊ）を求め、カレントフレームＦｃ上の注目画素Ｐ（ｉ，ｊ）に対応する参照フレームＦｒ上の画素Ｐ'（ｉ，ｊ）を始点とし、参照画素Ｐｎ（ｉ，ｊ）を終点とするベクトルを、注目画素Ｐ（ｉ，ｊ）の動きベクトル（Ｖｘ，Ｖｙ）として求めて出力する。
【００２４】
以上説明したようなブロックマッチング法により、動きベクトルを検出する場合において、動きベクトル検出対象ブロック、および、参照フレームの小ブロックごとに定常成分および過渡成分を抽出し、これらの定常成分および過渡成分の差分をそれぞれ検出して、絶対値差分を累算した値を加重平均して評価値を形成し、この評価値に基づいて動きベクトルを検出することにより、演算量を低減することができ、また、誤検出のおそれを防止することができるようにする技術がある（例えば、特許文献１参照）。また、１ビットＡＤＲＣによって各画素値を１ビットのコード値に符号化し、そのコード値を使用してマッチング演算を行うことにより、動きベクトルの検出に係る回路構成、演算時間等を簡素化することができる技術がある（例えば、特許文献２参照）。
【００２５】
【特許文献１】
特開平０７−０８７４９４号公報
【００２６】
【特許文献２】
特開２０００−２７８６９１号公報
【００２７】
【発明が解決しようとする課題】
しかしながら、上述したブロックマッチングアルゴリズムは、式（１）の演算量が非常に膨大なものとなるため、MPEG（Moving Picture Experts Group）等の画像圧縮処理においては、大半の時間がこの処理に費やされてしまうという課題があった。
【００２８】
また、カレントフレームＦｃ、または、参照フレームＦｒの動きベクトルの始点、または、終点付近でノイズが含まれた場合、ブロックマッチングでは基準ブロックに類似する参照ブロックを検出することができず、正確な動きベクトルを検出することができないという課題があった。
【００２９】
本発明はこのような状況に鑑みてなされたものであり、正確に動きベクトルを生成することができるようにするものである。
【００３０】
【課題を解決するための手段】
本発明の画像処理装置は、第１の学習フレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードに対応付けて、所定の画素群のうち、第１の学習フレームと第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルと、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置を画素位置テーブルから読み出し、画素位置テーブルから読み出されたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡを抽出する第１の特徴量抽出手段と、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置を画素位置テーブルから読み出し、画素位置テーブルから読み出されたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出する第２の特徴量抽出手段と、第２の特徴量抽出手段により抽出された特徴量コードＢおよび特徴量コードＣごとに、対応する画素位置の情報を記憶するデータベースと、データベースにより記憶されている画素位置の情報のうち、第１の特徴量抽出手段により抽出された第１のフレームの注目画素の特徴量コードＡと値が一致する特徴量コードＢおよび特徴量コードＣの画素位置の情報を検索する検索手段とを備える。
【００３１】
第２の特徴量抽出手段は、第２の画素位置の画素値を示す８ビットのデータに２ ^m-1 を加算した後に２ ^m+1 で割ったものに２を乗算することで、第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いてコードＢをコード化し、第２の画素位置の画素値を示す８ビットのデータから２ ^m-1 を減算した後に２ ^m+1 で割ったものに２を乗算してから１を足すことで、第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いてコードＣをコード化することができる。
【００３２】
第２の特徴量抽出手段は、第２のフレームの注目画素を含む所定の画素群の量子化コードのうち、量子化における閾値近傍の画素値に対応する所定のコードのみをビット反転し、所定のコードのみがビット反転されている量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第３の画素位置を画素位置テーブルから読み出し、画素位置テーブルから読み出されたレベル変動が最も少ない第３の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、所定のコードのみがビット反転されている量子化コードからなる特徴量コードＢおよび特徴量コードＣもそれぞれ抽出することができる。
【００３３】
データベースは、複数備えられ、フレーム毎に交互に情報が記憶される。
【００３４】
検索手段により検索された画素位置の情報のうち、第１のフレーム中の注目画素との距離が最小となる画素位置を検出し、検出された画素位置と、注目画素の画素位置の情報とを基に、動きベクトルを生成する動きベクトル生成手段を更に備えさせるようにすることができる。
【００３５】
本発明の画像処理方法は、動きベクトルを検出する画像処理装置が、第１の学習フレームの注目画素を含む所定の画素群のＡＤＲＣの各画素値をｎビットの量子化コードに対応付けて、所定の画素群のうち、第１の学習フレームと第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置を読み出し、画素位置テーブルから読みだされたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡを抽出し、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置を画素位置テーブルから読み出し、画素位置テーブルから読みだされたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出し、抽出された特徴量コードＢおよび特徴量コードＣごとに、対応する画素位置の情報をデータベースに記憶する処理を制御し、データベースに記憶されている画素位置の情報のうち、抽出された第１のフレームの注目画素の特徴量コードＡと値が一致する特徴量コードＢおよび特徴量コードＣの画素位置の情報を検索するステップを含む。
【００３６】
本発明の第１の記録媒体に記録されているプログラムは、第１の学習フレームの注目画素を含む所定の画素群のＡＤＲＣの各画素値をｎビットの量子化コードに対応付けて、所定の画素群のうち、第１の学習フレームと第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置を読み出し、画素位置テーブルから読みだされたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡを抽出し、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置を画素位置テーブルから読み出し、画素位置テーブルから読みだされたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出し、抽出された特徴量コードＢおよび特徴量コードＣごとに、対応する画素位置の情報をデータベースに記憶する処理を制御し、データベースに記憶されている画素位置の情報のうち、抽出された第１のフレームの注目画素の特徴量コードＡと値が一致する特徴量コードＢおよび特徴量コードＣの画素位置の情報を検索するステップを含む処理をコンピュータに実行させる。
【００３７】
本発明の第１のプログラムは、第１の学習フレームの注目画素を含む所定の画素群のＡＤＲＣの各画素値をｎビットの量子化コードに対応付けて、所定の画素群のうち、第１の学習フレームと第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置を読み出し、画素位置テーブルから読みだされたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡを抽出し、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置を画素位置テーブルから読み出し、画素位置テーブルから読みだされたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出し、抽出された特徴量コードＢおよび特徴量コードＣごとに、対応する画素位置の情報をデータベースに記憶する処理を制御し、データベースに記憶されている画素位置の情報のうち、抽出された第１のフレームの注目画素の特徴量コードＡと値が一致する特徴量コードＢおよび特徴量コードＣの画素位置の情報を検索するステップを含む処理をコンピュータに実行させる。
【００３８】
本発明の学習装置は、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡと、特徴量コードＡに対応するものが検索される特徴量であって、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣとを抽出するのに用いられる、第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置において、第１の学習フレームにおいて、注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して、ｎビットの量子化コードを生成する量子化コード生成手段と、第１の学習フレームと、第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出する検出手段と、量子化コード生成手段により生成された量子化コードに対応付けて、検出手段により検出された画素群のそれぞれの画素位置における画素のレベルの変動値を蓄積する蓄積手段と、蓄積手段により蓄積された情報を基に、量子化コードと、画素値のレベル変動が最も少ない画素位置を対応付けた画素位置情報を生成する画素位置情報生成手段とを備える。
【００３９】
本発明の学習方法は、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡと、特徴量コードＡに対応するものが検索される特徴量であって、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣとを抽出するのに用いられる、第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置が、第１の学習フレームにおいて、注目画素を含む所定の画素群のＡＤＲＣに基づくｎビットの量子化コードを生成し、第１の学習フレームと、第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出し、生成された量子化コードに対応付けて、検出された画素群のそれぞれの画素位置における画素のレベルの変動値を蓄積し、蓄積された情報を基に、量子化コードと、画素値のレベル変動が最も少ない画素位置を対応付けた画素位置情報を生成するステップを含む。
【００４０】
本発明の第２の記録媒体に記録されているプログラムは、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡと、特徴量コードＡに対応するものが検索される特徴量であって、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣとを抽出するのに用いられる、第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置に、第１の学習フレームにおいて、注目画素を含む所定の画素群のＡＤＲＣに基づくｎビットの量子化コードを生成し、第１の学習フレームと、第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出し、生成された量子化コードに対応付けて、検出された画素群のそれぞれの画素位置における画素のレベルの変動値を蓄積し、蓄積された情報を基に、量子化コードと、画素値のレベル変動が最も少ない画素位置を対応付けた画素位置情報を生成するステップを含む処理を実行させる。
【００４１】
本発明の第２のプログラムは、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡと、特徴量コードＡに対応するものが検索される特徴量であって、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣとを抽出するのに用いられる、第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置に、第１の学習フレームにおいて、注目画素を含む所定の画素群のＡＤＲＣに基づくｎビットの量子化コードを生成し、第１の学習フレームと、第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出し、生成された量子化コードに対応付けて、検出された画素群のそれぞれの画素位置における画素のレベルの変動値を蓄積し、蓄積された情報を基に、量子化コードと、画素値のレベル変動が最も少ない画素位置を対応付けた画素位置情報を生成するステップを含む処理を実行させる。
【００４２】
本発明の画像処理装置および方法、並びに第１のプログラムにおいては、第１の学習フレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードに対応付けて、所定の画素群のうち、第１の学習フレームと第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第１の画素位置が読み出され、画素位置テーブルから読み出されたレベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、第１のフレームの注目画素を含む所定の画素群の量子化コードからなる、第１のフレームの注目画素の特徴量コードＡが抽出される。また、第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの量子化コードに対応付けられたレベル変動が最も少ない第２の画素位置が画素位置テーブルから読み出され、画素位置テーブルから読み出されたレベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、第２のフレームの注目画素を含む所定の画素群の量子化コードからなる、第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣがそれぞれ抽出される。そして、抽出された特徴量コードＢおよび特徴量コードＣごとに、対応する画素位置の情報がデータベースに記憶され、データベースに記憶されている画素位置の情報のうち、第１のフレームの注目画素の特徴量コードＡと値が一致する特徴量コードＢおよび特徴量コードＣの画素位置の情報が検索される。
【００４３】
本発明の学習装置および方法、並びに第２のプログラムにおいては、第１の学習フレームにおいて、注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して、ｎビットの量子化コードが生成され、第１の学習フレームと、第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値が検出され、量子化コードに対応付けて、画素群のそれぞれの画素位置における画素のレベルの変動値が蓄積され、蓄積された情報を基に、量子化コードと、画素値のレベル変動が最も少ない画素位置を対応付けた画素位置情報が生成される。
【００４４】
【発明の実施の形態】
以下、図を参照して、本発明の実施の形態について説明する。
【００４５】
図４は、本発明を適用した画像処理装置の動き検出部５１の構成を示すブロック図である。
【００４６】
動き検出部５１は、フレームメモリ６１および６２、特徴量抽出部６３および６４、データベース制御部６５、データベース検索部６６、および、動きベクトル決定部６７で構成されている。
【００４７】
第１のフレームメモリであるフレームメモリ６１は、入力端子Ｔｉｎから入力された画像信号の１画面の情報を格納し、フレームメモリ６２および特徴量抽出部６３に供給する。入力端子Ｔｉｎから入力される画像信号は、各画素の画素値が、所定のビット数に量子化された情報であり、例えば、ＰＣＭ符号化などにより、符号化されていてもかまわない。ここでは、画素値が、８ビットに量子化されているものとして説明する。
【００４８】
特徴量抽出部６３は、フレームメモリから供給された画面情報、すなわちカレントフレームＦｃの情報を基に、注目画素の特徴量を抽出する。特徴量には、例えば、注目画素の画素値、注目画素近傍の所定の画素の画素値、注目画素を中心とした所定の画素範囲のブロックの複数の画素値またはこれらの画素値の平均値、注目画素を中心とした所定の画素範囲のブロックの画素値を基に算出されるＡＤＲＣ（Adaptive Dynamic Range Coding）のコード、ブロック内の画素値分布のダイナミックレンジ、あるいは、ブロック内の画素値分布の最小値などがある。
【００４９】
ここでは、特徴量抽出部６３は、注目画素近傍の所定の画素の画素値、および、ＡＤＲＣコードを、特徴量として抽出するものとする。
【００５０】
図５は、特徴量抽出部６３の構成を示すブロック図である。
【００５１】
クラスタップ抽出部９１は、入力された画像情報のうち、特徴量の抽出に必要な注目画素に対応する周辺画素（クラスタップ）の情報（画素値）を抽出し、ＤＲ（ダイナミックレンジ）演算部９２、および、ＡＤＲＣ（Adaptive Dynamic Range Coding）コード生成部９３に出力する。また、クラスタップ抽出部９１は、クラスタップのパターンを記憶したタップテーブル９１ａを有しており、タップテーブル９１ａに記憶されているクラスタップのパターンに基づいて、抽出するクラスタップのパターンを決定する。
【００５２】
すなわち、タップテーブル９１ａは、クラスタップのパターンとして、図６で示すように、注目画素（図中黒丸で表示されている画素）を中心とした３画素×３画素からなるブロックの最外周位置に配置される８画素（図中斜線で塗りつぶされている画素）のパターン、または、注目画素を含む３画素×３画素からなるブロックの９画素のパターン、図７で示すように、注目画素（図中黒丸で表示されている画素）を中心とした５画素×５画素からなるブロックの最外周位置に配置される１６画素（図中斜線で塗りつぶされている画素）のパターン、または、注目画素を含む５画素×５画素からなるブロックの２５画素のパターン、図８で示すように、注目画素（図中黒丸で表示されている画素）を中心とした７画素×７画素からなるブロックの最外周位置に配置される２４画素（図中斜線で塗りつぶされている画素）のパターン、更に、図示しないが、注目画素（図中黒丸で表示されている画素）を中心としたｎ画素×ｎ画素からなるブロックの最外周位置に配置される４（ｎ−１）画素パターンなどを記憶している。
【００５３】
また、クラスタップのパターンは、図９で示すように、注目画素の上下左右の１画素からなる４画素であってもよいし、図１０で示すように、注目画素の上下左右の２画素からなる８画素であってもよいし、図１１で示すように、注目画素の上下左右の３画素からなる１２画素であってもよいし、更には、図示しないが、注目画素の上下左右のｍ画素からなる４ｍ画素であってもよい。更に、クラスタップは、図示した以外の構成でもよく、例えば、注目画素との位置関係が非対称となる配置のものであってもよい。
【００５４】
ここでは、クラスタップ抽出部９１は、クラスタップのパターンとして、図６で示すように、注目画素を含む３画素×３画素からなるブロックの９画素を抽出するものとする。
【００５５】
ＤＲ（ダイナミックレンジ）演算部９２は、クラスタップ抽出部９１より入力されるクラスタップの情報（画素値）からダイナミックレンジを求め、ダイナミックレンジをＡＤＲＣコード生成部９３に出力するとともに、ダイナミックレンジを求める際に得られる最小値の情報をＡＤＲＣコード生成部９３に出力する。すなわち、クラスタップのパターンが、例えば、図９で示すものであった場合、各クラスタップＣ１乃至Ｃ４の情報（画素値レベル）が、（Ｃ１，Ｃ２，Ｃ３，Ｃ４）＝（６０，９０，５１，１００）であるとき、その関係は、図１２で示すようになる。このような場合、ダイナミックレンジは、画素値レベルの最小値と最大値の差として定義され、その値は、以下の式（２）で定義される。
【００５６】
ＤＲ＝Max−Min＋１・・・（２）
【００５７】
ここで、Maxは、クラスタップの情報である画素値レベルの最大値であり、Minは、クラスタップの画素値レベルの最小値を示す。ここで、１を加算するのは、クラスを定義するためである（例えば、０，１で示されるクラスを設定する場合、両者の差分は１であるが、クラスとしては２クラスとなるため、差分に１を加算する）。従って、図１２の場合、クラスタップＣ３の画素値レベル１００が最大値であり、クラスタップＣ１の画素値レベル５１が最小値となるので、ＤＲは、５０（＝１００−５１＋１）となる。
【００５８】
このように、ＤＲ演算部９２は、ダイナミックレンジを演算するにあたり、クラスタップの画素値レベルのうちの最小値と最大値を検出することになるので、その最小値（または最大値）をＡＤＲＣコード生成部９３に出力する。
【００５９】
ＡＤＲＣコード生成部９３は、ＤＲ演算部９２より入力されたダイナミックレンジの値と最小値Minとから、クラスタップの各画素値レベルを基に、ＡＤＲＣコードからなる量子化コードを生成して出力する。より詳細には、ＡＤＲＣコードは、クラスタップの各画素値レベルを以下の式（３）の代入することにより求められる。
【００６０】
Ｑ＝Round（（Ｌ−Min＋０．５）×（２＾ｎ）／ＤＲ）・・・（３）
【００６１】
ここで、Roundは切捨てを、Ｌは画素値レベルを、ｎは割当てビット数を、（２＾ｎ）は２のｎ乗を、それぞれ示している。
【００６２】
従って、例えば、割当てビット数ｎが１であった場合、各クラスタップの画素値レベルは、以下の式（４）で示される閾値ｔｈ以上であれば１であり、閾値ｔｈより小さければ０とされる。
【００６３】
ｔｈ＝ＤＲ／２−０．５＋Min・・・（４）
【００６４】
結果として、割当てビット数ｎが１である場合、図１２で示すようなクラスタップが得られたとき、閾値ｔｈは、７５．５（＝５０／２−０．５＋５１）となるので、ＡＤＲＣコードは、ＡＤＲＣ（Ｃ１，Ｃ２，Ｃ３，Ｃ４）＝０１０１となる。
【００６５】
すなわち、クラスタップ抽出部９１が、クラスタップのパターンとして、図６で示した注目画素を中心とした３画素×３画素を抽出した場合、ＡＤＲＣコード生成部９３においては、９ビットのＡＤＲＣコードが生成される。
【００６６】
ＡＤＲＣコード生成部９３は、生成したＡＤＲＣコードを、特徴量生成部９６に供給する。
【００６７】
画素コード生成部９４は、画素位置テーブル９５に記憶されている画素位置テーブルを参照して、ノイズなどによるレベル変動が少ない画素位置の画素値を基に、画素コードを生成する。
【００６８】
参照フレームＦｒとカレントフレームFｃにおいて対応する画素は、本来、同一の特徴を有しているはずであるが、例えば、ノイズなどの影響により、その特徴が多少変化することがある。参照フレームとカレントフレームにおいて対応するべき画素の画素値の差分レベルと、クラスタップ中の画素位置との関係は、ＡＤＲＣコードのパターンによって、規則性を有する。換言すれば、ＡＤＲＣコードに対応して、画素値の差分レベルが小さい（あるいは、大きい）画素のタップ位置（画素位置）が決まっている。
【００６９】
例えば、クラスタップのパターンとして、図６で示した注目画素を中心とした３画素×３画素を用い、複数の（例えば、２５パターン程度の）動画の参照フレームおよびカレントフレームにおいて、対応するべき画素の画素値の差分レベルを調べた結果、対応するべき画素の画素値の差分レベルが、図１３に示されるように、第１のＡＤＲＣコードパターンにおいては、実線で示されるような規則性を有し、第２のＡＤＲＣコードパターンにおいては、点線で示されるような規則性を有するものとする。このとき、第１のＡＤＲＣコードパターンにおいては、タップ位置１で示される画素位置において、対応するべき画素の画素値の差分レベルの変動が少ないといえる。また、第２のＡＤＲＣコードパターンにおいては、タップ位置３で示される画素位置において、対応するべき画素の画素値の差分レベルの変動が少ないといえる。
【００７０】
従って、ＡＤＲＣコードごとに最も差分レベルの少ない画素位置を予め学習して、記憶しておき、特徴量抽出時に、ＡＤＲＣコードを参照して、クラスタップ中の最も画素値の差分レベルの変動が少ない画素位置の画素値を基に、特徴量を抽出するようにすることにより、動きベクトル検出において、ノイズによるレベル変動の影響を低減することが可能となる。
【００７１】
画素コード生成部９４は、クラスタップ抽出部９１がクラスタップのパターンとして、図６で示した注目画素を中心とした３画素×３画素を抽出した場合、その３画素×３画素の供給を受けるとともに、ＡＤＲＣコード生成部９３において生成された、ＡＤＲＤコードの供給を受ける。画素位置テーブル９５には、図１４に示されるように、学習の結果、最も差分レベルが低いとされる画素位置が、ＡＤＲＤコードに対応付けられて記憶されている。画素コード生成部９４は、画素位置テーブル９５を参照し、ＡＤＲＤコードに対応した、最も差分レベルの低いと予想される画素位置を選択し、その画素位置における画素値から、画素コード（コードＡ）を算出する。
【００７２】
画素コード生成部９４は、選択された画素の画素値として、８ビットのデータを取得し、そのうち、上位４ビットを用いて、コード化を実行する。すなわち、８ビットで示される画素値０乃至２⁸−１が、２⁴種類のコードで示される。具体的には、特徴量抽出部６３において、次の式（５）により、コードＡ（ｃｏｄｅＡ）が算出される。
【００７３】
ｃｏｄｅＡ＝［ｐ（ｘ、ｙ）／１６］・・・（５）
【００７４】
ただし、ｐ（ｘ、ｙ）は、注目画素の画素値であり、［］の括弧内の計算値は、小数点以下を切り捨てるものとする。画素コード生成部９４は、算出されたコードＡを、対応する画素の位置（例えば、座標情報）とともに、特徴量生成部９６に供給する。
【００７５】
特徴量生成部９６は、画素コード生成部９４から供給された４ビットの画素コードであるコードＡ、および、ＡＤＲＣコード生成部９３から供給された９ビットのＡＤＲＣコードを用いて、１３ビットの特徴量コードである特徴量コードＡを生成し、対応する画素の位置（例えば、座標情報）とともに、データベース検索部６６に供給する。
【００７６】
なお、画素位置テーブル９５に記憶されている画素位置テーブルの生成については、後述する。
【００７７】
図４に戻り、本発明を適用した画像処理装置の動き検出部５１について説明する。
【００７８】
第２のフレームメモリであるフレームメモリ６２は、フレームメモリ６１に格納されていた、以前（たとえば１画面前）の画面情報の入力を受け、参照フレームＦｒの情報として格納し、順次、特徴量抽出部６４に供給する。
【００７９】
参照フレームＦｒとカレントフレームFｃにおいて対応する画素は、本来、同一の特徴を有しているはずであるが、例えば、ノイズなどの影響により、その特徴が多少変化することがある。参照フレームＦｒとカレントフレームFｃにおいて、いずれも、式（５）を用いて説明したコードＡの演算結果を利用して特徴量を算出するようになされている場合、例えば、参照フレームＦｒにおいて、画素値が１５であった画素が、カレントフレームFｃにおいて、画素値１６となってしまったとき、コードＡの算出結果は、参照フレームＦｒではコードＡ＝０、カレントフレームFｃではコードＡ＝１となる。このような場合、同一の特徴量算出結果を得ることができない。また、同様にして、画素値が、ＡＤＲＣコード算出時の閾値ｔｈに近い値である場合、ノイズなどの影響により、ビット反転を起こしてしまうことがある。このような場合、動きベクトルを正しく検出することができなくなってしまう。
【００８０】
そこで、特徴量抽出部６４は、フレームメモリ６２から供給された参照フレームＦｒの情報を基に、特徴量抽出部６３とは異なる規準を用いて、注目画素の特徴量を抽出する。
【００８１】
図１５は、特徴量抽出部６４の構成を示すブロック図である。
【００８２】
なお、図５における場合と対応する部分には同一の符号を付してあり、その説明は適宜省略する。
【００８３】
すなわち、図１５の特徴量抽出部６４は、画素コード生成部９４に代わって画素コード生成部１０２が設けられ、ＡＤＲＣコード生成部９３により生成されたＡＤＲＣコードを変換するコード変換部１０１が更に備えられている以外は、図５の特徴量抽出部６３と基本的に同様の構成を有するものである。
【００８４】
参照フレームＦｒとカレントフレームＦｃの対応する画素の特徴量は、同一であるはずであるが、例えば、ノイズなどの影響により、特徴量が変化してしまうことが考えられる。特徴量として、ＡＤＲＣコードが用いられる場合、例えば、画素値が閾値ｔｈに近い値を有しているいずれかのビットが、特徴量の変化によって反転してしまう恐れがある。
【００８５】
コード変換部１０１は、供給されたＡＤＲＣコードのうち、ビット反転が起こる可能性が高い所定の画素位置に対応するコードをビット反転したコードを生成し、入力されたＡＤＲＣコード（ビット反転前のコード）とともに、特徴量生成部９６に供給する。
【００８６】
そして、画素コード生成部１０２においては、ＡＤＲＣコード生成部９３により生成されたＡＤＲＣコード、および、コード変換部１０１において変換されたＡＤＲＣコードを用いて、図５の画素コード生成部９４とは異なる方法で、画素コードが生成されるようになされている。画素コード生成部１０２は、画素位置テーブル９５を参照し、ＡＤＲＤコード、または、変換されたＡＤＲＣコードに対応した、カレントフレームと参照フレームにおいて対応する画素の画素値の差分レベルの最も低いと予想される画素位置を選択し、その画素位置における画素値から、コードＡを求めた方法とは異なる方法を用いて、画素コード（コードＢおよびコードＣ）を算出する。
【００８７】
画素コード生成部１０２は、画素位置テーブル９５を参照して選択された画素位置の画素の画素値として、８ビットのデータを取得し、そのうち、上位３ビットを用いて、コード化を実行する。すなわち、８ビットで示される画素値０乃至２⁸−１が、２³種類のコードで示されるのであるが、ここでは、２種類のコードを算出する。従って、算出されるコードの種類の総数は、２³×２種類となり、特徴量抽出部６３において算出されるコードの種類と同数となる。具体的には、次の式（６）および式（７）により、コードＢ（ｃｏｄｅＢ）およびコードＣ（ｃｏｄｅＣ）が算出される。
【００８８】
ｃｏｄｅＢ＝［｛ｐ（ｘ、ｙ）＋８｝／３２］×２・・・（６）
ｃｏｄｅＣ＝［｛ｐ（ｘ、ｙ）−８｝／３２］×２＋１・・・（７）
【００８９】
ここでも、［］の括弧内の計算値は、小数点以下切り捨てるものとする。特徴量抽出部６３において算出されるコードＡと、特徴量抽出部６４において算出されるコードＢおよびコードＣとの関係を図１６に示す。
【００９０】
コードＡが、上位４ビットの値を用いてコード化されているのに対し、コードＢおよびコードＣは、上位３ビットの値を用いてコード化されているため、同一コードに含まれる画素範囲は、コードＡにおいては１６ステップであり、コードＢおよびコードＣにおいては、コードＡにおける場合の２倍の３２ステップとなる。そして、コードＢのコードの境界は、コードＡの境界と比較して、マイナス側に３ビットシフトされ、コードＣのコードの境界は、コードＡの境界と比較して、プラス側に３ビットシフトされている。
【００９１】
ノイズなどの影響により、参照フレームＦｒとカレントフレームFｃとにおいて画素の特徴量が変化した場合、最も誤検知を起こしやすいのは、コードの境界部分である。従って、この部分を誤検知しないように、参照フレームＦｒのコードＢおよびコードＣが、それぞれ、コードＡの境界部分を含んで、同じ画素値範囲でオーバーラップするように設定される。
【００９２】
図１６に示されるように、上述した式（６）および式（７）によって生成されるコードＢまたはコードＣは、それぞれ、コードＡ＝Ｘ（Ｘは、０乃至１４）と、コードＡ＝Ｘ＋１との境界部分の画素値を含む。そして、コードＡ＝Ｘと、コードＡ＝Ｘ＋１との境界部分となる画素値において、コードＢ＝Ｘとなり、コードＣ＝Ｘ＋１となる。
【００９３】
具体的には、例えば、注目画素の画素値が６３である場合、カレントフレームＦｃにおいて、コードＡ＝３となり、参照フレームＦｒにおいて、コードＢ＝４、かつ、コードＣ＝３となる。また、注目画素の画素値が６４である場合、カレントフレームＦｃにおいて、コードＡ＝４となり、参照フレームＦｒにおいて、コードＢ＝４、かつ、コードＣ＝３となる。
【００９４】
従って、参照フレームＦｒとカレントフレームFｃとにおいて画素の特徴量が６３と６４で変化してしまった場合においても、コードＢかコードＣのいずれかにおいて、コードＡと同一のコードを得ることが可能となる。上述した式（６）および式（７）によってコードＢおよびコードＣが生成された場合、コードＢまたはコードＣは、コードＡの対応する境界を挟んで、３ビット（画素値で８ステップ）ずつずれているので、参照フレームＦｒとカレントフレームFｃとにおいての画素の特徴量の変動が３ビット（画素値で８ステップ）以内であれば、コードＢかコードＣのいずれかにおいて、正しい検出結果を得ることが可能となる。
【００９５】
ここでは、コードＡを検出するために用いる情報のビット数を上位４ビットとし、コードＢおよびコードＣを検出するために用いる情報のビット数を上位３ビットとして説明したが、特徴量としてのコードの検出方法はこの限りではない。例えば、コードＡを検出するために用いる情報のビット数を上位５ビットとし、コードＢおよびコードＣを検出するために用いる情報のビット数を上位４ビットとしてもよい。
【００９６】
画素コード生成部１０２は、算出されたコードＢおよびコードＣを、特徴量生成部９６に供給する。特徴量生成部９６は、画素コード生成部１０２から供給された４ビットの画素コードであるコードＢ、および、コード変換部１０１から供給された９ビットのＡＤＲＣコードを用いて、１３ビットの特徴量コードである特徴量コードＢを生成し、また、画素コード生成部１０２から供給された４ビットの画素コードであるコードＣ、および、コード変換部１０１から供給された９ビットのＡＤＲＣコードを用いて、１３ビットの特徴量コードである特徴量コードＣを生成して、それぞれ、データベース制御部６５に供給する。
【００９７】
再び、図４に戻り、本発明を適用した画像処理装置の動き検出部５１について説明する。
【００９８】
データベース制御部６５は、特徴量抽出部６４から供給された、参照フレームＦｒの特徴量情報を基に、特徴量をアドレスとして、画素の位置情報をデータベース７１に格納することにより、参照フレーム情報を生成する。データベース制御部６５は、内部に、処理済の画素数をカウントするためのカウンタを有している。
【００９９】
データベース７１に格納される参照フレーム情報の構成を図１７に示す。
【０１００】
データベース７１は、特徴量アドレス０乃至ａと、フラグアドレス０乃至ｂによって示される（ａ＋１）×（ｂ＋１）個のセルにより構成されている。
【０１０１】
データベース制御部６５は、参照フレームＦｒの画素の特徴量である特徴量コードＢ、および、特徴量コードＣを、特徴量アドレスに対応つけて、その画素の位置情報を、データベース７１の、特徴量アドレスに対応するフラグアドレス１乃至ｂに順次格納する。そして、フラグアドレス０には、現在、その特徴量アドレスに格納されている位置情報の数が、順次、インクリメントされて格納される。具体的には、特徴量アドレス１に、１つの位置情報が格納されている場合、セル（１，０）には、格納されている位置情報の数として、１が格納される。そして、次の注目画素の特徴量が、特徴量アドレス１に対応するものであった場合、セル（１，０）に格納されている値は、インクリメントされて２となり、注目画素の位置情報が、セル（１，２）に格納される。
【０１０２】
そして、参照フレームＦｒの注目画素１つに付き、特徴量コードＢおよび特徴量コードＣに対応する２つの画素位置情報が、特徴量アドレスに対応するフラグアドレス１乃至ｂのいずれかに格納される。１フレーム分の格納処理が終了したとき、データベース７１のセル（０，１）からセル（ａ，ｂ）には、１フレームの画素数の２倍の画素位置情報が格納される。
【０１０３】
データベース検索部６６は、特徴量抽出部６３から供給されたカレントフレームＦｃの特徴量情報である特徴量コードＡと、データベース制御部６５のデータベース７１に記憶されている参照フレーム情報とのマッチング処理を実行する。すなわち、データベース検索部６１は、特徴量抽出部６３から、カレントフレームＦｃの注目画素の特徴量である特徴量コードＡの入力を受け、データベース７１の参照フレーム情報を参照して、カレントフレームＦｃの注目画素の特徴量と一致する特徴量アドレスに記載されている、複数の参照画素候補の画素位置情報を検出して、カレントフレームＦｃの注目画素の画素位置情報とともに、動きベクトル決定部６７に供給する。
【０１０４】
動きベクトル決定部６７は、カレントフレームＦｃの注目画素の画素位置情報と、複数の参照画素候補の画素位置情報との距離を演算し、例えば、算出された距離が最小である参照画素候補を、カレントフレームＦｃの注目画素に対応する参照画素であると決定し、その位置情報に基づいて、差分座標を検出して出力する。出力された差分座標は、注目画素の動きベクトル（Ｖｘ，Ｖｙ）である。
【０１０５】
ここでは、距離演算結果が最小のものを選択することにより、複数の参照画素候補のうち、カレントフレームＦｃの注目画素に対応する参照画素を決定するものとして説明したが、参照画素の決定方法は、それ以外の方法であってもかまわないことは言うまでもない。
【０１０６】
図１８を参照して、動き検出部５１が実行する動き検出処理１について説明する。
【０１０７】
ステップＳ３１において、フレームメモリ６１は、フレームの入力を受ける。
【０１０８】
ステップＳ３２において、フレームメモリ６１は、ステップＳ２１において入力されたフレームを、第２のフレームメモリであるフレームメモリ６２に供給する。フレームメモリ６２は、例えば、ｎ番目のフレームがフレームメモリ６１からフレームメモリ６２に供給されるよりも前に、格納されていたｎ−１番目のフレームを、特徴量抽出部６４に供給する。
【０１０９】
ステップＳ３３において、図１９を用いて後述する参照フレーム情報生成処理が、特徴量抽出部６４において実行される。
【０１１０】
ステップＳ３４において、図２７を用いて後述するカレントフレーム情報生成処理が、特徴量抽出部６３において実行される。
【０１１１】
ステップＳ３５において、図３０を用いて後述するマッチング処理が実行されて、処理が終了される。
【０１１２】
次に、図１９のフローチャートを参照して、図１８のステップＳ３３において実行される、参照フレーム情報生成処理について説明する。
【０１１３】
ステップＳ５１において、データベース制御部６５は、データベース７１に登録されている参照フレーム情報を初期化する。すなわち、データベース制御部６５は、全ての特徴量アドレスに対応するフラグアドレス０のセルに０を書き込み、フラグアドレス１乃至ｂに格納されている位置情報を削除する。
【０１１４】
ステップＳ５２において、データベース制御部６５は、１フレームメモリ内の画素をカウントするカウンタのカウンタ変数ｎを０に初期化する。
【０１１５】
ステップＳ５３において、特徴量抽出部６４は、図２０を用いて後述する特徴量算出処理を実行する。
【０１１６】
ステップＳ５４において、データベース制御部６５は、特徴量抽出部６３において抽出された参照フレームＦｒのクラスタップの特徴量の供給を受け、データベース７１から、特徴量コードＢおよび特徴量コードＣに対応する特徴量アドレスの、フラグアドレス０に記載されている値Ｋを読み込む。上述したように、１つの注目画素に対して、特徴量である特徴量コードＢおよび特徴量コードＣが２つ決定されるので、２箇所の特徴量アドレスに対応する値Ｋが読み出される。
【０１１７】
ステップＳ５５において、データベース制御部６５は、ステップＳ５４において読み出した値Ｋを、それぞれ、Ｋ＝Ｋ＋１とし、データベース７１の対応する特徴量アドレスの、フラグアドレス０に書き込む。
【０１１８】
ステップＳ５６において、データベース制御部６５は、特徴量抽出部６３から供給された、参照フレームＦｒの注目画素の位置情報を、データベース７１の対応する２箇所の特徴量アドレスのフラグアドレスＫ＋１で示されるセルに書き込む。
【０１１９】
ステップＳ５７において、データベース制御部６５は、カウンタ変数ｎをインクリメントして、ｎ＝ｎ＋１とする。
【０１２０】
ステップＳ５８において、データベース制御部６５は、カウンタ変数ｎ＝１フレームの画素数であるか否かを判断する。ステップＳ５８において、カウンタ変数ｎ＝１フレームの画素数ではないと判断された場合、処理は、ステップＳ５３に戻り、それ以降の処理が繰り返される。ステップＳ５８において、カウンタ変数ｎ＝１フレームの画素数であると判断された場合、処理は、図１８のステップＳ３４に戻る。
【０１２１】
このような処理により、図１７を用いて説明したデータベース７１のフラグアドレス＝０のセルであるセル（０，０）乃至セル（ａ，０）に、対応する特徴量を有する参照フレームＦｒの画素位置の候補の数が格納され、データベース７１のフラグアドレス＝１乃至ｂのセルであるセル（０，１）乃至セル（ａ，ｂ）に、対応する特徴量を有する参照フレームＦｒの画素位置が格納される。換言すれば、参照フレームＦｒにおいて、同一の特徴量を有する可能性の高い画素の画素位置が、特徴量ごとにアドレシングされて、データベース７１に格納される。
【０１２２】
次に、図２０のフローチャートを参照して、図１９のステップＳ５３において実行される特徴量算出処理について説明する。
【０１２３】
ステップＳ７１において、図２１を用いて後述するＡＤＲＣコード生成処理１が実行される。
【０１２４】
ステップＳ７２において、図２６を用いて後述する画素コード生成処理１が実行される。
【０１２５】
ステップＳ７３において、特徴量抽出部６４の特徴量生成部９６は、画素コード生成部１０２から供給された４ビットの画素コードであるコードＢ、および、コード変換部１０１から供給された９ビットのＡＤＲＣコードを用いて、１３ビットの特徴量コードである特徴量コードＢを生成し、また、画素コード生成部１０２から供給された４ビットの画素コードであるコードＣ、および、コード変換部１０１から供給された９ビットのＡＤＲＣコードを用いて、１３ビットの特徴量コードである特徴量コードＣを生成して、処理は、図１９のステップＳ５４に戻る。
【０１２６】
次に、図２１のフローチャートを参照して、図２０のステップＳ７１で実行されるＡＤＲＣコード生成処理１について説明する。
【０１２７】
ステップＳ９１において、クラスタップ抽出部９１は、注目画素を中心とする所定サイズのクラスコード用タップを設定し、クラスコード用タップに含まれる複数の画素の画素値を取得する。以下においては、図２２に示すように、クラスコード用タップのサイズを３×３画素とし、左上の画素を先頭に右下の画素までの画素値をそれぞれＰ１乃至Ｐ９として説明を継続する。
【０１２８】
ＤＲ演算部９２は、ステップＳ９２において、画素値Ｐ１乃至Ｐ９の最大値Ｐ_MAXと最小値Ｐ_MINを判定し、ステップＳ９３において、画素値Ｐ１乃至Ｐ９のダイナミックレンジＤＲ（＝｜最大値Ｐ_MAX−最小値Ｐ_MIN＋１｜）を算出し、最小値Ｐ_MINとともに、ＡＤＲＣコード生成部９３に供給する。
【０１２９】
ステップＳ９４において、ＡＤＲＣコード生成部９３は、ＤＲ演算部９２から供給された画素値Ｐ１乃至Ｐ９の最小値Ｐ_MIN、および、ダイナミックレンジＤＲより、上述した式（４）を用いて、閾値ｔｈを決定する。閾値ｔｈは、コード変換部１０１に供給される。
【０１３０】
ステップＳ９５において、コード変換部１０１は、画素値Ｐ１乃至Ｐ９のうち、マスク化する画素を決定する。ここで、マスク化とは、ノイズなどによる画素値のゆれが発生した場合においても、正しく動きベクトルが検出できるように、ビット反転を行う処理のことである。マスク化する画素は、例えば、閾値Ｔｈ近傍の画素値を有する所定の個数（例えば、２個）の画素としたり、閾値Ｔｈから所定の範囲以内の画素値を有する画素として決定するようにしても良い。
【０１３１】
ステップＳ９６において、ＡＤＲＣコード生成部９３は、画素値Ｐ１乃至Ｐ９の９画素を、それぞれ閾値Ｔｈと比較し、閾値Ｔｈよりも大きい場合には１に量子化し、閾値Ｔｈよりも小さい場合には０に量子化して、番号順に並べた９ビットを注目画素のＡＤＲＣコードとして生成し、コード変換部１０１に供給する。
【０１３２】
ステップＳ９７において、コード変換部１０１は、供給された９ピットのコードのうち、マスク化される画素をビット反転し、処理は、図２０のステップＳ７２に戻る。
【０１３３】
例えば、閾値Ｔｈ近傍の２つの画素をマスク化する場合、クラスコード用タップに含まれる９画素の画素値Ｐ１乃至Ｐ９が、図２３に示すような状態であるとき、ステップＳ９６において、９ビットのＡＤＲＣコード１０１００１１１１が生成される。そして、ステップＳ９７において、閾値Ｔｈに最も近い画素値Ｐ６と画素値Ｐ８に対応するＡＤＲＣコードのそれぞれが単独で、あるいは、両方がビット反転されるので、９ビットのＡＤＲＣコード１０１０００１１１、１０１００１１０１、および、１０１０００１０１が生成される。
【０１３４】
また、例えば、クラスコード用タップに含まれる９画素の画素値Ｐ１乃至Ｐ９が図２４に示すような状態であるとき、ステップＳ９６において、９ビットのＡＤＲＣコード１０１００１１０１が生成される。そして、ステップＳ９７において、閾値Ｔｈに最も近い画素値Ｐ５と画素値Ｐ６に対応するＡＤＲＣコードのそれぞれが単独で、あるいは、両方のコードがビット反転されるので、９ビットのＡＤＲＣコード１０１０１１１０１、１０１０００１０１、および、１０１０１０１０１が生成される。
【０１３５】
また、閾値Ｔｈを中心とする所定の範囲（±Δ）に含まれる全ての画素値を有する画素に対して、マスク化を行う場合、例えば、クラスコード用タップに含まれる９画素の画素値Ｐ１乃至Ｐ９が図２５に示すような状態であるとき、ステップＳ９６において、１０１００１１１１が生成される。そして、ステップＳ９７において、閾値Ｔｈを中心とする所定の範囲（±Δ）に含まれる画素値Ｐ２と画素値Ｐ６に対応するＡＤＲＣコードのそれぞれが単独で、あるいは、両方のコードがビット反転されるので、９ビットのＡＤＲＣコード１１１００１１１１、１０１０００１１１、および、１１１０００１１１が生成される。
【０１３６】
このように、ノイズなどの影響により、レベルが変動した場合に、ＡＤＲＣコードがビット反転する可能性の高い画素のＡＤＲＣコードに対して、０に量子化した場合のＡＤＲＣコードと、１に量子化した場合のＡＤＲＣコードを生成するようにしたことにより、ノイズなどの影響により、マッチングが正しく取れないようなことを抑止することができる。従って、クラスコードのロバスト性を向上させることができる。
【０１３７】
なお、クラスコード用タップを構成する画素の数、およびクラスコードのビット数は、上述した例に限るものではなく、任意である。
【０１３８】
更に、マスク化する画素の決定方法は、上述した限りではなく、他の方法で、マスク化する画素を決めることができるようにしても良いことは言うまでもない。
【０１３９】
また、マスク化を実行することなく、検出されたＡＤＲＣコードのみを用いて、マッチング処理を行うようにしてもよい。
【０１４０】
次に、図２６のフローチャートを参照して、図２０のステップＳ７２において実行される画素コード生成処理１について説明する。
【０１４１】
ステップＳ１１１において、画素コード生成部１０２は、ＡＤＲＣコード生成部９３により生成されたＡＤＲＣコード、または、コード変換部１０１により変換された（所定のコードがビット反転された)ＡＤＲＣコードを取得する。
【０１４２】
ステップＳ１１２において、画素コード生成部１０２は、画素位置テーブル９５に記憶されている画素位置テーブルを参照して、ＡＤＲＣコードに対応する、差分レベルが小さい画素位置を選択する。
【０１４３】
例えば、画素位置テーブル９５に、図１４を用いて説明したテーブル画素位置が記憶され、ステップＳ１１１において取得されたＡＤＲＣコードが、コード３で示されるコード配列であった場合、図２２におけるタップＰ４が、画素コード生成に用いられる画素の画素位置として選択される。
【０１４４】
ステップＳ１１３において、画素コード生成部１０２は、選択された画素位置に対応する画素の画素値の上位３ビットを抽出する。
【０１４５】
ステップＳ１１４において、画素コード生成部１０２は、コードＢおよびコードＣを算出して、処理は、図２０のステップＳ７３に戻る。
【０１４６】
ステップＳ１１３およびステップＳ１１４の処理は、具体的には、上述した、式（６）および式（７）を用いて、コードＢおよびコードＣを算出する処理と等価である。
【０１４７】
このようにして、ＡＤＲＣコードに対応した差分レベルが小さい画素位置の画素を基に、参照フレームの画素コードを、図２９を用いて後述するカレントフレームの画素コードの算出方法とは異なる方法で算出することができる。
【０１４８】
次に、図２７のフローチャートを参照して、図１８のステップＳ３４において実行されるカレントフレーム情報生成処理１について説明する。
【０１４９】
ステップＳ１３１において、データベース検索部６６は、１フレームメモリ内の画素をカウントするカウンタのカウンタ変数ｍを０に初期化する。
【０１５０】
ステップＳ１３２において、図２８を用いて後述するＡＤＲＣコード生成処理２が実行される。
【０１５１】
ステップＳ１３３において、図２９を用いて後述する画素コード生成処理２が実行される。
【０１５２】
ステップＳ１３４において、特徴量抽出部６３の特徴量生成部９６は、画素コード生成部９４から供給された４ビットの画素コードであるコードＡ、および、ＡＤＲＣコード生成部９３から供給された９ビットのＡＤＲＣコードを用いて、１３ビットの特徴量コードである特徴量コードＡを生成して、データベース検索部６６に供給する。
【０１５３】
ステップＳ１３５において、データベース検索部６６は、カウンタ変数ｍをインクリメントして、ｍ＝ｍ＋１とする。
【０１５４】
ステップＳ１３６において、データベース検索部６６は、カウンタ変数ｍ＝１フレームの画素数であるか否かを判断する。ステップＳ１３６において、カウンタ変数ｍ＝１フレームの画素数ではないと判断された場合、処理は、ステップＳ１３２に戻り、それ以降の処理が繰り返される。ステップＳ１３６において、カウンタ変数ｍ＝１フレームの画素数であると判断された場合、処理は、図１８のステップＳ３５に戻る。
【０１５５】
このような処理により、カレントフレーム情報が算出される。
【０１５６】
次に、図２８のフローチャートを参照して、図２７のステップＳ１３２において実行されるＡＤＲＣコード生成処理２について説明する。
【０１５７】
ステップＳ１５１乃至ステップＳ１５４において、図２１を用いて説明したステップＳ９１乃至ステップＳ９４と同様の処理が、特徴量抽出部６３において実行される。すなわち、特徴量抽出部６３のクラスタップ抽出部９１において設定された所定サイズのクラスコード用タップに含まれる複数の画素の画素値が取得され、ＤＲ演算部９２において、画素値Ｐ１乃至Ｐ９の最大値Ｐ_MAXと最小値Ｐ_MINが判定されて、画素値Ｐ１乃至Ｐ９のダイナミックレンジＤＲが算出され、最小値Ｐ_MINとともに、ＡＤＲＣコード生成部９３に供給される。そして、ＡＤＲＣコード生成部９３において、上述した式（４）を用いて閾値Ｔｈが決定される。
【０１５８】
ステップＳ１５５において、特徴量抽出部６３のＡＤＲＣコード生成部９３は、画素値Ｐ１乃至Ｐ９の９画素を、それぞれ閾値Ｔｈと比較し、閾値Ｔｈよりも大きい場合には１に量子化し、閾値Ｔｈよりも小さい場合には０に量子化して、番号順に並べた９ビットを注目画素のＡＤＲＣコードとして生成し、特徴量生成部９６に供給して、処理は、図２７のステップＳ１３３に戻る。
【０１５９】
このような処理により、カレントフレームにおいて、ＡＤＲＣコードが算出される。
【０１６０】
次に、図２９のフローチャートを参照して、図２７のステップＳ１３３において実行される画素コード生成処理２について説明する。
【０１６１】
ステップＳ１７１において、画素コード生成部９４は、ＡＤＲＣコード生成部９３により生成されたＡＤＲＣコードを取得する。
【０１６２】
ステップＳ１７２において、画素コード生成部９４は、画素位置テーブル９５に記憶されている画素位置テーブルを参照して、ＡＤＲＣコードに対応する画素位置を選択する。
【０１６３】
ステップＳ１７３において、画素コード生成部９４は、入力された参照フレームＦｒの注目画素の画素値の上位４ビットを抽出する。
【０１６４】
ステップＳ１７４において、画素コード生成部９４は、コードＡを算出して、処理は、図２７のステップＳ１３４に戻る。
【０１６５】
ステップＳ１７３およびステップＳ１７４の処理は、具体的には、上述した、式（５）を用いて、コードＡを算出する処理と等価である。
【０１６６】
この処理により、画素コードであるコードＡが算出される。
【０１６７】
次に、図３０のフローチャートを参照して、図１８のステップＳ３５において実行されるマッチング処理について説明する。
【０１６８】
ステップＳ１９１において、データベース検索部６６は、特徴量抽出部６３から、カレントフレームFｃの注目画素の特徴量コードＡの入力を受ける。
【０１６９】
ステップＳ１９２において、データベース検索部６６は、データベース７１に記録されている参照フレームＦｒの特徴量（特徴量アドレス）のうち、カレントフレームFｃの特徴量コードＡと等しいものを検出する。
【０１７０】
ステップＳ１９３において、データベース検索部６６は、検出された特徴量のアドレスのフラグアドレス１乃至フラグアドレスＫのセルに記載されている画素位置情報を読み込んで、動きベクトル決定部６７に供給する。
【０１７１】
ステップＳ１９４において、動きベクトル決定部６７は、読み込まれた画素位置の中で、カレントフレームFｃの注目画素に最も近い画素位置を検出する。
【０１７２】
ステップＳ１９５において、動きベクトル決定部６７は、注目画素の画素位置と検出された画素位置を基に、動きベクトルを算出し、処理が終了される。
【０１７３】
ここでは、カレントフレームＦｃの注目画素から最も近い候補が、対応する参照フレームＦｒの画素であるものとして説明したが、動きベクトルの検出方法は、これ以外の方法であってもかまわない。
【０１７４】
以上が、図４を用いて説明した動き検出部５１が実行する図１８の動き検出処理についての説明であるが、図１８のステップＳ３１乃至ステップＳ３５の処理は、一部、並行して実行される。例えば、ｎ−１番目に入力されたフレーム（ｎ−１フレーム）、ｎ番目に入力されたフレーム（ｎフレーム）、および、ｎ＋１番目に入力されたフレーム（ｎ＋１フレーム）が処理されるタイミングの例について、図３１を用いて説明する。
【０１７５】
例えば、フレームメモリ６１にｎフレームが入力されたとき、特徴量抽出部６４においては、ｎ−１フレームを参照フレームとして、特徴量コードＢおよび特徴量コードＣが算出される。
【０１７６】
次に、特徴量抽出部６３において、ｎフレームをカレントフレームとして、特徴量コードＡが算出されているとき、第２のフレームメモリであるフレームメモリ６２にｎフレームが供給され、特徴量抽出部６４において算出されたｎ−１フレームの特徴量コードＢおよび特徴量コードＣが、データベース制御部６５の処理により、データベース７１に登録される。
【０１７７】
そして、その後、データベース検索部６６において、特徴量抽出部６３で算出されたｎフレーム、すなわち、カレントフレームの特徴量コードＡと、データベース７１に登録されているｎ−１フレーム、すなわち、参照フレームの特徴量コードＢおよび特徴量コードＣとがマッチングされて、カレントフレームの特徴量コードＡに対応する特徴量を有する画素位置が検出される。このとき、特徴量抽出部６４において、ｎフレームを参照フレームとして特徴量コードＢおよび特徴量コードＣが算出され、フレームメモリ６１に、カレントフレームであるｎ＋１フレームが入力される。
【０１７８】
次に、特徴量抽出部６３において、カレントフレームであるｎ＋１フレームの特徴量コードＡが算出されているとき、第２のフレームメモリであるフレームメモリ６２にｎ＋１フレームが供給されるとともに、特徴量抽出部６４において算出されたｎフレーム、すなわち、参照フレームの特徴量コードＢおよび特徴量コードＣが、データベース制御部６５の処理により、データベース７１に登録される。
【０１７９】
そして、データベース検索部６６において、特徴量抽出部６３において算出されたｎ＋１フレーム、すなわち、カレントフレームの特徴量コードＡと、データベース７１に登録されているｎフレーム、すなわち、参照フレームの特徴量コードＢおよび特徴量コードＣとがマッチングされ、カレントフレームの特徴量コードＡに対応する特徴量を有する画素位置が検出される。以下、同様に処理が繰り返される。
【０１８０】
このようにして、ステップＳ３１乃至ステップＳ３５の処理は、一部、並行して実行される。
【０１８１】
また、データベースを２つ設けることにより、データベースへの登録処理である参照フレーム情報生成処理と、データベースに登録された情報を用いて実行されるマッチング処理を、並行して行えるようにし、処理を高速化することができる。
【０１８２】
図３２は、データベースを２つ備えるようにした動き検出部１２１の構成を示すブロック図である。なお、図４における場合と対応する部分には同一の符号を付してあり、その説明は適宜省略する。
【０１８３】
すなわち、図３２の動き検出部１２１は、データベース制御部６５に代わり、データベース７１−１およびデータベース７１−２の２つのデータベースを備え、データベース選択処理部１４１を有するデータベース制御部１３１が備えられ、フレームメモリ６２が省略されている以外は、基本的に、図４を用いて説明した動き検出部５１と同様の構成を有するものである。
【０１８４】
図３３は、図３２のデータベース制御部の更に詳細な構成を示すブロック図である。データベース選択処理部１４１は、データベース７１−１およびデータベース７１−２の一方に、特徴量抽出部６４から供給されるカレントフレームFｃの特徴量が登録され、他方にすでに登録されている参照フレームＦｒの情報が、データベース検索部６６により読み込まれるように、情報の入出力を制御する。
【０１８５】
すなわち、データベース制御部１３１には、データベース７１−１およびデータベース７１−２の２つのデータベースが備えられ、例えば、カレントフレームＦｃの特徴量が、データベース７１−２に登録されている間、その前の処理でデータベース７１−１に登録された参照フレームＦｒの情報が、データベース検索部６６により読み込まれる。そして、データベース７１−１に登録された参照フレームＦｒと、カレントフレームＦｃとのマッチング処理、および、カレントフレームＦｃの特徴量のデータベース７１−２への登録処理が終了した後、次のフレームがカレントフレームFｃとなり、カレントフレームＦｃの特徴量がデータベース７１−１に登録され、データベース７１−２に登録されている情報が、参照フレームＦｒの情報としてデータベース検索部６６により読み込まれる。
【０１８６】
すなわち、データベース７１−１からの情報の読み出しと、データベース７１−２への情報の書き込みが、並行して行われ、データベース７１−２からの情報の読み出しと、データベース７１−１への情報の書き込みが、並行して行われる。
【０１８７】
図３４のフローチャートを参照して、図３２の動き検出部１２１が実行する動き検出処理２について説明する。
【０１８８】
ステップＳ２１１において、フレームメモリ６１は、フレームの入力を受ける。フレームメモリ６１は、入力されたフレームを、特徴量抽出部６３および特徴量抽出部６４に供給する。
【０１８９】
ステップＳ２１２において、図１９を用いて説明した参照フレーム情報生成処理が、特徴量抽出部６４において実行される。
【０１９０】
ステップＳ２１３において、図２７を用いて説明したカレントフレーム情報生成処理が、特徴量抽出部６３において実行される。
【０１９１】
ステップＳ２１４において、図３０を用いて説明したマッチング処理が実行されて、処理が終了される。
【０１９２】
ステップＳ２１１乃至ステップＳ２１４の処理は、一部、並行して実行される。例えば、ｎ−１番目に入力されたフレーム（ｎ−１フレーム）、ｎ番目に入力されたフレーム（ｎフレーム）、および、ｎ＋１番目に入力されたフレーム（ｎ＋１フレーム）が処理されるタイミングの例について、図３５を用いて説明する。
【０１９３】
例えば、フレームメモリ６１にｎフレームが入力されたとき、特徴量抽出部６４において、ｎ−１フレームの特徴量コードＢおよび特徴量コードＣが算出されて、データベース制御部１３１に供給される。
【０１９４】
次に、特徴量抽出部６３において、ｎフレームの特徴量コードＡが算出されているとき、特徴量抽出部６４において、ｎフレームの特徴量コードＢおよび特徴量コードＣが算出される。この処理と並行して、特徴量抽出部６４において算出されたｎ−１フレームの特徴量コードＢおよび特徴量コードＣがデータベース制御部１３１の処理により、データベース７１のいずれか一方（以下、第１のデータベースと称する）に登録される。このとき、フレームメモリ６１には、ｎ＋１フレームが入力される。
【０１９５】
そして、その後、データベース検索部６６において、特徴量抽出部６３において算出されたｎフレームの特徴量コードＡと、第１のデータベースに登録されているｎ−１フレームの特徴量コードＢおよび特徴量コードＣとがマッチングされて、特徴量コードＡに対応する特徴量を有する画素位置の情報が得られる。このとき、特徴量抽出部６４において算出された、フレームｎの特徴量コードＢおよび特徴量コードＣが、データベース制御部１３１の処理により、データベース７１の他方（以下、第２のデータベースと称する）に登録される。
【０１９６】
すなわち、第１のデータベースからの情報の読み出しと、第２のデータベースへの情報の書き込みが、並行して行われる。更に、並行して、特徴量抽出部６３において、ｎ＋１フレームの特徴量コードＡが算出され、特徴量抽出部６４において、ｎ＋１フレームの特徴量コードＢおよび特徴量コードＣが算出される。
【０１９７】
そして、データベース検索部６６において、特徴量抽出部６３において算出されたｎ＋１フレームの特徴量コードＡと、第２のデータベースに登録されているｎフレームの特徴量コードＢおよび特徴量コードＣとがマッチングされて、特徴量コードＡに対応する特徴量を有する画素位置の情報が得られる。このとき、特徴量抽出部６４において算出されたフレームｎ＋１の特徴量コードＢおよび特徴量コードＣが、データベース制御部１３１の処理により第１のデータベースに登録され、以下、同様に処理が繰り返される。
【０１９８】
このような構成にすることにより、図３１を用いて説明した場合と比較して、処理を高速化することが可能となる。
【０１９９】
次に、特徴量抽出部６３および特徴量抽出部６４の画素位置テーブル９５に記憶されている画素位置テーブルの生成について説明する。
【０２００】
図３６は、画素位置テーブルを生成する画素位置テーブル生成装置１６１の構成を示すブロック図である。なお、図５における場合と対応する部分には同一の符号を付してあり、その説明は適宜省略する。
【０２０１】
クラスタップ抽出部９１、ＤＲ演算部９２、および、ＡＤＲＣコード生成部９３の処理により、画像情報ＡのＡＤＲＣコードが生成され、データ蓄積部１７２に供給される。クラスタップ抽出部９１は、図２２を用いて説明したように、３画素×３画素の９画素のクラスタップを抽出するものとして説明する。
【０２０２】
差分レベル抽出部１７１は、画像情報Ａのクラスタップと、画像情報Ａの前のフレームの画像である画像情報Ｂとの供給を受ける。画像情報Ａと画像情報Ｂにおいて、動きの前後において対応する画素位置は、既知であり、差分レベル抽出部１７１は、動きの前後において対応する画素のレベルの変動（差分レベル）を、クラスタップの９画素のそれぞれの画素位置ごとに検出し、データ蓄積部１７２に供給する。
【０２０３】
データ蓄積部１７２は、差分レベル抽出部１７１から供給された、画素位置ごとの差分レベルを、ＡＤＲＣコード別に蓄積する。図１３を用いて説明したように、画素位置と差分レベルとの関係は、ＡＤＲＣコードによって、規則性を有する。データ蓄積部１７２は、例えば、２５パターン程度、あるいは、それ以上の数の画像情報Ａおよび画像情報Ｂに対応する差分レベル情報を蓄積する。
【０２０４】
画素位置テーブル生成部１７３は、データ蓄積部１７２に蓄積されているデータから、それぞれのＡＤＲＣコードにおいて、最も差分レベルの小さい画素位置を検出し、図１４を用いて説明した画素位置テーブルを生成する。
【０２０５】
次に、図３７のフローチャートを参照して、画素位置テーブル生成装置１６１が実行する画素位置テーブル生成処理について説明する。
【０２０６】
ステップＳ２３１において、図２８を用いて説明したＡＤＲＣコード生成処理２が実行される。
【０２０７】
ステップＳ２３２において、差分レベル抽出部１７１は、画像情報Ａと、画像情報Ａより1フレーム前の画像である画像情報Ｂとの、動き前後において対応する画素位置における画素値の差分レベルを抽出し、データ蓄積部１７２に供給する
【０２０８】
ステップＳ２３３において、データ蓄積部１７２は、差分レベル抽出部１７１から供給された差分レベル情報を、ＡＤＲＣコード生成部９３から供給されたＡＤＲＣコードと対応して蓄積する。ここでは、例えば、２５パターン程度の画像情報Ａおよび画像情報Ｂに対応する差分レベル情報が蓄積されるものとする。
【０２０９】
ステップＳ２３４において、画素位置テーブル生成部１７３は、データ蓄積部１７２に蓄積されている差分レベル情報を参照して、ＡＤＲＣコードごとに差分レベルの最も小さな画素位置を検出する。
【０２１０】
ステップＳ２３５において、画素位置テーブル生成部１７３は、図１４を用いて説明した画素位置テーブルを生成し、処理が終了される。
【０２１１】
このような処理により、図１４を用いて説明した画素位置テーブルが生成される。そして、動きベクトル検出時に、この画素位置テーブルが参照されて、差分レベルの低い画素位置の画素値を基に、特徴量の抽出が行われるので、動きベクトルの検出において、ノイズなどによるレベル変動の影響を削減することが可能となる。
【０２１２】
上述した一連の処理は、ソフトウェアにより実行することもできる。そのソフトウェアは、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、記録媒体からインストールされる。
【０２１３】
この記録媒体は、図３８に示すように、コンピュータとは別に、ユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク２４１（フレキシブルディスクを含む）、光ディスク２４２（CD-ROM（Compact Disk-Read Only Memory），DVD（Digital Versatile Disk）を含む）、光磁気ディスク２４３（ＭＤ（Mini-Disk)（商標）を含む）、もしくは半導体メモリ１４４などよりなるパッケージメディアなどにより構成される。
【０２１４】
図３８を用いて、パーソナルコンピュータ２０１について説明する。
【０２１５】
ＣＰＵ（Central Processing Unit）２２１は、入出力インターフェース２２２および内部バス２２３を介して、ユーザが、入力部２２４を用いて入力した各種指令に対応する信号や、ネットワークインターフェース２３０を介して、他のパーソナルコンピュータが送信した制御信号の入力を受け、入力された信号に基づいた各種処理を実行する。ＲＯＭ（Read Only Memory）２２５は、ＣＰＵ２２１が使用するプログラムや演算用のパラメータのうちの基本的に固定のデータを格納する。ＲＡＭ（Random Access Memory）２２６は、ＣＰＵ２２１の実行において使用するプログラムや、その実行において適宜変化するパラメータを格納する。ＣＰＵ２２１、ＲＯＭ２２５、およびＲＡＭ２２６は、内部バス２２３により相互に接続されている。
【０２１６】
内部バス２２３は、入出力インターフェース２２２とも接続されている。入力部２２４は、例えば、キーボード、タッチパッド、ジョグダイヤル、あるいはマウスなどからなり、ユーザがＣＰＵ２２１に各種の指令を入力するとき操作される。表示部２２７は、例えば、ＣＲＴ（Cathode Ray Tube）や液晶表示装置などからなり、各種情報をテキスト、あるいはイメージなどで表示する。
【０２１７】
ＨＤＤ（hard disk drive）２２８は、ハードディスクを駆動し、それらにＣＰＵ２２１によって実行するプログラムや情報を記録または再生させる。ドライブ２２９には、必要に応じて磁気ディスク２４１、光ディスク２４２、光磁気ディスク２４３、および半導体メモリ２４４が装着され、データの授受を行う。
【０２１８】
ネットワークインターフェース２３０は、他のパーソナルコンピュータや、パーソナルコンピュータ以外の各種装置と所定のケーブルを用いて有線で、もしくは無線で接続され、それらの機器との情報の授受を行ったり、インターネットを介してウェブサーバにアクセスし、情報の授受を行う。
【０２１９】
これらの入力部２２４乃至ネットワークインターフェース２３０は、入出力インターフェース２２２および内部バス２２３を介してＣＰＵ２２１に接続されている。
【０２２０】
また、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。
【０２２１】
【発明の効果】
このように、本発明によれば、カレントフレームの注目画素に対応する前記参照フレームの画素を検出することができる。特に、参照フレームとカレントフレームとで画素の特徴量が変化しても、正しく検出することができるので、正しい動きベクトルを求めることが可能となる。
【０２２２】
また、他の本発明によれば、ＡＤＲＣコードごとに差分レベルの最も小さな画素位置を検出して、画像処理に用いる画素の画素位置を示すテーブルを生成することができる。
【図面の簡単な説明】
【図１】従来の動き検出部の構成を示すブロック図である。
【図２】動きベクトルの検出方法を説明する図である。
【図３】図１の動き検出部による動き検出処理を説明するフローチャートである。
【図４】本発明を適用した動き検出部の構成を示すブロック図である。
【図５】図４の特徴量抽出部６３の構成を示すブロック図である。
【図６】クラスタップを説明する図である。
【図７】クラスタップを説明する図である。
【図８】クラスタップを説明する図である。
【図９】クラスタップを説明する図である。
【図１０】クラスタップを説明する図である。
【図１１】クラスタップを説明する図である。
【図１２】ＡＤＲＣコードを説明する図である。
【図１３】ＡＤＲＣコードごとのタップ位置と差分レベルとの関係について説明する図である。
【図１４】画素位置テーブルについて説明する図である。
【図１５】図４の特徴量抽出部６４の構成を示すブロック図である。
【図１６】画素コードとして算出されるコードＡ、コードＢ、およびコードＣについて説明する図である。
【図１７】図４のデータベースの構造を説明する図である。
【図１８】図４の動き検出部による動き検出処理１を説明するフローチャートである。
【図１９】参照フレーム情報生成処理を説明するフローチャートである。
【図２０】特徴量算出処理を説明するフローチャートである。
【図２１】ＡＤＲＣコード生成処理１を説明するフローチャートである。
【図２２】クラスコード用タップの一例を示す図である。
【図２３】マスク化する画素位置について説明する図である。
【図２４】マスク化する画素位置について説明する図である。
【図２５】マスク化する画素位置について説明する図である。
【図２６】画素コード生成処理１を説明するフローチャートである。
【図２７】カレントフレーム情報生成処理を説明するフローチャートである。
【図２８】ＡＤＲＣコード生成処理２を説明するフローチャートである。
【図２９】画素コード生成処理２を説明するフローチャートである。
【図３０】マッチング処理を説明するフローチャートである。
【図３１】図４の動き検出部による動き検出処理のタイミングを説明する図である。
【図３２】本発明を適用した動き検出部の他の構成を示すブロック図である。
【図３３】図３２のデータベース制御部の構成を示すブロック図である。
【図３４】図３２の動き検出部による動き検出処理２を説明するフローチャートである。
【図３５】図３２の動き検出部による動き検出処理のタイミングを説明する図である。
【図３６】画素位置テーブル生成装置の構成例を示すブロック図である。
【図３７】画素位置テーブル生成処理を説明するフローチャートである。
【図３８】パーソナルコンピュータの構成例を示すブロック図である。
【符号の説明】
５１動き検出部，６１，６２フレームメモリ，６３，６４特徴量抽出部，６５データベース制御部，６６データベース検索部，６７動きベクトル決定部，７１データベース，９１クラスタップ抽出部，９２ＤＲ演算部，９３ＡＤＲＣコード生成部，９４画素コード生成部，９５画素位置テーブル，９６特徴量生成部，１０１コード変換部，１０２画素コード生成部，１２１動き検出部，１３１データベース制御部，１４１データベース選択処理部，１６１画素位置テーブル生成装置，１７１差分レベル抽出部，１７２データ蓄積部，１７３画素位置テーブル生成部，２０１パーソナルコンピュータ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus and method,LearningIn particular, the present invention relates to an apparatus and method, a recording medium, and a program, and in particular, an image processing apparatus and method capable of more accurately detecting a motion vector,LearningThe present invention relates to an apparatus and method, a recording medium, and a program.
[0002]
[Prior art]
There is a technique for obtaining a motion vector indicating the motion of an image and efficiently compressing the moving image based on the motion vector.
[0003]
Several methods for obtaining the above-described motion vector in this moving image compression technique have been proposed. As a representative method, there is a method called a block matching algorithm.
[0004]
FIG. 1 shows a configuration example of a motion detection unit 1 of a conventional image processing apparatus employing a block matching algorithm.
[0005]
For example, when an image signal is input from the input terminal Tin at time t1, the frame memory 11 of the motion detection unit 1 stores information for one frame. Further, when an image signal of the next frame is input from the input terminal Tin at time t2, which is the next timing, the frame memory 11 outputs the stored image information for one frame to the frame memory 12 at time t1. After that, the newly inputted image information for one frame is stored.
[0006]
The frame memory 12 stores image information for one frame input from the input terminal Tin at the timing of time t1 input from the frame memory 11 at the timing of time t2.
[0007]
That is, when the frame memory 11 stores image information for one frame (current) input at the timing of the above-described time t2, the frame memory 12 is input at the timing of time t1 (one timing past). The image information for one frame is stored. Hereinafter, image information stored in the frame memory 11 is referred to as a current frame Fc, and image information stored in the frame memory 12 is referred to as a reference frame Fr.
[0008]
The motion vector detection unit 13 reads the current frame Fc and the reference frame Fr stored in the

frame memories

11 and 12 respectively, and detects a motion vector by a block matching algorithm based on the current frame Fc and the reference frame Fr. And output from the output terminal Tout.
[0009]
Here, the block matching algorithm will be described. For example, as shown in FIG. 2, when obtaining a motion vector corresponding to a target pixel P (i, j) in the current frame Fc, first, the target pixel P (i, j) is centered on the current frame Fc. A reference block Bb (i, j) composed of L (number of pixels) × L (number of pixels), a search area SR corresponding to the position of the target pixel P (i, j) on the reference frame Fr, and the search area Reference blocks Brn (i, j) each consisting of L (number of pixels) × L (number of pixels) pixels are set in SR.
[0010]
Next, the process of calculating the sum of the absolute values of the differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) causes the reference block Brn to be horizontal throughout the search area SR. The process is repeated from Br1 to Brm in FIG. 2 (m is possible to set m reference blocks Brn in the search area SR) while moving one pixel at a time in the direction or in the vertical direction.
[0011]
Of the sum of absolute differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) obtained in this way, the reference block Brn having the smallest difference absolute value sum is obtained. Thus, the reference pixel Pn (i, j) serving as the center of the L × L pixels constituting the reference block Brn (i, j) closest to (similar to) the base block Bb (i, j) is Desired.
[0012]
A vector having a pixel P ′ (i, j) on the reference frame Fr corresponding to the target pixel P (i, j) on the current frame Fc as a start point and a reference pixel Pn (i, j) as an end point is obtained. , And output as a motion vector (Vx, Vy) of the pixel of interest P (i, j). Here, for example, when P (i, j) = (a, b) and Pn (i, j) = (c, d), (Vx, Vy) becomes (Vx, Vy) = ( c-a, db).
[0013]
That is, the reference block closest to (similar to) the base block Bb (i, j) starting from the reference pixel P ′ (i, j) on the reference frame Fr corresponding to the target pixel P (i, j). A vector whose end point is the reference pixel Pn (i, j), which is the center of L × L pixels constituting Brn (i, j), is obtained as a motion vector.
[0014]
Next, the motion detection processing of the motion detection unit 1 in FIG. 1 will be described with reference to the flowchart in FIG.
[0015]
In step S 1, the motion vector detection unit 13 sets the search area SR according to the pixel position of the pixel of interest P (i, j) on the current frame Fc stored in the frame memory 11.
[0016]
In step S2, the motion vector detection unit 13 sets the variable min for setting the minimum value of the sum of absolute differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) as described above. Initialization is performed by setting a value obtained by multiplying the number of gradations of the pixel by the number of pixels constituting the reference block Bb (i, j). That is, for example, when one pixel is 8-bit data, the number of gradations of one pixel is 2 8, and thus 256 gradations (256 colors) are obtained. Further, when the reference block Bb (i, j) is composed of L pixels × L pixels = 3 pixels × 3 pixels, the number of pixels is nine. As a result, the variable min is initialized to 2304 (= 256 (number of gradations) × 9 (number of pixels)).
[0017]
In step S3, the motion vector detection unit 13 initializes a counter variable n for counting the reference block Brn to 1.
[0018]
In step S4, the motion vector detecting unit 13 initializes a variable sum used for substituting the sum of absolute differences between the pixels of the base block Bb and the reference block Brn to 0.
[0019]
In step S5, the motion vector detection unit 13 obtains the sum of absolute differences (= sum) between the pixels of the base block Bb (i, j) and the reference block Brn (i, j). That is, when each pixel of the reference block Bb (i, j) is indicated as P_Bb (i, j) and each pixel of the reference block Brn (i, j) is indicated as P_Brn (i, j), the motion vector detection unit 13 Performs an operation represented by the following expression (1) to obtain a sum of absolute differences between pixels of the base block Bb (i, j) and the reference block Brn (i, j).
[0020]
[Expression 1]

[0021]
In step S6, the motion vector detection unit 13 determines whether or not the variable min is larger than the variable sum. For example, when determining that the variable min is larger than the variable sum, in step S7, the motion vector is changed to the variable sum. And the value of the counter n at that time is registered as a motion vector number. That is, the fact that the variable sum indicating the sum of absolute differences obtained is smaller than the variable min indicating the minimum value means that the reference block Brn (i, i, Since j) can be regarded as more similar to the reference block Bb (i, j), a counter n at that time is registered as a motion vector number in order to make it a candidate for obtaining a motion vector. If it is determined in step S6 that the variable min is not greater than the variable sum, the process in step S7 is skipped.
[0022]
In step S8, the motion vector detecting unit 13 determines whether or not the counter variable n is the total number m of the reference blocks Brn in the search area SR, that is, whether or not the current reference block Brn is Brn = Brm. For example, when it is determined that the total number is not m, the counter variable n is incremented by 1 in step S9, and the process returns to step S4.
[0023]
In step S8, when it is determined that the counter variable n is the total number m of reference blocks Brn in the search area, that is, the current reference block Brn is Brn = Brm, the motion vector detection unit 13 in step S10 The motion vector is output based on the registered motion vector number. That is, by repeating steps S4 to S9, the counter variable n corresponding to the reference block Brn having the smallest difference absolute value sum is registered as the motion vector number. Among the L × L pixels of the reference block Brn corresponding to the motion vector number, a reference pixel Pn (i, j) serving as the center is obtained and corresponds to the target pixel P (i, j) on the current frame Fc. A vector starting from the pixel P ′ (i, j) on the reference frame Fr and ending at the reference pixel Pn (i, j) is obtained as the motion vector (Vx, Vy) of the pixel of interest P (i, j). Output.
[0024]
When detecting a motion vector by the block matching method as described above, a stationary component and a transient component are extracted for each motion vector detection target block and each small block of the reference frame. Each of the differences is detected, and a value obtained by accumulating the absolute value difference is weighted and averaged to form an evaluation value. By detecting a motion vector based on the evaluation value, the amount of computation can be reduced. There is a technique for preventing the possibility of erroneous detection (see, for example, Patent Document 1). Also, each pixel value is encoded into a 1-bit code value by 1-bit ADRC, and a matching operation is performed using the code value, thereby simplifying the circuit configuration, operation time, and the like related to motion vector detection. There is a technology that can do this (for example, see Patent Document 2).
[0025]
[Patent Document 1]
Japanese Patent Application Laid-Open No. 07-087494
[0026]
[Patent Document 2]
JP 2000-278691 A
[0027]
[Problems to be solved by the invention]
However, since the above-described block matching algorithm requires an enormous amount of calculation of equation (1), most of the time is spent on this processing in image compression processing such as MPEG (Moving Picture Experts Group). There was a problem of being done.
[0028]
In addition, when noise is included near the start point or end point of the motion vector of the current frame Fc or the reference frame Fr, the block matching cannot detect a reference block similar to the reference block, and an accurate motion There was a problem that the vector could not be detected.
[0029]
The present invention has been made in view of such circumstances, and makes it possible to accurately generate a motion vector.
[0030]
[Means for Solving the Problems]
  The image processing apparatus according to the present invention associates each pixel value of a predetermined pixel group including the target pixel of the first learning frame with an n-bit quantization code generated by quantizing with ADRC, and stores the predetermined pixel group Among these, the pixel position that stores information on the pixel position with the least level fluctuation of the pixel value detected in the first learning frame and the second learning frame that is temporally prior to the first learning frame The quantum of the first learning frame whose value matches the n-bit quantization code generated by ADRC quantizing each pixel value of the predetermined pixel group including the target pixel of the input first frame in the table. The level fluctuation associated with the activation code is the leastFirstThe pixel position is read from the pixel position table, and the level fluctuation read from the pixel position table is the least.FirstOf the 8-bit data indicating the pixel value at the pixel position, a first code consisting of a code A coded using the upper m bits and a quantization code of a predetermined pixel group including the target pixel of the first frame, Each of a predetermined pixel group including a first feature amount extraction unit that extracts a feature amount code A of a target pixel of a second frame, and a target pixel of a second frame that is a temporally preceding frame of the first frame The level fluctuation associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing the pixel value by ADRC is the least.SecondThe pixel position is read from the pixel position table, and the level fluctuation read from the pixel position table is the least.SecondOf the 8-bit data indicating the pixel value at the pixel position, the upper m-1 bits are used., So as to overlap each other within a predetermined pixel value range including the pixel value indicated by the boundary part of code AA feature amount code B and a feature amount code C of the pixel of interest in the second frame, each of which includes the coded code B or code C and a quantization code of a predetermined pixel group including the pixel of interest in the second frame, respectively. Second feature quantity extraction means for extraction, a database for storing information of corresponding pixel positions for each feature quantity code B and feature quantity code C extracted by the second feature quantity extraction means, and a database Of the pixel positions of the feature amount code B and the feature amount code C whose values match the feature amount code A of the target pixel of the first frame extracted by the first feature amount extraction means. Search means for searching for information.
[0031]
  The second feature amount extraction means includes:2 to 8-bit data indicating the pixel value of the second pixel position ^m-1 2 after adding ^{m + 1} The code B is encoded using the upper m−1 bits of the 8-bit data indicating the pixel value at the second pixel position by multiplying by 2 by 2 and the pixel at the second pixel position 2 from 8-bit data indicating the value ^m-1 2 after subtracting ^{m + 1} The code C is coded using the upper m−1 bits of the 8-bit data indicating the pixel value at the second pixel position by multiplying the number divided by 2 by 2 and adding 1 to it.be able to.
[0032]
  The second feature amount extraction unit bit-inverts only a predetermined code corresponding to a pixel value in the vicinity of a threshold in quantization among the quantization codes of a predetermined pixel group including the target pixel of the second frame, The level fluctuation associated with the quantized code of the first learning frame whose value matches the quantized code in which only the code is bit-inverted is the least.ThirdThe pixel position is read from the pixel position table, and the level fluctuation read from the pixel position table is the least.ThirdOf the 8-bit data indicating the pixel value at the pixel position, the upper m-1 bits are used., So as to overlap each other within a predetermined pixel value range including the pixel value indicated by the boundary part of code AA feature code B and a feature code C each including a coded code B or code C and a quantized code in which only a predetermined code is bit-inverted can also be extracted.
[0033]
  The databaseAre provided, and information is stored alternately for each frame.
[0034]
Among the pixel position information searched by the search means, a pixel position having a minimum distance from the target pixel in the first frame is detected, and the detected pixel position and the pixel position information of the target pixel are obtained. Based on this, it is possible to further include a motion vector generation means for generating a motion vector.
[0035]
  In the image processing method of the present invention, an image processing apparatus that detects a motion vector associates each pixel value of ADRC of a predetermined pixel group including a target pixel of the first learning frame with an n-bit quantization code, Among the predetermined pixel group, information on the pixel position where the level fluctuation of the pixel value detected in the first learning frame and the second learning frame that is temporally prior to the first learning frame is the smallest. The first bit whose value matches the n-bit quantization code generated by quantizing each pixel value of a predetermined pixel group including the target pixel of the input first frame from the pixel position table to be stored by ADRC. Minimal level fluctuation associated with the quantization code of the learning frameFirstReads the pixel position and has the least level fluctuation read from the pixel position tableFirstOf the 8-bit data indicating the pixel value at the pixel position, a first code consisting of a code A coded using the upper m bits and a quantization code of a predetermined pixel group including the target pixel of the first frame, The feature amount code A of the target pixel of the second frame is extracted, and each pixel value of a predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, is quantized by ADRC. The level fluctuation associated with the quantized code of the first learning frame whose value matches the generated n-bit quantized code is the least.SecondReads the pixel position from the pixel position table and has the least level fluctuation read from the pixel position tableSecondOf the 8-bit data indicating the pixel value at the pixel position, the upper m-1 bits are used., So as to overlap each other within a predetermined pixel value range including the pixel value indicated by the boundary part of code AA feature amount code B and a feature amount code C of the pixel of interest in the second frame, each of which includes the coded code B or code C and a quantization code of a predetermined pixel group including the pixel of interest in the second frame, respectively. For each extracted feature quantity code B and feature quantity code C, the process of storing the corresponding pixel position information in the database is controlled, and the extracted pixel position information stored in the database is extracted. A step of retrieving pixel position information of a feature code B and a feature code C whose values match the feature code A of the target pixel of the first frame.
[0036]
  The program recorded on the first recording medium of the present invention associates each ADRC pixel value of a predetermined pixel group including the target pixel of the first learning frame with an n-bit quantization code, In the pixel group, information on the pixel position where the level fluctuation of the pixel value detected in the first learning frame and the second learning frame that is temporally prior to the first learning frame is the smallest is stored. A first learning frame whose value matches the n-bit quantization code generated by ADRC quantizing each pixel value of a predetermined pixel group including the target pixel of the input first frame from the pixel position table Minimal level fluctuations associated with other quantization codesFirstReads the pixel position and has the least level fluctuation read from the pixel position tableFirstOf the 8-bit data indicating the pixel value at the pixel position, a first code consisting of a code A coded using the upper m bits and a quantization code of a predetermined pixel group including the target pixel of the first frame, The feature amount code A of the target pixel of the second frame is extracted, and each pixel value of a predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, is quantized by ADRC. The level fluctuation associated with the quantized code of the first learning frame whose value matches the generated n-bit quantized code is the least.SecondReads the pixel position from the pixel position table and has the least level fluctuation read from the pixel position tableSecondOf the 8-bit data indicating the pixel value at the pixel position, the upper m-1 bits are used., So as to overlap each other within a predetermined pixel value range including the pixel value indicated by the boundary part of code AA feature amount code B and a feature amount code C of the pixel of interest in the second frame, each of which includes the coded code B or code C and a quantization code of a predetermined pixel group including the pixel of interest in the second frame, respectively. For each extracted feature quantity code B and feature quantity code C, the process of storing the corresponding pixel position information in the database is controlled, and the extracted pixel position information stored in the database is extracted. The computer is caused to execute a process including a step of searching for pixel position information of the feature amount code B and the feature amount code C whose values match the feature amount code A of the target pixel of the first frame.
[0037]
  The first program of the present invention associates each pixel value of ADRC of a predetermined pixel group including the target pixel of the first learning frame with an n-bit quantization code, and outputs the first of the predetermined pixel group. Input from a pixel position table that stores information on pixel positions with the least level fluctuation of pixel values detected in the second learning frame that is temporally prior to the first learning frame and the second learning frame Each pixel value of a predetermined pixel group including the target pixel of the first frame is associated with the n-bit quantization code generated by quantizing with ADRC and the quantization value of the first learning frame whose value matches Minimized level fluctuationFirstReads the pixel position and has the least level fluctuation read from the pixel position tableFirstOf the 8-bit data indicating the pixel value at the pixel position, a first code consisting of a code A coded using the upper m bits and a quantization code of a predetermined pixel group including the target pixel of the first frame, The feature amount code A of the target pixel of the second frame is extracted, and each pixel value of a predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, is quantized by ADRC. The level fluctuation associated with the quantized code of the first learning frame whose value matches the generated n-bit quantized code is the least.SecondReads the pixel position from the pixel position table and has the least level fluctuation read from the pixel position tableSecondOf the 8-bit data indicating the pixel value at the pixel position, the upper m-1 bits are used., So as to overlap each other within a predetermined pixel value range including the pixel value indicated by the boundary part of code AA feature amount code B and a feature amount code C of the pixel of interest in the second frame, each of which includes the coded code B or code C and a quantization code of a predetermined pixel group including the pixel of interest in the second frame, respectively. For each extracted feature quantity code B and feature quantity code C, the process of storing the corresponding pixel position information in the database is controlled, and the extracted pixel position information stored in the database is extracted. The computer is caused to execute a process including a step of searching for pixel position information of the feature amount code B and the feature amount code C whose values match the feature amount code A of the target pixel of the first frame.
[0038]
  The learning device according to the present invention includes a pixel of interest of an input first frameThe level fluctuation associated with the quantized code of the first learning frame whose value matches the n-bit quantized code generated by quantizing each pixel value of a predetermined pixel group including ADRC is the smallest. Among the 8-bit data indicating the pixel value of the pixel position of 1, the code A encoded using the upper m bits and the quantization code of a predetermined pixel group including the target pixel of the first frame, Of the pixel of interest in the first frameThe feature amount code A and the feature amount corresponding to the feature amount code A are searched, and are frames that are temporally prior to the first frame.Each pixel value of a predetermined pixel group including the target pixel of the second frame is associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing with ADRC. Among the 8-bit data indicating the pixel value of the second pixel position with the least level fluctuation, the upper m−1 bits are used to overrun the predetermined pixel value range including the pixel value indicated by the boundary portion of code A A code B or a code C each encoded so as to wrap, and a quantization code of a predetermined pixel group including a target pixel of the second frame,Information on the pixel position in a predetermined pixel group including the target pixel of the first learning frame, which is used to extract the feature amount code B and the feature amount code C, which are feature amounts of the target pixel of the second frame, In the learning device for learning, in the first learning frame, quantized code generating means for quantizing each pixel value of a predetermined pixel group including the target pixel by ADRC to generate an n-bit quantized code; Detecting means for detecting a fluctuation value of a level of a pixel at a corresponding pixel position in a learning frame of the second learning frame and a second learning frame that is temporally prior to the first learning frame, and a quantization code generating means A storage means for storing a fluctuation value of a pixel level at each pixel position of the pixel group detected by the detection means in association with the generated quantization code; Based on the information accumulated by, it comprises a quantization code, and pixel position information generating means for generating a pixel position information level variation associates smallest pixel position of the pixel values.
[0039]
  According to the learning method of the present invention, the target pixel of the input first frameThe level fluctuation associated with the quantized code of the first learning frame whose value matches the n-bit quantized code generated by quantizing each pixel value of a predetermined pixel group including ADRC is the smallest. Among the 8-bit data indicating the pixel value of the pixel position of 1, the code A encoded using the upper m bits and the quantization code of a predetermined pixel group including the target pixel of the first frame, Of the pixel of interest in the first frameThe feature amount code A and the feature amount corresponding to the feature amount code A are searched, and are frames that are temporally prior to the first frame.Each pixel value of a predetermined pixel group including the target pixel of the second frame is associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing with ADRC. Among the 8-bit data indicating the pixel value of the second pixel position with the least level fluctuation, the upper m−1 bits are used to overrun the predetermined pixel value range including the pixel value indicated by the boundary portion of code A A code B or a code C each encoded so as to wrap, and a quantization code of a predetermined pixel group including a target pixel of the second frame,Information on the pixel position in a predetermined pixel group including the target pixel of the first learning frame, which is used to extract the feature amount code B and the feature amount code C, which are feature amounts of the target pixel of the second frame, A learning device for learning generates an n-bit quantization code based on ADRC of a predetermined pixel group including a target pixel in the first learning frame, and is temporally compared with the first learning frame and the first learning frame. The variation value of the level of the pixel at the corresponding pixel position in the second learning frame that is the previous frame is detected and associated with the generated quantization code at each pixel position of the detected pixel group The pixel level fluctuation value is accumulated, and based on the accumulated information, a step of generating pixel position information that associates the quantization code with the pixel position having the smallest pixel value level fluctuation. Including the.
[0040]
  The program recorded on the second recording medium of the present invention is the target pixel of the first frame to be inputThe level fluctuation associated with the quantized code of the first learning frame whose value matches the n-bit quantized code generated by quantizing each pixel value of a predetermined pixel group including ADRC is the smallest. Among the 8-bit data indicating the pixel value of the pixel position of 1, the code A encoded using the upper m bits and the quantization code of a predetermined pixel group including the target pixel of the first frame, Of the pixel of interest in the first frameThe feature amount code A and the feature amount corresponding to the feature amount code A are searched, and are frames that are temporally prior to the first frame.Each pixel value of a predetermined pixel group including the target pixel of the second frame is associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing with ADRC. Among the 8-bit data indicating the pixel value of the second pixel position with the least level fluctuation, the upper m−1 bits are used to overrun the predetermined pixel value range including the pixel value indicated by the boundary portion of code A A code B or a code C each encoded so as to wrap, and a quantization code of a predetermined pixel group including a target pixel of the second frame,Information on the pixel position in a predetermined pixel group including the target pixel of the first learning frame, which is used to extract the feature amount code B and the feature amount code C, which are feature amounts of the target pixel of the second frame, The learning device for learning generates an n-bit quantization code based on ADRC of a predetermined pixel group including the pixel of interest in the first learning frame, and is temporally compared with the first learning frame and the first learning frame. The variation value of the level of the pixel at the corresponding pixel position in the second learning frame that is the previous frame is detected and associated with the generated quantization code at each pixel position of the detected pixel group The pixel level fluctuation value is accumulated, and based on the accumulated information, a step of generating pixel position information that associates the quantization code with the pixel position having the smallest pixel value level fluctuation. To execute a process including the.
[0041]
  The second program of the present invention is a pixel of interest of the input first frameThe level fluctuation associated with the quantized code of the first learning frame whose value matches the n-bit quantized code generated by quantizing each pixel value of a predetermined pixel group including ADRC is the smallest. Among the 8-bit data indicating the pixel value of the pixel position of 1, the code A encoded using the upper m bits and the quantization code of a predetermined pixel group including the target pixel of the first frame, Of the pixel of interest in the first frameThe feature amount code A and the feature amount corresponding to the feature amount code A are searched, and are frames that are temporally prior to the first frame.Each pixel value of a predetermined pixel group including the target pixel of the second frame is associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing with ADRC. Among the 8-bit data indicating the pixel value of the second pixel position with the least level fluctuation, the upper m−1 bits are used to overrun the predetermined pixel value range including the pixel value indicated by the boundary portion of code A A code B or a code C each encoded so as to wrap, and a quantization code of a predetermined pixel group including a target pixel of the second frame,Information on the pixel position in a predetermined pixel group including the target pixel of the first learning frame, which is used to extract the feature amount code B and the feature amount code C, which are feature amounts of the target pixel of the second frame, The learning device for learning generates an n-bit quantization code based on ADRC of a predetermined pixel group including the pixel of interest in the first learning frame, and is temporally compared with the first learning frame and the first learning frame. The variation value of the level of the pixel at the corresponding pixel position in the second learning frame that is the previous frame is detected and associated with the generated quantization code at each pixel position of the detected pixel group The pixel level fluctuation value is accumulated, and based on the accumulated information, a step of generating pixel position information that associates the quantization code with the pixel position having the smallest pixel value level fluctuation. To execute a process including the.
[0042]
  In the image processing apparatus and method and the first program of the present invention, an n-bit quantization code generated by quantizing each pixel value of a predetermined pixel group including the target pixel of the first learning frame by ADRC Correspondingly, the level fluctuation of the pixel value detected in the first learning frame and the second learning frame that is temporally prior to the first learning frame in the predetermined pixel group is the smallest. An n-bit quantization code and a value generated by quantizing each pixel value of a predetermined pixel group including the target pixel of the input first frame by ADRC from a pixel position table storing pixel position information The level fluctuation associated with the quantized code of the matching first learning frame is the leastFirstThe pixel position is read and the level fluctuation read from the pixel position table is the leastFirstOf the 8-bit data indicating the pixel value at the pixel position, a first code consisting of a code A coded using the upper m bits and a quantization code of a predetermined pixel group including the target pixel of the first frame, The feature amount code A of the target pixel of the frame is extracted. Also, an n-bit quantization code and a value generated by quantizing each pixel value of a predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, by ADRC are The level fluctuation associated with the quantized code of the matching first learning frame is the leastSecondThe pixel position is read from the pixel position table, and the level fluctuation read from the pixel position table is the leastSecondOf the 8-bit data indicating the pixel value at the pixel position, the upper m-1 bits are used., So as to overlap each other within a predetermined pixel value range including the pixel value indicated by the boundary part of code AThe feature amount code B and the feature amount code C of the target pixel of the second frame, each of which includes the coded code B or code C and the quantization code of a predetermined pixel group including the target pixel of the second frame, Extracted. Then, for each extracted feature quantity code B and feature quantity code C, corresponding pixel position information is stored in the database, and the pixel position information of the first frame is stored in the database. Information on the pixel positions of the feature code B and the feature code C whose values match the feature code A is searched.
[0043]
  In the learning apparatus and method and the second program of the present invention, a predetermined pixel group including the target pixel in the first learning frameEach pixel value ofADRCQuantize withAn n-bit quantization code is generated, and a fluctuation value of the level of the pixel at the corresponding pixel position in the first learning frame and the second learning frame that is temporally prior to the first learning frame is Detected and associated with the quantization code, the variation value of the pixel level at each pixel position of the pixel group is accumulated, and the quantization code and the pixel value level variation are the smallest based on the accumulated information Pixel position information associated with the pixel position is generated.
[0044]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0045]
FIG. 4 is a block diagram showing a configuration of the motion detection unit 51 of the image processing apparatus to which the present invention is applied.
[0046]
The motion detection unit 51 includes

frame memories

61 and 62, feature

amount extraction units

63 and 64, a database control unit 65, a database search unit 66, and a motion vector determination unit 67.
[0047]
A frame memory 61, which is a first frame memory, stores information for one screen of an image signal input from the input terminal Tin, and supplies the information to the frame memory 62 and the feature amount extraction unit 63. The image signal input from the input terminal Tin is information obtained by quantizing the pixel value of each pixel into a predetermined number of bits, and may be encoded by, for example, PCM encoding. Here, description will be made assuming that the pixel value is quantized to 8 bits.
[0048]
The feature amount extraction unit 63 extracts the feature amount of the target pixel based on the screen information supplied from the frame memory, that is, the information of the current frame Fc. The feature amount includes, for example, a pixel value of a target pixel, a pixel value of a predetermined pixel near the target pixel, a plurality of pixel values of a block in a predetermined pixel range centered on the target pixel, or an average value of these pixel values, ADRC (Adaptive Dynamic Range Coding) code calculated based on the pixel value of the block in a predetermined pixel range centered on the target pixel, the dynamic range of the pixel value distribution in the block, or the pixel value distribution in the block There is a minimum value.
[0049]
Here, it is assumed that the feature amount extraction unit 63 extracts a pixel value of a predetermined pixel near the target pixel and an ADRC code as a feature amount.
[0050]
FIG. 5 is a block diagram illustrating a configuration of the feature amount extraction unit 63.
[0051]
The class tap extraction unit 91 extracts information (pixel value) of peripheral pixels (class taps) corresponding to the target pixel necessary for feature amount extraction from the input image information, and a DR (dynamic range) calculation unit 92 and ADRC (Adaptive Dynamic Range Coding) code generator 93. The class tap extraction unit 91 includes a tap table 91a that stores a class tap pattern, and determines a class tap pattern to be extracted based on the class tap pattern stored in the tap table 91a. .
[0052]
That is, as shown in FIG. 6, the tap table 91a has a class tap pattern at the outermost peripheral position of a block composed of 3 pixels × 3 pixels centered on the pixel of interest (pixels displayed with black circles in the figure). A pattern of 8 pixels (pixels filled with diagonal lines in the figure) arranged, or a pattern of 9 pixels of a block composed of 3 pixels × 3 pixels including the pixel of interest, as shown in FIG. A pattern of 16 pixels (pixels filled with diagonal lines in the figure) arranged at the outermost peripheral position of a block composed of 5 pixels × 5 pixels centered on a pixel indicated by a solid black circle) or a target pixel The pattern of 25 pixels of a block consisting of 5 pixels × 5 pixels including, as shown in FIG. 8, a block of 7 pixels × 7 pixels centered on the pixel of interest (pixels displayed with black circles in the figure) A pattern of 24 pixels (pixels filled with diagonal lines in the figure) arranged at the outermost peripheral position. Further, although not shown, n pixels × n centering on a pixel of interest (pixels displayed in black circles in the figure) A 4 (n-1) pixel pattern or the like arranged at the outermost peripheral position of a block composed of pixels is stored.
[0053]
Further, the class tap pattern may be four pixels including one pixel above, below, left, and right of the target pixel as shown in FIG. 9, or from two pixels above, below, left, and right of the target pixel as shown in FIG. 11 may be used, or as shown in FIG. 11, the pixel may be 12 pixels including three pixels on the top, bottom, left, and right of the target pixel. It may be a 4 m pixel consisting of pixels. Furthermore, the class tap may have a configuration other than that illustrated, for example, an arrangement in which the positional relationship with the target pixel is asymmetrical.
[0054]
Here, as shown in FIG. 6, the class tap extraction unit 91 extracts nine pixels of a block made up of 3 pixels × 3 pixels including the target pixel as a class tap pattern.
[0055]
The DR (dynamic range) calculation unit 92 obtains a dynamic range from class tap information (pixel value) input from the class tap extraction unit 91, outputs the dynamic range to the ADRC code generation unit 93, and obtains the dynamic range. The minimum value information obtained at this time is output to the ADRC code generation unit 93. That is, if the class tap pattern is, for example, as shown in FIG. 9, the information (pixel value level) of each of the class taps C1 to C4 is (C1, C2, C3, C4) = (60, 90, 51, 100), the relationship is as shown in FIG. In such a case, the dynamic range is defined as the difference between the minimum value and the maximum value of the pixel value level, and the value is defined by the following equation (2).
[0056]
DR = Max−Min + 1 (2)
[0057]
Here, Max is the maximum value of the pixel value level which is class tap information, and Min is the minimum value of the pixel value level of the class tap. Here, 1 is added in order to define a class (for example, when a class indicated by 0, 1 is set, the difference between the two is 1, but the class is 2 classes. 1 is added to the difference). Therefore, in the case of FIG. 12, since the pixel value level 100 of the class tap C3 is the maximum value and the pixel value level 51 of the class tap C1 is the minimum value, DR is 50 (= 100−51 + 1).
[0058]
As described above, since the DR calculation unit 92 detects the minimum value and the maximum value of the pixel value levels of the class tap when calculating the dynamic range, the minimum value (or maximum value) is detected as the ADRC code. The data is output to the generation unit 93.
[0059]
The ADRC code generation unit 93 generates and outputs a quantization code composed of an ADRC code from the dynamic range value and the minimum value Min input from the DR operation unit 92 based on each pixel value level of the class tap. . More specifically, the ADRC code is obtained by substituting each pixel value level of the class tap by the following equation (3).
[0060]
Q = Round ((L−Min + 0.5) × (2 ^ n) / DR) (3)
[0061]
Here, Round indicates truncation, L indicates the pixel value level, n indicates the number of assigned bits, and (2 ^ n) indicates 2 to the nth power.
[0062]
Therefore, for example, when the number of assigned bits n is 1, the pixel value level of each class tap is 1 if it is greater than or equal to the threshold th shown in the following equation (4), and 0 if it is smaller than the threshold th. Is done.
[0063]
th = DR / 2−0.5 + Min (4)
[0064]
As a result, when the number of assigned bits n is 1, when the class tap as shown in FIG. 12 is obtained, the threshold th is 75.5 (= 50 / 2−0.5 + 51), so the ADRC code Is ADRC (C1, C2, C3, C4) = 0101.
[0065]
That is, when the class tap extraction unit 91 extracts 3 pixels × 3 pixels centered on the target pixel shown in FIG. 6 as a class tap pattern, the ADRC code generation unit 93 generates a 9-bit ADRC code. Generated.
[0066]
The ADRC code generation unit 93 supplies the generated ADRC code to the feature amount generation unit 96.
[0067]
The pixel code generation unit 94 refers to the pixel position table stored in the pixel position table 95, and generates a pixel code based on the pixel value of the pixel position where the level fluctuation due to noise or the like is small.
[0068]
The corresponding pixels in the reference frame Fr and the current frame Fc should originally have the same characteristics, but the characteristics may slightly change due to the influence of noise or the like, for example. The relationship between the difference level of the pixel value of the pixel that should correspond in the reference frame and the current frame and the pixel position in the class tap has regularity depending on the pattern of the ADRC code. In other words, a tap position (pixel position) of a pixel having a small (or large) pixel value difference level is determined in accordance with the ADRC code.
[0069]
For example, 3 × 3 pixels centered on the target pixel shown in FIG. 6 are used as the class tap pattern, and pixels to be supported in a plurality of (for example, about 25 patterns) reference frames and current frames As a result of examining the difference level of the pixel value of the first ADRC code pattern, the difference level of the pixel value of the corresponding pixel has regularity as shown by a solid line as shown in FIG. The second ADRC code pattern has regularity as indicated by a dotted line. At this time, in the first ADRC code pattern, at the pixel position indicated by the tap position 1, it can be said that there is little variation in the difference level of the pixel value of the pixel to be supported. In the second ADRC code pattern, it can be said that there is little variation in the difference level of the pixel value of the corresponding pixel at the pixel position indicated by the tap position 3.
[0070]
Accordingly, the pixel position with the smallest difference level is learned and stored in advance for each ADRC code, and the difference in the difference level of the pixel value in the class tap is the smallest when the feature value is extracted with reference to the ADRC code. By extracting the feature value based on the pixel value at the pixel position, it is possible to reduce the influence of level fluctuation due to noise in motion vector detection.
[0071]
When the class tap extraction unit 91 extracts 3 pixels × 3 pixels centered on the target pixel shown in FIG. 6 as the class tap pattern, the pixel code generation unit 94 receives the supply of 3 pixels × 3 pixels. At the same time, the ADRD code generated by the ADRC code generation unit 93 is supplied. In the pixel position table 95, as shown in FIG. 14, the pixel position that has the lowest difference level as a result of learning is stored in association with the ADRD code. The pixel code generation unit 94 refers to the pixel position table 95, selects a pixel position corresponding to the ADRD code and expected to have the lowest difference level, and calculates a pixel code (code A) from the pixel value at the pixel position. Is calculated.
[0072]
The pixel code generation unit 94 acquires 8-bit data as the pixel value of the selected pixel, and executes encoding using the upper 4 bits. That is, pixel values 0 to 2 indicated by 8 bits⁸-1 is 2^FourIndicated by type code. Specifically, in the feature quantity extraction unit 63, a code A (code A) is calculated by the following equation (5).
[0073]
codeA = [p (x, y) / 16] (5)
[0074]
However, p (x, y) is the pixel value of the pixel of interest, and the calculation value in parentheses in [] is rounded down. The pixel code generation unit 94 supplies the calculated code A to the feature amount generation unit 96 together with the corresponding pixel position (for example, coordinate information).
[0075]
The feature quantity generation unit 96 uses the code A, which is a 4-bit pixel code supplied from the pixel code generation unit 94, and the 9-bit ADRC code supplied from the ADRC code generation unit 93 to generate a 13-bit feature. A feature code A that is a quantity code is generated and supplied to the database search unit 66 together with the corresponding pixel position (for example, coordinate information).
[0076]
The generation of the pixel position table stored in the pixel position table 95 will be described later.
[0077]
Returning to FIG. 4, the motion detector 51 of the image processing apparatus to which the present invention is applied will be described.
[0078]
The frame memory 62, which is the second frame memory, receives the previous screen information (for example, one screen before) stored in the frame memory 61, stores it as information of the reference frame Fr, and sequentially extracts feature values. Supplied to the unit 64.
[0079]
The corresponding pixels in the reference frame Fr and the current frame Fc should originally have the same characteristics, but the characteristics may slightly change due to the influence of noise or the like, for example. In both the reference frame Fr and the current frame Fc, when the feature amount is calculated using the calculation result of the code A described using the equation (5), for example, in the reference frame Fr, When a pixel having a value of 15 has a pixel value of 16 in the current frame Fc, the calculation result of code A is code A = 0 in the reference frame Fr and code A = 1 in the current frame Fc. . In such a case, the same feature amount calculation result cannot be obtained. Similarly, when the pixel value is a value close to the threshold th when calculating the ADRC code, bit inversion may occur due to the influence of noise or the like. In such a case, the motion vector cannot be detected correctly.
[0080]
Therefore, the feature amount extraction unit 64 extracts the feature amount of the target pixel using a criterion different from that of the feature amount extraction unit 63 based on the information of the reference frame Fr supplied from the frame memory 62.
[0081]
FIG. 15 is a block diagram illustrating a configuration of the feature amount extraction unit 64.
[0082]
In addition, the same code | symbol is attached | subjected to the part corresponding to the case in FIG. 5, The description is abbreviate | omitted suitably.
[0083]
15 includes a pixel code generation unit 102 instead of the pixel code generation unit 94, and further includes a code conversion unit 101 that converts the ADRC code generated by the ADRC code generation unit 93. Except for this, the configuration is basically the same as that of the feature quantity extraction unit 63 of FIG.
[0084]
The feature amounts of the corresponding pixels of the reference frame Fr and the current frame Fc should be the same, but it is conceivable that the feature amount changes due to the influence of noise or the like. When an ADRC code is used as the feature amount, for example, any bit whose pixel value has a value close to the threshold th may be inverted due to a change in the feature amount.
[0085]
The code conversion unit 101 generates a code obtained by bit-inverting a code corresponding to a predetermined pixel position where the possibility of bit inversion is high among the supplied ADRC codes, and the input ADRC code (code before bit inversion) ) And the feature amount generation unit 96.
[0086]
The pixel code generation unit 102 uses a method different from the pixel code generation unit 94 in FIG. 5 by using the ADRC code generated by the ADRC code generation unit 93 and the ADRC code converted by the code conversion unit 101. Thus, a pixel code is generated. The pixel code generation unit 102 refers to the pixel position table 95 and is expected to have the lowest difference level of the pixel values of the corresponding pixels in the current frame and the reference frame corresponding to the ADRD code or the converted ADRC code. A pixel code (code B and code C) is calculated using a method different from the method for obtaining the code A from the pixel value at the pixel position.
[0087]
The pixel code generation unit 102 acquires 8-bit data as the pixel value of the pixel at the pixel position selected with reference to the pixel position table 95, and executes encoding using the upper 3 bits. That is, pixel values 0 to 2 indicated by 8 bits⁸-1 is 2^ThreeIn this case, two types of codes are calculated. Therefore, the total number of code types calculated is 2^Three× 2 types, which is the same as the code types calculated by the feature amount extraction unit 63. Specifically, code B (code B) and code C (code C) are calculated by the following equations (6) and (7).
[0088]
codeB = [{p (x, y) +8} / 32] × 2 (6)
codeC = [{p (x, y) −8} / 32] × 2 + 1 (7)
[0089]
Again, the calculation value in parentheses in [] is rounded down. FIG. 16 shows the relationship between the code A calculated by the feature quantity extraction unit 63 and the codes B and C calculated by the feature quantity extraction unit 64.
[0090]
While code A is coded using the upper 4 bits, code B and code C are coded using the upper 3 bits, so the pixel range included in the same code Is 16 steps in code A, and in code B and code C, it is 32 steps, twice that in code A. The code boundary of code B is shifted 3 bits to the minus side compared to the boundary of code A, and the code boundary of code C is shifted 3 bits to the plus side compared to the boundary of code A. Has been.
[0091]
When the pixel feature amount changes between the reference frame Fr and the current frame Fc due to the influence of noise or the like, it is the boundary portion of the code that is most likely to cause a false detection. Therefore, the code B and the code C of the reference frame Fr are set so as to overlap in the same pixel value range, including the boundary portion of the code A, so that this portion is not erroneously detected.
[0092]
As shown in FIG. 16, the code B or the code C generated by the above equations (6) and (7) are code A = X (X is 0 to 14) and code A = X + 1, respectively. Including the pixel value of the boundary portion between. Then, in a pixel value that is a boundary portion between the code A = X and the code A = X + 1, the code B = X and the code C = X + 1.
[0093]
Specifically, for example, when the pixel value of the target pixel is 63, code A = 3 in the current frame Fc, code B = 4, and code C = 3 in the reference frame Fr. When the pixel value of the target pixel is 64, code A = 4 in the current frame Fc, code B = 4, and code C = 3 in the reference frame Fr.
[0094]
Therefore, even when the pixel feature amount changes between 63 and 64 in the reference frame Fr and the current frame Fc, the same code as the code A can be obtained in either the code B or the code C. It becomes. When the code B and the code C are generated by the above-described equations (6) and (7), the code B or the code C is 3 bits (eight steps by pixel value) across the corresponding boundary of the code A. Therefore, if the variation in the feature amount of the pixel between the reference frame Fr and the current frame Fc is within 3 bits (eight steps in pixel value), the correct detection result is obtained with either the code B or the code C. Can be obtained.
[0095]
Here, the number of bits of information used to detect code A has been described as the upper 4 bits, and the number of bits of information used to detect code B and code C has been described as the upper 3 bits. The detection method is not limited to this. For example, the number of bits of information used to detect code A may be the upper 5 bits, and the number of bits of information used to detect code B and code C may be the upper 4 bits.
[0096]
The pixel code generation unit 102 supplies the calculated code B and code C to the feature amount generation unit 96. The feature amount generation unit 96 uses the code B, which is a 4-bit pixel code supplied from the pixel code generation unit 102, and the 9-bit ADRC code supplied from the code conversion unit 101 to generate a 13-bit feature amount. A feature quantity code B that is a code is generated, and a code C that is a 4-bit pixel code supplied from the pixel code generation unit 102 and a 9-bit ADRC code supplied from the code conversion unit 101 are used. , A feature amount code C which is a 13-bit feature amount code is generated and supplied to the database control unit 65.
[0097]
Returning to FIG. 4 again, the motion detection unit 51 of the image processing apparatus to which the present invention is applied will be described.
[0098]
The database control unit 65 stores the reference frame information by storing the pixel position information in the database 71 using the feature amount as an address based on the feature amount information of the reference frame Fr supplied from the feature amount extraction unit 64. Generate. The database control unit 65 has a counter for counting the number of processed pixels.
[0099]
The configuration of the reference frame information stored in the database 71 is shown in FIG.
[0100]
The database 71 includes (a + 1) × (b + 1) cells indicated by feature amount addresses 0 to a and flag addresses 0 to b.
[0101]
The database control unit 65 associates the feature amount code B and the feature amount code C, which are the feature amounts of the pixels of the reference frame Fr, with the feature amount addresses, and obtains the position information of the pixels as the feature amounts of the database 71. The flag addresses 1 to b corresponding to the addresses are sequentially stored. In the flag address 0, the number of position information currently stored in the feature amount address is sequentially incremented and stored. Specifically, when one piece of position information is stored in the

feature amount address

1, 1 is stored in the cell (1, 0) as the number of pieces of position information stored. When the feature amount of the next pixel of interest corresponds to the feature amount address 1, the value stored in the cell (1, 0) is incremented to 2, and the position information of the pixel of interest is , Stored in cell (1, 2).
[0102]
Then, for each target pixel of the reference frame Fr, two pieces of pixel position information corresponding to the feature amount code B and the feature amount code C are stored in any one of flag addresses 1 to b corresponding to the feature amount address. . When the storage process for one frame is completed, the pixel position information twice the number of pixels of one frame is stored in the cell (0, 1) to the cell (a, b) of the database 71.
[0103]
The database search unit 66 performs a matching process between the feature amount code A which is the feature amount information of the current frame Fc supplied from the feature amount extraction unit 63 and the reference frame information stored in the database 71 of the database control unit 65. Execute. That is, the database search unit 61 receives the feature amount code A that is the feature amount of the target pixel of the current frame Fc from the feature amount extraction unit 63 and refers to the reference frame information of the database 71 to determine the current frame Fc. The pixel position information of a plurality of reference pixel candidates described in the feature amount address that matches the feature amount of the target pixel is detected and supplied to the motion vector determination unit 67 together with the pixel position information of the target pixel of the current frame Fc. To do.
[0104]
The motion vector determination unit 67 calculates the distance between the pixel position information of the target pixel of the current frame Fc and the pixel position information of the plurality of reference pixel candidates. For example, the reference pixel candidate having the smallest calculated distance is The reference pixel corresponding to the target pixel of the current frame Fc is determined, and based on the position information, the differential coordinates are detected and output. The output difference coordinates are the motion vector (Vx, Vy) of the target pixel.
[0105]
Here, it has been described that the reference pixel corresponding to the target pixel of the current frame Fc is determined from among a plurality of reference pixel candidates by selecting the one having the smallest distance calculation result. It goes without saying that other methods may be used.
[0106]
With reference to FIG. 18, the motion detection process 1 which the motion detection part 51 performs is demonstrated.
[0107]
In step S31, the frame memory 61 receives a frame input.
[0108]
In step S32, the frame memory 61 supplies the frame input in step S21 to the frame memory 62 that is the second frame memory. For example, the frame memory 62 supplies the stored (n−1) th frame to the feature amount extraction unit 64 before the nth frame is supplied from the frame memory 61 to the frame memory 62.
[0109]
In step S33, a reference frame information generation process, which will be described later with reference to FIG.
[0110]
In step S34, the feature amount extraction unit 63 executes current frame information generation processing, which will be described later with reference to FIG.
[0111]
In step S35, a matching process, which will be described later with reference to FIG. 30, is executed, and the process ends.
[0112]
Next, reference frame information generation processing executed in step S33 in FIG. 18 will be described with reference to the flowchart in FIG.
[0113]
In step S 51, the database control unit 65 initializes reference frame information registered in the database 71. That is, the database control unit 65 writes 0 in the cell of flag address 0 corresponding to all the feature amount addresses, and deletes the position information stored in the flag addresses 1 to b.
[0114]
In step S52, the database control unit 65 initializes a counter variable n of a counter that counts pixels in one frame memory to zero.
[0115]
In step S53, the feature amount extraction unit 64 executes a feature amount calculation process to be described later with reference to FIG.
[0116]
In step S 54, the database control unit 65 receives supply of the feature amount of the class tap of the reference frame Fr extracted by the feature amount extraction unit 63, and the feature corresponding to the feature amount code B and the feature amount code C from the database 71. The value K described in the flag address 0 of the quantity address is read. As described above, since two feature quantity codes B and C, which are feature quantities, are determined for one target pixel, values K corresponding to the feature quantity addresses at two locations are read out.
[0117]
In step S55, the database control unit 65 sets the values K read in step S54 to K = K + 1, and writes them to the flag address 0 of the corresponding feature amount address in the database 71.
[0118]
In step S 56, the database control unit 65 corresponds to the position information of the target pixel of the reference frame Fr supplied from the feature amount extraction unit 63 in the database 71.Two placesIndicated by the flag address K + 1 of the feature amount addresscellWrite to.
[0119]
In step S57, the database control unit 65 increments the counter variable n to n = n + 1.
[0120]
In step S58, the database control unit 65 determines whether or not the counter variable n is the number of pixels of one frame. If it is determined in step S58 that the counter variable n is not the number of pixels in one frame, the process returns to step S53, and the subsequent processes are repeated. If it is determined in step S58 that the counter variable n is the number of pixels in one frame, the process returns to step S34 in FIG.
[0121]
Through such processing, the pixels of the reference frame Fr having the feature quantities corresponding to the cells (0, 0) to (a, 0), which are the cells having the flag address = 0 in the database 71 described with reference to FIG. The number of position candidates is stored, and the pixel position of the reference frame Fr having the corresponding feature amount is stored in cells (0, 1) to cells (a, b), which are cells of flag address = 1 to b in the database 71. Stored. In other words, in the reference frame Fr, pixel positions of pixels that are highly likely to have the same feature value are addressed for each feature value and stored in the database 71.
[0122]
Next, the feature quantity calculation process executed in step S53 in FIG. 19 will be described with reference to the flowchart in FIG.
[0123]
In step S71, ADRC code generation processing 1 described later with reference to FIG. 21 is executed.
[0124]
In step S72, a pixel code generation process 1 described later with reference to FIG. 26 is executed.
[0125]
In step S 73, the feature quantity generation unit 96 of the feature quantity extraction unit 64 performs the code B, which is a 4-bit pixel code supplied from the pixel code generation unit 102, and the 9-bit ADRC supplied from the code conversion unit 101. Using the code, a feature amount code B that is a 13-bit feature amount code is generated, and a code C that is a 4-bit pixel code supplied from the pixel code generation unit 102 and a code conversion unit 101 The generated 9-bit ADRC code is used to generate a feature code C, which is a 13-bit feature code, and the process returns to step S54 in FIG.
[0126]
Next, the ADRC code generation process 1 executed in step S71 of FIG. 20 will be described with reference to the flowchart of FIG.
[0127]
In step S91, the class tap extraction unit 91 sets a class code tap having a predetermined size centered on the pixel of interest, and acquires pixel values of a plurality of pixels included in the class code tap. In the following description, as shown in FIG. 22, the size of the class code tap is 3 × 3 pixels, and the description is continued assuming that the pixel values from the upper left pixel to the lower right pixel are P1 to P9, respectively.
[0128]
In step S92, the DR calculation unit 92 determines the maximum value P of the pixel values P1 to P9._MAXAnd the minimum value P_MINIn step S93, the dynamic range DR (= | maximum value P) of the pixel values P1 to P9 is determined._MAX-Minimum value P_MIN+1 |) and the minimum value P_MINAt the same time, it is supplied to the ADRC code generator 93.
[0129]
In step S 94, the ADRC code generation unit 93 determines the minimum value P of the pixel values P 1 to P 9 supplied from the DR calculation unit 92._MINThe threshold th is determined from the dynamic range DR using the above-described equation (4). The threshold th is supplied to the code conversion unit 101.
[0130]
In step S95, the code conversion unit 101 determines a pixel to be masked among the pixel values P1 to P9. Here, masking is a process of performing bit inversion so that a motion vector can be detected correctly even when pixel values fluctuate due to noise or the like. The pixels to be masked are, for example, a predetermined number (for example, two) of pixels having pixel values in the vicinity of the threshold Th, or may be determined as pixels having a pixel value within a predetermined range from the threshold Th. good.
[0131]
In step S96, the ADRC code generation unit 93 compares each of the nine pixels of the pixel values P1 to P9 with the threshold value Th. If the threshold value Th is greater than the threshold value Th, the ADRC code generation unit 93 quantizes the pixel to 1. 9 bits are generated as an ADRC code of the pixel of interest, and supplied to the code conversion unit 101.
[0132]
In step S97, the code conversion unit 101 bit-inverts the pixel to be masked in the supplied 9-pit code, and the process returns to step S72 in FIG.
[0133]
For example, when masking two pixels in the vicinity of the threshold Th, when the pixel values P1 to P9 of the nine pixels included in the class code tap are in the state as shown in FIG. An ADRC code 101001111 is generated. In step S97, each of the ADRC codes corresponding to the pixel value P6 and the pixel value P8 closest to the threshold value Th is singly or both are bit-inverted, so that the 9-bit ADRC codes 101000111, 101001101, and 101000101 is generated.
[0134]
For example, when the pixel values P1 to P9 of the nine pixels included in the class code tap are in the state as shown in FIG. 24, a 9-bit ADRC code 10100101 is generated in step S96. In step S97, each of the ADRC codes corresponding to the pixel value P5 and the pixel value P6 closest to the threshold value Th is independent, or both codes are bit-inverted, so that the 9-bit ADRC codes 101011101, 101000101, And 101010101 are generated.
[0135]
Further, when masking is performed on pixels having all pixel values included in a predetermined range (± Δ) centering on the threshold Th, for example, the pixel value P1 of 9 pixels included in the class code tap is used. When P9 to P9 are in the state shown in FIG. 25, 101001111 is generated in step S96. In step S97, each of the ADRC codes corresponding to the pixel value P2 and the pixel value P6 included in the predetermined range (± Δ) centering on the threshold Th is alone, or both codes are bit-inverted. Therefore, 9-bit ADRC codes 111001111, 101000111, and 111000111 are generated.
[0136]
In this way, when the level changes due to the influence of noise or the like, the ADRC code of a pixel that is highly likely to bit-invert the ADRC code and the ADRC code when quantized to 0 and quantized to 1 By generating the ADRC code in such a case, it is possible to prevent the matching from being correctly obtained due to the influence of noise or the like. Therefore, the robustness of the class code can be improved.
[0137]
The number of pixels constituting the class code tap and the number of bits of the class code are not limited to the above-described example, and are arbitrary.
[0138]
Furthermore, the method for determining the pixel to be masked is not limited to the above, and it goes without saying that the pixel to be masked may be determined by another method.
[0139]
Further, the matching process may be performed using only the detected ADRC code without performing masking.
[0140]
Next, the pixel code generation process 1 executed in step S72 of FIG. 20 will be described with reference to the flowchart of FIG.
[0141]
In step S111, the pixel code generation unit 102 acquires the ADRC code generated by the ADRC code generation unit 93 or the ADRC code converted by the code conversion unit 101 (the predetermined code is bit-inverted).
[0142]
In step S112, the pixel code generation unit 102 refers to the pixel position table stored in the pixel position table 95, and selects a pixel position corresponding to the ADRC code and having a small difference level.
[0143]
For example, when the table pixel position described with reference to FIG. 14 is stored in the pixel position table 95 and the ADRC code acquired in step S111 is the code arrangement indicated by code 3, the tap P4 in FIG. Are selected as pixel positions of pixels used for pixel code generation.
[0144]
In step S113, the pixel code generation unit 102 extracts the upper 3 bits of the pixel value of the pixel corresponding to the selected pixel position.
[0145]
In step S114, the pixel code generation unit 102 calculates code B and code C, and the process returns to step S73 in FIG.
[0146]
Specifically, the processing in step S113 and step S114 is equivalent to the processing for calculating code B and code C using the above-described equations (6) and (7).
[0147]
In this way, the pixel code of the reference frame is calculated by a method different from the method of calculating the pixel code of the current frame, which will be described later with reference to FIG. 29, based on the pixel at the pixel position having a small difference level corresponding to the ADRC code. can do.
[0148]
Next, the current frame information generation process 1 executed in step S34 in FIG. 18 will be described with reference to the flowchart in FIG.
[0149]
In step S131, the database search unit 66 initializes a counter variable m of a counter that counts pixels in one frame memory to zero.
[0150]
In step S132, ADRC code generation processing 2 described later with reference to FIG. 28 is executed.
[0151]
In step S133, pixel code generation processing 2 described later with reference to FIG. 29 is executed.
[0152]
In step S 134, the feature quantity generation unit 96 of the feature quantity extraction unit 63 includes the code A that is a 4-bit pixel code supplied from the pixel code generation unit 94 and the 9-bit supply that is supplied from the ADRC code generation unit 93. A feature amount code A, which is a 13-bit feature amount code, is generated using the ADRC code and supplied to the database search unit 66.
[0153]
In step S135, the database search unit 66 increments the counter variable m to m = m + 1.
[0154]
In step S136, the database search unit 66 determines whether or not the counter variable m = 1 is the number of pixels in one frame. If it is determined in step S136 that the counter variable m is not the number of pixels in one frame, the process returns to step S132, and the subsequent processes are repeated. If it is determined in step S136 that the counter variable m is the number of pixels in one frame, the process returns to step S35 in FIG.
[0155]
Through such processing, current frame information is calculated.
[0156]
Next, the ADRC code generation process 2 executed in step S132 of FIG. 27 will be described with reference to the flowchart of FIG.
[0157]
In step S151 through step S154, the feature amount extraction unit 63 executes the same processing as in step S91 through step S94 described with reference to FIG. That is, pixel values of a plurality of pixels included in a class code tap having a predetermined size set by the class tap extraction unit 91 of the feature amount extraction unit 63 are acquired, and the DR calculation unit 92 acquires the maximum pixel values P1 to P9. Value P_MAXAnd the minimum value P_MINIs determined, the dynamic range DR of the pixel values P1 to P9 is calculated, and the minimum value P is calculated._MINAt the same time, it is supplied to the ADRC code generator 93. Then, the ADRC code generation unit 93 determines the threshold Th using the above-described equation (4).
[0158]
In step S155, the ADRC code generation unit 93 of the feature amount extraction unit 63 compares each of the nine pixels of the pixel values P1 to P9 with the threshold value Th. If the threshold value Th is greater than the threshold value Th, the ADRC code generation unit 93 quantizes the nine pixels. If it is also smaller, 9 bits quantized to 0 and arranged in numerical order are generated as the ADRC code of the pixel of interest, supplied to the feature quantity generation unit 96, and the process returns to step S133 in FIG.
[0159]
By such processing, the ADRC code is calculated in the current frame.
[0160]
Next, the pixel code generation process 2 executed in step S133 in FIG. 27 will be described with reference to the flowchart in FIG.
[0161]
In step S 171, the pixel code generation unit 94 acquires the ADRC code generated by the ADRC code generation unit 93.
[0162]
In step S 172, the pixel code generation unit 94 refers to the pixel position table stored in the pixel position table 95 and selects a pixel position corresponding to the ADRC code.
[0163]
In step S173, the pixel code generation unit 94 extracts the upper 4 bits of the pixel value of the target pixel of the input reference frame Fr.
[0164]
In step S174, the pixel code generation unit 94 calculates code A, and the process returns to step S134 of FIG.
[0165]
Specifically, the processing in step S173 and step S174 is equivalent to the processing for calculating code A using the above-described equation (5).
[0166]
By this processing, a code A that is a pixel code is calculated.
[0167]
Next, the matching process executed in step S35 of FIG. 18 will be described with reference to the flowchart of FIG.
[0168]
In step S191, the database search unit 66 receives the feature amount code A of the target pixel of the current frame Fc from the feature amount extraction unit 63.
[0169]
In step S192, the database search unit 66 detects a feature amount (feature amount address) of the reference frame Fr recorded in the database 71 that is equal to the feature amount code A of the current frame Fc.
[0170]
In step S193, the database search unit 66 reads the pixel position information described in the cells of the flag address 1 to the flag address K of the detected feature amount address, and supplies the pixel position information to the motion vector determination unit 67.
[0171]
In step S194, the motion vector determination unit 67 detects a pixel position closest to the target pixel of the current frame Fc from among the read pixel positions.
[0172]
In step S195, the motion vector determination unit 67 calculates a motion vector based on the pixel position of the target pixel and the detected pixel position, and the process ends.
[0173]
Here, the candidate closest to the pixel of interest in the current frame Fc is described as being the pixel of the corresponding reference frame Fr, but the motion vector detection method may be other methods.
[0174]
The above is the description of the motion detection process of FIG. 18 performed by the motion detection unit 51 described with reference to FIG. 4, but the processes in steps S31 to S35 of FIG. 18 are partially executed in parallel. The For example, an example of the timing at which the (n-1) th input frame (n-1 frame), the nth input frame (nframe), and the (n + 1) th input frame (n + 1 frame) are processed. Will be described with reference to FIG.
[0175]
For example, when n frames are input to the frame memory 61, the feature amount extraction unit 64 calculates the feature amount code B and the feature amount code C using the n-1 frame as a reference frame.
[0176]
Next, when the feature quantity extraction unit 63 calculates the feature quantity code A using the n frames as the current frame, the n frames are supplied to the frame memory 62 which is the second frame memory, and the feature quantity extraction unit 64 The feature amount code B and the feature amount code C of the n−1 frame calculated in step S 1 are registered in the database 71 by the processing of the database control unit 65.
[0177]
Thereafter, in the database search unit 66, the feature amount code A of the current frame calculated by the feature amount extraction unit 63, that is, the n−1 frame registered in the database 71, that is, the reference frame. The feature amount code B and the feature amount code C are matched, and the pixel position having the feature amount corresponding to the feature amount code A of the current frame is detected. At this time, the feature quantity extraction unit 64 calculates the feature quantity code B and the feature quantity code C using n frames as reference frames, and the current memory frame n + 1 frame is input to the frame memory 61.
[0178]
Next, when the feature amount code A of the n + 1 frame that is the current frame is calculated in the feature amount extraction unit 63, the n + 1 frame is supplied to the frame memory 62 that is the second frame memory, and the feature amount extraction is performed. The n frames calculated by the unit 64, that is, the feature code B and the feature code C of the reference frame are registered in the database 71 by the processing of the database control unit 65.
[0179]
Then, in the database search unit 66, the feature amount code A of the n + 1 frame calculated by the feature amount extraction unit 63, that is, the current frame, and the n frame registered in the database 71, that is, the feature amount code B of the reference frame. And the feature amount code C are matched, and a pixel position having a feature amount corresponding to the feature amount code A of the current frame is detected. Thereafter, the process is repeated in the same manner.
[0180]
In this way, the processes in steps S31 to S35 are partially executed in parallel.
[0181]
In addition, by providing two databases, the reference frame information generation process, which is a registration process for the database, and the matching process executed using the information registered in the database can be performed in parallel, thereby speeding up the process. Can be
[0182]
FIG. 32 is a block diagram illustrating a configuration of the motion detection unit 121 provided with two databases. Note that portions corresponding to those in FIG. 4 are denoted by the same reference numerals, and description thereof will be omitted as appropriate.
[0183]
That is, the motion detection unit 121 in FIG. 32 includes two databases, a database 71-1 and a database 71-2, instead of the database control unit 65, and includes a database control unit 131 having a database selection processing unit 141. Except that the memory 62 is omitted, the configuration is basically the same as that of the motion detection unit 51 described with reference to FIG.
[0184]
FIG. 33 is a block diagram showing a more detailed configuration of the database control unit of FIG. The database selection processing unit 141 registers the feature amount of the current frame Fc supplied from the feature amount extraction unit 64 in one of the database 71-1 and the database 71-2, and the reference frame Fr already registered in the other. Input / output of information is controlled so that the information is read by the database search unit 66.
[0185]
That is, the database control unit 131 includes two databases, a database 71-1 and a database 71-2. For example, while the feature amount of the current frame Fc is registered in the database 71-2, Information of the reference frame Fr registered in the database 71-1 by the processing is read by the database search unit 66. Then, after the matching process between the reference frame Fr registered in the database 71-1 and the current frame Fc and the process of registering the feature amount of the current frame Fc in the database 71-2 are completed, the next frame is the current frame. The feature amount of the current frame Fc is registered in the database 71-1, and the information registered in the database 71-2 is read by the database search unit 66 as information of the reference frame Fr.
[0186]
That is, reading of information from the database 71-1 and writing of information to the database 71-2 are performed in parallel, reading of information from the database 71-2 and writing of information to the database 71-1. Are done in parallel.
[0187]
With reference to the flowchart of FIG. 34, the motion detection process 2 which the motion detection part 121 of FIG. 32 performs is demonstrated.
[0188]
In step S211, the frame memory 61 receives a frame input. The frame memory 61 supplies the input frame to the feature amount extraction unit 63 and the feature amount extraction unit 64.
[0189]
In step S212, the reference frame information generation process described with reference to FIG.
[0190]
In step S213, the feature amount extraction unit 63 executes the current frame information generation processing described with reference to FIG.
[0191]
In step S214, the matching process described with reference to FIG. 30 is executed, and the process ends.
[0192]
The processes in steps S211 to S214 are partially executed in parallel. For example, an example of the timing at which the (n-1) th input frame (n-1 frame), the nth input frame (nframe), and the (n + 1) th input frame (n + 1 frame) are processed. Will be described with reference to FIG.
[0193]
For example, when n frames are input to the frame memory 61, the feature quantity extraction unit 64 calculates the feature quantity code B and feature quantity code C of n−1 frames and supplies them to the database control unit 131.
[0194]
Next, when the feature quantity extraction unit 63 calculates the feature quantity code A for n frames, the feature quantity extraction section 64 calculates the feature quantity code B and the feature quantity code C for n frames. In parallel with this process, the feature quantity code B and the feature quantity code C of the n−1 frame calculated by the feature quantity extraction unit 64 are processed by the database control unit 131 to either one of the databases 71 (hereinafter referred to as the first one). In the database). At this time, n + 1 frames are input to the frame memory 61.
[0195]
Thereafter, in the database search unit 66, the feature quantity code A of n frames calculated by the feature quantity extraction unit 63, and the feature quantity code B and feature quantity code of n−1 frames registered in the first database. C is matched, and information on the pixel position having the feature amount corresponding to the feature amount code A is obtained. At this time, the feature quantity code B and the feature quantity code C of the frame n calculated by the feature quantity extraction unit 64 are transferred to the other database 71 (hereinafter referred to as a second database) by the processing of the database control unit 131. be registered.
[0196]
That is, reading of information from the first database and writing of information to the second database are performed in parallel. In parallel, the feature quantity extraction unit 63 calculates the feature quantity code A for n + 1 frame, and the feature quantity extraction unit 64 calculates the feature quantity code B and feature quantity code C for n + 1 frame.
[0197]
Then, the database search unit 66 matches the feature value code A of n + 1 frame calculated by the feature value extraction unit 63 with the feature value code B and feature value code C of n frames registered in the second database. Thus, information on the pixel position having the feature amount corresponding to the feature amount code A is obtained. At this time, the feature amount code B and the feature amount code C of the frame n + 1 calculated by the feature amount extraction unit 64 are registered in the first database by the processing of the database control unit 131, and the processing is similarly repeated thereafter.
[0198]
By adopting such a configuration, it is possible to speed up the processing compared to the case described with reference to FIG.
[0199]
Next, generation of the pixel position table stored in the pixel position table 95 of the feature amount extraction unit 63 and the feature amount extraction unit 64 will be described.
[0200]
FIG. 36 is a block diagram illustrating a configuration of a pixel position table generation device 161 that generates a pixel position table. In addition, the same code | symbol is attached | subjected to the part corresponding to the case in FIG. 5, The description is abbreviate | omitted suitably.
[0201]
The ADRC code of the image information A is generated by the processing of the class tap extraction unit 91, the DR calculation unit 92, and the ADRC code generation unit 93, and is supplied to the data storage unit 172. As described with reference to FIG. 22, the class tap extraction unit 91 will be described as extracting 9 pixel class taps of 3 pixels × 3 pixels.
[0202]
The difference level extraction unit 171 receives supply of a class tap of the image information A and image information B that is an image of a frame before the image information A. In the image information A and the image information B, the corresponding pixel positions before and after the movement are known, and the difference level extraction unit 171 indicates the fluctuation (difference level) of the corresponding pixel level before and after the movement in the class tap. Detection is performed for each pixel position of the nine pixels, and the data is supplied to the data storage unit 172.
[0203]
The data accumulation unit 172 accumulates the difference level for each pixel position supplied from the difference level extraction unit 171 for each ADRC code. As described with reference to FIG. 13, the relationship between the pixel position and the difference level has regularity by the ADRC code. The data storage unit 172 stores, for example, difference level information corresponding to image information A and image information B of about 25 patterns or more.
[0204]
The pixel position table generation unit 173 detects the pixel position having the smallest difference level in each ADRC code from the data stored in the data storage unit 172, and generates the pixel position table described with reference to FIG. .
[0205]
Next, pixel position table generation processing executed by the pixel position table generation device 161 will be described with reference to the flowchart of FIG.
[0206]
In step S231, the ADRC code generation process 2 described with reference to FIG. 28 is executed.
[0207]
In step S232, the difference level extraction unit 171 extracts the difference level of the pixel value at the corresponding pixel position before and after the movement between the image information A and the image information B that is an image one frame before the image information A, Supply to data storage unit 172
[0208]
In step S233, the data storage unit 172 stores the difference level information supplied from the difference level extraction unit 171 in correspondence with the ADRC code supplied from the ADRC code generation unit 93. Here, for example, it is assumed that difference level information corresponding to about 25 patterns of image information A and image information B is accumulated.
[0209]
In step S234, the pixel position table generation unit 173 refers to the difference level information stored in the data storage unit 172, and detects the pixel position having the smallest difference level for each ADRC code.
[0210]
In step S235, the pixel position table generation unit 173 generates the pixel position table described with reference to FIG. 14, and the process ends.
[0211]
By such processing, the pixel position table described with reference to FIG. 14 is generated. Then, at the time of motion vector detection, the pixel position table is referred to, and feature amounts are extracted based on the pixel value of the pixel position having a low difference level. Therefore, in motion vector detection, level fluctuation due to noise or the like is detected. The impact can be reduced.
[0212]
The series of processes described above can also be executed by software. The software is a computer in which the program constituting the software is incorporated in dedicated hardware, or various functions can be executed by installing various programs, for example, a general-purpose personal computer For example, it is installed from a recording medium.
[0213]
As shown in FIG. 38, this recording medium is distributed to provide a program to the user separately from the computer, and includes a magnetic disk 241 (including a flexible disk) on which the program is recorded, an optical disk 242 (CD- ROM (Compact Disk-Read Only Memory), DVD (including Digital Versatile Disk)), magneto-optical disk 243 (including MD (Mini-Disk) (trademark)), or a package medium composed of semiconductor memory 144, etc. Is done.
[0214]
The personal computer 201 will be described with reference to FIG.
[0215]
A CPU (Central Processing Unit) 221 receives signals corresponding to various commands input by the user using the input unit 224 via the input / output interface 222 and the internal bus 223, and other personal computers via the network interface 230. The control signal transmitted from the computer is received, and various processes based on the input signal are executed. A ROM (Read Only Memory) 225 stores basically fixed data among the programs used by the CPU 221 and calculation parameters. A RAM (Random Access Memory) 226 stores programs used in the execution of the CPU 221 and parameters that change as appropriate during the execution. The CPU 221, ROM 225, and RAM 226 are connected to each other via an internal bus 223.
[0216]
The internal bus 223 is also connected to the input / output interface 222. The input unit 224 includes, for example, a keyboard, a touch pad, a jog dial, or a mouse, and is operated when the user inputs various commands to the CPU 221. The display unit 227 includes, for example, a CRT (Cathode Ray Tube), a liquid crystal display device, and the like, and displays various types of information as text or images.
[0217]
An HDD (hard disk drive) 228 drives a hard disk and records or reproduces a program executed by the CPU 221 and information. A magnetic disk 241, an optical disk 242, a magneto-optical disk 243, and a semiconductor memory 244 are mounted on the drive 229 as necessary to exchange data.
[0218]
The network interface 230 is wired or wirelessly connected to other personal computers or various devices other than personal computers using a predetermined cable, and exchanges information with those devices, or via the Internet. Access the server and exchange information.
[0219]
The input unit 224 through the network interface 230 are connected to the CPU 221 via the input / output interface 222 and the internal bus 223.
[0220]
Further, in the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in chronological order according to the described order, but may be performed in parallel or It also includes processes that are executed individually.
[0221]
【The invention's effect】
Thus, according to the present invention, the pixel of the reference frame corresponding to the target pixel of the current frame can be detected. In particular, even if the feature amount of the pixel changes between the reference frame and the current frame, it can be detected correctly, so that a correct motion vector can be obtained.
[0222]
According to another aspect of the present invention, it is possible to detect a pixel position having the smallest difference level for each ADRC code and generate a table indicating the pixel positions of pixels used for image processing.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of a conventional motion detection unit.
FIG. 2 is a diagram for explaining a motion vector detection method;
FIG. 3 is a flowchart for explaining motion detection processing by a motion detection unit in FIG. 1;
FIG. 4 is a block diagram illustrating a configuration of a motion detection unit to which the present invention is applied.
5 is a block diagram illustrating a configuration of a feature amount extraction unit 63 in FIG. 4;
FIG. 6 is a diagram illustrating class taps.
FIG. 7 is a diagram illustrating class taps.
FIG. 8 is a diagram illustrating class taps.
FIG. 9 is a diagram illustrating class taps.
FIG. 10 is a diagram illustrating class taps.
FIG. 11 is a diagram illustrating class taps.
FIG. 12 is a diagram illustrating an ADRC code.
FIG. 13 is a diagram illustrating a relationship between a tap position and a difference level for each ADRC code.
FIG. 14 is a diagram illustrating a pixel position table.
15 is a block diagram illustrating a configuration of a feature quantity extraction unit 64 in FIG. 4;
FIG. 16 is a diagram illustrating code A, code B, and code C calculated as pixel codes.
17 is a diagram for explaining the structure of the database in FIG. 4;
18 is a flowchart for explaining motion detection processing 1 by the motion detection unit in FIG. 4;
FIG. 19 is a flowchart illustrating reference frame information generation processing.
FIG. 20 is a flowchart illustrating a feature amount calculation process.
FIG. 21 is a flowchart for explaining ADRC code generation processing 1;
FIG. 22 is a diagram illustrating an example of a class code tap.
FIG. 23 is a diagram illustrating pixel positions to be masked.
FIG. 24 is a diagram illustrating pixel positions to be masked.
FIG. 25 is a diagram illustrating pixel positions to be masked.
FIG. 26 is a flowchart for explaining pixel code generation processing 1;
FIG. 27 is a flowchart for describing current frame information generation processing;
FIG. 28 is a flowchart for explaining ADRC code generation processing 2;
FIG. 29 is a flowchart illustrating pixel code generation processing 2;
FIG. 30 is a flowchart illustrating matching processing.
31 is a diagram illustrating timing of motion detection processing by the motion detection unit in FIG. 4; FIG.
FIG. 32 is a block diagram illustrating another configuration of a motion detection unit to which the present invention is applied.
33 is a block diagram showing a configuration of a database control unit in FIG. 32. FIG.
34 is a flowchart for explaining motion detection processing 2 by the motion detection unit in FIG. 32;
35 is a diagram for explaining the timing of motion detection processing by the motion detection unit in FIG. 32;
FIG. 36 is a block diagram illustrating a configuration example of a pixel position table generation device.
FIG. 37 is a flowchart for describing pixel position table generation processing;
FIG. 38 is a block diagram illustrating a configuration example of a personal computer.
[Explanation of symbols]
51 motion detector, 61, 62 frame memory, 63, 64 feature extraction unit, 65 database control unit, 66 database search unit, 67 motion vector determination unit, 71 database, 91 class tap extraction unit, 92 DR operation unit, 93 ADRC code generation unit, 94 pixel code generation unit, 95 pixel position table, 96 feature quantity generation unit, 101 code conversion unit, 102 pixel code generation unit, 121 motion detection unit, 131 database control unit, 141 database selection processing unit, 161 Pixel position table generation device, 171 difference level extraction unit, 172 data storage unit, 173 pixel position table generation unit, 201 personal computer

Claims

第１の学習フレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードに対応付けて、前記所定の画素群のうち、前記第１の学習フレームと前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルと、
入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置を前記画素位置テーブルから読み出し、
前記画素位置テーブルから読み出された前記レベル変動が最も少ない前記第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡを抽出する第１の特徴量抽出手段と、
前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置を前記画素位置テーブルから読み出し、
前記画素位置テーブルから読み出された前記レベル変動が最も少ない前記第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出する第２の特徴量抽出手段と、
前記第２の特徴量抽出手段により抽出された前記特徴量コードＢおよび前記特徴量コードＣごとに、対応する画素位置の情報を記憶するデータベースと、
前記データベースにより記憶されている前記画素位置の情報のうち、前記第１の特徴量抽出手段により抽出された前記第１のフレームの注目画素の前記特徴量コードＡと値が一致する前記特徴量コードＢおよび前記特徴量コードＣの前記画素位置の情報を検索する検索手段と
を備える画像処理装置。In association with each n-bit quantization code generated by ADRC quantizing each pixel value of the predetermined pixel group including the target pixel of the first learning frame, the first pixel of the predetermined pixel group A pixel position table that stores information on pixel positions with the least level fluctuation of pixel values detected in a learning frame and a second learning frame that is temporally prior to the first learning frame;
The quantization of the first learning frame whose value matches the n-bit quantization code generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the input first frame by ADRC Read the first pixel position associated with the code with the least level fluctuation from the pixel position table,
Of the 8-bit data indicating the pixel value at the first pixel position with the least level fluctuation read from the pixel position table, the code A encoded using upper m bits, and the first First feature quantity extraction means for extracting the feature quantity code A of the target pixel of the first frame, comprising the quantization code of the predetermined pixel group including the target pixel of the frame of
The value coincides with the n-bit quantization code generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, by ADRC. A second pixel position with the least level fluctuation associated with the quantization code of the first learning frame is read from the pixel position table;
Of the 8-bit data indicating the pixel value of the second pixel position with the least level fluctuation read from the pixel position table, the upper m−1 bits are used to indicate the boundary portion of the code A A code B or a code C each encoded so as to overlap in a predetermined pixel value range including a pixel value, and the quantization code of the predetermined pixel group including the target pixel of the second frame, Second feature quantity extraction means for extracting the feature quantity code B and the feature quantity code C of the target pixel of the second frame;
A database for storing information of corresponding pixel positions for each of the feature quantity code B and the feature quantity code C extracted by the second feature quantity extraction unit;
Of the pixel position information stored in the database, the feature code whose value matches the feature code A of the target pixel of the first frame extracted by the first feature extraction means B and a search means for searching for information on the pixel position of the feature code C.

前記第２の特徴量抽出手段は、前記第２の画素位置の画素値を示す８ビットのデータに２ ^m-1 を加算した後に２ ^m+1 で割ったものに２を乗算することで、前記第２の画素位置の画素値を示す８ビットのデータのうち、前記上位ｍ−１ビットを用いて前記コードＢをコード化し、
前記第２の画素位置の画素値を示す８ビットのデータから２ ^m-1 を減算した後に２ ^m+1 で割ったものに２を乗算してから１を足すことで、前記第２の画素位置の画素値を示す８ビットのデータのうち、前記上位ｍ−１ビットを用いて前記コードＣをコード化する
請求項１に記載の画像処理装置。The second feature quantity extraction unit adds 2 ^m−1 to 8-bit data indicating a pixel value at the second pixel position, and then divides by 2 ^{m + 1} to multiply by 2. Of the 8-bit data indicating the pixel value of the second pixel position, the code B is encoded using the upper m−1 bits.
By subtracting 2 ^m−1 from 8-bit data indicating the pixel value at the second pixel position and then dividing by 2 ^{m + 1} , multiplying by 2 and adding 1 to the second pixel The image processing apparatus according to claim 1, wherein the code C is encoded using the upper m−1 bits of 8-bit data indicating a pixel value of a position .

前記第２の特徴量抽出手段は、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードのうち、量子化における閾値近傍の画素値に対応する所定のコードのみをビット反転し、前記所定のコードのみがビット反転されている量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第３の画素位置を前記画素位置テーブルから読み出し、
前記画素位置テーブルから読み出された前記レベル変動が最も少ない前記第３の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記所定のコードのみがビット反転されている前記量子化コードからなる特徴量コードＢおよび特徴量コードＣもそれぞれ抽出する
請求項１に記載の画像処理装置。The second feature amount extraction unit is configured to bit only a predetermined code corresponding to a pixel value in the vicinity of a threshold in quantization among the quantization codes of the predetermined pixel group including the target pixel of the second frame. A third pixel position having the least level fluctuation associated with the quantization code of the first learning frame whose value matches the quantization code in which only the predetermined code is bit-inverted. Read from the pixel location table;
Of the 8-bit data indicating the pixel value of the third pixel position with the least level fluctuation read from the pixel position table, the upper m−1 bits are used to indicate the boundary portion of the code A. Feature quantity code B and feature comprising code B or code C each coded to overlap within a predetermined pixel value range including pixel values, and the quantized code in which only the predetermined code is bit-inverted The image processing apparatus according to claim 1, wherein each quantity code C is also extracted.

前記データベースは、複数備えられ、フレーム毎に交互に情報が記憶される
請求項１に記載の画像処理装置。The image processing apparatus according to claim 1, wherein a plurality of the databases are provided, and information is alternately stored for each frame.

前記検索手段により検索された前記画素位置の情報のうち、前記第１のフレームの前記注目画素との距離が最小となる画素位置を検出し、検出された前記画素位置と、前記注目画素の画素位置の情報とを基に、動きベクトルを生成する動きベクトル生成手段
を更に備える請求項１に記載の画像処理装置。Among the information on the pixel position searched by the search means, a pixel position having a minimum distance from the target pixel of the first frame is detected, and the detected pixel position and the pixel of the target pixel The image processing apparatus according to claim 1, further comprising: a motion vector generation unit configured to generate a motion vector based on the position information.

動きベクトルを検出する画像処理装置が、
第１の学習フレームの注目画素を含む所定の画素群のＡＤＲＣの各画素値をｎビットの量子化コードに対応付けて、前記所定の画素群のうち、前記第１の学習フレームと前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む前記所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置を読み出し、
前記画素位置テーブルから読みだされた前記レベル変動が最も少ない前記第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡを抽出し、
前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む前記所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置を前記画素位置テーブルから読み出し、
前記画素位置テーブルから読みだされた前記レベル変動が最も少ない前記第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出し、
抽出された前記特徴量コードＢおよび前記特徴量コードＣごとに、対応する画素位置の情報をデータベースに記憶する処理を制御し、
前記データベースに記憶されている前記画素位置の情報のうち、抽出された前記第１のフレームの注目画素の前記特徴量コードＡと値が一致する前記特徴量コードＢおよび前記特徴量コードＣの前記画素位置の情報を検索するステップを
含む画像処理方法。An image processing device for detecting a motion vector
Associating each pixel value of ADRC of a predetermined pixel group including the target pixel of the first learning frame with an n-bit quantization code, the first learning frame and the first of the predetermined pixel group From the pixel position table that stores information on the pixel position with the smallest level fluctuation of the pixel value detected in the second learning frame that is temporally prior to the learning frame of the first frame, Each of the pixel values of the predetermined pixel group including the pixel of interest is associated with the quantization code of the first learning frame whose value matches an n-bit quantization code generated by quantizing with ADRC. Read the first pixel position with the least level fluctuation,
Of the 8-bit data indicating the pixel value of the first pixel position with the least level fluctuation read from the pixel position table, the code A encoded using upper m bits, and the first A feature amount code A of the target pixel of the first frame, which is composed of the quantization code of the predetermined pixel group including the target pixel of the frame of
An n-bit quantization code and a value generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, by ADRC. Read from the pixel position table the second pixel position with the least level variation associated with the quantization code of the first learning frame that matches,
Of the 8-bit data indicating the pixel value at the second pixel position with the least level fluctuation read from the pixel position table, the upper m−1 bits are used to indicate the boundary portion of the code A. A code B or a code C each encoded so as to overlap in a predetermined pixel value range including a pixel value, and the quantization code of the predetermined pixel group including the target pixel of the second frame, Extracting the feature amount code B and the feature amount code C of the target pixel of the second frame,
For each of the extracted feature quantity code B and feature quantity code C, control processing for storing information on the corresponding pixel position in a database;
Of the information on the pixel position stored in the database, the feature code B and the feature code C whose values match the feature code A of the extracted target pixel of the first frame. An image processing method including a step of retrieving pixel position information.

第１の学習フレームの注目画素を含む所定の画素群のＡＤＲＣの各画素値をｎビットの量子化コードに対応付けて、前記所定の画素群のうち、前記第１の学習フレームと前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む前記所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置を読み出し、
前記画素位置テーブルから読みだされた前記レベル変動が最も少ない前記第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡを抽出し、
前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む前記所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置を前記画素位置テーブルから読み出し、
前記画素位置テーブルから読みだされた前記レベル変動が最も少ない前記第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出し、
抽出された前記特徴量コードＢおよび前記特徴量コードＣごとに、対応する画素位置の情報をデータベースに記憶する処理を制御し、
前記データベースに記憶されている前記画素位置の情報のうち、抽出された前記第１のフレームの注目画素の前記特徴量コードＡと値が一致する前記特徴量コードＢおよび前記特徴量コードＣの前記画素位置の情報を検索するステップを
含む処理をコンピュータに実行させるためのプログラムが記録されている記録媒体。Associating each pixel value of ADRC of a predetermined pixel group including the target pixel of the first learning frame with an n-bit quantization code, the first learning frame and the first of the predetermined pixel group From the pixel position table that stores information on the pixel position with the smallest level fluctuation of the pixel value detected in the second learning frame that is temporally prior to the learning frame of the first frame, Each of the pixel values of the predetermined pixel group including the pixel of interest is associated with the quantization code of the first learning frame whose value matches an n-bit quantization code generated by quantizing with ADRC. Read the first pixel position with the least level fluctuation,
Of the 8-bit data indicating the pixel value of the first pixel position with the least level fluctuation read from the pixel position table, the code A encoded using upper m bits, and the first A feature amount code A of the target pixel of the first frame, which is composed of the quantization code of the predetermined pixel group including the target pixel of the frame of
An n-bit quantization code and a value generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, by ADRC. Read from the pixel position table the second pixel position with the least level variation associated with the quantization code of the first learning frame that matches,
Of the 8-bit data indicating the pixel value at the second pixel position with the least level fluctuation read from the pixel position table, the upper m−1 bits are used to indicate the boundary portion of the code A. A code B or a code C each encoded so as to overlap in a predetermined pixel value range including a pixel value, and the quantization code of the predetermined pixel group including the target pixel of the second frame, Extracting the feature amount code B and the feature amount code C of the target pixel of the second frame,
For each of the extracted feature quantity code B and feature quantity code C, control processing for storing information on the corresponding pixel position in a database;
Of the information on the pixel position stored in the database, the feature code B and the feature code C whose values match the feature code A of the extracted target pixel of the first frame. A recording medium on which a program for causing a computer to execute processing including a step of searching for pixel position information is recorded.

第１の学習フレームの注目画素を含む所定の画素群のＡＤＲＣの各画素値をｎビットの量子化コードに対応付けて、前記所定の画素群のうち、前記第１の学習フレームと前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとで検出される画素値のレベル変動が最も少ない画素位置の情報を記憶する画素位置テーブルから、入力される第１のフレームの注目画素を含む前記所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置を読み出し、
前記画素位置テーブルから読みだされた前記レベル変動が最も少ない前記第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡと、前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡを抽出し、
前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む前記所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置を前記画素位置テーブルから読み出し、
前記画素位置テーブルから読みだされた前記レベル変動が最も少ない前記第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量コードＢおよび特徴量コードＣをそれぞれ抽出し、
抽出された前記特徴量コードＢおよび前記特徴量コードＣごとに、対応する画素位置の情報をデータベースに記憶する処理を制御し、
前記データベースに記憶されている前記画素位置の情報のうち、抽出された前記第１のフレームの注目画素の前記特徴量コードＡと値が一致する前記特徴量コードＢおよび前記特徴量コードＣの前記画素位置の情報を検索するステップを
含む処理をコンピュータに実行させるためのプログラム。Associating each pixel value of ADRC of a predetermined pixel group including the target pixel of the first learning frame with an n-bit quantization code, the first learning frame and the first of the predetermined pixel group From the pixel position table that stores information on the pixel position with the smallest level fluctuation of the pixel value detected in the second learning frame that is temporally prior to the learning frame of the first frame, Each of the pixel values of the predetermined pixel group including the pixel of interest is associated with the quantization code of the first learning frame whose value matches an n-bit quantization code generated by quantizing with ADRC. Read the first pixel position with the least level fluctuation,
Of the 8-bit data indicating the pixel value of the first pixel position with the least level fluctuation read from the pixel position table, the code A encoded using upper m bits, and the first A feature amount code A of the target pixel of the first frame, which is composed of the quantization code of the predetermined pixel group including the target pixel of the frame of
An n-bit quantization code and a value generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the second frame, which is a frame temporally prior to the first frame, by ADRC. Read from the pixel position table the second pixel position with the least level variation associated with the quantization code of the first learning frame that matches,
Of the 8-bit data indicating the pixel value at the second pixel position with the least level fluctuation read from the pixel position table, the upper m−1 bits are used to indicate the boundary portion of the code A. A code B or a code C each encoded so as to overlap in a predetermined pixel value range including a pixel value, and the quantization code of the predetermined pixel group including the target pixel of the second frame, Extracting the feature amount code B and the feature amount code C of the target pixel of the second frame,
For each of the extracted feature quantity code B and feature quantity code C, control processing for storing information on the corresponding pixel position in a database;
Of the information on the pixel position stored in the database, the feature code B and the feature code C whose values match the feature code A of the extracted target pixel of the first frame. A program for causing a computer to execute processing including a step of retrieving pixel position information.

入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡと、
前記特徴量コードＡに対応するものが検索される特徴量であって、前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣと
を抽出するのに用いられる、前記第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置において、
前記第１の学習フレームにおいて、注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して、ｎビットの量子化コードを生成する量子化コード生成手段と、
前記第１の学習フレームと、前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出する検出手段と、
前記量子化コード生成手段により生成された前記量子化コードに対応付けて、前記検出手段により検出された前記画素群のそれぞれの画素位置における前記画素のレベルの変動値を蓄積する蓄積手段と、
前記蓄積手段により蓄積された情報を基に、前記量子化コードと、前記画素値の前記レベル変動が最も少ない画素位置を対応付けた画素位置情報を生成する画素位置情報生成手段と
を備える学習装置。 The quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the input first frame by ADRC Among the 8-bit data indicating the pixel value at the first pixel position with the smallest level fluctuation associated with the code A encoded using the upper m bits, and the target pixel of the first frame A feature amount code A of a pixel of interest in the first frame, which is composed of the quantization code of the predetermined pixel group including :
A feature value corresponding to the feature code A is a feature value to be searched , and each pixel value of a predetermined pixel group including a target pixel of a second frame that is a frame temporally prior to the first frame The pixel value of the second pixel position with the least level fluctuation associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by ADRC quantization Code B or code each encoded so as to overlap in a predetermined pixel value range including the pixel value indicated by the boundary portion of the code A, using the upper m-1 bits of the 8-bit data indicating C and the composed of the quantization code of the predetermined group of pixels including a target pixel of the second frame, the second is a feature of the subject pixel of the frame feature value code B and the feature co Used to extract and de C, and the learning device for learning the information of the pixel position in a given pixel group including a pixel of interest of the first learning frame,
In the first learning frame, quantized code generating means for generating an n-bit quantized code by quantizing each pixel value of a predetermined pixel group including the target pixel by ADRC;
Detecting means for detecting a variation value of a level of a pixel at a corresponding pixel position in the first learning frame and a second learning frame which is a frame temporally prior to the first learning frame;
Storage means for storing a variation value of the level of the pixel at each pixel position of the pixel group detected by the detection means in association with the quantization code generated by the quantization code generation means;
A learning apparatus comprising: pixel position information generating means for generating pixel position information in which the quantization code is associated with the pixel position with the smallest level fluctuation of the pixel value based on the information accumulated by the accumulation means. .

入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡと、
前記特徴量コードＡに対応するものが検索される特徴量であって、前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣと
を抽出するのに用いられる、前記第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置が、
前記第１の学習フレームにおいて、注目画素を含む所定の画素群のＡＤＲＣに基づくｎビットの量子化コードを生成し、
前記第１の学習フレームと、前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出し、
生成された前記量子化コードに対応付けて、検出された前記画素群のそれぞれの画素位置における前記画素のレベルの変動値を蓄積し、
蓄積された情報を基に、前記量子化コードと、前記画素値の前記レベル変動が最も少ない画素位置を対応付けた画素位置情報を生成するステップを
含む学習方法。 The quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the input first frame by ADRC Among the 8-bit data indicating the pixel value at the first pixel position with the smallest level fluctuation associated with the code A encoded using the upper m bits, and the target pixel of the first frame A feature amount code A of a pixel of interest in the first frame, which is composed of the quantization code of the predetermined pixel group including :
A feature value corresponding to the feature code A is a feature value to be searched , and each pixel value of a predetermined pixel group including a target pixel of a second frame that is a frame temporally prior to the first frame The pixel value of the second pixel position with the least level fluctuation associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by ADRC quantization Code B or code each encoded so as to overlap in a predetermined pixel value range including the pixel value indicated by the boundary portion of the code A, using the upper m-1 bits of the 8-bit data indicating C and the composed of the quantization code of the predetermined group of pixels including a target pixel of the second frame, the second is a feature of the subject pixel of the frame feature value code B and the feature co Used to extract and de C, the first learning device for learning the information of the pixel position in a given pixel group including a pixel of interest of the learning frame,
In the first learning frame, an n-bit quantization code based on ADRC of a predetermined pixel group including the target pixel is generated,
Detecting a variation value of a level of a pixel at a corresponding pixel position in the first learning frame and a second learning frame that is a frame temporally prior to the first learning frame;
In association with the generated quantization code, accumulate the fluctuation value of the level of the pixel at each pixel position of the detected pixel group,
A learning method including a step of generating pixel position information in which the quantization code is associated with a pixel position where the level variation of the pixel value is the smallest based on accumulated information.

入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡと、
前記特徴量コードＡに対応するものが検索される特徴量であって、前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣと
を抽出するのに用いられる、前記第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置に、
前記第１の学習フレームにおいて、注目画素を含む所定の画素群のＡＤＲＣに基づくｎビットの量子化コードを生成し、
前記第１の学習フレームと、前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出し、
生成された前記量子化コードに対応付けて、検出された前記画素群のそれぞれの画素位置における前記画素のレベルの変動値を蓄積し、
蓄積された情報を基に、前記量子化コードと、前記画素値の前記レベル変動が最も少ない画素位置を対応付けた画素位置情報を生成するステップを
含む処理を実行させるためのプログラムが記録されている記録媒体。 The quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the input first frame by ADRC Among the 8-bit data indicating the pixel value at the first pixel position with the smallest level fluctuation associated with the code A encoded using the upper m bits, and the target pixel of the first frame A feature amount code A of a pixel of interest of the first frame, the quantization code of the predetermined pixel group including :
A feature value corresponding to the feature code A is a feature value to be searched , and each pixel value of a predetermined pixel group including a target pixel of a second frame that is a frame temporally prior to the first frame The pixel value of the second pixel position with the least level fluctuation associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by ADRC quantization Code B or code each encoded so as to overlap in a predetermined pixel value range including the pixel value indicated by the boundary portion of the code A, using the upper m-1 bits of the 8-bit data indicating C and the composed of the quantization code of the predetermined group of pixels including a target pixel of the second frame, the second is a feature of the subject pixel of the frame feature value code B and the feature co Used to extract and de C, and the learning device for learning the information of the pixel position in a given pixel group including a pixel of interest of the first learning frame,
In the first learning frame, an n-bit quantization code based on ADRC of a predetermined pixel group including the target pixel is generated,
Detecting a variation value of a level of a pixel at a corresponding pixel position in the first learning frame and a second learning frame that is a frame temporally prior to the first learning frame;
In association with the generated quantization code, accumulate the fluctuation value of the level of the pixel at each pixel position of the detected pixel group,
Based on the accumulated information, a program for executing a process including a step of generating pixel position information in which the quantization code is associated with the pixel position having the smallest level variation of the pixel value is recorded. Recording medium.

入力される第１のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第１の画素位置の画素値を示す８ビットのデータのうち、上位ｍビットを用いてコード化されたコードＡ、および前記第１のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第１のフレームの注目画素の特徴量コードＡと、
前記特徴量コードＡに対応するものが検索される特徴量であって、前記第１のフレームより時間的に前のフレームである第２のフレームの注目画素を含む所定の画素群の各画素値をＡＤＲＣで量子化して生成されたｎビットの量子化コードと値が一致する前記第１の学習フレームの前記量子化コードに対応付けられた前記レベル変動が最も少ない第２の画素位置の画素値を示す８ビットのデータのうち、上位ｍ−１ビットを用いて、前記コードＡの境界部分が示す画素値を含む所定の画素値範囲でオーバーラップするようにそれぞれコード化されたコードＢまたはコードＣと、前記第２のフレームの注目画素を含む前記所定の画素群の前記量子化コードからなる、前記第２のフレームの注目画素の特徴量である特徴量コードＢおよび特徴量コードＣと
を抽出するのに用いられる、前記第１の学習フレームの注目画素を含む所定の画素群における画素位置の情報を学習する学習装置に、
前記第１の学習フレームにおいて、注目画素を含む所定の画素群のＡＤＲＣに基づくｎビットの量子化コードを生成し、
前記第１の学習フレームと、前記第１の学習フレームより時間的に前のフレームである第２の学習フレームとにおいて対応する画素位置の画素のレベルの変動値を検出し、
生成された前記量子化コードに対応付けて、検出された前記画素群のそれぞれの画素位置における前記画素のレベルの変動値を蓄積し、
蓄積された情報を基に、前記量子化コードと、前記画素値の前記レベル変動が最も少ない画素位置を対応付けた画素位置情報を生成するステップを
含む処理を実行させるためのプログラム。 The quantization code of the first learning frame whose value matches the n-bit quantization code generated by quantizing each pixel value of the predetermined pixel group including the target pixel of the input first frame by ADRC Among the 8-bit data indicating the pixel value at the first pixel position with the smallest level fluctuation associated with the code A encoded using the upper m bits, and the target pixel of the first frame A feature amount code A of a pixel of interest in the first frame, which is composed of the quantization code of the predetermined pixel group including :
A feature value corresponding to the feature code A is a feature value to be searched , and each pixel value of a predetermined pixel group including a target pixel of a second frame that is a frame temporally prior to the first frame The pixel value of the second pixel position with the least level fluctuation associated with the quantization code of the first learning frame whose value matches the n-bit quantization code generated by ADRC quantization Code B or code each encoded so as to overlap in a predetermined pixel value range including the pixel value indicated by the boundary portion of the code A, using the upper m-1 bits of the 8-bit data indicating C and the composed of the quantization code of the predetermined group of pixels including a target pixel of the second frame, the second is a feature of the subject pixel of the frame feature value code B and the feature co Used to extract and de C, and the learning device for learning the information of the pixel position in a given pixel group including a pixel of interest of the first learning frame,
In the first learning frame, an n-bit quantization code based on ADRC of a predetermined pixel group including the target pixel is generated,
Detecting a variation value of a level of a pixel at a corresponding pixel position in the first learning frame and a second learning frame that is a frame temporally prior to the first learning frame;
In association with the generated quantization code, accumulate the fluctuation value of the level of the pixel at each pixel position of the detected pixel group,
A program for executing a process including a step of generating pixel position information in which the quantization code is associated with a pixel position with the smallest level fluctuation of the pixel value based on accumulated information.