JP4563982B2

JP4563982B2 - Motion estimation method, apparatus, program thereof, and recording medium thereof

Info

Publication number: JP4563982B2
Application number: JP2006295407A
Authority: JP
Inventors: 幸浩坂東; 誠之高村; 一人上倉; 由幸八島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-10-31
Filing date: 2006-10-31
Publication date: 2010-10-20
Anticipated expiration: 2026-10-31
Also published as: JP2008113292A

Description

本発明は，高能率画像信号符号化方法に関し，特に，複数の参照フレームを用いた動き補償を利用するフレーム間予測を伴う動画像符号化において，整数画素精度の変移量推定を高速化するための動き推定方法，装置，そのプログラムおよびその記録媒体に関するものである。 The present invention relates to a high-efficiency image signal encoding method, and more particularly, to speed up the amount of shift estimation with integer pixel accuracy in moving image encoding with inter-frame prediction using motion compensation using a plurality of reference frames. The present invention relates to a motion estimation method, apparatus, program thereof, and recording medium thereof.

近年，臨場感あふれる大画面のスポーツ映像やデジタルシネマに代表される超高画質映像への期待が高まっている。これを受けて，映像の高画質化に関する研究が精力的に行われている。超高画質映像の実現には，次の四要素が必要である。すなわち，空間解像度，画素値深度，色再現性，時間解像度である。これを受けて，前者の三要素については，デジタルシネマ等の応用およびナチュラルビジョンプロジェクトにおいて検討が進められている。 In recent years, there are growing expectations for super-high-quality images such as large-screen sports images and digital cinema that are full of realism. In response to this, research on high-quality video has been vigorously conducted. The following four elements are necessary to realize ultra-high-quality video. That is, spatial resolution, pixel value depth, color reproducibility, and temporal resolution. In response, the former three elements are being studied in applications such as digital cinema and natural vision projects.

しかし，被写体の自然な動きを表現するために不可欠な時間解像度の向上，すなわち，映像の高フレームレート化については，充分な検討がなされていない。Spillmann らによれば，網膜の出力細胞である神経節細胞が出力するパルス数の上限は毎秒３００〜４００個程度であるとする生理学的な知見が示されている。このため，人の視覚系は最も短くて１／１５０〜１／２００秒程度までの発光の違いを知覚できると推察される。これは，知覚可能なフレームレートの検知限が，１５０〜２００［フレーム／秒］であることを意味する。現行映像のフレームレートである３０，６０［フレーム／秒］は，フリッカーの検知限から定められたものであり，自然な動きを表現するのに十分な値ではない。 However, sufficient studies have not been made on improving time resolution, that is, increasing the frame rate of an image, which is indispensable for expressing the natural movement of a subject. According to Spillmann et al., Physiological knowledge that the upper limit of the number of pulses output by ganglion cells as output cells of the retina is about 300 to 400 per second is shown. For this reason, it is presumed that the human visual system is the shortest and can perceive a difference in light emission from 1/150 to 1/200 second. This means that the perceivable frame rate detection limit is 150 to 200 [frames / second]. The current video frame rate of 30, 60 [frames / second] is determined from the flicker detection limit and is not a sufficient value to express natural motion.

一方，映像の超高画質化はデータ量の爆発的な増加を招くため，効率的な符号化方法が必要となる。動画像データの時間軸方向の冗長度削減には，動き補償によるフレーム間予測が有効である。動き補償を用いて予測誤差の低減を図る場合，動き推定による正確なフレーム間の変移量推定が必要となる。高フレームレート映像に対して，隣接フレーム間の正確な変移量を推定する場合，１画素に満たない変移量が多く発生する可能性があるため，小数画素精度の動き推定が必要となる。例えば，１０００［フレーム／秒］の高フレームレート映像の場合，１／１０００画素精度程度の変移量が発生する可能性がある。 On the other hand, the super high image quality of the video leads to an explosive increase in the amount of data, so an efficient encoding method is required. Inter-frame prediction by motion compensation is effective for reducing redundancy of moving image data in the time axis direction. In order to reduce prediction errors using motion compensation, it is necessary to accurately estimate the amount of shift between frames by motion estimation. When an accurate shift amount between adjacent frames is estimated for a high frame rate video, there is a possibility that a shift amount less than one pixel may occur. Therefore, motion estimation with decimal pixel accuracy is required. For example, in the case of a high frame rate video of 1000 [frames / second], there is a possibility that a shift amount of about 1/1000 pixel accuracy may occur.

小数画素精度の動き推定において，参照フレーム内の小数画素位置の画素値を得るためには，補間フィルタを用いる。しかし，一般的に，補間フィルタの低域通過特性による制約のため，１／４ないし１／８画素精度の補間によって，予測性能は飽和することが報告されている。このため，補間フィルタを用いた小数画素精度の動き推定によって，高フレームレート映像に含まれる微小な変移量に対応することには限界がある。 In motion estimation with decimal pixel accuracy, an interpolation filter is used to obtain the pixel value at the decimal pixel position in the reference frame. However, it is generally reported that the prediction performance is saturated by interpolation with 1/4 to 1/8 pixel accuracy due to the restriction due to the low-pass characteristic of the interpolation filter. For this reason, there is a limit in dealing with a minute amount of shift included in a high frame rate video by fractional pixel precision motion estimation using an interpolation filter.

一方，動き補償時に複数フレームを参照フレームとして利用する方法（複数参照フレーム）がある。高フレームレート映像の場合，時間軸方向のサンプリングを密に行っているため，当該フレームと整数画素位置で対応付けが行えるフレームが存在する可能性が高い。つまり，当該フレームと参照フレーム間の変移量が整数画素精度で表せるフレームが存在する可能性が高い。 On the other hand, there is a method (multiple reference frames) that uses a plurality of frames as reference frames during motion compensation. In the case of a high frame rate video, since sampling in the time axis direction is densely performed, there is a high possibility that there is a frame that can be associated with the frame at integer pixel positions. That is, there is a high possibility that there is a frame in which the shift amount between the frame and the reference frame can be expressed with integer pixel accuracy.

変移量が整数画素精度で表せれば，補間フィルタ処理が不要となり，予測誤差を低く抑える可能性が高まる。このため，高フレームレート映像に対する動き補償では，複数参照フレームを用いた整数画素精度の動き補償が有効といえる。しかし，総当りで変移量を求めるとなると，参照フレームの対象となるフレーム数が増加するため，計算量，消費メモリ共に増大する。例えば，１［pixel/sec ］の変移量をもつ画素の場合，フレームレートが１０００［frames/sec］のフレームに対して，動き補償による変移量推定を行う場合，１０００［frames］を参照フレームとして変移量の推定を行う必要がある。このため，参照フレームを絞り込む手段が必要となる。 If the amount of shift can be expressed with integer pixel accuracy, interpolation filter processing is not necessary, and the possibility of keeping prediction errors low is increased. For this reason, it can be said that motion compensation with integer pixel accuracy using multiple reference frames is effective in motion compensation for high frame rate video. However, when the amount of shift is determined by brute force, the number of frames subject to reference frames increases, so both the calculation amount and the memory consumption increase. For example, in the case of a pixel having a shift amount of 1 [pixel / sec], when estimating a shift amount by motion compensation for a frame having a frame rate of 1000 [frames / sec], 1000 [frames] is used as a reference frame. It is necessary to estimate the amount of displacement. For this reason, a means for narrowing down the reference frame is required.

複数参照フレームに対する動き探索の高速化に関して，以下のような検討が行われている。下記の非特許文献１では，隣接フレームに対して動き探索を行い，その探索結果に対して，運動の等速性を仮定して他の参照フレームでの動き探索の範囲を絞り込んでいる。この非特許文献１に示されているような従来手法は，小数画素精度での動き探索の演算量低減を目指した手法である。このため，整数画素精度の動き探索を行うための参照フレームの絞込みについては，考慮していない。さらに，隣接フレームで得られた動き探索結果を基に外挿を行うことが前提の手法である。このため，上述の場合のように，隣接フレームに限定せず，整数画素精度で動き探索を行うための参照フレームを設定する場合には，演算量の低減は期待できない。
松尾翔平，永吉功，花村剛，富永英義，“複数参照フレームを用いた効率的動き探索に関する検討”，電子情報通信学会研究報告，2005-AVM-50 ，Oct. 2005 The following investigations have been made on speeding up motion search for multiple reference frames. In the following non-patent document 1, a motion search is performed on adjacent frames, and the motion search range in other reference frames is narrowed down on the basis of the search result assuming constant motion. The conventional method as shown in Non-Patent Document 1 is a method aimed at reducing the amount of calculation of motion search with decimal pixel accuracy. For this reason, narrowing down of reference frames for performing motion search with integer pixel precision is not considered. Furthermore, it is a premise that extrapolation is performed based on the motion search result obtained in the adjacent frame. For this reason, when the reference frame for performing the motion search with integer pixel accuracy is set without being limited to the adjacent frames as in the above-described case, the amount of calculation cannot be expected.
Shohei Matsuo, Isao Nagayoshi, Go Hanamura, Hideyoshi Tominaga, “Examination of Efficient Motion Search Using Multiple Reference Frames”, IEICE Technical Report, 2005-AVM-50, Oct. 2005

本発明はかかる事情に鑑みてなされたものであって，高フレームレート映像信号に対する符号化処理において，複数参照フレームを用いた整数画素精度の変移量推定を行う場合，高速な動き推定方法を確立することを目的とする。 The present invention has been made in view of such circumstances, and establishes a high-speed motion estimation method in the case of performing integer pixel precision shift amount estimation using a plurality of reference frames in encoding processing for a high frame rate video signal. The purpose is to do.

上記課題を解決するため，本発明は，以下の方法を用いる。なお，画素値として輝度の値を用いる例を説明するが，輝度以外の画素値を用いてもよい。 In order to solve the above problems, the present invention uses the following method. In addition, although the example which uses the value of a brightness | luminance as a pixel value is demonstrated, you may use pixel values other than a brightness | luminance.

第一の発明では，フレーム間予測を伴う動画像符号化において，１または複数の参照フレームを用いた動き補償（複数参照フレーム動き補償）を利用する場合，一定の空間変移に必要な時間変移量を重み値とする，処理対象フレーム内の該空間変移の距離にある画素間の空間輝度勾配と隣接フレーム間の時間輝度勾配との加重和からなるコスト関数を最小化する時間変移量を求め，求めた時間変移量に対応するフレームを参照フレームとして動き推定を行う。または求めた時間変移量に対応するフレームとその前後の複数フレームとを参照フレームとして動き推定を行う。 In the first invention, when motion compensation using one or a plurality of reference frames (multiple reference frame motion compensation) is used in moving picture coding with inter-frame prediction, the amount of time variation required for a certain spatial transition A time shift amount that minimizes a cost function composed of a weighted sum of a spatial luminance gradient between pixels at a distance of the spatial shift in the processing target frame and a temporal luminance gradient between adjacent frames in the processing target frame, Motion estimation is performed using a frame corresponding to the obtained time shift amount as a reference frame. Alternatively, motion estimation is performed using a frame corresponding to the obtained time shift amount and a plurality of frames before and after the frame as reference frames.

また，第二の発明では，フレーム間予測を伴う動画像符号化において，１または複数の参照フレームを用いた動き補償（複数参照フレーム動き補償）を利用する場合，局所領域内の各画素に対して，一定の空間変移に必要な時間変移量を，当該フレーム内の該空間変移の距離にある画素間の輝度勾配と隣接フレーム間の輝度勾配との比により推定し，推定された時間変移量に基づき，この局所領域内の該時間変移量のヒストグラムを生成し，同ヒストグラムにおける最大頻度の時間変移量を求め，該時間変移量に対応するフレームを参照フレームとして動き推定を行う。または該時間変移量に対応するフレームとその前後の複数フレームとを参照フレームとして動き推定を行う。詳しくは，以下のとおりである。 In the second invention, when motion compensation using one or a plurality of reference frames (multiple reference frame motion compensation) is used in moving picture coding with inter-frame prediction, for each pixel in the local region Thus, the amount of time shift required for a certain spatial shift is estimated by the ratio of the luminance gradient between the pixels at the distance of the spatial shift in the frame and the luminance gradient between adjacent frames, and the estimated time shift amount Based on the above, a histogram of the time shift amount in the local region is generated, the time shift amount of the maximum frequency in the histogram is obtained, and motion estimation is performed using a frame corresponding to the time shift amount as a reference frame. Alternatively, motion estimation is performed using a frame corresponding to the time shift amount and a plurality of frames before and after the frame as reference frames. Details are as follows.

［第一の発明］
以下では，簡単のため１次元信号を例にとり説明する。第ｔフレームの位置ｘにおける画素値をｆ（ｘ，ｔ）とする。フレーム間隔をΔ_tとする。平行移動モデルを仮定し，１画素の変移に必要なフレーム数（時間変移量）をｂとすると，次式の関係が成り立つ。 [First invention]
In the following, for simplicity, a one-dimensional signal will be described as an example. The pixel value at the position x of the t-th frame is assumed to be f (x, t). Let the frame interval be _Δt . Assuming a translation model and assuming that the number of frames required for shifting one pixel (time shift amount) is b, the following relationship holds.

ｆ（ｘ，ｔ）＝ｆ（ｘ−１，ｔ−ｂΔ_t）（１）
上式の右辺はテイラー展開の一次近似により，次式のように近似できる。 f (x, t) = f (x-1, t-bΔ t) (1)
The right-hand side of the above equation can be approximated by the first-order approximation of Taylor expansion as

さらに，上式の右辺の第２項を差分近似する。 Further, the second term on the right side of the above equation is approximated by difference.

以上の関係を用いて，以下の式を得る。 Using the above relationship, the following equation is obtained.

上式の両辺が等しいものとして，ｂについて解くと，次式を得る。 Solving for b, assuming that both sides of the above equation are equal, the following equation is obtained.

上式は，各画素に対する１画素の変移に必要な時間変移量を与える。 The above equation gives the amount of time shift required for shifting one pixel for each pixel.

さらに，変移量を一般化して，ａ画素の変移に必要なフレーム数（時間変移量）をｂ_aとすると，前述と同様に考えれば，次式が得られる。なお，ａは整数とする。 Furthermore, when the amount of transition is generalized and the number of frames necessary for the transition of a pixel (time variation amount) is b _a , the following equation is obtained in the same manner as described above. Note that a is an integer.

上式は，各画素に対するａ画素の変移に必要な時間変移量を与える。式（５）で示したｂは，式（６）においてａ＝１とした場合であり，ｂ₁にあたる。 The above equation gives the amount of time shift required for shifting a pixel for each pixel. The b shown in the equation (5) is a case where a = 1 in the equation (6) and corresponds to b ₁ .

一方，符号化効率の観点からは，複数画素に対して，共通の時間変移量を与えることが望ましい。そこで，局所領域内の各点の時間変移量が等しいと仮定して，局所領域内で上記の関係式が最小二乗誤差の意味で最もよく当てはまるｂ_aを推定する。つまり，局所領域Ｒで次式を最小化する時間変移量を求める。 On the other hand, from the viewpoint of encoding efficiency, it is desirable to give a common time shift amount to a plurality of pixels. Therefore, assuming equal time displacement amount of each point in the local region, the above relation in the local region to estimate the best fit b _a sense of the least square error. That is, a time shift amount that minimizes the following expression in the local region R is obtained.

ここで，argmin_bは，次に続く関数を最小化するｂを返す。このコスト関数Ｅ（ｂ_a）は，次式の通りである。 Here, argmin _b returns b that minimizes the next function. The cost function E (b _a ) is as follows:

式（８）において，
ｅ_s（ｘ，ｔ，ａ）＝ｆ（ｘ−ａ，ｔ）−ｆ（ｘ，ｔ），および，
ｅ_t（ｘ，ｔ，ａ）＝ｆ（ｘ−ａ，ｔ−Δ_t）−ｆ（ｘ−ａ，ｔ）
とおくと，Ｅ（ｂ_a）は以下の通り，展開できる。なお，ｅ_s（ｘ，ｔ，ａ）を空間輝度勾配，ｅ_t（ｘ，ｔ，ａ）を時間輝度勾配と呼ぶ。 In equation (8),
e _s (x, t, a) = f (x−a, t) −f (x, t), and
e _t (x, t, a) = f (x−a, t− _Δt ) −f (x−a, t)
Then, E (b _a ) can be expanded as follows. Note that e _s (x, t, a) is called a spatial luminance gradient, and e _t (x, t, a) is called a temporal luminance gradient.

従って，上式を最小化するｂの値は，次式となる。 Therefore, the value of b that minimizes the above equation is as follows.

ｆ（ｘ，ｔ）に対する動き推定は，式（７）で示したフレーム間隔ｂ_optに基づき，フレームｆ（ｘ，ｔ−ｂ_optΔ_t）を参照フレームとして，整数画素精度の動き推定を行う。なお，詳しくは第二の発明で説明するように，さらにフレームｆ（ｘ，ｔ−ｂ_optΔ_t）の前後の複数フレームを参照フレームとしてもよい。 The motion estimation for f (x, t) is based on the frame interval b _opt shown in Equation (7), and the motion estimation with integer pixel accuracy is performed using the frame f (x, t−b _opt _Δt ) as a reference frame. . As will be described in detail in the second invention, a plurality of frames before and after the frame f (x, t−b _opt _Δt ) may be used as reference frames.

［第二の発明］
式（６）を用いて，局所領域Ｒ内の全画素について，時間変移量を求める。この結果得られるｆ（ｘ，ｔ）（ｘ∈Ｒ）における時間変移量をｂ（ｘ，ｔ，ａ）とおく。ｂ（ｘ，ｔ，ａ）（ｘ∈Ｒ）の頻度が最多となる値をｂ_optとして，第（ｔ−ｂ_opt）フレームを参照フレームとする。 [Second invention]
Using equation (6), the time shift amount is obtained for all the pixels in the local region R. The time shift amount at f (x, t) (xεR) obtained as a result is set as b (x, t, a). Let b _opt be the value with the highest frequency of b (x, t, a) (xεR), and let the (t−b _opt ) frame be the reference frame.

さらに，その第（ｔ−ｂ_opt）フレームの前後の複数フレーム（２Ｍ−１フレーム）を参照フレームとしてもよい。この場合には，第τフレーム（ｔ−ｂ_opt−Ｍ≦τ＜ｔ−ｂ_opt＋Ｍ）が動き推定の探索対象となる参照フレームとなる。ここで，Ｍは整数とし，Ｍの値は外部から与えられるものとする。Ｍの値を外部から与える手段を持たない場合には，内部で同値を生成するものとする。例えば，ｂ（ｘ，ｔ，ａ）（ｘ∈Ｒ）の最大頻度の確率をｒとすると，Ｍを次式で与える。 Further, a plurality of frames (2M-1 frames) before and after the (t-b _opt ) frame may be used as reference frames. In this case, the τ-th frame (t−b _opt −M ≦ τ <t−b _opt + M) is a reference frame to be searched for motion estimation. Here, M is an integer, and the value of M is given from the outside. If there is no means for giving the value of M from the outside, the same value is generated internally. For example, assuming that the probability of the maximum frequency of b (x, t, a) (xεR) is r, M is given by the following equation.

なお，ここでＦはシーケンスのフレームレートであり，Ｎは領域Ｒ内の画素数である。 Here, F is the frame rate of the sequence, and N is the number of pixels in the region R.

本発明により，参照フレームを隣接フレームに制限しない動き推定を高フレームレート映像に適用する際，整数画素精度の動き推定の探索対象となる参照フレームを高速に定めることが可能となる。これにより，変移量を高速に推定することが可能となる。整数画素精度の動き推定により変移量を推定することで，補間フィルタの適用に伴う参照信号内の高周波成分の欠落を回避できるため，動き補償フレーム間予測における予測残差を低減することができる。 According to the present invention, when motion estimation in which a reference frame is not limited to adjacent frames is applied to a high frame rate video, it is possible to quickly determine a reference frame to be searched for motion estimation with integer pixel accuracy. This makes it possible to estimate the amount of displacement at high speed. By estimating the shift amount by motion estimation with integer pixel accuracy, it is possible to avoid missing high-frequency components in the reference signal due to the application of the interpolation filter, thereby reducing the prediction residual in motion compensation interframe prediction.

以下では，動き推定対象フレームの画素値ｆ（ｘ，ｔ）を当該フレームの当該画素値と呼び，ｆ（ｘ−ａ，ｔ）を当該フレームの変移位置画素値と呼び，ｆ（ｘ−ａ，ｔ−Δ_t）を隣接フレームの変移位置画素値と呼ぶ。 Hereinafter, the pixel value f (x, t) of the motion estimation target frame is referred to as the pixel value of the frame, f (x−a, t) is referred to as the transition position pixel value of the frame, and f (x−a , T−Δ _t ) is called the transition position pixel value of the adjacent frame.

［動き推定処理の流れ（実施例１）］
実施例１の動き推定処理の流れについて，図１を参照して説明する。 [Flow of Motion Estimation Processing (Example 1)]
A flow of motion estimation processing according to the first embodiment will be described with reference to FIG.

ステップＳ１０１：空間変移量ａを読み込み，記憶する。この空間変移量ａは，本実施例では，符号化時のパラメータとして外部から与えられるものとする。例えば，ａ＝１というように，あらかじめ定められている場合には，この空間変移量の読込み処理は不要である。 Step S101: The spatial shift amount a is read and stored. In this embodiment, the spatial shift amount a is assumed to be given from the outside as a parameter at the time of encoding. For example, in the case where a = 1 is set in advance, such as a = 1, this spatial shift amount reading process is not necessary.

ステップＳ１０２：当該フレームをフレームバッファに読み込む。なお，当該フレームのフレーム番号をｔとする。 Step S102: The frame is read into the frame buffer. Note that the frame number of the frame is t.

ステップＳ１０３：当該フレームの隣接フレームをフレームバッファに読み込む。 Step S103: Read an adjacent frame of the frame into the frame buffer.

ステップＳ１０４：レジスタＡ１，Ａ２を零値に初期化する。 Step S104: The registers A1 and A2 are initialized to zero values.

ステップＳ１０５：読み込んだ空間変移量ａをもとに，当該フレームの処理対象画素値ｆ（ｘ，ｔ），および当該フレームの変移位置画素値ｆ（ｘ−ａ，ｔ）をフレームバッファから読み出し，それらの値の差分を求める処理を行い，結果をレジスタＥｓに書き出す。 Step S105: Based on the read spatial shift amount a, the processing target pixel value f (x, t) of the frame and the shift position pixel value f (x−a, t) of the frame are read from the frame buffer. A process for obtaining a difference between these values is performed, and the result is written in the register Es.

ステップＳ１０６：読み込んだ空間変移量ａをもとに，当該フレームの変移位置画素値ｆ（ｘ−ａ，ｔ），および隣接フレームの変移位置画素値ｆ（ｘ−ａ，ｔ−Δ_t）を各フレームバッファから読み出し，それらの値の差分を求める処理を行い，求めた結果をレジスタＥｔに書き出す。 Step S106: Based on the read spatial shift amount a, the shift position pixel value f (x−a, t) of the frame and the shift position pixel value f (x−a, t− _Δt ) of the adjacent frame are obtained. Reading from each frame buffer, obtaining a difference between these values, and writing the obtained result to the register Et.

ステップＳ１０７：レジスタＥｔの値を読み込み，同値の二乗値を求め，求めた結果をレジスタＡ１の値と加算し，加算結果をレジスタＡ１に書き出す。 Step S107: The value of the register Et is read, the square value of the same value is obtained, the obtained result is added to the value of the register A1, and the addition result is written to the register A1.

ステップＳ１０８：レジスタＥｓ，Ｅｔの値を読み込み，両値の積を求め，求めた結果をレジスタＡ２の値と加算し，加算結果をレジスタＡ２に書き出す。 Step S108: The values of the registers Es and Et are read, the product of both values is obtained, the obtained result is added to the value of the register A2, and the addition result is written to the register A2.

ステップＳ１０９，Ｓ１１０：局所領域Ｒ内の全ての画素について処理を終えたかどうかを判定し，処理を終えたならば，ステップＳ１１１へ進む。そうでなければ，処理対象画素を局所領域Ｒ内の次の画素とし，ステップＳ１０５〜Ｓ１０８の処理を同様に繰り返す。 Steps S109 and S110: It is determined whether or not the processing has been completed for all the pixels in the local region R. If the processing has been completed, the process proceeds to step S111. Otherwise, the processing target pixel is set as the next pixel in the local region R, and the processing in steps S105 to S108 is repeated in the same manner.

ステップＳ１１１：レジスタＡ１，Ａ２の値を読み込み，Ａ１の値を用いたＡ２の値に対する除算処理を行い，結果をレジスタｂに書き出す。 Step S111: Read the values of the registers A1 and A2, perform division processing on the value of A2 using the value of A1, and write the result to the register b.

ステップＳ１１２：レジスタｂの値，参照フレーム枚数を格納したＭの値を読み込み，第（ｔ−ｂ−Ｍ）フレームから第（ｔ−ｂ＋Ｍ）フレームまでの２Ｍ＋１フレームを動き推定の探索対象となる参照フレームとして，フレームバッファに書き出す。ｔは当該フレームのフレーム番号である。 Step S112: The value of the register b and the value of M storing the number of reference frames are read, and the 2M + 1 frame from the (t−b−M) th frame to the (t−b + M) th frame is a reference for motion estimation search. Write as a frame to the frame buffer. t is the frame number of the frame.

ステップＳ１１３：ステップＳ１０２で書き出したフレームバッファ内の当該フレームの画素値，およびステップＳ１１２で書き出したフレームバッファ内の参照フレームの画素値を入力として読み込み，動き推定処理を行い，推定された変移量をレジスタに書き出す。なお，具体的な動き推定手法は外部より与えられるものとする。動き推定手法としては，例えばブロックマッチングに基づく動き推定がある。 Step S113: The pixel value of the frame in the frame buffer written out in step S102 and the pixel value of the reference frame in the frame buffer written out in step S112 are read as inputs, motion estimation processing is performed, and the estimated amount of change is calculated. Write to register. Note that a specific motion estimation method is given from the outside. As a motion estimation method, for example, there is motion estimation based on block matching.

［動き推定装置（実施例１）］
図２は，実施例１の動き推定装置の構成図である。図２を参照して，実施例１の動き推定装置を説明する。 [Motion Estimation Device (Example 1)]
FIG. 2 is a configuration diagram of the motion estimation apparatus according to the first embodiment. With reference to FIG. 2, the motion estimation apparatus of Example 1 is demonstrated.

当該フレーム読込み部１０１：当該フレームを読み込み，当該フレーム記憶部１０２に格納する。 Frame reading unit 101: Reads the frame and stores it in the frame storage unit 102.

空間変移量読込み部１０３：空間変移量ａを読み込み，空間変移量記憶部１０４に格納する。 Spatial shift reading unit 103: Reads the spatial shift a and stores it in the spatial shift storage unit 104.

隣接フレーム読込み部１０５：当該フレームを読み込み，隣接フレーム記憶部１０６に格納する。 Adjacent frame reading unit 105: Reads the frame and stores it in the adjacent frame storage unit 106.

参照フレーム数読込み部１０７：参照フレーム数Ｍを読み込み，参照フレーム数記憶部１０８に格納する。 Reference frame number reading unit 107: Reads the reference frame number M and stores it in the reference frame number storage unit 108.

空間輝度勾配算出部１０９：空間変移量記憶部１０４内の空間変移量ａをもとに，当該フレーム記憶部１０２から当該フレームの処理対象画素値ｆ（ｘ，ｔ）と，当該フレームの変移位置画素値ｆ（ｘ−ａ，ｔ）とを読み出し，両画素値の差分を求める処理を行い，求めた結果を空間輝度勾配記憶部１１０に書き出す。 Spatial luminance gradient calculation unit 109: Based on the spatial shift amount a in the spatial shift amount storage unit 104, the processing target pixel value f (x, t) of the frame from the frame storage unit 102 and the shift position of the frame The pixel value f (x−a, t) is read out, a process for obtaining a difference between the two pixel values is performed, and the obtained result is written in the spatial luminance gradient storage unit 110.

時間輝度勾配算出部１１１：空間変移量記憶部１０４内の空間変移量ａをもとに，当該フレーム記憶部１０２から当該フレームの変移位置画素値ｆ（ｘ−ａ，ｔ）を読み出し，隣接フレーム記憶部１０６から隣接フレームの変移位置画素値ｆ（ｘ−ａ，ｔ−Δ_t）を読み出し，両値の差分を求める処理を行い，求めた結果を時間輝度勾配記憶部１１２に書き出す。 Temporal luminance gradient calculation unit 111: Based on the spatial shift amount a in the spatial shift amount storage unit 104, the transition position pixel value f (x−a, t) of the frame is read from the frame storage unit 102, and the adjacent frame The transition position pixel value f (x−a, t− _Δt ) of the adjacent frame is read from the storage unit 106, a process for calculating the difference between the two values is performed, and the calculated result is written in the temporal luminance gradient storage unit 112.

積算処理部１１３：空間輝度勾配記憶部１１０の値，および時間輝度勾配記憶部１１２の値を入力として読み込み，両値の積演算を行い，演算結果を積算値記憶部１１４に書き出す。 Integration processing unit 113: Reads the value of the spatial luminance gradient storage unit 110 and the value of the temporal luminance gradient storage unit 112 as inputs, performs a product operation of both values, and writes the calculation result to the integrated value storage unit 114.

二乗演算処理部１１５：時間輝度勾配記憶部１１２の値を入力として読み込み，同値に対する二乗演算を行い，演算結果を二乗値記憶部１１６に書き出す。 Square calculation processing unit 115: Reads the value of the time luminance gradient storage unit 112 as an input, performs a square calculation on the same value, and writes the calculation result to the square value storage unit 116.

最終画素判定処理部１１７：以上の処理を局所領域内の全ての画素について繰り返す制御を行う。 Final pixel determination processing unit 117: Control is performed to repeat the above processing for all the pixels in the local region.

除算処理部１１８：二乗値記憶部１１６の値，積算値記憶部１１４の値を入力として読み込み，後者を前者で除する演算を行い，整数値に丸める処理を行い，結果を除算値記憶部１１９に書き出す。 Division processing unit 118: The value of the square value storage unit 116 and the value of the integrated value storage unit 114 are read as inputs, the latter is divided by the former, the number is rounded to an integer value, and the result is divided by the division value storage unit 119. Export to

参照フレーム読込み部１２０：除算値記憶部１１９の値，参照フレーム数記憶部１０８の値を入力として読み込み，両値を各々，ｂ，Ｍとすると，第（ｔ−ｂ−Ｍ）フレームから第（ｔ−ｂ＋Ｍ）フレームまでの２Ｍ＋１フレームを動き推定の探索対象となる参照フレームとして，参照フレーム記憶部１２１に書き出す。ｔは当該フレームのフレーム番号である。 Reference frame reading unit 120: Reads the value of the division value storage unit 119 and the value of the reference frame number storage unit 108 as inputs, and sets both values as b and M, respectively, from the (t−b−M) frame to the ( 2M + 1 frames up to (t−b + M) frames are written in the reference frame storage unit 121 as reference frames to be searched for motion estimation. t is the frame number of the frame.

動き推定処理部１２２：当該フレーム記憶部１０２の当該フレームの画素値，参照フレーム記憶部１２１の参照フレームの画素値を入力として読み込み，動き推定処理を行い，推定された変移量を変移量記憶部１２３に書き出す。なお，具体的な動き推定手法は外部より与えられるものとする。例えば，ブロックマッチングに基づく動き推定がある。 Motion estimation processing unit 122: Reads the pixel value of the frame in the frame storage unit 102 and the pixel value of the reference frame in the reference frame storage unit 121 as input, performs a motion estimation process, and stores the estimated shift amount as a shift amount storage unit Write to 123. Note that a specific motion estimation method is given from the outside. For example, there is motion estimation based on block matching.

［動き推定処理の流れ（実施例２）］
実施例２の動き推定処理の流れについて，図３を参照して説明する。 [Flow of Motion Estimation Processing (Example 2)]
A flow of motion estimation processing according to the second embodiment will be described with reference to FIG.

ステップＳ２０１〜Ｓ２０３：実施例１のステップＳ１０１〜Ｓ１０３と同じである。 Steps S201 to S203: The same as steps S101 to S103 of the first embodiment.

ステップＳ２０４：ヒストグラムＨの各要素を零値に初期化する。 Step S204: Each element of the histogram H is initialized to zero.

ステップＳ２０５，Ｓ２０６：実施例１のステップＳ１０５，Ｓ１０６と同じである。 Steps S205 and S206: The same as steps S105 and S106 of the first embodiment.

ステップＳ２０７：レジスタＥｓ，Ｅｔの値を読み込み，両値の除算Ｅｓ／Ｅｔを行い，求めた結果を整数値に丸める処理を行い，結果をレジスタｂに書き出す。 Step S207: Read the values of the registers Es and Et, divide both values Es / Et, round the obtained result to an integer value, and write the result to the register b.

ステップＳ２０８：レジスタｂの値，ヒストグラムＨを入力として読み込み，ヒストグラムＨの第ｂ要素の頻度に１を加算し，加算後の頻度をヒストグラムＨの第ｂ要素の頻度として上書きする。 Step S208: Read the value of the register b and the histogram H as input, add 1 to the frequency of the b-th element of the histogram H, and overwrite the frequency after the addition as the frequency of the b-th element of the histogram H.

ステップＳ２０９，Ｓ２１０：局所領域Ｒ内の全ての画素について処理を終えたかどうかを判定し，処理を終えたならば，ステップＳ２１１へ進む。そうでなければ，処理対象画素を局所領域Ｒ内の次の画素とし，ステップＳ２０５〜Ｓ２０８の処理を同様に繰り返す。 Steps S209 and S210: It is determined whether or not the processing has been completed for all the pixels in the local region R. If the processing is completed, the process proceeds to step S211. Otherwise, the process target pixel is set as the next pixel in the local region R, and the processes in steps S205 to S208 are repeated in the same manner.

ステップＳ２１１：ヒストグラムＨを入力として読み込み，同ヒストグラムにおける最大頻度を有する要素の値を求め，レジスタｂに書き出す。 Step S211: The histogram H is read as an input, the value of the element having the maximum frequency in the histogram is obtained and written in the register b.

ステップＳ２１２，Ｓ２１３：実施例１のステップＳ１１２，Ｓ１１３と同じである。 Steps S212 and S213: Same as steps S112 and S113 of the first embodiment.

［動き推定装置（実施例２）］
図４は，実施例２の動き推定装置の構成図である。図４を参照して，実施例２の動き推定装置を説明する。 [Motion Estimation Device (Example 2)]
FIG. 4 is a configuration diagram of the motion estimation apparatus according to the second embodiment. With reference to FIG. 4, the motion estimation apparatus of Example 2 is demonstrated.

当該フレーム読込み部２０１：当該フレームを読み込み，当該フレーム記憶部２０２に格納する。 Frame reading unit 201: Reads the frame and stores it in the frame storage unit 202.

空間変移量読込み部２０３：空間変移量ａを読み込み，空間変移量記憶部２０４に格納する。 Spatial shift reading unit 203: Reads the spatial shift a and stores it in the spatial shift storage unit 204.

隣接フレーム読込み部２０５：当該フレームを読み込み，隣接フレーム記憶部２０６に格納する。 Adjacent frame reading unit 205: Reads the frame and stores it in the adjacent frame storage unit 206.

参照フレーム数読込み部２０７：参照フレーム数Ｍを読み込み，参照フレーム数記憶部２０８に格納する。 Reference frame number reading unit 207: Reads the reference frame number M and stores it in the reference frame number storage unit 208.

空間輝度勾配算出部２０９：空間変移量記憶部２０４内の空間変移量ａをもとに，当該フレーム記憶部２０２から当該フレームの処理対象画素値ｆ（ｘ，ｔ）と，当該フレームの変移位置画素値ｆ（ｘ−ａ，ｔ）とを読み出し，両画素値の差分を求める処理を行い，求めた結果を空間輝度勾配記憶部２１０に書き出す。 Spatial luminance gradient calculation unit 209: based on the spatial shift amount a in the spatial shift amount storage unit 204, the processing target pixel value f (x, t) of the frame from the frame storage unit 202 and the shift position of the frame The pixel value f (x−a, t) is read out, a process for obtaining the difference between the two pixel values is performed, and the obtained result is written in the spatial luminance gradient storage unit 210.

時間輝度勾配算出部２１１：空間変移量記憶部２０４内の空間変移量ａをもとに，当該フレーム記憶部２０２から当該フレームの変移位置画素値ｆ（ｘ−ａ，ｔ）を読み出し，隣接フレーム記憶部２０６から隣接フレームの変移位置画素値ｆ（ｘ−ａ，ｔ−Δ_t）を読み出し，両値の差分を求める処理を行い，求めた結果を時間輝度勾配記憶部２１２に書き出す。 Temporal luminance gradient calculating unit 211: Based on the spatial shift amount a in the spatial shift amount storage unit 204, the transition position pixel value f (x−a, t) of the frame is read from the frame storage unit 202, and the adjacent frame The transition position pixel value f (x−a, t− _Δt ) of the adjacent frame is read from the storage unit 206, a process for calculating the difference between the two values is performed, and the calculated result is written in the temporal luminance gradient storage unit 212.

除算処理部２１３：空間輝度勾配記憶部２１０の値，および時間輝度勾配記憶部２１２の値を入力として読み込み，前者を後者で除する演算を行い，整数値に丸める処理を行い，結果を除算値記憶部２１４に書き出す。 Division processing unit 213: Reads the value of the spatial luminance gradient storage unit 210 and the value of the temporal luminance gradient storage unit 212 as input, performs an operation of dividing the former by the latter, performs processing to round to an integer value, and divides the result into division values Write to the storage unit 214.

ヒストグラム更新部２１５：除算値記憶部２１４の値，およびヒストグラムを入力として読み込み，除算値記憶部２１４の値をｂとすると，ヒストグラムの第ｂ要素の頻度に１を加算し，加算後の頻度をヒストグラムの第ｂ要素の頻度として，ヒストグラム記憶部２１６に書き出す。 Histogram update unit 215: Reads the value of the division value storage unit 214 and the histogram as input, and if the value of the division value storage unit 214 is b, 1 is added to the frequency of the b-th element of the histogram, and the frequency after the addition is The frequency is written in the histogram storage unit 216 as the frequency of the b-th element of the histogram.

最終画素判定処理部２１７：以上の処理を局所領域内の全ての画素について繰り返す制御を行う。 Final pixel determination processing unit 217: Control is performed to repeat the above processing for all pixels in the local region.

最大頻度要素検出部２１８：ヒストグラム記憶部２１６のヒストグラムを入力として読み込み，同ヒストグラムにおける最大頻度を有する要素の値を求め，この値を最大頻度要素記憶部２１９に書き出す。 Maximum frequency element detection unit 218: Reads the histogram of the histogram storage unit 216 as input, obtains the value of the element having the maximum frequency in the histogram, and writes this value to the maximum frequency element storage unit 219.

参照フレーム読込み部２２０：最大頻度要素記憶部２１９の値，参照フレーム数記憶部２０８の値を入力として読み込み，両値を各々，ｂ，Ｍとすると，第（ｔ−ｂ−Ｍ）フレームから第（ｔ−ｂ＋Ｍ）フレームまでの２Ｍ＋１フレームを動き推定の探索対象となる参照フレームとして，参照フレーム記憶部２２１に書き出す。ｔは当該フレームのフレーム番号である。 Reference frame reading unit 220: Reads the value of the maximum frequency element storage unit 219 and the value of the reference frame number storage unit 208 as inputs, where both values are b and M, respectively, from the (t−b−M) frame to the second 2M + 1 frames up to (t−b + M) frames are written in the reference frame storage unit 221 as reference frames to be searched for motion estimation. t is the frame number of the frame.

動き推定処理部２２２：当該フレーム記憶部２０２の当該フレームの画素値，参照フレーム記憶部２２１の参照フレームの画素値を入力として読み込み，動き推定処理を行い，推定された変移量を変移量記憶部２２３に書き出す。なお，具体的な動き推定手法は外部より与えられるものとする。例えば，ブロックマッチングに基づく動き推定がある。 Motion estimation processing unit 222: Reads the pixel value of the frame of the frame storage unit 202 and the pixel value of the reference frame of the reference frame storage unit 221 as input, performs a motion estimation process, and stores the estimated shift amount as a shift amount storage unit Write to 223. Note that a specific motion estimation method is given from the outside. For example, there is motion estimation based on block matching.

説明を分かりやすくするために，１次元信号を例にとり説明したが，本実施例の説明から本技術を２次元の画像信号に容易に適用することができることは言うまでもない。 In order to make the explanation easy to understand, the description has been given by taking a one-dimensional signal as an example, but it goes without saying that the present technology can be easily applied to a two-dimensional image signal from the description of the present embodiment.

以上の動き推定の処理は，コンピュータとソフトウェアプログラムとによって実現することができ，そのプログラムをコンピュータ読み取り可能な記録媒体に記録して提供することも，ネットワークを通して提供することも可能である。 The motion estimation process described above can be realized by a computer and a software program, and the program can be provided by being recorded on a computer-readable recording medium or provided through a network.

実施例１の動き推定処理の流れを示すフローチャートである。3 is a flowchart illustrating a flow of motion estimation processing according to the first embodiment. 実施例１の動き推定装置の構成図である。It is a block diagram of the motion estimation apparatus of Example 1. 実施例２の動き推定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the motion estimation process of Example 2. 実施例２の動き推定装置の構成図である。It is a block diagram of the motion estimation apparatus of Example 2.

符号の説明Explanation of symbols

１０１，２０１当該フレーム読込み部
１０２，２０２当該フレーム記憶部
１０３，２０３空間変移量読込み部
１０４，２０４空間変移量記憶部
１０５，２０５隣接フレーム読込み部
１０６，２０６隣接フレーム記憶部
１０７，２０７参照フレーム数読込み部
１０８，２０８参照フレーム数記憶部
１０９，２０９空間輝度勾配算出部
１１０，２１０空間輝度勾配記憶部
１１１，２１１時間輝度勾配算出部
１１２，２１２時間輝度勾配記憶部
１１３積算処理部
１１４積算値記憶部
１１５二乗演算処理部
１１６二乗値記憶部
１１７，２１７最終画素判定処理部
１１８，２１３除算処理部
１１９，２１４除算値記憶部
１２０，２２０参照フレーム読込み部
１２１，２２１参照フレーム記憶部
１２２，２２２動き推定処理部
１２３，２２３変移量記憶部
２１５ヒストグラム更新部
２１６ヒストグラム記憶部
２１８最大頻度要素検出部
２１９最大頻度要素記憶部 101, 201 Frame reading unit 102, 202 Frame storage unit 103, 203 Spatial displacement reading unit 104, 204 Spatial transition storage unit 105, 205 Adjacent frame reading unit 106, 206 Adjacent frame storage unit 107, 207 Number of reference frames Reading unit 108, 208 Reference frame number storage unit 109, 209 Spatial luminance gradient calculation unit 110, 210 Spatial luminance gradient storage unit 111, 211 Time luminance gradient calculation unit 112, 212 Time luminance gradient storage unit 113 Integration processing unit 114 Integration value storage Unit 115 Square arithmetic processing unit 116 Square value storage unit 117, 217 Final pixel determination processing unit 118, 213 Division processing unit 119, 214 Division value storage unit 120, 220 Reference frame reading unit 121, 221 Reference frame storage unit 122, 222 Motion Estimation process Units 123 and 223 Transition amount storage unit 215 Histogram update unit 216 Histogram storage unit 218 Maximum frequency element detection unit 219 Maximum frequency element storage unit

Claims

１または複数の参照フレームを用いた動き補償を利用するフレーム間予測を伴う動画像符号化における動き推定方法において，
処理対象フレームにおける処理対象画素と，その処理対象画素から所定の整数画素分の空間変移量だけ離れた画素との間の空間画素値勾配を算出する空間画素値勾配算出ステップと，
当該処理対象フレームにおける前記処理対象画素から前記空間変移量だけ離れた変移位置画素と，隣接フレームにおける前記変移位置画素に対応する位置にある変移位置画素との間の時間画素値勾配を算出する時間画素値勾配算出ステップと，
前記空間画素値勾配と前記時間画素値勾配との，前記空間変移量に必要な時間変移量を重み値とする加重和からなるコスト関数を最小化する時間変移量を算出する時間変移量算出ステップと，
前記算出された時間変移量に対応するフレーム，またはさらにその前後の複数フレームを参照フレームとして，動き補償における動きを推定する動き推定ステップとを有する
ことを特徴とする動き推定方法。 In a motion estimation method in video coding with inter-frame prediction using motion compensation using one or more reference frames,
A spatial pixel value gradient calculating step for calculating a spatial pixel value gradient between the processing target pixel in the processing target frame and a pixel separated from the processing target pixel by a spatial shift amount corresponding to a predetermined integer pixel;
Time for calculating a time pixel value gradient between a transition position pixel separated from the processing target pixel in the processing target frame by the spatial shift amount and a transition position pixel in a position corresponding to the transition position pixel in an adjacent frame A pixel value gradient calculating step;
A time shift amount calculating step for calculating a time shift amount that minimizes a cost function including a weighted sum of the space pixel value gradient and the time pixel value gradient, the time shift amount necessary for the space shift amount as a weight value; When,
A motion estimation method comprising: a motion estimation step of estimating motion in motion compensation using a frame corresponding to the calculated time shift amount or a plurality of frames before and after the frame as reference frames.

１または複数の参照フレームを用いた動き補償を利用するフレーム間予測を伴う動画像符号化における動き推定方法において，
処理対象フレームにおける処理対象画素と，その処理対象画素から所定の整数画素分の空間変移量だけ離れた画素との間の空間画素値勾配を，当該処理対象フレームにおける局所領域内の各画素について算出する空間画素値勾配算出ステップと，
当該処理対象フレームにおける前記処理対象画素から前記空間変移量だけ離れた変移位置画素と，隣接フレームにおける前記変移位置画素に対応する位置にある変移位置画素との間の時間画素値勾配を，当該処理対象フレームにおける局所領域内の各画素について算出する時間画素値勾配算出ステップと，
前記空間画素値勾配と前記時間画素値勾配との比を算出し，前記局所領域内の各画素ごとの時間変移量を推定する時間変移量推定ステップと，
推定された各画素ごとの時間変移量に基づき，この局所領域内の全画素に対する時間変移量のヒストグラムを生成し，該ヒストグラムにおける最大頻度の時間変移量を求める時間変移量算出ステップと，
前記算出された時間変移量に対応するフレーム，またはさらにその前後の複数フレームを参照フレームとして，動き補償における動きを推定する動き推定ステップとを有する
ことを特徴とする動き推定方法。 In a motion estimation method in video coding with inter-frame prediction using motion compensation using one or more reference frames,
The spatial pixel value gradient between the processing target pixel in the processing target frame and a pixel separated from the processing target pixel by a predetermined integer pixel spatial shift amount is calculated for each pixel in the local region in the processing target frame. A spatial pixel value gradient calculating step,
A temporal pixel value gradient between a transition position pixel separated from the processing target pixel in the processing target frame by the spatial shift amount and a transition position pixel in a position corresponding to the transition position pixel in an adjacent frame is represented by the processing. A temporal pixel value gradient calculating step for calculating each pixel in the local region in the target frame;
Calculating a ratio between the spatial pixel value gradient and the temporal pixel value gradient, and estimating a time shift amount for each pixel in the local region;
A time shift amount calculating step for generating a time shift amount histogram for all pixels in the local region based on the estimated time shift amount for each pixel, and obtaining a maximum frequency time shift amount in the histogram;
A motion estimation step of estimating a motion in motion compensation using a frame corresponding to the calculated time shift amount or a plurality of frames before and after the frame as a reference frame.

１または複数の参照フレームを用いた動き補償を利用するフレーム間予測を伴う動画像符号化における動き推定装置において，
処理対象フレームにおける処理対象画素と，その処理対象画素から所定の整数画素分の空間変移量だけ離れた画素との間の空間画素値勾配を算出する空間画素値勾配算出手段と，
当該処理対象フレームにおける前記処理対象画素から前記空間変移量だけ離れた変移位置画素と，隣接フレームにおける前記変移位置画素に対応する位置にある変移位置画素との間の時間画素値勾配を算出する時間画素値勾配算出手段と，
前記空間画素値勾配と前記時間画素値勾配との，前記空間変移量に必要な時間変移量を重み値とする加重和からなるコスト関数を最小化する時間変移量を算出する時間変移量算出手段と，
前記算出された時間変移量に対応するフレーム，またはさらにその前後の複数フレームを参照フレームとして，動き補償における動きを推定する動き推定手段とを備える
ことを特徴とする動き推定装置。 In a motion estimation apparatus in video coding with inter-frame prediction using motion compensation using one or a plurality of reference frames,
A spatial pixel value gradient calculating means for calculating a spatial pixel value gradient between a processing target pixel in the processing target frame and a pixel separated from the processing target pixel by a spatial shift amount corresponding to a predetermined integer pixel;
Time for calculating a time pixel value gradient between a transition position pixel separated from the processing target pixel in the processing target frame by the spatial shift amount and a transition position pixel in a position corresponding to the transition position pixel in an adjacent frame Pixel value gradient calculating means;
Time shift amount calculating means for calculating a time shift amount that minimizes a cost function composed of a weighted sum of the space pixel value gradient and the time pixel value gradient using a time shift amount required for the space shift amount as a weight value. When,
A motion estimation apparatus comprising: a motion estimation unit configured to estimate motion in motion compensation using a frame corresponding to the calculated time shift amount or a plurality of frames before and after the frame as reference frames.

１または複数の参照フレームを用いた動き補償を利用するフレーム間予測を伴う動画像符号化における動き推定装置において，
処理対象フレームにおける処理対象画素と，その処理対象画素から所定の整数画素分の空間変移量だけ離れた画素との間の空間画素値勾配を，当該処理対象フレームにおける局所領域内の各画素について算出する空間画素値勾配算出手段と，
当該処理対象フレームにおける前記処理対象画素から前記空間変移量だけ離れた変移位置画素と，隣接フレームにおける前記変移位置画素に対応する位置にある変移位置画素との間の時間画素値勾配を，当該処理対象フレームにおける局所領域内の各画素について算出する時間画素値勾配算出手段と，
前記空間画素値勾配と前記時間画素値勾配との比を算出し，前記局所領域内の各画素ごとの時間変移量を推定する時間変移量推定手段と，
推定された各画素ごとの時間変移量に基づき，この局所領域内の全画素に対する時間変移量のヒストグラムを生成し，該ヒストグラムにおける最大頻度の時間変移量を求める時間変移量算出手段と，
前記算出された時間変移量に対応するフレーム，またはさらにその前後の複数フレームを参照フレームとして，動き補償における動きを推定する動き推定手段とを備える
ことを特徴とする動き推定装置。 In a motion estimation apparatus in video coding with inter-frame prediction using motion compensation using one or a plurality of reference frames,
The spatial pixel value gradient between the processing target pixel in the processing target frame and a pixel separated from the processing target pixel by a predetermined integer pixel spatial shift amount is calculated for each pixel in the local region in the processing target frame. Spatial pixel value gradient calculating means for
A temporal pixel value gradient between a transition position pixel separated from the processing target pixel in the processing target frame by the spatial shift amount and a transition position pixel in a position corresponding to the transition position pixel in an adjacent frame is represented by the processing. A time pixel value gradient calculating means for calculating each pixel in the local region in the target frame;
Calculating a ratio between the spatial pixel value gradient and the temporal pixel value gradient, and estimating a time shift amount for each pixel in the local region;
A time shift amount calculating means for generating a histogram of time shift amounts for all the pixels in the local region based on the estimated time shift amount for each pixel, and obtaining a time shift amount having a maximum frequency in the histogram;
A motion estimation apparatus comprising: a motion estimation unit configured to estimate motion in motion compensation using a frame corresponding to the calculated time shift amount or a plurality of frames before and after the frame as reference frames.

請求項１または請求項２記載の動き推定方法を，コンピュータに実行させるための動き推定プログラム。 A motion estimation program for causing a computer to execute the motion estimation method according to claim 1.

請求項１または請求項２記載の動き推定方法を，コンピュータに実行させるための動き推定プログラムを記録したコンピュータ読み取り可能な記録媒体。 A computer-readable recording medium recording a motion estimation program for causing a computer to execute the motion estimation method according to claim 1.