JP2011243008A

JP2011243008A - Motion estimation device and program

Info

Publication number: JP2011243008A
Application number: JP2010114729A
Authority: JP
Inventors: Yasutaka Matsuo; 康孝松尾; Yoshiaki Shishikui; 善明鹿喰; Toshie Misu; 俊枝三須; Shinichi Sakaida; 慎一境田
Original assignee: Nippon Hoso Kyokai NHK
Current assignee: Japan Broadcasting Corp
Priority date: 2010-05-18
Filing date: 2010-05-18
Publication date: 2011-12-01
Anticipated expiration: 2030-05-18
Also published as: JP5520687B2

Abstract

PROBLEM TO BE SOLVED: To provide a motion estimation device and program for estimating a motion of a moving image.SOLUTION: The motion estimation device includes: a calculation time direction frequency resolving part 11 for calculating power of each frequency band of a time direction about a frame image sequence; a spatial resolution rank determining part 12 for determining a spatial resolution rank value on the basis of a power value of each frequency band of the time direction; a spatial direction frequency resolving part 13 for performing octave resolution of a frame of a motion estimation object on the basis of the spatial resolution rank value to calculate a power value of each frequency band respectively for horizontal, vertical oblique directions; an image analyzing part 14 for determining a motion detection start rank based on a power value for each frequency band in each direction calculated for each rank corresponding to the spatial resolution rank value, and determining a block size and a motion retrieval range corresponding to the power value for each frequency band in each direction for each rank; and a motion estimating part 15 for performing hierarchical motion estimation on the basis of the block size and motion retrieval range determined for each rank of the motion detection start rank.

Description

本発明は、動画像の時間方向及び／又は空間方向のスペクトルパワーによって動画像の動き量を階層的に分析し、動画像の動きベクトルを高精度に検出する動き推定装置及びプログラムに関する。 The present invention relates to a motion estimation device and a program for hierarchically analyzing the amount of motion of a moving image by spectral power in the time direction and / or spatial direction of the moving image and detecting a motion vector of the moving image with high accuracy.

近年、撮像装置及び表示装置の高精細化が進んでおり、超解像（Ｓｕｐｅｒ−Ｒｅｓｏｌｕｔｉｏｎ）と称される動画像の高解像化技術が研究されている（例えば、特許文献１参照）。いわゆる８Ｋシステムと呼ばれるスーパーハイビジョン（ＳＨＶ）のような超高精細映像、又は４Ｋシステムと呼ばれるデジタルシネマのような高精細映像は、従来のハイビジョン（ＨＶ）映像の４倍ないし１６倍の高解像度を有するに至っている。 2. Description of the Related Art In recent years, high-definition imaging devices and display devices have been advanced, and a moving image high-resolution technique called super-resolution has been studied (for example, see Patent Document 1). Ultra-high definition video such as Super Hi-Vision (SHV) called 8K system, or high-definition video like Digital Cinema called 4K system has a resolution that is 4 to 16 times higher than that of conventional Hi-Vision (HV) video. It has come to have.

しかしながら、動画像を表示する表示装置の画面が高精細化されるほど、同じ画角で撮影した場合の動領域における１画素あたりの動きボケ量が大きくなる。 However, the higher the definition of the screen of a display device that displays a moving image, the greater the amount of motion blur per pixel in the moving region when shooting at the same angle of view.

例えば、図１５（ａ）に示すように、ハイビジョン（ＨＶ：Hi-Vision）画面は水平１９２０画素×垂直１０８０ライン、時間６０フレーム／秒であり、図１５（ｂ）に示すように、スーパーハイビジョン（ＳＨＶ：Super Hi-Vision）画面は、水平７６８０画素×垂直４３２０ライン、時間６０フレーム／秒である。ハイビジョン画面用の動画像と同じＦＯＶ (ＦＯＶ：ＦｉｅｌｄＯｆＶｉｅｗ)で撮像した動画像をスーパーハイビジョン画面で見ると、水平・垂直解像度ともに４倍となるため、動きのある被写体の動き量は４倍となり、動領域における１画素あたりの動きボケ量も４倍となる。特に、画面全体が大きく変化するスポーツシーン等の高速動きシーンでは、視覚的なボケ感は顕著となる。 For example, as shown in FIG. 15 (a), the Hi-Vision (HV) screen is horizontal 1920 pixels × vertical 1080 lines, time 60 frames / second, and as shown in FIG. The (SHV: Super Hi-Vision) screen is horizontal 7680 pixels × vertical 4320 lines, time 60 frames / second. When a moving image taken with the same FOV (Field Of View) as a moving image for a high-definition screen is viewed on a super high-definition screen, the horizontal and vertical resolutions are four times larger, so the amount of movement of a moving subject is four times larger. Thus, the amount of motion blur per pixel in the motion region is also quadrupled. In particular, in a high-speed motion scene such as a sports scene in which the entire screen changes greatly, the visual blur is remarkable.

尚、或る動画像の１つの画面に対して異なる解像度の画像データを階層的に複数設定して、或る解像度の画像データについて動き量のための評価値を求めるとともに、この解像度とは異なる画像データについて動き量のための評価値を求め、各評価値を加算して得られる値から最終的な動き量を決定する技法が知られている（例えば、特許文献２参照）。 Note that a plurality of image data with different resolutions are set hierarchically for one screen of a certain moving image, and an evaluation value for the amount of motion is obtained for the image data with a certain resolution, which is different from this resolution. A technique is known in which an evaluation value for a motion amount is obtained for image data, and a final motion amount is determined from a value obtained by adding the evaluation values (see, for example, Patent Document 2).

また、或る動画像の１つの画面に対して空間方向に複数階のウェーブレット変換を施して、高周波成分の領域を多く含む階数の優先度を低くする輪郭情報を抽出するとともに、動画像の１つの画面をブロック分割し、輪郭情報で示されるブロックにおけるアクティビティ（画像の局所的性質）が小さいほど優先度を高くなるように、輪郭情報で示されるブロックに同一の優先度を設定して、優先度が低いブロックから順に切り捨て処理を行って符号化データ量を制御する技法が知られている（例えば、特許文献３参照）。 Further, wavelet transform of a plurality of floors is performed in a spatial direction on one screen of a certain moving image to extract contour information that lowers the priority of the floor including a lot of high-frequency component regions. Divide one screen into blocks and set the same priority to the block indicated by the contour information so that the priority becomes higher as the activity (local nature of the image) in the block indicated by the contour information is smaller. A technique is known in which the amount of encoded data is controlled by performing a truncation process in order from the block with the lowest degree (see, for example, Patent Document 3).

特開２００９−１０５４９０号公報JP 2009-105490 A 特許第３３３４２７１号Japanese Patent No. 3334271 特許第４１９５９７８号Japanese Patent No. 4195978

前述したように、ＳＨＶ画面の映像フォーマットは、水平７６８０画素、垂直４３２０ライン、時間６０フレーム／秒であり、ＨＶ画面の映像フォーマットと比較して、水平及び垂直標本化周波数が、時間標本化周波数に対して相対的に増大している。 As described above, the video format of the SHV screen is horizontal 7680 pixels, vertical 4320 lines, time 60 frames / second. Compared with the video format of the HV screen, the horizontal and vertical sampling frequencies are temporal sampling frequencies. It has increased relative to.

従って、ＳＨＶ画面の動領域は、同じ画角で撮像された動画像で比較した場合、ＨＶ画面と比較して大きな動き量（フレーム単位の動きを示す画素数：画素／フレーム）を示し、動領域ではフレーム間の相関が低くなり時間方向の高周波領域のパワーが高くなることが想定されるとともに、動領域のボケ量が大きくなり、空間方向の高周波領域のパワーが低くなることが想定される。符号化処理や超解像処理における複数フレーム間での動き量を推定するには、これらの想定に基づく処理が有効となる。尚、ＨＤＴＶ標準の動画像の動き量は、一般的に数画素／フレーム〜数十画素／フレーム程度であることが知られている。 Therefore, the moving area of the SHV screen shows a larger amount of movement (number of pixels indicating movement in units of frames: pixels / frame) than the HV screen when compared with moving images captured at the same angle of view. In the region, it is assumed that the correlation between frames is low and the power in the high frequency region in the time direction is high, the blur amount in the moving region is large, and the power in the high frequency region in the spatial direction is low. . Processing based on these assumptions is effective in estimating the amount of motion between a plurality of frames in encoding processing and super-resolution processing. Note that it is known that the amount of motion of a moving picture of the HDTV standard is generally about several pixels / frame to several tens of pixels / frame.

一般に、空間高周波成分の多い絵柄は、小さいブロックサイズ（例えば、２×２画素）を用いた動き推定装置が適している。一方、空間高周波成分が少ない絵柄は、小さいブロックサイズでは誤った動き推定装置となる可能性が高くなるため、大きなブロックサイズ（例えば、１６×１６画素）を用いた動き推定装置が有効である。更に、大きなブロックサイズを用いた動き推定装置では、大きな動きによるボケの影響も考えられるため、大きな動き探索範囲が要求される。 In general, a motion estimation device using a small block size (for example, 2 × 2 pixels) is suitable for a pattern having a large number of spatial high-frequency components. On the other hand, since a pattern with a small number of spatial high-frequency components is likely to be an erroneous motion estimation device with a small block size, a motion estimation device using a large block size (for example, 16 × 16 pixels) is effective. Furthermore, in a motion estimation apparatus using a large block size, the influence of blur due to a large motion can be considered, and thus a large motion search range is required.

一方、このような符号化処理や超解像処理における複数フレーム間での動き量の推定において、特許文献２の技法を適用しても、時空間周波領域のパワーに基づいて階層化するものではなく予め規定した階層数で処理を行うために、処理負担が大きくなり、且つ時間方向のボケの影響が反映された動き量を検出することができない。 On the other hand, even if the technique of Patent Literature 2 is applied to estimate the amount of motion between a plurality of frames in such encoding processing and super-resolution processing, it is not possible to hierarchize based on the power in the spatio-temporal frequency domain. In addition, since processing is performed with a predetermined number of hierarchies, the processing load increases, and it is impossible to detect a motion amount that reflects the influence of blur in the time direction.

また、このような符号化処理や超解像処理における複数フレーム間での動き量の推定において、特許文献３の技法を適用しても、高周波成分の領域を多く含む階数に依存して符号化データ量を取捨選択するためのブロックの優先度を決定することができるが、時空間周波領域のパワーに基づいて階層化するものではないので、動きベクトルを高精度化させることができない。 In addition, even if the technique of Patent Document 3 is applied to estimate the amount of motion between a plurality of frames in such encoding processing or super-resolution processing, encoding is performed depending on the rank including a large number of high-frequency component regions. Although the priority of the block for selecting the data amount can be determined, the motion vector cannot be made highly accurate because it is not hierarchized based on the power of the spatio-temporal frequency domain.

したがって、大きな動き量は、動領域の動きぼけ量が増し、空間高周波領域のパワー割合を低下させる。大きな動領域面積は、動領域の時間変動面積を増し、時間高周波領域のパワー割合を増加させる。 Therefore, a large amount of motion increases the amount of motion blur in the motion region and decreases the power ratio in the spatial high frequency region. A large dynamic area increases the time variation area of the dynamic area and increases the power ratio of the time high frequency area.

一般に動き量が大きく、動きぼけ量が大きな画像の動き推定は、大きなブロックサイズと探索範囲が適している。ＳＨＶ画面の動き推定では、大きな動き量から小さな動き量まで、幅広い動き量に対応する動き推定方法が望まれる。 In general, a large block size and search range are suitable for estimating the motion of an image having a large amount of motion and a large amount of motion blur. In the motion estimation of the SHV screen, a motion estimation method that supports a wide range of motion amounts from large motion amounts to small motion amounts is desired.

そこで、本発明の目的は、動画像の時間方向及び／又は空間方向のスペクトルパワーによって動画像の動き量を階層的に分析し、動画像の動き推定情報を高精度に求める動き推定装置及びプログラムを提供することにある。 Therefore, an object of the present invention is to provide a motion estimation apparatus and program for hierarchically analyzing the amount of motion of a moving image based on the spectral power in the time direction and / or spatial direction of the moving image and obtaining motion estimation information of the moving image with high accuracy. Is to provide.

前述のように、動き量と時空間方向の高周波領域のパワーとの間には一定の相関を持つことが多いため、動画像の時間及び空間スペクトルのパワー分析を行って、動き推定装置における適切なブロックサイズを推定するとともに、推定したブロックサイズを空間的に階層化して動きベクトルを検出することが有効である。 As described above, there is often a certain correlation between the amount of motion and the power in the high-frequency region in the space-time direction. It is effective to estimate a block size and detect a motion vector by spatially hierarchizing the estimated block size.

動画像の時空間方向のスペクトルを考察すると、動領域における空間方向の高周波領域のパワーは、動き量の面積が大きくなるにつれて減少する。即ち、大面積の動オブジェクトが大きな動き量を持つ動画像は、空間方向の高周波領域のパワーが小さくなるが、時間方向の高周波領域のパワーは大きくなる傾向がある。これは、画面中で大きな面積のオブジェクトが大きく動く場合は、時間方向の変動が大きくなることに起因する。 Considering the spectrum in the spatio-temporal direction of the moving image, the power of the high frequency region in the moving direction in the moving region decreases as the area of the motion amount increases. That is, a moving image in which a large-area moving object has a large amount of motion has a tendency that the power in the high-frequency region in the spatial direction decreases, but the power in the high-frequency region in the time direction tends to increase. This is because when a large area object moves greatly on the screen, the variation in the time direction becomes large.

そこで、本発明の動き推定装置及びプログラムは、動画像における動領域の面積及び動き量の推定のために、動画像の時間方向及び／又は空間方向のスペクトルパワー分析を行って、画像内の動領域では時間周波数が大きく変動し、動き量が大きい方向に画像は大きく動きぼけする特徴を利用し、時空間周波数帯域のパワーを解析することにより、画像の持つ大凡の動き量と動き方向を推定する。この大凡の動き量と動き方向の推定結果から、画像が持つ周波数帯域ごとのパワーに応じて（階層型動き推定の階数を決定して）、帯域毎にブロックサイズと動き探索範囲の水平方向および垂直方向の大きさを決定する。そして、この階層毎のブロックサイズと動き探索範囲を用いて、階層的動き推定を行う。これにより、大凡の動き量と動き方向の推定により、ブロックサイズと動き探索範囲の水平、垂直方向の大きさが階層的に異なることとなり（等価的に様々なサイズのブロックサイズ及び探索範囲を用いて動き推定を行うこととなり）、従来法よりも高確度に動き推定を行うことが可能となり、さらに、動き方向を水平、垂直だけでなく、斜め方向をも考慮に入れてブロックサイズ及び探索範囲を決定して動き推定を行うことにより、動きぼけ量や方向に応じた動き推定を従来法よりもさらに高確度に行うことが可能となる。 Therefore, the motion estimation apparatus and program of the present invention perform spectral power analysis in the time direction and / or space direction of a moving image to estimate the area of the moving region and the amount of motion in the moving image, and the motion in the image. Estimate the approximate amount of motion and direction of the image by analyzing the power of the spatio-temporal frequency band using the characteristics that the image fluctuates greatly in the direction in which the temporal frequency fluctuates greatly and the amount of motion is large. To do. From the estimation results of the approximate motion amount and motion direction, the block size and the horizontal direction of the motion search range and the horizontal direction of the motion search range for each band are determined according to the power of each frequency band of the image (determining the rank of hierarchical motion estimation). Determine the vertical size. Then, hierarchical motion estimation is performed using the block size and the motion search range for each layer. As a result, the block size and the horizontal and vertical sizes of the motion search range are hierarchically different from each other by estimating the amount of motion and the direction of motion (equivalently, block sizes and search ranges of various sizes are used. It is possible to perform motion estimation with higher accuracy than in the conventional method, and the block size and search range take into account not only horizontal and vertical motion but also diagonal directions. Thus, it is possible to perform motion estimation according to the amount of motion blur and the direction with higher accuracy than the conventional method.

即ち、本発明の動き推定装置は、動画像の動き推定を行う動き推定装置であって、複数フレームのフレーム画像列について時間方向の周波数帯域毎のパワーを算出する算出時間方向周波数分解部と、前記時間方向の周波数帯域別のパワー値と、予め定めた第１閾値とを比較し、前記第１閾値を上回るパワーとなる階数を空間周波数の空間分解階数値として決定する空間分解階数決定部と、前記空間分解階数値に基づいて、動き推定対象のフレームに対してオクターブ分解を行い、前記空間分解階数値に対応する階数毎に、水平、垂直、斜め方向の周波数帯域毎のパワー値を算出する空間方向周波数分解部と、前記空間分解階数値に対応する階数毎に算出した水平、垂直、斜め方向の周波数帯域毎のパワー値と、予め定めた第２閾値とを比較し、当該水平、垂直、斜め方向の周波数帯域毎のパワー値の全てが前記第２閾値を下回るパワーとなる階数を動き検出開始階数として決定し、前記動き検出開始階数の各階層について該水平、垂直、斜め方向の周波数帯域毎のパワー値に応じたブロックサイズ及び動き探索範囲を決定する画像解析部と、前記動き検出開始階数の階数毎に決定されたブロックサイズ及び動き探索範囲に基づいて、前記動き推定対象のフレームに対して階層型の動き推定を行う動き推定部と、を具えることを特徴とする。 That is, the motion estimation device of the present invention is a motion estimation device that performs motion estimation of a moving image, and a calculation time direction frequency decomposition unit that calculates power for each frequency band in the time direction for a frame image sequence of a plurality of frames; A space-resolved rank determining unit that compares the power value for each frequency band in the time direction with a predetermined first threshold and determines a rank that has a power that exceeds the first threshold as a spatially-resolved rank value of the spatial frequency; Based on the spatial decomposition factor, octave decomposition is performed on the frame for motion estimation, and power values are calculated for each frequency band in the horizontal, vertical, and diagonal directions for each rank corresponding to the spatial decomposition factor. A spatial direction frequency resolution unit that compares the power value for each frequency band in the horizontal, vertical, and diagonal directions calculated for each rank corresponding to the spatial resolution rank value and a predetermined second threshold value. The number of floors in which all power values for the horizontal, vertical, and diagonal frequency bands are lower than the second threshold is determined as a motion detection start floor, and the horizontal, vertical, An image analyzer that determines a block size and a motion search range according to a power value for each frequency band in an oblique direction, and the motion based on the block size and the motion search range determined for each rank of the motion detection start floor A motion estimation unit that performs hierarchical motion estimation on the estimation target frame.

また、本発明の動き推定装置において、前記画像解析部は、前記動き検出開始階数における各階層の水平、垂直、斜め方向の周波数帯域毎のパワーの割合を算出して、前記動き検出開始階数における階層毎の基準の動き検出のブロックサイズ及び動き探索範囲に対する倍数を算出し、該倍数を前記基準の動き検出のブロックサイズ及び動き探索範囲に乗じて、当該水平、垂直、斜め方向の周波数帯域毎のパワー値に応じたブロックサイズ及び動き探索範囲を決定する手段を有することを特徴とする。 In the motion estimation apparatus of the present invention, the image analysis unit calculates a ratio of power for each frequency band in the horizontal, vertical, and diagonal directions of each layer in the motion detection start floor, and in the motion detection start floor A multiple of the reference motion detection block size and motion search range for each layer is calculated, and the multiple is multiplied by the reference motion detection block size and motion search range for each horizontal, vertical, and diagonal frequency band. It has a means for determining a block size and a motion search range according to the power value.

また、本発明の動き推定装置において、前記動き推定部は、階層型の動き推定を行う際に、階層間で同位置にブロックがない場合に、階層間のブロックの面積割合に応じた加重平均を行って、前記最終的な動きベクトルを決定する手段を有することを特徴とする。 In the motion estimation device of the present invention, when the motion estimation unit performs hierarchical motion estimation, when there is no block at the same position between layers, the weighted average according to the area ratio of blocks between layers And a means for determining the final motion vector.

さらに、本発明は、動画像の動き推定を行う動き推定装置として構成するコンピュータに、複数フレームのフレーム画像列について時間方向の周波数帯域毎のパワーを算出するステップと、前記時間方向の周波数帯域別のパワー値と、予め定めた第１閾値とを比較し、前記第１閾値を上回るパワーとなる階数を空間周波数の空間分解階数値として決定するステップと、前記空間分解階数値に基づいて、動き推定対象のフレームに対してオクターブ分解を行い、前記空間分解階数値に対応する階数毎に、水平、垂直、斜め方向の周波数帯域毎のパワー値を算出するステップと、前記空間分解階数値に対応する階数毎に算出した水平、垂直、斜め方向の周波数帯域毎のパワー値と、予め定めた第２閾値とを比較し、当該水平、垂直、斜め方向の周波数帯域毎のパワー値の全てが前記第２閾値を下回るパワーとなる階数を動き検出開始階数として決定し、前記動き検出開始階数の各階層について該水平、垂直、斜め方向の周波数帯域毎のパワー値に応じたブロックサイズ及び動き探索範囲を決定するステップと、前記動き検出開始階数の階数毎に決定されたブロックサイズ及び動き探索範囲に基づいて、前記動き推定対象のフレームに対して階層型の動き推定を行うステップと、を実行させるためのプログラムとして構成される。 Further, the present invention provides a computer configured as a motion estimation device that performs motion estimation of a moving image, calculating power for each frequency band in a time direction for a frame image sequence of a plurality of frames, and for each frequency band in the time direction. Comparing the power value of the first frequency with a predetermined first threshold value and determining the rank of the power that exceeds the first threshold value as a spatial resolution factor of the spatial frequency, and based on the spatial resolution factor Performing octave decomposition on the estimation target frame, calculating a power value for each frequency band in the horizontal, vertical, and diagonal directions for each rank corresponding to the spatial decomposition rank, and corresponding to the spatial decomposition rank The power value for each frequency band in the horizontal, vertical, and diagonal directions calculated for each floor is compared with a predetermined second threshold value, and the horizontal, vertical, and diagonal directions are compared. The number of floors in which power values for every several bands are lower than the second threshold is determined as the motion detection start floor, and the power for each frequency band in the horizontal, vertical, and diagonal directions for each layer of the motion detection start floor A step of determining a block size and a motion search range according to a value, and a hierarchical type for the motion estimation target frame based on the block size and the motion search range determined for each rank of the motion detection start floor And a step of performing motion estimation.

本発明によれば、動画像における動き推定にあたって、適切なブロックサイズを推定して動き推定装置を開始することができるだけでなく、時空間方向のスペクトルパワーから画像が持つ大凡の動き量、動き方向（水平・垂直・斜めの動き方向）及び動領域を帯域毎に推定してブロックサイズ及び動き探索範囲の大きさを決定し、帯域毎にブロックサイズ及び動き探索範囲の大きさが異なる階層的に動き推定を行うことで、雑音に強く、且つ高精度の動き推定装置の計算量を削減することができるとともに、動きぼけ量や方向に応じた高確度動き推定を行うことが可能となる。 According to the present invention, when estimating motion in a moving image, not only can the motion estimation device be started by estimating an appropriate block size, but also the approximate amount of motion and motion direction that the image has from the spectral power in the spatio-temporal direction. (Horizontal / vertical / diagonal motion direction) and motion region are estimated for each band to determine the block size and the motion search range size, and the block size and motion search range size differ for each band hierarchically By performing motion estimation, it is possible to reduce the amount of calculation of a motion estimation device that is robust against noise and highly accurate, and it is possible to perform highly accurate motion estimation according to the amount and direction of motion blur.

本発明による一実施例の動き推定装置の概略図である。It is the schematic of the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における時間方向周波数分解部の概略図である。It is the schematic of the time direction frequency decomposition part in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における空間分解階数決定部の概略図である。It is the schematic of the space decomposition rank determination part in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における空間方向周波数分解部の概略図である。It is the schematic of the spatial direction frequency decomposition part in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における画像解析部の概略図である。It is the schematic of the image analysis part in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における動き推定部の概略図である。It is the schematic of the motion estimation part in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置の動作を示す動作フロー図である。It is an operation | movement flowchart which shows operation | movement of the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置におけるフレーム画像列を示す図である。It is a figure which shows the frame image sequence in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における時間方向周波数分解部の動作説明図である。It is operation | movement explanatory drawing of the time direction frequency decomposition | disassembly part in the motion estimation apparatus of one Example by this invention. （ａ），（ｂ）本発明による一実施例の動き推定装置に係る２次元２階離散ウェーブレット分解の説明図である。(A), (b) It is explanatory drawing of the two-dimensional 2nd-order discrete wavelet decomposition | disassembly which concerns on the motion estimation apparatus of one Example by this invention. （ａ），（ｂ），（ｃ），（ｄ）本発明による一実施例の動き推定装置における動き検出部の動作説明図である。(A), (b), (c), (d) It is operation | movement explanatory drawing of the motion detection part in the motion estimation apparatus of one Example by this invention. （ａ），（ｂ），（ｃ），（ｄ）本発明による一実施例の動き推定装置における動き検出部に係るブロックサイズの説明図である。(A), (b), (c), (d) It is explanatory drawing of the block size which concerns on the motion detection part in the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置に係る２次関数近似による小数画素位置のブロックマッチング法の説明図である。It is explanatory drawing of the block matching method of the decimal pixel position by quadratic function approximation which concerns on the motion estimation apparatus of one Example by this invention. 本発明による一実施例の動き推定装置における動き検出部に係る面積割合の荷重平均で動きベクトルを算出する説明図である。It is explanatory drawing which calculates a motion vector by the load average of the area ratio which concerns on the motion detection part in the motion estimation apparatus of one Example by this invention. （ａ），（ｂ）動領域における１画素あたりの動きボケ量が映像フォーマットに従って変化する様子を示す図である。(A), (b) It is a figure which shows a mode that the amount of motion blur per pixel in a moving region changes according to a video format.

以下、本発明による一実施例の動き推定装置について説明する。 Hereinafter, a motion estimation apparatus according to an embodiment of the present invention will be described.

一実施例の動き推定装置１として、時間方向及び空間方向の周波数解析にウェーブレット変換によるオクターブ分解処理を用いる場合について説明する。尚、時間方向及び空間方向の周波数解析には、ウェーブレット変換を用いる場合以外に、他の直交変換又はＦＦＴ（Fast Fourier transform）を用いることができるが、画像を低解像度化するにしたがってくり返し同じ処理を行って、時間方向及び空間方向の周波数解析を行う点を考慮すれば、ウェーブレット変換によるオクターブ分解処理を用いることが特に処理効率が向上する点で有利である。 As a motion estimation apparatus 1 according to an embodiment, a case where octave decomposition processing by wavelet transform is used for frequency analysis in a time direction and a spatial direction will be described. For the frequency analysis in the time direction and the spatial direction, other orthogonal transforms or FFT (Fast Fourier transform) can be used in addition to using the wavelet transform, but the same processing is repeated as the resolution of the image is reduced. In consideration of performing frequency analysis in the time direction and in the spatial direction, it is particularly advantageous to use octave decomposition processing by wavelet transform because the processing efficiency is improved.

[装置構成]
図１に、本発明による一実施例の動き推定装置１を示す。本実施例の動き推定装置１は、時間方向周波数分解部１１と、空間分解階数決定部１２と、空間方向周波数分解部１３と、画像解析部１４と、動き推定部１５とを備える。尚、各構成要素で処理するのに必要な画像データは、動き推定装置１が備える記憶部（図示せず）に適宜格納して読み出すように構成することができる。 [Device configuration]
FIG. 1 shows a motion estimation apparatus 1 according to an embodiment of the present invention. The motion estimation apparatus 1 of the present embodiment includes a time direction frequency decomposition unit 11, a spatial resolution rank determination unit 12, a spatial direction frequency decomposition unit 13, an image analysis unit 14, and a motion estimation unit 15. Note that image data necessary for processing by each component can be appropriately stored in a storage unit (not shown) included in the motion estimation device 1 and read out.

時間方向周波数分解部１１は、動きベクトルの検出を行う基準フレームＦ（ｔ_Ｃ）及び動き探索に用いる参照フレームＦ（ｔ_Ｒ）を含む、時刻ｔ＝ｔ_０・・・ｔ_ｍにおける複数フレームのフレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力し、基準フレームＦ（ｔ_Ｃ）における全画素について、この複数フレームを時間方向に予め規定した最大階数の周波領域に分解した後、全画素における時間方向の周波数帯域毎のパワーを算出し、この複数フレームの全画素に対するパワー値で正規化し、全画素におけるｎ_ｔ階の各時間方向の周波数帯域別のパワー値Ｐ_Ｔ（ｎ_ｔ）を算出して空間分解階数決定部１２に送出する。 Temporal frequency decomposition unit 11 includes a reference frame F to be used for reference frame F _{(t C)} and the motion search to detect the motion vector _{(t R),} a plurality of frames at time _{_t} = _t 0 ··· _t _m F (t ₀ ),..., F (t _C ),..., F (t _R ),..., F (t _m ) are input, and the reference frame F (t _C ) For all pixels, after decomposing the multiple frames into the frequency domain of the maximum rank specified in advance in the time direction, calculate the power for each frequency band in the time direction for all pixels and normalize the power value for all pixels in the multiple frames. The power value P _T ( _nt ) for each frequency band in each time direction of the _nt floor in all pixels is calculated and sent to the spatial resolution rank determining unit 12.

例えば、図２に示すように、時間方向周波数分解部１１は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を含む、時刻ｔ＝ｔ_０・・・ｔ_ｍにおけるフレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力し、基準フレームＦ（ｔ_Ｃ）の全画素について時間方向に予め規定したＮｍａｘ階（例えば、４階）の離散ウェーブレット分解を行う時間方向１次元Ｎｍａｘ階離散ウェーブレット分解処理部１１１と、基準フレームＦ（ｔ_Ｃ）の全画素における時間方向の周波数帯域毎のパワーを算出し、この複数フレームの全画素に対するパワー値で正規化し、算出した全画素におけるｎ_ｔ階の各時間方向の周波数帯域別のパワー値Ｐ_Ｔ（ｎ_ｔ）を算出して空間分解階数決定部１２に送出する時間方向周波数帯域別パワー算出部１１２から構成することができる。 For example, as illustrated in FIG. 2, the time-direction frequency decomposition unit 11 includes a frame image sequence F at time t = t ₀ ... T _m including a base frame F (t _C ) and a reference frame F (t _R ). (T ₀ ), ..., F (t _C ), ..., F (t _R ), ..., F (t _m ) are input, and the time for all pixels of the reference frame F (t _C ) A time-direction one-dimensional Nmax-order discrete wavelet decomposition processing unit 111 that performs discrete wavelet decomposition of the Nmax floor (for example, fourth floor) defined in advance in the direction, and for each frequency band in the time direction in all pixels of the reference frame F (t _C ) And normalizing with the power values for all the pixels of the plurality of frames, and calculating the power value P _T ( _nt ) for each frequency band in the _nt floor in each time direction for all the calculated pixels. Floor determination unit 12 Can be composed of the time direction the frequency band-dependent power calculator 112 for sending.

空間分解階数決定部１２は、時間方向周波数分解部１１によって算出したｎ_ｔ階の各時間方向の周波数帯域別のパワー値Ｐ_Ｔ（ｎ_ｔ）と、時間方向の高周波領域のパワーの割合が大きいほど動領域面積が大きく、且つ動き量が大きいとして判断するための予め定めた閾値Ｔｈ_ｔとを比較し、閾値Ｔｈ_ｔを上回るパワーとなる階数（尚、パワーの割合は階数によって区分される）を、時間方向の高周波領域のパワーの割合に応じた空間周波数の分解階数Ｎｓ（以下、「空間分解階数値」とも称する）として決定し、決定した空間分解階数値Ｎｓの情報を空間方向周波数分解部１３に送出する。時間方向の周波数帯域分割した階数ｎ_ｔの階層が大きいほど、動領域面積が大きく、且つ動き量が大きいと判断することができる。 The spatial resolution rank determination unit 12 has a large ratio between the power value P _T (n _t ) for each frequency band in the time direction of the _nt floor calculated by the time direction frequency decomposition unit 11 and the power in the high frequency region in the time direction. Compared with a predetermined threshold value Th _t for determining that the area of the moving region is large and the amount of movement is large, the number of floors with power exceeding the threshold value Th _t (the power ratio is divided by the number of floors) Is determined as a spatial frequency decomposition rank Ns (hereinafter also referred to as a “space decomposition rank value”) according to the power ratio of the high frequency region in the time direction, and information on the determined space decomposition rank value Ns is determined in the spatial direction frequency resolution. Send to unit 13. It can be determined that the larger the hierarchy of the rank n _{t obtained} by dividing the frequency band in the time direction, the larger the moving region area and the larger the amount of motion.

例えば、図３に示すように、空間分解階数決定部１２は、時間方向周波数分解部１１によって算出したｎ_ｔ階の各時間方向の周波数帯域別のパワー値Ｐ_Ｔ（ｎ_ｔ）と閾値Ｔｈ_ｔとを入力して比較する比較部１２１と、比較部１２１の比較によって得られる閾値Ｔｈ_ｔを上回るパワーとなる階数を、時間方向の高周波領域のパワーの割合に応じた空間分解階数値Ｎｓとして決定し、決定した空間分解階数値Ｎｓの情報を空間方向周波数分解部１３に送出する空間分解階数値決定部１２２から構成することができる。 For example, as illustrated in FIG. 3, the spatial decomposition rank determination unit 12 includes a power value P _T ( _nt ) and a threshold value Th _t for each frequency band in the _nt floor calculated by the time direction frequency decomposition unit 11. And the comparison unit 121 that compares and inputs the rank, and the rank that is higher than the threshold value Th _t obtained by the comparison by the comparison unit 121 is determined as the spatially-resolved rank value Ns corresponding to the ratio of the power in the high-frequency region in the time direction. In addition, the information of the determined spatial resolution scale value Ns can be configured by the spatial resolution scale value determination unit 122 that sends the information to the spatial direction frequency resolution unit 13.

空間方向周波数分解部１３は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、空間分解階数決定部１２によって決定した空間分解階数値Ｎｓに基づいて、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々の全画素に対して空間Ｎｓ階離散ウェーブレット分解を行い、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間分解階数値Ｎｓに対応するｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎に平均して正規化したパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）を算出し、画像解析部１４に送出する。 The spatial direction frequency decomposition unit 13 receives the reference frame F (t _C ) and the reference frame F (t _R ), and based on the spatial decomposition rank value Ns determined by the spatial decomposition rank determination unit 12, the reference frame F (t performs spatial Ns floor discrete wavelet decomposition for all the pixels of each of _C) and the reference frame F _{(t R),} the reference frame F _{(t C)} and the reference frame F _{(t R)} spatial decomposition floor numerical Ns in each calculated corresponding _{n s} floor horizontal, vertical, diagonal average for each frequency band normalized power value _{_{_{_{P H (n s), P}}}} V (n s), P D a _{(n s),} the The image is sent to the image analysis unit 14.

尚、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の双方に対して空間Ｎｓ階離散ウェーブレット分解を実行することは、元画像に対して可変のブロックサイズ及び探索範囲の大きさとする階層型動き推定を行うことができる点で有利であり、特に、動き推定装置を階層的に行うための分解能の決定のためには、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）のうちの空間方向の低周波領域のパワーの割合が大きいほうを選定するのが好適となる。以下の説明では、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の双方について空間Ｎｓ階離散ウェーブレット分解を行う例を説明する。 Note that performing the spatial Ns-order discrete wavelet decomposition on both the base frame F (t _C ) and the reference frame F (t _R ) has a variable block size and a search range size for the original image. It is advantageous in that hierarchical motion estimation can be performed, and in particular, for determining resolution for performing the motion estimation device hierarchically, the reference frame F (t _C ) and the reference frame F (t _R ) Of these, it is preferable to select the one having the larger power ratio in the low frequency region in the spatial direction. In the following description, an example will be described in which spatial Ns-order discrete wavelet decomposition is performed on both the base frame F (t _C ) and the reference frame F (t _R ).

例えば、図４に示すように、空間方向周波数分解部１３は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、空間分解階数決定部１２によって決定した空間分解階数値Ｎｓに基づいて、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々の全画素に対して空間Ｎｓ階離散ウェーブレット分解を行う空間方向２次元Ｎｓ階離散ウェーブレット分解処理部１３１と、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間分解階数値Ｎｓに対応するｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎に平均して正規化したパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）を算出して画像解析部１４に送出する水平・垂直・斜め方向周波数帯域別パワー算出部１３２から構成することができる。 For example, as shown in FIG. 4, the spatial direction frequency decomposition unit 13 receives the reference frame F (t _C ) and the reference frame F (t _R ), and the spatial decomposition rank value Ns determined by the spatial decomposition rank determination unit 12. Based on the spatial direction two-dimensional Ns-order discrete wavelet decomposition processing unit 131 for performing spatial Ns-order discrete wavelet decomposition on all the pixels of the reference frame F (t _C ) and the reference frame F (t _R ), and a reference frame F (t _C) and the reference frame F n _s floor horizontal corresponding to the spatial decomposition floor numerical Ns in each of (t _R), a vertical power value P _H normalized by averaging for each diagonal frequency band (N _s ), P _V (n _s ), and P _D (n _s ) are calculated and transmitted to the image analysis unit 14, and can be configured by a horizontal, vertical, and diagonal frequency band-specific power calculation unit 132. wear.

画像解析部１４は、空間方向周波数分解部１３によって算出したｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）と、空間方向の低周波領域のパワーの割合が大きいほど（空間方向の高周波領域のパワーの割合が小さいほど）動領域面積が大きく、且つ動き量が大きいと判断するための予め定めた閾値Ｔｈ_ｓとを入力して比較し、水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）の全てが閾値Ｔｈ_ｓを下回るパワーとなる階数ｎ_ｓ（尚、パワーの割合は階数によって区分される）を、動き検出を開始する階数（以下、「動き検出開始階数」と称する）として決定し、動き検出開始階数ｎ_ｓにおける各ｎ_ｓ階における水平、垂直、斜め方向の周波数帯域毎のパワーの割合Ｐ_ＨＲ（ｎ_ｓ），Ｐ_ＶＲ（ｎ_ｓ），Ｐ_ＤＲ（ｎ_ｓ）を算出して、動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲の倍数α，βを算出し、倍数α，βによって動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲を決定し、それぞれブロックサイズ情報及び動き探索範囲情報として動き検出部１５に送出する。動き検出開始階数ｎ_ｓの階層が小さいほど、動領域面積が大きく、且つ動き量が大きいと判断することができる。 The image analysis unit 14, horizontal _{n s} floor calculated by the spatial direction frequency decomposition unit 13, a vertical, power value _P H _(n _s) of each diagonal frequency _{_{bands, P V (n s),}} P D (n _s ) and a predetermined ratio for determining that the larger the power ratio of the low frequency region in the spatial direction (the smaller the power ratio of the high frequency region in the spatial direction), the larger the moving region area and the larger the amount of motion. The threshold value Th _s is inputted and compared, and all of the power values P _H (n _s ), P _V (n _s ), and P _D (n _s ) for each frequency band in the horizontal, vertical, and diagonal directions are the threshold value Th _s. The floor number n _s (the power ratio is divided by the floor number) is determined as the floor number to start motion detection (hereinafter referred to as “motion detection start floor number”), and the motion detection start floor number n horizontal in each _{n s} floor in _s, By calculating power ratios P _HR (n _s ), P _VR (n _s ), and P _DR (n _s ) for each frequency band in the vertical and oblique directions, the block size for motion detection in the motion detection start rank n _s and Multiples α and β of the motion search range are calculated, a block size and a motion search range for motion detection in the motion detection start rank n _s are determined by the multiples α and β, and a motion detection unit as block size information and motion search range information, respectively. 15 to send. Higher hierarchical motion detection start rank n _s is small, a large dynamic region area, and it can be determined that the motion amount is large.

例えば、図５に示すように、画像解析部１４は、空間方向周波数分解部１３によって算出したｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）と、予め定めた閾値Ｔｈ_ｓとを入力して比較する比較部１４１と、比較部１４１の比較によって水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）の全てが閾値Ｔｈ_ｓを下回るパワーとなる動き検出開始階数ｎ_ｓを決定し、動き検出開始階数ｎ_ｓにおける各ｎ_ｓ階における水平、垂直、斜め方向の周波数帯域毎のパワーの割合Ｐ_ＨＲ（ｎ_ｓ），Ｐ_ＶＲ（ｎ_ｓ），Ｐ_ＤＲ（ｎ_ｓ）を算出する水平・垂直・斜め方向パワー割合算出部１４２と、動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲の倍数α，βを算出する水平・垂直方向倍数算出部１４３と、倍数α，βによって動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲を決定し、それぞれブロックサイズ情報及び動き探索範囲情報として動き検出部１５に送出するブロックサイズ・動き探索範囲決定部１４４として構成することができる。 For example, as shown in FIG. 5, the image analysis unit 14, n _s floor horizontal calculated by the spatial direction frequency decomposition unit 13, a vertical, power value P _{H (n} s) of each diagonal frequency _bands, P _V (N _s ), P _D (n _s ) and a predetermined threshold value Th _s are inputted and compared, and the comparison unit 141 compares the power values for the frequency bands in the horizontal, vertical, and diagonal directions. _{_{_{_{P H (n s), P}}}} V (n s), P D (n s) all determines the motion detection start rank _{n s} as a power below the threshold Th _s, each of the motion detection start rank _{n s} n horizontal in _s floor, vertically, the percentage _P HR _(n _s) of the power of each frequency band in the diagonal _{_{direction, P VR (n s),}} P DR (n s) horizontal, vertical, and diagonal direction power ratio calculation unit for calculating a and 142, moving in the motion detection start rank _{n s} The block size for motion detection and the multiples α and β for the motion search range are calculated in the horizontal / vertical direction multiple calculation unit 143, and the block size for motion detection and the motion search range for the motion detection start rank n _s are determined by the multiples α and β. The block size / motion search range determination unit 144 can be configured to transmit the block size information and the motion search range information to the motion detection unit 15, respectively.

動き検出開始階数ｎ_ｓにおける水平、垂直、斜め方向の周波数帯域毎のパワーの割合Ｐ_ＨＲ（ｎ_ｓ），Ｐ_ＶＲ（ｎ_ｓ），Ｐ_ＤＲ（ｎ_ｓ）は、以下の式から得られる。 The power ratios P _HR (n _s ), P _VR (n _s ), and P _DR (n _s ) for each frequency band in the horizontal, vertical, and diagonal directions in the motion detection start rank n _s are obtained from the following equations.

水平方向パワーの割合：
Ｐ_ＨＲ（ｎ_ｓ）＝Ｐ_Ｈ（ｎ_ｓ）／（Ｐ_Ｈ（ｎ_ｓ）＋Ｐ_Ｖ（ｎ_ｓ）＋Ｐ_Ｄ（ｎ_ｓ））
垂直方向パワーの割合：
Ｐ_ＶＲ（ｎ_ｓ）＝Ｐ_Ｖ（ｎ_ｓ）／（Ｐ_Ｈ（ｎ_ｓ）＋Ｐ_Ｖ（ｎ_ｓ）＋Ｐ_Ｄ（ｎ_ｓ））
斜め方向パワーの割合：
Ｐ_ＤＲ（ｎ_ｓ）＝Ｐ_Ｄ（ｎ_ｓ）／（Ｐ_Ｈ（ｎ_ｓ）＋Ｐ_Ｖ（ｎ_ｓ）＋Ｐ_Ｄ（ｎ_ｓ）） Horizontal power ratio:
_{_{_{_{P HR (n s) = P}}}} H (n s) / (P H (n s) + P V (n s) + P D (n s))
Percentage of vertical power:
_{_{_{_{P VR (n s) = P}}}} V (n s) / (P H (n s) + P V (n s) + P D (n s))
Diagonal power ratio:
_{_{_{_{P DR (n s) = P}}}} D (n s) / (P H (n s) + P V (n s) + P D (n s))

動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲の倍数α，βは、以下の式から得られる。尚、ａは適宜設定可能な係数であるが、等分配するにはａ＝１／２とすることができる。 The motion detection block size and the motion search range multiples α and β in the motion detection start rank n _s are obtained from the following equations. Note that a is a coefficient that can be set as appropriate, but for equal distribution, a = 1/2.

水平方向の倍数：
α＝ａ×（Ｐ_ＶＲ（ｎ_ｓ）＋Ｐ_ＤＲ（ｎ_ｓ））
／（Ｐ_ＨＲ（ｎ_ｓ）＋Ｐ_ＶＲ（ｎ_ｓ）＋Ｐ_ＤＲ（ｎ_ｓ））
垂直方向の倍数：
β＝（１−ａ）×（Ｐ_ＨＲ（ｎ_ｓ）＋Ｐ_ＤＲ（ｎ_ｓ））
／（Ｐ_ＨＲ（ｎ_ｓ）＋Ｐ_ＶＲ（ｎ_ｓ）＋Ｐ_ＤＲ（ｎ_ｓ）） Horizontal multiple:
α = a × (P _VR (n _s ) + P _DR (n _s ))
/ (P _HR ( _ns ) + _PVR ( _ns ) + _PDR ( _ns ))
Vertical multiple:
β = (1−a) × (P _HR (n _s ) + P _DR (n _s ))
/ (P _HR ( _ns ) + _PVR ( _ns ) + _PDR ( _ns ))

従って、倍数α，βは、斜め方向パワーの割合Ｐ_ＤＲ（ｎ_ｓ）を考慮して決定されるので、後述する高確度な動き推定に寄与することになる。 Therefore, the multiples α and β are determined in consideration of the oblique power ratio P _DR (n _s ), which contributes to highly accurate motion estimation described later.

動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ情報及び動き探索範囲情報は、以下の式から得られる。動き検出開始階数ｎ_ｓにおける予め定めた基準のブロックサイズをＢ_Ｘ（ｎ_ｓ）画素、垂直Ｂ_Ｙ（ｎ_ｓ）ラインとし、動き検出開始階数ｎ_ｓにおける予め定めた基準の動き探索範囲を水平Ｓ_Ｘ（ｎ_ｓ）画素、垂直ＳＳ_Ｙ（ｎ_ｓ）ラインとする。 The block size information and motion search range information for motion detection at the motion detection start rank n _s are obtained from the following equations. A predetermined reference block size in the motion detection start rank n _s is set to B _X ( _ns ) pixels and a vertical B _Y ( _ns ) line, and a predetermined reference motion search range in the motion detection start rank n _s is horizontal. It is assumed that S _X ( _ns ) pixels and vertical SS _Y ( _ns ) lines are used.

ブロックサイズ情報：水平αＢ_Ｘ（ｎ_ｓ）画素、垂直βＢ_Ｙ（ｎ_ｓ）ライン
動き探索範囲情報：水平αＳ_Ｘ（ｎ_ｓ）画素、垂直βＳ_Ｙ（ｎ_ｓ）ライン Block size information: horizontal αB _X _{(n s)} of pixels, the vertical βB _Y _{(n s)} line motion search range information: horizontal αS _X _{(n s)} of pixels, the vertical βS _Y _{(n s)} line

動き推定部１５は、動き検出開始階数ｎ_ｓに応じた空間方向に低周波領域の基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の画像を得るために、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の空間方向２次元Ｎｓ階離散ウェーブレット分解したデータに対して、動き検出開始階数ｎ_ｓに応じた空間ｎ_ｓ階ウェーブレットの再構成を行い、画像解析部１４から得られるブロックサイズ情報及び動き探索範囲情報に従う大きさで動き推定を実行し、続いて空間ｎ_ｓ−１階ウェーブレットの再構成を行い、当該ブロックサイズ情報及び動き探索範囲情報に従う大きさで動き推定を再度実行し、最上位の階層（即ち、元の画像レベル）にて当該ブロックサイズ情報及び動き探索範囲情報に従う大きさで動き推定を行うまで階数をデクリメントして繰り返す。この動き推定部１５の動作は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、動き検出開始階数ｎ_ｓに基づいて、基準フレームＦ（ｔ_Ｃ）に対して順次ブロックサイズ及び探索範囲の大きさを縮小しながら動き推定を行うことと類似した処理となる。ただし、空間ｎ_ｓ階ウェーブレット分解及び再構成を経て順次繰り返すことによる動き推定部１５によれば、階層に応じて順次水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）に基づいて決定したブロックサイズ情報及び動き探索範囲情報に従って動き推定を行うため、画像シーンに応じた動き検出開始階数ｎ_ｓに応じた動き推定を高確度に行うことができ、高精度化が期待できる。 The motion estimation unit 15 obtains images of the reference frame F (t _C ) and the reference frame F (t _R ) in the low-frequency region in the spatial direction according to the motion detection start rank n _s , so as to obtain the reference frame F (t _C ) and with respect to the reference frame F (t _R) spatial direction 2D Ns floor discrete wavelet decomposed data, performs reconstruction of the spatial n _s floor wavelet corresponding to the movement detection start rank n _s, the image analysis unit 14 The motion estimation is executed with the size according to the obtained block size information and the motion search range information, and then the space n _s −1 floor wavelet is reconstructed, and the motion estimation is performed with the size according to the block size information and the motion search range information. Until the motion estimation is performed with the size according to the block size information and the motion search range information at the highest layer (ie, the original image level). Decrement and repeat. The operation of the motion estimator 15 receives the base frame F (t _C ) and the reference frame F (t _R ), and sequentially blocks the base frame F (t _C ) based on the motion detection start rank n _s. This process is similar to performing motion estimation while reducing the size and the size of the search range. However, according to the motion estimation unit 15 by sequentially repeating through the spatial _ns -th order wavelet decomposition and reconstruction, the power value P _H ( _ns ), for each frequency band in the horizontal, vertical, and diagonal directions sequentially according to the hierarchy. P V _(n s), for performing motion estimation according to the block size information and the motion estimation range information determined based on _{P D} _(n _s), the motion estimation according to the motion detection start rank n _s in accordance with the image scene High accuracy can be expected and high accuracy can be expected.

例えば、図６に示すように、動き推定部１５は、動き検出開始階数ｎ_ｓに応じた空間方向に低周波領域の基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の画像を得るために動き検出開始階数ｎ_ｓに応じた空間ｎ_ｓ階ウェーブレットの再構成を行い、画像解析部１４から得られるブロックサイズ情報及び動き探索範囲情報に従う大きさで小数画素精度のブロックマッチングによる動き推定を行う階層型動き推定部１５１と、この動き推定の処理を最上位の階数に対応する元の画像レベルとなるまで階数をデクリメントして繰り返すために、空間方向に１階上位のウェーブレット再構成を実行した画像を階層型動き推定部１５１に送出する空間１階ウェーブレット再構成部１５２から構成することができる。従って、階層型動き推定部１５１は、空間１階ウェーブレット再構成部１５２から得られる基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の再構成画像を用いて、動き推定処理を階層的に繰り返し、最終的な動き推定情報（例えば、動きベクトル）を決定して出力することができる。 For example, as illustrated in FIG. 6, the motion estimation unit 15 obtains images of the reference frame F (t _C ) and the reference frame F (t _R ) in the low frequency region in the spatial direction according to the motion detection start rank n _s. reconfigures space n _s floor wavelet corresponding to the movement detection start rank n _s for motion estimation using block matching of sub-pixel accuracy in size according the block size information and the motion estimation range information obtained from the image analysis section 14 In order to repeat the motion estimation process by decrementing the rank until the original image level corresponding to the highest rank is reached, the first-level wavelet reconstruction is performed in the spatial direction. A spatial first-floor wavelet reconstruction unit 152 that sends the executed image to the hierarchical motion estimation unit 151 can be configured. Therefore, the hierarchical motion estimation unit 151 performs hierarchical motion estimation processing using the reconstructed images of the base frame F (t _C ) and the reference frame F (t _R ) obtained from the spatial first-order wavelet reconstruction unit 152. The final motion estimation information (for example, motion vector) can be determined and output repeatedly.

以下、本発明による一実施例の動き推定装置１の動作について更に詳細に説明する。 Hereinafter, the operation of the motion estimation apparatus 1 according to an embodiment of the present invention will be described in more detail.

[装置動作]
図７は、本発明による一実施例の動き推定装置の動作を示す動作フローである。 [Device operation]
FIG. 7 is an operation flow showing the operation of the motion estimation apparatus of one embodiment according to the present invention.

ステップＳ１にて、動き推定装置１は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を含む、時刻ｔ＝ｔ_０・・・ｔ_ｍにおけるフレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力して、動き推定装置１が備える記憶部（図示せず）に適宜読み出し可能に格納する。 In step S1, the motion estimation apparatus 1 includes the frame image sequence F (t ₀ ) at time t = t ₀ ... T _m including the base frame F (t _C ) and the reference frame F (t _R ),. .., F (t _C ),..., F (t _R ),..., F (t _m ) can be input and read as appropriate to a storage unit (not shown) included in the motion estimation device 1 To store.

ステップＳ２にて、時間方向周波数分解部１１により、フレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力し、基準フレームＦ（ｔ_Ｃ）における全画素について時間方向に予め規定した最大階数の周波領域に分解した後、全画素における時間方向の周波数帯域毎のパワーを算出する。 In step S2, the time-direction frequency decomposition unit 11 causes the frame image sequence F (t ₀ ),..., F (t _C ),..., F (t _R ) _,. ), And all pixels in the reference frame F (t _C ) are decomposed into frequency regions of the maximum rank defined in advance in the time direction, and then the power for each frequency band in the time direction in all pixels is calculated.

例えば、図８に示すように、フレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）における或る画素Ｒ（ｋ，ｌ）について、時間方向に予め規定した最大階数（Ｎｍａｘ）の周波領域に分解した後、全画素における時間方向周波数帯域毎のパワーを算出することができる。例えば、図９に示すように、１６フレームのフレーム画像列Ｆ（ｔ）を時間方向にＮｍａｘ階に分解するとすれば、Ｎｍａｘ＝１では、低周波領域Ｌ^１及び高周波領域Ｈ^１として分割することができ（図９（ａ）参照）、Ｎｍａｘ＝２では、低周波領域Ｌ^２及び高周波領域Ｈ^１，Ｈ^２として分割することができ（図９（ｂ）参照）、Ｎｍａｘ＝３では、低周波領域Ｌ^３及び高周波領域Ｈ^１，Ｈ^２，Ｈ^３として分割することができ（図９（ｃ）参照）、Ｎｍａｘ＝４では、低周波領域Ｌ^４及び高周波領域Ｈ^１，Ｈ^２，Ｈ^３，Ｈ^４として分割することができる（図９（ｄ）参照）。 For example, as shown in FIG. 8, the frame image sequence _{_{F (t 0), ···,}} F (t C), ···, F (t R), ···, one at F _{(t m)} After the pixel R (k, l) is decomposed into frequency regions of the maximum rank (Nmax) defined in advance in the time direction, the power for each time direction frequency band in all the pixels can be calculated. For example, as shown in FIG. 9, if a frame image sequence F (t) of 16 frames is decomposed into Nmax floors in the time direction, when Nmax = 1, it is divided into a low frequency region L ¹ and a high frequency region H ^1. (See FIG. 9A), when Nmax = 2, it can be divided into the low frequency region L ² and the high frequency regions H ¹ and H ² (see FIG. 9B), and when Nmax = 3, it is low. It can be divided into the frequency region L ³ and the high frequency regions H ¹ , H ² , H ³ (see FIG. 9C), and when Nmax = 4, the low frequency region L ⁴ and the high frequency regions H ¹ , H ² , H ³ and H ⁴ (see FIG. 9D).

ステップＳ３にて、空間分解階数決定部１２により、算出した時間方向の周波数帯域毎のパワーＰ_Ｔ（ｎ_ｔ）と、時間方向の高周波領域のパワーの割合が大きいほど動領域面積が大きく、且つ動き量が大きくなると判断するための予め定めた閾値Ｔｈ_ｔとを比較し、この閾値Ｔｈ_ｔを上回るパワーとなる階数を、時間方向の高周波領域のパワーの割合に応じた空間周波数の分解階数（空間分解階数値）Ｎｓとして決定する。 In step S3, the spatial resolution rank determining unit 12 calculates the power P _T (n _t ) for each frequency band in the time direction and the power ratio in the high-frequency region in the time direction, and the dynamic region area increases. Compared with a predetermined threshold value Th _t for determining that the amount of motion becomes large, the rank that becomes the power exceeding the threshold value Th _t is determined as the decomposition rank of the spatial frequency according to the ratio of the power in the high frequency region in the time direction ( It is determined as the spatial resolution floor number) Ns.

パワーＰ_Ｔ（ｎ_ｔ）と閾値Ｔｈ_ｔとの比較として構成する代わりに、例えば、表１に示すように、時間方向の高周波領域のパワーの割合と空間周波数の分解階数Ｎｓとの間で規定されるテーブルを予め保持しておき、パワーＰ_Ｔ（ｎ_ｔ）から空間周波数の分解階数Ｎｓを求めるように構成することもできる。 Instead of configuring as a comparison between the power P _T (n _t ) and the threshold Th _t , for example, as shown in Table 1, it is defined between the power ratio in the high-frequency region in the time direction and the decomposition rank Ns of the spatial frequency. It is also possible to store the table in advance and obtain the spatial frequency decomposition rank Ns from the power P _T (n _t ).

ステップＳ４にて、空間方向周波数分解部１３により、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、空間分解階数決定部１２によって決定した空間周波数の分解階数Ｎｓに基づいて、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）に対して空間Ｎｓ階離散ウェーブレット分解を実行し、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間分解階数値Ｎｓに対応するｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）を算出する。尚、動き検出開始階数ｎ_ｓの決定のために、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）における空間周波数帯域毎のパワーの割合の大きいほうを選定することや、基準フレームＦ（ｔ_Ｃ）又は参照フレームＦ（ｔ_Ｒ）における空間周波数帯域毎のパワーの割合のみを算出してもよい。 In step S4, the base frame F (t _C ) and the reference frame F (t _R ) are input by the spatial frequency decomposition unit 13 and based on the spatial frequency decomposition rank Ns determined by the spatial decomposition rank determination unit 12. performs spatial Ns floor discrete wavelet decomposition on the reference frame F _{(t C)} and the reference frame F _{(t R),} spatially resolved floors in each of the reference frame F _{(t C)} and the reference frame F _{(t R)} _{n s} floor horizontal corresponding to numerical Ns, vertical, power value _P H for each diagonal frequency band _{_{_{(n s), P V (}}} n s), and calculates the P D _{(n s).} In order to determine the motion detection start rank n _s , the higher one of the ratio of power for each spatial frequency band in the reference frame F (t _C ) and the reference frame F (t _R ) is selected, or the reference frame F You may calculate only the ratio of the power for every spatial frequency band in (t _C ) or reference frame F (t _R ).

例えば、図１０（ａ）に示すように、基準フレームＦ（ｔ_Ｃ）の全画素に対して空間方向に２次元２階離散ウェーブレット分解を実行して、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間分解階数値Ｎｓに対応するｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）を算出することができる。また、図１０（ｂ）に示すように、基準フレームＦ（ｔ_Ｃ）の空間方向の低周波領域（例えば、ＬＬ^２）のみを抽出して基準フレームＦ（ｔ_Ｃ）の低周波領域のみの画像を再構成することができる。 For example, as shown in FIG. 10 (a), by performing a two-dimensional second-order discrete wavelet decomposition in the spatial direction for all pixels of the reference frame F (t _C), the reference frame F (t _C) and the reference frame F _{n s} floor horizontal corresponding to the spatial decomposition floor numerical Ns in each of _{(t R),} a vertical power value _P H _(n _s) of each diagonal frequency _{_{bands, P V (n s),}} P D ( n _s ) can be calculated. Further, as shown in FIG. 10 (b), the reference frame F _{(t C)} a low-frequency region (e.g., LL ²⁾ spatial direction only extracted by the reference frame F low-frequency region only of the _{(t C)} Images can be reconstructed.

続いて、画像解析部１４により、空間分解階数値Ｎｓに対応するｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）と、空間方向の低周波領域のパワーの割合が大きいほど（空間方向の高周波領域のパワーの割合が小さいほど）動領域面積が大きく、且つ動き量が大きいと判断するための予め定めた閾値Ｔｈ_ｓとを入力して比較し、この比較によって得られる閾値Ｔｈ_ｓを下回るパワーとなる階数ｎ_ｓ（尚、パワーの割合は階数によって区分される）を、動き検出を開始する階数（以下、「動き検出開始階数」と称する）として決定し（ステップＳ４）、動き検出開始階数ｎ_ｓにおける各ｎ_ｓ階における水平、垂直、斜め方向の周波数帯域毎のパワーの割合Ｐ_ＨＲ（ｎ_ｓ），Ｐ_ＶＲ（ｎ_ｓ），Ｐ_ＤＲ（ｎ_ｓ）を算出して（ステップＳ５）、動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲の倍数α，βを算出し、倍数α，βによって動き検出開始階数ｎ_ｓにおける動き検出のブロックサイズ及び動き探索範囲を決定する（ステップＳ６）。ただし、ｎ_ｓ≦Ｎｓである。 Subsequently, the image analysis unit 14, _{n s} floor horizontal corresponding to the spatial decomposition floor numerical Ns, vertical, power value _P H _(n _s) of each diagonal frequency _{bands, P V (n s),} P D In order to determine that ( _ns ) and the power ratio of the low frequency region in the spatial direction are large (the power ratio of the high frequency region in the spatial direction is small), the moving region area is large and the motion amount is large. A predetermined threshold value Th _s is input and compared, and the rank n _s (power ratio is divided by the rank) that is lower than the threshold value Th _s obtained by this comparison is used as the rank at which motion detection is started. (hereinafter, referred to as "motion detection start rank") was determined as (step S4), and the ratio of the power of the horizontal, vertical, each diagonal direction of the frequency bands in each n _s floor in the motion detection start rank n _s P _{HR (n} _s) P _VR _(n _s), and calculates the P DR _{(n s)} (step S5), and a multiple of the block size and motion search range of the motion detection in the motion detection start rank _{n s} alpha, calculates the beta, multiple alpha, determining a block size and motion search range of the motion detection in the motion detection start rank n _s by beta (step S6). However, n _s ≦ Ns.

ｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）と閾値Ｔｈ_ｓとの比較で動き検出開始階数ｎ_ｓを求めるように構成する代わりに、例えば、表２に示すように、ｎ_ｓ階の水平、垂直、斜め方向の周波数帯域毎のパワー値Ｐ_Ｈ（ｎ_ｓ），Ｐ_Ｖ（ｎ_ｓ），Ｐ_Ｄ（ｎ_ｓ）における低周波領域のパワーの割合と動き検出開始階数ｎ_ｓとの間で規定されるテーブルを予め保持しておき、低周波領域のパワーの割合を算出して動き検出開始階数ｎ_ｓを求めるように構成することもできる。尚、動き検出開始階数ｎ_ｓが大きくなるにつれて、元の画像が低解像度化することを意味しており、元の画像に対して相対的にブロックサイズ及び動き探索範囲の大きさが増大することを意味している。例えば、空間分解能１／１６，１／８，１／４，１／２とすれば、それぞれ（ブロックサイズ，動き探索範囲の大きさ）は、（１６×１６，水平・垂直１６画素），（８×８，水平・垂直８画素），（４×４，水平・垂直４画素），（２×２，水平・垂直２画素）などである。ここで、例えば、空間分解能１／１６は、元の画像における水平標本化周波数Ｈｓ及び垂直標本化周波数Ｖｓにおいて、１６画素を１画素として標本化する低解像度化を意味する。 n _s floor horizontal, vertical, power value _P H _(n _s) of each diagonal frequency _{_{bands, P V (n s),}} P D (n s) and Comparative motion detection start with the threshold value Th _s rank n instead configured to determine a _s, for example, as shown in Table 2, _{n s} floor horizontal, vertical, power value _P H for each diagonal frequency band _{_{_{(n s), P V (}}} n s), A table defined between the power ratio of the low frequency region in P _D ( _ns ) and the motion detection start rank n _s is held in advance, and the motion ratio is started by calculating the power ratio of the low frequency region. It can also be configured to determine the rank n _s . Note that as the motion detection start rank n _s increases, it means that the resolution of the original image is reduced, and the block size and the size of the motion search range increase relative to the original image. Means. For example, if the spatial resolution is 1/16, 1/8, 1/4, 1/2, (block size, size of motion search range) is (16 × 16, horizontal / vertical 16 pixels), ( 8 × 8, horizontal / vertical 8 pixels), (4 × 4, horizontal / vertical 4 pixels), (2 × 2, horizontal / vertical 2 pixels), and the like. Here, for example, the spatial resolution of 1/16 means a reduction in resolution in which 16 pixels are sampled as one pixel at the horizontal sampling frequency Hs and the vertical sampling frequency Vs in the original image.

つまり、図１１に示すように、空間方向の低周波領域のパワーの割合によって、動き検出開始階数ｎ_ｓを関連付けることができる。例えば、Ｎｓ＝４のとき、低周波領域（ＬＬ^４）及び高周波領域（ＬＬ^４以外）のそれぞれのパワーを算出して、全体における低周波領域（ＬＬ^４）の割合が、９９．５％以上であれば、動き検出開始階数ｎ_ｓ＝４として４階層の低周波領域のみの画像を再構成することができる（図１１（ｄ）参照）。また、全体における低周波領域（ＬＬ^４）の割合が、９８．０％以上９９．５％未満であれば、動き検出開始階数ｎ_ｓ＝３として３階層の低周波領域（この場合、ＬＬ^３）のみの画像を再構成することができる（図１１（ｃ）参照）。同様に、全体における低周波領域（ＬＬ^４）の割合が、９５．０％以上９８．０％未満であれば、動き検出開始階数ｎ_ｓ＝２として２階層の低周波領域（この場合、ＬＬ^２）のみの画像を再構成することができ（図１１（ｂ）参照）、全体における低周波領域（ＬＬ^４）の割合が、９５．０％未満であれば、動き検出開始階数ｎ_ｓ＝１として１階層の低周波領域（この場合、ＬＬ^１）のみの画像を再構成することができる。 That is, as shown in FIG. 11, the motion detection start rank n _s can be associated with the ratio of the power in the low frequency region in the spatial direction. For example, when Ns = 4, the respective powers of the low frequency region (LL ⁴ ) and the high frequency region (other than LL ⁴ ) are calculated, and the ratio of the low frequency region (LL ⁴ ) in the whole is 99.5% or more. If so, it is possible to reconstruct an image of only the four-layer low-frequency region with the motion detection start rank n _s = 4 (see FIG. 11D). Further, if the ratio of the low frequency region (LL ⁴ ) in the whole is 98.0% or more and less than 99.5%, the motion detection start rank n _s = 3 is set as three layers of low frequency regions (in this case, LL ³ ) Image can be reconstructed (see FIG. 11C). Similarly, if the ratio of the low frequency region (LL ⁴ ) in the whole is 95.0% or more and less than 98.0%, the motion detection start rank n _s = 2 is set as two layers of low frequency regions (in this case, LL ² ) only images can be reconstructed (see FIG. 11B), and if the ratio of the low frequency region (LL ⁴ ) in the whole is less than 95.0%, the motion detection start rank n _s = 1, it is possible to reconstruct an image of only one layer of a low frequency region (in this case, LL ¹ ).

ステップＳ７にて、動き推定部１５により、動き検出開始階数ｎ_ｓに基づいた基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の低周波画像に対して、基準フレームＦ（ｔ_Ｃ）の低周波画像を所定のブロックサイズに分割し、分割した各ブロックについて、所定の動き探索範囲の大きさで、小数画素精度のブロックマッチングによる動き推定を行う。 In step S7, the motion estimation unit 15 causes the base frame F (t _C ) to be applied to the low-frequency images of the base frame F (t _C ) and the reference frame F (t _R ) based on the motion detection start rank n _s. Are divided into predetermined block sizes, and motion estimation is performed for each of the divided blocks by block matching with decimal pixel accuracy within a predetermined motion search range.

ステップＳ８にて、動き推定部１５により、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）に対してブロックサイズ及び動き探索範囲の大きさを縮小しながら動き推定装置を繰り返す効果を得るために、算出していた基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の空間ｎ_ｓ階離散ウェーブレット分解データに対して動き検出開始階数ｎ_ｓよりも上位の階数の画像となるように空間方向に１階上位のウェーブレット再構成を実行する。 In step S8, the motion estimation unit 15 obtains an effect of repeating the motion estimation device while reducing the block size and the size of the motion search range with respect to the base frame F (t _C ) and the reference frame F (t _R ). Therefore, an image having a higher rank than the motion detection start rank n _s is obtained with respect to the space n _s -th order discrete wavelet decomposition data of the calculated base frame F (t _C ) and reference frame F (t _R ). Next, wavelet reconstruction of the first floor in the spatial direction is executed.

動き推定部１５は、空間１階ウェーブレット再構成部１５２から得られる基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の再構成画像を用いて、最上位の階層（即ち、元の画像レベルにおける動き推定装置）となるまで順次階数をデクリメントして動き推定装置の処理を繰り返し（ステップＳ９）、最終的な動きベクトルを決定して動き推定情報を出力することができる（ステップＳ１０）。 The motion estimation unit 15 uses the reconstructed images of the base frame F (t _C ) and the reference frame F (t _R ) obtained from the spatial first-order wavelet reconstruction unit 152 to use the highest layer (ie, the original image). It is possible to sequentially decrement the rank and repeat the process of the motion estimator (step S9) until a final motion vector is determined and output motion estimation information (step S10).

例えば、図１２（ａ）〜（ｄ）に示すように、動き検出開始階数ｎ_ｓが大きいほどブロックサイズが大きくなる様子を示しており、ブロックサイズが大きいほど参照フレームＦ（ｔ_Ｒ）における動き探索範囲の大きさも大きくなる。 For example, as shown in FIGS. 12A to 12D, the block size increases as the motion detection start rank n _s increases, and the motion in the reference frame F (t _R ) increases as the block size increases. The size of the search range is also increased.

つまり、基準フレームＦ（ｔ_Ｃ）のｎ_ｓ階低周波画像を水平及び垂直のブロックサイズ（Ｂｘ^ｎｓ，Ｂｙ^ｎｓ）の或るブロックＢ^ｎｓ（上添え字は、階級を示す）に分割し、参照フレームＦ（ｔ_Ｒ）の±Ｓｘ^ｎｓ，±Ｓｙ^ｎｓの範囲（例えば、±２ブロック）で探索し、各ブロックの動きベクトルｖ^ｎｓを算出する。次に、基準フレームＦ（ｔ_Ｃ）のｎ_ｓ−１階低周波画像を、ブロックサイズ（Ｂｘ^ｎｓ，Ｂｙ^ｎｓ）で分割し、参照フレームＦ（ｔ_Ｒ）のｎ_ｓ階低周波画像上の同じ位置から２×ｖ^ｎｓだけずらした場所を中心位置とする水平及び垂直画素数としてそれぞれ±Ｓｘ^ｎｓ，±Ｓｙ^ｎｓの範囲で探索し、得られた動きベクトルに２×ｖ^ｎｓをベクトル加算して、ｎ_ｓ−１階低周波画像における動きベクトルｖ^ｎｓ−１を算出する。 In other words, the reference frame F _{(t C)} _{n s} Kaihiku frequency image horizontal and vertical block size ^(Bx ^{ns, By} ns) of a certain block ^{B ns} (superscript indicates the rank) of the split, the ± ^{Sx ns} of the reference frame F _{(t R),} probed with a range of ± ^{Sy ns} (e.g., ± 2 blocks), calculates a motion vector ^{v ns} of each block. Next, a _n s -1 Kaihiku frequency image of the reference frame F _{(t C),} the block size ^(Bx ^{ns, By} ns) is divided by the reference frame F on _{n s} Kaihiku frequency images _{(t R)} each ± Sx ns as the horizontal and vertical number of pixels centered position shifted by location 2 × v ^ns from the same ^position, and the search range of ± Sy ^ns, the 2 × v ^ns to the resulting motion vector and vector addition Then, the motion vector v ^ns-1 in the n _s −1 floor low frequency image is calculated.

尚、動き推定装置は、２次関数近似による小数画素位置のブロックマッチング法を用いて行うのは、最上位の階数（即ち、１階）でのみ行うのが好適であり、式（１）で与えられる。 Note that the motion estimation apparatus is preferably performed only with the highest rank (that is, the first floor) using the block matching method of decimal pixel positions by quadratic function approximation. Given.

尚、探索位置における画素位置をxとしたとき、SSD(x)は、画素位置におけるＳＳＤ値（誤差二乗和）を表し、より具体的には、SSD(0)は中心位置におけるＳＳＤ値、SSD(−1)は中心位置から−Ｓｘ（Ｓｙ）画素の位置におけるＳＳＤ値、SSD(1)は中心位置から＋Ｓｘ（Ｓｙ）画素の位置におけるＳＳＤ値を表す。式（１）から、水平又は垂直方向の小数画素精度の画素位置（小数画素位置）をそれぞれ算出することができる。例えば、図１３に示すように、式（１）から２次関数近似して、小数画素位置として例えば−０．３３を得ることができる。 When the pixel position at the search position is x, SSD (x) represents the SSD value (sum of squares of error) at the pixel position. More specifically, SSD (0) is the SSD value at the center position, SSD (−1) represents the SSD value at the position of −Sx (Sy) pixel from the center position, and SSD (1) represents the SSD value at the position of + Sx (Sy) pixel from the center position. From equation (1), pixel positions (decimal pixel positions) with decimal pixel precision in the horizontal or vertical direction can be calculated respectively. For example, as shown in FIG. 13, a quadratic function is approximated from the equation (1) to obtain, for example, −0.33 as the decimal pixel position.

また、ｎ_ｓ階におけるブロックサイズは固定でないため、ｎ_ｓ階からみてｎ_ｓ−１階のブロックが階層間で同位置及び同サイズとならない場合も考えられる。このような場合には、図１４に示すように、階層間のブロックの面積割合に応じた加重平均をとるのが好適である。例えば、ｎ_ｓ階における面積Ａ１〜Ａ４の４ブロックに対して、ｎ_ｓ−１階のブロックがまたがる場合に（面積Ａ１＋Ａ２’＋Ａ３’＋Ａ４’）、ｎ_ｓ階における面積Ａ１〜Ａ４の４ブロックにおけるそれぞれ動きベクトルがｖ１〜ｖ４とすると、ｎ_ｓ−１階の動きベクトルは、（Ａ１・２ｖ１＋Ａ２’・２ｖ２＋Ａ３’・２ｖ３＋Ａ４’・２ｖ４）／（Ａ１＋Ａ２＋Ａ３＋Ａ４）として求めることができる。このようにして、最上位の階数（即ち、１階）まで動き推定装置を繰り返すことにより、高精度化を図ることができる。 The block size in n _s floor because it is not fixed, n _s floor viewed from n _s -1 floor blocks can be considered may not be the same position and the same size between tiers. In such a case, as shown in FIG. 14, it is preferable to take a weighted average corresponding to the area ratio of blocks between layers. For example, with respect to four blocks of the area A1~A4 in _{n s} _{floor, n} s -1 floor when the block spans (area A1 + A2 '+ A3' + A4 '), in 4 blocks of the area A1~A4 in _{n s} floor Assuming that the motion vectors are v1 to v4, respectively, the motion vector of the n _s −1 floor can be obtained as (A1 · 2v1 + A2 ′ · 2v2 + A3 ′ · 2v3 + A4 ′ · 2v4) / (A1 + A2 + A3 + A4). In this way, high accuracy can be achieved by repeating the motion estimation device up to the highest rank (that is, the first floor).

以上のように、一実施例の動き推定装置によれば、動画像における動き推定にあたって、適切なブロックサイズを推定して動き推定装置を開始することができるだけでなく、時空間方向のスペクトルパワーから画像が持つ大凡の動き量、動き方向（水平・垂直・斜めの動き方向）及び動領域を帯域毎に推定してブロックサイズ及び動き探索範囲の大きさを決定し、帯域毎にブロックサイズ及び動き探索範囲の大きさが異なる階層的に動き推定を行うことで、雑音に強く、且つ高精度の動き推定装置の計算量を削減することができるとともに、動きぼけ量や方向に応じた高確度動き推定を行うことが可能となる。例えば、ＳＨＶ画面における大きな動きをするオブジェクトを高精度に推定した動き推定装置が可能となる。 As described above, according to the motion estimation device of one embodiment, in motion estimation in a moving image, not only can the motion estimation device be started by estimating an appropriate block size, but also from the spectral power in the spatio-temporal direction. Estimate the approximate amount of motion, motion direction (horizontal / vertical / diagonal motion direction) and motion area of the image for each band to determine the block size and motion search range, and block size and motion for each band. By performing motion estimation hierarchically with different search range sizes, it is possible to reduce the calculation amount of a highly accurate motion estimation device that is resistant to noise and highly accurate motion according to motion blur amount and direction. Estimation can be performed. For example, it is possible to provide a motion estimation device that accurately estimates an object that makes a large motion on the SHV screen.

また、このように高精度で効率的に求めた動きベクトルを既存の符号化装置の符号化処理や超解像処理に適用することで、更なる高品質化が期待できる。 Further, by applying the motion vector obtained with high accuracy and efficiency in this way to encoding processing and super-resolution processing of an existing encoding device, further improvement in quality can be expected.

また、上述の実施例では、各周波領域のパワーの算出のために、ウェーブレット変換を用いる例を説明したが、離散コサイン変換などの既知の直交変換や、フィルターバンク等を用いて各周波領域のパワーを算出することができる。ウェーブレット変換を用いる場合には、画像の周波数変化を高精度に捉えることができる点で優れており、離散コサイン変換を用いる場合には、既存のシステムが離散コサイン変換を用いている場合に装置構成が容易になる。 In the above-described embodiment, an example in which the wavelet transform is used to calculate the power of each frequency region has been described. However, a known orthogonal transform such as a discrete cosine transform, a filter bank, or the like is used. Power can be calculated. When using the wavelet transform, it is superior in that it can capture the frequency change of the image with high accuracy. When using the discrete cosine transform, the device configuration is used when the existing system uses the discrete cosine transform. Becomes easier.

更に、本発明の一態様として、本発明の動き推定装置をコンピュータとして構成させることができる。コンピュータに、前述した本発明の動き推定装置の各構成要素を実現させるためのプログラムは、コンピュータの内部又は外部に備えられる記憶部に記憶される。そのような記憶部は、外付けハードディスクなどの外部記憶装置、或いはＲＯＭ又はＲＡＭなどの内部記憶装置で実現することができる。コンピュータに備えられる制御部は、中央演算処理装置（ＣＰＵ）などの制御で実現することができる。即ち、ＣＰＵが、各構成要素の機能を実現するための処理内容が記述されたプログラムを、適宜、記憶部から読み込んで、各構成要素の機能をコンピュータ上で実現させることができる。ここで、各構成要素の機能をハードウェアの一部で実現しても良い。 Furthermore, as one aspect of the present invention, the motion estimation apparatus of the present invention can be configured as a computer. A program for causing a computer to realize each component of the motion estimation device of the present invention described above is stored in a storage unit provided inside or outside the computer. Such a storage unit can be realized by an external storage device such as an external hard disk or an internal storage device such as ROM or RAM. The control unit provided in the computer can be realized by controlling a central processing unit (CPU) or the like. In other words, the CPU can appropriately read from the storage unit a program in which the processing content for realizing the function of each component is described, and realize the function of each component on the computer. Here, the function of each component may be realized by a part of hardware.

また、この処理内容を記述したプログラムを、例えばＤＶＤ又はＣＤ−ＲＯＭなどの可搬型記録媒体の販売、譲渡、貸与等により流通させることができるほか、そのようなプログラムを、例えばネットワーク上にあるサーバの記憶部に記憶しておき、ネットワークを介してサーバから他のコンピュータにそのプログラムを転送することにより、流通させることができる。 In addition, the program describing the processing contents can be distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM, and such a program can be distributed on a server on a network, for example. Can be distributed by transferring the program from the server to another computer via the network.

また、そのようなプログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプログラム又はサーバから転送されたプログラムを、一旦、自己の記憶部に記憶することができる。また、このプログラムの別の実施態様として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、更に、このコンピュータにサーバからプログラムが転送される度に、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。 In addition, a computer that executes such a program can temporarily store, for example, a program recorded on a portable recording medium or a program transferred from a server in its own storage unit. As another embodiment of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and each time the program is transferred from the server to the computer. In addition, the processing according to the received program may be executed sequentially.

以上、具体例を挙げて本発明の実施例を詳細に説明したが、本発明の特許請求の範囲から逸脱しない限りにおいて、あらゆる変形や変更が可能であることは当業者に明らかである。 While the embodiments of the present invention have been described in detail with specific examples, it will be apparent to those skilled in the art that various modifications and changes can be made without departing from the scope of the claims of the present invention.

本発明によれば、動画像の超解像処理及び映像符号化、又はその他の画像処理で用いられる動き推定装置の精度を高めることができる。近年における動画像処理は益々高精細化しており、本発明による動き推定装置は、高精度の動き推定装置が求められる任意の動画像処理の用途に有用である。 ADVANTAGE OF THE INVENTION According to this invention, the precision of the motion estimation apparatus used by the super-resolution process and video encoding of a moving image, or other image processing can be improved. In recent years, moving image processing has become more and more precise, and the motion estimation apparatus according to the present invention is useful for any moving image processing application that requires a highly accurate motion estimation apparatus.

１動き推定装置
１１時間方向周波数分解部
１２空間分解階数決定部
１３空間方向周波数分解部
１４画像解析部
１５動き推定部
１２１比較部
１２２空間分解階数値決定部
１３１空間方向２次元Ｎｓ階離散ウェーブレット分解処理部
１３２水平・垂直・斜め方向周波数帯域別パワー算出部
１４１比較部
１４２水平・垂直・斜め方向パワー割合算出部
１４３水平・垂直方向倍数算出部
１４４ブロックサイズ・動き探索範囲決定部
１５１階層型動き推定部
１５２空間１階ウェーブレット再構成部 DESCRIPTION OF SYMBOLS 1 Motion estimation apparatus 11 Time direction frequency decomposition part 12 Spatial decomposition rank determination part 13 Spatial direction frequency decomposition part 14 Image analysis part 15 Motion estimation part 121 Comparison part 122 Spatial decomposition rank numerical value determination part 131 Spatial direction two-dimensional Ns-order discrete wavelet decomposition Processing Unit 132 Horizontal / Vertical / Diagonal Direction Frequency Band Power Calculation Unit 141 Comparison Unit 142 Horizontal / Vertical / Diagonal Direction Power Ratio Calculation Unit 143 Horizontal / Vertical Multiple Calculation Unit 144 Block Size / Motion Search Range Determination Unit 151 Hierarchical Motion Estimation unit 152 1st floor wavelet reconstruction unit

Claims

動画像の動き推定を行う動き推定装置であって、
複数フレームのフレーム画像列について時間方向の周波数帯域毎のパワーを算出する算出時間方向周波数分解部と、
前記時間方向の周波数帯域別のパワー値と、予め定めた第１閾値とを比較し、前記第１閾値を上回るパワーとなる階数を空間周波数の空間分解階数値として決定する空間分解階数決定部と、
前記空間分解階数値に基づいて、動き推定対象のフレームに対してオクターブ分解を行い、前記空間分解階数値に対応する階数毎に、水平、垂直、斜め方向の周波数帯域毎のパワー値を算出する空間方向周波数分解部と、
前記空間分解階数値に対応する階数毎に算出した水平、垂直、斜め方向の周波数帯域毎のパワー値と、予め定めた第２閾値とを比較し、当該水平、垂直、斜め方向の周波数帯域毎のパワー値の全てが前記第２閾値を下回るパワーとなる階数を動き検出開始階数として決定し、前記動き検出開始階数の各階層について該水平、垂直、斜め方向の周波数帯域毎のパワー値に応じたブロックサイズ及び動き探索範囲を決定する画像解析部と、
前記動き検出開始階数の階数毎に決定されたブロックサイズ及び動き探索範囲に基づいて、前記動き推定対象のフレームに対して階層型の動き推定を行う動き推定部と、
を具えることを特徴とする動き推定装置。 A motion estimation device that performs motion estimation of a moving image,
A calculation time direction frequency decomposition unit that calculates power for each frequency band in the time direction for a frame image sequence of a plurality of frames;
A space-resolved rank determining unit that compares the power value for each frequency band in the time direction with a predetermined first threshold and determines a rank that has a power that exceeds the first threshold as a spatially-resolved rank value of the spatial frequency; ,
Based on the spatial resolution factor, octave decomposition is performed on the motion estimation target frame, and a power value is calculated for each horizontal, vertical, and diagonal frequency band for each rank corresponding to the spatial resolution factor. A spatial frequency decomposition unit;
The power value for each frequency band in the horizontal, vertical, and diagonal directions calculated for each rank corresponding to the spatial resolution rank value is compared with a predetermined second threshold value, and the frequency band for each horizontal, vertical, and diagonal direction is compared. The number of floors at which all of the power values are lower than the second threshold is determined as a motion detection start floor, and for each layer of the motion detection start floor, depending on the power value for each horizontal, vertical, and diagonal frequency band An image analysis unit for determining a block size and a motion search range,
A motion estimation unit that performs hierarchical motion estimation on the motion estimation target frame based on a block size and a motion search range determined for each floor of the motion detection start floor;
A motion estimation apparatus comprising:

前記画像解析部は、前記動き検出開始階数における各階層の水平、垂直、斜め方向の周波数帯域毎のパワーの割合を算出して、前記動き検出開始階数における階層毎の基準の動き検出のブロックサイズ及び動き探索範囲に対する倍数を算出し、該倍数を前記基準の動き検出のブロックサイズ及び動き探索範囲に乗じて、当該水平、垂直、斜め方向の周波数帯域毎のパワー値に応じたブロックサイズ及び動き探索範囲を決定する手段を有することを特徴とする、請求項１に記載の動き推定装置。 The image analysis unit calculates a ratio of power for each frequency band in the horizontal, vertical, and diagonal directions of each layer in the motion detection start floor, and a block size for reference motion detection for each layer in the motion detection start floor And a multiple of the motion search range is calculated, and the multiple is multiplied by the reference motion detection block size and the motion search range, and the block size and motion according to the power values for the horizontal, vertical, and diagonal frequency bands are calculated. The motion estimation apparatus according to claim 1, further comprising means for determining a search range.

前記動き推定部は、階層型の動き推定を行う際に、階層間で同位置にブロックがない場合に、階層間のブロックの面積割合に応じた加重平均を行って、前記最終的な動きベクトルを決定する手段を有することを特徴とする、請求項１又は２に記載の動き推定装置。 When the motion estimation unit performs hierarchical motion estimation, if there is no block at the same position between layers, the motion estimation unit performs a weighted average according to the area ratio of the blocks between layers to obtain the final motion vector. The motion estimation apparatus according to claim 1, further comprising means for determining

動画像の動き推定を行う動き推定装置として構成するコンピュータに、
複数フレームのフレーム画像列について時間方向の周波数帯域毎のパワーを算出するステップと、
前記時間方向の周波数帯域別のパワー値と、予め定めた第１閾値とを比較し、前記第１閾値を上回るパワーとなる階数を空間周波数の空間分解階数値として決定するステップと、
前記空間分解階数値に基づいて、動き推定対象のフレームに対してオクターブ分解を行い、前記空間分解階数値に対応する階数毎に、水平、垂直、斜め方向の周波数帯域毎のパワー値を算出するステップと、
前記空間分解階数値に対応する階数毎に算出した水平、垂直、斜め方向の周波数帯域毎のパワー値と、予め定めた第２閾値とを比較し、当該水平、垂直、斜め方向の周波数帯域毎のパワー値の全てが前記第２閾値を下回るパワーとなる階数を動き検出開始階数として決定し、前記動き検出開始階数の各階層について該水平、垂直、斜め方向の周波数帯域毎のパワー値に応じたブロックサイズ及び動き探索範囲を決定するステップと、
前記動き検出開始階数の階数毎に決定されたブロックサイズ及び動き探索範囲に基づいて、前記動き推定対象のフレームに対して階層型の動き推定を行うステップと、
を実行させるためのプログラム。
In a computer configured as a motion estimation device that performs motion estimation of moving images,
Calculating power for each frequency band in the time direction for a frame image sequence of a plurality of frames;
Comparing the power value for each frequency band in the time direction with a predetermined first threshold value, and determining a rank that is a power that exceeds the first threshold value as a spatially resolved rank value of the spatial frequency;
Based on the spatial resolution factor, octave decomposition is performed on the motion estimation target frame, and a power value is calculated for each horizontal, vertical, and diagonal frequency band for each rank corresponding to the spatial resolution factor. Steps,
The power value for each frequency band in the horizontal, vertical, and diagonal directions calculated for each rank corresponding to the spatial resolution rank value is compared with a predetermined second threshold value, and the frequency band for each horizontal, vertical, and diagonal direction is compared. The number of floors at which all of the power values are lower than the second threshold is determined as a motion detection start floor, and for each layer of the motion detection start floor, depending on the power value for each horizontal, vertical, and diagonal frequency band Determining the determined block size and motion search range;
Performing hierarchical motion estimation on the motion estimation target frame based on the block size and motion search range determined for each floor of the motion detection start floor;
A program for running