JP2018059767A

JP2018059767A - Image processing device, image processing method and program

Info

Publication number: JP2018059767A
Application number: JP2016196433A
Authority: JP
Inventors: 希名板倉; Kina Itakura
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-10-04
Filing date: 2016-10-04
Publication date: 2018-04-12

Abstract

PROBLEM TO BE SOLVED: To improve the accuracy of estimating parallax in a parallax estimation using block matching.SOLUTION: An image processing device of the present invention comprises: derivation means which, with one of a plurality images as a basic image and other images different from the one image as a reference image, searches for a corresponding pixel position from the reference image by block matching with regard to each pixel position of the basic image, and deriving a parallax with the pixel position detected by block matching; generation means for generating a parallax map on the basis of the parallax derived with regard to each pixel position of the basic image; and update means for expanding a block used for block matching along a search direction in block matching, causing block matching to be re-executed by the derivation means with regard to a pixel position whose reliability of the parallax derived by the derivation means is below a fixed level, and updating the parallax map by a parallax with the pixel position detected by another block matching.SELECTED DRAWING: Figure 3

Description

本発明は、互いに異なる複数の視点から撮像された複数の画像を示す画像データを用いて、各画像間の視差を推定する技術に関する。 The present invention relates to a technique for estimating parallax between images using image data indicating a plurality of images taken from a plurality of different viewpoints.

同一の被写体を互いに異なる複数の視点から撮像して得られた複数の画像を用いて、被写体までの距離を推定し、推定した距離情報を画像処理に応用する技術がある。異なる視点に対応する複数の画像から距離情報を推定する方法の代表例として、各画像間に生じた視差を利用する方法がある。ここで視差とは、各視点に対応する画像間で生じる、同じ被写体領域に対応する画像上の位置のずれである。視差の大きさは被写体までの距離の大きさに依存する。そのため、各画像間に生じた視差の大きさと、各視点間の距離等に基づいて、被写体までの距離を推定することが可能である。視差は、ブロックマッチングなどの方法を用いて、各視点に対応する画像間で、同じ被写体領域に対応する領域を検出することで求められる。 There is a technique for estimating a distance to a subject using a plurality of images obtained by imaging the same subject from a plurality of different viewpoints and applying the estimated distance information to image processing. As a representative example of a method for estimating distance information from a plurality of images corresponding to different viewpoints, there is a method of using parallax generated between images. Here, the parallax is a positional shift on the image corresponding to the same subject area, which occurs between the images corresponding to the respective viewpoints. The magnitude of the parallax depends on the distance to the subject. Therefore, the distance to the subject can be estimated based on the magnitude of the parallax generated between the images and the distance between the viewpoints. The parallax is obtained by detecting an area corresponding to the same subject area between images corresponding to each viewpoint using a method such as block matching.

視差の推定精度を向上させる技術として、ブロックサイズを変えながら繰り返しブロックマッチングを行う、階層型のブロックマッチングを用いる方法が知られている。特許文献１には、推定した視差に誤差が含まれると判断した場合に、ブロックマッチングに用いるブロックのサイズを拡大して、再度ブロックマッチングを実行することにより、視差の推定精度を向上させる方法が記載されている。 As a technique for improving the estimation accuracy of parallax, a method using hierarchical block matching in which block matching is repeatedly performed while changing the block size is known. Patent Document 1 discloses a method for improving the accuracy of parallax estimation by increasing the size of a block used for block matching and executing block matching again when it is determined that an error is included in the estimated parallax. Have been described.

特開２００９−２９３９７１号公報JP 2009-293971 A

ブロックマッチングでは、一つのブロック内に距離（カメラからの距離）が異なる複数の被写体が含まれると、画像間で、対応する画像領域を適切に検出できない場合がある。例えば、マッチングの対象となる被写体がテクスチャレスである場合には、隣接する被写体に対応する領域が誤って検出される場合がある。したがって、被写体の境界付近においては、視差が正しく推定されない可能性がある。 In block matching, if a plurality of subjects having different distances (distances from the camera) are included in one block, the corresponding image area may not be detected properly between images. For example, when the subject to be matched is textureless, a region corresponding to an adjacent subject may be detected erroneously. Therefore, there is a possibility that the parallax is not correctly estimated in the vicinity of the boundary of the subject.

特許文献１に記載された方法では、ブロックが拡大されるとブロック内に複数の被写体が含まれやすくなるため、視差の推定精度が低下するおそれがある。そこで本発明は、ブロックマッチングを用いた視差推定において、視差の推定精度を向上させることを目的とする。 In the method described in Patent Literature 1, when a block is enlarged, a plurality of subjects are likely to be included in the block, so that the estimation accuracy of parallax may be reduced. Therefore, an object of the present invention is to improve the accuracy of parallax estimation in parallax estimation using block matching.

本発明による画像処理装置は、互いに異なる視点から撮像された複数の画像の視差を示す視差マップを生成する画像処理装置であって、複数の画像のうち一の画像を基準画像とし、当該一の画像と異なる他の画像を参照画像として、基準画像の各画素位置について、対応する画素位置を参照画像からブロックマッチングにより探索し、ブロックマッチングにより検出された画素位置との視差値を導出する導出手段と、基準画像の各画素位置について導出された視差値に基づき、視差マップを生成する生成手段と、ブロックマッチングにおける探索方向に沿って、ブロックマッチングに用いるブロックを拡大するとともに、導出手段が導出した視差の信頼度が一定レベル以下である画素位置について、導出手段にブロックマッチングを再度実行させ、再度のブロックマッチングにより検出された画素位置との視差で、視差マップを更新する更新手段と、を備えることを特徴とする。 An image processing apparatus according to the present invention is an image processing apparatus that generates a parallax map indicating parallaxes of a plurality of images captured from different viewpoints, wherein one of the plurality of images is used as a reference image, and the one Deriving means for searching for a corresponding pixel position from the reference image by block matching for each pixel position of the standard image using another image different from the image as a reference image, and deriving a parallax value from the pixel position detected by block matching And generating means for generating a disparity map based on the disparity value derived for each pixel position of the reference image, expanding the block used for block matching along the search direction in block matching, and deriving means derived Block matching is performed again on the derivation means for pixel positions where the parallax reliability is below a certain level. It was, by the disparity of the pixel positions detected by the block matching again, characterized in that it comprises updating means for updating the disparity map, a.

本発明によれば、ブロックマッチングを用いた視差推定において、視差の推定精度を向上させることができる。 ADVANTAGE OF THE INVENTION According to this invention, the parallax estimation precision can be improved in the parallax estimation using block matching.

第１の実施例の画像処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the image processing apparatus of a 1st Example. エピポーラ線を説明するための図である。It is a figure for demonstrating an epipolar line. 第１の実施例における視差推定処理を説明するための図である。It is a figure for demonstrating the parallax estimation process in a 1st Example. 第１の実施例における画像処理装置の機能構成の一例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of a functional configuration of an image processing apparatus according to a first embodiment. 第１の実施例における視差推定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the parallax estimation process in a 1st Example. エピポーラ線の算出方法を説明するための図である。It is a figure for demonstrating the calculation method of an epipolar line. ブロックの拡大方向を説明するための図である。It is a figure for demonstrating the expansion direction of a block. 第１の実施例の効果を説明するための概念図である。It is a conceptual diagram for demonstrating the effect of a 1st Example. 第２の実施例における画像処理装置の機能構成の一例を示すブロック図である。It is a block diagram which shows an example of a function structure of the image processing apparatus in a 2nd Example. 第２の実施例における視差推定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the parallax estimation process in a 2nd Example. 第３の実施例におけるブロックの修正処理を説明するための図である。It is a figure for demonstrating the correction process of the block in a 3rd Example. 第３の実施例における画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image processing apparatus in a 3rd Example. 第３の実施例における視差推定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the parallax estimation process in a 3rd Example. 領域修正部の処理を説明するための図である。It is a figure for demonstrating the process of an area | region correction part.

以下、本発明の実施形態について、図面を参照して説明する。なお、以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。なお、同一の構成については、同じ符号を付して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The following embodiments do not limit the present invention, and all the combinations of features described in the present embodiment are not necessarily essential to the solution means of the present invention. In addition, about the same structure, the same code | symbol is attached | subjected and demonstrated.

＜実施例１＞
図１は、第１の実施例の画像処理装置の構成の一例を示すブロック図である。第１の実施例の画像処理装置１００は、ＣＰＵ１０１、ＲＡＭ１０２、ＲＯＭ１０３、二次記憶装置１０４、入力インターフェース１０５、および出力インターフェース１０６を備える。そして、画像処理装置１００の各構成要素はシステムバス１０７によって相互に接続されている。また、画像処理装置１００は、入力インターフェース１０５を介して外部記憶装置１０８に接続されている。また、画像処理装置１００は、出力インターフェース１０６を介して外部記憶装置１０８および表示装置１０９に接続されている。 <Example 1>
FIG. 1 is a block diagram illustrating an example of the configuration of the image processing apparatus according to the first embodiment. The image processing apparatus 100 according to the first embodiment includes a CPU 101, a RAM 102, a ROM 103, a secondary storage device 104, an input interface 105, and an output interface 106. The components of the image processing apparatus 100 are connected to each other via a system bus 107. The image processing apparatus 100 is connected to the external storage device 108 via the input interface 105. The image processing apparatus 100 is connected to an external storage device 108 and a display device 109 via the output interface 106.

ＣＰＵ１０１は、ＲＡＭ１０２をワークメモリとして、ＲＯＭ１０３に格納されたプログラムを実行し、システムバス１０７を介して画像処理装置１００の各構成部を統括的に制御するプロセッサである。これにより、後述する様々な処理が実行される。 The CPU 101 is a processor that uses the RAM 102 as a work memory, executes a program stored in the ROM 103, and comprehensively controls each component of the image processing apparatus 100 via the system bus 107. Thereby, various processes described later are executed.

二次記憶装置１０４は、画像処理装置１００で取り扱われる種々のデータを記憶する記憶装置であり、例えばハードディスクドライブ（ＨＤＤ）である。ＣＰＵ１０１は、システムバス１０７を介して二次記憶装置１０４へのデータの書き込みおよび二次記憶装置１０４に記憶されたデータの読出しを行う。二次記憶装置１０４にはＨＤＤの他に、光ディスクドライブやフラッシュメモリなど、様々な記憶デバイスを用いることが可能である。 The secondary storage device 104 is a storage device that stores various data handled by the image processing apparatus 100, and is, for example, a hard disk drive (HDD). The CPU 101 writes data to the secondary storage device 104 and reads data stored in the secondary storage device 104 via the system bus 107. In addition to the HDD, various storage devices such as an optical disk drive and a flash memory can be used for the secondary storage device 104.

入力インターフェース１０５は、例えばＵＳＢやＩＥＥＥ１３９４等のシリアルバスインターフェースである。画像処理装置１００は、入力インターフェース１０５を介して、外部装置からデータや命令等を入力する。また画像処理装置１００は、入力インターフェース１０５を介して、外部記憶装置１０８（例えば、ハードディスク、メモリーカード、ＣＦカード、ＳＤカード、ＵＳＢメモリなどの記憶媒体）からデータを取得する。なお、入力インターフェース１０５には不図示のマウスやボタンなどの入力デバイスも接続可能である。 The input interface 105 is a serial bus interface such as USB or IEEE1394. The image processing apparatus 100 inputs data, instructions, and the like from an external device via the input interface 105. Further, the image processing apparatus 100 acquires data from an external storage device 108 (for example, a storage medium such as a hard disk, a memory card, a CF card, an SD card, a USB memory) via the input interface 105. Note that an input device such as a mouse or a button (not shown) can be connected to the input interface 105.

出力インターフェース１０６は、入力インターフェース１０５と同様に、ＵＳＢやＩＥＥＥ１３９４等のシリアルバスインターフェースである。なお、出力インターフェース１０６には、例えばＤＶＩやＨＤＭＩ（登録商標）等の映像出力端子を用いることも可能である。画像処理装置１００は、出力インターフェース１０６を介して、外部装置へデータ等を出力する。また、画像処理装置１００は、出力インターフェース１０６を介して、表示装置１０９（液晶ディスプレイなどの各種画像表示デバイス）に、処理された画像データなどを出力する。これにより、表示装置１０９に画像が表示される。なお、画像処理装置１００の構成要素は上記以外にも存在するが、本発明の主眼ではないため、説明を省略する。 Similar to the input interface 105, the output interface 106 is a serial bus interface such as USB or IEEE1394. The output interface 106 may be a video output terminal such as DVI or HDMI (registered trademark). The image processing apparatus 100 outputs data and the like to an external device via the output interface 106. Further, the image processing apparatus 100 outputs processed image data and the like to the display device 109 (various image display devices such as a liquid crystal display) via the output interface 106. As a result, an image is displayed on the display device 109. Note that the components of the image processing apparatus 100 exist in addition to the above, but are not the main points of the present invention, and thus the description thereof is omitted.

ここで、本実施例で用いるブロックマッチングの原理について説明する。ブロックマッチングでは、比較対象とする画像領域（ブロック）を変更しながら、２枚の画像間で同じサイズのブロック同士が比較される。そして、領域間のマッチング度合いを表わす評価値が最小（最大）となるブロック同士が、対応するブロックとして検出される。そして、検出されたブロックの画像中の位置の差が、そのブロックに対応する視差として導出される。 Here, the principle of block matching used in this embodiment will be described. In block matching, blocks of the same size are compared between two images while changing an image region (block) to be compared. Then, the blocks having the smallest (maximum) evaluation value representing the degree of matching between regions are detected as corresponding blocks. Then, the difference in the position of the detected block in the image is derived as the parallax corresponding to the block.

ブロックマッチングの精度は、マッチングに用いられるブロックの大きさ（サイズ）や形状に左右される。したがって、画像の特徴によって、適切なブロックの大きさや形状は変化する。例えば、小さなサイズのブロックを用いてブロックマッチングを行った場合には、テクスチャレス領域において評価値にほとんど差がないブロックが多数検出されてしまう。そのため、間違ったブロックを対応ブロックとして検出する可能性が高くなってしまう。一方、テクスチャレス領域に対応するため、大きなサイズのブロックを用いた場合は、マッチングの対象となる被写体（着目被写体）に隣接する、被写体（隣接被写体）がブロック内に含まれる可能性が増大する。カメラからの距離が着目被写体と隣接被写体とで異なる場合、両被写体は互いに視差が異なる。このように、距離が異なる複数の被写体を含むブロックでは、視差が異なる領域が混在する。特に着目被写体がテクスチャレスである場合には、画像間において対応ブロックを適切に決定することができず、ブロックマッチングの精度が低下する。 The accuracy of block matching depends on the size (size) and shape of blocks used for matching. Therefore, the appropriate block size and shape vary depending on the characteristics of the image. For example, when block matching is performed using a small-sized block, a large number of blocks having almost no difference in evaluation value are detected in the textureless region. This increases the possibility of detecting an incorrect block as a corresponding block. On the other hand, when a large-size block is used to deal with a textureless area, there is an increased possibility that a subject (adjacent subject) adjacent to the subject to be matched (target subject) is included in the block. . When the distance from the camera is different between the subject of interest and the adjacent subject, both subjects have different parallaxes. Thus, in blocks including a plurality of subjects with different distances, areas with different parallaxes are mixed. In particular, when the subject of interest is textureless, the corresponding block cannot be appropriately determined between images, and the accuracy of block matching decreases.

そこで、本実施例では、サイズが異なる複数のブロックのそれぞれでマッチングを行う階層型のブロックマッチングを採用する。具体的には、最初のブロックマッチング（最下層のブロックマッチング）では、最も小さいサイズのブロックを用いてマッチングを行う。そして、ブロックのサイズを拡大して、次のブロックマッチング（次の階層のブロックマッチング）を行う。このように本実施例では、ブロックのサイズを拡大しながら、繰り返しブロックマッチングが行われる。その際、ある階層で視差の推定が上手くいかなかった領域については、さらに上の階層での視差推定の結果を用いて視差の値を補うようにする。以下、視差の値を視差値と称する。また、視差値を、単に視差と表現する場合がある。 Therefore, in this embodiment, hierarchical block matching is used in which matching is performed for each of a plurality of blocks having different sizes. Specifically, in the first block matching (bottom layer block matching), matching is performed using the block having the smallest size. Then, the block size is enlarged and the next block matching (block matching of the next layer) is performed. Thus, in this embodiment, repeated block matching is performed while increasing the block size. At that time, for a region where the parallax estimation is not successful in a certain hierarchy, the parallax value is supplemented using the result of the parallax estimation in the further hierarchy. Hereinafter, the parallax value is referred to as a parallax value. In addition, the parallax value may be simply expressed as parallax.

本実施例では、ブロックを拡大する際、エピポーラ線と平行な方向にのみブロックを拡大することで、高精度な視差推定を実現する。以下、図２を用いてエピポーラ線について説明した後、図３に示す概念図を用いてブロックの拡大について具体的に説明する。 In this embodiment, when the block is enlarged, the parallax estimation with high accuracy is realized by enlarging the block only in the direction parallel to the epipolar line. Hereinafter, after describing the epipolar line with reference to FIG. 2, the enlargement of the block will be specifically described with reference to the conceptual diagram shown in FIG.

図２は、エピポーラ線を説明するための図である。点４０１は、被写体上のある１点を表す。点４０２は、被写体を左側（図２における左側）から撮像する、左視点のカメラの光学中心を表す。点４０５は、被写体を右側（図２における右側）から撮像する、右視点のカメラの光学中心を表す。以下、点４０２，４０５をそれぞれ、カメラ４０２，４０５と表現する場合がある。図２に示すように、３次元空間中のある点４０１がカメラ４０２の画像（画像面４０３）に投影されているとする。ここで、点４０１とカメラ４０２とを結ぶ直線と、画像面４０３とが交差する点４０４が、３次元空間中の点４０１の画像面４０３への投影像となる。同様に、別視点のカメラ４０５では、点４０１とカメラ４０５とを結ぶ直線と、画像面４０６とが交差する点４０７が、点４０１の画像面４０６への投影像となる。このとき、点４０１，４０２，４０５の３点は空間中に一つの平面４０８を定義する。上記点４０４，４０７は平面４０８上に存在する。このとき平面４０８と画像面４０３とが交差してできる直線４０９が、画像面４０３におけるエピポーラ線となる。同様に、平面４０８と画像面４０６とが交差してできる直線４１０は、画像面４０６におけるエピポーラ線となる。カメラ４０２により観測される画像点４０４に対応する空間中の点は、直線４１１上のどこかに存在する。この直線４１１を画像面４０６に投影したものが、エピポーラ線４１０となる。つまり、エピポーラ線４１０は、画像面４０３上の点４０４に対応する、画像面４０６上の点の集合である。同様に、画像面４０３上のエピポーラ線４０９は、カメラ４０５により観測される画像点４０７に対応する、画像面４０３上の点の集合である。このように、エピポーラ線は画像上の点同士の対応関係を表している。したがって、例えば、画像点４０４に対応する、画像面４０６上の点を探索する場合には、画像面４０６全体ではなくエピポーラ線４１０上を探索すればよい。また例えば、画像点４０７に対応する、画像面４０３上の点を探索する場合には、画像面４０３全体ではなくエピポーラ線４０９上を探索すればよい。 FIG. 2 is a diagram for explaining epipolar lines. A point 401 represents a certain point on the subject. A point 402 represents the optical center of the left viewpoint camera that images the subject from the left side (left side in FIG. 2). A point 405 represents the optical center of the right-view camera that images the subject from the right side (the right side in FIG. 2). Hereinafter, points 402 and 405 may be expressed as cameras 402 and 405, respectively. Assume that a certain point 401 in the three-dimensional space is projected on the image (image plane 403) of the camera 402 as shown in FIG. Here, a point 404 where a straight line connecting the point 401 and the camera 402 intersects with the image plane 403 is a projection image of the point 401 in the three-dimensional space onto the image plane 403. Similarly, in the camera 405 of another viewpoint, a point 407 where a straight line connecting the point 401 and the camera 405 intersects with the image plane 406 is a projection image of the point 401 on the image plane 406. At this time, three points 401, 402, and 405 define one plane 408 in the space. The points 404 and 407 exist on the plane 408. At this time, a straight line 409 formed by the intersection of the plane 408 and the image plane 403 becomes an epipolar line on the image plane 403. Similarly, a straight line 410 formed by the intersection of the plane 408 and the image plane 406 becomes an epipolar line on the image plane 406. A point in the space corresponding to the image point 404 observed by the camera 402 exists somewhere on the straight line 411. An epipolar line 410 is obtained by projecting the straight line 411 onto the image plane 406. That is, the epipolar line 410 is a set of points on the image plane 406 corresponding to the points 404 on the image plane 403. Similarly, an epipolar line 409 on the image plane 403 is a set of points on the image plane 403 corresponding to the image point 407 observed by the camera 405. Thus, the epipolar line represents the correspondence between points on the image. Therefore, for example, when searching for a point on the image plane 406 corresponding to the image point 404, it is only necessary to search on the epipolar line 410 instead of the entire image plane 406. Further, for example, when searching for a point on the image plane 403 corresponding to the image point 407, the search may be performed on the epipolar line 409 instead of the entire image plane 403.

図３は、第１の実施例における視差推定処理を説明するための図である。図３（ａ）には、テクスチャレスの被写体（以下、テクスチャレス被写体という）３０５と、テクスチャを有する被写体（以下、テクスチャ被写体という）３０６とを、２台のカメラ（カメラ３０１，３０３）で撮像する様子が示されている。カメラ３０１とカメラ３０３は、所定の距離を隔てて同一平面上に含まれるように配置されている。また、カメラ３０１とカメラ３０３は、互いの光軸が平行になるように配置されている。テクスチャレス被写体３０５とテクスチャ被写体３０６とは、カメラ３０１，３０２を含む上記平面からそれぞれ異なる距離に配置されている。図３（ａ）において、画像３０２は、カメラ３０１により撮像された左視点の画像である。画像３０４は、カメラ３０３により撮像された右視点の画像である。画像３０２，３０４には、共にテクスチャレス被写体３０５とテクスチャ被写体３０６が写っている。ここで、画像３０２上の位置（ｕ１，ｖ１）の画素３０７に対応する画像３０４上の画素を求める。画素３０７は、テクスチャレス被写体３０５に対応する領域に属する。ここで、画素３０７を中心とするブロック３０８内に、テクスチャレス被写体３０５に対応する領域と、テクスチャ被写体３０６に対応する領域とが含まれる場合を考える。この状態でブロックマッチングを行った場合、正解の視差（すなわち、テクスチャレス被写体３０５の視差）に対応するブロック３０９ではなく、テクスチャ被写体３０６の視差に対応するブロック３１０が誤って検出される。間違いの要因は、ブロック３１０内に含まれるテクスチャ被写体３０６の特徴により、ブロック３０９よりもブロック３１０との類似性が高いと判断されるためである。この現象は、ブロック内に含まれる、テクスチャレス被写体とそれ以外の被写体との境界線が、画像３０４におけるブロックの走査方向と平行（すなわち、画素３０７に対応する画像３０４上のエピポーラ線と平行）な場合に発生する。その結果、テクスチャレス領域における推定視差の精度が低下する。そこで、本実施例では、この現象の発生を抑えるために、ブロック内にエピポーラ線と平行な境界線が含まれるのを抑制する。具体的には、図３（ｂ）に示すブロック３１１のように、初期ブロックを１画素とし、さらに、ブロックを拡大するときには、エピポーラ線に平行な方向にブロックを拡大させる。このようにして、本実施例では、ブロック内にエピポーラ線と平行な境界線が含まれるのを抑制し、推定視差の精度を向上させる。 FIG. 3 is a diagram for explaining the parallax estimation processing in the first embodiment. In FIG. 3A, a textureless object (hereinafter referred to as a textureless object) 305 and a textured object (hereinafter referred to as a textured object) 306 are captured by two cameras (cameras 301 and 303). The state of doing is shown. The camera 301 and the camera 303 are arranged so as to be included on the same plane with a predetermined distance therebetween. Further, the camera 301 and the camera 303 are arranged so that their optical axes are parallel to each other. The textureless object 305 and the texture object 306 are arranged at different distances from the plane including the cameras 301 and 302, respectively. In FIG. 3A, an image 302 is a left viewpoint image captured by the camera 301. An image 304 is a right viewpoint image captured by the camera 303. The images 302 and 304 both include a textureless subject 305 and a textured subject 306. Here, the pixel on the image 304 corresponding to the pixel 307 at the position (u1, v1) on the image 302 is obtained. The pixel 307 belongs to an area corresponding to the textureless subject 305. Here, consider a case where a block 308 centered on the pixel 307 includes a region corresponding to the textureless subject 305 and a region corresponding to the textured subject 306. When block matching is performed in this state, the block 310 corresponding to the parallax of the texture subject 306 is detected in error, not the block 309 corresponding to the correct parallax (that is, the parallax of the textureless subject 305). The reason for the error is that it is determined that the similarity to the block 310 is higher than the block 309 due to the characteristics of the texture subject 306 included in the block 310. In this phenomenon, the boundary line between the textureless object and the other object included in the block is parallel to the scanning direction of the block in the image 304 (that is, parallel to the epipolar line on the image 304 corresponding to the pixel 307). Occurs when As a result, the accuracy of the estimated parallax in the textureless region decreases. Therefore, in this embodiment, in order to suppress the occurrence of this phenomenon, it is suppressed that the boundary line parallel to the epipolar line is included in the block. Specifically, like the block 311 shown in FIG. 3B, the initial block is one pixel, and when the block is further enlarged, the block is enlarged in a direction parallel to the epipolar line. In this way, in this embodiment, the boundary line parallel to the epipolar line is prevented from being included in the block, and the accuracy of the estimated parallax is improved.

上記処理により、テクスチャレス領域やその形状によらず、精度がよく、信頼性が高い視差マップ（各画素位置に対応する視差値を画素値として格納した画像データ）を得ることが可能になる。 With the above processing, it is possible to obtain a highly accurate and highly reliable parallax map (image data storing parallax values corresponding to each pixel position as a pixel value) regardless of the textureless region and its shape.

次に、第１の実施例の画像処理装置１００で行われる処理について、図４および図５を用いて、具体的に説明する。図４は、画像処理装置１００の機能構成の一例を示すブロック図である。図５は、第１の実施例における視差推定処理の流れを示すフローチャートである。 Next, processing performed by the image processing apparatus 100 according to the first embodiment will be specifically described with reference to FIGS. 4 and 5. FIG. 4 is a block diagram illustrating an example of a functional configuration of the image processing apparatus 100. FIG. 5 is a flowchart showing the flow of parallax estimation processing in the first embodiment.

図４に示す各機能は、ＲＯＭ１０３に格納されたプログラムをＣＰＵ１０１がＲＡＭ１０２にロードして実行することで動作する。本実施例では、各機能として、画像データ取得部２０１、撮像パラメータ取得部２０２、エピポーラ線算出部２０３、領域設定部２０４、視差推定部２０５、判定部２０６、及び、視差決定部２０７が含まれる。またそれにより、図５（ａ）（ｂ）に示す一連の処理が実行される。なお、以下に示す処理の全てがＣＰＵ１０１によって実行される必要はない。以下に示す処理の一部または全部が、ＣＰＵ１０１以外の一つ又は複数の処理回路によって行われるように画像処理装置１００が構成されてもよい。以下、図５（ａ）（ｂ）に示す処理の流れを説明する。 Each function shown in FIG. 4 operates when the CPU 101 loads the program stored in the ROM 103 to the RAM 102 and executes it. In the present embodiment, as each function, an image data acquisition unit 201, an imaging parameter acquisition unit 202, an epipolar line calculation unit 203, a region setting unit 204, a parallax estimation unit 205, a determination unit 206, and a parallax determination unit 207 are included. . Accordingly, a series of processes shown in FIGS. 5A and 5B are executed. Note that it is not necessary for the CPU 101 to execute all of the processes described below. The image processing apparatus 100 may be configured such that part or all of the following processing is performed by one or a plurality of processing circuits other than the CPU 101. Hereinafter, the processing flow shown in FIGS. 5A and 5B will be described.

ステップＳ５０１では、画像データ取得部２０１が、入力インターフェース１０５を介して、または二次記憶装置１０４から、処理対象となる画像データを取得する。ここでは、画像データ取得部２０１は、多眼カメラなどの多視点撮像装置を用いて撮像された画像データを取得する。すなわち、画像データ取得部２０１は、同一の被写体を複数の異なる視点から同時に撮像することで得られる複数の画像を示す多視点画像データを取得する。画像データ取得部２０１は、視差推定に用いる基準画像および参照画像を決定する。ここで、基準画像とは視差推定の基準となる画像である。ブロックマッチングでは、基準画像における処理対象となる画素（以下、着目画素という）に対応する画素が、基準画像とは異なる視点で撮像された画像（以下、参照画像という）から探索される。その結果、視差推定の結果として得られる視差マップは、基準画像の視点に対応する視差マップとなる。なお、基準画像および参照画像は、入力インターフェース１０５を介して、ユーザによって指定されてもよい。以下では、基準画像を撮像したカメラを基準カメラ、参照画像を撮像したカメラを参照カメラと称する。画像データ取得部２０１は、取得した画像データを視差推定部２０５に出力する。 In step S <b> 501, the image data acquisition unit 201 acquires image data to be processed via the input interface 105 or from the secondary storage device 104. Here, the image data acquisition unit 201 acquires image data captured using a multi-view imaging device such as a multi-view camera. That is, the image data acquisition unit 201 acquires multi-viewpoint image data indicating a plurality of images obtained by simultaneously capturing the same subject from a plurality of different viewpoints. The image data acquisition unit 201 determines a standard image and a reference image used for parallax estimation. Here, the reference image is an image serving as a reference for parallax estimation. In block matching, a pixel corresponding to a pixel to be processed (hereinafter referred to as a target pixel) in a standard image is searched from an image (hereinafter referred to as a reference image) captured from a viewpoint different from that of the standard image. As a result, the parallax map obtained as a result of the parallax estimation is a parallax map corresponding to the viewpoint of the reference image. Note that the standard image and the reference image may be designated by the user via the input interface 105. Hereinafter, a camera that captures a reference image is referred to as a reference camera, and a camera that captures a reference image is referred to as a reference camera. The image data acquisition unit 201 outputs the acquired image data to the parallax estimation unit 205.

以下では、同一平面上に配置した２台のカメラにより取得される、２枚の撮像画像データが画像データ取得部２０１に入力される場合を例にする。ここで、２台のカメラが同一平面上に配置された状態とは、一のカメラの光軸に垂直でありかつ当該一のカメラの主点を通る平面上に、他のカメラの主点が存在する状態である。また、カメラの位置姿勢を表す軸がカメラ間で互いに平行（カメラの光軸が互いに平行）であるとする。なお、画像データ取得部２０１に入力される画像の枚数は３枚以上でもよい。また、互いのカメラの光軸が平行でない場合についても同様に、本実施例を適用することが可能である。具体的な処理方法については後述する。 Hereinafter, a case where two pieces of captured image data acquired by two cameras arranged on the same plane are input to the image data acquisition unit 201 will be described as an example. Here, the state in which two cameras are arranged on the same plane means that the principal point of another camera is on a plane perpendicular to the optical axis of the one camera and passing through the principal point of the one camera. It exists. Also, it is assumed that the axes representing the position and orientation of the cameras are parallel to each other between the cameras (the optical axes of the cameras are parallel to each other). Note that the number of images input to the image data acquisition unit 201 may be three or more. Similarly, the present embodiment can be applied to the case where the optical axes of the cameras are not parallel. A specific processing method will be described later.

ステップＳ５０２では、視差推定部２０５がフラグマップの全画素に１を設定して、フラグマップを初期化する。フラグマップとは、処理対象の場合は１、処理対象でない場合は０を画素毎に設定したデータのことである。このようにフラグマップを初期化することで、画像に含まれる全画素が視差推定の対象となる。フラグマップは、視差推定部２０５が保持してもよいし、二次記憶装置１０４に格納するようにしてもよい。 In step S502, the parallax estimation unit 205 sets 1 to all the pixels of the flag map, and initializes the flag map. The flag map is data in which “1” is set for each pixel and “0” is set for each pixel. By initializing the flag map in this way, all the pixels included in the image are targets for parallax estimation. The flag map may be held by the parallax estimation unit 205 or may be stored in the secondary storage device 104.

ステップＳ５０３では、撮像パラメータ取得部２０２が、画像データ取得部２０１が取得した画像データの各画像を撮影したカメラの、撮像パラメータを取得する。本実施例では、撮像パラメータ取得部２０２は、入力インターフェース１０５を介して、または二次記憶装置１０４から、撮像パラメータの値が記述されたファイルを読み込む。ファイルには、予めカメラを測定して得られる、位置及び姿勢を表す外部パラメータと、焦点距離、主点位置、及び歪曲を表す内部パラメータとが、撮像パラメータとして記述されている。なお、画像データを用いてＳＦＭ（ＳｔｒｕｃｔｕｒｅＦｒｏｍＭｏｔｉｏｎ）などの公知の手法により、各撮像パラメータの値を推定してもよい。撮像パラメータ取得部２０２は、撮像パラメータをエピポーラ線算出部２０３に出力する。 In step S503, the imaging parameter acquisition unit 202 acquires imaging parameters of a camera that has captured each image of the image data acquired by the image data acquisition unit 201. In this embodiment, the imaging parameter acquisition unit 202 reads a file in which imaging parameter values are described via the input interface 105 or from the secondary storage device 104. In the file, external parameters representing the position and orientation, and internal parameters representing the focal length, principal point position, and distortion obtained by measuring the camera in advance are described as imaging parameters. Note that the value of each imaging parameter may be estimated by using a known method such as SFM (Structure From Motion) using image data. The imaging parameter acquisition unit 202 outputs the imaging parameters to the epipolar line calculation unit 203.

ステップＳ５０４では、エピポーラ線算出部２０３が、基準画像上の着目画素に対応する、参照画像上のエピポーラ線を算出する。上述のとおり、基準カメラと参照カメラとは同一平面状に配置されていて、かつ互いの光軸が平行になるように配置されている。そのため、参照画像上のエピポーラ線は、基準カメラと参照カメラとの方向ベクトルに平行となる。この方向ベクトルを用いて、参照画像上のエピポーラ線を算出する。図６を用いて具体的に説明する。 In step S504, the epipolar line calculation unit 203 calculates an epipolar line on the reference image corresponding to the target pixel on the base image. As described above, the base camera and the reference camera are arranged on the same plane, and are arranged so that their optical axes are parallel to each other. Therefore, the epipolar line on the reference image is parallel to the direction vector between the base camera and the reference camera. An epipolar line on the reference image is calculated using this direction vector. This will be specifically described with reference to FIG.

図６は、エピポーラ線の算出方法を説明するための図である。点６０５は、被写体上のある１点を表す。点６０１は、被写体を左視点から撮像するカメラの光学中心を表す。点６０２は、被写体を右視点から撮像するカメラの光学中心を表す。以下、点６０１を基準カメラ６０１と表現する。また、点６０２を参照カメラ６０２と表現する。図６において、世界座標のＺ軸と基準カメラ６０１の光軸とは互いに平行である。基準カメラ６０１の３次元空間中の座標点（Ｘｂ，Ｙｂ，Ｚｂ）と、参照カメラ６０２の３次元空間中の座標点（Ｘｒ，Ｙｒ，Ｚｒ）とから、方向ベクトル６０３は（Ｘｂ−Ｘｒ，Ｙｂ−Ｙｒ，Ｚｂ−Ｚｒ）と求まる。ここで、各カメラの主点が同一平面上に存在するため、方向ベクトル６０３は、Ｚ軸を除く２次元のベクトル（Ｘｂ−Ｘｒ，Ｙｂ−Ｙｒ）で表される。このとき、エピポーラ線は参照画像上の座標系ｕ，ｖを用いて下記の式１で表すことができる。（ｕ₀，ｖ₀）は、参照画像上の座標系で表した、基準画像上の着目画素６０４の座標である。 FIG. 6 is a diagram for explaining a method of calculating epipolar lines. A point 605 represents a certain point on the subject. A point 601 represents the optical center of the camera that captures an image of the subject from the left viewpoint. A point 602 represents the optical center of the camera that captures an image of the subject from the right viewpoint. Hereinafter, the point 601 is expressed as a reference camera 601. A point 602 is expressed as a reference camera 602. In FIG. 6, the world coordinate Z-axis and the optical axis of the reference camera 601 are parallel to each other. From the coordinate point (Xb, Yb, Zb) in the three-dimensional space of the reference camera 601 and the coordinate point (Xr, Yr, Zr) in the three-dimensional space of the reference camera 602, the direction vector 603 is (Xb−Xr, Yb-Yr, Zb-Zr). Here, since the principal point of each camera exists on the same plane, the direction vector 603 is represented by a two-dimensional vector (Xb−Xr, Yb−Yr) excluding the Z axis. At this time, the epipolar line can be expressed by the following equation 1 using the coordinate systems u and v on the reference image. (U ₀ , v ₀ ) is the coordinates of the pixel of interest 604 on the standard image, expressed in the coordinate system on the reference image.

また、２台のカメラの光軸が平行であるため、基準画像と参照画像とは同一の画像座標軸となり、上記式で求めたエピポーラ線は基準画像のエピポーラ線と等しい。なお、世界座標のＺ軸と基準カメラ６０１の光軸とが平行でない場合についても同様にしてエピポーラ線を算出することが可能である。エピポーラ線算出部２０３は、エピポーラ線の算出結果を領域設定部２０４と視差推定部２０５とに出力する。 Further, since the optical axes of the two cameras are parallel, the base image and the reference image are the same image coordinate axis, and the epipolar line obtained by the above formula is equal to the epipolar line of the base image. The epipolar line can be calculated in the same manner even when the Z axis of the world coordinates and the optical axis of the reference camera 601 are not parallel. The epipolar line calculation unit 203 outputs the calculation result of the epipolar line to the region setting unit 204 and the parallax estimation unit 205.

ステップＳ５０５では、領域設定部２０４がエピポーラ線算出部２０３から取得したエピポーラ線の算出結果に基づいてブロックの拡大方向を設定する。また、領域設定部２０４は、階層型のブロックマッチングにおける初期ブロックを決定する。 In step S505, the region setting unit 204 sets the enlargement direction of the block based on the calculation result of the epipolar line acquired from the epipolar line calculation unit 203. The region setting unit 204 determines an initial block in hierarchical block matching.

ブロックの拡大方向は、事前に定めた方向の候補から選択する。図７は、ブロックの拡大方向を説明するための図である。本実施例では、図７に示す、画像軸の各軸に平行な方向と各軸間の中間方向との計４方向を候補とする。領域設定部２０４は、エピポーラ線算出部２０３が算出したエピポーラ線と、上記４方向とがなす角度をそれぞれ算出する。そして、領域設定部２０４は、最も角度が小さい方向をブロックの拡大方向として選択し、拡大方向を示す情報（以下、拡大方向情報という）を保持する。なお、ブロックの拡大方向の候補はこれに限らず、画像軸の各軸がなす角度を複数に分割して定めた方向を、候補として用いてもよい。 The enlargement direction of the block is selected from predetermined direction candidates. FIG. 7 is a diagram for explaining the enlargement direction of the block. In the present embodiment, a total of four directions shown in FIG. 7 including a direction parallel to each axis of the image axis and an intermediate direction between the axes are set as candidates. The area setting unit 204 calculates an angle formed by the epipolar line calculated by the epipolar line calculation unit 203 and the four directions. Then, the region setting unit 204 selects the direction with the smallest angle as the enlargement direction of the block, and holds information indicating the enlargement direction (hereinafter referred to as enlargement direction information). In addition, the candidate of the expansion direction of a block is not restricted to this, You may use the direction determined by dividing | segmenting into several the angle which each axis | shaft of an image axis | shaft makes.

初期ブロックとして、本実施例では、上述したように着目画素である１画素が設定される。なお、初期ブロックは１画素でなくてもよい。例えば、着目画素を中心とした正方形や長方形のブロックなど任意のブロックを設定してもよい。ただし、初期ブロックを１画素に設定することで、ブロック内にエピポーラ線と平行な境界線が含まれることを確実に抑制することができる。領域設定部２０４は、決定した初期ブロックのサイズや形状を示す情報を、現在の階層で用いるブロックに関する情報（以下、ブロック情報という）として、視差推定部２０５に出力する。 In the present embodiment, as the initial block, one pixel that is the target pixel is set as described above. Note that the initial block may not be one pixel. For example, an arbitrary block such as a square or rectangular block centered on the pixel of interest may be set. However, by setting the initial block to one pixel, it is possible to reliably prevent the boundary line included in the block from being parallel to the epipolar line. The area setting unit 204 outputs information indicating the determined size and shape of the initial block to the disparity estimation unit 205 as information on the block used in the current hierarchy (hereinafter referred to as block information).

ステップＳ５０６では、視差推定部２０５が基準画像において視差推定を行う着目画素を決定する。本実施例では、基準画像において最も左上の画素が着目画素として選択される。その後、着目画素に対応する視差が推定されるたびに、それまでに着目画素として選択されていない画素が新たな着目画素として選択される。具体的には、ラスタ順に最も右下の画素まで、各画素が着目画素として選択される。なお、着目画素の選択順はこれに限られず、どのような順番で着目画素を選択してもよい。 In step S506, the parallax estimation unit 205 determines a pixel of interest for which parallax estimation is performed in the reference image. In this embodiment, the upper left pixel in the reference image is selected as the target pixel. Thereafter, each time the parallax corresponding to the target pixel is estimated, a pixel that has not been selected as the target pixel is selected as a new target pixel. Specifically, each pixel is selected as a pixel of interest up to the lower right pixel in raster order. Note that the selection order of the target pixel is not limited to this, and the target pixel may be selected in any order.

ステップＳ５０７では、視差推定部２０５が、画像データ取得部２０１から取得した画像データの着目画素における視差を推定する。視差は、エピポーラ線算出部２０３から取得したエピポーラ線と、領域設定部２０４から取得したブロック情報とに基づいて、ブロックマッチングにより推定する。以下に、具体的な処理内容を示す。 In step S507, the parallax estimation unit 205 estimates the parallax at the target pixel of the image data acquired from the image data acquisition unit 201. The parallax is estimated by block matching based on the epipolar line acquired from the epipolar line calculation unit 203 and the block information acquired from the region setting unit 204. Specific processing contents are shown below.

まず、視差推定部２０５は、参照画像において、着目画素と比較する画素（以下、参照画素という）を決定する。本実施例では、視差推定部２０５は、着目画素と対応する画素が存在しそうにない画像領域の画素はあらかじめ参照画素の候補から除外する。具体的には、視差推定部２０５は、最初の参照画素として、基準画像における着目画素の座標（ｕ０，ｖ０）と同一座標の画素を参照画像上から選択する。その後、参照画素の評価値が算出されるたびに、視差推定部２０５は、エピポーラ線上に存在する画素を新たな参照画素として選択する。探索方向は、参照画像を撮像したカメラから基準画像を撮像したカメラを見た際の向きに相当する。したがって、探索方向は、エピポーラ線算出部２０３で算出された、基準カメラと参照カメラとの方向ベクトル６０３と等しい。このように、エピポーラ線上の画素のみを参照画素として選択することにより、ブロックマッチングの処理に要する時間を短縮することができる。なお、参照画素の選択方法はこれに限られず、全画素を参照画素の対象としてもよいし、どのような順番で参照画素を選択してもよい。例えば、視差推定部２０５は、最初の参照画素として、参照画像の最も左上の画素を選択し、ラスタ順に最も右下の画素まで、各画素を参照画素として選択するようにしてもよい。 First, the parallax estimation unit 205 determines a pixel (hereinafter referred to as a reference pixel) to be compared with the target pixel in the reference image. In the present embodiment, the parallax estimation unit 205 excludes pixels in an image area in which a pixel corresponding to the target pixel is unlikely to exist from reference pixel candidates in advance. Specifically, the parallax estimation unit 205 selects, from the reference image, a pixel having the same coordinates as the coordinates (u0, v0) of the target pixel in the standard image as the first reference pixel. Thereafter, each time the evaluation value of the reference pixel is calculated, the parallax estimation unit 205 selects a pixel existing on the epipolar line as a new reference pixel. The search direction corresponds to the direction when viewing the camera that captured the reference image from the camera that captured the reference image. Therefore, the search direction is equal to the direction vector 603 between the base camera and the reference camera calculated by the epipolar line calculation unit 203. Thus, by selecting only the pixels on the epipolar line as reference pixels, the time required for the block matching process can be shortened. Note that the reference pixel selection method is not limited to this, and all the pixels may be targeted for reference pixels, or the reference pixels may be selected in any order. For example, the parallax estimation unit 205 may select the top left pixel of the reference image as the first reference pixel, and select each pixel as the reference pixel up to the bottom right pixel in the raster order.

次に、視差推定部２０５は、着目画素と参照画素とを比較し、ブロックマッチングの評価値を計算する。本実施例では、比較対象のブロック間での画素値の二乗平均誤差を評価値として用いる。二乗平均誤差の計算に用いるブロックは、領域設定部２０４から取得したブロック情報によって示されるブロックとする。視差推定部２０５は、着目画素を中心とするブロックと、参照画素を中心とするブロックとで画素値の二乗平均誤差を計算する。着目画素と参照画素との画素位置の、水平方向の差をｌｕ、垂直方向の差をｌｖとする。すると、着目画素の画素位置（ｕ０，ｖ０）における評価値Ｖ（ｌｕ，ｌｖ，ｕ０，ｖ０）は以下の式で表わされる。そして、評価値Ｖが小さいほどマッチング度合いが高いと判断される。 Next, the parallax estimation unit 205 compares the target pixel and the reference pixel, and calculates an evaluation value of block matching. In this embodiment, the mean square error of pixel values between comparison target blocks is used as an evaluation value. The block used for calculating the mean square error is a block indicated by the block information acquired from the region setting unit 204. The disparity estimation unit 205 calculates a mean square error of pixel values between a block centered on the pixel of interest and a block centered on the reference pixel. The difference in pixel direction between the target pixel and the reference pixel in the horizontal direction is lu, and the difference in the vertical direction is lv. Then, the evaluation value V (lu, lv, u0, v0) at the pixel position (u0, v0) of the target pixel is represented by the following expression. And it is judged that a matching degree is so high that evaluation value V is small.

ここで、Ｂはマッチングに用いるブロックに含まれる画素の集合を示す。｜Ｂ｜は領域Ｂに含まれる画素数を示す。Ｉ_b（ｕ，ｖ）は基準画像の画素位置（ｕ，ｖ）における画素値を示す。Ｉ_r（ｕ，ｖ）は参照画像の画素位置（ｕ，ｖ）における画素値を示す。なお、ここで算出される評価値は式（２）に示す画素値の二乗平均誤差に限られず、二つのブロックの類似度合いを示す値であれば公知の様々なものが利用可能である。視差推定部２０５は、上記方法に基づき、参照画素を変更しながら、参照画素の候補となる全ての画素についてそれぞれ評価値を算出する。そして、視差推定部２０５は、評価値Ｖが最小となる参照画素と、着目画素との画素位置の差を、着目画素に対応する視差と推定する。 Here, B represents a set of pixels included in a block used for matching. | B | indicates the number of pixels included in the region B. I _b (u, v) indicates a pixel value at the pixel position (u, v) of the reference image. I _r (u, v) indicates a pixel value at the pixel position (u, v) of the reference image. Note that the evaluation value calculated here is not limited to the mean square error of the pixel values shown in Expression (2), and various known values can be used as long as the values indicate the degree of similarity between the two blocks. Based on the above method, the parallax estimation unit 205 calculates an evaluation value for each of the reference pixel candidates while changing the reference pixel. Then, the parallax estimation unit 205 estimates the difference in pixel position between the reference pixel having the smallest evaluation value V and the target pixel as the parallax corresponding to the target pixel.

ステップＳ５０８では、視差推定部２０５が、フラグマップが１である全画素について処理を終了したかを判断する。処理が終了している場合は（ステップＳ５０８のＹＥＳ）、視差推定部２０５は、視差推定の結果として生成した視差マップを判定部２０６に出力する。そして、処理はステップＳ５０９に移行する。処理が終了していない場合は（ステップＳ５０８のＮＯ）、処理はステップＳ５０６に戻る。 In step S <b> 508, the parallax estimation unit 205 determines whether the processing has been completed for all the pixels whose flag map is 1. When the processing is completed (YES in step S508), the parallax estimation unit 205 outputs the parallax map generated as a result of the parallax estimation to the determination unit 206. Then, the process proceeds to step S509. If the process has not ended (NO in step S508), the process returns to step S506.

ステップＳ５０９では、判定部２０６が、視差推定部２０５から取得した視差マップの分散度を画素毎に算出する。一般的な視差マップでは、遠近競合領域（一つのブロック内に距離が異なる複数の被写体が含まれる領域）以外では視差値が緩やかに変化するという性質を有する。そこで、本実施例では、判定部２０６が、各階層のブロックマッチングにより得られた視差マップの分散度を画素毎に評価する。着目画素周辺の視差変動が小さい、つまり分散度が小さい場合は、判定部２０６は、信頼度が高いと判断する。逆に、視差変動が大きい、つまり分散度が大きい場合は、判定部２０６は、信頼度が低いと判断する。本実施例では、分散度として以下の式で算出される値を用いる。 In step S509, the determination unit 206 calculates the degree of dispersion of the parallax map acquired from the parallax estimation unit 205 for each pixel. A general parallax map has a property that a parallax value gradually changes except in a perspective conflict area (an area in which a plurality of subjects having different distances are included in one block). Therefore, in this embodiment, the determination unit 206 evaluates the degree of dispersion of the parallax map obtained by block matching of each layer for each pixel. When the parallax variation around the pixel of interest is small, that is, when the degree of dispersion is small, the determination unit 206 determines that the reliability is high. Conversely, when the parallax variation is large, that is, when the degree of dispersion is large, the determination unit 206 determines that the reliability is low. In this embodiment, a value calculated by the following equation is used as the degree of dispersion.

ここで、ｄ（ｘ、ｙ）は視差マップである。Ｂ´は視差マップにおいて、分散の算出対象となる領域である。ここでは、例えば着目画素を中心とした５×５の正方領域を分散の算出対象とする。なお、分散の算出式は、式（３）以外であってもよい。また、視差マップの信頼度を評価できる評価値であれば、分散度以外の評価値（例えば平滑度）であってもよい。 Here, d (x, y) is a parallax map. B ′ is an area to be calculated for dispersion in the parallax map. Here, for example, a 5 × 5 square region centered on the pixel of interest is set as the calculation target of the variance. Note that the variance calculation formula may be other than formula (3). Further, an evaluation value other than the degree of dispersion (for example, smoothness) may be used as long as it is an evaluation value that can evaluate the reliability of the parallax map.

ステップＳ５１０では、判定部２０６が、算出した分散度と、閾値とを画素毎に比較する処理（分散度判定処理）を実行する。なお、ステップＳ５１０で用いられる閾値は、予め判定部２０６によって決定される。例えば、判定部２０６は、処理対象となる画像データの視差の最大値に基づき閾値を決定する。 In step S510, the determination unit 206 executes a process (dispersion degree determination process) for comparing the calculated degree of dispersion and the threshold value for each pixel. Note that the threshold used in step S510 is determined in advance by the determination unit 206. For example, the determination unit 206 determines the threshold based on the maximum parallax value of the image data to be processed.

ここで、分散度判定処理を説明する。図５（ｂ）には、分散度判定処理のフローが示されている。 Here, the degree of dispersion determination process will be described. FIG. 5B shows a flow of the degree of dispersion determination process.

ステップＳ５２１では、判定部２０６が、判定対象とする画素を選択する。ステップＳ５２２は、判定部２０６が、選択した画素について算出された分散度と、閾値とを比較する。分散度が閾値より大きい場合は（ステップＳ５２２のＹＥＳ）、視差の信頼度が低い（一定レベル以下である）と判断され、処理はステップＳ５２４の処理に移行する。分散度が閾値以下である場合は（ステップＳ５２２のＮＯ）、信頼度が高い（一定レベルを超えている）と判断され、処理はステップＳ５２３の処理に移行する。このとき、判定部２０６は、視差決定部２０７に、信頼度が高いと判断された画素の画素位置を出力する。ステップＳ５２３では、視差決定部２０７が、判定部２０６から取得した画素位置における視差値を決定する。具体的には、視差決定部２０７は、視差推定部２０５から取得した視差マップ内の上記画素位置における視差値を、現在設定されている値で確定する。さらに視差決定部２０７は、フラグマップの、上記画素位置の画素値を０に更新する。このように、視差決定部２０７は、着目画素周辺の視差マップが滑らかな場合は、正しく視差推定が行われたと判断し、フラグマップを０にして当該着目画素をそれ以降の処理対象から外す。なお、視差決定部２０７は、フラグマップを更新する際、判定部２０６から取得した画素位置以外の画素値の更新は行わない。ステップＳ５２４では、判定部２０６が、未だ判定対象となっていない、未判定の画素があるかを判断する。未判定の画素がある場合には（ステップＳ５２４のＹＥＳ）、処理はステップＳ５２１に戻る。未判定の画素がない場合には（ステップＳ５２４のＮＯ）、判定部２０６は、分散度判定処理を終了する。なお、ここでは、信頼度として分散度が用いられる場合を例にしたが、信頼度として平滑度が用いられる場合には、判定部２０６は、平滑度が閾値より小さい場合に信頼度が低いと判定し、平滑度が閾値以上である場合に信頼度が高いと判定すればよい。 In step S521, the determination unit 206 selects a pixel to be determined. In step S522, the determination unit 206 compares the degree of dispersion calculated for the selected pixel with a threshold value. If the degree of dispersion is greater than the threshold (YES in step S522), it is determined that the parallax reliability is low (below a certain level), and the process proceeds to step S524. When the degree of dispersion is equal to or less than the threshold (NO in step S522), it is determined that the reliability is high (exceeds a certain level), and the process proceeds to the process in step S523. At this time, the determination unit 206 outputs the pixel position of the pixel determined to have high reliability to the parallax determination unit 207. In step S523, the parallax determination unit 207 determines the parallax value at the pixel position acquired from the determination unit 206. Specifically, the parallax determination unit 207 determines the parallax value at the pixel position in the parallax map acquired from the parallax estimation unit 205 with the currently set value. Further, the parallax determination unit 207 updates the pixel value of the pixel position of the flag map to 0. As described above, when the parallax map around the target pixel is smooth, the parallax determination unit 207 determines that the parallax estimation is correctly performed, sets the flag map to 0, and removes the target pixel from the subsequent processing targets. Note that the parallax determination unit 207 does not update pixel values other than the pixel position acquired from the determination unit 206 when updating the flag map. In step S524, the determination unit 206 determines whether there is an undetermined pixel that has not yet been determined. If there is an undetermined pixel (YES in step S524), the process returns to step S521. If there is no undetermined pixel (NO in step S524), the determination unit 206 ends the dispersion degree determination process. Here, the case where the degree of dispersion is used as the reliability is taken as an example, but when the smoothness is used as the reliability, the determination unit 206 determines that the reliability is low when the smoothness is smaller than the threshold value. It may be determined that the reliability is high when the smoothness is equal to or higher than the threshold.

ステップＳ５１１では、判定部２０６が、ブロックの拡大回数が最大拡大回数以上であるかを判定する。なお、判定部２０６は予め、入力インターフェース１０５を介して、または二次記憶装置１０４から、最大拡大回数を取得し、保持する。最大拡大回数以上であると判断した場合は（ステップＳ５１１のＹＥＳ）、処理はステップＳ５１３に移行する。最大拡大回数よりも小さいと判断した場合は（ステップＳ５１１のＮＯ）、処理はステップＳ５１２に移行する。 In step S511, the determination unit 206 determines whether the number of block expansions is greater than or equal to the maximum number of expansions. Note that the determination unit 206 obtains and holds the maximum number of enlargements in advance via the input interface 105 or from the secondary storage device 104. If it is determined that the number is equal to or greater than the maximum number of times of expansion (YES in step S511), the process proceeds to step S513. If it is determined that the number is smaller than the maximum number of times of expansion (NO in step S511), the process proceeds to step S512.

ステップＳ５１２では、領域設定部２０４が、保持するブロックの拡大方向に基づき、現在の階層で使用しているブロックを拡大する。具体的には、領域設定部２０４は、拡大方向情報に基づいてブロック情報を更新する。これにより、次の階層で使用されるブロックのサイズや形状が決定される。ここでは、領域設定部２０４は、現在の階層のブロックマッチングで使用しているブロックを、ブロックの拡大方向にのみ、サイズ刻み幅分拡大する。それにより、次の階層のブロックマッチングで使用されるブロックが決定される。サイズ刻み幅は、入力インターフェース１０５を介してユーザによって指定可能である。なお、サイズ刻み幅を示す情報を二次記憶装置１０４に予め格納しておいて、領域設定部２０４が、二次記憶装置１０４から当該情報を読み出すようにしてもよい。また、サイズ刻み幅は、領域設定部２０４が決定してもよい。例えば、領域設定部２０４は、ブロックの最大拡大回数に応じてサイズ刻み幅を決定してもよい。領域設定部２０４は、更新したブロック情報を視差推定部２０５に出力する。それにより、ステップＳ５０６の処理が再実行される。すなわち、拡大されたブロックでブロックマッチングが再度実行される。 In step S512, the area setting unit 204 enlarges the block used in the current hierarchy based on the enlargement direction of the held block. Specifically, the area setting unit 204 updates the block information based on the enlargement direction information. Thereby, the size and shape of the block used in the next hierarchy are determined. Here, the area setting unit 204 expands the block used in block matching of the current layer by the size increment width only in the block expansion direction. Thereby, a block used in block matching of the next layer is determined. The size increment can be specified by the user via the input interface 105. Note that information indicating the size increment may be stored in the secondary storage device 104 in advance, and the area setting unit 204 may read out the information from the secondary storage device 104. Further, the area setting unit 204 may determine the size increment. For example, the area setting unit 204 may determine the size increment in accordance with the maximum number of block expansions. The area setting unit 204 outputs the updated block information to the parallax estimation unit 205. Thereby, the process of step S506 is re-executed. That is, block matching is performed again on the enlarged block.

ステップＳ５１３では、視差決定部２０７が、フラグマップにおいて値が１のままである画素に対応する、視差マップの視差値を、０または事前に定めた値に設定する。その理由は、フラグマップにおいて値が１のままである画素ついては、精度が高い視差推定が不可能であると判断できるからである。そして、視差決定部２０７は、フラグマップにおいて値が１のままである画素の値を０に更新する。 In step S513, the parallax determination unit 207 sets the parallax value of the parallax map corresponding to the pixel whose value remains 1 in the flag map to 0 or a predetermined value. The reason is that it can be determined that a highly accurate parallax estimation is impossible for a pixel whose value remains 1 in the flag map. Then, the parallax determination unit 207 updates the value of the pixel whose value remains 1 in the flag map to 0.

ステップＳ５１４では、視差決定部２０７が、視差マップを二次記憶装置１０４や外部記憶装置１０８や表示装置１０９に出力する。そして、処理が終了する。 In step S514, the parallax determination unit 207 outputs the parallax map to the secondary storage device 104, the external storage device 108, and the display device 109. Then, the process ends.

以上が、第１の実施例の画像処理装置１００で行われる処理である。なお、本実施例では着目画素を中心とする１方向に長い形状をもつブロックを用いたが、ブロックの形状はこれに限定されず、例えば、着目画素を中心とした十字形を用いてもよい。また、本実施例では、ブロックの拡大回数（すなわち、ブロックマッチングの回数）が予め定められた回数以上実行された場合に、視差マップを出力するようにしている。しかし、視差マップ内のすべての視差値の信頼度が一定レベルを超えた場合には、その時点で視差マップを出力するようにしてもよい。 The above is the processing performed by the image processing apparatus 100 of the first embodiment. In this embodiment, a block having a shape that is long in one direction with the pixel of interest at the center is used. However, the shape of the block is not limited to this, and for example, a cross shape with the pixel of interest at the center may be used. . In this embodiment, a parallax map is output when the number of block enlargements (that is, the number of times of block matching) is executed more than a predetermined number of times. However, when the reliability of all the parallax values in the parallax map exceeds a certain level, the parallax map may be output at that time.

図８は、第１の実施例の効果を説明するための概念図である。図８において、画像８０１は、ランダムパターンが有する板８０５と全面がテクスチャレス領域である板８０６とを、左視点のカメラ８０２により撮像した画像である。画像８０３は、同一被写体を右視点のカメラ８０４により撮像した画像である。例えば、ブロック８１０を用いて画像８０１と画像８０３とのブロックマッチングを行って視差マップを生成した場合、当該視差マップによって示される画像（視差画像）８０７のテクスチャレス領域において誤差８０８が発生する。これに対し、第１の実施例の画像処理装置１００では、エピポーラ線と平行な方向にブロックを拡大させながら、階層型のブロックマッチングを実行する。それにより、ブロック内にエピポーラ線と平行な境界線が含まれることを抑制できる。したがって、画像内にテクスチャレス領域が含まれる場合でも、誤ったブロックを検出することがない。さらに、画像処理装置１００は、上記階層型のブロックマッチングにおいて、視差マップの分散度を評価しながら画素毎に最適な視差値を導出する。したがって、より正確な視差を推定することができる。よって、第１の実施例の画像処理装置１００によれば、精度が高い視差画像８０９を生成することができる。 FIG. 8 is a conceptual diagram for explaining the effect of the first embodiment. In FIG. 8, an image 801 is an image obtained by capturing the plate 805 included in the random pattern and the plate 806 whose entire surface is a textureless region with the camera 802 at the left viewpoint. An image 803 is an image obtained by capturing the same subject with the right-view camera 804. For example, when a parallax map is generated by performing block matching between the image 801 and the image 803 using the block 810, an error 808 occurs in the textureless region of the image (parallax image) 807 indicated by the parallax map. In contrast, the image processing apparatus 100 according to the first embodiment executes hierarchical block matching while enlarging a block in a direction parallel to the epipolar line. Thereby, it can suppress that the boundary line parallel to an epipolar line is contained in a block. Therefore, even when a textureless area is included in the image, an erroneous block is not detected. Furthermore, the image processing apparatus 100 derives an optimal parallax value for each pixel while evaluating the degree of dispersion of the parallax map in the hierarchical block matching. Therefore, more accurate parallax can be estimated. Therefore, according to the image processing apparatus 100 of the first embodiment, a highly accurate parallax image 809 can be generated.

＜実施例２＞
第１の実施例では、撮像パラメータから算出したエピポーラ線に基づいてマッチングに用いるブロックの拡大方向を決定した。本実施例では、ユーザがブロックの拡大方向を決定する。撮像カメラが同一平面上に存在し、かつ互いのカメラの光軸が平行である場合、第１の実施例で説明したように、エピポーラ線は基準カメラと参照カメラとの方向ベクトルと等しい。そこで本実施例では、エピポーラ線のおおよその方向をユーザが予測し、予測した方向をブロックの拡大方向として決定する。このように、本実施例では、撮像カメラの位置関係から算出したエピポーラ線を用いてブロックの拡大方向を決定していた第１の実施例と異なり、おおよその方向に基づいてブロックの拡大方向を決定する。このように、本実施例ではエピポーラ線の算出を行わないので、エピポーラ線を算出していた第１の実施例に比べて、処理工数を低減することが可能となる。 <Example 2>
In the first embodiment, the enlargement direction of the block used for matching is determined based on the epipolar line calculated from the imaging parameters. In this embodiment, the user determines the enlargement direction of the block. When the imaging cameras are on the same plane and the optical axes of the cameras are parallel to each other, the epipolar line is equal to the direction vector between the reference camera and the reference camera as described in the first embodiment. Therefore, in this embodiment, the user predicts the approximate direction of the epipolar line, and determines the predicted direction as the block enlargement direction. Thus, in this embodiment, unlike the first embodiment in which the enlargement direction of the block is determined using the epipolar line calculated from the positional relationship of the imaging camera, the enlargement direction of the block is determined based on the approximate direction. decide. As described above, since the epipolar line is not calculated in this embodiment, the number of processing steps can be reduced as compared with the first embodiment in which the epipolar line is calculated.

以下、本実施例の画像処理装置１００で行われる処理について説明する。図９は、第２の実施例における画像処理装置１００の機能構成の一例を示すブロック図である。図１０は、第２の実施例における視差推定処理の流れを示すフローチャートである。ＲＯＭ１０３に格納されたプログラムをＣＰＵ１０１がＲＡＭ１０２にロードして実行することで、図９に示す各機能が動作する。またそれにより、図１０に示す一連の処理が実行される。なお、以下に示す処理の全てがＣＰＵ１０１によって実行される必要はなく、処理の一部または全部が、ＣＰＵ１０１以外の一つ又は複数の処理回路によって行われるように画像処理装置１００が構成されていてもよい。また、本実施例の機能構成は、第１の実施例の機能構成と同様である。ただし、本実施例では、撮像パラメータ取得部２０２およびエピポーラ線算出部２０３の代わりに、拡大方向取得部９０１が含まれる。また、図１０に示すステップＳ１００１，Ｓ１００２，Ｓ１００５〜Ｓ１０１３の処理は、第１の実施例におけるステップＳ５０１，Ｓ５０２，Ｓ５０６〜Ｓ５１４の処理と同様であるため、以下では説明を省略する。 Hereinafter, processing performed in the image processing apparatus 100 according to the present embodiment will be described. FIG. 9 is a block diagram illustrating an example of a functional configuration of the image processing apparatus 100 according to the second embodiment. FIG. 10 is a flowchart showing the flow of the parallax estimation process in the second embodiment. Each function shown in FIG. 9 operates when the CPU 101 loads the program stored in the ROM 103 to the RAM 102 and executes the program. Thereby, a series of processes shown in FIG. 10 is executed. Note that it is not necessary for the CPU 101 to execute all of the processes described below, and the image processing apparatus 100 is configured such that part or all of the processes are performed by one or more processing circuits other than the CPU 101. Also good. The functional configuration of this embodiment is the same as that of the first embodiment. However, in this embodiment, instead of the imaging parameter acquisition unit 202 and the epipolar line calculation unit 203, an enlargement direction acquisition unit 901 is included. Further, the processing in steps S1001, S1002, S1005 to S1013 shown in FIG. 10 is the same as the processing in steps S501, S502, and S506 to S514 in the first embodiment, and thus the description thereof is omitted below.

ステップＳ１００３では、拡大方向取得部９０１が入力インターフェース１０５を介して、ユーザが指定するブロックの拡大方向を取得する。ユーザは、ブロックの拡大方向として、基準カメラで撮像された画像の、水平軸に対する角度を指定する。例えば、撮像カメラが水平に配置されている場合（すなわち、光軸に対する回転角度が０度である場合）は０度が指定される。また例えば、撮像カメラが斜め４５度に配置されている場合（すなわち、光軸に対する回転角度が４５度である場合）は４５度が指定される。拡大方向取得部９０１は、取得したブロックの拡大方向を示す情報を領域設定部２０４に出力する。 In step S <b> 1003, the enlargement direction acquisition unit 901 acquires the enlargement direction of the block specified by the user via the input interface 105. The user designates the angle of the image captured by the reference camera with respect to the horizontal axis as the block enlargement direction. For example, when the imaging camera is arranged horizontally (that is, when the rotation angle with respect to the optical axis is 0 degree), 0 degree is designated. Further, for example, when the imaging camera is arranged at an angle of 45 degrees (that is, when the rotation angle with respect to the optical axis is 45 degrees), 45 degrees is designated. The enlargement direction acquisition unit 901 outputs information indicating the acquired enlargement direction of the block to the region setting unit 204.

ステップＳ１００４では、領域設定部２０４が拡大方向取得部９０１から取得したブロックの拡大方向を示す情報に基づいて、ブロックの拡大方向を設定する。また、領域設定部２０４は、階層型のブロックマッチングにおける初期ブロックを決定する。ブロックの拡大方向は、事前に定めた方向の候補から選択する。具体的な方法は第１の実施例と同様であるため、説明を省略する。 In step S <b> 1004, the area setting unit 204 sets the block enlargement direction based on the information indicating the block enlargement direction acquired from the enlargement direction acquisition unit 901. The region setting unit 204 determines an initial block in hierarchical block matching. The enlargement direction of the block is selected from predetermined direction candidates. Since the specific method is the same as that of the first embodiment, description thereof is omitted.

以上が、第２の実施例の画像処理装置１００で行われる処理である。上述したように、第２の実施例では、ユーザがブロックの拡大方向を指定する。それにより、第１の実施例に比べて処理工数を低減することができる。 The above is the process performed by the image processing apparatus 100 of the second embodiment. As described above, in the second embodiment, the user designates the enlargement direction of the block. Thereby, the number of processing steps can be reduced as compared with the first embodiment.

＜実施例３＞
第１の実施例では、エピポーラ線の情報のみに基づいて、ブロックマッチングに用いるブロックの拡大方向を決定する例について説明した。本実施例では、被写体のエッジ情報を用いて、ブロックの拡大の有無を決定する例について説明する。第１の実施例では、ブロック内に含まれる全ての画素を用いてブロックマッチングを行い、推定した視差値の信頼性が高いと判断されるまでブロックを拡大する。本実施例では、ブロックマッチングに用いる画素としてふさわしくない画素が除外されるようにブロックを修正する。そして、修正したブロックを用いてブロックマッチングを行う。また、被写体のエッジ情報に基づいて、ブロックの拡大を制限する。 <Example 3>
In the first embodiment, the example in which the enlargement direction of the block used for block matching is determined based only on the epipolar line information has been described. In the present embodiment, an example will be described in which presence / absence of block enlargement is determined using edge information of a subject. In the first embodiment, block matching is performed using all pixels included in a block, and the block is expanded until it is determined that the reliability of the estimated parallax value is high. In this embodiment, the block is modified so that pixels that are not suitable as pixels used for block matching are excluded. Then, block matching is performed using the corrected block. Further, the expansion of the block is limited based on the edge information of the subject.

ここで、図３に示す環境において着目画素３０７の視差値を推定する際に、図１１に示すブロック１１０２を用いる場合を考える。図１１は、第３の実施例におけるブロックの修正処理を説明するための図である。ブロック１１０２には、着目画素３０７が属する被写体領域（被写体３０５の領域）とは異なる被写体領域（被写体３０６の領域）が多く含まれる。そのため、被写体３０６の特徴により、ブロック１１０２内の、被写体３０６に対応するブロックにおいて類似性が高いと判断され、正しく視差を推定できない可能性がある。また、着目画素３０７の被写体３０５のみを含むブロック１１０３を用いた場合においては、ブロック内がテクスチャレス領域のみとなる。そのため、類似領域として複数の領域が検出される可能性があり、正しく視差を推定できない可能性がある。このように、エピポーラ線に平行なブロックを用いてブロックマッチングを行ったとしても、マッチング元となるブロック内がテクスチャレス領域のみである場合や、ブロック内に複数の被写体領域が含まれる場合は、視差の推定精度が低下する可能性がある。 Here, consider the case where the block 1102 shown in FIG. 11 is used when estimating the parallax value of the pixel of interest 307 in the environment shown in FIG. FIG. 11 is a diagram for explaining a block correction process in the third embodiment. The block 1102 includes many subject areas (areas of the subject 306) different from the subject areas (area of the subject 305) to which the target pixel 307 belongs. Therefore, due to the characteristics of the subject 306, it is determined that the similarity in the block 1102 corresponding to the subject 306 is high, and the parallax may not be estimated correctly. In addition, when the block 1103 including only the subject 305 of the target pixel 307 is used, only the textureless area is included in the block. Therefore, a plurality of areas may be detected as similar areas, and parallax may not be estimated correctly. Thus, even if block matching is performed using a block parallel to the epipolar line, if the block that is the matching source is only a textureless area, or if the block contains multiple subject areas, There is a possibility that the estimation accuracy of the parallax is lowered.

そこで、本実施例では、ブロック内に含まれるエッジを抽出し、ブロック１１０１のように、エッジを微小に跨ぐようにブロックを修正する。エッジを微小に跨がせることにより、ブロック内がテクスチャレス領域のみになることがない。それにより、誤ったブロックが検出されるおそれを低減させることができる。また、エッジを微小に跨がせることにより、着目画素３０７が属さない被写体領域（被写体３０６の領域）をブロック内に必要以上に含ませることがない。したがって、被写体３０６の視差に対応するブロックが誤って検出されることを抑制できる。また本実施例では、ブロック内において抽出されたエッジがエピポーラ線と平行なエッジである場合には、上述した現象の発生を抑制するために、当該エッジを跨がないようにブロックを修正する。さらに本実施例では、第１の実施例と同様に階層型のブロックマッチングを行うが、いずれの階層においても、ブロックを修正した方向については、ブロックの拡大方向として選択しないようにする。このような処理により、本実施例では、第１の実施例に比べて、ロバストかつ精度が高い視差マップを得ることが可能になる。 Therefore, in this embodiment, an edge included in the block is extracted, and the block is corrected so as to cross the edge minutely as in the block 1101. By extending the edges slightly, the inside of the block is not limited to the textureless area. Thereby, the possibility that an erroneous block is detected can be reduced. Further, by extending the edges slightly, the subject area (the area of the subject 306) to which the target pixel 307 does not belong is not included in the block more than necessary. Therefore, erroneous detection of a block corresponding to the parallax of the subject 306 can be suppressed. In this embodiment, when the edge extracted in the block is an edge parallel to the epipolar line, the block is corrected so as not to straddle the edge in order to suppress the occurrence of the above-described phenomenon. Furthermore, in this embodiment, hierarchical block matching is performed as in the first embodiment. However, in any hierarchy, the direction in which the block is corrected is not selected as the block enlargement direction. By such processing, in this embodiment, it is possible to obtain a parallax map that is more robust and more accurate than the first embodiment.

以下、本実施例の画像処理装置１００で行われる処理について説明する。図１２は、第３の実施例における画像処理装置１００の機能構成を示すブロック図である。図１３は、第３の実施例における視差推定処理の流れを示すフローチャートである。ＲＯＭ１０３に格納されたプログラムをＣＰＵ１０１がＲＡＭ１０２にロードして実行することで、図１２に示す各機能が動作する。またそれにより、図１３に示す一連の処理が実行される。なお、以下に示す処理の全てがＣＰＵ１０１によって実行される必要はなく、処理の一部または全部が、ＣＰＵ１０１以外の一つ又は複数の処理回路によって行われるように画像処理装置１００が構成されていてもよい。また、本実施例の機能構成は、第１の実施例の機能構成と同様である。ただし、本実施例では、第１の実施例に示す各機能に加えて、エッジ抽出部１２０１及び領域修正部１２０２が含まれる。また、図１３に示すステップＳ１３０１，Ｓ１３０３〜Ｓ１３０７，Ｓ１３０９〜Ｓ１３１２，Ｓ１３１５，Ｓ１３１６の処理は、第１の実施例におけるステップＳ５０１，Ｓ５０２〜Ｓ５０６，Ｓ５０７〜Ｓ５１０，Ｓ５１３，Ｓ５１４の処理と同様であるため、以下では説明を省略する。 Hereinafter, processing performed in the image processing apparatus 100 according to the present embodiment will be described. FIG. 12 is a block diagram illustrating a functional configuration of the image processing apparatus 100 according to the third embodiment. FIG. 13 is a flowchart showing the flow of parallax estimation processing in the third embodiment. Each function shown in FIG. 12 operates when the CPU 101 loads the program stored in the ROM 103 to the RAM 102 and executes it. Thereby, a series of processes shown in FIG. 13 are executed. Note that it is not necessary for the CPU 101 to execute all of the processes described below, and the image processing apparatus 100 is configured such that part or all of the processes are performed by one or more processing circuits other than the CPU 101. Also good. The functional configuration of this embodiment is the same as that of the first embodiment. However, in this embodiment, in addition to the functions shown in the first embodiment, an edge extraction unit 1201 and a region correction unit 1202 are included. Further, the processes of steps S1301, S1303 to S1307, S1309 to S1312, S1315, and S1316 shown in FIG. 13 are the same as the processes of steps S501, S502 to S506, S507 to S510, S513, and S514 in the first embodiment. Therefore, description is omitted below.

ステップＳ１３０２では、エッジ抽出部１２０１が、画像データ取得部２０１から画像データを取得する。そして、エッジ抽出部１２０１は、取得した画像データの基準画像からエッジを抽出する。エッジ抽出方法はソーベルフィルタ（ｓｏｂｅｌｆｉｌｔｅｒ）など、公知の様々なものが利用可能である。本実施例ではエッジに対応する画素の値を１、エッジでない画素の値を０としたエッジ情報を領域修正部１２０２に出力する。 In step S1302, the edge extraction unit 1201 acquires image data from the image data acquisition unit 201. Then, the edge extraction unit 1201 extracts an edge from the reference image of the acquired image data. As the edge extraction method, various known methods such as a Sobel filter can be used. In this embodiment, the edge information with the pixel value corresponding to the edge set to 1 and the non-edge pixel value set to 0 is output to the region correction unit 1202.

ステップＳ１３０８では、領域修正部１２０２が、エッジ抽出部１２０１から取得したエッジ情報に基づき領域設定部２０４から取得したブロックを修正する。領域修正部１２０２の処理を、図１４を用いて具体的に説明する。 In step S1308, the region correction unit 1202 corrects the block acquired from the region setting unit 204 based on the edge information acquired from the edge extraction unit 1201. The processing of the area correction unit 1202 will be specifically described with reference to FIG.

図１４は、領域修正部１２０２の処理を説明するための図である。図１４において、ブロック１４０２は、領域修正部１２０２が領域設定部２０４から取得したブロック情報によって示されるブロックである。両矢印１４０１は、ブロックの拡大方向を示す。ブロック１４０２内の黒色の画素は着目画素である。また、斜線で示す画素はエッジ抽出部１２０１によってエッジと判断された画素である。まず、領域修正部１２０２は、着目画素から矢印１４０１の負の方向（図１４において左方向）に、エッジと判断された画素を探索する。領域修正部１２０２は、拡大方向１４０１の正の方向についても同様に、エッジと判断された画素を探索する。 FIG. 14 is a diagram for explaining the processing of the area correction unit 1202. In FIG. 14, a block 1402 is a block indicated by the block information acquired by the area correction unit 1202 from the area setting unit 204. A double arrow 1401 indicates the enlargement direction of the block. The black pixel in the block 1402 is the target pixel. Pixels indicated by diagonal lines are pixels that are determined to be edges by the edge extraction unit 1201. First, the region correction unit 1202 searches for a pixel determined to be an edge in the negative direction of the arrow 1401 from the target pixel (left direction in FIG. 14). Similarly, in the positive direction of the enlargement direction 1401, the region correction unit 1202 searches for pixels determined to be edges.

例えば、エッジと判断された画素が拡大方向１４０１の負の方向に存在しない場合は、領域修正部１２０２は、負の方向に対してはブロックの修正を行わないと決定する。また例えば、画素１４０４に示されるように、エッジと判断された画素が単独で存在している場合には、領域修正部１２０２は、ブロック１４０３に示されるように、エッジを１画素跨ぐようにブロックを修正する。なお、エッジを跨ぐ画素数は１画素に限らず、エッジを数画素跨ぐようにブロックが修正されてもよい。また、エッジを跨ぐ画素数は、入力インターフェース１０５を介して、または二次記憶装置１０４から取得されるようにしても良い。また例えば、画素１４０５に示されるように、エッジと判断された画素が連続して存在する場合には、領域修正部１２０２は、エピポーラ線と平行なエッジが存在すると判断する。そして、領域修正部１２０２は、ブロック１４０３に示されるように、エッジを跨がないようにブロックを修正する。領域修正部１２０２は、ブロックの修正の有無を示す情報（以下、ブロック修正情報をという）を正と負の方向毎に保持する。また、領域修正部１２０２は、修正後のブロックに関する情報を、視差推定部２０５に出力する。 For example, if the pixel determined to be an edge does not exist in the negative direction of the enlargement direction 1401, the region correction unit 1202 determines not to perform block correction in the negative direction. Further, for example, as shown in the pixel 1404, when a pixel determined to be an edge exists alone, the area correction unit 1202 blocks the edge so as to straddle one pixel as shown in the block 1403. To correct. Note that the number of pixels straddling the edge is not limited to one pixel, and the block may be modified to straddle several pixels across the edge. In addition, the number of pixels across the edge may be acquired via the input interface 105 or from the secondary storage device 104. Further, for example, as shown in the pixel 1405, when there are consecutive pixels determined to be edges, the region correction unit 1202 determines that an edge parallel to the epipolar line exists. Then, the area correction unit 1202 corrects the block so as not to cross the edge, as indicated by a block 1403. The area correction unit 1202 holds information indicating whether or not a block is corrected (hereinafter referred to as block correction information) for each of positive and negative directions. Further, the region correction unit 1202 outputs information on the corrected block to the parallax estimation unit 205.

ステップＳ１３１３では、判定部２０６が、ブロックの拡大が不可であると判断した場合、または、ブロックの拡大回数がブロックの最大拡大回数以上であると判断した場合には（ステップＳ１３１３のＹＥＳ）、処理がステップＳ１３１５に移行する。そうでない場合は（ステップＳ１３１３のＮＯ）、処理がステップＳ１３１４に移行する。なお、判定部２０６は、ブロック修正情報において正と負の両方向において修正有りと記されている場合に、ブロックの拡大が不可であると判断する。 In step S1313, if the determination unit 206 determines that the block cannot be expanded, or if it determines that the block expansion count is equal to or greater than the maximum block expansion count (YES in step S1313), the processing is performed. Goes to step S1315. If not (NO in step S1313), the process proceeds to step S1314. Note that the determination unit 206 determines that the block cannot be expanded when the block correction information indicates that there is correction in both the positive and negative directions.

ステップＳ１３１４では、領域設定部２０４が、保持するブロックの拡大情報に基づき、現在の階層で使用しているブロックを拡大する。このとき、まず、領域設定部２０４は、領域修正部１２０２が保持するブロック修正情報から、ブロックの修正がされていない方向を特定する。そして、領域設定部２０４は、ブロックの修正がされていない方向にのみ、サイズ刻み幅分ブロックを拡大する。このようにして、次の階層で使用するブロックが決定される。領域設定部２０４は、次の階層のブロックを示すブロック情報を領域修正部１２０２に出力する。 In step S1314, the area setting unit 204 expands the block used in the current hierarchy based on the block expansion information held. At this time, first, the area setting unit 204 specifies a direction in which the block is not corrected from the block correction information held by the area correcting unit 1202. Then, the region setting unit 204 enlarges the block by the size increment width only in the direction in which the block is not corrected. In this way, a block to be used in the next hierarchy is determined. The area setting unit 204 outputs block information indicating a block in the next hierarchy to the area correction unit 1202.

以上が、第３の実施例の画像処理装置１００で行われる処理である。なお、本実施例では着目画素を中心とする１方向に長い形状のブロックを用いたが、ブロックの形状はこれに限定されず、例えば、着目画素を中心とした十字形を用いてもよい。 The above is the process performed by the image processing apparatus 100 of the third embodiment. In this embodiment, a block having a shape that is long in one direction centered on the pixel of interest is used. However, the shape of the block is not limited to this, and for example, a cross shape centering on the pixel of interest may be used.

第１の実施例の画像処理装置１００では、ブロック内に含まれる、着目画素とは異なる被写体の、大きさや特徴によって、推定した視差結果に誤差が含まれる場合がある。そのため、設定する初期ブロックや階層型のブロックマッチングにおいて拡大するブロックの大きさによって、推定した視差の精度が変動する。一方、本実施例の画像処理装置１００では、ブロック内のエッジ情報を用いて、ブロック内に含まれる不要な画素を除去するようにブロックを修正する。そのため、ブロックに含まれる、異なる被写体の大きさや特徴によらず、高精度な視差推定が可能となる。つまり、本実施例の画像処理装置１００によれば、第１の実施例の画像処理装置１００に比べてブロックの設定にロバストな推定が可能となる。また、本実施例の画像処理装置１００によれば、着目画素とは異なる被写体がブロック内に含まれる可能性が高い被写体間の境界線付近においても、精度が高い視差推定することが可能となる。 In the image processing apparatus 100 according to the first embodiment, an error may be included in the estimated parallax result depending on the size and characteristics of the subject that is included in the block and is different from the target pixel. Therefore, the accuracy of the estimated parallax varies depending on the initial block to be set and the size of the block expanded in the hierarchical block matching. On the other hand, in the image processing apparatus 100 according to the present exemplary embodiment, the block is corrected so that unnecessary pixels included in the block are removed using the edge information in the block. Therefore, highly accurate parallax estimation can be performed regardless of the size and characteristics of different subjects included in the block. That is, according to the image processing apparatus 100 of the present embodiment, it is possible to perform estimation that is more robust in setting blocks than the image processing apparatus 100 of the first embodiment. In addition, according to the image processing apparatus 100 of the present embodiment, it is possible to estimate the parallax with high accuracy even in the vicinity of the boundary line between the subjects where the subject different from the target pixel is likely to be included in the block. .

＜その他の実施例＞
本発明の実施形態は、上記の実施例に限られるものではなく、様々な実施形態をとることが可能である。上記の実施例では、多視点画像データが示す複数の画像を撮像したカメラの光軸が、互いに平行であるとして視差値の推定を行ったが、光軸が平行でない画像を用いて視差値の推定を行ってもよい。この場合は、各画像を撮像したカメラの撮像パラメータを用いて、各画像がカメラの光軸を平行にして撮影したときと同じ画像になるように、多視点画像データを変換する画像処理を事前に行えばよい。また、カメラの光軸が平行でない場合にも対応可能な、公知の方法を用いて画素毎にエピポーラ線を算出した後に、エピポーラ線に基づいて視差値の数値を行ってもよい。 <Other examples>
Embodiments of the present invention are not limited to the above-described examples, and various embodiments can be adopted. In the above embodiment, the parallax value is estimated on the assumption that the optical axes of the cameras that have captured a plurality of images indicated by the multi-viewpoint image data are parallel to each other. An estimation may be performed. In this case, using the imaging parameters of the camera that captured each image, image processing that converts the multi-viewpoint image data is performed in advance so that each image becomes the same image that was captured when the optical axis of the camera was parallel. You can go to Further, after calculating the epipolar line for each pixel using a known method that can cope with the case where the optical axis of the camera is not parallel, the numerical value of the parallax value may be calculated based on the epipolar line.

また、上記実施例では、多視点画像データが示す複数の画像のうち、選択した基準画像と参照画像のみを比較することで視差値の推定を行ったが、３枚以上の画像を用いて視差値の推定を行ってもよい。この場合は、視差値として各視点間の位置関係を示す単位ベクトルごとの被写体像の移動量（３次元空間上の移動量）が格納される。この場合の、マッチングに用いる評価値は分散を用いて以下の式で表わされる。 Further, in the above embodiment, the parallax value is estimated by comparing only the selected standard image and the reference image among a plurality of images indicated by the multi-viewpoint image data. However, the parallax value is estimated using three or more images. A value may be estimated. In this case, the movement amount of the subject image (movement amount in the three-dimensional space) for each unit vector indicating the positional relationship between the viewpoints is stored as the parallax value. In this case, the evaluation value used for matching is expressed by the following equation using variance.

式（４）において、Ｂはマッチングに用いるブロックに含まれる画素の集合を示す。｜Ｂ｜は領域Ｂに含まれる画素数を示す。ｌは前述の、単位ベクトルごとの被写体像の移動量を示す視差値である。（ｒ_xk, ｒ_yk）は、基準の視点から見た各視点の相対位置ベクトルを示す。ｎはマッチングに用いる視点の総数を示す。なお、ここで算出される評価値は式（４）に示す式に限られず、二つの画像領域の類似度合いを示す値であれば公知の様々なものが利用可能である。 In Expression (4), B represents a set of pixels included in a block used for matching. | B | indicates the number of pixels included in the region B. l is the above-described parallax value indicating the amount of movement of the subject image for each unit vector. (R _xk , r _yk ) indicates a relative position vector of each viewpoint viewed from the reference viewpoint. n indicates the total number of viewpoints used for matching. Note that the evaluation value calculated here is not limited to the expression shown in Expression (4), and various known values can be used as long as the values indicate the degree of similarity between two image areas.

また、本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 Further, the present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus execute the program. It can also be realized by a process of reading and executing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

２０４領域設定部
２０５視差推定部
２０６判定部
２０７視差決定部 204 Region setting unit 205 Parallax estimation unit 206 Determination unit 207 Parallax determination unit

Claims

互いに異なる視点から撮像された複数の画像の視差を示す視差マップを生成する画像処理装置であって、
前記複数の画像のうち一の画像を基準画像とし、当該一の画像と異なる他の画像を参照画像として、前記基準画像の各画素位置について、対応する画素位置を前記参照画像からブロックマッチングにより探索し、前記ブロックマッチングにより検出された画素位置との視差値を導出する導出手段と、
前記基準画像の各画素位置について前記導出された視差値に基づき、前記視差マップを生成する生成手段と、
前記ブロックマッチングにおける探索方向に沿って、前記ブロックマッチングに用いるブロックを拡大するとともに、前記導出手段が導出した視差の信頼度が一定レベル以下である画素位置について、前記導出手段に前記ブロックマッチングを再度実行させ、再度の前記ブロックマッチングにより検出された画素位置との視差で、前記視差マップを更新する更新手段と、を備える
ことを特徴とする画像処理装置。 An image processing device that generates a parallax map indicating parallax of a plurality of images captured from different viewpoints,
Searching for a corresponding pixel position for each pixel position of the standard image by block matching from the reference image, using one of the plurality of images as a reference image and another image different from the one image as a reference image Deriving means for deriving a parallax value with the pixel position detected by the block matching;
Generating means for generating the parallax map based on the derived parallax value for each pixel position of the reference image;
The block used for the block matching is expanded along the search direction in the block matching, and the block matching is again performed on the derivation unit for the pixel position where the reliability of the parallax derived by the derivation unit is below a certain level. An image processing apparatus comprising: an updating unit configured to update the parallax map with a parallax with a pixel position detected by the block matching performed again.

前記更新手段は、
更新した前記視差マップ内に前記信頼度が一定レベル以下である視差が含まれる場合には、処理を繰り返す
請求項１に記載の画像処理装置。 The updating means includes
The image processing device according to claim 1, wherein the process is repeated when the updated disparity map includes a disparity having the reliability of a certain level or less.

前記更新手段は、
更新した前記視差マップ内に前記信頼度が一定レベル以下である視差が含まれる場合でも、前記ブロックマッチングが予め定められた回数以上実行されている場合には、処理を終了する
請求項２に記載の画像処理装置。 The updating means includes
The processing is terminated when the block matching has been executed a predetermined number of times or more even when the updated disparity map includes a disparity whose reliability is a certain level or less. Image processing apparatus.

前記更新手段は、
前記導出手段が導出した視差の分散度を算出する算出手段と、
前記算出手段が算出した前記分散度と予め定められた閾値とを比較して、前記分散度が前記閾値よりも大きい場合に、前記信頼度が一定レベル以下であると判定する
請求項１から請求項３のうちのいずれか１項に記載の画像処理装置。 The updating means includes
Calculating means for calculating the dispersity of the parallax derived by the deriving means;
The degree of reliability is determined to be equal to or less than a certain level when the degree of dispersion calculated by the calculation unit is compared with a predetermined threshold and the degree of dispersion is greater than the threshold. The image processing device according to any one of items 3 to 4.

前記更新手段は、
前記導出手段が導出した視差の平滑度を算出する算出手段と、
前記算出手段が算出した前記平滑度と予め定められた閾値とを比較して、前記平滑度が前記閾値よりも小さい場合に、前記平滑度が一定レベル以下であると判定する
請求項１から請求項３のうちのいずれか１項に記載の画像処理装置。 The updating means includes
Calculating means for calculating the smoothness of the parallax derived by the deriving means;
The smoothness calculated by the calculating means is compared with a predetermined threshold, and when the smoothness is smaller than the threshold, it is determined that the smoothness is below a certain level. The image processing device according to any one of items 3 to 4.

前記ブロックマッチングにおける探索方向が、前記基準画像の着目画素に対応する、前記参照画像上のエピポーラ線と平行な方向である
請求項１から請求項５のうちのいずれか１項に記載の画像処理装置。 The image processing according to any one of claims 1 to 5, wherein a search direction in the block matching is a direction parallel to an epipolar line on the reference image corresponding to a target pixel of the base image. apparatus.

前記基準画像を撮像する基準カメラと前記参照画像を撮像する参照カメラとが、同一平面上に配置され、かつ互いの光軸が平行である場合には、
前記更新手段は、
前記基準カメラと前記参照カメラとの３次元空間中の座標点から導出される方向ベクトルによって示される方向を、前記エピポーラ線と平行な方向とみなして、前記ブロックを拡大する
請求項６に記載の画像処理装置。 When the standard camera that captures the standard image and the reference camera that captures the reference image are arranged on the same plane and the optical axes thereof are parallel to each other,
The updating means includes
The block is enlarged by regarding a direction indicated by a direction vector derived from coordinate points in a three-dimensional space of the reference camera and the reference camera as a direction parallel to the epipolar line. Image processing device.

前記基準画像を撮像する基準カメラと前記参照画像を撮像する参照カメラとが、同一平面上に配置され、かつ互いの光軸が平行である場合には、
前記更新手段は、
前記基準画像の水平軸に対する角度によって示される方向を、前記エピポーラ線と平行な方向とみなして、前記ブロックを拡大する
請求項６に記載の画像処理装置。 When the standard camera that captures the standard image and the reference camera that captures the reference image are arranged on the same plane and the optical axes thereof are parallel to each other,
The updating means includes
The image processing apparatus according to claim 6, wherein the block is enlarged by regarding a direction indicated by an angle with respect to a horizontal axis of the reference image as a direction parallel to the epipolar line.

前記基準画像の水平軸に対する角度を示す情報を入力する入力手段をさらに備え、
前記更新手段は、前記入力手段が入力した情報から、前記基準画像の水平軸に対する角度を取得する
請求項８に記載の画像処理装置。 An input unit for inputting information indicating an angle with respect to a horizontal axis of the reference image;
The image processing apparatus according to claim 8, wherein the update unit acquires an angle of the reference image with respect to a horizontal axis from information input by the input unit.

前記ブロックマッチングが行われる際に、マッチング元となるブロック内に前記エピポーラ線と平行なエッジが含まれると判断した場合には、当該ブロック内に当該エッジが含まれないように、当該ブロックのサイズと形状との少なくとも一方を修正する修正手段をさらに備える
請求項６から請求項９のうちのいずれか１項に記載の画像処理装置。 When the block matching is performed, if it is determined that an edge parallel to the epipolar line is included in the matching source block, the size of the block is not included in the block. The image processing apparatus according to claim 6, further comprising a correcting unit that corrects at least one of the shape and the shape.

前記修正手段は、
前記ブロックマッチングが行われる際に、マッチング元となるブロック内に、前記エピポーラ線と平行でないエッジが含まれる場合には、当該ブロックが当該エッジを１画素または数画素跨ぐように、当該ブロックのサイズと形状との少なくとも一方を修正する
請求項１０に記載の画像処理装置。 The correcting means is
When the block matching is performed, if an edge that is not parallel to the epipolar line is included in the matching source block, the size of the block is set so that the block crosses the edge by one pixel or several pixels. The image processing apparatus according to claim 10, wherein at least one of the shape and the shape is corrected.

互いに異なる視点から撮像された複数の画像の視差を示す視差マップを生成する画像処理方法であって、
前記複数の画像のうち一の画像を基準画像とし、当該一の画像と異なる他の画像を参照画像として、前記基準画像の各画素位置について、対応する画素位置を前記参照画像からブロックマッチングにより探索し、前記ブロックマッチングにより検出された画素位置との視差を導出する導出ステップと、
前記基準画像の各画素位置について前記導出された視差に基づき、前記視差マップを生成する生成ステップと、
前記ブロックマッチングにおける探索方向に沿って、前記ブロックマッチングに用いるブロックを拡大するとともに、前記導出された視差の信頼度が一定レベル以下である画素位置について、前記ブロックマッチングを再度実行させ、再度の前記ブロックマッチングにより検出された画素位置との視差で、前記視差マップを更新する更新ステップと、を含む
ことを特徴とする画像処理方法。 An image processing method for generating a parallax map indicating parallax of a plurality of images taken from different viewpoints,
Searching for a corresponding pixel position for each pixel position of the standard image by block matching from the reference image, using one of the plurality of images as a reference image and another image different from the one image as a reference image And a derivation step for deriving a parallax with the pixel position detected by the block matching;
Generating the parallax map based on the derived parallax for each pixel position of the reference image; and
Along with the search direction in the block matching, the block used for the block matching is expanded, the block matching is performed again for the pixel position where the reliability of the derived parallax is below a certain level, An update step of updating the parallax map with a parallax with a pixel position detected by block matching.

コンピュータを請求項１から請求項１１のうちのいずれか１項に記載の画像処理装置として機能させるためのプログラム。 A program for causing a computer to function as the image processing apparatus according to any one of claims 1 to 11.