JP6310417B2

JP6310417B2 - Image processing apparatus, image processing method, and image processing program

Info

Publication number: JP6310417B2
Application number: JP2015110509A
Authority: JP
Inventors: 信哉志水; 志織杉本; 広太竹内; 明小島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-05-29
Filing date: 2015-05-29
Publication date: 2018-04-11
Anticipated expiration: 2035-05-29
Also published as: JP2016225832A

Description

本発明は、所望の画像を画像処理によって生成する画像処理装置、画像処理方法及び画像処理プログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and an image processing program that generate a desired image by image processing.

デジタル画像や映像の品質において、その空間解像度は非常に大きな要素である。そのため、より高解像度な映像を取り扱うことのできる高精細映像／画像システムの研究開発が継続的に行われている。高解像度な映像／画像を用いることで被写体や背景を細部まで鮮明に表現することが可能となる。その一方で各被写体に対してフォーカスが合っているか否かという解像度が低かった際には視認不可能であった要素まで視認されることになる。一般に、注視する被写体にフォーカスが合っていない映像／画像は、ボケが生じていると認識され、その画質は低いと評価されてしまう。そのため、解像度の高い映像／画像を撮影する際には、フォーカスを正確にコントロールすることが非常に重要であると考えられている。 The spatial resolution is a very big factor in the quality of digital images and videos. For this reason, research and development of high-definition video / image systems that can handle higher-resolution video has been continuously performed. By using a high resolution video / image, it is possible to clearly express the subject and the background in detail. On the other hand, when the resolution of whether or not each subject is in focus is low, elements that could not be visually recognized are visible. In general, a video / image in which the subject to be watched is out of focus is recognized as blurring, and the image quality is evaluated to be low. For this reason, it is considered that it is very important to accurately control the focus when shooting a high resolution video / image.

なお、本明細書において、画像とは、静止画像、または動画像を構成する１フレーム分の画像のことをいう。また映像とは、動画像と同じ意味であり、一連の画像の集合である。 Note that in this specification, an image means a still image or an image for one frame constituting a moving image. A video has the same meaning as a moving image, and is a set of a series of images.

しかしながら、高解像度の映像／画像を撮影する際のフォーカスコントロールは非常に困難な作業であることが知られている。低解像度の画像／映像を撮影する際にはビューファインダや小型の確認用のモニタを用いてフォーカスの状況を確認しながら撮影することが可能であるが、解像度の高い画像／映像を撮影する場合、小さなモニタでは細かな合焦状況まで確認することができないためである。 However, it is known that focus control when shooting a high-resolution video / image is a very difficult task. When shooting low-resolution images / videos, it is possible to check the focus status using a viewfinder or a small monitor for confirmation, but when shooting high-resolution images / videos. This is because even a small monitor cannot confirm a fine focusing state.

一般に高解像度の画像／映像を表示可能なモニタは大型になることから、撮影とフォーカスの確認を同時に一人の人間が行うことが不可能である。そのため、カメラマンとは別に“フォーカスマン”と呼ばれるスタッフを用意し、離れた場所で大型のモニタでフォーカスの状況を確認しながらフォーカスの操作を行ったり、確認したフォーカスの状況をカメラマンへ指示することが行われている。 In general, a monitor capable of displaying a high-resolution image / video is large, so that it is impossible for one person to simultaneously perform shooting and focus confirmation. For this reason, a staff member called “Focus Man” is prepared separately from the photographer, and the focus operation is performed while checking the focus status on a large monitor at a remote location, or the confirmed focus status is instructed to the photographer. Has been done.

また、撮影後に画像処理を行うことを前提とすることで、撮影後にフォーカスを調節できる撮像装置も開発されている。これはライトフィールドカメラと呼ばれる撮像装置であり、従来のカメラにおけるメインレンズと投影面の間にマイクロレンズアレイを挿入した構成になっている（例えば、非特許文献１参照）。このような構成を取ることで、カメラに入射する光線を入射角毎に記録することが可能となり、そこから異なる距離にフォーカスを合わせた画像／映像を生成することができる。ライトフィールドカメラにより撮像された画像（以下、ライトフィールド画像という）は、各画素の位置における光線の強度を光線の進行方向ごとに表現した画像である。 In addition, an imaging apparatus that can adjust the focus after shooting has been developed on the assumption that image processing is performed after shooting. This is an imaging device called a light field camera, and has a configuration in which a microlens array is inserted between a main lens and a projection plane in a conventional camera (for example, see Non-Patent Document 1). By adopting such a configuration, it becomes possible to record the light rays incident on the camera for each incident angle, and it is possible to generate images / videos focused at different distances therefrom. An image captured by a light field camera (hereinafter referred to as a light field image) is an image expressing the intensity of light rays at each pixel position for each traveling direction of the light rays.

R. Ng, "Digital light field photography", Ph.D dissertation, Stanford University, July 2006.R. Ng, "Digital light field photography", Ph.D dissertation, Stanford University, July 2006.

しかしながら、非特許文献１に記載の方法では、撮像素子を空間的に異なる光線をサンプリングするためだけでなく、レンズへの入射角の異なる光線をサンプリングするために割かなくてはならないため、撮像可能な空間解像度が低下してしまうという問題がある。 However, in the method described in Non-Patent Document 1, the imaging element must be divided not only for sampling spatially different light beams but also for sampling light beams having different angles of incidence on the lens. There is a problem that the spatial resolution is reduced.

なお、空間解像度と角度解像度の積が撮像素子の個数に近似できる。そのため、どちらにより高い解像度を与えるかは撮像装置を設計する際にある程度コントロールすることが可能である。ただし、角度解像度は撮像後のリフォーカス可能な範囲に影響を与えるため、ある程度の解像度を与える必要がある。非特許文献１の方法に基づいて作成・販売されたライトフィールドカメラでは、おおよそ１０ｘ１０の角度解像度を持つため、空間解像度は一般的なカメラの約１００分の１になっている。 Note that the product of the spatial resolution and the angular resolution can approximate the number of image sensors. For this reason, it is possible to control to some extent when designing an imaging apparatus which higher resolution is given. However, since the angle resolution affects the refocusable range after imaging, it is necessary to give a certain level of resolution. Since the light field camera created and sold based on the method of Non-Patent Document 1 has an angular resolution of approximately 10 × 10, the spatial resolution is about 1/100 of that of a general camera.

より多くの撮像素子を用いることで、空間解像度の低下を防ぐことも可能である。しかしながら、そのような方法では、十分な量の光量を得るために撮像装置の規模が大きくなるほか、より高解像度な画像センサが必要となるためにコストも増大するという問題がある。例えば、前述したライトフィールドカメラの場合、本来の空間解像度を保つためには、約１００倍の解像度を持つ画像センサが必要となる。 By using more image sensors, it is possible to prevent a reduction in spatial resolution. However, in such a method, there is a problem that the scale of the imaging device is increased in order to obtain a sufficient amount of light, and the cost is increased because a higher-resolution image sensor is required. For example, in the case of the light field camera described above, an image sensor having a resolution of about 100 times is required to maintain the original spatial resolution.

本発明は、このような事情に鑑みてなされたもので、高解像度の画像または映像の空間解像度を維持しつつ、高解像度の画像のリフォーカス機能を実現できる画像を生成することができる画像処理装置、画像処理方法及び画像処理プログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and image processing capable of generating an image capable of realizing a refocus function of a high-resolution image while maintaining the spatial resolution of the high-resolution image or video. It is an object to provide an apparatus, an image processing method, and an image processing program.

本発明の一態様は、ライトフィールド化すべき処理対象画像と、前記処理対象画像と同じシーンにおける光線の強度を前記処理対象画像よりも低い空間解像度で光線の進行方向毎に表現した参照ライトフィールド画像とを用いて、前記処理対象画像の各画素の位置における光線の強度を光線の進行方向毎に表現したライトフィールド画像を生成する画像処理方法であって、前記処理対象画像を複数の処理領域に分割する領域分割ステップと、前記処理対象画像と前記参照ライトフィールド画像とを用いて、前記処理領域ごとに、前記処理対象画像のフォーカス情報を推定する処理対象画像フォーカス推定ステップと、前記処理対象画像と前記参照ライトフィールド画像と前記推定したフォーカス情報とを用いて、前記処理領域ごとに、前記処理対象画像に対するライトフィールド画像である高解像度ライトフィールド画像を生成する高解像度ライトフィールド画像生成ステップとを有する画像処理方法である。 One aspect of the present invention is a processing target image to be converted into a light field, and a reference light field image in which the intensity of light in the same scene as the processing target image is expressed for each traveling direction of light with a spatial resolution lower than that of the processing target image. Is used to generate a light field image representing the intensity of the light beam at each pixel position of the processing target image for each traveling direction of the light beam, and the processing target image is divided into a plurality of processing regions. A region dividing step for dividing, a processing target image focus estimating step for estimating focus information of the processing target image for each processing region using the processing target image and the reference light field image, and the processing target image And the reference light field image and the estimated focus information for each processing region. The image processing method and a high-resolution light field image generation step of generating a high-resolution light field image that is a light field image for the target image.

本発明の一態様は、前記画像処理方法であって、前記参照ライトフィールド画像から異なるフォーカス情報を持つ複数の仮想フォーカス画像を生成する仮想フォーカス画像生成ステップを更に有し、前記処理対象画像フォーカス推定ステップでは、前記仮想フォーカス画像と前記処理対象画像とを用いて、前記処理対象画像のフォーカス情報を推定する。 One aspect of the present invention is the image processing method, further including a virtual focus image generation step of generating a plurality of virtual focus images having different focus information from the reference light field image, and the processing target image focus estimation In the step, focus information of the processing target image is estimated using the virtual focus image and the processing target image.

本発明の一態様は、前記画像処理方法であって、前記仮想フォーカス画像生成ステップでは、前記参照ライトフィールド画像に対する前記処理対象画像との撮影位置、向きの違いを補償した視点合成ライトフィールド画像を生成し、当該視点合成ライトフィールド画像を用いて前記仮想フォーカス画像を生成する。 One aspect of the present invention is the image processing method, wherein in the virtual focus image generation step, a viewpoint synthesized light field image that compensates for a difference in shooting position and orientation with respect to the processing target image with respect to the reference light field image is obtained. And generating the virtual focus image using the viewpoint synthesized light field image.

本発明の一態様は、前記画像処理方法であって、前記仮想フォーカス画像生成ステップでは、前記視点合成ライトフィールド画像の空間解像度を前記処理対象画像と合わせた高解像度視点合成ライトフィールド画像を生成し、当該高解像度視点合成ライトフィールド画像を用いて、前記仮想フォーカス画像を生成する。 One aspect of the present invention is the image processing method, wherein, in the virtual focus image generation step, a high-resolution viewpoint composite light field image in which a spatial resolution of the viewpoint composite light field image is combined with the processing target image is generated. The virtual focus image is generated using the high-resolution viewpoint composite light field image.

本発明の一態様は、前記画像処理方法であって、前記仮想フォーカス画像生成ステップでは、前記視点合成ライトフィールド画像から、前記視点合成ライトフィールド画像と同じ空間解像度で異なるフォーカス情報を持つ複数の低解像度仮想フォーカス画像を生成し、当該低解像度仮想フォーカス画像を前記処理対象画像と同じ空間解像度を持つようにアップサンプルすることで、前記仮想フォーカス画像を生成する。 One aspect of the present invention is the image processing method, wherein, in the virtual focus image generation step, a plurality of low-level images having different focus information with the same spatial resolution as the viewpoint synthesized light field image from the viewpoint synthesized light field image. A resolution virtual focus image is generated, and the virtual focus image is generated by up-sampling the low resolution virtual focus image so as to have the same spatial resolution as the processing target image.

本発明の一態様は、前記画像処理方法であって、前記フォーカス情報に、ライトフィールド画像から通常の画像を生成するための方法に関する情報が含まれる。 One aspect of the present invention is the image processing method, wherein the focus information includes information on a method for generating a normal image from a light field image.

本発明の一態様は、ライトフィールド化すべき処理対象画像と、前記処理対象画像と同じシーンにおける光線の強度を前記処理対象画像よりも低い空間解像度で光線の進行方向毎に表現した参照ライトフィールド画像とを用いて、前記処理対象画像の各画素の位置における光線の強度を光線の進行方向毎に表現したライトフィールド画像を生成する画像処理装置であって、前記処理対象画像を複数の処理領域に分割する領域分割手段と、前記処理対象画像と前記参照ライトフィールド画像とを用いて、前記処理領域ごとに、前記処理対象画像のフォーカス情報を推定する処理対象画像フォーカス推定手段と、前記処理対象画像と前記参照ライトフィールド画像と前記推定したフォーカス情報とを用いて、前記処理領域ごとに、前記処理対象画像に対するライトフィールド画像である高解像度ライトフィールド画像を生成する高解像度ライトフィールド画像生成手段とを備える画像処理装置である。 One aspect of the present invention is a processing target image to be converted into a light field, and a reference light field image in which the intensity of light in the same scene as the processing target image is expressed for each traveling direction of light with a spatial resolution lower than that of the processing target image. And a light field image that expresses the intensity of the light beam at each pixel position of the processing target image for each traveling direction of the light beam, and the processing target image is divided into a plurality of processing regions. Using the region dividing means for dividing, the processing target image focus estimation means for estimating the focus information of the processing target image for each processing region using the processing target image and the reference light field image, and the processing target image And the reference light field image and the estimated focus information for each processing region, An image processing apparatus and a high-resolution light field image generating means for generating a high-resolution light field image that is a light field image for which.

本発明の一態様は、コンピュータに、前記画像処理方法を実行させるための画像処理プログラムである。 One aspect of the present invention is an image processing program for causing a computer to execute the image processing method.

本発明によれば、同一シーンのライトフィールド画像を用いて、高解像度の画像または映像の光線情報を推定することで、高解像度の画像のリフォーカスを行うことのできるライトフィールド画像を生成することができるという効果が得られる。 According to the present invention, a light field image that can refocus a high resolution image is generated by estimating light ray information of the high resolution image or video using the light field image of the same scene. The effect of being able to be obtained.

本発明の実施形態による画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image processing apparatus by embodiment of this invention. 図１に示す画像処理装置１００の動作を示すフローチャートである。2 is a flowchart illustrating an operation of the image processing apparatus 100 illustrated in FIG. 1. 図２に示すステップＳ１０２において、参照ライトフィールド画像から処理対象画像のフォーカス情報を推定する処理動作を示すフローチャートである。3 is a flowchart showing a processing operation for estimating focus information of a processing target image from a reference light field image in step S102 shown in FIG. 図１に示す高解像度ライトフィールド画像生成部１０４の詳細な構成を示すブロック図である。FIG. 2 is a block diagram illustrating a detailed configuration of a high-resolution light field image generation unit 104 illustrated in FIG. 1. 図４に示す高解像度ライトフィールド画像生成部１０４の動作を示すフローチャートである。6 is a flowchart showing an operation of a high resolution light field image generation unit 104 shown in FIG. 図４に示す高解像度ライトフィールド画像生成部１０４の動作の変形例を示すフローチャートである。6 is a flowchart showing a modified example of the operation of the high resolution light field image generation unit 104 shown in FIG. 4. 画像処理装置１００をコンピュータとソフトウェアプログラムとによって構成する場合のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions in the case of comprising the image processing apparatus 100 by a computer and a software program.

以下、図面を参照して、本発明の実施形態による画像処理装置を説明する。ここでは、１枚の画像に対する処理を説明するが、複数の連続する画像に対して同じ処理を繰り返すことで映像（動画像）を処理することができる。なお、映像の全てのフレームに適用せずに、一部のフレームに対して本手法による処理を適用し、その他のフレームに対しては別の処理を適用しても構わない。 Hereinafter, an image processing apparatus according to an embodiment of the present invention will be described with reference to the drawings. Here, the processing for one image will be described, but a video (moving image) can be processed by repeating the same processing for a plurality of consecutive images. Note that the processing according to the present technique may be applied to some frames without being applied to all frames of the video, and another processing may be applied to other frames.

図１は本実施形態における画像処理装置の構成を示すブロック図である。画像処理装置１００は、コンピュータ装置によって構成し、図１に示すように、処理対象画像入力部１０１、参照ライトフィールド画像入力部１０２、フォーカス情報推定部１０３及び高解像度ライトフィールド画像生成部１０４を備えている。 FIG. 1 is a block diagram showing the configuration of the image processing apparatus according to this embodiment. As shown in FIG. 1, the image processing apparatus 100 includes a processing target image input unit 101, a reference light field image input unit 102, a focus information estimation unit 103, and a high resolution light field image generation unit 104, as shown in FIG. ing.

処理対象画像入力部１０１は、ライトフィールド化の対象となる高解像度の画像を入力する。以下では、この画像を処理対象画像と称する。参照ライトフィールド画像入力部１０２は、処理対象画像よりも空間解像度が低く、処理対象画像と同一のシーンに対するライトフィールド画像を入力する。以下では、この低解像度ライトフィールド画像を参照ライトフィールド画像と称する。 The processing target image input unit 101 inputs a high resolution image to be light fielded. Hereinafter, this image is referred to as a processing target image. The reference light field image input unit 102 inputs a light field image with a lower spatial resolution than the processing target image and for the same scene as the processing target image. Hereinafter, this low resolution light field image is referred to as a reference light field image.

なお、どのようなライトフィールド画像が入力されても構わない。例えば、非特許文献１のようなメインレンズによって結像した被写体の光学像を複数のマイクロレンズを用いて取得したライトフィールド画像であっても、別の方法を用いて取得したライトフィールド画像であっても構わない。ここでは、非特許文献１記載のライトフィールド画像が入力されるものとする。 Any light field image may be input. For example, even a light field image obtained by using a plurality of microlenses, an optical image of a subject formed by a main lens as in Non-Patent Document 1, may be a light field image obtained using another method. It doesn't matter. Here, it is assumed that the light field image described in Non-Patent Document 1 is input.

フォーカス情報推定部１０３は、処理対象画像と参照ライトフィールド画像とを入力し、参照ライトフィールドを用いて処理対象画像のフォーカスを推定する。高解像度ライトフィールド画像生成部１０４は、処理対象画像と参照ライトフィールド画像と推定したフォーカスとにしたがって、処理対象画像をライトフィールド化した画像を推定して生成する。以下では、生成されたライトフィールド画像を高解像度ライトフィールド画像と称する。 The focus information estimation unit 103 receives the processing target image and the reference light field image, and estimates the focus of the processing target image using the reference light field. The high-resolution light field image generation unit 104 estimates and generates an image obtained by converting the processing target image into a light field according to the processing target image, the reference light field image, and the estimated focus. Hereinafter, the generated light field image is referred to as a high resolution light field image.

次に、図２を参照して、図１に示す画像処理装置１００の動作を説明する。図２は、図１に示す画像処理装置１００の動作を示すフローチャートである。まず、各種の情報（処理対象画像と参照ライトフィールド画像）を入力して内部に保持する（ステップＳ１０１）。具体的には、処理対象画像入力部１０１は処理対象画像を入力し、参照ライトフィールド画像入力部１０２は参照ライトフィールド画像を入力し、入力した画像を内部に保持する。 Next, the operation of the image processing apparatus 100 shown in FIG. 1 will be described with reference to FIG. FIG. 2 is a flowchart showing the operation of the image processing apparatus 100 shown in FIG. First, various types of information (processing target image and reference light field image) are input and held inside (step S101). Specifically, the processing target image input unit 101 inputs a processing target image, the reference light field image input unit 102 inputs a reference light field image, and holds the input image therein.

処理対象画像および参照ライトフィールド画像の入力が終了したら、フォーカス情報推定部１０３は、処理対象画像を予め定められた大きさの領域に分割し、その領域ごとに、参照ライトフィールド画像を用いて処理対象画像のフォーカス情報を推定する（ステップＳ１０２）。この処理は後で詳しく説明する。なお、分割された各領域は互いに重複した領域となっていても構わない。ただし、処理対象画像の各画素は最低でも１つの領域には含まれるものとする。以下ではこの領域のことを処理領域と称する。 When the input of the processing target image and the reference light field image is completed, the focus information estimation unit 103 divides the processing target image into regions of a predetermined size, and performs processing using the reference light field image for each region. Focus information of the target image is estimated (step S102). This process will be described in detail later. Each divided area may be an overlapping area. However, each pixel of the processing target image is included in at least one area. Hereinafter, this region is referred to as a processing region.

ステップＳ１０２で推定するフォーカス情報は、参照ライトフィールド画像を用いて、その処理領域における処理対象画像を生成するのに必要となる合焦面や被写界深度などのフォーカスに関連するパラメータを表す。なお、参照ライトフィールド画像から生成するのに必要なパラメータであれば、フォーカスに関連するもの以外を含めても構わない。パラメータの種類や個数は、ライトフィールド画像から通常の画像を生成する際に用いる方法によって異なる。使用する手法もパラメータの１つとして取扱い、複数の方法を処理領域ごとに切り替えるようにしても構わない。 The focus information estimated in step S102 represents parameters related to the focus, such as a focal plane and a depth of field, necessary for generating a processing target image in the processing region using the reference light field image. Note that parameters other than those related to the focus may be included as long as they are parameters necessary for generation from the reference light field image. The type and number of parameters vary depending on the method used when generating a normal image from a light field image. The method to be used may be handled as one of the parameters, and a plurality of methods may be switched for each processing region.

ステップＳ１０２で想定するライトフィールド画像から通常の画像を生成する方法には、どのような方法を用いても構わない。例えば、フーリエスライス法（参考文献１：「R. Ng, “Fourier slice photography,”ACM SIGGRAPH 2005 Pap. - SIGGRAPH ’05, p. 735, 2005.」に記載）や、シフト加算法（参考文献２：「R. Ng, M. Levoy, G. Duval, M. Horowitz, and P. Hanrahan, “Light Field Photography with a Hand-held Plenoptic Camera,” Stanford Tech Rep. CTSR, pp. 1-11, 2005.」に記載）を用いても構わない。シフト加算法では、例えば、基準となるシフト量や使用するサブアパチャ画像、サブアパチャ画像に対するフィルタ、ポストフィルタの種類や強度などがフォーカス情報となる。 Any method may be used as a method of generating a normal image from the light field image assumed in step S102. For example, the Fourier slice method (reference document 1: “R. Ng,“ Fourier slice photography, ”described in ACM SIGGRAPH 2005 Pap.-SIGGRAPH '05, p. 735, 2005.), or the shift addition method (reference document 2) : “R. Ng, M. Levoy, G. Duval, M. Horowitz, and P. Hanrahan,“ Light Field Photography with a Hand-held Plenoptic Camera, ”Stanford Tech Rep. CTSR, pp. 1-11, 2005. May be used. In the shift addition method, for example, the reference shift amount, the sub-aperture image to be used, the filter for the sub-aperture image, the type and intensity of the post filter, and the like are the focus information.

フォーカス情報の推定が終了したら、高解像度ライトフィールド画像生成部１０４は、処理領域ごとに、処理対象画像と参照ライトフィールド画像と推定したフォーカス情報とを用いて、処理対象画像に対するライトフィールド画像を推定することで、高解像度ライトフィールド画像を生成する（ステップ１０３）。生成された高解像度ライトフィールド画像は画像処理装置１００の出力となる。ここでの処理は、処理対象画像と参照ライトライトフィールド画像と推定したフォーカス情報とを用いる処理であれば、どのような手法を用いても構わない。 When the estimation of the focus information is completed, the high-resolution light field image generation unit 104 estimates the light field image for the processing target image using the processing target image, the reference light field image, and the estimated focus information for each processing region. Thus, a high resolution light field image is generated (step 103). The generated high resolution light field image is an output of the image processing apparatus 100. The processing here may be any method as long as it uses the processing target image, the reference light / light field image, and the estimated focus information.

例えば、処理対象画像や参照ライトフィールド画像との整合性を考慮しながら高解像度ライトフィールド画像を生成するようにしても構わない。その際に、（１）式、（２）式に従って高解像度ライトフィールド画像ＬＦ_ｈｉｇｈを生成しても構わない。

Ｅ（ＬＦ）＝α‖Ｄｏｗｎ（ＬＦ）−ＬＦ_ｌｏｗ‖＋β‖Ｃｏｎｖ（ＬＦ）−Ｉ_ｈｉｇｈ‖＋λＲ（ＬＦ）・・・（２）
ここで、ＬＦ_ｈｉｇｈ、ＬＦ_ｌｏｗ及びＩ_ｈｉｇｈは、それぞれ、高解像度ライトフィールド画像、参照ライトフィールド画像及び処理対象画像を表す。Ｄｏｗｎはライトフィールド画像に対するダウンサンプル処理を表し、与えられたライトフィールド画像をダウンサンプルして、参照ライトフィールド画像と同じ条件のライトフィールド画像を生成した結果を返す。Ｃｏｎｖはライトフィールド画像から通常の画像を再構成する処理を表し、与えられたライトフィールド画像から、推定したフォーカス情報に従って、処理対象画像と同じ条件の画像を再構成した結果を返す。 For example, a high-resolution light field image may be generated in consideration of consistency with the processing target image and the reference light field image. At that time, the high-resolution light field image LF _high may be generated according to the equations (1) and (2).

E (LF) = α‖Down (LF) −LF _low ‖ + β‖Conv (LF) −I _high ‖ + λR (LF) (2)
Here, LF _high , LF _low, and I _high represent a high resolution light field image, a reference light field image, and a processing target image, respectively. Down represents down-sampling processing for a light field image. The given light field image is down-sampled, and a result of generating a light field image under the same conditions as the reference light field image is returned. Conv represents a process of reconstructing a normal image from a light field image, and returns a result of reconstructing an image of the same condition as the processing target image from the given light field image according to the estimated focus information.

α、β及びλはそれぞれの項の重みを調節するパラメータである。‖Ａ‖はＡのノルムを表す（典型的にはｌ_２ノルムが用いられるが、ｌ_０ノルムやｌ_１ノルムを用いても構わない）。Ｒは与えられたライトフィールド画像のライトフィールド画像らしさを評価した結果を返す。どのような基準で評価を行っても構わないが、ここでは、値が小さいほどライトフィールド画像らしさが高いものとするが、値が大きいほどライトフィールドらしさを返すものでも構わない。その場合、λは負の数となる。 α, β and λ are parameters for adjusting the weight of each term. ‖A‖ represents the norm of A (typically, l ₂ norm is used, but ₁₀ norm or l ₁ norm may be used). R returns the result of evaluating the light field image likelihood of the given light field image. Evaluation may be performed based on any criteria, but here, the smaller the value is, the higher the light field image is, but the larger the value is, the light field may be returned. In that case, λ is a negative number.

Ｒとしては、例えば、（３）式、（４）式に示すようにライトフィールド画像におけるスパース性を用いても構わない。すなわち、ライトフィールド画像を過完備（オーバーコンプリート）な辞書Ｄを用いて表現した際の係数ベクトルχのノルムを用いても構わない。ここで‖χ‖_ｎはχのｌ_nノルムを表し、一般にｌ_０ノルムやｌ_１ノルム、ｌ_１/２ノルムなどが用いられるが、何を用いても構わない。
Ｒ（ＬＦ_ｈｉｇｈ）＝‖χ‖_ｎ・・・（３）
ＬＦ_ｈｉｇｈ＝Ｄχ ・・・（４） As R, for example, sparsity in a light field image may be used as shown in the equations (3) and (4). In other words, the norm of the coefficient vector χ when the light field image is expressed using the overcomplete dictionary D may be used. Here ‖Kai‖ _n represents l _n norm of chi, generally l ₀ norm and l ₁ norm, but such l _1/2 norm is used, may be anything used.
R (LF _high ) = ‖χ‖ _n (3)
LF _high = Dχ (4)

過完備な辞書Ｄはどのような方法を用いて生成しても構わない。例えば、参照ライトフィールド画像を用いて生成しても構わないし、別のライトフィールド画像群を用いて生成しても構わない。具体的な生成方法としては、例えば、参考文献３「J. Mairal, F. Bach, J. Ponce, and G. Spairo, "Online Dictionary Learning for Sparse Coding", International Conference on Machine Learning, 2009.」に記載の生成方法を用いても構わない。なお、別のライトフィールド画像群を用いて生成する場合は、事前に生成しておいた辞書を入力して用いても構わない。 The overcomplete dictionary D may be generated using any method. For example, it may be generated using a reference light field image, or may be generated using another light field image group. As a specific generation method, for example, in Reference 3 “J. Mairal, F. Bach, J. Ponce, and G. Spairo,“ Online Dictionary Learning for Sparse Coding ”, International Conference on Machine Learning, 2009.” The generation method described may be used. In addition, when generating using another light field image group, you may input and use the dictionary produced | generated previously.

また、過完備な辞書Ｄを用いる場合、全ての処理領域で同じ辞書を用いても構わないし、処理領域ごとに異なる辞書を用いても構わない。例えば、推定されたフォーカス情報に応じて、異なる辞書を用いても構わない。フォーカス情報に応じて生成された辞書を用いることで、少なくとも復元したライトフィールド画像の持つフォーカス情報が、推定対象と同じフォーカス情報を持つようにすることができる。また、フォーカス情報に依存する特徴的な基底による表現が可能となるため、辞書のサイズが小さくなり、（１）式の最小化問題にかかる演算量を削減し、高速にライトフィールド画像を復元することが可能となる。 When using an overcomplete dictionary D, the same dictionary may be used for all processing regions, or a different dictionary may be used for each processing region. For example, different dictionaries may be used according to the estimated focus information. By using the dictionary generated according to the focus information, at least the focus information of the restored light field image can have the same focus information as the estimation target. In addition, since it is possible to express with a characteristic base that depends on the focus information, the size of the dictionary is reduced, the amount of calculation related to the minimization problem of equation (1) is reduced, and the light field image is restored at high speed. It becomes possible.

その他の例としては、ライトフィールド画像から生成されるリフォーカス画像や全焦点画像の画像らしさを用いても構わない。画像らしさの尺度としてはＴＶ（Total Variation）ノルムなどがある。リフォーカス画像は１つでも構わないし、複数のリフォーカス画像を生成して、それらの画像らしさの平均値や合計値などを用いても構わない。 As another example, the image quality of a refocus image generated from a light field image or an omnifocal image may be used. As a measure of image quality, there is a TV (Total Variation) norm. The number of refocus images may be one, or a plurality of refocus images may be generated and an average value or a total value of the image quality may be used.

更に別の例としては、ライトフィールド画像から生成できるサブアパチャ画像群の確からしさを用いても構わない。サブアパチャ画像とは、ライトフィールド画像中の同じ角度成分の画素をサンプリング位置に合わせて並べることで生成できる画像である。すなわち、１つのライトフィールド画像から、角度解像度と同数のサブアパチャ画像が生成できる。なお、サブアパチャ画像の空間解像度はライトフィールド画像の空間解像度と同じである。個々のサブアパチャ画像の確からしさには、サブアパチャ画像の画像らしさを用いることができる。 As yet another example, the probability of a sub-aperture image group that can be generated from a light field image may be used. The sub-aperture image is an image that can be generated by arranging pixels having the same angle component in the light field image in accordance with the sampling position. That is, the same number of sub-aperture images as the angular resolution can be generated from one light field image. Note that the spatial resolution of the sub-aperture image is the same as the spatial resolution of the light field image. As the likelihood of each sub-aperture image, the image quality of the sub-aperture image can be used.

また、ライトフィールド画像の形式によっては、サブアパチャ画像はピンホールカメラ画像で理論近似されるため、サブアパチャ画像のブラー量を個々のサブアパチャ画像らしさとして用いても構わない。前述の画像らしさとブラー量の両方を加味した画像らしさを用いても構わない。サブアパチャ画像群として評価には、各サブアパチャ画像の評価値の平均値や合計値、分散値などを用いても構わない。 Also, depending on the format of the light field image, the sub-aperture image is theoretically approximated by a pinhole camera image, so the blur amount of the sub-aperture image may be used as the individual sub-aperture image characteristic. You may use the image quality which considered both the image quality mentioned above and the blur amount. For evaluation as a sub-aperture image group, an average value, total value, variance value, or the like of evaluation values of each sub-aperture image may be used.

更に別の例としては、参照ライトフィールド画像から推定できるデプスマップのデプスマップらしさを用いても構わない。ここでデプスマップらしさとは、デプスマップが一般的に持つ区分的になめらかな性質を満たしているか否かを評価したものなどを用いることができる。具体的には、デプスマップに対するＴＶノルムや、デプスマップをデプスマップに対する過完備辞書を用いてスパース表現した際のノルムなどがある。なお、参照ライトフィールド画像からデプスマップを推定する処理には任意の手法を用いることができる。 As yet another example, a depth map-likeness that can be estimated from a reference light field image may be used. Here, as the depth map-like property, it is possible to use a value obtained by evaluating whether or not the depth map generally satisfies the piecewise smooth property. Specifically, there are a TV norm for the depth map, a norm when the depth map is sparsely expressed using an overcomplete dictionary for the depth map, and the like. An arbitrary method can be used for the process of estimating the depth map from the reference light field image.

例えば、ライトフィールド画像から生成できるサブアパチャ画像群を多視点画像とみなして、ステレオマッチング等のデプス推定を行うことで推定しても構わない。別の方法としては、ライトフィールド画像から焦点距離の異なる画像群を生成し、その合焦度合いを調べることでデプスを推定する方法を用いても構わない。更に別の方法としては、ライトフィールド画像からＥＰＩ（Epipolar Plane Image）を構成し、ＥＰＩ上の直線の傾きを推定することでデプスを推定する方法を用いても構わない。 For example, the sub-aperture image group that can be generated from the light field image may be regarded as a multi-viewpoint image and estimated by performing depth estimation such as stereo matching. As another method, an image group having different focal lengths may be generated from the light field image, and the depth may be estimated by examining the degree of focusing. As another method, an EPI (Epipolar Plane Image) may be constructed from a light field image, and the depth may be estimated by estimating the slope of a straight line on the EPI.

更に別の例としては、上記説明した方法を組み合わせた方法を用いても構わない。例えば、過完備辞書におけるスパース性による評価と、サブアパチャ画像のサブアパチャ画像らしさによる評価との重み付け評価値を用いても構わない。 As yet another example, a method combining the methods described above may be used. For example, a weighted evaluation value between evaluation based on sparsity in an overcomplete dictionary and evaluation based on the sub-aperture image likeness of the sub-aperture image may be used.

Ｄｏｗｎはライトフィールド画像の形式やダウンサンプルの比率等に従って適切な手法を用いる必要がある。例えば、非特許文献１に記載のように、メインレンズによって結像した被写体の光学像を複数のマイクロレンズを用いてライトフィールド画像を撮像する場合、Ｄｏｗｎを、ダウンサンプル後の１つのマイクロレンズに対応する領域に存在するマイクロレンズ群に対して、マイクロレンズ下の画像の平均画像を求める処理として定義しても構わない。なお、ダウンサンプルの対象となるライトフィールド画像と参照ライトフィールド画像とが、異なる位置や向きから取得されたものの場合、Ｄｏｗｎでは、その位置や向きの違いも考慮した処理を含めても構わない。 For Down, it is necessary to use an appropriate method according to the format of the light field image, the ratio of the down sample, and the like. For example, as described in Non-Patent Document 1, when a light field image is captured using a plurality of microlenses, an optical image of a subject formed by a main lens is converted into one microlens after down-sampling. You may define as a process which calculates | requires the average image of the image under a micro lens with respect to the micro lens group which exists in a corresponding area | region. If the light field image and the reference light field image to be downsampled are acquired from different positions and orientations, Down may include processing that takes into account the difference in the positions and orientations.

Ｃｏｎｖでは、合焦面や被写界深度などのフォーカス情報が、ステップＳ１０２で推定したフォーカス情報と同じになるように、与えられたライトフィールド画像から画像を生成する。ライトフィールド画像から画像を生成する処理としては、ライトフィールド画像の形式に適切な手法を用いる必要がある。例えば、ライトフィールド画像が、非特許文献１に記載のようにメインレンズによって結像した被写体の光学像を複数のマイクロレンズを用いて撮像することで得られるようなライトフィールド画像である場合、フーリエスライス法（参考文献４：「R. Ng, “Fourier slice photography,”ACM SIGGRAPH 2005 Pap. - SIGGRAPH ’05, p. 735, 2005.」に記載）を用いてフーリエ変換領域での処理によって生成しても構わない。 In Conv, an image is generated from a given light field image so that the focus information such as the focal plane and the depth of field is the same as the focus information estimated in step S102. As processing for generating an image from a light field image, it is necessary to use a method suitable for the format of the light field image. For example, when the light field image is a light field image obtained by capturing an optical image of a subject formed by a main lens using a plurality of microlenses as described in Non-Patent Document 1, Fourier It is generated by processing in the Fourier transform domain using the slice method (Ref. 4: R. Ng, “Fourier slice photography,” described in ACM SIGGRAPH 2005 Pap.-SIGGRAPH '05, p. 735, 2005.). It doesn't matter.

また、シフト加算法（参考文献２に記載）を用いて、ライトフィールド画像から得られるサブアパチャ画像を、角度成分にしたがってシフトし、それらの平均画像を求めることで生成しても構わない。なお、ステップＳ１０２で推定したフォーカス情報に、ライトフィールド画像から画像を生成する手法の種類に関する情報が含まれている場合は、フォーカス情報によって指定された手法を用いて画像を生成する。 Alternatively, the sub-aperture image obtained from the light field image may be shifted according to the angle component using a shift addition method (described in Reference Document 2), and the average image may be obtained. When the focus information estimated in step S102 includes information on the type of technique for generating an image from the light field image, an image is generated using the technique specified by the focus information.

前述の（１）式で表した最小化問題に対して、どのような方法を用いて解となる高解像ライトフィールド画像を求めても構わない。例えば、全ての高解像度ライトフィールド画像の候補に対して評価値を計算し、その最小値を与えるものを求めても構わない。別の方法としては、ＭａｔｃｈｉｎｇＰｅｒｓｕｉｔ（ＭＰ）やＯｒｔｈｏｇｏｎａｌＭａｔｃｈｉｎｇＰｅｒｓｕｉｔ（ＯＭＰ）、内点法、ＢｌｏｃｋＣｏｏｒｄｉｎａｔｅＲｅｌａｘａｔｉｏｎ（ＢＣＲ）法、ＡｌｔｅｒｎａｔｉｎｇＤｉｒｅｃｔｉｏｎａｌＭｅｔｈｏｄｏｆＭｕｌｔｉｐｌｅｒｓ（ＡＤＭＭ）などを用いても構わない。 Any method may be used to obtain a high-resolution light field image as a solution to the minimization problem expressed by the above equation (1). For example, evaluation values may be calculated for all high-resolution light field image candidates, and an evaluation value may be obtained. As another method, Matching Persit (MP), Orthogonal Matching Persit (OMP), interior point method, Block Coordinated Relaxation (BCR) method, and Alternate Directional Method of Multiplers (ADMM) may be used.

なお、処理領域が処理対象画像上で重複している場合、処理対象画像の１つの画素に対して、複数のライトフィールドが得られることになる。その場合、どれか１つのライトフィールドを選択しても構わないし、光線ごとに平均値や中央値、最頻値を計算して最終的なライトフィールドを生成しても構わない。なお、平均値や中央値を求める際に重み付けを行っても構わない。１つを選択する場合や、重み付けを行う場合の基準としては、処理領域における対象の画素の位置を用いても構わない。例えば、処理領域の中央に位置するほど、優先するようにしても構わない。 If the processing areas overlap on the processing target image, a plurality of light fields are obtained for one pixel of the processing target image. In that case, any one light field may be selected, and a final light field may be generated by calculating an average value, a median value, and a mode value for each ray. In addition, when calculating | requiring an average value or a median, you may weight. As a reference for selecting one or performing weighting, the position of the target pixel in the processing region may be used. For example, priority may be given to the center of the processing area.

また、処理領域をラスタースキャン順など予め定められた順番で処理する場合、それまでに処理された処理領域に対して生成された高解像度ライトフィールド画像との間の整合性が高まるように、処理中の処理領域に対する高解像度ライトフィールドを生成するようにしても構わない。すなわち、重複する領域においては、それまでに生成された高解像度ライトフィールド画像との一致度が高くなるように、処理領域に対するライトフィールド画像の推定を行っても構わない。 In addition, when processing the processing areas in a predetermined order such as a raster scan order, the processing is performed so that the consistency with the high-resolution light field image generated for the processing areas processed so far is increased. A high-resolution light field for the inside processing area may be generated. That is, in the overlapping region, the light field image may be estimated for the processing region so that the degree of coincidence with the high-resolution light field image generated so far is high.

次に、図３を参照して、図２に示すステップＳ１０２の詳細な処理動作を説明する。図３は、図２に示すステップＳ１０２において、参照ライトフィールド画像から処理対象画像のフォーカス情報を推定する処理動作を示すフローチャートである。まず、フォーカス情報推定部１０３は、参照ライトフィールド画像に対する処理対象画像を取得した位置や向きの違いによる影響を補償した視点合成ライトフィールド画像を生成する（ステップＳ１２０１）。 Next, the detailed processing operation of step S102 shown in FIG. 2 will be described with reference to FIG. FIG. 3 is a flowchart showing the processing operation for estimating the focus information of the processing target image from the reference light field image in step S102 shown in FIG. First, the focus information estimation unit 103 generates a viewpoint synthesis light field image that compensates for the influence of the difference in position and orientation at which the processing target image is acquired with respect to the reference light field image (step S1201).

なお、参照ライトフィールド画像と処理対象画像が同じ位置と同じ向きで取得されたものである場合は、視点合成ライトフィールド画像は参照ライトフィールド画像と等しい。どのような処理を用いて生成しても構わないが、例えば、参照ライトフィールド画像のサブアパチャ画像の画素におけるカメラから被写体までの距離を求め、各サブアパチャ画像をＤＩＢＲ（Depth Image Based Rendering）と呼ばれる技術を用いて変換した変換サブアパチャ画像をサブアパチャ画像とするライトフィールド画像を求めることで、視点合成ライトフィールド画像を生成しても構わない。 Note that when the reference light field image and the processing target image are acquired at the same position and in the same direction, the viewpoint synthesis light field image is equal to the reference light field image. For example, a technique called DIBR (Depth Image Based Rendering) is used to determine the distance from the camera to the subject in the pixels of the sub-aperture image of the reference light field image. A viewpoint composite light field image may be generated by obtaining a light field image using the converted sub-aperture image converted by using the sub-aperture image.

なお、サブアパチャ画像の各画素におけるカメラから被写体までの距離は、どのような方法で求めても構わないが、例えば、サブアパチャ画像を多視点画像とみなしてステレオマッチング法を適用することで求めることができる。別の方法としては、参考文献５：「Wanner, S.; Goldluecke, B., "Globally consistent depth labeling of 4D light fields," Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on , vol., no., pp.41,48, 16-21 June 2012」に記載の方法などＥＰＩ画像における直線の傾きを推定することで求めても構わない。 The distance from the camera to the subject in each pixel of the sub-aperture image may be obtained by any method. For example, it can be obtained by applying the stereo matching method by regarding the sub-aperture image as a multi-viewpoint image. it can. Another method is Reference 5: “Wanner, S .; Goldluecke, B.,“ Globally consistent depth labeling of 4D light fields, ”Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, vol., No ., pp. 41, 48, 16-21 June 2012, or the like, may be obtained by estimating the slope of a straight line in an EPI image.

次に、フォーカス情報推定部１０３は、視点合成ライトフィールド画像から、処理対象画像と同じ空間解像度で異なる合焦面や被写界深度を持った複数の仮想フォーカス画像を生成する（ステップＳ１２０２）。視点合成ライトフィールド画像から仮想フォーカス画像を生成する処理には、視点合成ライトフィールド画像の形態に対して適切な方法であれば、どのような方法を用いても構わない。例えば、視点合成ライトフィールド画像が、非特許文献１に記載のようにメインレンズによって結像した被写体の光学像を複数のマイクロレンズを用いて撮像することで得られたライトフィールド画像である場合、前述したシフト加算法やフーリエスライス法などを用いても構わない。なお、仮想フォーカス画像を生成する方法をフォーカス情報に含める場合は、複数の異なる方法を用いて仮想フォーカス画像を生成しても構わない。 Next, the focus information estimation unit 103 generates a plurality of virtual focus images having different focal planes and depths of field at the same spatial resolution as the processing target image from the viewpoint synthesis light field image (step S1202). Any method may be used for the process of generating the virtual focus image from the viewpoint synthesized light field image as long as the method is appropriate for the form of the viewpoint synthesized light field image. For example, when the viewpoint synthesized light field image is a light field image obtained by capturing an optical image of a subject formed by a main lens using a plurality of microlenses as described in Non-Patent Document 1, The shift addition method or Fourier slice method described above may be used. Note that when the method for generating the virtual focus image is included in the focus information, the virtual focus image may be generated using a plurality of different methods.

更に、視点合成ライトフィールド画像の空間解像度は、処理対象画像の空間解像度となるため、空間解像度を合わせる処理も行う必要がある。ライトフィールド画像から通常の画像を生成する際に同時に空間解像度変換を行っても構わないし、事前にライトフィールド画像をアップサンプルして空間解像度を合わせても構わないし、通常の画像を生成した後にアップサンプルして空間解像度を合わせても構わない。事前にライトフィールド画像をアップサンプルする場合、視点合成ライトフィールド画像に含まれる各サブアパチャ画像をアップサンプルすることで、処理対象画像と同じ空間解像度を持ったライトフィールド画像を生成しても構わない。 Furthermore, since the spatial resolution of the viewpoint synthesis light field image is the spatial resolution of the processing target image, it is necessary to perform processing for matching the spatial resolution. When generating a normal image from a light field image, the spatial resolution conversion may be performed at the same time, or the light field image may be up-sampled in advance to match the spatial resolution. You may sample and adjust the spatial resolution. When a light field image is up-sampled in advance, a light field image having the same spatial resolution as the processing target image may be generated by up-sampling each sub-aperture image included in the viewpoint synthesis light field image.

なお、仮想フォーカス画像を生成する方法をフォーカス情報に含める場合は、このアップサンプルの方法の違いもフォーカス情報に加えて、複数の異なる方法を用いて仮想フォーカス画像を生成しても構わない。また、アップサンプルに用いるフィルタを複数定義し、使用するフィルタの種類もアップサンプルの方法に含めても構わない。 When a method for generating a virtual focus image is included in the focus information, a virtual focus image may be generated using a plurality of different methods in addition to the focus information in addition to the difference in the upsampling method. Also, a plurality of filters used for upsampling may be defined, and the type of filter used may be included in the upsampling method.

仮想フォーカス画像群が生成されたら、フォーカス情報推定部１０３は、処理領域ごとに、処理対象画像に最も一致する仮想フォーカス画像を決定する（ステップＳ１２０３）。なお、ある処理領域に対して、その処理領域内の画素のみを用いて、仮想フォーカス画像との一致度を計算しても構わないし、その処理領域を中心とする一定の距離内の画素群も含めて、仮想フォーカス画像との一致度を計算しても構わない。更に、仮想フォーカス画像との一致度の計算に、処理対象画像と仮想フォーカス画像との誤差のみによる尺度を用いても構わないし、処理領域に重複または隣接する別の処理領域におけるフォーカス情報の類似度を考慮した尺度を用いても構わない。処理領域に重複または隣接する別の処理領域におけるフォーカス情報の類似度を考慮した尺度を用いる場合、隣接する処理領域とのフォーカス情報の類似度が高く、処理領域内の各画素における処理対象画像との誤差が小さくなるように、各処理領域に対する仮想フォーカス画像を決定することになる。 When the virtual focus image group is generated, the focus information estimation unit 103 determines a virtual focus image that most closely matches the processing target image for each processing region (step S1203). Note that the degree of coincidence with a virtual focus image may be calculated for a certain processing area using only the pixels in that processing area, and a group of pixels within a certain distance centered on that processing area may also be used. In addition, the degree of coincidence with the virtual focus image may be calculated. Further, the degree of coincidence with the virtual focus image may be calculated using a scale based only on the error between the processing target image and the virtual focus image, and the similarity of the focus information in another processing area overlapping or adjacent to the processing area. A scale that takes into account may be used. When using a scale that considers the similarity of focus information in another processing region that overlaps or is adjacent to the processing region, the similarity of the focus information with the adjacent processing region is high, and the processing target image in each pixel in the processing region Therefore, the virtual focus image for each processing region is determined so that the error in the image becomes smaller.

なお、フォーカス情報の類似度としてはどのようなものを用いても構わないが、典型的な例としては、ある領域に対して一致する仮想フォーカス画像を生成する際の合焦面と被写界深度が、その領域に隣接する領域に対して一致する仮想フォーカス画像を生成する際の合焦面と被写界深度に近いほど高くなる尺度を用いても構わない。 Note that any degree of similarity of the focus information may be used, but as a typical example, a focal plane and an object scene when generating a virtual focus image that matches a certain region are typical. A scale that increases as the depth is closer to the in-focus plane and the depth of field when generating a virtual focus image that matches a region adjacent to the region may be used.

このステップＳ１２０３の結果、処理対象画像の各処理領域に対して、その処理領域における画像をライトフィールド画像から生成する際のフォーカス情報が得られる。この情報は、処理対象画像に対するライトフィールド画像が与えられた場合に、そのライトフィールド画像から、処理対象画像と同じフォーカスを持った画像を生成する際に用いる処理領域ごとのパラメータとなる。すなわち、ここで求めたフォーカス情報は一般的なカメラモデルにおける合焦面や被写界深度（画像全体で単一の合焦面や被写界深度が定義される）とは異なるが、ライトフィールド画像から通常のカメラ画像を生成するための情報であり、これまでには存在しないフォーカスに関する情報となる。 As a result of step S1203, for each processing region of the processing target image, focus information for generating an image in the processing region from the light field image is obtained. This information is a parameter for each processing region used when an image having the same focus as the processing target image is generated from the light field image when a light field image for the processing target image is given. That is, the focus information obtained here is different from the focal plane and depth of field in a general camera model (a single focal plane and depth of field are defined for the entire image), but the light field This is information for generating a normal camera image from an image, and is information regarding a focus that has not existed so far.

次に、図４を参照して、図１に示す高解像度ライトフィールド画像生成部１０４の詳細な構成を説明する。図４は、図１に示す高解像度ライトフィールド画像生成部１０４の詳細な構成を示すブロック図である。高解像度ライトフィールド画像生成部１０４は、ライトフィールド画像から通常画像への変換及びライトフィールド画像のダウンサンプルを用いて、処理対象画像及び参照ライトフィールド画像、処理対象画像の推定フォーカス情報との整合性に基づき高解像度ライトフィールド画像を生成する。高解像度ライトフィールド画像生成部１０４は、図４に示すように、位置関係設定部１０４１、高解像度ライトフィールド画像候補生成部１０４２、通常画像化部１０４３、ライトフィールド画像ダウンサンプル部１０４４、高解像度ライトフィールド画像候補補正部１０４５及びスイッチ１０４６を備えている。 Next, a detailed configuration of the high-resolution light field image generation unit 104 shown in FIG. 1 will be described with reference to FIG. FIG. 4 is a block diagram showing a detailed configuration of the high-resolution light field image generation unit 104 shown in FIG. The high-resolution light field image generation unit 104 uses the conversion from the light field image to the normal image and the down-sampling of the light field image to match the processing target image, the reference light field image, and the estimated focus information of the processing target image. To generate a high-resolution light field image. As shown in FIG. 4, the high-resolution light field image generation unit 104 includes a positional relationship setting unit 1041, a high-resolution light field image candidate generation unit 1042, a normal imaging unit 1043, a light field image down-sampling unit 1044, a high-resolution light A field image candidate correction unit 1045 and a switch 1046 are provided.

位置関係設定部１０４１は、処理対象画像に対するカメラと参照ライトフィールド画像に対するカメラの位置関係を設定する。高解像度ライトフィールド画像候補生成部１０４２は、高解像度ライトフィールドの候補となるライトフィールド画像を生成する。通常画像化部１０４３は、高解像度ライトフィールド画像候補から処理対象画像に対する推定画像を生成する。ライトフィールド画像ダウンサンプル部１０４４は、ライトフィールド画像に対するダウンサンプル及び位置関係に基づく変換によって、高解像度ライトフィールド画像候補の空間解像度を低下させ、参照ライトフィールド画像に対する推定画像を生成する。高解像度ライトフィールド画像候補補正部１０４５は、処理対象画像及びその推定画像と参照ライトフィールド画像及びその推定画像とを用いて、高解像度ライトフィールド画像候補を補正する。 The positional relationship setting unit 1041 sets the positional relationship between the camera with respect to the processing target image and the camera with respect to the reference light field image. The high resolution light field image candidate generation unit 1042 generates a light field image that is a candidate for a high resolution light field. The normal imaging unit 1043 generates an estimated image for the processing target image from the high-resolution light field image candidate. The light field image down-sampling unit 1044 reduces the spatial resolution of the high-resolution light field image candidate by down-sampling the light field image and conversion based on the positional relationship, and generates an estimated image for the reference light field image. The high-resolution light field image candidate correcting unit 1045 corrects the high-resolution light field image candidate using the processing target image, its estimated image, the reference light field image, and its estimated image.

次に、図５を参照して、図４に示す高解像度ライトフィールド画像生成部１０４の動作を説明する。図５は、図４に示す高解像度ライトフィールド画像生成部１０４の動作を示すフローチャートである。まず、位置関係設定部１０４１は、処理対象画像と参照ライトフィールド画像の位置関係を設定する（ステップＳ２０１）。処理対象画像と参照ライトフィールド画像の位置関係が分かるものであればどのような情報を設定しても構わない。例えば、参考文献６：「Oliver Faugeras, "Three-Dimension Computer Vision", MIT Press; BCTC/UFF-006.37 F259 1993, ISBN:0-262-06158-9.」に記載されているようなカメラパラメータを設定しても構わない。 Next, the operation of the high resolution light field image generation unit 104 shown in FIG. 4 will be described with reference to FIG. FIG. 5 is a flowchart showing the operation of the high-resolution light field image generation unit 104 shown in FIG. First, the positional relationship setting unit 1041 sets the positional relationship between the processing target image and the reference light field image (step S201). Any information may be set as long as the positional relationship between the processing target image and the reference light field image is known. For example, camera parameters as described in Reference 6: “Oliver Faugeras,“ Three-Dimension Computer Vision ”, MIT Press; BCTC / UFF-006.37 F259 1993, ISBN: 0-262-06158-9.” You can set it.

また、どのように位置関係を示す情報を設定しても構わない。例えば、別途与えられる位置関係の情報を設定しても構わない。特に、処理対象画像と参照ライトフィールド画像とが、ハーフミラー等を用いて同じ位置で取得されたことが既知の場合は、同じ位置であることを設定しても構わない。なお、常に同じ位置であることが明らかであれば、このステップを省略し、以降の位置関係に伴う処理を行わないようにしても構わない。 Also, any information indicating the positional relationship may be set. For example, positional relationship information given separately may be set. In particular, when it is known that the processing target image and the reference light field image are acquired at the same position using a half mirror or the like, it may be set that they are the same position. Note that if it is clear that the positions are always the same, this step may be omitted, and the processing associated with the subsequent positional relationship may not be performed.

別の方法としては、参照ライトフィールド画像からリフォーカス画像や全焦点画像、要素画像を生成し、それらと処理対象画像における画像間の対応点情報を求め、それらを用いることで求めても構わない。画像間の対応点情報から位置関係を求める方法としては、例えば、ＳｔｒｕｃｔｕｒｅｆｒｏｍＭｏｔｉｏｎ（ＳｆＭ）を用いても構わない。 As another method, a refocus image, an omnifocal image, or an element image may be generated from the reference light field image, and corresponding point information between the images in the processing target image may be obtained and used. . For example, Structure from Motion (SfM) may be used as a method for obtaining the positional relationship from the corresponding point information between images.

なお、処理対象画像のフォーカス情報を推定する際に用いるなど、高解像度ライトフィールド画像生成部１０４の外部で同じ位置関係の情報を用いる場合は、高解像度ライトフィールド画像生成部１０４の外部で推定した情報を入力して設定するようにしても構わない。その場合、位置関係設定部１０４１は高解像度ライトフィールド画像生成部１０４の内部に存在する必要はない。 In addition, when using the same positional relationship information outside the high-resolution light field image generation unit 104, such as used when estimating the focus information of the processing target image, the estimation is performed outside the high-resolution light field image generation unit 104. Information may be input and set. In that case, the positional relationship setting unit 1041 does not need to exist inside the high-resolution light field image generation unit 104.

位置関係の設定が終了したら、高解像度ライトフィールド画像候補生成部１０４２は、高解像度ライトフィールド画像候補を設定する（ステップＳ２０２）。どのように候補を設定しても構わない。例えば、全ての画素値が０のライトフィールド画像を設定しても構わないし、任意のライトフィールド画像を設定しても構わない。別の方法としては、参照ライトフィールド画像に対して、角度成分毎にフィルタ等を用いた拡大処理を行うことで生成したライトフィールド画像を設定しても構わない。その際、全ての角度成分で同じ拡大処理を用いても構わないし、異なる拡大処理を用いても構わない。 When the setting of the positional relationship is completed, the high resolution light field image candidate generation unit 1042 sets a high resolution light field image candidate (step S202). It does not matter how candidates are set. For example, a light field image in which all pixel values are 0 may be set, or an arbitrary light field image may be set. As another method, a light field image generated by performing an enlargement process using a filter or the like for each angle component on the reference light field image may be set. At that time, the same enlargement process may be used for all angle components, or different enlargement processes may be used.

更に別の方法としては、処理対象画像に対して、任意のモデルに基づく角度成分の情報を与えることで生成したライトフィールド画像を設定しても構わない。角度成分の情報を与える方法としては、全ての角度成分が同じであるとしても構わないし、各画素に対して周辺の画像を縮小することで生成しても構わない。なお、周辺の画像を縮小して生成する際に、生成された角度成分の画素値の平均値が、元の画素値と同じになるように縮小処理を行うようにしても構わない。 As another method, a light field image generated by giving information of an angle component based on an arbitrary model may be set for the processing target image. As a method of giving information on angle components, all angle components may be the same, or the image may be generated by reducing the surrounding image for each pixel. Note that when the peripheral image is reduced and generated, the reduction process may be performed so that the average value of the pixel values of the generated angle components is the same as the original pixel value.

また、高解像度ライトフィールド画像候補を直接生成するのではなく、前述した辞書Ｄに対する係数ベクトルχの候補を設定し、辞書Ｄを用いて高解像度ライトフィールド画像候補を生成しても構わない。係数ベクトルχの候補としては、ゼロベクトルを用いても構わないし、参照ライトフィールド画像に対する係数ベクトルを求め、それをアップサンプルすることで生成したベクトルを用いても構わない。 Further, instead of directly generating the high-resolution light field image candidate, the candidate of the coefficient vector χ for the dictionary D described above may be set, and the dictionary D may be used to generate the high-resolution light field image candidate. As a candidate for the coefficient vector χ, a zero vector may be used, or a vector generated by obtaining a coefficient vector for the reference light field image and up-sampling it may be used.

高解像度ライトフィールド画像候補の設定が終了したら、スイッチ１０４６を操作し、通常画像化部１０４３は、処理対象画像の推定フォーカス情報に従って、高解像度ライトフィールド画像候補から処理対象画像に対応する画像を生成する（ステップＳ２０３）。そして、ライトフィールド画像ダウンサンプル部１０４４は高解像度ライトフィールド画像候補から参照ライトフィールド画像に対応する画像を生成する（ステップＳ２０４）。ここでの処理は、それぞれ、前述したＣｏｎｖおよびＤｏｗｎによる処理と同じである。なお、ステップＳ２０３、ステップＳ２０４はどの順番で行っても構わない。 When the setting of the high resolution light field image candidate is completed, the switch 1046 is operated, and the normal imaging unit 1043 generates an image corresponding to the processing target image from the high resolution light field image candidate according to the estimated focus information of the processing target image. (Step S203). The light field image downsampling unit 1044 generates an image corresponding to the reference light field image from the high resolution light field image candidates (step S204). The processing here is the same as the processing by Conv and Down described above. Note that step S203 and step S204 may be performed in any order.

次に、高解像度ライトフィールド画像候補補正部１０４５は、得られた画像群を用いて高解像度ライトフィールド画像候補が更新処理の終了条件を満たすか否かをチェックする（ステップＳ２０５）。どのような終了条件を用いても構わないが、例えば、（２）式のＥ（ＬＦ）によって得られる高解像度ライトフィールド画像候補ＬＦの評価値が予め定められた閾値より小さいか否かを終了条件にしても構わないし、高解像度ライトフィールド画像の更新回数が予め定められた回数行われたか否かを終了条件にしても構わないし、そのどちらか一方もしくは両方を満たすか否かを終了条件にしても構わない。 Next, the high-resolution light field image candidate correction unit 1045 checks whether or not the high-resolution light field image candidate satisfies the update process end condition using the obtained image group (step S205). Any termination condition may be used. For example, it is terminated whether the evaluation value of the high-resolution light field image candidate LF obtained by E (LF) in equation (2) is smaller than a predetermined threshold value. The end condition may be whether the high-resolution light field image has been updated a predetermined number of times, or the end condition is whether one or both of them are satisfied. It doesn't matter.

終了条件を満たしていた場合、高解像度ライトフィールド画像候補補正部１０４５は、高解像度ライトフィールド画像候補を高解像度ライトフィールドとして出力して処理を終了する。 If the end condition is satisfied, the high resolution light field image candidate correction unit 1045 outputs the high resolution light field image candidate as a high resolution light field and ends the processing.

一方、終了条件を満たしていない場合、高解像度ライトフィールド画像候補補正部１０４５は、高解像度ライトフィールド画像候補を更新する（ステップＳ２０６）。更新された高解像度ライトフィールド画像候補は、スイッチ１０４６が操作され、再度、通常画像化部１０４３、ライトフィールド画像ダウンサンプル部１０４４及び高解像度ライトフィールド画像候補補正部１０４５へ入力される。高解像度ライトフィールド画像候補の更新は、どのような方法を用いて行っても構わない。例えば、ランダムに生成された任意のライトフィールド画像を高解像度ライトフィールド画像候補として設定することで更新を行っても構わない。 On the other hand, when the termination condition is not satisfied, the high resolution light field image candidate correction unit 1045 updates the high resolution light field image candidate (step S206). The updated high resolution light field image candidate is input to the normal imaging unit 1043, the light field image downsampling unit 1044, and the high resolution light field image candidate correction unit 1045 again by operating the switch 1046. Any method may be used to update the high-resolution light field image candidate. For example, the update may be performed by setting an arbitrarily generated light field image as a high-resolution light field image candidate.

なお、ステップＳ２０３及びステップＳ２０４において、高解像度ライトフィールド画像候補から生成された処理対象画像及び参照ライトフィールド画像に対する画像を、更新処理に使用しても構わない。例えば、前述のＯＭＰなどの方法では、それら高解像度ライトフィールド画像から生成された画像群と処理対象画像や参照ライトフィールド画像との誤差を計算し、その誤差に基づいて高解像ライトフィールド画像候補を更新する。 In step S203 and step S204, the processing target image generated from the high-resolution light field image candidate and the image for the reference light field image may be used for the update process. For example, in the above-described method such as OMP, an error between an image group generated from these high-resolution light field images and a processing target image or a reference light field image is calculated, and based on the error, a high-resolution light field image candidate is calculated. Update.

前述した説明では終了条件をチェックする前に、高解像度ライトフィールド画像候補から処理対象画像及び参照ライトフィールド画像に対する画像を生成しているが、それらの画像は終了条件のチェックには使用せず、高解像度ライトフィールド画像候補の更新処理のみ使用する場合は、図６に示すように、終了条件を満たさなかった場合のみに生成するようにしても構わない。図６は、図４に示す高解像度ライトフィールド画像生成部１０４の動作の変形例を示すフローチャートである。図６において、図５に示す処理と同じ処理には同じ符号を付与してある。 In the above description, before checking the end condition, images for the processing target image and the reference light field image are generated from the high-resolution light field image candidates, but these images are not used for checking the end condition. When only the high-resolution light field image candidate update process is used, it may be generated only when the end condition is not satisfied, as shown in FIG. FIG. 6 is a flowchart showing a modification of the operation of the high-resolution light field image generation unit 104 shown in FIG. In FIG. 6, the same processes as those shown in FIG.

前述した説明では、１フレームに対する処理を説明したが、複数フレーム繰り返すことで映像（動画像）を処理することができる。また、前述した説明では画像処理装置の構成及び処理動作を説明したが、これら画像処理装置の各部の動作に対応した処理動作によって本発明の画像処理方法を実現することができる。 In the above description, the processing for one frame has been described, but a video (moving image) can be processed by repeating a plurality of frames. In the above description, the configuration and processing operation of the image processing apparatus have been described. However, the image processing method of the present invention can be realized by processing operation corresponding to the operation of each unit of the image processing apparatus.

このように、空間解像度の高いライトフィールド画像を直接取得するのではなく、同じシーンにおける空間解像度の限定されたライトフィールド画像を用いて、空間解像度の高い通常の画像に対する角度成分の情報を生成することで、空間解像度を損なわずにライトフィールド画像を生成することができる。 In this way, instead of directly acquiring a light field image with a high spatial resolution, information on angle components for a normal image with a high spatial resolution is generated using a light field image with a limited spatial resolution in the same scene. Thus, a light field image can be generated without losing the spatial resolution.

図７は、前述した画像処理装置１００をコンピュータとソフトウェアプログラムとによって構成する場合のハードウェア構成を示すブロック図である。図７に示すシステムは、プログラムを実行するＣＰＵ５０と、ＣＰＵ５０がアクセスするプログラムやデータが格納されるＲＡＭ等のメモリ５１と、カメラ等からの処理対象の画像信号を入力する処理対象画像入力部５２（ディスク装置等による映像信号を記憶する記憶部でもよい）と、ライトフィールドカメラ等から参照ライトフィールド画像の画像信号を入力する参照ライトフィールド画像入力部５３（ディスク装置等によるライトフィールドを記憶する記憶部でもよい）と、画像処理をＣＰＵ５０に実行させるソフトウェアプログラムである画像処理プログラム５４１が格納されたプログラム記憶装置５４と、ＣＰＵ５０がメモリ５１にロードされた画像処理プログラム５４１を実行することにより生成された高解像度ライトフィールド画像を出力する高解像度ライトフィールド画像出力部５５（ディスク装置等による高解像度ライトフィールド画像を記憶する記憶部でもよい）とが、バスで接続された構成になっている。 FIG. 7 is a block diagram illustrating a hardware configuration when the above-described image processing apparatus 100 is configured by a computer and a software program. The system shown in FIG. 7 includes a CPU 50 that executes a program, a memory 51 such as a RAM that stores programs and data accessed by the CPU 50, and a processing target image input unit 52 that inputs a processing target image signal from a camera or the like. (It may be a storage unit that stores a video signal by a disk device or the like), and a reference light field image input unit 53 that inputs an image signal of a reference light field image from a light field camera or the like (a memory that stores a light field by a disk device or the like). A program storage device 54 in which an image processing program 541 that is a software program for causing the CPU 50 to execute image processing is stored, and the image processing program 541 loaded in the memory 51 by the CPU 50. High resolution light feel High-resolution light field image output unit 55 for outputting an image (which may be a storage unit for storing a high-resolution light field image by a disk device, etc.), have become connected to each other by a bus.

以上説明したように、処理対象画像と、当該処理対象画像より空間解像度が低いライトフィールド画像を用いて、当該処理対象画像をライトフィールド化した画像を生成することにより、ライトフィールドカメラで処理対象画像と同一のシーンを撮像したライトフィールド画像よりも解像度の高いライトフィールド化した画像を得ることができる。 As described above, a processing target image and a light field image having a spatial resolution lower than that of the processing target image are used to generate an image obtained by converting the processing target image into a light field. It is possible to obtain a light field image having a higher resolution than that of a light field image obtained by capturing the same scene.

前述した実施形態における画像処理装置１００の全部または一部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されるものであってもよい。 You may make it implement | achieve all or one part of the image processing apparatus 100 in embodiment mentioned above with a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on this recording medium may be read into a computer system and executed. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory inside a computer system serving as a server or a client in that case may be included and a program held for a certain period of time. Further, the program may be a program for realizing a part of the above-described functions, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system. It may be realized using hardware such as PLD (Programmable Logic Device) or FPGA (Field Programmable Gate Array).

以上、図面を参照して本発明の実施の形態を説明してきたが、上記実施の形態は本発明の例示に過ぎず、本発明が上記実施の形態に限定されるものではないことは明らかである。したがって、本発明の技術思想及び範囲を逸脱しない範囲で構成要素の追加、省略、置換、その他の変更を行ってもよい。 As mentioned above, although embodiment of this invention has been described with reference to drawings, the said embodiment is only the illustration of this invention, and it is clear that this invention is not limited to the said embodiment. is there. Therefore, additions, omissions, substitutions, and other modifications of the components may be made without departing from the technical idea and scope of the present invention.

画像又は動画像の空間解像度を損なわずに、同じ画像又は動画像における角度解像度を持ったライトフィールド画像又はライトフィールド動画像を取得することが不可欠な用途に適用できる。 The present invention can be applied to an indispensable use for acquiring a light field image or a light field moving image having an angular resolution in the same image or moving image without impairing the spatial resolution of the image or moving image.

１０１・・・処理対象画像入力部、１０２・・・参照ライトフィールド画像入力部、１０３・・・フォーカス情報推定部、１０４・・・高解像度ライトフィールド画像生成部、１０４１・・・位置関係設定部、１０４２・・・高解像度ライトフィールド画像候補生成部、１０４３・・・通常画像化部、１０４４・・・ライトフィールド画像ダウンサンプル部、１０４５・・・高解像度ライトフィールド画像候補補正部、１０４６・・・スイッチ、５０・・・ＣＰＵ、５１・・・メモリ、５２・・・処理対象画像入力部（記憶部）、５３・・・参照ライトフィールド画像入力部（記憶部）、５４・・・プログラム記憶装置、５４１・・・画像処理プログラム、５５・・・高解像度ライトフィールド画像出力部（記憶部） DESCRIPTION OF SYMBOLS 101 ... Processing target image input part, 102 ... Reference light field image input part, 103 ... Focus information estimation part, 104 ... High resolution light field image generation part, 1041 ... Position relationship setting part 1042 ... High-resolution light field image candidate generation unit, 1043 ... Normal imaging unit, 1044 ... Light field image down-sampling unit, 1045 ... High-resolution light field image candidate correction unit, 1046 ... Switch 50 ... CPU 51 ... Memory 52 ... Processing target image input unit (storage unit) 53 ... Reference light field image input unit (storage unit) 54 ... Program storage Device, 541... Image processing program, 55... High-resolution light field image output unit (storage unit)

Claims

ライトフィールド化すべき処理対象画像と、前記処理対象画像と同じシーンにおける光線の強度を前記処理対象画像よりも低い空間解像度で光線の進行方向毎に表現した参照ライトフィールド画像とを用いて、前記処理対象画像の各画素の位置における光線の強度を光線の進行方向毎に表現したライトフィールド画像を生成する画像処理方法であって、
前記処理対象画像を複数の処理領域に分割する領域分割ステップと、
前記処理対象画像と前記参照ライトフィールド画像とを用いて、前記処理領域ごとに、前記処理対象画像のフォーカス情報を推定する処理対象画像フォーカス推定ステップと、
前記処理対象画像と前記参照ライトフィールド画像と前記推定したフォーカス情報とを用いて、前記処理領域ごとに、前記処理対象画像に対するライトフィールド画像である高解像度ライトフィールド画像を生成する高解像度ライトフィールド画像生成ステップと
を有する画像処理方法。 Using the processing target image to be converted into a light field, and the reference light field image expressing the intensity of the light beam in the same scene as the processing target image for each traveling direction of the light beam at a lower spatial resolution than the processing target image, An image processing method for generating a light field image in which the intensity of a light beam at each pixel position of a target image is expressed for each traveling direction of the light beam,
A region dividing step of dividing the processing target image into a plurality of processing regions;
A processing target image focus estimation step of estimating focus information of the processing target image for each processing region using the processing target image and the reference light field image;
A high-resolution light field image that generates a high-resolution light field image that is a light field image for the processing target image for each processing region using the processing target image, the reference light field image, and the estimated focus information. An image processing method comprising: a generation step.

前記参照ライトフィールド画像から異なるフォーカス情報を持つ複数の仮想フォーカス画像を生成する仮想フォーカス画像生成ステップを更に有し、
前記処理対象画像フォーカス推定ステップでは、前記仮想フォーカス画像と前記処理対象画像とを用いて、前記処理対象画像のフォーカス情報を推定する請求項１に記載の画像処理方法。 A virtual focus image generation step of generating a plurality of virtual focus images having different focus information from the reference light field image;
The image processing method according to claim 1, wherein in the processing target image focus estimation step, focus information of the processing target image is estimated using the virtual focus image and the processing target image.

前記仮想フォーカス画像生成ステップでは、前記参照ライトフィールド画像に対する前記処理対象画像との撮影位置、向きの違いを補償した視点合成ライトフィールド画像を生成し、当該視点合成ライトフィールド画像を用いて前記仮想フォーカス画像を生成する請求項２に記載の画像処理方法。 In the virtual focus image generation step, a viewpoint synthesized light field image that compensates for a difference in shooting position and orientation between the reference light field image and the processing target image is generated, and the virtual focus image is generated using the viewpoint synthesized light field image. The image processing method according to claim 2, wherein an image is generated.

前記仮想フォーカス画像生成ステップでは、前記視点合成ライトフィールド画像の空間解像度を前記処理対象画像と合わせた高解像度視点合成ライトフィールド画像を生成し、当該高解像度視点合成ライトフィールド画像を用いて、前記仮想フォーカス画像を生成する請求項３に記載の画像処理方法。 In the virtual focus image generation step, a high-resolution viewpoint composite light field image in which the spatial resolution of the viewpoint composite light field image is combined with the processing target image is generated, and the virtual resolution image is generated using the high-resolution viewpoint composite light field image. The image processing method according to claim 3, wherein a focus image is generated.

前記仮想フォーカス画像生成ステップでは、前記視点合成ライトフィールド画像から、前記視点合成ライトフィールド画像と同じ空間解像度で異なるフォーカス情報を持つ複数の低解像度仮想フォーカス画像を生成し、当該低解像度仮想フォーカス画像を前記処理対象画像と同じ空間解像度を持つようにアップサンプルすることで、前記仮想フォーカス画像を生成する請求項３に記載の画像処理方法。 In the virtual focus image generation step, a plurality of low resolution virtual focus images having the same spatial resolution and different focus information as the viewpoint composite light field image are generated from the viewpoint composite light field image, and the low resolution virtual focus image is The image processing method according to claim 3, wherein the virtual focus image is generated by up-sampling to have the same spatial resolution as the processing target image.

前記フォーカス情報に、ライトフィールド画像から通常の画像を生成するための方法に関する情報が含まれる請求項１から請求項５のいずれか１項に記載の画像処理方法。 The image processing method according to claim 1, wherein the focus information includes information related to a method for generating a normal image from a light field image.

ライトフィールド化すべき処理対象画像と、前記処理対象画像と同じシーンにおける光線の強度を前記処理対象画像よりも低い空間解像度で光線の進行方向毎に表現した参照ライトフィールド画像とを用いて、前記処理対象画像の各画素の位置における光線の強度を光線の進行方向毎に表現したライトフィールド画像を生成する画像処理装置であって、
前記処理対象画像を複数の処理領域に分割する領域分割手段と、
前記処理対象画像と前記参照ライトフィールド画像とを用いて、前記処理領域ごとに、前記処理対象画像のフォーカス情報を推定する処理対象画像フォーカス推定手段と、
前記処理対象画像と前記参照ライトフィールド画像と前記推定したフォーカス情報とを用いて、前記処理領域ごとに、前記処理対象画像に対するライトフィールド画像である高解像度ライトフィールド画像を生成する高解像度ライトフィールド画像生成手段と
を備える画像処理装置。 Using the processing target image to be converted into a light field, and the reference light field image expressing the intensity of the light beam in the same scene as the processing target image for each traveling direction of the light beam at a lower spatial resolution than the processing target image, An image processing device that generates a light field image that expresses the intensity of a light beam at each pixel position of a target image for each traveling direction of the light beam,
Area dividing means for dividing the processing target image into a plurality of processing areas;
Processing target image focus estimation means for estimating focus information of the processing target image for each processing region using the processing target image and the reference light field image;
A high-resolution light field image that generates a high-resolution light field image that is a light field image for the processing target image for each processing region using the processing target image, the reference light field image, and the estimated focus information. An image processing apparatus comprising: generation means.

コンピュータに、請求項１から請求項６のいずれか１項に記載の画像処理方法を実行させるための画像処理プログラム。 An image processing program for causing a computer to execute the image processing method according to any one of claims 1 to 6.