JP6974978B2

JP6974978B2 - Image processing equipment, image processing methods, and programs

Info

Publication number: JP6974978B2
Application number: JP2017156642A
Authority: JP
Inventors: 達朗小泉; 宗浩吉村
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-08-31
Filing date: 2017-08-14
Publication date: 2021-12-01
Anticipated expiration: 2037-08-14
Also published as: JP2018042237A

Description

本発明は、画像処理装置、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program.

被写体を複数の撮像装置で撮像して得られた画像から、任意の仮想視点から被写体を観察した際に得られる画像を再構成する技術が知られている。特許文献１には、以下の方法が開示されている。まず、複数のカメラにより撮像された被写体の撮像画像と、カメラの位置情報とを用いて、被写体の三次元モデルを作成する。次に、三次元モデル上の各位置のテクスチャ画像を、複数の撮像画像に写っているテクスチャ画像をブレンドすることにより生成する。最後に、ブレンドテクスチャ画像を三次元モデルにテクスチャマッピングすることにより、カメラが配置されていない仮想視点からの画像を再構成することができる。 There is known a technique for reconstructing an image obtained when a subject is observed from an arbitrary virtual viewpoint from an image obtained by capturing the subject with a plurality of image pickup devices. Patent Document 1 discloses the following method. First, a three-dimensional model of the subject is created by using the captured images of the subject captured by a plurality of cameras and the position information of the cameras. Next, the texture image of each position on the three-dimensional model is generated by blending the texture images shown in the plurality of captured images. Finally, by texture mapping the blended texture image to the 3D model, it is possible to reconstruct the image from a virtual viewpoint where the camera is not placed.

特許文献１には、仮想視点からの視認状態に近い再構成画像を生成するために、ブレンドテクスチャ画像を生成する際に用いる撮像画像の選択方法についても記載されている。例えば、特許文献１は、仮想視点に近いカメラにより撮像された被写体の撮像画像を選択することを提案している。別の方法として、特許文献１は、仮想視点と近い視線方向を有するカメラにより撮像された被写体の撮像画像を選択することも提案している。また、特許文献１には、仮想視点により近い又は仮想視点とより近い視線方向を有するカメラにより撮像された被写体の撮像画像について、混合比率を高めることも記載されている。 Patent Document 1 also describes a method of selecting a captured image to be used when generating a blended texture image in order to generate a reconstructed image close to a visual state from a virtual viewpoint. For example, Patent Document 1 proposes to select a captured image of a subject captured by a camera close to a virtual viewpoint. As another method, Patent Document 1 also proposes to select a captured image of a subject captured by a camera having a line-of-sight direction close to a virtual viewpoint. Further, Patent Document 1 also describes increasing the mixing ratio of a captured image of a subject captured by a camera having a line-of-sight direction closer to or closer to the virtual viewpoint.

特許第５０１１２２４号公報Japanese Patent No. 501224

照明等の影響により、被写体の同じ部位であっても、それぞれの撮像装置により得られる撮像画像上での色が異なることがある。このため、特に異なる撮像画像がブレンドされている領域の間で、不自然な色の変化が見られることがあった。 Due to the influence of lighting and the like, the colors on the captured image obtained by each imaging device may differ even in the same portion of the subject. For this reason, unnatural color changes may be seen, especially between regions where different captured images are blended.

本発明は、仮想視点からの再構成画像において、領域間の色の違いに起因する違和感を低減することを目的とする。 An object of the present invention is to reduce a sense of discomfort caused by a difference in color between regions in a reconstructed image from a virtual viewpoint.

本発明の目的を達成するために、例えば、本発明の画像処理装置は以下の構成を有する。すなわち、
仮想視点の位置の情報と前記仮想視点からの視線方向の情報を含む仮想視点情報を取得する取得手段と、
前記取得手段により取得された仮想視点情報により特定される仮想視点に対応する仮想視点画像の着目画素の色情報を、複数の撮像装置で撮像されることにより取得された複数の撮像画像と、前記複数の撮像画像それぞれにおける端部から前記着目画素に対応する画素までの距離に応じた重みとに基づいて決定する決定手段と、
前記決定手段により決定された色情報に基づいて、前記仮想視点画像を生成する生成手段と、
を有し、
前記重みは、
前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値を超えている場合、一定の重みであり、
前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値以下である場合、前記一定の重みより小さい重みである
ことを特徴とする。 In order to achieve the object of the present invention, for example, the image processing apparatus of the present invention has the following configuration. That is,
An acquisition means for acquiring virtual viewpoint information including information on the position of the virtual viewpoint and information on the line-of-sight direction from the virtual viewpoint, and
A plurality of captured images acquired by capturing the color information of the pixel of interest of the virtual viewpoint image corresponding to the virtual viewpoint identified by the virtual viewpoint information acquired by the acquisition means by a plurality of imaging devices, and the above-mentioned. A determination means for determining based on the weight according to the distance from the end of each of the plurality of captured images to the pixel corresponding to the pixel of interest.
A generation means for generating the virtual viewpoint image based on the color information determined by the determination means, and a generation means.
Have a,
The weight is
When the distance from the end portion of the captured image to the pixel corresponding to the pixel of interest exceeds a predetermined threshold value, the weight is constant.
When the distance from the end portion of the captured image to the pixel corresponding to the pixel of interest is equal to or less than a predetermined threshold value, the weight is smaller than the constant weight .

仮想視点からの再構成画像において、領域間の色の違いに起因する違和感を低減することができる。 In the reconstructed image from the virtual viewpoint, it is possible to reduce the discomfort caused by the difference in color between the regions.

実施形態１に係る画像処理装置のハードウェア構成例を示す図。The figure which shows the hardware configuration example of the image processing apparatus which concerns on Embodiment 1. FIG. 実施形態１に係る画像処理システムの配置例を示す図。The figure which shows the arrangement example of the image processing system which concerns on Embodiment 1. 実施形態１に係る画像処理装置の機能構成例を示す図。The figure which shows the functional structure example of the image processing apparatus which concerns on Embodiment 1. 実施形態１に係る画像処理装置の機能構成例を示す図。The figure which shows the functional structure example of the image processing apparatus which concerns on Embodiment 1. 実施形態１に係る処理を概念的に説明する図。The figure which conceptually explains the process which concerns on Embodiment 1. 実施形態１に係る処理のフローチャート。The flowchart of the process which concerns on Embodiment 1. 実施形態１に係る位置重み及び方向重みの算出方法を説明する図。The figure explaining the calculation method of the position weight and the direction weight which concerns on Embodiment 1. 実施形態１の実装例に係る画素値算出方法を説明する図。The figure explaining the pixel value calculation method which concerns on the implementation example of Embodiment 1. FIG. 実施形態２で用いられる背景モデルの例を示す図。The figure which shows the example of the background model used in Embodiment 2. 実施形態３に係る処理を概念的に説明する図。The figure which conceptually explains the process which concerns on Embodiment 3. 実施形態３に係る画像処理装置の機能構成例を示す図。The figure which shows the functional structure example of the image processing apparatus which concerns on Embodiment 3. 実施形態３に係る画像処理装置の機能構成例を示す図。The figure which shows the functional structure example of the image processing apparatus which concerns on Embodiment 3. 実施形態３に係る処理のフローチャート。The flowchart of the process which concerns on Embodiment 3. 実施形態３に係る重みの算出方法を説明する図。The figure explaining the calculation method of the weight which concerns on Embodiment 3. 実施形態３に係る重みの算出方法を説明する図。The figure explaining the calculation method of the weight which concerns on Embodiment 3. 実施形態３に係る重みの算出方法を説明する図。The figure explaining the calculation method of the weight which concerns on Embodiment 3. 実施形態４に係る画像処理装置の機能構成例を示す図。The figure which shows the functional structure example of the image processing apparatus which concerns on Embodiment 4. 実施形態４に係る処理のフローチャート。The flowchart of the process which concerns on Embodiment 4.

以下、本発明の実施形態を図面に基づいて説明する。ただし、本発明の範囲は以下の実施形態に限定されるものではない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, the scope of the present invention is not limited to the following embodiments.

［実施形態１］
以下、本発明の実施形態１に係る画像処理装置について説明する。本実施形態に係る画像処理装置は、例えばプロセッサー及びメモリを備えるコンピュータでありうる。図１は、本実施形態に係る画像処理装置１００のハードウェア構成例を示す。ＣＰＵ１０１は、画像処理装置１００全体を制御する。ＲＡＭ１０２は、プログラム又はデータ等を一時記憶するランダムアクセスメモリである。ＲＯＭ１０３は、プログラム又はパラメータ等を格納する読み出し専用メモリである。二次記憶装置１０４は、プログラム又はデータ等を長期間保管可能な記憶装置であり、例えばハードディスク又はメモリカード等でありうる。 [Embodiment 1]
Hereinafter, the image processing apparatus according to the first embodiment of the present invention will be described. The image processing apparatus according to the present embodiment may be, for example, a computer including a processor and a memory. FIG. 1 shows a hardware configuration example of the image processing apparatus 100 according to the present embodiment. The CPU 101 controls the entire image processing device 100. The RAM 102 is a random access memory for temporarily storing a program, data, or the like. The ROM 103 is a read-only memory for storing programs, parameters, and the like. The secondary storage device 104 is a storage device capable of storing a program, data, or the like for a long period of time, and may be, for example, a hard disk, a memory card, or the like.

入力インターフェース１０５は、画像処理装置１００と入力デバイスとを接続するインターフェースである。入力デバイスとは画像処理装置１００にデータを入力する装置であり、その種類は特に制限されない。例えば、入力インターフェース１０５は、被写体の画像を撮像する撮像装置１０８又は外部記憶装置１０９からのデータを受け取ることができ、画像処理装置１００は受け取ったデータを用いて処理を行うことができる。出力インターフェース１０６は、画像処理装置１００と出力デバイスとを接続するインターフェースである。出力デバイスとは画像処理装置１００からのデータを受け取る装置であり、その種類は特に制限されない。例えば、出力インターフェース１０６は、画像処理装置１００からのデータを、外部記憶装置１０９又は表示装置１１０へと出力することができる。 The input interface 105 is an interface for connecting the image processing device 100 and the input device. The input device is a device for inputting data to the image processing device 100, and the type thereof is not particularly limited. For example, the input interface 105 can receive data from an image pickup device 108 or an external storage device 109 that captures an image of a subject, and the image processing device 100 can perform processing using the received data. The output interface 106 is an interface for connecting the image processing device 100 and the output device. The output device is a device that receives data from the image processing device 100, and the type thereof is not particularly limited. For example, the output interface 106 can output data from the image processing device 100 to the external storage device 109 or the display device 110.

例えば図３に示されるような、以下で説明される各部の動作は、以下のようにして実現できる。すなわち、ＲＯＭ１０３、二次記憶装置１０４、又は外部記憶装置１０９等の、コンピュータ読み取り可能な記憶媒体に格納された各部の動作に対応するプログラムを、ＲＡＭ１０２に展開する。そして、このプログラムに従ってＣＰＵ１０１が動作することにより、以下で説明される各部の動作を実現できる。もっとも、後述する各部のうち全部又は一部の動作は、ＡＳＩＣ等の専用のハードウェアによって実現されてもよい。 For example, the operation of each part described below as shown in FIG. 3 can be realized as follows. That is, a program corresponding to the operation of each part stored in a computer-readable storage medium such as the ROM 103, the secondary storage device 104, or the external storage device 109 is developed in the RAM 102. Then, by operating the CPU 101 according to this program, the operation of each part described below can be realized. However, the operation of all or part of each part described later may be realized by dedicated hardware such as ASIC.

本実施形態に係る画像処理装置１００は、被写体の画像を撮像する複数の撮像装置１０８から撮像画像を取得し、仮想視点からの再構成画像を生成する処理を行う。本明細書において、再構成画像とは、仮想視点に基づいて生成される被写体（オブジェクト）の仮想視点画像（オブジェクト画像）であり、仮想カメラを仮想視点に配置した場合に得られるであろう被写体の撮像画像に相当する。再構成画像は、自由視点画像とも呼ばれる。これら複数の撮像装置１０８は、例えば被写体の周りを取り囲むように配置することができる。このような撮像装置１０８の配置例を図２に示す。図２は体育館に配置された撮像装置１０８を示し、世界座標系２０１が示されている。図２（Ａ）に示すように撮像装置１０８は上方から床面を見下ろすように配置されており、図２（Ｂ）に示すように撮像装置１０８は体育館の側面に沿って配置されている。そして、これらの撮像装置１０８は、体育館の床面及び側面、並びに体育館で活動している人物等の被写体を様々な方向から同期して撮像することができる。こうして、複数の撮像装置１０８は、同じ時刻における様々な方向からの被写体の撮像画像を生成することができる。もっとも、図２に示す撮像装置１０８の配置方法は一例にすぎず、他の配置方法を採用することもできる。 The image processing device 100 according to the present embodiment acquires images from a plurality of image pickup devices 108 that capture images of a subject, and performs a process of generating a reconstructed image from a virtual viewpoint. In the present specification, the reconstructed image is a virtual viewpoint image (object image) of a subject (object) generated based on a virtual viewpoint, and is a subject that would be obtained when a virtual camera is arranged in the virtual viewpoint. Corresponds to the captured image of. The reconstructed image is also called a free viewpoint image. These plurality of image pickup devices 108 can be arranged so as to surround the subject, for example. An example of arrangement of such an image pickup apparatus 108 is shown in FIG. FIG. 2 shows the image pickup device 108 arranged in the gymnasium, and the world coordinate system 201 is shown. As shown in FIG. 2 (A), the image pickup device 108 is arranged so as to look down on the floor surface from above, and as shown in FIG. 2 (B), the image pickup device 108 is arranged along the side surface of the gymnasium. Then, these image pickup devices 108 can synchronously image the floor surface and the side surface of the gymnasium, and the subject such as a person who is active in the gymnasium from various directions. In this way, the plurality of image pickup devices 108 can generate captured images of the subject from various directions at the same time. However, the method of arranging the image pickup apparatus 108 shown in FIG. 2 is only an example, and other arrangement methods can also be adopted.

図１において、画像処理装置１００は複数の撮像装置１０８と接続されており、画像処理装置１００及び複数の撮像装置１０８を備える画像処理システムを形成している。このような構成によれば、リアルタイムに仮想視点からの再構成画像を生成することができる。しかしながら、画像処理装置１００に撮像装置１０８が接続されることは必須ではなく、例えば画像処理装置１００は記憶媒体を介して撮像装置１０８から撮像画像を取得してもよい。なお、撮像装置１０８は動画像を撮像してもよい。この場合、画像処理装置１００は、複数の撮像装置１０８により略同時刻に撮像されたフレーム画像を用いて、以下の処理を行うことができる。 In FIG. 1, the image processing device 100 is connected to a plurality of image pickup devices 108, and forms an image processing system including the image processing device 100 and the plurality of image pickup devices 108. With such a configuration, it is possible to generate a reconstructed image from a virtual viewpoint in real time. However, it is not essential that the image processing device 108 is connected to the image processing device 100. For example, the image processing device 100 may acquire an captured image from the image pickup device 108 via a storage medium. The image pickup device 108 may capture a moving image. In this case, the image processing device 100 can perform the following processing using the frame images captured at substantially the same time by the plurality of image pickup devices 108.

図３は、本実施形態に係る画像処理装置１００が備える機能構成を示す。図３に示すように、画像処理装置１００は、入力視点情報取得部３１０、出力視点情報取得部３２０、距離マップ取得部３３０、画像取得部３４０、レンダリング部３５０、及び画像出力部３６０を備える。 FIG. 3 shows a functional configuration included in the image processing apparatus 100 according to the present embodiment. As shown in FIG. 3, the image processing apparatus 100 includes an input viewpoint information acquisition unit 310, an output viewpoint information acquisition unit 320, a distance map acquisition unit 330, an image acquisition unit 340, a rendering unit 350, and an image output unit 360.

入力視点情報取得部３１０及び出力視点情報取得部３２０は、仮想視点及び複数の撮像装置の位置姿勢情報を取得する。本実施形態において入力視点情報取得部３１０は、入力視点に関する情報（以下、入力視点情報と呼ぶ）を取得する。本実施形態において、入力視点とは撮像装置１０８の視点のことを指し、入力視点情報とは、複数の撮像装置１０８それぞれについての情報のことを意味する。入力視点情報には、所定の座標系内での撮像装置１０８の位置姿勢情報が含まれ、例えば、撮像装置１０８の位置情報及び光軸方向を示す姿勢情報を含む。また、入力視点情報には、焦点距離又は主点位置等、撮像装置１０８の画角情報を含めることもできる。これらの情報を用いて、撮像画像の各画素と、撮像装置１０８から被写体への向きと、を対応づけることができる。このため、被写体の特定箇所について、撮像画像上の対応する画素を特定することができ、その色情報を取得することが可能となる。さらに、入力視点情報には、撮像装置１０８により撮像される画像の歪曲を示す歪曲パラメータ、並びにＦ値、シャッタースピード及びホワイトバランス等の撮影パラメータを含むことができる。 The input viewpoint information acquisition unit 310 and the output viewpoint information acquisition unit 320 acquire the position / orientation information of the virtual viewpoint and the plurality of image pickup devices. In the present embodiment, the input viewpoint information acquisition unit 310 acquires information related to the input viewpoint (hereinafter referred to as input viewpoint information). In the present embodiment, the input viewpoint refers to the viewpoint of the image pickup device 108, and the input viewpoint information means information about each of the plurality of image pickup devices 108. The input viewpoint information includes the position / orientation information of the image pickup apparatus 108 within a predetermined coordinate system, and includes, for example, the position information of the image pickup apparatus 108 and the attitude information indicating the optical axis direction. Further, the input viewpoint information may include the angle of view information of the image pickup apparatus 108 such as the focal length or the position of the principal point. Using this information, it is possible to associate each pixel of the captured image with the orientation of the image pickup device 108 toward the subject. Therefore, it is possible to specify the corresponding pixel on the captured image for the specific part of the subject, and it is possible to acquire the color information. Further, the input viewpoint information can include a distortion parameter indicating distortion of the image captured by the image pickup apparatus 108, and a shooting parameter such as an F value, a shutter speed, and a white balance.

また、本実施形態において、出力視点情報取得部３２０は、出力視点に関する情報（以下、出力視点情報と呼ぶ）を取得する。本実施形態において、出力視点とは、画像処理装置１００が生成する再構成画像の仮想視点のことを指し、出力視点情報とは、仮想視点についての情報のことを意味する。出力視点情報には、入力視点情報と同様、所定の座標系内での仮想視点の位置姿勢情報が含まれ、例えば、仮想視点の位置情報及び光軸方向を示す姿勢情報を含む。また、出力視点情報には、仮想視点からの画角情報、再構成画像の解像度情報等を含むこともできる。さらに、出力視点情報は、歪曲パラメータ及び撮影パラメータ等を含むこともでき、これらを用いて得られた再構成画像に対して画像処理を行うこともできる。 Further, in the present embodiment, the output viewpoint information acquisition unit 320 acquires information regarding the output viewpoint (hereinafter referred to as output viewpoint information). In the present embodiment, the output viewpoint refers to a virtual viewpoint of the reconstructed image generated by the image processing device 100, and the output viewpoint information means information about the virtual viewpoint. Like the input viewpoint information, the output viewpoint information includes the position / orientation information of the virtual viewpoint in a predetermined coordinate system, and includes, for example, the position information of the virtual viewpoint and the attitude information indicating the optical axis direction. Further, the output viewpoint information may include the angle of view information from the virtual viewpoint, the resolution information of the reconstructed image, and the like. Further, the output viewpoint information can also include distortion parameters, photographing parameters, and the like, and image processing can be performed on the reconstructed image obtained by using these.

入力視点情報取得部３１０及び出力視点情報取得部３２０は、仮想視点及び撮像装置１０８の位置姿勢情報を取得する代わりに、撮像装置及び仮想視点の相対的な位置姿勢関係を示す情報を取得してもよい。 The input viewpoint information acquisition unit 310 and the output viewpoint information acquisition unit 320 acquire information indicating the relative position / orientation relationship between the image pickup device and the virtual viewpoint, instead of acquiring the position / orientation information of the virtual viewpoint and the image pickup device 108. May be good.

距離マップ取得部３３０は、空間内の被写体の位置情報を取得する。この位置情報は、仮想視点と被写体の相対的な位置関係を示す。本実施形態において、距離マップ取得部３３０は、仮想視点から被写体までの距離マップ（デプスマップ）を取得する。複数の撮像装置１０８により得られた被写体の撮像画像に基づいて距離マップを生成する方法は公知であり、任意の方法を採用することができる。例えば、特許文献１に記載されている視体積公差法又はステレオマッチング法を用いて、被写体の三次元モデルを生成することができる。そして、仮想視点と被写体の三次元モデルとの関係に基づいて、仮想視点からの再構成画像の各画素について、仮想視点から対応する被写体までの距離を求め、こうして距離マップを生成することができる。距離マップの生成方法は被写体の撮像画像に基づく方法に限られず、何らかのトラッカー等を用いて被写体の三次元モデルを生成し、この三次元モデルに基づいて距離マップを生成してもよい。また、事前にレンジセンサなどで仮想視点から対応する被写体までの距離を計測し、距離マップを取得してもよい。 The distance map acquisition unit 330 acquires the position information of the subject in the space. This position information indicates the relative positional relationship between the virtual viewpoint and the subject. In the present embodiment, the distance map acquisition unit 330 acquires a distance map (depth map) from the virtual viewpoint to the subject. A method of generating a distance map based on a captured image of a subject obtained by a plurality of image pickup devices 108 is known, and any method can be adopted. For example, a three-dimensional model of a subject can be generated by using the visual volume tolerance method or the stereo matching method described in Patent Document 1. Then, based on the relationship between the virtual viewpoint and the three-dimensional model of the subject, the distance from the virtual viewpoint to the corresponding subject can be obtained for each pixel of the reconstructed image from the virtual viewpoint, and thus a distance map can be generated. .. The method of generating the distance map is not limited to the method based on the captured image of the subject, and a three-dimensional model of the subject may be generated using some kind of tracker or the like, and the distance map may be generated based on this three-dimensional model. Further, the distance from the virtual viewpoint to the corresponding subject may be measured in advance with a range sensor or the like, and a distance map may be acquired.

画像取得部３４０は、複数の撮像装置１０８のそれぞれにより撮像された被写体の撮像画像を取得する。 The image acquisition unit 340 acquires images of the subject captured by each of the plurality of image pickup devices 108.

レンダリング部３５０は、仮想視点から各方向に存在する被写体の色情報を、複数の撮像画像から決定する。例えば、レンダリング部３５０は、被写体の位置情報（距離マップ取得部３３０が取得した距離マップ）、入力視点情報及び出力視点情報（撮像装置１０８及び仮想視点の位置姿勢情報）を参照して、仮想視点からの再構成画像を生成することができる。この際に、レンダリング部３５０は、画像取得部３４０が取得した被写体の撮像画像のそれぞれから、仮想視点からの着目方向に存在する被写体の色情報を取得する。そして、レンダリング部３５０は、後述するように撮像装置の向きと撮像装置の視野内における被写体の位置とに応じた重みを用いて、取得した色情報を合成することにより、着目方向に存在する被写体の色情報を決定する。レンダリング部３５０は、こうして再構成画像の各画素に対応する各着目方向について被写体の色情報を決定することにより、再構成画像を生成する。 The rendering unit 350 determines the color information of the subject existing in each direction from the virtual viewpoint from a plurality of captured images. For example, the rendering unit 350 refers to the position information of the subject (distance map acquired by the distance map acquisition unit 330), the input viewpoint information and the output viewpoint information (position and orientation information of the image pickup device 108 and the virtual viewpoint), and the virtual viewpoint. Reconstructed images from can be generated. At this time, the rendering unit 350 acquires the color information of the subject existing in the direction of interest from the virtual viewpoint from each of the captured images of the subject acquired by the image acquisition unit 340. Then, as will be described later, the rendering unit 350 synthesizes the acquired color information using the weights corresponding to the orientation of the image pickup device and the position of the subject in the field of view of the image pickup device, thereby presenting the subject existing in the direction of interest. Determine the color information of. The rendering unit 350 thus generates the reconstructed image by determining the color information of the subject for each direction of interest corresponding to each pixel of the reconstructed image.

画像出力部３６０は、レンダリング部３５０が生成した再構成画像を出力する。例えば、画像出力部３６０は、再構成画像を表示装置１１０に出力し、再構成画像を表示装置１１０に表示させることができる。 The image output unit 360 outputs the reconstructed image generated by the rendering unit 350. For example, the image output unit 360 can output the reconstructed image to the display device 110 and display the reconstructed image on the display device 110.

次に、レンダリング部３５０についてより詳しく説明する。まず、レンダリング処理の概略について説明する。レンダリング部３５０が行う処理は、着目方向に存在する被写体の位置を距離マップに基づいて特定し、この被写体の色情報を撮像画像から抽出する処理に相当する。言い換えれば、レンダリング部３５０は、再構成画像中の着目画素について、着目画素に写る被写体の位置を距離マップに基づいて特定し、着目画素に写る被写体の色情報を撮像画像から抽出する。より詳しく説明すると、レンダリング部３５０は、仮想視点から着目方向に存在する被写体までの距離と、仮想視点と撮像装置との間の位置姿勢関係とに基づいて、着目方向に存在する被写体に対応する撮像画像上の画素を特定することができる。そして、レンダリング部３５０は、特定された画素の色情報を着目方向に存在する被写体の色情報として取得することができる。 Next, the rendering unit 350 will be described in more detail. First, the outline of the rendering process will be described. The process performed by the rendering unit 350 corresponds to the process of specifying the position of the subject existing in the direction of interest based on the distance map and extracting the color information of the subject from the captured image. In other words, the rendering unit 350 specifies the position of the subject reflected in the pixel of interest for the pixel of interest in the reconstructed image based on the distance map, and extracts the color information of the subject reflected in the pixel of interest from the captured image. More specifically, the rendering unit 350 corresponds to the subject existing in the direction of interest based on the distance from the virtual viewpoint to the subject existing in the direction of interest and the positional / posture relationship between the virtual viewpoint and the image pickup device. Pixels on the captured image can be specified. Then, the rendering unit 350 can acquire the color information of the specified pixel as the color information of the subject existing in the direction of interest.

この処理は、例えば以下のように行うことができる。以下の説明では、再構成画像中の着目画素の座標を（ｕ_０，ｖ_０）とする。着目画素に写る被写体の位置は、式（１）に従って、出力視点におけるカメラ座標系で表すことができる。

式（１）において、（ｘ_０，ｙ_０，ｚ_０）は被写体のカメラ座標を表す。ｄ_０（ｕ_０，ｖ_０）は、距離マップに示される、出力視点から着目画素に写る被写体までの距離を表す。ｆ_０は出力視点の焦点距離を表し、ｃ_ｘ０及びｃ_ｙ０は出力視点の主点位置を表す。 This process can be performed, for example, as follows. In the following description, the coordinates of the pixel of interest in the reconstructed image are (u ₀ , v ₀ ). The position of the subject reflected in the pixel of interest can be represented by the camera coordinate system at the output viewpoint according to the equation (1).

In equation (1), (x ₀ , y ₀ , z ₀ ) represents the camera coordinates of the subject. d ₀ (u ₀ , v ₀ ) represents the distance from the output viewpoint to the subject reflected in the pixel of interest, which is shown in the distance map. f ₀ represents the focal length of the output viewpoint, and c _{x 0} and _cy 0 represent the principal point position of the output viewpoint.

次に、着目画素に写る被写体について、出力視点におけるカメラ座標を、式（２）に従って世界座標に変換することができる。

式（２）において、（Ｘ_０，Ｙ_０，Ｚ_０）は被写体の世界座標を表す。Ｒ_０は、出力視点の光軸方向を表す。（Ｘ_{ｏｕｔｐｕｔ}，Ｙ_{ｏｕｔｐｕｔ}，Ｚ_{ｏｕｔｐｕｔ}）は、出力視点のカメラ位置を表す。 Next, with respect to the subject reflected in the pixel of interest, the camera coordinates at the output viewpoint can be converted into world coordinates according to the equation (2).

In equation (2), (X ₀ , Y ₀ , Z ₀ ) represents the world coordinates of the subject. _R0 represents the optical axis direction of the output viewpoint. (X _output , Y _output , Z _output ) represents the camera position of the output viewpoint.

次に、世界座標（Ｘ_０，Ｙ_０，Ｚ_０）に存在する被写体が写っている、入力視点からの撮像画像上の座標を、式（４）に従って算出することができる。

式（３）において、Ｒ_ｉは入力視点ｉの光軸方向を表す（入力視点ｉは、複数の入力視点のうちｉ番目の入力視点である）。（Ｘ_{ｃａｍ，ｉ}，Ｙ_{ｃａｍ，ｉ}，Ｚ_{ｃａｍ，ｉ}）は、入力視点ｉのカメラ位置を表す。ｆ_ｉは、入力視点ｉの焦点距離を表し、ｃ_ｘｉ及びｃ_ｙｉは入力視点ｉの主点位置を表す。また、ｔは定数を表す。式（３）を（ｕ_ｉ，ｖ_ｉ）について解くことにより、式（４）が得られる。

Next, the coordinates on the captured image from the input viewpoint in which the subject existing at the _{world coordinates (X 0} , Y ₀ , Z _{0) are captured can be calculated according to the equation (4).}

In the equation (3), R _i represents the optical axis direction of the input viewpoint i (the input viewpoint i is the i-th input viewpoint among the plurality of input viewpoints). (X _{cam, i} , Y _{cam, i} , Z _{cam, i} ) represents the camera position of the input viewpoint i. f _i represents the focal length of the input viewpoint i, and c _xi and c _yi represent the principal point position of the input viewpoint i. Further, t represents a constant. By solving equation (3) _(u _{i, v} i), the formula (4) is obtained.

式（４）に従うと、まずｔを求めることができ、さらに得られたｔを用いて（ｕ_ｉ，ｖ_ｉ）を求めることができる。このように、再構成画像中の着目画素の座標（ｕ_０，ｖ_０）は、撮像画像中の画素の座標（ｕ_ｉ，ｖ_ｉ）に変換することができる。再構成画像中の着目画素（ｕ_０，ｖ_０）と撮像画像中の画素（ｕ_ｉ，ｖ_ｉ）とは、同じ被写体に対応する可能性が高い。したがって、撮像画像中の画素（ｕ_ｉ，ｖ_ｉ）の画素値（色情報）を、再構成画像中の着目画素（ｕ_０，ｖ_０）の画素値（色情報）として用いることができる。 According to equation (4), first t can be obtained can be determined by the (u _{i, v} _i) using the further resulting t. Thus, the target pixel in the reconstructed image coordinates _(u 0, _{v 0)} can be converted to the coordinates _(u i, _{v i)} of the pixels in the captured image. The target pixel in the reconstructed image _(u 0, _{v 0)} and pixels in the captured image _(u _{i, v} i), is likely to correspond to the same object. Therefore, it is possible to use the pixel _(u i, _{v i)} in the captured image pixel value (color information), as the pixel value of the pixel of interest in the reconstructed image _(u 0, _{v 0)} (color information).

しかしながら、視線方向の違いのために、再構成画像中の着目画素（ｕ_０，ｖ_０）と撮像画像中の画素（ｕ_ｉ，ｖ_ｉ）とが同じ被写体に対応するとは限らない。また、光源方向等の影響により、これらが同じ被写体に対応したとしても、撮像画像間で色が異なっている可能性もある。このため、本実施形態において、レンダリング部３５０は、複数の撮像画像から、着目画素（ｕ_０，ｖ_０）に対応する画素（ｕ_ｉ，ｖ_ｉ）（ｉ＝１〜Ｎ：Ｎは撮像装置１０８の数）を特定し、特定された画素の画素値を重み付け合成する。ここで、被写体が撮像範囲外にあるなどの理由で、着目画素に対応する被写体が写っていない撮像画像については、合成の対象から外すことができる。このような重み付け合成により得られた画素値が、着目画素（ｕ_０，ｖ_０）の画素値として用いられる。このように、１つ以上の撮像装置による撮像画像における被写体の色情報を用いて、再構成画像における被写体の色情報を決定することができる。一実施形態においては、２つ以上の撮像装置による撮像画像における被写体の色情報を用いて、再構成画像における被写体の色情報が決定される。 However, because of the line of sight differences, noted pixel in the reconstructed image (u _{0, v} ₀₎ and pixels in the captured image (u _{i, v} _i) and does not necessarily correspond to the same object. Further, due to the influence of the light source direction and the like, even if they correspond to the same subject, there is a possibility that the colors are different between the captured images. Therefore, in the present embodiment, the rendering unit 350, a plurality of captured images, the target pixel _(u 0, _{v 0)} the corresponding pixel in the _{_{(u i, v i) (}} i = 1~N: N imaging apparatus The number of 108) is specified, and the pixel values of the specified pixels are weighted and synthesized. Here, the captured image in which the subject corresponding to the pixel of interest is not captured because the subject is out of the imaging range can be excluded from the synthesis target. The pixel value obtained by such weighting synthesis is used as the pixel value of the pixel of interest (u ₀ , v ₀ ). As described above, the color information of the subject in the reconstructed image can be determined by using the color information of the subject in the image captured by one or more image pickup devices. In one embodiment, the color information of the subject in the reconstructed image is determined by using the color information of the subject in the images captured by two or more image pickup devices.

以下に、レンダリング部３５０の詳細な構成について、図４を参照して説明する。レンダリング部３５０は、歪曲補正部４１０、方向重み算出部４２０、位置重み算出部４３０、及び画素値算出部４４０を備える。 The detailed configuration of the rendering unit 350 will be described below with reference to FIG. The rendering unit 350 includes a distortion correction unit 410, a direction weight calculation unit 420, a position weight calculation unit 430, and a pixel value calculation unit 440.

歪曲補正部４１０は、画像取得部３４０が取得した撮像画像に対して歪曲補正処理を行い、歪曲補正処理後の撮像画像を画素値算出部４４０に送る。例えば、歪曲補正部４１０は、画像取得部３４０が取得した撮像画像に対して、入力視点情報取得部３１０が取得したそれぞれの撮像装置１０８の歪曲パラメータを参照して、歪曲補正処理を行うことができる。歪曲補正部４１０は、歪曲補正後の撮像画像を、方向重み算出部４２０及び位置重み算出部４３０に送ってもよい。このように歪曲補正された撮像画像を用いて再構成画像を生成することにより、より違和感の少ない再構成画像を生成することが可能となる。もっとも、レンダリング部３５０が歪曲補正部４１０を備えることは必須ではない。 The distortion correction unit 410 performs distortion correction processing on the captured image acquired by the image acquisition unit 340, and sends the captured image after the distortion correction processing to the pixel value calculation unit 440. For example, the distortion correction unit 410 may perform distortion correction processing on the captured image acquired by the image acquisition unit 340 with reference to the distortion parameters of the respective image pickup devices 108 acquired by the input viewpoint information acquisition unit 310. can. The distortion correction unit 410 may send the captured image after the distortion correction to the direction weight calculation unit 420 and the position weight calculation unit 430. By generating a reconstructed image using the captured image corrected for distortion in this way, it is possible to generate a reconstructed image with less discomfort. However, it is not essential that the rendering unit 350 includes the distortion correction unit 410.

方向重み算出部４２０は、撮像装置の向きに応じて、撮像画像のそれぞれに対して重みを設定する。本実施形態において、方向重み算出部４２０は、入力視点の向きと出力視点の向きとの関係に応じて、撮像画像のそれぞれに対して重みを設定する。入力視点から被写体への方向が、出力視点から被写体への方向により近いほど、撮像画像に写る被写体像は仮想視点からの被写体像により近いと考えられる。したがって、入力視点から被写体への方向が、出力視点から被写体への方向により近いほど、大きな重みを与えることができる。より具体的には、入力視点から被写体への方向ベクトル（大きさは任意）と、出力視点から被写体への方向ベクトル（大きさは任意）と、がなす角度が小さいほど、大きな重みを与えることができる。 The direction weight calculation unit 420 sets weights for each of the captured images according to the orientation of the image pickup device. In the present embodiment, the direction weight calculation unit 420 sets weights for each of the captured images according to the relationship between the orientation of the input viewpoint and the orientation of the output viewpoint. It is considered that the closer the direction from the input viewpoint to the subject is from the output viewpoint to the subject, the closer the subject image captured in the captured image is to the subject image from the virtual viewpoint. Therefore, the closer the direction from the input viewpoint to the subject is to the direction from the output viewpoint to the subject, the greater the weight can be given. More specifically, the smaller the angle between the direction vector from the input viewpoint to the subject (the size is arbitrary) and the direction vector from the output viewpoint to the subject (the size is arbitrary), the greater the weight is given. Can be done.

方向重み算出部４２０は、１つの撮像画像中のそれぞれの画素に対して異なる重みを設定することができる。この場合、方向重み算出部４２０は、撮像装置の向きとして、着目方向に存在する被写体への撮像装置からの方向を用いて、重みを設定できる。例えば、上記の例であれば、再構成画像中の着目画素（ｕ_０，ｖ_０）についての出力視点からの方向と、撮像画像中の対応する画素（ｕ_ｉ，ｖ_ｉ）についての入力視点からの方向と、に応じて画素（ｕ_ｉ，ｖ_ｉ）に重みを設定することができる。一方で、計算を簡単にするために、方向重み算出部４２０は、撮像装置（入力視点）の向きとして、撮像装置（入力視点）の光軸方向を用いて、重みを設定することもできる。このように、１つの撮像画像中のそれぞれの画素に関して同じ重みを設定してもよい。また、計算を簡単にするために、仮想視点（出力視点）の向きとしては、着目方向を用いてもよいし、出力視点の光軸方向を用いることもできる。すなわち、仮想視点の光軸方向又は着目方向と、撮像装置の向きとの間の角度に応じて、方向重みを設定することができる。 The direction weight calculation unit 420 can set different weights for each pixel in one captured image. In this case, the direction weight calculation unit 420 can set the weight by using the direction from the image pickup device to the subject existing in the direction of interest as the direction of the image pickup device. For example, in the above example, the input for the direction from the output viewpoint of the target pixel in the reconstructed image (u _{0, v _0),} the corresponding pixel in the captured image (u _{i, v} _i) viewpoint and direction from, it is possible to set a weight to the pixel (u _{i, v} _i) in accordance with. On the other hand, in order to simplify the calculation, the direction weight calculation unit 420 can also set the weight by using the optical axis direction of the image pickup device (input viewpoint) as the direction of the image pickup device (input viewpoint). In this way, the same weight may be set for each pixel in one captured image. Further, in order to simplify the calculation, the direction of interest may be used as the direction of the virtual viewpoint (output viewpoint), or the optical axis direction of the output viewpoint may be used. That is, the direction weight can be set according to the angle between the optical axis direction or the direction of interest of the virtual viewpoint and the direction of the image pickup apparatus.

方向重み算出部４２０は、画素値算出部４４０による処理に必要になった際に方向重みを算出してもよいし、画素値算出部４４０による処理の前に撮像画像中の各画素について方向重みを予め算出しておいてもよい。後者の場合、方向重み算出部４２０は、上述のように算出された重みを撮像画像の画素のそれぞれに対応づける。そして、画素値算出部４４０は、後述するように着目画素に写る被写体の画素値を撮像画像から抽出する際に、撮像画像上における着目画素に対応する画素に対応づけられた方向重みを、撮像画像の重みとして使用する。 The direction weight calculation unit 420 may calculate the direction weight when it is necessary for the processing by the pixel value calculation unit 440, or the direction weight for each pixel in the captured image before the processing by the pixel value calculation unit 440. May be calculated in advance. In the latter case, the direction weight calculation unit 420 associates the weights calculated as described above with each of the pixels of the captured image. Then, when the pixel value calculation unit 440 extracts the pixel value of the subject reflected in the pixel of interest from the captured image as described later, the pixel value calculation unit 440 captures the directional weight associated with the pixel corresponding to the pixel of interest on the captured image. Used as an image weight.

位置重み算出部４３０は、撮像装置の視野内における、着目方向に位置する被写体の位置に応じて、撮像画像のそれぞれに対して重みを設定する。まず、この構成の意義について説明する。１つの入力視点からの撮像範囲は限られるため、再構成画像に写る全ての被写体が１つの入力視点からの撮像画像に写っていることは少ない。したがって、再構成画像には、１つの入力視点からの撮像画像に写っておりこの撮像画像の色情報が反映される領域と、１つの入力視点からの撮像画像に写っていないためこの撮像画像の色情報が反映されない領域とが含まれることが多い。一方、特に出力視点と向きが近い入力視点からの撮像画像には、方向重み算出部４２０により大きい重みが与えられる。 The position weight calculation unit 430 sets weights for each of the captured images according to the position of the subject located in the direction of interest in the field of view of the image pickup apparatus. First, the significance of this configuration will be described. Since the imaging range from one input viewpoint is limited, it is rare that all the subjects captured in the reconstructed image are captured in the captured image from one input viewpoint. Therefore, in the reconstructed image, there is a region in which the color information of the captured image is reflected in the captured image from one input viewpoint, and the captured image is not captured in the captured image from one input viewpoint. It often includes areas where color information is not reflected. On the other hand, a larger weight is given to the direction weight calculation unit 420 particularly for the captured image from the input viewpoint whose orientation is close to that of the output viewpoint.

例えば、図５（Ａ）に示される再構成画像５００には、入力視点Ａからの視野内に含まれる領域５１０と、入力視点Ｂからの視野内に含まれる領域５２０と、入力視点Ａ及び入力視点Ｂからの双方の視野内に含まれる領域５３０が示されている。言い換えれば、領域５１０は入力視点Ａからの撮像画像Ａにのみ写っている領域であり、領域５２０は入力視点Ｂからの撮像画像Ｂにのみ写っている領域であり、領域５３０は撮像画像Ａ及び撮像画像Ｂの双方に写っている領域である。図５（Ｂ）は、図５（Ａ）の線分Ｘ−Ｘ’に沿った撮像画像Ａ及び撮像画像Ｂのそれぞれの重みの例を示す。この例において、入力視点Ｂと比較して、入力視点Ａの向きは出力視点の向きに近いため、撮像画像Ａにはより大きい重みが与えられている。図５（Ｂ）から明らかなように、撮像画像Ａの寄与が大きい領域５３０と、撮像画像Ａが寄与しない領域５２０との間では、撮像画像Ａの寄与が大きく異なっている。このため、領域５２０と領域５３０との境界Ｙ付近において色が急に変化する可能性がある。 For example, in the reconstructed image 500 shown in FIG. 5A, a region 510 included in the visual field from the input viewpoint A, a region 520 included in the visual field from the input viewpoint B, the input viewpoint A, and the input Regions 530 included in both fields of view from viewpoint B are shown. In other words, the region 510 is a region that is reflected only in the captured image A from the input viewpoint A, the region 520 is a region that is captured only in the captured image B from the input viewpoint B, and the region 530 is the captured image A and the region 530. This is an area that is reflected in both of the captured images B. FIG. 5B shows an example of the respective weights of the captured image A and the captured image B along the line segment XX'of FIG. 5A. In this example, since the orientation of the input viewpoint A is closer to the orientation of the output viewpoint as compared with the input viewpoint B, the captured image A is given a larger weight. As is clear from FIG. 5B, the contribution of the captured image A is significantly different between the region 530 in which the captured image A contributes greatly and the region 520 in which the captured image A does not contribute. Therefore, there is a possibility that the color suddenly changes near the boundary Y between the region 520 and the region 530.

本実施形態において、位置重み算出部４３０は、着目方向に存在する被写体が撮像装置の視野の周辺部に存在する場合、撮像装置の視野の中心部に存在する場合よりも小さい重みを設定する。すなわち、位置重み算出部４３０は、被写体が入力視点からの視野の周辺部に相当する場合に、被写体が視野の中心部に相当する場合よりも小さい重みを設定する。結果として、入力視点からの視野内において被写体が周辺部分に位置する場合に、撮像画像の方向重みを小さくする効果が得られる。図５（Ｃ）の例では、上記の領域５３０のうち、境界Ｙ付近の画素については、撮像画像Ａに対してはより小さい重みが設定され、撮像画像Ｂに対してはより大きい重みが設定されている。このような構成によれば、上記の領域５３０のうち、境界Ｙ付近において撮像画像Ａの寄与が小さくなるため、領域５２０と領域５３０との境界Ｙ付近における色の変化を小さくすることができる。 In the present embodiment, the position weight calculation unit 430 sets a weight smaller when the subject existing in the direction of interest is present in the peripheral portion of the field of view of the image pickup device than in the central portion of the field of view of the image pickup device. That is, the position weight calculation unit 430 sets a weight smaller when the subject corresponds to the peripheral portion of the visual field from the input viewpoint than when the subject corresponds to the central portion of the visual field. As a result, when the subject is located in the peripheral portion in the field of view from the input viewpoint, the effect of reducing the directional weight of the captured image can be obtained. In the example of FIG. 5C, in the above region 530, a smaller weight is set for the captured image A and a larger weight is set for the captured image B for the pixels near the boundary Y. Has been done. According to such a configuration, since the contribution of the captured image A is small in the vicinity of the boundary Y in the above-mentioned region 530, the color change in the vicinity of the boundary Y between the region 520 and the region 530 can be reduced.

位置重み算出部４３０による具体的な重みの設定方法としては、様々な方法が挙げられる。一実施形態において、撮像装置の視野内における被写体の位置は、着目方向に存在する被写体の、撮像画像中における位置である。そして、位置重み算出部４３０は、被写体が写っている撮像画像中の座標に従って、中心部分よりも周辺部分の方が小さくなるように重みを設定することができる。上記の例であれは、再構成画像中の着目画素（ｕ_０，ｖ_０）に対応する撮像画像中の画素（ｕ_ｉ，ｖ_ｉ）が周辺部分に位置する場合に、この撮像画像の重みを小さくすることができる。 As a specific method for setting the weight by the position weight calculation unit 430, various methods can be mentioned. In one embodiment, the position of the subject in the field of view of the image pickup apparatus is the position of the subject existing in the direction of interest in the captured image. Then, the position weight calculation unit 430 can set the weight so that the peripheral portion is smaller than the central portion according to the coordinates in the captured image in which the subject is captured. Is there in the above example, when the target pixel (u _{0, v} ₀₎ in the reconstructed image pixel in the corresponding captured image in (u _{i, v} _i) is located in the peripheral portion, the weight of the captured image Can be made smaller.

別の方法として、位置重み算出部４３０は、再構成画像中において入力視点から見える領域を判定することができる。そして、位置重み算出部４３０は、判定された領域中の被写体が写っている位置が周辺部分に近いほど重みが小さくなるように、この入力視点からの撮像画像に重みを設定することができる。例えば、再構成画像中の着目画素（ｕ_０，ｖ_０）が判定された領域の中心部分にある場合よりも周辺部分にある場合の方が小さくなるように、重みを設定することができる。 Alternatively, the position weight calculation unit 430 can determine the region visible from the input viewpoint in the reconstructed image. Then, the position weight calculation unit 430 can set the weight on the captured image from this input viewpoint so that the weight becomes smaller as the position in the determined area where the subject is captured is closer to the peripheral portion. For example, the weight can be set so that the pixel of interest (u ₀ , v ₀ ) in the reconstructed image is smaller in the peripheral portion than in the central portion of the determined region.

なお、入力視点からの視野内における位置に応じた重みの設定方法は、上記の方法には限られない。例えば、より品質の高い撮像が行える視線方向にある被写体を撮像して得られた撮像画像の重みを大きくすることができる。また、再構成画像中において撮像画像の色情報が反映される領域を判定し、判定された領域中の被写体が写っている位置が周辺部分に近いほど重みが小さくなるように、この撮像画像に重みを設定することもできる。このような構成は、例えば色情報を重み付け合成する撮像画像の数を制限する場合に有効である。この場合、再構成画像中において入力視点から見える領域と撮像画像の色情報が反映される領域とが一致せず、入力視点から見えるにもかかわらず撮像画像の色情報が反映されない領域が存在する可能性がある。例えば、視点の向きに基づいて２つの撮像画像を選択して合成する場合、互いに隣接する領域の一方においては撮像画像Ａと撮像画像Ｂとの色情報が用いられ、他方においては撮像画像Ａと撮像画像Ｃとの色情報が用いられるかもしれない。この結果、これらの領域の境界において色が急激に変化する可能性がある。一方、このような構成を用いることにより、この境界周辺において撮像画像Ｂ及び撮像画像Ｃの重みが小さくなり、色の急激な変化を抑えることができる。 The method of setting the weight according to the position in the visual field from the input viewpoint is not limited to the above method. For example, it is possible to increase the weight of the captured image obtained by imaging a subject in the line-of-sight direction that enables higher quality imaging. Further, the region in which the color information of the captured image is reflected is determined in the reconstructed image, and the weight is reduced as the position in the determined region in which the subject is captured is closer to the peripheral portion. You can also set the weight. Such a configuration is effective, for example, when limiting the number of captured images for weighting and synthesizing color information. In this case, in the reconstructed image, the area visible from the input viewpoint and the area where the color information of the captured image is reflected do not match, and there is a region where the color information of the captured image is not reflected even though it is visible from the input viewpoint. there is a possibility. For example, when two captured images are selected and combined based on the orientation of the viewpoint, the color information of the captured image A and the captured image B is used in one of the regions adjacent to each other, and the captured image A is used in the other. Color information with the captured image C may be used. As a result, colors can change abruptly at the boundaries of these areas. On the other hand, by using such a configuration, the weights of the captured image B and the captured image C become small around this boundary, and a sudden change in color can be suppressed.

位置重み算出部４２０は、画素値算出部４４０による処理に必要になった際に位置重みを算出してもよいし、画素値算出部４４０による処理の前に撮像画像中の各画素について位置重みを予め算出しておいてもよい。後者の場合、位置重み算出部４３０は、上述のように算出された重みを撮像画像の画素のそれぞれに対応づける。そして、画素値算出部４４０は、後述するように着目画素に写る被写体の画素値を撮像画像から抽出する際に、撮像画像上における着目画素に対応する画素に対応づけられた位置重みを、撮像画像の重みとして使用する。 The position weight calculation unit 420 may calculate the position weight when it is necessary for the processing by the pixel value calculation unit 440, or the position weight for each pixel in the captured image before the processing by the pixel value calculation unit 440. May be calculated in advance. In the latter case, the position weight calculation unit 430 associates the weights calculated as described above with each of the pixels of the captured image. Then, when the pixel value calculation unit 440 extracts the pixel value of the subject reflected in the pixel of interest from the captured image as described later, the pixel value calculation unit 440 captures the position weight associated with the pixel corresponding to the pixel of interest on the captured image. Used as an image weight.

画素値算出部４４０は、再構成画像中の着目画素について、着目画素に写る被写体の位置を距離マップに基づいて特定し、着目画素に写る被写体の画素値を撮像画像から抽出する。この処理は、上記の式（１）〜（４）に従って行うことができる。そして、画素値算出部４４０は、それぞれの撮像画像から抽出した画素値を、方向重み算出部４２０及び位置重み算出部４３０が算出した重みを用いて重み付け合成する。こうして、再構成画像中のそれぞれの着目画素について画素値（色情報）が決定される。すなわち、以上の処理により、画素値算出部４４０は再構成画像を生成する。 The pixel value calculation unit 440 specifies the position of the subject reflected in the pixel of interest for the pixel of interest in the reconstructed image based on the distance map, and extracts the pixel value of the subject reflected in the pixel of interest from the captured image. This process can be performed according to the above equations (1) to (4). Then, the pixel value calculation unit 440 weights and synthesizes the pixel values extracted from each captured image using the weights calculated by the direction weight calculation unit 420 and the position weight calculation unit 430. In this way, the pixel value (color information) is determined for each pixel of interest in the reconstructed image. That is, by the above processing, the pixel value calculation unit 440 generates a reconstructed image.

最後に、本実施形態に係る画像処理装置１００が行う画像処理方法について、図６（Ａ）を参照して説明する。ステップＳ６１０において、入力視点情報取得部３１０は、上述のように入力視点情報を取得する。ステップＳ６２０において、画像取得部３４０は、上述のように撮像画像を取得する。ステップＳ６３０において、出力視点情報取得部３２０は、上述のように出力視点情報を取得する。ステップＳ６４０において、距離マップ取得部３３０は、上述のように距離マップを取得する。ステップＳ６５０において、レンダリング部３５０は、上述のように再構成画像を生成する。ステップＳ６６０において、画像出力部３６０は、上述のように再構成画像を出力する。 Finally, the image processing method performed by the image processing apparatus 100 according to the present embodiment will be described with reference to FIG. 6A. In step S610, the input viewpoint information acquisition unit 310 acquires the input viewpoint information as described above. In step S620, the image acquisition unit 340 acquires an captured image as described above. In step S630, the output viewpoint information acquisition unit 320 acquires the output viewpoint information as described above. In step S640, the distance map acquisition unit 330 acquires the distance map as described above. In step S650, the rendering unit 350 generates the reconstructed image as described above. In step S660, the image output unit 360 outputs the reconstructed image as described above.

次に、レンダリング部３５０が行う処理について、図６（Ｂ）を参照して説明する。ステップＳ６５１において、歪曲補正部４１０は、上述のように撮像画像に対して歪曲補正処理を行う。ステップＳ６５２〜ステップＳ６５７は、再構成画像のそれぞれの画素を処理対象として繰り返される。これらのステップにおいて、処理対象の画素を着目画素と呼ぶ。ステップＳ６５３において、方向重み算出部４２０は、着目画素に関して、画素値を決定する際に参照するそれぞれの撮像画像に対して、上述のように視線の向きに基づいて重みを設定する。ステップＳ６５４において、位置重み算出部４３０は、着目画素に関して、画素値を決定する際に参照するそれぞれの撮像画像に対して、上述のように被写体の位置に基づいて重みを設定する。ステップＳ６５５において、画素値算出部４４０は、上述のように、それぞれの撮像画像から抽出した画素値を重み付け合成することにより、着目画素の画素値を決定する。 Next, the processing performed by the rendering unit 350 will be described with reference to FIG. 6B. In step S651, the distortion correction unit 410 performs distortion correction processing on the captured image as described above. Steps S652 to S657 are repeated with each pixel of the reconstructed image as a processing target. In these steps, the pixel to be processed is called the pixel of interest. In step S653, the direction weight calculation unit 420 sets weights for the pixel of interest for each captured image referred to when determining the pixel value, based on the direction of the line of sight as described above. In step S654, the position weight calculation unit 430 sets weights for the pixel of interest for each captured image referred to when determining the pixel value, based on the position of the subject as described above. In step S655, the pixel value calculation unit 440 determines the pixel value of the pixel of interest by weighting and synthesizing the pixel values extracted from the respective captured images as described above.

以上の構成によれば、仮想視点からの再構成画像において、異なる撮像画像がブレンドされている領域の境界部における急激な色の変化を抑制し、違和感を低減することができる。 According to the above configuration, in the reconstructed image from the virtual viewpoint, it is possible to suppress a sudden change in color at the boundary portion of the region where different captured images are blended, and to reduce the sense of discomfort.

（実装例）
実施形態１ではそれぞれの撮像画像に対して方向重み及び位置重みを設定し、それぞれの撮像画像からの画素値を重み付け合成する場合に説明した。しかしながら、本発明はこのような実施形態に限られず、処理精度及び処理負荷を考慮して様々な実装を採用することができる。例えば、２枚の撮像画像からの画素値を重み付け合成することで、２枚の撮像画像に基づく再構成画像を生成することができる。さらに、この再構成画像と、別の撮像画像又は別の撮像画像に基づく再構成画像と、に基づいて、より多くの撮像画像に基づく再構成画像を生成することができる。また、このように複数段階に分けて撮像画像を合成する場合、それぞれの段階で異なる合成方法を採用することができる。以下では、このような実装例の１つについて説明するとともに、位置重み及び方向重みの具体的な算出方法の１つについて説明する。 (Implementation example)
In the first embodiment, the direction weight and the position weight are set for each captured image, and the pixel values from the respective captured images are weighted and combined. However, the present invention is not limited to such an embodiment, and various implementations can be adopted in consideration of processing accuracy and processing load. For example, by weighting and synthesizing the pixel values from the two captured images, a reconstructed image based on the two captured images can be generated. Further, based on this reconstructed image and another captured image or a reconstructed image based on another captured image, it is possible to generate a reconstructed image based on more captured images. Further, when the captured image is synthesized in a plurality of stages in this way, a different synthesis method can be adopted in each stage. In the following, one such implementation example will be described, and one of the specific calculation methods of the position weight and the direction weight will be described.

ステップＳ６５０において、画素値算出部４４０は、まず向きが出力視点に最も近い２つの入力視点のペアを選択する。ここでは、１番目の入力視点と２番目の入力視点が選択されたものとする。画素値算出部４４０は、着目画素に対応する撮像画像中の画素の画素値を、式（１）〜（４）に従って抽出する。そして、画素値算出部４４０は、１番目の入力視点からの撮像画像から抽出した画素値Ｉ_１と、２番目の入力視点からの撮像画像から抽出した画素値Ｉ_２とを、式（５）に従って重み付け合成し、画素値Ｉ_１２を算出する。 In step S650, the pixel value calculation unit 440 first selects a pair of two input viewpoints whose orientation is closest to the output viewpoint. Here, it is assumed that the first input viewpoint and the second input viewpoint are selected. The pixel value calculation unit 440 extracts the pixel values of the pixels in the captured image corresponding to the pixel of interest according to the equations (1) to (4). Then, the pixel value calculation unit 440 combines the pixel value I ₁ _{extracted from the image captured from the first input viewpoint and the pixel value I 2} extracted from the image captured from the second input viewpoint into the equation (5). The pixel value I ₁₂ is calculated by weighting and synthesizing according to the above.

Ｉ_ｎｍ＝（（ｍｉｎ（ｗ_ｎ，ｗ_ｍ）・ｗ’_ｎ＋（１−（ｍｉｎ（ｗ_ｎ，ｗ_ｍ））・ｗ_ｎ）・Ｉ_ｎ＋（ｍｉｎ（ｗ_ｎ，ｗ_ｍ）・ｗ’_ｍ＋（１−ｍｉｎ（ｗ_ｎ，ｗ_ｍ））・ｗ_ｍ）・Ｉ_ｍ）／Ｗ ……（５）
式（５）において、ｗ_ｎ，ｗ_ｍはｎ，ｍ番目の入力視点についての位置重みを表す。ｗ’_ｎ，ｗ’_ｍはｎ，ｍ番目の入力視点についての方向重みを表す。Ｗは重みの和を表す。 _{_{I nm = ((min (w}} n, w m) · w 'n + (1- (min (w n, w m)) · w n) · I n + (min (w n, w m) · w ' _m + (1-min (w _n , w _m )) ・ w _m ) ・_Im ) / W …… (5)
In equation (5), w _n and w _m represent the position weights for the n and mth input viewpoints. w _'n, w' _m represents the direction weight of n, the m-th input viewpoint. W represents the sum of the weights.

位置重みｗ_ｎは、着目画素に対応する撮像画像中の画素の、撮像画像中における位置に従って、位置重み算出部４３０が求めている。一例として、位置重み算出部４３０は、撮像画像上で、着目方向に存在する被写体の撮像画像の端部からの距離が所定の閾値を超えている場合、一定の重みを設定することができる。また、位置重み算出部４３０は、撮像画像上で、着目方向に存在する被写体の撮像画像の端部からの距離が所定の閾値以下である場合、一定の重みより小さい重みを設定することができる。このような位置重みの設定は、式（６）に従って行うことができる。位置重みｗ_ｍも同様に求めることができる。
ｗ_ｎ＝ｍｉｎ（ｄ_０，ｄ_１，ｄ_２，ｄ_３，ｄ’）／ｄ’ ……（６）
式（６）において、ｄ_０〜ｄ_３は、図７（Ａ）に示すように、着目画素に対応する撮像画像７１０中の画素７２０の、外縁までの距離を示す。ｄ’は端部ブレンド幅を示し、外縁までの距離がｄ’を下回ると位置重みが小さくなる。 Position weight w _n is the pixel in the captured image corresponding to the target pixel, according to the position in the captured image, the position weight calculator 430 seeking. As an example, the position weight calculation unit 430 can set a constant weight on the captured image when the distance from the end of the captured image of the subject existing in the direction of interest exceeds a predetermined threshold value. Further, the position weight calculation unit 430 can set a weight smaller than a certain weight when the distance from the end of the captured image of the subject existing in the direction of interest on the captured image is equal to or less than a predetermined threshold value. .. Such setting of the position weight can be performed according to the equation (6). The position weight w _m can be obtained in the same manner.
w _n = min (d ₀ , d ₁ , d ₂ , d ₃ , d') / d'... (6)
In the formula (6), d _{0 to} _{d 3} indicate the distance to the outer edge of the pixel 720 in the captured image 710 corresponding to the pixel of interest, as shown in FIG. 7 (A). d'indicates the end blend width, and the position weight becomes smaller when the distance to the outer edge is less than d'.

方向重みｗ’_ｎ，ｗ’_ｍは、入力視点から着目画素に写っている被写体への方向と、出力視点からの着目画素に対応する方向とに従って、方向重み算出部４２０が式（７）を用いて求めている。なお、入力視点から着目画素に写っている被写体への方向は、入力視点のカメラ位置及び被写体の世界座標等を用いて容易に算出することができる。
ｗ’_ｎ＝θ_ｍ／（θ_ｎ＋θ_ｍ）
ｗ’_ｍ＝θ_ｎ／（θ_ｎ＋θ_ｍ） ……（７）
式（７）において、図７（Ｂ）に示すように、θ_ｎは、１つの入力視点から着目画素に写っている被写体７５０への方向７７０と、出力視点からの着目画素に対応する方向７６０と、がなす角を示す。また、θ_ｍは、別の入力視点から着目画素に写っている被写体７５０への方向７８０と、出力視点からの着目画素に対応する方向７６０と、がなす角を示す。 Directional weights w _'n, w' _m is the direction from the input viewpoint to the object that is reflected in the target pixel, according to a direction corresponding to the target pixel from the output viewpoint, direction weight calculating section 420 a formula (7) I am seeking using it. The direction from the input viewpoint to the subject reflected in the pixel of interest can be easily calculated by using the camera position of the input viewpoint, the world coordinates of the subject, and the like.
_{_{w 'n = θ m / (}} θ n + θ m)
_{_{w 'm = θ n / (}} θ n + θ m) ...... (7)
In equation (7), as shown in FIG. 7 (B), θ _n is a direction 770 from one input viewpoint to the subject 750 reflected in the pixel of interest and a direction 760 corresponding to the pixel of interest from the output viewpoint. Shows the angle between. Further, θ _m indicates an angle formed by the direction 780 from another input viewpoint to the subject 750 reflected in the pixel of interest and the direction 760 corresponding to the pixel of interest from the output viewpoint.

また、画素値算出部４４０は、合成された画素値Ｉ_１２の重みｗ_１２を、式（８）を用いて算出する。
ｗ_ｎｍ＝ｍａｘ（ｗ_ｎ，ｗ_ｍ） ……（８） Further, the pixel value calculation unit 440 calculates the weight w ₁₂ _{of the combined pixel value I 12} by using the equation (8).
w _nm = max (w _n , w _m ) …… (8)

画素値算出部４４０は、同様に、向きが出力視点に３番目及び４番目に近い２つの入力視点のペアを選択する。そして、式（５）を用いて合成された画素値Ｉ_３４を算出し、式（８）を用いて合成された画素値Ｉ_３４の重みｗ_３４を算出する。 Similarly, the pixel value calculation unit 440 selects a pair of two input viewpoints whose orientations are the third and fourth closest to the output viewpoint. _{Then, the pixel value I 34} synthesized using the equation (5) is calculated, and the weight w ₃₄ _{of the pixel value I 34} synthesized using the equation (8) is calculated.

さらに、画素値算出部４４０は、画素値Ｉ_１２と画素値Ｉ_３４とを合成する。選択された入力視点のペアに基づいて得られた画素値Ｉ_ｑ，Ｉ_ｒの合成は、式（９）に基づいて行われ、合成された画素値Ｉ_ｓが得られる。式（９）において、Ｉ_ｑ及びｗ_ｑはＩ_１に基づく合成画素値（例えば、画素値Ｉ_１２、及び画素値Ｉ_１２とＩ_３４とが合成された画素値等）及びその重みを表す。Ｉ_ｒ及びｗ_ｒは、Ｉ_１に基づかない合成画素値（例えば、画素値Ｉ_３４及びＩ_５６等）及びその重みを表す。
Ｉ_ｓ＝ｗ_ｑ・Ｉ_ｑ＋（１−ｗ_ｑ）・Ｉ_ｒ ……（９）
前述のとおり、合成画素値の重みは式（８）を用いて計算できる。言い換えれば、合成画素値Ｉｓの重みｗ_ｓは、ｗ_ｓ＝ｍａｘ（ｗ_ｑ，ｗ_ｒ）と表すことができる。 Further, the pixel value calculation unit 440 synthesizes the _{pixel value I 12} and the pixel value I _34. Pixel value I _q obtained based on the pair of selected input _viewpoint, the synthesis of I _r is performed based on Equation (9), the synthesized pixel value I _s is obtained. In the formula (9), I _q and w _q represent _{a composite pixel value based on I 1} (for example, a pixel value I ₁₂ and a pixel value in which the pixel values I ₁₂ and I ₃₄ are combined) and their weights. I _r and _{w r} is composite pixel values that are not based on _{I 1} (e.g., pixel values _{I 34} and _{I 56,} etc.) and representing the weight.
_{_{_{I s = w q · I q}}} + (1-w q) · I r ...... (9)
As described above, the weight of the composite pixel value can be calculated using the equation (8). In other words, the weight w _s of the composite pixel value Is can be expressed as w _s = max (w _q , _wr).

画素値算出部４４０は、さらに、向きが出力視点に５番目及び６番目に近い２つの入力視点のペアを選択し、式（５）（８）を用いて合成画素値Ｉ_５６及びその重みｗ_５６を算出する。そして、画素値算出部４４０は、式（９）を用いて合成画素値Ｉ_５６をＩ_１〜Ｉ_４の合成画素値に合成する。このような処理を繰り返すことにより、着目画素の画素値が算出される。 The pixel value calculation unit 440 further selects a pair of two input viewpoints whose orientations are close to the 5th and 6th output viewpoints, and uses equations (5) and (8) to create a composite pixel value I ₅₆ and its weight w. ₅₆ is calculated. Then, the pixel value calculation unit 440 synthesizes the composite pixel value I ₅₆ into the _{composite pixel values of I 1 to} I ₄ using the equation (9). By repeating such processing, the pixel value of the pixel of interest is calculated.

この実装例に基づく画素値の合成方法について、図８（Ａ）〜（Ｃ）を参照して説明する。図８（Ａ）〜（Ｃ）は、再構成画像上における、それぞれの入力視点の視野範囲（すなわち、各撮像画像の投影範囲）を示す。本実装例においては、１つの着目画素の画素値を決定する際に、まず第１の入力視点ペアからの撮像画像に基づいて、第１の入力視点ペアの視野範囲８１０について画素値が決定される。次に、第２の入力視点ペアからの撮像画像に基づいて、第２の入力視点ペアの視野範囲８２０について画素値が決定され、第１の入力視点ペアの視野範囲８１０の画素値と合成される。第３の入力視点ペア及びさらなる入力視点ペアからの撮像画像に基づいて、さらなる視野範囲８３０について画素値が逐次決定され、既に得られた画素値と合成される。 A method of synthesizing pixel values based on this implementation example will be described with reference to FIGS. 8A to 8C. 8 (A) to 8 (C) show the visual field range of each input viewpoint (that is, the projection range of each captured image) on the reconstructed image. In this implementation example, when determining the pixel value of one pixel of interest, the pixel value is first determined for the field of view range 810 of the first input viewpoint pair based on the captured image from the first input viewpoint pair. NS. Next, based on the captured image from the second input viewpoint pair, the pixel value is determined for the field range 820 of the second input viewpoint pair, and is combined with the pixel value of the field range 810 of the first input viewpoint pair. NS. Based on the images captured from the third input viewpoint pair and the further input viewpoint pair, the pixel values for the further field of view range 830 are sequentially determined and combined with the already obtained pixel values.

この方法によれば、出力視点と向きが近い２つの入力視点を選択して画素値を合成した後に、さらなる２つの入力視点に基づく合成画素値がさらに逐次的に合成される。式（５）に従う２つの画素値の合成は実施形態１と同様である。また、式（９）に従う合成画素値同士の合成も、実施形態１と同様の考えに基づいている。すなわち、合成画素値の重みｗ_ｎｍは２つの入力視点の向きが出力視点の向きに近いほど大きくなるし、被写体が２つの入力視点の視野範囲の周辺部に近いほど小さくなる。このように、合成処理においては、異なる２つの方法を組み合わせることが可能である。また、本発明に係る方法と、その他の方法とを組み合わせて用いることもできる。 According to this method, after selecting two input viewpoints whose orientations are close to those of the output viewpoint and synthesizing the pixel values, the combined pixel values based on the further two input viewpoints are further sequentially combined. The composition of the two pixel values according to the equation (5) is the same as that of the first embodiment. Further, the synthesis of the composite pixel values according to the equation (9) is also based on the same idea as in the first embodiment. That is, the weight w _nm of the composite pixel value becomes larger as the orientation of the two input viewpoints is closer to the direction of the output viewpoint, and becomes smaller as the subject is closer to the peripheral portion of the visual field range of the two input viewpoints. In this way, in the synthesis process, it is possible to combine two different methods. Further, the method according to the present invention can be used in combination with other methods.

とりわけ、この実装例において、画素値算出部４４０は、向きが仮想視点と近い２つの撮像装置を選択し、選択された撮像装置により撮像された撮像画像を用いて、着目方向に存在する被写体の色情報を決定する第１の処理をまず行う。この処理は、第１の入力視点ペアからの撮像画像に基づく、第１の入力視点ペアの視野範囲８１０における画素値の決定に相当する。ここで、本実施形態において、式（６）によれば入力視点の視野の中央部においては位置重みｗ_ｎが１となり、方向重みｗ_ｎ’は１未満である。したがって、視野範囲８１０の中央部分の重みｗ_ｎｍは１であり、周辺部分のみ重みｗ_ｎｍが１未満となる。 In particular, in this implementation example, the pixel value calculation unit 440 selects two image pickup devices whose orientations are close to the virtual viewpoint, and uses the image captured by the selected image pickup device to capture a subject existing in the direction of interest. First, the first process of determining the color information is performed. This process corresponds to the determination of the pixel value in the visual field range 810 of the first input viewpoint pair based on the captured image from the first input viewpoint pair. Here, in the present embodiment, according to the equation (6), the position weight w _n is 1 and the direction weight w _n'is less than 1 in the central portion of the field of view of the input viewpoint. _{Therefore, the weight w nm} of the central portion of the visual field range 810 is 1, and the weight w _{nm of} only the peripheral portion is less than 1.

したがって、式（９）に従って第２の入力視点ペアの視野範囲８２０における画素値を合成する第２の処理を行うと、視野範囲８１０の中央部分の画素値は更新されず、視野範囲８１０の周辺部分の画素値のみが更新される。また、視野範囲８１０の外については、第２の入力視点ペアからの撮像画像に基づく画素値が用いられる。まとめると、第２の処理において、画素値算出部４４０は、着目方向に存在する被写体が２つの撮像装置（第１の入力視点ペア）の視野内の中央部に存在する場合、第１の処理により決定された被写体の色情報を更新しない。また、着目方向に存在する被写体が２つの撮像装置（第１の入力視点ペア）の視野内の周辺部に存在する場合、画素値算出部４４０は次の処理を行う。すなわち、画素値算出部４４０は、２つの撮像装置とは異なる撮像装置（第２の入力視点ペア）により撮像された撮像画像を用いて決定された着目方向に存在する被写体の色情報を、第１の処理により決定された被写体の色情報と合成する。そして、着目方向に存在する被写体が２つの撮像装置の視野外に存在する場合、画素値算出部４４０は次の処理を行う。すなわち、画素値算出部４４０は、２つの撮像装置（第１の入力視点ペア）とは異なる撮像装置（第２の入力視点ペア）により撮像された撮像画像を用いて、着目方向に存在する被写体の色情報を決定する。 Therefore, when the second process of synthesizing the pixel values in the field of view range 820 of the second input viewpoint pair is performed according to the equation (9), the pixel values in the central portion of the field of view range 810 are not updated and the periphery of the field of view range 810 is not updated. Only the pixel value of the part is updated. Further, outside the field of view range 810, the pixel value based on the captured image from the second input viewpoint pair is used. In summary, in the second process, the pixel value calculation unit 440 is the first process when the subject existing in the direction of interest is in the central portion in the field of view of the two image pickup devices (first input viewpoint pair). Does not update the subject color information determined by. Further, when the subject existing in the direction of interest is present in the peripheral portion in the field of view of the two image pickup devices (first input viewpoint pair), the pixel value calculation unit 440 performs the following processing. That is, the pixel value calculation unit 440 obtains the color information of the subject existing in the direction of interest determined by using the image captured by the image pickup device (second input viewpoint pair) different from the two image pickup devices. It is combined with the color information of the subject determined by the process of 1. Then, when the subject existing in the direction of interest is outside the field of view of the two image pickup devices, the pixel value calculation unit 440 performs the following processing. That is, the pixel value calculation unit 440 uses an image captured by an image pickup device (second input viewpoint pair) different from the two image pickup devices (first input viewpoint pair), and the subject exists in the direction of interest. Determine the color information of.

このような実装例においては、向きが仮想視点と近い撮像装置からの撮像画像が、再構成画像の生成において重視されるため、より違和感の少ない画像を合成することができる。なお、具体的な合成方法は特に限定されず、撮像画像のペアに基づく画素値を合成する代わりに、１つの撮像画像に基づく画素値を合成してもよい。 In such an implementation example, since the image captured from the image pickup device whose orientation is close to the virtual viewpoint is emphasized in the generation of the reconstructed image, it is possible to synthesize an image with less discomfort. The specific synthesizing method is not particularly limited, and instead of synthesizing the pixel values based on the pair of captured images, the pixel values based on one captured image may be synthesized.

［実施形態２］
実施形態１では、時間とともに位置及び形状が変化する人物のような被写体と、時間とともに位置がほとんど変化しない壁面のような被写体と、の双方について、仮想視点からの距離に基づいて仮想視点からの画像を再構成した。一方で、例えば壁面、床、及び天井のような背景は撮像装置１０８との位置姿勢関係が一定である。したがって、仮想視点の位置姿勢が決まれば、撮像画像に対するホモグラフィ変換を行うことで、背景の再構成画像のうち、この撮像画像に対応する領域の画像を生成することができる。そして、それぞれの撮像画像から得られた背景の画像をブレンディングすることにより、背景の再構成画像を生成することができる。別途、式（１）〜（４）を用いて、又は実施形態１と同様の方法で人物等の背景以外の画像を生成して背景に合成することで、人物等を含む再構成画像を生成することもできる。実施形態２では、このような処理について説明する。なお、以下の説明において、時間とともに位置又は形状が変化する被写体のことを動体と呼び、時間とともに位置及び形状が変化しない被写体のことを背景と呼ぶ。 [Embodiment 2]
In the first embodiment, both a subject such as a person whose position and shape change with time and a subject such as a wall surface whose position hardly changes with time are viewed from a virtual viewpoint based on the distance from the virtual viewpoint. The image was reconstructed. On the other hand, the positional relationship with the image pickup apparatus 108 is constant for backgrounds such as walls, floors, and ceilings. Therefore, once the position and orientation of the virtual viewpoint are determined, it is possible to generate an image of a region corresponding to this captured image among the reconstructed background images by performing homographic transformation on the captured image. Then, by blending the background image obtained from each captured image, a reconstructed background image can be generated. Separately, by using the formulas (1) to (4) or by generating an image other than the background of a person or the like and synthesizing it with the background by the same method as in the first embodiment, a reconstructed image including the person or the like is generated. You can also do it. In the second embodiment, such a process will be described. In the following description, a subject whose position or shape changes with time is referred to as a moving object, and a subject whose position or shape does not change with time is referred to as a background.

実施形態２に係る画像処理装置は、図３及び図４に示す画像処理装置１００と同様の構成を有し、以下では異なる点について主に説明する。また、実施形態２に係る処理は、ステップＳ６４０及びＳ６５０の処理が異なることを除き、実施形態１と同様に行うことができる。 The image processing apparatus according to the second embodiment has the same configuration as the image processing apparatus 100 shown in FIGS. 3 and 4, and the differences will be mainly described below. Further, the process according to the second embodiment can be performed in the same manner as the first embodiment except that the processes of steps S640 and S650 are different.

ステップＳ６４０において、距離マップ取得部３３０は、仮想視点から被写体までの距離マップに加えて、背景の位置を示す位置情報を取得する。本実施形態において、背景は、複数の面で構成される被写体のモデルにより表される。本実施形態で用いられる位置情報の例を図９に示す。背景は体育館の壁面、床、及び天井であり、背景の位置情報は４頂点ポリゴンモデル９１０で表される。もっとも、位置情報の種類は特に限定されない。なお、動体の再構成画像を生成しない場合、仮想視点から被写体までの距離マップを取得することは必要ではない。 In step S640, the distance map acquisition unit 330 acquires position information indicating the position of the background in addition to the distance map from the virtual viewpoint to the subject. In the present embodiment, the background is represented by a model of a subject composed of a plurality of surfaces. An example of the position information used in this embodiment is shown in FIG. The background is the wall surface, floor, and ceiling of the gymnasium, and the position information of the background is represented by the 4-vertex polygon model 910. However, the type of location information is not particularly limited. If the reconstructed image of the moving object is not generated, it is not necessary to acquire the distance map from the virtual viewpoint to the subject.

ステップＳ６４０において、距離マップ取得部３３０はさらに、再構成画像の各画素について、撮像画像中の対応する画素を判定する。ここで、対応する画素とは、背景の同じ位置が写っている画素のことを指す。入力視点、出力視点、及び背景モデルの位置姿勢関係が既知であるため、この処理は任意の方法を用いて行うことができる。例えば、背景モデルに含まれる背景平面を出力視点からの再構成画像及び入力視点からの撮像画像に射影することにより、再構成画像中の画素位置を撮像画像中の画素位置に変換するホモグラフィ行列を算出することができる。本実施形態の場合、背景モデルに含まれる背景平面は、４頂点ポリゴンの１つを意味する。このような処理をそれぞれの背景平面について繰り返すことにより、再構成画像中の各画素に対応する背景平面が決定される。また、背景平面ごとに、それぞれの撮像画像について、再構成画像中の着目画素に対応する画素位置を算出するためのホモグラフィ行列が得られる。このホモグラフィ行列を用いて、それぞれの撮像画像について、再構成画像中の着目画素に対応する画素位置を算出することができる。なお、着目画素がどの背景平面にも対応しない場合、及び着目画素が動体が写っている領域にある場合には、本実施形態では実施形態１と同様に着目画素の画素値が算出されるため、着目画素に対応する撮像画像中の画素を判定するこの処理は行わなくてもよい。 In step S640, the distance map acquisition unit 330 further determines, for each pixel of the reconstructed image, the corresponding pixel in the captured image. Here, the corresponding pixel refers to a pixel in which the same position of the background is reflected. Since the positional relationship between the input viewpoint, the output viewpoint, and the background model is known, this process can be performed by any method. For example, a homography matrix that transforms the pixel positions in the reconstructed image into the pixel positions in the captured image by projecting the background plane included in the background model onto the reconstructed image from the output viewpoint and the captured image from the input viewpoint. Can be calculated. In the case of this embodiment, the background plane included in the background model means one of the four vertex polygons. By repeating such processing for each background plane, the background plane corresponding to each pixel in the reconstructed image is determined. Further, for each background plane, a homography matrix for calculating the pixel position corresponding to the pixel of interest in the reconstructed image can be obtained. Using this homography matrix, it is possible to calculate the pixel position corresponding to the pixel of interest in the reconstructed image for each captured image. When the pixel of interest does not correspond to any background plane, or when the pixel of interest is in the area where the moving object is captured, the pixel value of the pixel of interest is calculated in the same manner as in the first embodiment in the present embodiment. It is not necessary to perform this process of determining the pixel in the captured image corresponding to the pixel of interest.

以下、ステップＳ６５０における処理について説明する。ステップＳ６５１は実施形態１と同様に行うことができる。ステップＳ６５２〜Ｓ６５６は、実施形態１と同様、再構成画像の各画素について行われ、着目画素の画素値が算出される。以下では、再構成画像のうち、背景が写っている画素の画素値を算出する処理について説明する。再構成画像のうち、動体が写っている画素又は対応する背景平面が存在しない画素の画素値は、例えば実施形態１と同様の方法により算出することができる。また、再構成画像のうち、背景が写っている領域と動体が写っている領域との識別は、従来知られている方法により行うことができる。例えば、動体が存在していない場合の距離マップと、動体が存在する場合の距離マップとを比較し、画素値の差分が閾値以上である画素を動体が写っている領域と判定することができる。 Hereinafter, the processing in step S650 will be described. Step S651 can be performed in the same manner as in the first embodiment. Steps S652 to S656 are performed for each pixel of the reconstructed image as in the first embodiment, and the pixel value of the pixel of interest is calculated. Hereinafter, the process of calculating the pixel value of the pixel in which the background is reflected in the reconstructed image will be described. Among the reconstructed images, the pixel values of the pixels in which the moving object is shown or the pixels in which the corresponding background plane does not exist can be calculated by, for example, the same method as in the first embodiment. Further, in the reconstructed image, the area in which the background is shown and the area in which the moving object is shown can be distinguished by a conventionally known method. For example, the distance map when the moving object does not exist and the distance map when the moving object exists can be compared, and it can be determined that the pixel whose pixel value difference is equal to or larger than the threshold value is the area where the moving object is reflected. ..

ここまでの処理で、着目画素に対応する撮像画像中の画素は既知である。したがって、実施形態１と同様に、ステップＳ６５３で方向重み算出部４２０はそれぞれの撮像画像について方向重みを算出することができ、ステップＳ６５４で位置重み算出部４３０はそれぞれの撮像画像について位置重みを算出することができる。そして、ステップＳ６５５で画素値算出部４４０は着目画素の画素値を決定することができる。例えば、画素値算出部４４０は、仮想視点からの画像への背景モデル面の射影像と、撮像画像への背景モデル面の射影像と、の間の座標変換を用いて、着目方向に存在する被写体に対応する撮像画像上の画素を特定することができる。そして、画素値算出部４４０は、特定された画素の色情報を着目方向に存在する被写体の色情報として取得することができる。最後に、画素値算出部４４０は、それぞれの撮像画像から抽出した色情報を方向重み及び位置重みを用いて重み付け合成することにより、着目方向に存在する被写体の色情報を決定することができる。 By the processing up to this point, the pixels in the captured image corresponding to the pixel of interest are known. Therefore, similarly to the first embodiment, the direction weight calculation unit 420 can calculate the direction weight for each captured image in step S653, and the position weight calculation unit 430 calculates the position weight for each captured image in step S654. can do. Then, in step S655, the pixel value calculation unit 440 can determine the pixel value of the pixel of interest. For example, the pixel value calculation unit 440 exists in the direction of interest by using a coordinate transformation between the projected image of the background model surface on the image from the virtual viewpoint and the projected image of the background model surface on the captured image. Pixels on the captured image corresponding to the subject can be specified. Then, the pixel value calculation unit 440 can acquire the color information of the specified pixel as the color information of the subject existing in the direction of interest. Finally, the pixel value calculation unit 440 can determine the color information of the subject existing in the direction of interest by weighting and synthesizing the color information extracted from each captured image using the direction weight and the position weight.

本実施形態によれば、背景画像については、再構成画像中の画素に対応する撮像画像中の画素を特定するために式（１）〜（４）の計算を行わなくてもよいため、処理を高速化することができる。 According to the present embodiment, for the background image, it is not necessary to perform the calculations of the equations (1) to (4) in order to specify the pixels in the captured image corresponding to the pixels in the reconstructed image. Can be speeded up.

前記背景の位置を示す位置情報には、透過方向を示す情報を設定することができる。この場合、仮想視点からの視線が、背景の透過方向に従って背景を横切るならば、背景を描画しないことができる。また、仮想視点からの視線が、背景の反射方向に従って背景を横切るならば、背景を描画することができる。このような処理により、例えば、仮想視点が床の上方にある場合に床を描画し、仮想視点が床の下方にある場合には床を描画しないように、制御を行うことができる。このような処理によれば、より自由な仮想視点からの再構成画像を生成することが可能となる。 Information indicating the transmission direction can be set in the position information indicating the position of the background. In this case, if the line of sight from the virtual viewpoint crosses the background according to the transparent direction of the background, the background can not be drawn. Further, if the line of sight from the virtual viewpoint crosses the background according to the reflection direction of the background, the background can be drawn. By such processing, for example, it is possible to control so that the floor is drawn when the virtual viewpoint is above the floor and the floor is not drawn when the virtual viewpoint is below the floor. By such processing, it becomes possible to generate a reconstructed image from a more free virtual viewpoint.

具体的な一例として、背景の位置を示すポリゴンモデルの各面に、反射面か透過面かを示す情報を与えることができる。ここで、１つのポリゴンについては、両面に独立に反射面か透過面かを示す情報を与えることができる。仮想視点がポリゴンの透過面側に存在する場合、このポリゴンは存在しないものとして扱いながら、再構成画像の各画素について、撮像画像中の対応する画素を判定することができる。例えば、仮想視点がポリゴンの透過面側に存在する場合、このポリゴンについてはホモグラフィ行列の算出及びそのための射影処理を行わなくてよい。このような例においては、床を示すポリゴンのうち、表面（内面）に反射面を示す情報を設定し、裏面（外面）に透過面を示す情報を設定することにより、上記のような制御が可能となる。なお、ポリゴンの反射面側に、仮想視点と撮像装置との双方が存在する場合に、このポリゴンが存在するものとして扱うこともできる。 As a specific example, information indicating whether the surface is a reflective surface or a transparent surface can be given to each surface of the polygon model indicating the position of the background. Here, for one polygon, information indicating whether it is a reflecting surface or a transmitting surface can be independently provided on both sides. When the virtual viewpoint exists on the transparent surface side of the polygon, it is possible to determine the corresponding pixel in the captured image for each pixel of the reconstructed image while treating the polygon as if it does not exist. For example, when the virtual viewpoint exists on the transparent surface side of the polygon, it is not necessary to calculate the homography matrix and perform the projection processing for that polygon. In such an example, among the polygons indicating the floor, by setting the information indicating the reflective surface on the front surface (inner surface) and setting the information indicating the transmissive surface on the back surface (outer surface), the above control can be performed. It will be possible. When both the virtual viewpoint and the image pickup device are present on the reflective surface side of the polygon, it can be treated as if this polygon is present.

［実施形態３］
実施形態１では、実装例として、２つの入力視点ペアからの撮像画像に基づいて出力視点からの画像を得て、各画像を合成することにより、再構成画像を生成する例について説明した。実施形態３では、３つ以上の入力視点からの撮像画像に基づいて再構成画像を生成する例について説明する。 [Embodiment 3]
In the first embodiment, as an implementation example, an example of generating a reconstructed image by obtaining an image from an output viewpoint based on an image captured from two input viewpoint pairs and synthesizing each image has been described. In the third embodiment, an example of generating a reconstructed image based on captured images from three or more input viewpoints will be described.

本実施形態においては、以下のようにして再構成画像上の着目画素の画素値が決定される。まず、入力視点からの撮像画像において、着目画素の方向（着目方向）に存在する被写体が撮像画像に写っているような入力視点が選択される。次に、既に説明したように、入力視点の撮像方向に応じた方向重み（又は角度重み）が算出され、入力視点からの視野内におけるオブジェクトの位置に応じた位置重みが算出される。そして、方向重みと位置重みとを合成することで、入力視点の向きと入力視点の視野内におけるオブジェクトの位置とに応じた重みが算出される。このようにして、方向重み（又は角度重み）と位置重みとの双方を考慮して、それぞれの入力視点に重みが設定される。そして、入力視点からの撮像画像から既に説明したように得られた着目画素の画素値を、入力視点に設定された重みに応じて重み付け合成することにより、着目画素の画素値が決定される。 In the present embodiment, the pixel value of the pixel of interest on the reconstructed image is determined as follows. First, in the captured image from the input viewpoint, the input viewpoint in which the subject existing in the direction of the pixel of interest (direction of interest) is reflected in the captured image is selected. Next, as described above, the direction weight (or angle weight) according to the imaging direction of the input viewpoint is calculated, and the position weight according to the position of the object in the field of view from the input viewpoint is calculated. Then, by synthesizing the direction weight and the position weight, the weight corresponding to the direction of the input viewpoint and the position of the object in the field of view of the input viewpoint is calculated. In this way, weights are set for each input viewpoint in consideration of both the directional weight (or the angular weight) and the position weight. Then, the pixel value of the pixel of interest is determined by weighting and synthesizing the pixel value of the pixel of interest obtained from the captured image from the input viewpoint according to the weight set in the input viewpoint.

以下、本実施形態における再構成画像生成方法の具体例について、図１０に示す入力視点が３視点の場合を例として説明する。図１０（Ａ）は、再構成画像１０００上での入力視点１〜３の可視範囲１００１〜１００３を示す。本実施形態においては、撮像画像の端部からの距離に基づいて位置重みが設定される。従って、再構成画像１０００上での位置重みは、可視範囲の端部から内側に向かって増大し、十分に内側の領域では一定値となる。例えば、図１０（Ｃ）は、線分１００４に沿った各位置における位置重みを示す。領域１００９〜１０１１は、それぞれ入力視点１〜３の位置重みに対応する。 Hereinafter, a specific example of the reconstructed image generation method in the present embodiment will be described by taking the case where the input viewpoints shown in FIG. 10 are three viewpoints as an example. FIG. 10A shows the visible ranges 1001 to 1003 of the input viewpoints 1 to 3 on the reconstructed image 1000. In this embodiment, the position weight is set based on the distance from the end of the captured image. Therefore, the position weight on the reconstructed image 1000 increases inward from the end of the visible range and becomes a constant value in a sufficiently inner region. For example, FIG. 10C shows the position weights at each position along the line segment 1004. The regions 1009 to 1011 correspond to the position weights of the input viewpoints 1 to 3, respectively.

本実施形態では、再構成画像１０００を、着目方向に存在する被写体が撮像画像に写っているような入力視点の組み合わせに応じて分割し、それぞれの領域について角度重みを設定する。すなわち、角度重みによる重み付けは、着目画素に対応する被写体が見えている視点の組ごとに行われる。図１０（Ｂ）は、このような分割の例を示す。 In the present embodiment, the reconstructed image 1000 is divided according to a combination of input viewpoints such that a subject existing in the direction of interest is captured in the captured image, and an angle weight is set for each region. That is, weighting by angle weighting is performed for each set of viewpoints in which the subject corresponding to the pixel of interest is visible. FIG. 10B shows an example of such a division.

入力視点１及び２から見えている領域１００５に対しては、入力視点１，２に基づく角度重みを設定する。領域１００５における角度重みの設定方法を、概念図１００６を参照して説明する。概念図１００６においては、円が光線方向を表しており、矢印が出力視点からの光線、白丸が被写体が見えている入力視点からの光線を表している。領域１００５については、概念図１００６に示される角度ｄ_１（入力視点１からの光線と出力視点からの光線とがなす角度）と、角度ｄ_２（入力視点２からの光線と出力視点からの光線とがなす角度）に基づき角度重みが設定される。 For the area 1005 visible from the input viewpoints 1 and 2, the angle weights based on the input viewpoints 1 and 2 are set. The method of setting the angle weight in the region 1005 will be described with reference to the conceptual diagram 1006. In conceptual diagram 1006, circles represent light rays, arrows represent light rays from an output viewpoint, and white circles represent light rays from an input viewpoint in which the subject is visible. For the region 1005, the angle d ₁ (the angle formed by the light ray from the input viewpoint 1 and the light ray from the output viewpoint) and the angle d ₂ (the light ray from the input viewpoint 2 and the light ray from the output viewpoint) shown in the conceptual diagram 1006. The angle weight is set based on the angle between the rays).

また、概念図１００８に示されるように、領域１００７については、角度ｄ_１と、角度ｄ_２と、に基づき角度重みが設定される。このように、被写体が見えている入力視点が３つ以上である場合、出力視点と距離又は光線が近い２つの入力視点についての角度ｄ_１及びｄ_２に基づいて、角度重みが設定される。 Further, as shown in a conceptual diagram 1008, the region 1007, the angle _{d 1,} and the angle _{d 2,} the angle weighting based on the set. In this way, when there are three or more input viewpoints in which the subject is visible, the angle weights are set based _{on the angles d 1} and d _{2 for the two input viewpoints whose distance or light beam is close to the output viewpoint.}

入力視点の組み合わせのそれぞれについて角度重みを設定した後は、位置重みの設定及び再統合が行われる。図１０（ｄ）は、線分１００４に沿った、入力視点の組み合わせのそれぞれについての位置重みを表した図である。領域１０１２は、入力視点１のみからなる組の成分についての重みであり、領域１０１３は入力視点１及び２からなる組についての成分の重みであり、領域１０１４は入力視点１と２と３とからなる組についての成分の重みである。各組の位置重みの設定方法は特に限定されない。例えば、組に含まれるそれぞれの入力視点について実施形態１と同様に設定される位置重みのうち、最小の位置重みに基づいて、組についての位置重みを設定することができる。具体例としては、入力視点数の多い組から順に、被写体が見えている入力視点についての位置重みのうち最小のものに入力視点数を乗じ、得られた値を位置重みとして割り当てていく方法が挙げられる。 After setting the angle weights for each of the combinations of input viewpoints, the position weights are set and reintegrated. FIG. 10D is a diagram showing position weights for each of the combinations of input viewpoints along the line segment 1004. The region 1012 is the weight of the component of the set consisting only of the input viewpoint 1, the region 1013 is the weight of the component of the set consisting of the input viewpoints 1 and 2, and the region 1014 is from the input viewpoints 1, 2 and 3. It is the weight of the component for the set. The method of setting the position weight of each set is not particularly limited. For example, among the position weights set in the same manner as in the first embodiment for each input viewpoint included in the set, the position weight for the set can be set based on the minimum position weight. As a specific example, there is a method of multiplying the smallest position weight of the input viewpoints in which the subject is visible by the number of input viewpoints and assigning the obtained value as the position weight in order from the set having the largest number of input viewpoints. Can be mentioned.

実施形態３におけるレンダリング部３５０の構成例を図１１に示す。歪曲補正部４１０及び位置重み算出部４３０の構成及び処理は実施形態１と同様である。可視判定部１１０１は、オブジェクトが複数の撮像装置のそれぞれから見えているか否かを判定する。例えば、可視判定部１１０１は、再構成画像上の着目画素に対応する被写体が、各入力視点からの撮像画像上で見えているかを判定し、判定結果を可視視点情報として生成する。 FIG. 11 shows a configuration example of the rendering unit 350 in the third embodiment. The configuration and processing of the distortion correction unit 410 and the position weight calculation unit 430 are the same as those in the first embodiment. The visibility determination unit 1101 determines whether or not the object is visible from each of the plurality of image pickup devices. For example, the visibility determination unit 1101 determines whether the subject corresponding to the pixel of interest on the reconstructed image is visible on the captured image from each input viewpoint, and generates the determination result as the visible viewpoint information.

視点選択部１１０２は、オブジェクトが見えている撮像装置群から選択された１以上の撮像装置の組を示す視点組み合わせ情報（選択情報）を生成する。例えば、視点選択部１１０２は、被写体が見えている入力視点の組から、１以上の入力視点の組を選択することにより視点組み合わせ情報を生成する。合成位置重み算出部１１０３は、視点組み合わせ情報に基づいて、各入力視点についての位置重みから合成位置重みを算出する。また、合成位置重み算出部１１０３は、重みを使い切った入力視点を無効化し、重みを使い切った入力視点の情報を視点選択部１１０２に通知することにより視点選択部１１０２による選択の対象から外す。方向重み算出部１１０４は、視点組み合わせ情報に基づいて角度重みを決定する。重み算出部１１０５は、合成位置重みと角度重みとに基づいて合成重みを算出する。画素値算出部１１０６は、合成重みと歪曲補正画像とに基づいて画素値を算出する。 The viewpoint selection unit 1102 generates viewpoint combination information (selection information) indicating a set of one or more image pickup devices selected from the image pickup device group in which the object is visible. For example, the viewpoint selection unit 1102 generates viewpoint combination information by selecting one or more sets of input viewpoints from the set of input viewpoints in which the subject is visible. The composite position weight calculation unit 1103 calculates the composite position weight from the position weights for each input viewpoint based on the viewpoint combination information. Further, the composite position weight calculation unit 1103 invalidates the input viewpoint that has used up the weight, and notifies the viewpoint selection unit 1102 of the information of the input viewpoint that has used up the weight, thereby excluding it from the selection target by the viewpoint selection unit 1102. The direction weight calculation unit 1104 determines the angle weight based on the viewpoint combination information. The weight calculation unit 1105 calculates the composite weight based on the composite position weight and the angle weight. The pixel value calculation unit 1106 calculates the pixel value based on the composite weight and the distortion-corrected image.

図１２は、視点選択部１１０２及び合成位置重み算出部１１０３の詳細な構成例を示す。視点組み合わせ生成部１２０１は、有効視点バッファ１２０２に格納されている有効視点情報に基づいて入力視点の組み合わせを選択することにより、視点組み合わせ情報を生成する。ここで、視点組み合わせ生成部１２０１は、選択する時点で有効となっている入力視点を選択する。有効視点バッファ１２０２は有効視点情報を保持するバッファであり、有効視点情報は各入力視点について有効か無効かを示す。有効視点情報は、再構成画像上の着目画素が変わるたびに、可視視点情報に基づいて、被写体が見えている入力視点が有効となるように初期化される。 FIG. 12 shows a detailed configuration example of the viewpoint selection unit 1102 and the combined position weight calculation unit 1103. The viewpoint combination generation unit 1201 generates viewpoint combination information by selecting a combination of input viewpoints based on the effective viewpoint information stored in the effective viewpoint buffer 1202. Here, the viewpoint combination generation unit 1201 selects an input viewpoint that is valid at the time of selection. The effective viewpoint buffer 1202 is a buffer that holds the effective viewpoint information, and the effective viewpoint information indicates whether it is valid or invalid for each input viewpoint. The effective viewpoint information is initialized so that the input viewpoint in which the subject is visible becomes effective based on the visible viewpoint information each time the pixel of interest on the reconstructed image changes.

位置重み和算出部１２０４は、各入力視点についての位置重みの総和を算出する。なお、不可視の入力視点、すなわち再構成画像上の着目画素に対応する被写体が見えていない入力視点についての位置重みは０として算出が行われる。位置重みバッファ１２０５は、各入力視点についての位置重みを保持するバッファであり、再構成画像上の着目画素が変わる度に、各入力視点について位置重み算出部４３０が得た位置重みを用いて初期化される。具体的には、初期化の際には、位置重みバッファ１２０５には、各入力視点についての位置重みを位置重みの総和で割った値が格納される。 The position weight sum calculation unit 1204 calculates the total position weights for each input viewpoint. The position weight of the invisible input viewpoint, that is, the input viewpoint in which the subject corresponding to the pixel of interest on the reconstructed image is not visible, is calculated as 0. The position weight buffer 1205 is a buffer that holds the position weights for each input viewpoint, and each time the pixel of interest on the reconstructed image changes, the position weights obtained by the position weight calculation unit 430 for each input viewpoint are initially used. Is made. Specifically, at the time of initialization, the position weight buffer 1205 stores a value obtained by dividing the position weight for each input viewpoint by the sum of the position weights.

最小位置重み算出部１２０６は、視点組み合わせ情報に基づいて、選択された入力視点から、位置重みバッファ１２０５に格納されている位置重みが最小である入力視点を選択する。選択された入力視点の情報は有効視点更新部１２０３に通知され、有効視点更新部１２０３は、選択された入力視点が無効視点となるように有効視点バッファ１２０２を更新する。また、重み更新部１２０７は、位置重みバッファ１２０５が保持する位置重みについて、視点組み合わせ生成部１２０１が選択した入力視点についての位置重みから、最小位置重み算出部１２０６が選択した入力視点についての位置重みを減算する更新処理を行う。重み決定部１２０８は、最小位置重み算出部１２０６が選択した入力視点についての位置重みに、視点組み合わせ生成部１２０１が選択した入力視点の数を乗じることで、合成位置重みを算出する。 The minimum position weight calculation unit 1206 selects the input viewpoint having the minimum position weight stored in the position weight buffer 1205 from the selected input viewpoints based on the viewpoint combination information. The information of the selected input viewpoint is notified to the effective viewpoint update unit 1203, and the effective viewpoint update unit 1203 updates the effective viewpoint buffer 1202 so that the selected input viewpoint becomes an invalid viewpoint. Further, the weight updating unit 1207 regarding the position weight held by the position weight buffer 1205, the position weight for the input viewpoint selected by the minimum position weight calculation unit 1206 from the position weight for the input viewpoint selected by the viewpoint combination generation unit 1201. Performs update processing to subtract. The weight determination unit 1208 calculates the combined position weight by multiplying the position weight for the input viewpoint selected by the minimum position weight calculation unit 1206 by the number of input viewpoints selected by the viewpoint combination generation unit 1201.

実施形態３におけるレンダリング部３５０による処理の流れを図１３に示す。ステップＳ６５１、ステップＳ６５２、ステップＳ６５６は実施形態１と同様に行われる。ステップＳ１３０１において可視判定部１１０１は上述のように可視視点情報を生成する。ステップＳ１３０２において可視判定部１１０１は上述のように可視視点情報に基づいて有効視点バッファ１２０２の初期化を行う。ステップＳ１３０３において位置重み算出部４３０は上述のように各入力視点についての位置重みを算出する。ステップＳ１３０４において位置重み和算出部１２０４は上述のように各入力視点についての位置重みの総和を算出し、位置重みバッファ１２０５の初期化を行う。 FIG. 13 shows the flow of processing by the rendering unit 350 in the third embodiment. Step S651, step S652, and step S656 are performed in the same manner as in the first embodiment. In step S1301, the visibility determination unit 1101 generates visible viewpoint information as described above. In step S1302, the visibility determination unit 1101 initializes the effective viewpoint buffer 1202 based on the visible viewpoint information as described above. In step S1303, the position weight calculation unit 430 calculates the position weight for each input viewpoint as described above. In step S1304, the position weight sum calculation unit 1204 calculates the total position weights for each input viewpoint as described above, and initializes the position weight buffer 1205.

ステップＳ１３０５において視点組み合わせ生成部１２０１は、上述のように有効視点情報に基づいて入力視点の組み合わせを選択し、視点組み合わせ情報を生成する。例えば、視点組み合わせ生成部１２０１は、有効な入力視点をすべて選択する。ステップＳ１３０６において最小位置重み算出部１２０６は、視点組み合わせ情報と位置重みバッファ１２０５に格納された位置重みとに基づいて、位置重みが最小である入力視点を上述のように選択する。ステップＳ１３０７において有効視点更新部１２０３は、ステップＳ１３０６で選択された入力視点が無効化されるように、上述のように有効視点バッファ１２０２を更新する。ステップＳ１３０８において重み更新部１２０７は、上述のように位置重みバッファ１２０２を更新する。具体的には、重み更新部１２０７は、ステップＳ１３０５で選択された選択された各入力視点について、現在の位置重みからステップＳ１３０６で選択された入力視点についての位置重みを減算する。ステップＳ１３０９において重み決定部１２０８は、ステップＳ１３０６で選択された入力視点についての位置重みに、ステップＳ１３０５で選択された入力視点の数を乗じることで、上述のように合成位置重みを算出する。こうして算出された合成位置重みは、ステップＳ１３０５で選択された入力視点の組み合わせについての合成位置重みとして用いられる。 In step S1305, the viewpoint combination generation unit 1201 selects a combination of input viewpoints based on the effective viewpoint information as described above, and generates the viewpoint combination information. For example, the viewpoint combination generation unit 1201 selects all valid input viewpoints. In step S1306, the minimum position weight calculation unit 1206 selects the input viewpoint having the minimum position weight as described above, based on the viewpoint combination information and the position weight stored in the position weight buffer 1205. In step S1307, the effective viewpoint update unit 1203 updates the effective viewpoint buffer 1202 as described above so that the input viewpoint selected in step S1306 is invalidated. In step S1308, the weight update unit 1207 updates the position weight buffer 1202 as described above. Specifically, the weight updating unit 1207 subtracts the position weight for the input viewpoint selected in step S1306 from the current position weight for each selected input viewpoint selected in step S1305. In step S1309, the weight determination unit 1208 calculates the combined position weight as described above by multiplying the position weight for the input viewpoint selected in step S1306 by the number of input viewpoints selected in step S1305. The combined position weight calculated in this way is used as the combined position weight for the combination of input viewpoints selected in step S1305.

ステップＳ１３１０において、方向重み算出部１１０４は、上述のように視点組み合わせ情報に基づいて方向重みを決定する。この方向重みは、ステップＳ１３０５で選択された入力視点の組み合わせについて、各入力視点の重みを示す方向重みとして用いられる。ステップＳ１３１１において重み算出部１１０５は、ステップＳ１３１０で各入力視点について決定された方向重みに、ステップＳ１３０９で算出された合成位置重みを乗じることにより、各入力視点についての重みの更新量を算出する。 In step S1310, the direction weight calculation unit 1104 determines the direction weight based on the viewpoint combination information as described above. This directional weight is used as a directional weight indicating the weight of each input viewpoint for the combination of the input viewpoints selected in step S1305. In step S1311, the weight calculation unit 1105 calculates the update amount of the weight for each input viewpoint by multiplying the directional weight determined for each input viewpoint in step S1310 by the combined position weight calculated in step S1309.

ステップＳ１３１２において重み算出部１１０５は、ステップＳ１３１１で算出された更新量を、これまでの各入力視点についての累積重みに加算することで、各入力視点についての累積重みを更新する。なお、各入力視点についての累積重みは、再構成画像上の着目画素が変わる度に０に初期化されている。そして、これまでの各入力視点についての累積重みは、現在の視点組み合わせ情報とは異なる視点組み合わせ情報に基づいて算出された重みの更新量を累積することにより得られている。ステップＳ１３１３において、視点組み合わせ生成部１２０１は、有効視点が残っているか否かを判定する。有効視点が残っていなければ処理はステップＳ６５６に進み、残っていれば処理はステップＳ１３０５に戻り、別の有効視点の組み合わせについて処理が繰り返される。 In step S1312, the weight calculation unit 1105 updates the cumulative weight for each input viewpoint by adding the update amount calculated in step S1311 to the cumulative weight for each input viewpoint so far. The cumulative weight for each input viewpoint is initialized to 0 each time the pixel of interest on the reconstructed image changes. The cumulative weights for each input viewpoint so far are obtained by accumulating the update amount of the weight calculated based on the viewpoint combination information different from the current viewpoint combination information. In step S1313, the viewpoint combination generation unit 1201 determines whether or not an effective viewpoint remains. If no effective viewpoint remains, the process proceeds to step S656, and if it remains, the process returns to step S1305, and the process is repeated for another combination of effective viewpoints.

ステップＳ１３１４において画素値算出部１１０６は、各入力視点についての歪曲補正画像と、各入力視点についての累積重みと、に基づいて着目画素の画素値を決定する。具体的には、着目画素に対応する歪曲補正画像上の画素の画素値を、累積重みを用いて重み付け合成することにより、着目画素の画素値を決定することができる。ここで、各入力視点についての累積重みは、位置重みと方向重みとの双方を考慮した重みである。 In step S1314, the pixel value calculation unit 1106 determines the pixel value of the pixel of interest based on the distortion correction image for each input viewpoint and the cumulative weight for each input viewpoint. Specifically, the pixel value of the pixel of interest can be determined by weighting and synthesizing the pixel values of the pixels on the distortion-corrected image corresponding to the pixel of interest using cumulative weights. Here, the cumulative weight for each input viewpoint is a weight considering both the position weight and the direction weight.

図１４（図１４−１〜図１４−３）は、入力視点が５つある場合における、本実施形態における重み算出処理の流れを例示する。図１４（Ａ）は、各入力視点についての位置重みが、４回の更新を経てどのように変化するかを表している。また、図１４（Ｂ）は、有効視点が４回の更新を経てどのように変化するかを表している。図１４（Ｂ）では、１が有効、０が無効を表している。入力視点５からは着目画素に対応する被写体は可視でないため、初期状態において入力視点は無効であり、累積重みは０となる。図１４（Ｃ）は、各更新により得られる最小重み視点（位置重みが最小となる入力視点）、最小位置重み（最小重み視点の位置重み）、有効視点数、及び合成位置重みを表している。 14 (14-1 to 14-3) exemplify the flow of the weight calculation process in the present embodiment when there are five input viewpoints. FIG. 14A shows how the position weight for each input viewpoint changes after four updates. Further, FIG. 14B shows how the effective viewpoint changes after four updates. In FIG. 14B, 1 represents valid and 0 represents invalid. Since the subject corresponding to the pixel of interest is not visible from the input viewpoint 5, the input viewpoint is invalid in the initial state, and the cumulative weight is 0. FIG. 14C shows the minimum weight viewpoint (input viewpoint with the minimum position weight), the minimum position weight (position weight of the minimum weight viewpoint), the number of effective viewpoints, and the combined position weight obtained by each update. ..

初期値においては、有効視点１，２，３，４の中では、入力視点２が最小重み視点であり、その位置重みは０．１である。したがって１回目の更新では入力視点１，２，３，４についての位置重みから最小位置重みの０．１が引かれ、視点２が無効化される。また、有効視点数は４であるので、合成位置重みは０．４となる。１回更新後においては、有効視点１，３，４の中では、入力視点３が最小重み視点であり、その位置重みは０．１である。したがって２回目の更新では、視点１，３，４についての位置重みから最小位置重みの０．１が引かれ、視点３が無効化される。また、有効視点数は３であるので、合成位置重みは０．３となる。２回更新後においては、有効視点１，４の中では、有効視点１が最小重み視点であり、その位置重みは０．１である。したがって３回目の更新では、視点１，４についての位置重みから最小位置重みの０．１が引かれ、入力視点２が無効化される。また有効視点数は２であるので、合成位置重みは０．２となる。３回更新後においては、入力視点４が唯一有効であるから、入力視点４が最小重み視点であり、その位置重みは０．１である。したがって４回目の更新では、入力視点４が無効化され、繰り返しが終了する。また、有効視点数は１であるので、合成位置重みは０．１となる。 In the initial value, among the effective viewpoints 1, 2, 3, and 4, the input viewpoint 2 is the minimum weight viewpoint, and its position weight is 0.1. Therefore, in the first update, the minimum position weight of 0.1 is subtracted from the position weights of the input viewpoints 1, 2, 3, and 4, and the viewpoint 2 is invalidated. Further, since the number of effective viewpoints is 4, the composite position weight is 0.4. After one update, among the effective viewpoints 1, 3 and 4, the input viewpoint 3 is the minimum weight viewpoint, and its position weight is 0.1. Therefore, in the second update, the minimum position weight of 0.1 is subtracted from the position weights for the viewpoints 1, 3 and 4, and the viewpoint 3 is invalidated. Further, since the number of effective viewpoints is 3, the composite position weight is 0.3. After the second update, among the effective viewpoints 1 and 4, the effective viewpoint 1 is the minimum weight viewpoint, and its position weight is 0.1. Therefore, in the third update, the minimum position weight of 0.1 is subtracted from the position weights of the viewpoints 1 and 4, and the input viewpoint 2 is invalidated. Further, since the number of effective viewpoints is 2, the composite position weight is 0.2. After updating three times, the input viewpoint 4 is the only valid viewpoint, so that the input viewpoint 4 is the minimum weight viewpoint and its position weight is 0.1. Therefore, in the fourth update, the input viewpoint 4 is invalidated and the repetition ends. Further, since the number of effective viewpoints is 1, the composite position weight is 0.1.

図１４（Ｄ）は、一例における、出力視点からの光線と、各入力視点からの光線とがなす角を表す。ここでは、出力視点からの光線は、入力視点２と入力視点３の光線の間に位置している。方向重みは、出力視点と出力視点から左回り方向に最近傍の入力視点とのなす角、及び出力視点と出力視点から右回り方向に最近傍の入力視点とのなす角に基づいて、式（７）にしたがって算出できる。ここで、出力視点からの光線としては、出力視点から、再構成画像上の着目画素に対応する被写体への光線を算出して用いることができる。また、入力視点からの光線としては、入力視点から、再構成画像上の着目画素に対応する被写体への光線を算出して用いることができる。なお、入力視点からの光線は、実施形態１で説明した入力視点から着目画素に写っている被写体への方向に、出力視点からの光線は、実施形態１で説明した出力視点からの着目画素に対応する方向に、それぞれ対応する。したがって、入力視点及び出力視点からの光線としては、実施形態１と同様のものを用いることができる。例えば、画素毎に光線を算出する代わりに、各視点の光軸ベクトル、又は各視点位置から基準点へのベクトルを、光線として用いることができる。さらに、光線がなす角は、光線を基準面に射影して得られる２次元ベクトルがなす角を用いてもよいし、３次元空間中において光線がなす角を用いてもよい。 FIG. 14D shows the angle formed by the light rays from the output viewpoint and the light rays from each input viewpoint in one example. Here, the light rays from the output viewpoint are located between the light rays of the input viewpoint 2 and the input viewpoint 3. The directional weight is based on the angle between the output viewpoint and the input viewpoint closest to the output viewpoint in the counterclockwise direction, and the angle between the output viewpoint and the input viewpoint closest to the output viewpoint in the clockwise direction. It can be calculated according to 7). Here, as the light rays from the output viewpoint, the light rays to the subject corresponding to the pixel of interest on the reconstructed image can be calculated and used from the output viewpoint. Further, as the light ray from the input viewpoint, a light ray to the subject corresponding to the pixel of interest on the reconstructed image can be calculated and used from the input viewpoint. The light rays from the input viewpoint are directed toward the subject reflected in the pixel of interest from the input viewpoint described in the first embodiment, and the light rays from the output viewpoint are directed to the pixels of interest from the output viewpoint described in the first embodiment. Corresponds to each in the corresponding direction. Therefore, as the light rays from the input viewpoint and the output viewpoint, the same rays as in the first embodiment can be used. For example, instead of calculating the light ray for each pixel, the optical axis vector of each viewpoint or the vector from each viewpoint position to the reference point can be used as the light ray. Further, as the angle formed by the light ray, the angle formed by the two-dimensional vector obtained by projecting the light ray onto the reference plane may be used, or the angle formed by the light ray in the three-dimensional space may be used.

図１４（Ｅ）は、図１４（Ｄ）に示す光線がなす角に基づいて算出された、方向重みの例を示す。ステップＳ１３１０の通り、「１回目更新」の欄は、入力視点１〜４の組み合わせに対する各入力視点の方向重みを表す。図１４（Ｅ）の例では、入力視点１〜４の組み合わせに対応する画素値成分は、入力視点２からの撮像画像と、入力視点３からの撮像画像とに基づいて設定される。本実施形態では、２以上の入力視点の組み合わせに対応する画素値成分は、２つの入力視点からの撮像画像から得られた画素値の組み合わせで表されるため、上述のように２つの入力視点について重みが設定されている。同様に、「２回目更新」「３回目更新」及び「４回目更新」の欄は、入力視点１，３，４の組み合わせ、入力視点１，４の組み合わせ、及び入力視点４からなる組み合わせについての方向重みを表す。 14 (E) shows an example of directional weights calculated based on the angle formed by the light rays shown in FIG. 14 (D). As in step S1310, the “first update” column represents the directional weight of each input viewpoint for the combination of the input viewpoints 1 to 4. In the example of FIG. 14E, the pixel value component corresponding to the combination of the input viewpoints 1 to 4 is set based on the captured image from the input viewpoint 2 and the captured image from the input viewpoint 3. In the present embodiment, the pixel value component corresponding to the combination of two or more input viewpoints is represented by the combination of the pixel values obtained from the captured images from the two input viewpoints, and therefore, as described above, the two input viewpoints. Weights are set for. Similarly, the "second update", "third update" and "fourth update" columns are for combinations of input viewpoints 1, 3 and 4, combinations of input viewpoints 1 and 4, and combinations of input viewpoints 4. Represents a directional weight.

図１４（Ｆ）は、図１４（Ｃ）に示す合成位置重みに、図１４（Ｅ）に示す方向重みを乗じて得られる値を示し、これは重みの更新量として用いられる。図１４の例において、着目画素の画素値成分は、以下の画素値成分の組み合わせで表される。すなわち、入力視点１〜４の組み合わせに対応する画素値成分と、入力視点１，３，４の組み合わせに対応する画素値成分と、入力視点１，４の組み合わせに対応する画素値成分と、入力視点４からなる組み合わせに対応する画素値成分と、の組み合わせである。図１４（Ｆ）に示される重みの更新量は、入力視点の組み合わせの１つに対応する画素値成分を算出する際に用いられる、各入力視点の重みに相当する。そして、重みの更新量を更新毎に加算して累積していくと、図１４（Ｇ）に示す最終的な入力視点毎の重みが得られる。この重みで、各入力視点からの撮像画像において着目画素に対応する画素の画素値を重み付き平均することにより、着目画素の画素値が得られる。各入力視点からの撮像画像における着目画素に対応する画素の画素値の算出は、実施形態１，２と同様に行うことができる。 14 (F) shows a value obtained by multiplying the combined position weight shown in FIG. 14 (C) by the directional weight shown in FIG. 14 (E), which is used as the weight update amount. In the example of FIG. 14, the pixel value component of the pixel of interest is represented by a combination of the following pixel value components. That is, the pixel value component corresponding to the combination of the input viewpoints 1 to 4, the pixel value component corresponding to the combination of the input viewpoints 1, 3 and 4, and the pixel value component corresponding to the combination of the input viewpoints 1 and 4 are input. It is a combination with a pixel value component corresponding to the combination consisting of the viewpoint 4. The weight update amount shown in FIG. 14F corresponds to the weight of each input viewpoint used when calculating the pixel value component corresponding to one of the combinations of input viewpoints. Then, when the update amount of the weight is added and accumulated for each update, the final weight for each input viewpoint shown in FIG. 14 (G) can be obtained. With this weight, the pixel value of the pixel of interest is obtained by weighted averaging the pixel values of the pixels corresponding to the pixel of interest in the captured image from each input viewpoint. Calculation of the pixel value of the pixel corresponding to the pixel of interest in the captured image from each input viewpoint can be performed in the same manner as in the first and second embodiments.

ここで示した合成位置重みの算出方法は一例であり、合成位置重みは他の方法で算出することもできる。例えば、入力視点ごとに算出した位置重みを用いる代わりに、入力視点の組み合わせに応じて分割した領域のそれぞれについて、端部のブレンド幅を適宜設定し、この領域の端部からの距離に応じて設定された位置重みを合成位置重みとして用いることができる。また、有効視点を全て選択する代わりに、出力視点光線と入力視点光線とのなす角が小さい２つの入力視点を選択し、選択された入力視点を含むような入力視点の組み合わせ全てに対して重みを減じる処理を繰り返すこともできる。また、方向重みの算出方法も一例であり、方向重みは他の方法で算出することもできる。例えば、３以上の入力視点について方向重みを設定してもよいし、非線形な重みを設定することもできる。また、最近傍の入力視点について算出した重みと、被写体が可視である視点の間で平均的に設定された重みと、を基準面と光線とがなす角に基づきブレンドすることもできる。このような方法によれば、光線方向が基準面に対して垂直となる位置の近傍で急激に重みが変化する現象を抑制することもできる。 The method for calculating the combined position weight shown here is an example, and the combined position weight can be calculated by another method. For example, instead of using the position weight calculated for each input viewpoint, the blend width of the end portion is appropriately set for each of the regions divided according to the combination of the input viewpoints, and the distance from the end portion of this region is set. The set position weight can be used as the composite position weight. Also, instead of selecting all the effective viewpoints, two input viewpoints with a small angle between the output viewpoint ray and the input viewpoint ray are selected, and weights are given to all combinations of input viewpoints including the selected input viewpoint. It is also possible to repeat the process of reducing. Further, the method of calculating the directional weight is also an example, and the directional weight can be calculated by another method. For example, directional weights may be set for three or more input viewpoints, or non-linear weights may be set. It is also possible to blend the weight calculated for the nearest input viewpoint and the weight set on average between the viewpoints where the subject is visible, based on the angle formed by the reference plane and the light beam. According to such a method, it is possible to suppress the phenomenon that the weight changes suddenly in the vicinity of the position where the light ray direction is perpendicular to the reference plane.

［実施形態４］
実施形態３までは、概略円環状に配置された入力視点からの撮像画像を用いる例を示した。実施形態４では、主たる撮像画像の他に、背景撮像画像又は環境情報などを用いて再構成画像を生成する例を示す。 [Embodiment 4]
Up to the third embodiment, an example of using an image taken from an input viewpoint arranged in a substantially annular shape has been shown. In the fourth embodiment, an example of generating a reconstructed image by using a background captured image, environmental information, or the like in addition to the main captured image is shown.

実施形態４に係るレンダリング部３５０の構成例を図１５に示す。本実施形態における主レンダリング部１５０１及び背景レンダリング部１５０２は、実施形態１〜３におけるレンダリング部３５０と同様の構成及び処理を行い、それぞれ主レンダリング画像及び背景レンダリング画像を生成する。例えば、主レンダリング部１５０１は動体である被写体の画像を主レンダリング画像として生成することができ、背景レンダリング部１５０２は背景である被写体の画像を背景レンダリング画像として生成することができる。このような処理は、例えば実施形態２の方法を応用することにより行うことができる。本実施形態において、主レンダリング部１５０１及び背景レンダリング部１５０２は、それぞれ概略円環状に配置されている、異なるグループの撮像装置により得られた撮像画像を用いてレンダリングを行う。すなわち、主レンダリング部１５０１は、主撮像部グループにより得られた主撮像画像群に基づいて色情報を決定する。また、背景レンダリング部１５０２は、背景撮像部グループにより得られた背景撮像画像群に基づいて色情報を決定する。 FIG. 15 shows a configuration example of the rendering unit 350 according to the fourth embodiment. The main rendering unit 1501 and the background rendering unit 1502 in the present embodiment perform the same configuration and processing as the rendering unit 350 in the first to third embodiments, and generate a main rendering image and a background rendering image, respectively. For example, the main rendering unit 1501 can generate an image of a moving subject as a main rendering image, and the background rendering unit 1502 can generate an image of a subject as a background as a background rendering image. Such processing can be performed, for example, by applying the method of the second embodiment. In the present embodiment, the main rendering unit 1501 and the background rendering unit 1502 perform rendering using captured images obtained by different groups of image pickup devices, which are arranged in a substantially annular shape, respectively. That is, the main rendering unit 1501 determines the color information based on the main captured image group obtained by the main imaging unit group. Further, the background rendering unit 1502 determines the color information based on the background captured image group obtained by the background imaging unit group.

また、主レンダリング部１５０１及び背景レンダリング部１５０２は、レンダリング画像の他に、レンダリング画像の各画素についての位置重みを示す主重みマップ及び背景重みマップを生成する。この重みマップは、レンダリング画像の各画素に対して、各画素に対応する全入力視点の位置重みのうちの最大の位置重みを格納している。ブレンディング部１５０３は、主レンダリング画像と背景レンダリング画像とを重みマップに基づきブレンド（合成）することにより、オブジェクトの色情報を決定し、こうしてブレンド画像を出力する。ブレンド方法の一例としては、主レンダリング画像には位置重みを乗じ、背景レンダリング画像には（１−位置重み）を乗じて、加算平均する方法が挙げられる。ここで用いられる位置重みとしては、正規化された、主重みマップに示される位置重みを用いることができる。また、重み算出部１５０４は、主重みマップと背景重みマップとから新たな重みマップを生成する。重み算出部は、主重みマップに示される重みと背景重みマップに示される重みとの最大値を、新たな重みマップに示される重みとして算出することができる。 Further, the main rendering unit 1501 and the background rendering unit 1502 generate a main weight map and a background weight map showing the position weights for each pixel of the rendered image in addition to the rendered image. This weight map stores the maximum position weight among the position weights of all the input viewpoints corresponding to each pixel for each pixel of the rendered image. The blending unit 1503 determines the color information of the object by blending (combining) the main rendered image and the background rendered image based on the weight map, and outputs the blended image in this way. As an example of the blending method, there is a method of multiplying the main rendered image by a position weight and multiplying the background rendered image by (1-position weight) and averaging them. As the position weight used here, the normalized position weight shown in the main weight map can be used. Further, the weight calculation unit 1504 generates a new weight map from the main weight map and the background weight map. The weight calculation unit can calculate the maximum value of the weight shown in the main weight map and the weight shown in the background weight map as the weight shown in the new weight map.

環境レンダリング部１５０５は、環境レンダリング画像を生成する。環境レンダリング部１５０５は、例えば、各光線方向に対して画素値が定義された環境マップ又は単色を示す色情報のような環境情報を用いてレンダリングを行うことにより、環境レンダリング画像を生成することができる。ブレンディング部１５０６は、ブレンディング部１５０３が生成したブレンド画像と、環境レンダリング画像とをブレンディングすることにより、最終的な出力画像を生成することができる。ここで、ブレンディング部１５０６は、重み算出部１５０４が生成した重みマップに基づいて、ブレンディング部１５０３と同様にブレンディングを行うことができる。 The environment rendering unit 1505 generates an environment rendering image. The environment rendering unit 1505 can generate an environment-rendered image by rendering using environment information such as an environment map in which pixel values are defined for each ray direction or color information indicating a single color. can. The blending unit 1506 can generate a final output image by blending the blended image generated by the blending unit 1503 and the environment-rendered image. Here, the blending unit 1506 can perform blending in the same manner as the blending unit 1503 based on the weight map generated by the weight calculation unit 1504.

実施形態３におけるレンダリング部３５０が行う処理の流れを図１６に示す。ステップＳ１６０１において、主レンダリング部１５０１は主撮像画像に基づき主レンダリング画像を生成する。ステップＳ１６０２において、背景レンダリング部１５０２は背景撮像画像に基づいて背景レンダリング画像を生成する。ステップＳ１６０３において、環境レンダリング部１５０５は環境情報に基づいて環境レンダリング画像を生成する。ステップＳ１６０４においてブレンディング部１５０３は主レンダリング画像と背景レンダリング画像とをブレンディングする。ステップＳ１６０５において重み算出部１５０４は主重みマップと背景重みマップとを合成する。ステップＳ１６０６では、ステップＳ１６０５で得られた重みマップに基づいて、ブレンディング部１５０６が、ステップＳ１６０４で生成した画像と環境レンダリング画像とをブレンディングする。 FIG. 16 shows the flow of processing performed by the rendering unit 350 in the third embodiment. In step S1601, the main rendering unit 1501 generates a main rendering image based on the main captured image. In step S1602, the background rendering unit 1502 generates a background rendering image based on the background captured image. In step S1603, the environment rendering unit 1505 generates an environment rendering image based on the environment information. In step S1604, the blending unit 1503 blends the main rendered image and the background rendered image. In step S1605, the weight calculation unit 1504 synthesizes the main weight map and the background weight map. In step S1606, the blending unit 1506 blends the image generated in step S1604 with the environment-rendered image based on the weight map obtained in step S1605.

本実施形態では、主撮像画像、背景撮像画像、及び環境情報を用いて画像を生成する例を示したが、用いる画像群はこれより多くても少なくても構わない。また、ここに示したブレンディング方法は一例に過ぎず、他の方法を用いることもできる。例えば、基準面と光線のなす角、又は出力視点の位置に基づき、画像のブレンディング比率を変えることもできる。 In the present embodiment, an example of generating an image using a main captured image, a background captured image, and environmental information is shown, but the number of image groups used may be larger or smaller than this. Further, the blending method shown here is only an example, and other methods can be used. For example, the blending ratio of the image can be changed based on the angle between the reference plane and the light beam or the position of the output viewpoint.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

３１０：入力視点情報取得部、３２０：出力視点情報取得部、３３０：距離マップ取得部、３４０：画像取得部、３５０：レンダリング部、３６０：画像出力部、４１０：歪曲補正部、４２０：方向重み算出部、４３０：位置重み算出部 310: Input viewpoint information acquisition unit, 320: Output viewpoint information acquisition unit, 330: Distance map acquisition unit, 340: Image acquisition unit, 350: Rendering unit, 360: Image output unit, 410: Distortion correction unit, 420: Direction weight Calculation unit, 430: Position weight calculation unit

Claims

仮想視点の位置の情報と前記仮想視点からの視線方向の情報を含む仮想視点情報を取得する取得手段と、
前記取得手段により取得された仮想視点情報により特定される仮想視点に対応する仮想視点画像の着目画素の色情報を、複数の撮像装置で撮像されることにより取得された複数の撮像画像と、前記複数の撮像画像それぞれにおける端部から前記着目画素に対応する画素までの距離に応じた重みとに基づいて決定する決定手段と、
前記決定手段により決定された色情報に基づいて、前記仮想視点画像を生成する生成手段と、
を有し、
前記重みは、
前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値を超えている場合、一定の重みであり、
前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値以下である場合、前記一定の重みより小さい重みである
ことを特徴とする画像処理装置。 An acquisition means for acquiring virtual viewpoint information including information on the position of the virtual viewpoint and information on the line-of-sight direction from the virtual viewpoint, and
A plurality of captured images acquired by capturing the color information of the pixel of interest of the virtual viewpoint image corresponding to the virtual viewpoint identified by the virtual viewpoint information acquired by the acquisition means by a plurality of imaging devices, and the above-mentioned. A determination means for determining based on the weight according to the distance from the end of each of the plurality of captured images to the pixel corresponding to the pixel of interest.
A generation means for generating the virtual viewpoint image based on the color information determined by the determination means, and a generation means.
Have a,
The weight is
When the distance from the end portion of the captured image to the pixel corresponding to the pixel of interest exceeds a predetermined threshold value, the weight is constant.
An image processing apparatus characterized in that when the distance from the end portion of the captured image to the pixel corresponding to the pixel of interest is equal to or less than a predetermined threshold value, the weight is smaller than the constant weight.

前記決定手段は、前記仮想視点画像の前記着目画素の色情報を、前記複数の撮像画像それぞれにおける前記着目画素に対応する画素の色情報と、前記着目画素に対応する画素に対して前記距離に応じて設定された重みと、に基づいて決定することを特徴とする請求項１に記載の画像処理装置。 The determining means transfers the color information of the pixel of interest of the virtual viewpoint image to the color information of the pixel corresponding to the pixel of interest in each of the plurality of captured images and the distance to the pixel corresponding to the pixel of interest. The image processing apparatus according to claim 1, wherein the image processing apparatus is determined based on the weights set accordingly.

前記重みは、前記撮像画像における端部から前記着目画素に対応する画素までの距離が第１の距離である場合の方が、前記撮像画像における端部から前記着目画素に対応する画素までの距離が前記第１の距離よりも大きい第２の距離である場合よりも小さいことを特徴とする、請求項１又は２に記載の画像処理装置。 The weight is the distance from the end of the captured image to the pixel corresponding to the pixel of interest when the distance from the end of the captured image to the pixel corresponding to the pixel of interest is the first distance. The image processing apparatus according to claim 1 or 2, wherein is smaller than the case where the second distance is larger than the first distance.

前記重みは、前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値を超えている場合は、前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値以下である場合よりも大きいことを特徴とする、請求項１乃至３の何れか１項に記載の画像処理装置。 When the distance from the end of the captured image to the pixel corresponding to the pixel of interest exceeds a predetermined threshold value, the weight is the distance from the end of the captured image to the pixel corresponding to the pixel of interest. The image processing apparatus according to any one of claims 1 to 3, wherein the image processing apparatus is larger than a predetermined threshold value or less.

前記重みは、さらに、前記仮想視点からの視線方向と、前記複数の撮像装置の視線方向とに基づくことを特徴とする請求項１乃至４の何れか１項に記載の画像処理装置。 The image processing apparatus according to any one of claims 1 to 4 , wherein the weight is further based on a line-of-sight direction from the virtual viewpoint and a line-of-sight direction of the plurality of image pickup devices.

前記重みは、さらに、前記仮想視点の位置と、前記複数の撮像装置の位置とに基づくことを特徴とする請求項１乃至５の何れか１項に記載の画像処理装置。 The image processing apparatus according to any one of claims 1 to 5 , wherein the weight is further based on the position of the virtual viewpoint and the position of the plurality of image pickup devices.

前記取得手段は、前記複数の撮像装置の位置の情報と、前記複数の撮像装置の視線方向の情報とを含む撮像装置情報をさらに取得し、
前記決定手段は、前記取得手段により取得された前記仮想視点情報と前記撮像装置情報とに基づいて、仮想視点画像の色情報を決定するために用いる２以上の撮像装置を特定し、特定された前記２以上の撮像装置で撮像されることにより取得された２以上の撮像画像と、前記２以上の撮像画像それぞれにおける端部から前記着目画素に対応する画素までの距離に応じた重みとに基づいて、前記仮想視点画像の色情報を決定することを特徴とする請求項１乃至６の何れか１項に記載の画像処理装置。 The acquisition means further acquires image pickup device information including information on the positions of the plurality of image pickup devices and information on the line-of-sight direction of the plurality of image pickup devices.
The determination means identifies and identifies two or more image pickup devices used for determining the color information of the virtual viewpoint image based on the virtual viewpoint information and the image pickup device information acquired by the acquisition means. Based on the two or more captured images acquired by being imaged by the two or more imaging devices and the weight according to the distance from the end of each of the two or more captured images to the pixel corresponding to the pixel of interest. The image processing apparatus according to any one of claims 1 to 6 , wherein the color information of the virtual viewpoint image is determined.

前記生成手段は、さらにオブジェクトの３次元モデルに基づいて、前記仮想視点画像を生成することを特徴とする請求項１乃至７の何れか１項に記載の画像処理装置。 The image processing apparatus according to any one of claims 1 to 7 , wherein the generation means further generates the virtual viewpoint image based on a three-dimensional model of an object.

前記決定手段は、前記仮想視点画像におけるオブジェクトの色情報を決定することを特徴とする請求項１乃至８の何れか１項に記載の画像処理装置。 The image processing apparatus according to any one of claims 1 to 8 , wherein the determination means determines color information of an object in the virtual viewpoint image.

前記オブジェクトは、動体であることを特徴とする請求項８又は９に記載の画像処理装置。 The image processing apparatus according to claim 8 or 9 , wherein the object is a moving object.

前記オブジェクトは、時間とともに位置及び形状が変化しない物体であることを特徴とする請求項８又は９に記載の画像処理装置。 The image processing apparatus according to claim 8 or 9 , wherein the object is an object whose position and shape do not change with time.

画像処理装置が行う画像処理方法であって、
仮想視点の位置の情報と前記仮想視点からの視線方向の情報を含む仮想視点情報を取得する工程と、
取得された前記仮想視点情報により特定される仮想視点に対応する仮想視点画像の着目画素の色情報を、複数の撮像装置で撮像されることにより取得された複数の撮像画像と、前記複数の撮像画像それぞれにおける端部から前記着目画素に対応する画素までの距離に応じた重みとに基づいて決定する工程と、
決定された前記色情報に基づいて、前記仮想視点画像を生成する工程と、
を有し、
前記重みは、
前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値を超えている場合、一定の重みであり、
前記撮像画像における端部から前記着目画素に対応する画素までの距離が所定の閾値以下である場合、前記一定の重みより小さい重みである
ことを特徴とする画像処理方法。 It is an image processing method performed by an image processing device.
The process of acquiring virtual viewpoint information including information on the position of the virtual viewpoint and information on the line-of-sight direction from the virtual viewpoint, and
A plurality of captured images acquired by capturing the color information of the pixel of interest of the virtual viewpoint image corresponding to the acquired virtual viewpoint image specified by the acquired virtual viewpoint information by a plurality of imaging devices, and the plurality of imaging. A step of determining based on the weight according to the distance from the end of each image to the pixel corresponding to the pixel of interest, and
A process of generating the virtual viewpoint image based on the determined color information, and
Have a,
The weight is
When the distance from the end portion of the captured image to the pixel corresponding to the pixel of interest exceeds a predetermined threshold value, the weight is constant.
An image processing method characterized in that when the distance from the end portion of the captured image to the pixel corresponding to the pixel of interest is equal to or less than a predetermined threshold value, the weight is smaller than the constant weight.

コンピュータを、請求項１乃至１１の何れか１項に記載の画像処理装置として機能させるためのプログラム。 A program for causing a computer to function as the image processing device according to any one of claims 1 to 11.