JP7393931B2

JP7393931B2 - Image encoding device and its program, and image decoding device and its program

Info

Publication number: JP7393931B2
Application number: JP2019223446A
Authority: JP
Inventors: 美和片山; 真宏河北
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2019-12-11
Filing date: 2019-12-11
Publication date: 2023-12-07
Anticipated expiration: 2039-12-11
Also published as: JP2021093641A

Description

本発明は、画像を符号化する画像符号化装置およびそのプログラム、ならびに、符号化された画像を復号する画像復号装置およびそのプログラムに関する。 The present invention relates to an image encoding device and program for encoding an image, and an image decoding device and program for decoding the encoded image.

現在、多視点画像を符号化伝送するにあたり、多視点画像と、撮影位置（視点位置）からの奥行情報を示す距離画像（デプスマップ）とを、併せて伝送する方式が検討されている。国際標準においては、多視点画像と距離画像とを符号化し、復号側で任意の視点画像を生成する３Ｄ－ＡＶＣ、３Ｄ－ＨＥＶＣ等の符号化方式が決められている（非特許文献１，２参照）。
また、近年、インテグラル立体の符号化効率の向上のため、高周波成分が高い要素画像群を多視点画像に変換し、多視点画像を符号化する手法が提案されている（非特許文献３）。この要素画像群と多視点画像とは相互に変換可能であり、多視点画像を構成する各視点画像に対応する距離画像と、要素画像群を構成する各要素画像に対応する距離画像とも相互に変換可能である（特許文献１参照）。 Currently, when encoding and transmitting multi-view images, a method is being considered in which the multi-view images and a distance image (depth map) indicating depth information from a shooting position (viewpoint position) are transmitted together. International standards specify encoding methods such as 3D-AVC and 3D-HEVC that encode multi-view images and distance images and generate arbitrary viewpoint images on the decoding side (Non-patent Documents 1, 2). reference).
Furthermore, in recent years, in order to improve the coding efficiency of integral 3D, a method has been proposed in which a group of elemental images with high frequency components are converted into multi-view images and the multi-view images are encoded (Non-Patent Document 3). . This elemental image group and the multi-view image can be mutually converted, and the distance image corresponding to each viewpoint image making up the multi-view image and the distance image corresponding to each element image making up the element image group can also be mutually converted. It is convertible (see Patent Document 1).

特開２０１６－１５８２１３号公報Japanese Patent Application Publication No. 2016-158213

志水、「デプスマップを用いた三次元映像符号化の国際標準化動向」、情報処理学会研究報告、Vol.2013-AVM-82、No.11、pp.1-6(2013).Shimizu, “Trends in international standardization of 3D video coding using depth maps,” Information Processing Society of Japan Research Report, Vol. 2013-AVM-82, No. 11, pp. 1-6 (2013). 妹尾，山本，大井，栗田、「ＭＰＥＧ多視点映像符号化の標準化活動」、情報通信研究機構季報、Vol.56、Nos.1/2、pp.79-90(2010).Seno, Yamamoto, Oi, Kurita, "Standardization activities for MPEG multi-view video coding", National Institute of Information and Communications Technology Quarterly Report, Vol. 56, Nos. 1/2, pp. 79-90 (2010). 原，洗井，河北，三科、「インテグラル立体の符号化効率向上のための要素画像サイズ変換方法」、情報処理学会研究報告、Vol.2016-AVM-93、No.2、pp.1-4(2016).Hara, Arai, Kawakita, Mishina, "Element image size conversion method for improving integral 3D encoding efficiency", Information Processing Society of Japan Research Report, Vol.2016-AVM-93, No.2, pp.1 -4(2016).

従来の距離画像を含んだ画像の符号化装置は、距離情報により符号化パラメータを変えるなどの手段をとらず、被写体の画像領域と背景の画像領域とを区別することなく画像の符号化を行っている。
そのため、従来は、画像伝送後の被写体の画像領域と背景の画像領域とは同等の画質で再現されることになる。しかし、被写体の画像領域を背景の画像領域よりも高精細に再現したいという要望がある。 Conventional image encoding devices that include distance images encode images without taking any measures such as changing encoding parameters depending on distance information, and without distinguishing between the image area of the subject and the image area of the background. ing.
Therefore, conventionally, the image area of the subject and the image area of the background after image transmission are reproduced with the same image quality. However, there is a desire to reproduce the image area of the subject with higher definition than the image area of the background.

また、インテグラル方式を用いた３次元表示装置（ＩＰ立体表示装置）は、表示ディスプレイからの距離により表示解像度が変わる。例えば、図１５の空間周波数特性に示すように、ＩＰ立体表示装置では、立体像の奥行き位置ｚがレンズアレイ付近で空間周波数（観視空間周波数β）〔ｃｙｃｌｅ/ｒａｄ＝ｃｐｒ〕が最も高く（最大空間周波数β_ｎ）、レンズアレイから前方または後方に遠ざかると空間周波数が低くなる（以下の参考文献参照）。なお、図１５において、レンズアレイから観察者Ｍまでの視距離Ｌは、レンズアレイから立体像を表示する最も手前の位置までの距離を示している。
このようなＩＰ立体表示装置では、レンズアレイ付近で立体像の解像度が高く表示され、レンズアレイから遠ざかると解像度が低くなり、立体像にぼやけが生じる。
（参考文献）H. Hoshino, F. Okuno, H. Isono and I. Yuyama, “Analysis of resolution limitation of integral photography,” Optical Society of America A, Vol. 15, No. 8, pp. 2059 - 2065 (1998).
そのため、撮影画像（要素画像群）に従来の符号化方式をそのまま適用すると、符号化圧縮により再生画像に劣化が発生した場合、再生画像全体に劣化が発生し、解像度の高いレンズアレイ付近の立体像に対する劣化が目立ってしまうという問題がある。
このように、立体像を表示するための多視点画像や要素画像群を符号化する際に、所望の距離に対して符号化量を変更することが望まれている。 Further, in a three-dimensional display device (IP stereoscopic display device) using an integral method, the display resolution changes depending on the distance from the display. For example, as shown in the spatial frequency characteristics in FIG. 15, in the IP stereoscopic display device, the depth position z of the stereoscopic image has the highest spatial frequency (viewing spatial frequency β) [cycle/rad=cpr] near the lens array ( maximum spatial frequency β _n ), the spatial frequency decreases as you move away from the lens array forward or backward (see references below). Note that in FIG. 15, the viewing distance L from the lens array to the observer M indicates the distance from the lens array to the closest position where a stereoscopic image is displayed.
In such an IP stereoscopic display device, a stereoscopic image is displayed with high resolution near the lens array, and as the image moves away from the lens array, the resolution decreases and the stereoscopic image becomes blurred.
(References) H. Hoshino, F. Okuno, H. Isono and I. Yuyama, “Analysis of resolution limitation of integral photography,” Optical Society of America A, Vol. 15, No. 8, pp. 2059 - 2065 ( 1998).
Therefore, if the conventional encoding method is applied as is to the photographed images (elemental image group), if deterioration occurs in the reproduced image due to encoding compression, the deterioration will occur in the entire reproduced image, and the There is a problem in that the deterioration of the image becomes noticeable.
As described above, when encoding a multi-view image or a group of elemental images for displaying a stereoscopic image, it is desired to change the amount of encoding for a desired distance.

本発明は、このような問題や要望に鑑みてなされたものであり、画像を奥行区間ごとに符号量を変えて符号化／復号することが可能な画像符号化装置およびそのプログラム、ならびに、画像復号装置およびそのプログラムを提供することを課題とする。 The present invention has been made in view of such problems and demands, and provides an image encoding device and its program that can encode/decode an image by changing the amount of code for each depth section, and The object of the present invention is to provide a decoding device and its program.

前記課題を解決するため、本発明に係る画像符号化装置は、被写体の撮影画像と被写体の奥行情報を示す距離画像とを符号化する画像符号化装置であって、奥行別撮影画像生成手段と、奥行別撮影画像符号化手段と、距離画像符号化手段と、ストリーム結合手段と、を備える構成とした。 In order to solve the above-mentioned problems, an image encoding device according to the present invention is an image encoding device that encodes a photographed image of a subject and a distance image indicating depth information of the subject, and includes a photographed image generating means for each depth. , a depth-specific photographed image encoding means, a distance image encoding means, and a stream combining means.

かかる構成において、画像符号化装置は、奥行別撮影画像生成手段によって、予め設定された閾値で区分された奥行区間ごとに、距離画像で特定される奥行区間に対応する画像を撮影画像から抽出する。これによって、奥行別撮影画像生成手段は、奥行区間ごとに、複数の奥行別撮影画像を生成する。
そして、画像符号化装置は、奥行別撮影画像符号化手段によって、複数の奥行別撮影画像を、奥行区間ごとに予め設定された量子化パラメータ等の符号量を制御するパラメータに基づいて符号化し、複数の奥行別撮影画像符号化ストリームを生成する。これによって、奥行区間ごとに撮影画像の符号量を制御することが可能になる。 In such a configuration, the image encoding device uses the depth-based captured image generation means to extract, from the captured image, an image corresponding to the depth section specified by the distance image for each depth section divided by a preset threshold value. . Thereby, the depth-based captured image generation means generates a plurality of depth-based captured images for each depth section.
Then, the image encoding device encodes the plurality of depth-specific captured images by the depth-specific captured image encoding means based on a parameter that controls the amount of code such as a quantization parameter set in advance for each depth section, A plurality of encoded streams of photographed images classified by depth are generated. This makes it possible to control the code amount of a photographed image for each depth section.

また、画像符号化装置は、距離画像符号化手段によって、距離画像を符号化し、距離画像符号化ストリームを生成する。
そして、画像符号化装置は、ストリーム結合手段によって、複数の奥行別撮影画像符号化ストリームと距離画像符号化ストリームとを結合する。これによって、１つの符号化ストリームが生成される。 Further, the image encoding device encodes the distance image using the distance image encoding means and generates a distance image encoded stream.
Then, the image encoding device combines the plurality of depth-specific photographed image encoded streams and the distance image encoded stream by the stream combining means. As a result, one encoded stream is generated.

また、前記課題を解決するため、本発明に係る画像符号化装置は、被写体の撮影画像と被写体の奥行情報を示す距離画像とを符号化する画像符号化装置であって、奥行別撮影画像生成手段と、奥行別撮影画像符号化手段と、奥行別距離画像生成手段と、奥行別距離画像符号化手段と、ストリーム結合手段と、を備える構成とした。 Further, in order to solve the above problems, an image encoding device according to the present invention is an image encoding device that encodes a captured image of a subject and a distance image indicating depth information of the subject, and generates captured images by depth. The present invention is configured to include: a means for encoding captured images by depth, a distance image generating means for each depth, a distance image encoding means for each depth, and a stream combining means.

また、画像符号化装置は、奥行別距離画像生成手段によって、撮影画像と同じ閾値で区分された奥行区間ごとに、奥行区間に対応する画像を距離画像から抽出する。これによって、奥行別距離画像生成手段は、奥行区間ごとに、複数の奥行別距離画像を生成する。
そして、画像符号化装置は、奥行別距離画像符号化手段によって、複数の奥行別距離画像を、奥行区間ごとに予め設定された符号量を制御するパラメータに基づいて符号化し、複数の奥行別距離画像符号化ストリームを生成する。これによって、奥行区間ごとに距離画像の符号量を制御することが可能になる。
そして、画像符号化装置は、ストリーム結合手段によって、複数の奥行別撮影画像符号化ストリームと複数の奥行別距離画像符号化ストリームとを結合する。これによって、１つの符号化ストリームが生成される。
なお、画像符号化装置は、コンピュータを、前記した各手段として機能させるためのプログラムで動作させることができる。 Further, the image encoding device uses the depth-based distance image generation means to extract an image corresponding to the depth section from the distance image for each depth section divided by the same threshold as the photographed image. Thereby, the depth-specific distance image generation means generates a plurality of depth-specific distance images for each depth section.
Then, the image encoding device encodes the plurality of depth-specific distance images by the depth-specific distance image encoding means based on a parameter that controls the amount of code set in advance for each depth section, and Generate an encoded image stream. This makes it possible to control the code amount of the distance image for each depth section.
Then, the image encoding device combines the plurality of depth-specific photographed image encoded streams and the plurality of depth-specific distance image encoded streams by the stream combining means. As a result, one encoded stream is generated.
Note that the image encoding device can operate a computer using a program for causing the computer to function as each of the above-mentioned means.

また、前記課題を解決するため、本発明に係る画像復号装置は、奥行きを区分して被写体の撮影画像を符号化した複数の奥行別撮影画像符号化ストリームと被写体の奥行情報を示す距離画像を符号化した距離画像符号化ストリームとを結合した符号化ストリームを復号する画像復号装置であって、ストリーム分離手段と、奥行別撮影画像復号手段と、撮影画像合成手段と、距離画像復号手段と、を備える構成とした。 Moreover, in order to solve the above-mentioned problem, the image decoding device according to the present invention provides a plurality of coded streams of photographed images classified by depth, in which photographed images of a photographed object are encoded by dividing the depth, and a distance image indicating depth information of the photographed object. An image decoding device that decodes an encoded stream obtained by combining an encoded distance image encoded stream, comprising: a stream separating means, a photographed image decoding means by depth, a photographed image combining means, a distance image decoding means, The configuration includes:

かかる構成において、画像復号装置は、ストリーム分離手段によって、符号化ストリームを複数の奥行別撮影画像符号化ストリームと距離画像符号化ストリームとに分離する。
そして、画像復号装置は、奥行別撮影画像復号手段によって、複数の奥行別撮影画像符号化ストリームを復号する。これによって、奥行別撮影画像復号手段は、複数の奥行別撮影画像を生成する。
そして、画像復号装置は、撮影画像合成手段によって、複数の奥行別撮影画像を合成し、撮影画像を生成する。
さらに、画像復号装置は、距離画像復号手段によって、距離画像符号化ストリームを復号し、距離画像を生成する。 In such a configuration, the image decoding device separates the encoded stream into a plurality of depth-specific photographed image encoded streams and a distance image encoded stream by the stream separation means.
Then, the image decoding device decodes the plurality of depth-specific captured image encoded streams by the depth-specific captured image decoding means. Thereby, the depth-based photographed image decoding means generates a plurality of depth-based photographed images.
Then, the image decoding device uses the photographed image combining means to combine the plurality of photographed images classified by depth to generate a photographed image.
Furthermore, the image decoding device decodes the distance image encoded stream by the distance image decoding means to generate a distance image.

また、前記課題を解決するため、本発明に係る画像復号装置は、奥行きを区分して被写体の撮影画像を符号化した複数の奥行別撮影画像符号化ストリームと被写体の奥行情報を示す距離画像を撮影画像と同じ奥行区間で区分して符号化した複数の奥行別距離画像符号化ストリームとを結合した符号化ストリームを復号する画像復号装置であって、ストリーム分離手段と、奥行別撮影画像復号手段と、撮影画像合成手段と、奥行別距離画像復号手段と、距離画像合成手段と、を備える構成とした。 Moreover, in order to solve the above-mentioned problem, the image decoding device according to the present invention provides a plurality of coded streams of photographed images classified by depth, in which photographed images of a photographed object are encoded by dividing the depth, and a distance image indicating depth information of the photographed object. An image decoding device that decodes a coded stream that is a combination of a plurality of depth-specific distance image encoded streams segmented and coded in the same depth section as a captured image, the device comprising: a stream separation unit; and a depth-specific captured image decoding unit. , a photographed image composition means, a depth-based distance image decoding means, and a distance image composition means.

かかる構成において、画像復号装置は、ストリーム分離手段によって、符号化ストリームを複数の奥行別撮影画像符号化ストリームと複数の奥行別距離画像符号化ストリームとに分離する。
そして、画像復号装置は、奥行別撮影画像復号手段によって、複数の奥行別撮影画像符号化ストリームを復号する。これによって、奥行別撮影画像復号手段は、複数の奥行別撮影画像を生成する。
そして、画像復号装置は、撮影画像合成手段によって、複数の奥行別撮影画像を合成し、撮影画像を生成する。
また、画像復号装置は、奥行別距離画像復号手段によって、複数の奥行別距離画像符号化ストリームを復号する。これによって、奥行別距離画像復号手段は、複数の奥行別距離画像を生成する。
そして、画像復号装置は、距離画像合成手段によって、複数の奥行別距離画像を合成し、距離画像を生成する。
なお、画像復号装置は、コンピュータを、前記した各手段として機能させるためのプログラムで動作させることができる。 In such a configuration, the image decoding device separates the encoded stream into a plurality of depth-specific photographed image encoded streams and a plurality of depth-specific distance image encoded streams by the stream separation means.
Then, the image decoding device decodes the plurality of depth-specific captured image encoded streams by the depth-specific captured image decoding means. Thereby, the depth-based photographed image decoding means generates a plurality of depth-based photographed images.
Then, the image decoding device uses the photographed image combining means to combine the plurality of photographed images classified by depth to generate a photographed image.
Further, the image decoding device decodes a plurality of depth-specific distance image encoded streams by the depth-specific distance image decoding means. Thereby, the depth-specific distance image decoding means generates a plurality of depth-specific distance images.
The image decoding device then uses the distance image composition means to compose the plurality of depth-specific distance images to generate a distance image.
Note that the image decoding device can operate a computer using a program for causing the computer to function as each of the above-mentioned means.

本発明は、以下に示す優れた効果を奏するものである。
本発明によれば、画像を奥行区間ごとに符号量を制御して符号化／復号することができる。これによって、本発明は、解像度を高めたい奥行きに、他の奥行きに比べて多くの符号量を割り当てることが可能になり、所望の奥行区間の画像の解像度を高めることができる。 The present invention has the following excellent effects.
According to the present invention, an image can be encoded/decoded by controlling the amount of code for each depth section. As a result, the present invention makes it possible to allocate a larger amount of code to the depth for which the resolution is desired to be increased compared to other depths, thereby increasing the resolution of the image in the desired depth section.

本発明の第１実施形態に係る画像符号化装置の構成を示すブロック構成図である。FIG. 1 is a block configuration diagram showing the configuration of an image encoding device according to a first embodiment of the present invention. 多視点画像を生成する手法を説明するための説明図である。FIG. 2 is an explanatory diagram for explaining a method of generating a multi-view image. 距離画像を説明するための説明図であって、（ａ）は撮影画像、（ｂ）は（ａ）に対応する距離画像を示す。FIG. 2 is an explanatory diagram for explaining distance images, in which (a) shows a photographed image and (b) shows a distance image corresponding to (a). 奥行区間の閾値を説明するための説明図である。FIG. 3 is an explanatory diagram for explaining a threshold value of a depth section. （ａ）～（ｃ）は奥行区間ごとのマスクデータを示す図である。(a) to (c) are diagrams showing mask data for each depth section. （ａ）～（ｃ）は奥行区間ごとの奥行別撮影画像を示す図である。(a) to (c) are diagrams showing captured images by depth for each depth section. 奥行区間の閾値を等間隔に設定する例を説明するための説明図である。FIG. 7 is an explanatory diagram for explaining an example in which thresholds for depth sections are set at equal intervals. 奥行区間の閾値を非等間隔に設定する例を説明するための説明図である。FIG. 7 is an explanatory diagram for explaining an example in which thresholds for depth sections are set at non-uniform intervals. インテグラル方式におけるＩＰ表示装置の概要を示す概略図である。FIG. 2 is a schematic diagram showing an overview of an IP display device in an integral system. 本発明の第１実施形態に係る画像符号化装置の動作を示すフローチャートである。3 is a flowchart showing the operation of the image encoding device according to the first embodiment of the present invention. 本発明の第１実施形態に係る画像復号装置の構成を示すブロック構成図である。FIG. 1 is a block configuration diagram showing the configuration of an image decoding device according to a first embodiment of the present invention. 本発明の第１実施形態に係る画像復号装置の動作を示すフローチャートである。3 is a flowchart showing the operation of the image decoding device according to the first embodiment of the present invention. 本発明の第２実施形態に係る画像符号化装置の構成を示すブロック構成図である。FIG. 2 is a block configuration diagram showing the configuration of an image encoding device according to a second embodiment of the present invention. 本発明の第２実施形態に係る画像復号装置の構成を示すブロック構成図である。FIG. 2 is a block configuration diagram showing the configuration of an image decoding device according to a second embodiment of the present invention. インテグラル方式の空間周波数特性を説明するための説明図である。FIG. 3 is an explanatory diagram for explaining spatial frequency characteristics of an integral method.

以下、本発明の実施形態について図面を参照して説明する。
＜第１実施形態＞
〔画像符号化装置の構成〕
図１を参照して、本発明の第１実施形態に係る画像符号化装置１の構成について説明する。 Embodiments of the present invention will be described below with reference to the drawings.
<First embodiment>
[Configuration of image encoding device]
With reference to FIG. 1, the configuration of an image encoding device 1 according to a first embodiment of the present invention will be described.

画像符号化装置１は、被写体の撮影画像と被写体の奥行情報を示す距離画像（デプスマップ）とを符号化するものである。この画像符号化装置１は、距離画像で特定される奥行きに応じて、符号量を変えて撮影画像を符号化する。
まず、画像符号化装置１に入力する画像（撮影画像、距離画像）について説明した後、画像符号化装置１の構成について説明する。 The image encoding device 1 encodes a captured image of a subject and a distance image (depth map) indicating depth information of the subject. This image encoding device 1 encodes a photographed image by changing the amount of code depending on the depth specified by the distance image.
First, the images (photographed images, distance images) input to the image encoding device 1 will be explained, and then the configuration of the image encoding device 1 will be explained.

撮影画像は、被写体を撮影した多視点画像の個々の視点画像である。なお、撮影画像は、実写画像である必要はなく、例えば、仮想的なカメラで仮想的に被写体を撮影したＣＧ画像であってもよい。画像符号化装置１は、多視点画像を視点画像ごとに入力する。 The photographed images are individual viewpoint images of multi-view images of the subject. Note that the photographed image does not need to be a real image, and may be, for example, a CG image obtained by virtually photographing a subject with a virtual camera. The image encoding device 1 receives multi-view images for each viewpoint image.

距離画像は、被写体を撮影した被写体空間における視点位置から被写体までの距離を、画素ごとに予め定めた奥行最小値から奥行最大値までの範囲に割り当てた画素値で表した画像である。この距離画像は、多視点画像の個々の視点画像に対応した画像である、 A distance image is an image in which the distance from the viewpoint position in the subject space where the subject was photographed to the subject is expressed by pixel values assigned to a predetermined range from the minimum depth value to the maximum depth value for each pixel. This distance image is an image corresponding to each viewpoint image of the multi-view image.

ここで、図２および図３を参照して、撮影画像および距離画像の例について説明する。図２に示すように、複数（ここでは、一例として９台）のカメラＣ_１，Ｃ_２，…，Ｃ_９を２次元状に配列した多視点カメラＣで、被写体空間上で奥行きの位置が異なる被写体Ｏ（ここでは、Ｏ_１，Ｏ_２，Ｏ_３）を撮影したとする。
図３（ａ）に示すように、撮影画像Ｇは、多視点カメラＣの１つのカメラ（例えば、Ｃ_１）で撮影された画像である。なお、個々のカメラＣ_１，Ｃ_２，…，Ｃ_９が撮影する画像は、視点位置が異なるだけで、図３（ａ）と同様の画像である。
図３（ｂ）に示すように、距離画像Ｄは、隣接するカメラ間で撮影した画像をマッチングし、対応する画素の距離に応じた視差に対応する画素値を撮影画像Ｇの画素位置に割り当てたものである。ここでは、距離画像Ｄは、視点位置に近いほど白く、視点位置から遠いほど黒く表示している。 Here, examples of captured images and distance images will be described with reference to FIGS. 2 and 3. As shown in FIG. 2, a multi-view camera C has a plurality of cameras C ₁ , C ₂ , ..., C ₉ arranged in a two-dimensional manner (here, nine as an example), and the depth position is determined in the subject space. Assume that different subjects O (here, O ₁ , O ₂ , O ₃ ) are photographed.
As shown in FIG. 3A, the captured image G is an image captured by one camera (for example, C ₁ ) of the multi-view camera C. Note that the images taken by _the individual cameras C ₁ , C ₂ , .
As shown in FIG. 3(b), the distance image D is created by matching images taken between adjacent cameras and assigning pixel values corresponding to the parallax according to the distance of the corresponding pixels to the pixel positions of the photographed image G. It is something that Here, the distance image D is displayed so that the closer it is to the viewpoint position, the whiter it is, and the farther it is from the viewpoint position, the darker it is displayed.

このように、複数のカメラで撮影した画像から、距離画像を生成する手法は、例えば、特開２０１９－１８４３０８号公報等に開示されている公知の手法を用いることができる。なお、ここでは、距離画像をカメラで撮影した画像から生成したものとしたが、投射したレーザの往復時間から距離を測定する距離画像センサ等によって取得したものでもよい。
図１に戻って、画像符号化装置１の構成について説明する。 As a method for generating a distance image from images captured by a plurality of cameras in this way, for example, a known method disclosed in Japanese Patent Application Laid-Open No. 2019-184308 can be used. Note that here, the distance image is generated from an image taken with a camera, but it may be acquired by a distance image sensor or the like that measures the distance from the round trip time of a projected laser.
Returning to FIG. 1, the configuration of the image encoding device 1 will be explained.

図１に示すように、画像符号化装置１は、閾値設定手段１０と、奥行別撮影画像生成手段１１と、符号量制御情報設定手段１２と、奥行別撮影画像符号化手段１３と、距離画像符号化手段１４と、ストリーム結合手段１５と、を備える。 As shown in FIG. 1, the image encoding device 1 includes a threshold value setting means 10, a depth-based photographed image generation means 11, a code amount control information setting means 12, a depth-based photographed image encoding means 13, and a distance image It includes an encoding means 14 and a stream combining means 15.

閾値設定手段１０は、距離画像の奥行きを階層的に区分するための閾値を奥行別撮影画像生成手段１１に設定するものである。閾値は、距離画像を奥行きで区分した奥行区間の境界を示す奥行値である。
この閾値設定手段１０は、予め定めた視点位置を奥行最小値、表示する最も遠方の位置を奥行最大値とし、その間の区間の境界である奥行値を設定する。
閾値設定手段１０は、奥行区間の区間数Ｎに対して、（Ｎ－１）個の閾値ｚｐ_１，ｚｐ_２，…，ｚｐ_Ｎ－１を外部から入力する。ここでは、奥行区間の区間数Ｎを“５”とし、奥行最小値と奥行最大値とを５等分した値を閾値ｚｐ_１，…，ｚｐ_４とする。 The threshold value setting means 10 sets a threshold value for hierarchically classifying the depth of the distance image in the depth-based photographed image generation means 11. The threshold is a depth value that indicates the boundary of a depth section obtained by dividing the distance image by depth.
This threshold value setting means 10 sets a predetermined viewpoint position as a minimum depth value, the farthest position to be displayed as a maximum depth value, and sets a depth value that is a boundary between the sections.
The threshold setting means 10 inputs (N-1) threshold values zp ₁ , zp ₂ , . . . , zp _N-1 from the outside for the number N of depth sections. Here, the number N of the depth sections is set to "5", and the values obtained by equally dividing the minimum depth value and the maximum depth value into five are set as threshold values zp ₁ , . . . , zp ₄ .

奥行区間は、必ずしも等間隔である必要はなく、符号量を可変にしたい区間ごとに不均等に設定してもよい。また、ここでは、奥行区間の区間数を“５”として説明するが、少なくとも“２”以上であればよい。
これによって、図４に示すように、奥行最大値ｚｆａｒから閾値ｚｐ_１までの区間Ｌ_１（区間長ｚｌｅｎ_１）、閾値ｚｐ_１から閾値ｚｐ_２までの区間Ｌ_２（区間長ｚｌｅｎ_２）、閾値ｚｐ_２から閾値ｚｐ_３までの区間Ｌ_３（区間長ｚｌｅｎ_３）、閾値ｚｐ_３から閾値ｚｐ_４までの区間Ｌ_４（区間長ｚｌｅｎ_４）、閾値ｚｐ_４から奥行最小値ｚｎｅａｒまでの区間Ｌ_５（区間長ｚｌｅｎ_５）として、距離画像を区分することができる。 The depth sections do not necessarily have to be equally spaced, and may be set unevenly for each section in which the amount of code is desired to be variable. Furthermore, here, the explanation will be given assuming that the number of depth sections is "5", but it may be at least "2" or more.
As a result, as shown in FIG. 4, the section L ₁ (section length zlen _{1 ) from the maximum depth value zfar to the threshold zp 1} _, the section L ₂ (section length zlen ₂ ) from the threshold zp ₁ to the threshold zp ₂ , and the threshold Section L 3 from zp ₂ to threshold zp ₃ ( _section length zlen ₃ ), Section L ₄ from threshold zp ₃ to threshold zp ₄ (section length zlen ₄ ), Section L ₅ from threshold zp ₄ to minimum depth znear The distance image can be segmented as (section length zlen ₅ ).

この閾値は、例えば、前記した特許文献１に記載の手法により、符号化対象の撮影画像（多視点画像）をインテグラル方式の要素画像群として使用する場合、レンズアレイ付近の奥行きの画像を区切るように設定することが好ましい。これによって、少なくとも、レンズアレイ付近の画像と、それ以外の画像とで、符号量を変えることができる。
この多視点画像をインテグラル方式の要素画像群として使用する場合における閾値の設定の具体例については、後記する。
閾値設定手段１０は、入力された閾値（ｚｐ_１，…，ｚｐ_４）を奥行別撮影画像生成手段１１に出力する。 For example, when using the captured image (multi-view image) to be encoded as an elemental image group of an integral method by the method described in Patent Document 1 mentioned above, this threshold value is used to separate images in the depth near the lens array. It is preferable to set it as follows. This makes it possible to change the amount of code at least between the image near the lens array and the other images.
A specific example of threshold setting when this multi-view image is used as an integral image group will be described later.
The threshold value setting means 10 outputs the input threshold values (zp ₁ , . . . , zp ₄ ) to the depth-based photographed image generation means 11 .

奥行別撮影画像生成手段１１は、予め設定された閾値で区分された奥行区間ごとに、距離画像で特定される奥行区間に対応する画像を撮影画像から抽出し、複数の奥行別撮影画像を生成するものである。
ここでは、奥行別撮影画像生成手段１１は、領域区分手段１１０と、領域画像生成手段１１１と、を備える。 The depth-based photographed image generation means 11 extracts from the photographed image an image corresponding to the depth section specified by the distance image for each depth section divided by a preset threshold value, and generates a plurality of depth-classified photographed images. It is something to do.
Here, the depth-based captured image generation means 11 includes an area division means 110 and an area image generation means 111.

領域区分手段１１０は、閾値設定手段１０で設定された閾値で特定される奥行区間ごとに、入力された距離画像の領域を区分する領域情報を生成するものである。
この領域区分手段１１０は、奥行区間ごとに、対応する奥行値を有する距離画像の画素の集合を、奥行区間に対応する領域を示す領域情報とする。
ここでは、領域区分手段１１０は、奥行区間ごとの領域情報をマスクデータとして生成する。具体的には、領域区分手段１１０は、奥行区間ごとに、距離画像の画素値が奥行区間に含まれる画素の画素値を“１”、それ以外の画素値を“０”としてマスクデータを生成する。なお、２つの区間を区分する閾値はいずれか一方の区間に含ませることとする。 The area dividing means 110 generates area information for dividing the area of the input distance image for each depth section specified by the threshold set by the threshold setting means 10.
For each depth section, the area segmentation means 110 sets a set of pixels of the distance image having the corresponding depth value as area information indicating the area corresponding to the depth section.
Here, the area dividing means 110 generates area information for each depth section as mask data. Specifically, the area segmentation means 110 generates mask data for each depth section by setting the pixel values of pixels included in the depth section of the distance image to "1" and the other pixel values to "0". do. Note that the threshold that separates the two sections is included in one of the sections.

例えば、図２に示す被写体Ｏ_１，Ｏ_２，Ｏ_３に対する距離画像が図３（ｂ）に示す距離画像Ｄであって、図４に示すように被写体Ｏ_１，Ｏ_２，Ｏ_３が、区間Ｌ_２，Ｌ_３，Ｌ_４のそれぞれの奥行区間に存在していたとする。
この場合、領域区分手段１１０は、区間Ｌ_１については、距離画像Ｄにおいて、閾値ｚｐ_１以上、奥行最大値ｚｆａｒ以下の画素値を“１”、それ以外の画素値を“０”としてマスクデータ（不図示）を生成する。
また、領域区分手段１１０は、区間Ｌ_２については、距離画像Ｄにおいて、閾値ｚｐ_２以上、閾値ｚｐ_１未満の画素値を“１”、それ以外の画素値を“０”として、図５（ａ）に示すマスクデータＭ_１を生成する。
また、領域区分手段１１０は、区間Ｌ_３については、距離画像Ｄにおいて、閾値ｚｐ_３以上、閾値ｚｐ_２未満の画素値を“１”、それ以外の画素値を“０”として、図５（ｂ）に示すマスクデータＭ_２を生成する。 For example, the distance image for the objects O ₁ , O ₂ , O ₃ shown in FIG _. 2 is the distance image D shown in FIG. ₃ ( b ), and as shown in _FIG . Suppose that it exists in each depth section of sections L ₂ , L ₃ , and L ₄ .
In this case, the area segmentation means 110 sets the pixel values in the distance image D that are equal to or greater than the threshold value zp ₁ and _equal to or less than the maximum depth value zfar as "1" and other pixel values as "0" for the section L1, and uses the mask data as the mask data. (not shown).
In addition, for the section _L2 , the area dividing means 110 sets pixel values of the distance image D that are equal to or higher than the threshold value _zp2 and less than the threshold value _zp1 to "1", and sets the other pixel values to "0", as shown in FIG. Mask data _M1 shown in a) is generated.
Furthermore, for the section _L3 , the area dividing means 110 sets pixel values of the distance image D that are equal to or higher than the threshold value _zp3 and less than the threshold value _zp2 to "1", and sets the other pixel values to "0", as shown in FIG. Mask data _M2 shown in b) is generated.

また、領域区分手段１１０は、区間Ｌ_４については、距離画像Ｄにおいて、閾値ｚｐ_４以上、閾値ｚｐ_３未満の画素値を“１”、それ以外の画素値を“０”として、図５（ｃ）に示すマスクデータＭ_３を生成する。
また、領域区分手段１１０は、区間Ｌ_５については、距離画像Ｄにおいて、奥行最小値ｚｎｅａｒ以上、閾値ｚｐ_４未満の奥行区間の画素の画素値を“１”、それ以外の画素値を“０”としてマスクデータ（不図示）を生成する。
これによって、領域区分手段１１０は、閾値で特定される奥行区間ごとに、距離画像の領域を区分することができる。
領域区分手段１１０は、生成した領域情報（マスクデータ）を領域画像生成手段１１１に出力する。 Furthermore, for the section _L4 , the area segmentation means 110 sets pixel values of the distance image D that are equal to or higher than the threshold value _zp4 and less than the threshold value _zp3 to "1", and sets the other pixel values to "0", as shown in FIG. Mask data _M3 shown in c) is generated.
Furthermore, for the section _L5 , the area dividing means 110 sets the pixel values of pixels in the depth section of the distance image D that is equal to or greater than the minimum depth value znear and less than the threshold value _zp4 to "1", and sets the other pixel values to "0". ” to generate mask data (not shown).
Thereby, the region segmentation means 110 can segment the region of the distance image for each depth section specified by the threshold value.
The area segmentation means 110 outputs the generated area information (mask data) to the area image generation means 111.

領域画像生成手段１１１は、奥行区間ごとに、領域区分手段１１０で生成された領域情報に基づいて撮影画像から奥行別撮影画像を生成するものである。
ここでは、領域画像生成手段１１１を、奥行区間ごとに、奥行区間の区間数に対応した複数の領域画像生成手段１１１_１，１１１_２，…，１１１_５で構成している。 The area image generation means 111 generates depth-based photographed images from the photographed images for each depth section based on the area information generated by the area division means 110.
Here, the area image generation means 111 is composed of a plurality of area image generation means 111 ₁ , 111 ₂ , . . . , 111 ₅ corresponding to the number of depth sections for each depth section.

領域画像生成手段１１１_１は、撮影画像のうちで、領域区分手段１１０で区分された区間Ｌ_１の領域の画像を、奥行別撮影画像として生成するものである。
領域画像生成手段１１１_２は、撮影画像のうちで、領域区分手段１１０で区分された区間Ｌ_２の領域の画像を、奥行別撮影画像として生成するものである。
同様に、領域画像生成手段１１１_３，１１１_４，１１１_５は、それぞれ撮影画像のうちで、領域区分手段１１０で区分された区間Ｌ_３，Ｌ_４，Ｌ_５の領域の画像を、奥行別撮影画像として生成するものである。 The area image generation means ₁₁₁₁ generates an image of the area _L1 divided by the area division means 110 out of the photographed images as a depth-based photographed image.
The area image generation means ₁₁₁₂ generates an image of the area _L2 divided by the area division means 110 out of the photographed images as a depth-based photographed image.
Similarly, the area image generation means 111 ₃ , 111 ₄ , 111 ₅ generate images of areas L ₃ , L ₄ , L ₅ divided by the area division means 110 in the photographed images by depth. It is generated as an image.

領域画像生成手段１１１_１，１１１_２，…，１１１_５は、撮影画像から、所定の奥行区間の領域に対応する奥行別撮影画像を生成する点で同じ処理を行う。
具体的には、領域画像生成手段１１１は、それぞれ、撮影画像に、領域区分手段１１０から出力される領域情報であるマスクデータを奥行別に乗算することで、奥行区間ごとの奥行別撮影画像を生成する。 The area image generation means 111 ₁ , 111 ₂ , . . . , 111 ₅ perform the same processing in that they generate, from the captured images, depth-based captured images corresponding to regions in a predetermined depth section.
Specifically, the regional image generation means 111 multiplies the photographed images by mask data, which is the region information output from the region division means 110, for each depth, thereby generating photographed images by depth for each depth section. do.

例えば、図３に示す撮影画像Ｇおよび距離画像Ｄにおいて、図４に示すように被写体Ｏ_１，Ｏ_２，Ｏ_３が、区間Ｌ_２，Ｌ_３，Ｌ_４のそれぞれの奥行区間に存在していたとする。
この場合、領域画像生成手段１１１は、区間Ｌ_２に対応する図６（ａ）に示す奥行別撮影画像Ｇ_１、区間Ｌ_３に対応する図６（ｂ）に示す奥行別撮影画像Ｇ_２、区間Ｌ_３に対応する図６（ｃ）に示す奥行別撮影画像Ｇ_３を生成する。なお、区間Ｌ_１および区間Ｌ_５の奥行別撮影画像については図示を省略している。
奥行別撮影画像生成手段１１は、生成した奥行区間ごとの奥行別撮影画像を奥行別撮影画像符号化手段１３に出力する。 For example, in the photographed image G and distance image D shown in FIG. 3, objects O ₁ , O ₂ , and O ₃ exist in the respective depth sections of sections L ₂ , L ₃ , and L ₄ as shown in FIG. Suppose that
In this case, the area image generation means 111 generates a photographed image G ₁ by depth shown in FIG. 6(a) corresponding to section _L2 , a photographed image _G2 classified by depth shown in FIG. 6(b) corresponding to section L3 _, A photographed image _G3 by depth shown in FIG. 6(c) corresponding to the section _L3 is generated. Note that the photographed images by depth in the section _L1 and the section _L5 are not illustrated.
The depth-based photographed image generating means 11 outputs the generated depth-based photographed images for each depth section to the depth-based photographed image encoding means 13.

符号量制御情報設定手段１２は、奥行別撮影画像の符号量を奥行区間ごとに制御するパラメータを奥行別撮影画像符号化手段１３に設定するものである。
符号量を制御するパラメータは、奥行別撮影画像を符号化する際の符号量を制御する量子化ステップを特定する情報である。例えば、パラメータとして、ＨＥＶＣ等の符号化方式で用いられる量子化パラメータ（ＱＰ：Quantization parameter）を用いることができる。この量子化パラメータの値を大きく設定すると、量子化ステップが大きくなり、符号量が少なくなる。一方、量子化パラメータの値を小さく設定すると、量子化ステップが小さくなり、符号量が多くなる。
ここでは、パラメータとして、量子化パラメータを用いるが、量子化パラメータと対応する量子化ステップを用いてもよい。 The code amount control information setting means 12 sets parameters for controlling the code amount of the depth-based photographed image for each depth section in the depth-based photographed image encoding means 13.
The parameter that controls the code amount is information that specifies a quantization step that controls the code amount when encoding the depth-based captured image. For example, a quantization parameter (QP) used in an encoding method such as HEVC can be used as the parameter. If the value of this quantization parameter is set large, the quantization step becomes large and the amount of code decreases. On the other hand, if the value of the quantization parameter is set to a small value, the quantization step becomes small and the amount of code increases.
Here, a quantization parameter is used as a parameter, but a quantization step corresponding to the quantization parameter may also be used.

この符号量制御情報設定手段１２は、奥行区間ごとに、それぞれ個別の量子化パラメータＱＰを外部から入力する。この区間数は、閾値設定手段１０で設定する閾値で区分される区間数Ｎと同じである。すなわち、符号量制御情報設定手段１２は、奥行区間ごとの量子化パラメータ（ＱＰ_１，ＱＰ_２，…，ＱＰ_Ｎ－１）を外部から入力する。ここでは、奥行区間の区間数Ｎを“５”とし、量子化パラメータＱＰ_１，…，ＱＰ_４とする。
この量子化パラメータによって、撮影画像のどの奥行区間に対して、符号量を多く割り当てるかを制御することが可能になる。
符号量制御情報設定手段１２は、入力された奥行区間ごとの量子化パラメータ（ＱＰ_１，…，ＱＰ_４）を、奥行別撮影画像符号化手段１３に出力する。 This code amount control information setting means 12 receives from the outside a separate quantization parameter QP for each depth section. This number of sections is the same as the number N of sections divided by the threshold set by the threshold setting means 10. That is, the code amount control information setting means 12 inputs the quantization parameters (QP ₁ , QP ₂ , . . . , QP _N-1 ) for each depth section from the outside. Here, it is assumed that the number N of depth sections is "5" and the quantization parameters QP ₁ , . . . , QP ₄ .
This quantization parameter makes it possible to control which depth section of the photographed image a large amount of code is allocated to.
The code amount control information setting means 12 outputs the input quantization parameters (QP ₁ , . . . , QP ₄ ) for each depth section to the depth-based photographed image encoding means 13 .

奥行別撮影画像符号化手段１３は、奥行別撮影画像生成手段１１で生成された奥行区間ごとの撮影画像（奥行別撮影画像）を符号化するものである。奥行別撮影画像符号化手段１３は、奥行区間ごとに複数の符号化手段１３０を備える。 The depth-based photographed image encoding means 13 encodes the photographed images for each depth section (depth-based photographed images) generated by the depth-based photographed image generation means 11. The depth-specific photographed image encoding means 13 includes a plurality of encoding means 130 for each depth section.

符号化手段１３０は、奥行別撮影画像を、符号量制御情報設定手段１２で設定された符号量を制御するパラメータ（量子化パラメータ）に基づいて符号化するものである。この符号化手段１３０は、符号量を制御可能な符号化方式であれば、どのような符号化方式を用いてもよい。例えば、符号化手段１３０は、ＨＥＶＣ等の符号化方式を用いる。
ここでは、符号化手段１３０を、奥行区間の区間数に応じた複数の符号化手段１３０_１，１３０_２，…，１３０_５で構成している。 The encoding unit 130 encodes the depth-based captured image based on the parameter (quantization parameter) for controlling the code amount set by the code amount control information setting unit 12. This encoding means 130 may use any encoding method as long as it can control the amount of code. For example, the encoding means 130 uses an encoding method such as HEVC.
Here, the encoding means 130 is constituted by a plurality of encoding means 130 ₁ , 130 ₂ , . . . , 130 ₅ according to the number of depth sections.

符号化手段１３０_１は、奥行別撮影画像生成手段１１の領域画像生成手段１１１_１で生成された奥行別撮影画像を符号化するものである。
符号化手段１３０_２は、奥行別撮影画像生成手段１１の領域画像生成手段１１１_２で生成された奥行別撮影画像を符号化するものである。
同様に、符号化手段１３０_３，１３０_４，１３０_５は、それぞれ奥行別撮影画像生成手段１１の領域画像生成手段１１１_３，１１１_４，１１１_５で生成された奥行別撮影画像を符号化するものである。 The encoding means 130 ₁ encodes the depth-specific photographed image generated by the area image generation means 111 ₁ of the depth-class photographed image generation means 11 .
The encoding means ₁₃₀₂ encodes the depth-based photographed image generated by the area image generation means ₁₁₁₂ of the depth-class photographed image generation means 11.
Similarly, the encoding means 130 ₃ , 130 ₄ , 130 ₅ encode the depth-based photographed images generated by the area image generation means 111 ₃ , 111 ₄ , 111 ₅ of the depth-class photographed image generation means 11, respectively. It is.

符号化手段１３０（１３０_１，…，１３０_５）は、奥行別撮影画像を符号化する点で同じ処理を行う。
具体的には、符号化手段１３０は、奥行別撮影画像を所定の大きさのブロックごとに直交変換し、周波数成分に変換する。この直交変換は、例えば、離散コサイン変換（ＤＣＴ）である。そして、符号化手段１３０は、周波数成分である変換係数を、符号量制御情報設定手段１２で設定された量子化パラメータで特定される量子化ステップで量子化する。そして、符号化手段１３０は、量子化した変換係数（量子化係数）を可変長符号化して、ストリームデータ（奥行別撮影画像符号化ストリーム）を生成する。なお、符号化手段１３０は、奥行別撮影画像符号化ストリームに、付帯情報として量子化パラメータを付加しておく。
符号化手段１３０（１３０_１，…，１３０_５）は、それぞれ奥行区間ごとに生成した奥行別撮影画像符号化ストリームをストリーム結合手段１５に出力する。 The encoding means 130 (130 ₁ , . . . , 130 ₅ ) performs the same processing in that it encodes the captured images by depth.
Specifically, the encoding unit 130 orthogonally transforms the depth-based captured image for each block of a predetermined size, and converts the image into frequency components. This orthogonal transform is, for example, a discrete cosine transform (DCT). Then, the encoding means 130 quantizes the transform coefficient, which is a frequency component, at a quantization step specified by the quantization parameter set by the code amount control information setting means 12. Then, the encoding unit 130 performs variable length encoding on the quantized transform coefficients (quantization coefficients) to generate stream data (depth-specific photographed image encoded stream). Note that the encoding unit 130 adds a quantization parameter as supplementary information to the encoded stream of photographed images classified by depth.
The encoding means 130 (130 ₁ , . . . , 130 ₅ ) outputs the depth-specific photographed image encoded streams generated for each depth section to the stream combining means 15 .

距離画像符号化手段１４は、距離画像を符号化するものである。この距離画像符号化手段１４は、どのような符号化方式を用いてもよいが、例えば、符号化手段１３０と同じ、ＨＥＶＣ等の符号化方式を用いることができる。
距離画像符号化手段１４は、距離画像を符号化したストリームデータ（距離画像符号化ストリーム）をストリーム結合手段１５に出力する。 The distance image encoding means 14 encodes a distance image. The distance image encoding means 14 may use any encoding method, and for example, the same encoding method as the encoding means 130, such as HEVC, can be used.
The distance image encoding means 14 outputs stream data (distance image encoded stream) obtained by encoding the distance image to the stream combining means 15 .

ストリーム結合手段１５は、奥行別撮影画像符号化手段１３で生成された奥行区間ごとの複数の奥行別撮影画像符号化ストリームと、距離画像符号化手段１４で生成された距離画像符号化ストリームとを１つのストリームデータとして結合するものである。
例えば、ストリーム結合手段１５は、予め定めた符号化手段１３０_１，…，１３０_５の順、または、符号化を完了した順に、生成した奥行別画像符号化ストリームを連結する。また、ストリーム結合手段１５は、連結した奥行別画像符号化ストリームの前または後に距離画像符号化ストリームを連結する。 The stream combining means 15 combines a plurality of coded streams of photographed images by depth for each depth section generated by the photographed image encoding means 13 and the encoded distance image streams generated by the distance image encoding means 14. The data is combined as one stream data.
For example, the stream combining means 15 connects the generated depth-specific image encoded streams in the predetermined order of the encoding means 130 ₁ , . . . , 130 ₅ or in the order in which encoding is completed. Furthermore, the stream combining means 15 connects the distance image encoded stream before or after the connected depth-specific image encoded stream.

さらに、ストリーム結合手段１５は、ストリームの構成（例えば、ストリームの配置順、奥行区間の区間数）、撮影画像および距離画像の大きさ（水平画素数、垂直画素数）、奥行別撮影画像符号化ストリームおよび距離画像符号化ストリームの各データ長等のストリームの構成情報をヘッダ情報として生成する。
そして、ストリーム結合手段１５は、奥行別画像符号化ストリームおよび距離画像符号化ストリームを連結したストリームに、ヘッダ情報を付加して、符号化ストリームを生成する。なお、ストリーム結合手段１５は、多視点画像の視点画像である撮影画像を、予め定めた視点数の符号化ストリーム分だけ連続させて出力する。また、多視点画像が動画像である場合、ストリーム結合手段１５は、さらに、符号化ストリームを連続させて出力する。 Furthermore, the stream combining means 15 encodes the stream structure (for example, the arrangement order of the streams, the number of depth sections), the size of the photographed image and the distance image (the number of horizontal pixels, the number of vertical pixels), and the photographed image encoding by depth. Stream configuration information such as each data length of the stream and the distance image encoded stream is generated as header information.
Then, the stream combining means 15 adds header information to a stream in which the depth-specific image encoded stream and the distance image encoded stream are connected, and generates an encoded stream. Note that the stream combining means 15 consecutively outputs captured images, which are viewpoint images of a multi-view image, for encoded streams having a predetermined number of viewpoints. Further, when the multi-view image is a moving image, the stream combining means 15 further outputs the encoded stream continuously.

以上説明したように画像符号化装置１を構成することで、画像符号化装置１は、奥行区間ごとに、撮影画像の符号量を制御することができる。
これによって、画像符号化装置１は、高画質化したい奥行区間の符号量を増やすことで、その奥行きで再生される立体像の解像度を高めることができる。
また、画像符号化装置１は、撮影画像を圧縮符号化する場合でも、区間ごとに符号量を増減させることで、所望の奥行区間の解像度の劣化を抑えることができる。
なお、画像符号化装置１は、コンピュータを、前記した各手段として機能させるためのプログラム（画像符号化プログラム）で動作させることができる。 By configuring the image encoding device 1 as described above, the image encoding device 1 can control the code amount of the photographed image for each depth section.
Thereby, the image encoding device 1 can increase the resolution of the stereoscopic image reproduced at that depth by increasing the amount of code for the depth section whose image quality is desired to be improved.
Furthermore, even when compressing and encoding a photographed image, the image encoding device 1 can suppress deterioration of the resolution of a desired depth section by increasing or decreasing the amount of code for each section.
Note that the image encoding device 1 can operate a computer using a program (image encoding program) for causing a computer to function as each of the above-described means.

〔インテグラル方式の閾値設定について〕
次に、画像符号化装置１が符号化対象とする撮影画像（多視点画像）を、インテグラル方式の要素画像群として使用する場合において、閾値設定手段１０が設定する閾値の設定例について説明する。 [About threshold setting for integral method]
Next, an example of setting the threshold value set by the threshold value setting means 10 will be explained when the image encoding device 1 uses the captured image (multi-view image) to be encoded as an elemental image group of the integral method. .

（等間隔の閾値設定）
まず、図７を参照して、閾値を等間隔に設定する例について説明する。
図７は、横軸をレンズアレイの位置を基準位置“０”（ＬＰ）とする距離、縦軸を空間周波数（ｃｙｃｌｅ／ｒａｄ＝ｃｐｒ）とする空間周波数特性を示す。
ここで、奥行最小置をｚｎｅａｒ、最も奥の奥行最大値をｚｆａｒとする。ｚｎｅａｒは、基準位置ＬＰから立体像を表示する最も手前（観察者Ｍの位置）までの視距離Ｌに対応する位置である。なお、ｚｆａｒは有限とし、無限遠の像はｚｆａｒの位置に存在するものとする。 (Evenly spaced threshold setting)
First, with reference to FIG. 7, an example in which threshold values are set at equal intervals will be described.
FIG. 7 shows spatial frequency characteristics, with the horizontal axis representing the distance from the position of the lens array as the reference position "0" (LP), and the vertical axis representing the spatial frequency (cycle/rad=cpr).
Here, the minimum depth is assumed to be znear, and the maximum depth at the deepest position is assumed to be zfar. znear is a position corresponding to the viewing distance L from the reference position LP to the closest position where the stereoscopic image is displayed (the position of the observer M). It is assumed that zfar is finite and that an image at infinity exists at the position of zfar.

また、奥行区間の区間数をＮ（Ｎは３以上の整数）とする。この奥行区間は、基準位置ＬＰを含む区間、その区間の手前側および奥側にそれぞれ１区間とする構成が最小区間構成となる。なお、基準位置ＬＰ付近を区間として設定可能であれば、区間数Ｎは、奇数であっても偶数であっても構わない。
また、閾値をｚｐ_ｎ（ｎ＝１，…，Ｎ－１）とし、ｚｆａｒに最も近い閾値をｚｐ_１とする。なお、図７では、区間数Ｎ＝５とした例を示している。
ここで、等間隔に閾値を設定する場合、各奥行区間の区間長ｚｌｅｎ_ｎ（ｎ＝１，…，Ｎ）は、以下の式（１）で示す長さとなる。また、閾値ｚｐ_ｎ（ｎ＝１，…，Ｎ－１）は、以下の式（２）で求めることができる。 Further, the number of depth sections is set to N (N is an integer of 3 or more). This depth section has a minimum section configuration in which there is a section including the reference position LP, and one section on the near side and one section on the back side of the section. Note that as long as the vicinity of the reference position LP can be set as a section, the number of sections N may be an odd number or an even number.
Further, the threshold value is set to zp _n (n=1,...,N-1), and the threshold value closest to zfar is set to zp ₁ . Note that FIG. 7 shows an example in which the number of sections N=5.
Here, when setting the threshold values at equal intervals, the section length zlen _n (n=1,...,N) of each depth section becomes the length shown by the following equation (1). Further, the threshold value zp _n (n=1, . . . , N-1) can be determined using the following equation (2).

図７の例では、区間［ｚｐ_２，ｚｐ_３］に多くの符号量を割り当てることで、レンズアレイ近傍の立体像を再生する際に、高解像度の立体像を表示することが可能になる。 In the example of FIG. 7, by allocating a large amount of code to the section [zp ₂ , zp ₃ ], it becomes possible to display a high-resolution stereoscopic image when reproducing a stereoscopic image near the lens array.

（非等間隔の閾値設定）
次に、図８を参照して、閾値を非等間隔に設定する例について説明する。なお、閾値を非等間隔にする場合、レンズアレイ付近ほど奥行区間を狭くする方が、レンズアレイ付近の画像により多くの符号量を割り当てることが可能になる。
そこで、ここでは、レンズアレイ付近の奥行区間を狭く、レンズアレイから離れた奥行区間を広くすることとする。
図８は、図７と同様、空間周波数特性を示し、奥行最小置をｚｎｅａｒ、奥行最大値をｚｆａｒとする。
ここで、奥行区間の区間数をＮ（Ｎは３以上の整数）とする。また、レンズアレイよりも奥の区間数をＭ（ＭはＮ／２の整数部分）とする。この場合、レンズアレイよりも手前の区間数は（Ｎ－（Ｍ＋１））となる。なお、図８では、区間数Ｎ＝５とした例を示している。
このとき、レンズアレイの位置を含む奥行区間を空間周波数が最大となる奥行区間とする。観察者Ｍが視認する空間周波数（観視空間周波数β）は、以下の式（３）で表さられる。 (Non-uniform threshold setting)
Next, with reference to FIG. 8, an example in which thresholds are set at non-uniform intervals will be described. Note that when the threshold values are set at non-uniform intervals, it is possible to allocate a larger amount of code to images near the lens array by making the depth section narrower closer to the lens array.
Therefore, here, the depth section near the lens array is narrowed, and the depth section away from the lens array is widened.
Similar to FIG. 7, FIG. 8 shows spatial frequency characteristics, with the minimum depth being znear and the maximum depth being zfar.
Here, it is assumed that the number of depth sections is N (N is an integer of 3 or more). Further, the number of sections deeper than the lens array is assumed to be M (M is an integer part of N/2). In this case, the number of sections before the lens array is (N-(M+1)). Note that FIG. 8 shows an example in which the number of sections N=5.
At this time, the depth section including the position of the lens array is defined as the depth section where the spatial frequency becomes maximum. The spatial frequency (viewing spatial frequency β) visually recognized by the observer M is expressed by the following equation (3).

ここで、図９を参照して、式（３）を説明する。図９は、インテグラル方式における立体像を表示するＩＰ表示装置の概略図である。
ｆは、レンズアレイＬ_Ａ（要素レンズＬ_Ｅ）の焦点距離である。ｐは、要素画像ｅを表示するディスプレイの画素ピッチ（画素間隔）である。Ｌは、レンズアレイＬ_Ａから観察者Ｍまでの視距離である。ｚは、レンズアレイＬ_Ａからの立体像Ｔまでの距離（観察者Ｍ方向を正）である。
ここで、立体像Ｔの画素ピッチ（画素間隔）Δは、以下の式（４）で表される。 Here, equation (3) will be explained with reference to FIG. FIG. 9 is a schematic diagram of an IP display device that displays a stereoscopic image using the integral method.
f is the focal length of the lens array _LA (element lens L _E ). p is the pixel pitch (pixel interval) of the display that displays the element image e. L is the viewing distance from the lens array _LA to the observer M. z is the distance from the lens array _LA to the stereoscopic image T (positive in the direction of the observer M).
Here, the pixel pitch (pixel interval) Δ of the stereoscopic image T is expressed by the following equation (4).

また、観視空間周波数βは、観察者が見ることのできる単位角度あたりの最大の縞の数であって、以下の式（５）で表される。 Furthermore, the viewing spatial frequency β is the maximum number of fringes per unit angle that can be seen by the observer, and is expressed by the following equation (5).

この式（５）に、式（４）を代入することで、前記式（３）となる。
なお、観察者Ｍが視距離Ｌの位置で視認する立体像Ｔの最大空間周波数β_ｎは、レンズアレイＬ_Ａの要素レンズＬ_Ｅのレンズピッチｐ_Ｌによって、ナイキスト周波数に制限され、以下の式（６）となる。 By substituting equation (4) into equation (5), equation (3) is obtained.
Note that the maximum spatial frequency β _n of the stereoscopic image T that is visually recognized by the observer M at the position of the viewing distance L is limited to the Nyquist frequency by the lens pitch p _L of the element lenses L _E of the lens array L _A , and is expressed by the following equation. (6) becomes.

この場合、図８に示すように、最大空間周波数β_ｎとなる区間［ｆａｒ_１，ｎｅａｒ_１］は、前記式（３）の観視空間周波数βに、前記式（６）の最大空間周波数β_ｎを代入することで求められるｚにより、区間[－ｚ，＋ｚ]として求めることができる。
ここでは、レンズアレイ付近の奥行区間を、区間［ｆａｒ_１，ｎｅａｒ_１］の前後に予め定めたマージンを設けた区間とする。
例えば、マージンを設けた奥行区間［ｆａｒ_２，ｎｅａｒ_２］を、以下の式（７）により求められる区間とする。 In this case, as shown in FIG. 8, the interval [far ₁ , near ₁ ] having the maximum spatial frequency β _n is equal to the viewing spatial frequency β of the above equation (3) and the maximum spatial frequency β of the above equation (6). From z found by substituting _n , it can be found as an interval [-z, +z].
Here, the depth section near the lens array is defined as a section in which a predetermined margin is provided before and after the section [far ₁ , near ₁ ].
For example, let the depth section [far ₂ , near ₂ ] provided with a margin be the section determined by the following equation (7).

なお、αは“０”以上の任意の定数であって、例えば、“０．２”等である。この値は、奥行区間の区間数を少なくする場合は大きな値とし、奥行区間の区間数を多くする場合は小さな値とすればよい。
ここで、奥行区間の区間数Ｎ＝５の場合、閾値ｚｐ_２＝ｆａｒ_２、閾値ｚｐ_３＝ｎｅａｒ_２とする。
レンズアレイよりも奥のＰ個の奥行区間、および、レンズアレイよりも手前の（Ｎ－（Ｐ＋１））個の奥行区間については、各奥行区間に割り当てる符号量に応じて閾値を定めればよい。
例えば、レンズアレイよりも奥のＰ個の奥行区間について、各奥行区間に割り当てる符号量の逆比を、奥側から順にＤ_１：Ｄ_２：…：Ｄ_Ｐとする。
この場合、各奥行区間の区間長ｚｌｅｎ_ｎ（ｎ＝１，２，…，Ｐ）は、以下の式（８）で示す長さとなる。また、閾値ｚｐ_ｎ（ｍ＝１，２，…，Ｐ－１）は、以下の式（９）で求めることができる。 Note that α is an arbitrary constant greater than or equal to “0”, and is, for example, “0.2”. This value may be set to a large value when decreasing the number of depth sections, and may be set to a small value when increasing the number of depth sections.
Here, when the number of depth sections N=5, the threshold value zp ₂ =far ₂ and the threshold value zp ₃ =near ₂ are set.
For the P depth sections behind the lens array and the (N-(P+1)) depth sections in front of the lens array, thresholds may be determined according to the amount of code allocated to each depth section. .
For example, for P depth sections deeper than the lens array, the inverse ratio of the amount of code allocated to each depth section is set as D ₁ :D ₂ :...:D _P in order from the back side.
In this case, the section length zlen _n (n=1, 2, . . . , P) of each depth section is the length expressed by the following equation (8). Further, the threshold value zp _n (m=1, 2, . . . , P-1) can be obtained using the following equation (9).

同様に、レンズアレイよりも手前の（Ｎ－（Ｐ＋１））個の奥行区間についても、各奥行区間に割り当てる符号量に応じて定めればよい。
例えば、レンズアレイよりも手前の（Ｎ－（Ｐ＋１））個の奥行区間について、各奥行区間に割り当てる符号量の逆比を、奥側から順にＤ_Ｐ＋２：Ｄ_Ｐ＋３：…：Ｄ_Ｎとする。
この場合、各奥行区間の区間長ｚｌｅｎ_ｎ（ｎ＝Ｐ＋２，Ｐ＋３，…，Ｎ）は、以下の式（１０）で示す長さとなる。また、閾値ｚｐ_ｎ（ｎ＝Ｐ＋２，Ｐ＋３，…，Ｎ－１）は、以下の式（１１）で求めることができる。 Similarly, the (N-(P+1)) depth sections in front of the lens array may be determined according to the amount of code allocated to each depth section.
For example, for the (N-(P+1)) depth sections in front of the lens array, the inverse ratio of the amount of code allocated to each depth section is set as D _P+2 :D _P+3 :...:D _N in order from the back side.
In this case, the section length zlen _n (n=P+2, P+3, . . . , N) of each depth section is the length expressed by the following equation (10). Further, the threshold value zp _n (n=P+2, P+3, . . . , N-1) can be determined using the following equation (11).

以上説明したように、奥行区間の閾値をレンズアレイの位置を基準として設定することで、レンズアレイ付近の画像に多くの符号量を割り当てることが可能になり、立体像の解像度を高めることができる。 As explained above, by setting the threshold value for the depth section based on the position of the lens array, it is possible to allocate a large amount of code to images near the lens array, and the resolution of the 3D image can be increased. .

〔画像符号化装置の動作〕
次に、図１０を参照（構成については、適宜図１参照）して、本発明の第１実施形態に係る画像符号化装置１の動作について説明する。
ステップＳ１において、閾値設定手段１０は、撮影画像の奥行区間の閾値を設定する。ここでは、閾値設定手段１０は、予め定めた視点位置を奥行最小値、表示する最も遠方の位置を奥行最大値とし、その間の奥行区間の境界である奥行値を外部から入力する。
ステップＳ２において、符号量制御情報設定手段１２は、ステップＳ１で設定した閾値で特定される奥行区間ごとに、符号量を制御するパラメータ（量子化パラメータ）を設定する。ここでは、符号量制御情報設定手段１２は、パラメータとして、区間ごとの量子化パラメータを外部から入力する。
なお、ステップＳ１およびＳ２の順番は入れ替えても構わない。 [Operation of image encoding device]
Next, the operation of the image encoding device 1 according to the first embodiment of the present invention will be described with reference to FIG. 10 (for the configuration, refer to FIG. 1 as appropriate).
In step S1, the threshold value setting means 10 sets a threshold value for the depth section of the photographed image. Here, the threshold value setting means 10 sets a predetermined viewpoint position as the minimum depth value, sets the farthest position to be displayed as the maximum depth value, and inputs from the outside a depth value that is the boundary of the depth section between them.
In step S2, the code amount control information setting means 12 sets a parameter (quantization parameter) for controlling the code amount for each depth section specified by the threshold value set in step S1. Here, the code amount control information setting means 12 inputs a quantization parameter for each section as a parameter from the outside.
Note that the order of steps S1 and S2 may be changed.

ステップＳ３において、奥行別撮影画像生成手段１１は、領域区分手段１１０によって、ステップＳ１で設定された閾値で特定される奥行区間ごとに、距離画像の領域を区分した領域情報を生成する。ここでは、領域区分手段１１０は、奥行区間ごとに、距離画像の対応する奥行値を有する画素の集合を、奥行区間に対応する領域を示す領域情報（マスクデータ）として生成する。
ステップＳ４において、奥行別撮影画像生成手段１１は、領域画像生成手段１１１によって、奥行区間ごとに、ステップＳ３で生成された領域情報に対応する撮影画像を抽出して、奥行別撮影画像を生成する。ここでは、奥行別撮影画像生成手段１１は、複数の領域画像生成手段１１１_１，…，１１１_５によって、撮影画像に、奥行区間ごとのマスクデータを乗算することで、奥行区間ごとの奥行別撮影画像を生成する。 In step S3, the depth-based photographed image generating means 11 generates area information by dividing the area of the distance image by the area dividing means 110 for each depth section specified by the threshold value set in step S1. Here, the area dividing means 110 generates, for each depth section, a set of pixels having corresponding depth values of the distance image as area information (mask data) indicating the area corresponding to the depth section.
In step S4, the depth-based photographed image generation means 11 extracts, for each depth section, a photographed image corresponding to the area information generated in step S3, and generates a depth-based photographed image. . Here, the depth-based photographed image generation means 11 multiplies the photographed image by mask data for each depth section by a plurality of area image generation means 111 ₁ , ..., 111 ₅ , so that the depth-based photographed image for each depth section is multiplied by the mask data for each depth section. Generate an image.

ステップＳ５において、奥行別撮影画像符号化手段１３は、ステップＳ２で設定されたパラメータ（量子化パラメータ）に基づいて、ステップＳ４で生成された奥行区間ごとの奥行別撮影画像を符号化する。
ここでは、奥行別撮影画像符号化手段１３は、複数の符号化手段１３０_１，…，１３０_５によって、奥行別撮影画像を直交変換し、量子化パラメータで量子化した後、可変長符号化して、奥行別撮影画像符号化ストリームを生成する。
ステップＳ６において、距離画像符号化手段１４は、距離画像を符号化して、距離画像符号化ストリームを生成する。 In step S5, the depth-based photographed image encoding unit 13 encodes the depth-based photographed image for each depth section generated in step S4, based on the parameter (quantization parameter) set in step S2.
Here, the depth-based photographed image encoding means ₁₃ uses a plurality of encoding means 130 ₁ , . , generates a depth-specific captured image encoded stream.
In step S6, the distance image encoding means 14 encodes the distance image to generate an encoded distance image stream.

ステップＳ７において、ストリーム結合手段１５は、ステップＳ５で生成された奥行区間ごとの奥行別撮影画像符号化ストリームと、ステップＳ６で生成された距離画像符号化ストリームとを結合して、符号化ストリームを生成する。
ここで、次画像（撮影画像、距離画像）がさらに入力される場合（ステップＳ８でＹｅｓ）、画像符号化装置１は、ステップＳ３に戻って動作を続ける。
一方、入力が終了した場合（ステップＳ８でＮｏ）、画像符号化装置１は、動作を終了する。
以上の動作によって、画像符号化装置１は、距離画像で特定される奥行区間ごとに符号量を変えて撮影画像を符号化することができる。 In step S7, the stream combining means 15 combines the encoded captured image stream by depth for each depth section generated in step S5 and the encoded distance image stream generated in step S6 to create an encoded stream. generate.
Here, if the next image (photographed image, distance image) is further input (Yes in step S8), the image encoding device 1 returns to step S3 and continues the operation.
On the other hand, if the input is completed (No in step S8), the image encoding device 1 ends the operation.
By the above-described operation, the image encoding device 1 can encode a photographed image by changing the amount of code for each depth section specified by the distance image.

〔画像復号装置の構成〕
次に、図１１を参照して、本発明の第１実施形態に係る画像復号装置２の構成について説明する。
画像復号装置２は、画像符号化装置１（図１）で生成された符号化ストリームを復号するものである。
図１１に示すように、画像復号装置２は、ストリーム分離手段２０と、奥行別撮影画像復号手段２１と、撮影画像合成手段２２と、距離画像復号手段２３と、を備える。 [Configuration of image decoding device]
Next, with reference to FIG. 11, the configuration of the image decoding device 2 according to the first embodiment of the present invention will be described.
The image decoding device 2 decodes the encoded stream generated by the image encoding device 1 (FIG. 1).
As shown in FIG. 11, the image decoding device 2 includes a stream separating means 20, a depth-based photographed image decoding means 21, a photographed image combining means 22, and a distance image decoding means 23.

ストリーム分離手段２０は、入力された符号化ストリームを、複数の奥行別撮影画像符号化ストリームと、距離画像符号化ストリームとに分離するものである。
このストリーム分離手段２０は、符号化ストリームのヘッダ情報を参照して、符号化ストリームから、複数の奥行別撮影画像符号化ストリームと、距離画像符号化ストリームとを分離して抽出する。
ストリーム分離手段２０は、分離した奥行別撮影画像符号化ストリームを、奥行別撮影画像復号手段２１に出力する。
また、ストリーム分離手段２０は、分離した距離画像符号化ストリームを、距離画像復号手段２３に出力する。 The stream separating means 20 separates the input encoded stream into a plurality of depth-specific photographed image encoded streams and a distance image encoded stream.
The stream separation means 20 refers to the header information of the encoded stream and separates and extracts a plurality of depth-specific photographed image encoded streams and a distance image encoded stream from the encoded stream.
The stream separating means 20 outputs the separated depth-based photographed image encoded stream to the depth-based photographed image decoding means 21 .
Furthermore, the stream separating means 20 outputs the separated distance image encoded stream to the distance image decoding means 23.

奥行別撮影画像復号手段２１は、ストリーム分離手段２０で分離された複数の奥行別撮影画像符号化ストリームを復号するものである。奥行別撮影画像復号手段２１は、予め定めた奥行区間の区間数の復号手段２１０を備える。 The depth-based photographed image decoding means 21 decodes the plurality of depth-based photographed image encoded streams separated by the stream separation means 20. The depth-specific photographed image decoding means 21 includes decoding means 210 for a predetermined number of depth sections.

復号手段２１０は、個々の奥行別撮影画像符号化ストリームを復号し、奥行別撮影画像を生成するものである。この復号手段２１０は、図１で説明した符号化手段１３０の符号化方式に対応した復号処理を行う。
ここでは、奥行別撮影画像復号手段２１を、奥行区間ごとに、区間数に応じた複数の復号手段２１０_１，２１０_２，…，２１０_５で構成している。 The decoding means 210 decodes each encoded stream of captured images classified by depth to generate captured images classified by depth. This decoding means 210 performs a decoding process corresponding to the encoding method of the encoding means 130 explained in FIG.
Here, the depth-specific photographed image decoding means 21 is constituted by a plurality of decoding means 210 ₁ , 210 ₂ , . . . , 210 ₅ according to the number of sections for each depth section.

復号手段２１０_１，…，２１０_５は、奥行別撮影画像符号化ストリームから奥行別撮影画像を復号する点で同じ処理を行う。
具体的には、復号手段２１０は、奥行別復号画像符号化ストリームを可変長復号し、量子化係数を生成する。そして、復号手段２１０は、付帯情報として奥行別撮影画像符号化ストリームに付帯されている量子化パラメータで特定される量子化ステップのサイズを量子化係数に乗算し、逆直交変換（逆ＤＣＴ）することで、奥行別撮影画像を復号する。
復号手段２１０（２１０_１，…，２１０_５）は、それぞれ奥行区間ごとに生成した奥行別撮影画像を撮影画像合成手段２２に出力する。 The decoding means 210 ₁ , . . . , 210 ₅ perform the same processing in that they decode the depth-specific captured images from the depth-specific captured image encoded stream.
Specifically, the decoding means 210 performs variable length decoding on the depth-specific decoded image encoded stream to generate quantization coefficients. Then, the decoding means 210 multiplies the quantization coefficient by the size of the quantization step specified by the quantization parameter attached to the depth-specific captured image encoded stream as additional information, and performs inverse orthogonal transformation (inverse DCT). This decodes the captured images by depth.
The decoding means 210 (210 ₁ , . . . , 210 ₅ ) outputs the depth-specific photographed images generated for each depth section to the photographed image composition means 22 .

撮影画像合成手段２２は、奥行別撮影画像復号手段２１で復号された複数の奥行別撮影画像を合成するものである。
この撮影画像合成手段２２は、複数の奥行別撮影画像を加算することで撮影画像を生成する。 The photographed image combining means 22 is for combining the plurality of photographed images classified by depth decoded by the photographed image decoding means 21 classified by depth.
This photographed image synthesis means 22 generates a photographed image by adding a plurality of photographed images classified by depth.

距離画像復号手段２３は、ストリーム分離手段２０で分離された距離画像符号化ストリームを復号するものである。この距離画像復号手段２３は、図１で説明した距離画像符号化手段１４の符号化方式に対応した復号処理を行う。 The distance image decoding means 23 decodes the distance image encoded stream separated by the stream separation means 20. This distance image decoding means 23 performs a decoding process corresponding to the encoding method of the distance image encoding means 14 explained with reference to FIG.

以上説明したように画像復号装置２を構成することで、画像復号装置２は、画像符号化装置１で撮影画像を奥行区間ごとに異なる量子化パラメータで符号化された符号化ストリームを復号することができる。
これによって、画像復号装置２は、符号量を多く割り当てられた奥行きで再生される立体像の解像度を高めることができる。 By configuring the image decoding device 2 as described above, the image decoding device 2 can decode the encoded stream in which the captured image is encoded with different quantization parameters for each depth section by the image encoding device 1. I can do it.
Thereby, the image decoding device 2 can increase the resolution of the stereoscopic image reproduced at the depth to which a large amount of code is allocated.

〔画像復号装置の動作〕
次に、図１２を参照（構成については、適宜図１１参照）して、本発明の第１実施形態に係る画像復号装置２の動作について説明する。
ステップＳ１０において、ストリーム分離手段２０は、符号化ストリームを、複数の奥行別撮影画像符号化ストリームと、距離画像符号化ストリームとに分離する。
ステップＳ１１において、奥行別撮影画像復号手段２１は、ステップＳ１０で分離された複数の奥行別撮影画像符号化ストリームを復号し、奥行区間ごとの奥行別撮影画像を生成する。ここでは、奥行別撮影画像復号手段２１は、複数の復号手段２１０_１，…，２１０_５によって、個々に奥行別撮影画像符号化ストリームを復号する。
ステップＳ１２において、撮影画像合成手段２２は、ステップＳ１１で生成された複数の奥行別撮影画像を合成する。これによって、符号化前の撮影画像が再生される。 [Operation of image decoding device]
Next, the operation of the image decoding device 2 according to the first embodiment of the present invention will be described with reference to FIG. 12 (for the configuration, refer to FIG. 11 as appropriate).
In step S10, the stream separating means 20 separates the encoded stream into a plurality of depth-specific photographed image encoded streams and a distance image encoded stream.
In step S11, the depth-based captured image decoding means 21 decodes the plurality of depth-based captured image encoded streams separated in step S10, and generates depth-based captured images for each depth section. Here, the depth-based photographed image decoding means 21 individually decodes the depth-based photographed image encoded stream using a plurality of decoding means 210 ₁ , . . . , 210 ₅ .
In step S12, the photographed image combining means 22 combines the plurality of depth-based photographed images generated in step S11. As a result, the photographed image before encoding is reproduced.

ステップＳ１３において、距離画像復号手段２３は、ステップＳ１０で分離された距離画像符号化ストリームを復号する。これによって、符号化前の距離画像が再生される。
ここで、符号化ストリームがさらに入力される場合（ステップＳ１４でＮｏ）、画像復号装置２は、ステップＳ１０に戻って動作を続ける。
一方、符号化ストリームの入力が終了した場合（ステップＳ１４でＹｅｓ）、画像復号装置２は、動作を終了する。
以上の動作によって、画像復号装置２は、奥行区間ごとに符号量を変えて符号化された撮影画像を復号することができる。 In step S13, the distance image decoding means 23 decodes the distance image encoded stream separated in step S10. As a result, the distance image before encoding is reproduced.
Here, if another encoded stream is input (No in step S14), the image decoding device 2 returns to step S10 and continues the operation.
On the other hand, if the input of the encoded stream has ended (Yes in step S14), the image decoding device 2 ends its operation.
Through the above-described operations, the image decoding device 2 can decode a photographed image encoded by changing the amount of code for each depth section.

＜第２実施形態＞
〔画像符号化装置の構成〕
図１３を参照して、本発明の第２実施形態に係る画像符号化装置１Ｂの構成について説明する。
画像符号化装置１Ｂは、被写体の撮影画像と被写体の奥行情報を示す距離画像とを符号化するものである。なお、図１で説明した画像符号化装置１との違いは、画像符号化装置１Ｂが、距離画像についても奥行きに応じて符号量を変えて符号化する点である。 <Second embodiment>
[Configuration of image encoding device]
With reference to FIG. 13, the configuration of an image encoding device 1B according to a second embodiment of the present invention will be described.
The image encoding device 1B encodes a captured image of a subject and a distance image indicating depth information of the subject. Note that the difference from the image encoding device 1 described with reference to FIG. 1 is that the image encoding device 1B encodes the distance image by changing the amount of code depending on the depth.

図１３に示すように、画像符号化装置１Ｂは、閾値設定手段１０と、奥行別撮影画像生成手段１１と、符号量制御情報設定手段１２と、奥行別撮影画像符号化手段１３と、ストリーム結合手段１５Ｂと、奥行別距離画像生成手段１６と、奥行別距離画像符号化手段１７と、を備える。
閾値設定手段１０、奥行別撮影画像生成手段１１、符号量制御情報設定手段１２および奥行別撮影画像符号化手段１３は、図１で説明した画像符号化装置１と同じ構成であるため、同一の符号を付して説明を省略する。 As shown in FIG. 13, the image encoding device 1B includes a threshold value setting means 10, a photographed image generation means 11 for each depth, a code amount control information setting means 12, a photographed image encoding means 13 for each depth, and a stream combination. It includes means 15B, depth-based distance image generation means 16, and depth-based distance image encoding means 17.
The threshold value setting means 10, the depth-based photographed image generation means 11, the code amount control information setting means 12, and the depth-based photographed image encoding means 13 have the same configuration as the image encoding device 1 described in FIG. Reference numerals are given and explanations are omitted.

ストリーム結合手段１５Ｂは、奥行別撮影画像符号化手段１３で生成された奥行区間ごとの奥行別撮影画像符号化ストリームと、後記する奥行別距離画像符号化手段１７で生成された奥行区間ごとの奥行別距離画像符号化ストリームとを１つのストリームデータとして結合するものである。このストリーム結合手段１５Ｂは、ストリーム結合手段１５（図１）で結合する距離画像符号化ストリームが、複数の奥行別距離画像符号化ストリームとなったもので、基本的な処理はストリーム結合手段１５と同じである。 The stream combining means 15B combines the encoded stream of photographed images for each depth section generated by the photographed image encoding means 13 for each depth and the depth for each depth section generated by the distance image encoding means 17 for each depth, which will be described later. This is to combine another distance image encoded stream as one stream data. This stream combining means 15B is such that the distance image encoded streams combined by the stream combining means 15 (FIG. 1) become a plurality of depth-specific distance image encoded streams, and the basic processing is performed by the stream combining means 15. It's the same.

奥行別距離画像生成手段１６は、予め設定された閾値で区分された奥行区間ごとに、距離画像から奥行区間に対応する画像を抽出し、複数の奥行別距離画像を生成するものである。
この奥行別距離画像生成手段１６は、奥行区間ごとに、距離画像の画素値が奥行区間に含まれる画素の画素値をそのまま保持し、それ以外の画素値を“０”とすることで、奥行別距離画像を生成する。
もちろん、奥行別距離画像生成手段１６は、距離画像に、領域区分手段１１０で生成される奥行区間ごとの領域情報（マスクデータ）を乗算することで、奥行別距離画像を生成してもよい。
奥行別距離画像生成手段１６は、生成した奥行区間ごとの奥行別距離画像を奥行別距離画像符号化手段１７に出力する。 The depth-specific distance image generation unit 16 extracts an image corresponding to the depth section from the distance image for each depth section divided by a preset threshold value, and generates a plurality of depth-specific distance images.
This depth-specific distance image generation means 16 maintains the pixel values of pixels included in the depth section as they are, and sets the other pixel values to "0" for each depth section. Generate another distance image.
Of course, the depth-specific distance image generation unit 16 may generate a depth-specific distance image by multiplying the distance image by area information (mask data) for each depth section generated by the area segmentation unit 110.
The depth-specific distance image generating means 16 outputs the generated depth-specific distance image for each depth section to the depth-specific distance image encoding means 17.

奥行別距離画像符号化手段１７は、奥行別距離画像生成手段１６で生成された奥行区間ごとの距離画像を符号化するものである。奥行別距離画像符号化手段１７は、奥行区間ごとに複数の符号化手段１７０を備える。
符号化手段１７０は、奥行別距離画像を、符号量制御情報設定手段１２で設定された奥行区間ごとの符号量を制御するパラメータ（量子化パラメータ）に基づいて符号化するものである。この符号化手段１７０は、符号化対象（奥行別撮影画像か奥行別距離画像か）が異なるだけで、基本的な処理は符号化手段１３０と同じである。
符号化手段１７０（１７０_１，…，１７０_５）は、それぞれ奥行区間ごとに生成した奥行別距離画像符号化ストリームをストリーム結合手段１５Ｂに出力する。 The depth-specific distance image encoding means 17 encodes the distance image for each depth section generated by the depth-specific distance image generation means 16. The depth-specific distance image encoding means 17 includes a plurality of encoding means 170 for each depth section.
The encoding means 170 encodes the depth-based distance image based on the parameter (quantization parameter) that controls the amount of code for each depth section set by the amount of code control information setting means 12. This encoding means 170 is basically the same as the encoding means 130, except that the encoding target (photographed image by depth or distance image by depth) is different.
The encoding means 170 (170 ₁ , . . . , 170 ₅ ) outputs the depth-specific distance image encoded streams generated for each depth section to the stream combining means 15B.

以上説明したように画像符号化装置１Ｂを構成することで、画像符号化装置１Ｂは、奥行区間ごとに、撮影画像および距離画像の符号量を制御することができる。
なお、画像符号化装置１Ｂは、コンピュータを、前記した各手段として機能させるためのプログラム（画像符号化プログラム）で動作させることができる。
この画像符号化装置１Ｂの動作は、図１０で説明した画像符号化装置１の動作において、距離画像を奥行区間ごとに符号化する点が異なるだけであるため、説明を省略する。 By configuring the image encoding device 1B as described above, the image encoding device 1B can control the code amount of the photographed image and the distance image for each depth section.
Note that the image encoding device 1B can operate a computer using a program (image encoding program) for causing a computer to function as each of the above-mentioned means.
The operation of this image encoding device 1B is different from the operation of the image encoding device 1 described with reference to FIG. 10 in that the distance image is encoded for each depth section, so the explanation will be omitted.

〔画像復号装置の構成〕
次に、図１４を参照して、本発明の第２実施形態に係る画像復号装置２Ｂの構成について説明する。
画像復号装置２Ｂは、画像符号化装置１Ｂ（図１３）で生成された符号化ストリームを復号するものである。 [Configuration of image decoding device]
Next, with reference to FIG. 14, the configuration of an image decoding device 2B according to a second embodiment of the present invention will be described.
The image decoding device 2B decodes the encoded stream generated by the image encoding device 1B (FIG. 13).

図１４に示すように、画像復号装置２Ｂは、ストリーム分離手段２０Ｂと、奥行別撮影画像復号手段２１と、撮影画像合成手段２２と、奥行別距離画像復号手段２４と、距離画像合成手段２５と、を備える。
奥行別撮影画像復号手段２１および撮影画像合成手段２２は、図１１で説明した画像復号装置２と同じ構成であるため、同一の符号を付して説明を省略する。 As shown in FIG. 14, the image decoding device 2B includes a stream separation means 20B, a photographed image decoding means 21 for each depth, a photographed image composition means 22, a distance image decoding means 24 for each depth, and a distance image composition means 25. , is provided.
The depth-based photographed image decoding means 21 and the photographed image synthesizing means 22 have the same configuration as the image decoding device 2 described with reference to FIG. 11, so the same reference numerals are given and the description thereof will be omitted.

ストリーム分離手段２０Ｂは、入力された符号化ストリームを、複数の奥行別撮影画像符号化ストリームと、複数の奥行別距離画像符号化ストリームとに分離するものである。
このストリーム分離手段２０Ｂは、符号化ストリームのヘッダ情報を参照して、符号化ストリームから、複数の奥行別撮影画像符号化ストリームと、複数の奥行別距離画像符号化ストリームとを分離して抽出する。
ストリーム分離手段２０Ｂは、分離した奥行別撮影画像符号化ストリームを、奥行別撮影画像復号手段２１に出力する。
また、ストリーム分離手段２０Ｂは、分離した奥行別距離画像符号化ストリームを、奥行別距離画像復号手段２４に出力する。 The stream separating means 20B separates the input encoded stream into a plurality of depth-specific photographed image encoded streams and a plurality of depth-specific distance image encoded streams.
The stream separating means 20B refers to the header information of the encoded stream and separates and extracts a plurality of depth-specific photographic image encoded streams and a plurality of depth-specific distance image encoded streams from the encoded stream. .
The stream separating means 20B outputs the separated depth-based photographed image encoded stream to the depth-based photographed image decoding means 21.
Furthermore, the stream separating means 20B outputs the separated depth-based distance image encoded stream to the depth-based distance image decoding means 24.

奥行別距離画像復号手段２４は、ストリーム分離手段２０Ｂで分離された複数の奥行別距離画像符号化ストリームを復号するものである。奥行別距離画像復号手段２４は、予め定めた奥行区間の区間数の復号手段２４０（２４０_１，…，２４０_５）を備える。 The depth-based distance image decoding means 24 decodes the plurality of depth-based distance image encoded streams separated by the stream separation means 20B. The depth-based distance image decoding means 24 includes decoding means 240 (240 ₁ , . . . , 240 ₅ ) for a predetermined number of depth sections.

復号手段２４０は、奥行区間ごとに、個々の奥行別距離画像符号化ストリームを復号し、奥行別距離画像を生成するものである。この復号手段２４０は、図１３で説明した符号化手段１７０の符号化方式に対応した復号処理を行う。
復号手段２４０（２４０_１，…，２４０_５）は、それぞれ奥行区間ごとに生成した奥行別距離画像を距離画像合成手段２５に出力する。 The decoding means 240 decodes each depth-specific distance image encoded stream for each depth section, and generates a depth-specific distance image. This decoding means 240 performs a decoding process corresponding to the encoding method of the encoding means 170 explained with reference to FIG.
The decoding means 240 (240 ₁ , . . . , 240 ₅ ) outputs the depth-specific distance images generated for each depth section to the distance image synthesis means 25 .

距離画像合成手段２５は、奥行別距離画像復号手段２４で復号された複数の奥行別距離画像を合成するものである。
この距離画像合成手段２５は、複数の奥行別距離画像を加算することで距離画像を生成する。
以上説明したように画像復号装置２Ｂを構成することで、画像復号装置２Ｂは、画像符号化装置１Ｂで撮影画像および距離画像を奥行区間ごとに異なる量子化パラメータで符号化された符号化ストリームを復号することができる。 The distance image synthesizing means 25 synthesizes the plurality of depth-specific distance images decoded by the depth-specific distance image decoding means 24.
This distance image synthesis means 25 generates a distance image by adding a plurality of depth-based distance images.
By configuring the image decoding device 2B as described above, the image decoding device 2B can generate an encoded stream in which the captured image and the distance image are encoded with different quantization parameters for each depth section by the image encoding device 1B. Can be decrypted.

〔変形例〕
以上、本発明の実施形態に係る画像符号化装置１，１Ｂおよび画像復号装置２，２Ｂの構成および動作について説明したが、本発明は、この実施形態に限定されるものではない。 [Modified example]
Although the configurations and operations of the image encoding devices 1, 1B and the image decoding devices 2, 2B according to the embodiments of the present invention have been described above, the present invention is not limited to these embodiments.

（変形例その１）
ここでは、画像符号化装置１，１Ｂの閾値設定手段１０は、外部から指定されることで閾値を設定した。
しかし、これらの閾値は、画像符号化装置１，１Ｂの内部メモリ等に予め設定されているものとしてもよい。この場合、画像符号化装置１，１Ｂは、構成から閾値設定手段１０を省略してもよい。また、この場合、領域区分手段１１０は、予め設定されている閾値によって、距離画像の領域を区分すればよい。 (Modification 1)
Here, the threshold value setting means 10 of the image encoding apparatuses 1 and 1B set the threshold value by being specified from the outside.
However, these threshold values may be set in advance in the internal memory of the image encoding devices 1 and 1B. In this case, the image encoding devices 1 and 1B may omit the threshold value setting means 10 from the configuration. Further, in this case, the area segmentation means 110 may segment the area of the distance image using a preset threshold value.

（変形例その２）
ここでは、画像符号化装置１，１Ｂの奥行別撮影画像生成手段１１を、複数の領域画像生成手段１１１（１１１_１，１１１_２，…，１１１_５）で、並列に動作させる構成とした。また、奥行別撮影画像符号化手段１３を、複数の符号化手段１３０（１３０_１，１３０_２，…，１３０_５）で、並列に動作させる構成とした。
また、画像符号化装置１Ｂの奥行別距離画像符号化手段１７を、複数の符号化手段１７０（１７０_１，１７０_２，…，１７０_５）で、並列に動作させる構成とした。
しかし、領域画像生成手段１１１、符号化手段１３０および符号化手段１７０は、それぞれ、必ずしも並列に動作を行う構成とする必要はなく、単一の構成とし、順番に動作を行うこととしてもよい。 (Modification 2)
Here, the depth-specific captured image generation means 11 of the image encoding apparatuses 1 and 1B is configured to operate in parallel with a plurality of area image generation means 111 (111 ₁ , 111 ₂ , . . . , 111 ₅ ). Further, the depth-based photographed image encoding means 13 is configured to operate in parallel with a plurality of encoding means 130 (130 ₁ , 130 ₂ , . . . , 130 ₅ ).
Further, the depth-specific distance image encoding means 17 of the image encoding device 1B is configured to operate in parallel with a plurality of encoding means 170 (170 ₁ , 170 ₂ , . . . , 170 ₅ ).
However, the regional image generation means 111, the encoding means 130, and the encoding means 170 do not necessarily need to be configured to operate in parallel, but may be configured as a single configuration and operated in order.

また、ここでは、画像復号装置２，２Ｂの奥行別撮影画像復号手段２１を、複数の復号手段２１０（２１０_１，２１０_２，…，２１０_５）で、並列に動作させる構成とした。
また、画像復号装置２Ｂの奥行別距離画像復号手段２４を、複数の復号手段２４０（２４０_１，２４０_２，…，２４０_５）で、並列に動作させる構成とした。
しかし、復号手段２１０および復号手段２４０は、それぞれ、必ずしも並列に動作を行う構成とする必要はなく、単一の構成とし、順番に動作を行うこととしてもよい。 Further, here, the depth-based photographed image decoding means 21 of the image decoding devices 2 and 2B is configured to be operated in parallel by a plurality of decoding means 210 (210 ₁ , 210 ₂ , . . . , 210 ₅ ).
Further, the depth-specific distance image decoding means 24 of the image decoding device 2B is configured to be operated in parallel by a plurality of decoding means 240 (240 ₁ , 240 ₂ , . . . , 240 ₅ ).
However, the decoding means 210 and the decoding means 240 do not necessarily need to have a configuration in which they operate in parallel, but may have a single configuration and operate in order.

１，１Ｂ画像符号化装置
１０閾値設定手段
１１奥行別撮影画像生成手段
１１０領域区分手段
１１１領域画像生成手段
１２符号量制御情報設定手段
１３奥行別撮影画像符号化手段
１４距離画像符号化手段
１５，１５Ｂストリーム結合手段
１６奥行別距離画像生成手段
１７奥行別距離画像符号化手段
１７０符号化手段
２，２Ｂ画像復号装置
２０，２０Ｂストリーム分離手段
２１奥行別撮影画像復号手段
２１０復号手段
２２撮影画像合成手段
２３距離画像復号手段
２４奥行別距離画像復号手段
２４０復号手段
２５距離画像合成手段 1, 1B Image encoding device 10 Threshold value setting means 11 Depth-based photographed image generation means 110 Area division means 111 Area image generation means 12 Code amount control information setting means 13 Depth-based photographed image encoding means 14 Distance image encoding means 15, 15B Stream combining means 16 Depth-based distance image generation means 17 Depth-based distance image encoding means 170 Encoding means 2, 2B Image decoding device 20, 20B Stream separation means 21 Depth-based photographed image decoding means 210 Decoding means 22 Photographed image composition means 23 Distance image decoding means 24 Depth-based distance image decoding means 240 Decoding means 25 Distance image composition means

Claims

被写体の撮影画像と前記被写体の奥行情報を示す距離画像とを符号化する画像符号化装置であって、
予め設定された閾値で区分された奥行区間ごとに、前記距離画像で特定される前記奥行区間に対応する画像を前記撮影画像から抽出し、複数の奥行別撮影画像を生成する奥行別撮影画像生成手段と、
前記複数の奥行別撮影画像を、前記奥行区間ごとに予め設定された符号量を制御するパラメータに基づいて符号化し、複数の奥行別撮影画像符号化ストリームを生成する奥行別撮影画像符号化手段と、
前記距離画像を符号化し、距離画像符号化ストリームを生成する距離画像符号化手段と、
前記複数の奥行別撮影画像符号化ストリームと前記距離画像符号化ストリームとを結合した符号化ストリームを生成するストリーム結合手段と、
を備えることを特徴とする画像符号化装置。 An image encoding device that encodes a photographed image of a subject and a distance image indicating depth information of the subject,
Generation of photographed images by depth, which extracts an image corresponding to the depth section specified by the distance image from the photographed image for each depth section divided by a preset threshold value, and generates a plurality of photographed images by depth. means and
Depth-based photographed image encoding means for encoding the plurality of depth-based photographed images based on a parameter that controls a preset code amount for each depth section to generate a plurality of depth-based photographed image encoded streams; ,
distance image encoding means for encoding the distance image and generating a distance image encoded stream;
Stream combining means for generating a coded stream that combines the plurality of depth-specific photographed image coded streams and the distance image coded stream;
An image encoding device comprising:

被写体の撮影画像と前記被写体の奥行情報を示す距離画像とを符号化する画像符号化装置であって、
予め設定された閾値で区分された奥行区間ごとに、前記距離画像で特定される前記奥行区間に対応する画像を前記撮影画像から抽出し、複数の奥行別撮影画像を生成する奥行別撮影画像生成手段と、
前記複数の奥行別撮影画像を、前記奥行区間ごとに予め設定された符号量を制御するパラメータに基づいて符号化し、複数の奥行別撮影画像符号化ストリームを生成する奥行別撮影画像符号化手段と、
前記閾値で区分された奥行区間ごとに、前記奥行区間に対応する画像を前記距離画像から抽出し、複数の奥行別距離画像を生成する奥行別距離画像生成手段と、
前記複数の奥行別距離画像を、前記奥行区間ごとに予め設定された前記パラメータに基づいて符号化し、複数の奥行別距離画像符号化ストリームを生成する奥行別距離画像符号化手段と、
前記複数の奥行別撮影画像符号化ストリームと前記複数の奥行別距離画像符号化ストリームとを結合した符号化ストリームを生成するストリーム結合手段と、
を備えることを特徴とする画像符号化装置。 An image encoding device that encodes a photographed image of a subject and a distance image indicating depth information of the subject,
Generation of photographed images by depth, which extracts an image corresponding to the depth section specified by the distance image from the photographed image for each depth section divided by a preset threshold value, and generates a plurality of photographed images by depth. means and
Depth-based photographed image encoding means for encoding the plurality of depth-based photographed images based on a parameter that controls a preset code amount for each depth section to generate a plurality of depth-based photographed image encoded streams; ,
Depth-specific distance image generation means for extracting an image corresponding to the depth section from the distance image for each depth section divided by the threshold value, and generating a plurality of depth-specific distance images;
Depth-specific distance image encoding means for encoding the plurality of depth-specific distance images based on the parameters preset for each depth section to generate a plurality of depth-specific distance image encoded streams;
Stream combining means for generating a coded stream that combines the plurality of depth-specific photographed image coded streams and the plurality of depth-specific distance image coded streams;
An image encoding device comprising:

前記撮影画像は、インテグラル方式の要素画像群を生成するための多視点画像を構成する個々の視点画像であって、
前記パラメータは、前記インテグラル方式のレンズアレイの位置を含んだ奥行区間の画像に対して他の奥行区間よりも多くの符号量を割り当てるように設定された値であることを特徴とする請求項１または請求項２に記載の画像符号化装置。 The photographed images are individual viewpoint images constituting a multi-view image for generating an integral image group,
4. The parameter is a value set to allocate a larger amount of code to an image in a depth section including the position of the integral lens array than to other depth sections. 3. The image encoding device according to claim 1 or claim 2.

前記奥行別撮影画像生成手段は、
前記奥行区間ごとに前記距離画像の領域を区分するマスクデータを生成する領域区分手段と、
前記奥行区間ごとに、前記撮影画像に前記マスクデータを乗算することで前記奥行別撮影画像を生成する領域画像生成手段と、
を備えることを特徴とする請求項１から請求項３のいずれか一項に記載の画像符号化装置。 The depth-based photographed image generating means includes:
area segmentation means for generating mask data for segmenting the area of the distance image for each of the depth sections;
Area image generation means for generating the depth-specific photographed image by multiplying the photographed image by the mask data for each depth section;
The image encoding device according to any one of claims 1 to 3, characterized by comprising:

コンピュータを、請求項１から請求項４のいずれか一項に記載の画像符号化装置として機能させるための画像符号化プログラム。 An image encoding program for causing a computer to function as the image encoding device according to any one of claims 1 to 4.

奥行きを区分して被写体の撮影画像を符号化した複数の奥行別撮影画像符号化ストリームと前記被写体の奥行情報を示す距離画像を符号化した距離画像符号化ストリームとを結合した符号化ストリームを復号する画像復号装置であって、
前記符号化ストリームを前記複数の奥行別撮影画像符号化ストリームと前記距離画像符号化ストリームとに分離するストリーム分離手段と、
前記複数の奥行別撮影画像符号化ストリームを復号し、複数の奥行別撮影画像を生成する奥行別撮影画像復号手段と、
前記複数の奥行別撮影画像を合成し、前記撮影画像を生成する撮影画像合成手段と、
前記距離画像符号化ストリームを復号し、前記距離画像を生成する距離画像復号手段と、
を備えることを特徴とする画像復号装置。 Decoding a coded stream that is a combination of a plurality of depth-separated captured image coded streams in which captured images of a subject are coded by dividing the depth, and a distance image coded stream in which a distance image indicating depth information of the subject is coded. An image decoding device that performs
Stream separation means for separating the encoded stream into the plurality of depth-based photographed image encoded streams and the distance image encoded stream;
Depth-specific captured image decoding means for decoding the plurality of depth-specific captured image encoded streams to generate a plurality of depth-specific captured images;
Photographed image synthesis means that combines the plurality of depth-based photographed images to generate the photographed image;
distance image decoding means for decoding the distance image encoded stream and generating the distance image;
An image decoding device comprising:

奥行きを区分して被写体の撮影画像を符号化した複数の奥行別撮影画像符号化ストリームと前記被写体の奥行情報を示す距離画像を前記撮影画像と同じ奥行区間で区分して符号化した複数の奥行別距離画像符号化ストリームとを結合した符号化ストリームを復号する画像復号装置であって、
前記符号化ストリームを前記複数の奥行別撮影画像符号化ストリームと前記複数の奥行別距離画像符号化ストリームとに分離するストリーム分離手段と、
前記複数の奥行別撮影画像符号化ストリームを復号し、複数の奥行別撮影画像を生成する奥行別撮影画像復号手段と、
前記複数の奥行別撮影画像を合成し、前記撮影画像を生成する撮影画像合成手段と、
前記複数の奥行別距離画像符号化ストリームを復号し、複数の奥行別距離画像を生成する奥行別距離画像復号手段と、
前記複数の奥行別距離画像を合成し、前記距離画像を生成する距離画像合成手段と、
を備えることを特徴とする画像復号装置。 A plurality of depth-specific photographed image encoded streams in which photographed images of a subject are coded by dividing the depth, and a plurality of depths in which distance images indicating depth information of the subject are divided and encoded in the same depth interval as the photographed images. An image decoding device that decodes an encoded stream that is combined with another distance image encoded stream,
Stream separation means for separating the encoded stream into the plurality of depth-specific photographed image encoded streams and the plurality of depth-specific distance image encoded streams;
Depth-specific captured image decoding means for decoding the plurality of depth-specific captured image encoded streams to generate a plurality of depth-specific captured images;
Photographed image synthesis means that combines the plurality of depth-based photographed images to generate the photographed image;
depth-specific distance image decoding means for decoding the plurality of depth-specific distance image encoded streams and generating a plurality of depth-specific distance images;
distance image synthesis means for synthesizing the plurality of depth-based distance images to generate the distance image;
An image decoding device comprising:

コンピュータを、請求項６または請求項７に記載の画像復号装置として機能させるための画像復号プログラム。 An image decoding program for causing a computer to function as the image decoding device according to claim 6 or 7.