WO2013077304A1 - Image coding device, image decoding device, and methods and programs thereof - Google Patents

Image coding device, image decoding device, and methods and programs thereof Download PDF

Info

Publication number
WO2013077304A1
WO2013077304A1 PCT/JP2012/080019 JP2012080019W WO2013077304A1 WO 2013077304 A1 WO2013077304 A1 WO 2013077304A1 JP 2012080019 W JP2012080019 W JP 2012080019W WO 2013077304 A1 WO2013077304 A1 WO 2013077304A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
block
unit
encoding
prediction
Prior art date
Application number
PCT/JP2012/080019
Other languages
French (fr)
Japanese (ja)
Inventor
大津 誠
内海 端
貴也 山本
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2013077304A1 publication Critical patent/WO2013077304A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • luminance compensation is performed on the reference block specified by the ID indicating the reference image and the disparity vector using Expression (1). Further, the residual component is added to the luminance compensated image to reproduce the original image. Color compensation can also be realized by a method similar to luminance compensation.
  • the offset parameter in color compensation will be described as C CC
  • the scale parameter will be described as S CC .
  • MPEG-3DV an MPEG ad hoc group
  • the depth image is information representing the distance from the camera to the subject, and as a generation method, for example, there is a method of obtaining from a device that measures the distance installed in the vicinity of the camera.
  • a depth image can be generated by analyzing an image taken from a multi-viewpoint camera.
  • the decoder 907 receives the output result of the encoder 906 that is transmitted via the network N or directly, decodes it, and outputs a decoded image and a decoded depth image.
  • the display unit 908 receives the decoded image and the decoded depth image from the decoder 907 as input, and displays the decoded image, or displays the decoded image after performing processing using the depth image.
  • the present invention has been made in view of the above-described circumstances, and an object of the present invention is to always provide a highly accurate image without requiring additional information other than information necessary for parallax compensation prediction in the parallax compensation prediction method. It is to be able to carry out inter-characteristic difference compensation processing.
  • another object of the present invention is to perform inter-image characteristic difference compensation processing with high accuracy at all times in the multi-view video and the corresponding viewpoint depth video coding system being developed by MPEG-3DV. This is to perform the parallax compensation prediction with high efficiency.
  • the first technical means of the present invention corrects a difference in characteristics between viewpoint images and encodes a parallax when encoding each viewpoint image taken from at least two viewpoints.
  • a corresponding block extraction unit that extracts a reference block to be referred to when encoding an encoding target block, a block around the encoding target block, and a block around the reference block
  • a correction processing unit that corrects a difference in characteristics between the viewpoint images based on the similarity to.
  • the correction processing unit performs correction by excluding peripheral blocks in which a subject different from the subject in the encoding target block is captured. It is characterized by.
  • the correction processing unit selects one peripheral block most likely to show the same subject as the subject shown in the encoding target block, The correction is performed.
  • the sixth technical means is characterized in that, in the fifth technical means, the depth information is information based on a representative value of a block obtained by dividing a depth image.
  • the correction processing unit executes correction by excluding a peripheral block in which a subject different from the subject in the decoding target block is shown. It is a feature.
  • the correction processing section selects one peripheral block most likely to show the same subject as the subject shown in the decoding target block, and corrects it. It is characterized by executing.
  • the eleventh technical means is characterized in that, in the ninth or tenth technical means, depth information corresponding to each viewpoint image is used for determination of the subject.
  • the twelfth technical means is characterized in that, in the eleventh technical means, the depth information is information based on a representative value of a block obtained by dividing a depth image.
  • the fifteenth technical means causes a computer to execute an image encoding process for correcting parallax compensation by correcting a characteristic difference between viewpoint images when encoding each viewpoint image taken from at least two viewpoints.
  • An image encoding program for extracting a reference block to be referred to when encoding an encoding target block, a block around the encoding target block, and the reference block And a step of correcting a characteristic difference between the viewpoint images based on the similarity to the peripheral blocks.
  • FIG. 3 is a block diagram illustrating a configuration example of a motion / disparity compensation unit in the image encoding unit of FIG. 2.
  • FIG. 4 is a diagram for conceptually explaining a block position extracted by a corresponding block extraction unit in the motion / disparity compensation unit in FIG. 3, and is a conceptual diagram showing a block position handled in an inter-image characteristic difference compensation process. It is a conceptual diagram of the determination process of a representative depth value. It is a flowchart for demonstrating the image coding process which the image coding apparatus of FIG. 1 performs.
  • the viewpoint image is first input to the image encoding unit 101.
  • the reference viewpoint encoding processing unit 102 compresses and encodes the viewpoint image of the reference viewpoint using the intra-view prediction encoding method.
  • intra-view prediction encoding intra-screen prediction or motion compensation is performed within the same viewpoint, and image data is compression-encoded based only on image data within the viewpoint.
  • reverse processing that is, internal decoding is performed for reference when encoding a viewpoint image of a non-reference viewpoint (to be described later), and the image signal is restored.
  • the non-reference viewpoint encoding processing unit 103 compresses and encodes the viewpoint image of the non-reference viewpoint using an inter-view prediction encoding method.
  • the inter-view prediction encoding method parallax compensation is performed using an image of a viewpoint different from the encoding target image, and the image data is compressed and encoded.
  • the non-reference viewpoint encoding processing unit 103 can also select an intra-view prediction encoding method that uses only image data within the viewpoint based on the overall encoding efficiency.
  • the encoded data encoded by the reference viewpoint encoding processing unit 102 and the encoded data encoded by the non-reference viewpoint encoding processing unit 103 are output from the image encoding unit 101.
  • the code construction unit 104 receives the encoded data from the image encoding unit 101, is connected and rearranged, and outputs the encoded stream to the outside of the image encoding device 100 (for example, an image decoding device 700 described later).
  • FIG. 2 is a block diagram illustrating a configuration example of the image encoding unit 101.
  • the image coding unit 101 includes an image input unit 201, a subtraction unit 202, an orthogonal transformation unit 203, a quantization unit 204, an entropy coding unit 205, an inverse quantization unit 206, an inverse orthogonal transformation unit 207, an addition unit 208, and a prediction method.
  • the control unit 209, the image selection unit 210, the deblocking filter unit 211, the frame memory 212, the motion / disparity compensation unit 213, the motion / disparity vector detection unit 214, and the intra prediction unit 215 are configured.
  • the reference viewpoint encoding process and the other non-reference viewpoint encoding processes are explicitly divided into the reference viewpoint encoding processing unit 102 and the non-reference viewpoint encoding process.
  • the reference viewpoint encoding processing unit 103 is used, since there are many processes that are common to each other in practice, a mode in which the reference viewpoint encoding process and the non-reference viewpoint encoding process are integrated will be described below.
  • processing for referencing an image with the same viewpoint as the encoding target viewpoint (motion compensation) and processing for referring to an image with a different viewpoint (parallax compensation) performed in the inter-screen prediction unit 218 are also referred to during encoding.
  • the processing can be made common by using ID information (reference viewpoint number, reference frame number) indicating a reference image, only with different images.
  • ID information reference viewpoint number, reference frame number
  • the method of encoding the residual component between the image predicted by each prediction unit 217 and 218 and the input viewpoint image can be performed in common for both the reference viewpoint and the non-reference viewpoint. Details will be described later.
  • the image input unit 201 outputs the divided image block signal to the subtraction unit 202, the intra prediction unit 215 in the intra-screen prediction unit 217, and the motion / disparity vector detection unit 214 in the inter-screen prediction unit 218.
  • the intra-screen prediction unit 217 is a processing unit that performs encoding using only information in the same screen that has been processed before the encoding target block, and details of the processing will be described later.
  • the inter-screen prediction unit 218 is a processing unit that performs encoding using information on a viewpoint image of the same viewpoint processed in the past or a viewpoint image of a different viewpoint, which is different from the encoding target image. Will be described later.
  • the image input unit 201 repeatedly outputs the blocks until all the blocks in the image frame are completed and all the input images are completed while sequentially changing the block positions.
  • the block size when the image input unit 201 divides the image signal is not limited to the 16 ⁇ 16 size described above, and may be 8 ⁇ 8, 4 ⁇ 4, or the like.
  • the number of vertical and horizontal pixels may not be the same, and may be 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 4, 4 ⁇ 8, or the like. Examples of these sizes are described in H.C.
  • the block size is not limited to the above size, and can correspond to an arbitrary block size that will be employed in a future encoding scheme.
  • the orthogonal transform unit 203 performs orthogonal transform on the difference image block signal input from the subtracting unit 202 to generate signals indicating the strengths of various frequency characteristics.
  • the difference image block signal is subjected to, for example, DCT transform (discrete cosine transform; Discrete Cosine Transform) and frequency domain signal (for example, DCT transform). Produces a DCT coefficient).
  • DCT transform discrete cosine transform
  • frequency domain signal for example, DCT transform
  • FFT Fast Fourier Transform
  • the orthogonal transform unit 203 outputs the coefficient value included in the generated frequency domain signal to the quantization unit 204.
  • the inverse quantization unit 206 decodes the difference image code input from the quantization unit 204 by performing processing (inverse quantization) opposite to the quantization performed by the quantization unit 204 using the quantization coefficient.
  • a frequency domain signal is generated and output to the inverse orthogonal transform unit 207.
  • the inverse orthogonal transform unit 207 generates a decoded differential image block signal that is a spatial domain signal by performing a process reverse to the orthogonal transform unit 203, for example, inverse DCT transform, on the input decoded frequency domain signal.
  • the inverse orthogonal transform unit 207 can generate a spatial domain signal based on the decoded frequency domain signal, the inverse orthogonal transform unit 207 is not limited to the inverse DCT transform, and other methods (for example, IFFT (Inverse Fast Fourier Transform)) are used. It may be used.
  • IFFT Inverse Fast Fourier Transform
  • the addition unit 208 receives the predicted image block signal from the image selection unit 210 and also receives the decoded difference image block signal from the inverse orthogonal transform unit 207.
  • the adder 208 adds the decoded differential image block signal to the predicted image block signal, and generates a reference image block signal obtained by encoding / decoding the input image (internal decoding).
  • the reference image block signal is output to the intra-screen prediction unit 217 and the inter-screen prediction unit 218.
  • the intra prediction unit 217 receives the reference image block signal from the adder 208 and also receives the image block signal of the encoding target image from the image input unit 201, and predicts the intra prediction in a predetermined direction.
  • the signal is output to the prediction method control unit 209 and the image selection unit 210.
  • the intra-screen prediction unit 217 generates information indicating the direction of prediction necessary for generating the intra-screen prediction image block signal as intra-screen prediction coding information, and outputs the information to the prediction method control unit 209.
  • the intra-screen prediction is performed according to the intra-screen prediction method of the conventional method (for example, H.264 Reference Software Software JM Ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008). Note that these processes in the in-screen prediction unit 217 are executed by the intra prediction unit 215 as shown in the configuration described above.
  • the inter-screen prediction unit 218 receives the reference image block signal from the addition unit 208 and also receives the image block signal of the encoding target image from the image input unit 201, and receives the inter-screen prediction image block signal generated by the inter-screen prediction.
  • the data is output to the prediction method control unit 209 and the image selection unit 210.
  • the inter-screen prediction unit 218 includes information necessary for generating an inter-screen prediction image block signal (reference image information including a reference viewpoint image number and a reference frame number, and a difference vector between a motion / disparity vector and a prediction vector. ) Is generated as inter-screen prediction encoding information, and the generated inter-screen prediction encoding information is output to the prediction scheme control unit 209. Details of the inter-screen prediction unit 218 will be described later.
  • the prediction scheme control unit 209 based on the picture type and coding efficiency of the input image, the intra-screen prediction image block signal input from the intra-screen prediction unit 217, the intra-screen prediction encoding information, and the screen
  • the prediction method for each block is determined based on the inter-screen prediction image block signal input from the inter-prediction unit 218 and the inter-screen coding information, and information on the prediction method is output to the image selection unit 210.
  • the picture type of the input image is information for identifying an image that can be referred to as a prediction image by the encoding target image, and includes an I picture, a P picture, a B picture, and the like. Similar to the coefficient, it is determined by a parameter given from the outside, and the same method as the conventional MVC can be used.
  • the prediction method control unit 209 monitors the picture type of the input image, and if the input encoding target image is an I picture that can only refer to information within the screen, it selects the intra prediction method deterministically. P-pictures that can refer to past frames that have already been encoded or images from different viewpoints, past frames that have been encoded, and future frames that have already been encoded (although they are future frames in display order, they have been processed in the past In the case of a B picture that can refer to an image of a different viewpoint from the frame), the prediction method control unit 209 determines the residual between the number of bits generated by the encoding performed by the entropy encoding unit 205 and the original image of the subtraction unit 202 To calculate the Lagrangian cost using the conventional method (eg, H.264.Reference.Software JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008) Alternatively, an inter-screen prediction method is determined.
  • the conventional method
  • the prediction method control unit 209 adds information that can specify the prediction method to the coding information corresponding to the prediction method selected by the above method from the intra-screen prediction coding information or the inter-screen prediction coding information. It outputs to the entropy encoding part 205 as prediction encoding information.
  • the image selection unit 210 performs the intra-screen prediction image block signal input from the intra-screen prediction unit 217 or the inter-screen prediction input from the inter-screen prediction unit 218 according to the prediction method information input from the prediction method control unit 209.
  • the image block signal is selected, and the predicted image block signal is output to the subtraction unit 202 and the addition unit 208.
  • the prediction method input from the prediction method control unit 209 is intra-screen prediction
  • the image selection unit 210 selects and outputs the intra-screen prediction image block signal input from the intra-screen prediction unit 217, and performs prediction.
  • the prediction method input from the method control unit 209 is inter-screen prediction
  • the inter-screen prediction image block signal input from the inter-screen prediction unit 218 is selected and output.
  • the entropy encoding unit 205 packs the differential image code input from the quantization unit 204, the quantization coefficient used by the quantization unit 204, and the prediction encoding information input from the prediction scheme control unit 209. And encoding using variable length coding (entropy coding), for example, to generate encoded data in which the amount of information is further compressed.
  • the entropy encoding unit 205 outputs the generated encoded data to the code configuration unit 104, and then the code configuration unit 104 outputs the encoded stream to the outside of the image encoding device 100 (for example, the image decoding device 700). .
  • the deblocking filter unit 211 receives the reference image block signal from the adder unit 208 and reduces the block distortion that occurs during image coding (for example, H.264 Reference Software JM ver. 13.2). Encoder, http://iphome.hhi.de/suehring/tml/, 2008).
  • the deblocking filter unit 211 outputs the processing result (correction block signal) to the frame memory 212.
  • the frame memory 212 receives the correction block signal from the deblocking filter unit 211, and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number.
  • the frame memory 212 manages the picture type or the image order of the input image by a memory management unit (not shown), and stores or discards the image according to the instruction. For image management, a conventional MVC image management method can also be used.
  • the motion / disparity vector detection unit 214 searches for a block similar to the image block signal input from the image input unit 201 by the block matching described later from the image group stored in the frame memory 212, and vector information indicating the searched block , Generating a viewpoint number and a frame number.
  • the vector information is referred to as a motion vector when the referenced image is the same viewpoint as the encoding target image, and is referred to as a disparity vector when the referenced image is different from the encoding target image.
  • the motion / disparity vector detecting unit 214 calculates an index value between the divided blocks for each area, and searches for an area where the calculated index value is minimum. The index value only needs to indicate the correlation or similarity between the image signals.
  • the motion / disparity vector detection unit 214 uses, for example, the absolute value sum (SAD) of the difference between the luminance value of the pixel included in the divided block and the luminance value in a certain region of the reference image.
  • SAD absolute value sum
  • the SAD between a block (for example, a size of N ⁇ N pixels) divided from the input viewpoint image signal and the block of the reference image signal is expressed by the following equation (2).
  • I in (i 0 + i, j 0 + j) is the luminance value at the coordinates (i 0 + i, j 0 + j) of the input image, and (i 0 , j 0 ) is the upper left of the divided block
  • I ref (i 0 + i + p, j 0 + j + q) represents the luminance value in the reference image coordinates (i 0 + i + p, j 0 + j + q)
  • (p, q) is shifted relative to the coordinates of the upper left corner of the divided blocks It is a quantity (motion vector).
  • the motion / disparity compensation unit 213 When the reference vector input from the motion / disparity vector detection unit 214 is a disparity vector, the motion / disparity compensation unit 213 simultaneously receives an image block signal around an encoding target block that has already been processed from the frame memory 212. And a corresponding image block signal around the reference image block (the position of the image block will be described later). When the motion / disparity compensation unit 213 calculates an index value indicating the correlation between corresponding neighboring blocks in the encoding target image and the reference image, and determines that the image blocks indicate the same subject to each other, The inter-image characteristic difference compensation is performed (details will be described later). The motion / disparity compensation unit 213 outputs the processed image block to the prediction method control unit 209 and the image selection unit 210 as an inter-screen prediction image block signal.
  • the motion / disparity compensation unit 213 does not perform the above-described peripheral block input and inter-image characteristic difference compensation processing.
  • the reference image block extracted from the frame memory 212 based on the motion vector, the reference image number, and the frame number is output as it is to the prediction method control unit 209 and the image selection unit 210 as an inter-screen prediction image block signal.
  • the motion / disparity compensation unit 213 further subtracts the prediction vector generated based on the motion / disparity vector adopted by the encoded block adjacent to the encoding target block from the motion / disparity vector calculated by the block matching. Then, a difference vector is calculated.
  • the prediction vector generation method is the median value of the horizontal and vertical components of the vector adjacent to the block to be encoded, the block adjacent to the upper right, and the block adjacent to the left. To obtain a prediction vector.
  • a method for calculating a prediction vector a method adopted in MVC can be used.
  • the motion / disparity compensation unit 213 outputs the difference vector and the reference image information (reference view number, reference frame number) to the prediction scheme control unit 209 as inter-frame coding information. Note that it is necessary that at least the reference viewpoint image number and the reference frame number match the region most similar to the input image block detected by block matching and the region indicated by the prediction vector.
  • FIG. 3 is a block diagram illustrating a configuration example inside the motion / disparity compensation unit 213.
  • the motion / disparity compensation unit 213 includes a corresponding block extraction unit 301, a correction coefficient calculation unit 302, and a correction processing unit 303.
  • the motion / disparity vector, reference viewpoint number, and reference frame number input from the motion / disparity vector detection unit 214 are input to the corresponding block extraction unit 301.
  • the corresponding block extraction unit 301 extracts a corresponding image block (reference image block) from the frame memory 212 based on the input motion / disparity vector, reference viewpoint number, and reference frame number.
  • FIG. 4 is a diagram for conceptually explaining the block positions extracted by the corresponding block extraction unit 301.
  • an encoding target image 401 indicates an encoding target viewpoint image
  • a reference image 402 indicates an image referred to by the encoding target image 401.
  • the reference image 402 is an image in which the entire image has already been encoded and decoded and stored in the frame memory 212, and is a viewpoint image with a different viewpoint from the encoding target image 401.
  • encoding / decoding has been completed up to the block immediately before the encoding target block position.
  • the block on which the encoding process is being performed is the encoding target block 403, and the vector 407 extending therefrom is a disparity vector input from the motion / disparity vector detection unit 214.
  • a specific block (reference block 405) of the reference image is specified by the disparity vector, the reference image number, and the reference frame number.
  • the encoding target block peripheral image block 404 is a block above the encoding target block 403 (adjacent block A), an upper right block (adjacent block B), and a left block (adjacent block C).
  • the reference block peripheral image block 406 is a block above the reference block 405 (adjacent block A ′), an upper right block (adjacent block B ′), and a left block (adjacent block C ′).
  • the relative position of the block 404 around the encoding target block 403 viewed from the encoding target block 403 is the reference block 405 extracted by the corresponding block extraction unit 301 as the reference source of the encoding target block 403. This corresponds to the relative position of the block 406 around the reference block 405 as seen from FIG.
  • this correspondence relationship is based on the disparity vectors in the encoding target block 403, and adjacent blocks A and A ′, B and B ′, and C It may be uniform in each relationship of C ′.
  • the correspondence relationship described above is based on the disparity vectors of the encoding target block 403 and its adjacent blocks A, B, and C, for example, the relative position of the adjacent block A with respect to the encoding target block 403 and the reference block of the adjacent block A ′
  • the correspondence relationship between the relative position of 405 and the relative position of the adjacent block B with respect to the encoding target block 403 and the relative position of the adjacent block B ′ with respect to the reference block 405 may be different.
  • the corresponding block extraction unit 301 when the reference vector input from the motion / disparity vector detection unit 214 is a disparity vector, the corresponding block extraction unit 301 includes the peripheral block (encoding target block peripheral image block) 404 of the encoding target block 403.
  • the image signal and the image signal of the peripheral block (reference block peripheral image block) 406 of the reference block 405 are extracted and output to the correction coefficient calculation unit 302.
  • an index value is calculated between the divided adjacent blocks, and determination is performed based on the obtained index value. Similar to the block matching described above, the sum of absolute values of differences in luminance values as shown in Expression (3) can be introduced into the index value.
  • I target and I ref indicate the luminance value of the encoding target image and the luminance value of the reference image, respectively.
  • the coordinates of the upper left corner of adjacent blocks A, B, and C of the encoding target image and the reference image on the right side are represented by (i block , j block ).
  • the above index value is calculated for each adjacent block of the encoding target block and the corresponding adjacent block of the reference image, and similarity determination is executed.
  • the value “20” can be used as the threshold value.
  • the index value expressed by the equation (3) is smaller than this threshold, it is determined that the similarity of individual blocks is high.
  • the similarity is calculated for each of the blocks A, B, and C, and it is determined that the similarity is high in all the blocks, calculation of a correction coefficient and correction processing described later are performed.
  • the threshold value used here is transmitted to the image decoding device (if a fixed threshold value is used for the entire image, it is only necessary to transmit the threshold value once at the beginning of processing), and the same similarity determination result is obtained at the time of decoding. There is a need.
  • the threshold value may be common to the entire image, or a method of adaptively switching in consideration of the encoding efficiency of the image may be employed. In the adaptive switching method, it is necessary to transmit all the information to the image decoding apparatus.
  • correction processing is performed when it is determined that the similarity is high in all the blocks A, B, and C.
  • the present invention is not particularly limited. For example, correction processing is performed when the similarity is high in any one or two blocks (or one or two blocks at a predetermined position), and the number of blocks to be determined (or the position of the block to be determined) May be transmitted to the image decoding apparatus together with a threshold value.
  • the absolute value sum of the differences in luminance values is used as an index for determining similarity, any method can be used as long as it is an index that can determine the similarity of corresponding blocks.
  • correction coefficient calculation When it is determined that correction processing is performed in the similarity determination described above, correction using the formula (1) employed in the above-described conventional technique may be executed.
  • a correction coefficient is used.
  • the calculation unit 302 calculates a correction coefficient as shown in Equation (4).
  • an offset of an average value of luminance values (offset Y ) and an offset of an average value of chrominance components (offset U , offset V ) are introduced as correction coefficients.
  • the luminance I used in the above equation (3) and Y used in the equation (4) are the same.
  • the correction coefficient calculation unit 302 outputs the offset value calculated in this way to the correction processing unit 303.
  • the correction processing unit 303 receives the reference image block signal from the corresponding block extraction unit 301 and performs correction processing based on the correction coefficient input from the correction coefficient calculation unit 302.
  • the correction processing unit 303 does not perform the correction process and does not perform the correction process.
  • the reference block signal input from the extraction unit 301 is output as it is.
  • the correction process is performed as shown in the following equation (5).
  • Y (x, y), U (x, y), and V (x, y) are pixel values of the reference image block input from the corresponding block extraction unit 301.
  • the motion / disparity compensation unit 213 outputs the image block signal output from the correction processing unit 303 as an inter-screen prediction image block signal to the prediction method control unit 209 and the image selection unit 210, and at the same time,
  • the difference vector calculated from the reference vector input from the vector detection unit 214 and the reference image information are also output to the prediction method control unit 209.
  • a characteristic difference between viewpoint images (that is, a characteristic difference between captured images between different viewpoints) is calculated. Correct and perform parallax compensation.
  • the motion / parallax compensation unit 213 performs this parallax compensation processing.
  • the parallax compensation itself is a process that takes into account the redundancy between the viewpoints of different viewpoint images.
  • the correction processing unit 303 of the motion / disparity compensation unit 213 determines whether each viewpoint image (between viewpoints) is based on the similarity between the block around the encoding target block and the block around the reference block indicated by the disparity vector. Correct the difference in image characteristics.
  • correction coefficient calculation method there is a method of using a depth information corresponding to an image and excluding from the correction coefficient calculation a block that is supposed to include a subject different from the encoding target block.
  • the correction processing unit 303 performs the correction by excluding peripheral blocks in which a subject different from the subject in the encoding target block is captured.
  • the depth information is image information indicating the distance to the subject shown in the image.
  • the depth information to be used is depth information about the encoding target block and its surrounding blocks, and the reference block and its surrounding blocks.
  • the depth image handled by the above-mentioned MPEG-3DV can be used in this example.
  • a depth image corresponding to the encoding target image is input from the outside of the image encoding apparatus 100, and representative values of depth values corresponding to the encoding target block and the encoding target block peripheral image block positions are calculated. Based on the calculated depth value, if the difference between the representative depth value of the encoding target block and the representative value of the depth value of the surrounding block at each position is larger than a predetermined value, it is excluded from the correction coefficient calculation processing. . Specifically, processing is performed using the following equation (6).
  • FLG (•) is a flag that determines whether or not to exclude from the correction coefficient calculation processing at a predetermined block position, and is given as follows.
  • BK indicates the position of the adjacent peripheral block.
  • D cur is a representative value of the depth value of the encoding target block
  • D (BK) is a representative value of the depth value in the peripheral block BK of the encoding target block.
  • a TH D is the determination threshold (the predetermined value).
  • the representative value of the depth value for each divided block can be calculated by the following method, for example. Specifically, a frequency distribution (histogram) of depth values in the block is created, and the depth value having the highest appearance frequency is extracted and determined as a representative value.
  • FIG. 5 shows a conceptual diagram of the representative depth value determination process.
  • the depth image 502 illustrated in FIG. 5B is given as the depth image corresponding to the viewpoint image 501 illustrated in FIG.
  • the depth image is represented as a monochrome image with luminance only.
  • the depth value 505 having the highest appearance frequency is represented by the block 503. Determined as the depth value.
  • the representative value of the depth value in addition to the method based on the histogram as described above, it may be determined according to the following method. For example, (a) an intermediate value of the in-block depth value, (b) an average value in consideration of the appearance frequency of the in-block depth value, and (c) a value closest to the camera among the in-block depth values (intra-block depth) (D) (maximum value), (d) the most distant value from the camera among the in-block depth values (minimum value of the in-block depth value), or (e) the depth value at the center position of the block among the in-block depth values. Either of these may be extracted and determined as a representative value.
  • the criteria for selecting which method is, for example, a method in which the most efficient method is fixed to a common method for encoding and decoding, and a parallax prediction using a depth representative value obtained based on each method
  • it is not necessary to prepare the above-described most frequently occurring depth value 505 and all of (a) to (e), and at least two of them may be selected.
  • the block size for dividing the depth image may be matched to the divided block size of the image.
  • the size is not limited to 16 ⁇ 16 size, and may be 8 ⁇ 8, 4 ⁇ 4, or the like.
  • the number of vertical and horizontal pixels may not be the same, and may be 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 4, 4 ⁇ 8, or the like.
  • a method of selecting an optimum size according to the size of a subject included in a depth image or a corresponding viewpoint image, a required compression rate, or the like is also possible.
  • the example which adopted the information based on the representative value of the block which divided the depth image is given as depth information, it is not limited to this. For example, if it is acceptable to slightly reduce the accuracy, it is prepared for each slice. Deformed depth information can also be used.
  • the example which used the depth information corresponding to each viewpoint image (encoding object image) was given for determination of a subject, other information is used, such as performing color determination, edge detection, etc., and using that information.
  • the amount of parallax can be obtained from a plurality of different viewpoint images, and the determination can be made using the difference in the amount of parallax for each region.
  • the depth information is information that can be calculated from the parallax
  • this determination method is almost equivalent to using the depth information, but there is an advantage that it is not necessary to use the depth information separately.
  • only one depth information (or other information) is prepared for each block (or only one is acquired), and processing is performed, so that one periphery of the encoding target block is executed. Since it is not determined that a plurality of subjects appear in the block, that is, it is determined that both the same subject and the different subject are included in the surrounding blocks in the encoding target block. Therefore, only the peripheral blocks where different subjects are captured can be excluded.
  • the select ( ⁇ ) function outputs the value of the corresponding argument based on the adjacent block position having the depth value closest to the representative depth value of the encoding target block shown in the following equation (9). To do. For example, the value input as the first argument when adjacent block A is closest, the second argument when adjacent block B is closest, and the third argument when adjacent block C is closest Is output.
  • FIG. 6 is a flowchart showing an image encoding process performed by the image encoding apparatus 100. This will be described with reference to FIG.
  • step S101 the image encoding device 100 inputs a viewpoint image (reference viewpoint, non-reference viewpoint) from the outside. Thereafter, the process proceeds to step S102.
  • viewpoint image reference viewpoint, non-reference viewpoint
  • step S102 the image encoding unit 101 encodes a viewpoint image input from the outside.
  • the image encoding unit 101 outputs the encoded data to the code configuration unit 104. Thereafter, the process proceeds to step S103.
  • step S103 if the input of the viewpoint image from the outside is completed, the image encoding device 100 proceeds to step S104. If the input of the viewpoint image from the outside is not completed, the image encoding device 100 returns to step S101 and repeats the process.
  • step S104 the code configuration unit 104 receives encoded data of images from a plurality of viewpoints from the image encoding unit 101, concatenates and rearranges the encoded data, and outputs the encoded data as an encoded stream to the outside of the image encoding device 100. Output to.
  • the image encoding unit 101 repeats the processing from step S202 to step S210 for each image block in the frame. Next, the process proceeds to step S203 and step S204.
  • the intra prediction unit 217 receives the image block signal of the viewpoint image from the image input unit 201 and also receives the reference image block signal decoded (internally decoded) from the addition unit 208, and performs intra prediction. To do.
  • the intra-screen prediction unit 217 outputs the generated intra-screen prediction image block signal to the prediction method control unit 209 and the image selection unit 210, and outputs the intra-screen prediction encoding information to the prediction method control unit 209.
  • a reset image block an image block in which all pixel values are 0
  • the process of the in-screen prediction unit is completed, the process proceeds to step S205.
  • step S204 the inter-screen prediction unit 218 inputs the image block signal of the viewpoint image from the image input unit 201 and also receives the reference image block signal decoded (internally decoded) from the addition unit 208, and performs inter-screen prediction. To do.
  • the inter-screen prediction unit 218 outputs the generated inter-screen prediction image block signal to the prediction method control unit 209 and the image selection unit 210, and outputs the inter-screen prediction coding information to the prediction method control unit 209.
  • a reset image block an image block signal in which all pixel values are 0
  • the process of the inter-screen prediction unit 218 is completed, the process proceeds to step S205.
  • step S ⁇ b> 205 the prediction method control unit 209 receives the intra-screen prediction image block signal and the intra-screen prediction encoding information from the intra-screen prediction unit 217, and the inter-screen prediction image block signal and the inter-screen prediction from the inter-screen prediction unit 218. Encoding information is received, and a prediction mode with good encoding efficiency is selected based on the aforementioned Lagrangian cost.
  • the prediction method control unit 209 outputs information on the selected prediction mode to the image selection unit 210.
  • the prediction scheme control unit 209 adds information for identifying the selected prediction mode to the prediction coding information corresponding to the selected prediction mode, and outputs the information to the entropy coding unit 205.
  • the image selection unit 210 performs an intra-screen prediction image block signal input from the intra-screen prediction unit 217 or an inter-screen prediction image input from the inter-screen prediction unit 218 according to the prediction mode information input from the prediction method control unit 209. A block signal is selected and output to the subtraction unit 202 and the addition unit 208. Thereafter, the process proceeds to step S206.
  • step S206 the subtraction unit 202 subtracts the predicted image block signal input from the image selection unit 210 from the image block signal input from the image input unit 201 to generate a difference image block signal.
  • the subtraction unit 202 outputs the difference image block signal to the orthogonal transformation unit 203. Thereafter, the process proceeds to step S207.
  • the orthogonal transformation unit 203 receives the difference image block signal from the subtraction unit 202 and performs the above-described orthogonal transformation.
  • the orthogonal transform unit 203 outputs the signal after the orthogonal transform to the quantization unit 204.
  • the quantization unit 204 performs the above-described quantization processing on the signal input from the orthogonal transform unit 203 to generate a difference image code.
  • the quantization unit 204 outputs the difference image code and the quantization coefficient to the entropy coding unit 205 and the inverse quantization unit 206.
  • the entropy encoding unit 205 packs the differential image code input from the quantization unit 204, the quantization coefficient, and the prediction encoding information input from the prediction scheme control unit 209, and performs variable length encoding. (Entropy coding) is performed to generate encoded data in which the amount of information is further compressed. The entropy encoding unit 205 outputs the encoded data to the external code configuration unit 104. Thereafter, the process proceeds to step S208.
  • step S208 the inverse quantization unit 206 receives the difference image code and the quantization coefficient from the quantization unit 204, and performs the inverse process of the quantization performed by the quantization unit 204.
  • the inverse quantization unit 206 outputs the generated signal to the inverse orthogonal transform unit 207.
  • the inverse orthogonal transform unit 207 receives the inversely quantized signal from the inverse quantization unit 206, performs the inverse orthogonal transform process of the orthogonal transform process performed by the orthogonal transform unit 203, and performs a difference image (decoded difference image block signal). ).
  • the inverse orthogonal transform unit 207 outputs the decoded difference image block signal to the addition unit 208. Thereafter, the process proceeds to step S209.
  • step S209 the addition unit 208 decodes the input image by adding the predicted image block signal input from the image selection unit 210 to the decoded differential image block signal input from the inverse orthogonal transform unit 207 ( Reference image block signal).
  • the adding unit 208 outputs the reference image block signal to the intra-screen prediction unit 217 and the inter-screen prediction unit 218. Thereafter, the process proceeds to step S210.
  • step S210 when the image encoding unit 101 has not completed the processes in steps S202 to S210 for all blocks and all viewpoint images in the frame, the block to be processed is changed and the process returns to step S202. When all the processes are completed, the process ends.
  • the processing flow of intra prediction performed in step S203 described above is the conventional method H.264. It may be the same as the processing step of H.264 or MVC intra-screen prediction.
  • step S301 the deblocking filter unit 211 inputs a reference image block signal from the adder unit 208 outside the inter-screen prediction unit 218, and performs the above-described FIR filter processing.
  • the deblocking filter unit 211 outputs the corrected block signal after the filtering process to the frame memory 212. Thereafter, the process proceeds to step S302.
  • step S302 the frame memory 212 receives the correction block signal of the deblocking filter unit 211, and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number. Thereafter, the process proceeds to step S303.
  • step S303 upon receiving the image block signal from the image input unit 201, the motion / disparity vector detection unit 214 searches for a block similar to the image block from the reference image group stored in the frame memory 212 (block matching). Then, vector information (motion vector / disparity vector) representing the found block is generated. The motion / disparity vector detection unit 214 outputs information (reference viewpoint image number, reference frame number) necessary for encoding including the detected vector information to the motion / disparity compensation unit 213. Thereafter, the process proceeds to step S304.
  • information reference viewpoint image number, reference frame number
  • step S304 the motion / disparity compensation unit 213 inputs information necessary for encoding from the motion / disparity vector detection unit 214, and extracts a corresponding prediction block from the frame memory 212.
  • the motion / disparity compensation unit 213 performs the inter-image characteristic difference compensation process when the correlation between the input peripheral image block signals is high.
  • the motion / disparity compensation unit 213 outputs the finally generated predicted image block signal to the prediction method control unit 209 and the image selection unit 210 as an inter-screen predicted image block signal.
  • the motion / disparity compensation unit 213 uses the reference image block signal as it is without performing inter-image characteristic difference compensation processing. It outputs to the prediction system control part 209 and the image selection part 210 as a block signal.
  • the motion / disparity compensation unit 213 calculates a difference vector between the prediction vector generated based on the vector information of the adjacent block of the encoding target block and the motion / disparity vector input from the motion / disparity vector detection unit 214.
  • the motion / disparity compensation unit 213 outputs the calculated difference vector and information necessary for prediction (reference viewpoint image number and reference frame number) to the prediction method control unit 209. Thereafter, the inter-screen prediction is terminated.
  • the image encoding device 100 corresponds to at least one or more neighboring blocks of the previous reference block indicated by the disparity vector of the encoding target block in the disparity compensation prediction method.
  • parallax compensation prediction no implicit additional information (additional information for inter-image characteristic difference compensation) is required and, of course, there is no need to transmit to the decoding side, and inter-image characteristics during parallax compensation prediction Difference compensation processing can be performed.
  • An encoding method is provided that enables accurate estimation of information necessary for the inter-image characteristic difference compensation process even when different subjects are reflected in the encoding target block and its periphery. Therefore, it is possible to dramatically improve the encoding efficiency.
  • FIG. 9 is a block diagram illustrating a configuration example of an image decoding apparatus according to an embodiment of the present invention.
  • An image decoding device 700 illustrated in FIG. 9 includes a code separation unit 701 and an image decoding unit 702. Note that the blocks and arrows indicated by dotted lines described in the image decoding unit 702 are used for conceptually explaining the operation of the image decoding unit 702.
  • the image decoding apparatus 700 When receiving the transmitted encoded stream, the image decoding apparatus 700 passes the encoded stream to the code separation unit 701.
  • the code separation unit 701 separates the reference viewpoint image encoded data and the non-reference viewpoint image encoded data.
  • the code separation unit 701 outputs the separated reference viewpoint image encoded data and non-reference viewpoint image encoded data to the image decoding unit 702.
  • the reference viewpoint decoding processing unit 703 decodes the encoded data that has been compression-encoded by a method according to intra-view prediction encoding, and restores the viewpoint image of the reference viewpoint.
  • FIG. 10 is a block diagram illustrating a configuration example of the image decoding unit 702.
  • the image decoding unit 702 includes an encoded data input unit 813, an entropy decoding unit 801, an inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a prediction scheme control unit 805, an image selection unit 806, and a deblocking filter unit. 807, a frame memory 808, a motion / disparity compensation unit 809, an intra prediction unit 810, and an image output unit 812.
  • the intra-screen prediction unit 816 and the inter-screen prediction unit 815 are illustrated by dotted lines, the intra-screen prediction unit 816 includes an intra prediction unit 810, and the inter-screen prediction unit 815 includes the deblocking filter unit 807, It is assumed that a frame memory 808 and a motion / disparity compensation unit 809 are included.
  • the reference viewpoint decoding process and the non-reference viewpoint decoding process are explicitly divided into the reference viewpoint decoding process and the non-reference viewpoint decoding process.
  • the reference viewpoint decoding process and the non-reference viewpoint decoding process are integrated.
  • the intra-view prediction decoding method performed as the reference viewpoint decoding processing unit 703 described above is a part of the processing performed by the intra-screen prediction unit 816 and the inter-screen prediction unit 815 of FIG. This is a combination of processing (motion compensation) for referring to an image of the same viewpoint.
  • the inter-view prediction decoding method performed as the non-reference viewpoint decoding processing unit 704 is a process performed by the intra-screen prediction unit 816 and a process referring to the same viewpoint image performed by the inter-screen prediction unit 815 (motion compensation). And a process of referring to images from different viewpoints (parallax compensation).
  • the processing (motion compensation) for referring to an image of the same viewpoint as the processing target viewpoint performed by the inter-screen prediction unit 815 motion compensation
  • ID information reference viewpoint number, reference frame number
  • the processing for restoring the image by adding the residual component obtained by decoding the encoded image data and the image predicted by each of the prediction units 815 and 816 is common to both the reference viewpoint and the non-reference viewpoint. Yes. Details will be described later.
  • the encoded data input unit 813 divides the image encoded data input from the code separation unit 701 into processing block units (for example, 16 pixels ⁇ 16 pixels), and outputs the result to the entropy decoding unit 801.
  • the encoded data input unit 813 repeatedly outputs the blocks until the blocks are sequentially changed and all the blocks in the frame are completed and the input encoded data is completed.
  • the entropy decoding unit 801 processes the encoded data input from the encoded data input unit 813 in a process reverse to the encoding method (for example, variable length encoding) performed by the entropy encoding unit 205 in FIG. Variable length decoding), and a differential image code, a quantization coefficient, and predictive coding information are extracted.
  • the entropy decoding unit 801 outputs the difference image code and the quantization coefficient to the inverse quantization unit 802 and the prediction coding information to the prediction scheme control unit 805.
  • the inverse quantization unit 802 dequantizes the difference image code input from the entropy decoding unit 801 using the extracted quantization coefficient to generate a decoded frequency domain signal, and outputs the decoded frequency domain signal to the inverse orthogonal transform unit 803.
  • the inverse orthogonal transform unit 803 generates a decoded differential image block signal that is a spatial domain signal by performing, for example, inverse DCT transform on the input decoded frequency domain signal.
  • the inverse orthogonal transform unit 803 can generate a spatial domain signal based on the decoded frequency domain signal, the inverse orthogonal transform unit 803 is not limited to the inverse DCT transform, and other methods (for example, IFFT (Inverse Fast Fourier Transform)) are used. It may be used.
  • IFFT Inverse Fast Fourier Transform
  • the prediction scheme control unit 805 takes out the block unit prediction scheme employed in the image coding apparatus 100 shown in FIGS. 1 and 2 from the prediction coding information input from the entropy decoding unit 801.
  • the prediction method is intra-screen prediction or inter-screen prediction.
  • the prediction method control unit 805 outputs information regarding the extracted prediction method to the image selection unit 806.
  • the prediction method control unit 805 extracts the encoded information from the prediction encoded information input from the entropy decoding unit 801, and outputs the encoded information to the processing unit corresponding to the extracted prediction method.
  • the prediction method control unit 805 outputs the encoded information to the intra prediction unit 816 as intra prediction prediction information.
  • the prediction method control unit 805 outputs the encoded information to the inter-screen prediction unit 815 as inter-screen prediction encoded information.
  • the image selection unit 806, based on the prediction method input from the prediction method control unit 805, the intra-screen prediction image block signal input from the intra-screen prediction unit 816 or the inter-screen prediction image block input from the inter-screen prediction unit 815. Select a signal.
  • the prediction method is intra prediction
  • an intra prediction image block signal is selected.
  • the prediction method is inter-screen prediction
  • an inter-screen prediction image block signal is selected.
  • the image selection unit 806 outputs the selected predicted image block signal to the addition unit 804.
  • the addition unit 804 adds the predicted image block signal input from the image selection unit 806 to the decoded difference image block signal input from the inverse orthogonal transform unit 803 to generate a decoded image block signal.
  • the adding unit 804 outputs the decoded decoded image block signal to the intra-screen prediction unit 816, the inter-screen prediction unit 815, and the image output unit 812.
  • the image output unit 812 receives the decoded image block signal from the adder 804 and temporarily holds it as a part of the image in a frame memory (not shown).
  • the image output unit 812 outputs the image to the outside of the image decoding apparatus 700 when all the viewpoint images are prepared after rearranging the frames in the display order.
  • the intra prediction unit 810 in the intra prediction unit 816 receives the decoded image block signal from the addition unit 804 and the intra prediction encoding information from the prediction scheme control unit 805.
  • the intra prediction unit 810 reproduces the intra prediction performed at the time of encoding based on the intra prediction encoding information. Note that intra prediction can be performed according to the conventional method described above.
  • the intra prediction unit 810 outputs the generated prediction image to the image selection unit 806 as an intra-screen prediction image block signal.
  • the deblocking filter unit 807 performs the same processing as the FIR filter performed by the deblocking filter unit 211 on the decoded image block signal input from the addition unit 804, and the processing result (correction block signal) is stored in the frame memory. Output to 808.
  • the frame memory 808 receives the correction block signal from the deblocking filter unit 807 and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number.
  • the frame memory 808 manages the picture type or image order of the input image by a memory management unit (not shown), and stores or discards the image according to the instruction. For image management, a conventional MVC image management method can also be used.
  • the motion / disparity compensation unit 809 receives inter-frame prediction encoding information from the prediction scheme control unit 805, and from among these, reference image information (reference viewpoint image number and reference frame number) and a difference vector (motion / disparity vector and prediction). Vector difference vector).
  • the motion / disparity compensation unit 809 generates a prediction vector by the same method as the prediction vector generation method performed by the motion / disparity compensation unit 213 described above.
  • the motion / disparity compensation unit 809 reproduces the motion / disparity vector by adding the difference vector to the calculated prediction vector.
  • the motion / disparity compensation unit 809 extracts a target image block signal (predicted image block signal) from the images stored in the frame memory 808 based on the reference image information and the motion / disparity vector.
  • the motion / disparity compensation unit 809 simultaneously performs an image block signal around the decoding target block (decoding target block surrounding image block signal) and an image signal block signal around the prediction image block (prediction image block surrounding image).
  • the block signal) is extracted, and the similarity is determined under the same conditions as in the encoding using the above-described method for determining the similarity used at the time of encoding. Further, according to the determination result, the same inter-image characteristic difference correction processing as that at the time of encoding is performed.
  • the reference image block signal is output as it is without performing the inter-image characteristic difference compensation process.
  • the motion / parallax compensation unit 809 outputs the image block signal after the inter-image characteristic difference correction or the reference image block signal to the image selection unit 806 as an inter-screen prediction image block signal.
  • the image decoding apparatus 700 corrects the characteristic difference between the viewpoint images and performs parallax compensation.
  • the motion / parallax compensation unit 809 performs this parallax compensation processing in the same manner as the motion / parallax compensation unit 213 described above. That is, the motion / disparity compensation unit 809 includes a corresponding block extraction unit that extracts a reference block to be referred to when decoding the decoding target block, a block around the decoding target block, and a block around the reference block indicated by the disparity vector. And a correction processing unit that corrects a characteristic difference between the viewpoint images based on the similarity.
  • the relative positions of the blocks around the decoding target block viewed from the decoding target block are determined from the reference blocks extracted as reference sources of the decoding target block by the corresponding block extraction unit. Preferably, it corresponds to the relative position of the block around the reference block as seen.
  • the correction processing unit performs correction by excluding a peripheral block in which a subject different from the subject in the decoding target block is captured, or a peripheral in which the same subject as the subject in the decoding target block is most likely to be captured It is preferable to perform correction by selecting one block.
  • the depth information corresponding to each viewpoint image (decoding target image) can be used for subject determination.
  • the depth information to be used is depth information about the decoding target block and its peripheral blocks, and the reference block and its peripheral blocks.
  • the depth information is preferably information based on representative values of blocks obtained by dividing the depth image.
  • FIG. 11 is a flowchart showing an image decoding process performed by the image decoding apparatus 700. This will be described with reference to FIG.
  • step S501 the image decoding apparatus 700 receives an encoded stream from the outside (for example, the image encoding apparatus 100), and the code separation unit 701 separates and extracts the image encoded data of each viewpoint. Thereafter, the process proceeds to step S502.
  • step S502 the image decoding unit 702 decodes the encoded image data of the viewpoint image separated and extracted in step S501, and outputs the result to the outside of the image decoding unit 702. Thereafter, the process proceeds to step S503.
  • step S503 when the image decoding apparatus 700 determines that the processing of all viewpoint images has been completed, the image decoding apparatus 700 rearranges the viewpoint images in the time axis direction and arranges the order of the viewpoint directions. The image is output to the outside of the image decoding apparatus 700. If the image decoding apparatus 700 determines that the processing of all the viewpoints and time images has not been completed, the image decoding apparatus 700 returns to step S502 and continues the processing.
  • step S603 the entropy decoding unit 801 performs entropy decoding on the encoded image data input from the encoded data input unit 813, and generates a differential image code, a quantization coefficient, and predictive encoding information.
  • the entropy decoding unit 801 outputs the difference image code and the quantization coefficient to the inverse quantization unit 802, and outputs the prediction coding information to the prediction scheme control unit 805.
  • the prediction scheme control unit 805 receives prediction coding information from the entropy decoding unit 801, and extracts information regarding the prediction scheme and coding information corresponding to the prediction scheme.
  • the prediction method is intra prediction
  • the encoding information is output to the intra prediction unit 816 as intra prediction encoding information.
  • the prediction method is inter-screen prediction
  • the encoding information is output to the inter-screen prediction unit 815 as inter-screen prediction encoding information. Then, it progresses to step S604 and step S605.
  • step S604 the intra prediction unit 810 in the intra prediction unit 816 receives the intra prediction encoding information input from the prediction scheme control unit 805 and the decoded image block signal input from the addition unit 804, and the screen Intra prediction processing is performed.
  • the intra prediction unit 810 outputs the generated intra-screen prediction image block signal to the image selection unit 806.
  • a reset image block signal an image block signal in which all pixel values are 0
  • the process proceeds to step S606.
  • step S605 the inter-screen prediction unit 815 performs inter-screen prediction based on the inter-screen prediction encoding information input from the prediction method control unit 805 and the decoded image block signal input from the addition unit 804.
  • the inter-screen prediction unit 815 outputs the generated inter-screen prediction image block signal to the image selection unit 806.
  • the inter-screen prediction process will be described later.
  • a reset image block signal an image block signal in which all pixel values are 0
  • the process proceeds to step S606.
  • step S ⁇ b> 606 the image selection unit 806 receives the intra-screen prediction image block signal input from the intra-screen prediction unit 816 or the inter-screen prediction unit 815 based on the information related to the prediction method input from the prediction method control unit 805. The input inter-screen prediction image signal is selected and output to the adding unit 804. Thereafter, the process proceeds to step S607.
  • step S607 the inverse quantization unit 802 uses the difference image code and the quantization coefficient input from the entropy decoding unit 801, and performs the inverse of the quantization performed by the quantization unit 204 of the image encoding unit 101 in FIG. Process.
  • the inverse quantization unit 802 outputs the generated decoded frequency domain signal to the inverse orthogonal transform unit 803.
  • the inverse orthogonal transform unit 803 receives the inverse quantized decoded frequency domain signal from the inverse quantization unit 802 and performs the inverse orthogonal transform process of the orthogonal transform process performed by the orthogonal transform unit 203 of the image coding unit 101 in FIG. And the differential image (decoded differential image block signal) is decoded.
  • the inverse orthogonal transform unit 803 outputs the decoded decoded difference image block signal to the adding unit 804.
  • the adding unit 804 adds the predicted image block signal input from the image selection unit 806 to the decoded difference image block signal input from the inverse orthogonal transform unit 803, thereby generating a decoded image block signal.
  • the adding unit 804 outputs the decoded decoded image block signal to the image output unit 812, the intra-screen prediction unit 816, and the inter-screen prediction unit 815. Thereafter, the process proceeds to step S608.
  • step S608 the image output unit 812 places the decoded image block signal input from the adding unit 804 at a corresponding position in the image to generate an output image. If the processes in steps S602 to S608 have not been completed for all blocks in the frame, the block to be processed is changed and the process returns to step S602.
  • the image output unit 812 rearranges the images in the display order, aligns the viewpoint images of the same frame, and outputs them to the outside of the image decoding apparatus 700.
  • step S701 the deblocking / filtering unit 807 receives the decoded image block signal from the adding unit 804 outside the inter-screen prediction unit 815, and performs the FIR filter processing performed at the time of the encoding.
  • the deblocking filter unit 807 outputs the corrected corrected block signal to the frame memory 808. Thereafter, the process proceeds to step S702.
  • step S702 the frame memory 808 receives the correction block signal of the deblocking filter unit 807, and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number. Thereafter, the process proceeds to step S703.
  • the motion / disparity compensation unit 809 receives the inter-frame predictive coding information from the prediction scheme control unit 805, and among them, the reference image information (reference viewpoint image number and reference frame number) and the difference vector (motion / motion). A difference vector between the disparity vector and the prediction vector) is extracted.
  • the motion / disparity compensation unit 809 generates a prediction vector by the same method as the prediction vector generation method performed by the motion / disparity compensation unit 213 described above.
  • the motion / disparity compensation unit 809 adds a difference vector to the calculated prediction vector to generate a motion / disparity vector.
  • the motion / disparity compensation unit 809 extracts a target image block signal (predicted image block signal) from the images stored in the frame memory 808 based on the reference image information and the motion / disparity vector.
  • the motion / disparity compensation unit 809 extracts the surrounding image block signals of the encoding target block and the reference image block, performs the above-described inter-image characteristic difference compensation processing, and the result Is output to the image selection unit 806 as an inter-screen prediction image block signal.
  • the motion / disparity compensation unit 809 When the reference vector is a motion vector, the motion / disparity compensation unit 809 outputs the predicted image block signal as it is as the inter-screen predicted image block signal to the image selection unit 806 without performing the above-described inter-image characteristic difference compensation processing. To do. Thereafter, the inter-screen prediction process ends.
  • the image decoding apparatus 700 performs the parallax compensation prediction in which the inter-image characteristic difference compensation is performed without receiving additional information for explicit inter-image characteristic difference compensation. Can do. That is, according to the present embodiment, it is possible to decode the encoded data with improved encoding efficiency as in the image encoding apparatus 100 of FIG.
  • (Embodiment 3) Part of the image encoding device 100 and the image decoding device 700 in the above-described embodiment, for example, the code configuration unit 104, the subtraction unit 202 in the image encoding unit 101, the orthogonal transform unit 203, the quantization unit 204, and the entropy code Unit 205, inverse quantization unit 206, inverse orthogonal transform unit 207, addition unit 208, prediction method control unit 209, image selection unit 210, deblocking filter unit 211, motion / disparity compensation unit 213, motion / disparity vector detection Unit 214, intra prediction unit 215, code separation unit 701, entropy decoding unit 801 in image decoding unit 702, inverse quantization unit 802, inverse orthogonal transform unit 803, addition unit 804, prediction scheme control unit 805, image
  • the selection unit 806, the deblocking filter unit 807, the motion / disparity compensation unit 809, and the intra prediction unit 810 are performed by a computer. It may be present.
  • a program for realizing this control function is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system. , May be realized by executing.
  • the “computer system” here is a computer system built in the image encoding apparatus 100 or the image decoding apparatus 700, and includes an OS and hardware such as peripheral devices.
  • the “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, and a hard disk incorporated in a computer system.
  • the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line,
  • a volatile memory inside a computer system that serves as a server or a client may be included that holds a program for a certain period of time.
  • the program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system. .
  • this program is not limited to being distributed via a portable recording medium or a network, but can also be distributed via a broadcast wave.
  • the image encoding program causes a computer to execute an image encoding process that corrects a characteristic difference between viewpoint images and performs parallax compensation when encoding each viewpoint image captured from at least two viewpoints. It is a program for.
  • the image encoding process is based on the similarity between the step of extracting a reference block to be referred to when encoding the encoding target block, and the block around the encoding target block and the block around the reference block. And correcting the characteristic difference between the viewpoint images.
  • Other application examples are as described for the image encoding device.
  • the image decoding program performs an image decoding process for correcting a characteristic difference between viewpoint images and performing parallax compensation when decoding an encoded stream of each viewpoint image captured from at least two viewpoints. It is a program for making it run.
  • the image decoding process is based on the step of extracting a reference block to be referred to when decoding the decoding target block, and based on the similarity between the block around the decoding target block and the block around the reference block. And correcting a characteristic difference between images.
  • Other application examples are as described for the image decoding apparatus.
  • This image decoding program can be implemented as part of multi-viewpoint image playback software.
  • part or all of the image encoding device 100 and the image decoding device 700 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration) or an IC (Integrated Circuit) chip set.
  • LSI Large Scale Integration
  • IC Integrated Circuit
  • Each functional block of the image encoding device 100 and the image decoding device 700 may be individually made into a processor, or a part or all of them may be integrated into a processor.
  • the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor.
  • an integrated circuit based on the technology may be used.
  • the present invention further includes an image encoding method and an image encoding method, as described as the processing of each step of the image decoding program, A form as an image decoding method can also be adopted.
  • the image encoding method is a method for performing parallax compensation by correcting a difference in characteristics between viewpoint images when encoding each viewpoint image taken from at least two viewpoints, wherein the encoding target block Extracting a reference block to be referred to when encoding the image, and correcting a characteristic difference between the viewpoint images based on similarity between a block around the encoding target block and a block around the reference block And.
  • Other application examples are as described for the image encoding device.
  • the image decoding method is a method of performing parallax compensation by correcting a characteristic difference between viewpoint images when decoding an encoded stream of each viewpoint image captured from at least two viewpoints.
  • Other application examples are as described for the image decoding apparatus.
  • DESCRIPTION OF SYMBOLS 100 ... Image coding apparatus, 101 ... Image coding part, 102 ... Reference viewpoint coding process part, 103 ... Non-reference viewpoint coding process part, 104 ... Code structure part, 201 ... Image input part, 202 ... Subtraction part, DESCRIPTION OF SYMBOLS 203 ... Orthogonal transformation part, 204 ... Quantization part, 205 ... Entropy encoding part, 206 ... Dequantization part, 207 ... Inverse orthogonal transformation part, 208 ... Addition part, 209 ... Prediction scheme control part, 210 ... Image selection part , 211 ... deblocking filter unit, 212 ... frame memory, 213 ...
  • motion / disparity compensation unit 214 ... parallax vector detection unit, 215 ... intra prediction unit, 217 ... intra prediction unit, 218 ... inter prediction unit, 301 ... corresponding block extraction unit 302 ... correction coefficient calculation unit 303 ... correction processing unit 401 ... encoding target image 402 ... reference image 403 ... encoding target block 404 ... Encoding target block peripheral image block, 405 ... reference block, 406 ... reference block peripheral image block, 407 ... disparity vector, 700 ... image decoding apparatus, 701 ... code separation unit, 702 ... image decoding unit, 703 ... reference viewpoint decoding process 704... Non-reference viewpoint decoding processing unit 801... Entropy decoding unit 802.
  • Deblocking filter unit 808 Frame memory 809
  • Motion / disparity compensation unit 810 Intra prediction unit 812
  • Image output unit 813 Encoded data input unit 815
  • Inter prediction unit 816 Screen Inner prediction unit, 901 ... subject, 902 ... camera, 903 ... sensor, 906 ... encoder, 907 ... decoder, 908 ... display unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

An objective of the present invention is, in a disparity compensation prediction scheme, to allow carrying out a consistently high-precision inter-image characteristic difference compensation process without requiring additional information aside from information required for the disparity compensation prediction. When coding each viewpoint image which is photographed from at least two or more viewpoints, this image coding device corrects characteristic differences between each viewpoint image and carries out disparity compensation. This image coding device comprises: a corresponding block extraction unit which extracts a reference block (405) which is referred to when coding a block to be coded (403); and a correction processing unit which, on the basis of the similarity between peripheral blocks (404) of the block to be coded (403) and peripheral blocks (406) of the reference block (405), corrects characteristic differences between each viewpoint image.

Description

画像符号化装置、画像復号装置、並びにそれらの方法及びプログラムImage encoding apparatus, image decoding apparatus, method and program thereof
 本発明は、複数の視点から撮影された画像に対して視差補償予測符号化を行う画像符号化装置、その符号化されたデータを視差補償予測復号により復号する画像復号装置、並びに、それらの方法及びプログラムに関する。 The present invention relates to an image encoding device that performs disparity-compensated predictive encoding on images taken from a plurality of viewpoints, an image decoding device that decodes the encoded data by disparity-compensated predictive decoding, and methods thereof And programs.
 同一の被写体や背景を、複数のカメラで撮影した動画像(多視点動画像)を符号化するための方式として、MPEG(Moving Picture Experts Group)-4 AVC/H.264の拡張規格であるMVC(Multiview Video Coding)がある。この符号化方式では、従来の単一視点の動画像符号化方式(MPEG-2、MPEG-4、MPEG-4 AVC/H.264など)と同様に、動画像の時間方向の相関性を利用した動き補償フレーム間予測符号化と、拡張により導入された視点方向の相関性を利用した視差補償予測符号化とを駆使して、符号量の削減が図られている。 MPEG MPEG (Moving Picture Experts Group) -4 AVC / H.3 as a method for encoding moving images (multi-viewpoint moving images) taken by a plurality of cameras on the same subject or background. There is MVC (Multiview Video Coding) which is an H.264 extension standard. In this encoding method, as in the conventional single-view moving image encoding method (MPEG-2, MPEG-4, MPEG-4 AVC / H.264, etc.), the correlation in the time direction of moving images is used. The amount of code is reduced by making full use of the motion-compensated inter-frame prediction coding and the parallax compensation prediction coding using the correlation of the viewpoint direction introduced by the extension.
 視差補償予測符号化については、複数の個体差のあるカメラによる撮影のため、それぞれのカメラが十分に校正されていない場合、同一被写体を撮影しても動画像内の被写体の明るさや色味が異なり、視差補償のための予測精度が下がり、その結果として符号化効率が低下する問題が発生することがある。このような異なるカメラ間の画像の特性不一致は、カメラの校正不足による要因以外にも、被写体表面を反射した光の入り方がカメラ毎に異なる場合にも起こり得る。 For parallax-compensated predictive coding, if the cameras are not fully calibrated because they are shot with multiple individual differences, the brightness and color of the subject in the moving image will remain unchanged even if the same subject is shot. In contrast, the prediction accuracy for parallax compensation is lowered, and as a result, there is a problem that coding efficiency is lowered. Such a mismatch in image characteristics between different cameras may occur not only due to a camera calibration deficiency but also when the light reflected from the surface of the subject differs from camera to camera.
 カメラ間の画像の特性の不一致が起こっても符号化効率を低下させないために、輝度補償IC(Illumination Compensation)とカラー補償CC(Color Compensation)を導入した視差補償予測符号化方式が提案されている(非特許文献1)。以下、輝度補償と色補償の両方式を組み合わせて、或いはいずれか一方のことを便宜上、画像間特性差補償と呼ぶこととする。 In order not to reduce the encoding efficiency even if the image characteristics do not match between cameras, a parallax-compensated predictive encoding method that introduces a luminance compensation IC (Illumination Compensation) and a color compensation CC (Color Compensation) has been proposed. (Non-Patent Document 1). Hereinafter, a combination of both luminance compensation and color compensation, or one of them is referred to as inter-image characteristic difference compensation for convenience.
 非特許文献1では、符号化時に参照するブロック(参照ブロック)に対して、オフセットパラメータCICとスケールパラメータSICを用いて輝度補償を行っている。このとき輝度補償を施した参照ブロックは以下の式(1)のようになる。ここで、添え字Rは参照画像内のブロックであることを示し、μは参照ブロックの輝度平均、ωは平均値0の正規分布である。 In Non-Patent Document 1, luminance compensation is performed using a offset parameter C IC and a scale parameter S IC for a block (reference block) to be referred to at the time of encoding. At this time, the reference block subjected to luminance compensation is expressed by the following equation (1). Here, the subscript R indicates a block in the reference image, μ is a luminance average of the reference block, and ω is a normal distribution with an average value of 0.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 具体的な符号化方法は、以下の通りである。初めに符号化対象ブロックと最も相関の高い参照ブロックが、符号化対象画像とは異なる視点の参照画像内より抽出される。抽出された参照ブロックと符号化対象ブロックに基づいて、上記オフセットパラメータとスケールパラメータが導出される。続いて、参照ブロックの相対位置を示すベクトル(視差ベクトル)と参照画像を示すID(参照視点番号、参照フレーム番号)と上記パラメータ(オフセットパラメータとスケールパラメータ)、及び輝度補償後の参照ブロックと符号化対象ブロックとの残差成分が、符号化され伝送される。 The specific encoding method is as follows. First, the reference block having the highest correlation with the encoding target block is extracted from the reference image at a different viewpoint from the encoding target image. Based on the extracted reference block and encoding target block, the offset parameter and the scale parameter are derived. Subsequently, a vector (disparity vector) indicating the relative position of the reference block, an ID (reference viewpoint number, reference frame number) indicating the reference image, the above parameters (offset parameter and scale parameter), and the reference block and code after luminance compensation The residual component with the target block is encoded and transmitted.
 復号時には、参照画像を示すIDと視差ベクトルによって特定された参照ブロックに対して、式(1)を用いて輝度補償が施される。さらに、輝度補償後の画像に、残差成分が加算されて、元の画像が再現される。色補償についても輝度補償と同様な方法によって実現可能である。以下、色補償におけるオフセットパラメータをCCC、スケールパラメータをSCCとして説明する。 At the time of decoding, luminance compensation is performed on the reference block specified by the ID indicating the reference image and the disparity vector using Expression (1). Further, the residual component is added to the luminance compensated image to reproduce the original image. Color compensation can also be realized by a method similar to luminance compensation. Hereinafter, the offset parameter in color compensation will be described as C CC , and the scale parameter will be described as S CC .
 しかしながら、上記画像間特性差補償を行うためには、視差補償で必要な情報以外の追加の情報(輝度補償や色補償のためのオフセットパラメータとスケールパラメータ)が必要となり、必ずしも符号化効率を高めることにならない場合がある。そこで、以下のようなシンタックスを導入して最適化する方法が提案されている。 However, in order to perform the above-described inter-image characteristic difference compensation, additional information (an offset parameter and a scale parameter for luminance compensation and color compensation) other than information necessary for parallax compensation is required, which necessarily increases the encoding efficiency. It may not happen. Therefore, a method of optimizing by introducing the following syntax has been proposed.
 符号化のシンタックスとしては、スライス・レベルにおいて、現在のスライスで輝度補償、或いは色補償を有効にするか否かを示すフラグ(ic_flag,cc_flag)と、各マクロブロックにおける輝度補償、或いは色補償を有効にするか否かを示すフラグ(ic_enable,cc_enable)が導入される。これら導入されたフラグを用いて、符号化効率の向上が見込めない条件においては、上記画像間特性差補償の処理をキャンセルすることができる。 The coding syntax includes flags (ic_flag, cc_flag) indicating whether or not to enable luminance compensation or color compensation in the current slice at the slice level, and luminance compensation or color compensation in each macroblock. Flags (ic_enable, cc_enable) indicating whether to enable or not are introduced. By using these introduced flags, the above-described inter-image characteristic difference compensation process can be canceled under the condition that the improvement of the encoding efficiency cannot be expected.
 以上、画像間特性差補償を実現するには、シンタックスを含めた補償のための情報(オフセットパラメータ、スケールパラメータ)を、通常の視差補償符号化に必要な情報(視差ベクトル、参照画像を示すID、残差成分)に加えて符号化する必要がある。 As described above, in order to realize the inter-image characteristic difference compensation, information for compensation including the syntax (offset parameter, scale parameter), information necessary for normal parallax compensation encoding (disparity vector, reference image) (ID, residual component) in addition to encoding.
 さらに効率を高めるために、特許文献1,2では上記画像間特性差補償に用いられる情報の一部を明示的に伝送しなくてもよい方法が提案されている。特許文献1,2に記載の方法では、画像の特性差を低減する画像間特性差補償において、上記追加の情報(画像間特性差補償のためのシンタックス、及びパラメータ)を伝送するモード(モード1)と、追加の情報を明示的に伝送しないモード(モード2)を備え、適宜切り替えを行っている。追加情報を伝送するモードは、ここまで説明したように画像間特性差補償に必要な追加の情報を全て伝送し、その情報に基づいて補償処理を行うモードである(上記非特許文献1で説明した方法と基本処理は同じである)。明示的に画像間特性差補償の一部の情報を伝送しないモードでは、隣接ブロック(空間的、時間的、視点横断的)に用いられたオフセット量(、スケール量)を用いて処理対象ブロックのオフセット量(、スケール量)が算出される。 In order to further increase the efficiency, Patent Documents 1 and 2 propose a method in which a part of information used for the above-described inter-image characteristic difference compensation does not need to be transmitted explicitly. In the methods described in Patent Literatures 1 and 2, in the inter-image characteristic difference compensation for reducing the image characteristic difference, a mode (mode) for transmitting the additional information (syntax and parameters for inter-image characteristic difference compensation) is transmitted. 1) and a mode (mode 2) in which additional information is not explicitly transmitted, and switching is performed as appropriate. The mode for transmitting the additional information is a mode for transmitting all the additional information necessary for compensation of the characteristic difference between the images as described above, and performing the compensation processing based on the information (described in Non-Patent Document 1 above). The basic method is the same as the above method). In a mode that does not explicitly transmit some information for compensation of characteristic differences between images, the offset amount (and scale amount) used for neighboring blocks (spatial, temporal, and cross-viewpoint) is used. The offset amount (scale amount) is calculated.
特表2010-507334号公報Special table 2010-507334 特表2010-507336号公報Special table 2010-507336
 非特許文献1に記載の方法では、特許文献1で指摘されているように画像間特性差補償を実施するためのパラメータ(オフセットパラメータ、スケールパラメータ)とマクロブロック以下のシンタックス(ic_enable,cc_enable)を余計に符号化して伝送する必要があり、条件によっては符号化効率が低下するといった問題がある。 In the method described in Non-Patent Document 1, as pointed out in Patent Document 1, parameters (offset parameter, scale parameter) for performing inter-image characteristic difference compensation and syntax (ic_enable, cc_enable) below a macroblock Need to be encoded and transmitted, and there is a problem that the encoding efficiency decreases depending on the conditions.
 一方、特許文献1,2に記載の方法では、符号化対象ブロック周辺において画像間特性差補償に利用できる情報が有効である場合(モード2)のみに、それらの情報を用いて符号化対象ブロックに対するパラメータを推測することが可能となる。その場合、非特許文献1の方法で必要な追加情報を送る必要がなくなるが、このような方式を用いても、符号化対象ブロック周辺に利用できる情報が存在しない場合(モード1)には、非特許文献1に記載の画像間特性差補償と同様な方式を用いる必要があり、符号化効率が低下するといった問題がある。例えば図14に示すように、符号化対象ブロックを視差補償予測モードにて符号化しようとしているときに、周辺ブロックの予測方式が視差補償予測とは異なる方式(図14の例では動き補償フレーム間予測符号化)で符号化処理されている場合である。 On the other hand, in the methods described in Patent Documents 1 and 2, only when information that can be used for compensation of the characteristic difference between the images around the encoding target block is valid (mode 2), the encoding target block is used using the information. It is possible to infer the parameters for. In that case, it is not necessary to send the additional information required by the method of Non-Patent Document 1, but even if such a method is used, if there is no information available around the encoding target block (mode 1), It is necessary to use a method similar to the inter-image characteristic difference compensation described in Non-Patent Document 1, and there is a problem that the coding efficiency is lowered. For example, as shown in FIG. 14, when the encoding target block is to be encoded in the disparity compensation prediction mode, the prediction method of the neighboring blocks is different from the disparity compensation prediction (in the example of FIG. 14, between motion compensation frames). This is a case where the encoding process is performed by (predictive encoding).
 さらに、特許文献1,2に記載の方法では、隣接ブロック及び符号化対象ブロック間で異なる被写体が重なる境界においても、周辺ブロックの情報が誤って利用されて画像間特性差補償の精度を低下させるといった問題がある。例えば図15の模式図に示すように、符号化対象ブロック及び隣接ブロックCに被写体OH(人物の頭)が写っているが、隣接ブロックA及び隣接ブロックBに被写体OB(背景)が写っているような場合に、隣接ブロックA及び隣接ブロックBで採用された画像間特性差補償による特性差を用いて符号化対象ブロックの画像間特性差補償予測を行ってしまう問題がある。この場合、誤った情報を用いて補償予測することになるため、符号化効率を悪くさせる可能性が高い。 Further, in the methods described in Patent Documents 1 and 2, information on peripheral blocks is erroneously used even at a boundary where different subjects overlap between adjacent blocks and encoding target blocks, thereby reducing the accuracy of inter-image characteristic difference compensation. There is a problem. For example, as shown in the schematic diagram of FIG. 15, the subject O H (person's head) is shown in the encoding target block and the adjacent block C, but the subject O B (background) is shown in the adjacent block A and the adjacent block B. In such a case, there is a problem that the inter-image characteristic difference compensation prediction of the encoding target block is performed using the characteristic difference by the inter-image characteristic difference compensation adopted in the adjacent block A and the adjacent block B. In this case, since compensation prediction is performed using erroneous information, there is a high possibility that coding efficiency will be deteriorated.
 また、MPEGのアドホックグループであるMPEG-3DVにおいてカメラで撮影した映像と合わせて奥行き画像も伝送する新しい規格が策定されている。奥行き画像とはカメラから被写体までの距離を表した情報であり、生成方法としては、例えばカメラの近傍に設置された距離を測定する装置から取得する方法がある。また、複数視点のカメラから撮影された画像を解析することによって奥行き画像を生成することもできる。 In addition, MPEG-3DV, an MPEG ad hoc group, has developed a new standard for transmitting depth images together with video taken by a camera. The depth image is information representing the distance from the camera to the subject, and as a generation method, for example, there is a method of obtaining from a device that measures the distance installed in the vicinity of the camera. In addition, a depth image can be generated by analyzing an image taken from a multi-viewpoint camera.
 MPEG-3DVの新しい規格におけるシステムの全体図を図16に示す。この新しい規格は、2視点以上の複数視点に対応しているが、図16では2視点の場合で説明する。このシステムでは、被写体901をカメラ902、904で撮影し画像を出力すると共に、それぞれのカメラの近傍に設置されている被写体901までの距離を測定するセンサ903、905を用いて奥行き画像(デプスマップ)を生成し出力する。符号化器906は、入力として画像と奥行き画像を受取り、動き補償フレーム間予測符号化や視差補償予測を用いて、画像及び奥行き画像を符号化し出力する。復号器907は、ネットワークNを介して又は直接伝送されてくる符号化器906の出力結果を入力として受取り、復号し、復号画像及び復号した奥行き画像を出力する。表示部908は入力として復号器907から復号画像と復号した奥行き画像を受取り、復号画像を表示する、或いは、奥行き画像を用いた処理を復号画像に施してから表示する。 FIG. 16 shows an overall view of the system in the new MPEG-3DV standard. This new standard corresponds to a plurality of viewpoints of two or more viewpoints, but FIG. 16 will be described in the case of two viewpoints. In this system, a subject 901 is photographed by cameras 902 and 904 and an image is output, and a depth image (depth map) is used using sensors 903 and 905 that measure the distance to the subject 901 installed in the vicinity of each camera. ) Is generated and output. The encoder 906 receives an image and a depth image as input, and encodes and outputs the image and the depth image using motion compensation interframe prediction encoding or parallax compensation prediction. The decoder 907 receives the output result of the encoder 906 that is transmitted via the network N or directly, decodes it, and outputs a decoded image and a decoded depth image. The display unit 908 receives the decoded image and the decoded depth image from the decoder 907 as input, and displays the decoded image, or displays the decoded image after performing processing using the depth image.
 このように、MPEG-3DVにおいて、多視点動画像及び対応する視点の奥行き動画像を符号化する方式の策定が進められており、ここでも常に精度の良い画像間特性差補償処理を実施し、高効率で視差補償予測を行うことが望まれている。 As described above, in MPEG-3DV, a method for encoding a multi-view video and a depth video of a corresponding viewpoint is being developed. Here, the inter-image characteristic difference compensation process is always performed with high accuracy. It is desired to perform parallax compensation prediction with high efficiency.
 本発明は、上述のような実情に鑑みてなされたものであり、その目的は、視差補償予測方式において、視差補償予測に必要な情報以外の追加情報を必要とせずに、常に精度の良い画像間特性差補償処理を実施できるようにすることにある。 The present invention has been made in view of the above-described circumstances, and an object of the present invention is to always provide a highly accurate image without requiring additional information other than information necessary for parallax compensation prediction in the parallax compensation prediction method. It is to be able to carry out inter-characteristic difference compensation processing.
 また、本発明の他の目的は、MPEG-3DVで策定が進められている多視点動画像及び対応する視点の奥行き動画像符号化方式において、常に精度の良い画像間特性差補償処理を実施し、高効率で視差補償予測を行うことにある。 In addition, another object of the present invention is to perform inter-image characteristic difference compensation processing with high accuracy at all times in the multi-view video and the corresponding viewpoint depth video coding system being developed by MPEG-3DV. This is to perform the parallax compensation prediction with high efficiency.
 上記課題を解決するために、本発明の第1の技術手段は、少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に、各視点画像間の特性差を補正して視差補償を行う画像符号化装置であって、符号化対象ブロックを符号化する際に参照する参照ブロックを抽出する対応ブロック抽出部と、前記符号化対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間の特性差を補正する補正処理部と、を備えたことを特徴としたものである。 In order to solve the above problem, the first technical means of the present invention corrects a difference in characteristics between viewpoint images and encodes a parallax when encoding each viewpoint image taken from at least two viewpoints. A corresponding block extraction unit that extracts a reference block to be referred to when encoding an encoding target block, a block around the encoding target block, and a block around the reference block And a correction processing unit that corrects a difference in characteristics between the viewpoint images based on the similarity to.
 第2の技術手段は、第1の技術手段において、前記符号化対象ブロックから見た前記符号化対象ブロックの周辺のブロックの相対位置は、前記対応ブロック抽出部で前記符号化対象ブロックの参照元として抽出された前記参照ブロックから見た、前記参照ブロックの周辺のブロックの相対位置と対応していることを特徴としたものである。 According to a second technical means, in the first technical means, a relative position of a block around the encoding target block as viewed from the encoding target block is determined by the corresponding block extracting unit by the reference source of the encoding target block. It corresponds to the relative position of the block around the reference block as seen from the reference block extracted as.
 第3の技術手段は、第1又は第2の技術手段において、前記補正処理部は、前記符号化対象ブロックに写る被写体とは異なる被写体が写る周辺のブロックを除外して、補正を実行することを特徴としたものである。 According to a third technical means, in the first or second technical means, the correction processing unit performs correction by excluding peripheral blocks in which a subject different from the subject in the encoding target block is captured. It is characterized by.
 第4の技術手段は、第1又は第2の技術手段において、前記補正処理部は、前記符号化対象ブロックに写る被写体と同じ被写体が写る可能性が最も高い周辺のブロックを一つ選んで、補正を実行することを特徴としたものである。 According to a fourth technical means, in the first or second technical means, the correction processing unit selects one peripheral block most likely to show the same subject as the subject shown in the encoding target block, The correction is performed.
 第5の技術手段は、第3又は第4の技術手段において、前記被写体の判定に、各視点画像に対応する奥行き情報を用いることを特徴としたものである。 The fifth technical means is characterized in that, in the third or fourth technical means, depth information corresponding to each viewpoint image is used for the determination of the subject.
 第6の技術手段は、第5の技術手段において、前記奥行き情報は、奥行き画像を分割したブロックの代表値に基づく情報であることを特徴としたものである。 The sixth technical means is characterized in that, in the fifth technical means, the depth information is information based on a representative value of a block obtained by dividing a depth image.
 第7の技術手段は、少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に、各視点画像間の特性差を補正して視差補償を行う画像復号装置であって、復号対象ブロックを復号する際に参照する参照ブロックを抽出する対応ブロック抽出部と、前記復号対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックとの類似性に基づいて、視点画像間の特性差を補正する補正処理部と、を備えたことを特徴としたものである。 A seventh technical means is an image decoding device that corrects a characteristic difference between viewpoint images and performs parallax compensation when decoding an encoded stream of each viewpoint image captured from at least two viewpoints. A corresponding block extraction unit that extracts a reference block to be referred to when decoding a decoding target block, and a similarity between a block around the decoding target block and a block around the reference block, between viewpoint images And a correction processing unit that corrects the characteristic difference.
 第8の技術手段は、第7の技術手段において、前記復号対象ブロックから見た前記復号対象ブロックの周辺のブロックの相対位置は、前記対応ブロック抽出部で前記復号対象ブロックの参照元として抽出された前記参照ブロックから見た、前記参照ブロックの周辺のブロックの相対位置と対応していることを特徴としたものである。 According to an eighth technical means, in the seventh technical means, relative positions of blocks around the decoding target block viewed from the decoding target block are extracted as reference sources of the decoding target block by the corresponding block extraction unit. Further, it corresponds to a relative position of a block around the reference block as viewed from the reference block.
 第9の技術手段は、第7又は第8の技術手段において、前記補正処理部は、前記復号対象ブロックに写る被写体とは異なる被写体が写る周辺のブロックを除外して、補正を実行することを特徴としたものである。 According to a ninth technical means, in the seventh or eighth technical means, the correction processing unit executes correction by excluding a peripheral block in which a subject different from the subject in the decoding target block is shown. It is a feature.
 第10の技術手段は、第7又は第8の技術手段において、前記補正処理部は、前記復号対象ブロックに写る被写体と同じ被写体が写る可能性が最も高い周辺のブロックを一つ選んで、補正を実行することを特徴としたものである。 According to a tenth technical means, in the seventh or eighth technical means, the correction processing section selects one peripheral block most likely to show the same subject as the subject shown in the decoding target block, and corrects it. It is characterized by executing.
 第11の技術手段は、第9又は第10の技術手段において、前記被写体の判定に、各視点画像に対応する奥行き情報を用いることを特徴としたものである。 The eleventh technical means is characterized in that, in the ninth or tenth technical means, depth information corresponding to each viewpoint image is used for determination of the subject.
 第12の技術手段は、第11の技術手段において、前記奥行き情報は、奥行き画像を分割したブロックの代表値に基づく情報であることを特徴としたものである。 The twelfth technical means is characterized in that, in the eleventh technical means, the depth information is information based on a representative value of a block obtained by dividing a depth image.
 第13の技術手段は、少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に、各視点画像間の特性差を補正して視差補償を行う画像符号化方法であって、符号化対象ブロックを符号化する際に参照する参照ブロックを抽出するステップと、前記符号化対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間の特性差を補正するステップと、を有することを特徴としたものである。 A thirteenth technical means is an image encoding method for performing parallax compensation by correcting a characteristic difference between viewpoint images when encoding each viewpoint image taken from at least two viewpoints. A step of extracting a reference block to be referred to when encoding the encoding target block, and characteristics between the viewpoint images based on similarity between a block around the encoding target block and a block around the reference block And a step of correcting the difference.
 第14の技術手段は、少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に、各視点画像間の特性差を補正して視差補償を行う画像復号方法であって、復号対象ブロックを復号する際に参照する参照ブロックを抽出するステップと、前記復号対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックとの類似性に基づいて、視点画像間の特性差を補正するステップと、を有することを特徴としたものである。 A fourteenth technical means is an image decoding method for performing parallax compensation by correcting a characteristic difference between viewpoint images when decoding an encoded stream of each viewpoint image taken from at least two viewpoints. Extracting a reference block to be referred to when decoding the decoding target block, and determining a characteristic difference between viewpoint images based on a similarity between a block around the decoding target block and a block around the reference block. And a step of correcting.
 第15の技術手段は、少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に各視点画像間の特性差を補正して視差補償を行う画像符号化処理を、コンピュータに実行させるための画像符号化プログラムであって、前記画像符号化処理は、符号化対象ブロックを符号化する際に参照する参照ブロックを抽出するステップと、前記符号化対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間の特性差を補正するステップと、を有することを特徴としたものである。 The fifteenth technical means causes a computer to execute an image encoding process for correcting parallax compensation by correcting a characteristic difference between viewpoint images when encoding each viewpoint image taken from at least two viewpoints. An image encoding program for extracting a reference block to be referred to when encoding an encoding target block, a block around the encoding target block, and the reference block And a step of correcting a characteristic difference between the viewpoint images based on the similarity to the peripheral blocks.
 第16の技術手段は、少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に各視点画像間の特性差を補正して視差補償を行う画像復号処理を、コンピュータに実行させるための画像復号プログラムであって、前記画像復号処理は、復号対象ブロックを復号する際に参照する参照ブロックを抽出するステップと、前記復号対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックとの類似性に基づいて、視点画像間の特性差を補正するステップと、を有することを特徴としたものである。 According to a sixteenth technical means, the image decoding process for correcting the characteristic difference between the viewpoint images and performing the parallax compensation when decoding the encoded stream of each viewpoint image captured from at least two viewpoints is performed on the computer. An image decoding program to be executed, wherein the image decoding process includes a step of extracting a reference block to be referred to when decoding a decoding target block, a block around the decoding target block, and a block around the reference block And a step of correcting a characteristic difference between the viewpoint images based on similarity to the block.
 本発明によれば、視差補償予測方式において、符号化対象ブロックの視差ベクトルが指し示す先の参照ブロックの少なくとも一つ以上の周辺ブロックと、それに対応する符号化対象ブロックの周辺ブロックの間の画像間特性差を用いることで、視差補償予測に必要な情報以外の追加情報を必要とせずに常に精度の良い画像間特性差補償処理を実施でき、結果として高効率で視差補償予測を行うことが可能になる。 According to the present invention, in the disparity compensation prediction method, between images between at least one peripheral block of a reference block indicated by a disparity vector of an encoding target block and the peripheral block of the corresponding encoding target block By using characteristic differences, it is possible to always perform accurate inter-image characteristic difference compensation processing without requiring additional information other than information necessary for parallax compensation prediction, and as a result, it is possible to perform parallax compensation prediction with high efficiency. become.
 また、本発明のある形態によれば、MPEG-3DVで策定が進められている多視点動画像及び対応する視点の奥行き動画像符号化方式において、その奥行き画像の奥行きの差に基づいて異なる被写体を判定し、その判定結果に基づいて画像間特性差補償処理の実施を制御することで、常に精度の良い画像間特性差補償処理が可能になり、結果として高効率で視差補償予測を行うことが可能になる。 In addition, according to an aspect of the present invention, in a multi-view video and a corresponding depth video coding system that are being developed in MPEG-3DV, different subjects are based on the depth difference between the depth images. By controlling the execution of the inter-image characteristic difference compensation process based on the determination result, the inter-image characteristic difference compensation process can always be performed with high accuracy, and as a result, the parallax compensation prediction can be performed with high efficiency. Is possible.
本発明の画像符号化装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image coding apparatus of this invention. 図1の画像符号化装置における画像符号化部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image coding part in the image coding apparatus of FIG. 図2の画像符号化部における動き/視差補償部の構成例を示すブロック図である。FIG. 3 is a block diagram illustrating a configuration example of a motion / disparity compensation unit in the image encoding unit of FIG. 2. 図3の動き/視差補償部における対応ブロック抽出部が抽出するブロック位置を概念的に説明するための図で、画像間特性差補償処理で扱うブロック位置を示す概念図である。FIG. 4 is a diagram for conceptually explaining a block position extracted by a corresponding block extraction unit in the motion / disparity compensation unit in FIG. 3, and is a conceptual diagram showing a block position handled in an inter-image characteristic difference compensation process. 代表奥行き値の決定処理の概念図である。It is a conceptual diagram of the determination process of a representative depth value. 図1の画像符号化装置が実行する画像符号化処理を説明するためのフローチャートである。It is a flowchart for demonstrating the image coding process which the image coding apparatus of FIG. 1 performs. 図2の画像符号化部が実行する画像符号化処理を説明するためのフローチャートである。3 is a flowchart for explaining an image encoding process executed by an image encoding unit in FIG. 2. 図2の画像符号化部における画面間予測部が実行する画面間予測処理を説明するためのフローチャートである。It is a flowchart for demonstrating the prediction process between screens which the prediction part between screens in the image coding part of FIG. 2 performs. 本発明の画像復号装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image decoding apparatus of this invention. 図9の画像復号装置における画像復号部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image decoding part in the image decoding apparatus of FIG. 図9の画像復号装置が実行する画像復号処理を説明するためのフローチャートである。It is a flowchart for demonstrating the image decoding process which the image decoding apparatus of FIG. 9 performs. 図10の画像復号部が実行する画像復号処理を説明するためのフローチャートである。It is a flowchart for demonstrating the image decoding process which the image decoding part of FIG. 10 performs. 図10の画像復号部における画面間予測部が実行する画面間予測処理を説明するためのフローチャートである。It is a flowchart for demonstrating the inter prediction process which the inter prediction part in the image decoding part of FIG. 10 performs. 従来の画像間特性差補償の問題の一例を示す概念図である。It is a conceptual diagram which shows an example of the problem of the conventional characteristic difference compensation between images. 従来の画像間特性差補償の問題の一例を示す他の概念図である。It is another conceptual diagram which shows an example of the problem of the conventional characteristic difference compensation between images. MPEG-3DVの新しい規格におけるシステムの全体図である。1 is an overall view of a system in a new MPEG-3DV standard.
 以下、図面を参照しながら本発明の実施形態について説明する。図面において同じ機能を有する部分については同じ符号を付し、繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the drawings, portions having the same function are denoted by the same reference numerals, and repeated description is omitted.
(実施形態1)
<画像符号化装置の構成>
 図1は、本発明の一実施形態である画像符号化装置の構成例を示すブロック図である。
 図1で例示する画像符号化装置100は、画像符号化部101及び符号構成部104を含んで構成される。なお、画像符号化部101の内部に記載した点線で示したブロック及び矢印は、画像符号化部101の動作を概念的に説明するために利用する。
(Embodiment 1)
<Configuration of Image Encoding Device>
FIG. 1 is a block diagram illustrating a configuration example of an image encoding device according to an embodiment of the present invention.
An image encoding device 100 illustrated in FIG. 1 includes an image encoding unit 101 and a code configuration unit 104. It should be noted that the blocks and arrows indicated by dotted lines described inside the image encoding unit 101 are used for conceptually explaining the operation of the image encoding unit 101.
 以下、画像符号化装置100の機能及び動作について説明する。
 画像符号化装置100に入力される視点画像は、基準視点画像と非基準視点画像である。基準視点の視点画像は単一の視点による画像に限定されるが、非基準視点の視点画像は複数の視点による画像が複数入力されてもよい。基準視点画像とは、複数の視点画像において基準となる視点の画像である。
Hereinafter, functions and operations of the image encoding device 100 will be described.
The viewpoint images input to the image encoding device 100 are a reference viewpoint image and a non-reference viewpoint image. Although the viewpoint image of the reference viewpoint is limited to an image with a single viewpoint, a plurality of images with a plurality of viewpoints may be input as the viewpoint images of the non-reference viewpoint. A reference viewpoint image is an image of a viewpoint that serves as a reference among a plurality of viewpoint images.
 視点画像は、初めに画像符号化部101に入力される。基準視点符号化処理部102では、基準視点の視点画像を視点内予測符号化方式により圧縮符号化する。視点内予測符号化では、画面内予測や同一視点内で動き補償を行って、視点内の画像データのみに基づいて画像データを圧縮符号化する。同時に、後述する非基準視点の視点画像を符号化する際の参照用に、逆処理すなわち内部復号を行い、画像信号に復元しておく。非基準視点符号化処理部103では、復元された上記基準視点画像に基づいて、非基準視点の視点画像を視点間予測符号化方式により圧縮符号化する。視点間予測符号化方式では、符号化対象画像とは異なる視点の画像を用いて視差補償を行い、画像データを圧縮符号化する。なお、非基準視点符号化処理部103では、視点内の画像データのみを用いた視点内予測符号化方式を、全体の符号化効率に基づいて選択することもできる。基準視点符号化処理部102によって符号化された符号化データと非基準視点符号化処理部103によって符号化された符号化データが、画像符号化部101より出力される。 The viewpoint image is first input to the image encoding unit 101. The reference viewpoint encoding processing unit 102 compresses and encodes the viewpoint image of the reference viewpoint using the intra-view prediction encoding method. In intra-view prediction encoding, intra-screen prediction or motion compensation is performed within the same viewpoint, and image data is compression-encoded based only on image data within the viewpoint. At the same time, reverse processing, that is, internal decoding is performed for reference when encoding a viewpoint image of a non-reference viewpoint (to be described later), and the image signal is restored. Based on the restored reference viewpoint image, the non-reference viewpoint encoding processing unit 103 compresses and encodes the viewpoint image of the non-reference viewpoint using an inter-view prediction encoding method. In the inter-view prediction encoding method, parallax compensation is performed using an image of a viewpoint different from the encoding target image, and the image data is compressed and encoded. Note that the non-reference viewpoint encoding processing unit 103 can also select an intra-view prediction encoding method that uses only image data within the viewpoint based on the overall encoding efficiency. The encoded data encoded by the reference viewpoint encoding processing unit 102 and the encoded data encoded by the non-reference viewpoint encoding processing unit 103 are output from the image encoding unit 101.
 符号構成部104は、画像符号化部101より符号化データを受取り、連結・並べ替えがなされ、符号化ストリームとして画像符号化装置100の外部(例えば、後述する画像復号装置700)に出力する。 The code construction unit 104 receives the encoded data from the image encoding unit 101, is connected and rearranged, and outputs the encoded stream to the outside of the image encoding device 100 (for example, an image decoding device 700 described later).
 続いて、画像符号化部101の詳細について、図2を用いて説明する。図2は、画像符号化部101の構成例を示すブロック図である。 Next, details of the image encoding unit 101 will be described with reference to FIG. FIG. 2 is a block diagram illustrating a configuration example of the image encoding unit 101.
 画像符号化部101は、画像入力部201、減算部202、直交変換部203、量子化部204、エントロピー符号化部205、逆量子化部206、逆直交変換部207、加算部208、予測方式制御部209、画像選択部210、デブロッキング・フィルタ部211、フレームメモリ212、動き/視差補償部213、動き/視差ベクトル検出部214、及びイントラ予測部215を含んで構成される。なお、説明のために、画面内予測部217と画面間予測部218を点線で図示し、画面内予測部217はイントラ予測部215を含み、画面間予測部218はデブロッキング・フィルタ部211、フレームメモリ212、動き/視差補償部213、及び動き/視差ベクトル検出部214を含むものとする。 The image coding unit 101 includes an image input unit 201, a subtraction unit 202, an orthogonal transformation unit 203, a quantization unit 204, an entropy coding unit 205, an inverse quantization unit 206, an inverse orthogonal transformation unit 207, an addition unit 208, and a prediction method. The control unit 209, the image selection unit 210, the deblocking filter unit 211, the frame memory 212, the motion / disparity compensation unit 213, the motion / disparity vector detection unit 214, and the intra prediction unit 215 are configured. For the sake of explanation, the intra-screen prediction unit 217 and the inter-screen prediction unit 218 are illustrated by dotted lines, the intra-screen prediction unit 217 includes an intra prediction unit 215, and the inter-screen prediction unit 218 includes the deblocking filter unit 211, It is assumed that a frame memory 212, a motion / disparity compensation unit 213, and a motion / disparity vector detection unit 214 are included.
 図1において画像符号化部101の動作を説明した際には、基準視点の符号化処理とそれ以外の非基準視点の符号化処理を明示的に分けて、基準視点符号化処理部102と非基準視点符号化処理部103としたが、実際にはお互いに共通する処理が多いため、以下では基準視点符号化処理と非基準視点符号化処理を統合した形態について説明を行う。 When the operation of the image encoding unit 101 is described with reference to FIG. 1, the reference viewpoint encoding process and the other non-reference viewpoint encoding processes are explicitly divided into the reference viewpoint encoding processing unit 102 and the non-reference viewpoint encoding process. Although the reference viewpoint encoding processing unit 103 is used, since there are many processes that are common to each other in practice, a mode in which the reference viewpoint encoding process and the non-reference viewpoint encoding process are integrated will be described below.
 具体的には、前述の基準視点符号化処理部102として行う視点内予測符号化方式は、図2の画面内予測部217で実施される処理と画面間予測部218で実施される処理の一部である同一視点の画像を参照する処理(動き補償)を組み合わせたものである。また、非基準視点符号化処理部103として行う視点間予測符号化方式は、画面内予測部217で実施される処理と画面間予測部218で実施される同一視点の画像を参照する処理(動き補償)及び異なる視点の画像を参照する処理(視差補償)を組み合わせたものである。さらに、画面間予測部218で実施される符号化対象の視点と同一視点の画像を参照する処理(動き補償)と異なる視点の画像を参照する処理(視差補償)についても、符号化時に参照する画像が異なるだけで、参照画像を指し示すID情報(参照視点番号、参照フレーム番号)を用いることで処理を共通化することが可能である。また、各予測部217,218で予測した画像と入力した視点画像の残差成分を符号化する方法も、基準視点であっても非基準視点であっても共通に行える。詳細は後述する。 Specifically, the intra-view prediction encoding method performed as the reference viewpoint encoding processing unit 102 described above is one of the processing performed by the intra-screen prediction unit 217 and the inter-screen prediction unit 218 in FIG. This is a combination of processing (motion compensation) for referring to images of the same viewpoint as a part. Further, the inter-view prediction encoding method performed as the non-reference viewpoint encoding processing unit 103 is a process performed by the intra-screen prediction unit 217 and a process referring to an image of the same viewpoint performed by the inter-screen prediction unit 218 (motion Compensation) and processing for referring to images from different viewpoints (parallax compensation). Further, processing for referencing an image with the same viewpoint as the encoding target viewpoint (motion compensation) and processing for referring to an image with a different viewpoint (parallax compensation) performed in the inter-screen prediction unit 218 are also referred to during encoding. The processing can be made common by using ID information (reference viewpoint number, reference frame number) indicating a reference image, only with different images. Also, the method of encoding the residual component between the image predicted by each prediction unit 217 and 218 and the input viewpoint image can be performed in common for both the reference viewpoint and the non-reference viewpoint. Details will be described later.
 画像入力部201は、画像符号化部101の外部から入力される符号化対象となる視点画像(基準視点画像、或いは非基準視点画像)を示す画像信号を、予め定めた大きさ(例えば、垂直方向16画素×水平方向16画素)のブロックに分割する。 The image input unit 201 generates an image signal indicating a viewpoint image (reference viewpoint image or non-reference viewpoint image) that is input from the outside of the image encoding unit 101 and has a predetermined size (for example, vertical). (16 pixels in the direction × 16 pixels in the horizontal direction).
 画像入力部201は、分割した画像ブロック信号を、減算部202、画面内予測部217の中にあるイントラ予測部215、及び画面間予測部218の内部にある動き/視差ベクトル検出部214に出力する。画面内予測部217は、符号化対象ブロックより前に処理の完了した同一画面内の情報のみを用いて符号化を行う処理部であり、その処理内容については後述する。一方、画面間予測部218は、符号化対象画像とは異なる、過去に処理した同一視点の視点画像、或いは異なる視点の視点画像の情報を用いて符号化を行う処理部であり、その処理内容については後述する。 The image input unit 201 outputs the divided image block signal to the subtraction unit 202, the intra prediction unit 215 in the intra-screen prediction unit 217, and the motion / disparity vector detection unit 214 in the inter-screen prediction unit 218. To do. The intra-screen prediction unit 217 is a processing unit that performs encoding using only information in the same screen that has been processed before the encoding target block, and details of the processing will be described later. On the other hand, the inter-screen prediction unit 218 is a processing unit that performs encoding using information on a viewpoint image of the same viewpoint processed in the past or a viewpoint image of a different viewpoint, which is different from the encoding target image. Will be described later.
 画像入力部201は、ブロック位置を順次変えながら、画像フレーム内の全てのブロックが完了し、そして入力される画像が全て終了するまで繰り返して出力する。なお、画像入力部201が、画像信号を分割する際のブロックサイズは、前述の16×16サイズに限らず、8×8、4×4などのサイズでもよい。また、縦横の画素数が同数でなくともよく、例えば16×8、8×16、8×4、4×8などのサイズでもよい。これらのサイズの例は、H.264、MVCなどの従来方式で用いられた符号化ブロックサイズである。後述する符号化手順に従って、全ブロックサイズの符号化を一通り実施し、最終的に効率の良いものを選択する。なお、ブロックサイズについては上記サイズに限定するものではなく、将来の符号化方式で採用される任意のブロックサイズに対応することが可能である。 The image input unit 201 repeatedly outputs the blocks until all the blocks in the image frame are completed and all the input images are completed while sequentially changing the block positions. Note that the block size when the image input unit 201 divides the image signal is not limited to the 16 × 16 size described above, and may be 8 × 8, 4 × 4, or the like. The number of vertical and horizontal pixels may not be the same, and may be 16 × 8, 8 × 16, 8 × 4, 4 × 8, or the like. Examples of these sizes are described in H.C. It is a coding block size used in a conventional method such as H.264 or MVC. According to the encoding procedure to be described later, encoding is performed for all block sizes, and finally an efficient one is selected. Note that the block size is not limited to the above size, and can correspond to an arbitrary block size that will be employed in a future encoding scheme.
 減算部202は、画像入力部201から入力した画像ブロック信号から画像選択部210から入力した予測画像ブロック信号を減算して、差分画像ブロック信号を生成する。減算部202は、生成した差分画像ブロック信号を直交変換部203に出力する。 The subtraction unit 202 subtracts the predicted image block signal input from the image selection unit 210 from the image block signal input from the image input unit 201 to generate a difference image block signal. The subtraction unit 202 outputs the generated difference image block signal to the orthogonal transformation unit 203.
 直交変換部203は、減算部202から入力した差分画像ブロック信号を直交変換し、種々の周波数特性の強度を示す信号を生成する。直交変換部203は、差分画像ブロック信号を直交変換する際に、その差分画像ブロック信号を、例えばDCT変換(離散コサイン変換;Discrete Cosine Transform)して周波数領域信号(例えば、DCT変換を行った場合はDCT係数)を生成する。直交変換部203は、差分画像ブロック信号に基づき周波数領域信号を生成することができれば、DCT変換に限らず、他の方法(例えば、FFT(高速フーリエ変換;Fast Fourier Transform))を用いてもよい。直交変換部203は、生成した周波数領域信号に含まれる係数値を、量子化部204に出力する。 The orthogonal transform unit 203 performs orthogonal transform on the difference image block signal input from the subtracting unit 202 to generate signals indicating the strengths of various frequency characteristics. When the orthogonal transform unit 203 orthogonally transforms the difference image block signal, the difference image block signal is subjected to, for example, DCT transform (discrete cosine transform; Discrete Cosine Transform) and frequency domain signal (for example, DCT transform). Produces a DCT coefficient). As long as the orthogonal transform unit 203 can generate a frequency domain signal based on the difference image block signal, other methods (for example, FFT (Fast Fourier Transform)) may be used instead of the DCT transform. . The orthogonal transform unit 203 outputs the coefficient value included in the generated frequency domain signal to the quantization unit 204.
 量子化部204は、直交変換部203より入力した周波数特性強度を示す係数値を所定の量子化係数にて量子化し、生成した量子化信号(差分画像ブロック符号)を、エントロピー符号化部205と逆量子化部206に出力する。なお、量子化係数は、外部より与えられる符号量を決めるためのパラメータで、逆量子化部206及びエントロピー符号化部205においても参照される。 The quantization unit 204 quantizes the coefficient value indicating the frequency characteristic intensity input from the orthogonal transform unit 203 with a predetermined quantization coefficient, and generates the generated quantized signal (difference image block code) with the entropy coding unit 205. The result is output to the inverse quantization unit 206. The quantization coefficient is a parameter for determining a code amount given from the outside, and is also referred to in the inverse quantization unit 206 and the entropy coding unit 205.
 逆量子化部206は、量子化部204から入力された差分画像符号を、上記量子化係数を用いて上記量子化部204で行った量子化と逆の処理(逆量子化)をして復号周波数領域信号を生成し、逆直交変換部207に出力する。 The inverse quantization unit 206 decodes the difference image code input from the quantization unit 204 by performing processing (inverse quantization) opposite to the quantization performed by the quantization unit 204 using the quantization coefficient. A frequency domain signal is generated and output to the inverse orthogonal transform unit 207.
 逆直交変換部207は、入力された復号周波数領域信号を直交変換部203とは逆の処理、例えば逆DCT変換して空間領域信号である復号差分画像ブロック信号を生成する。逆直交変換部207は、復号周波数領域信号に基づき空間領域信号を生成することができれば、逆DCT変換に限らず、他の方法(例えば、IFFT(高速フーリエ逆変換;Inverse Fast Fourier Transform))を用いてもよい。そして、逆直交変換部207は、生成した復号差分画像ブロック信号を加算部208に出力する。 The inverse orthogonal transform unit 207 generates a decoded differential image block signal that is a spatial domain signal by performing a process reverse to the orthogonal transform unit 203, for example, inverse DCT transform, on the input decoded frequency domain signal. As long as the inverse orthogonal transform unit 207 can generate a spatial domain signal based on the decoded frequency domain signal, the inverse orthogonal transform unit 207 is not limited to the inverse DCT transform, and other methods (for example, IFFT (Inverse Fast Fourier Transform)) are used. It may be used. Then, the inverse orthogonal transform unit 207 outputs the generated decoded difference image block signal to the addition unit 208.
 加算部208は、画像選択部210から予測画像ブロック信号を入力すると共に、逆直交変換部207から復号差分画像ブロック信号を入力する。加算部208は、予測画像ブロック信号に復号差分画像ブロック信号を加算し、入力画像を符号化・復号した参照画像ブロック信号を生成する(内部デコード)。この参照画像ブロック信号は、画面内予測部217及び画面間予測部218に出力される。 The addition unit 208 receives the predicted image block signal from the image selection unit 210 and also receives the decoded difference image block signal from the inverse orthogonal transform unit 207. The adder 208 adds the decoded differential image block signal to the predicted image block signal, and generates a reference image block signal obtained by encoding / decoding the input image (internal decoding). The reference image block signal is output to the intra-screen prediction unit 217 and the inter-screen prediction unit 218.
 画面内予測部217は、加算部208より参照画像ブロック信号を入力すると共に、画像入力部201より符号化対象画像の画像ブロック信号を入力し、所定の方向に画面内予測した画面内予測画像ブロック信号を予測方式制御部209と画像選択部210に出力する。同時に、画面内予測部217は、画面内予測画像ブロック信号を生成するために必要な予測の方向を示す情報を画面内予測符号化情報として生成し、予測方式制御部209に出力する。画面内予測は、従来方式(例えば、H.264 Reference Software JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008)の画面内予測方式に従って実施される。なお、画面内予測部217におけるこれらの処理は、上述した構成で示したようにイントラ予測部215が実行することになる。 The intra prediction unit 217 receives the reference image block signal from the adder 208 and also receives the image block signal of the encoding target image from the image input unit 201, and predicts the intra prediction in a predetermined direction. The signal is output to the prediction method control unit 209 and the image selection unit 210. At the same time, the intra-screen prediction unit 217 generates information indicating the direction of prediction necessary for generating the intra-screen prediction image block signal as intra-screen prediction coding information, and outputs the information to the prediction method control unit 209. The intra-screen prediction is performed according to the intra-screen prediction method of the conventional method (for example, H.264 Reference Software Software JM Ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008). Note that these processes in the in-screen prediction unit 217 are executed by the intra prediction unit 215 as shown in the configuration described above.
 画面間予測部218は、加算部208より参照画像ブロック信号を入力すると共に、画像入力部201より符号化対象画像の画像ブロック信号を入力し、画面間予測により生成した画面間予測画像ブロック信号を予測方式制御部209と画像選択部210に出力する。同時に、画面間予測部218は、画面間予測画像ブロック信号を生成するために必要な情報(参照視点画像番号と参照フレーム番号を含む参照画像情報や、動き/視差ベクトルと予測ベクトルとの差分ベクトル)を、画面間予測符号化情報として生成し、生成した画面間予測符号化情報を予測方式制御部209に出力する。画面間予測部218の詳細については後述する。 The inter-screen prediction unit 218 receives the reference image block signal from the addition unit 208 and also receives the image block signal of the encoding target image from the image input unit 201, and receives the inter-screen prediction image block signal generated by the inter-screen prediction. The data is output to the prediction method control unit 209 and the image selection unit 210. At the same time, the inter-screen prediction unit 218 includes information necessary for generating an inter-screen prediction image block signal (reference image information including a reference viewpoint image number and a reference frame number, and a difference vector between a motion / disparity vector and a prediction vector. ) Is generated as inter-screen prediction encoding information, and the generated inter-screen prediction encoding information is output to the prediction scheme control unit 209. Details of the inter-screen prediction unit 218 will be described later.
 続いて、予測方式制御部209は、入力画像のピクチャの種類及び符号化効率に基づいて、画面内予測部217より入力される画面内予測画像ブロック信号とその画面内予測符号化情報、及び画面間予測部218より入力される画面間予測画像ブロック信号とその画面間符号化情報に基づいてブロック毎の予測方式を決め、その予測方式の情報を画像選択部210に出力する。ここで、入力画像のピクチャの種類とは、符号化対象画像が予測画像に参照できる画像を識別するための情報で、Iピクチャ、Pピクチャ、Bピクチャなどがあり、ピクチャの種類は、量子化係数と同様に外部より与えられるパラメータによって決まるもので、従来方式のMVCと同じ方法を利用できる。 Subsequently, the prediction scheme control unit 209, based on the picture type and coding efficiency of the input image, the intra-screen prediction image block signal input from the intra-screen prediction unit 217, the intra-screen prediction encoding information, and the screen The prediction method for each block is determined based on the inter-screen prediction image block signal input from the inter-prediction unit 218 and the inter-screen coding information, and information on the prediction method is output to the image selection unit 210. Here, the picture type of the input image is information for identifying an image that can be referred to as a prediction image by the encoding target image, and includes an I picture, a P picture, a B picture, and the like. Similar to the coefficient, it is determined by a parameter given from the outside, and the same method as the conventional MVC can be used.
 予測方式制御部209は、入力画像のピクチャの種類を監視し、入力された符号化対象画像が画面内の情報しか参照できないIピクチャの場合は画面内予測方式を確定的に選択する。符号化済みの過去のフレーム或いは異なる視点の画像を参照できるPピクチャや、符号化済みの過去のフレーム及び符号化済みの未来のフレーム(表示順番では未来のフレームではあるが、過去に処理されたフレーム)と異なる視点の画像を参照できるBピクチャの場合には、予測方式制御部209は、エントロピー符号化部205で行う符号化により生成されるビット数と減算部202の原画像との残差から、例えば従来の手法(例えば、H.264 Reference Software JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008)を用いてラグランジュコストを算出し、画面内予測方式或いは画面間予測方式を決める。 The prediction method control unit 209 monitors the picture type of the input image, and if the input encoding target image is an I picture that can only refer to information within the screen, it selects the intra prediction method deterministically. P-pictures that can refer to past frames that have already been encoded or images from different viewpoints, past frames that have been encoded, and future frames that have already been encoded (although they are future frames in display order, they have been processed in the past In the case of a B picture that can refer to an image of a different viewpoint from the frame), the prediction method control unit 209 determines the residual between the number of bits generated by the encoding performed by the entropy encoding unit 205 and the original image of the subtraction unit 202 To calculate the Lagrangian cost using the conventional method (eg, H.264.Reference.Software JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008) Alternatively, an inter-screen prediction method is determined.
 同時に、予測方式制御部209は、画面内予測符号化情報若しくは画面間予測符号化情報のうち、上記方法によって選択された予測方式に対応する符号化情報に、予測方式を特定できる情報を付加し予測符号化情報として、エントロピー符号化部205に出力する。 At the same time, the prediction method control unit 209 adds information that can specify the prediction method to the coding information corresponding to the prediction method selected by the above method from the intra-screen prediction coding information or the inter-screen prediction coding information. It outputs to the entropy encoding part 205 as prediction encoding information.
 画像選択部210は、予測方式制御部209より入力される予測方式の情報に従って、画面内予測部217より入力される画面内予測画像ブロック信号、或いは画面間予測部218より入力される画面間予測画像ブロック信号を選択して、減算部202及び加算部208に予測画像ブロック信号を出力する。画像選択部210は、予測方式制御部209より入力される予測方式が画面内予測である場合には、画面内予測部217より入力される画面内予測画像ブロック信号を選択して出力し、予測方式制御部209より入力される予測方式が画面間予測である場合は、画面間予測部218より入力される画面間予測画像ブロック信号を選択して出力するものとする。 The image selection unit 210 performs the intra-screen prediction image block signal input from the intra-screen prediction unit 217 or the inter-screen prediction input from the inter-screen prediction unit 218 according to the prediction method information input from the prediction method control unit 209. The image block signal is selected, and the predicted image block signal is output to the subtraction unit 202 and the addition unit 208. When the prediction method input from the prediction method control unit 209 is intra-screen prediction, the image selection unit 210 selects and outputs the intra-screen prediction image block signal input from the intra-screen prediction unit 217, and performs prediction. When the prediction method input from the method control unit 209 is inter-screen prediction, the inter-screen prediction image block signal input from the inter-screen prediction unit 218 is selected and output.
 エントロピー符号化部205は、量子化部204より入力される差分画像符号と量子化部204で用いられた量子化係数、予測方式制御部209より入力される予測符号化情報をパッキング(packing;詰込)し、例えば可変長符号化(エントロピー符号化)を用いて符号化し、情報量がより圧縮された符号化データを生成する。エントロピー符号化部205は、生成した符号化データを符号構成部104に出力し、その後、符号構成部104は符号化ストリームとして画像符号化装置100の外部(例えば、画像復号装置700)に出力する。 The entropy encoding unit 205 packs the differential image code input from the quantization unit 204, the quantization coefficient used by the quantization unit 204, and the prediction encoding information input from the prediction scheme control unit 209. And encoding using variable length coding (entropy coding), for example, to generate encoded data in which the amount of information is further compressed. The entropy encoding unit 205 outputs the generated encoded data to the code configuration unit 104, and then the code configuration unit 104 outputs the encoded stream to the outside of the image encoding device 100 (for example, the image decoding device 700). .
 次に、画面間予測部218の詳細について説明する。
 デブロッキング・フィルタ部211は、加算部208より参照画像ブロック信号を入力し、画像の符号化時に発生するブロック歪みを減少させるための、従来の手法(例えば、H.264 Reference Software JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008)で用いられるFIRフィルタ処理を行う。デブロッキング・フィルタ部211は、処理結果(補正ブロック信号)をフレームメモリ212に出力する。
Next, details of the inter-screen prediction unit 218 will be described.
The deblocking filter unit 211 receives the reference image block signal from the adder unit 208 and reduces the block distortion that occurs during image coding (for example, H.264 Reference Software JM ver. 13.2). Encoder, http://iphome.hhi.de/suehring/tml/, 2008). The deblocking filter unit 211 outputs the processing result (correction block signal) to the frame memory 212.
 フレームメモリ212は、デブロッキング・フィルタ部211から補正ブロック信号を入力し、視点番号とフレーム番号を同定できる情報と共に画像の一部として補正ブロック信号を保持しておく。フレームメモリ212は、図示していないメモリ管理部によって、入力画像のピクチャの種類或いは画像の順番が管理され、その指示に従って画像を蓄えたり破棄したりする。画像管理については、従来方式のMVCの画像管理方法を利用することもできる。 The frame memory 212 receives the correction block signal from the deblocking filter unit 211, and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number. The frame memory 212 manages the picture type or the image order of the input image by a memory management unit (not shown), and stores or discards the image according to the instruction. For image management, a conventional MVC image management method can also be used.
 動き/視差ベクトル検出部214は、画像入力部201より入力される画像ブロック信号に類似するブロックを、フレームメモリ212に蓄積された画像群から後述するブロックマッチングで探し出し、探し出したブロックを指し示すベクトル情報、視点番号、及びフレーム番号を生成する。ここで、ベクトル情報は、参照する画像が符号化対象画像と同一視点の場合は動きベクトルと呼び、参照する画像が符号化対象画像と異なる視点の場合は視差ベクトルと呼ぶ。動き/視差ベクトル検出部214は、ブロックマッチングを行う際、当該分割されたブロックとの間の指標値を領域毎に算出し、算出した指標値が最小となる領域を探し出す。指標値は、画像信号間の相関性や類似性を示すものであればよい。動き/視差ベクトル検出部214は、例えば、分割されたブロックに含まれる画素の輝度値と参照画像のある領域における輝度値の差の絶対値総和(SAD;Sum of Absolute Difference)を用いる。入力された視点画像信号から分割されたブロック(例えば、大きさがN×N画素)と参照画像信号のブロックとの間のSADは次の式(2)で表される。 The motion / disparity vector detection unit 214 searches for a block similar to the image block signal input from the image input unit 201 by the block matching described later from the image group stored in the frame memory 212, and vector information indicating the searched block , Generating a viewpoint number and a frame number. Here, the vector information is referred to as a motion vector when the referenced image is the same viewpoint as the encoding target image, and is referred to as a disparity vector when the referenced image is different from the encoding target image. When performing the block matching, the motion / disparity vector detecting unit 214 calculates an index value between the divided blocks for each area, and searches for an area where the calculated index value is minimum. The index value only needs to indicate the correlation or similarity between the image signals. The motion / disparity vector detection unit 214 uses, for example, the absolute value sum (SAD) of the difference between the luminance value of the pixel included in the divided block and the luminance value in a certain region of the reference image. The SAD between a block (for example, a size of N × N pixels) divided from the input viewpoint image signal and the block of the reference image signal is expressed by the following equation (2).
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 式(2)において、Iin(i0+i,j0+j)は入力画像の座標(i0+i,j0+j)における輝度値、(i0,j0)は当該分割されたブロックの左上端の画素座標を示す。Iref(i0+i+p,j0+j+q)は参照画像の座標(i0+i+p,j0+j+q)における輝度値、(p,q)は当該分割されたブロックの左上端の座標を基準にしたシフト量(動きベクトル)である。すなわち、動き/視差ベクトル検出部214は、ブロックマッチングにおいて、(p,q)毎にSAD(p,q)を算出し、SAD(p,q)を最小とする(p,q)を探し出す。(p,q)は入力された視点画像から当該分割されたブロックから当該参照領域の位置までのベクトル(動き/視差ベクトル)を表す。 In Expression (2), I in (i 0 + i, j 0 + j) is the luminance value at the coordinates (i 0 + i, j 0 + j) of the input image, and (i 0 , j 0 ) is the upper left of the divided block The edge pixel coordinates are shown. I ref (i 0 + i + p, j 0 + j + q) represents the luminance value in the reference image coordinates (i 0 + i + p, j 0 + j + q), (p, q) is shifted relative to the coordinates of the upper left corner of the divided blocks It is a quantity (motion vector). That is, the motion / disparity vector detection unit 214 calculates SAD (p, q) for each (p, q) in block matching, and searches for (p, q) that minimizes SAD (p, q). (P, q) represents a vector (motion / disparity vector) from the divided block to the position of the reference area from the input viewpoint image.
 動き/視差補償部213は、動き/視差ベクトル検出部214より入力された動きベクトル或いは視差ベクトル、視点番号、及びフレーム番号に基づいて、参照画像ブロックをフレームメモリ212より抽出する。 The motion / disparity compensation unit 213 extracts a reference image block from the frame memory 212 based on the motion vector or the disparity vector, the viewpoint number, and the frame number input from the motion / disparity vector detection unit 214.
 動き/視差ベクトル検出部214より入力された参照ベクトルが視差ベクトルである場合には、動き/視差補償部213は同時に、フレームメモリ212より、既に処理の完了した符号化対象ブロック周辺の画像ブロック信号と、それに対応する参照画像ブロック周辺の画像ブロック信号とを入力する(画像ブロックの位置については、後述する)。動き/視差補償部213は、符号化対象画像と参照画像内の対応する隣接ブロック同士の相関性を示す指標値を算出し、お互いに同一の被写体を示す画像ブロックであると判定した場合には、画像間特性差補償を実施する(詳細は後述する)。動き/視差補償部213は、処理後の画像ブロックを画面間予測画像ブロック信号として、予測方式制御部209と画像選択部210に出力する。 When the reference vector input from the motion / disparity vector detection unit 214 is a disparity vector, the motion / disparity compensation unit 213 simultaneously receives an image block signal around an encoding target block that has already been processed from the frame memory 212. And a corresponding image block signal around the reference image block (the position of the image block will be described later). When the motion / disparity compensation unit 213 calculates an index value indicating the correlation between corresponding neighboring blocks in the encoding target image and the reference image, and determines that the image blocks indicate the same subject to each other, The inter-image characteristic difference compensation is performed (details will be described later). The motion / disparity compensation unit 213 outputs the processed image block to the prediction method control unit 209 and the image selection unit 210 as an inter-screen prediction image block signal.
 なお、動き/視差補償部213は、動き/視差ベクトル検出部214より入力された参照ベクトルが動きベクトルである場合には、前述の周辺ブロックの入力と画像間特性差補償処理を行わずに、フレームメモリ212より、動きベクトルと参照画像番号及びフレーム番号に基づいて抽出した参照画像ブロックをそのまま、画面間予測画像ブロック信号として、予測方式制御部209と画像選択部210に出力する。 In addition, when the reference vector input from the motion / disparity vector detection unit 214 is a motion vector, the motion / disparity compensation unit 213 does not perform the above-described peripheral block input and inter-image characteristic difference compensation processing. The reference image block extracted from the frame memory 212 based on the motion vector, the reference image number, and the frame number is output as it is to the prediction method control unit 209 and the image selection unit 210 as an inter-screen prediction image block signal.
 動き/視差補償部213はさらに、上記ブロックマッチングで算出した動き/視差ベクトルから、符号化対象ブロックに隣接する符号化済みブロックで採用された動き/視差ベクトルに基づいて生成された予測ベクトルを減算し、差分ベクトルを算出する。予測ベクトルの生成方法は、符号化対象ブロックの上に隣接しているブロックと、右上に隣接しているブロックと、左に隣接しているブロックのそれぞれのベクトルの水平成分及び垂直成分の中央値を求めて、予測ベクトルとする。予測ベクトルの算出方法は、MVCで採用されている方式を利用できる。 The motion / disparity compensation unit 213 further subtracts the prediction vector generated based on the motion / disparity vector adopted by the encoded block adjacent to the encoding target block from the motion / disparity vector calculated by the block matching. Then, a difference vector is calculated. The prediction vector generation method is the median value of the horizontal and vertical components of the vector adjacent to the block to be encoded, the block adjacent to the upper right, and the block adjacent to the left. To obtain a prediction vector. As a method for calculating a prediction vector, a method adopted in MVC can be used.
 動き/視差補償部213は、上記差分ベクトルと参照画像情報(参照視点番号、参照フレーム番号)を、画面間符号化情報として予測方式制御部209に出力する。なお、ブロックマッチングで検出された入力画像ブロックと最も類似する領域と、上記予測ベクトルが指し示す領域は、少なくとも参照視点画像番号と参照フレーム番号が一致するようにしておく必要がある。 The motion / disparity compensation unit 213 outputs the difference vector and the reference image information (reference view number, reference frame number) to the prediction scheme control unit 209 as inter-frame coding information. Note that it is necessary that at least the reference viewpoint image number and the reference frame number match the region most similar to the input image block detected by block matching and the region indicated by the prediction vector.
 続いて、本発明の特徴である画像間特性差補償の方法について、図3と図4を用いて説明する。図3は、動き/視差補償部213内部の構成例を示すブロック図である。
 動き/視差補償部213は、対応ブロック抽出部301、補正係数算出部302、及び補正処理部303を含んで構成される。
Next, a method for compensating for the difference in characteristics between images, which is a feature of the present invention, will be described with reference to FIGS. FIG. 3 is a block diagram illustrating a configuration example inside the motion / disparity compensation unit 213.
The motion / disparity compensation unit 213 includes a corresponding block extraction unit 301, a correction coefficient calculation unit 302, and a correction processing unit 303.
 動き/視差ベクトル検出部214より入力された動き/視差ベクトル、参照視点番号及び参照フレーム番号は、対応ブロック抽出部301に入力される。対応ブロック抽出部301は、入力された動き/視差ベクトル、参照視点番号、及び参照フレーム番号に基づいて、フレームメモリ212より該当する画像ブロック(参照画像ブロック)を抽出する。 The motion / disparity vector, reference viewpoint number, and reference frame number input from the motion / disparity vector detection unit 214 are input to the corresponding block extraction unit 301. The corresponding block extraction unit 301 extracts a corresponding image block (reference image block) from the frame memory 212 based on the input motion / disparity vector, reference viewpoint number, and reference frame number.
 同時に、対応ブロック抽出部301は、入力された参照ベクトルが視差ベクトルである場合には、既に符号化が完了してフレームメモリ212に格納されている、符号化対象ブロックの周辺ブロック(以下、符号化対象ブロック周辺画像ブロックと呼ぶ)の画像信号と、それに位置的に対応する参照画像ブロック周辺の周辺ブロック(以下、参照ブロック周辺画像ブロックと呼ぶ)の画像信号を抽出する。このように、対応ブロック抽出部301は、符号化対象ブロックを符号化する際に参照する参照ブロックを抽出する。なお、本発明の特徴の一つである対応ブロック抽出部では、例示した対応ブロック抽出部301のように動きベクトルの抽出も行ってもよいが、少なくとも視差ベクトルの抽出を少なくとも行うものとする。 At the same time, when the input reference vector is a disparity vector, the corresponding block extraction unit 301 has already completed encoding and stores the peripheral blocks (hereinafter referred to as code) of the block to be encoded that are stored in the frame memory 212. And an image signal of a peripheral block (hereinafter referred to as a reference block peripheral image block) around a reference image block corresponding to the position of the image signal. In this way, the corresponding block extraction unit 301 extracts a reference block that is referred to when the encoding target block is encoded. Note that the corresponding block extraction unit, which is one of the features of the present invention, may extract a motion vector as in the illustrated corresponding block extraction unit 301, but at least extract a parallax vector.
 図4は、対応ブロック抽出部301が抽出するブロック位置を概念的に説明するための図である。図4において、符号化対象画像401は符号化対象の視点画像を示し、参照画像402は符号化対象画像401で参照する画像を示している。このとき、参照画像402は、既に画像全体が符号化・復号されてフレームメモリ212に格納されている画像であり、符号化対象画像401とは異なる視点の視点画像である。符号化対象画像401については、符号化対象ブロック位置の直前のブロックまで符号化・復号が完了している。 FIG. 4 is a diagram for conceptually explaining the block positions extracted by the corresponding block extraction unit 301. In FIG. 4, an encoding target image 401 indicates an encoding target viewpoint image, and a reference image 402 indicates an image referred to by the encoding target image 401. At this time, the reference image 402 is an image in which the entire image has already been encoded and decoded and stored in the frame memory 212, and is a viewpoint image with a different viewpoint from the encoding target image 401. With respect to the encoding target image 401, encoding / decoding has been completed up to the block immediately before the encoding target block position.
 いま、符号化処理を行っているブロックが符号化対象ブロック403であり、そこから延びるベクトル407は、動き/視差ベクトル検出部214より入力された視差ベクトルである。この視差ベクトルと参照画像番号及び参照フレーム番号によって、参照画像の特定のブロック(参照ブロック405)が特定される。 Now, the block on which the encoding process is being performed is the encoding target block 403, and the vector 407 extending therefrom is a disparity vector input from the motion / disparity vector detection unit 214. A specific block (reference block 405) of the reference image is specified by the disparity vector, the reference image number, and the reference frame number.
 前述の符号化対象ブロック周辺画像ブロック404とは、符号化対象ブロック403の上のブロック(隣接ブロックA)、右上のブロック(隣接ブロックB)、及び左のブロック(隣接ブロックC)である。また、前述の参照ブロック周辺画像ブロック406とは、参照ブロック405の上のブロック(隣接ブロックA′)、右上のブロック(隣接ブロックB′)、及び左のブロック(隣接ブロックC′)である。このように、符号化対象ブロック403から見たその符号化対象ブロック403の周辺のブロック404の相対位置は、対応ブロック抽出部301でその符号化対象ブロック403の参照元として抽出された参照ブロック405から見た、その参照ブロック405の周辺のブロック406の相対位置と対応している。 The encoding target block peripheral image block 404 is a block above the encoding target block 403 (adjacent block A), an upper right block (adjacent block B), and a left block (adjacent block C). The reference block peripheral image block 406 is a block above the reference block 405 (adjacent block A ′), an upper right block (adjacent block B ′), and a left block (adjacent block C ′). As described above, the relative position of the block 404 around the encoding target block 403 viewed from the encoding target block 403 is the reference block 405 extracted by the corresponding block extraction unit 301 as the reference source of the encoding target block 403. This corresponds to the relative position of the block 406 around the reference block 405 as seen from FIG.
 そしてこの対応関係は、対応ブロック抽出部301で参照ブロック405を抽出することから分かるように、符号化対象ブロック403での視差ベクトルに基づき各隣接ブロックAとA′、BとB′、CとC′のそれぞれの関係において一律であってよい。若しくは、上述の対応関係は、符号化対象ブロック403及びその隣接ブロックA,B,Cでの視差ベクトルに基づき、例えば隣接ブロックAの符号化対象ブロック403に対する相対位置と隣接ブロックA′の参照ブロック405に対する相対位置との対応関係と、隣接ブロックBの符号化対象ブロック403に対する相対位置と隣接ブロックB′の参照ブロック405に対する相対位置との対応関係とが異なっていてもよい。 Then, as can be seen from the extraction of the reference block 405 by the corresponding block extraction unit 301, this correspondence relationship is based on the disparity vectors in the encoding target block 403, and adjacent blocks A and A ′, B and B ′, and C It may be uniform in each relationship of C ′. Alternatively, the correspondence relationship described above is based on the disparity vectors of the encoding target block 403 and its adjacent blocks A, B, and C, for example, the relative position of the adjacent block A with respect to the encoding target block 403 and the reference block of the adjacent block A ′ The correspondence relationship between the relative position of 405 and the relative position of the adjacent block B with respect to the encoding target block 403 and the relative position of the adjacent block B ′ with respect to the reference block 405 may be different.
 なお、周辺の画像ブロックの位置についてはこれに限定するものではなく、符号化対象ブロックより前に符号化・復号されたブロックであれば利用について問題はなく、そのブロックの数も特段限定するものではない。また、図4では、精確な特性差補正のための好ましい例として、符号化対象ブロック403から見た隣接ブロックA,B,Cそれぞれの相対位置と、参照ブロック405から見た隣接ブロックA′,B′,C′それぞれの相対位置とが対応している例を挙げたが、このような対応関係がないように参照ブロックの隣接ブロックの位置を決めておくこともできる。 The positions of the surrounding image blocks are not limited to this, and there is no problem in use as long as the blocks are encoded / decoded before the encoding target block, and the number of the blocks is also particularly limited. is not. In FIG. 4, as a preferable example for accurate characteristic difference correction, the relative positions of the adjacent blocks A, B, C viewed from the encoding target block 403 and the adjacent blocks A ′, viewed from the reference block 405 are illustrated. Although an example has been given in which the relative positions of B ′ and C ′ correspond to each other, the positions of adjacent blocks of the reference block can be determined so that there is no such correspondence.
 図3に戻り、対応ブロック抽出部301は、動き/視差ベクトル検出部214より入力された参照ベクトルが視差ベクトルの場合、符号化対象ブロック403の周辺ブロック(符号化対象ブロック周辺画像ブロック)404の画像信号と参照ブロック405の周辺ブロック(参照ブロック周辺画像ブロック)406の画像信号を抽出して、補正係数算出部302に出力する。 Returning to FIG. 3, when the reference vector input from the motion / disparity vector detection unit 214 is a disparity vector, the corresponding block extraction unit 301 includes the peripheral block (encoding target block peripheral image block) 404 of the encoding target block 403. The image signal and the image signal of the peripheral block (reference block peripheral image block) 406 of the reference block 405 are extracted and output to the correction coefficient calculation unit 302.
 補正係数算出部302は、対応ブロック抽出部301より符号化対象ブロック周辺画像ブロックの画像信号及び参照ブロック周辺画像ブロックの画像信号を受取り、双方の周辺ブロックの類似性を判定する(詳細は後述する)。補正係数算出部302は、符号化対象ブロック周辺画像ブロックと参照ブロック周辺画像ブロックの類似性が高いと判定した場合は、画像間特性差補償のための補正係数を算出して、補正処理部303にその補正係数を出力する。補正係数算出部302は、類似性が低いと判定した場合には、補正を行わないような補正係数を算出して(或いは、処理を無視するためのフラグを)補正処理部303に出力する。 The correction coefficient calculation unit 302 receives the image signal of the encoding target block peripheral image block and the image signal of the reference block peripheral image block from the corresponding block extraction unit 301, and determines the similarity between the peripheral blocks (details will be described later). ). When the correction coefficient calculation unit 302 determines that the similarity between the encoding target block peripheral image block and the reference block peripheral image block is high, the correction coefficient calculation unit 302 calculates a correction coefficient for compensation of the inter-image characteristic difference, and the correction processing unit 303. The correction coefficient is output to. When it is determined that the similarity is low, the correction coefficient calculation unit 302 calculates a correction coefficient that does not perform correction (or outputs a flag for ignoring the process) to the correction processing unit 303.
[類似性の判定方法]
 類似性の判定については、当該分割された複数の隣接ブロックとの間の指標値を算出し、得られた指標値に基づいて判定を行う。指標値には、前述のブロックマッチングと同様に、式(3)に示すような輝度値の差の絶対値総和を導入することができる。
[Method of judging similarity]
For similarity determination, an index value is calculated between the divided adjacent blocks, and determination is performed based on the obtained index value. Similar to the block matching described above, the sum of absolute values of differences in luminance values as shown in Expression (3) can be introduced into the index value.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 ここで、Itarget、Irefはそれぞれ、符号化対象画像の輝度値と参照画像の輝度値であることを示している。また、block={A,B,C}は、図4に示す隣接ブロックA,B,Cのブロック位置を示している。また、このブロック位置において、右辺にある符号化対象画像と参照画像の隣接ブロックA,B,Cにおける左上端の座標を(iblock,jblock)で表す。ここで、参照画像の隣接ブロックの位置については、A=A′、B=B′、C=C′と置き換えて位置を特定するものとする。 Here, I target and I ref indicate the luminance value of the encoding target image and the luminance value of the reference image, respectively. Further, block = {A, B, C} indicates the block positions of adjacent blocks A, B, C shown in FIG. In addition, at this block position, the coordinates of the upper left corner of adjacent blocks A, B, and C of the encoding target image and the reference image on the right side are represented by (i block , j block ). Here, the positions of adjacent blocks in the reference image are specified by replacing A = A ′, B = B ′, and C = C ′.
 上記指標値を、符号化対象ブロックの隣接ブロックとそれに対応する参照画像の隣接ブロックにおいてそれぞれ算出し、類似性判定を実行する。その際の閾値としては、例えば値「20」を用いることができる。この閾値よりも式(3)で示した指標値が小さい場合に個々のブロックの類似性が高いと判定する。上記類似性をA,B,Cの各ブロックで算出し、全てのブロックにおいて類似性が高いと判定したら、後述の補正係数の算出及び補正処理を実施する。同時に、ここで用いた上記閾値を画像復号装置に伝送し(画像全体で固定の閾値を用いる場合、処理の最初で一度閾値を伝送するだけでよい)、復号時に同じ類似性の判定結果を得る必要がある。 The above index value is calculated for each adjacent block of the encoding target block and the corresponding adjacent block of the reference image, and similarity determination is executed. For example, the value “20” can be used as the threshold value. When the index value expressed by the equation (3) is smaller than this threshold, it is determined that the similarity of individual blocks is high. When the similarity is calculated for each of the blocks A, B, and C, and it is determined that the similarity is high in all the blocks, calculation of a correction coefficient and correction processing described later are performed. At the same time, the threshold value used here is transmitted to the image decoding device (if a fixed threshold value is used for the entire image, it is only necessary to transmit the threshold value once at the beginning of processing), and the same similarity determination result is obtained at the time of decoding. There is a need.
 閾値については、画像全体において共通にしてもよいし、画像の符号化効率を鑑みて適応的に切り替える方法を採用してもよい。適応的に切り替える方法では、その情報を全て画像復号装置に伝送する必要がある。また、上述の例では、A,B,C全てのブロックにおいて類似性が高いと判定した場合に補正処理を行うようにしているが、特に限定するものではない。例えば、そのうち任意の1又は2つのブロックにおいて(若しくは予め決めた位置の1又は2のブロックにおいて)類似性が高い場合に補正処理を行い、判定対象のブロック数(若しくは判定対象のブロックの位置)を示す情報を閾値と共に画像復号装置に伝送するようにしてもよい。また、類似性を判定する指標として輝度値の差の絶対値総和を用いたが、対応するブロックの類似性を判定できる指標であれば、どのような方式でも問題ない。 The threshold value may be common to the entire image, or a method of adaptively switching in consideration of the encoding efficiency of the image may be employed. In the adaptive switching method, it is necessary to transmit all the information to the image decoding apparatus. In the above example, correction processing is performed when it is determined that the similarity is high in all the blocks A, B, and C. However, the present invention is not particularly limited. For example, correction processing is performed when the similarity is high in any one or two blocks (or one or two blocks at a predetermined position), and the number of blocks to be determined (or the position of the block to be determined) May be transmitted to the image decoding apparatus together with a threshold value. Further, although the absolute value sum of the differences in luminance values is used as an index for determining similarity, any method can be used as long as it is an index that can determine the similarity of corresponding blocks.
[補正係数算出]
 上記、類似性の判定において補正処理を行うと判定した場合は、前述の従来技術で採用されている式(1)を用いた補正を実行してもよいが、本実施形態では、まず補正係数算出部302が式(4)のような補正係数の算出を行う。式(4)では、補正係数として、輝度値の平均値のオフセット(offsetY)と色差成分の平均値のオフセット(offsetU,offsetV)を導入している。上述の式(3)で用いた輝度Iと式(4)で用いたYは同じものである。
[Correction coefficient calculation]
When it is determined that correction processing is performed in the similarity determination described above, correction using the formula (1) employed in the above-described conventional technique may be executed. In the present embodiment, first, a correction coefficient is used. The calculation unit 302 calculates a correction coefficient as shown in Equation (4). In the equation (4), an offset of an average value of luminance values (offset Y ) and an offset of an average value of chrominance components (offset U , offset V ) are introduced as correction coefficients. The luminance I used in the above equation (3) and Y used in the equation (4) are the same.
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 補正係数算出部302は、類似性の判定において補正処理を行わないと判定した場合には、offsetY=0、offsetU=0、offsetV=0とする。補正係数算出部302は、このようにして算出したオフセット値を補正処理部303に出力する。 The correction coefficient calculation unit 302 sets offset Y = 0, offset U = 0, and offset V = 0 when determining that the correction process is not performed in the similarity determination. The correction coefficient calculation unit 302 outputs the offset value calculated in this way to the correction processing unit 303.
[補正処理]
 補正処理部303は、対応ブロック抽出部301より参照画像ブロック信号を入力し、補正係数算出部302より入力した補正係数に基づいて補正処理を実施する。なお、補正処理部303は、補正係数算出部302より入力した補正係数が、補正を行わないような係数(或いは、処理を無視するためのフラグ)の場合、補正処理を行わずに、対応ブロック抽出部301より入力された参照ブロック信号をそのまま出力する。補正処理については、以下の式(5)で示すように行う。式(5)において、Y(x,y)、U(x,y)、V(x,y)は、対応ブロック抽出部301より入力された参照画像ブロックの画素値である。
[Correction process]
The correction processing unit 303 receives the reference image block signal from the corresponding block extraction unit 301 and performs correction processing based on the correction coefficient input from the correction coefficient calculation unit 302. When the correction coefficient input from the correction coefficient calculation unit 302 is a coefficient that does not perform correction (or a flag for ignoring the process), the correction processing unit 303 does not perform the correction process and does not perform the correction process. The reference block signal input from the extraction unit 301 is output as it is. The correction process is performed as shown in the following equation (5). In Expression (5), Y (x, y), U (x, y), and V (x, y) are pixel values of the reference image block input from the corresponding block extraction unit 301.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 最後に、動き/視差補償部213は、補正処理部303の出力である画像ブロック信号を画面間予測画像ブロック信号として予測方式制御部209及び画像選択部210に出力するのと同時に、動き/視差ベクトル検出部214より入力される参照ベクトルより算出した前述の差分ベクトルと参照画像情報(参照視点画像番号、参照フレーム番号)も合わせて予測方式制御部209に出力する。 Finally, the motion / disparity compensation unit 213 outputs the image block signal output from the correction processing unit 303 as an inter-screen prediction image block signal to the prediction method control unit 209 and the image selection unit 210, and at the same time, The difference vector calculated from the reference vector input from the vector detection unit 214 and the reference image information (reference viewpoint image number, reference frame number) are also output to the prediction method control unit 209.
 画像符号化装置100は、このように少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に、各視点画像間の特性差(つまり異なる視点間での撮影画像の特性差)を補正して視差補償を行う。図2の例では、動き/視差補償部213がこの視差補償の処理を行っている。視差補償自体は、異なる視点画像の視点間の冗長性を考慮した処理である。そして、動き/視差補償部213の補正処理部303が、符号化対象ブロックの周辺のブロックと視差ベクトルが指し示す参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間(視点間の画像)の特性差を補正する。このような補正(画像間特性差補償)により、撮影したカメラの個体差や校正差やレンズへの被写体表面の反射光の入射の違いなどによって生じる視差画像間の特性差が精度良く補正され、補正された状態で視差補償を行うことになるため、高効率の視差補償が実施できる。さらに特性差の補正処理は、既に符号化・復号が完了したブロックの信号を用いるだけであるため、復号側に視差補償予測に必要な情報以外の追加情報を伝送する必要もなく、伝送負荷を軽減させることができる。 When the image encoding device 100 encodes each viewpoint image captured from at least two or more viewpoints as described above, a characteristic difference between viewpoint images (that is, a characteristic difference between captured images between different viewpoints) is calculated. Correct and perform parallax compensation. In the example of FIG. 2, the motion / parallax compensation unit 213 performs this parallax compensation processing. The parallax compensation itself is a process that takes into account the redundancy between the viewpoints of different viewpoint images. Then, the correction processing unit 303 of the motion / disparity compensation unit 213 determines whether each viewpoint image (between viewpoints) is based on the similarity between the block around the encoding target block and the block around the reference block indicated by the disparity vector. Correct the difference in image characteristics. By such correction (compensation for differences in characteristics between images), differences in characteristics between parallax images caused by individual differences or calibration differences of captured cameras and differences in incidence of reflected light on the surface of the subject on the lens are accurately corrected. Since the parallax compensation is performed in the corrected state, highly efficient parallax compensation can be performed. Furthermore, since the characteristic difference correction process only uses the signal of the block that has already been encoded and decoded, there is no need to transmit additional information other than the information necessary for the parallax compensation prediction to the decoding side, and the transmission load is reduced. It can be reduced.
[補正係数算出方法の別の例]
 補正係数算出方法の別の例としては、画像に対応した奥行き情報を利用して、符号化対象ブロックと異なる被写体が写っていると想定されるブロックを補正係数算出から除外する方法がある。本例では、補正処理部303は、符号化対象ブロックに写る被写体とは異なる被写体が写る周辺のブロックを除外して補正を実行することになる。本例を用いることで、補正係数算出にとって雑音となるブロックを除外でき、補正係数の精度をさらに向上させることができる。つまり、上述のような画像間特性差補償処理において、同時に符号化対象ブロックと周辺ブロックに写る被写体の相違を判定し、画像間特性差補償処理を制御することで、視差補償予測精度をさらに向上させ、符号化効率を高めることができる。
[Another example of correction coefficient calculation method]
As another example of the correction coefficient calculation method, there is a method of using a depth information corresponding to an image and excluding from the correction coefficient calculation a block that is supposed to include a subject different from the encoding target block. In this example, the correction processing unit 303 performs the correction by excluding peripheral blocks in which a subject different from the subject in the encoding target block is captured. By using this example, it is possible to exclude a block that causes noise for correction coefficient calculation, and it is possible to further improve the accuracy of the correction coefficient. In other words, in the inter-image characteristic difference compensation process as described above, it is possible to further improve the parallax compensation prediction accuracy by simultaneously determining the difference between the subject captured in the encoding target block and the peripheral block and controlling the inter-image characteristic difference compensation process. Encoding efficiency can be improved.
 奥行き情報とは、画像に映っている被写体までの距離を示す画像情報である。用いる奥行き情報は、符号化対象ブロック及びその周辺ブロック、並びに参照ブロック及びその周辺ブロックについての奥行き情報となる。 The depth information is image information indicating the distance to the subject shown in the image. The depth information to be used is depth information about the encoding target block and its surrounding blocks, and the reference block and its surrounding blocks.
 奥行き情報に関し、上述したMPEG-3DVで扱われる奥行き画像を本例に利用することができる。画像符号化装置100の外部より符号化対象画像に対応した奥行き画像を入力し、上記符号化対象ブロックと符号化対象ブロック周辺画像ブロック位置に対応する奥行き値の代表値を算出する。算出した奥行き値に基づいて、符号化対象ブロックの奥行き代表値と各位置における周辺ブロックの奥行き値の代表値と比べて、その差が所定の値より大きい場合に、補正係数算出処理から除外する。具体的には、以下の式(6)を用いて処理を行う。 Regarding the depth information, the depth image handled by the above-mentioned MPEG-3DV can be used in this example. A depth image corresponding to the encoding target image is input from the outside of the image encoding apparatus 100, and representative values of depth values corresponding to the encoding target block and the encoding target block peripheral image block positions are calculated. Based on the calculated depth value, if the difference between the representative depth value of the encoding target block and the representative value of the depth value of the surrounding block at each position is larger than a predetermined value, it is excluded from the correction coefficient calculation processing. . Specifically, processing is performed using the following equation (6).
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 ここで、FLG(・)とは、所定のブロック位置において補正係数算出処理から除外をするか否かを決めるフラグであり、以下のように与えられる。 Here, FLG (•) is a flag that determines whether or not to exclude from the correction coefficient calculation processing at a predetermined block position, and is given as follows.
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
 ここで、各記号について説明する。BKは、上記隣接する周辺ブロックの位置を示す。Dcurは符号化対象ブロックの奥行き値の代表値で、D(BK)は符号化対象ブロック周辺ブロックBKにおける奥行き値の代表値である。また、THDは判定の閾値(上記所定の値)である。 Here, each symbol will be described. BK indicates the position of the adjacent peripheral block. D cur is a representative value of the depth value of the encoding target block, and D (BK) is a representative value of the depth value in the peripheral block BK of the encoding target block. Further, a TH D is the determination threshold (the predetermined value).
 分割されたブロック毎の奥行き値の代表値は、例えば以下の方法で算出することができる。具体的には、ブロック内の奥行き値の頻度分布(ヒストグラム)を作成し、最も出現頻度の高い奥行き値を抽出して代表値として決定する。 The representative value of the depth value for each divided block can be calculated by the following method, for example. Specifically, a frequency distribution (histogram) of depth values in the block is created, and the depth value having the highest appearance frequency is extracted and determined as a representative value.
 図5に、代表奥行き値の決定処理の概念図を示す。図5(A)で例示する視点画像501に対応する奥行き画像として図5(B)で例示する奥行き画像502が与えられているとする。奥行き画像は、輝度のみのモノクロ画像として表される。輝度が高い(=奥行き値が大きい)領域ほどカメラからの距離が近いことを意味し、輝度が低い(=奥行き値が小さい)領域ほどカメラからの距離が遠いことを意味する。奥行き画像502の中の分割されたブロック503において、奥行き値が図5(C)で例示する頻度分布504のような頻度分布をとる場合、最も出現頻度の高い奥行き値505を、ブロック503の代表奥行き値として決定する。 FIG. 5 shows a conceptual diagram of the representative depth value determination process. Assume that the depth image 502 illustrated in FIG. 5B is given as the depth image corresponding to the viewpoint image 501 illustrated in FIG. The depth image is represented as a monochrome image with luminance only. A region with higher brightness (= large depth value) means a shorter distance from the camera, and a region with lower brightness (= smaller depth value) means a longer distance from the camera. In the divided block 503 in the depth image 502, when the depth value takes a frequency distribution such as the frequency distribution 504 illustrated in FIG. 5C, the depth value 505 having the highest appearance frequency is represented by the block 503. Determined as the depth value.
 なお、奥行き値の代表値を決定する際には、前述のようなヒストグラムに基づく方法の他に、以下の方法に従って決定してもよい。例えば、(a)ブロック内奥行き値の中間値、(b)ブロック内奥行き値の出現頻度を考慮した平均値、(c)ブロック内奥行き値のうちカメラからの距離が最も近い値(ブロック内奥行き値の最大値)、(d)ブロック内奥行き値のうちカメラからの距離が最も遠い値(ブロック内奥行き値の最小値)、或いは(e)ブロック内奥行き値のうちブロックの中心位置の奥行き値、のいずれかを抽出して代表値として決定してもよい。どの方法を選択するかの基準は、例えば、一番効率の良いものを、符号化及び復号で共通の方式に固定する方法、それぞれの方法に基づいて得られた奥行き代表値を用いて視差予測をした際にもっとも予測誤差の小さい方法を適応的に選択するという方法がある。後者の場合、選択した方法を上記符号化ストリームに付加し、画像復号装置に与える必要がある。また、後者の場合の選択肢として、上述した最も出現頻度の高い奥行き値505や(a)~(e)の全てを用意しておく必要はなく、このうち少なくとも2つの中から選択すればよい。 In addition, when determining the representative value of the depth value, in addition to the method based on the histogram as described above, it may be determined according to the following method. For example, (a) an intermediate value of the in-block depth value, (b) an average value in consideration of the appearance frequency of the in-block depth value, and (c) a value closest to the camera among the in-block depth values (intra-block depth) (D) (maximum value), (d) the most distant value from the camera among the in-block depth values (minimum value of the in-block depth value), or (e) the depth value at the center position of the block among the in-block depth values. Either of these may be extracted and determined as a representative value. The criteria for selecting which method is, for example, a method in which the most efficient method is fixed to a common method for encoding and decoding, and a parallax prediction using a depth representative value obtained based on each method There is a method of adaptively selecting a method with the smallest prediction error when performing the above. In the latter case, the selected method needs to be added to the encoded stream and given to the image decoding apparatus. In addition, as the options in the latter case, it is not necessary to prepare the above-described most frequently occurring depth value 505 and all of (a) to (e), and at least two of them may be selected.
 また、奥行き画像を分割する際のブロックサイズは、画像の分割ブロックサイズに合わせればよい。そのサイズは、16×16サイズに限らず、8×8、4×4などのサイズでもよい。また、縦横の画素数が同数でなくともよく、例えば16×8、8×16、8×4、4×8などのサイズでもよい。これらのサイズは、画像符号化部101が採用する符号化対象ブロックのブロックサイズに合わせる方法がある。或いは奥行き画像や対応する視点画像に含まれる被写体の大きさや、要求される圧縮率などに応じて最適なサイズを選択する方法なども可能である。 Also, the block size for dividing the depth image may be matched to the divided block size of the image. The size is not limited to 16 × 16 size, and may be 8 × 8, 4 × 4, or the like. The number of vertical and horizontal pixels may not be the same, and may be 16 × 8, 8 × 16, 8 × 4, 4 × 8, or the like. There is a method of matching these sizes with the block size of the encoding target block adopted by the image encoding unit 101. Alternatively, a method of selecting an optimum size according to the size of a subject included in a depth image or a corresponding viewpoint image, a required compression rate, or the like is also possible.
 なお、奥行き情報として、奥行き画像を分割したブロックの代表値に基づく情報を採用した例を挙げているが、これに限ったものではなく、例えば精度を少し落とすことを容認すればスライス毎に用意された奥行き情報を用いることもできる。また、被写体の判定に、各視点画像(符号化対象画像)に対応する奥行き情報を用いた例を挙げたが、色判定やエッジ検出などを施してその情報を用いるなど、他の情報を用いることもできる。その他の例としては、複数の異なる視点画像から視差量を求めて、領域毎の視差量の違いを用いて判定することもできる。そもそも奥行き情報は視差から算出できる情報であるため、この判定方法は奥行き情報を用いることとほぼ等価であるが、奥行き情報を別途用いなくて済むといった利点はある。また、例示したようにブロック毎に奥行き情報(や他の情報)を1つだけ用意して(若しくは1つだけ取得するようにして)処理を実行することで、符号化対象ブロックの1つの周辺ブロックに複数の被写体が写っているように判定されることがないため、すなわち周辺ブロックに符号化対象ブロックに写った被写体と同じ被写体と異なる被写体との双方が含まれるように判定されることがないため、異なる被写体が写る周辺のブロックだけ除外することができる。 In addition, although the example which adopted the information based on the representative value of the block which divided the depth image is given as depth information, it is not limited to this. For example, if it is acceptable to slightly reduce the accuracy, it is prepared for each slice. Deformed depth information can also be used. Moreover, although the example which used the depth information corresponding to each viewpoint image (encoding object image) was given for determination of a subject, other information is used, such as performing color determination, edge detection, etc., and using that information. You can also As another example, the amount of parallax can be obtained from a plurality of different viewpoint images, and the determination can be made using the difference in the amount of parallax for each region. In the first place, since the depth information is information that can be calculated from the parallax, this determination method is almost equivalent to using the depth information, but there is an advantage that it is not necessary to use the depth information separately. Further, as illustrated, only one depth information (or other information) is prepared for each block (or only one is acquired), and processing is performed, so that one periphery of the encoding target block is executed. Since it is not determined that a plurality of subjects appear in the block, that is, it is determined that both the same subject and the different subject are included in the surrounding blocks in the encoding target block. Therefore, only the peripheral blocks where different subjects are captured can be excluded.
 また、符号化対象ブロックと周辺ブロックに写る被写体の相違を判定し、画像間特性差補償処理を制御する他の例として、補正処理部303は、符号化対象ブロックに写る被写体と同じ被写体が写る可能性が最も高い周辺のブロックを一つ選んで、補正を実行するようにしてもよい。ここでも被写体の判定には、各視点画像に対応する奥行き情報を用いた例を挙げるが、他の情報を用いることもできし、奥行き情報として、奥行き画像を分割したブロックの代表値に基づく情報を採用した例を挙げるが、これに限ったものではない。 Further, as another example of determining the difference between the subject captured in the encoding target block and the peripheral block and controlling the inter-image characteristic difference compensation processing, the correction processing unit 303 captures the same subject as the subject captured in the encoding target block. Correction may be performed by selecting one of the peripheral blocks having the highest possibility. Here, the subject is determined using the depth information corresponding to each viewpoint image. However, other information can be used, and information based on the representative value of the block obtained by dividing the depth image is used as the depth information. However, this is not a limitation.
 上記他の例として、奥行き情報を用いた補正係数算出処理(式(6))の別の方法を説明する。この方法では、符号化対象ブロックの奥行き値の代表値に一番近い奥行き値の代表値を持つ隣接の周辺ブロックを用いる。その方法を以下の式(8)に示している。 As another example, another method of the correction coefficient calculation process (formula (6)) using depth information will be described. In this method, adjacent peripheral blocks having a representative depth value closest to the representative depth value of the encoding target block are used. The method is shown in the following formula (8).
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000008
 ここで、select(・)関数は、以下の式(9)に示す符号化対象ブロックの奥行き値の代表値に一番近い奥行き値を持つ隣接ブロック位置に基づいて、対応する引数の値を出力するものである。例えば、隣接ブロックAが一番近い場合には1番目の引数、隣接ブロックBが一番近い場合には2番目の引数、隣接ブロックCが一番近い場合には3番目の引数として入力した値を出力する。 Here, the select (·) function outputs the value of the corresponding argument based on the adjacent block position having the depth value closest to the representative depth value of the encoding target block shown in the following equation (9). To do. For example, the value input as the first argument when adjacent block A is closest, the second argument when adjacent block B is closest, and the third argument when adjacent block C is closest Is output.
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
<画像符号化装置100のフローチャート>
 次に、本実施形態に係る画像符号化装置100が行う画像符号化処理について説明する。図6は、画像符号化装置100が行う画像符号化処理を示すフローチャートである。図1も参照しながら説明する。
<Flowchart of Image Encoding Device 100>
Next, an image encoding process performed by the image encoding device 100 according to the present embodiment will be described. FIG. 6 is a flowchart showing an image encoding process performed by the image encoding apparatus 100. This will be described with reference to FIG.
 まずステップS101において、画像符号化装置100は、外部から視点画像(基準視点、非基準視点)を入力する。その後、ステップS102に進む。 First, in step S101, the image encoding device 100 inputs a viewpoint image (reference viewpoint, non-reference viewpoint) from the outside. Thereafter, the process proceeds to step S102.
 ステップS102において、画像符号化部101は、外部より入力される視点画像の符号化を行う。画像符号化部101は、符号構成部104に符号化データを出力する。その後、ステップS103に進む。 In step S102, the image encoding unit 101 encodes a viewpoint image input from the outside. The image encoding unit 101 outputs the encoded data to the code configuration unit 104. Thereafter, the process proceeds to step S103.
 ステップS103において、画像符号化装置100は、外部からの視点画像の入力が終了していればステップS104に進む。画像符号化装置100は、外部からの視点画像の入力が終了していなければ、ステップS101に戻って処理を繰り返す。 In step S103, if the input of the viewpoint image from the outside is completed, the image encoding device 100 proceeds to step S104. If the input of the viewpoint image from the outside is not completed, the image encoding device 100 returns to step S101 and repeats the process.
 ステップS104において、符号構成部104は、画像符号化部101より複数視点の画像の符号化データを入力し、符号化データの連結・並べ替えを行い、符号化ストリームとして画像符号化装置100の外部へ出力する。 In step S104, the code configuration unit 104 receives encoded data of images from a plurality of viewpoints from the image encoding unit 101, concatenates and rearranges the encoded data, and outputs the encoded data as an encoded stream to the outside of the image encoding device 100. Output to.
 上記ステップS102で実施される視点画像の符号化について、図7及び図2を用いてより詳しく説明する。
 ステップS201において、画像符号化部101は、外部から視点画像を入力する。その後、ステップS202に進む。
The viewpoint image encoding performed in step S102 will be described in more detail with reference to FIGS.
In step S201, the image encoding unit 101 inputs a viewpoint image from the outside. Thereafter, the process proceeds to step S202.
 ステップS202において、画像入力部201は、画像符号化部101の外部から入力された視点画像である入力画像信号を予め定めた大きさ(例えば、垂直方向16画素×水平方向16画素)のブロックに分割して、減算部202と画面内予測部217及び画面間予測部218に出力する。 In step S202, the image input unit 201 converts the input image signal, which is a viewpoint image input from the outside of the image encoding unit 101, into a block having a predetermined size (for example, 16 pixels in the vertical direction × 16 pixels in the horizontal direction). The data is divided and output to the subtraction unit 202, the intra-screen prediction unit 217, and the inter-screen prediction unit 218.
 画像符号化部101は、ステップS202~ステップS210の処理をフレーム内の画像ブロック毎に繰り返す。次に、ステップS203とステップS204に進む。 The image encoding unit 101 repeats the processing from step S202 to step S210 for each image block in the frame. Next, the process proceeds to step S203 and step S204.
 ステップS203において、画面内予測部217は、画像入力部201から視点画像の画像ブロック信号を入力すると共に加算部208から復号(内部デコード)された参照画像ブロック信号を入力し、画面内予測を実施する。画面内予測部217は、生成した画面内予測画像ブロック信号を予測方式制御部209と画像選択部210に、画面内予測符号化情報を予測方式制御部209に出力する。なお、最初の処理において、加算部208の処理が完了していない場合には、リセットされた画像ブロック(全ての画素値が0の画像ブロック)を入力するものとする。画面内予測部の処理が完了すると、ステップS205に進む。 In step S203, the intra prediction unit 217 receives the image block signal of the viewpoint image from the image input unit 201 and also receives the reference image block signal decoded (internally decoded) from the addition unit 208, and performs intra prediction. To do. The intra-screen prediction unit 217 outputs the generated intra-screen prediction image block signal to the prediction method control unit 209 and the image selection unit 210, and outputs the intra-screen prediction encoding information to the prediction method control unit 209. In the first process, when the process of the adding unit 208 is not completed, a reset image block (an image block in which all pixel values are 0) is input. When the process of the in-screen prediction unit is completed, the process proceeds to step S205.
 ステップS204において、画面間予測部218は、画像入力部201から視点画像の画像ブロック信号を入力すると共に加算部208から復号(内部デコード)された参照画像ブロック信号を入力し、画面間予測を実施する。画面間予測部218は、生成した画面間予測画像ブロック信号を予測方式制御部209と画像選択部210に、画面間予測符号化情報を予測方式制御部209に出力する。なお、最初の処理において、加算部208の処理が完了していない場合には、リセットされた画像ブロック(全ての画素値が0の画像ブロック信号)を入力するものとする。画面間予測部218の処理が完了すると、ステップS205に進む。 In step S204, the inter-screen prediction unit 218 inputs the image block signal of the viewpoint image from the image input unit 201 and also receives the reference image block signal decoded (internally decoded) from the addition unit 208, and performs inter-screen prediction. To do. The inter-screen prediction unit 218 outputs the generated inter-screen prediction image block signal to the prediction method control unit 209 and the image selection unit 210, and outputs the inter-screen prediction coding information to the prediction method control unit 209. In the first process, when the process of the adding unit 208 is not completed, a reset image block (an image block signal in which all pixel values are 0) is input. When the process of the inter-screen prediction unit 218 is completed, the process proceeds to step S205.
 ステップS205において、予測方式制御部209は、画面内予測部217より画面内予測画像ブロック信号と画面内予測符号化情報を受取り、並びに画面間予測部218より画面間予測画像ブロック信号と画面間予測符号化情報を受取り、前述のラグランジュコストに基づいて、符号化効率の良い予測モードを選択する。予測方式制御部209は、選択した予測モードの情報を画像選択部210に出力する。予測方式制御部209は、選択した予測モードに対応する予測符号化情報に、さらに選択した予測モードを識別するための情報を付加して、エントロピー符号化部205に出力する。 In step S <b> 205, the prediction method control unit 209 receives the intra-screen prediction image block signal and the intra-screen prediction encoding information from the intra-screen prediction unit 217, and the inter-screen prediction image block signal and the inter-screen prediction from the inter-screen prediction unit 218. Encoding information is received, and a prediction mode with good encoding efficiency is selected based on the aforementioned Lagrangian cost. The prediction method control unit 209 outputs information on the selected prediction mode to the image selection unit 210. The prediction scheme control unit 209 adds information for identifying the selected prediction mode to the prediction coding information corresponding to the selected prediction mode, and outputs the information to the entropy coding unit 205.
 画像選択部210は、予測方式制御部209から入力される予測モード情報に従って、画面内予測部217から入力される画面内予測画像ブロック信号、或いは画面間予測部218から入力される画面間予測画像ブロック信号を選択して、減算部202と加算部208に出力する。その後、ステップS206に進む。 The image selection unit 210 performs an intra-screen prediction image block signal input from the intra-screen prediction unit 217 or an inter-screen prediction image input from the inter-screen prediction unit 218 according to the prediction mode information input from the prediction method control unit 209. A block signal is selected and output to the subtraction unit 202 and the addition unit 208. Thereafter, the process proceeds to step S206.
 ステップS206において、減算部202は、画像入力部201から入力される画像ブロック信号から画像選択部210から入力される予測画像ブロック信号を減算し、差分画像ブロック信号を生成する。減算部202は、差分画像ブロック信号を直交変換部203に出力する。その後、ステップS207に進む。 In step S206, the subtraction unit 202 subtracts the predicted image block signal input from the image selection unit 210 from the image block signal input from the image input unit 201 to generate a difference image block signal. The subtraction unit 202 outputs the difference image block signal to the orthogonal transformation unit 203. Thereafter, the process proceeds to step S207.
 ステップS207において、直交変換部203は、減算部202から差分画像ブロック信号を入力し、上述の直交変換を実施する。直交変換部203は、直交変換後の信号を量子化部204に出力する。量子化部204は、直交変換部203から入力された信号を、上述の量子化処理を実施し、差分画像符号を生成する。量子化部204は、差分画像符号及び量子化係数を、エントロピー符号化部205と逆量子化部206に出力する。 In step S207, the orthogonal transformation unit 203 receives the difference image block signal from the subtraction unit 202 and performs the above-described orthogonal transformation. The orthogonal transform unit 203 outputs the signal after the orthogonal transform to the quantization unit 204. The quantization unit 204 performs the above-described quantization processing on the signal input from the orthogonal transform unit 203 to generate a difference image code. The quantization unit 204 outputs the difference image code and the quantization coefficient to the entropy coding unit 205 and the inverse quantization unit 206.
 エントロピー符号化部205は、量子化部204から入力される差分画像符号と量子化係数及び予測方式制御部209から入力される予測符号化情報をパッキング(packing;詰込)し、可変長符号化(エントロピー符号化)を行い、情報量がより圧縮された符号化データを生成する。エントロピー符号化部205は、符号化データを外部の符号構成部104に出力する。その後、ステップS208に進む。 The entropy encoding unit 205 packs the differential image code input from the quantization unit 204, the quantization coefficient, and the prediction encoding information input from the prediction scheme control unit 209, and performs variable length encoding. (Entropy coding) is performed to generate encoded data in which the amount of information is further compressed. The entropy encoding unit 205 outputs the encoded data to the external code configuration unit 104. Thereafter, the process proceeds to step S208.
 ステップS208において、逆量子化部206は、量子化部204から差分画像符号及び量子化係数を入力し、量子化部204で実施した量子化の逆の処理を行う。逆量子化部206は、生成された信号を逆直交変換部207に出力する。逆直交変換部207は、逆量子化部206から逆量子化された信号を入力し、直交変換部203で実施した直交変換処理の逆直交変換処理を実施し、差分画像(復号差分画像ブロック信号)を復号する。逆直交変換部207は、復号された差分画像ブロック信号を加算部208に出力する。その後、ステップS209に進む。 In step S208, the inverse quantization unit 206 receives the difference image code and the quantization coefficient from the quantization unit 204, and performs the inverse process of the quantization performed by the quantization unit 204. The inverse quantization unit 206 outputs the generated signal to the inverse orthogonal transform unit 207. The inverse orthogonal transform unit 207 receives the inversely quantized signal from the inverse quantization unit 206, performs the inverse orthogonal transform process of the orthogonal transform process performed by the orthogonal transform unit 203, and performs a difference image (decoded difference image block signal). ). The inverse orthogonal transform unit 207 outputs the decoded difference image block signal to the addition unit 208. Thereafter, the process proceeds to step S209.
 ステップS209において、加算部208は、逆直交変換部207から入力される復号された差分画像ブロック信号に、画像選択部210から入力される予測画像ブロック信号を加算して、入力画像を復号する(参照画像ブロック信号)。加算部208は、参照画像ブロック信号を、画面内予測部217と画面間予測部218に出力する。その後、ステップS210に進む。 In step S209, the addition unit 208 decodes the input image by adding the predicted image block signal input from the image selection unit 210 to the decoded differential image block signal input from the inverse orthogonal transform unit 207 ( Reference image block signal). The adding unit 208 outputs the reference image block signal to the intra-screen prediction unit 217 and the inter-screen prediction unit 218. Thereafter, the process proceeds to step S210.
 ステップS210において、画像符号化部101が、フレーム内の全ブロック及び全視点画像についてステップS202~S210の処理が完了していない場合、処理対象となるブロックを変更してステップS202に戻る。
 全ての処理が完了している場合、終了する。
In step S210, when the image encoding unit 101 has not completed the processes in steps S202 to S210 for all blocks and all viewpoint images in the frame, the block to be processed is changed and the process returns to step S202.
When all the processes are completed, the process ends.
 上述のステップS203で実施される画面内予測の処理フローは、従来方式であるH.264或いはMVCの画面内予測の処理ステップと同じでよい。 The processing flow of intra prediction performed in step S203 described above is the conventional method H.264. It may be the same as the processing step of H.264 or MVC intra-screen prediction.
 上述のステップS204で実施される画面間予測の処理フローについては、図8及び図2を用いて説明する。
 まずステップS301において、デブロッキング・フィルタ部211は、画面間予測部218の外部にある加算部208から参照画像ブロック信号を入力し、前述のFIRフィルタ処理を実施する。デブロッキング・フィルタ部211は、フィルタ処理後の補正ブロック信号をフレームメモリ212に出力する。その後、ステップS302に進む。
The process flow of inter-screen prediction performed in step S204 described above will be described with reference to FIGS.
First, in step S301, the deblocking filter unit 211 inputs a reference image block signal from the adder unit 208 outside the inter-screen prediction unit 218, and performs the above-described FIR filter processing. The deblocking filter unit 211 outputs the corrected block signal after the filtering process to the frame memory 212. Thereafter, the process proceeds to step S302.
 ステップS302において、フレームメモリ212は、デブロッキング・フィルタ部211の補正ブロック信号を入力し、視点番号とフレーム番号を同定できる情報と共に画像の一部として補正ブロック信号を保持しておく。その後、ステップS303に進む。 In step S302, the frame memory 212 receives the correction block signal of the deblocking filter unit 211, and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number. Thereafter, the process proceeds to step S303.
 ステップS303において、動き/視差ベクトル検出部214は、画像入力部201から画像ブロック信号を受取ると、該画像ブロックに類似するブロックを、フレームメモリ212に蓄積された参照画像群より探し出し(ブロックマッチング)、探し出したブロックを表すベクトル情報(動きベクトル/視差ベクトル)を生成する。動き/視差ベクトル検出部214は、検出したベクトル情報を含めた符号化のために必要な情報(参照視点画像番号、参照フレーム番号)を動き/視差補償部213に出力する。その後、ステップS304に進む。 In step S303, upon receiving the image block signal from the image input unit 201, the motion / disparity vector detection unit 214 searches for a block similar to the image block from the reference image group stored in the frame memory 212 (block matching). Then, vector information (motion vector / disparity vector) representing the found block is generated. The motion / disparity vector detection unit 214 outputs information (reference viewpoint image number, reference frame number) necessary for encoding including the detected vector information to the motion / disparity compensation unit 213. Thereafter, the process proceeds to step S304.
 ステップS304において、動き/視差補償部213は、動き/視差ベクトル検出部214から符号化のために必要な情報を入力し、該当する予測ブロックをフレームメモリ212より抽出する。 In step S304, the motion / disparity compensation unit 213 inputs information necessary for encoding from the motion / disparity vector detection unit 214, and extracts a corresponding prediction block from the frame memory 212.
 同時に、動き/視差補償部213は、動き/視差ベクトル検出部214から入力した参照視点画像番号が符号化対象画像と異なる場合には、既に符号化が完了した周辺の複数の画像ブロック信号と、上記視差ベクトルが指し示す参照画像内の対応する周辺の複数の画像ブロック信号を、フレームメモリ212より抽出する。動き/視差補償部213は、入力した周辺画像ブロック信号同士の相関が高い場合には、上記画像間特性差補償処理を実施する。動き/視差補償部213は、最終的に生成した予測画像ブロック信号を画面間予測画像ブロック信号として予測方式制御部209と画像選択部210に出力する。 At the same time, the motion / disparity compensation unit 213, when the reference viewpoint image number input from the motion / disparity vector detection unit 214 is different from the encoding target image, a plurality of peripheral image block signals that have already been encoded, A plurality of corresponding peripheral image block signals in the reference image indicated by the disparity vector are extracted from the frame memory 212. The motion / disparity compensation unit 213 performs the inter-image characteristic difference compensation process when the correlation between the input peripheral image block signals is high. The motion / disparity compensation unit 213 outputs the finally generated predicted image block signal to the prediction method control unit 209 and the image selection unit 210 as an inter-screen predicted image block signal.
 動き/視差補償部213は、動き/視差ベクトル検出部214から入力した参照ベクトルが動きベクトルである場合には、画像間特性差補償処理を行わずに、参照画像ブロック信号をそのまま画面間予測画像ブロック信号として予測方式制御部209と画像選択部210に出力する。 When the reference vector input from the motion / disparity vector detection unit 214 is a motion vector, the motion / disparity compensation unit 213 uses the reference image block signal as it is without performing inter-image characteristic difference compensation processing. It outputs to the prediction system control part 209 and the image selection part 210 as a block signal.
 同時に、動き/視差補償部213は、符号化対象ブロックの隣接ブロックのベクトル情報に基づいて生成した予測ベクトルと動き/視差ベクトル検出部214より入力した動き/視差ベクトルとの差分ベクトルを算出する。動き/視差補償部213は、算出した差分ベクトル及び予測に必要な情報(参照視点画像番号及び参照フレーム番号)を予測方式制御部209に出力する。その後、画面間予測を終了する。 At the same time, the motion / disparity compensation unit 213 calculates a difference vector between the prediction vector generated based on the vector information of the adjacent block of the encoding target block and the motion / disparity vector input from the motion / disparity vector detection unit 214. The motion / disparity compensation unit 213 outputs the calculated difference vector and information necessary for prediction (reference viewpoint image number and reference frame number) to the prediction method control unit 209. Thereafter, the inter-screen prediction is terminated.
 このように、本実施形態によれば、画像符号化装置100は、視差補償予測方式において、符号化対象ブロックの視差ベクトルが指し示す先の参照ブロックの少なくとも一つ以上の周辺ブロックと、それに対応する符号化対象ブロックの周辺ブロックの間の画像間特性差を用いて参照ブロックの信号を補正して視差補償予測を行うことで、つまり画像間の特性の違いを低減させる輝度補償及び/又は色補償を用いて視差補償予測を行うことで、暗黙的な追加情報(画像間特性差補償のための追加情報)を必要とせず、無論復号側に伝送する必要もなく、視差補償予測時に画像間特性差補償処理を行うことができる。そして、異なる被写体が符号化対象ブロック及びその周辺に写っている場合であっても、上記画像間特性差補償処理に必要な情報を精度良く推定することを可能にした符号化方法を提供しているため、符号化効率を飛躍的に向上させることができる。 Thus, according to the present embodiment, the image encoding device 100 corresponds to at least one or more neighboring blocks of the previous reference block indicated by the disparity vector of the encoding target block in the disparity compensation prediction method. Luminance compensation and / or color compensation for reducing parallax compensation prediction by correcting the signal of the reference block using the inter-image characteristic difference between the neighboring blocks of the encoding target block, that is, reducing the difference in characteristics between images. By using parallax compensation prediction, no implicit additional information (additional information for inter-image characteristic difference compensation) is required and, of course, there is no need to transmit to the decoding side, and inter-image characteristics during parallax compensation prediction Difference compensation processing can be performed. An encoding method is provided that enables accurate estimation of information necessary for the inter-image characteristic difference compensation process even when different subjects are reflected in the encoding target block and its periphery. Therefore, it is possible to dramatically improve the encoding efficiency.
(実施形態2)
<画像復号装置の構成>
 図9は、本発明の一実施形態である画像復号装置の構成例を示すブロック図である。
 図9で例示する画像復号装置700は、符号分離部701、画像復号部702を含んで構成される。なお、画像復号部702の内部に記載した点線で示したブロック及び矢印は、画像復号部702の動作を概念的に説明するために利用する。
(Embodiment 2)
<Configuration of Image Decoding Device>
FIG. 9 is a block diagram illustrating a configuration example of an image decoding apparatus according to an embodiment of the present invention.
An image decoding device 700 illustrated in FIG. 9 includes a code separation unit 701 and an image decoding unit 702. Note that the blocks and arrows indicated by dotted lines described in the image decoding unit 702 are used for conceptually explaining the operation of the image decoding unit 702.
 以下、画像復号装置700の機能及び動作について説明する。
 画像復号装置700は、伝送された符号化ストリームを入力すると、符号分離部701に渡す。符号分離部701は、符号化ストリームを受取ると、基準視点画像符号化データ、非基準視点画像符号化データを分離する。符号分離部701は、分離した基準視点画像符号化データと非基準視点画像符号化データを画像復号部702に出力する。画像復号部702の内部については、概念的な処理の内容について説明する。基準視点復号処理部703は、視点内予測符号化に従う方式により圧縮符号化された符号化データを復号し、基準視点の視点画像を復元する。復元した視点画像は、そのまま表示(或いは出力)に使用されると共に、後述する非基準視点の視点画像の復号にも使用される。非基準視点復号処理部704は、視点間予測符号化に従う方式により圧縮符号化された符号化データを、復元された上記基準視点画像に基づいて復号し、非基準視点の視点画像を復元する。最終的に、基準視点画像、非基準視点画像は、そのまま表示用画像(或いは出力用画像)として使用される。
Hereinafter, functions and operations of the image decoding apparatus 700 will be described.
When receiving the transmitted encoded stream, the image decoding apparatus 700 passes the encoded stream to the code separation unit 701. When receiving the encoded stream, the code separation unit 701 separates the reference viewpoint image encoded data and the non-reference viewpoint image encoded data. The code separation unit 701 outputs the separated reference viewpoint image encoded data and non-reference viewpoint image encoded data to the image decoding unit 702. About the inside of the image decoding part 702, the content of a conceptual process is demonstrated. The reference viewpoint decoding processing unit 703 decodes the encoded data that has been compression-encoded by a method according to intra-view prediction encoding, and restores the viewpoint image of the reference viewpoint. The restored viewpoint image is used for display (or output) as it is, and also for decoding a viewpoint image of a non-reference viewpoint described later. The non-reference viewpoint decoding processing unit 704 decodes encoded data that has been compression-encoded by a method according to inter-view prediction encoding based on the restored reference viewpoint image, and restores the viewpoint image of the non-reference viewpoint. Finally, the reference viewpoint image and the non-reference viewpoint image are used as they are as display images (or output images).
 続いて、画像復号部702の詳細について、図10を用いて説明する。図10は、画像復号部702の構成例を示すブロック図である。 Next, details of the image decoding unit 702 will be described with reference to FIG. FIG. 10 is a block diagram illustrating a configuration example of the image decoding unit 702.
 画像復号部702は、符号化データ入力部813、エントロピー復号部801、逆量子化部802、逆直交変換部803、加算部804、予測方式制御部805、画像選択部806、デブロッキング・フィルタ部807、フレームメモリ808、動き/視差補償部809、イントラ予測部810、及び画像出力部812を含んで構成される。なお、説明のために、画面内予測部816と画面間予測部815を点線で図示し、画面内予測部816はイントラ予測部810を含み、画面間予測部815はデブロッキング・フィルタ部807、フレームメモリ808、及び動き/視差補償部809を含むものとする。 The image decoding unit 702 includes an encoded data input unit 813, an entropy decoding unit 801, an inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a prediction scheme control unit 805, an image selection unit 806, and a deblocking filter unit. 807, a frame memory 808, a motion / disparity compensation unit 809, an intra prediction unit 810, and an image output unit 812. For the sake of explanation, the intra-screen prediction unit 816 and the inter-screen prediction unit 815 are illustrated by dotted lines, the intra-screen prediction unit 816 includes an intra prediction unit 810, and the inter-screen prediction unit 815 includes the deblocking filter unit 807, It is assumed that a frame memory 808 and a motion / disparity compensation unit 809 are included.
 図9において画像復号部702の動作を説明した際には、基準視点の復号処理とそれ以外の非基準視点の復号処理を明示的に分けて、基準視点復号処理部703と非基準視点復号処理部704としたが、実際にはお互いに共通する処理が多いため、以下では基準視点復号処理と非基準視点復号処理を統合した形態について説明を行う。 When the operation of the image decoding unit 702 is described with reference to FIG. 9, the reference viewpoint decoding process and the non-reference viewpoint decoding process are explicitly divided into the reference viewpoint decoding process and the non-reference viewpoint decoding process. However, since there are many processes that are common to each other, a mode in which the reference viewpoint decoding process and the non-reference viewpoint decoding process are integrated will be described below.
 具体的には、前述の基準視点復号処理部703として行う視点内予測復号方式は、図10の画面内予測部816で実施される処理と画面間予測部815で実施される処理の一部である同一視点の画像を参照する処理(動き補償)を組み合わせたものである。また、非基準視点復号処理部704として行う視点間予測復号方式は、画面内予測部816で実施される処理と画面間予測部815で実施される同一視点の画像を参照する処理(動き補償)及び異なる視点の画像を参照する処理(視差補償)を組み合わせたものである。さらに、画面間予測部815で実施される処理対象視点と同一視点の画像を参照する処理(動き補償)と異なる視点を参照する処理(視差補償)についても、復号時に参照する画像が異なるだけで、参照画像を指し示すID情報(参照視点番号、参照フレーム番号)を用いることで処理を共通化することが可能である。また、画像符号化データを復号した残差成分と各予測部815,816で予測した画像を加算して画像を復元処理する処理も、基準視点であっても非基準視点であっても共通に行える。詳細は後述する。 Specifically, the intra-view prediction decoding method performed as the reference viewpoint decoding processing unit 703 described above is a part of the processing performed by the intra-screen prediction unit 816 and the inter-screen prediction unit 815 of FIG. This is a combination of processing (motion compensation) for referring to an image of the same viewpoint. In addition, the inter-view prediction decoding method performed as the non-reference viewpoint decoding processing unit 704 is a process performed by the intra-screen prediction unit 816 and a process referring to the same viewpoint image performed by the inter-screen prediction unit 815 (motion compensation). And a process of referring to images from different viewpoints (parallax compensation). Furthermore, with respect to the processing (motion compensation) for referring to an image of the same viewpoint as the processing target viewpoint performed by the inter-screen prediction unit 815 (motion compensation), only the image to be referred to at the time of decoding is different. By using ID information (reference viewpoint number, reference frame number) indicating the reference image, it is possible to share the processing. Also, the processing for restoring the image by adding the residual component obtained by decoding the encoded image data and the image predicted by each of the prediction units 815 and 816 is common to both the reference viewpoint and the non-reference viewpoint. Yes. Details will be described later.
 符号化データ入力部813は、符号分離部701から入力された画像符号化データを、処理ブロック単位(例えば、16画素×16画素)に分割して、エントロピー復号部801に出力する。符号化データ入力部813は、ブロック位置を順次変えながら、フレーム内の全てのブロックが完了し、そして入力される符号化データが終了するまで繰り返して出力する。 The encoded data input unit 813 divides the image encoded data input from the code separation unit 701 into processing block units (for example, 16 pixels × 16 pixels), and outputs the result to the entropy decoding unit 801. The encoded data input unit 813 repeatedly outputs the blocks until the blocks are sequentially changed and all the blocks in the frame are completed and the input encoded data is completed.
 エントロピー復号部801は、符号化データ入力部813から入力された符号化データを、図2のエントロピー符号化部205が行った符号化方法(例えば、可変長符号化)と逆の処理(例えば、可変長復号)であるエントロピー復号して、差分画像符号と量子化係数及び予測符号化情報を抽出する。エントロピー復号部801は、差分画像符号と量子化係数を逆量子化部802に、予測符号化情報を予測方式制御部805に出力する。 The entropy decoding unit 801 processes the encoded data input from the encoded data input unit 813 in a process reverse to the encoding method (for example, variable length encoding) performed by the entropy encoding unit 205 in FIG. Variable length decoding), and a differential image code, a quantization coefficient, and predictive coding information are extracted. The entropy decoding unit 801 outputs the difference image code and the quantization coefficient to the inverse quantization unit 802 and the prediction coding information to the prediction scheme control unit 805.
 逆量子化部802は、エントロピー復号部801から入力された差分画像符号を、抽出した量子化係数を用いて逆量子化して復号周波数領域信号を生成し、逆直交変換部803に出力する。 The inverse quantization unit 802 dequantizes the difference image code input from the entropy decoding unit 801 using the extracted quantization coefficient to generate a decoded frequency domain signal, and outputs the decoded frequency domain signal to the inverse orthogonal transform unit 803.
 逆直交変換部803は、入力された復号周波数領域信号を、例えば逆DCT変換して空間領域信号である復号差分画像ブロック信号を生成する。逆直交変換部803は、復号周波数領域信号に基づき空間領域信号を生成することができれば、逆DCT変換に限らず、他の方法(例えば、IFFT(高速フーリエ逆変換;Inverse Fast Fourier Transform))を用いてもよい。逆直交変換部803は、生成した復号差分画像ブロック信号を加算部804に出力する。 The inverse orthogonal transform unit 803 generates a decoded differential image block signal that is a spatial domain signal by performing, for example, inverse DCT transform on the input decoded frequency domain signal. As long as the inverse orthogonal transform unit 803 can generate a spatial domain signal based on the decoded frequency domain signal, the inverse orthogonal transform unit 803 is not limited to the inverse DCT transform, and other methods (for example, IFFT (Inverse Fast Fourier Transform)) are used. It may be used. The inverse orthogonal transform unit 803 outputs the generated decoded difference image block signal to the addition unit 804.
 予測方式制御部805は、エントロピー復号部801から入力される予測符号化情報の中から、図1及び図2で示した画像符号化装置100で採用されたブロック単位の予測方式を取り出す。予測方式は、画面内予測、或いは画面間予測である。予測方式制御部805は、抽出した予測方式に関する情報を画像選択部806に出力する。また、予測方式制御部805は、エントロピー復号部801から入力される予測符号化情報の中から符号化情報を取り出し、抽出した予測方式に対応する処理部に符号化情報を出力する。予測方式制御部805は、予測方式が画面内予測である場合には、画面内予測部816に符号化情報を画面内予測符号化情報として出力する。予測方式制御部805は、予測方式が画面間予測である場合には、画面間予測部815に符号化情報を画面間予測符号化情報として出力する。 The prediction scheme control unit 805 takes out the block unit prediction scheme employed in the image coding apparatus 100 shown in FIGS. 1 and 2 from the prediction coding information input from the entropy decoding unit 801. The prediction method is intra-screen prediction or inter-screen prediction. The prediction method control unit 805 outputs information regarding the extracted prediction method to the image selection unit 806. Also, the prediction method control unit 805 extracts the encoded information from the prediction encoded information input from the entropy decoding unit 801, and outputs the encoded information to the processing unit corresponding to the extracted prediction method. When the prediction method is intra prediction, the prediction method control unit 805 outputs the encoded information to the intra prediction unit 816 as intra prediction prediction information. When the prediction method is inter-screen prediction, the prediction method control unit 805 outputs the encoded information to the inter-screen prediction unit 815 as inter-screen prediction encoded information.
 画像選択部806は、予測方式制御部805から入力された予測方式に従って、画面内予測部816から入力される画面内予測画像ブロック信号、或いは画面間予測部815から入力される画面間予測画像ブロック信号を選択する。予測方式が、画面内予測の場合には、画面内予測画像ブロック信号を選択する。予測方式が、画面間予測の場合には、画面間予測画像ブロック信号を選択する。画像選択部806は、選択した予測画像ブロック信号を、加算部804に出力する。 The image selection unit 806, based on the prediction method input from the prediction method control unit 805, the intra-screen prediction image block signal input from the intra-screen prediction unit 816 or the inter-screen prediction image block input from the inter-screen prediction unit 815. Select a signal. When the prediction method is intra prediction, an intra prediction image block signal is selected. When the prediction method is inter-screen prediction, an inter-screen prediction image block signal is selected. The image selection unit 806 outputs the selected predicted image block signal to the addition unit 804.
 加算部804は、逆直交変換部803から入力した復号差分画像ブロック信号に画像選択部806から入力した予測画像ブロック信号を加算し、復号画像ブロック信号を生成する。加算部804は、復号した復号画像ブロック信号を、画面内予測部816と画面間予測部815、及び画像出力部812に出力する。 The addition unit 804 adds the predicted image block signal input from the image selection unit 806 to the decoded difference image block signal input from the inverse orthogonal transform unit 803 to generate a decoded image block signal. The adding unit 804 outputs the decoded decoded image block signal to the intra-screen prediction unit 816, the inter-screen prediction unit 815, and the image output unit 812.
 画像出力部812は、加算部804から復号画像ブロック信号を入力し、図示しないフレームメモリに画像の一部として一旦保持する。画像出力部812は、表示順にフレームの並び替えを行った後、全ての視点画像が揃った時に、画像復号装置700の外部に出力する。 The image output unit 812 receives the decoded image block signal from the adder 804 and temporarily holds it as a part of the image in a frame memory (not shown). The image output unit 812 outputs the image to the outside of the image decoding apparatus 700 when all the viewpoint images are prepared after rearranging the frames in the display order.
 続いて、画面内予測部816と画面間予測部815について説明する。
 まず、画面内予測部816について説明する。
 画面内予測部816内のイントラ予測部810は、加算部804より復号画像ブロック信号と予測方式制御部805より画面内予測符号化情報を入力する。イントラ予測部810は、画面内予測符号化情報に基づいて、符号化時に実施した画面内予測を再現する。なお、画面内予測は上述の従来方式に従って実施できる。イントラ予測部810は、生成した予測画像を画面内予測画像ブロック信号として、画像選択部806に出力する。
Next, the intra-screen prediction unit 816 and the inter-screen prediction unit 815 will be described.
First, the intra-screen prediction unit 816 will be described.
The intra prediction unit 810 in the intra prediction unit 816 receives the decoded image block signal from the addition unit 804 and the intra prediction encoding information from the prediction scheme control unit 805. The intra prediction unit 810 reproduces the intra prediction performed at the time of encoding based on the intra prediction encoding information. Note that intra prediction can be performed according to the conventional method described above. The intra prediction unit 810 outputs the generated prediction image to the image selection unit 806 as an intra-screen prediction image block signal.
 続いて、画面間予測部815の詳細について説明する。
 デブロッキング・フィルタ部807は、加算部804から入力される復号画像ブロック信号に対して、デブロッキング・フィルタ部211で行うFIRフィルタと同じ処理を行い、その処理結果(補正ブロック信号)をフレームメモリ808に出力する。
Next, details of the inter-screen prediction unit 815 will be described.
The deblocking filter unit 807 performs the same processing as the FIR filter performed by the deblocking filter unit 211 on the decoded image block signal input from the addition unit 804, and the processing result (correction block signal) is stored in the frame memory. Output to 808.
 フレームメモリ808は、デブロッキング・フィルタ部807から補正ブロック信号を入力し、視点番号とフレーム番号を同定できる情報と共に画像の一部として補正ブロック信号を保持しておく。フレームメモリ808は、図示していないメモリ管理部によって、入力画像のピクチャの種類、或いは画像の順番が管理され、その指示に従って画像を蓄えたり破棄したりする。画像管理については、従来方式のMVCの画像管理方法を利用することもできる。 The frame memory 808 receives the correction block signal from the deblocking filter unit 807 and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number. The frame memory 808 manages the picture type or image order of the input image by a memory management unit (not shown), and stores or discards the image according to the instruction. For image management, a conventional MVC image management method can also be used.
 動き/視差補償部809は、予測方式制御部805より画面間予測符号化情報を入力し、その中から参照画像情報(参照視点画像番号と参照フレーム番号)と差分ベクトル(動き/視差ベクトルと予測ベクトルの差分ベクトル)を取り出す。動き/視差補償部809は、前述の動き/視差補償部213で実施した予測ベクトル生成方法と同じ方法によって、予測ベクトルを生成する。動き/視差補償部809は、算出した予測ベクトルに差分ベクトルを加算して、動き/視差ベクトルを再現する。動き/視差補償部809は、参照画像情報と動き/視差ベクトルに基づいて、フレームメモリ808に蓄積されている画像の中から対象の画像ブロック信号(予測画像ブロック信号)を抽出する。 The motion / disparity compensation unit 809 receives inter-frame prediction encoding information from the prediction scheme control unit 805, and from among these, reference image information (reference viewpoint image number and reference frame number) and a difference vector (motion / disparity vector and prediction). Vector difference vector). The motion / disparity compensation unit 809 generates a prediction vector by the same method as the prediction vector generation method performed by the motion / disparity compensation unit 213 described above. The motion / disparity compensation unit 809 reproduces the motion / disparity vector by adding the difference vector to the calculated prediction vector. The motion / disparity compensation unit 809 extracts a target image block signal (predicted image block signal) from the images stored in the frame memory 808 based on the reference image information and the motion / disparity vector.
 参照ベクトルが視差ベクトルの場合、動き/視差補償部809は同時に、復号対象ブロック周辺の画像ブロック信号(復号対象ブロック周辺画像ブロック信号)と予測画像ブロック周辺の画像信号ブロック信号(予測画像ブロック周辺画像ブロック信号)を抽出し、前述の符号化時に用いた類似性を判定する方法を用いて、符号化時と同じ条件で類似性の判定を行う。さらに、その判定結果に従い、符号化時と同じ画像間特性差補正処理を実施する。 When the reference vector is a disparity vector, the motion / disparity compensation unit 809 simultaneously performs an image block signal around the decoding target block (decoding target block surrounding image block signal) and an image signal block signal around the prediction image block (prediction image block surrounding image). The block signal) is extracted, and the similarity is determined under the same conditions as in the encoding using the above-described method for determining the similarity used at the time of encoding. Further, according to the determination result, the same inter-image characteristic difference correction processing as that at the time of encoding is performed.
 参照ベクトルが動きベクトルの場合、画像間特性差補償処理を行わずに、参照画像ブロック信号をそのまま出力する。 When the reference vector is a motion vector, the reference image block signal is output as it is without performing the inter-image characteristic difference compensation process.
 動き/視差補償部809は、画像間特性差補正後の画像ブロック信号、或いは参照画像ブロック信号を画面間予測画像ブロック信号として画像選択部806に出力する。 The motion / parallax compensation unit 809 outputs the image block signal after the inter-image characteristic difference correction or the reference image block signal to the image selection unit 806 as an inter-screen prediction image block signal.
 以上のように、画像復号装置700は、少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に、各視点画像間の特性差を補正して視差補償を行う。図10の例では、動き/視差補償部809が前述の動き/視差補償部213と同様にしてこの視差補償の処理を行っている。つまり、動き/視差補償部809は、復号対象ブロックを復号する際に参照する参照ブロックを抽出する対応ブロック抽出部と、復号対象ブロックの周辺のブロックと視差ベクトルが指し示す参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間の特性差を補正する補正処理部と、を備える。 As described above, when decoding the encoded stream of each viewpoint image captured from at least two or more viewpoints, the image decoding apparatus 700 corrects the characteristic difference between the viewpoint images and performs parallax compensation. In the example of FIG. 10, the motion / parallax compensation unit 809 performs this parallax compensation processing in the same manner as the motion / parallax compensation unit 213 described above. That is, the motion / disparity compensation unit 809 includes a corresponding block extraction unit that extracts a reference block to be referred to when decoding the decoding target block, a block around the decoding target block, and a block around the reference block indicated by the disparity vector. And a correction processing unit that corrects a characteristic difference between the viewpoint images based on the similarity.
 上述したように、符号化時と同じ条件での類似性の判定や、その判定結果に従った符号化時と同じ画像間特性差補正処理を実施する。つまり、前述の動き/視差補償部213における他の応用例についても同様に動き/視差補償部809に適用できる。 As described above, similarity determination under the same conditions as at the time of encoding and the same inter-image characteristic difference correction processing as at the time of encoding according to the determination result are performed. That is, other application examples in the motion / parallax compensation unit 213 can be similarly applied to the motion / parallax compensation unit 809.
 より具体的には、画像復号装置700において、復号対象ブロックから見たその復号対象ブロックの周辺のブロックの相対位置は、対応ブロック抽出部でその復号対象ブロックの参照元として抽出された参照ブロックから見た、その参照ブロックの周辺のブロックの相対位置と対応していることが好ましい。また、補正処理部は、復号対象ブロックに写る被写体とは異なる被写体が写る周辺のブロックを除外して補正を実行する、若しくは、復号対象ブロックに写る被写体と同じ被写体が写る可能性が最も高い周辺のブロックを一つ選んで補正を実行することが好ましい。さらに、被写体の判定に、各視点画像(復号対象画像)に対応する奥行き情報を用いることができる。用いる奥行き情報は、復号対象ブロック及びその周辺ブロック、並びに参照ブロック及びその周辺ブロックについての奥行き情報となる。また、奥行き情報は、奥行き画像を分割したブロックの代表値に基づく情報であることが好ましい。 More specifically, in the image decoding apparatus 700, the relative positions of the blocks around the decoding target block viewed from the decoding target block are determined from the reference blocks extracted as reference sources of the decoding target block by the corresponding block extraction unit. Preferably, it corresponds to the relative position of the block around the reference block as seen. In addition, the correction processing unit performs correction by excluding a peripheral block in which a subject different from the subject in the decoding target block is captured, or a peripheral in which the same subject as the subject in the decoding target block is most likely to be captured It is preferable to perform correction by selecting one block. Furthermore, the depth information corresponding to each viewpoint image (decoding target image) can be used for subject determination. The depth information to be used is depth information about the decoding target block and its peripheral blocks, and the reference block and its peripheral blocks. The depth information is preferably information based on representative values of blocks obtained by dividing the depth image.
<画像復号装置のフローチャート>
 次に、本実施形態に係る画像復号装置700が行う画像復号処理について説明する。図11は、画像復号装置700が行う画像復号処理を示すフローチャートである。図9も参照しながら説明する。
<Flowchart of Image Decoding Device>
Next, an image decoding process performed by the image decoding apparatus 700 according to the present embodiment will be described. FIG. 11 is a flowchart showing an image decoding process performed by the image decoding apparatus 700. This will be described with reference to FIG.
 ステップS501において、画像復号装置700は、外部(例えば、画像符号化装置100)から符号化ストリームを入力し、符号分離部701によって各視点の画像符号化データに分離・抽出する。その後、ステップS502に進む。 In step S501, the image decoding apparatus 700 receives an encoded stream from the outside (for example, the image encoding apparatus 100), and the code separation unit 701 separates and extracts the image encoded data of each viewpoint. Thereafter, the process proceeds to step S502.
 ステップS502において、画像復号部702は、ステップS501で分離・抽出された、視点画像の画像符号化データを復号し、結果を画像復号部702の外部に出力する。その後、ステップS503に進む。 In step S502, the image decoding unit 702 decodes the encoded image data of the viewpoint image separated and extracted in step S501, and outputs the result to the outside of the image decoding unit 702. Thereafter, the process proceeds to step S503.
 ステップS503において、画像復号装置700は、全ての視点画像の処理が終了していると判定した場合には、視点画像の時間軸方向の並べ替えと視点方向の順番を整えたうえで、結果を画像復号装置700の外部に出力する。画像復号装置700は、全ての視点と時間の画像の処理が終了していないと判定した場合は、ステップS502に戻って処理を続ける。 In step S503, when the image decoding apparatus 700 determines that the processing of all viewpoint images has been completed, the image decoding apparatus 700 rearranges the viewpoint images in the time axis direction and arranges the order of the viewpoint directions. The image is output to the outside of the image decoding apparatus 700. If the image decoding apparatus 700 determines that the processing of all the viewpoints and time images has not been completed, the image decoding apparatus 700 returns to step S502 and continues the processing.
 続いて、ステップS502で実施される視点画像の復号について、図12及び図10を用いて説明する。
 ステップS601において、画像復号部702は、外部から画像符号化データを入力する。その後、ステップS602に進む。
 ステップS602において、符号化データ入力部813は、画像復号部702の外部から入力された符号化データを予め定めた大きさ(例えば、垂直方向16画素×水平方向16画素)に対応する処理ブロックに分割して、エントロピー復号部801に出力する。画像復号部702は、ステップS602~ステップS608の処理をフレーム内の画像ブロック毎に繰り返す。
Next, the viewpoint image decoding performed in step S502 will be described with reference to FIGS.
In step S601, the image decoding unit 702 inputs encoded image data from the outside. Thereafter, the process proceeds to step S602.
In step S602, the encoded data input unit 813 converts the encoded data input from the outside of the image decoding unit 702 into a processing block corresponding to a predetermined size (for example, 16 pixels in the vertical direction × 16 pixels in the horizontal direction). Divide and output to the entropy decoding unit 801. The image decoding unit 702 repeats the processing in steps S602 to S608 for each image block in the frame.
 ステップS603において、エントロピー復号部801は、符号化データ入力部813から入力された画像符号化データをエントロピー復号し、差分画像符号と量子化係数、及び予測符号化情報を生成する。エントロピー復号部801は、差分画像符号と量子化係数を、逆量子化部802に出力し、予測符号化情報を予測方式制御部805に出力する。予測方式制御部805は、エントロピー復号部801から予測符号化情報を入力し、予測方式に関する情報とその予測方式に対応する符号化情報を取り出す。予測方式が、画面内予測の場合には、符号化情報を画面内予測符号化情報として画面内予測部816に出力する。予測方式が、画面間予測の場合には、符号化情報を画面間予測符号化情報として画面間予測部815に出力する。その後、ステップS604とステップS605に進む。 In step S603, the entropy decoding unit 801 performs entropy decoding on the encoded image data input from the encoded data input unit 813, and generates a differential image code, a quantization coefficient, and predictive encoding information. The entropy decoding unit 801 outputs the difference image code and the quantization coefficient to the inverse quantization unit 802, and outputs the prediction coding information to the prediction scheme control unit 805. The prediction scheme control unit 805 receives prediction coding information from the entropy decoding unit 801, and extracts information regarding the prediction scheme and coding information corresponding to the prediction scheme. When the prediction method is intra prediction, the encoding information is output to the intra prediction unit 816 as intra prediction encoding information. When the prediction method is inter-screen prediction, the encoding information is output to the inter-screen prediction unit 815 as inter-screen prediction encoding information. Then, it progresses to step S604 and step S605.
 ステップS604において、画面内予測部816内のイントラ予測部810は、予測方式制御部805から入力される画面内予測符号化情報と加算部804から入力される復号画像ブロック信号を入力して、画面内予測処理を実施する。イントラ予測部810は、生成された画面内予測画像ブロック信号を画像選択部806に出力する。なお、最初の処理において、加算部804の処理が完了していない場合には、リセットされた画像ブロック信号(全ての画素値が0の画像ブロック信号)を入力するものとする。その後、ステップS606に進む。 In step S604, the intra prediction unit 810 in the intra prediction unit 816 receives the intra prediction encoding information input from the prediction scheme control unit 805 and the decoded image block signal input from the addition unit 804, and the screen Intra prediction processing is performed. The intra prediction unit 810 outputs the generated intra-screen prediction image block signal to the image selection unit 806. In the first process, when the process of the adding unit 804 is not completed, a reset image block signal (an image block signal in which all pixel values are 0) is input. Thereafter, the process proceeds to step S606.
 ステップS605において、画面間予測部815は、予測方式制御部805から入力される画面間予測符号化情報と加算部804から入力される復号画像ブロック信号に基づいて、画面間予測を実施する。画面間予測部815は、生成された画面間予測画像ブロック信号を画像選択部806に出力する。画面間予測の処理については後述する。なお、最初の処理において、加算部804の処理が完了していない場合には、リセットされた画像ブロック信号(全ての画素値が0の画像ブロック信号)を入力するものとする。その後、ステップS606に進む。 In step S605, the inter-screen prediction unit 815 performs inter-screen prediction based on the inter-screen prediction encoding information input from the prediction method control unit 805 and the decoded image block signal input from the addition unit 804. The inter-screen prediction unit 815 outputs the generated inter-screen prediction image block signal to the image selection unit 806. The inter-screen prediction process will be described later. In the first process, when the process of the adding unit 804 is not completed, a reset image block signal (an image block signal in which all pixel values are 0) is input. Thereafter, the process proceeds to step S606.
 ステップS606において、画像選択部806は、予測方式制御部805から入力された予測方式に関する情報に基づいて、画面内予測部816から入力された画面内予測画像ブロック信号、若しくは画面間予測部815から入力された画面間予測画像信号を選択して、加算部804に出力する。その後、ステップS607に進む。 In step S <b> 606, the image selection unit 806 receives the intra-screen prediction image block signal input from the intra-screen prediction unit 816 or the inter-screen prediction unit 815 based on the information related to the prediction method input from the prediction method control unit 805. The input inter-screen prediction image signal is selected and output to the adding unit 804. Thereafter, the process proceeds to step S607.
 ステップS607において、逆量子化部802は、エントロピー復号部801から入力した差分画像符号と量子化係数を用いて、図2の画像符号化部101の量子化部204で実施した量子化の逆の処理を行う。逆量子化部802は、生成された復号周波数領域信号を逆直交変換部803に出力する。逆直交変換部803は、逆量子化部802から逆量子化された復号周波数領域信号を入力し、図2の画像符号化部101の直交変換部203で実施した直交変換処理の逆直交変換処理を実施し、差分画像(復号差分画像ブロック信号)を復号する。逆直交変換部803は、復号された復号差分画像ブロック信号を加算部804に出力する。加算部804は、逆直交変換部803から入力される復号差分画像ブロック信号に画像選択部806から入力される予測画像ブロック信号を加算して、復号画像ブロック信号を生成する。加算部804は、復号した復号画像ブロック信号を画像出力部812と画面内予測部816及び画面間予測部815に出力する。その後、ステップS608に進む。 In step S607, the inverse quantization unit 802 uses the difference image code and the quantization coefficient input from the entropy decoding unit 801, and performs the inverse of the quantization performed by the quantization unit 204 of the image encoding unit 101 in FIG. Process. The inverse quantization unit 802 outputs the generated decoded frequency domain signal to the inverse orthogonal transform unit 803. The inverse orthogonal transform unit 803 receives the inverse quantized decoded frequency domain signal from the inverse quantization unit 802 and performs the inverse orthogonal transform process of the orthogonal transform process performed by the orthogonal transform unit 203 of the image coding unit 101 in FIG. And the differential image (decoded differential image block signal) is decoded. The inverse orthogonal transform unit 803 outputs the decoded decoded difference image block signal to the adding unit 804. The adding unit 804 adds the predicted image block signal input from the image selection unit 806 to the decoded difference image block signal input from the inverse orthogonal transform unit 803, thereby generating a decoded image block signal. The adding unit 804 outputs the decoded decoded image block signal to the image output unit 812, the intra-screen prediction unit 816, and the inter-screen prediction unit 815. Thereafter, the process proceeds to step S608.
 ステップS608において、画像出力部812は、加算部804から入力される復号画像ブロック信号を、画像内の対応する位置に配置させ出力画像生成する。フレーム内の全ブロックについてステップS602~ステップS608の処理が完了していない場合、処理対象となるブロックを変更してステップS602に戻る。
 画像出力部812は、画像を表示順に並び替えを行い、同一フレームの視点画像を揃えて画像復号装置700の外部に出力する。
In step S608, the image output unit 812 places the decoded image block signal input from the adding unit 804 at a corresponding position in the image to generate an output image. If the processes in steps S602 to S608 have not been completed for all blocks in the frame, the block to be processed is changed and the process returns to step S602.
The image output unit 812 rearranges the images in the display order, aligns the viewpoint images of the same frame, and outputs them to the outside of the image decoding apparatus 700.
 画面間予測部815の処理フローについては、図13及び図10を用いて説明する。
 ステップS701において、デブロッキング・フィルタ部807は、画面間予測部815の外部にある加算部804から復号画像ブロック信号を入力し、上記符号化時に行ったFIRフィルタ処理を実施する。デブロッキング・フィルタ部807は、フィルタ処理後の補正ブロック信号をフレームメモリ808に出力する。その後、ステップS702に進む。
A processing flow of the inter-screen prediction unit 815 will be described with reference to FIGS. 13 and 10.
In step S701, the deblocking / filtering unit 807 receives the decoded image block signal from the adding unit 804 outside the inter-screen prediction unit 815, and performs the FIR filter processing performed at the time of the encoding. The deblocking filter unit 807 outputs the corrected corrected block signal to the frame memory 808. Thereafter, the process proceeds to step S702.
 ステップS702において、フレームメモリ808は、デブロッキング・フィルタ部807の補正ブロック信号を入力し、視点番号とフレーム番号を同定できる情報と共に画像の一部として補正ブロック信号を保持しておく。その後、ステップS703に進む。 In step S702, the frame memory 808 receives the correction block signal of the deblocking filter unit 807, and holds the correction block signal as a part of the image together with information that can identify the viewpoint number and the frame number. Thereafter, the process proceeds to step S703.
 ステップS703において、動き/視差補償部809は、予測方式制御部805から画面間予測符号化情報を入力し、その中から参照画像情報(参照視点画像番号と参照フレーム番号)と差分ベクトル(動き/視差ベクトルと予測ベクトルとの差分ベクトル)を取り出す。動き/視差補償部809は、前述の動き/視差補償部213で実施した予測ベクトル生成方法と同じ方法によって、予測ベクトルを生成する。動き/視差補償部809は、算出した予測ベクトルに差分ベクトルを加算して、動き/視差ベクトルを生成する。動き/視差補償部809は、参照画像情報と動き/視差ベクトルに基づいて、フレームメモリ808に蓄積されている画像の中から対象の画像ブロック信号(予測画像ブロック信号)を抽出する。同時に、動き/視差補償部809は、参照ベクトルが視差ベクトルの場合には、符号化対象ブロックと参照画像ブロックそれぞれの周辺画像ブロック信号を取り出し、前述の画像間特性差補償処理を実施し、結果を画面間予測画像ブロック信号として画像選択部806に出力する。 In step S703, the motion / disparity compensation unit 809 receives the inter-frame predictive coding information from the prediction scheme control unit 805, and among them, the reference image information (reference viewpoint image number and reference frame number) and the difference vector (motion / motion). A difference vector between the disparity vector and the prediction vector) is extracted. The motion / disparity compensation unit 809 generates a prediction vector by the same method as the prediction vector generation method performed by the motion / disparity compensation unit 213 described above. The motion / disparity compensation unit 809 adds a difference vector to the calculated prediction vector to generate a motion / disparity vector. The motion / disparity compensation unit 809 extracts a target image block signal (predicted image block signal) from the images stored in the frame memory 808 based on the reference image information and the motion / disparity vector. At the same time, when the reference vector is a disparity vector, the motion / disparity compensation unit 809 extracts the surrounding image block signals of the encoding target block and the reference image block, performs the above-described inter-image characteristic difference compensation processing, and the result Is output to the image selection unit 806 as an inter-screen prediction image block signal.
 動き/視差補償部809は、参照ベクトルが動きベクトルの場合には、前述の画像間特性差補償処理を行わずに、予測画像ブロック信号をそのまま画面間予測画像ブロック信号として画像選択部806に出力する。
 その後、画面間予測処理を終了する。
When the reference vector is a motion vector, the motion / disparity compensation unit 809 outputs the predicted image block signal as it is as the inter-screen predicted image block signal to the image selection unit 806 without performing the above-described inter-image characteristic difference compensation processing. To do.
Thereafter, the inter-screen prediction process ends.
 このように、本実施形態によれば、画像復号装置700は、明示的な画像間特性差補償のための追加情報を受信することなく、画像間特性差補償を実施した視差補償予測を行うことができる。つまり、本実施形態によれば、図1の画像符号化装置100のようにして符号化効率を高めて符号化されたデータを復号することができる。 Thus, according to the present embodiment, the image decoding apparatus 700 performs the parallax compensation prediction in which the inter-image characteristic difference compensation is performed without receiving additional information for explicit inter-image characteristic difference compensation. Can do. That is, according to the present embodiment, it is possible to decode the encoded data with improved encoding efficiency as in the image encoding apparatus 100 of FIG.
(実施形態3)
<ソフトウェア、方法>
 上述した実施形態における画像符号化装置100、画像復号装置700の一部、例えば、符号構成部104と、画像符号化部101内の減算部202、直交変換部203、量子化部204、エントロピー符号化部205、逆量子化部206、逆直交変換部207、加算部208、予測方式制御部209、画像選択部210、デブロッキング・フィルタ部211、動き/視差補償部213、動き/視差ベクトル検出部214、及びイントラ予測部215と、符号分離部701と、画像復号部702内のエントロピー復号部801、逆量子化部802、逆直交変換部803、加算部804、予測方式制御部805、画像選択部806、デブロッキング・フィルタ部807、動き/視差補償部809、及びイントラ予測部810を、コンピュータで実現するようにしてもよい。
(Embodiment 3)
<Software, method>
Part of the image encoding device 100 and the image decoding device 700 in the above-described embodiment, for example, the code configuration unit 104, the subtraction unit 202 in the image encoding unit 101, the orthogonal transform unit 203, the quantization unit 204, and the entropy code Unit 205, inverse quantization unit 206, inverse orthogonal transform unit 207, addition unit 208, prediction method control unit 209, image selection unit 210, deblocking filter unit 211, motion / disparity compensation unit 213, motion / disparity vector detection Unit 214, intra prediction unit 215, code separation unit 701, entropy decoding unit 801 in image decoding unit 702, inverse quantization unit 802, inverse orthogonal transform unit 803, addition unit 804, prediction scheme control unit 805, image The selection unit 806, the deblocking filter unit 807, the motion / disparity compensation unit 809, and the intra prediction unit 810 are performed by a computer. It may be present.
 その場合、この制御機能を実現するためのプログラム(画像符号化プログラム及び/又は画像復号プログラム)をコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、画像符号化装置100又は画像復号装置700に内蔵されたコンピュータシステムであって、OSや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ROM、CD-ROM等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムに既に記録されているプログラムとの組み合わせで実現できるものであってもよい。また、このプログラムは、可搬の記録媒体やネットワークを介して流通させるに限らず、放送波を介して流通させることもできる。 In that case, a program (an image encoding program and / or an image decoding program) for realizing this control function is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system. , May be realized by executing. The “computer system” here is a computer system built in the image encoding apparatus 100 or the image decoding apparatus 700, and includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In this case, a volatile memory inside a computer system that serves as a server or a client may be included that holds a program for a certain period of time. Further, the program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system. . Further, this program is not limited to being distributed via a portable recording medium or a network, but can also be distributed via a broadcast wave.
 上記画像符号化プログラムは、少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に各視点画像間の特性差を補正して視差補償を行う画像符号化処理を、コンピュータに実行させるためのプログラムである。ここで、上記画像符号化処理は、符号化対象ブロックを符号化する際に参照する参照ブロックを抽出するステップと、符号化対象ブロックの周辺のブロックと参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間の特性差を補正するステップと、を有するものとする。その他の応用例については、画像符号化装置について説明した通りである。 The image encoding program causes a computer to execute an image encoding process that corrects a characteristic difference between viewpoint images and performs parallax compensation when encoding each viewpoint image captured from at least two viewpoints. It is a program for. Here, the image encoding process is based on the similarity between the step of extracting a reference block to be referred to when encoding the encoding target block, and the block around the encoding target block and the block around the reference block. And correcting the characteristic difference between the viewpoint images. Other application examples are as described for the image encoding device.
 また、上記画像復号プログラムは、少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に各視点画像間の特性差を補正して視差補償を行う画像復号処理を、コンピュータに実行させるためのプログラムである。ここで、上記画像復号処理は、復号対象ブロックを復号する際に参照する参照ブロックを抽出するステップと、復号対象ブロックの周辺のブロックと参照ブロックの周辺のブロックとの類似性に基づいて、視点画像間の特性差を補正するステップと、を有するものとする。その他の応用例については、画像復号装置について説明した通りである。この画像復号プログラムは、多視点画像の再生ソフトウェアの一部として実装することができる。 Further, the image decoding program performs an image decoding process for correcting a characteristic difference between viewpoint images and performing parallax compensation when decoding an encoded stream of each viewpoint image captured from at least two viewpoints. It is a program for making it run. Here, the image decoding process is based on the step of extracting a reference block to be referred to when decoding the decoding target block, and based on the similarity between the block around the decoding target block and the block around the reference block. And correcting a characteristic difference between images. Other application examples are as described for the image decoding apparatus. This image decoding program can be implemented as part of multi-viewpoint image playback software.
 また、上述した実施形態における画像符号化装置100及び画像復号装置700の一部、又は全部を、LSI(Large Scale Integration)等の集積回路又はIC(Integrated Circuit)チップセットとして実現してもよい。画像符号化装置100及び画像復号装置700の各機能ブロックは個別にプロセッサ化してもよいし、一部、又は全部を集積してプロセッサ化してもよい。また、集積回路化の手法はLSIに限らず専用回路、又は汎用プロセッサで実現してもよい。また、半導体技術の進歩によりLSIに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 Further, part or all of the image encoding device 100 and the image decoding device 700 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration) or an IC (Integrated Circuit) chip set. Each functional block of the image encoding device 100 and the image decoding device 700 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. In addition, when an integrated circuit technology that replaces LSI appears due to the advancement of semiconductor technology, an integrated circuit based on the technology may be used.
 また、本発明は、画像符号化装置、画像復号装置における制御の流れを例示したように、さらには画像符号化プログラム、画像復号プログラムの各ステップの処理として説明したように、画像符号化方法、画像復号方法としての形態も採り得る。 In addition, as exemplified in the flow of control in the image encoding device and the image decoding device, the present invention further includes an image encoding method and an image encoding method, as described as the processing of each step of the image decoding program, A form as an image decoding method can also be adopted.
 上記画像符号化方法は、少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に、各視点画像間の特性差を補正して視差補償を行う方法であって、符号化対象ブロックを符号化する際に参照する参照ブロックを抽出するステップと、符号化対象ブロックの周辺のブロックと参照ブロックの周辺のブロックとの類似性に基づいて、各視点画像間の特性差を補正するステップと、を有するものとする。その他の応用例については、画像符号化装置について説明した通りである。 The image encoding method is a method for performing parallax compensation by correcting a difference in characteristics between viewpoint images when encoding each viewpoint image taken from at least two viewpoints, wherein the encoding target block Extracting a reference block to be referred to when encoding the image, and correcting a characteristic difference between the viewpoint images based on similarity between a block around the encoding target block and a block around the reference block And. Other application examples are as described for the image encoding device.
 また、上記画像復号方法は、少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に、各視点画像間の特性差を補正して視差補償を行う方法であって、復号対象ブロックを復号する際に参照する参照ブロックを抽出するステップと、復号対象ブロックの周辺のブロックと参照ブロックの周辺のブロックとの類似性に基づいて、視点画像間の特性差を補正するステップと、を有するものとする。その他の応用例については、画像復号装置について説明した通りである。 The image decoding method is a method of performing parallax compensation by correcting a characteristic difference between viewpoint images when decoding an encoded stream of each viewpoint image captured from at least two viewpoints. A step of extracting a reference block to be referred to when decoding the decoding target block, and a step of correcting a characteristic difference between viewpoint images based on a similarity between a block around the decoding target block and a block around the reference block And. Other application examples are as described for the image decoding apparatus.
100…画像符号化装置、101…画像符号化部、102…基準視点符号化処理部、103…非基準視点符号化処理部、104…符号構成部、201…画像入力部、202…減算部、203…直交変換部、204…量子化部、205…エントロピー符号化部、206…逆量子化部、207…逆直交変換部、208…加算部、209…予測方式制御部、210…画像選択部、211…デブロッキング・フィルタ部、212…フレームメモリ、213…動き/視差補償部、214…視差ベクトル検出部、215…イントラ予測部、217…画面内予測部、218…画面間予測部、301…対応ブロック抽出部、302…補正係数算出部、303…補正処理部、401…符号化対象画像、402…参照画像、403…符号化対象ブロック、404…符号化対象ブロック周辺画像ブロック、405…参照ブロック、406…参照ブロック周辺画像ブロック、407…視差ベクトル、700…画像復号装置、701…符号分離部、702…画像復号部、703…基準視点復号処理部、704…非基準視点復号処理部、801…エントロピー復号部、802…逆量子化部、803…逆直交変換部、804…加算部、805…予測方式制御部、806…画像選択部、807…デブロッキング・フィルタ部、808…フレームメモリ、809…動き/視差補償部、810…イントラ予測部、812…画像出力部、813…符号化データ入力部、815…画面間予測部、816…画面内予測部、901…被写体、902…カメラ、903…センサ、906…符号化器、907…復号器、908…表示部。 DESCRIPTION OF SYMBOLS 100 ... Image coding apparatus, 101 ... Image coding part, 102 ... Reference viewpoint coding process part, 103 ... Non-reference viewpoint coding process part, 104 ... Code structure part, 201 ... Image input part, 202 ... Subtraction part, DESCRIPTION OF SYMBOLS 203 ... Orthogonal transformation part, 204 ... Quantization part, 205 ... Entropy encoding part, 206 ... Dequantization part, 207 ... Inverse orthogonal transformation part, 208 ... Addition part, 209 ... Prediction scheme control part, 210 ... Image selection part , 211 ... deblocking filter unit, 212 ... frame memory, 213 ... motion / disparity compensation unit, 214 ... parallax vector detection unit, 215 ... intra prediction unit, 217 ... intra prediction unit, 218 ... inter prediction unit, 301 ... corresponding block extraction unit 302 ... correction coefficient calculation unit 303 ... correction processing unit 401 ... encoding target image 402 ... reference image 403 ... encoding target block 404 ... Encoding target block peripheral image block, 405 ... reference block, 406 ... reference block peripheral image block, 407 ... disparity vector, 700 ... image decoding apparatus, 701 ... code separation unit, 702 ... image decoding unit, 703 ... reference viewpoint decoding process 704... Non-reference viewpoint decoding processing unit 801... Entropy decoding unit 802. Inverse quantization unit 803. Inverse orthogonal transform unit 804... Addition unit 805 .. Prediction method control unit 806. Deblocking filter unit 808 Frame memory 809 Motion / disparity compensation unit 810 Intra prediction unit 812 Image output unit 813 Encoded data input unit 815 Inter prediction unit 816 Screen Inner prediction unit, 901 ... subject, 902 ... camera, 903 ... sensor, 906 ... encoder, 907 ... decoder, 908 ... display unit

Claims (5)

  1.  少なくとも2つ以上の視点から撮影した各視点画像の符号化ストリームを復号する際に、視差補償を行う画像復号装置であって、
     復号対象ブロックを復号する際に参照する参照ブロックを抽出する対応ブロック抽出部と、
     前記復号対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックの画素値を用いて、前記参照ブロックの画素値の補正を行う補正処理部と、
    を備えたことを特徴とする画像復号装置。
    An image decoding device that performs parallax compensation when decoding an encoded stream of each viewpoint image captured from at least two or more viewpoints,
    A corresponding block extraction unit that extracts a reference block to be referred to when decoding a decoding target block;
    A correction processing unit that corrects the pixel value of the reference block using pixel values of a block around the decoding target block and a block around the reference block;
    An image decoding apparatus comprising:
  2.  前記補正処理部は、復号対象ブロックの周辺のブロックとして、復号対象ブロックに隣接する複数の復号済みブロックを用い、かつ、参照ブロックの周辺のブロックとして、参照ブロックに隣接する複数の復号済みブロックを用いて、前記補正を行うことを特徴とする請求項1に記載の画像復号装置。 The correction processing unit uses a plurality of decoded blocks adjacent to the decoding target block as blocks around the decoding target block, and a plurality of decoded blocks adjacent to the reference block as blocks around the reference block. The image decoding apparatus according to claim 1, wherein the correction is performed.
  3.  前記補正処理部は、復号対象ブロックの周辺のブロックの画素値と前記参照ブロックの周辺のブロックの画素値との差分に基づいて、補正係数を算出することを特徴とする請求項1または2に記載の画像復号装置。 The correction processing unit calculates a correction coefficient based on a difference between a pixel value of a block around the decoding target block and a pixel value of a block around the reference block. The image decoding device described.
  4.  前記補正処理部は、視点画像間の特性差を補正することを特徴とする請求項1または2に記載の画像復号装置。 The image decoding apparatus according to claim 1 or 2, wherein the correction processing unit corrects a characteristic difference between viewpoint images.
  5.  少なくとも2つ以上の視点から撮影した各視点画像を符号化する際に、視差補償を行う画像符号化装置であって、
     符号化対象ブロックを符号化する際に参照する参照ブロックを抽出する対応ブロック抽出部と、
     前記符号化対象ブロックの周辺のブロックと前記参照ブロックの周辺のブロックの画素値を用いて、前記参照ブロックの画素値の補正を行なう補正処理部と、
    を備えたことを特徴とする画像符号化装置。
    An image encoding device that performs parallax compensation when encoding each viewpoint image captured from at least two viewpoints,
    A corresponding block extraction unit that extracts a reference block to be referred to when the encoding target block is encoded;
    A correction processing unit that corrects the pixel value of the reference block using pixel values of a block around the encoding target block and a block around the reference block;
    An image encoding apparatus comprising:
PCT/JP2012/080019 2011-11-21 2012-11-20 Image coding device, image decoding device, and methods and programs thereof WO2013077304A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2011253617A JP2013110555A (en) 2011-11-21 2011-11-21 Image encoder, image decoder, and method and program thereof
JP2011-253617 2011-11-21

Publications (1)

Publication Number Publication Date
WO2013077304A1 true WO2013077304A1 (en) 2013-05-30

Family

ID=48469748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2012/080019 WO2013077304A1 (en) 2011-11-21 2012-11-20 Image coding device, image decoding device, and methods and programs thereof

Country Status (2)

Country Link
JP (1) JP2013110555A (en)
WO (1) WO2013077304A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10321142B2 (en) 2013-07-15 2019-06-11 Samsung Electronics Co., Ltd. Method and apparatus for video encoding for adaptive illumination compensation, method and apparatus for video decoding for adaptive illumination compensation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009005280A (en) * 2007-06-25 2009-01-08 Nippon Telegr & Teleph Corp <Ntt> Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, image decoding program, and computer readable recording medium
JP2010507336A (en) * 2006-10-18 2010-03-04 トムソン ライセンシング Method and apparatus for local brightness and color compensation without explicit signaling
WO2010095471A1 (en) * 2009-02-23 2010-08-26 日本電信電話株式会社 Multi-view image coding method, multi-view image decoding method, multi-view image coding device, multi-view image decoding device, multi-view image coding program, and multi-view image decoding program
JP2012080242A (en) * 2010-09-30 2012-04-19 Sharp Corp Prediction vector generation method, image encoding method, image decoding method, prediction vector generation device, image encoding device, image decoding device, prediction vector generation program, image encoding program, and image decoding program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010507336A (en) * 2006-10-18 2010-03-04 トムソン ライセンシング Method and apparatus for local brightness and color compensation without explicit signaling
JP2010507334A (en) * 2006-10-18 2010-03-04 トムソン ライセンシング Method and apparatus for local brightness and color compensation without explicit signaling
JP2009005280A (en) * 2007-06-25 2009-01-08 Nippon Telegr & Teleph Corp <Ntt> Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, image decoding program, and computer readable recording medium
WO2010095471A1 (en) * 2009-02-23 2010-08-26 日本電信電話株式会社 Multi-view image coding method, multi-view image decoding method, multi-view image coding device, multi-view image decoding device, multi-view image coding program, and multi-view image decoding program
JP2012080242A (en) * 2010-09-30 2012-04-19 Sharp Corp Prediction vector generation method, image encoding method, image decoding method, prediction vector generation device, image encoding device, image decoding device, prediction vector generation program, image encoding program, and image decoding program

Also Published As

Publication number Publication date
JP2013110555A (en) 2013-06-06

Similar Documents

Publication Publication Date Title
JP7491985B2 (en) Multi-view signal codec
US9363535B2 (en) Coding motion depth maps with depth range variation
CN111819853B (en) Image block encoding device and image block encoding method
KR101773693B1 (en) Disparity vector derivation in 3d video coding for skip and direct modes
JP5575908B2 (en) Depth map generation technique for converting 2D video data to 3D video data
EP3100454B1 (en) Method for low-latency illumination compensation process
JP6039178B2 (en) Image encoding apparatus, image decoding apparatus, method and program thereof
KR20190016058A (en) Methods of decoding using skip mode and apparatuses for using the same
JP6307152B2 (en) Image encoding apparatus and method, image decoding apparatus and method, and program thereof
KR20210158843A (en) Method of encoding/decoding motion vector for multi-view video and apparatus thereof
KR101631183B1 (en) MULTIVIEW IMAGE ENCODNG METHOD, MULTIVIEW IMAGE DECODNG METHOD, MULTIVIEW IMAGE ENCODlNG DEVICE, MULTIVIEW lNlAGE DECODlNG DEVICE, AND PROGRAMS OF SAME
JPWO2014168082A1 (en) Image encoding method, image decoding method, image encoding device, image decoding device, image encoding program, and image decoding program
EP4038579A1 (en) Method and apparatus for inter-picture prediction with virtual reference picture for video coding
RU2571511C2 (en) Encoding of motion depth maps with depth range variation
KR20140124919A (en) A method for adaptive illuminance compensation based on object and an apparatus using it
US20160255370A1 (en) Moving image encoding method, moving image decoding method, moving image encoding apparatus, moving image decoding apparatus, moving image encoding program, and moving image decoding program
WO2012161318A1 (en) Image encoding device, image decoding device, image encoding method, image decoding method and program
JP5706291B2 (en) Video encoding method, video decoding method, video encoding device, video decoding device, and programs thereof
WO2013077304A1 (en) Image coding device, image decoding device, and methods and programs thereof
JP6232117B2 (en) Image encoding method, image decoding method, and recording medium
JP7504999B2 (en) Method and apparatus for inter-picture prediction using virtual reference pictures for video coding
WO2023107773A1 (en) Method and apparatus for scene detection based encoding
WO2023239391A1 (en) Adjacent spatial motion vector predictor candidates improvement
JP2013179554A (en) Image encoding device, image decoding device, image encoding method, image decoding method, and program
KR20140124045A (en) A method for adaptive illuminance compensation based on object and an apparatus using it

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12851953

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12851953

Country of ref document: EP

Kind code of ref document: A1