KR20150136017A

KR20150136017A - A method and an apparatus for processing a multi-view video signal

Info

Publication number: KR20150136017A
Application number: KR1020150071482A
Authority: KR
Inventors: 이배근; 김주영
Original assignee: 주식회사 케이티
Priority date: 2014-05-26
Filing date: 2015-05-22
Publication date: 2015-12-04
Also published as: WO2015182927A1

Abstract

In the multi-view video signal processing method according to the present invention, when the current depth block is coded in the intra-single mode, a single depth index related to the current depth block is obtained from the bit stream, Mode, and restores the current depth block considering whether the selected candidate sample is available for intra prediction of the current depth block.

Description

TECHNICAL FIELD [0001] The present invention relates to a multi-view video signal processing method and apparatus,

The present invention relates to a method and apparatus for coding a video signal.

Recently, the demand for high resolution and high quality images such as high definition (HD) image and ultra high definition (UHD) image is increasing in various applications. As the image data has high resolution and high quality, the amount of data increases relative to the existing image data. Therefore, when the image data is transmitted using a medium such as a wired / wireless broadband line or stored using an existing storage medium, The storage cost is increased. High-efficiency image compression techniques can be utilized to solve such problems as image data becomes high-resolution and high-quality.

An inter picture prediction technique for predicting a pixel value included in a current picture from a previous or a subsequent picture of a current picture by an image compression technique, an intra picture prediction technique for predicting a pixel value included in a current picture using pixel information in the current picture, There are various techniques such as an entropy encoding technique in which a short code is assigned to a value having a high appearance frequency and a long code is assigned to a value having a low appearance frequency. Image data can be effectively compressed and transmitted or stored using such an image compression technique.

On the other hand, demand for high-resolution images is increasing, and demand for stereoscopic image content as a new image service is also increasing. Video compression techniques are being discussed to effectively provide high resolution and ultra-high resolution stereoscopic content.

It is another object of the present invention to provide a method and apparatus for performing inter-view prediction using a disparity vector in encoding / decoding a multi-view video signal.

An object of the present invention is to provide a method and apparatus for deriving a variation vector of a texture block using depth data of a depth block in encoding / decoding a multi-view video signal.

The present invention aims to provide a method and apparatus for deriving a variation vector from a neighboring block of a current texture block in encoding / decoding a multi-view video signal.

An object of the present invention is to provide a method and apparatus for coding a depth image according to a segment-based depth coding scheme in encoding / decoding a multi-view video signal.

SUMMARY OF THE INVENTION It is an object of the present invention to provide a method and apparatus for encoding and decoding a multi-view video signal, which obtains an absolute value of an offset through entropy decoding based on context-based adaptive binary arithmetic coding.

An object of the present invention is to provide a method and apparatus for intra-prediction of a depth image in an intra-single mode in encoding / decoding a multi-view video signal.

A method and apparatus for decoding a multi-layer video signal according to the present invention includes: acquiring a single depth index related to the current depth block from a bitstream when a current depth block is coded in an intra-single mode; And selects one of a plurality of candidate samples available for decoding the current depth block into the intra-single mode, and restores the current depth block using the selected candidate samples.

The plurality of candidate samples may include at least one of a depth sample of a neighboring block adjacent to the current depth block or a depth value derived from a variation vector corresponding to the current depth block, And a control unit.

In the method and apparatus for decoding a multi-layer video signal according to the present invention, the neighboring block includes at least one of a left neighboring block or an uppermost neighboring block adjacent to the current depth block, and a depth sample of the neighboring block is transmitted to the neighboring block Is a sample of a pre-defined position among a plurality of depth samples belonging to the sample.

In the method and apparatus for decoding a multi-layer video signal according to the present invention, a depth sample of the left neighboring block is a depth sample located in the middle of depth samples adjacent to the left side of the current depth block, And a depth sample located in the middle of the depth samples adjacent to the current depth block.

The method and apparatus for decoding a multi-layer video signal according to the present invention may further comprise constructing a candidate list using the plurality of candidate samples, wherein the plurality of candidate samples are arranged in the candidate list in a pre- .

A method and an apparatus for encoding a multi-layer video signal according to the present invention are characterized by encoding a single-depth index related to the current depth block when the current depth block is encoded in an intra-single mode, Selecting one of a plurality of candidate samples usable for encoding the block in the intra single mode and restoring the current depth block using the selected candidate sample.

In the method and apparatus for encoding a multi-layer video signal according to the present invention, the neighboring block includes at least one of a left neighboring block or an uppermost neighboring block adjacent to the current depth block, and a depth sample of the neighboring block is transmitted to the neighboring block Is a sample of a pre-defined position among a plurality of depth samples belonging to the sample.

The depth sample of the left neighboring block is a depth sample located in the middle of the depth samples adjacent to the left side of the current depth block, and the depth sample of the upper neighboring block is And a depth sample located in the middle of the depth samples adjacent to the current depth block.

The method and apparatus for encoding a multi-layer video signal according to the present invention may further comprise constructing a candidate list using the plurality of candidate samples, wherein the plurality of candidate samples are arranged in the candidate list in a pre- .

According to the present invention, it is possible to efficiently perform inter-view prediction using a disparity vector.

According to the present invention, the variation vector of the current texture block can be effectively derived from the depth data of the current depth block or the variation vector of the neighboring texture block.

According to the present invention, the residual depth value of the depth image can be efficiently coded according to the segment-based depth coding technique.

According to the present invention, the absolute value of the offset can be effectively decoded through entropy decoding based on context-based adaptive binary arithmetic coding.

According to the present invention, the encoding efficiency of intraprediction of a depth image can be improved by encoding in an intra-single mode.

FIG. 1 is a schematic block diagram of a video decoder according to an embodiment to which the present invention is applied.
FIG. 2 illustrates a method of performing an inter-view prediction based on a disparity vector according to an embodiment to which the present invention is applied.
FIG. 3 illustrates a method of deriving a variation vector of a current texture block using depth data of a depth image according to an embodiment of the present invention. Referring to FIG.
FIG. 4 illustrates a candidate spatial / temporal neighbor block of a current texture block according to an embodiment of the present invention. Referring to FIG.
FIG. 5 illustrates a method of decoding a depth image according to a segment-based depth coding technique according to an embodiment of the present invention. Referring to FIG.
FIG. 6 illustrates a method of obtaining an absolute value of an offset through entropy decoding based on context-based adaptive binary arithmetic coding according to an embodiment of the present invention. Referring to FIG.
FIGS. 7 to 9 illustrate a method of binarizing an absolute value of an offset according to the maximum number of beans cMax, according to an embodiment to which the present invention is applied.
FIG. 10 illustrates a method of decoding a current depth block based on an intra single mode according to an embodiment of the present invention. Referring to FIG.
11 shows an example of a plurality of candidate samples usable in an intra-single mode according to an embodiment to which the present invention is applied.
FIG. 12 illustrates a method of encoding a single-depth index based on a plurality of candidate lists according to an embodiment of the present invention. Referring to FIG.

The technique of compression-encoding or decoding multi-view video signal data considers spatial redundancy, temporal redundancy, and redundancy existing between viewpoints. Also, in case of a multi-view image, a multi-view texture image captured at two or more viewpoints can be coded to realize a three-dimensional image. Further, the depth data corresponding to the multi-viewpoint texture image may be further encoded as needed. In coding the depth data, it is needless to say that compression coding can be performed in consideration of spatial redundancy, temporal redundancy, or inter-view redundancy. The depth data expresses the distance information between the camera and the corresponding pixel. In the present specification, the depth data can be flexibly interpreted as depth-related information such as a depth value, a depth information, a depth image, a depth picture, a depth sequence and a depth bit stream . In the present specification, coding may include both concepts of encoding and decoding, and may be flexibly interpreted according to the technical idea and technical scope of the present invention.

FIG. 1 is a schematic block diagram of a video decoder according to an embodiment to which the present invention is applied.

1, the video decoder includes a NAL parsing unit 100, an entropy decoding unit 200, an inverse quantization / inverse transformation unit 300, an intra prediction unit 400, an in-loop filter unit 500, A buffer unit 600, and an inter-prediction unit 700.

The NAL parsing unit 100 may receive the bitstream including the texture data again. In addition, when depth data is required for coding texture data, a bitstream including encoded depth data may be further received. The texture data and the depth data input at this time may be transmitted in one bit stream or in a separate bit stream. The NAL parsing unit 100 may perform parsing in units of NALs to decode the input bitstream. If the input bitstream is multi-point related data (e.g., 3-Dimensional Video), the input bitstream may further include camera parameters. The camera parameters may include intrinsic camera parameters and extrinsic camera parameters and the unique camera parameters may include focal length, aspect ratio, principal point, etc., and the camera parameter of the note may include position information of the camera in the world coordinate system, and the like.

The entropy decoding unit 200 can extract quantized transform coefficients through entropy decoding, coding information for predicting a texture picture, and the like.

The inverse quantization / inverse transformation unit 300 can obtain a transform coefficient by applying a quantization parameter to the quantized transform coefficient, and invert the transform coefficient to decode the texture data or the depth data. Here, the decoded texture data or depth data may mean residual data according to prediction processing. In addition, the quantization parameter for the depth block can be set in consideration of the complexity of the texture data. For example, when a texture block corresponding to a depth block is a region with a high complexity, a low quantization parameter may be set, and in the case of a low complexity region, a high quantization parameter may be set. The complexity of the texture block can be determined based on the difference value between adjacent pixels in the reconstructed texture picture as shown in Equation (1).

In Equation (1), E represents the complexity of the texture data, C represents the restored texture data, and N may represent the number of pixels in the texture data region for which the complexity is to be calculated. Referring to Equation 1, the complexity of the texture data corresponds to the difference value between the texture data corresponding to the (x, y) position and the texture data corresponding to the position (x-1, y) Can be calculated using the difference value between the texture data and the texture data corresponding to the position (x + 1, y). In addition, the complexity can be calculated for the texture picture and the texture block, respectively, and the quantization parameter can be derived using Equation (2) below.

Referring to Equation (2), the quantization parameter for the depth block can be determined based on the ratio of the complexity of the texture picture and the complexity of the texture block. [alpha] and [beta] may be a variable integer that is derived from the decoder, or it may be a predetermined integer within the decoder.

The intra prediction unit 400 may perform intra prediction using the restored texture data in the current texture picture. In-depth prediction can also be performed on the depth map in the same manner as the texture picture. For example, the coding information used for intra-picture prediction of the texture picture can be used equally in the tep-picture. Here, the coding information used for intra-picture prediction may include intra-prediction mode and partition information of intra-prediction.

The in-loop filter unit 500 may apply an in-loop filter to each coded block to reduce block distortion. The filter can smooth the edges of the block to improve the picture quality of the decoded picture. The filtered texture or depth pictures may be output or stored in the decoded picture buffer unit 600 for use as a reference picture. On the other hand, when the texture data and the depth data are coded using the same in-loop filter, the coding efficiency may deteriorate because the characteristics of the texture data and the characteristics of the depth data are different from each other. Thus, a separate in-loop filter for depth data may be defined. Hereinafter, an area-based adaptive loop filter and a trilateral loop filter will be described as an in-loop filtering method capable of efficiently coding depth data.

In the case of an area-based adaptive loop filter, it may be determined whether to apply an area-based adaptive loop filter based on the variance of the depth block. Here, the variation amount of the depth block can be defined as a difference between the maximum pixel value and the minimum pixel value in the depth block. It is possible to decide whether to apply the filter by comparing the variation amount of the depth block with the predetermined threshold value. For example, if the variation of the depth block is greater than or equal to the predetermined threshold value, it means that the difference between the maximum pixel value and the minimum pixel value in the depth block is large, so that it can be determined to apply the area-based adaptive loop filter . Conversely, when the depth variation is smaller than the predetermined threshold value, it can be determined that the area-based adaptive loop filter is not applied. When the filter is applied according to the comparison result, the pixel value of the filtered depth block may be derived by applying a predetermined weight to the neighboring pixel value. Here, the predetermined weight may be determined based on the positional difference between the currently filtered pixel and the neighboring pixel and / or the difference value between the currently filtered pixel value and the neighboring pixel value. In addition, the neighboring pixel value may mean any one of the pixel values included in the depth block excluding the pixel value currently filtered.

The Tracheal Loop Filter according to the present invention is similar to the region-based adaptive loop filter, but differs in that it additionally considers the texture data. Specifically, the trilateral loop filter compares the following three conditions and extracts depth data of neighboring pixels satisfying the following three conditions.

Condition 1.

Condition 2.

Condition 3.

Condition 1 represents the positional difference between the current pixel p and the neighboring pixel q in the depth block as a predetermined parameter

And condition 2 is to compare the difference between the depth data of the current pixel p and the depth data of the neighboring pixel q to a predetermined parameter

And condition 3 is to compare the difference between the texture data of the current pixel p and the texture data of the neighboring pixel q to a predetermined parameter

.

It is possible to extract neighboring pixels satisfying the above three conditions and to filter the current pixel p with an intermediate value or an average value of these depth data.

A decoded picture buffer unit (600) performs a function of storing or opening a previously coded texture picture or a depth picture to perform inter-picture prediction. At this time, frame_num and picture order count (POC) of each picture can be used to store or open the picture in the decoding picture buffer unit 600. Further, among the previously coded pictures in the depth coding, since there are depth pictures that are different from the current depth picture, in order to utilize such pictures as a reference picture, time identification information for identifying the time of the depth picture may be used have. The decoded picture buffer unit 600 can manage reference pictures using a memory management control operation method and a sliding window method in order to more flexibly perform inter-picture prediction. This is to uniformly manage the memories of the reference picture and the non-reference picture into one memory and efficiently manage them with a small memory. In the depth coding, the depth pictures may be marked with a separate mark in order to distinguish them from the texture pictures in the decoding picture buffer unit, and information for identifying each depth picture in the marking process may be used.

The inter prediction unit 700 may perform motion compensation of the current block using the reference picture and the motion information stored in the decoding picture buffer unit 600. [ In the present specification, the motion information can be understood as a broad concept including motion vectors and reference index information. In addition, the inter prediction unit 700 may perform temporal inter prediction to perform motion compensation. Temporal inter prediction may refer to inter prediction using motion information of a current texture block and a reference picture located at the same time and at a different time in the current texture block. Also, in the case of a multi-view image captured by a plurality of cameras, temporal inter prediction as well as inter-view prediction may be performed. The motion information used for the inter-view prediction may include a disparity vector or an inter-view motion vector. A method of performing inter-view prediction using the disparity vector will be described with reference to FIG.

FIG. 2 illustrates a method of performing an inter-view prediction based on a disparity vector according to an embodiment to which the present invention is applied.

Referring to FIG. 2, a disparity vector of a current texture block may be derived (S200).

For example, the disparity vector may be derived from the depth image corresponding to the current texture block, which will be described in detail with reference to FIG.

It may also be derived from a neighboring block that is spatially adjacent to the current texture block or may be derived from a temporal neighbor block located at a different time zone from the current texture block. A method of deriving a variation vector from a spatial / temporal neighboring block of a current texture block will be described with reference to FIG.

Referring to FIG. 2, inter-view prediction of the current texture block may be performed using the transition vector derived in operation S200 (S210).

For example, the texture data of the current texture block can be predicted or restored using the texture data of the reference block specified by the variation vector. Here, the reference block may belong to the time point used for the inter-view prediction of the current texture block, that is, the reference point. The reference block may belong to a reference picture located in the same time zone as the current texture block.

In addition, a reference block belonging to a reference time point may be specified using the variation vector, and a temporal motion vector of the current texture block may be derived using the temporal motion vector of the specified reference block. Here, the temporal motion vector means a motion vector used for temporal inter prediction, and can be distinguished from a transition vector used for inter-view prediction.

FIG. 3 illustrates a method of deriving a variation vector of a current texture block using depth data of a depth image according to an embodiment of the present invention. Referring to FIG.

Referring to FIG. 3, position information of a depth block (hereinafter referred to as a current depth block) in a depth picture corresponding to a current texture block may be obtained based on position information of a current texture block (S300).

The position of the current depth block can be determined in consideration of the spatial resolution between the depth picture and the current picture.

For example, if the depth picture and the current picture are coded with the same spatial resolution, the current depth block may be determined to be the same block as the current texture block of the current picture. On the other hand, the current picture and the depth picture may be coded with different spatial resolutions. Since the depth information indicating the distance information between the camera and the object has the characteristic that the coding efficiency may not decrease significantly even if the spatial resolution is lowered. Thus, if the spatial resolution of the depth picture is coded lower than the current picture, the decoder may involve an upsampling process on the depth picture before acquiring the position information of the current depth block. In addition, when the aspect ratio between the upsampled depth picture and the current picture does not exactly coincide, offset information may be additionally considered in acquiring the position information of the current depth block in the upsampled depth picture. Here, the offset information may include at least one of top offset information, left offset information, right offset information, and bottom offset information. The top offset information may indicate a positional difference between at least one pixel located at the top of the upsampled depth picture and at least one pixel located at the top of the current picture. Left, right, and bottom offset information may also be defined in the same manner, respectively.

Referring to FIG. 3, the depth data corresponding to the position information of the current depth block may be obtained (S310).

If there are a plurality of pixels in the current depth block, the depth data corresponding to the corner pixels of the current depth block may be used. Alternatively, depth data corresponding to the center pixel of the current depth block may be used. Alternatively, one of a maximum value, a minimum value, and a mode value may be selectively used among a plurality of depth data corresponding to a plurality of pixels, or an average value of a plurality of depth data may be used.

Referring to FIG. 3, the variation vector of the current texture block may be derived using the depth data obtained in operation S310 (S320).

For example, the transition vector of the current texture block may be derived as: < EMI ID = 3.0 >

Referring to Equation (3), v represents depth data, a represents a scaling factor, and f represents an offset used to derive a variation vector. The scaling factor a and the offset f may be signaled in a video parameter set or a slice header, or may be a pre-set value in a decoder. n is a variable indicating the value of the bit shift, which can be variably determined according to the accuracy of the variation vector.

FIG. 4 illustrates a candidate spatial / temporal neighbor block of a current texture block according to an embodiment of the present invention. Referring to FIG.

Referring to FIG. 4A, the spatial neighboring block includes a left neighbor block A1, an upper neighbor block B1, a lower left neighbor block A0, an upper right neighbor block B0, And block B2.

Referring to FIG. 4B, the temporally neighboring block may denote a block in the same position as the current texture block. Specifically, the temporal neighboring block is a block belonging to a picture located at a time zone different from the current texture block, and includes a block (BR) corresponding to the lower right pixel of the current texture block, a block (CT) corresponding to the center pixel of the current texture block or And a block TL corresponding to the upper left pixel of the current texture block.

The displacement vector of the current texture block may be derived from a disparity-compensated prediction block (hereinafter referred to as a DCP block) among the spatially / temporally neighboring blocks. Here, the DCP block may be a block encoded through inter-view texture prediction using a transition vector. In other words, the DCP block can perform the inter-view prediction using the texture data of the reference block specified by the disparity vector. In this case, the transition vector of the current texture block can be predicted or restored by using the transition vector used for the inter-view texture prediction of the DCP block.

Alternatively, the disparity vector of the current texture block may be derived from a disparity vector based motion compensation prediction block (hereinafter referred to as a DV-MCP block) based on the disparity vector among the spatially neighboring blocks. Here, the DV-MCP block may mean a block coded through inter-view motion prediction using a disparity vector. In other words, the DV-MCP block can perform temporal inter prediction using the temporal motion vector of the reference block specified by the disparity vector. In this case, the disparity vector of the current texture block may be predicted or reconstructed using the disparity vector used for obtaining the temporal motion vector of the reference block of the DV-MCP block.

The current texture block may be searched for whether a spatial / temporal neighbor block corresponds to a DCP block according to a predefined priority, and a variation vector may be derived from the first searched DCP block. As an example of the predefined priority, a search can be performed with priority of a spatial neighbor block -> temporal neighbor block, and among the spatial neighbor blocks, a priority order of A1-> B1-> B0-> A0-> B2 It is possible to search whether it corresponds to a DCP block. However, it is to be understood that this is merely one embodiment of the priority order and can be determined differently within a range that is obvious to a person skilled in the art.

If any of the spatially / temporally neighboring blocks does not correspond to the DCP block, the spatial neighboring block may be additionally searched for whether it corresponds to the DV-MCP block, and the variation vector may be derived from the first searched DV-MCP block .

FIG. 5 illustrates a method of decoding a depth image according to a segment-based depth coding technique according to an embodiment of the present invention. Referring to FIG.

In the present invention, the segment-based depth coding technique is a technique in which the residual depth values regarding a depth image (e.g., a coding block or a prediction block) are not separately encoded, and one residual depth value (Hereinafter referred to as " depth value "). One depth image may be composed of at least one segment. If the depth image is composed of two or more segments, the representative residual depth value can be signaled for each segment. Here, the representative residual depth value may be derived by averaging the difference between the original depth value and the predicted depth value. For example, the difference value between the original depth value and the prediction depth value can be obtained for each pixel of the depth image, and the average value of the obtained difference values can be defined as the representative residual depth value. Alternatively, a difference value between the average value of the original depth values of the depth image and the average value of the prediction depth values may be defined as a representative residual depth value.

Using the above-described segment-based depth coding scheme, the bit rate for the residual depth value can be reduced compared to the case of encoding the residual depth value for each pixel.

Referring to FIG. 5, an offset absolute value (depth_dc_abs) and offset sign information (depth_dc_sign_flag) can be obtained from a bitstream (S500).

Here, the offset absolute value and the offset code information are the syntax used to derive the offset value DcOffset. That is, the offset value DcOffset can be coded into the offset absolute value and the offset code information.

Specifically, the absolute value of the offset means the absolute value of the offset value (DcOffset), and the offset code information can indicate the sign of the offset value (DcOffset). The absolute value of the offset can be obtained through entropy decoding based on context-based adaptive binary arithmetic coding, which will be described with reference to FIGS. 6 to 9. FIG.

Referring to FIG. 5, an offset value DcOffset may be derived using the offset absolute value and offset sign information obtained in operation S500 (S510).

Here, the offset value DcOffset may be used as the representative residual depth value. Alternatively, when the representative residual depth value is coded using a depth look-up table, the offset value DcOffset may be defined as an index mapped to the representative residual depth value instead of the representative residual depth value itself have. The depth lookup table defines a mapping relationship between a depth value of a video image and an index assigned to the depth value. In this manner, when the depth lookup table is used, the encoding efficiency can be improved by encoding only the index assigned to the depth value without encoding the depth value itself.

Therefore, the method of deriving the representative residual depth value from the offset value DcOffset depending on whether the depth lookup table is used in the process of encoding the representative residual depth value will be different.

Referring to FIG. 5, it can be checked whether a depth lookup table is used (S520).

Specifically, whether a depth lookup table is used can be confirmed by using a depth lookup table flag dlt_flag. The depth lookup table flag is a syntax that is encoded to indicate whether a depth lookup table is used in encoding or decoding a residual depth value. The depth lookup table flag may be encoded for each layer or view including the corresponding video image. Alternatively, the depth lookup table flag may be encoded for each sequence or slice including the corresponding video image.

Referring to FIG. 5, if it is determined that the depth lookup table is used, the representative residual depth value may be derived using the offset value DcOffset derived in step S510 and the depth lookup table in operation S530.

Specifically, a representative residual depth value corresponding to the offset value (DcOffset) can be derived using a depth lookup table. For example, the representative residual depth value can be derived as shown in the following equation (4).

In Equation (4), DltIdxToVal [] denotes a function for converting an index into a depth value using a depth lookup table, and DltValToIdx [] denotes a function for converting a depth value into an index using a depth lookup table.

First, the predictive depth value dcPred of the depth image can be converted into a corresponding first index DltValToIdx [dcPred] using a depth lookup table. For example, a depth value that minimizes the difference between the depth value defined in the depth lookup table and the predicted depth value (dcPred) or the predicted depth value (dcPred) is selected, and the selected depth value The assigned index can be determined as the first index. Here, the prediction depth value dcPred may be derived as an average value of samples located at the corners of the restored depth image. In this case, the samples located at the corners may include upper left corner samples, upper right corner samples, lower left corner samples, and lower right corner samples in the depth image. The second index is obtained by adding the transformed first index DltValToIdx [dcPred] and the offset value DcOffset, and the second index is transformed to a corresponding depth value (hereinafter referred to as a restoration depth value) by using the depth lookup table ). &Lt; / RTI > A value obtained by subtracting the prediction depth value dcPred from the restoration depth value may be determined as a representative remaining depth value dcVal.

Referring to FIG. 5, if it is determined that the depth lookup table is not used, the representative residual depth value may be derived using the offset value DcOffset derived in step S510 (S540). For example, the derived offset value (DcOffset) can be set as the representative residual depth value.

In operation S550, the depth image may be restored using the representative residual depth value derived in operation S530 or S540.

FIG. 6 illustrates a method of obtaining an absolute value of an offset through entropy decoding based on context-based adaptive binary arithmetic coding according to an embodiment of the present invention. Referring to FIG.

Referring to FIG. 6, a bin string may be generated through a normal coding or a bypass coding process on a bitstream encoded by context-based adaptive binary arithmetic coding (S600).

Here, the regular coding is adaptive binary arithmetic coding for predicting the probability of a bin using context modeling, and the bypass coding can mean coding for outputting the binarized bin string as a bitstream. Context modeling means probability modeling for each bin, and the probability can be updated according to the value of the currently encoded bin. When encoded through the regular coding, an empty string can be generated based on context modeling of the absolute value of the offset, that is, probability of occurrence of each bit.

The absolute value of the offset may be obtained through inverse-binarization of the bin string generated in operation S600 (S610).

Here, the de-binarization may refer to an inverse process of the binarization process on the absolute value of the offset performed in the encoder. Binarization method, and the like unary binary encoding (unary binarization), a cutting-type unary binary encoding (Truncated unary binarization), unary / zero-th order index Gollum coupled binary coding (Truncated unary / zero ^th order exponential golomb binarization) may be used .

The binarization of the absolute value of the offset may be performed by a combination of a prefix bin string and a suffix bin string. Here, the preamble bin string and the suffix bin string can be represented by different binarization methods. For example, the preamble bin string may use a truncated unary binary encoding, and the suffix bin string may use the zeroth exponent golem binarization encoding. Hereinafter, the process of binarizing the absolute value of the offset according to the maximum number cMax of beans constituting the preamble bin string will be described with reference to FIGS.

FIGS. 7 to 9 illustrate a method of binarizing an absolute value of an offset according to the maximum number of beans cMax, according to an embodiment to which the present invention is applied.

FIG. 7 shows a binarization method when the maximum number of bins cMax is set to 3. Referring to FIG. 7, the absolute value of the offset is represented by a combination of a preamble bin string and a suffix bin string. The preamble bin string and the suffix bin string are binarized by a cutting type unary binary encoding and a zero order exponent Golomb binary encoding, respectively do.

If the maximum number of bins (cMax) is set to 3 and the offset absolute value is 3, the preamble bin string can be represented by 111 and the suffix bin string can be represented by 0. If the absolute value of the offset is greater than 3, the preamble bin string is fixed at 111, and the suffix bin string can be represented by binarizing the difference value between the absolute value of the offset and the maximum number of beans according to the zero- have.

For example, assume that an empty string of 111101 is generated through context modeling of the absolute value of the offset. At this time, the generated empty string 111101 can be divided into a preamble empty string and a suffix empty string based on the maximum number (cMax) of bins. Here, since the maximum number cMax of bins is set to 3, the preamble bin string will be 111 and the suffix bin string will be 101.

Meanwhile, if inverse binarization is performed on the preamble bin string 111 that has been binarized according to the cutting type unary binary encoding, 3 is obtained, and inverse binarization is performed on the binarized suffix bin string 101 according to the zero- Performing binarization can yield 2. The acquired 3 and 2 can be added to obtain 5 as an offset absolute value.

FIG. 8 shows a binarization method when the maximum number of beans cMax is set to 5. Referring to FIG. 8, the absolute value of the offset is represented by a combination of a preamble bin string and a suffix bin string. The preamble bin string and the suffix bin string are binarized by a cutting type unary binary encoding and a zero order exponent Golomb binary encoding, respectively do.

If the maximum number of bins (cMax) is set to 5 and the offset absolute value is 5, the preamble bin string may be represented by 11111 and the suffix bin string may be represented by 0. If the absolute value of the offset is greater than 5, the preamble bin string is fixed to 11111, and the suffix bin string can be represented by binarizing the difference value between the absolute value of the offset and the maximum number of beans according to the zero- have.

For example, assume that an empty string of 11111100 is created through context modeling of the absolute value of the offset. At this time, the generated empty string 11111100 can be divided into a preamble empty string and a suffix empty string based on the maximum number of bins (cMax). Here, since the maximum number of bins cMax is set to 5, the preamble bin string will be 11111 and the suffix bin string will be 100.

On the other hand, by performing inverse-binarization on the preamble bin string 11111 that is binarized according to the cutting type unary binary encoding, 5 is obtained, and the inverse binarization is performed on the binarized suffix bin string 100 according to the zero- You can obtain 1 by performing binarization. The obtained 5 and 1 can be added to obtain 6 as an offset absolute value.

FIG. 9 shows a binarization method when the maximum number of bins cMax is set to 7. FIG. Referring to FIG. 9, the absolute value of the offset is represented by a combination of a preamble bin string and a suffix bin string. The preamble bin string and the suffix bin string are binarized by a cutting type unary binary encoding and a zero order exponent Golomb binary encoding, respectively do.

For example, if the maximum number of bins (cMax) is set to 7 and the offset absolute value is 7, the preamble bin string may be represented by 1111111 and the suffix bin string may be represented by 0. If the absolute value of the offset is greater than 7, the preamble bin string is fixed to 1111111, and the suffix bin string can be represented by binarizing the difference value between the absolute value of the offset and the maximum number of beans according to the zero- have.

For example, assume that an empty string of 11111111100 has been generated through context modeling of the absolute value of the offset. At this time, the generated empty string 11111111100 can be divided into a preamble empty string and a suffix empty string based on the maximum number of beans (cMax). Here, since the maximum number cMax of bins is set to 7, the preamble bin string will be 1111111 and the suffix bin string will be 100.

On the other hand, by performing inverse-binarization on the preamble bin string 11111 binarized according to the cutting type unary binary coding, 7 is obtained, and inverse-binarization is performed on the binarized suffix bin string 100 according to the 0 < th & You can obtain 1 by performing binarization. The obtained 7 and 1 can be added to obtain 8 as an offset absolute value.

FIG. 10 illustrates a method of decoding a current depth block based on an intra single mode according to an embodiment of the present invention. Referring to FIG.

The depth image may have the same or similar value as the texture image. In this case, the restoration / prediction depth value of the current depth block can be obtained by using one depth sample, and this is called an intra single mode. Here, a depth value derived from a peripheral depth sample or a disparity vector reconstructed by one depth sample may be used, and separate index information may be signaled to specify one depth sample.

Referring to FIG. 10, a single depth index related to the current depth block may be obtained from the bitstream (S1000).

If the current depth block is coded in the intra-single mode, the single-depth index can be signaled. The single-depth index is information for specifying one depth sample used for deriving the restoration / prediction depth value of the current depth block. For example, the single-depth index indicates whether the restoration / prediction depth value of the current depth block is derived using the reconstructed depth sample of the upper neighboring block adjacent to the current depth block, or the restored depth of the left neighboring block adjacent to the current depth block. Whether it is derived using a sample, and the like.

Referring to FIG. 10, one of a plurality of candidate samples may be selected based on the single-depth index obtained in S1000 (S1010).

Here, the plurality of candidate samples may be a depth sample that can be used to encode / decode the current depth block in the intra-single mode. The plurality of candidate samples may include a depth sample of a neighboring block adjacent to the current depth block. A depth sample of a neighboring block used as a candidate sample can be selectively used as a depth sample of a pre-defined one of the depth samples belonging to a neighboring block, which will be described in detail with reference to FIG.

Also, the current depth block may be encoded in an intra-single mode using a depth value derived from a disparity vector related to a current texture block or a current depth block. In this case, a plurality of candidate samples may be encoded as a variation depth candidate , disparity derived depth candidate (DDD)). The depth value corresponding to the DDD can be derived by the following equation (5).

In Equation (5), w represents the distance between the viewpoints, and DVx represents the x component of the disparity vectors DVx and DVy.

Referring to FIG. 10, a restoration / prediction depth value of a current depth block may be obtained in consideration of whether a candidate sample selected in step S1010 is available for intraprediction of a current depth block (step S1020).

As an example of a case where a specific candidate sample is not available for intra prediction, when a specific candidate sample exists outside the depth picture, the candidate sample exists in a slice and / or a tile different from the current depth block, And may not be decoded due to reasons such as independent or parallel processing with the depth block.

If the selected candidate sample is available for intra prediction, the restoration / prediction depth value of all samples belonging to the current depth block can be derived as the value of the selected candidate sample.

On the other hand, if the selected candidate sample is not available for intra prediction, a restoration / prediction depth value of all samples belonging to the current depth block can be derived using a default offset. Here, as an example of a default offset value, a default value predefined in the decoding apparatus may be used, and a predefined default value may be set based on the bit depth value (BitDepth) of the coded texture picture or depth picture have. Alternatively, a value obtained by adding or subtracting a predetermined value from the value of another available candidate sample may be used as a default offset value, and another available candidate sample may be a sample closest to the current depth block in the scan order.

It should be appreciated that the current depth block may be finally restored using the restoration / prediction depth value obtained in step S1020 and the representative residual depth value derived in FIG. In the present embodiment, a method of restoring the current depth block based on the intra-single mode has been described, but the present invention is not limited thereto. For example, in the case of the current texture block, the restoration / prediction texture value can be derived using only one texture sample of surrounding texture samples according to the intra single mode described above.

11 shows an example of a plurality of candidate samples usable in an intra-single mode according to an embodiment to which the present invention is applied.

The current depth block encoded in the intra single mode can use a depth sample of a predefined location among the depth samples of a spatially adjacent neighbor block.

As shown in FIG. 11A, a plurality of candidate samples include at least one of a depth sample of an upper neighboring block, a depth sample of a left neighboring block, or a depth sample of a left-upper neighboring block based on a current depth block .

Specifically, the depth sample of the left neighboring block used as the candidate sample is at least one of the depth sample A1 located at the uppermost-end or the depth sample A0 located at the lowermost one of the depth samples adjacent to the left side of the current depth block One can be included. Also, the depth sample of the upper neighbor block used as the candidate sample is at least one of the depth sample B1 located at the leftmost side or the depth sample B0 located at the rightmost side among the depth samples adjacent to the upper end of the current depth block . &Lt; / RTI >

Referring to FIG. 11B, a plurality of candidate samples may include a depth sample of the upper neighboring block, a depth sample of the left neighboring block, a depth sample of the upper left neighboring block, a depth sample of the upper left neighboring block, Sample or a depth sample of a left-bottom neighboring block.

Specifically, the depth sample of the left neighbor block used as the candidate sample is a depth sample A1 located at the lowermost one of the depth samples adjacent to the left side of the current depth block, or a depth sample A1 located at the lowermost end of the depth sample A1 And a depth sample (A0) located in the depth direction. The depth sample of the upper neighbor block used as the candidate sample is located at the right side of the depth sample B1 located at the rightmost side or the depth sample B1 located at the rightmost side among the depth samples adjacent to the upper end of the current depth block And a depth sample B0 located thereon.

11, a depth sample located in the middle of the depth samples adjacent to the left side of the current depth block may be used as a candidate sample, and a depth sample adjacent to the top of the current depth block may be used as a candidate sample. An intermediate depth sample may be used as a candidate sample.

The plurality of candidate samples described above may constitute a candidate list in a pre-set order. For example, a plurality of candidate samples may be arranged in the candidate list in the order A0- > B0- > B1- > A1- > B2. Alternatively, the variation depth candidate (DDD) may be arranged with priority over spatially adjacent candidate samples. For example, they can be arranged in the candidate list in the order DDD->A0->B0->B1->A1-> B2. However, the present invention is not limited to this, and it is needless to say that the candidate list can be configured with various priorities within a range that can be easily derived by an ordinary technician.

FIG. 12 illustrates a method of encoding a single-depth index based on a plurality of candidate lists according to an embodiment of the present invention. Referring to FIG.

A first candidate list may be generated based on some candidate samples among the plurality of candidate samples described above and a second candidate list may be generated based on the remaining candidate samples. In the present embodiment, it is assumed that the first candidate list (DDD, A0, B0) is used to generate the second candidate list using the second candidate sample group (B1, A1, B2) do. Needless to say, the number and / or type of the candidate samples belonging to each candidate sample group can be variously determined within a range that can be easily derived by a typical descriptor.

If at least one of DDD, A0, and Bo belonging to the first candidate sample group is available, the first candidate list can be generated based on the available candidate samples. If there is no available candidate among DDD, A0, and B0, a second candidate list may be additionally generated using the second candidate sample group. In this case as well, the second candidate list can be generated on the basis of the available candidate samples among B1, A2, and B2 belonging to the second candidate sample group.

As shown in FIG. 12, when the final candidate sample used to encode the current depth block in the intra-single mode is included in the first candidate list, the encoder can encode the first single-depth index related to the first candidate list have. That is, when DDD is used as the final candidate sample, a value of 0 is encoded with a first single-depth index. When A0 is used, a value of 1 is encoded with a first single-depth index.

If the final candidate sample used to encode the current depth block into the intra-single mode is included in the second candidate list, the encoder can also encode the second single-depth index related to the second candidate list. In this case, any one of the values 0, 1, and 2 can be coded to specify any one of B1, A1, and B2.

In addition, the encoder may encode the value of the first single-depth index into a predetermined constant value to indicate that the current depth block is encoded in the intra-single mode with reference to the second candidate list. Here, the predetermined constant value may be determined to be the same value as the maximum number of candidate samples included in the first candidate sample group. Therefore, as shown in FIG. 12, since the maximum number of candidate samples included in the first candidate sample group is three, the value of the first single-depth index can be encoded as three. The second single-depth index may be limitedly encoded only when the value of the first single-depth index is coded with a predetermined constant value.

Alternatively, a flag may be encoded that specifies whether the current depth block refers to the first candidate list or the second candidate list. The first single-depth index or the second single-depth index may be selectively signaled according to the value of the corresponding flag.

Claims

Wherein when the current depth block is coded in an intra single mode, obtaining a single depth index related to the current depth block from a bitstream, wherein the intra single mode uses a depth sample, , &Lt; / RTI >
Selecting one of a plurality of candidate samples available for decoding the current depth block into an intra-single mode based on the single-depth index; And
And reconstructing the current depth block considering whether the selected candidate sample is available for intra prediction of the current depth block.

The apparatus of claim 1, wherein the plurality of candidate samples include at least one of a depth sample of a neighboring block adjacent to the current depth block or a depth value derived from a variation vector corresponding to the current depth block,
Wherein the neighboring block includes at least one of a left neighboring block or an upper neighboring block adjacent to the current depth block and a depth sample of the neighboring block is a sample of a predefined location among a plurality of depth samples belonging to the neighboring block &Lt; / RTI >

3. The method of claim 2, wherein the depth sample of the left neighboring block is a depth sample located in the middle of the depth samples adjacent to the left side of the current depth block, and the depth sample of the upper neighboring block is a depth sample Wherein the depth sample is a depth sample located in the middle of the depth video signal.

The method according to claim 1,
And restoring the current depth block to a value of the selected candidate sample if the selected candidate sample is available for intra prediction,
And if the selected candidate sample is not available for intra prediction, restoring the current depth block using a default offset.

An entropy decoding unit for obtaining a single depth index related to the current depth block from a bitstream when the current depth block is coded in an intra single mode; Here, the intra-single mode is a mode for predicting samples belonging to the current depth block using one depth sample,
Selecting one of a plurality of candidate samples usable for decoding the current depth block into the intra single mode based on the single depth index and determining whether the selected candidate sample is available for intraprediction of the current depth block And a depth image restoring unit for restoring the current depth block in consideration of the current depth block.

The apparatus of claim 5, wherein the plurality of candidate samples include at least one of a depth sample of a neighboring block adjacent to the current depth block or a depth value derived from a variation vector corresponding to the current depth block,
Wherein the neighboring block includes at least one of a left neighboring block or an upper neighboring block adjacent to the current depth block and a depth sample of the neighboring block is a sample of a predefined location among a plurality of depth samples belonging to the neighboring block And the second video signal is decoded.

The method of claim 6, wherein the depth sample of the left neighboring block is a depth sample located in the middle of the depth samples adjacent to the left side of the current depth block, and the depth sample of the upper neighboring block is a depth sample Is a depth sample located in the middle of the depth video signal.

The apparatus of claim 7, wherein the depth image restoration unit comprises:
And restoring the current depth block to a value of the selected candidate sample if the selected candidate sample is available for intra prediction,
And restores the current depth block using a default offset if the selected candidate sample is not available for intra prediction.

Encoding a single depth index related to the current depth block when the current depth block is coded in an intra single mode, wherein the intra single mode predicts samples belonging to the current depth block using one depth sample Mode,
Selecting one of a plurality of candidate samples usable for encoding the current depth block into an intra-single mode based on the single-depth index; And
And reconstructing the current depth block considering whether the selected candidate sample is available for intra prediction of the current depth block.

The apparatus of claim 9, wherein the plurality of candidate samples include at least one of a depth sample of a neighboring block adjacent to the current depth block or a depth value derived from a variation vector corresponding to the current depth block,
Wherein the neighboring block includes at least one of a left neighboring block or an upper neighboring block adjacent to the current depth block and a depth sample of the neighboring block is a sample of a predefined location among a plurality of depth samples belonging to the neighboring block Wherein the video signal is encoded by the encoding means.

11. The method of claim 10, wherein the depth sample of the left neighboring block is a depth sample located in the middle of the depth samples adjacent to the left side of the current depth block, and the depth sample of the upper neighboring block is a depth sample Is a depth sample located in the middle of the video signal.

12. The method of claim 11,
And restoring the current depth block to a value of the selected candidate sample if the selected candidate sample is available for intra prediction,
Wherein if the selected candidate sample is not available for intra prediction, the current depth block is restored using a default offset.

An entropy encoding unit for encoding a single depth index related to the current depth block when the current depth block is coded in an intra single mode; Here, the intra-single mode is a mode for predicting samples belonging to the current depth block using one depth sample,
A depth image restoration unit that selects one of a plurality of candidate samples usable for decoding the current depth block into the intra single mode based on the single depth index and restores the current depth block using the selected candidate sample, And a second video signal encoding unit for encoding the second video signal.

14. The apparatus of claim 13, wherein the plurality of candidate samples include at least one of a depth sample of a neighboring block adjacent to the current depth block or a depth value derived from a variation vector corresponding to the current depth block,
Wherein the neighboring block includes at least one of a left neighboring block or an upper neighboring block adjacent to the current depth block and a depth sample of the neighboring block is a sample of a predefined location among a plurality of depth samples belonging to the neighboring block Point video signal.

15. The method of claim 14, wherein the depth sample of the left neighboring block is a depth sample located in the middle of the depth samples adjacent to the left side of the current depth block, and the depth sample of the upper neighboring block is a depth sample Is a depth sample located in the middle of the depth video signal.