Summary of the invention
H.264/AVC scalable extension SVC also provides other scalability types, for example spatial scalability.In spatial scalability, BL is different with pixel count among the EL.Therefore, the problem that how bit-depth scalability combined with other scalability types (especially spatial scalability) has appearred.The invention provides a kind of solution of this problem.
Claim 1 discloses a kind of coding method, and it allows bit-depth scalability to combine with other scalability types.Claim 6 discloses a kind of corresponding decoding method.
In claim 10, disclose a kind of equipment that utilizes this coding method, in claim 11, disclose a kind of equipment that utilizes this coding/decoding method.
According to the present invention, in inter-layer prediction, adopt contrary tone mapping, so that improve code efficiency based on look-up table (LUT).Contrary tone mapping techniques based on LUT is used for following EL pictorial element: wherein the BL pictorial element for correspondence position (collocated) carries out intraframe coding.Total pictorial element is macro block (MB), piece, as bar (slice), image or image sets.For example, for picture bar level,, create LUT at encoder based on the original EL picture bar of BL I-picture bar after the reconstruct and correspondence position.Particularly, LUT can be inserted bit stream with hierarchical approaches.For example, in the bit stream of AVC compatibility,, produce a LUT as " base " LUT based on whole sequence; Based on different frames, can also produce more rudimentary LUT; In addition, if desired, can also in bit stream, transmit as bar level LUT.In order to reduce the expense that LUT introduces,, only its difference with the higher level LUT that is right after is encoded in each level of LUT.In the SVC structure, can realize whole proposal, and the compatibility of support and time, space and these other types scalabilities of SNR scalability.
In one embodiment, in two logic steps BL information is carried out up-sampling, a step is texture (texture) up-sampling, and another step is the bit-depth up-sampling.The texture up-sampling is the process that increases pixel count, and the bit-depth up-sampling is to increase the numbered purpose process that each pixel can have.This numerical value is corresponding with (color) intensity of pixel.BL pictorial element behind the up-sampling is used to predict the EL pictorial element of correspondence position.Encoder produces residual error from the EL video data, this residual error further can be encoded (normally entropy coding) and transmission.Want sampled BL information that any granularity can be arranged, for example single pixel unit, block of pixels, MB, as bar, entire image or image sets.Can in single step, carry out these two sampling step in logic in addition.In encoder one side to BL information up-sampling, and in the same way in decoder one side to its up-sampling, wherein this up-sampling relates to space and bit-depth characteristic.
In addition, the space and the bit-depth up-sampling of combination be can carry out usually, intraframe coding and inter coded images are used for.Yet,, only define and use according to hierarchical luts of the present invention if carry out intraframe coding for correspondence position BL.
Particularly, the invention discloses a kind of method to coding video data with basic unit and enhancement layer, wherein basic unit's pixel has littler bit-depth and lower spatial resolution than enhancement layer pixels, said method comprising the steps of:
On first particle size fraction base layer data is encoded, described first particle size fraction for example is GOP level, a plurality of image level or picture bar level, wherein base layer data is carried out intraframe coding,
Base layer data behind the reconstruct coding,
Be that the base layer data after the intraframe coding (as first predicted version of enhancement data) produces the first tone mapping table, this table definition the independent mapping between the base layer data after the reconstruct and the corresponding original enhancement layer data,
Be that the base layer data fragment after the intraframe coding produces the second different tone mapping tables, this table definition the independent mapping between the homologous segment of described fragment and corresponding original enhancement layer data of the base layer data after the reconstruct,
Produce difference table, the difference (that is: the deviation of the second and first tone mapping table) between this first and second tone mapping table of table expression,
Based on the described first and second tone mapping tables, base layer data is carried out the bit-depth up-sampling, wherein for the described fragment of base layer data, only use the second tone mapping table, and obtain second predicted version of corresponding enhancement data, its first predicted version than enhancement data has higher bit-depth resolution
Produce enhancement layer residual, its be the original enhancement layer data with second predicted version of corresponding enhancement data between difference, and
Enhancement layer residual, the first tone mapping table and described difference table are encoded, wherein the first tone mapping table behind the coding is associated with basic unit or enhancement data behind the coding, and difference table with encode after base layer data or the described fragment of enhancement data be associated.
In one embodiment, before the bit-depth up-sampling, the base layer data after the reconstruct is carried out space, time or SNR up-sampling, wherein, obtain first predicted version of corresponding enhancement data, it has higher space, time or SNR resolution than base layer data.First mapping table has defined the tone mapping between BL data up-sampling, after the reconstruct and the corresponding original EL data substantially then, second tone mapping table definition up-sampling, the mapping after the reconstruct between the homologous segment of BL data and corresponding original EL data.In addition, in this embodiment, first predicted version of the EL data that the bit-depth up-sampling is related is different from the BL data, and this is because first predicted version of these EL data has been carried out up-sampling.
According to an aspect of the present invention, disclose a kind of method that video data is decoded, said method comprising the steps of:
From the coding after EL data or the BL data extract first and second tone mapping (enum) datas relevant with EL data after the intraframe coding,
From the tone mapping (enum) data reconstruct first tone mapping table that extracts,
From the tone mapping (enum) data of extraction and the first tone mapping table reconstruct, the second tone mapping table after the described reconstruct, wherein, the tone mapping (enum) data that extracts that utilizes is represented the difference between described first and second tone mapping table,
Determine first coding unit relevant with the first tone mapping table and second coding unit relevant with the second tone mapping table, wherein, second coding unit is the sub-fraction of described first coding unit,
The BL data and the EL data that receive are carried out inverse quantization and inverse transformation, and wherein, the EL data after inverse quantization and the inverse transformation comprise residual error,
BL data after the reconstruct intraframe coding,
BL data after the reconstruct are carried out up-sampling, wherein increased the value degree of depth of every pixel, and for the pixel in described second coding unit, use the second tone mapping table, and, use the first tone mapping table for the residual pixel of first coding unit, and obtain the EL data of prediction, and
From the EL data of prediction and the EL data after inverse quantization and the inverse transformation, reconstruct the EL video data of reconstruct.
The principle that is adopted can be interpreted as LUT general and exception: usually, a LUT is effectively for specified scope, for example: for the picture bar, remove the subrange of appointment in the described scope, for example as the MB in the bar.In the subrange of appointment, the 2nd LUT is effective.In principle, in the subrange of appointment, the second tone mapping table rewrites the first tone mapping table.This principle can be expanded to some or all of available code level.
According to a further aspect in the invention, a kind of signal is disclosed, it comprises base layer video data and enhancement layer video data, basic unit has littler color bit-depth than enhancement layer, wherein base layer data comprises the video data after the intraframe coding, and wherein said signal further comprise with intraframe coding after the first relevant tone mapping (enum) data of first level (for example image) of video data, and further comprise the second relevant tone mapping (enum) data of defined segmentation (fraction) (for example specific picture bar or MB) in described first level with video data.The first tone mapping (enum) data is represented first table, the bit-depth up-sampling that is used for the pixel of described first level except that described segmentation, basic unit, the second tone mapping (enum) data is represented the difference between second table and described first table, and wherein said second table is used for the bit-depth up-sampling of the pixel of described segmentation.Herein, term " segmentation " is meant elementary area substantially, for example MB, image, GOP and image sequence.
According on the other hand, corresponding apparatus is disclosed.
In one embodiment of the invention, a kind of equipment to video data encoding or decoding has been proposed, described coding or decoding device further comprise: carry out the device of space (residual error or texture) up-sampling and the device of carrying out the color bit-depth up-sampling, the device that wherein is used for the space up-sampling has increased the interior numerical value number of BL information, the device that is used for the color bit-depth up-sampling has increased the color gamut of value, and has wherein obtained through the BL data behind space and the color bit-depth up-sampling.
The various embodiment of this encoding scheme can with H.264/AVC and the scalability compatibility of all categories of definition in the H.264/AVC scalable expansion (SVC).
Advantageous embodiment of the present invention is disclosed in dependent claims, following specification and accompanying drawing.
Embodiment
As shown in Figure 1, use the input of two videos conducts: N bit original video and M bit (M<N, M=8 usually) video to video encoder.This M bit video can be decomposed from N bit original video, or provide by other modes.This scalable scheme can be by using the image of BL, reduces the redundancy between two-layer.These two video flowings, one has 8 bit color, another has N bit color (N〉8), and they are inputed to encoder, output be scalable bit stream.Only import a N bit color data flow and also be fine, (M<N) color data flows from the inner M of generation of this N bit color data flow bit for BL.Use included H.264/AVC encoder, this M bit video is encoded to BL.Can use BL information, improve the code efficiency of EL.This is called as inter-layer prediction herein.Each image---one group of MB---has two access units, and one is used for BL, and another is used for EL.Bit stream coded is carried out multiplexing, to form scalable bit stream.The BL encoder for example comprises H.264/AVC encoder, and uses reconstruct to predict the N bit color video that will be used for the EL coding.
As shown in Figure 1, exemplarily comprise can be by BL bit stream BL decoder (traditional AVC decoder) decoding, that be suitable for AVC for scalable bit stream.Then, decoder one side will finish with encoder in identical prediction (after each self-indication is estimated), with the N bit video that obtains predicting.Then, utilize this N bit prediction video, the EL decoder will use the prediction of N bit, come to produce final N bit video for high quality displayer HQ.
When using term " color bit-depth " herein, its indication bit degree of depth, promptly every numerical value bit number.This is corresponding with color intensity usually, but also can refer to the gray value among the luminance channel Y.
In one embodiment, the present invention is based on the current structure of SVC space, time and quality scalability, and strengthen by the bit-depth scalability of the color bit-depth that has strengthened.Therefore, this embodiment and current SVC standard are compatible fully.Yet it will be easy that the technical staff adapts to other standards with it.The key of bit-depth scalability is the bit-depth inter-layer prediction.By using inter-layer prediction, be EL with the differential coding between N bit and the M bit video.
The present invention has used the contrary tone mapping techniques based on LUT, is used for the inter-layer prediction of bit-depth ges forschung, this technological improvement code efficiency.Based on the relation between the original EL coding unit of BL coding unit after the reconstruct (GOP, image, as bar or MB) and correspondence position, create LUT at the encoder place.
Usually, to each brightness/chroma channel: Y, Cb and Cr, create a LUT.In fact, two in these different channels or all three can share identical LUT.If two or more different LUT are applied to identical code level, then can also carry out differential coding, for example LUT to them
Y, LUT
Cb-Y, LUT
CR-YThen, in inter-layer prediction process, use the LUT that has created at the encoder place, so that the redundancy between BL and the EL is carried out decorrelation.LUT is inserted in the bit stream, and can recover this LUT in decoder end.Decoder uses identical LUT in inter-layer prediction, thereby can be with high-quality reconstruct EL.
BL and EL data that the tone mapping LUT is consulted can be in any level, for example image sequence, image, as bar, macro block (MB), piece (with descending).For LUT not at the same level is carried out decorrelation, for every grade (except that the superlative degree), only to its be right after more higher leveled difference and encode.This difference look-up table is known as " increment (delta) LUT ".For example, produce a LUT for the superlative degree such as GOP (image sets) level.Can produce another LUT for the child group level of for example 4 images.Then, can produce the difference table of difference between this child group LUT of expression and this group/GOP LUT.Can produce another LUT for single image.Then, produce the corresponding increment LUT of difference between this child group LUT of expression and this image LUT.In an identical manner, can on picture bar level and Mb level, produce other LUT.For these grades each, produce and be right after the increment LUT of higher one-level LUT with respect to it.This is shown in Figure 5.Yet, needn't all produce LUT for each level, for example can the skip pictures level.Then, return as bar level increment LUT that to consult the next one more senior, for example GOP level LUT.May occur as same stages equally and produce situation more than a LUT and increment LUT.For example, a LUT/ increment LUT consults first image in the GOP (or son group), and the 2nd LUT/ increment LUT consults another second image in the identical GOP (or son group).Then, these two increment LUT return and consult identical GOP (or son group) LUT.
In order further to reduce the expense of LUT in the bit stream, use differential encoding in one embodiment, more rudimentary LUT and/or increment LUT are encoded.The mathematic(al) representation of the Code And Decode process of LUT is as follows.
Suppose that NB and NE represent the bit-depth of basic unit (BL) and enhancement layer (EL) respectively,, be expressed as LUT={V (0) by the LUT of BL signal estimation EL signal for individual channel, V (1) ..., V (2
NB-1) }, wherein BL the level from 0 to 2
NB-1, and the level from 0 to 2 of EL
NE-1.Therefore, according to LUT, in interlayer bit depth prediction process with among the BL the level i be mapped among the EL the level V (i).
At the encoder place,, highest LUT is encoded by difference to consecutive value.Only following value is carried out entropy coding:
V(0),V(1)-V(0),V(2)-V(1),...,V(2
NB-1)-V(2
NB-2) (1)
Clauses and subclauses add up to 2
NBFor more rudimentary LUT, we are that each grade i calculates increment LUT according to following formula at first:
ΔLUT
i=LUT
i-LUT
i-1≡{V
i(0)-V
i-1(0),V
i(1)-V
i-1(1),…,V
i(2
NB-1)-V
i-1(2
NB-1)}
(2)
Also can use the method in the equation (1), LUT encodes to increment.In addition, because V
i(k)-V
I-1(k) many in all are zero, so the run-length coding of Huffman type can be favourable.
According to an aspect of the present invention, only when the BL data are carried out intraframe coding, use contrary tone mapping techniques based on LUT.This has following advantage: this technology be suitable for for example in current SVC standard, using to the decoding of the single loop of image after the intraframe coding and fragment, and it is suitable for the other types scalability for example also supported in current SVC standard.
Fig. 2 shows a kind of encoder, is used for that the texture inter-layer prediction expands to bit-depth scalability in the frame of the identical spatial scalability that will use with current SVC standard.Bit-depth up-sampling piece BDUp and look-up table (LUT) produce piece LUTGEN and LUT entropy coding piece EC
LUTRepresent expansion together, and other pieces also are used for spatial scalability to bit-depth scalability.These pieces BDUp, LUTGEN, EC
LUTAnd their connection is traditional SVC intra encoder and according to the difference between the intra encoder of the present invention.
Yet, it is noted that the bit-depth up-sampling must not need space (texture), time or SNR up-sampling.Yet, an advantage of the invention is: dissimilar scalabilities can be combined.
In Fig. 2, the MB of basic unit is input to encoder with the M bit, and N bit enhancement layer MB is input to EL encoder (N〉M).In current SVC standard, in the frame in space between texture layer predictive designs the texture up-sampling.In Fig. 2, the input of texture up-sampling TUp is the BL macro block BL after the reconstruct
REC, output is space (texture) the predicted version Pre of EL macro block
t{ BL
Rec.Step by (in this example) directly follows the bit-depth up-sampling BDUp after texture up-sampling TUp realizes bit-depth scalability.In fact, it is normally favourable at first the texture up-sampling to be applied as spatial inter-layer prediction, then bit-depth up-sampling BDUp is applied as the bit-depth inter-layer prediction.Yet the inverted order of prediction steps is fine.Utilize texture up-sampling TUp and bit-depth up-sampling BDUp, obtain the predicted version Pre of N bit EL macro block
c{ Pre
t{ BL
Rec.For each MB, use among at least two defined LUT.Based on the characteristic of BL after the reconstruct and original EL view data, in LUT generation piece LUTGEN, produce LUT.Bit-depth up-sampling piece BDUp uses LUT, and also LUT is outputed to encoder, and therefore this must send to decoder with them owing to LUT is necessary for decoding.As mentioned above, at LUT entropy coding unit EC
LUTIn LUT is encoded.
By difference generator D
EL, obtain original N bit EL macro block EL
OrgWith its predicted version Pre
c{ Pre
t{ BL
RecBetween residual error EL '
ResIn one embodiment of the invention, to the further conversion T of this residual error, quantification Q and entropy coding EC
EL, to form EL bit stream, as in SVC.In mathematic(al) representation, the residual error of up-sampling is in the color bit-depth frame:
EL’
res=EL
org-Pre
c{Pre
t{BL
rec}} (3)
Pre wherein
t{ } expression texture up-sampling operator.
The different variants of cataloged procedure are possible, and can control variant by Control Parameter.Exemplary sign base_mode_flag has been shown among Fig. 2, and it determines to be based on the BL information prediction EL residual error after EL information after the reconstruct also is based on up-sampling.
Below, show the exemplary embodiment of this technical scheme, shine upon based on the contrary tone of hierarchical luts in the SVC bit-depth scalability, realizing.At length, some are new syntactic element adds to as the sequence parameter set in the capable exemplary scalable expansion that illustrates of 25-41 of table 1.Use following symbol:
Inv_tone_map_flag equals 1 explanation can be called contrary tone mapping in inter-layer prediction process.Inv_tone_map_flag equals 0 explanation can not called contrary tone mapping in inter-layer prediction process (default).
Level_lookup_table_luma_minus8 adds the progression of the look-up table of 8 explanation Y-channels.
Offset_val_lookup_table_luma[i] explanation value s[i], the level i in the look-up table of Y-channel is mapped to this value s[i as follows]:
If i is not equal to 0, s[i then] equal s[i-1] add offset_val_lookup_table_luma[i], s[i-1 wherein] be the level value that i-1 was mapped in the Y-channel.
If i equals 0, s[i then] equal offset_val_lookup_table_luma[i].
Chroma_inv_tone_map_flag equals 1 explanation can be called contrary tone mapping in the inter-layer prediction of Cb and Cr channel process.
Level_lookup_table_chroma_minus8 adds the progression of the LUT of 8 explanation Cb and Cr channel.
Offset_val_lookup_table_cb[i] explanation value s[i], the level i in the look-up table of Cb channel is mapped to this value s[i as follows]:
If i is not equal to 0, s[i then] equal s[i-1] add offset_val_lookup_table_cb[i], s[i-1 wherein] be the level value that i-1 was mapped in the Cb channel.If i equals 0, s[i then] equal offset_val_lookup_table_cb[i].
Cr_inv_tone_map_flag equals 0 explanation reuses the Cb channel in the inter-layer prediction of Cr channel LUT.Cr_inv_tone_map_flag equals 1 explanation use different look-up table except that the LUT of Cb channel in the inter-layer prediction of Cr channel.
Offset_val_lookup_table_cr[i] explanation value s[i], the level i among the LUT of Cr channel is mapped to this value s[i as follows]:
If i is not equal to 0, s[i then] equal s[i-1] add offset_val_lookup_table_cr[i], s[i-1 wherein] be the level value that i-1 was mapped in the Cr channel.If i equals 0, s[i then] equal offset_val_lookup_table_cr[i].
# |
|
|
|
1 |
seq_parameter_set_svc_extension(){ |
C |
Descriptor |
2 |
interlayer_deblocking_filter_control_present_flag |
0 |
u(1) |
3 |
extended_spatial_scalability |
0 |
u(2) |
4 |
if(chroma_format_idc==1||chroma_format_idc==2) |
|
|
5 |
chroma_phase_x_plus1 |
0 |
u(1) |
6 |
if(chroma_format_idc==1) |
|
|
7 |
chroma_phase_y_plus1 |
0 |
u(2) |
8 |
if(extended_spatial_scalability==1){ |
|
|
9 |
if(chroma_format_idc>0){ |
|
|
10 |
base_chroma_phase_x_plus1 |
0 |
u(1) |
11 |
base_chroma_phase_y_plus1 |
0 |
u(2) |
12 |
} |
|
|
13 |
scaled_base_left_offset |
0 |
se(v) |
14 |
scaled_base_top_offset |
0 |
se(v) |
15 |
scaled_base_right_offset |
0 |
se(v) |
16 |
scaled_base_bottom_offset |
0 |
se(v) |
17 |
} |
|
|
18 |
if(extended_spatial_scalability==0){ |
|
|
19 |
Avc_rewrite_flag |
0 |
u(1) |
20 |
if(avc_rewrite_flag){ |
|
|
21 |
avc_adaptive_rewrite_flag |
0 |
u(1) |
22 |
} |
|
|
23 |
} |
|
|
24 |
avc_header_rewrite_flag |
0 |
u(1) |
25 |
inv_tone_map_flag |
1 |
u(1) |
26 |
if(inv_tone_map_flag){ |
|
|
27 |
level_lookup_table_luma_minus8 |
1 |
u(v) |
28 |
for(i=0;i<(1<<(8+level_lookup_table_luma_minus8));i++) { |
|
|
29 |
offset_val_lookup_table_luma[i] |
|
se(v) |
30 |
} |
|
|
31 |
chroma_inv_tone_map_flag |
1 |
u(1) |
32 |
if(chroma_inv_tone_map_flag){ |
|
|
33 |
level_lookup_table_chroma_minus8 |
1 |
u(v) |
34 |
for(i=0;i<(1<<(8+level_lookup_table_chroma_minus8)); i++){ |
|
|
35 |
offset_val_lookup_table_cb[i] |
1 |
se(v) |
36 |
} |
|
|
37 |
cr_inv_tone_map_flag |
1 |
u(1) |
38 |
if(cr_inv_tone_map_flag){ |
|
|
39 |
for(i=0;i<(1<<(8+ level_lookup_table_chroma_minus8));i++){ |
|
|
40 |
offset_val_lookup_table_cr[i] |
1 |
se(v) |
41 |
} |
|
|
42 |
} } } } |
|
|
Table 1: the illustrative embodiments in the picture bar head (Slice Header) in the scalable expansion
Table 2 shows the picture parameter set of revising according to one embodiment of present invention.The present invention be included in the 49-68 of table 2 capable in.
# |
pic_parameter_set_rbsp(){ |
C |
Descriptor |
1 |
pic_parameter_set_id |
1 |
ue(v) |
2 |
seq_parameter_set_id |
1 |
ue(v) |
3 |
entropy_coding_mode_flag |
1 |
u(1) |
4 |
pic_order_present_flag |
1 |
u(1) |
5 |
num_slice_groups_minus1 |
1 |
ue(v) |
6 |
if(num_slice_groups_minus1>0){ |
|
|
7 |
slice_group_map_type |
1 |
ue(v) |
8 |
if(slice_group_map_type==0) |
|
|
9 |
for(iGroup=0;iGroup<=num_slice_groups_minus1;iGroup++) |
|
|
10 |
run_length_minus1[iGroup] |
1 |
ue(v) |
11 |
else if(slice_group_map_type==2) |
|
|
12 |
for(iGroup=0;iGroup<num_slice_groups_minus1;iGroup++){ |
|
|
13 |
top_left[iGroup] |
1 |
ue(v) |
14 |
bottom_right[iGroup] |
1 |
ue(v) |
15 |
} |
|
|
16 |
else if(slice_group_map_type==3|| slice_group_map_type==4|| slice_group_map_type==5){ |
|
|
17 |
slice_group_change_direction_flag |
1 |
u(1) |
18 |
slice_group_change_rate_minus1 |
1 |
ue(v) |
19 |
}else if(slice_group_map_type==6){ |
|
|
20 |
pic_size_in_map_units_minus1 |
1 |
ue(v) |
21 |
for(i=0;i<=pic_size_in_map_units_minus1;i++) |
|
|
22 |
slice_group_id[i] |
1 |
u(v) |
23 |
} |
|
|
24 |
} |
|
|
25 |
num_ref_idx_l0_active_minus1 |
1 |
ue(v) |
26 |
num_ref_idx_l1_active_minus1 |
1 |
ue(v) |
27 |
weighted_pred_flag |
1 |
u(1) |
28 |
weighted_bipred_idc |
1 |
u(2) |
29 |
pic_init_qp_minus26/*relative to 26*/ |
1 |
se(v) |
30 |
pic_init_qs_minus26/*relative to 26*/ |
1 |
se(v) |
31 |
chroma_qp_index_offset |
1 |
se(v) |
32 |
deblocking_filter_control_present_flag |
1 |
u(1) |
33 |
constrained_intra_pred_flag |
1 |
u(1) |
34 |
redundant_pic_cnt_present_flag |
1 |
u(1) |
35 |
if(more_rbsp_data()){ |
|
|
36 |
transform_8x8_mode_flag |
1 |
u(1) |
37 |
pic_scaling_matrix_present_flag |
1 |
u(1) |
38 |
if(pic_scaling_matrix_present_flag) |
|
|
Table 2: the illustrative embodiments in the picture parameter set
39 |
for(i=0;i<6+ ((chroma_format_idc!=3) 2:6)*transform_8x8_mode_flag; i++){ |
|
|
40 |
pic_scaling_list_present_flag[i] |
1 |
u(1) |
41 |
if(pic_scaling_list_present_flag[i]) |
|
|
42 |
if(i<6) |
|
|
43 |
scaling_list(ScalingList4x4[i],16, UseDefaultScalingMatrix4x4Flag[i]) |
1 |
|
44 |
else |
|
|
45 |
scaling_list(ScalingList8x8[i-6],64, UseDefaultScalingMatrix8x8Flag[i-6]) |
1 |
|
46 |
} |
|
|
47 |
second_chroma_qp_index_offset |
1 |
se(v) |
48 |
} |
|
|
49 |
inv_tone_map_delta_flag |
1 |
u(1) |
50 |
if(inv_tone_map_delta_flag){ |
|
|
51 |
level_lookup_table_luma_minus8 |
1 |
u(v) |
52 |
for(i=0;i<(1<<(8+level_lookup_table_luma_minus8));i++){ |
|
|
53 |
offset_val_lookup_table_luma_delta[i] |
|
se(v) |
54 |
} |
|
|
55 |
chroma_inv_tone_map_delta_flag |
1 |
u(1) |
56 |
if(chroma_inv_tone_map_delta_flag){ |
|
|
57 |
level_lookup_table_chroma_minus8 |
1 |
u(v) |
58 |
for(i=0;i<(1<<(8+level_lookup_table_chroma_minus8)); i++){ |
|
|
59 |
offset_val_lookup_table_cb_delta[i] |
1 |
se(v) |
60 |
} |
|
|
61 |
cr_inv_tone_map_delta_flag |
1 |
u(1) |
62 |
if(cr_inv_tone_map_delta_flag){ |
|
|
63 |
for(i=0;i<(1<<(8+ level_lookup_table_chroma_minus8));i++){ |
|
|
64 |
offset_val_lookup_table_cr_delta[i] |
1 |
se(v) |
65 |
} |
|
|
66 |
} |
|
|
67 |
} |
|
|
68 |
} |
|
|
69 |
rbsp_trailing_bits() |
1 |
|
70 |
} |
|
|
Table 2 (brought forward): the illustrative embodiments in the picture parameter set
Inv_tone_map_delta_flag equals the increment size that there is contrary tone mapping interpolation specified in will the sequence parameter set (SPS) in inter-layer prediction in 1 explanation.
Level_lookup_table_luma_minus8 adds the progression of the look-up table of 8 explanation Y-channels.
Offset_val_lookup_table_luma_delta[i] the increment size s[i that will be in SPS specified value is added is described], the level i in the look-up table of Y-channel is mapped to this increment size s[i as follows]:
If i is not equal to 0, s[i then] equal s[i-1] add offset_val_lookup_table_luma_delta[i].Otherwise, s[i] and equal offset_val_lookup_table_luma_delta[i].
Chroma_inv_tone_map_delta_flag equals 1 explanation and exists and will shine upon the increment size that adds by specified contrary tone in the SPS in the inter-layer prediction of Cb and Cr channel.
Level_lookup_table_chroma_minus8 adds the progression of the LUT of 8 explanation Cb and Cr channel.
Offset_val_lookup_table_cb_delta[i] the increment size s[i that will be in SPS specified value is added is described], the level i in the look-up table of Cb channel is mapped to this increment size s[i as follows]:
If i is not equal to 0, s[i then] equal s[i-1] add offset_val_lookup_table_cb_delta[i].Otherwise, s[i] and equal offset_val_lookup_table_cb_delta[i].
Cr_inv_tone_map_delta_flag equals the Cb channel of increment size 0 explanation reuses to(for) the Cr channel.Cr_inv_tone_map_delta_flag equals the different increment size of 1 explanation use except that the increment size of Cb channel.
Offset_val_lookup_table_cr_delta[i] the increment size s[i that will be in SPS specified value is added is described], the level i in the look-up table of Cr channel is mapped to this increment size s[i as follows]:
If i is not equal to 0, s[i then] equal s[i-1] add offset_val_lookup_table_cr_delta[i].Otherwise, s[i] and equal offset_val_lookup_table_cr_delta[i].
Picture bar after the exemplary extended in scalable expansion head is provided in the table 3.It is capable that the present invention is included in 46-67.
# |
Slice_header_in_scalable_extension(){ |
C |
Descriptor |
1 |
first_mb_in_slice |
2 |
ue(v) |
2 |
slice_type |
2 |
ue(v) |
3 |
pic_parameter_set_id |
2 |
ue(v) |
4 |
frame_num |
2 |
u(v) |
5 |
if(!frame_mbs_only_flag){ |
|
|
6 |
field_pic_flag |
2 |
u(1) |
7 |
if(field_pic_flag) |
|
|
8 |
bottom_field_flag |
2 |
u(1) |
9 |
} |
|
|
10 |
if(nal_unit_type==21) |
|
|
11 |
idr_pic_id |
2 |
ue(v) |
12 |
if(pic_order_cnt_type==0){ |
|
|
13 |
pic_order_cnt_lsb |
2 |
u(v) |
14 |
if(pic_order_present_flag && !field_pic_flag) |
|
|
15 |
delta_pic_order_cnt_bottom |
2 |
se(v) |
16 |
} |
|
|
17 |
if( pic_order_cnt_type = = 1 && !delta_pic_order_always_zero_flag){ |
|
|
18 |
delta_pic_order_cnt[0] |
2 |
se(v) |
19 |
if(pic_order_present_flag &&!field_pic_flag) |
|
|
20 |
delta_pic_order_cnt[1] |
2 |
se(v) |
21 |
} |
|
|
22 |
if(redundant_pic_cnt_present_flag) |
|
|
23 |
redundant_pic_cnt |
2 |
ue(v) |
24 |
if(slice_type==EB) |
|
|
25 |
direct_spatial_mv_pred_flag |
2 |
u(1) |
26 |
if(quality_id==0){ |
|
|
27 |
if(slice_type==EP||slice_type==EB){ |
|
|
28 |
num_ref_idx_active_override_flag |
2 |
u(1) |
29 |
if(num_ref_idx_active_override_flag){ |
|
|
30 |
num_ref_idx_l0_active_minus1 |
2 |
ue(v) |
31 |
if(slice_type==EB) |
|
|
32 |
num_ref_idx_l1_active_minus1 |
2 |
ue(v) |
33 |
} |
|
|
34 |
} |
|
|
35 |
Ref_pic_list_reordering() |
2 |
|
36 |
if(!layer_base_flag){ |
|
|
37 |
base_id |
2 |
ue(v) |
38 |
adaptive_prediction_flag |
2 |
u(1) |
39 |
if(!adaptive_prediction_flag){ |
|
|
40 |
default_base_mode_flag |
2 |
u(1) |
41 |
if(!default_base_mode_flag){ |
|
|
42 |
adaptive_motion_prediction_flag |
2 |
u(1) |
43 |
if(!adaptive_motion_prediction_flag) |
|
|
44 |
default_motion_prediction_flag |
2 |
u(1) |
45 |
} |
|
|
46 |
inv_tone_map_delta_flag |
1 |
u(1) |
47 |
if(inv_tone_map_delta_flag){ |
|
|
48 |
level_lookup_table_luma_minus8 |
1 |
u(v) |
49 |
for(i =0; i<(1<<(8+ level_lookup_table_luma_minus8));i++){ |
|
|
50 |
offset_val_lookup_table_luma_delta[i] |
1 |
se(v) |
51 |
} |
|
|
52 |
chroma_inv_tone_map_delta_flag |
1 |
u(1) |
53 |
if(chroma_inv_tone_map_delta_flag){ |
|
|
54 |
level_lookup_table_chroma_minus8 |
1 |
u(v) |
55 |
for( i=0; i<(1<<(8+ level_lookup_table_chroma_minus8)); |
|
|
56 |
i++){ |
|
|
57 |
offset_val_lookup_table_cb_delta[i] |
1 |
se(v) |
58 |
} |
|
|
59 |
cr_inv_tone_map_delta_flag |
1 |
u(1) |
60 |
if(cr_inv_tone_map_delta_flag){ |
|
|
61 |
for( i=0; i<(1<<(8+ level_lookup_table_chroma_minus8)); |
|
|
62 |
i++){ |
|
|
63 |
offset_val_lookup_table_cr_delta[i] |
1 |
se(v) |
64 |
} |
|
|
65 |
} |
|
|
66 |
} |
|
|
67 |
} |
|
|
68 |
} |
|
|
69 |
adaptive_residual_prediction_flag |
2 |
u(1) |
70 |
} |
|
|
71 |
if((weighted_pred_flag && slice_type==EP)|| (weighted_bipred_idc==1 && slice_type==EB)){ |
|
|
72 |
if(adaptive_prediction_flag) |
|
|
73 |
base_pred_weight_table_flag |
2 |
u(1) |
74 |
if(layer_base_flag||base_pred_weight_table_flag==0) |
|
|
75 |
pred_weight_table() |
|
|
76 |
} |
|
|
77 |
if(nal_ref_idc!=0){ |
|
|
78 |
dec_ref_pic_marking() |
2 |
|
79 |
if(use_base_prediction_flag && nal_unit_type!=21) |
|
|
80 |
dec_ref_pic_marking_base() |
|
|
81 |
} |
|
|
82 |
} |
|
|
83 |
if(entropy_coding_mode_flag && slice_type!=EI) |
|
|
84 |
cabac_init_idc |
2 |
ue(v) |
85 |
slice_qp_delta |
2 |
se(v) |
86 |
if(deblocking_filter_control_present_flag){ |
|
|
87 |
disable_deblocking_filter_idc |
2 |
ue(v) |
88 |
if(disable_deblocking_filter_idc!=1){ |
|
|
89 |
slice_alpha_c0_offset_div2 |
2 |
se(v) |
90 |
slice_beta_offset_div2 |
2 |
se(v) |
91 |
} |
|
|
92 |
} |
|
|
93 |
if(interlayer_deblocking_filter_control_present_flag){ |
|
|
94 |
disable_interlayer_deblocking_filter_idc |
2 |
ue(v) |
95 |
if(disable_interlayer_deblocking_filter_idc!=1){ |
|
|
96 |
interlayer_slice_alpha_c0_offset_div2 |
2 |
se(v) |
97 |
interlayer_slice_beta_offset_div2 |
2 |
se(v) |
98 |
} |
|
|
99 |
} |
|
|
100 |
constrained_intra_upsampling_flag |
2 |
u(1) |
101 |
if(quality_id==0) |
|
|
102 |
if(num_slice_groups_minus1>0 && slice_group_map_type>=3 && slice_group_map_type<=5) |
|
|
103 |
slice_group_change_cycle |
2 |
u(v) |
104 |
if(quality_id==0 && extended_spatial_scalability>0){ |
|
|
105 |
if(chroma_format_idc>0){ |
|
|
106 |
base_chroma_phase_x_plus1 |
2 |
u(2) |
107 |
base_chroma_phase_y_plus1 |
2 |
u(2) |
108 |
} |
|
|
109 |
if(extended_spatial_scalability==2){ |
|
|
110 |
scaled_base_left_offset |
2 |
se(v) |
111 |
scaled_base_top_offset |
2 |
se(v) |
112 |
scaled_base_right_offset |
2 |
se(v) |
113 |
scaled_base_bottom_offset |
2 |
se(v) |
114 |
} |
|
|
115 |
} |
|
|
116 |
if(use_base_prediction_flag) |
|
|
117 |
store_base_rep_flag |
2 |
u(1) |
118 |
if(quality_id==0){ |
|
|
119 |
if(BaseFrameMbsOnlyFlag && !frame_mbs_only_flag && !field_pic_flag) |
|
|
120 |
base_frame_and_bottom_field_coincided_flag |
2 |
u(1) |
121 |
else if(frame_mbs_only_flag && !BaseFrameMbsOnlyFlag && !BaseFie ldPicFlag) |
|
|
122 |
base_bottom_field_coincided_flag |
2 |
u(1) |
123 |
} |
|
|
124 |
SpatialScalabilityType=spatial_scalability_type ()/* [Ed.: should be moved to semanteme and delete grammatic function] */ |
|
|
125 |
} |
|
|
Table 3: the exemplary picture bar head in the scalable expansion
In one embodiment, the BL picture bar based on behind original EL picture bar and the up-sampling, after the reconstruct produces a LUT.The MB of the correspondence position of the BL picture bar behind the one or more MB and up-sampling based on original EL picture bar, after the reconstruct produces the 2nd LUT.As mentioned above, be LUT/ increment LUT with these two LUT differential codings.Therefore, can use a LUT at the decoder place, (except that those MB that the 2nd LUT is consulted) are mapped to EL picture bar with the BL picture bar behind the up-sampling, after the reconstruct, and can use the 2nd LUT at the decoder place, shine upon those MB that it is consulted.The method of this generation LUT has following advantage: can optimize decoding, this is because LUT has defined at the obtainable picture bar in decoder place (the BL picture bar behind the up-sampling, after the reconstruct) and had mapping between the highest EL picture bar (being original EL picture bar) that obtains quality.Use the advantage of hierarchical luts to be: LUT collects best-fit in actual video data, and this is owing to all be isomorphism usually as the most of of bar, and in the picture bar some are than possible the difference in zonule.Advantageously, these zones are defined difference LUT respectively.Therefore, this method best-fit is in the needs of decoder and the reconstruct of first water.
In one embodiment, LUT is encoded with the EL data and send.At the encoder place, use these LUT, by the BL data prediction EL data after the reconstruct, and residual error carried out intraframe coding and send.Then, at the decoder place, LUT is applied to BL data after the reconstruct, and adds residual error.The result is the decoded EL image with high color bit-depth.
Advantageously, with added, support to insert in the head, for example for picture bar level interpolation slice_header_in_scalable_extension based on the syntactic element of the contrary tone mapping of LUT.
In fact, different units (image, as bar, MB) can have Different L UT.Adding new syntactic element in the head of level separately makes and adopts contrary tone mapping to have flexibility.For example, in the situation of object-based picture bar segmentation, different has different characteristics as bar, and in the middle of different picture bars, and the relation between the EL picture bar of BL picture bar and correspondence position can be very different.Therefore, it can be useful creating different LUT for difference as bar.On the other hand, to tie up on the sequence of a plurality of images can be constant to the pass between the EL picture bar of characteristic and BL picture bar and correspondence position.In this case, can produce higher LUT for higher level (for example sequence or GOP level), be one, interior zone (for example as bar, MB group, MB) the more rudimentary LUT of generation of some or all these images.In one embodiment, the specific region that defines in this more rudimentary LUT and each image is associated.In another embodiment, separately zone association in each image of single more rudimentary LUT and sequence can be got up.In one embodiment, MB has related increment LUT, and the next MB in the sequence has the indication that identical increment LUT is applied as once more last MB.Can on the code level except that MB, use identical principle.
Fig. 3 shows and utilizes exemplary decoder inter-layer prediction, that be used for the BL image after the intraframe coding.BL and EL information BL after receiving the coding that has the back LUT that encodes according to the present invention
Enc, EL
EncAfterwards, for example in multiplex packet bit stream, isolate BL, EL and LUT information, BL information, EL information and LUT are carried out the entropy decoding.In this example, LUT is included in the EL information.Then, with inverse quantization Q
-1With inverse transformation T
-1Be applied to video data, in LUT decoding unit LUTdec to hierarchical luts---LUT
1, LUT
2Decode.The LUT decoding unit reconstructs more senior LUT, increment LUT and final more rudimentary LUT, and provides two or more decoded look-up tables for bit depth prediction unit B DUp.Reconstruct according to the highest LUT behind 1 pair of coding of equation can be used (V
EncBe the value behind the coding):
V(0)=V
enc(0),
V(1)=V(0)-V
enc(1),
V(2)=V(1)-V
enc(2),
…,
V(2
NB-1)=V(2
NB-2)-V
enc(2
NB-1) (4)
Reconstruct according to the more rudimentary LUT behind 2 pairs of codings of equation can be used:
LUT
i-1≡LUT
i-ΔLUT
i={V
i(0)-dV
i(0),V
i(1)-dV
i(1),…,V
i(2
NB-1)-dV
i(2
NB-1)}
(5)
Wherein, common most of dV
i(k) be zero.
For BL, to treatment of picture after the intraframe coding with identical for traditional SVC: the usage space infra-frame prediction comes reconstructed image, promptly based on the last reconfiguration information of identical image.After deblocking, can be with the BL signal BL that obtains
RecBe presented on the standard SVC display of 8 bit color depth.Can also use this signal to produce the predicted version Pre of the EL image of correspondence position
c{ Pre
t{ BL
Rec}: for this reason, this signal is carried out texture up-sampling TUp, wherein obtain the texture prediction version Pre of EL image
t{ BL
Rec, then, the look-up table behind the coding that use is extracted carries out bit-depth up-sampling BDUp to this texture prediction version.Then, use texture and bit-depth up-sampling, BL image Pre after the reconstruct
c{ Pre
t{ BL
Rec, upgrade A
2, ELEL residual error EL ' after improved, inverse quantization and the inverse transformation
ResThereby, obtain after deblocking, to can be used as EL video EL
RecExport to the signal of HQ display.
Certainly, the decoder of operating under the EL pattern also can produce BL video BL in inside
Rec, this is to predict owing to needing it to be used for EL, but the BL video must not obtain in decoder output place.In one embodiment, decoder has two outputs, and one is used for BL video BL
Rec, one is used for EL video EL
Rec, and in another embodiment, decoder only has the EL of being used for video EL
RecOutput.
EL MB for correspondence position BL MB is wherein carried out interframe encode does not have following constraint: must use and the identical inter-layer prediction based on LUT of situation that correspondence position BL MB is carried out intraframe coding.For example, correspondence position BL MB is being carried out under the situation of interframe encode, linear scale can be served as the method for bit-depth up-sampling.
As mentioned above, for the intra encoder of Fig. 2, this decoder also can work in and encode under the corresponding different mode.Therefore, from bit stream, extract sign separately and sign estimated, for example determine whether to use the indication base_mode_flag of inter-layer prediction.If do not use, then use deblock, spatial prediction and to the renewal A of spatial prediction image
1, EL, come reconstruct EL image traditionally.
In one embodiment of the invention, proposed a kind of equipment to coding video data with basic unit and enhancement layer, wherein basic unit's pixel has littler color bit-depth than enhancement layer pixels, and described equipment comprises:
Be used for the code device T, the Q that on first particle size fraction, base layer data are encoded, wherein base layer data carried out intraframe coding,
The device T that is used for the base layer data after reconstruct is encoded
-1, Q
-1,
Be used for producing the first tone mapping table LUT at the base layer data after the intraframe coding
GOPDevice, the table LUT
GOPDefined the base layer data Pre after original enhancement layer data and the corresponding reconstruct
t{ BL
RecBetween tone mapping,
Be used for producing the second different tone mapping table LUT at the base layer data fragment after the intraframe coding (for example MB)
MBDevice, the table LUT
MBDefined original enhancement layer data EL
OrgDescribed fragment and reconstruct after base layer data Pre
t{ BL
RecHomologous segment between tone mapping,
Be used to produce difference table dLUT
MBDevice, the table dLUT
MBRepresent first and second tone mapping table LUT
GOP, LUT
MBBetween difference,
Be used for based on the described first and second tone mapping tables, base layer data after the reconstruct is carried out the device BDUp of bit-depth up-sampling, wherein, only use the second tone mapping table, and obtain the predicted version Pre of corresponding enhancement data for the described fragment of the basic unit after the reconstruct
c{ Pre
t{ BL
Rec, it has higher bit-depth resolution than base layer data,
Be used to produce enhancement layer residual EL '
ResDevice, enhancement layer residual EL '
ResBe the corresponding predicted version Pre of original enhancement layer data with enhancement data
c{ Pre
t{ BL
RecBetween difference, and
Be used for enhancement layer residual, the first tone mapping table LUT
GOPWith described difference table dLUT
MBCarry out apparatus for encoding, wherein the first tone mapping table behind the coding is associated with basic unit or enhancement data behind the coding, and difference table with encode after base layer data or the described fragment of enhancement data be associated.
In one embodiment, this encoding device further comprises: device TUp was used for before described bit-depth up-sampling the base layer data BL after the reconstruct
RecCarry out up-sampling, wherein obtained the first predicted version Pre of corresponding enhancement data
t{ BL
Rec, it has higher space, time or SNR resolution and is used to described bit-depth up-sampling step than base layer data.
In one embodiment of the invention, proposed a kind of equipment that video data with basic unit and enhancement layer is decoded, described equipment comprises:
Be used for from the coding after enhancement data EL
EncOr base layer data BL
ENCExtract the device of the first and second tone mapping (enum) datas relevant with enhancement data after the intraframe coding,
Be used for from the tone mapping (enum) data reconstruct first tone mapping table LUT that extracts
GOPDevice,
Be used for from the tone mapping (enum) data of extraction and the first tone mapping table reconstruct, the second tone mapping table LUT after the described reconstruct
MBDevice, wherein, the tone mapping (enum) data of the extraction that is utilized is represented the difference dLUT between described first and second tone mapping table
MB,
The device of second coding unit that is used for determining first coding unit relevant and is correlated with the second tone mapping table with the first tone mapping table, wherein, second coding unit is the sub-fraction of described first coding unit,
Be used for the base layer data that receives and enhancement data are carried out the device T of inverse quantization and inverse transformation
-1, Q
-1, wherein, the enhancement data after inverse quantization and the inverse transformation comprises residual error EL '
Res,
The device A that is used for the base layer data after the reconstruct intraframe coding
1, BL, PR
I, DBL
I,
Be used for the base layer data BL after the reconstruct
RecCarry out the device BDUp of up-sampling, wherein increased the numerical value degree of depth of every pixel, and for the pixel in described second coding unit, use the second tone mapping table, and, use the first tone mapping table, and obtain the enhancement data Pre of prediction for the residual pixel of first coding unit
c{ Pre
t{ BL
Rec, and
Be used for from the enhancement data Pre of prediction
c{ Pre
t{ BL
RecAnd inverse quantization and inverse transformation after enhancement data reconstruct the device A of the enhancement layer video data of reconstruct
2, EL
Exemplarily, in one embodiment, proposed a kind of equipment that video data with basic unit and enhancement layer is decoded, described equipment comprises:
Be used for enhancement data or the base layer data device that extracts the first and second tone mapping (enum) datas behind the coding, one or more heads of the enhancement data of the first and second tone mapping (enum) datas after from intraframe coding,
Be used for from the device of the tone mapping (enum) data reconstruct first tone mapping table that extracts,
Be used for from the tone mapping (enum) data of extraction and the device of the first tone mapping table reconstruct, the second tone mapping table after the described reconstruct, wherein, the tone mapping (enum) data of the extraction that is utilized is represented the difference between described first and second tone mapping table,
Be used for the base layer data that receives and enhancement data are carried out the device of inverse quantization and inverse transformation, wherein, the enhancement data after inverse quantization and the inverse transformation comprises residual error,
The device that is used for the base layer data after the reconstruct intraframe coding,
Be used for the base layer data after the reconstruct is carried out the device of up-sampling, wherein increase pixel count and increased the value degree of depth of every pixel, wherein for the first intra-coding data unit, use the first tone mapping table, and for the second intra-coding data unit that is included in first data cell, use the second tone mapping table, and obtain the enhancement data of prediction, and
Be used for reconstructing the device of the EL video information of reconstruct from the EL data of prediction and the EL information after inverse quantization and the inverse transformation.
It is noted that term " tone mapping " describes identical process with " contrary tone mapping " from different viewpoints.Therefore, use them with the free burial ground for the destitute herein.For example in JVT, use term " contrary tone mapping ", describe by of the prediction of low bit-depth (being BL) to the higher bit degree of depth (being EL).Yet, the term that herein uses can be interpreted as and get rid of the practicality of the present invention JVT.Identical situation is also applicable to other standards.
In addition, the part after the not all intraframe coding of BL image all needs to use the contrary tone mapping based on LUT.Can pass through some distortion measurement technology, determine whether to use contrary tone mapping based on LUT.If determine to use contrary tone mapping techniques, then for example, will select the INTRA_BL pattern based on LUT; If determine not use, then can use shared AVC instrument, come current EL MB coding.
Because number of colors possible among BL and the EL is different, so each BL color can be mapped to different EL colors.Usually, these different EL colors are closely similar, therefore in colour code or colour gamut " adjacent ".
Fig. 4 shows at GOP, as an exemplary collection of the classification look-up table of the tone on bar and MB level mapping.GOP comprises a plurality of image I that have similar characteristic about higher bit degree of depth color
1, I
2..., I
nFor example, use particular color more continually than its " adjacent " color.Exemplarily, at least one in the image, for example I
2, comprise a plurality of picture bar SL
1, SL
2, SL
3, at these as one of bar SL
2EL in, more often do not use this special neighbourhood color than another second adjacent color.In addition, at one of picture bar SL
3In, include one or more macro blocks, wherein also be more often not use this special neighbourhood color than described second (or another the 3rd) adjacent color.First tone mapping look-up table LUT that is sent
GOPDefined the general mapping on the GOP level between BL and the EL.In addition, second tone mapping look-up table LUT
SLDefine the different mappings of the described color on the picture bar level, only consulted picture bar SL separately
2With this particular B L color.With second tone mapping look-up table LUT
SLDifferential coding is " increment LUT " dLUT
SL, then with dLUT
SLSend.Two tables are all consulted with them, and their zone separately (being GOP and picture bar) is associated, for example by indication or by inserting head separately.In addition, produce another three color scheme mapping look-up table LUT
MBAnd use it for as the one or more macro block MB in one of bar
1, MB
2, MB
3, MB
4To this three color scheme mapping look-up table LUT
MBAlso carrying out differential coding, (is LUT in this example with respect to five-star table promptly
GOP) come differential coding.Then, with increment LUT dLUT
MBBe associated with its MB separately that consults or a plurality of MB, and with dLUT
MBSend.
Fig. 5 shows at GOP, as another exemplary collection of the classification tone mapping look-up table of the tone on bar and MB level mapping.It and Fig. 4 are similar, except to more rudimentary tone mapping look-up table LUT
MB, be right after higher one-level with respect to it and (be LUT in this example
SL) encode.Because the characteristic of natural video frequency, this coding can to consult highest LUT more suitable than returning in Fig. 4.In addition, the MB that consulted of MB level tone mapping LUT is positioned at the picture bar SL that has independent related tone mapping LUT
2In.Picture bar level table LUT
SLOnly be SL
2Domination GOP level table LUT
GOP, MB level table is MB
2Arrange GOP level table LUT simultaneously
GOPWith picture bar level table LUT
SLIn addition, can be for example MB
3Produce another MB level LUT.In one embodiment, MB level look-up table can be consulted more than a macro block, for example can consult MB
1And MB
2
Usually, in the inapplicable zone of more rudimentary tone mapping table, ignore this table (for example, for the MB among Fig. 5
1, ignore LUT
MB).In addition, can implicitly produce more rudimentary tone mapping table, for example by carrying out the mapping step in two sub-steps: at first in more rudimentary LUT, inquire about specific input value,, then use this output valve if determine that more rudimentary LUT has defined output valve for this specific input value.Yet, if more rudimentary LUT is not this specific input value definition output valve, for example:, in more senior LUT, inquire about input value because more rudimentary LUT is partial L UT.If have above, then begin to search for continuously two or more more senior LUT continuously, up to there being one to provide output valve for input value from being right after higher one-level more than two levels.
The advantage that being used for of being presented expands to spatial scalability the classification look-up table method of bit-depth scalability is: the data volume that send is very low, and this is because look-up table is suitable for the content of image individually and is compressed.Therefore, the control data (being the LUT data) and the amount of actual video data are minimized.In addition, do not need new predictive mode to realize expansion to the color bit-depth scalability.
Additional advantages of the present invention are: to the complete compatibility of other type scalability, robustness and to the extensibility of advanced technology.Especially, the present invention is still keeping the single loop decode structures, so that when will only be applied to image after basic unit's intraframe coding or image section based on the contrary tone mapping of LUT, has improved code efficiency.
Because the BL data after the use reconstruct are used for the generation of up-sampling and look-up table, the prediction of encoder one side adapts to the prediction of decoder one side better, so that residual error is more suitable for and can obtain better prediction and reconstruction result in decoder one side, this also is an advantage.
The present invention can be used to ges forschung device, scalable decoding device and scalable signal, especially can be used to vision signal or have the layer of different quality and other type signal of high interlayer redundancy.
Should be appreciated that and intactly described the present invention for example, and in the case without departing from the scope of the present invention, can make modification details of the present invention.Can be independently or with suitable combination arbitrarily furnish an explanation book and appropriate location and claim and disclosed each feature of accompanying drawing.Appropriate location in hardware, software or both combinations is realized these features.The reference number that occurs in the claim is only as example, and the system of being limited in scope to claim does not influence.