EP2092752A2 - Procédé et système d'interpolation adaptative pour un codage et un décodage vidéo prédictif à compensation de mouvement - Google Patents

Procédé et système d'interpolation adaptative pour un codage et un décodage vidéo prédictif à compensation de mouvement

Info

Publication number
EP2092752A2
EP2092752A2 EP07859334A EP07859334A EP2092752A2 EP 2092752 A2 EP2092752 A2 EP 2092752A2 EP 07859334 A EP07859334 A EP 07859334A EP 07859334 A EP07859334 A EP 07859334A EP 2092752 A2 EP2092752 A2 EP 2092752A2
Authority
EP
European Patent Office
Prior art keywords
filter
samples
pixel
filters
horizontal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07859334A
Other languages
German (de)
English (en)
Inventor
Ronggang Wang
Zhen-Nadine Ren
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP2092752A2 publication Critical patent/EP2092752A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/533Motion estimation using multistep search, e.g. 2D-log search or one-at-a-time search [OTS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Definitions

  • the invention relates to video coding and decoding technology, and particularly to an adaptive interpolation method and the system for the improvement of a motion compensated predictive video coding and decoding.
  • a typical video encoding system is based on the motion compensated prediction technique with motion vectors of a fractional pixel resolution.
  • the motion vectors can be in half-pixel resolution (or precision).
  • the resolution of the motion vectors can be higher, i.e., in 1/4-pixel resolution.
  • AVC Advanced Video Coding
  • AIF adaptive interpolation filtering
  • US Patent Application No. 2004/0076333 entitled “Adaptive Interpolation Filter System for Motion Compensated Predictive Video Coding”
  • the adaptive interpolation filter utilizes a heuristic search technique to increase the efficiency of coding.
  • the main disadvantage of the heuristics search method is that it fails to converge to "optimum" or “near optimum” solutions unless it initially begins from a "good" initial starting point.
  • the global minimum filter may never be found if the initial starting point is poorly chosen.
  • One way to counteract this problem is to execute multiple passes searching. However, the multiple passes searching will definitely increase the computing load thereof, which is not suitable for those real-time coding applications.
  • Wiener Interpolation Filter for H.264/AVC disclosed an a two-dimensional
  • (2D) non-separable interpolation filtering technique which is composed of five groups of independently calculated filters in view of each frame by minimizing prediction errors.
  • the problem of this method is that there's no relationship among the five groups of filters, so it is required to adopt a big number of bits to transmit the five groups of filter coefficients for each frame. Therefore, the two-dimensional (2D) non-separable interpolation filtering technique imposes a heavy computing complexity in both processes of filter training and interpolation operation.
  • the present invention provides an adaptive interpolation method and system for motion compensated predictive video codec, which is capable of minimizing the difference between raw picture and predicted pictures.
  • the present invention also provides a decoding method and system corresponding to the interpolation method and system.
  • the training procedure to find an optimal interpolation filter can be performed by one pass fast algorithm to make it feasible for real-time coding application.
  • the present invention provides an adaptive interpolation method for motion compensated predictive video codec, which comprises: providing a set of filters for a current frame; interpolating a reference frame having a certain precision according to the set of filters; calculating motion vectors to generate a prediction frame of the current frame in view of the interpolated reference frame; constructing a first interpolation filter F1 of the set of filters in view of a first part of all the sub-pixel positions according to a fixed linear relationship among samples of the first part; training the first filter F1 by performing Least Square Estimation on the sub-pixel positions of the first part; constructing a second filter F2 in view of a second part of all the sub-pixel positions according to a fixed linear relationship among samples of the second part; training the second filter F2 of the set of filters by performing Least Square Estimation on the sub-pixel positions of the second part under the constraint of F1 ; re-training the first filter F1 on the sub-pixel positions of the first part under the constraint of the constraint of the
  • the first filter F1 is employed to interpolate samples at horizontal half-pixel positions or vertical half-pixel positions.
  • the second filter F2 is employed to interpolate samples at horizontal and vertical half-pixel positions.
  • the samples at other sub-pixel positions are interpolated under a fixed linear relationship between the samples at half-pixel or integer-pixel positions and the samples at sub-pixel position with higher precision.
  • said step for employing the filter F2 to interpolate samples at horizontal and vertical half-pixel positions further comprises: filtering the up-left NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a first middle result; filtering the up-right NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a second middle result; filtering the down-left NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a third middle result; filtering the down-right NxN integer samples of horizontal and vertical half-pixel sample positions by using the filter F2 to obtain a fourth middle result; and interpolating samples at horizontal and vertical half-pixel sample positions by averaging the first, the second, the third, and the fourth obtained results, wherein N is an integer.
  • the present invention further provides a video encoder, which comprises a summer, a motion compensation module, a motion estimation module, an encoding module, a feedback decoding module and an adaptive interpolation system, wherein said adaptive interpolation system further comprises: a device configured to provide a set of filters for a current frame; a device configured to interpolate a reference frame having a certain precision according to the set of filters; a device configured to calculate motion vectors of the current frame in view of the interpolated reference frame; a device configured to train at least one of the filter sets by performing Least Square Estimation using the calculated motion vectors according to an equation as below:
  • e represents the difference between the current frame and a prediction of the current frame
  • S represents the current frame
  • P represents the reference frame
  • x, y represent the x and y coordinates, respectively
  • NxM is the size of the filter
  • ⁇ mvx, mvy) represents the motion vectors
  • h represents the float filter coefficients
  • /, j represent the coordinates of filter coefficients
  • a device configured to obtain a desirable prediction of the current frame by using the optimum filter set.
  • the present invention provides a decoding method for motion compensated predictive video codec, which comprises receiving encoded set of filters, motion vectors and prediction error, in which said filters include a first filter F1 and a second filter F2; decoding the received set of filters, motion vectors and prediction error by using predictive coding and exponent-Glomob method; determining samples to be interpolated according to the decoded motion vectors; interpolating a reference frame using the decoded set of filters, which further includes: applying the filter F1 to interpolate a first plurality of samples among said determined samples where the first plurality of samples are at horizontal or vertical half-pixel sample positions; and applying the filter F2 to interpolate a second plurality of samples among said determined samples where the second plurality of samples are at horizontal and vertical half-pixel sample positions; and reconstructing the current frame using the interpolated reference frame, the decoded motion vectors and the decoded prediction error.
  • said step for applying the filter F2 to interpolate said second plurality of samples further comprises: filtering the up-left NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a first middle result; filtering the up-right NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a second middle result; filtering the down-left NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a third middle result; filtering the down-right NxN integer samples of horizontal and vertical half-pixel sample positions by using the filter F2 to obtain a fourth middle result; and interpolating samples at horizontal and vertical half-pixel sample positions by averaging the first, the second, the third, and the fourth obtained results, wherein N is an integer.
  • said interpolating the reference frame using the decoded set of filters further includes: applying a fixed filter to interpolate samples at other sub-pixel sample positions under a fixed linear relationship between the samples at half-pixel or integer-pixel positions and the samples at sub-pixel position with higher precision after the samples at half-pixel positions are interpolated using the filter F1 or filter F2.
  • the present invention further provides a video decoder, which comprises a decoding module configured to receive and decode an encoded set of filters, motion vectors and prediction error; a motion compensation module configured to interpolate the reference frame using the decoded set of filters including a first filter F1 and a second filter F2, which further comprises: means for determining samples to be interpolated according to the decoded motion vectors; means for applying the filter F1 to interpolate a first plurality of samples among said determined samples if the first plurality of samples are at horizontal or vertical half-pixel sample positions; and means for applying the filter F2 to interpolate a second plurality of samples among said determined samples if the second plurality of samples are at horizontal and vertical half-pixel sample positions; and a reconstruction module configured for reconstructing the current frame using the interpolated reference frame, the decoded motion vectors and the decoded prediction error.
  • a decoding module configured to receive and decode an encoded set of filters, motion vectors and prediction error
  • a motion compensation module configured to interpolate
  • said means for applying the filter F2 to interpolate said second samples further comprises: means for filtering the up-left NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a first middle result; means for filtering the up-right NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a second middle result; means for filtering the down-left NxN integer samples of horizontal and vertical half-pixel sample positions using the filter F2 to obtain a third middle result; means for filtering the down-right NxN integer samples of horizontal and vertical half-pixel sample positions by using the filter F2 to obtain a fourth middle result; and means for interpolating samples at horizontal and vertical half-pixel sample positions by averaging the first, the second, the third, and the fourth obtained results, wherein N is an integer.
  • said motion compensation module further comprises: means for applying a fixed filter to interpolate samples at other sub-pixel sample positions under a fixed linear relationship between the samples at half-pixel or integer-pixel positions and the samples at sub-pixel position with higher precision after the samples at half-pixel positions are interpolated using the filter F1 or filter F2.
  • FIG. 1 is a block diagram of a video codec having an adaptive interpolation system
  • Fig. 2 is a flow chart illustrating the process of the video encoding with adaptive interpolation filtering
  • FIG. 3 is a flow chart illustrating the first embodiment of the process of training adaptive interpolation filters;
  • Fig. 4 is a flow chart illustrating the second embodiment of the process of training adaptive interpolation filters;
  • Fig. 5 shows a sub-pixel interpolation scheme of H.264/AVC by incorporating the interpolation method according to the present invention, wherein those shaded blocks with upper-case letters represent integer samples and unshaded blocks with lower-case letters represent fractional sample positions for quarter sample Luma interpolation;
  • Fig. 6 is a photo for showing subjective reconstructed video quality comparison between with and without adaptive interpolation system in H.264/AVC;
  • Fig. 7 is a flow chart illustrating a decoding method according to the present invention.
  • Fig. 8 is a block diagram of a decoder for implementing the decoding method according to Fig. 7.
  • FIG. 1 is a block diagram showing a video codec 170 with an adaptive interpolation system 110, which is capable of improving the video compression efficiency by utilizing an adaptive filter set in the process of motion compensated prediction.
  • the video codec 170 comprises an encoder 171 and a decoder 172.
  • the encoder 171 comprises a summer 120, a motion compensation module 115, a motion estimation module 105, an encoding module 125, a feedback decoding module 130, and an adaptive interpolation system 110.
  • the decoder 172 comprises a decoding module 135, a motion compensation module 140, and a reconstruction module 145.
  • a current frame s(t), namely, a raw image signal to be coded is input into the encoder 171 , namely, input to the summer 120, the adaptive interpolation system 110 and the motion estimation module 105.
  • the current frame s(t) may be predicted by a motion compensated prediction technique based on a reference frame s'(t-1) which was obtained by reconstructing a previous encoded frame in the feedback decoding module 130.
  • an interpolated frame is transmitted from the adaptive interpolation filter system 110 into the Motion Estimation module 105.
  • the interpolated frame is obtained by interpolating the reference frame s'(t-1) according to a default filter set of the adaptive interpolation system 110.
  • the default filter set may be a fixed filter set or an adaptive filter set trained by the immediate preceding frame.
  • a filter set in the present invention comprises a set of filters, each of which is designed for its specific sub-pixel resolution positions.
  • filters each of which is designed for its specific sub-pixel resolution positions.
  • two kinds of filters may be required: the first one for interpolating horizontal 1/2 sub-pixel positions and the vertical 1/2 sub-pixel positions of the reference frame, and the second one for interpolating the 1/4 sub-pixel positions of the reference frame.
  • the interpolation filter system 110 is also capable of determining the pattern of the filter set, such as the relationship among the filters.
  • the motion estimation module 105 partitions the input current frame s(t) into multiple blocks and assigns a motion vector MV to each of the blocks in view of the interpolated frame. It is apparent that the motion vectors relating to the interpolated frame and the current frame may have a fractional pixel resolution.
  • the motion vectors MV for all the blocks in the current frame s(t) are provided to the adaptive interpolation system 110, the motion compensation module 115, and the encoding module 125.
  • the motion compensation 115 utilizes the received motion vectors as well as the interpolation filter set from the adaptive interpolation filter system 110 to generate a prediction so as to obtain the prediction frame s pre (t).
  • the adaptive interpolation filter system 110 receives the current frame s(t) from the input of the encoder 171 , the reference frame s'(t-1) from the feedback decoding 130, and motion vectors from the motion estimation 105, and adaptively optimizes a filter set by utilizing the above received information until an optimum filter set occurs.
  • the principle of the adaptive interpolation filter system 110 as well as an optimization process employed therein will be described in detail later.
  • the motion compensation 115 utilizes the optimum filter set derived from the adaptive interpolation filter system 110 to improve the prediction s pre (t) of the current frame s(t).
  • the prediction s pre (t) of the current frame s(t) is transmitted to the summer 120 and subtracted from the current frame s(t).
  • the difference between the input current frame s(t) and the prediction s pre (t) is encoded by the encoding module 125.
  • the encoded difference, together with the encoded motion vectors of the current frame, is sent to the decoding module 135.
  • the optimum filter set obtained by the adaptive interpolation system 110 is also transmitted to motion compensation module 140.
  • the decoding module 135 decodes the encoded difference and the encoded MV, and transmits the decoded signals to the motion compensation module 140.
  • the motion compensation module 140 is used for determining the samples to be interpolated according to the decoded MV and for interpolating the reference frame so as to recover the motion compensated prediction frame based on the decoded difference and motion vectors by using the optimum filter set from the adaptive interpolation system 110.
  • the reconstruction module 145 receives the decoded difference from the decoding module 135 and the motion compensated prediction frame from motion compensation module 140 so as to reconstruct the required video signals s'(t) by sum of the decoded difference and the decoded prediction.
  • the adaptive interpolation filter system 110 is able to adaptively optimize a filter set according to the current frame s(t), the previously reconstructed reference frame s'(t-1) and motion vectors having a fractional pixel resolution to obtain an optimum filter set.
  • the optimization process carried out by the adaptive interpolation filter system 110 is described with reference to Figs. 2, 3 and 4.
  • Fig. 2 shows an encoding process of a current frame carried out by encoder 171.
  • the frame to be processed is an inter-frame.
  • the inter-frame refers to a frame in a video codec which is expressed as the change from one or more other frames.
  • the "inter" part of the term refers to the use of inter-frame prediction.
  • step 200 is carried out to determine whether the current frame to be coded is the first inter-frame.
  • a default filter set is selected in step 210 and a reference frame of the first inter-frame is interpolated in step 215 by the default filter set.
  • the default filter set may be a fixed filter set preset in the system 110.
  • an adaptive filter set is selected in step 205. This adaptive filter set may be the optimum filter set obtained by the training process of the immediate preceding inter-frame.
  • a reference frame will be interpolated in step 215 by the selected adaptive filter set.
  • each block of the current frame in view of the corresponding block of the reference frame with a fractional pixel resolution will be searched so that motion vectors representative of the least distortion between current frame and its prediction frame are obtained.
  • the motion estimation is implemented based on a default filter set selected in step 210 or an adaptive filter set selected in step 205.
  • the default filter set or the adaptive filter set (hereafter "designated filter set”) will be optimized to derive an optimum filter set for the current frame so as to improve the motion estimation and thereby enhance the coding efficiency.
  • the objective of the optimization is to minimize the prediction error between the current frame and the prediction frame by a Least Square Estimation.
  • the prediction error is represented by (e) 2 using the following formula
  • S represents the current frame to be coded
  • S pre represents the prediction frame from the motion compensation module 115
  • x and y represent x and y coordinates, respectively, of a pixel of the current frame.
  • step 230 If, in step 230, the optimized filter set satisfies a stopping condition, the optimized filter set is then identified to be the optimum interpolation filter set for the current frame.
  • the motion compensation prediction for the current frame will be executed in step 235.
  • the current frame is encoded using the motion predictive estimation with the optimum filter set of the invention in step 240.
  • the procedure returns to step 205 at which the obtained optimized filter set is selected to be the current adaptive filter set. Then, steps from 205 to 230 will be repeated to iteratively optimize the filter set until the stopping condition is satisfied.
  • the stopping condition may be a preset number of iteration cycles, a set of desirable coefficients of the filter set, or a desirable prediction error. It is known that stopping condition should be determined by trade-off between the distortion of an image and the complexity of processing the image.
  • the present invention aims to minimize the prediction error by optimizing the filter set using the Least Square Estimation.
  • the detailed optimization procedure will be described hereinafter by referring to Fig. 3.
  • Fig. 3 is a flowchart illustrating the adaptive optimizing step
  • the coefficients of all the filters of the filter set can be simultaneously trained to minimize the prediction error by using Least Square Estimation.
  • the parameter values comprises such as sub-pixel resolution for determining the number of filters needed for the filter set, filter taps representing the size of each filter of the filter set.
  • the filtering pattern includes filtering patterns in respect of each sub-pixel position as well as the relationship among the filters.
  • step 310 coefficients of the filter set (i.e. coefficients of each filter with a specific sub-pixel resolution) are adaptively trained for minimizing the square error (e) 2 in formula 1 -1.
  • the prediction frame S pre in formula 1 -1 can be calculated using the following formula
  • NxM is the size of a filter
  • P represents the reference frame
  • (mvx, mvy) represents the motion vectors of a current sub-pixel at the position (x, y)
  • h represents the filter coefficients for the current sub-pixel position
  • the filter size is decided by filter taps which was determined in step 200 as shown in Fig. 2.
  • the square error (e) 2 can be obtained by using the following formula wherein e represents the difference between the current frame and a prediction of the current frame; NxM is the size of a filter, S represents the current frame; P represents the reference frame; x and y represent the x and y coordinates, respectively; ⁇ mvx, mvy) represents the motion vectors; h represents the float filter coefficients, and /, j represent the coordinates of filter coefficients.
  • the training of a filter set in step 310 is to calculate optimum filter coefficients h for minimizing the square error (e) 2 .
  • Such a training step can be achieved by using Least Square Estimation.
  • the coefficients h of the present invention are float coefficients, which are different from the quantization coefficients used in the US patent application No. 2004/0076333 as stated in the background. In order to minimize the prediction error, in US
  • quantization coefficients of the filter is searched using the heuristic search method.
  • float coefficients of the filter set is derived using the Least Square Estimation method. Therefore, the filter set obtained using the present invention is a global optimum interpolation filter set.
  • step 315 is carried out for mapping the float filter coefficients to quantization filter coefficients according to the required precision of the present embodiment. It is understood that this mapping step is employed for facilitating the training of the interpolation filter set.
  • the filter set with quantization coefficients is the trained filter set in the current iteration.
  • the procedure will go to step 230 of Fig. 2 to determine whether the trained filter set of the current iteration satisfies a stopping condition. If it is "yes", the trained filter of this iteration is the desired optimized interpolation filter, namely, optimum filter set.
  • the objective of the optimization is to minimize e square as mentioned above according to Figs. 2 and 3. It is impossible to directly apply the Least Square Estimation to error e due to the unknown motion vector ⁇ mvx, mvy) and h in formula 1 -3.
  • the above embodiment sets forth a solution to address this issue in this way: setting a default filter set or an adaptive interpolation filter set H'; finding motion vectors which can optimize the objective by motion estimation; performing the Least Square Estimation on the interpolation filter set H under the constrains of just obtained motion vectors; and the filter set H can replace the filter set H' in step 1 to further optimize the interpolation filter set by interactively performing steps 1 -3 until coefficients of the filter set H are convergent.
  • the present invention proposes the second embodiment which is capable of reducing the filter coefficients bit rate of a filter set H and the computing complexity for filtering the whole set S of sub-pixel positions.
  • a filter set H in step 225 as shown in
  • Fig. 2 can be also optimized.
  • Step 400 is to construct a filter F1 according to a predetermined filtering pattern and assumed relationship among the first sub-set of sub-pixel positions.
  • F1 is used to compute middle results which will be further used to interpolate those samples at other related sub-pixel positions higher than half-pixel precision.
  • the relationship between the samples at sub-pixel position with higher precision than half-pixel precision and the samples at horizontal half-pixel position or vertical half-pixel position with half-pixel precision should be defined by fixed linear functions, such as a linear averaging function.
  • S1 The set of all samples at related sub-pixel positions in this step are called S1.
  • step 405 F1 is optimized by Least Square Estimation for minimizing the prediction error of S1 between the current frame and the prediction frame.
  • the difference between this embodiment and the above-mentioned first embodiment is that the prediction frame of the first embodiment is obtained based on the whole filter set including all filters in the set, and all the filters are trained simultaneously for minimizing the prediction error, while in the second embodiment the prediction frame in step 405 is obtained based on the filter F1 only and therefore the training procedure herein is only for the filter F1.
  • Step 410 is to construct another filter F2 according to the predetermined filter pattern and assumed relationship among another sub-set of sub-pixel positions.
  • step 415 F2 is optimized by Least Square Estimation under the constraints of S2 and optimized F1 obtained in step 405.
  • the optimizing procedure of F2 is similar to that of F1 as described in step 405, so it is omitted herein.
  • F1 is further optimized by Least Square Estimation under the constraints of S1 and optimized F2 in step 420.
  • the procedure goes to step 425 for determining whether the optimizing procedure satisfies a stop condition.
  • the stopping condition may be a preset number of iteration cycles, coefficients of the filter set are convergent, or the prediction error between the current frame and the prediction frame being within a desirable range.
  • the filter set H is reduced to two interpolation filters F1 and F2, which could be used together with said fixed linear relationship among the sub-pixel positions to interpolate the pixels of the whole sub-pixel positions.
  • Luma samples 'A' to 'U' at full-sample locations are derived by the following rules.
  • the Luma prediction values at half sample positions shall be derived by applying a fixed 6-tap filter with tap values (1 , -5, 20, 20, -5, 1).
  • the Luma prediction values at quarter sample positions shall be derived by averaging samples at full and half sample positions.
  • the sample at a half sample position labeled with "b” is derived by calculating an intermediate value b-, by applying the fixed 6-tap filter to the nearest integer position samples E, F, G, H, I and J in the horizontal direction.
  • E, F, G, H, I and J represent six full samples in the horizontal direction, respectively; and A, C, G, M, R and T represent six full samples in the vertical direction, respectively. Due to applying the fixed filter to the half samples b and h, each of taps will be applied to each of full samples in each of directions.
  • n means to shift (b1 +16) or (h1 +16) rightwards by n bits (here n is an integer) and the sign "CMpI " is a mechanism which constrains the filtered result b and h in the range of 0 to 255.
  • n equals to 5, namely, the value of b or h is divided by 2 5 (because they have been scaled up 32 by filter (1 , -5, 20, 20, -5, 1) in the above process).
  • the samples at quarter sample positions labeled as a, c, d, n, f, i, k, and q shall be derived by averaging with upward rounding of the two nearest samples at integer and half sample positions.
  • the samples at quarter sample positions labeled as e, g, p, and r shall be derived by averaging with upward rounding of the two nearest samples at half sample positions in the diagonal direction.
  • the motion vector precision is set to be 1/4 pixel and the largest reference area of one sub-pixel position is set as 6x6, which may be done in step 300 of Fig. 3.
  • the filtering pattern is also determined in step 305 of Fig. 3.
  • an asymmetrical 6-tap filter F1 (x ⁇ , x1 , x2, x3, x4, x5) is used to interpolate the samples like "b" and "h".
  • the filtering operation is the same as that of H.264/AVC.
  • the filter F1 can be optimized by Least Square
  • F2 filters the up-left 3x3 integer samples of sample “j"(AO, A1 , A, CO, C1 , C, E, F and G4) and get a middle result G1.
  • F2 filters the up-right 3x3 integer samples of sample "j"(BO, B1 , B, DO, D1 , D, J, I and H) and get a middle result H1.
  • F2 further filters the down-left 3x3 integer samples of sample “j"(TO, T1 , T, RO, R1 , R, K, L and M) and get a middle result M1 and filters the down-right 3x3 integer samples of sample “j"(UO, U1 , U, SO, S1 , S, Q, P and N) and get a middle result N1. Then, the interpolated sample "j" is computed by averaging G1 , H1 , M1 and N1.
  • the filter F2 can be optimized by Least Square Estimation method under the constraints of F2 and the samples at sub-pixel positions of "j", “f”, “i”, “k” and “q”. [0089] F1 and F2 are optimized iteratively until the coefficients of F1 and F2 are both convergent.
  • coefficients relating to F1 and F2 are fixed or adaptively searched by Downhill simplex search or heuristic search method.
  • filters F1 and F2 are trained by Least Square Estimation method, and they are optimized iteratively.
  • LDL T Lower Triangle Matrix Diagonal Lower Triangle Matrix Transpose
  • the coefficients obtained according to the present invention are float coefficients.
  • the quantized F1 and F2 are encoded by, e.g. a known method as called "predictive coding and exponent-Glomob". The encoded filters F1 and F2 will be transmitted to the encoder 125 as a part of the raw frame.
  • Fig. 6 shows an experimental result by employing the interpolation method according to the present invention.
  • Time cost of the method is much less than the known methods as given in the background.
  • One-pass training can still obtain comparable improvement as their multiple pass training.
  • step 700 is implemented for receiving encoded information including the encoded set of filters, motion vectors and prediction error from an encoder like the encoder 171.
  • the set of filters includes a first filter F1 and a second filter F2, but is not limited to this proposal.
  • step 705 the received filters F1 and F2, motion vectors and prediction error are entropy decoded and recovered from the bitstream according to the known technique named "predictive coding and exponent-Glomob".
  • step 710 is implemented to determining samples to be interpolated according to the decoded motion vectors.
  • a reference frame is interpolated using the decoded set of filters by applying the filter F1 to interpolate a first plurality of samples among said determined samples, wherein the first plurality of samples are at horizontal or vertical half-pixel sample positions in step 715; and by applying the filter F2 to interpolate a second plurality of samples among said determined samples, wherein the second plurality of samples are at horizontal and vertical half-pixel sample positions in step 720.
  • step 725 the current frame is reconstructed using the interpolated reference frame, the decoded motion vectors and the decoded prediction error.
  • Luma samples 'A' to 'U' at full-sample locations are derived by the following rules.
  • the Luma prediction values at horizontal or vertical half sample positions S1 (e.g., the sample position at b) shall be derived by applying filter F1 with tap values (x ⁇ , x1 , x2, x3, x4, x5).
  • the Luma prediction values at horizontal and vertical half sample positions S2 (e.g., the sample position at h) shall be derived by applying filter F2 with tap values (y ⁇ , y1 , y2, y3, y4, y5, y6, y7, y8) and averaging filter.
  • the Luma prediction values at quarter sample positions shall be derived by averaging samples at full and half sample positions.
  • the sample at a half sample position labeled with "b” is derived by calculating an intermediate value b-, by applying the adaptive filter F1 to the nearest integer position samples E, F, G, H, I and J in the horizontal direction.
  • the sample at a half sample position labeled with "h” is derived by calculating an intermediate value h-, by applying the adaptive filter F1 to the nearest integer position samples A, C, G, M, R and T in the vertical direction, namely:
  • D 1 (xO * E + x1 * F + x2 * G + x3 * H + x4 * I + x5 * J)
  • h A (xO * A - x1 * C + x2 * G + x3 * M - x4 * R + x5 * T)
  • E, F, G, H, I and J represent six full samples in the horizontal direction, respectively
  • A, C, G, M, R and T represent six full samples in the vertical direction, respectively. Due to applying the fixed filter to the half samples b and h, each of taps will be applied to each of full samples in each of directions.
  • n 7 (because they have been scaled up 128 by filter F1 (x ⁇ ,x1 , x2, x3, x4, x5) in the above process).
  • the sample at horizontal and vertical half sample position labeled with "j” is derived by applying F2 with tap values (y ⁇ , y1 , y2, y3, y4, y5, y6, y7, y8) respectively to the 3x3 integer samples of each corner of "j".
  • F2 filters the up-left 3x3 integer samples of sample "j"(AO, A1 , A, CO, C1 , C, E, F and G4) and gets a middle result G1.
  • F2 filters the up-right 3x3 integer samples of sample "j"(BO, B1 , B, DO, D1 , D, J, I and H) and gets a middle result H1.
  • F2 further filters the down-left 3x3 integer samples of sample "j"(TO, T1 , T, RO, R1 , R, K, L and M) and gets a middle result M1
  • F2 also filters the down-right 3x3 integer samples of sample "j"(UO, U1 , U, SO, S1 , S, Q, P and N) and gets a middle result N1.
  • G 1 (yO * AO + y1 * A1 + y2 * A + y3 * CO + y4 * C1 + y5 * C1 + y6 * C + y7 * E + y8 * F),
  • H 1 (yO * BO + y1 * B1 + y2 * B + y3 * DO + y4 * D1 + y5 * D + y6 * J + y7 * I + y8 * H),
  • M 1 (yO * TO + y1 * T1 + y2 * T + y3 * RO + y4 * R1 + y5 * R + y6 * K + y7 * L + y8 * M),
  • N 1 (yO * UO + y1 * U1 + y2 * U + y3 * SO + y4 * S1 + y5 * S + y6 * Q + y7 * P + y8 * N).
  • n 9
  • j the value of j is divided by 2 9 (because it has been scaled up 512 in the above process).
  • the samples at quarter sample positions labeled as "a, c, d, n, f, i, k, and q" shall be derived by averaging with upward rounding of the two nearest samples at integer and half sample positions.
  • the samples at quarter sample positions labeled as "e, g, p, and r" shall be derived by averaging with upward rounding of the two nearest samples at half sample positions in the diagonal direction.
  • video decoder 172 comprises a decoding module 135 configured to receive and decode an encoded set of filters, motion vectors and a prediction error; a motion compensation module 140 configured to interpolate the reference frame using the decoded set of filters including a first filter F1 and a second filter F2, which further comprises a sub-module 805 for determining samples to be interpolated according to the decoded motion vectors, a sub-module 810 for applying the filter F1 to interpolate a first plurality of samples among said determined samples where the first plurality of samples are at horizontal or vertical half-pixel sample positions, and a sub-module 815 for applying the filter F2 to interpolate a second plurality of samples among said determined samples where the second plurality of samples are at horizontal and vertical half-pixel sample positions; and

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention a trait à un procédé et à un système d'interpolation adaptative pour un codec vidéo prédictif à compensation de mouvement et à un procédé et à un système de décodage correspondant au procédé et au système d'interpolation. Le procédé d'interpolation comporte la fourniture d'un ensemble de filtres comprenant F1 et F2 pour une trame courante; l'interpolation d'une trame de référence selon les filtres; le calcul de vecteurs de mouvement pour générer une trame de prédiction; la construction et l'apprentissage adaptatif de F1 pour une première partie de positions de sous-pixels; la construction et l'apprentissage adaptatif de F2 pour une seconde partie de positions de sous-pixels sous la contrainte de F1; le réapprentissage de F1 sous la contrainte de F2; et la mise à jour des filtres par les filtres instruits F1 et F2 pour davantage optimiser les filtres. Dans l'invention, il est possible de réduire au minimum la différence entre la trame courante et sa trame de prédiction par un algorithme rapide à une passe pour rendre une application de codage en temps réel possible.
EP07859334A 2006-12-01 2007-11-30 Procédé et système d'interpolation adaptative pour un codage et un décodage vidéo prédictif à compensation de mouvement Withdrawn EP2092752A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2006003239 2006-12-01
PCT/IB2007/004305 WO2008068623A2 (fr) 2006-12-01 2007-11-30 Procédé et système d'interpolation adaptative pour un codage et un décodage vidéo prédictif à compensation de mouvement

Publications (1)

Publication Number Publication Date
EP2092752A2 true EP2092752A2 (fr) 2009-08-26

Family

ID=39492687

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07859334A Withdrawn EP2092752A2 (fr) 2006-12-01 2007-11-30 Procédé et système d'interpolation adaptative pour un codage et un décodage vidéo prédictif à compensation de mouvement

Country Status (3)

Country Link
EP (1) EP2092752A2 (fr)
CN (1) CN101632306B (fr)
WO (1) WO2008068623A2 (fr)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2136565A1 (fr) * 2008-06-19 2009-12-23 Thomson Licensing Procédé de détermination d'un filtre pour interpoler un ou plusieurs pixels d'un cadre, procédé de codage ou de reconstruction d'un cadre et procédé de transmission d'un cadre
RU2530327C2 (ru) * 2008-07-29 2014-10-10 Франс Телеком Способ обновления кодера посредством интерполяции фильтра
US9078007B2 (en) 2008-10-03 2015-07-07 Qualcomm Incorporated Digital video coding with interpolation filters and offsets
JP2011050001A (ja) * 2009-08-28 2011-03-10 Sony Corp 画像処理装置および方法
US9219921B2 (en) 2010-04-12 2015-12-22 Qualcomm Incorporated Mixed tap filters
CN101984669A (zh) * 2010-12-10 2011-03-09 河海大学 一种帧层次自适应维纳插值滤波器的迭代方法
JP6715467B2 (ja) * 2015-07-01 2020-07-01 パナソニックIpマネジメント株式会社 符号化方法、復号方法、符号化装置、復号装置および符号化復号装置
CN113196777B (zh) * 2018-12-17 2024-04-19 北京字节跳动网络技术有限公司 用于运动补偿的参考像素填充
CN112131529B (zh) * 2020-09-22 2023-10-13 南京大学 一种基于e-g两步法的配对交易协整关系加速验证方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19730305A1 (de) * 1997-07-15 1999-01-21 Bosch Gmbh Robert Verfahren zur Erzeugung eines verbesserten Bildsignals bei der Bewegungsschätzung von Bildsequenzen, insbesondere eines Prädiktionssignals für Bewegtbilder mit bewegungskompensierender Prädiktion
US7110459B2 (en) * 2002-04-10 2006-09-19 Microsoft Corporation Approximate bicubic filter
US20040076333A1 (en) * 2002-10-22 2004-04-22 Huipin Zhang Adaptive interpolation filter system for motion compensated predictive video coding
CN1216495C (zh) * 2003-09-27 2005-08-24 浙江大学 视频图像亚像素插值的方法和装置
EP1578137A2 (fr) * 2004-03-17 2005-09-21 Matsushita Electric Industrial Co., Ltd. Appareil de codage vidéo avec une méthode d' interpolation à multiples étapes
EP1617672A1 (fr) * 2004-07-13 2006-01-18 Matsushita Electric Industrial Co., Ltd. Estimateur/compensateur de mouvement comportant un 16-bit 1/8 pel filtre d'interpolation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2008068623A3 *

Also Published As

Publication number Publication date
CN101632306A (zh) 2010-01-20
WO2008068623A3 (fr) 2009-07-30
CN101632306B (zh) 2014-03-19
WO2008068623A2 (fr) 2008-06-12

Similar Documents

Publication Publication Date Title
TWI735172B (zh) 用於多個工具的互相排斥設定
CN102396230B (zh) 图像处理设备和方法
EP2092752A2 (fr) Procédé et système d'interpolation adaptative pour un codage et un décodage vidéo prédictif à compensation de mouvement
CN104041041B (zh) 用于非均匀运动向量栅格的运动向量缩放
CN102804779A (zh) 图像处理装置和方法
TW202315408A (zh) 以區塊為基礎之預測技術
WO2013002144A1 (fr) Procédé et dispositif de codage d'images vidéo, procédé et dispositif de décodage d'images vidéo, et programme associé
JP2021502031A (ja) ビデオ符号化のためのインター予測機器及び方法の補間フィルタ
US8170110B2 (en) Method and apparatus for zoom motion estimation
WO2008148272A1 (fr) Procédé et appareil pour un codage vidéo à mouvement compensé de sous-pixels
JP7375224B2 (ja) 符号化・復号方法、装置及びそのデバイス
WO2013002150A1 (fr) Procédé et dispositif de codage d'images vidéo, procédé et dispositif de décodage d'images vidéo, et programme associé
KR20140010174A (ko) 동화상 부호화 장치, 동화상 복호 장치, 동화상 부호화 방법, 동화상 복호 방법, 동화상 부호화 프로그램 및 동화상 복호 프로그램
JP2024069438A (ja) イントラ予測を用いた符号化
CN113994692A (zh) 用于利用光流的预测细化的方法和装置
CN103069803B (zh) 视频编码方法、视频解码方法、视频编码装置、视频解码装置
JP2023528609A (ja) 符号化・復号方法、装置及びそのデバイス
WO2022022278A1 (fr) Procédé de prédiction inter-trames, codeur, décodeur et support de stockage informatique
WO2022061680A1 (fr) Procédé de prédiction inter-trames, codeur, décodeur et support de stockage informatique
KR102435316B1 (ko) 이미지 처리 장치 및 방법
WO2022037344A1 (fr) Procédé de prédiction entre trames, codeur, décodeur et support de stockage informatique
WO2022077495A1 (fr) Procédé de prédiction inter-trames, codeur, décodeurs et support de stockage informatique
Rusanovskyy et al. Video coding with pixel-aligned directional adaptive interpolation filters
WO2011142221A1 (fr) Dispositif de codage et dispositif de décodage
TW202209893A (zh) 幀間預測方法、編碼器、解碼器以及電腦儲存媒介

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20090630

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

R17D Deferred search report published (corrected)

Effective date: 20090730

DAX Request for extension of the european patent (deleted)
RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ORANGE

17Q First examination report despatched

Effective date: 20160523

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20161203