WO2020043710A1

WO2020043710A1 - Filtering of image data

Info

Publication number: WO2020043710A1
Application number: PCT/EP2019/072814
Authority: WO
Inventors: Jacob STRÖM; Per Wennersten; Du LIU; Jack ENHORN
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2018-08-27
Filing date: 2019-08-27
Publication date: 2020-03-05

Abstract

A method for determining a filtered pixel value, where the method includes obtaining a first pixel value associated with a first pixel of a block of pixels; determining a first weight value (w1) based on a first value (V1), a second value (k), and a third value (m), wherein V1 is equal to, or calculated using, the absolute value of the difference between the first pixel value and a second pixel value associated with a second pixel of the block of pixels; and determining the filtered pixel value using the first pixel value and the first weight value (w1).

Description

FILTERING OF IMAGE DATA

TECHNICAL FIELD

[001] Disclosed are embodiments related to filtering of image data.

BACKGROUND

[002] Bilateral filtering of image data directly after forming a reconstructed image block can be beneficial for video compression. As described by Wennersten et al. (reference [1]), it is possible to reach a bit rate reduction of 0.5% with maintained visual quality for a complexity increase of 3% (encode) and 0% (decode) for random access. However, bilateral filtering involves a division, which can be expensive for hardware implementations. Therefore, Wennersten et al. implemented the bilateral filtering using a multiplication and a look-up-table (LUT) of 576 bytes. Later, a division-table-free variant of the bilateral filter form [1] was proposed in reference [2]

[003] The filter weighting factors in a bilateral filter depend on the image data, so they need to be calculated on-the-fly or obtained from a LUT. In the implementation described in reference [1], 2202 bytes were needed for the LUT. Another 576 bytes were needed for the division table, yielding 2778 bytes in total for the solution described in [1] The

implementation proposed in reference [2] used a LUT of about 33000 values.

[004] As described by Strom et al. (reference [3]), it is possible to reduce the necessary data stored in the LUT down to 816 bytes by reusing rows in the LUT.

[005] Typically, each row of the LUT is associated with a quantization parameter (qp).

An example row of the LUT for qp=38 (denoted LUT38) may contain the following array of values: LUT 38 = [255, 255, 255, 254, 254, 253, 252, 251 , 250, 249, 248, 246, 245, 243, 241 , 239, 237, 235, 233, 230, 228, 225, 222, 219, 217, 214, 21 1 , 207, 204, 201 , 198, 194, 191 , 187,

184, 180, 177, 173, 169, 166, 162, 158, 155, 151 , 147, 144, 140, 136, 133, 129, 126, 122, 118,

1 15, 1 12, 108, 105, 102, 98, 95, 92, 89, 86, 83, 80, 77, 74, 71 , 69, 66, 64, 61, 59, 56, 54, 52,

50, 47, 45, 43, 42, 40, 38, 36, 35, 33, 31 , 30, 28, 27, 26, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 13, 12, 1 1, 11 , 10, 9, 9, 8, 8, 7, 7, 6, 6, 6, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2,

1, 1, 1 , 1, 1, 1 , 1 , 1, 1 , 1, 1 , 1 , 1 , 1 , 0]

[006] The above array is 150 items long, which means that, for each instantiation of the filter, 150 bytes must be stored for this qp value. Bilateral filtering is performed when the qp is 18 or larger, and if the maximum qp is 51 this means that 51-17 = 34 such arrays must be stored, which is a considerable amount of data.

SUMMARY

[007] Even with a division-free implementation, the bilateral filter is costly to implement for some forms of implementations, such as full custom ASIC implementations. In order to gain parallelism in such applications, a filter must typically be instantiated several times. This means that even a small look-up table of 2202 bytes, or even 816 bytes, may be costly in terms of silicon area if it is instantiated, say, seven times. It is therefore of interest to further reduce the complexity of the filter in terms of LUT size.

[008] Accordingly, embodiments described herein eliminate the need for the LUT completely by using a function (e.g., a linear or piecewise linear function) that approximates the LUT. This removes the need for storing the LUT, thereby simplifying the implementation. Lor instance, in one embodiment, the LUT is replaced with a clipped linear function of the form y = max(0, k*x + m ), where k and m are predetermined values. In another embodiment, the LUT is replaced with a function of the form: y = k x min(x, xc) + m, where xc is a predetermined value.

[009] Hence, instead of obtaining the weighting factor for an input value of x using the LUT, i.e.,:

w = LUT38[x], (Eq.l)

it is instead possible to calculate the weighting factor (w) for the value of x directly using:

w = max (0, k_38*x+m_38), (Eq.2)

where k_38 and m_38 are predetermined constants.

[0010] Lor the case of qp = 38, this means that instead of storing 150 values, only two values (k_38 and m_38) need to be stored. If 51 is the maximum qp value allowed, this means that only (51-17)*2 = 68 values need to be stored, instead of 816 bytes as described in reference [3] Furthermore, in some embodiments, the m- value is always the same (for instance 255), which means that only the k-value needs to be stored for each qp. Thus only (5l-l7)*l + l = 35 values need to be stored, a substantial reduction. If one byte is used to store every k- and m-value, only 35 bytes need to be stored. That is a reduction by (816-35)/816 = 96%. If the maximum allowed qp value is instead 63, only (63-17)* 1+1 = 47 bytes need to be stored. This reduction will save significant silicon surface area in a full custom ASIC implementation, thus significantly saving costs.

[0011] Thus, in one aspect there is provided a method for determining a filtered sample value, such as a filtered intensity value or a filtered chroma value. As used herein, the term “pixel value” is used to mean a sample value, such as an intensity value or chroma value. The method includes: obtaining a first pixel value associated with a first pixel of a block of pixels. The method also includes determining a first weight value, wl, based on a first value, VI, a second value, k, and a third value, m, wherein VI is equal to, or calculated using, the absolute value of the difference between the first pixel value and a second pixel value associated with a second pixel of the block of pixels. And the method further includes determining the filtered pixel value using the first pixel value and the first weight value, wl .

[0012] In another aspect there is provided a method for determining a filtered pixel value, IF. This method includes obtaining a pixel value, Ic, associated with a first pixel of a block of pixels. The method also includes determining a first intermediate value, Wl_m, based on a first value, m, and DIA where DIA is equal to a difference between a second pixel value associated with a second pixel of the block of pixels and the pixel value for the first pixel of the block of pixels and determining a second intermediate value, Wl_k, based on a second value, k, and a third value, VI . The method further includes summing Wl_m and Wl_k, thereby producing a sum, Wl, where Wl = Wl_m + Wl_k. And the method also includes determining the filtered pixel value, IF, using the pixel value for the first pixel, Ic, and Wl, wherein determining Wl_m comprises calculating (m x DIA); determining Wl_k comprises calculating (k x VI x DIA); and determining IF using Wl and Ic comprises determining whether Wl is greater than 0. [0013] Another advantage of the embodiments disclosed herein is that they allow the use of highly efficient single instruction, multiple data (SIMD) implementations on CPUs. SIMD operations allow the execution of several operations simultaneously on a modern CPU. As an example, if a normal machine code instruction can add two numbers to each other, a SIMD operation can add eight numbers to eight other numbers in parallel. This can improve performance considerably. There are SIMD operations for performing table look-ups.

However, they need the entire LUT to be stored in a single SIMD register. Such registers are typically of the size of 128 bits. If 8 -bit values are used, this means the largest number of items that such an operation can handle is 128/8 = 16 items. The LUT38 array described above is 150 items, and would therefore be too big to implement using a single SIMD operation on current hardware. In contrast, it is easy to execute the arithmetic operations used in, for example, Equation 2, using SIMD instructions.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments.

[0015] FIG. 1 illustrates a network node communicating simultaneously with a first UE and a second UE.

[0016] FIG. 2 is a plot showing that a linear function can estimate the values of an array of a LUT.

[0017] FIG. 3 is a flow chart illustrating a process according to one embodiment.

[0018] FIG. 4 is a flow chart illustrating a process according to one embodiment.

[0019] FIG. 5 is a flow chart illustrating a process according to one embodiment.

[0020] FIG. 6 is a block diagram of an apparatus according to one embodiment.

DETAILED DESCRIPTION

[0021] FIG. 1 illustrates a system 100 having an encoder 102 for encoding image data and a decoder 104 for decoding image data, which may be communicatively connected to encoder 102 via a network 1 10 (e.g., the Internet or other network). The decoder 104 includes a filter 1 12 for filtering (e.g., bilateral filtering) of image data, such as, for example, intensity values.

[0022] Throughout this description we will use filtering of pixel intensity values as an example. This traditionally refers to the Y in YCbCr. It should be noted, however, that this filtering can also be used for other pixel values, such as chroma values such as Cb and Cr, or any other components from other color spaces such as IC_TC_P, Lab, Y’u V etc. That is, in the following we describe the filtering of intensity values of pixels, of which the Y in YCbCr is an example. Another term often used by persons skilled in the art is samples. A sample can be an intensity value of a pixel, or it can be a chroma value such as Cb and Cr. In the following we will use the terms“pixel value”,“intensity” and“sample” interchangeably.” We will use the filtering of intensity values as the example here but it should be understood that it can likewise be applied to other sample values such as Cb and Cr, C_T and Cp etc.

[0023] As described in [3], the filter 1 12 filters a pixel using the following equation 3:

where I_F is a filtered pixel intensity and I_c is the center pixel intensity, i.e., the pixel intensity before filtering. The value AI_A is the difference between the center pixel intensity (denoted I_c ) and the intensity of the pixel above the center pixel (which intensity is denotd I_A ), i.e., AI_A =

1_A— I_c. Analogously, I_B , I_L and I_R are the intensities of the pixel immediately below, left and right of the center pixel respectively, and A1_B = I_B— I_c, AI_L = I_L— I_c, and AI_R = I_R— I_c. Furthermore,

_ i_

256 e ^2ad

is a one-dimensional look-up table of four values (since there are four possible values of s_ά ) and

255 Wl²

^{Pcip (D/) =} 256 ^e8iqP~17)2 = ^LUT(W’ ^D/)· (pq. 5) is a two-dimensional look-up table of values, consisting of several LUT-rows, one for each qp, as described above. As an example, the LUT row for qp = 38 is denoted LUT(38,AI ), and is equal to the array LUT38 described above. Note that the values in LUT38 starts at 255 rather than 255/256 since an 8-bit fixed point representation is used.

[0024] In FIG. 2, we can see the values of LUT (38, AI) plotted as a function of AI using crosses. As can be seen in FIG. 2, this curve can be approximated by two line segments, a first sloping downward line segment 202 and a second horizontal line segment 204.

[0025] The first line segment 202 starts at coordinate ( VAI = 0, p₃₈ =

and ends when it cuts the x-axis at coordinate (AI = 100, p₃₈ = 0). The second line segment 204 starts in this point and follows the x-axis.

[0026] This piecewise linear curve, which we denote A_qp (AI) can be calculated as q_p(AI) = max(0 , k_qp x \AI\ + m_qp). (Eq. 6)

[0027] Thus, in one embodiment, filter 112 applies the following equation 7 to determine IF:

[0028] The values to use for k_qp and m_qp must be known both by the encoder 102 and the decoder 104. It is possible to transmit these values from the encoder to the decoder, or alternatively, use fixed values that are known a priori to both the encoder and the decoder. If fixed values are used, it can be a good idea to select the values for each qp that reduces the difference between the approximation A_qp(AI) and the function it is trying to approximate, i.e.,

Pqp (D7 ) ·

255 _ 255

[0029] In one embodiment, the value of m_qp is equal to p_qp( 0) =—

e ^e(qp-ⁿ⁾² =—— .

This means that it does not depend on qp: m_qp = m =

The values for k_qp for the different qps from 18 to 51 may be equal to:

k_qp = [ -0.1992, -0.0996, -0.0711, -0.0524, -0.0415, -0.0343, -0.0302, -0.0262, -0.0232, - 0.0208, -0.0192, -0.0175, -0.0161, -0.0149, -0.0140, -0.0131, -0.0123, -0.0116, -0.0111, - 0.0105, -0.0100, -0.0095, -0.0091, -0.0087, -0.0084, -0.0080, -0.0078, -0.0075, -0.0072, - 0.0070, -0.0068, -0.0066, -0.0063, -0.0061]

[0030] These k_qp values have been found by fixing m_qp to 255/256 and then finding the k_qp that minimizes the sum of the squared error between p_qp (AI) and A_qp (AI).

[0031] Note that in Equation 7, the value A_qp is calculated using the delta value AI.

However, as described in [2] and [3], for inter blocks it may be beneficial to instead calculate the value using an averaged version of the delta value, which we may call NL. Thus, in one embodiment, filter 1 12 applies the following formula to determine I_F:

[0032] How the values NL_A, NL_B, NL_h and Nh_R can be calculated is described in [3]

For example, as described in [3], assuming the center pixel (i.e., the pixel to be filtered) is pixel a3,3, then:

NLA = ( (abs(al2-a22) + abs(al3-a23) + abs(al4-a24) +

abs(a22-a32) + abs(a23-a33) + abs(a34-a35) +

abs(a32-a42) + abs(a33-a43) + abs(a44-a45)) * 114 ) » 10;

NLB = ( (abs(a52-a42) + abs(a53-a43) + abs(a54-a44) +

abs(a42-a32) + abs(a43-a33) + abs(a44-a34) +

abs(a32-a22) + abs(a33-a23) + abs(a34-a24)) * 114 ) » 10;

NLL = ( (abs(a21 -a22) + abs(a3l -a32) + abs(a4l-a42) +

abs(a22-a23) + abs(a32-a33) + abs(a42-a43) +

abs(a23-a24) + abs(a33-a34) + abs(a43-a44)) * 114 ) » 10; and

NLR = ( (abs(a23-a22) + abs(a24-a23) + abs(a25-a24) +

abs(a33-a32) + abs(a34-a33) + abs(a35-a34) +

abs(a43-a42) + abs(a44-a43) + abs(a45-a44)) * 114 ) » 10,

where the sample (intensity values) are arranged according to the table below:

[0033] As shown above, each NL value is calculated using the absolute value of the difference between a first pixel value associated with a first pixel of the block of pixels (e.g., pixel a3,3) and a second pixel value associated with a second pixel of the block of pixels (e.g., pixel a2,3).

[0034] However, for intra blocks, Equation 7 is used. Note that in this case the A_qp is always multiplied by the delta value D7. As an example, the first term in Equation 7 is

X_qp(AI_A AI_A, (Eq. 9) but this is equal to

[0035] Assume that AI_A is positive and that k_qpAI_A + m_qp > 0 . Then this becomes

{k_qpAl_A + m_qp)Al_A

= k_qpAl\ + m_qpAI_A (Eq. 11)

[0036] Since this is the value that is going to be added in Equation 7, it may make sense to minimize the difference between this value and the first term of Equation 3, p_qp (AI_a) X I_A , that it is trying to approximate. This may lead to slightly different values of k_qp than the ones stated above.

[0037] In yet another embodiment of the present invention, the value of k_qp can be transmitted to the decoder. As an example, a video frame containing graphics (such as sub titles or rendered 3D-graphics) typically benefits from strong filtering. Then a k_qp can be signaled that gives the line 202 a less steep slope. This will mean that larger values of D7 will contribute to the filtering, giving a stronger filtering effect. Alternatively, if very smooth content is compressed, a k_qp that results in a steeper slope can be signaled, which will result in less filtering. [0038] In both these cases, the decoder 104 will obtain the signaled k_qp values and filter accordingly.

[0039] In another embodiment, the linear function is not calculated according to

Equation 6. Instead, one keeps track of at what value of |D/| the line crosses the x-axis. We can call this value xc_qp. As we have seen above, for LUT=38, this crossing happens at xc_qp = |D/| = 100. Note that when |D/| = xc_qp, the resulting value is zero (this is the definition of crossing the x-axis.) Hence k_qp x xc_qp + m_qp = 0. We can therefore instead write Equation 6 as:

[0040] For some implementations, Equation 12 may be easier to implement than

Equation 6. However, note that both equations will give essentially the same result (given limited arithmetic resolution it may not be exactly the same). This is due to the fact that if |DI| > xc_qp then

[0041] Values for xc_qp for qp = 17 to qp = 51 may be selected as:

xc qp = [5, 10, 14, 19, 24, 29, 33, 38, 43, 48, 52, 57, 62, 67, 71, 76, 81, 86, 90, 95, 100, 105, 109, 114, 119, 124, 128, 133, 138, 143, 147, 152, 157, 162]

[0042] Preliminary results show that the efficiency of the filter goes down only slightly, from a savings of 0.81% to a savings of 0.75%, for a random access configuration.

[0043] Because a subsequent block may predict from the right-most column of filtered pixels in the current block, it may be desirable to be able to filter this column of pixels as soon as possible. In detail, one may want to reduce the latency defined as the time from where the unfiltered column of pixels is available until the time when one or all of the pixels in the column of pixels are filtered. (It should be noted that the same argument may apply mutatis mutandis for a row of pixels or any subset of the pixels in the block.)

[0044] Reducing latency can be achieved by reducing the number of dependent operations, especially costly operations such as multiplications. As an example, let’s say a value a should be multiplied by another value b such as output = a * b, but b is in turn the result of a multiplication of c and d. In such a case, we first have to wait for the first multiplication b = c * d before we can start the second multiplication output = a * b. If a multiplication takes 3 cycles, we must in this case wait 6 cycles for the output. Thus it is desirable to reduce the number of dependent consecutive operations we have in the calculation, especially heavy operations such as multiplications that may have a latency of several clock cycles.

[0045] Therefore, in one embodiment, we propose moving the multiplication of d(a_d) into the calculation of the LUT, and combining it with the constants k_qp and m_qp . In detail, Equation 8 calculates the filtered pixel as

Ip =

[0046] For clarity, we will only show the math for the first two terms of the factor in brackets, even though the same will apply mutatis mutandis to the last two terms as well:

[0047] Using Equation 6, we can now rewrite this as

Ip =

I_c + d(a_d ) x [max(0 , k_qp x |NL_A| + m_qp) x AI_A + max(0 , k_qp x |NL_B | + m_qp) x AI_B

+ ··· ]. (Eq. 14)

[0048] Note that in order to calculate I_F using Equation 14, we need to calculate three dependent consecutive multiplications. First, we need to calculate k_qp x \NL_A \. Then, after having added m_qp and taken the max value, we multiply by AI_a. After combining with the other terms, we can finally multiply by d(a_d). If every multiplication takes 3 cycles of latency, the multiplications alone will give 9 cycles of latency. (Note that the second term involving AI_B does not depend on the first term and can therefore be calculated in parallel.)

[0049] Note here that both d(a_d ) and k_qp are constants that do not change over the course of a block. Hence we can move the value d(a_qp) inside the brackets:

IF -

I_c + [max(

[0050] It is allowed to move d(a_d) inside the max() function since it is always positive.

We can now create a new constant:

[0051] This constant can be calculated once per block, or may be stored in

precalculated form. As described in reference [3] there are only 18 possible values of d(a_d), and only 34 or 46 possible values of k_qp, meaning that a total of only 828 values need to be stored. In some embodiments, the number of different d(o_d) values can be further reduced, reducing the need of storage for k_qP(ff_d). Likewise we use

and these can also be precalculated and stored. Since one embodiment is to have m = 255 for all qp’s, may not depend on qp and thus only 18 different values need to be stored. In this case we can write it using the shorthand m(a_d). Using Equation 16 and 17 it is possible to write Equation 15 as

[0052] As we can see in this formula, the number of dependent multiplications has now been reduced. Assuming that

(also denoted k for short) and m(a_d) (also denoted m for short) are precalculated and do not contribute to the latency, we only have two consecutive dependent multiplications. First, we multiply

by |NL_A|, and after adding m(a_d and taking the max, we multiply by A1_a. NO more multiplications are needed, and hence we have reduced the number of multiplications from three to two. If we assume three cycles of latency per multiplication, this means there will be six cycles of latency from the multiplications instead of nine, a substantial reduction.

[0053] In fact, latency can be further reduced by rearranging Equation 18 yet more. In some embodiments, it may take more clock cycles to calculate \NL_A \ than to calculate AI_a.

This is because NL_A can depend on nine intensity values whereas AI_A only depends on two:

AI_A = I_A— I_c. Therefore, it may be better to start multiplying k_qv (o_d) (which is readily available since it is pre -calculated) by AI_A instead of multiplying it by \NL_A \ . This is indeed possible. If AI_A is positive, one can move it inside the max-operation

I_c + [max(

where we have used the shorthand notation k =

Because the order of multiplications does not matter, it is now possible to instead calculate

I_c + [max(

and the multiplication of k X AI_A can be calculated in parallel with the computation of \NL_A \ . This comes at the cost of an extra multiplication in X AI_A , but if latency is more important than silicon surface area this may be a desirable trade-off.

[0054] It should be noted that if AI_A is negative, it is possible to set it positive, carry out the calculations and then negate the result of the max operation afterwards. Hence a negative AI_A is not an obstacle to this way of rearranging multiplications in the decoder or encoder.

That is, equation 19 can be written as:

IF ⁼ Ic + WA + WB + WL +WR, where

W_A = a(max(

W_B = b(max(0, k x \AI_B \ x |NL_A| + rh x |D/_B |)),

W_L = c(max(

WR = d(max(0, k x |D/_K | x |NL_A | + rh x _R \)\ a = -1 if AI_A is negative, otherwise a = 1,

b = -1 if AI_B is negative, otherwise a = 1,

c = -1 if AI_L is negative, otherwise a = 1, and

d = -1 if AI_R is negative, otherwise a = 1.

[0055] One design option is how many bits should be used for k. A large k will give high accuracy, but at the same time it will make the multiplications expensive and pre-storing k will take up space. However, it should be noted that high accuracy is mostly needed when \NL_A \ is big. This is because when \NL_A \ is small, a possible quantization error will be multiplied by a small value of \NL_A |. This is because k_quant = k + e, and after multiplication by | NL_A | we will get \NL_A \(k + e), and the error will be the deviation from the correct value \NL_A\k, i.e., the error will be equal to \NL_A\e.

[0056] One can therefore store the value of k as k =—k_m2^p. Here k_m can be of reduced resolution, for instance 4 bits. For large magnitudes of k, such as k =—60, we will have a limited fractional resolution. In the example of 4 bits, we must use p = 2 to be able to represent k =— 15 X 2² =— 60. In this case we have no fractional bits at all, and the quantization error e = 0.5. However, since k is so large, that means that the negative slope of k must also be large. Hence the function will reach zero long before \NL_A \ becomes big enough to cause error.

[0057] Conversely, if we have a small magnitude of k, such as k =—1.12, we will have great fractional resolution since we can use a small p: k =—9 X 2^-3 =—1.125. This greater precision is also needed since such a small magnitude is associated with a shallow slope k. That shallow slope will hit the x-axis at a much higher \NL_A | -value, and if the precision were not high the error would be large.

[0058] In practice this means that we can use a much lower number of bits for k_m than if we represented k directly. As an example, perhaps 4 or 8 bits needs to be used for k_m, whereas perhaps up to 22 bits would be needed if k would be represented directly. This would mean that the multiplication k X \NL_A\ (or alternatively, k X AI_a ), would be calculated as k_m X | NL_A I followed by a bitshift of p. A bitshift is a far easier operation than a multiplication. Also, the multiplication would then be 8 bits times 8 bits (assuming \NL_A \ < 255) instead of a complex 22 bits times 8 bits.

[0059] FIG. 3 is a flow chart illustrating a process 300 according to some embodiments.

Process 300 may begin in step s302.

[0060] In step s302 decoder 104 obtains a pixel value (e.g., an intensity value, a chroma value, etc.) for a first pixel (e.g., a center pixel).

[0061] In step s304, decoder 104 determines a first weight value (wl ) based on the values k, m and VI , where k and m are predetermined values, and i) VI is equal to abs(AIl), which is the absolute value of the difference between the pixel value for the first pixel (e.g., the center pixel) and a second pixel value associated with a second pixel (e.g., the pixel above the center pixel) (i.e., DI1 may be one of DIA, DIB, AIL, and AIR) or ii) VI is equal to abs(NLl), which is an average version of the delta value All (e.g., NL1 may be one of NL_A, NL_B,NL_L, and NL_R). AS used herein a“weight value” is a value that is used in determining a filtered pixel value. In some embodiments, the weight value is a factor, but in other embodiments the weight value is not a factor.

[0062] In step s306, decoder 104 determines a filtered pixel value (I_F) using the pixel value for the first pixel (Ic) and the first weight value (wl ).

[0063] In some embodiments, determining wl comprises calculating wl = max (0, k x

VI + m).

[0064] In some embodiments, determining wl comprises calculating wl = k x

(min(Vl ,xc)) + m, where xc = -m/k.

[0065] In some embodiments determining wl comprises, determining whether VI is less than or equal to xc and setting wl equal to 0 as a result of determining that VI (e.g., |DI1 1) is less than or equal to xc.

[0066] In some embodiments, decoder 104 determines IF using: Ic, wl , w2, w3, and w4, wherein, for i=l , 2, 3 and 4, wi = 0 if Vi is less than or equal xc, otherwise wi = k x Vi + m, where VI = abs(AI,_\) or abs(NL_A); V2 = abs(AIis) or abs(NL_B); V3 = abs(AIi ) or abs(NL_L); and V4 = abs(Al_R) or abs(NL_R). In such an embodiment, decoder 104 may determine I_F by calculating:

IF = Ic + d(c r_d) x [wlx DIA + W2XAIB + W3XAIL + W4XAIR] ,

[0067] FIG. 4 is a flow chart illustrating a process 400 according to some embodiments.

Process 400 may begin in step s402.

[0068] In step s402, a pixel value is obtained (e.g., an intensity value, a chroma value, etc.) for a first pixel (e.g., a center pixel).

[0069] In step s404, the values are obtained: a first value (! ), a second value (h¾,), and a third value (d(o_d)).

[0070] In step s406, the value k is calculated as k = d(o_d) x k_qp.

[0071] In step s408, the value m is calculated as m = d(o_d) x h¾,.

[0072] In step s4l0 a first weight value (wl) is determined, wherein wl is determined based on the values k, m and VI, where i) VI is equal to abs(AIl), which is the absolute value of the difference between the pixel value for the first pixel (e.g., the center pixel) and a second pixel value associated with a second pixel in the block of pixels (e.g., the pixel above the center pixel) (i.e., DI1 may be one of DI_A, DI_B, AIL, and AIR) or ii) VI is equal to abs(NLl), which is an average version of abs(AIl).

[0073] In step s4l2, the filtered pixel value (I_F) is determined using the pixel value for the first pixel (Ic) and the first weight value (wl).

[0074] In some embodiments, determining wl comprises calculating I = k x VI + m.

In such an embodiment, determining wl may further comprise calculating wl = max (0, 1).

[0075] In some embodiments, determining wl comprises calculating wl = k x

(min(Vl,xc)) + m, where xc = -m/k.

[0076] In some embodiments determining wl comprises, determining whether VI is less than or equal to xc and setting wl equal to 0 as a result of determining that VI (e.g., |AIl |) is less than or equal to xc. [0077] In some embodiments, decoder 104 determines IF using: Ic, wl , w2, w3, and w4, wherein, for i=l , 2, 3 and 4, wi = 0 if Vi is less than or equal xc, otherwise wi = k x Vi + m, where VI = abs(AlA) or abs(NLA); V2 = abs(AIe) or abs(NLB); V3 = abs(AlL) or abs(NLL); and V4 = abs(Al_R) or abs(NL_R). In such an embodiment, decoder 104 may determine IF by calculating: IF = Ic + d(ff_d) x [wlx DIA + W2XAIB + W3XAIL + W4XAIR] ,

[0078] FIG. 5 is a flow chart illustrating a process 500 according to some embodiments.

Process 500 may begin in step s502.

[0079] In step s502, a pixel value is obtained (e.g., an intensity value, a chroma value, etc.) for a first pixel (e.g., a center pixel).

[0080] In step s504, the values are obtained: a first value (kqp), a second value (h¾,), and a third value (d(od)).

[0081] In step s506, the value k is calculated as k = d(o_d) x k_qp.

[0082] In step s508, the value m is calculated as m = d(o_d) x h¾,.

[0083] In step s510, a first intermediate value Wl _m is calculated as Wl _m = m x aAI 1 , wherein DI1 is the value of a difference between the pixel value for the first pixel (e.g., the center pixel) and a second pixel value associated with a second pixel in the block of pixels (e.g., the pixel above the center pixel— e.g., DI1 may be one of DIA, DIB, AIL, and AIR) .

[0084] In step s512, a second intermediate value Wl_k is calculated as Wl_k = k x DI1 x

VI .

[0085] In step s5 l4, the weight value Wl is calculated as Wl = Wl_m + Wl_k.

[0086] In step s516, the filtered pixel value (IF) is determined using the pixel value for the first pixel (Ic) and Wl , wherein determining IF using Wl and Ic comprises determining whether Wl is greater than 0.

[0087] In some embodiments, process 500 further includes the steps of:

calculating W2_m = m x DI2, wherein DI2 is a difference between the pixel value for the first pixel and a third pixel value associated with a third pixel in the block of pixels calculating W2_k = k x DI2 x V2; after calculating W2_m and W2_k, a second weight value W2 is calculated as W2 =

W2_m + W2_k;

calculating W3_m = m x DI3, wherein DI3 is a difference between the pixel value for the first pixel and a fourth pixel value associated with a fourth pixel in the block of pixels;

calculating W3_k = k x DI3 x V3;

after calculating W3_m and W3_k, calculating W3 = W3_m + W3i_<; calculating W4_m = m x DI4, wherein DI4 is a difference between the first pixel value and a fifth pixel value associated with a fifth pixel in the block of pixels;

calculating W4_k = k x DI4 x V4;

after calculating W4_m and W4_k, calculating W4 = W4_m + W4_k; and determining IF using Ic and the weight values Wl , W2, W3 and W4, wherein determining IF using Ic, Wl , W2, W3 and W4 comprises calculating: IF = Ic + max(0,Wl) + max(0,W2) + max(0,W3) + max(0,W4).

[0088] FIG. 6 is a block diagram of an apparatus 600, according to some embodiments for performing methods disclosed herein. That is, for example, apparatus 600 can be used to implement decoder 104. As shown in FIG. 6, apparatus 600 may comprise: processing circuitry (PC) 602, which may include one or more processors (P) 655 (e.g., a general purpose microprocessor and/or one or more other processors, such as an application specific integrated circuit (ASIC), field-programmable gate arrays (FPGAs), and the like), which processors may be co-located or distributed in different locations; circuitry 603 (e.g., radio transceiver circuitry comprising an Rx 605 and a Tx 606) coupled to network 110); and a local storage unit (a.k.a., “data storage system”) 608, which may include one or more non-volatile storage devices and/or one or more volatile storage devices. In embodiments where PC 602 includes a programmable processor, a computer program product (CPP) 641 may be provided. CPP 641 includes a computer readable medium (CRM) 642 storing a computer program (CP) 643 comprising computer readable instructions (CRI) 644. CRM 642 may be a non-transitory computer readable medium, such as, magnetic media (e.g., a hard disk), optical media, memory devices (e.g., random access memory, flash memory), and the like. In some embodiments, the CRI 644 of computer program 643 is configured such that when executed by PC 602, the CRI causes apparatus 600 to perform steps described herein (e.g., steps described herein with reference to the flow charts). In other embodiments, apparatus 600 may be configured to perform steps described herein without the need for code. That is, for example, PC 602 may consist merely of one or more ASICs. Hence, the features of the embodiments described herein may be

implemented in hardware and/or software.

[0089] Embodiments:

[0090] Al . A method for determining a filtered pixel value (IF), the method comprising: obtaining a pixel value (e.g., an intensity value, a chroma value, etc.) for a first pixel (e.g., a center pixel); determining a first weight value (wl) based on the values k, m and VI , where k and m are predetermined values, and i) VI is equal to abs(AIl), which is the absolute value of the difference between the pixel value for the first pixel (e.g., the center pixel) and a second pixel value associated with a second pixel (e.g., the pixel above the center pixel) (i.e., DI1 may be one of DI_A, DI_B, AIL, and AIR) or ii) VI is equal to abs(NLl), which is an average version of abs(AIl ); determining the filtered pixel value (IF) using the pixel value for the first pixel (Ic) and the first weight value (wl ).

[0091] A2. The method of embodiment Al, wherein determining wl comprises calculating wl = max (0, k x VI + m).

[0092] A3. The method of embodiment Al, wherein determining wl comprises calculating wl = k x (min(Vl ,xc)) + m, where xc = -m/k.

[0093] A4. The method of embodiment Al, wherein determining wl comprises:

determining whether VI is less than or equal to xc; and setting wl equal to 0 as a result of determining that VI (e.g., |DI11) is less than or equal to xc, otherwise setting wl equal to k x VI + m.

[0094] A5. The method of any one of embodiments A1-A4, wherein determining IF comprises determining IF using: Ic, wl , w2, w3, and w4, wherein, for i=l , 2, 3 and 4, wi = 0 if Vi is less than or equal xc, otherwise wi = k x Vi + m, where VI = abs(Al_A) or abs(NL_A); V2 = abs(AlB) or absCNLe); V3 = abs(AlL) or abs(NLL); and V4 = abs(AlR) or abs(NLR). [0095] A6. The method of embodiment A5, wherein determining IF comprises calculating: IF = Ic + d(a_d) x [wlx DIA + W2XAIB + W3XAIL + W4XAIR] ,

[0096] Bl . A method for determining a filtered pixel value (IF), the method comprising: obtaining a pixel value (e.g., an intensity value, a chroma value, etc.) for a first pixel (e.g., a center pixel) of a block of pixels; obtaining a first value (k_qp); obtaining a second value (h¾,); obtaining a third value (d(a_{ti ))}; calculating k = d(o_d) x k_qp; calculating m = d(o_d) x m_qp;

determining a first weight value (wl) based on the values k, m and VI, where i) VI is equal to abs( D 11 ), which is the absolute value of the difference between the pixel value for the first pixel (e.g., the center pixel) and a second pixel value associated with a second pixel in the block of pixels (e.g., the pixel above the center pixel) (i.e., DI1 may be one of DIA, DIB, AIL, and AIR) or ii) VI is equal to abs(NLl), which is an average version of abs(AIl); determining the filtered pixel value (IF) using the pixel value for the first pixel (Ic) and the first weight value (wl).

[0097] B2. The method of embodiment Bl, wherein determining wl comprises calculating I = k x Vl + m.

[0098] B3. The method of embodiment B2, wherein determining wl further comprises calculating wl = max (0, 1).

[0099] B4. The method of embodiment Bl, wherein determining wl comprises calculating wl = k x (min(Vl,xc)) + m, where xc = -m/k.

[00100] B5. The method of embodiment Bl, wherein determining wl comprises:

[00101] B6. The method of any one of embodiments B1-B4, wherein determining IF comprises determining IF using: Ic, wl , w2, w3, and w4, wherein, for i=l , 2, 3 and 4, wi = 0 if Vi is less than or equal xc, otherwise wi = k x Vi + m, where VI = abs(Al_A) or abs(NL_A); V2 = abs(AlB) or abs(NLB); V3 = abs(AlL) or abs(NLL); and V4 = abs(AlR) or abs(NLR). [00102] B6. The method of embodiment B5, wherein determining IF comprises determining (Ic + d(od) x [wlx DIA + W2XAIB + W3XAIL + W4XA IR] ).

[00103] Cl . A method for determining a filtered pixel value (I_F), the method comprising: obtaining a pixel value (e.g., an intensity value, a chroma value, etc.) for a first pixel (e.g., a center pixel) of a block of pixels; obtaining a first value (k_qp); obtaining a second value (h¾,); obtaining a third value (d(a_ti)); calculating k = d(o_d) x k_qp; calculating m = d(o_d) x m_qp;

calculating Wl _m = m x abs(AIl), wherein abs(AIl) is the absolute value of the difference between the pixel value for the first pixel (e.g., the center pixel) and a second pixel value associated with a second pixel in the block of pixels (e.g., the pixel above the center pixel— e.g., DI1 may be one of DIA, DIB, AIL, and AIR); calculating Wl_k = k x DI1 x VI ; after calculating Wl _m and Wl_k, calculating Wl = Wl _m + Wl_k; and determining the filtered pixel value (I_F) using the pixel value for the first pixel (Ic) and Wl , wherein determining I_F using Wl and Ic comprises determining whether Wl is greater than 0.

[00104] C2. The method of embodiment Cl , further comprising: calculating W2_m = m x

DI2, wherein DI2 is a difference between the pixel value for the first pixel and a third pixel value associated with a third pixel in the block of pixels; calculating W2_k = k x DI2 x V2; after calculating W2_m and W2_k, calculating W2 = W2_m + W2_k; calculating W3_m = m x DI3, wherein DI3 is a difference between the pixel value for the first pixel and a fourth pixel value associated with a fourth pixel in the block of pixels; calculating W3_k = k x DI3 x V3; after calculating W3_m and W3_k, calculating W3 = W3_m + W3_k; calculating W4_m = m x DI4, wherein (DI4 is a difference between the pixel value for the first pixel and a fifth pixel value associated with a fifth pixel in the block of pixels; calculating W4_k = k x DI4 x V4; after calculating W4_m and W4_k, calculating W4 = W4_m + W4_k; and determining I_F using Ic, Wl , W2, W3 and W4, wherein determining I_F using Ic, Wl , W2, W3 and W4 comprises calculating: I_F = Ic + max(0,Wl) + max(0,W2) + max(0,W3) + max(0,W4).

[00105] D. An apparatus (600) configured to perform the method of any one of the above embodiments.

[00106] While various embodiments are described herein, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of this disclosure should not be limited by any of the above-described exemplary embodiments. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the disclosure unless otherwise indicated herein or otherwise clearly contradicted by context.

[00107] Additionally, while the processes described above and illustrated in the drawings are shown as a sequence of steps, this was done solely for the sake of illustration. Accordingly, it is contemplated that some steps may be added, some steps may be omitted, the order of the steps may be re-arranged, and some steps may be performed in parallel.

[00108] References

[00109] [l] P. Wennersten, J. Strom, Y. Wang, K. Andersson, R. Sjdberg, J. Enhom,

“Bilateral Filtering for Video Coding”, IEEE Visual Communications and Image Processing (VCIP), December 2017. [Paper downloadable from:

www .jacobstrom.com/publications/Wennersten_et_al_VCIP2017.pdf] .

[00110] [2] Y. Chen, W.-J. Chien, H.-C. Chuang, M. Coban, J. Dong, H. E. Egilmez, N.

Hu, M. Karczewicz, A. Ramasubramonian, D. Rusanovskyy, A. Said, V. Seregin, G. Van Der Auwera, K. Zhang, L. Zhang, P. Bordes, Y. Chen, C. Chevance, E. Francois, F. Galpin, F. Hiron, P. de Lagrange, F. Le Leannec, K. Naser, T. Poirier, F. Racape, G. Rath, A. Robert, F. Fbban, T. Viellard [Downloadable as JVET-J002l .docx in the zip file phenix.int- evry.fr/jvet/doc_end_user/documents/l0_San%20Diego/wgl 1/JVET-J002 l -v5.zip]

[00111] [3] J. Strom, P. Wennersten, J. Enhom, D. Liu, K. Andersson, R. Sjdberg,”CE2 related: Reduced complexity bilateral filter”, Input document to JVET, document number JVET- K0274-v4, [Downloadable as JVET-K0274_v4_clean.docx in the zip file phenix.int- evry . fr/j vet/ doc_end_user/ documents/ 11 _Ljublj ana/wg 1 1/ JVET -K0274-v5.zip] .

Claims

1. A method (300, 400, 500) for determining a filtered pixel value, the method comprising:

obtaining (s302, s402, s502) a first pixel value associated with a first pixel of a block of pixels;

determining (s304, s4l0, s5l4) a first weight value, wl, based on a first value, VI, a second value, k, and a third value, m, wherein VI is equal to, or calculated using, the absolute value of the difference between the first pixel value and a second pixel value associated with a second pixel of the block of pixels; and

determining (s306, s4l2, s516) the filtered pixel value using the first pixel value and the first weight value, wl .

2. The method of claim 1, wherein determining wl comprises determining a value I, wherein I = (k x Vl) + m.

3. The method of claim 2, wherein determining wl further comprises calculating wl = min(0, 1).

4. The method of claim 1, wherein determining wl comprises calculating wl = k x (min(Vl,xc)) + m, where xc = -m/k.

5. The method of claim 1, wherein determining wl comprises:

determining whether VI is less than or equal to xc, where xc = -m/k; and

setting wl equal to 0 as a result of determining that VI is greater than or equal to xc, otherwise setting wl equal to k x VI + m.

6. The method of any one of claims 1-5, wherein

determining the filtered pixel value comprises determining the filtered pixel value using: Ic, wl , w2, w3, and w4, Ic is the first pixel value,

for i=l , 2, 3 and 4, wi = 0 if Vi is less than or equal xc, otherwise wi = k x Vi + m,

V2 is equal to or calculated using the absolute value of the difference between the first pixel value and a third pixel value associated with third pixel of the block of pixels,

V3 is equal to or calculated using the absolute value of the difference between the first pixel value and a fourth pixel value associated with a fourth pixel of the block of pixels, and V4 is equal to or calculated using the absolute value of the difference between the first pixel value and a fifth pixel value associated with a fifth pixel of the block of pixels.

7. The method of claim 6, wherein

the filtered pixel value is equal to: Ic + d(a_d) x [wlx DIA + W2XAIB + W3XAIL + W4XAI_R], and

d(a_d) is a predetermined constant for the block of pixels.

8. The method of any one of claims 1-6, wherein

determining the filtered pixel value using the first pixel value and wl comprises determining x and calculating Ic + x,

Ic is the pixel value for the first pixel,

determining x comprises calculating wl x DI_A, and

DI_A is equal to the difference between the second pixel value and the first pixel value.

9. The method of claim 8, wherein

x is equal to: w 1 cD I,_\ + W2XAI_B + W3XAI_L + W4XAI_R,

w2 is a determined second weight value,

w3 is a determined third weight value,

w4 is a determined fourth weight value,

DI_B is equal to the difference between a third pixel value associated with a third pixel of the block of pixels and the first pixel value,

AI_L is equal to the difference between a fourth pixel value associated with a fourth pixel of the block of pixels and the first pixel value, and AI_R is equal to the difference between a fifth pixel value associated with a fifth pixel of the block of pixels and the first pixel value.

10. The method of claim 9, wherein determining wl comprises calculating: k x VI + m.

1 1. The method of claim 10, wherein determining wl comprises determining wl to be zero if k x VI + m is equal to or less than zero otherwise determining wl to be k x VI + m.

12. The method of claim 1 , wherein

determining the filtered pixel value using the first pixel value and wl comprises calculating Ic + wl + w2 + w3 + w4,

Ic is the pixel value for the first pixel,

w2 is a determined second weight value,

w3 is a determined third weight value, and

w4 is a determined fourth weight value.

13. The method of claim 1 or 12, wherein determining wl comprises calculating: (k x VI x DI_A) + (m x DI_A), where DI_A is equal to the difference between the pixel value for the second pixel of the block of pixels and the first pixel value.

14. The method of claim 13, wherein determining wl comprises determining wl to be zero if ((k x VI x DIA) + (m x DIA)) is equal to or less than zero otherwise determining wl to be ((k x VI x DIA) + (m x DIA)).

15. A method (500) for determining a filtered pixel value, the method comprising: obtaining (s302, s402, s502) a pixel value, Ic, associated with a first pixel of a block of pixels;

determining (s5l0) a first intermediate value, Wl_m, based on a first value, m, and DI_A where DIA is equal to a difference between a second pixel value associated with a second pixel of the block of pixels and the first pixel value; determining (s5 l2) a second intermediate value, Wl_k, based on a second value, k, and a third value, VI ;

summing (s5l4) Wl_m and Wl_k, thereby producing a sum, Wl, where Wl = Wl_m +

Wl_k;

determining (s5 l6) the filtered pixel value using the pixel value for the first pixel, Ic, and W 1 , wherein

determining Wl_m comprises calculating (m x DI_A),

determining Wl_k comprises calculating (k x VI x DI_A), and

determining the filtered pixel value using W 1 and Ic comprises determining whether Wl is greater than 0.

16. The method of claim 15, wherein VI = abs(Al_A).

17. The method of claim 15 or 16, wherein

the filtered pixel value is equal to: Ic + max(0,Wl) + max(0,W2) + max(0,W3), + max(0,W4),

W2 = (k x V2 x DI_B) + (m x DI_B),

W3 = (k x V3 x AIL) + (m x AIL), and

W4 = (k x V4 x D½) + (m x D¾).

18. A computer program (643) comprising instructions (644) which when executed by processing circuitry (602) of an apparatus (600) causes the apparatus (600) to perform the method of any one of the above claims.

19. A carrier containing the computer program of claim 18, wherein the carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium (642).

20. An apparatus (600) for determining a filtered pixel value, the apparatus being configured to : obtain (s302, s402, s502) a first pixel value associated with a first pixel of a block of pixels;

determine (s304, s4l0, s5l4) a first weight value, wl, based on a first value, VI, a second value, k, and a third value, m, wherein VI is equal to, or calculated using, the absolute value of the difference between the first pixel value and a second pixel value associated with a second pixel of the block of pixels; and

determine (s306, s4l2, s516) the filtered pixel value using the first pixel value and the first weight value, wl .

21. The apparatus of claim 20, wherein the apparatus is configured to determine wl by performing a process that includes determining a value I, wherein I = (k x Vl) + m.

22. The apparatus of claim 21, the process further comprises calculating wl = min(0, 1).

23. The apparatus of claim 20, wherein the apparatus is configured to determine wl by performing a process that comprises calculating wl = k x (min(Vl,xc)) + m, where xc = -m/k.

24. The apparatus of claim 20, wherein the apparatus is configured to determine wl by: determining whether VI is less than or equal to xc, where xc = -m/k; and

25. An apparatus (600) for determining a filtered pixel value, the apparatus being configured to :

obtain (s302, s402, s502) a first pixel value, Ic, associated with a first pixel of a block of pixels;

determine (s5l0) a first intermediate value, Wl_m, based on a first value, m, and DI_A where DI_A is equal to a difference between a second pixel value associated with a second pixel of the block of pixels and the first pixel value; determine (s5l2) a second intermediate value, Wl_k, based on a second value, k, and a third value, VI ;

sum (s5l4) Wlm and Wl_k, thereby producing a sum, Wl, where Wl = Wl_m + Wl_k; determine (s516) the filtered pixel value using the pixel value for the first pixel, Ic, and W 1 , wherein

the apparatus is configured to determine Wl_m by calculating (m x DI_A);

the apparatus is configured to determine Wl_k by calculating (k x VI x DI_A);

the apparatus is configured to determine the filtered pixel value using W 1 and Ic by performing a process that comprises determining whether Wl is greater than 0.

26. The apparatus of claim 25, wherein VI = abs(Al_A).

27. The apparatus of claim 25 or 26, wherein

W2 = (k x V2 x DI_B) + (m x DI_B),

W3 = (k x V3 x AIL) + (m x AIL), and

W4 = (k x V4 x D½) + (m x D¾).