CN113473133A - Method and device for predicting by using geometric partition GEO mode - Google Patents

Method and device for predicting by using geometric partition GEO mode Download PDF

Info

Publication number
CN113473133A
CN113473133A CN202110349749.7A CN202110349749A CN113473133A CN 113473133 A CN113473133 A CN 113473133A CN 202110349749 A CN202110349749 A CN 202110349749A CN 113473133 A CN113473133 A CN 113473133A
Authority
CN
China
Prior art keywords
prediction
partition
prediction block
distance
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110349749.7A
Other languages
Chinese (zh)
Inventor
修晓宇
马宗全
陈漪纹
朱弘正
陈伟
王祥林
于冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Publication of CN113473133A publication Critical patent/CN113473133A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/156Availability of hardware or computational resources, e.g. encoding based on power-saving criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present disclosure relates to a method and an apparatus for prediction using a geometric partition GEO mode, the method comprising: when the current coding unit is predicted by using a GEO mode, determining a weighting matrix according to a predefined angle and an offset distance of a final partition mode in the GEO mode, wherein the precision of the slope of the predefined angle is 1/m pixel precision, and m is a positive integer less than or equal to 4; and performing weighted mixing on a first prediction block of the first non-rectangular sub-partition and a second prediction block of the second non-rectangular sub-partition according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning mode.

Description

Method and device for predicting by using geometric partition GEO mode
Technical Field
The present disclosure relates to the field of video technologies, and in particular, to a method and an apparatus for performing prediction by using a geometric partitioning GEO mode.
Background
As the demand for video codecs for efficiently encoding or decoding high-resolution or high-quality video content increases, various video encoding techniques are used for encoding or decoding video data. For example, video coding may be performed according to one or more video coding standards, which may include high efficiency video coding (h.265/HEVC), advanced video coding (h.264/AVC), Motion Picture Experts Group (MPEG) coding, universal video coding (VVC), and so on. A primary goal of video coding techniques is to compress video data into a format that uses a lower bit rate while avoiding or minimizing degradation of video quality. Although HEVC provides a bit rate savings of about 50% or equivalent perceptual quality compared to the previous generation video coding standard h.264/MPEG AVC, there is evidence that higher coding efficiencies than HEVC can be achieved using other coding tools. Therefore, the universal video coding (VVC) standard has come into force.
In general, video encoding may utilize prediction methods (e.g., inter-prediction, intra-prediction, skip mode, etc.) to remove redundancy present in a video image or sequence to complete the encoding of the video. To further improve coding efficiency, the VVC employs a geometric partition prediction (GEO) mode that models a bi-prediction block by blending two inter-prediction blocks and generates a prediction block for the GEO mode by sample-weighted averaging the two inter-prediction blocks. Although GEO mode improves coding efficiency to some extent, it still has an aspect to be improved in terms of computational logic and storage efficiency.
Disclosure of Invention
The present disclosure provides a method and apparatus for prediction using geometric partition GEO mode to solve at least the problems of the related art described above, and may not solve any of the problems described above.
According to a first aspect of the embodiments of the present disclosure, there is provided a method for performing prediction by using a geometric partition GEO mode, including: when the current coding unit is predicted by using a GEO mode, determining a weighting matrix according to a predefined angle and an offset distance of a final partition mode in the GEO mode, wherein the precision of the slope of the predefined angle is 1/m pixel precision, and m is a positive integer less than or equal to 4; and performing weighted mixing on a first prediction block of the first non-rectangular sub-partition and a second prediction block of the second non-rectangular sub-partition according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning mode.
Optionally, the step of determining the weighting matrix may comprise: determining a weight applied to one prediction sample in a first prediction block with 1/m pixel precision according to the offset distance, the predefined angle and a sample position of the one prediction sample in the first prediction block; the weight applied to the prediction samples at the same position as the one prediction sample in the second prediction block with 1/m pixel accuracy is obtained by a difference between a fixed value and the weight of the one prediction sample in the first prediction block.
Alternatively, the step of determining the weight applied to the first prediction samples in the first prediction block with 1/m pixel precision may comprise: determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in a first prediction block from the offset distance, the predefined angle and the sample position of the one prediction sample; deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx.
Alternatively, the step of determining a weight index weightIdx for deriving the weight applied to the one prediction sample in the first prediction block may comprise: determining an x-direction offset and a y-direction offset from the predefined angle and the offset distance; retrieving a geometric partition distance array under the pixel precision of 1/m based on the angle index of the predefined angle to respectively obtain the geometric partition distance in the x direction and the geometric partition distance in the y direction; and determining the weight index weightIdx according to the offset in the x direction and the offset in the y direction, the geometric partitioning distance in the x direction and the geometric partitioning distance in the y direction, and the sampling point position of the predicted sampling point.
Alternatively, the step of deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx may include: determining the distance weightIdxL from the one predicted sample point to the partition boundary corresponding to the final partition mode under the condition of 1/m pixel precision according to the weight index weightIdx; determining a weight wValue applied to the one prediction sample in the first prediction block with 1/m pixel precision using the distance weightIdxL according to the following equation:
wValue=Clip3(0,8,(weightIdxL+2m-3)>>(m-2))
alternatively, the distance weightIdxL may be determined according to the following equation:
Figure BDA0003002032770000031
wherein the angleIdx is an angle index for representing the predefined angle.
Optionally, the method may further comprise: for each sub-block of a predefined size in the current coding unit, storing a first motion vector for obtaining a first prediction block and storing a second motion vector for obtaining a second prediction blockObtaining at least one of second motion vectors of a second prediction block, wherein the storing comprises: comparing the distance from the center of the current sub-block to the partition boundary corresponding to the final partition mode with a predetermined threshold, wherein the predetermined threshold is equal to 2 m(ii) a And selecting a motion vector to be stored for the current sub-block of the first and second motion vectors according to a result of the comparison.
According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for performing prediction by using a geometric partition GEO mode, including: a computing unit configured to: when the current coding unit is predicted by using a GEO mode, determining a weighting matrix according to a predefined angle and an offset distance of a final partition mode in the GEO mode, wherein the precision of the slope of the predefined angle is 1/m pixel precision, and m is a positive integer less than or equal to 4; and a generation unit configured to: and performing weighted mixing on a first prediction block of the first non-rectangular sub-partition and a second prediction block of the second non-rectangular sub-partition according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning mode.
Alternatively, the calculation unit may be configured to determine the weighting matrices applied to the first prediction block and the second prediction block according to: determining a weight applied to one prediction sample in a first prediction block with 1/m pixel precision according to the offset distance, the predefined angle and a sample position of the one prediction sample in the first prediction block; the weight applied to the prediction samples at the same position as the one prediction sample in the second prediction block with 1/m pixel accuracy is obtained by a difference between a fixed value and the weight of the one prediction sample in the first prediction block.
Alternatively, the calculation unit may be configured to determine the weights applied to the first prediction samples in the first prediction block with 1/m pixel precision according to: determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in a first prediction block from the offset distance, the predefined slope and the sample position of the one prediction sample; deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx.
Alternatively, the calculation unit may be configured to determine the weight index weightIdx for deriving the weight to be applied to the one prediction sample in the first prediction block by: determining an x-direction offset and a y-direction offset from the predefined angle and the offset distance; retrieving a geometric partition distance array under the pixel precision of 1/m based on the angle index of the predefined angle to respectively obtain the geometric partition distance in the x direction and the geometric partition distance in the y direction; and determining the weight index weightIdx according to the offset in the x direction and the offset in the y direction, the geometric partitioning distance in the x direction and the geometric partitioning distance in the y direction, and the sampling point position of the predicted sampling point.
Optionally, the computing unit may be configured to derive a weight value applied to the one prediction sample in the first prediction block according to: determining the distance weightIdxL from the one predicted sample point to the partition boundary corresponding to the final partition mode under the condition of 1/m pixel precision according to the weight index weightIdx; determining a weight wValue applied to the one prediction sample in the first prediction block with 1/m pixel precision using the distance weightIdxL according to the following equation: wValue ═ Clip3(0,8, (weightIdxL + 2)m-3)>>(m-2))。
Alternatively, the distance weightIdxL may be determined according to the following equation:
Figure BDA0003002032770000041
wherein the angleIdx is an angle index for representing the predefined angle.
Optionally, the apparatus may further comprise: a motion information storage unit configured to: for reservations in the current coding unitEach sub-block of the size, storing at least one of a first motion vector for obtaining a first prediction block and a second motion vector for obtaining a second prediction block, wherein the motion information storing unit performs the storing by: comparing the distance from the center of the current sub-block to the partition boundary corresponding to the final partition mode with a predetermined threshold, wherein the predetermined threshold is equal to 2 m(ii) a And selecting a motion vector to be stored for the current sub-block of the first and second motion vectors according to a result of the comparison.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a method of predicting with geometrically partitioned GEO patterns according to the present disclosure.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a method of prediction with geometry partition GEO mode according to the present disclosure.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, instructions in which are executable by a processor of a computer device to perform a method of predicting with geometric partition GEO-mode according to the present disclosure.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: according to the method and the device for predicting by using the geometric partitioning GEO mode, the number of bits in computational logic and a memory during prediction by using the GEO mode can be reduced, the complexity of encoding and decoding is reduced, and further the power consumption is reduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 and 2 are implementation scenarios illustrating a method and apparatus for prediction using GEO mode according to the present disclosure.
Fig. 3 is a diagram illustrating division of pictures into CTUs according to an exemplary embodiment of the present disclosure.
Fig. 4 is a diagram illustrating a QTBT structure according to an exemplary embodiment of the present disclosure.
Fig. 5 is a diagram illustrating four partition types in a multi-type tree structure according to an exemplary embodiment of the present disclosure.
Fig. 6 is a diagram illustrating derivation of motion vector candidates from spatially neighboring blocks adjacent to a current coding unit according to an exemplary embodiment of the present disclosure.
Fig. 7 is a diagram illustrating scaling for temporal motion vector candidates according to an exemplary embodiment of the present disclosure.
Fig. 8 is a diagram illustrating determining candidate positions for obtaining temporal motion vector candidates from temporally neighboring blocks (i.e., co-located blocks) adjacent to a current CU according to an exemplary embodiment of the present disclosure.
Fig. 9 is a diagram illustrating unidirectional motion vector selection for GEO mode according to an exemplary embodiment of the present disclosure.
Fig. 10 is a diagram illustrating possible angles of GEO mode according to an exemplary embodiment of the present disclosure.
Fig. 11 is a diagram illustrating two example zoning modes in GEO mode according to an exemplary embodiment of the present disclosure.
Fig. 12 is a flowchart illustrating a method of predicting using a GEO mode according to an exemplary embodiment of the present disclosure.
Fig. 13 is a block diagram illustrating an apparatus 1300 for prediction using GEO mode according to an exemplary embodiment of the present disclosure.
Fig. 14 is a block diagram of an electronic device 1400 according to an example embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The embodiments described in the following examples do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.
Existing GEO modes when a current coding unit is divided into two non-rectangular sub-partitions using a predefined angle derived from a mode index of the GEO mode and an offset distance index, a Slope of the predefined angle corresponding to the predefined angle refers to a lookup table of 1/8 pixel precision, as shown in table 1 below, where idx is an index indicating a partition manner in the GEO mode and disput [ ] is a geometric partition distance array. However, as can be seen from table 1 below, the minimum non-zero value of the slope of the predefined angle in this table 1 does not use 1/8 pixel precision, thus resulting in a waste of bits for the computation logic and memory means.
[ Table 1]
idx 0 2 3 4 5 6 8 10 11 12 13 14
disLut[idx] 8 8 8 4 4 2 0 -2 -4 -4 -8 -8
Slope 0 -1/4 -1/2 -1 -2 -4 -∞ 4 2 1 1/2 1/4
idx 16 18 19 20 21 22 24 26 27 28 29 30
disLut[idx] -8 -8 -8 -4 -4 -2 0 2 4 4 8 8
Slope 0 -1/4 -1/2 -1 -2 -4 -∞ 4 2 1 1/2 1/4
In order to solve the above problems, the present disclosure provides an improved method for performing prediction using a GEO mode, which can reduce the computation complexity and the storage complexity without affecting the video quality, so that the video encoding and decoding process can be completed more quickly no matter whether the video is hard decoding or soft decoding. Hereinafter, a method and apparatus for prediction using a GEO mode according to an exemplary embodiment of the present disclosure will be described in detail with reference to fig. 1 to 14.
Fig. 1 and 2 are implementation scenarios illustrating a method and apparatus for prediction using GEO mode according to the present disclosure.
Fig. 1 is a schematic diagram illustrating a video encoding process of a block-based hybrid video encoding framework.
Referring to fig. 1, when encoding input video data, an intra prediction (also referred to as "spatial prediction") may be performed on an encoding block of a current frame using pixel values from samples (referred to as reference samples) of already-encoded neighboring blocks in the same picture/slice in an intra prediction mode to obtain a prediction block to remove spatial redundancy in the video signal, or a prediction block may be obtained by performing motion estimation and motion compensation (i.e., inter prediction, also referred to as "temporal prediction") on an encoding block in a current frame using reconstructed pixels in an already-encoded picture in an inter prediction mode to remove temporal redundancy in the video signal. The temporal prediction signal of a coding unit, i.e. the prediction block obtained by inter-prediction, is typically conveyed by one or more Motion Vectors (MVs), where the MVs indicate the motion offset and direction between the current coding unit and the prediction block found by inter-prediction, and furthermore, if multiple reference frames are supported, the encoder additionally needs to send a reference frame index identifying from which reference frame of the multiple reference frames the prediction block comes from.
After inter prediction and/or intra prediction, an intra/inter mode decision module in the encoder selects the best prediction mode, e.g., based on a rate-distortion optimization method. Then, the encoder subtracts the current coding unit from the prediction block to obtain residual data, after which the residual data is decorrelated by transform and quantization to obtain quantized residual coefficients. In addition, the encoder inverse quantizes and inverse transforms the quantized residual coefficients to form a reconstructed residual, and then adds the reconstructed residual to the prediction block to obtain a reconstructed current coding unit. Thereafter, the encoder may apply a loop filter, such as a deblocking filter, a Sample Adaptive Offset (SAO) filter, and an Adaptive Loop Filter (ALF), to the reconstructed current coding unit before the reconstructed current coding unit is stored in a reference picture buffer and used for encoding of a subsequent coding unit.
Finally, the quantized residual coefficients and the prediction related information (e.g., prediction mode information, motion information, etc.) determined by the intra/inter mode decision module are all further compressed and packed into a bitstream by an entropy coding module. In the above description, two prediction methods of intra prediction and inter prediction are mentioned, but in addition to the intra prediction mode and the inter prediction mode, there are a merge prediction mode (including a GEO mode) and a skip mode. For the merge prediction mode, only the merge index and the residual data are transmitted, and for the skip mode, only the merge index is transmitted and transmission of the residual data is omitted. Therefore, the method and apparatus for prediction using geometric partition GEO mode according to the present disclosure may be applied to an encoder or an encoding device, and accordingly, may also be applied to a decoder or a decoding device.
Fig. 2 is a schematic diagram illustrating a video decoding process of a block-based hybrid video coding framework.
In the video image decoding process, first, a decoder obtains residual data to be decoded and prediction-related information (e.g., prediction mode information, motion information, etc.) by entropy-decoding a received bitstream. The prediction related information is sent to the intra prediction module (in case of intra prediction)/motion compensation module (in case of inter prediction). In the intra prediction mode, the decoder may perform intra prediction on the current coding unit according to the parsed prediction related information to obtain a prediction block, and in the inter prediction mode, the decoder may perform inter prediction on the current coding unit according to the parsed prediction related information to obtain a prediction block.
Further, the decoder inverse-quantizes and inverse-transforms the residual data to generate a reconstructed residual block, and then adds the reconstructed residual block to the prediction block obtained in the intra prediction mode/inter prediction mode to obtain a reconstructed block in a spatial domain. In addition, the decoder also performs loop filtering, such as a deblocking filter, a Sample Adaptive Offset (SAO) filter, and an Adaptive Loop Filter (ALF), etc., on the reconstructed block before storing the obtained reconstructed block in the spatial domain to the picture buffer. The reconstructed image in the picture buffer is sent to a display for display while the reconstructed image is also used as a reference frame for subsequent decoded pictures.
Fig. 3 is a diagram illustrating division of pictures into CTUs according to an exemplary embodiment of the present disclosure.
In a typical video encoding process, a video sequence typically includes a series of sequentially arranged frames or pictures. In the case of monochrome content, each frame may comprise an array of samples, such as an array of luminance samples, whereas in the case of non-monochrome content, each frame may comprise a two-dimensional array of three samples, such as one two-dimensional array of luminance samples and two-dimensional arrays of chrominance samples.
As shown in fig. 3, the encoder and decoder may divide each frame into a series of Coding Tree Units (CTUs), where a CTU is the largest logical coding unit. The width and height of the CTUs may be signaled by the encoder to the decoder, e.g., the CTUs may be one of 256 × 256, 128 × 128, 64 × 64, etc. It should be noted, however, that the size of the CTU is not particularly limited in this application. Furthermore, each CTU may consist of one luma Coded Tree Block (CTB) for frames of monochrome content and one luma CTB and two corresponding chroma CTBs for frames of non-monochrome content. Each CTU may be partitioned into multiple CUs, which may have a square shape or a rectangular shape, according to a QTBT partition structure, which eliminates the concept of multiple partition types, i.e., eliminates the separation of CU, Prediction Unit (PU), and Transform Unit (TU) concepts and provides greater flexibility for CU partition shapes.
Fig. 4 is a diagram illustrating a QTBT partition structure according to an exemplary embodiment of the present disclosure.
As shown in fig. 4, the left diagram shows an example of partitioning using a QTBT partition structure, the right diagram shows the corresponding tree structure, wherein the solid lines indicate partitioning according to a quadtree partition structure and the dashed lines indicate partitioning according to a binary tree partition structure, wherein the CTUs are first partitioned according to the quadtree partition structure, and then the quadtree leaf nodes are further partitioned according to a multi-type tree partition structure, e.g. according to a binary tree partition structure or a ternary tree partition structure, as shown in fig. 4, the quadtree leaf nodes are further partitioned according to a binary tree partition structure, wherein in the binary tree partition structure there are two partition types: symmetric horizontal partitions and symmetric vertical partitions (such as horizontal binary partition (SPLIT _ BT _ HOR) and vertical binary partition (SPLIT _ BT _ VER) in fig. 5). Although the leaf nodes of the quadtree are further partitioned according to the binary tree partition structure in fig. 4, the leaf nodes of the quadtree may be further partitioned according to other partition structures, for example, the leaf nodes of the quadtree are further partitioned according to the vertical ternary partition (SPLIT _ TT _ VER) and the horizontal ternary partition (SPLIT _ TT _ HOR) as shown in fig. 5. Further, in each partition node (non-leaf node) of the binary or ternary tree partition structure, the encoder signals a flag to indicate which partition type is used. For quad-tree partitioning, there is no need to indicate the partition type, since the quad-tree partitioning structure always partitions one block in the horizontal and vertical directions to generate 4 equally sized sub-blocks. Specifically, in fig. 4, CTUs are first partitioned by a quadtree partition structure, and quadtree leaf nodes are further partitioned by a binary tree partition structure or a ternary tree partition structure. The leaf nodes of the binary tree and the leaf nodes of the ternary tree are called CUs, and the CUs in this case are used for prediction and transformation processes without further partitioning.
Furthermore, for the QTBT partitioning scheme, the VVC also defines the following parameters:
MinQTSize: allowed minimum quadtree leaf node size;
MaxBTSize: the allowed maximum binary tree root node size;
MaxBTDepth: maximum allowed binary tree depth;
MinBTSize: the minimum allowed binary tree leaf node size.
For example, in one example of the QTBT partition structure, in the color format of 4:2:0, the CTU is set to include one luminance CTB of 128 × 128 size and two corresponding chrominance CTBs of 64 × 64 size, and MinQTSize indicating the minimum allowed quadtree leaf node size is set to 16 × 16, MaxBTSize indicating the maximum allowed binary tree root node size is set to 64 × 64, MinBTSize indicating the minimum allowed binary tree leaf node size is set to 4 × 4, and MaxBTDepth indicating the maximum allowed binary tree leaf node size is set to 4. In this case, first, a quadtree partitioning structure is applied to the CTUs to generate quadtree leaf nodes, which may range in size from 16 × 16 (i.e., MinQTSize) to 128 × 128 (i.e., CTU size). If the size of a leaf node of a quadtree is 128 x 128, then the leaf node of the quadtree will not be further partitioned according to the binary tree partition structure because its size exceeds MaxBTSize (i.e., 64 x 64). Otherwise, the quadtree leaf nodes may be further partitioned according to a binary tree partition structure. Thus, the quad-tree leaf nodes are also the root nodes of the binary tree partition structure, and the binary tree depth is 0. When the binary tree depth reaches MaxBTDepth (i.e., 4), no more partitioning occurs. When the width of the binary tree node is equal to MinBTSize (i.e., 4), no further horizontal partitions are considered. Similarly, when the height of the binary tree node is equal to MinBTSize, no further vertical partitions are considered. Thereafter, the leaf nodes of the binary tree may be further processed through prediction and transformation processes without any further partitioning.
Furthermore, the QTBT partitioning scheme supports the ability for luminance and chrominance to have separate QTBT partition structures. Currently, for P-and B-stripes, luma CTB and chroma CTB in one CTU may share the same QTBT partition structure, whereas for I-stripe, luma CTB may be divided into luma CUs by the QTBT partition structure and chroma CTB may be divided into chroma CUs by another QTBT partition structure, which means that CUs in I-stripe are composed of coded blocks of one luma component or two chroma components, while CUs in P-or B-stripe are composed of coded blocks of all three color components.
For each CU divided based on the above structure, prediction of block content may be performed over the entire CU block or in sub-blocks as described in the following. Such a predicted arithmetic unit may be referred to as a prediction unit (i.e., PU) or a sub-partition. In the case of intra prediction (or intra prediction), the size of a PU is typically equal to the size of a CU. In other words, prediction is performed on the entire CU. For inter prediction, the size of a PU may be equal to or smaller than the size of a CU. In other words, in some cases, a CU may be divided into multiple PUs for prediction. Some examples of PU sizes smaller than the CU size include affine prediction mode, ATMVP prediction mode, GEO mode, and the like.
In affine prediction mode, a CU may be split into multiple 4 × 4 PUs for prediction, a motion vector may be derived for each 4 × 4PU, and motion compensation may be performed on the 4 × 4 PUs accordingly. In ATMVP prediction mode, a CU may be split into one or more 8 × 8 PUs for prediction, a motion vector may be derived for each 8 × 8PU, and motion compensation may be performed on the 8 × 8 PUs accordingly. In GEO mode, a CU may be divided into two non-rectangular PUs (or non-rectangular sub-partitions), motion vectors may be derived for each PU, and motion compensation may be performed on each non-rectangular PU accordingly, as described in detail below.
The process of predicting a CU using GEO mode mainly includes three parts: a first part: constructing a unidirectional prediction candidate list to derive motion vectors of two non-rectangular sub-partitions partitioned from the current CU; a second part: calculating weighting matrixes (namely sampling point weighting matrixes) for the two non-rectangular sub-partitions, and performing weighted mixing on the two non-rectangular sub-partitions according to the weighting matrixes to obtain a prediction block corresponding to the current CU; and a third part: storing the motion vectors of the two non-rectangular sub-partitions for predicting a subsequent CU according to a GEO mode.
In the first part, a uni-directional prediction candidate list for the current CU is first constructed, wherein the uni-directional prediction candidate list may be derived from a regular merge mode motion vector candidate list, wherein the regular merge mode motion vector candidate list may be constructed from a spatial motion vector candidate, a temporal motion vector candidate, a historical motion vector candidate, an average motion vector candidate, and/or a zero motion vector of the current coding unit.
First, spatial motion vector candidates are selected from motion vectors of spatially neighboring blocks of the current CU. As shown in fig. 6, a maximum of four motion vectors may be selected from the motion vectors of the neighboring blocks a0, a1, B0, B1, and B2 shown in fig. 6, and the order of a1- > B1- > B0- > a0- > (B2) is selected, wherein the neighboring block at position B2 is considered only when any one of the neighboring blocks at positions a1, B1, B0, a0 is unavailable or intra-coded.
Then, when obtaining the temporal motion vector candidates, the scaled motion vector may be derived based on the co-located block belonging to the picture with the smallest POC difference value from the current picture within a given reference picture list. Wherein a flag indicating a reference picture list used to derive the co-located block is explicitly signaled in a slice header. A scaled motion vector for the temporal motion vector candidate of the current CU (i.e. curr _ PU) may be obtained as indicated by the dashed line shown in fig. 7, where the scaled motion vector is scaled from the motion vector of the co-located block col _ PU using POC distances tb and td, where tb represents the POC difference between the reference picture curr _ ref of the current picture curr _ pic and the current picture curr _ pic, and td is defined as the POC difference between the reference picture col _ ref of the co-located picture col _ pic and the co-located picture col _ pic. Further, the reference picture index of the co-located block used to obtain the temporal motion vector candidate is set to zero. Also, for B slices, two motion vectors (where one motion vector is used for reference picture list 0 and the other motion vector is used for reference picture list 1) may be obtained, and the combination of the two motion vectors may be used to obtain bi-prediction candidates.
Further, as shown in fig. 8, in obtaining the temporal motion vector candidate, the position of the co-located block may be selected among two candidate positions C3 and H. If the co-located block at candidate position H is not available or is intra coded or is outside the current CUT, the co-located block at candidate position C3 is used to obtain the temporal motion vector candidate. Otherwise, the temporal motion vector candidate may be obtained from the co-located block at candidate position H.
Thereafter, after obtaining the spatial motion vector candidate and the temporal motion vector candidate, the historical motion vector candidate may be added to the regular merge mode motion vector candidate list. The historical motion vector candidates represent motion vectors from previously encoded CUs, which are kept in a separate motion vector list and managed based on certain rules.
If the regular merge mode motion vector candidate list is not yet filled after the historical motion vector candidate is added to the regular merge mode motion vector candidate list, an average motion vector candidate is added to the regular merge mode motion vector candidate list, wherein the average motion vector candidate means that it is obtained by averaging the motion vector candidates already present in the regular merge mode motion vector candidate list, for example, two motion vector candidates may be obtained from the regular merge mode motion vector candidate list each time according to a predetermined rule, and then an average motion vector of the two motion vector candidates is obtained and added to the regular merge mode motion vector candidate list.
If the regular merge mode motion vector candidate list is not yet filled after adding the average motion vector to the regular merge mode motion vector candidate list, a zero motion vector is added to the regular merge mode motion vector candidate list to be filled.
After obtaining the normal merge mode motion vector candidate list, a uni-directional prediction candidate list for the current CU may be obtained from the normal merge mode motion vector candidate list. Since the normal merge mode motion vector candidate list is a bidirectional list, in constructing the unidirectional prediction candidate list for the current CU, one motion vector needs to be selected from two motion vectors for each candidate in the normal merge mode motion vector candidate list, and therefore, as shown in fig. 9, assuming that n denotes an index of the unidirectional prediction motion in the unidirectional prediction candidate list and X denotes parity of n, the LX motion vector for the nth merge candidate in the normal merge mode motion vector candidate list is used as the nth unidirectional prediction motion vector in the GEO mode. These motion vectors are marked with an "x" in fig. 9. In case there is no corresponding LX motion vector for candidate n, the L (1-X) motion vector for the same candidate n is used as the uni-directional predicted motion vector for GEO mode.
After the uni-directional prediction candidate list is obtained, motion vectors of two non-rectangular sub-partitions partitioned from the current coding unit are derived using the uni-directional prediction candidate list.
In particular, in GEO mode, the current CU may be divided into two non-rectangular sub-partitions according to partition boundaries, where a predefined angle may be used
Figure BDA0003002032770000131
And an offset distance ρ representing the partition boundary, wherein the angle is predefined
Figure BDA0003002032770000132
Representing a clockwise angle from the x-axis to the normal vector of the partition boundary, and the offset distance represents a shift from the center point of the CU to the partition boundary. Fig. 10 shows the possible 20 predefined angles for the GEO mode. Each partition mode can use a predefined angle
Figure BDA0003002032770000133
And an offset distance p, wherein the angle is predefined
Figure BDA0003002032770000134
Slope of (2)
Figure BDA0003002032770000135
May be 0, ± 1/2, ± 1/4, ± 1, ± 2 and ± ∞, the offset distance ρ may be 0, 1/8, 1/4, 1/2 of the height or width of the current CU, the offset distance ρ may be 0, 1/8, 1/4, or 1/2 of the height if the aspect ratio of the current CU is greater than 1 (i.e., the current CU is a vertical rectangle), and the offset distance ρ may be 0, 1/8, 1/4, or 1/2 of the width if the aspect ratio of the current CU is less than or equal to 1 (i.e., the current CU is a square or a horizontal rectangle). For example, in the example shown in fig. 11, (a) of fig. 11 shows a partition manner with an aspect ratio of 1/2 and a slope of 1/2 in the GEO mode, and (b) of fig. 11 shows a partition manner with 1/8 (i.e., 1/8 offset by the width) of the width shifted to the right of the partition manner of (a) of fig. 11.
At the encoder side, in determining which of the 64 partition modes in GEO mode is the optimal partition mode (i.e., the final partition mode), two motion vectors may be selected from the obtained list of uni-directional prediction candidates and used as motion vectors for a first non-rectangular sub-partition and a second non-rectangular sub-partition partitioned from the current CU according to each of the 64 partitioning manners, then motion compensating the first non-rectangular sub-partition and the second non-rectangular sub-partition according to the two selected motion vectors to obtain a first prediction block for the first non-rectangular sub-partition and a second prediction block for the second non-rectangular sub-partition, and calculating corresponding rate-distortion costs, therefore, the optimal partition mode is selected according to the calculated rate distortion cost, and two motion vectors corresponding to the optimal partition mode are obtained at the same time. In addition, the encoder may transmit information related to the optimal partition manner and information of the corresponding two motion vectors (e.g., flag information indicating positions of the two motion vectors in the unidirectional prediction candidate list) to the decoder. As shown in table 2, the GEO mode has 64 partition modes, the partition mode may be indicated by partition mode indexes merge _ gpm _ partition _ idx, each partition mode index merge _ gpm _ partition _ idx corresponds to an angle index angleIdx and a distance index distanceIdx, and thus, the 64 partition modes may be configured by a combination of 20 angle indexes angleIdx and 4 distance indexes distanceIdx, where the distance indexes distanceIdx are 0, 1, 2, and 3 respectively represent offset distances ρ is 0, 1/8, 1/4, and 1/2. The encoder may send the partition mode index merge _ gpm _ partition _ idx to the decoder, and thus, at the decoder side, the optimal partition mode may be determined based on table 2 according to the partition mode index merge _ gpm _ partition _ idx received from the encoder, and the motion vectors of the first non-rectangular sub-partition and the second non-rectangular sub-partition partitioned from the current CU may be derived using the unidirectional prediction candidate list according to the received information of the corresponding two motion vectors.
[ Table 2]
merge_gpm_partition_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
angleIdx 0 0 2 2 2 2 3 3 3 3 4 4 4 4 5 5
distanceIdx 1 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
merge_gpm_partition_idx 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
angleIdx 5 5 8 8 11 11 11 11 12 12 12 12 13 13 13 13
distanceIdx 2 3 1 3 0 1 2 3 0 1 2 3 0 1 2 3
merge_gpm_partition_idx 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
angleIdx 14 14 14 14 16 16 18 18 18 19 19 19 20 20 20 21
distanceIdx 0 1 2 3 1 3 1 2 3 1 2 3 1 2 3 1
merge_gpm_partition_idx 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
angleIdx 21 21 24 24 27 27 27 28 28 28 29 29 29 30 30 30
distanceIdx 2 3 1 3 1 2 3 1 2 3 1 2 3 1 2 3
The first part of the process of predicting a CU using GEO mode is described above. The following describes the improvement of the second and third parts of the process of predicting a CU using GEO mode.
Fig. 12 is a flowchart illustrating a method of predicting using a GEO mode according to an exemplary embodiment of the present disclosure.
As shown in fig. 12, in step S1201, when the current coding unit is predicted using the GEO mode, a weighting matrix is determined according to a predefined angle of a final partition manner in the GEO mode, the slope of which has a precision of 1/m pixel precision, where m is a positive integer less than or equal to 4, for example, 1, 2, and 4, and an offset distance.
The step of determining the weighting matrix may comprise: the weight applied to one prediction sample in the first prediction block with 1/m pixel precision is determined according to the offset distance and the predefined angle of the final partition mode and the sample position of the one prediction sample in the first prediction block of the first non-rectangular sub-partition.
In particular, the step of determining a weight applied to a first prediction sample in a first prediction block with 1/m pixel precision may comprise: determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in the first prediction block according to the offset distance and the predefined angle of the final partition mode and the sample position of the one prediction sample; deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx.
The process weightIdx of determining the weight index will be described in detail below. First, the step of determining a weight index weightIdx for deriving a weight applied to the one prediction sample in the first prediction block may include: determining an x-direction offset and a y-direction offset from the predefined angle and the offset distance.
Specifically, when it is determined that shiftHor is 0 according to the following equation (1), the values of the x-direction offset amount offsetX and the y-direction offset amount offsetY may be determined according to the following equations (2) and (3):
shiftHor=(angleIdx%16==8||(angleIdx%16!=0&&hwRatio>=1))?0:1 (1)
offsetX=(-nW)>>1 (2)
offsetY=((-nH)>>1)+(angleIdx<16?(distanceIdx*nH)>>3:-((distanceIdx*nH)>>3)) (3)
where angleIdx is an angle index for representing the predefined angle, hwRatio represents an aspect ratio of the current CU, distanceIdx is a distance index for representing the offset distance of the final partitioning manner, distanceIdx being obtainable by utilizing the lookup table 2 according to the final partitioning manner determined in the first part above.
When shiftHor is determined to be 1 according to the above equation (1), the values of offset x and offset y may be determined according to the following equations (4) and (5):
offsetX=((-nW)>>1)+(angleIdx<16?(distanceIdx*nW)>>3:-((distanceIdx*nW)>>3)) (4)
offsetY=(-nH)>>1 (5)
where nW and nH represent the width and height, respectively, of the luminance signal of the current CU, and may be determined according to equations (6) and (7) below, respectively:
nW=(cIdx==0)?nCbW:nCbW*SubWidthC (6)
nH=(cIdx==0)?nCbH:nCbH*SubHeightC (7)
where cIdx is an index for representing a color component, where cIdx ═ 0 represents a luminance component, and cIdx ═ 1 and 2 represent chrominance components Cb and Cr, respectively. nCbW and nCbH denote the width and height, respectively, of the current color component of the current CU, and subwidtc and subwight c denote sub-pel factors for the two chroma channels, respectively, wherein table 3 below shows the values of SubWidthC and subwight c derived from the chroma format.
[ Table 3]
Figure BDA0003002032770000151
Figure BDA0003002032770000161
Further, the step of determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in the first prediction block may further include: and searching a geometric partition distance array under the precision of 1/m pixel based on the angle index angleIdx of the predefined angle to respectively obtain the x-direction geometric partition distance and the y-direction geometric partition distance. Specifically, the x-direction geometric partition distance and the y-direction geometric partition distance may be obtained using discout [ angleIdx ] and discout [ (angleIdx + 8)% 32], respectively, where discout [. cndot ] represents a geometric partition distance array at 1/m pixel precision. For example, Table 4 below shows an example of the relationship between the geometric partition distance array disLut [. cndot ] and the Slope of the predefined angle Slope at 1/4 pixel accuracy, where idx is the index used to represent the GEO mode, and the 20 predefined angles shown in FIG. 10.
[ Table 4]
idx 0 2 3 4 5 6 8 10 11 12 13 14
disLut[idx] 4 4 4 2 2 1 0 -1 -2 -2 -4 -4
Slope 0 -1/4 -1/2 -1 -2 -4 -∞ 4 2 1 1/2 1/4
idx 16 18 19 20 21 22 24 26 27 28 29 30
disLut[idx] 4 4 4 2 2 1 0 -1 -2 -2 -4 -4
Slope 0 -1/4 -1/2 -1 -2 -4 -∞ 4 2 1 1/2 1/4
Further, the step of determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in the first prediction block may further include: and determining the weight index weightIdx according to the offset in the x direction and the offset in the y direction, the geometric partitioning distance in the x direction and the geometric partitioning distance in the y direction, and the sampling point position of the predicted sampling point. Specifically, the weight index weightIdx is calculated according to the following equation (8):
weightIdx=(((xL+offsetX)<<1)+1)*disLut[angleIdx]+(((yL+offsetY)<<1)+1))*disLut[(angleIdx+8)%32] (8)
Wherein xL and yL denote x-direction coordinates and y-direction coordinates of a sample position of the one prediction sample in the first prediction block, respectively, wherein xL and yL can be determined by the following equations (9) and (10):
xL=(cIdx==0)?x:x*SubWidthC (9)
yL=(cIdx==0)?y:y*SubHeightC (10)
where x and y represent the x-coordinate and the y-coordinate, respectively, of the luminance sample of the current color component of the current CU, where x is 0.. nCbW-1 and y is 0.. nCbH-1.
Deriving a weight value applied to the one prediction sample in the first prediction block from the weight index weightIdx may comprise: and determining the distance weightIdxL from the one predicted sample point to the partition boundary corresponding to the final partition mode under the condition of 1/m pixel precision according to the weight index weightIdx.
Specifically, distance weightIdxL may be calculated according to equation (11) below:
Figure BDA0003002032770000171
furthermore, the step of deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx may further include: determining a weight wValue applied to the one prediction sample in the first prediction block with 1/m pixel precision using the distance weightIdxL according to equation (12) below:
wValue=Clip3(0,8,(weightIdxL+2m-3)>>(m-2)) (12)
wherein Clip3() is a clipping function, representing (weightIdxL + 2) m-3)>>The value of (m-2) is defined between 0 and 8. Specifically, if (weightIdxL + 2)m-3)>>If the value of (m-2) is less than 0, the output of the function is 0, if (weightIdxL + 2)m-3)>>The value of (m-2) is greater than 8, the output of the function is 8, if 8 ≧ weight IdxL +2m-3)>>(m-2) ≧ 0, the output of the function is (weight IdxL + 2)m-3)>>(m-2)。
In addition, the step of determining the weighting matrix may further include: the weight applied to the prediction samples at the same position as the one prediction sample in the second prediction block with 1/m pixel accuracy is obtained by a difference between a fixed value and the weight of the one prediction sample in the first prediction block. Specifically, a fixed value of 8 may be subtracted from the weight of the one prediction sample in the first prediction block to obtain the weight applied to the prediction sample at the same position as the one prediction sample in the second prediction block.
In step S1202, a first prediction block of a first non-rectangular sub-partition and a second prediction block of a second non-rectangular sub-partition are weighted-mixed according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning manner.
Specifically, the sample value pbSamples [ x ] [ y ] for each pixel (x, y) in the prediction block corresponding to the current CU may be calculated according to the following equation (13):
Figure BDA0003002032770000181
where predSamplesLA [ x ] [ y ] and predSamplesLB [ x ] [ y ] respectively denote predictors at coordinates (x, y) in the first prediction block and the second prediction block, BitDepth denotes a bit depth of a current picture, offset1 ═ 1< (shift1), and shift1 ═ Max (5, 17-BitDepth).
Through the above process, each predictor in the prediction block of the current CU may be determined.
Further, the method may further include: for each sub-block of a predefined size in the current coding unit, at least one of a first motion vector of the first prediction block and a second motion vector for obtaining the second prediction block is obtained. Specifically, the storing step includes: comparing a distance abs (motionIdx) from a center of the current sub-block to a partition boundary corresponding to the final partition mode with a predetermined threshold, wherein the predetermined threshold is equal to 2m(ii) a Selecting a motion vector to be stored for the current sub-block of the first and second motion vectors according to a result of the comparison.
Specifically, first, motionIdx is determined according to the following equation (14):
Figure BDA0003002032770000182
Where xsbid denotes an index of a subblock of a predetermined size in the horizontal direction of the current CU, and ysbid denotes an index of a subblock of a predetermined size in the vertical direction of the current CU, respectively, where the predetermined size may be 4 × 4, where xsbid ═ 0 … numSbX-1, ysbid ═ 0 … numSbY-1, and numSbX and numSbY denote the number of subblocks of the predetermined size in the horizontal direction of the current CU and the number of subblocks of the predetermined size in the vertical direction of the current CU, respectively.
Then, the motion information storage type of the current sub-block is determined according to the comparison result between the calculated distance abs (motionidx) and the predetermined threshold. Specifically, the motion information storage type of the current subblock may be determined as one of three motion information storage types according to the following equations (15) and (16):
sType=abs(motionIdx)<2m?2:(motionIdx<=0?(1-partIdx):partIdx) (15)
partIdx=(angleIdx>=13&&angleIdx<=27)?1:0 (16)
wherein sType denotes a motion information storage type of the current subblock, and in equation (15), the predetermined threshold is equal to 2mFurther, the sType-2 also indicates that the current subblock is located within the width of the partition boundary. And sType of 0 or 1 indicates that the current subblock is not located within the width range of the partition boundary, in other words, indicates that the current subblock is located in an area of the first non-rectangular sub-partition other than the width range of the partition boundary or the current subblock is located in an area of the second non-rectangular sub-partition other than the width range of the partition boundary. After determining the type sType of motion information storage of the current sub-block, the motion vector to be stored for the current sub-block of the first and second motion vectors may be selected based on the sType according to the following:
When sType is 0, the following operations are performed:
predFlagL0=(predListFlagA==0)?1:0
predFlagL1=(predListFlagA==0)?0:1
refIdxL0=(predListFlagA==0)?refIdxA:-1
refIdxL1=(predListFlagA==0)?-1:refIdxA
mvL0[0]=(predListFlagA==0)?mvA[0]:0
mvL0[1]=(predListFlagA==0)?mvA[1]:0
mvL1[0]=(predListFlagA==0)?0:mvA[0]
mvL1[1]=(predListFlagA==0)?0:mvA[1]
where predlistflag a indicates the prediction direction of partition a, if predlistflag a is 0, it indicates that partition a is predicted from the L0 direction, and if predlistflag a is 1, it indicates prediction from the L1 direction. predflag L0 and predflag L1 indicate a prediction list L0 utilization flag and a prediction list L1 utilization flag, respectively. refIdxL0 and refIdxL1 denote reference picture indices in the L0 direction and L1 direction, respectively. refIdxA denotes a reference picture index of partition a. mvL0 and mvL1 represent L0-direction motion vectors and L1-direction motion vectors, respectively. mvA denotes a motion vector of partition a.
When sType is 1, or when sType is 2 and predlistflag a + predlistflag b is not equal to 1, the following operations may be performed:
predFlagL0=(predListFlagB==0)?1:0
predFlagL1=(predListFlagB==0)?0:1
refIdxL0=(predListFlagB==0)?refIdxB:-1
refIdxL1=(predListFlagB==0)?-1:refIdxB
mvL0[0]=(predListFlagB==0)?mvB[0]:0
mvL0[1]=(predListFlagB==0)?mvB[1]:0
mvL1[0]=(predListFlagB==0)?0:mvB[0]
mvL1[1]=(predListFlagB==0)?0:mvB[1]
where predlistflag B indicates the prediction direction of partition B, if predlistflag B is 0, it indicates that partition B is predicted from the L0 direction, and if predlistflag B is 1, it indicates prediction from the L1 direction. refIdxB denotes a reference picture index of partition B. mvB denotes a motion vector of partition B.
When sType is 2 and predlistflag a + predlistflag b is 1, the following operations may be performed:
predFlagL0=1
predFlagL1=1
refIdxL0=(predListFlagA==0)?refIdxA:refIdxB
refIdxL1=(predListFlagA==0)?refIdxB:refIdxA
mvL0[0]=(predListFlagA==0)?mvA[0]:mvB[0]
mvL0[1]=(predListFlagA==0)?mvA[1]:mvB[1]
mvL1[0]=(predListFlagA==0)?mvB[0]:mvA[0]
mvL1[1]=(predListFlagA==0)?mvB[1]:mvA[1]
then, for each pixel (x ∈ {0,1,2,3}, y ∈ {0,1,2,3}) in each 4 × 4 block, the following operations are performed to store the motion vector:
mvL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL0
mvL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL1
mvDmvrL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL0
mvDmvrL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL1
refIdxL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=refIdxL0
redIdxL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=refIdxL1
predFlagL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=predFlagL0
predFlagL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=predFlagL1
pcwIdx[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=0
According to the prediction scheme using the GEO mode disclosed by the disclosure, by representing the slope of the predefined angle of the GEO mode with the 1/m pixel precision greater than 1/8 pixel precision, the number of bits in the calculation logic and the memory during prediction using the GEO mode can be reduced, the complexity of encoding and decoding is reduced, and further the power consumption is reduced.
Fig. 13 is a block diagram illustrating an apparatus 1300 for prediction using GEO mode according to an exemplary embodiment of the present disclosure.
Referring to fig. 13, an apparatus 1300 for prediction using a GEO mode according to an exemplary embodiment of the present disclosure may include a calculation unit 1301 and a generation unit 1302.
When predicting the current coding unit using the GEO mode, the calculation unit 1301 may determine the weighting matrix according to a predefined angle of a final partition manner in the GEO mode and an offset distance, wherein the precision of a slope of the predefined angle is 1/m pixel precision, where m is a positive integer less than or equal to 4, for example, 1, 2, and 4.
The calculation unit 1301 may be configured to determine the weighting matrices applied to the first prediction block and the second prediction block according to the following operations: determining a weight to be applied to one prediction sample in the first prediction block with 1/m pixel precision according to the offset distance, the predefined angle, and a sample position of the one prediction sample in the first prediction block.
In particular, the calculation unit 1301 may be configured to determine the weight applied to the first prediction samples in the first prediction block with 1/m pixel precision according to the following: determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in the first prediction block according to the offset distance and the predefined angle of the final partition mode and the sample position of the one prediction sample; deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx.
First, the calculation unit 1301 may be configured to determine the weight index weightIdx used to derive the weight applied to the one prediction sample in the first prediction block by: determining an x-direction offset and a y-direction offset from the predefined angle and the offset distance; retrieving a geometric partition distance array under the pixel precision of 1/m based on the angle index of the predefined angle to respectively obtain the geometric partition distance in the x direction and the geometric partition distance in the y direction; and determining the weight index weightIdx according to the offset in the x direction and the offset in the y direction, the geometric partitioning distance in the x direction and the geometric partitioning distance in the y direction, and the sampling point position of the predicted sampling point. Since this has been described in detail with reference to equations (1) to (10) in describing fig. 12 above, it is not repeated here.
Then, the calculation unit 1301 may be configured to derive a weight value applied to the one prediction sample in the first prediction block according to the following: determining the distance weightIdxL from the one predicted sample point to the partition boundary corresponding to the final partition mode under the condition of 1/m pixel precision according to the weight index weightIdx; the distance weightIdxL is used to determine the weight wValue that is applied to the one prediction sample in the first prediction block with 1/4 pixel accuracy, according to equation (12) above. Wherein the distance weightIdxL can be determined according to equation (11). Since this has been described in detail with reference to equations (11) to (12) in describing fig. 12 above, it is not repeated here.
Furthermore, the calculation unit 1301 may be further configured to determine a weighting matrix applied to the first prediction block and the second prediction block according to: the weight applied to the prediction samples at the same position as the one prediction sample in the second prediction block with 1/m pixel accuracy is obtained by a difference between a fixed value and the weight of the one prediction sample in the first prediction block. Specifically, a fixed value of 8 may be subtracted from the weight of the one prediction sample in the first prediction block to obtain the weight applied to the prediction sample at the same position as the one prediction sample in the second prediction block.
The generating unit 1302 may be configured to perform a weighted mixing of a first prediction block of a first non-rectangular sub-partition and a second prediction block of a second non-rectangular sub-partition according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning manner. Since this has been described in detail above with reference to equation (13) in describing fig. 12, it is not repeated here.
The apparatus 1300 may further include a motion information storage unit (not shown), wherein the motion information storage unit may be configured to: at least one of a first motion vector used to obtain the first prediction block and a second motion vector used to obtain the second prediction block is stored for each sub-block of a predefined size in the current coding unit. The motion information storage unit may perform the storing operation by: comparing a distance from a center of the current sub-block to a partition boundary corresponding to the final partition manner with a predetermined threshold, which isIn (2), the predetermined threshold is equal to 2m(ii) a And selecting a motion vector to be stored for the current sub-block of the first and second motion vectors according to a result of the comparison. Since fig. 12 has been described above to describe this in detail, it is not described here in detail.
According to the scheme for predicting by utilizing the GEO mode, the bit quantity in the calculation logic and the memory can be reduced when the GEO mode is utilized for predicting, the encoding and decoding complexity is reduced, and the power consumption is further reduced.
Fig. 14 is a block diagram of an electronic device 1400 according to an example embodiment of the present disclosure.
Referring to fig. 4, the electronic device 1400 comprises at least one memory 1401 and at least one processor 1402, the at least one memory 401 having stored therein a set of computer-executable instructions that, when executed by the at least one processor 1402, perform a method of predicting with GEO mode according to an exemplary embodiment of the present disclosure.
By way of example, the electronic device 1400 may be a PC computer, tablet device, personal digital assistant, smartphone, or other device capable of executing the set of instructions described above. Here, the electronic device 1400 does not have to be a single electronic device, but can be any collection of devices or circuits that can execute the above instructions (or sets of instructions) individually or in combination. The electronic device 1400 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the electronic device 1400, the processor 1402 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special-purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
The processor 1402 may execute instructions or code stored in the memory 1401, wherein the memory 1401 may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.
The memory 1401 may be integrated with the processor 1402, e.g. by arranging a RAM or flash memory within an integrated circuit microprocessor or the like. In addition, memory 1401 may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The memory 1401 and the processor 1402 may be operatively coupled or may communicate with each other, e.g. via I/O ports, network connections, etc., such that the processor 1402 can read files stored in the memory.
In addition, the electronic device 1400 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device 1400 may be connected to each other via a bus and/or a network.
According to an example embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a method of predicting with GEO mode according to the present disclosure. Examples of the computer-readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD + RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD + RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or compact disc memory, Hard Disk Drive (HDD), solid-state drive (SSD), card-type memory (such as a multimedia card, a Secure Digital (SD) card or a extreme digital (XD) card), magnetic tape, a floppy disk, a magneto-optical data storage device, an optical data storage device, a hard disk, a magnetic tape, a magneto-optical data storage device, a hard disk, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, A solid state disk, and any other device configured to store and provide a computer program and any associated data, data files, and data structures to a processor or computer in a non-transitory manner such that the processor or computer can execute the computer program. The computer program in the computer-readable storage medium described above can be run in an environment deployed in a computer apparatus, such as a client, a host, a proxy device, a server, and the like, and further, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.
According to an exemplary embodiment of the present disclosure, a computer program product may also be provided, in which instructions are executable by a processor of a computer device to perform a method of predicting with GEO mode according to an exemplary embodiment of the present disclosure.
According to the method and the device for predicting by utilizing the GEO mode, the pixel precision for representing the GEO slope is reduced, the bit quantity in the calculation logic and the memory during the prediction by utilizing the GEO mode can be reduced under the condition of not influencing the coding and decoding quality, the coding and decoding complexity is reduced, and the power consumption is further reduced.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (16)

1. A method for prediction by using a geometric partitioning GEO mode is characterized by comprising the following steps:
when the current coding unit is predicted by using a GEO mode, determining a weighting matrix according to a predefined angle and an offset distance of a final partition mode in the GEO mode, wherein the precision of the slope of the predefined angle is 1/m pixel precision, and m is a positive integer less than or equal to 4;
and performing weighted mixing on a first prediction block of the first non-rectangular sub-partition and a second prediction block of the second non-rectangular sub-partition according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning mode.
2. The method of claim 1, wherein the step of determining a weighting matrix comprises:
determining a weight applied to one prediction sample in a first prediction block with 1/m pixel precision according to the offset distance, the predefined angle and a sample position of the one prediction sample in the first prediction block;
the weight applied to the prediction samples at the same position as the one prediction sample in the second prediction block with 1/m pixel accuracy is obtained by a difference between a fixed value and the weight of the one prediction sample in the first prediction block.
3. The method of claim 2, wherein the step of determining weights applied to first prediction samples in the first prediction block with 1/m pixel precision comprises:
determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in a first prediction block from the offset distance, the predefined angle and the sample position of the one prediction sample;
deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx.
4. The method of claim 3, wherein the step of determining a weight index weightIdx for deriving the weight to be applied to the one prediction sample in the first prediction block comprises:
determining an x-direction offset and a y-direction offset from the predefined angle and the offset distance;
retrieving a geometric partition distance array under the pixel precision of 1/m based on the angle index of the predefined angle to respectively obtain the geometric partition distance in the x direction and the geometric partition distance in the y direction;
and determining the weight index weightIdx according to the offset in the x direction and the offset in the y direction, the geometric partitioning distance in the x direction and the geometric partitioning distance in the y direction, and the sampling point position of the predicted sampling point.
5. The method of claim 3, wherein deriving the weight value applied to the one prediction sample in the first prediction block from the weight index weightIdx comprises:
determining the distance weightIdxL from the one predicted sample point to the partition boundary corresponding to the final partition mode under the condition of 1/m pixel precision according to the weight index weightIdx;
determining a weight wValue applied to the one prediction sample in the first prediction block with 1/m pixel precision using the distance weightIdxL according to the following equation:
wValue=Clip3(0,8,(weightIdxL+2m-3)>>(m-2))。
6. the method of claim 5, wherein the distance weightIdxL is determined according to the equation:
Figure FDA0003002032760000021
wherein the angleIdx is an angle index for representing the predefined angle.
7. The method of claim 1, further comprising:
storing, for each sub-block of a predefined size in a current coding unit, at least one of a first motion vector used to obtain a first prediction block and a second motion vector used to obtain a second prediction block,
wherein the step of storing comprises:
comparing the distance from the center of the current sub-block to the partition boundary corresponding to the final partition mode with a predetermined threshold, wherein the predetermined threshold is equal to 2 m(ii) a And
selecting a motion vector to be stored for the current sub-block of the first and second motion vectors according to a result of the comparison.
8. An apparatus for prediction using geometric partition GEO mode, comprising:
a computing unit configured to: when the current coding unit is predicted by using a GEO mode, determining a weighting matrix according to a predefined angle and an offset distance of a final partition mode in the GEO mode, wherein the precision of the slope of the predefined angle is 1/m pixel precision, and m is a positive integer less than or equal to 4; and
a generation unit configured to: and performing weighted mixing on a first prediction block of the first non-rectangular sub-partition and a second prediction block of the second non-rectangular sub-partition according to the weighting matrix to obtain a prediction block corresponding to the current coding unit, wherein the first non-rectangular sub-partition and the second non-rectangular sub-partition are partitioned from the current coding unit according to the final partitioning mode.
9. The apparatus of claim 8, wherein the calculation unit is configured to determine the weighting matrix applied to the first prediction block and the second prediction block according to:
Determining a weight applied to one prediction sample in a first prediction block with 1/m pixel precision according to the offset distance, the predefined angle and a sample position of the one prediction sample in the first prediction block;
the weight applied to the prediction samples at the same position as the one prediction sample in the second prediction block with 1/m pixel accuracy is obtained by a difference between a fixed value and the weight of the one prediction sample in the first prediction block.
10. The apparatus of claim 9, wherein the calculation unit is configured to determine the weight applied to the first prediction sample in the first prediction block with 1/m pixel precision according to:
determining a weight index weightIdx for deriving a weight to be applied to the one prediction sample in a first prediction block from the offset distance, the predefined slope and the sample position of the one prediction sample;
deriving a weight value applied to the one prediction sample in the first prediction block according to the weight index weightIdx.
11. The apparatus of claim 10, wherein the calculation unit is configured to determine the weight index weightIdx used to derive the weight applied to the one prediction sample in the first prediction block by:
Determining an x-direction offset and a y-direction offset from the predefined angle and the offset distance;
retrieving a geometric partition distance array under the pixel precision of 1/m based on the angle index of the predefined angle to respectively obtain the geometric partition distance in the x direction and the geometric partition distance in the y direction;
and determining the weight index weightIdx according to the offset in the x direction and the offset in the y direction, the geometric partitioning distance in the x direction and the geometric partitioning distance in the y direction, and the sampling point position of the predicted sampling point.
12. The apparatus of claim 10, wherein the computing unit is configured to derive the weight value applied to the one prediction sample in the first prediction block according to:
determining the distance weightIdxL from the one predicted sample point to the partition boundary corresponding to the final partition mode under the condition of 1/m pixel precision according to the weight index weightIdx;
determining a weight wValue applied to the one prediction sample in the first prediction block with 1/4 pixel accuracy using the distance weightIdxL according to the following equation:
wValue=Clip3(0,8,(weightIdxL+2m-3)>>(m-2))。
13. the apparatus of claim 12, wherein the distance weightIdxL is determined according to the following equation:
Figure FDA0003002032760000041
Wherein the angleIdx is an angle index for representing the predefined angle.
14. The apparatus of claim 8, further comprising:
a motion information storage unit configured to: storing, for each sub-block of a predefined size in a current coding unit, at least one of a first motion vector used to obtain a first prediction block and a second motion vector used to obtain a second prediction block,
wherein the motion information storage unit performs the stored operation by:
comparing the distance from the center of the current sub-block to the partition boundary corresponding to the final partition mode with a predetermined threshold, wherein the predetermined threshold is equal to 2m(ii) a And
selecting a motion vector to be stored for the current sub-block of the first and second motion vectors according to a result of the comparison.
15. An electronic device, comprising:
at least one processor;
at least one memory storing computer-executable instructions,
wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method of predicting with geometrically partitioned GEO modes of any of claims 1 to 7.
16. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the method of predicting with geometry partition GEO mode of any of claims 1 to 7.
CN202110349749.7A 2020-03-31 2021-03-31 Method and device for predicting by using geometric partition GEO mode Pending CN113473133A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063003228P 2020-03-31 2020-03-31
US63/003,228 2020-03-31

Publications (1)

Publication Number Publication Date
CN113473133A true CN113473133A (en) 2021-10-01

Family

ID=77868445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110349749.7A Pending CN113473133A (en) 2020-03-31 2021-03-31 Method and device for predicting by using geometric partition GEO mode

Country Status (1)

Country Link
CN (1) CN113473133A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114885164A (en) * 2022-07-12 2022-08-09 深圳比特微电子科技有限公司 Method and device for determining intra-frame prediction mode, electronic equipment and storage medium
WO2023207646A1 (en) * 2022-04-25 2023-11-02 Mediatek Inc. Method and apparatus for blending prediction in video coding system
WO2024055155A1 (en) * 2022-09-13 2024-03-21 Oppo广东移动通信有限公司 Coding method and apparatus, decoding method and apparatus, and coder, decoder and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023207646A1 (en) * 2022-04-25 2023-11-02 Mediatek Inc. Method and apparatus for blending prediction in video coding system
CN114885164A (en) * 2022-07-12 2022-08-09 深圳比特微电子科技有限公司 Method and device for determining intra-frame prediction mode, electronic equipment and storage medium
WO2024055155A1 (en) * 2022-09-13 2024-03-21 Oppo广东移动通信有限公司 Coding method and apparatus, decoding method and apparatus, and coder, decoder and storage medium

Similar Documents

Publication Publication Date Title
US11368675B2 (en) Method and device for encoding and decoding intra-frame prediction
WO2020200298A1 (en) Interaction between core transform and secondary transform
AU2016253621B2 (en) Method and apparatus for encoding image and method and apparatus for decoding image
CN112740685B (en) Image encoding/decoding method and apparatus, and recording medium storing bit stream
EP2755389B1 (en) Inter prediction method and apparatus therefor
US9100649B2 (en) Method and apparatus for processing a video signal
US11070831B2 (en) Method and device for processing video signal
KR20240066144A (en) Method and apparatus for encoding and decoding using selective information sharing over channels
CN111886861A (en) Image decoding method and apparatus according to block division structure in image coding system
KR20160106022A (en) Apparatus for encoding a moving picture
CN115002457A (en) Image encoding and decoding method and image decoding apparatus
EP3764643B1 (en) Image processing method based on inter prediction mode, and device therefor
CN113473133A (en) Method and device for predicting by using geometric partition GEO mode
US20240187623A1 (en) Video Coding Using Intra Sub-Partition Coding Mode
CN114793281A (en) Method and apparatus for cross component prediction
US11558608B2 (en) On split prediction
US20240155134A1 (en) Method and apparatus for video coding using improved cross-component linear model prediction
CN112567749B (en) Method and apparatus for processing video signal using affine motion prediction
US20220295045A1 (en) Video signal processing method and device
US20220295046A1 (en) Method and device for processing video signal
KR20230144426A (en) Image encoding/decoding method and apparatus
KR20240059507A (en) Method and apparatus for video encoding to imporve throughput and recording medium for storing bitstream
WO2023055968A1 (en) Methods and devices for decoder-side intra mode derivation
US20200267414A1 (en) Method and device for filtering image in image coding system
KR20240066274A (en) Sign prediction for block-based video coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination