WO2020159990A1 - Methods and apparatus on intra prediction for screen content coding - Google Patents

Methods and apparatus on intra prediction for screen content coding Download PDF

Info

Publication number: WO2020159990A1
Authority: WO; WIPO (PCT)
Prior art keywords: intra; flag; block; category; blocks
Prior art date: 2019-01-28

Application number

PCT/US2020/015411

Other languages

English (en)

French (fr)

Inventor

Xiaoyu XIU

Yi-Wen Chen

Xianglin Wang

Original Assignee

Beijing Dajia Internet Information Technology Co., Ltd.

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2019-01-28

Filing date

2020-01-28

Publication date

2020-08-06

2020-01-28 Application filed by Beijing Dajia Internet Information Technology Co., Ltd. filed Critical Beijing Dajia Internet Information Technology Co., Ltd.

2020-01-28 Priority to CN202080022164.8A priority Critical patent/CN113615202A/zh

2020-08-06 Publication of WO2020159990A1 publication Critical patent/WO2020159990A1/en

Links

Classifications

- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness

Definitions

the present disclosure relates generally to coding (e.g., encoding and decoding) video data. More specifically, this disclosure relates to a method, a computing device, and a non- transitory computer readable storage medium for selectively enabling and/or disabling intra smoothing operations for video coding.
Video coding is performed according to one or more video coding standards.
video coding standards include versatile video coding (VVC), joint exploration test model coding (JEM), high-efficiency video coding (H.265/HEVC), advanced video coding (H.264/AVC), moving picture experts group coding (MPEG), or the like.
Video coding generally utilizes prediction methods (e.g., inter-prediction, intra-prediction, or the like) that take advantage of redundancy present in video images or sequences.
An important goal of video coding techniques is to compress video data into a form that uses a lower bit rate, while avoiding or minimizing degradations to video quality.
the first version of the HEVC standard was finalized in October 2013, which offers approximately 50% bit-rate saving or equivalent perceptual quality compared to the prior generation video coding standard H.264/MPEG AVC.
the HEVC standard provides significant coding improvements over its predecessor, there is evidence that superior coding efficiency can be achieved with additional coding tools over HEVC.
both VCEG and MPEG started the exploration work of new coding technologies for future video coding standardization.
One Joint Video Exploration Team (JVET) was formed in October 2015 by ITU-T VECG and ISO/IEC MPEG to begin significant study of advanced technologies that could enable substantial enhancement of coding efficiency.
JVET obtained a reference software called joint exploration model (JEM) by integrating several additional coding tools on top of the HEVC test model (HM).
JEM joint exploration model
VVC test model VTM
a method for coding video data performed at a computing device having one or more processors and memory storing a plurality of programs to be executed by the one or more processors, includes selectively disabling intra smoothing operations.
a hash value of each block of a plurality of non- overlapped blacks in a tile group is calculated by an encoder.
the plurality of non-overlapped blocks in the tile group are classified into two categories: a first category and a second category.
the first category includes a set of blocks of the plurality of non-overlapped blocks of hash values that are within a first set of hash values.
the second category includes all remaining blocks of the plurality of non-overlapped blocks that are in the tile group but not of the first category. It is then determined, for a first block in the second category, whether there is a second block in the second category representing the same hash value as the first block. Based on determining whether there is the second block in the second category representing the same hash value as the first block, at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks in the tile group is disabled.
a computing device includes one or more processors, a non-transitory storage coupled to the one or more processors, and a plurality of programs stored in the non-transitory storage.
the computing device selectively disables intra smoothing operations.
a hash value of each block of a plurality of non-overlapped blacks in a tile group is calculated by an encoder.
the plurality of non-overlapped blocks within the tile group are classified into two categories: a first category and a second category.
the first category includes one or more blocks of the plurality of non-overlapping blocks of hash values that are within a first set of hash values.
the second category includes all remaining blocks of the plurality of non-overlapped blocks that are in the tile group but not of the first category. It is then determined, for a first block in the second category, whether there is a second block in the second category representing the same hash value as the first block. Based on determining whether there is the second block in the second category representing the same hash value as the first block, at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks in the tile group is disabled.
a non-transitory computer readable storage medium stores a plurality of programs for execution by a computing device having one or more processors.
the plurality of programs when executed by the one or more processors, cause the computing device to selectively disable intra smoothing operations.
a hash value of each block of a plurality of non-overlapped blacks in a tile group is calculated by an encoder.
the plurality of non-overlapped blocks within the tile group are classified into two categories: a first category and a second category.
the first category includes a set of blocks of the plurality of non-overlapped of hash values that are within a first set of hash values.
the second category includes all remaining blocks of the plurality of non-overlapped blocks that are in the tile group but not of the first category. It is then determined, for a first block in the second category, whether there is a second block in the second category representing the same hash value as the first block. Based on determining whether there is the second block in the second category representing the same hash value as the first block, at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks in the tile group is disabled.
FIG.1 is a block diagram setting forth an illustrative block-based video encoder which may be used in conjunction with many video coding standards including VVC;
FIG.2 is a block diagram setting forth an illustrative block-based video decoder which may be used in conjunction with many video coding standards including VVC;
FIGS.3A– 3E show example splitting types, namely, quaternary partitioning (FIG. 3A), horizontal binary partitioning (FIG.3B), vertical binary partitioning (FIG.3C), horizontal ternary partitioning (FIG.3D), and vertical ternary partitioning (FIG.3E), according to some examples;
FIG.4 illustrates intra modes in VVC
FIG.5 illustrates multiple reference lines for the intra prediction in VVC
FIG.7 illustrates locations of neighboring reconstructed samples that are used for the position-dependent intra prediction combination (PDPC) of one coding block
FIG.8 illustrates a comparison between pictures that are obtained from screen content and camera-captured video
FIG.9 illustrates an example of disabling intra fractional sample interpolation by clipping one fractional reference sample position to a left integer sample position
FIG.10A illustrates an example of disabling intra fractional sample interpolation by clipping one fractional reference sample position to its left neighboring integer reference sample when its left neighboring integer reference sample is the nearest neighbor;
FIG.10B illustrates an example of disabling intra fractional sample interpolation by clipping one fractional reference sample position to its right neighboring integer reference sample when its right neighboring integer reference sample is the nearest neighbor;
FIG.11 is a flowchart of a method for coding video data according to an example. DETAILED DESCRIPTION
first,“second,”“third,” etc. may be used herein to describe various information, the information should not be limited by these terms. These terms are only used to distinguish one category of information from another. For example, without departing from the scope of the present disclosure, first information may be termed as second information; and similarly, second information may also be termed as first information.
the term“if” may be understood to mean“when” or“upon” or “in response to,” depending on the context.
the present disclosure relates generally to coding (e.g., encoding and decoding) video data. More specifically, this disclosure relates to a method, a computing device, and a non- transitory computer readable storage medium for selectively enabling and/or disabling intra smoothing operations for video coding.
FIG. 1 is a block diagram setting forth an illustrative block-based video encoder 100 which may be used in conjunction with many video coding standards including VVC.
the input video signal is processed block by block (called coding units (CUs)).
CUs coding units
VTM-1.0 a CU can be up to 128x128 pixels.
one coding tree unit (CTU) is split into CUs to adapt to varying local characteristics based on quad/binary/ternary-tree.
each CU is always used as the basic unit for both prediction and transform without further partitions.
the multi-type tree structure one CTU is firstly partitioned by a quad-tree structure. Then, each quad-tree leaf node can be further partitioned by a binary and ternary tree structure.
a video frame is partitioned into a plurality of blocks for processing. For each given video block, a prediction is formed based on either an inter prediction approach or an intra prediction approach.
one or more predictors are formed through motion estimation and motion compensation, based on pixels from previously reconstructed frames.
predictors are formed based on reconstructed pixels in a current frame. Through mode decision, a best predictor may be chosen to predict a current block.
a prediction residual representing the difference between a current video block and its predictor, is sent to a transform circuitry 102.
the term“circuitry” as used herein includes hardware and software to operate the hardware.
Transform coefficients are then sent from the transform circuitry 102 to a quantization circuitry 104 for entropy reduction.
Quantized coefficients are then fed to an entropy coding circuitry 106 to generate a compressed video bitstream.
prediction-related information 110 from an inter prediction circuitry and/or an intra prediction circuitry 112, such as block partition info, motion vectors, reference picture index, and intra prediction mode, etc. are also fed through the entropy coding circuitry 106 and saved into a compressed video bitstream 114.
decoder-related circuitries are also needed in order to reconstruct pixels for the purpose of prediction.
a prediction residual is reconstructed through an inverse quantization circuitry 116 and an inverse transform circuitry 118.
This reconstructed prediction residual is combined with a block predictor 120 to generate un-filtered reconstructed pixels for a current block.
intra prediction also referred to as“spatial prediction” and/or inter prediction (also referred to as“temporal prediction” or“motion compensated prediction”) may be performed.
Intra prediction uses pixels from the samples of already coded neighboring blocks (which are called reference samples) in the same video picture or slice to predict the current video block.
Intra prediction reduces spatial redundancy inherent in the video signal.
Inter prediction uses reconstructed pixels from the already coded video pictures to predict the current video block.
Inter prediction reduces temporal redundancy inherent in the video signal.
An inter prediction signal for a given CU is usually signaled by one or more motion vectors (MVs) which indicate an amount and a direction of motion between the current CU and its temporal reference. Also, if multiple reference pictures are supported, one reference picture index is additionally sent, which is used to identify from which reference picture in the reference picture store the temporal prediction signal comes.
MVs motion vectors
the intra/inter mode decision circuitry 126 in the encoder 100 chooses the best prediction mode, for example based on the rate-distortion optimization method.
the prediction block is then subtracted from the current video block; and the prediction residual is de-correlated using transform and quantized.
the quantized residual coefficients are inverse quantized and inverse transformed to form the reconstructed residual, which is then added back to the prediction block to form the reconstructed signal of the CU. Further in-loop filtering, such as deblocking filter or sample adaptive offset (SAO) and adaptive in-loop filter (ALF) may be applied on the reconstructed CU before the reconstructed CU is put in the reference picture store and used to code future video blocks.
coding mode inter or intra
prediction mode information, motion information, and quantized residual coefficients are all sent to the entropy coding circuitry 106 to be further compressed and packed to form the bit-stream.
FIG.2 is a block diagram setting forth an illustrative block-based video decoder 200 which may be used in conjunction with many video coding standards including VVC.
decoder 200 is similar to the reconstruction-related section residing in encoder 100 of FIG.1.
an incoming video bitstream 201 is first decoded through an entropy decoding circuitry 202 to derive quantized coefficient levels and prediction-related information.
the quantized coefficient levels are then processed through an inverse quantization circuitry 204 and an inverse transform circuitry 206 to obtain a reconstructed prediction residual.
the coding mode and prediction information are sent to either the spatial prediction circuitry (if intra coded) or the temporal prediction circuitry (if inter coded) to form the prediction block.
the residual transform coefficients are sent to inverse quantization circuitry 204 and inverse transform circuitry 206 to reconstruct the residual block.
the prediction block and the residual block are then added together.
the reconstructed block may further go through in-loop filtering before it is stored in reference picture store.
the reconstructed video in the reference picture store is then sent out to drive a display device, as well as used to predict future video blocks.
a block predictor mechanism which may be implemented in an intra/inter mode selection circuitry 208, includes an intra prediction circuitry 210 configured to perform an intra-prediction process and/or a motion compensation circuitry 212 configured to perform a motion compensation process based on decoded prediction information.
a set of unfiltered reconstructed pixels are obtained by summing the reconstructed prediction residual from inverse transform circuitry 206 and a predictive output generated by the block predictor mechanism, using a sum 214. In situations where an in-loop filter 216 is turned on, a filtering operation is performed on these reconstructed pixels to derive the final reconstructed video for output.
FIGS.3A– 3E shows five example splitting types, namely, quaternary partitioning (FIG.3A), horizontal binary partitioning (FIG.3B), vertical binary partitioning (FIG.3C), horizontal ternary partitioning (FIG.3D), and vertical ternary partitioning (FIG.3E).
FIG.4 illustrates intra modes in VVC.
VVC uses a set of previously decoded samples neighboring to one current CU (i.e., above or left) to predict the samples of the CU.
the amount of angular intra modes is extended from 33 in HEVC to 93 in VVC.
the same planar mode which assumes a gradual changing surface with horizontal slope and vertical slope derived from boundaries
DC mode which assumes a flat surface
all the intra modes (i.e., planar, DC and angular directions) in VVC utilize a set of neighboring reconstructed samples above and left to the predicted block as the reference for intra prediction.
FIG.5 illustrates multiple reference lines for the intra prediction in VVC. Different from HEVC where only the nearest row or column (i.e., line 0 in Figure 5) of reconstructed samples as reference, multi reference lines are introduced in VVC where two additional rows or columns (i.e., line 1 and line 3 in Figure 5) are used for the intra prediction.
the index of the selected reference row or column is signaled from encoder 100 to decoder 200.
planar and DC modes are excluded from the set of intra modes that can be used to predict the current block.
FIGS.6A– 6C illustrates locations of the reference samples that are used in VVC to derive the predicted samples of one intra block.
FIGS.6A– 6C because the quad/binary/ternary tree partition structure is applied, besides the coding blocks in square shape, rectangular coding blocks also exist for the intra prediction of VVC. Due to the unequal width and height of one given block, various sets of angular directions are selected for different block shapes, which is also called wide-angle intra prediction. Particularly, for both square and rectangular coding blocks, besides planar and DC modes, 65 out of 93 angular directions are also supported for each block shape, as shown in Table 1 below.
Such design not only efficiently captures the directional structures that are typically present in video (by adaptively selecting angular directions based on block shapes) but also ensures that a total of 67 intra modes (i.e., planar, DC and 65 angular directions) are enabled for each coding block. This can achieve a good efficiency of signaling intra modes while providing a consistent design across different block sizes.
Table 1 shows selected angular directions for the intra prediction of different block shapes in VCC.
VVC applies a 3-tap smoothing filter [1, 2, 1]/4 to the reference samples used for the intra prediction.
the filtering operation is applied for each reference sample using its two neighboring reference samples on the left and right except for the bottom-left and top-right reference samples where the filtering operation is skipped.
the decision on whether to apply the smoothing filter to intra reference samples is dependent on coding block size and the directionality of the angular intra mode. More specifically, for coding block sizes with less than 64 samples, the reference smoothing filter is always disabled for all angular directions. For coding block sizes with at least 64 samples, the smoothing filter is applied to three diagonal angular directions (i.e., mode 2, 34 and 66) but disabled for all other angular directions. For DC mode, the smoothing filter is always disabled. For planar mode, the smoothing filter is applied to the intra reference samples when the coding block contains more than 32 samples.
an interpolation process is applied to obtain intra prediction samples when the target sample is predicted from one reference sample at a fractional position.
a DCT-IF filter the same filter that is applied for motion compensation of chroma components
a gaussian filter is applied to generate the luma samples at fractional positions.
Table 2 shows the interpolation filters used for the intra prediction of fractional luma samples.
w y is the weighting parameter between two integer reference samples R i, 0 and Ri+1, 0, which are calculated as:
the intra prediction samples are generated from either a non-filtered or a filtered set of neighboring reference samples, which may introduce discontinuities along the block boundaries between the current coding block and its neighbors.
boundary filtering is applied in HEVC by combing the first row or column of prediction samples of DC, horizontal (i.e., mode 18) and vertical (i.e., mode 50) prediction modes with the unfiltered reference samples utilizing a 2-tap filter (for DC mode) or a gradient-based smoothing filter (for horizontal and vertical prediction modes).
the position-dependent intra prediction combination (PDPC) tool in VVC extends the above idea by employing a weighted combination of intra prediction samples with unfiltered reference samples.
the PDPC is enabled for the following intra modes without signaling: planar, DC, horizontal (i.e., mode 18), vertical (i.e., mode 50), angular directions close to the bottom-left diagonal directions (i.e., mode 2, 3, 4,..., 10) and angular directions close to the top-right diagonal directions (i.e., mode 58, 59, 60,..., 66).
planar DC
horizontal i.e., mode 18
vertical i.e., mode 50
angular directions close to the bottom-left diagonal directions i.e., mode 2, 3, 4,..., 10
angular directions close to the top-right diagonal directions i.e., mode 58, 59, 60,..., 66.
pred(x,y) (wL ⁇ R-1,y + wT ⁇ Rx,-1– wTL ⁇ R-1,-1 + (64– wL– wT + wTL) ⁇ pred(x,y) + 32) >> 6
FIG.7 illustrates locations of neighboring reconstructed samples that are used for the position-dependent intra prediction combination (PDPC) of one coding block. It shows the locations of the reference samples that are used to combine with the current prediction sample during the PDPC process.
the weights wL, wT and wTL in Eq. (5) are adaptively selected depending on prediction mode and sample position, as described as follows where the current coding block is assumed to be in the size of W ⁇ H:
FIG.8 shows a comparison between pictures that are obtained from screen content and camera-captured video.
the screen content is shown on the left and the camera-captured content is shown on the right.
Screen content coding (SCC) is becoming increasingly important in various video applications, such as desktop sharing, video conferencing, and remote education.
screen content shows very different characteristics.
the video signals captured by cameras usually show smooth boundaries across different objects.
screen content presents shape edges.
the intra smoothing operations are efficient to improve the intra coding efficiency of camera-captured content due to the gradual transition edges.
the content has a lot of sharp edges such that those smoothing operations can potentially make the prediction inaccurate and therefore compromise the overall coding performance.
the smoothing of reference samples could remove the useful high- frequency information existing in the reconstructed neighboring samples of one block. Due to the sharp edges in screen content, the interpolated samples are less accurate than the reference samples at an integer position (i.e., the reference sample without interpolation) for predicting the samples of the current block.
the weighted combination of the intra prediction samples and the unfiltered reference samples can potentially remove the useful high-frequency information in the neighboring reconstructed region of the current block, therefore reducing the intra prediction efficiency.
signaling methods are proposed in this disclosure to control the enabling and/or disabling of the smoothing operations, e.g., intra reference sample smoothing, fractional intra sample interpolation and PDPC in the existing intra prediction design of VVC.
Table 3 illustrates the modified syntax table of sequence parameter set (SPS), where three syntax elements, i.e., sps_intra_reference_smooth_enabled_flag,
sps_intra_frac_sample_interp_enabled_flag and sps_pdpc_enabled_flag, are signaled to indicate if the intra reference smoothing, intra fractional sample interpolation or PDPC are applied in the intra prediction process of all coding blocks in the tile groups that refer to the SPS.
sps_intra_frac_sample_interp_enabled_flag By setting one of those three flags (e.g., sps_intra_frac_sample_interp_enabled_flag) to zero, one encoder can indicate to one decoder that on the sequence level the corresponding intra smoothing (e.g., intra fractional sample interpolation) is not applied.
intra_frac_sample_interp_override_enabled_flag, and pdpc_override_enabled_flag are also signaled in the SPS to indicate if it allows to change the enabling and/or disabling decisions of intra reference smoothing, intra fractional sample interpolation and PDPC at a tile group level.
intra_frac_sample_interp_override_enabled_flag When one of those three flags (e.g., intra_frac_sample_interp_override_enabled_flag) is set equal to 1, it indicates that another control flag to enable and/or disable the corresponding intra smoothing operation (e.g., intra fractional sample interpolation) will be signaled at a tile group header; otherwise (i.e., the flag is equal to 0), it indicates that the separate control to enable and/or disable the corresponding intra smoothing operation at the tile group level is disallowed, i.e., the enabling and/or disabling of that intra smoothing operation is controlled by the corresponding flag (e.g., intra_frac_sample_interp_override_enabled_flag) is set equal to 1, it indicates that another control flag to enable and/or disable the corresponding intra smoothing operation (e.g., intra fractional sample interpolation) will be signaled at a tile group
Table 3 shows the modified SPS syntax table as follows:
sps_intra_reference_smooth_enabled_flag specifies whether intra reference smoothing is applied for intra prediction.
sps_intra_reference_smooth_enabled_flag 0 specifies that intra reference smoothing is not used in the CVS. Otherwise, sps_intra_reference_smooth_enabled_flag equal to 1 specifies that intra reference smoothing is used in the CVS.
intra_reference_smooth_override_enabled_flag 1 specifies the presence of intra_reference_smooth_enabled_flag in the tile group header of the pictures referring to the SPS.
intra_reference_smooth_override_enabled_flag 0 specifies the absence of intra_reference_smooth_enabled_flag in the tile group header of the pictures referring to the SPS.
sps_intra_frac_sample_interp_enabled_flag specifies whether intra fractional sample interpolation is applied for intra prediction. sps_intra_frac_sample_interp_enabled_flag equal to 0 specifies that intra fractional sample interpolation is not used in the CVS. Otherwise, sps_intra_frac_sample_interp_enabled_flag equal to 1 specifies that intra fractional sample interpolation is used in the CVS.
intra_frac_sample_interp_override_enabled_flag 1 specifies the presence of intra_frac_sample_interp_enabled_flag in the tile group header of the pictures referring to the SPS.
intra_frac_sample_interp_override_enabled_flag 0 specifies the absence of intra_frac_sample_interp_enabled_flag in the tile group header of the pictures referring to the SPS.
sps_pdpc_enabled_flag specifies whether a position-dependent intra prediction combination is applied for intra prediction. sps_pdpc_enabled_flag equal to 0 specifies that a position-dependent intra prediction combination is not used in the CVS. Otherwise, sps_pdpc_enabled_flag equal to 1 specifies that a position-dependent intra prediction combination is used in the CVS.
pdpc_override_enabled_flag 1 specifies the presence of pdpc_enabled_flag in the tile group header of the pictures referring to the SPS.
Pdpc_override_enabled_flag 0 specifies the absence of pdpc_enabled_flag in the tile group header of the pictures referring to the SPS.
Table 4 shows the modified syntax table of the tile group header.
the tile group level adaptation of enabling and/or disabling one intra smoothing process is allowed by setting the corresponding SPS control flag (i.e., sps_intra_reference_smooth_enabled_flag, sps_intra_frac_sample_interp_enabled_flag, or sps_pdpc_enabled_flag) to 1, another flag, i.e., intra_reference_smooth_enabled_flag, intra_frac_sample_interp_enabled_flag, or pdpc_enabled_flag, is further signaled in the tile group header to indicate that the corresponding intra smoothing process (i.e., intra reference smoothing, intra fractional sample interpolation, or PDPC) is enabled or disabled for the coding blocks that belongs to the tile group.
the corresponding intra smoothing process i.e., intra reference smoothing,
intra_reference_smooth_enabled_flag specifies whether intra reference smoothing is applied for intra prediction.
intra_reference_smooth_enabled_flag 0 specifies that intra reference smoothing is not used for the intra prediction of all coding blocks in the current tile group. Otherwise, (intra_reference_smooth_enabled_flag is equal to 1), intra reference smoothing is used for the intra prediction of all coding blocks in the current tile group. When not present, its value is inferred to be the value of
intra_frac_sample_interp_enabled_flag specifies whether intra fractional sample interpolation is applied for intra prediction.
intra_frac_sample_interp_enabled_flag 0 specifies that intra fractional sample interpolation is not used for the intra prediction of all coding blocks in the current tile group. Otherwise, (intra_frac_sample_interp_enabled_flag is equal to 1), intra fractional sample interpolation is used for the intra prediction of all coding blocks in the current tile group. When not present, its value is inferred to be the value of sps_intra_frac_sample_interp_enabled_flag.
pdpc_enabled_flag specifies whether PDPC is applied for intra prediction.
pdpc_enabled_flag 0 specifies that PDPC is not used for the intra prediction of all coding blocks in the current tile group. Otherwise, (pdpc_enabled_flag is equal to 1), PDPC is used for the intra prediction of all coding blocks in the current tile group. When not present, its value is inferred to be the value of sps_pdpc_enabled_flag.
three sets of syntax elements in Table 3 and Table 4 are proposed in the SPS and the tile group header to separately control the enabling and disabling of intra reference smoothing, intra fractional sample interpolation and PDPC.
one same syntax element is used in the SPS and the tile group header to control the enabling and/or disabling of all three intra smoothing operations.
intra sample interpolation should not be applied to intra reference samples when the target reference sample locates at one fractional sample position for screen content coding.
the positions of the referred intra reference samples that locate at fractional sample positions should be clipped to the reference samples at integer sample positions before intra prediction, for which different methods can be applied.
the integer reference sample that is left to the actual fractional sample position (for angular intra modes that refer to the left neighboring samples of the current block) or below the actual fractional position (for angular intra modes that refer to the above neighboring samples of the current block) is used to generate the predicted sample.
the value of one intra predicted sample is calculated by the proposed method as
FIG.9 illustrates the generation of one intra prediction sample when the above method is applied. As shown in FIG.9, intra fractional sample interpolation is disabled by clipping one fractional reference sample position to its left integer sample position.
FIG.11 is a flowchart of a method 1100 for coding video data according to an example.
the method may adaptively or selectively make a decision on whether to enable or disable intra smoothing processes (e.g., intra reference smoothing, intra fractional sample interpolation, and PDPC) based on an analysis of the video content to be coded, e.g., camera- captured video content or screen video content.
intra smoothing processes e.g., intra reference smoothing, intra fractional sample interpolation, and PDPC
the method is described using a tile group as the basic unit to adaptively switch the enabling and disabling of the intra smoothing processes.
the method described herein may be applicable to other coding levels, e.g., sequence-level, picture/slice-level, or even region level (e.g., each region may contain a certain number of CTUs).
coding levels e.g., sequence-level, picture/slice-level, or even region level (e.g., each region may contain a certain number of CTUs).
an encoder such as encoder 100 calculates a hash value of (e.g., 32-bit CRC) each non-overlapped block of a plurality of non-overlapped blocks in a tile group. Meanwhile, for each hash value, the encoder counts the number of blocks (i.e., usage) associated with the hash value.
the size of the blocks is 4x4.
the method is not limited to using a block size of 4x4. Other block sizes may also be used, including both square blocks and/or rectangular blocks of difference sizes in some examples.
all non-overlapped blocks in the tile group are classified into two categories, such as a first category and a second category.
the first category may comprise one or more blocks of the plurality of non-overlapped blocks having a hash value covered by a first set of hash values, such as the most-used hash values.
a first set of most-used hash values may be first N most-used hash values.
the second category may include all remaining blocks of the plurality of non-overlapped blocks that do not belong to the first category.
step 1105 for each block in the second category, it is determined whether there is another block in the same category which presents the same hash value. For example, for a first block in the second category, it is determined whether there is a second block in the second category representing the same hash value as the first block in some examples.
At step 1107 based on the determination at step 1105, at least one intra smoothing operation for intra prediction of the blocks in the tile group is disabled. For example, based on determining there is the second block in the second category representing the same hash value as the first block, at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks is disabled in some examples.
the first block When there is at least one matching second block, the first block is regarded as a screen-content block; otherwise, if there is no matching second block, the first block is regarded as a non-screen-content block.
the intra smoothing operations e.g., intra reference smoothing, intra fractional sample interpolation and PDPC
the intra smoothing operations will be disabled for the intra prediction of the coding blocks in the tile group. Otherwise, the intra smoothing operations will be enabled for the coding blocks in the tile group.
a method for coding video data comprising calculating, by an encoder, a hash value of each block of a plurality of non-overlapped blocks in a tile group and classifying the plurality of non-overlapped blocks into at least two categories comprising a first category and a second category.
the first category comprises one or more blocks of the plurality of non-overlapped blocks representing one or more hash values covered by a first set of hash values
the second category comprises all remaining blocks of the plurality of non-overlapped blocks. For a first block in the second category, it is determined whether there is a second block in the second category representing the same hash value as the first block. And based on determining whether there is the second block in the second category representing the same hash value as the first block, at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks is disabled.
the method when there is at least one second block, the method further includes determining that the first block is a screen-content block and, when there is no second block representing the same hash value as the first block, determining the first block is a non-screen-content block.
disabling at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks in the tile group comprises: determining whether a first enabling control flag is set to 1, wherein the first enabling control flag is sent in a sequence parameter set corresponding to the tile group, and determining, when it is determined that the first enabling control flag is set to 1, a second enabling control flag for the plurality of non-overlapped blocks within the tile group, wherein the second enabling control flag is set in a header of the tile group.
the enabling control flag may be one of the control flags provided in Table 3 and Table 4.
the control flag may be the corresponding SPS control flag, such as sps_intra_reference_smooth_enabled_flag,
the enabling flag may be flag, i.e., intra_reference_smooth_enabled_flag,
the at least one intra smoothing operation comprises one of following operations: intra reference smoothing, intra fractional sample interpolation, and/or position-dependent intra prediction combination (PDPC).
intra reference smoothing intra fractional sample interpolation
PDPC position-dependent intra prediction combination
the first enabling control flag may further comprise a first enabling control sub-flag for controlling an intra reference smoothing operation, a second enabling control sub-flag for controlling an intra fractional sample interpolation operation, and a third enabling control sub-flag for controlling a PDPC operation.
the second enabling flag may further comprise a first enabling control sub-flag for enabling or disabling an intra reference smoothing operation, a second enabling control sub-flag for enabling or disabling an intra fractional sample interpolation operation, and a third enabling control sub-flag for enabling or disabling a PDPC operation.
control override flag may comprise a first control override flag for an intra reference smoothing operation, a second control override flag for an intra fractional sample interpolation operation, and a third control override flag for a PDPC operation.
disabling at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks within the tile group comprises: determining whether a first enabling control flag is set to 1, wherein the first enabling control flag is sent in a picture parameter set corresponding to the tile group, and determining, when it is determined that the first enabling control flag is set to 1, a second enabling control flag for the plurality of non-overlapped blocks within the tile group, wherein the second enabling control flag is sent in a header of the tile group.
the first enabling control flag may further comprise a first enabling control sub-flag for controlling an intra reference smoothing operation, a second enabling control sub-flag for controlling an intra fractional sample interpolation operation, and a third enabling control sub-flag for controlling a PDPC operation.
the second enabling control flag may further comprise a first enabling control sub- flag for enabling or disabling an intra reference smoothing operation, a second enabling control sub-flag for enabling or disabling an intra fractional sample interpolation operation, and a third enabling control sub-flag for enabling or disabling a PDPC operation.
the method when it is determined to disable the intra fractional interpolation, the method further comprises clipping an intra fractional reference sample position to an integer reference sample position that is left to or above an actual fractional sample position or clipping the intra factional refence sample position to a nearest integer reference sample position.
a computing device comprises one or more processors, a non- transitory storage coupled to the one or more processors, and a plurality of programs stored in the non-transitory storage that, when executed by the one or more processors, cause the computing device to perform acts comprising: for each block of a plurality of non-overlapped blocks in a tile group, calculating, by an encoder, a hash value of the block; classifying the plurality of non-overlapped blocks in the tile group into two categories comprising a first category and a second category, wherein the first category comprises one or more blocks of hash values covered by a first set of hash values, and the second category comprises all remaining blocks of the plurality of non-overlapped blocks in the tile group.
At least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks in the tile group may be disabled.
a non-transitory computer readable storage medium storing a plurality of programs for execution by a computing device having one or more processors, wherein the plurality of programs, when executed by the one or more processors, cause the computing device to perform acts comprising: calculating, by an encoder, a hash value of each block of a plurality of non-overlapped blocks in a tile group; classifying the plurality of non-overlapped blocks into at least two categories comprising a first category and a second category, wherein the first category comprises one or more blocks of the plurality of non- overlapped blocks representing one or more hash values covered by a first set of hash values, and the second category comprises all remaining blocks of the plurality of non-overlapped blocks; for a first block in the second category, determining whether there is a second block in the second category representing the same hash value as the first block; and disabling, based on the determining whether there is the second block in the second category representing the same hash value as the first block
disabling, based on determining whether there is the second block in the second category, at least one intra smoothing operation for intra prediction of the plurality of non-overlapped blocks may comprise determining whether a first enabling control flag is set to 1, wherein the first enabling control flag is sent in a sequence parameter set corresponding to the tile group, and determining, when it is determined that the first enabling control flag is set to 1, a second enabling control flag for the plurality of non-overlapped blocks, wherein the second enabling control flag is sent in a header of the tile.
Computer- readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol.
computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a
Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the implementations described in the present application.
a computer program product may include a computer- readable medium.
the above methods may be implemented using an apparatus that includes one or more circuitries, which include application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components.
the apparatus may use the circuitries in combination with the other hardware or software components for performing the above described methods.
Each module, sub-module, unit, or sub-unit disclosed above may be implemented at least partially using the one or more circuitries.

Landscapes

Engineering & Computer Science (AREA)
Multimedia (AREA)
Signal Processing (AREA)
Compression Or Coding Systems Of Tv Signals (AREA)

PCT/US2020/015411 2019-01-28 2020-01-28 Methods and apparatus on intra prediction for screen content coding WO2020159990A1 (en)

Priority Applications (1)

Application Number	Priority Date	Filing Date	Title
CN202080022164.8A CN113615202A (zh)	2019-01-28	2020-01-28	用于屏幕内容编解码的帧内预测的方法和装置

Applications Claiming Priority (2)

Application Number	Priority Date	Filing Date	Title
US201962797756P	2019-01-28	2019-01-28
US62/797,756		2019-01-28

Publications (1)

Publication Number	Publication Date
WO2020159990A1 true WO2020159990A1 (en)	2020-08-06

Family

ID=71841908

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
PCT/US2020/015411 WO2020159990A1 (en)	2019-01-28	2020-01-28	Methods and apparatus on intra prediction for screen content coding

Country Status (2)

Country	Link
CN (1)	CN113615202A (zh)
WO (1)	WO2020159990A1 (zh)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
WO2015131325A1 (en) *	2014-03-04	2015-09-11	Microsoft Technology Licensing, Llc	Hash table construction and availability checking for hash-based block matching
WO2015139165A1 (en) *	2014-03-17	2015-09-24	Microsoft Technology Licensing, Llc	Encoder-side decisions for screen content encoding

2020
- 2020-01-28 CN CN202080022164.8A patent/CN113615202A/zh active Pending
- 2020-01-28 WO PCT/US2020/015411 patent/WO2020159990A1/en active Application Filing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
WO2015131325A1 (en) *	2014-03-04	2015-09-11	Microsoft Technology Licensing, Llc	Hash table construction and availability checking for hash-based block matching
WO2015139165A1 (en) *	2014-03-17	2015-09-24	Microsoft Technology Licensing, Llc	Encoder-side decisions for screen content encoding

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BIN LI ET AL.: "An HEVC-based Screen Content Coding Scheme", MICROSOFT RESEARCH LAB, 31 January 2015 (2015-01-31), pages 1 - 13, XP055245403 *
XIAOZHONG XU ET AL.: "Description of Core Experiment 8: Screen Content Coding Tools", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, JVET-L1028-V3, 12TH MEETING, MACAO, CN, 12 October 2018 (2018-10-12), pages 1 - 14 *
YAWEI XU ET AL.: "Pattern-matching scheme for HEVC screen content coding", 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING (ICCSAE 2015), 29 February 2016 (2016-02-29), pages 916 - 921, XP055724238 *

Also Published As

Publication number	Publication date
CN113615202A (zh)	2021-11-05

Legal Events

Date

Code

Title

Description

2020-09-16

121

Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20749678

Country of ref document: EP

Kind code of ref document: A1

2021-07-29

NENP

Non-entry into the national phase

Ref country code: DE

2022-02-09

122

Ep: pct application non-entry in european phase