WO2018076827A1 - 视频编码中帧内编码的码率估计方法 - Google Patents

视频编码中帧内编码的码率估计方法 Download PDF

Info

Publication number
WO2018076827A1
WO2018076827A1 PCT/CN2017/094015 CN2017094015W WO2018076827A1 WO 2018076827 A1 WO2018076827 A1 WO 2018076827A1 CN 2017094015 W CN2017094015 W CN 2017094015W WO 2018076827 A1 WO2018076827 A1 WO 2018076827A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
coding
coded bits
block
prediction block
Prior art date
Application number
PCT/CN2017/094015
Other languages
English (en)
French (fr)
Inventor
王荣刚
曹洪彬
王振宇
高文
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Priority to US16/334,931 priority Critical patent/US10917646B2/en
Publication of WO2018076827A1 publication Critical patent/WO2018076827A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria

Definitions

  • the invention belongs to the technical field of video coding, and relates to an intraframe (I frame) predictive coding technique, in particular to an intraframe (I frame) rate estimation method in video coding, which can be used to quickly and efficiently select the most intraframe prediction. Good mode, reducing the time required for intraframe coding.
  • Video coding technology refers to the technology of compressing video.
  • Video data in daily life includes two parts of information and redundant data.
  • Video coding aims to remove redundant parts of video data to reduce the pressure of video data during storage and transmission.
  • the current mainstream video coding platforms adopt a block-based hybrid coding framework. Image blocks need to be predicted, transformed, quantized, entropy encoded, etc. to minimize statistical redundancy in video data.
  • Rate-Distortion Optimization is a method to ensure that the number of coded bits is reduced as much as possible under certain video quality conditions, or to minimize the distortion of the code under certain bit rate constraints.
  • the rate-distortion process evaluates the pros and cons of each prediction method through a rate-distortion optimization function, mainly by examining the distortion and number of coded bits for each prediction mode. In the case of the same distortion, the prediction mode with a smaller number of coded bits will be selected in the RDO process. Thus, the distortion of each prediction mode and the number of coded bits need to be obtained in the RDO process.
  • the number of coded bits required to obtain a certain prediction mode requires entropy coding of the transformed quantized result of the prediction block, which is a very time consuming process.
  • the present invention provides an intra-frame (I-frame) rate estimation method for video coding, mainly by predicting block information modeling, and using the information entropy theory to estimate the prediction under the corresponding model.
  • the block encodes the number of bits, thereby skipping the process of entropy coding; the method of the present invention can be used to quickly and efficiently select the best mode for intra prediction, reducing the time required for intra coding. In the case of less video quality loss (0.64% BD-rate loss), more coding time is saved (37.7% RDO module time).
  • the rate-distortion optimization RDO process first selects the optimal prediction mode for each block partition; and then selects the best block partition mode from the optimal modes of various block partitions.
  • the key innovation of the present invention is that the present invention uses a new model combining a generalized Gaussian distribution and a uniform distribution. The fitting of the new model to the residual distribution in the video coding is closer to the actual result, and at the same time, since the tail of the model is uniform Distribution, so that the computational complexity of updating the lookup table link can be improved during the entire rate estimation process. Greatly reduced.
  • the method of the invention adds a link for correcting the number of estimated coding bits, which is equivalent to a modification of the decision result of the above hybrid model, so that the correction result is closer to the result of the true entropy coding.
  • a code rate estimation method for intra-frame coding in video coding by modeling the residual information of the prediction block, and estimating the number of coded bits of the prediction block according to the information entropy theory under the corresponding model, so that the RDO (rate) In the process of distortion optimization, the entropy coding process is skipped, and the coding time is effectively reduced.
  • the code rate estimation method mainly includes: statistically predicting block distribution information and modeling the process, and estimating the number of coding bits of the prediction mode according to the model.
  • the process, and the process of correcting the estimated number of coded bits; the specific steps are as follows:
  • step 12 Using the model obtained in step 11), calculate the probability that the prediction block is a specific value at each position (ie, a certain position on a certain prediction block, and the quantized value of the position after quantization) x, find the probability of occurrence of this situation), and then obtain information entropy based on probability.
  • the calculated information entropy is used as the estimated number of coded bits of the quantized value of the prediction block at this position.
  • the estimated number of coded bits is stored in a data table, so that only the data table needs to be queried in the RDO process of the next frame encoding to obtain the corresponding number of coded bits.
  • Step 2 we can skip the entropy coding process and select the best prediction mode for each block size (4 ⁇ 4 ⁇ 64 ⁇ 64) (this has been achieved in the RDO process; our invention is Step 2) is used to replace the entropy coding process with higher computational complexity in the RDO process).
  • the RDO process will immediately select one of the optimal modes for different block partitions.
  • the best mode is the final result of the division.
  • a simple entropy coding is performed on these optimal modes, and the simple entropy coding result is used as the estimated coding bit number of the optimal mode for We choose the optimal division between different block sizes.
  • the invention provides a frame rate estimation method for intraframe (I frame) in video coding, mainly by predicting information modeling of a block, and estimating the number of prediction block coded bits by using information entropy theory in the corresponding model, thereby skipping
  • the process of entropy coding; the method of the present invention can be used to quickly and efficiently select the best mode of intra prediction, reducing the time required for intra coding.
  • the invention has the following advantages:
  • the hybrid model of generalized Gaussian distribution and uniform distribution compares the residual distribution according to the real situation, and can distinguish the optimal prediction mode well in the same block size.
  • Re-entropy coding (simple entropy coding) for the optimal mode in each block size, which is less complex than real entropy coding, but sufficient to decide the best block The way of division.
  • Video quality loss is low (0.64% BD-rate loss).
  • 1 is a flow chart of a code rate estimation method provided by the present invention.
  • 2 is a probability density function diagram of a conventional generalized Gaussian distribution model and a hybrid model proposed by the present invention
  • the dotted line represents the traditional generalized Gaussian distribution model
  • the solid line represents the generalized Gaussian distribution and the uniformly distributed mixture model
  • the intersection of the two vertical lines and the horizontal axis represents the generalized Gaussian distribution and the uniformly distributed boundary point in the hybrid model.
  • FIG. 3 is a schematic diagram showing the probability that a certain position in the prediction block is quantized to a specific value in the present invention
  • the curve is a residual distribution for a certain position in the prediction block; the horizontal axis represents the value before the quantization of the position, and the vertical axis represents the probability density of the specific value before the quantization of the position; for the position, when the quantization is performed result Quantization formula Where f is the quantization offset, Q step is the quantization step size, int(x) is the rounding function, and sgn(x) is the sign function. It can be seen that the corresponding pre-quantization value may be in the interval.
  • the pre-quantization interval may be in (-(1-f) ⁇ Q step , (1-f) ⁇ Q step ), the interval is the dead zone deadband, and the area of the white shaded portion in the middle of Fig. 3 indicates the quantized result of the position. for The probability.
  • the invention provides a fast rate estimation method for a Rate Distortion Optimization module in an I frame coding in the field of video coding, which is used to estimate the coding rate of each prediction mode in the RDO process, instead of the real entropy coding process.
  • the huge time complexity saves more coding time (37.7% RDO module time) with less loss of video quality (0.64% BD-rate loss).
  • the present invention is applicable to the code rate estimation of I frames in video coding.
  • the invention models the residual information of the prediction block, and estimates the number of coded bits of the prediction block according to the information entropy theory under the corresponding model, so that the entropy coding process can be skipped in the RDO process, and the coding time is effectively reduced;
  • 1 is a block diagram of a code rate estimation method provided by the present invention, which mainly includes three parts: statistically predicting block distribution information and modeling, estimating the number of coded bits of the prediction mode according to the model, and correcting the estimated number of coded bits. Specifically, the following steps are included:
  • the update frequency of the model is updated in principle every frame, that is, the statistical result of the current frame will be used in the prediction of the next frame. However, if a certain number of samples is not reached after the end of one frame, the model will not be updated and the statistical samples will be updated along with the next frame of samples.
  • each block size (4 ⁇ 4 ⁇ 64 ⁇ 64) we use the above method to determine the optimal prediction mode of the current block size.
  • we derive the optimal mode for each block size we do a simple entropy coding for these optimal modes, and use this simple entropy coding result as the estimated number of estimated coding bits for the different modes.
  • the optimal division between block sizes is chosen.
  • f uv (x) represents the probability density distribution of each position in the prediction block (u, v is the coordinates of the position in the prediction block), ⁇ uv represents the standard deviation at the position, and ⁇ uv controls the probability density function.
  • Shape represents the gamma function.
  • the present invention combines a generalized Gaussian distribution model with a uniform distribution model, and the mixed model expression f' uv (x) is as shown in Equation 3:
  • ⁇ uv is an adjustment factor to ensure that the probability density function is integrated to 1 over the entire interval.
  • b uv is the boundary of the generalized Gaussian distribution and the uniform distribution.
  • m uv represents the maximum value that x can take after quantization, so that the expression of the generalized Gaussian distribution and the uniformly distributed mixed model is obtained, as shown in Fig. 2.
  • 2 is a probability density function diagram of a conventional generalized Gaussian distribution model and a hybrid model proposed by the present invention.
  • the probability 3 is a schematic diagram of the probability that a certain position in the calculation prediction block is quantized to a specific value according to the present invention. As shown in Figure 3, for the part of the generalized Gaussian distribution in the model, the expression is calculated as Equation 4:
  • Equation 5 For the part of the model that is evenly distributed, the calculation expression is as shown in Equation 5:
  • the formula for calculating the probability has a relatively high computational complexity.
  • the module can directly find the result, which effectively improves the time efficiency of the RDO module.
  • the above method can quickly estimate the optimal prediction mode in the same block size, but the estimated number of coded bits is not ideal when determining different block sizes, and the estimated code rate needs to be corrected.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明公布了一种在视频编码领域的I帧编码中率失真优化模块的快速码率估计方法,通过对预测块的残差信息进行建模,并在相应模型下根据信息熵理论估计出预测块的编码比特数,从而可以在RDO过程中跳过熵编码过程,包括:统计预测块分布信息并建模得到混合模型、根据模型估计预测模式的编码比特数、并对估计的编码比特数做出修正,用以估算RDO过程中每种预测模式的编码码率,以替代真实熵编码过程的巨大的时间复杂度,在视频质量损失较少的情况下有效地减少编码时间。本发明适用于视频编码中I帧的码率估计。

Description

视频编码中帧内编码的码率估计方法 技术领域
本发明属于视频编码技术领域,涉及帧内(I帧)预测编码技术,尤其涉及一种视频编码中帧内(I帧)的码率估计方法,可用于快速高效地选择出帧内预测的最佳模式,减少帧内编码所需的时间。
背景技术
视频编码技术是指对视频进行压缩的技术。日常生活中的视频数据包括信息和冗余数据两部分内容,视频编码旨在去除视频数据的冗余部分,以减少视频数据在存储和传输过程中的压力。当前主流的视频编码平台都采用基于图像块的混合编码框架,图像块需要经过预测、变换、量化、熵编码等方法,最大限度地消除视频数据中的统计冗余。
率失真优化技术(Rate-DistortionOptimization,简称RDO)是保证在一定的视频质量条件下尽量减少编码比特数的方法,或在一定码率限制条件下尽量减少编码的失真。率失真过程会通过率失真优化函数来评估每种预测方式的优劣,主要通过考察每种预测模式的失真和编码比特数。在相同失真的情况下,编码比特数越小的预测模式将会在RDO过程中被选择。由此,在RDO过程中需要得到每种预测模式的失真以及编码比特数。然而,要求得到某种预测模式的编码比特数,需要对该预测块的变换量化后的结果做熵编码,这却是一个非常耗时的过程。
发明内容
为了克服上述现有技术的不足,本发明提供一种视频编码中帧内(I帧)的码率估计方法,主要通过预测块的信息建模,并在相应模型下利用信息熵理论估计出预测块编码比特数,从而跳过了熵编码的过程;本发明方法可用于快速高效地选择出帧内预测的最佳模式,减少帧内编码所需的时间。在视频质量损失较少(0.64%BD-rate损失)的情况下节约了较多的编码时间(37.7%的RDO模块时间)。
率失真优化RDO过程首先选出每一种块划分下的最优预测模式;再从各种块划分的最优模式中选出最佳的块划分模式。本发明的关键创新点在于:本发明使用了广义高斯分布和均匀分布相结合的新模型,新模型对视频编码中的残差分布的拟合与实际结果更为接近,同时由于模型尾部是均匀分布,使得在整个码率估计过程中,更新查找表环节的计算复杂度能够 大大降低。本发明方法增加了对估计编码比特数修正的环节,相当于对上述混合模型的决策结果进行了一次修正,使得修正结果更加接近真实熵编码的结果。
本发明提供的技术方案是:
一种视频编码中帧内编码的码率估计方法,通过对预测块的残差信息进行建模,并在相应模型下根据信息熵理论估计出预测块的编码比特数,从而可以在RDO(率失真优化,Rate-DistortionOptimization)过程中跳过熵编码过程,有效地减少编码时间;所述码率估计方法主要包括:统计预测块分布信息并建模过程、根据模型估计预测模式的编码比特数的过程,和对估计的编码比特数做出修正的过程;具体步骤如下:
1)统计预测块分布信息并建模的流程如下:
11)以一帧为单位,统计不同的预测块大小(RDO过程会将预测块划分为不同的大小,从4×4~64×64)中每个位置的残差分布,并对预测块中每个位置的残差用广义高斯分布模型和均匀分布模型的混合模型进行拟合,得到的概率密度函数作为用于预测块中每个位置的概率分布情况的混合模型。(注:混合模型的更新频率原则上每一帧更新一次,也就是说当前帧的统计结果会用到下一帧的预测当中。但如果在一帧结束以后没有达到一定的样本数量,则不会更新模型,统计样本与下一帧样本一同更新。)
12)用步骤11)得到的模型,计算出预测块在每个位置上、量化结果为特定值的概率(即对某一预测块上的某一位置,若经过量化后这个位置的量化值为x,求出这种情况发生的概率),再根据概率求出信息熵。以算出的信息熵作为预测块在这个位置这个量化值的估计编码比特数。将估计得到的编码比特数存放到一个数据表中,这样在下一帧编码的RDO过程中只需要查询该数据表即可获得相应的编码比特数。
2)根据模型估计RDO过程的每种预测模式的编码比特数:
在上述的建模过程在我们得到了预测块在每个位置的每个量化值的估计编码比特数,我们在RDO的过程中,经过预测、变换、量化步骤后,根据每个位置量化的结果从相应数据表中查询得到该位置的估计比特数,并将这个预测块的所有位置的编码比特数相加即可得到该预测块的估计编码比特数,从而跳过了熵编码的过程。
3)估计的编码比特数做出修正:
利用步骤2)的结果,我们可以跳过熵编码的过程,选出每种块大小(4×4~64×64)中的最佳预测模式(这在RDO过程中已经实现;我们的发明是利用步骤2)替代了RDO过程中计算复杂度较高的熵编码过程)。RDO过程会紧接着在不同块划分的最优模式中选出一个 最佳模式作为最终的划分结果。本发明中,当我们得出每个块大小的最优模式后,对这些最优模式做一个简易的熵编码,并用这个简易的熵编码结果作为最优模式的预估编码比特数,用于我们在不同的块大小之间选择出最优的划分方式。
与现有技术相比,本发明的有益效果是:
本发明提供一种视频编码中帧内(I帧)的码率估计方法,主要通过预测块的信息建模,并在相应模型下利用信息熵理论估计出预测块编码比特数,从而跳过了熵编码的过程;本发明方法可用于快速高效地选择出帧内预测的最佳模式,减少帧内编码所需的时间。本发明具有以下优点:
(一)广义高斯分布和均匀分布的混合模型比较符合真实情况的残差分布,在同一种块大小中能够很好地甄别出最优的预测模式。
(二)对每一种块大小中的最优模式做了重新的熵编码(简易熵编码),该过程与真实的熵编码相比复杂度较低,但足够用来决策出最好的块划分方式。视频质量损失较少(0.64%BD-rate损失)。
(三)一帧更新一次统计模型,而在RDO过程中只需要做简单的查询操作,时间复杂度较低。节约了较多的编码时间(节省了37.7%的RDO模块时间)
附图说明
图1是本发明提供的码率估计方法的流程框图。
图2是现有的广义高斯分布模型和本发明所提出的混合模型的概率密度函数图;
其中,虚线形表示传统的广义高斯分布模型;实线形表示广义高斯分布和均匀分布的混合模型;两条竖线与横轴的交点表示在混合模型中广义高斯分布和均匀分布的分界点。
图3是本发明计算预测块中某一位置被量化为特定值的概率示意图;
其中,曲线是针对预测块中某一个位置的残差分布;横轴表示该位置量化前的值,纵轴表示该位置量化前取特定值的概率密度;对于该位置而言,当量化后的结果
Figure PCTCN2017094015-appb-000001
时,从量化公式
Figure PCTCN2017094015-appb-000002
其中f为量化偏移,Qstep为量化步长,int(x)表示取整函数,sgn(x)表示符号函数)可知,对应的量化前的值可能处于的区间是
Figure PCTCN2017094015-appb-000003
Figure PCTCN2017094015-appb-000004
从而在这个区间对曲线求积分,即可得到该位置的值被量化为
Figure PCTCN2017094015-appb-000005
的概率,也就是图3中黑色阴影部分的面积;对于量化后结果为
Figure PCTCN2017094015-appb-000006
可得量化前区间 可能处于(-(1-f)·Qstep,(1-f)·Qstep),该区间即量化死区deadzone,图3中部白色阴影部分的面积表示该位置量化后结果为
Figure PCTCN2017094015-appb-000007
的概率。
具体实施方式
下面结合附图,通过实施例进一步描述本发明,但不以任何方式限制本发明的范围。
本发明提供了一种在视频编码领域的I帧编码中率失真优化(RateDistortionOptimization)模块的快速码率估计方法,用以估算RDO过程中每种预测模式的编码码率,以替代真实熵编码过程的巨大的时间复杂度,在视频质量损失较少(0.64%BD-rate损失)的情况下节约了较多的编码时间(37.7%的RDO模块时间)。本发明适用于视频编码中I帧的码率估计。
本发明通过对预测块的残差信息进行建模,并在相应模型下根据信息熵理论估计出预测块的编码比特数,从而可以在RDO过程中跳过熵编码过程,有效地减少编码时间;图1是本发明提供的码率估计方法的流程框图,主要包括三个部分:统计预测块分布信息并建模、根据模型估计预测模式的编码比特数、并对估计的编码比特数做出修正;具体包括如下步骤:
1)统计预测块分布信息并建模,流程如下:
11)以一帧为单位,统计不同的预测块大小(4×4~64×64)中每个位置的残差分布,并对预测块中每个位置的残差用广义高斯分布模型和均匀分布模型的混合模型进行拟合,得到预测块中每个位置的概率分布情况。
其中:模型的更新频率原则上每一帧更新一次,也就是说,当前帧的统计结果会用到下一帧的预测当中。但如果在一帧结束以后没有达到一定的样本数量,则不会更新模型,统计样本与下一帧样本一同更新。
12)计算出预测块在每个位置上,当量化结果为特定值的概率,再利用信息熵公式(f(P)=-log(P))根据概率求出信息熵。以算出的信息熵作为预测块在这个位置这个量化值的估计编码比特数。将估计得到的编码比特数存放到一个数据表中,这样在下一帧编码的RDO过程中只需要查询该数据表即可获得相应的编码比特数。
2)根据模型估计预测模式的编码比特数
在上述的建模过程在我们得到了预测块在每个位置的每个量化值的估计编码比特数,我们在RDO的过程中,经过预测、变换、量化步骤后,根据每个位置量化的结果从相应数据表中查询得到该位置的估计比特数,并将结果相加即可得到该预测块的估计编码比特数,从而跳过了熵编码的过程。
3)估计的编码比特数做出修正
在每一个块大小(4×4~64×64)中我们用上述方式决策出当前块大小的最优预测模式。当我们得出每个块大小的最优模式后,对这些最优模式做一个简易的熵编码,并用这个简易的熵编码结果作为最优模式的预估编码比特数,用于我们在不同的块大小之间选择出最优的划分方式。
在统计预测块分布信息并建模过程中,具体地,我们以一帧为单位,统计不同的预测块大小中每个位置的残差分布(每个位置有一个独立的残差分布模型),并对预测块中每个位置的残差用广义高斯分布模型和均匀分布模型的混合模型进行拟合,得到预测块中每个位置的概率分布情况。广义高斯分布函数表达式如下:
Figure PCTCN2017094015-appb-000008
其中,
Figure PCTCN2017094015-appb-000009
表达式中,fuv(x)表示预测块中每个位置的概率密度分布(u,v为预测块中位置的坐标),σuv表示在该位置的标准差,ηuv控制着概率密度函数的形状(用表达式
Figure PCTCN2017094015-appb-000010
Figure PCTCN2017094015-appb-000011
进行估算),Γ(·)表示伽马函数。本发明将广义高斯分布模型与均匀分布模型结合,混合模型表达式f′uv(x)如式3:
Figure PCTCN2017094015-appb-000012
其中,θuv是一个调整因子,以保证概率密度函数在整个区间内积分为1。buv是广义高斯分布和均匀分布的边界,muv表征量化后x能够取的最大值,从而得到广义高斯分布和均匀分布混合模型的表达式,如图2所示。图2是现有的广义高斯分布模型和本发明所提出的混合模型的概率密度函数图。
在根据模型估计预测模式的编码比特数的过程中,具体地,我们首先根据混合模型的概率密度函数计算量化结果为某一特定值
Figure PCTCN2017094015-appb-000013
的概率
Figure PCTCN2017094015-appb-000014
图3是本发明计算预测块中某一位置被量化为特定值的概率示意图。如图3所示,对于模型中的广义高斯分布的那部分,计算表达式如式4:
Figure PCTCN2017094015-appb-000015
其中,f表示量化偏移,Qstep表示量化步长。对于
Figure PCTCN2017094015-appb-000016
的情况,我们取
Figure PCTCN2017094015-appb-000017
作为计算的近似结果,而
Figure PCTCN2017094015-appb-000018
情况下的概率我们可以不必计算,正如下面所说的,我们会忽略当量化结果为0的时候所带来的编码比特数。
对于模型中均匀分布的那部分,计算表达式如式5所示:
Figure PCTCN2017094015-appb-000019
我们得到概率以后,可以利用信息熵理论,通过式6估计编码比特数:
Figure PCTCN2017094015-appb-000020
这样,通过式7我们就能够得到每个预测块的编码比特数:
rB=∑uvruv   (式7)
需要注意的是,计算概率的公式具有比较高的计算复杂度,我们可以将计算结果存到一个查找表中,这样就只需要在更新概率模型的时候计算一次,并缓存结果,在之后的RDO模块中就可以直接查找结果,从而有效地提高RDO模块的时间效率。
上述方式可以快速的估计出同一个块大小中的最优预测模式,但这个估计的编码比特数在决策不同块大小的时候效果并不理想,需要对估计的码率做出修正。
在对估计的码率做出修正的过程中,具体地,当我们得出一个块大小的最优模式以后,我们可以对这个模式进行一次简易的熵编码,并用该结果用于不同块大小之间的RDO决策依据。简易熵编码的思想比较简单,我们只将完整的熵编码过程进行到二值化的步骤,并用二值化产生的比特数作为最后的估计编码比特数。
至此完成在RDO过程中的码率估计的所有步骤。
需要注意的是,公布实施例的目的在于帮助进一步理解本发明,但是本领域的技术人员可以理解:在不脱离本发明及所附权利要求的精神和范围内,各种替换和修改都是可能的。因此,本发明不应局限于实施例所公开的内容,本发明要求保护的范围以权利要求书界定的范围为准。

Claims (5)

  1. 一种视频编码中帧内编码的码率估计方法,通过对预测块的残差信息进行建模,利用建模得到的模型估计出预测块的编码比特数,使得在率失真优化过程中跳过熵编码过程,有效地减少编码时间;所述码率估计方法主要包括:统计预测块分布信息并建模的过程、根据模型估计预测模式的编码比特数的过程和对估计的编码比特数做出修正的过程;具体步骤如下:
    1)统计预测块的分布信息并建模
    11)将预测块划分为不同的大小,以一帧为单位,统计不同大小的预测块的每个位置的残差分布;对预测块中每个位置的残差用广义高斯分布模型和均匀分布模型的混合模型进行拟合,得到的概率密度函数作为用于预测块中每个位置的概率分布情况的混合模型;
    12)计算得到预测块的每个位置的量化值和量化值为特定值的概率,再根据概率求出信息熵,以所述信息熵作为预测块在相应位置的量化值对应的估计编码比特数;并将所述估计编码比特数存放到一个数据表中;
    2)根据模型估计率失真优化过程的每种预测模式的编码比特数:在率失真优化过程中经过预测、变换、量化步骤后,根据每个位置的量化值,从步骤12)所述数据表中查询得到该位置的估计编码比特数,并将该预测块的所有位置的编估计码比特数结果相加,即可得到该预测块的估计编码比特数,从而跳过了熵编码的过程;
    3)对估计的编码比特数做修正:对不同大小的预测块,针对当前预测块的最优预测模式做一个简易的熵编码,将所述简易的熵编码结果作为最优预测模式的预估编码比特数,用于在不同的块大小之间选择出最优的块划分方式;
    由此完成率失真优化过程中的码率估计。
  2. 如权利要求1所述码率估计方法,其特征是,步骤11)所述统计不同大小的预测块,所述不同大小为4x4~64x64。
  3. 如权利要求1所述码率估计方法,其特征是,步骤11)将广义高斯分布模型与均匀分布模型结合建模,所述混合模型为:
    Figure PCTCN2017094015-appb-100001
    其中,θuv是一个调整因子,以保证概率密度函数在整个区间内积分为1;buv是广义高斯 分布和均匀分布的边界;muv表征量化后x能够取的最大值;fuv(x)为广义高斯分布函数,表达式为式1:
    Figure PCTCN2017094015-appb-100002
    其中,
    Figure PCTCN2017094015-appb-100003
    式中,fuv(x)表示预测块中每个位置的概率密度分布;u,v为预测块中位置的坐标;σuv表示在该位置的标准差;ηuv控制着概率密度函数的形状,可用表达式
    Figure PCTCN2017094015-appb-100004
    Figure PCTCN2017094015-appb-100005
    进行估算;Γ(·)表示伽马函数。
  4. 如权利要求1所述码率估计方法,其特征是,步骤2)根据混合模型估计预测模式的编码比特数,具体包括如下步骤:
    21)首先根据概率密度函数计算量化结果为某一特定值的概率,具体地,对于模型中的广义高斯分布的部分,计算表达式如式4:
    Figure PCTCN2017094015-appb-100006
    其中,f表示量化偏移,Qstep表示量化步长;对于
    Figure PCTCN2017094015-appb-100007
    的情况,取
    Figure PCTCN2017094015-appb-100008
    作为计算的近似结果,而
    Figure PCTCN2017094015-appb-100009
    情况下的概率不必计算,忽略当量化结果为0时所带来的编码比特数;
    对于模型中均匀分布的部分,计算表达式如式5:
    Figure PCTCN2017094015-appb-100010
    22)得到概率之后,通过式6估计得到编码比特数ruv
    Figure PCTCN2017094015-appb-100011
    23)通过式7得到每个预测块的编码比特数rB
    rB=∑uvruv  (式7)
    其中,Qstep表示量化步长;f′uv(buv)为混合模型;buv是广义高斯分布和均匀分布的边界。
  5. 如权利要求1所述码率估计方法,其特征是,步骤3)所述简易的熵编码具体为将完整的熵编码过程进行到二值化的步骤,将二值化结果产生的比特数作为熵编码结果。
PCT/CN2017/094015 2016-10-26 2017-07-24 视频编码中帧内编码的码率估计方法 WO2018076827A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/334,931 US10917646B2 (en) 2016-10-26 2017-07-24 Intra code-rate predicting with rate distortion optimization method in video coding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610948219.3 2016-10-26
CN201610948219.3A CN106454360B (zh) 2016-10-26 2016-10-26 视频编码中帧内编码的码率估计方法

Publications (1)

Publication Number Publication Date
WO2018076827A1 true WO2018076827A1 (zh) 2018-05-03

Family

ID=58178463

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/094015 WO2018076827A1 (zh) 2016-10-26 2017-07-24 视频编码中帧内编码的码率估计方法

Country Status (3)

Country Link
US (1) US10917646B2 (zh)
CN (1) CN106454360B (zh)
WO (1) WO2018076827A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106454360B (zh) * 2016-10-26 2019-05-07 北京大学深圳研究生院 视频编码中帧内编码的码率估计方法
CN107018419B (zh) * 2017-04-26 2019-07-05 安徽大学 一种基于ambtc的图像压缩编码方法
CN109743572B (zh) * 2019-01-08 2019-12-03 深圳市优微视觉科技有限公司 一种码率模型更新方法及装置
US11956447B2 (en) * 2019-03-21 2024-04-09 Google Llc Using rate distortion cost as a loss function for deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101466040A (zh) * 2009-01-09 2009-06-24 北京大学 一种用于视频编码模式决策的码率估计方法
CN102695055A (zh) * 2012-05-18 2012-09-26 西安电子科技大学 高码率下jpeg_ls码率控制方法
CN103024383A (zh) * 2012-12-14 2013-04-03 北京工业大学 一种基于hevc框架的帧内无损压缩编码方法
CN103546749A (zh) * 2013-10-14 2014-01-29 上海大学 利用残差系数分布特征和贝叶斯定理优化hevc残差编码的方法
US20140146884A1 (en) * 2012-11-26 2014-05-29 Electronics And Telecommunications Research Institute Fast prediction mode determination method in video encoder based on probability distribution of rate-distortion
CN106454360A (zh) * 2016-10-26 2017-02-22 北京大学深圳研究生院 视频编码中帧内编码的码率估计方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100566427C (zh) * 2007-07-31 2009-12-02 北京大学 用于视频编码的帧内预测编码最佳模式的选取方法及装置
JP2011082837A (ja) * 2009-10-07 2011-04-21 Sony Corp 送信装置および送信方法
US10085027B2 (en) * 2015-03-06 2018-09-25 Qualcomm Incorporated Adaptive mode checking order for video encoding

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101466040A (zh) * 2009-01-09 2009-06-24 北京大学 一种用于视频编码模式决策的码率估计方法
CN102695055A (zh) * 2012-05-18 2012-09-26 西安电子科技大学 高码率下jpeg_ls码率控制方法
US20140146884A1 (en) * 2012-11-26 2014-05-29 Electronics And Telecommunications Research Institute Fast prediction mode determination method in video encoder based on probability distribution of rate-distortion
CN103024383A (zh) * 2012-12-14 2013-04-03 北京工业大学 一种基于hevc框架的帧内无损压缩编码方法
CN103546749A (zh) * 2013-10-14 2014-01-29 上海大学 利用残差系数分布特征和贝叶斯定理优化hevc残差编码的方法
CN106454360A (zh) * 2016-10-26 2017-02-22 北京大学深圳研究生院 视频编码中帧内编码的码率估计方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHAO, XIN ET AL.: "Novel Statistical Modeling, Analysis and Implementation of Rate-Distortion Estimation for H.264/AVC Coders", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 31 May 2010 (2010-05-31) *

Also Published As

Publication number Publication date
CN106454360B (zh) 2019-05-07
US10917646B2 (en) 2021-02-09
CN106454360A (zh) 2017-02-22
US20190289298A1 (en) 2019-09-19

Similar Documents

Publication Publication Date Title
WO2018076827A1 (zh) 视频编码中帧内编码的码率估计方法
WO2022027881A1 (zh) 基于视频序列特征和QP-λ修正的时域率失真优化方法
CN103873861A (zh) 一种用于hevc的编码模式选择方法
CN102595140B (zh) 基于图像修复和矢量预测算子的帧内预测视频编码方法
US8238444B2 (en) Perceptual-based video coding method
WO2021196682A1 (zh) 一种基于失真类型传播分析的时域率失真优化方法
CN105049850A (zh) 基于感兴趣区域的hevc码率控制方法
CN100574447C (zh) 基于avs视频编码的快速帧间预测模式选择方法
CN101309421B (zh) 帧内预测模式选择方法
CN103533359A (zh) 一种h.264码率控制方法
CN110062239B (zh) 一种用于视频编码的参考帧选择方法及装置
WO2021120614A1 (zh) 二次编码优化方法
CN109889852B (zh) 一种基于邻近值的hevc帧内编码优化方法
CN105681793A (zh) 基于视频内容复杂度自适应的极低延迟高性能视频编码帧内码率控制方法
CN105120282A (zh) 一种时域依赖的码率控制比特分配方法
CN105681797A (zh) 一种基于预测残差的dvc-hevc视频转码方法
WO2024082580A1 (zh) 一种考虑时域失真传播的低复杂度全景视频编码方法
CN110351552B (zh) 视频编码中一种快速编码方法
CN107071421B (zh) 一种结合视频稳定的视频编码方法
CN103929652A (zh) 视频标准中基于自回归模型的帧内预测快速模式选择方法
Chen et al. Intra frame rate control for versatile video coding with quadratic rate-distortion modelling
CN103945222A (zh) 一种基于hevc视频编码标准的码率控制模型更新方法
WO2024082579A1 (zh) 一种考虑时域失真传播的零时延全景视频码率控制方法
CN104581152A (zh) 一种hevc帧内预测模式选择加速方法
CN101854554A (zh) 基于图像修复预测的视频编解码***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17864603

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17864603

Country of ref document: EP

Kind code of ref document: A1