CN114071162A - Image encoding method, image decoding method and related device - Google Patents

Image encoding method, image decoding method and related device Download PDF

Info

Publication number
CN114071162A
CN114071162A CN202010748924.5A CN202010748924A CN114071162A CN 114071162 A CN114071162 A CN 114071162A CN 202010748924 A CN202010748924 A CN 202010748924A CN 114071162 A CN114071162 A CN 114071162A
Authority
CN
China
Prior art keywords
intra
prediction
block
indication information
prediction filtering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010748924.5A
Other languages
Chinese (zh)
Inventor
谢志煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010748924.5A priority Critical patent/CN114071162A/en
Priority to TW110123866A priority patent/TW202209878A/en
Priority to PCT/CN2021/109173 priority patent/WO2022022622A1/en
Priority to CN202180060486.6A priority patent/CN116250240A/en
Publication of CN114071162A publication Critical patent/CN114071162A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the application discloses an image coding method, an image decoding method and a related device, wherein the image decoding method comprises the following steps: dividing the image, and determining intra-frame prediction filtering indication information of a current coding block; if the current coding block needs to use the first intra-frame prediction filtering mode according to the intra-frame prediction filtering indication information, setting a first use identification bit of the first intra-frame prediction filtering mode of the current coding block as a permission use; transmitting the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit through a code stream; and superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block. The embodiment of the application provides selection for operations such as smoothing processing or local blurring of intra-frame prediction, and for parts of image texture which do not need to be sharpened too much, the technology is used to enable predicted pixels to be smoother, predicted blocks are closer to original images, and finally coding efficiency is improved.

Description

Image encoding method, image decoding method and related device
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to an image encoding method, an image decoding method, and a related apparatus.
Background
Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video conferencing devices, video streaming devices, and so forth.
Digital video devices implement video compression techniques such as those described in the standards and extensions of the standards defined by the Moving Picture Experts Group (MPEG) -2, MPEG-4, ITU-t h.263, ITU-t h.264/MPEG-4 part 10 Advanced Video Coding (AVC), ITU-t h.265 High Efficiency Video Coding (HEVC) standards, to more efficiently transmit and receive digital video information. Video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing these video codec techniques.
With the proliferation of internet video, even though digital video compression technology is continuously evolving, still higher requirements are put on video compression ratio.
Disclosure of Invention
The embodiment of the application provides an image coding method, an image decoding method and a related device, which are used for providing selection for operations such as smoothing processing or local blurring of intra-frame prediction, and for parts of image texture which do not need to be sharpened, the technology is used for enabling prediction pixels to be smoother, prediction blocks are closer to original images, and finally coding efficiency is improved.
In a first aspect, an embodiment of the present application provides an image encoding method, including:
dividing the image, and determining intra-prediction filtering indication information of a current coding block, wherein the intra-prediction filtering indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra-prediction filtering mode is allowed to be used, and the first intra-prediction filtering mode is an intra-prediction filtering IPF mode;
if the current coding block needs to use the first intra-frame prediction filtering mode according to the intra-frame prediction filtering indication information, setting a first use identification bit of the first intra-frame prediction filtering mode of the current coding block as a permission use;
transmitting the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit via a code stream;
and superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block which is used as a prediction reference block of the next coding block.
Compared with the prior art, the scheme provides selection for operations such as smoothing processing or local blurring for intra-frame prediction, for parts of the image texture which does not need to be sharpened too much, the prediction pixels are smoother by using the technology, the prediction block is closer to the original image, and finally the coding efficiency is improved.
In a second aspect, an embodiment of the present application provides an image decoding method, including:
analyzing a code stream, and determining intra-frame prediction filtering indication information and a first use identification bit of a current decoding block, wherein the intra-frame prediction indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-frame prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra-frame prediction filtering mode is allowed to be used, the first intra-frame prediction filtering mode is an intra-frame prediction filtering IPF mode, and the first use identification bit is the use identification bit of the first intra-frame prediction filtering mode;
and determining to obtain the prediction block of the decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit.
Compared with the prior art, the scheme provides selection for operations such as smoothing processing or local blurring for intra-frame prediction, for parts of the image texture which does not need to be sharpened too much, the prediction pixels are smoother by using the technology, the prediction block is closer to the original image, and finally the coding efficiency is improved.
In a third aspect, an embodiment of the present application provides an image encoding apparatus, including:
a dividing unit, configured to divide an image, and determine intra prediction filtering indication information of a current encoding block, where the intra prediction filtering indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, and the first intra prediction filtering mode is an intra prediction filtering IPF mode;
a determining unit, configured to set a first usage flag of the first intra-prediction filtering mode of the current coding block to be allowed to be used if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;
a transmission unit, configured to transmit the intra-prediction filtering indication information, the first intra-prediction filtering mode, and the first usage flag via a code stream;
and the superposition unit is used for determining to obtain the prediction block of the decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit.
In a fourth aspect, an embodiment of the present application provides an image decoding apparatus, including:
a parsing unit, configured to determine intra prediction filtering indication information and a first usage flag bit of a currently decoded block, where the intra prediction indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, the first intra prediction filtering mode is an intra prediction filtering IPF mode, and the first usage flag bit is a usage flag bit of the first intra prediction filtering mode;
a determining unit, configured to determine, according to the intra prediction filtering indication information and the first usage flag, that a prediction block of the decoded block is obtained using the first frame rate prediction filtering mode.
In a fifth aspect, an embodiment of the present application provides an encoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the first aspect.
In a sixth aspect, an embodiment of the present application provides a decoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the second aspect.
In a seventh aspect, an embodiment of the present application provides a terminal, where the terminal includes: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicates with other devices through the communication interface, the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, perform the method according to the first or second aspect.
In an eighth aspect, the present invention provides a computer-readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the method of the first or second aspect.
In a ninth aspect, embodiments of the present application provide a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the method of the first or second aspect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic block diagram of a coding tree unit in an embodiment of the present application;
FIG. 2 is a schematic block diagram of a CTU and a coding block CU in an embodiment of the present application;
FIG. 3 is a schematic block diagram of a color format in an embodiment of the present application;
FIG. 4 is a schematic diagram of an IPF in an embodiment of the present application;
FIG. 5 is a diagram illustrating intra prediction filtering according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a video coding system in an embodiment of the present application;
FIG. 7 is a schematic block diagram of a video encoder in an embodiment of the present application;
FIG. 8 is a schematic block diagram of a video decoder in an embodiment of the present application;
FIG. 9 is a flowchart illustrating an image encoding method according to an embodiment of the present application;
FIG. 10 is a flowchart illustrating an image decoding method according to an embodiment of the present application;
FIG. 11A is a schematic diagram of a filling of a prediction block in the embodiment of the present application;
FIG. 11B is another schematic illustration of the filling of a prediction block in the embodiment of the present application;
FIG. 12 is a block diagram of a functional unit of an image encoding apparatus according to an embodiment of the present application;
FIG. 13 is a block diagram showing another functional unit of the image encoding apparatus according to the embodiment of the present application;
FIG. 14 is a block diagram of a functional unit of an image decoding apparatus according to an embodiment of the present application;
fig. 15 is a block diagram of another functional unit of the image decoding apparatus in the embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first client may be referred to as a second client, and similarly, a second client may be referred to as a first client, without departing from the scope of the present invention. Both the first client and the second client are clients, but they are not the same client.
First, terms used in the embodiments of the present application will be described.
For the partition of images, in order to more flexibly represent Video contents, a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) are defined in the High Efficiency Video Coding (HEVC) technology. The CTU, CU, PU, and TU are all image blocks.
A coding tree unit CTU, an image being composed of a plurality of CTUs, a CTU generally corresponding to a square image area, containing luminance pixels and chrominance pixels (or may contain only luminance pixels, or may contain only chrominance pixels) in the image area; the CTU also contains syntax elements that indicate how the CTU is divided into at least one Coding Unit (CU) and the method of decoding each coding block to obtain a reconstructed picture. As shown in fig. 1, the picture 10 is composed of a plurality of CTUs (including CTU a, CTU B, CTU C, etc.). The encoded information corresponding to a CTU includes luminance values and/or chrominance values of pixels in a square image region corresponding to the CTU. Furthermore, the coding information corresponding to a CTU may also contain syntax elements indicating how to divide the CTU into at least one CU and the method of decoding each CU to get the reconstructed picture. The image area corresponding to one CTU may include 64 × 64, 128 × 128, or 256 × 256 pixels. In one example, a CTU of 64 × 64 pixels comprises a rectangular pixel lattice of 64 columns of 64 pixels each, each pixel comprising a luminance component and/or a chrominance component. The CTUs may also correspond to rectangular image regions or image regions with other shapes, and an image region corresponding to one CTU may also be an image region in which the number of pixels in the horizontal direction is different from the number of pixels in the vertical direction, for example, including 64 × 128 pixels.
The coding block CU, as shown in fig. 2, may further be divided into coding blocks CU, each of which generally corresponds to an a × B rectangular region in the image, and includes a × B luma pixels and/or its corresponding chroma pixels, a being the width of the rectangle, B being the height of the rectangle, a and B may be the same or different, and a and B generally take values of 2 raised to an integer power, such as 128, 64, 32, 16, 8, 4. Here, the width referred to in the embodiment of the present application refers to the length in the X-axis direction (horizontal direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1, and the height refers to the length in the Y-axis direction (vertical direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1. The reconstructed image of a CU may be obtained by adding a predicted image, which is generated by intra prediction or inter prediction, specifically, may be composed of one or more Predicted Blocks (PB), and a residual image, which is generated by inverse quantization and inverse transform processing on transform coefficients, specifically, may be composed of one or more Transform Blocks (TB). Specifically, one CU includes coding information including information such as a prediction mode and a transform coefficient, and performs decoding processing such as corresponding prediction, inverse quantization, and inverse transform on the CU according to the coding information to generate a reconstructed image corresponding to the CU. The coding tree unit and coding block relationship is shown in fig. 3.
The prediction unit PU is a basic unit of intra prediction and inter prediction. Defining motion information of an image block to include an inter-frame prediction direction, a reference frame, a motion vector, and the like, wherein the image block undergoing encoding processing is called a Current Coding Block (CCB), the image block undergoing decoding processing is called a Current Decoding Block (CDB), and for example, when one image block is undergoing prediction processing, the current coding block or the current decoding block is a prediction block; when an image block is being residual processed, the currently encoded block or the currently decoded block is a transform block. The picture in which the current coding block or the current decoding block is located is called the current frame. In the current frame, image blocks located on the left side or the upper side of the current block (based on the coordinate system in fig. 1, the left side refers to the negative direction of the X axis, and the upper side refers to the positive direction of the Y axis) may be located inside the current frame and have completed encoding/decoding processing, resulting in reconstructed images, which are called reconstructed blocks; information such as the coding mode of the reconstructed block, the reconstructed pixels, etc. is available (available). A frame in which the encoding/decoding process has been completed before the encoding/decoding of the current frame is referred to as a reconstructed frame. When the current frame is a uni-directionally predicted frame (P frame) or a bi-directionally predicted frame (B frame), it has one or two reference frame lists, respectively, referred to as L0 and L1, each of which contains at least one reconstructed frame, referred to as the reference frame of the current frame. The reference frame provides reference pixels for inter-frame prediction of the current frame.
And a transform unit TU for processing the residual between the original image block and the predicted image block.
The pixel (also called as a pixel) refers to a pixel in an image, such as a pixel in a coding block, a pixel in a luminance component pixel block (also called as a luminance pixel), a pixel in a chrominance component pixel block (also called as a chrominance pixel), and the like.
The samples (also referred to as pixel values and sample values) refer to pixel values of pixels, the pixel values refer to luminance (i.e., gray-scale values) in a luminance component domain, and the pixel values refer to chrominance values (i.e., colors and saturations) in a chrominance component domain, and according to different processing stages, a sample of one pixel specifically includes an original sample, a predicted sample, and a reconstructed sample.
Description of the directions: horizontal direction, for example: in the two-dimensional rectangular coordinate system XoY shown in fig. 1, along the X-axis direction and the vertical direction, for example: as shown in the two-dimensional rectangular coordinate system XoY of fig. 1 along the Y-axis in the negative direction.
And intra-frame prediction, namely generating a prediction image of the current block according to the spatial adjacent pixels of the current block. An intra prediction mode corresponds to a method of generating a prediction image. The division of the intra-frame prediction unit comprises a2 Nx 2N division mode and an Nx N division mode, wherein the 2 Nx 2N division mode is that image blocks are not divided; the N × N division is to divide the image block into four equal-sized sub-image blocks.
Typically, digital video compression techniques work on video sequences whose color coding method is YCbCr, which may also be referred to as YUV, in a color format of 4:2:0, 4:2:2, or 4:4: 4. Where Y denotes brightness (Luma) that is a gray scale value, Cb denotes a blue Chrominance component, Cr denotes a red Chrominance component, and U and V denote Chrominance (Chroma) for describing color and saturation. In color format, 4:2:0 indicates 4 luminance components per 4 pixels, 2 chrominance components (yyycbcr), 4:2:2 indicates 4 luminance components per 4 pixels, 4 chrominance components (yyyycbcrcbccr), and 4:4:4 indicates full pixel display (yyycbcrcbcrcbcr), and fig. 3 shows the component profiles for different color formats, where the circle is the Y component and the triangle is the UV component.
The intra-frame prediction part in the digital video coding and decoding mainly refers to the image information of adjacent blocks of a current frame to predict a current coding unit block, calculates residual errors of a prediction block and an original image block to obtain residual error information, and transmits the residual error information to a decoding end through the processes of transformation, quantization and the like. And after receiving and analyzing the code stream, the decoding end obtains residual information through steps of inverse transformation, inverse quantization and the like, and a reconstructed image block is obtained after the residual information is superposed on a predicted image block obtained by prediction of the decoding end. In the process, intra-frame prediction usually predicts a current coding block by means of respective angle mode and non-angle mode to obtain a prediction block, screens out the optimal prediction mode of a current coding unit according to rate distortion information obtained by calculation of the prediction block and an original block, and then transmits the prediction mode to a decoding end through a code stream. And the decoding end analyzes the prediction mode, predicts to obtain a predicted image of the current decoding block and superposes residual pixels transmitted by the code stream to obtain a reconstructed image.
Through the development of the digital video coding and decoding standards of the past generations, a non-angle mode is kept relatively stable, and the non-angle mode has an average mode and a plane mode; the angle mode is increased along with the evolution of the digital video coding and decoding standard, taking the international digital video coding standard H series as an example, the H.264/AVC standard only has 8 angle prediction modes and 1 non-angle prediction mode; H.265/HEVC is extended to 33 angular prediction modes and 2 non-angular prediction modes; and the latest universal video coding standard H.266/VVC at present adopts 67 prediction modes, wherein 2 non-angular prediction modes are reserved, and the angular modes are expanded from 33 to 65 of H.265. Needless to say, as the angle mode increases, the intra-frame prediction will be more accurate, and the demand of the current society for the development of high definition and ultra-high definition videos is more satisfied. The international standard is the same, the domestic digital audio and video coding standard AVS3 also continues to expand the angle mode and the non-angle mode, the development of ultra-high definition digital video puts higher requirements on intra-frame prediction, and the coding efficiency can not be submitted by only increasing the angle prediction mode and expanding the wide angle. Therefore, the domestic digital audio and video coding standard AVS3 adopts an Intra Prediction Filter (IPF) technique, and the intra prediction filter technique indicates that not all reference pixels are used in the current intra angle prediction, and the relevance between some pixels and the current coding unit is easily ignored, and the intra prediction filter technique improves the pixel prediction precision through point-to-point filtering, and can effectively enhance the spatial relevance, thereby improving the intra prediction precision. The IPF technique takes the prediction mode from top right to bottom left in AVS3 as an example, as shown in fig. 4, where URB represents the boundary pixel of the left neighboring block near the current coding unit, MRB represents the boundary pixel of the upper neighboring block near the current coding unit, and filter direction represents the filtering direction. In the prediction mode direction from top right to bottom left, the generated prediction value of the current coding unit mainly uses the reference pixel points of the adjacent block in the row of the MRB above, that is, the prediction pixel of the current coding unit does not refer to the reconstructed pixel of the adjacent block on the left side, however, the current coding unit and the reconstructed block on the left side are in a spatial adjacent relationship, and if only the MRB pixel on the upper side is referred to and the URB pixel on the left side is not referred to, spatial correlation is easily lost, which results in poor prediction effect.
The IPF technology is applied to all prediction modes of intra-frame prediction, and is a filtering method for improving intra-frame prediction precision. The IPF technology is mainly realized by the following processes:
a) judging the current prediction mode of the coding unit by the IPF technology, and dividing the current prediction mode into a horizontal angle prediction mode, a vertical angle prediction mode and a non-angle prediction mode;
b) according to different types of prediction modes, the IPF technology adopts different filters to filter input pixels;
c) according to different distances from the current pixel to the reference pixel, the IPF technology adopts different filter coefficients to filter the input pixel;
the input pixel of the IPF technique is a predicted pixel obtained in each prediction mode, and the output pixel is a final predicted pixel after IPF.
The IPF technique has an allowable flag bit IPF _ enable _ flag, a binary variable, a value of '1' indicating that intra prediction filtering can be used; a value of '0' indicates that no intra prediction filtering should be used. Meanwhile, the IPF technology also uses an identification bit IPF _ flag, and a binary variable with the value of '1' indicates that intra-frame prediction filtering is to be used; a value of '0' indicates that intra prediction filtering should not be used, and if the flag ipf _ flag does not exist in the code stream, 0 is defaulted.
Syntax element IPF _ flag, as follows:
Figure BDA0002608522670000051
the IPF technique classifies prediction modes 0, 1, and 2 as non-angular prediction modes, and filters the prediction pixels using a first three-tap filter;
classifying the prediction modes 3 to 18 and 34 to 50 into vertical angle prediction modes, and filtering the prediction pixels by using a first two-tap filter;
the prediction modes 19 to 32 and 51 to 65 are classified into horizontal-class angle prediction modes, and the prediction pixels are filtered using the second two-tap filter.
The first three-tap filter applicable to the IPF technique has the following filtering formula:
P′(x,y)=f(x)·P(-1,y)+f(y)·P(x,-1)+(1-f(x)-f(y))·P(x,y)
the first two-tap filter applicable to the IPF technique has the following filtering formula:
P′(x,y)=f(x)·P(-1,y)+(1-f(x))·P(x,y)
the second two-tap filter suitable for the IPF technique has the following filtering formula:
P′(x,y)=f(y)·P(x,-1)+(1-f(y))·P(x,y)
in the above equation, P' (x, y) is the final prediction value of the pixel at the (x, y) position of the current chroma prediction block, f (x) and f (y) are the horizontal filter coefficient of the reconstructed pixel of the reference left-side neighboring block and the vertical filter coefficient of the reconstructed pixel of the reference upper-side neighboring block, respectively, P (-1, y) and P (x, -1) are the reconstructed pixel at the left side of the y row and the reconstructed pixel at the upper side of the x column, respectively, and P (x, y) is the original prediction pixel value in the current chroma component prediction block. Wherein, the values of x and y do not exceed the width and height value range of the current coding unit block.
The values of the horizontal filter coefficient and the vertical filter coefficient are related to the size of the current coding unit block and the distance from the prediction pixel in the current prediction block to the left reconstruction pixel and the upper reconstruction pixel. The values of the horizontal filter coefficient and the vertical filter coefficient are also related to the size of the current coding block, and are divided into different filter coefficient groups according to the size of the current coding unit block.
Table 1 gives the filter coefficients for the IPF technique.
Table 1 intra chroma prediction filter coefficients
Figure BDA0002608522670000052
Fig. 5 is a schematic diagram illustrating three filtering cases of intra prediction filtering, in which only upper reference pixels are referred to filter the prediction value in the current coding unit; only the left reference pixel is referred to filter the measured value in the current coding unit; and filtering the prediction value in the current coding unit block with reference to both the upper side reference pixel and the left side reference pixel.
FIG. 6 is a block diagram of a video coding system 1 of one example described in an embodiment of the present application. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this application, the term "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video coding system 1 are used to implement the cross-component prediction method proposed in the present application.
As shown in fig. 6, video coding system 1 includes a source device 10 and a destination device 20. Source device 10 generates encoded video data. Accordingly, source device 10 may be referred to as a video encoding device. Destination device 20 may decode the encoded video data generated by source device 10. Accordingly, the destination device 20 may be referred to as a video decoding device. Various implementations of source device 10, destination device 20, or both may include one or more processors and memory coupled to the one or more processors. The memory can include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein.
Source device 10 and destination device 20 may comprise a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.
Destination device 20 may receive encoded video data from source device 10 via link 30. Link 30 may comprise one or more media or devices capable of moving encoded video data from source device 10 to destination device 20. In one example, link 30 may comprise one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 20 in real-time. In this example, source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 20. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include a router, switch, base station, or other apparatus that facilitates communication from source device 10 to destination device 20. In another example, encoded data may be output from output interface 140 to storage device 40.
The image codec techniques of this application may be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (e.g., via the internet), encoding for video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications. In some examples, video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
The video coding system 1 illustrated in fig. 6 is merely an example, and the techniques of this application may be applied to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between an encoding device and a decoding device. In other examples, the data is retrieved from local storage, streamed over a network, and so forth. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In many examples, the encoding and decoding are performed by devices that do not communicate with each other, but merely encode data to and/or retrieve data from memory and decode data.
In the example of fig. 6, source device 10 includes video source 120, video encoder 100, and output interface 140. In some examples, output interface 140 may include a regulator/demodulator (modem) and/or a transmitter. Video source 120 may comprise a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.
Video encoder 100 may encode video data from video source 120. In some examples, source device 10 transmits the encoded video data directly to destination device 20 via output interface 140. In other examples, encoded video data may also be stored onto storage device 40 for later access by destination device 20 for decoding and/or playback.
In the example of fig. 6, destination device 20 includes input interface 240, video decoder 200, and display device 220. In some examples, input interface 240 includes a receiver and/or a modem. Input interface 240 may receive encoded video data via link 30 and/or from storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. In general, display device 220 displays decoded video data. The display device 220 may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.
Although not shown in fig. 6, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams.
Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as: one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the present application is implemented in part in software, a device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may execute the instructions in hardware using one or more processors to implement the techniques of the present application. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (codec) in a respective device.
Fig. 7 is an exemplary block diagram of a video encoder 100 described in embodiments of the present application. The video encoder 100 is used to output the video to the post-processing entity 41. Post-processing entity 41 represents an example of a video entity, such as a media-aware network element (MANE) or a splicing/editing device, that may process the encoded video data from video encoder 100. In some cases, post-processing entity 41 may be an instance of a network entity. In some video encoding systems, post-processing entity 41 and video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to post-processing entity 41 may be performed by the same device that includes video encoder 100. In some example, post-processing entity 41 is an example of storage 40 of FIG. 1.
In the example of fig. 7, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a memory 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. Filter unit 106 represents one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although filter unit 106 is shown in fig. 7 as an in-loop filter, in other implementations, filter unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data memory, a partitioning unit (not shown).
Video encoder 100 receives video data and stores the video data in a video data memory. The partitioning unit partitions the video data into image blocks and these image blocks may be further partitioned into smaller blocks, e.g. image block partitions based on a quadtree structure or a binary tree structure. Prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. Prediction processing unit 108 may provide the resulting intra, inter coded block to summer 112 to generate a residual block and to summer 111 to reconstruct the encoded block used as the reference picture. An intra predictor 109 within prediction processing unit 108 may perform intra-predictive encoding of a current block of video relative to one or more neighboring encoded blocks of the current block to be encoded in the same frame or slice to remove spatial redundancy. Inter predictor 110 within prediction processing unit 108 may perform inter-predictive encoding of the current block relative to one or more prediction blocks in one or more reference pictures to remove temporal redundancy. The prediction processing unit 108 provides information indicating the selected intra or inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected inter prediction mode.
After prediction processing unit 108 generates a prediction block for the current image block via inter/intra prediction, video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. Summer 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more TUs and applied to transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a Discrete Cosine Transform (DCT) or a conceptually similar transform. Transformer 101 may convert residual video data from a pixel value domain to a transform domain, e.g., the frequency domain.
The transformer 101 may send the resulting transform coefficients to the quantizer 102. Quantizer 102 quantizes the transform coefficients to further reduce the bit rate. In some examples, quantizer 102 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, the entropy encoder 103 may perform a scan.
After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 may perform Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or another entropy encoding method or technique. After entropy encoding by the entropy encoder 103, the encoded codestream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoder 103 may also entropy encode syntax elements of the current image block to be encoded.
Inverse quantizer 104 and inverse transformer 105 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, e.g., for later use as a reference block for a reference image. The summer 111 adds the reconstructed residual block to the prediction block produced by the inter predictor 110 or the intra predictor 109 to produce a reconstructed image block. The filter unit 106 may be adapted to reconstruct the image block to reduce distortions, such as block artifacts. This reconstructed image block is then stored in memory 107 as a reference block, which may be used by inter predictor 110 as a reference block to inter predict a block in a subsequent video frame or image.
The video encoder 100 divides the input video into a number of coding tree units, each of which is in turn divided into a number of coding blocks, either rectangular or square. When the current coding block selects the intra-frame prediction mode for coding, the calculation traversal of a plurality of prediction modes is carried out on the brightness component of the current coding block, the optimal prediction mode is selected according to the rate distortion cost, the calculation traversal of a plurality of prediction modes is carried out on the chroma component of the current coding block, and the optimal prediction mode is selected according to the rate distortion cost. And then, calculating a residual between the original video block and the prediction block, wherein one subsequent path of the residual forms an output code stream through change, quantization, entropy coding and the like, and the other path of the residual forms a reconstruction sample through inverse transformation, inverse quantization, loop filtering and the like to be used as reference information of subsequent video compression.
The present IPF technique is implemented in the video encoder 100 as follows.
The input digital video information is divided into a plurality of coding tree units at a coding end, each coding tree unit is divided into a plurality of rectangular or square coding units, and each coding unit carries out intra-frame prediction process to calculate a prediction block.
In the current coding unit,
if the allowable identification bit of the IPF is '1', performing all the following steps;
② if the allowed identification bit of IPF is '0', only the steps of a1), b1), f1 and g1) are carried out.
a1) The intra-frame prediction firstly traverses all prediction modes, calculates prediction pixels under each intra-frame prediction mode, and calculates the rate distortion cost according to the original pixels;
b1) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all prediction modes, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c1) traversing all intra-frame prediction modes again, starting an IPF technology in the process, firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;
d1) IPF is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a table 1 can be looked up according to the specific correspondence;
e1) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the IPF technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f1) if the IPF allowed identification bit is '0', transmitting the prediction mode index recorded in b1) to a decoding end through a code stream;
if the IPF allows the identification bit to be '1', the minimum cost value recorded in b1) is compared with the minimum cost value recorded in e1),
if the rate distortion cost in b1) is lower, the prediction mode index code recorded in b1) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and if the identification position of the IPF current coding unit uses the mark position, the IPF technology is not used, and the IPF technology is also transmitted to the decoding end through the code stream;
if the rate distortion in e1) is smaller, the prediction mode index code recorded in e1) is used as the optimal prediction mode of the current coding unit and is transmitted to the decoding end through the code stream, and the IPF current coding unit identification position is true using the flag position, which indicates that the IPF technology is used, and is also transmitted to the decoding end through the code stream.
g) And then, overlapping the predicted value and residual information after operations such as transformation, quantization and the like to obtain a reconstructed block of the current coding unit as reference information of a subsequent coding unit.
The intra predictor 109 may also provide information indicating the selected intra prediction mode of the current encoding block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected intra prediction mode.
Fig. 8 is an exemplary block diagram of a video decoder 200 described in embodiments of the present application. In the example of fig. 8, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a memory 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially reciprocal to the encoding process described with respect to video encoder 100 from fig. 7.
In the decoding process, video decoder 200 receives an encoded video bitstream representing an image block and associated syntax elements of an encoded video slice from video encoder 100. Video decoder 200 may receive video data from network entity 42 and, optionally, may store the video data in a video data store (not shown). The video data memory may store video data, such as an encoded video bitstream, to be decoded by components of video decoder 200. The video data stored in the video data memory may be obtained, for example, from storage device 40, from a local video source such as a camera, via wired or wireless network communication of video data, or by accessing a physical data storage medium. The video data memory may serve as a decoded picture buffer (CPB) for storing encoded video data from the encoded video bitstream.
Network entity 42 may be, for example, a server, a MANE, a video editor/splicer, or other such device for implementing one or more of the techniques described above. Network entity 42 may or may not include a video encoder, such as video encoder 100. Network entity 42 may implement portions of the techniques described in this application before network entity 42 sends the encoded video bitstream to video decoder 200. In some video decoding systems, network entity 42 and video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to network entity 42 may be performed by the same device that includes video decoder 200.
The entropy decoder 203 of the video decoder 200 entropy decodes the code stream to generate quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. Video decoder 200 may receive syntax elements at the video slice level and/or the picture block level. When a video slice is decoded as an intra-decoded (I) slice, intra predictor 209 of prediction processing unit 208 generates a prediction block for an image block of the current video slice based on the signaled intra prediction mode and data from previously decoded blocks of the current frame or picture. When a video slice is decoded as an inter-decoded (i.e., B or P) slice, the inter predictor 210 of the prediction processing unit 208 may determine an inter prediction mode for decoding a current image block of the current video slice based on syntax elements received from the entropy decoder 203, decode the current image block (e.g., perform inter prediction) based on the determined inter prediction mode.
The inverse quantizer 204 inversely quantizes, i.e., dequantizes, the quantized transform coefficients provided in the codestream and decoded by the entropy decoder 203. The inverse quantization process may include: the quantization parameter calculated by the video encoder 100 for each image block in the video slice is used to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied. Inverse transformer 205 applies an inverse transform, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, to the transform coefficients in order to generate a block of residues in the pixel domain.
After the inter predictor 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 obtains a reconstructed block, i.e., a decoded image block, by summing the residual block from the inverse transformer 205 with the corresponding prediction block generated by the inter predictor 210. Summer 211 represents the component that performs this summation operation. A loop filter (in or after the decoding loop) may also be used to smooth pixel transitions or otherwise improve video quality, if desired. Filter unit 206 may represent one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although the filter unit 206 is shown in fig. 8 as an in-loop filter, in other implementations, the filter unit 206 may be implemented as a post-loop filter.
The image decoding method specifically performed by the video decoder 200 includes obtaining a prediction mode index of a current coding block after an input code stream is analyzed, inversely transformed, and inversely quantized. If the prediction mode index of the chroma component of the current coding block is an enhanced two-step cross-component prediction mode, selecting only reconstructed samples from upper side or left side adjacent pixels of the current coding block according to an index value to calculate a linear model, calculating according to the linear model to obtain a reference prediction block of the chroma component of the current coding block, performing down-sampling, and performing prediction correction based on the correlation of boundary adjacent pixels in the orthogonal direction on the down-sampled prediction block to obtain a final prediction block of the chroma component. One path of the subsequent code stream is used as reference information of subsequent video decoding, and the other path of the subsequent code stream is subjected to post-filtering processing to output a video signal.
The IPF technique is currently implemented at the video decoder 200 as follows.
And the decoding end acquires and analyzes the code stream to obtain digital video sequence information, and analyzes to obtain an IPF allowed identification bit of the current video sequence, the current decoding unit coding mode is an intra-frame prediction coding mode, and the IPF used identification bit of the current decoding unit.
In the current decoding unit,
if the allowable identification bit of the IPF is '1', performing all the following steps;
② if the allowed identification bit of IPF is '0', only making a2), b2) and e2) steps:
a2) acquiring code stream information, analyzing residual error information of a current decoding unit, and obtaining time domain residual error information through inverse transformation and inverse quantization processes;
b2) analyzing the code stream and acquiring a prediction mode index of the current decoding unit, and calculating to obtain a prediction block of the current decoding unit according to the adjacent reconstruction block and the prediction mode index;
c2) analyzing and acquiring the use identification bit of the IPF, and if the use identification bit of the IPF is '0', not performing additional operation on the current prediction block; if the usage flag of the IPF is '1', executing d 2);
d2) selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and filtering each pixel in a prediction block to obtain a final prediction block;
e2) superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing;
it should be understood that other structural variations of the video decoder 200 may be used to decode the encoded video stream. For example, the video decoder 200 may generate an output video stream without processing by the filter unit 206; alternatively, for some image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode quantized coefficients and accordingly does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
In the intra-frame prediction technology, the existing IPF technology can effectively improve the coding efficiency of intra-frame prediction, greatly enhances the spatial correlation of intra-frame prediction, and well solves the problem that only a single reference pixel row or column is used in the intra-frame prediction process, but the influence of some pixels on the predicted value is ignored. However, when the intra-frame prediction process needs a smooth part, the IPF technology and the current intra-frame prediction mode cannot solve similar problems well, and the pixel-by-pixel filtering based on the reference pixel can improve the relevance between the prediction block and the reference block, but cannot solve the smooth problem inside the prediction block.
The prediction block calculated according to the single prediction mode usually shows better prediction effect in the image with clear texture, and the residual error becomes smaller and smaller, so that the coding efficiency is improved. However, in an image block with a blurred texture, too sharp prediction may increase and enlarge a residual, resulting in poor prediction effect and reduced coding efficiency.
In view of the above problems, an embodiment of the present application provides an IPF technique based on smoothing processing for some image blocks that need smoothing processing, and directly filters a prediction block obtained according to an intra prediction mode.
Fig. 9 is a flowchart illustrating an image encoding method in an embodiment of the present application, where the image encoding method can be applied to the source device 10 in the video decoding system 1 shown in fig. 6 or the video encoder 100 shown in fig. 7. The flow shown in fig. 9 is described by taking as an example the execution subject of the video encoder 100 shown in fig. 7. As shown in fig. 9, the cross-component prediction method provided in the embodiment of the present application includes:
step 110, dividing the image, and determining intra prediction filtering indication information of the current coding block, wherein the intra prediction filtering indication information includes first indication information and second indication information, the first indication information is used for indicating whether a first intra prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra prediction filtering mode is allowed to be used, and the first intra prediction filtering mode is an intra prediction filtering IPF mode;
step 120, if it is determined that the current coding block needs to use the first intra-frame prediction filtering mode according to the intra-frame prediction filtering indication information, setting a first use identification bit of the first intra-frame prediction filtering mode of the current coding block as a permission;
step 130, transmitting the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit via a code stream;
and 140, superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block serving as a prediction reference block of the next coding block.
The technical scheme 1 specifically realizes the following in-frame prediction part at the encoding end:
an encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smoothing filtering (hereinafter abbreviated as IPS) allowable identification bit and the like in the technical scheme, an image is divided into a plurality of CTUs after image information is acquired, the CTUs are further divided into a plurality of CUs, and each independent CU performs intra-frame prediction;
in the intra-prediction process, it is possible to,
if the IPF allowed identification position and the IPS allowed identification position are both '1', executing all the following steps;
② if the IPF allowed identification bit is '1' and the IPS allowed identification bit is '0', only a3), b3), c3), d3), e3), and i3), j3) are executed;
③ if the IPF allowed identification bit is '0' and the IPS allowed identification bit is '1' and the current CU area is greater than or equal to 64 and less than 4096, only a3), b3), f3), g3), h3), i3), j3) are executed;
if the IPF allowed identification bit and the IPS allowed identification bit are both '0', only a3), b3) and i3), j3) are executed:
a3) the current coding unit traverses all intra-frame prediction modes, calculates to obtain a prediction block under each prediction mode, and calculates to obtain rate distortion cost information of the current prediction mode according to an original pixel block;
b3) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all prediction modes, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c3) performing second traversal on all intra-frame prediction modes, starting an IPF (inter-prediction Filter) technology in the process, and firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of a current coding unit;
d3) IPF filtering is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a table 1 can be looked up according to the specific correspondence;
e3) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the IPF technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f3) performing third traversal on all intra-frame prediction modes, starting an IPS (in-plane switching) technology in the process, and firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;
g3) performing IPS twice on a prediction block of a current coding unit to obtain a final prediction block;
h3) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the IPS technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
i3) if the IPF allowed identification bit is '0' and the IPS allowed identification bit is '0', transmitting the prediction mode index recorded in b) to a decoding end through a code stream;
if the IPF allowed flag is '1' and the IPS allowed flag is '0', the minimum cost value recorded in b) is compared with the minimum cost value recorded in e),
if the rate distortion cost in b3) is lower, the prediction mode index code recorded in b3) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the IPF current coding unit uses a mark position '0' to indicate that the IPF technology is not used and is also transmitted to the decoding end through the code stream;
if the rate distortion in e3) is smaller, transmitting the prediction mode index code recorded in e3) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, and using an IPF current coding unit to use a mark position '1' to indicate that an IPF technology is used and also transmitting the IPF current coding unit to the decoding end through the code stream;
if the IPF allowed flag is '0' and the IPS allowed flag is '1', the minimum cost value recorded in b) is compared with the minimum cost value recorded in h),
if the rate distortion cost in b3) is lower, the prediction mode index code recorded in b3) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the IPS of the current coding unit uses a mark position '0' to indicate that the technology is not used and is also transmitted to the decoding end through the code stream;
if the rate distortion in h3) is smaller, transmitting the prediction mode index code recorded in h3) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, and transmitting the IPS use mark position '1' of the current coding unit to indicate that the technology is used and also to the decoding end through the code stream;
if the IPF allowed flag is '1' and the IPS allowed flag is '1', the minimum cost values recorded in b3), e3) and h3) are compared,
if the rate distortion cost in b3) is lower, transmitting the prediction mode index code recorded in b) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, indicating that the IPS use identification bit and the IPF use mark position '0' of the current coding unit are not used, and transmitting the prediction mode index code to the decoding end through the code stream;
if the rate distortion in e3) is smaller, transmitting the prediction mode index code recorded in e) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, and transmitting the IPF of the current coding unit by using an identification position '1' and not transmitting an IPS identification position, which means that an IPF technology is used and an IPS technology is not used, and also transmitting the IPF technology to the decoding end through the code stream;
if the rate distortion in h3) is smaller, the prediction mode index code recorded in h) is transmitted to the decoding end as the optimal prediction mode of the current coding unit through the code stream, and the IPF of the current coding unit uses the identification position '0' and the IPS uses the mark position '1', which indicates that the IPS technology is used instead of the IPF technology, and is also transmitted to the decoding end through the code stream.
j3) And then, superposing the prediction block and the residual after inverse transformation and inverse quantization to obtain a reconstructed coding unit block which is used as a prediction reference block of the next coding unit.
In the technical scheme 2, the intra-frame prediction part at the encoding end is specifically realized as follows:
an encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smoothing filtering (hereinafter abbreviated as IPS) allowable identification bit and the like in the technical scheme, an image is divided into a plurality of CTUs after image information is acquired, the CTUs are further divided into a plurality of CUs, and each independent CU performs intra-frame prediction;
in the intra-prediction process, it is possible to,
if the IPF allowed identification position and the IPS allowed identification position are both '1', executing all the following steps;
② if the IPF allowed identification bit is '1' and the IPS allowed identification bit is '0', only a4), b4), c4), d4), e4), and i4), j4) are executed;
③ if the IPF allowed identification bit is '0' and the IPS allowed identification bit is '1' and the current CU area is greater than or equal to 64 and less than 4096, only a4), b4), f4), g4), h4), i4), j4) are executed;
if the IPF allowed identification bit and the IPS allowed identification bit are both '0', only a4), b4) and i4), j4) are executed:
a4) the current coding unit traverses all intra-frame prediction modes, calculates to obtain a prediction block under each prediction mode, and calculates to obtain rate distortion cost information of the current prediction mode according to an original pixel block;
b4) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all prediction modes, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c4) performing second traversal on all intra-frame prediction modes, starting an IPF (inter-prediction Filter) technology in the process, and firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of a current coding unit;
d4) IPF filtering is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a table 1 can be looked up according to the specific correspondence;
e4) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the IPF technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f4) performing third traversal on all intra-frame prediction modes, starting an IPS (in-plane switching) technology in the process, and firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;
g4) performing primary IPS on a prediction block of a current coding unit to obtain a final prediction block;
h4) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the IPS technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
i4) if the IPF allowed identification bit is '0' and the IPS allowed identification bit is '0', transmitting the prediction mode index recorded in b) to a decoding end through a code stream;
if the IPF allowed flag is '1' and the IPS allowed flag is '0', the minimum cost value recorded in b) is compared with the minimum cost value recorded in e),
if the rate distortion cost in b4) is lower, the prediction mode index code recorded in b4) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the IPF current coding unit uses a mark position '0' to indicate that the IPF technology is not used and is also transmitted to the decoding end through the code stream;
if the rate distortion in e4) is smaller, transmitting the prediction mode index code recorded in e4) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, and using an IPF current coding unit to use a mark position '1' to indicate that an IPF technology is used and also transmitting the IPF current coding unit to the decoding end through the code stream;
if the IPF allowed flag is '0' and the IPS allowed flag is '1', the minimum cost value recorded in b) is compared with the minimum cost value recorded in h),
if the rate distortion cost in b4) is lower, the prediction mode index code recorded in b4) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the IPS of the current coding unit uses a mark position '0' to indicate that the technology is not used and is also transmitted to the decoding end through the code stream;
if the rate distortion in h4) is smaller, transmitting the prediction mode index code recorded in h4) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, and transmitting the IPS use mark position '1' of the current coding unit to indicate that the technology is used and also to the decoding end through the code stream;
if the IPF allowed flag is '1' and the IPS allowed flag is '1', the minimum cost values recorded in b4), e4) and h4) are compared,
if the rate distortion cost in b4) is lower, transmitting the prediction mode index code recorded in b) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, indicating that the IPS use identification bit and the IPF use mark position '0' of the current coding unit are not used, and transmitting the prediction mode index code to the decoding end through the code stream;
if the rate distortion in e4) is smaller, transmitting the prediction mode index code recorded in e) as the optimal prediction mode of the current coding unit to a decoding end through a code stream, and transmitting the IPF of the current coding unit by using an identification position '1' and not transmitting an IPS identification position, which means that an IPF technology is used and an IPS technology is not used, and also transmitting the IPF technology to the decoding end through the code stream;
if the rate distortion in h4) is smaller, the prediction mode index code recorded in h) is transmitted to the decoding end as the optimal prediction mode of the current coding unit through the code stream, and the IPF of the current coding unit uses the identification position '0' and the IPS uses the mark position '1', which indicates that the IPS technology is used instead of the IPF technology, and is also transmitted to the decoding end through the code stream.
j4) And then, superposing the prediction block and the residual after inverse transformation and inverse quantization to obtain a reconstructed coding unit block which is used as a prediction reference block of the next coding unit.
Corresponding to the image encoding method described in fig. 9, fig. 10 is a flowchart illustrating an image encoding method in an embodiment of the present application, which can be applied to the destination device 20 in the video decoding system 1 shown in fig. 6 or the video decoder 200 shown in fig. 8. The flow shown in fig. 10 is described by taking as an example the video encoder 200 shown in fig. 8 as an execution subject. As shown in fig. 10, the cross-component prediction method provided in the embodiment of the present application includes:
step 210, parsing a code stream, and determining intra-frame prediction filtering indication information and a first usage flag bit of a currently decoded block, where the intra-frame prediction indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra-frame prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra-frame prediction filtering mode is allowed to be used, the first intra-frame prediction filtering mode is an intra-frame prediction filtering IPF mode, and the first usage flag bit is a usage flag bit of the first intra-frame prediction filtering mode;
step 220, determining to use the first frame rate prediction filtering mode to obtain a prediction block of the decoded block according to the intra prediction filtering indication information and the first use flag.
The specific flow of intra-frame prediction at the decoding end in the technical scheme 1 is as follows:
the decoder obtains a code stream, analyzes the code stream to obtain an IPF (Internet protocol format) allowed identification bit and an IPS (in-plane switching) allowed identification bit of the current video sequence, analyzes the code stream and performs inverse transformation and inverse quantization on the obtained residual information.
In the intra-prediction decoding process,
if the allowed identification position of the IPF and the identification position of the IPS are both '1', executing all the following steps;
② if the allowed identification position of IPF is '1' and the identification position of IPS is '0', only execute the steps of a5), b5), c5), d5) and g 5);
③ if the allowed identification bit of the IPF is '0' and the identification bit of the IPS is '1' and the current CU area is greater than or equal to 65 and less than 5096, only executing the steps of a5), b5), e5), f5) and g 5);
if the allowed identification bit of the IPF and the identification bit of the IPS are both '0', only the steps of a5), b5) and g5) are carried out;
a5) acquiring a code stream, decoding to obtain residual information, and performing inverse transformation, inverse quantization and other processes to obtain time domain residual information;
b5) analyzing the code stream to obtain a prediction mode of a current decoding unit, and calculating according to the prediction mode of the current decoding unit and an adjacent reconstruction block to obtain a prediction block;
c5) the usage identification bit of the IPF is analyzed and obtained,
if the usage flag bit of the IPF is '0', no additional operation is performed on the current prediction block;
if the usage flag of the IPF is '1', executing d 5);
d5) selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and then filtering each pixel in a prediction block to obtain a prediction block;
e5) the usage identification bits of the IPF are obtained,
if the usage flag bit of the IPF is '1', skipping the rest processes in the step, and skipping the step f 5);
and if the IPF use identification bit is '0', analyzing and acquiring the IPS use identification bit.
If the using identification bit of the IPS is '0', no additional operation is carried out on the current prediction block;
if the usage flag of the IPS is '1', then f5 is executed);
f5) filtering the input prediction block twice by using an IPS device to obtain a filtered current decoding unit prediction block;
g5) superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing;
in the technical scheme 2, the specific flow of intra-frame prediction at the decoding end is as follows:
the decoder obtains a code stream, analyzes the code stream to obtain an IPF (Internet protocol format) allowed identification bit and an IPS (in-plane switching) allowed identification bit of the current video sequence, analyzes the code stream and performs inverse transformation and inverse quantization on the obtained residual information.
In the intra-prediction decoding process,
if the allowed identification position of the IPF and the identification position of the IPS are both '1', executing all the following steps;
② if the allowed identification position of IPF is '1' and the identification position of IPS is '0', only execute the steps of a6), b6), c6), d6) and g 6);
③ if the allowed identification bit of IPF is '0' and the identification bit of IPS is '1' and the current CU area is greater than or equal to 66 and less than 6096, only executing the steps of a6), b6), e6), f6) and g 6);
if the allowed identification bit of the IPF and the identification bit of the IPS are both '0', only the steps of a6), b6) and g6) are carried out;
a6) acquiring a code stream, decoding to obtain residual information, and performing inverse transformation, inverse quantization and other processes to obtain time domain residual information;
b6) analyzing the code stream to obtain a prediction mode of a current decoding unit, and calculating according to the prediction mode of the current decoding unit and an adjacent reconstruction block to obtain a prediction block;
c6) the usage identification bit of the IPF is analyzed and obtained,
if the usage flag bit of the IPF is '0', no additional operation is performed on the current prediction block;
if the usage flag of the IPF is '1', executing d 6);
d6) selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and then filtering each pixel in a prediction block to obtain a prediction block;
e6) the usage identification bits of the IPF are obtained,
if the usage flag bit of the IPF is '1', skipping the rest processes in the step, and skipping the step f 6);
and if the IPF use identification bit is '0', analyzing and acquiring the IPS use identification bit.
If the using identification bit of the IPS is '0', no additional operation is carried out on the current prediction block;
if the usage flag of the IPS is '1', then f6 is executed);
f6) performing primary filtering on the input prediction block by using an IPS device to obtain a filtered current decoding unit prediction block;
g6) superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing;
the technical scheme is applied to an intra-frame prediction part in a coding and decoding frame. When the current coding unit or decoding unit is filtered by using the IPS technique, the current block needs to be filled first, and the steps are as follows:
a7) if the reference pixels on the left side and the upper side outside the current prediction block are available, namely reconstructed pixels exist on the left side and the upper side, a left column and an upper row are filled with the reconstructed pixels;
b7) if the reference pixels on the left side or the upper side outside the current prediction block are unavailable, namely the reconstructed pixels are not arranged on the left side or the upper side, the side without the reconstructed pixels is filled by using the row or the column which is closest to the side of the current prediction block;
c7) filling right adjacent columns outside the current prediction block by using a rightmost column prediction value of the current prediction block;
d7) filling the lower adjacent row outside the current prediction block by using the lowest predicted value of the current prediction block;
and filling the upper right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the upper side outside the current prediction block, filling the lower right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the lower side outside the current prediction block, and filling the lower left-corner pixel points outside the current prediction block by using the filled bottommost pixel points on the left side outside the current prediction block.
Fig. 11A shows a schematic filling diagram of a prediction block, where pred.pixel represents pixels of the prediction block and recon.pixel represents filled pixels.
Fig. 11B shows another filling diagram of the prediction block, wherein pred. pixel represents the pixels of the prediction block and recon. pixel represents the filled pixels.
The IPS technique uses a simplified gaussian convolution kernel to filter the prediction block, where the filter has 9 taps and 9 different filter coefficients, as follows:
Figure BDA0002608522670000141
filter _ coefficients represent filter coefficients.
Filtering each prediction pixel in the prediction block, the filtering formula is as follows:
P′(x,y)=c1·P(x-1,y-1)+c2·P(x,y-1)+c1·P(x+1,y-1)+c1·P(x-1,y)+c3·P(x,y)+c2·P(x+1,y)+c1·P(x-1,y+1)+c2·P(x,y+1)+c1·P(x+1,y+1)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c1、c2And c3Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c1Is 0075, c2Is 0.124, c3Is 0.204. P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The convolution kernel coefficients adopted by the IPS technique can be approximated to integers, and the sum of all coefficients is an exponential power of 2, which can avoid both floating-point calculation and division operation of a computer, thereby greatly reducing the calculation complexity, as follows:
Figure BDA0002608522670000142
the sum of the filter coefficients is 64, i.e. the calculated prediction value needs to be shifted to the right by 6 bits.
The IPS technique in the above technical solution 2 uses a simplified gaussian convolution kernel to filter the prediction block, where the filter has 25 taps and 6 different filter coefficients, as follows:
Figure BDA0002608522670000143
filtering each prediction pixel in the prediction block, the filtering formula is as follows:
P′(x,y)=c1·P(x-2,y-2)+c2·P(x-1,y-2)+c3·P(x,y-2)+c2·P(x+1,y-2)+c1·P(x+2,y-2)+c2·P(x-2,y-1)+c4·P(x-1,y-1)+c5·P(x,y-1)+c4·P(x+1,y-1)+c2·P(x+2,y-1)+c3·P(x-2,y)+c5·P(x-1,y)+c6·P(x,y)+c5·P(x+1,y)+c3·P(x+2,y)+c2·P(x-2,y+1)+c4·P(x-1,y+1)+c5·P(x,y+1)+c4·P(x+1,y+1)+c2·P(x+2,y+1)+c1·P(x-2,y+2)+c2·P(x-1,y+2)+c3·P(x,y+2)+c2·P(x+1,y+2)+c1·P(x+2,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c1、c2、c3、c4、c5And c6Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c1Is 0.0030, c2Is 0.0133, c3Is 0.0219, c4Is 0.0596, c5Is 0.0983, c6Is 0.1621. P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The convolution kernel coefficients adopted by the IPS technique can be approximated to integers, and the sum of all coefficients is an exponential power of 2, which can avoid both floating-point calculation and division operation of a computer, thereby greatly reducing the calculation complexity, as follows:
Figure BDA0002608522670000151
the sum of the filter coefficients is 1024, i.e. the calculated prediction value needs to be shifted to the right by 10 bits.
The IPS technique in the above technical solution 2 may also use a simplified gaussian convolution kernel to filter the prediction block, where the filter has 13 taps and 4 different filter coefficients, as shown below:
Figure BDA0002608522670000152
filtering each prediction pixel in the prediction block, the filtering formula is as follows:
the filter formula is as follows:
P′(x,y)=c1·P(x,y-2)+c2·P(x-1,y-1)+c3·P(x,y)+c2·P(x+1,y-1)+c1·P(x-2,y)+c3·P(x-1,y)+c4·P(x,y)+c3·P(x+1,y)+c1·P(x+2,y)+c2·P(x-1,y+1)+c3·P(x,y+1)+c2·P(x+1,y+1)+c1·P(x,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c1、c2、c3And c4Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c1Is 13, c2Is 18, c3Is 25, c4Is 32. P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The sum of the filter coefficients is 256, i.e. the calculated prediction value needs to be shifted to the right by 8 bits.
The technical scheme is suitable for the intra-frame prediction coding and decoding part, provides selection for operations such as smoothing processing or local blurring of intra-frame prediction, and enables prediction pixels to be smoother and prediction blocks to be closer to original images by using the technology for the parts which do not need too sharp of image textures, and finally improves coding efficiency.
Technical scheme 1 tests on an official simulation platform HPM7.0 of AVS, performs smooth filtering on an intra-frame prediction block, and the test results are shown in tables 2 and 3 under a full-frame test condition and a random access condition.
TABLE 2 All Intra test results
Class Y U V
4K -0.61% -0.67% -0.84%
1080P -0.45% -0.78% -0.48%
720P -0.22% -0.08% -0.66%
Average performance -0.42% -0.51% -0.66%
TABLE 3 Random Access test results
Class Y U V
4K -0.25% -0.37% -0.57%
1080P -0.22% -0.41% -0.64%
720P -0.22% -0.01% -0.73%
Average performance -0.23% -0.26% -0.65%
As can be seen from tables 2 and 3, the present solution has good performance improvement under both test conditions.
Under AI test conditions, the luminance component has 0.42% BDBR saving, and the UV component has 0.51% BDBR saving and 0.66% BDBR saving respectively, so that the performance is higher, and the coding efficiency of the coder is effectively improved.
From each resolution, the scheme has larger coding performance improvement on the 4K resolution video, which is beneficial to the development of the future ultra-high definition video, and saves more code rates and more bandwidths for the ultra-high resolution video.
According to the scheme, in the intra-frame prediction process, smooth filtering is carried out on the prediction block obtained by calculating the intra-frame prediction mode, the intra-frame prediction precision is improved, and the coding efficiency is effectively improved, specifically as follows:
1. propose to smooth the filtering to the prediction block of the intra-frame coding;
2. the IPS technology and the IPF are provided, under the condition that the allowed identification bits of the two technologies are '1', the IPF and the IPS in the same coding unit cannot be simultaneously determined and used;
3. when the encoder determines to use the IPF technology for the current coding unit, the identification bit of the IPS is not transmitted, and the decoder does not need to analyze the use identification bit of the IPS;
when the encoder determines that the IPF technology is not used for the current coding unit, the identification bit of the IPS needs to be transmitted, and the decoder needs to analyze the use identification bit of the IPS;
4. the IPS convolution kernel is provided as a simplified 9-tap filtering Gaussian convolution kernel;
5. approximate values are provided for convolution kernels of floating point numbers, filter coefficients are rounded to avoid floating point calculation, the total sum of the filter coefficients is 2 exponential power, division operation is replaced by shift operation, calculation resources are saved, and complexity is reduced;
6. providing coefficients of a 9-tap integer Gaussian convolution kernel filter, wherein a first filter coefficient is 5, a second filter coefficient is 8, a third filter coefficient is 12, and a predicted value after filtering needs to be shifted to the right by 6 bits;
7. 25-tap and 13-tap filters are proposed.
The present application can be further extended from the following directions.
Embodiment 1: the 9-tap Gaussian convolution kernel in the technical scheme is replaced by the filtering convolution kernel with more taps so as to achieve better smooth filtering effect.
The expansion scheme 2: the brightness component and the chroma component in the technical scheme respectively use independent identification bits to indicate whether the IPS is used or not.
Embodiment 3: the application range in the technical scheme is limited, and the IPS technology is not used for the unit with smaller prediction block area, so that the transmission identification bit is reduced and the calculation complexity is reduced.
Embodiment 4: the application range in the technical scheme is limited, the prediction mode of the current coding unit is screened, and if the prediction mode is the mean mode, the IPS technology is not used, so that the transmission identification bit is reduced, and the calculation complexity is reduced.
The embodiment of the application provides an image coding device which can be a video decoder or a video encoder. In particular, the image encoding device is configured to perform the steps performed by the video decoder in the above decoding method. The image encoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
The present embodiment may divide the functional modules of the image encoding apparatus according to the above method, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 12 is a schematic diagram showing a possible configuration of the image encoding apparatus according to the above embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 12, the image encoding device 12 includes a dividing unit 120, a determining unit 121, a transmitting unit 122, and a superimposing unit 123.
A dividing unit 120, configured to divide an image, and determine intra prediction filtering indication information of a current coding block, where the intra prediction filtering indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, and the first intra prediction filtering mode is an intra prediction filtering IPF mode;
a determining unit 121, configured to set a first usage flag of the first intra-prediction filtering mode of the current coding block to be allowed to be used if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;
a transmission unit 122, configured to transmit the intra-prediction filtering indication information, the first intra-prediction filtering mode, and the first usage flag via a code stream;
a superimposing unit 123, configured to determine, according to the intra-frame prediction filtering indication information and the first usage flag, to obtain a prediction block of the decoded block by using the first frame rate prediction filtering mode.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image encoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image encoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image encoding apparatus.
In the case of using an integrated unit, a schematic structural diagram of an image encoding device provided in an embodiment of the present application is shown in fig. 13. In fig. 13, the image encoding device 13 includes: a processing module 130 and a communication module 131. The processing module 130 is used for controlling and managing actions of the image encoding apparatus, for example, performing steps performed by the dividing unit 120, the determining unit 121, the transmitting unit 122, the superimposing unit 123, and/or other processes for performing the techniques described herein. The communication module 131 is used to support interaction between the image encoding apparatus and other devices. As shown in fig. 13, the image encoding apparatus may further include a storage module 132, and the storage module 132 is used for storing program codes and data of the image encoding apparatus, for example, contents stored in the storage unit.
The Processing module 130 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 131 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 132 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image coding device can execute the image coding method, and the image coding device can be specifically a video image coding device or other equipment with a video coding function.
The application also provides a video encoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image encoding method of the embodiment of the application.
The embodiment of the application provides an image decoding device which can be a video decoder or a video decoder. Specifically, the image decoding apparatus is configured to perform the steps performed by the video decoder in the above decoding method. The image decoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
In the embodiment of the present application, the image decoding apparatus may be divided into functional modules according to the above method, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 14 shows a schematic diagram of a possible structure of the image decoding apparatus according to the above-described embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 14, image decoding apparatus 14 includes parsing section 140 and determining section 141.
A parsing unit, configured to determine intra prediction filtering indication information and a first usage flag bit of a currently decoded block, where the intra prediction indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, the first intra prediction filtering mode is an intra prediction filtering IPF mode, and the first usage flag bit is a usage flag bit of the first intra prediction filtering mode;
a determining unit, configured to determine, according to the intra prediction filtering indication information and the first usage flag, that a prediction block of the decoded block is obtained using the first frame rate prediction filtering mode.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image decoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image decoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image decoding apparatus.
In the case of using an integrated unit, a schematic structural diagram of an image decoding apparatus provided in an embodiment of the present application is shown in fig. 13. In fig. 13, the image decoding device 15 includes: a processing module 150 and a communication module 151. The processing module 150 is used to control and manage actions of the image decoding apparatus, such as performing steps performed by the parsing unit 140, the determining unit 141, and/or other processes for performing the techniques described herein. The communication module 151 is used to support interaction between the image decoding apparatus and other devices. As shown in fig. 13, the image decoding apparatus may further include a storage module 152, and the storage module 152 is used for storing program codes and data of the image decoding apparatus, for example, contents stored in the storage unit.
The Processing module 150 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 151 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 152 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image decoding device may perform the image decoding method, and the image decoding device may be specifically a video image decoding device or other equipment with a video decoding function.
The application also provides a video decoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image decoding method of the embodiment of the application.
The present application further provides a terminal, including: one or more processors, memory, a communication interface. The memory, communication interface, and one or more processors; the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the image encoding and/or image decoding methods of embodiments of the present application. The terminal can be a video display device, a smart phone, a portable computer and other devices which can process video or play video.
Another embodiment of the present application also provides a computer-readable storage medium including one or more program codes, where the one or more programs include instructions, and when a processor in a decoding apparatus executes the program codes, the decoding apparatus executes an image encoding method and an image decoding method of the embodiments of the present application.
In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; the at least one processor of the decoding device may read the computer executable instructions from the computer readable storage medium, and the execution of the computer executable instructions by the at least one processor causes the terminal to implement the image encoding method and the image decoding method of the embodiments of the present application.
In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented using a software program, may take the form of a computer program product, either entirely or partially. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part.
The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).
The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. An image encoding method, comprising:
dividing the image, and determining intra-prediction filtering indication information of a current coding block, wherein the intra-prediction filtering indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra-prediction filtering mode is allowed to be used, and the first intra-prediction filtering mode is an intra-prediction filtering IPF mode;
if the current coding block needs to use the first intra-frame prediction filtering mode according to the intra-frame prediction filtering indication information, setting a first use identification bit of the first intra-frame prediction filtering mode of the current coding block as a permission use;
transmitting the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit via a code stream;
and superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block which is used as a prediction reference block of the next coding block.
2. An image decoding method, comprising:
analyzing a code stream, and determining intra-frame prediction filtering indication information and a first use identification bit of a current decoding block, wherein the intra-frame prediction indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-frame prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra-frame prediction filtering mode is allowed to be used, the first intra-frame prediction filtering mode is an intra-frame prediction filtering IPF mode, and the first use identification bit is the use identification bit of the first intra-frame prediction filtering mode;
and determining to obtain the prediction block of the decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit.
3. An image encoding device characterized by comprising:
a dividing unit, configured to divide an image, and determine intra prediction filtering indication information of a current encoding block, where the intra prediction filtering indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, and the first intra prediction filtering mode is an intra prediction filtering IPF mode;
a determining unit, configured to set a first usage flag of the first intra-prediction filtering mode of the current coding block to be allowed to be used if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;
a transmission unit, configured to transmit the intra-prediction filtering indication information, the first intra-prediction filtering mode, and the first usage flag via a code stream;
and the superposition unit is used for determining to obtain the prediction block of the decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit.
4. An image decoding apparatus, comprising:
a parsing unit, configured to determine intra prediction filtering indication information and a first usage flag bit of a currently decoded block, where the intra prediction indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, the first intra prediction filtering mode is an intra prediction filtering IPF mode, and the first usage flag bit is a usage flag bit of the first intra prediction filtering mode;
a determining unit, configured to determine, according to the intra prediction filtering indication information and the first usage flag, that a prediction block of the decoded block is obtained using the first frame rate prediction filtering mode.
5. An encoder comprising a non-volatile storage medium and a central processor, wherein the non-volatile storage medium stores an executable program, wherein the central processor is coupled to the non-volatile storage medium, and wherein the encoder performs the bi-directional inter prediction method of claim 1 or 2 when the executable program is executed by the central processor.
6. A decoder comprising a non-volatile storage medium and a central processor, wherein the non-volatile storage medium stores an executable program, wherein the central processor is coupled to the non-volatile storage medium, wherein the decoder performs the bi-directional inter prediction method of claim 1 or 2 when the executable program is executed by the central processor.
7. A terminal, characterized in that the terminal comprises: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicating with other devices via the communication interface, the memory for storing computer program code, the computer program code comprising instructions,
the instructions, when executed by the one or more processors, cause the terminal to perform the method of claim 1 or 2.
8. A computer program product comprising instructions for causing a terminal to perform the method according to claim 1 or 2, when the computer program product is run on the terminal.
9. A computer-readable storage medium comprising instructions that, when executed on a terminal, cause the terminal to perform the method of claim 1 or 2.
CN202010748924.5A 2020-07-29 2020-07-29 Image encoding method, image decoding method and related device Withdrawn CN114071162A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202010748924.5A CN114071162A (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related device
TW110123866A TW202209878A (en) 2020-07-29 2021-06-29 Image coding method, image decoding method, and related apparatus
PCT/CN2021/109173 WO2022022622A1 (en) 2020-07-29 2021-07-29 Image coding method, image decoding method, and related apparatus
CN202180060486.6A CN116250240A (en) 2020-07-29 2021-07-29 Image encoding method, image decoding method and related devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010748924.5A CN114071162A (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related device

Publications (1)

Publication Number Publication Date
CN114071162A true CN114071162A (en) 2022-02-18

Family

ID=80037645

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202010748924.5A Withdrawn CN114071162A (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related device
CN202180060486.6A Pending CN116250240A (en) 2020-07-29 2021-07-29 Image encoding method, image decoding method and related devices

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202180060486.6A Pending CN116250240A (en) 2020-07-29 2021-07-29 Image encoding method, image decoding method and related devices

Country Status (3)

Country Link
CN (2) CN114071162A (en)
TW (1) TW202209878A (en)
WO (1) WO2022022622A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114567775A (en) * 2022-04-29 2022-05-31 中国科学技术大学 Image dividing method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103200400B (en) * 2012-01-09 2018-03-16 中兴通讯股份有限公司 A kind of decoding method, codec and the electronic equipment of image layer and sliced layer
CN105141948A (en) * 2015-09-22 2015-12-09 天津师范大学 Improved HEVC sample point self-adaption compensation method
CN110650349B (en) * 2018-06-26 2024-02-13 中兴通讯股份有限公司 Image encoding method, decoding method, encoder, decoder and storage medium
CN109889852B (en) * 2019-01-22 2021-11-05 四川大学 HEVC intra-frame coding optimization method based on adjacent values

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114567775A (en) * 2022-04-29 2022-05-31 中国科学技术大学 Image dividing method and device

Also Published As

Publication number Publication date
WO2022022622A1 (en) 2022-02-03
CN116250240A (en) 2023-06-09
TW202209878A (en) 2022-03-01

Similar Documents

Publication Publication Date Title
CN114501010B (en) Image encoding method, image decoding method and related devices
CN111327904B (en) Image reconstruction method and device
US11985309B2 (en) Picture coding method, picture decoding method, and related apparatuses
CN115243048B (en) Video image decoding and encoding method and device
US12010325B2 (en) Intra block copy scratch frame buffer
WO2021185257A1 (en) Image coding method, image decoding method and related apparatuses
CN117241014A (en) MPM list construction method, intra-frame prediction mode acquisition method and device of chroma block
CN114071161B (en) Image encoding method, image decoding method and related devices
WO2021244197A1 (en) Image encoding method, image decoding method, and related apparatuses
CN114913249A (en) Encoding method, decoding method and related devices
WO2022022622A1 (en) Image coding method, image decoding method, and related apparatus
WO2022037300A1 (en) Encoding method, decoding method, and related devices
CN118101967A (en) Position dependent spatially varying transform for video coding
CN112055211B (en) Video encoder and QP setting method
CN113965764B (en) Image encoding method, image decoding method and related device
US20240214561A1 (en) Methods and devices for decoder-side intra mode derivation
WO2024039803A1 (en) Methods and devices for adaptive loop filter
WO2023154359A1 (en) Methods and devices for multi-hypothesis-based prediction
CN114979629A (en) Image block prediction sample determining method and coding and decoding equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220218

WW01 Invention patent application withdrawn after publication