CN117413515A

CN117413515A - Encoding/decoding method, encoder, decoder, and computer storage medium

Info

Publication number: CN117413515A
Application number: CN202180098944.5A
Authority: CN
Inventors: 唐桐
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2024-01-16
Also published as: WO2022266971A1

Abstract

The embodiment of the application discloses a coding and decoding method, an encoder, a decoder and a computer storage medium, wherein the method comprises the following steps: determining a block type of a current block; determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding; determining a target block division mode based on the value of the bypass identification information; and encoding the current block according to the target block division mode. Therefore, whether the MIP is adopted for coding prediction can be determined based on the block type of the current block, the MIP coding process can be skipped for the screen content block with small quantity of brightness sampling values and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is shortened, and the coding and decoding efficiency can be further improved.

Description

Encoding/decoding method, encoder, decoder, and computer storage medium

Technical Field

The embodiment of the application relates to the technical field of video encoding and decoding, in particular to an encoding and decoding method, an encoder, a decoder and a computer storage medium.

Background

With the improvement of the requirements of people on the video display quality, new video application forms such as high-definition and ultra-high-definition videos and the like are generated. The h.265/high efficiency video coding (High Efficiency Video Coding, HEVC) has failed to meet the rapidly evolving needs of video applications, and the joint video expert group (Joint Video Exploration Team, jfet) has proposed a new generation of video coding standard h.266/multi-function video coding (Versatile Video Coding, VVC) whose corresponding Test Model is the VVC reference software Test Model (VTM).

Among the current VVC block partitioning techniques, matrix weighted intra prediction (Matrix Weighted Intra Prediction, MIP) techniques are capable of more accurate prediction of textured image blocks, but complex MIP prediction models make coding much more complex than conventional intra prediction techniques, which may create a lot of unnecessary overhead, waste computational resources, and increase coding time when coding screen content including, for example, flat regions.

Disclosure of Invention

The embodiment of the application provides a coding and decoding method, a coder, a decoder and a computer storage medium, which can reduce coding complexity and further improve coding and decoding efficiency.

The technical scheme of the embodiment of the application can be realized as follows:

in a first aspect, an embodiment of the present application proposes an encoding method, applied to an encoder, the method including:

determining a block type of a current block;

determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding;

determining a target block division mode based on the value of the bypass identification information;

and encoding the current block according to the target block dividing mode.

In a second aspect, an embodiment of the present application proposes a code stream, where the code stream includes code bits of one or more of a value of bypass identification information, a target block partition manner, a target prediction mode of a node sub-block, and a residual value.

In a third aspect, an embodiment of the present application proposes a decoding method, applied to a decoder, the method including:

analyzing the code stream and determining a target block division mode of the current block;

analyzing a code stream based on the target block division mode, and determining a predicted value of the current block;

analyzing a code stream based on the target block division mode, and determining a residual error value of the current block;

And determining a reconstruction value of the current block based on the predicted value and the residual value.

In a fourth aspect, embodiments of the present application provide an encoder, including a first determining unit and an encoding unit; wherein,

the first determining unit is configured to determine a block type of a current block; determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding; determining a target block division mode based on the value of the bypass identification information; and encoding the current block according to the target block dividing mode.

The encoding unit is configured to encode the current block according to the target block division mode.

In a fifth aspect, embodiments of the present application provide an encoder including a first memory and a first processor; wherein,

the first memory is used for storing a computer program capable of running on the first processor;

the first processor being configured to perform the method of any one of the preceding claims when the computer program is run.

In a sixth aspect, an embodiment of the present application proposes a decoder, including an parsing unit and a second determining unit; wherein,

The parsing unit is configured to parse the code stream and determine a target block division mode of the current block; analyzing a code stream based on the target block division mode, and determining a predicted value of the current block; analyzing a code stream based on the target block division mode, and determining a residual error value of the current block;

the second determining unit is configured to determine a reconstruction value of the current block based on the prediction value and the residual value.

In a seventh aspect, embodiments of the present application provide a decoder, the decoder including a second memory and a second processor; wherein,

the second memory is used for storing a computer program capable of running on the second processor;

the second processor is configured to perform the method as described above when the computer program is run.

In an eighth aspect, embodiments of the present application provide a computer storage medium, where the computer storage medium stores a computer program, where the computer program when executed implements a method as set forth in any one of the above.

In a ninth aspect, an embodiment of the present application proposes a codec system, where the codec system is composed of an encoder and a decoder.

The embodiment of the application provides a coding and decoding method, an encoder, a decoder and a computer storage medium, wherein the block type of a current block is determined at the encoder side; determining the value of bypass identification information corresponding to a current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding; determining a target block division mode based on the value of the bypass identification information; and encoding the current block according to the target block dividing mode. At the decoder side, analyzing the code stream and determining the dividing mode of the target block of the current block; analyzing the code stream based on the target block division mode, and determining the predicted value of the current block; analyzing the code stream based on the target block division mode, and determining the residual error value of the current block; a reconstruction value of the current block is determined based on the prediction value and the residual value. Therefore, the block type of the current block can be determined based on the color parameter of the current block, and whether the MIP is adopted for coding prediction or not is determined based on the block type of the current block, so that the MIP coding process can be skipped for the screen content block with small brightness sampling value quantity and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is reduced, and the coding and decoding efficiency can be further improved.

Drawings

Fig. 1 is a schematic flow chart of a MIP method according to the related art;

FIG. 2A is a schematic block diagram of an encoder according to an embodiment of the present application;

FIG. 2B is a schematic block diagram of a decoder according to an embodiment of the present application;

fig. 3 is a schematic flow chart of an encoding method according to an embodiment of the present application;

fig. 4 is a schematic flow chart of an exemplary determining MIP bypass identifying information according to an embodiment of the present application;

fig. 5 is a flow chart of another decoding method according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an encoder according to an embodiment of the present disclosure;

fig. 7 is a schematic diagram of a specific hardware structure of an encoder according to an embodiment of the present application;

fig. 8 is a schematic diagram of a composition structure of a decoder according to an embodiment of the present application;

fig. 9 is a schematic diagram of a specific hardware structure of a decoder according to an embodiment of the present application.

Detailed Description

For a more complete understanding of the features and technical content of the embodiments of the present application, reference should be made to the following detailed description of the embodiments of the present application, taken in conjunction with the accompanying drawings, which are for purposes of illustration only and not intended to limit the embodiments of the present application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict. It should also be noted that the term "first/second/third" in reference to the embodiments of the present application is used merely to distinguish similar objects and does not represent a specific ordering for the objects, it being understood that the "first/second/third" may be interchanged with a specific order or sequence, if allowed, to enable the embodiments of the present application described herein to be implemented in an order other than that illustrated or described herein.

Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, where the terms and terminology involved in the embodiments of the present application are suitable for the following explanation:

joint video expert group (Joint Video Experts Team, JVET)

H.265/high efficiency video coding (High Efficiency Video Coding, HEVC)

H.266/multifunctional video coding (Versatile Video Coding, VVC)

Reference software Test platform of VVC (VVC Test Model, VTM)

Matrix weighted intra prediction (Matrix Weighted Intra Prediction, MIP)

Coding Unit (Coding Unit, CU)

Coding Tree Unit (Coding Tree Unit, CTU)

The embodiment of the application provides an intra prediction technology based on linear affine variation, which may also be called as MIP technology, and fig. 1 shows a specific flow diagram of the MIP technology, and since the application performs operations such as prediction, inverse transformation and inverse quantization, loop filtering, etc. on a current block, the current block may also be called as a luminance block, and if the application performs operations such as prediction, inverse transformation and inverse quantization, loop filtering, etc. on a chrominance component on the current block, the current block may also be called as a chrominance block. The following will take luminance blocks as an example. For predicting a luminance block of width W and height H, MIP takes as input H reconstructed pixels located on the left side of the luminance block and W reconstructed pixels located above the luminance block, and then obtains the final predicted pixel by averaging, matrix vector multiplication and linear interpolation. The method is roughly divided into three steps:

1. and normalization operation on the reference pixels is realized. Depending on the size and shape of the luminance block, 4 or 8 pixels are obtained by averaging, specifically 4 pixels for a 4×4 luminance block and 8 pixels for the luminance blocks of the remaining shape. It should be noted that, the acquisition of the reference pixels in the MIP mode is the same as that of the conventional intra prediction mode, and will not be described herein. Note that, for the input long reference boundary pixel by _top And addry _left Method for calculating neighborhood average to obtain short reference boundary pixel by _top ^red And addry _left ^red To reduce the calculation amount and model parameter storage space in the prediction process, and then to use the bypass _top ^red And addry _left ^red Connected to a final length-4 or 8 vector by _red 。

2. The obtained sample is subjected to matrix vector multiplication and added with an offset b to obtain a partial predicted value pred _red As shown in a specific formula (1),

pred _red ＝A×bdry _red +b (1)

wherein A, b is all pre-trained and stored in sets S0, S1, S2. S0 is used for the blocks of 4x4, where S0 contains mipMatrix4x4 (corresponding to A) and mipBias4x4 (corresponding to B), S1 corresponds to the blocks of 4x8,8x4,8x8, S1 contains mipMatrix8x8 (corresponding to A) and mipBias8x8 (corresponding to B), S2 corresponds to the other blocks, and S2 contains mipMatrix16x16 (corresponding to A) and mipBias16x16 (corresponding to B).

3. And performing bilinear interpolation to obtain a final predicted pixel. I.e. pred _red The luminance block is converted into a prediction matrix, and the size of the prediction matrix is generally smaller than that of the luminance block, so that interpolation is required in two directions, namely a horizontal direction and/or a vertical direction, to obtain a final prediction pixel value, wherein the prediction pixel value is a prediction result of predicting the luminance block by utilizing MIP. If a difference is required between the horizontal direction and the vertical direction, the interpolation is performed in a fixed order by first performing interpolation calculation in the horizontal direction and then performing interpolation calculation in the vertical direction.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Referring to fig. 2A, a schematic block diagram of an encoder according to an embodiment of the present application is shown. As shown in fig. 2A, the encoder 10 includes a transform and quantization unit 101, an intra estimation unit 102, an intra prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter control analysis unit 107, a filtering unit 108, a coding unit 109, and a decoded image buffer unit 110, etc., wherein the filtering unit 108 can implement DBF filtering/SAO filtering/ALF filtering, and the coding unit 109 can implement header information coding and Context-based adaptive binary arithmetic coding (Context-based Adaptive Binary Arithmatic Coding, CABAC). For an input original video signal, a video Coding block can be obtained through division of a Coding Tree Unit (CTU), and then residual pixel information obtained after intra-frame or inter-frame prediction is transformed by a transformation and quantization Unit 101, including transforming the residual information from a pixel domain to a transformation domain, and quantizing the obtained transformation coefficient to further reduce a bit rate; the intra estimation unit 102 and the intra prediction unit 103 are used for intra prediction of the video coding block; in particular, the intra-estimation unit 102 and the intra-prediction unit 103 are used to determine an intra-prediction mode to be used to encode the video encoding block; the motion compensation unit 104 and the motion estimation unit 105 are used to perform inter-prediction encoding of the received video coding block relative to one or more blocks in one or more reference frames to provide temporal prediction information; the motion estimation performed by the motion estimation unit 105 is a process of generating a motion vector that can estimate the motion of the video encoding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 is further configured to provide the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also transmits the calculated determined motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for reconstructing the video coding block, reconstructing a residual block in the pixel domain, removing blocking artifacts by the filter control analysis unit 107 and the filtering unit 108, and adding the reconstructed residual block to a predictive block in the frame of the decoded image buffer unit 110 to generate a reconstructed video coding block; the encoding unit 109 is configured to encode various encoding parameters and quantized transform coefficients, and in a CABAC-based encoding algorithm, context content may be based on neighboring encoding blocks, and may be configured to encode information indicating the determined intra prediction mode, and output a code stream of the video signal; and the decoded picture buffer unit 110 is for storing reconstructed video coding blocks for prediction reference. As video image encoding proceeds, new reconstructed video encoding blocks are generated, and the reconstructed video encoding blocks are stored in the decoded image buffer unit 110.

Referring to fig. 2B, a schematic block diagram of a decoder according to an embodiment of the present application is shown. As shown in fig. 2B, the decoder 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra prediction unit 203, a motion compensation unit 204, a filtering unit 205, a decoded image buffer unit 206, and the like, wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and the filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering. After the input video signal is subjected to the encoding process of fig. 2A, a code stream of the video signal is output; the code stream is input into the video decoding system 20, and first passes through the decoding unit 201 to obtain decoded transform coefficients; processing by the inverse transform and inverse quantization unit 202 for the transform coefficients to generate a residual block in the pixel domain; the intra prediction unit 203 may be used to generate prediction data for a current video decoded block based on the determined intra prediction mode and data from a previously decoded block of a current frame or picture; the motion compensation unit 204 is a unit that determines prediction information for a video decoding block by parsing the motion vector and other associated syntax elements and uses the prediction information to generate a predictive block of the video decoding block being decoded; forming a decoded video block by summing the residual block from the inverse transform and inverse quantization unit 202 with a corresponding predictive block generated by the intra prediction unit 203 or the motion compensation unit 204; the decoded video signal is passed through a filtering unit 205 to remove blocking artifacts, which may improve video quality; the decoded video blocks are then stored in a decoded image buffer unit 206, and the decoded image buffer unit 206 stores reference images for subsequent intra prediction or motion compensation, and is also used for output of video signals, i.e. the restored original video signals.

It should be noted that, the encoding and decoding method in the embodiment of the present application may be applied to a video encoding and decoding chip, and the encoding performance may be significantly improved by using the MIP mode. Here, it may be applied to an intra/inter prediction section (represented by a black bold frame, specifically including the intra estimation unit 102, the intra prediction unit 103) as shown in fig. 2A, or to an intra/inter prediction section (represented by a black bold frame, specifically including the intra prediction unit 203) as shown in fig. 2B. That is, the codec method in the embodiment of the present application may be applied to both a video encoding system (simply referred to as "encoder") and a video decoding system (simply referred to as "decoder"), and may even be applied to both the video encoding system and the video decoding system at the same time, but is not limited in any way herein.

It should be further noted that, when the embodiment of the present application is applied to the encoder 10, the "current block" specifically refers to a block to be currently encoded (may also be simply referred to as an "encoding block") in a video image; when the present embodiment is applied to the decoder 20, the "current block" specifically refers to a block (may also be simply referred to as a "decoded block") to be currently decoded in a video image.

In a specific example, an implementation method of an encoder using MIP is: first, an input image is divided into a plurality of CTU blocks that do not overlap. Then, each CTU is sequentially processed according to a raster scanning sequence, and the CTU is divided into a plurality of CUs according to a plurality of block division modes, wherein the plurality of block division modes of the CTU can comprise four-fork tree division, vertical binary tree division, horizontal binary tree division, vertical three-fork tree division, horizontal three-fork tree division and other block division modes; determining an optimal block division manner from a plurality of block division manners comprises the following steps of, for an ith block division manner Spilt [ i ]: (1) performing intra-frame prediction by using a linear interpolation prediction-based method, selecting an optimal intra-frame prediction mode bestregIntraMode [ i ] and a prediction cost bestregIntraCost [ i ], (2) performing intra-frame prediction by using a MIP mode when judging that the block size of the current CU meets the corresponding block size limiting condition, and selecting an optimal MIP prediction mode bestMipINtraMode [ i ] and a prediction cost bestMipINtraCost [ i ]; (3) comparing bestregintraCost [ i ] with bestsmipiraCost [ i ], and selecting an optimal intra-frame prediction mode bestreMode [ i ] corresponding to the Spilt [ i ] and a prediction cost bestreCost [ i ]; (4) other methods, such as inter-frame prediction, are adopted to predict, and an optimal prediction mode bestOtherMode [ i ] and a prediction cost bestOtherCost [ i ] are selected; (5) and comparing the bestIntraCost [ i ] with the bestOtherCost [ i ], and selecting the best prediction mode bestMode [ i ] of the Spilt [ i ] and the prediction cost bestCost [ i ]. And (3) sequentially executing the steps (1) - (5) on a plurality of block division modes, and selecting a block division mode Split opt with the minimum prediction cost of the current CTU and a corresponding prediction mode bestMode opt. And finally, predicting according to an optimal block division mode to obtain a residual block, transforming, quantizing and entropy coding the residual block, coding prediction information such as a block division mode and the like, and outputting a code stream to wait for transmission.

In another specific example, a decoder using MIP is implemented by: firstly, performing entropy decoding, inverse quantization and inverse transformation on an input code stream to obtain a residual block; then, reconstructing an image from the residual block, the reconstruction process mainly comprising the following 3 steps: (1) determining a division tree of the current CTU according to the block division mode; (2) sequentially processing each CU of the partition tree according to the raster scanning sequence, and calculating a predicted value Pred by using a prediction mode bestMode opt of each CU; (3) and superposing the residual error value and the predicted value of the current CU to obtain the reconstructed CU. Finally, the reconstructed image is sent to a deblocking filter (Deblocking Filter, DBF)/Sample Adaptive 0 offset (sao) filter/Adaptive loop filter (Adaptive Loop Filter, ALF), and the filtered image is sent to a buffer to wait for video playback.

Compared with the conventional intra-frame prediction technology, the MIP technology has the greatest advantages that the textured image block can be predicted more accurately, but a prediction model of the MIP is obtained based on massive data training, and the prediction model is more complex and more accurate than a linear interpolation calculation model used by the conventional intra-frame prediction technology; however, for flat and simple image blocks, prediction can be performed accurately by using a conventional intra-frame prediction technology, and the gain obtained by using the MIP technology is small, so that the coding complexity is increased to a certain extent. On the other hand, many screen content videos are rich in flat areas, such as large-area solid background areas, etc., for which accurate prediction tasks can be accomplished directly using conventional intra-prediction techniques. The video of the screen content is generally shown to contain a large number of repeated modes, limited color quantity and the like, and the two characteristics are respectively encoded by adopting an Intra Block Copy (IBC) technology and a Palette technology (PLT) technology, so that the highest encoding efficiency can be obtained.

Therefore, when encoding screen content video, MIP technology computation is likely to generate a lot of unnecessary overhead, wasting computing resources, and increasing encoding time.

The embodiment of the application provides an encoding method, wherein at an encoder side, the block type of a current block is determined based on the color parameters of the current block; determining a bypass identifier corresponding to the current block based on the block type, wherein the bypass identifier is used for identifying whether matrix weighted intra-frame prediction MIP is adopted to encode the current block; determining a target block division mode based on MIP bypass identification information; and encoding the current block according to the target block division mode.

Therefore, the block type of the current block can be determined based on the color parameter of the current block, and whether the MIP is adopted for coding prediction or not is determined based on the block type of the current block, so that the MIP coding process can be skipped for the screen content block with small brightness sampling value quantity and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is reduced, and the coding and decoding efficiency can be further improved.

In an embodiment of the present application, referring to fig. 3, a schematic flow chart of an encoding method provided in an embodiment of the present application is shown. As shown in fig. 3, the method may include:

S101, determining the block type of the current block.

It should be noted that the encoding method of the embodiment of the present application is applied to an encoder. Here, for a video image, the video image may be divided into a plurality of image blocks, each image block to be encoded may be referred to as an encoding block, and the current block herein specifically refers to an encoding block to be currently encoded, which may be a CTU, even a CU, or the like, and the embodiments of the present application are not limited in any way.

In this embodiment of the present application, a video image may be divided according to a preset size to obtain N encoding blocks, where N is an integer greater than zero, and N blocks are not overlapped with each other. The preset size refers to a preset block size value. Here, the preset size may be any one of 8, 16, 32, 64, etc., and may also be any one of 8×8, 16×16, 32×32, 64×64, etc., and the embodiment of the present application is not particularly limited. In a specific example, the preset size may be 8×8, at this time, for a video image, divided into N8×8 encoded blocks.

It should be further noted that, in the embodiment of the present application, mainly, a MIP adaptive bypass technology based on content analysis is provided to analyze image content, and when most areas of the image are screen contents with a small number of brightness sampling values and a wide brightness sampling range, since the screen contents can obtain better coding efficiency by using IBC, PLT and a traditional intra-frame prediction method, MIP prediction coding is skipped; and vice versa based on predictive coding using MIP. Thus, the encoding complexity can be reduced while ensuring the encoding performance.

In this embodiment of the present application, whether to use MIP predictive coding may be determined based on one frame of video image, or whether to use MIP predictive coding may be determined based on CU of one or more regions in one frame of video image, specifically, may be selected according to actual situations, and the embodiment of the present application is not limited specifically.

In the embodiment of the application, the color parameters of the current block in the video image are acquired; determining a block type of the current block based on a color parameter of the current block, wherein the color parameter comprises a brightness sampling value number and a brightness sampling range, and specifically, the process of determining the color parameter of the current block in the video image comprises the following steps: aiming at a current block of a video image, acquiring a maximum brightness component sampling value, a minimum brightness component sampling value and the number of brightness sampling values of the current block; and determining the brightness sampling range of the current block according to the maximum brightness component sampling value and the minimum brightness component sampling value of the current block.

It should be noted that, the number of luminance sample values of the current block is determined by the number of luminance component sample values of the current block, and the number of luminance sample values of the current block is determined by counting the number of all different luminance component sample values in the current block.

It should be further noted that, the current block may be a luminance block or a chrominance block, that is, in the present application, a luminance component sampling value of the luminance block may be obtained, and then a block type of the luminance block is determined based on the luminance component sampling value of the luminance block, or a luminance component sampling value of the chrominance block is obtained, and then a block type of the chrominance block is determined based on the luminance component sampling value of the chrominance block.

Illustratively, for an 8bit pixel sampling bit number, the maximum luminance component sample value of the current block is 8 times 2, i.e., 255, and the minimum luminance component sample value is 0. The number of luminance sample values of the current block is then in the range between 0 and 256.

In the embodiment of the application, a difference value between a maximum luminance component sampling value and a minimum luminance component sampling value of the current block is calculated, and the difference value is based on a luminance sampling range of the current block. Specifically, as shown in the formula (2),

R＝Y _max -Y _min (2)

wherein R is the brightness sampling range, Y _max And Y _min The maximum luminance component sample value and the minimum luminance component sample value of the current block, respectively.

In the embodiment of the application, after the number of brightness sampling values of a current block and the brightness sampling range of the current block in a video image are determined, determining the content parameter of the current block according to the number of brightness sampling values of the current block and the brightness sampling range of the current block, then comparing the content parameter of the current block with a first preset threshold value, and if the content parameter of the current block is smaller than the first preset threshold value, determining that the block type of the current block is a first type block; and if the content parameter of the current block is greater than or equal to a first preset threshold value, determining that the block type of the current block is a second type block. The first type block is a natural content block with the number of brightness sampling values not smaller than a preset number threshold and the brightness sampling range not larger than a preset range threshold; the second type of block is a screen content block with the number of brightness sampling values smaller than a preset number threshold and the brightness sampling range larger than a preset range threshold. That is, the number of luminance sample values in the first type block is larger, the luminance sample range is smaller, and the number of luminance sample values in the second type block is smaller, and the luminance sample range is wider.

Illustratively, the process of determining the content parameter of the current block according to the number of luminance sample values of the current block and the luminance sample range of the current block is as shown in formula (3),

wherein N is the number of brightness sampling values, and D is the content parameter. As shown in the formula (3), if the number of brightness sampling values of the current block is 1, the content parameter of the current block is 0; if the number of the brightness sampling values of the current block is not 1, the content parameter representing the current block is in direct proportion to the brightness sampling range of the current block and in inverse proportion to the number of the brightness sampling values of the current block, namely, the larger the brightness sampling range of the current block is, the smaller the number of the brightness sampling values is, the larger the content parameter of the current block is; the smaller the luminance sampling range of the current block, the greater the number of luminance sampling values, the smaller the content parameter of the current block. Wherein the content parameter is used for representing the brightness sampling range of the current block and the brightness sampling value quantity. That is, in the case where the number of luminance sample values in the current block is small and the luminance sample value range is large, the larger the content parameter is; in the case where the number of luminance sample values in the current block is large and the luminance sample value range is small, the content parameter is smaller.

It should be noted that, the first preset threshold may be set according to an actual situation, and the embodiment of the application is not specifically limited, and in an actual application, the value of the first preset threshold may be 3, 4, 5, or the like.

Further, the block type of at least one coding block in the video image is determined sequentially, where the at least one coding block may be one or more coding blocks corresponding to slices, or may be one or more coding blocks corresponding to tiles, or may be preset one or more coding blocks corresponding to CU in an area, or may be one coding block corresponding to a frame of video image, and may be specifically selected according to an actual situation, which is not specifically limited in the embodiment of the present application.

It should be noted that, the video image is divided into N slices, where each slice includes a plurality of CTUs, and the type of a slice can be determined according to the block types of all the coding blocks in the slice; or dividing the video image into N rectangular areas from the horizontal or vertical direction, wherein the rectangular areas are called tiles, each tile comprises an integer number of CTUs, and the type of the tile is determined according to the block types of all the coding blocks in the tile; or determining the type of the CU of the one or more areas according to the block types of all the coding blocks in the CU of the one or more areas; or determining the type of a frame of video image according to the block types of all the encoding blocks in the frame of video image. The selection may be specifically performed according to actual situations, and the embodiment of the present application is not specifically limited.

Further, after the block types corresponding to at least one coding block in the video image are sequentially acquired, multi-scale block type fusion can be performed on the at least one coding block based on the block types corresponding to the at least one coding block; and calculating the screen content block duty ratio based on the fused block type.

In practical application, fusion can be performed on a frame of video image, and the screen content block ratio is calculated, specifically, after the block type of the 8×8 block is obtained, four 8×8 blocks are sequentially combined into 16×16 blocks, and 4 block types in each 16×16 block are determined, if more than 3 block types included in the 4 block types belong to the same block type, the whole 16×16 block type is fused into one block type, and the block types are the same block type; if 2 block types in the 4 block types are the same block type, block type fusion is not performed on the 16×16 block types. And sequentially executing until the block type fusion of the 16 multiplied by 16 blocks is completed, sequentially executing the block type fusion of the 32 multiplied by 32 blocks and the block type fusion of the 64 multiplied by 64 blocks according to the method, and calculating the proportion of the screen content block to all blocks in a frame of video image according to the block type of the current block after fusion until the fusion is completed, so as to obtain the proportion of the screen content block.

S102, determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding.

It should be noted that, in the embodiment of the present application, bypass identification information is newly added to characterize whether to adopt MIP coding to the current block, where in the embodiment of the present application, whether to adopt MIP to carry out intra-frame prediction to the current block is determined according to a specific block type, and then a specific value of the bypass identification is determined, and then, based on the value of the bypass identification information, MIP prediction coding can be executed or skipped, so that the MIP coding process can be skipped for the screen content block with a small number of brightness sampling values and a wide brightness sampling range, and coding complexity is significantly reduced.

In the embodiment of the application, the block type corresponding to at least one coding block in the video image can be sequentially determined, and if the number of second type blocks in the block type corresponding to at least one coding block is greater than a second preset threshold, the value of the bypass identification information is determined to be a first value; and if the number of the second type blocks in the block types corresponding to the at least one coding block is not greater than a second preset threshold value, determining the value of the bypass identification information as a second value. Wherein the at least one encoded block may comprise a current block.

The first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. In general, the bypass identification information of the video image may be a flag (flag), which is not limited in any way.

It should also be noted that, if the bypass identification information is a flag, in a specific example, the first value may be set to 1 and the second value may be set to 0; in another specific example, the first value may also be set to true and the second value may also be set to false; in even yet another specific example, the first value may also be set to 0 and the second value may also be set to 1; alternatively, the first value may also be set to false and the second value may also be set to true. The first value and the second value in the embodiment of the present application are not limited in any way.

In practical application, the value of the bypass identification information can be determined based on the block types of the first 10 image blocks in a frame of video image, specifically, if the block types of at least 8 image blocks in the first 10 image blocks are screen content blocks, the value of the bypass identification information is a second value, and the image blocks of the video image coded by MIP are represented; if the block type of 8 image blocks in the first 10 image blocks is not the screen content block, the value of the bypass identification information is a first value, and the image blocks of the skip MIP coded video image are represented.

Further, if the multi-scale block type fusion is carried out on at least one coding block, and the screen content block duty ratio is calculated based on the fused block type, the screen content block duty ratio is compared with a second preset threshold value, and if the screen content block duty ratio is not smaller than a third preset threshold value, the value of bypass identification information is determined to be a first value; and if the screen content block duty ratio is smaller than a third preset threshold value, determining the value of the bypass identification information as a second value.

It should be noted that, the specific value of the second preset threshold may be selected according to the actual situation, and the embodiment of the application is not limited specifically, and in practical application, the second preset threshold may be set to 0.1, 0.2, or 0.3.

In the embodiment of the application, if the value of the bypass identification information is the first value, skipping the MIP coding current block, and skipping the MIP coding process for the screen content block with small brightness sampling value number and wide brightness sampling range, thereby obviously reducing the coding complexity; and if the value of the bypass identification information is the second value, adopting MIP to encode the current block.

It should be noted that the level of the bypass identification information may be one of the following: image level, namely, calculating a bypass identification information for a frame of video image; slice level (slice), i.e. calculating a bypass identification information for a slice in a frame of video image, as a slice's MIP predictive coding constraint; slice level (tile), i.e., calculating a bypass identification information for a tile in a frame of video image, as a MIP predictive coding constraint for the tile. The selection is specifically performed according to actual situations, and the embodiment of the application is not specifically limited.

Based on the above description, in an alternative embodiment, bypass identification information may be determined for a frame of video image, as shown in figure 4,

1. inputting a sequence of video images;

2. calculating the screen content block ratio r of an ith frame of video image in the video image sequence; i is a natural number greater than or equal to 0;

specifically, 2 includes:

2.1, dividing an image into a plurality of 8×8 image blocks;

2.2, counting the number N of brightness sampling values of each unit block and a brightness sampling range R, and calculating a content parameter D of each unit block according to N and R;

2.3, comparing D with a first preset threshold T2;

2.4, if D is smaller than T2, judging the image block to be a natural content block;

2.5, if D is not less than T2, judging the image block as a screen content block;

2.6, carrying out block type fusion on the video image of the ith frame according to the unit blocks of 16×16, 32×32 and 64×64 in sequence;

2.7, calculating the duty ratio r of the screen content area according to the i-th frame video image after the block type fusion;

3. comparing r with T1;

4. if r is smaller than T1, adopting MIP to encode the ith frame of video image, and setting the value pic_mip_enable_flag=1 of bypass identification information;

5. If r is not less than T1, skipping MIP coding of the ith frame of video image, and setting the value pic_mip_enable_flag=0 of the bypass identification information.

6. Adding 1 to i; judging whether the video sequence is coded according to the i;

7. if the coded video sequence is judged to be finished, finishing the coding;

8. if it is determined that the video sequence has not been encoded, then 2 is performed.

It should be noted that, for 2.4, there is a comparison result that if D is not less than T2, the image block is determined to be a natural content block, and for 2.5, if D is less than T2, the image block is determined to be a screen content block.

S103, determining a target block division mode based on the value of the bypass identification information.

After determining the value of bypass identification information corresponding to a current block based on the block type, determining a target block division mode based on the value of the bypass identification information, specifically, dividing a video image into a plurality of non-overlapping CTU blocks, sequentially processing each CTU according to a raster scanning sequence, dividing the CTU into a plurality of CUs according to a plurality of block division modes, determining a prediction mode and a prediction cost corresponding to each block division mode at the moment, determining the target block division mode from the plurality of division modes, and determining the value of the bypass identification information to be applied to determining the prediction mode and the prediction cost corresponding to the plurality of block division modes.

In the embodiment of the application, firstly, intra-frame prediction is performed by using a linear interpolation prediction method to obtain a first prediction mode and a first prediction cost; if the value based on the bypass identification information determines that the MIP is adopted for intra-frame prediction, namely pic_mip_enable_flag=1, the MIP is utilized for intra-frame prediction, and a second prediction mode and a second prediction cost are obtained; based on the comparison result of the first prediction cost and the second prediction cost, determining an intra-frame prediction cost and an intra-frame prediction mode corresponding to the intra-frame prediction cost from the first prediction cost and the second prediction cost; if the value of the bypass identification information is determined to skip MIP for intra-frame prediction, namely pic_mip_enable_flag=0, determining a first prediction cost as an intra-frame prediction cost and determining a first prediction mode as an intra-frame prediction mode; inter-frame prediction is carried out by using an inter-frame prediction method, so as to obtain an inter-frame prediction mode and an inter-frame prediction cost; based on the comparison result of the intra-frame prediction cost and the inter-frame prediction cost, the prediction modes and the prediction costs corresponding to the multiple block division modes are determined from the intra-frame prediction cost and the inter-frame prediction cost.

S104, coding the current block according to the target block division mode.

In this embodiment of the present application, after determining the target block division manner based on the value of the bypass identification information, in order to enable the decoder to obtain the target block division manner and the prediction mode corresponding to the target block division manner, the encoder needs to encode the target block division manner and the prediction mode corresponding to the target block division manner, and then write the code stream to wait for the encoder to transmit to the decoder. Further, prediction can be performed based on a target block division mode to obtain a residual error value of the current block; and transform, quantize, entropy encode the residual and then write the encoded bits into the bitstream.

Specifically, a current block is divided according to a target block division mode to obtain a division tree of the current block, wherein the division tree comprises one or more node sub-blocks obtained by dividing the current block; then, according to a preset node subblock processing sequence, determining a target prediction mode of each node subblock in sequence; determining a predicted value of the node subblock according to the target prediction mode; determining a residual error value of the node sub-block according to the original value and the predicted value of the node sub-block; and finally, coding the target prediction mode and the residual value of the node subblock, and writing the coding bit into the code stream.

In addition, the embodiment of the application may further provide a code stream, where the code stream includes one or more coding bits of the bypass identification information, the target block division mode, the target prediction mode of the node sub-block, and the residual value.

In an alternative embodiment, the bypass identification information may also be encoded in its value and the encoded bits written to the code stream.

It should also be noted that, in a specific example, an implementation method of an encoder of the MIP adaptive bypass technique based on content analysis is as follows: first, an input image is divided into a plurality of CTU blocks that do not overlap. Then, each CTU is processed sequentially according to the raster scan order, the CTU is divided into a plurality of CUs according to a plurality of block division modes, determining the optimal block division mode from the plurality of block division modes comprises the following steps of, for the ith block division mode Spilt [ i ]: (1) performing intra-frame prediction by using a linear interpolation prediction-based method, selecting an optimal intra-frame prediction mode bestregIntraMode [ i ] and a prediction cost bestregIntraCost [ i ], (2) when judging that the block size of the current CU meets the corresponding block size limiting condition and the value of bypass identification information is a second value, namely, adopting MIP to encode the current CU, performing intra-frame prediction by using a MIP mode, and selecting an optimal MIP prediction mode bestMipINtraMode [ i ] and a prediction cost bestMipINtraCost [ i ]; (3) comparing bestregintraCost [ i ] with bestsmipiraCost [ i ], and selecting an optimal intra-frame prediction mode bestreMode [ i ] corresponding to the Spilt [ i ] and a prediction cost bestreCost [ i ]; (4) other methods, such as inter-frame prediction, are adopted to predict, and an optimal prediction mode bestOtherMode [ i ] and a prediction cost bestOtherCost [ i ] are selected; (5) and comparing the bestIntraCost [ i ] with the bestOtherCost [ i ], and selecting the best prediction mode bestMode [ i ] of the Spilt [ i ] and the prediction cost bestCost [ i ]. And (3) sequentially executing the steps (1) - (5) on a plurality of block division modes, and selecting a block division mode Split opt with the minimum prediction cost of the current CTU and a corresponding prediction mode bestMode opt. And finally, predicting according to an optimal block division mode to obtain a residual block, transforming, quantizing and entropy coding the residual block, coding prediction information such as a block division mode and the like, and outputting a code stream to wait for transmission.

Based on the above embodiments, after the technology proposed in the embodiments of the present application is implemented on VVC reference software VTM11.0, the test in the screen content test sequence recommended for jfet in the coding mode of all intra-coding is passed, where the test sequence includes three types of screen sequences: TGM Text and graphics with motion; m: mixed content; a: animation. For all sequences encoded with MIP bypass (7 TGM sequences, 1M sequence), the BD-rate average variation over the Y, cb, cr components was-0.182%, -0.568%, -0.034%, respectively, while the encoding time average variation was-4.3%. The data show that the technology can save a certain coding time without reducing coding performance.

As can be seen from the above embodiments, the techniques proposed in the embodiments of the present application can reduce coding complexity without reducing coding performance. Specifically, for typical screen content sequences (i.e. the number of brightness sampling values is small and the brightness sampling range is wide), the technology directly bypasses MIP predictive coding, and because the typical screen content sequences directly utilize IBC, PLT and the traditional intra-frame predictive method can obtain better coding efficiency, the coding performance is not lost although the MIP prediction is bypassed, and the coding time is reduced by 4.3%. In summary, the present technique can significantly reduce coding complexity while maintaining substantially comparable coding performance to the prior art.

The embodiment of the application provides an encoding method which is applied to an encoder. Determining a color parameter of a current block in a video image; obtaining the block type of the current block based on the color parameters of the current block; determining the value of bypass identification information corresponding to a current block based on the block type, wherein the value of the bypass identification information is used for identifying whether matrix weighted intra-frame prediction MIP is adopted to encode the current block; determining a target block division mode based on the value of the bypass identification information; according to the target block division mode, the current block is encoded, so that the block type of the current block can be determined based on the color parameter of the current block, whether MIP is adopted for encoding prediction or not is determined based on the block type of the current block, the MIP encoding process can be skipped for the screen content block with small brightness sampling value number and wide brightness sampling range, the encoding complexity is obviously reduced, the encoding time is shortened, and the encoding and decoding efficiency can be improved.

In yet another embodiment of the present application, reference is made to fig. 5, which shows a schematic flow chart of a decoding method provided in an embodiment of the present application. As shown in fig. 5, the method may include:

s201, analyzing the code stream and determining a target block division mode of the current block.

It should be noted that the decoding method in the embodiment of the present application is applied to a decoder. Here, for a video image, the video image may be divided into a plurality of image blocks, wherein each image block to be decoded may be referred to as a decoding block, and the current block herein refers specifically to a decoding block currently to be decoded; after decoding is completed, video playback can be awaited.

In the embodiment of the application, when the encoding end encodes the current block, the target block division mode of the current block is determined, and then the target block division mode is written into the code stream, so that the target block division mode can be obtained by analyzing the code stream at the decoding end, and the target block division mode is used for determining the predicted value of the current block and the residual error value of the current block in the decoding process.

Here, determining whether a video image is decoded using MIP may be represented by a bypass flag. Specifically, in some embodiments, the method may further comprise:

analyzing the code stream to obtain the value of the bypass identification information;

if the value of the bypass identification information is the first value, skipping MIP decoding the current block; or if the value of the bypass identification information is the second value, adopting MIP to decode the current block.

The first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. In general, the bypass identification information may be a flag (flag), which is not limited in any way.

In addition, taking MIP bypass identification information as an example of a flag, the first value may be set to 0 and the second value may be set to 1 in a specific example at this time; in another specific example, the first value may be set to false and the second value may be set to true; in even yet another specific example, the first value may also be set to 1 and the second value may also be set to 0; alternatively, the first value may also be set to true and the second value may also be set to false. The first value and the second value in the embodiment of the present application are not limited in any way.

In this way, assuming that the first value is 0 and the second value is 1, after the decoder analyzes the code stream to obtain the value of the bypass identification information, if the value of the MIP bypass identification information is 0, the decoder judges that the encoder end skips the MIP predictive coding process, so that the coding speed can be saved, the coding complexity is obviously reduced, and at the moment, the decoder end skips the process of MIP decoding the current block. Otherwise, if the value of the MIP bypass identification information is 1, judging that the encoder end adopts MIP to carry out predictive coding, and at the moment, the decoder end also adopts MIP to decode the current block.

S202, analyzing the code stream based on the target block division mode, and determining the predicted value of the current block.

In the embodiment of the present application, a partition tree of a current block is determined according to a target block partition manner, where the partition tree includes one or more node sub-blocks obtained by partitioning the current block. Then sequentially analyzing the code stream of each node sub-block of the partition tree according to a preset node sub-block processing sequence, and determining a target prediction mode of each node sub-block; a prediction value for each node sub-block is determined based on the prediction mode.

S203, analyzing the code stream based on the target block division mode, and determining the residual error value of the current block.

It should be noted that, after the decoding obtains the target block division manner, for determining the residual value, in some embodiments, the determining the residual value of the current block based on the target block division manner, analyzing the code stream may include:

and sequentially analyzing the code stream of each node sub-block of the partition tree according to a preset node sub-block processing sequence, and determining the residual error value of each node sub-block.

Here, the preset node sub-block processing order may be a preset scan order. That is, the embodiment of the present application may sequentially parse the code stream of each node sub-block of the partition tree according to the preset scanning order, so as to determine the residual value of each node sub-block.

S204, determining a reconstruction value of the current block based on the predicted value and the residual value.

In a specific example, the determining the reconstruction value of the current block based on the prediction value and the residual value may include: and carrying out addition calculation on the predicted value and the residual value, and determining the reconstruction value of the current block.

It should be noted that, after the block division parameters are obtained by decoding, the predicted value of the current block can also be obtained by analyzing the code stream; and the residual error value of the current block can be obtained by analyzing the code stream; thus, the reconstructed value of the current block can be determined by adding the predicted value and the residual value.

It should also be noted that, in a specific example, a decoder implementing the MIP adaptive bypass technique based on content analysis is implemented as follows: firstly, performing entropy decoding, inverse quantization and inverse transformation on an input code stream to obtain a residual value; then, reconstructing an image from the residual block, wherein the reconstruction process mainly comprises the following 3 steps: (1) and determining the division tree of the current CTU according to the block target block division mode. (2) Each CU of the partition tree is sequentially processed according to the raster scan order, and a prediction value is calculated using a prediction mode of each CU. (3) And superposing the residual error value and the predicted value of the current CU to obtain the reconstructed CU. Finally, the reconstructed image is sent to a deblocking filter (Deblocking Filter, DBF)/sampling adaptive compensation (Sample Adaptive Offset, SAO)/loop filter (Adaptive Loop Filter, ALF) filter, and the filtered image is sent to a buffer, waiting for video playback.

The embodiment provides a decoding method which is applied to a decoder. Analyzing the code stream and determining a target block division mode of the current block; analyzing the code stream based on the target block division mode, and determining the predicted value of the current block; analyzing the code stream based on the target block division mode, and determining the residual error value of the current block; a reconstruction value of the current block is determined based on the prediction value and the residual value. Therefore, the block type of the current block can be determined based on the color parameter of the current block, and whether the MIP is adopted for coding prediction or not is determined based on the block type of the current block, so that the MIP coding process can be skipped for the screen content block with small brightness sampling value quantity and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is reduced, and the coding and decoding efficiency can be further improved.

In still another embodiment of the present application, referring to fig. 6, a schematic diagram of the composition structure of an encoder 1 provided in the embodiment of the present application is shown, based on the same inventive concept as the previous embodiment. As shown in fig. 6, the encoder 11 may include: a first determination unit 10, and an encoding unit 11; wherein,

the first determining unit 10 is configured to determine a block type of the current block; determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether matrix weighted intra-frame prediction MIP is adopted to encode the current block; and determining a target block division mode based on the value of the bypass identification information.

The encoding unit 11 is configured to encode the current block according to the target block division manner.

In some embodiments, the first determining unit 10 is configured to determine a color parameter of a current block in the video image; a block type of the current block is determined based on the color parameter of the current block.

In some embodiments, the color parameters include a number of luminance sample values and a luminance sample range; the encoder 1 further comprises:

the first determining unit 10 is further configured to determine a maximum luminance component sampling value, a minimum luminance component sampling value, and the number of luminance sampling values of the current block; and determining the brightness sampling range of the current block according to the maximum brightness component sampling value and the minimum brightness component sampling value of the current block.

In some embodiments, the first determining unit 10 is further configured to determine a content parameter of the current block according to the number of luminance sample values of the current block and the luminance sample range of the current block; if the content parameter of the current block is smaller than a first preset threshold value, determining that the block type of the current block is a first type block; and if the content parameter of the current block is greater than or equal to the first preset threshold value, determining that the block type of the current block is a second type block.

In some embodiments, the first determining unit 10 is further configured to determine a block type corresponding to at least one encoded block included in the video image; if the number of the second type blocks in the block types corresponding to the at least one coding block is larger than a second preset threshold value, determining the value of bypass identification information corresponding to the current block as a first value; and if the number of the second type blocks in the block types corresponding to the at least one coding block is smaller than or equal to the second preset threshold value, determining the value of the bypass identification information corresponding to the current block as a second value.

In some embodiments, the encoder further comprises: a fusion unit and a calculation unit;

the first determining unit 10 is further configured to determine a block type corresponding to at least one coding block in the video image;

the fusion unit is configured to fuse the at least one coding block based on the block type corresponding to the at least one coding block to obtain a fused coding block;

the calculating unit is configured to calculate the content block duty ratio of the screen based on the block type of the fused coding block;

the first determining unit 10 is further configured to determine that the value of the bypass identification information is a first value if the screen content block duty ratio is greater than or equal to a second preset threshold value; and if the screen content block duty ratio is smaller than a second preset threshold value, determining the value of the bypass identification information as a second value.

In some embodiments, if the bypass identification information has a first value, skip MIP encoding the current block. And if the value of the bypass identification information is a second value, adopting MIP to encode the current block.

In some embodiments, the level of bypass identification information includes at least one of: picture level, slice level, coding tree unit CTU level, coding unit CU level.

In some embodiments, the first determining unit 10 is further configured to determine a prediction mode and a prediction cost corresponding to the multiple division modes based on the value of the bypass identification information; determining the target block division mode with the minimum prediction cost and a target prediction mode of each node subblock from the plurality of block division modes; each node sub-block is obtained by dividing the current block based on the target block dividing mode.

In some embodiments, the first determining unit 10 is further configured to perform intra prediction by using a linear interpolation prediction method, to obtain a first prediction mode and a first prediction cost; if the value based on the bypass identification information determines that the MIP is adopted for intra-frame prediction, the MIP is utilized for intra-frame prediction, and a second prediction mode and a second prediction cost are obtained; determining an intra-frame prediction cost and an intra-frame prediction mode corresponding to the intra-frame prediction cost from the first prediction cost and the second prediction cost based on a comparison result of the first prediction cost and the second prediction cost; if the bypass identification information is determined to skip MIP for intra-frame prediction based on the value of the bypass identification information, determining the first prediction cost as the intra-frame prediction cost and determining the first prediction mode as the intra-frame prediction mode; inter-frame prediction is carried out by using an inter-frame prediction method, so as to obtain an inter-frame prediction mode and an inter-frame prediction cost; and determining a prediction mode and a prediction cost corresponding to a plurality of block division modes from the intra-frame prediction cost and the inter-frame prediction cost based on a comparison result of the intra-frame prediction cost and the inter-frame prediction cost.

In some embodiments, the first determining unit 10 is further configured to divide the current block according to the target block division manner to obtain a division tree of the current block, where the division tree includes one or more node sub-blocks obtained by dividing the current block; determining a target prediction mode of each node sub-block; determining a predicted value of the node subblock according to the target prediction mode; determining a residual error value of the node sub-block according to the original value and the predicted value of the node sub-block;

the encoding unit 11 is further configured to encode the target prediction mode and the residual value of the node sub-block, and write encoded bits into the code stream.

In some embodiments, the encoding unit 11 is further configured to encode the target block division mode and write encoded bits into a code stream.

It will be appreciated that in the embodiments of the present application, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and may of course be a module, or may be non-modular. Furthermore, the components in the present embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional modules.

The integrated units, if implemented in the form of software functional modules, may be stored in a computer-readable storage medium, if not sold or used as separate products, and based on such understanding, the technical solution of the present embodiment may be embodied essentially or partly in the form of a software product, which is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform all or part of the steps of the method described in the present embodiment. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Accordingly, an embodiment of the present application provides a computer storage medium, applied to the encoder 1, storing a computer program which, when executed by a first processor, implements the method of any of the previous embodiments.

Based on the above-described composition of the encoder 1 and the computer storage medium, reference is made to fig. 7, which shows a schematic diagram of a specific hardware structure of the encoder provided in the embodiment of the present application. As shown in fig. 7, may include: a first communication interface 12, a first memory 13 and a first processor 14; the various components are coupled together by a first bus system 15. It will be appreciated that the first bus system 15 is used to enable connected communications between these components. The first bus system 15 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as first bus system 15 in fig. 7. Wherein,

a first communication interface 12, configured to receive and transmit signals during a process of transmitting and receiving information with other external network elements;

a first memory 13 for storing a computer program capable of running on the first processor 14;

a first processor 14 for, when running the computer program, executing:

determining a block type of a current block;

determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether matrix weighted intra-frame prediction MIP is adopted to encode the current block;

and encoding the current block according to the target block dividing mode.

It is understood that the first memory 13 in the embodiments of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). The first memory 13 of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

While the first processor 14 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in software form in the first processor 1003. The first processor 14 described above may be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the first memory 13 and the first processor 14 reads the information in the first memory 13 and in combination with its hardware performs the steps of the above method.

It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For a hardware implementation, the processing units may be implemented within one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processors (Digital Signal Processing, DSP), digital signal processing devices (DSP devices, DSPD), programmable logic devices (Programmable Logic Device, PLD), field programmable gate arrays (Field-Programmable Gate Array, FPGA), general purpose processors, controllers, microcontrollers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof. For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.

Optionally, as another embodiment, the first processor 14 is further configured to perform the method of any of the preceding embodiments when the computer program is run.

The present embodiment provides an encoder, which may include a first determination unit and an encoding unit. Therefore, the block type of the current block can be determined based on the color parameter of the current block, and whether the MIP is adopted for coding prediction or not is determined based on the block type of the current block, so that the MIP coding process can be skipped for the screen content block with small brightness sampling value quantity and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is reduced, and the coding and decoding efficiency can be further improved.

In still another embodiment of the present application, based on the same inventive concept as the previous embodiment, referring to fig. 8, a schematic diagram of the composition structure of a decoder 2 provided in the embodiment of the present application is shown. As shown in fig. 8, the decoder 2 may include: an analysis unit 20 and a second determination unit 21; wherein,

the parsing unit 20 is configured to parse the code stream and determine a target block division mode of the current block; analyzing a code stream based on the target block division mode, and determining a predicted value of the current block; analyzing a code stream based on the target block division mode, and determining a residual error value of the current block;

the second determining unit 21 is configured to determine a reconstruction value of the current block based on the prediction value and the residual value.

In one embodiment, the parsing unit 20 is further configured to parse the code stream to obtain the value of the bypass identifier information; if the value of the bypass identification information is a first value, skipping MIP to decode the current block; or if the value of the bypass identification information is the second value, adopting MIP to decode the current block.

In one embodiment, the second determining unit 21 is further configured to determine a partition tree of the current block according to the target block partition manner, where the partition tree includes one or more node sub-blocks obtained by dividing the current block.

In one embodiment, the parsing unit 20 is further configured to parse a code stream of each node sub-block of the partition tree, and determine a target prediction mode of each node sub-block;

the second determining unit 21 is further configured to determine a prediction value of each node sub-block according to the target prediction mode.

In one embodiment, the second determining unit 21 is further configured to sequentially parse the code stream of each node sub-block of the partition tree according to a preset processing sequence of the node sub-blocks, and determine a residual value of each node sub-block.

It will be appreciated that in this embodiment, the "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and may of course be a module, or may be non-modular. Furthermore, the components in the present embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional modules.

The integrated units may be stored in a computer readable storage medium if implemented in the form of software functional modules, and not sold or used as stand-alone products. Based on such understanding, the present embodiment provides a computer storage medium, applied to the decoder 2, which stores a computer program which, when executed by a second processor, implements the method of any of the preceding embodiments.

Based on the above-mentioned composition of the decoder 2 and the computer storage medium, reference is made to fig. 9, which shows a schematic diagram of a specific hardware structure of the decoder 2 provided in the embodiment of the present application. As shown in fig. 9, may include: a second communication interface 22, a second memory 23 and a second processor 24; the individual components are coupled together by a second bus system 25. It will be appreciated that the second bus system 25 is used to enable connected communication between these components. The second bus system 25 comprises, in addition to a data bus, a power bus, a control bus and a status signal bus. But for clarity of illustration the various buses are labeled as the second bus system 25 in fig. 9. Wherein,

The second communication interface 22 is configured to receive and send signals during the process of receiving and sending information with other external network elements;

a second memory 23 for storing a computer program capable of running on the second processor 24;

a second processor 24 for, when running the computer program, performing:

Optionally, as another embodiment, the second processor 24 is further configured to perform the method of any of the preceding embodiments when the computer program is run.

It will be appreciated that the second memory 23 is functionally similar to the hardware of the first memory 12 and the second processor 24 is functionally similar to the hardware of the first processor 14; and will not be described in detail herein.

The present embodiment provides a decoder, which may include a parsing unit and a second determining unit. Therefore, the block type of the current block can be determined based on the color parameter of the current block, and whether the MIP is adopted for coding prediction or not is determined based on the block type of the current block, so that the MIP coding process can be skipped for the screen content block with small brightness sampling value quantity and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is reduced, and the coding and decoding efficiency can be further improved.

In still another embodiment of the present application, based on the same inventive concept as the previous embodiment, the embodiment of the present application further provides a codec system composed of an encoder as shown in fig. 7 and a decoder as shown in fig. 9.

It should be noted that, in this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

The methods disclosed in the several method embodiments provided in the present application may be arbitrarily combined without collision to obtain a new method embodiment.

The features disclosed in the several product embodiments provided in the present application may be combined arbitrarily without conflict to obtain new product embodiments.

The features disclosed in the several method or apparatus embodiments provided in the present application may be arbitrarily combined without conflict to obtain new method embodiments or apparatus embodiments.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Industrial applicability

The embodiment of the application provides a coding and decoding method, an encoder, a decoder and a computer storage medium, wherein the color parameters of a current block in a video image are determined at the encoder side; and determining a block type of the current block based on the color parameter of the current block; determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether matrix weighted intra-frame prediction MIP is adopted to encode the current block; determining a target block division mode based on the value of the bypass identification information; and encoding the current block according to the target block division mode. At the decoder side, analyzing the code stream, and determining a target block dividing mode of the current block, a prediction mode of the current block and a residual error value of the current block; determining a predicted value of the current block based on the target block division mode and the prediction mode; a reconstruction value of the current block is determined based on the prediction value and the residual value. Therefore, the block type of the current block can be determined based on the color parameter of the current block, and whether the MIP is adopted for coding prediction or not is determined based on the block type of the current block, so that the MIP coding process can be skipped for the screen content block with small brightness sampling value quantity and wide brightness sampling range, the coding complexity is obviously reduced, the coding time is reduced, and the coding and decoding efficiency can be further improved.

Claims

An encoding method applied to an encoder, the method comprising:

determining a block type of a current block;

determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding;

determining a target block division mode based on the value of the bypass identification information;

and encoding the current block according to the target block dividing mode.
The method of claim 1, wherein the determining the block type of the current block comprises:

determining a color parameter of a current block in a video image;

a block type of the current block is determined based on the color parameter of the current block.
The method of claim 2, wherein the color parameters include a number of luminance sample values and a luminance sample range;

the determining the color parameter of the current block in the video image comprises the following steps:

determining the maximum brightness component sampling value, the minimum brightness component sampling value and the brightness sampling value quantity of the current block;

and determining the brightness sampling range of the current block according to the maximum brightness component sampling value and the minimum brightness component sampling value of the current block.
The method of claim 3, wherein the deriving the block type of the current block based on the color parameter of the current block comprises:

determining content parameters of the current block according to the number of the brightness sampling values of the current block and the brightness sampling range of the current block;

if the content parameter of the current block is smaller than a first preset threshold value, determining that the block type of the current block is a first type block;

and if the content parameter of the current block is greater than or equal to the first preset threshold value, determining that the block type of the current block is a second type block.
The method according to claim 1 or 4, wherein the determining, based on the block type, a value of a bypass identifier corresponding to the current block includes:

determining a block type corresponding to at least one coding block contained in the video image;

if the number of the second type blocks in the block types corresponding to the at least one coding block is larger than a second preset threshold value, determining the value of bypass identification information corresponding to the current block as a first value;

and if the number of the second type blocks in the block types corresponding to the at least one coding block is smaller than or equal to the second preset threshold value, determining the value of the bypass identification information corresponding to the current block as a second value.
The method according to claim 1 or 4, wherein the determining, based on the block type, a value of a bypass identifier corresponding to the current block includes:

determining a block type corresponding to at least one coding block contained in the video image;

fusing the at least one coding block based on the block type corresponding to the at least one coding block to obtain a fused coding block;

calculating the content block duty ratio of the screen based on the block type of the fused coding block;

if the screen content block duty ratio is greater than or equal to a third preset threshold value, determining that the value of the bypass identification information is a first value;

and if the screen content block duty ratio is smaller than a third preset threshold value, determining the value of the bypass identification information as a second value.
The method according to claim 5 or 6, wherein the method further comprises:

and if the value of the bypass identification information is a first value, skipping MIP coding of the current block.

And if the value of the bypass identification information is a second value, adopting MIP to encode the current block.
The method of claim 7, wherein the level of bypass identification information comprises one of: picture level, slice level, coding tree unit CTU level, coding unit CU level.
The method of claim 1, wherein the determining the target block partitioning method based on the value of the bypass identification information comprises:

based on the value of the bypass identification information, determining a prediction mode and a prediction cost corresponding to various block division modes;

determining the target block division mode with the minimum prediction cost and a target prediction mode of each node subblock from the plurality of block division modes; each node sub-block is obtained by dividing the current block based on the target block dividing mode.
The method of claim 9, wherein the determining, based on the value of the bypass identification information, a prediction mode and a prediction cost corresponding to the plurality of block partitioning modes includes:

carrying out intra-frame prediction by using a linear interpolation prediction method to obtain a first prediction mode and a first prediction cost;

if the value based on the bypass identification information determines that the MIP is adopted for intra-frame prediction, the MIP is utilized for intra-frame prediction, and a second prediction mode and a second prediction cost are obtained;

determining an intra-frame prediction cost and an intra-frame prediction mode corresponding to the intra-frame prediction cost from the first prediction cost and the second prediction cost based on a comparison result of the first prediction cost and the second prediction cost;

If the bypass identification information is determined to skip MIP for intra-frame prediction based on the value of the bypass identification information, determining the first prediction cost as the intra-frame prediction cost and determining the first prediction mode as the intra-frame prediction mode;

inter-frame prediction is carried out by using an inter-frame prediction method, so as to obtain an inter-frame prediction mode and an inter-frame prediction cost;

and determining a prediction mode and a prediction cost corresponding to a plurality of block division modes from the intra-frame prediction cost and the inter-frame prediction cost based on a comparison result of the intra-frame prediction cost and the inter-frame prediction cost.
The method of claim 1, wherein the encoding the current block according to the target block partitioning scheme comprises:

dividing the current block according to the target block dividing mode to obtain a dividing tree of the current block, wherein the dividing tree comprises one or more node sub-blocks obtained by dividing the current block;

determining a target prediction mode of each node sub-block;

determining a predicted value of the node subblock according to the target prediction mode;

determining a residual error value of the node sub-block according to the original value and the predicted value of the node sub-block;

And encoding the target prediction mode and the residual value of the node subblock, and writing encoding bits into a code stream.
The method of claim 1, wherein the method further comprises:

and coding the target block division mode, and writing coding bits into a code stream.
A code stream comprises code bits of one or more of bypass identification information values, target block division modes, target prediction modes of node subblocks and residual values.
A decoding method applied to a decoder, the method comprising:

analyzing the code stream and determining a target block division mode of the current block;

analyzing a code stream based on the target block division mode, and determining a predicted value of the current block;

analyzing a code stream based on the target block division mode, and determining a residual error value of the current block;

and determining a reconstruction value of the current block based on the predicted value and the residual value.
The method of claim 14, wherein the method further comprises:

analyzing the code stream to obtain the value of the bypass identification information;

if the value of the bypass identification information is a first value, skipping MIP to decode the current block; or,

and if the value of the bypass identification information is a second value, decoding the current block by adopting MIP.
The method according to claim 14 or 15, wherein the method further comprises:

and determining a division tree of the current block according to the target block division mode, wherein the division tree comprises one or more node sub-blocks obtained by dividing the current block.
The method of claim 16, wherein the parsing a code stream based on the target block partitioning, determining a predicted value of the current block, comprises:

analyzing the code stream of each node sub-block of the partition tree, and determining a target prediction mode of each node sub-block;

and determining the predicted value of each node subblock according to the target prediction mode.
The method of claim 16, wherein the parsing a code stream based on the target block partitioning, determining a residual value for the current block, comprises:

and analyzing the code stream of each node sub-block of the partition tree, and determining the residual error value of each node sub-block.
An encoder, the encoder comprising a first determination unit and an encoding unit; wherein,

the first determining unit is configured to determine a block type of a current block; determining the value of bypass identification information corresponding to the current block based on the block type, wherein the bypass identification information is used for identifying whether the current block adopts matrix weighted intra-frame prediction MIP coding; and determining a target block division mode based on the value of the bypass identification information.

The encoding unit is configured to encode the current block according to the target block division mode.
An encoder, the encoder comprising a first memory and a first processor; wherein,

the first memory is used for storing a computer program capable of running on the first processor;

the first processor being adapted to perform the method of any of claims 1 to 12 when the computer program is run.
A decoder comprising an parsing unit and a second determining unit; wherein,

the parsing unit is configured to parse the code stream and determine a target block division mode of the current block; analyzing a code stream based on the target block division mode, and determining a predicted value of the current block; analyzing a code stream based on the target block division mode, and determining a residual error value of the current block;

the second determining unit is configured to determine a reconstruction value of the current block based on the prediction value and the residual value.
A decoder comprising a second memory and a second processor; wherein,

the second memory is used for storing a computer program capable of running on the second processor;

The second processor being adapted to perform the method of any of claims 14-18 when the computer program is run.
A computer storage medium, wherein the computer storage medium stores a computer program which, when executed, implements the method of any one of claims 1 to 12, or the method of any one of claims 14-18.
A codec system, wherein the codec system consists of an encoder as claimed in claim 20 and a decoder as claimed in claim 21.