CN110730343B

CN110730343B - Method, system and storage medium for dividing multifunctional video coding frames

Info

Publication number: CN110730343B
Application number: CN201910894063.9A
Authority: CN
Inventors: 梁凡; 唐娜; 曹健
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2019-09-20
Filing date: 2019-09-20
Publication date: 2021-12-07
Anticipated expiration: 2039-09-20
Also published as: CN110730343A

Abstract

The invention discloses a method, a system and a storage medium for dividing multifunctional video coding frames, wherein the method comprises the following steps: determining that the coding frame of the current coding tree unit is an intra-frame coding frame or the first ratio is less than or equal to a first threshold value, and executing the next step; extracting and calculating the vertical edge texture features and the horizontal edge texture features of the current coding unit; and performing division decision of the current coding unit according to the vertical edge texture feature and the horizontal edge texture feature of the current coding unit. According to the invention, when inter-frame prediction coding is carried out, the relative size of the first ratio and whether the coding frame is an intra-frame coding frame or the edge texture characteristics of the current coding unit are utilized to skip certain division modes, so that the coding efficiency is improved; at least 3 partitioning modes are selected according to the vertical edge texture features and the horizontal edge texture features of the current coding unit for recursive partitioning, the BD-rate is smaller, and the coding performance is better. The invention can be widely applied to the field of video coding.

Description

Method, system and storage medium for dividing multifunctional video coding frames

Technical Field

The invention relates to the field of video coding, in particular to a method, a system and a storage medium for dividing multifunctional video coding frames.

Background

ITU-T VCEG and ISO/IEC MPEG established the Joint Video expansion Team (JFET) working group, primarily to investigate the potential requirements of future Video coding standards. In the development of jvt, a new generation of Video Coding standard, Versatile Video Coding (VVC), multifunctional Video Coding, was formed. Many new coding techniques have been investigated in VVCs, such as quad with connected multi-type tree (QTMT), Position Dependent intra Prediction Combination (PDPC), affinity Motion Prediction (AMCP), etc. These techniques can improve coding performance well, but also greatly increase coding time. Therefore, there is a need to find an efficient algorithm to achieve a better tradeoff between coding efficiency and coding time.

In VVC, each Coding Tree Unit (CTU) is partitioned into Coding Units (CUs) to accommodate various local texture features. QTMT supports more flexible CU partition shapes, a CU can be square or rectangular: first, the CTUs are divided using a quadtree. The leaf nodes of the quadtree may then be further partitioned using a binary, trigeminal, or quadtree structure, where both binary and ternary structures are collectively referred to as a multi-tree structure. The binary tree structure includes vertical and horizontal binary divisions, while the ternary tree structure includes vertical and horizontal trifurcated divisions. Fig. 1 is a schematic diagram of a CTU divided into CUs. In the VVC, a Rate-distortion Optimization (RDO) method is used by a QTMT structure to determine the optimal partition of all blocks: first, the current CU is seen as a leaf node without any partitioning, then various modes are tried for prediction and transformation, and finally the RD (rate distortion) cost of the best mode is selected and stored. Second, the CU is divided into two, three, or four subblocks according to a division mode. The RDO process is then performed recursively to determine the best partitioning of these sub-blocks. Finally, the one with the lowest RD cost is selected as the best partitioning. However, this partitioning approach is very time consuming for the entire recursive process.

Inter-frame prediction can effectively remove video temporal redundancy by using the correlation of the video temporal domain and using the pixels of the neighboring coded image (such as the coded image of the previous frame of the current frame) to predict the pixels of the current image. Therefore, when the current coding unit is significantly similar (i.e., relatively stationary) to the co-located block of the reference picture, the prediction error can be small even if all further partitioning operations are skipped. This case can reduce the overhead of partitioning information; at the same time, the rate-distortion cost considering the overhead and prediction error will increase little, but the coding time will decrease significantly and reasonably. In the current VTM of the VVC reference software, an inter-frame prediction encoding scheme is adopted, and when a current coding unit is not significantly similar to a co-located block of a reference image (i.e., relatively moving) and needs to be further divided, the selection of an optimal division manner is used as a multi-classification problem, features are selected from three types of information, namely global texture, local texture and image content information, and a decision tree is selected as a classifier, so that the division process can be significantly accelerated. In this way, the decision tree is selected as the classifier, and one partition is directly selected from the horizontal binary partition, the vertical binary partition, the horizontal trigeminal selection partition, the vertical trigeminal partition and the quadtree partition, and the other partitions are skipped, although this way can accelerate the encoding speed and save the encoding time, the method of selecting only one partition to perform encoding unit partition increases the BD-rate of encoding (indicating the change of the code rate under the condition of the same video quality), reduces the peak signal-to-noise ratio PSNR of encoding, and causes the encoding performance to be poor.

Disclosure of Invention

To solve the above technical problem, an embodiment of the present invention aims to: a method, a system and a storage medium for multi-functional video coding inter-frame division are provided to improve coding efficiency while ensuring coding performance.

The technical scheme adopted by the first aspect of the embodiment of the invention is as follows:

a method for multi-functional video coding inter-frame partitioning, comprising the steps of:

determining that the coding frame of the current coding tree unit is an intra-frame coding frame or a first ratio is smaller than or equal to a first threshold value, executing the next step, wherein the first ratio is the ratio of the number of pixels which are equal to 0 in a difference image obtained by the current coding unit through calculation by adopting a three-frame difference method to the total number of pixels of the current coding unit, and the current coding unit is any one of 4 coding units obtained by performing quad-tree division on the current coding tree unit;

extracting and calculating the vertical edge texture features and the horizontal edge texture features of the current coding unit;

performing a partition decision of the current coding unit according to the vertical edge texture feature and the horizontal edge texture feature of the current coding unit, wherein the performing the partition decision of the current coding unit comprises: when the vertical edge texture features of the current coding unit are determined to be larger than the horizontal edge texture features, skipping a horizontal binary division mode and a horizontal trigeminal division mode; and when the vertical edge texture features of the current coding unit are determined to be smaller than the horizontal edge texture features, skipping the vertical binary division mode and the vertical trigeminal division mode.

Specifically, the first threshold value may be set in advance.

Further, the next step is executed when it is determined that the coding frame of the current coding tree unit is an intra-frame coding frame or the first ratio is less than or equal to a first threshold, and specifically includes:

selecting a coding tree unit from an input image as a current coding tree unit;

performing quadtree division on a current coding tree unit to obtain 4 coding units;

selecting any one coding unit from the 4 coding units as a current coding unit;

judging whether the coding frame where the current coding unit is located is an intra-frame coding frame, if so, executing the step of extracting and calculating the vertical edge texture features and the horizontal edge texture features of the current coding unit; otherwise, executing the next step;

calculating a difference image of the current coding unit by adopting a three-frame difference method;

calculating the ratio of the number of pixels which are equal to 0 in the difference image of the current coding unit to the total number of pixels of the current coding unit as a first ratio;

and when the first ratio is determined to be less than or equal to the first threshold value, the step of extracting and calculating the vertical edge texture features and the horizontal edge texture features of the current coding unit is performed.

Specifically, according to the intra prediction coding requirement of VCC, the coding tree unit CTU must be divided into four coding units CU of 64 × 64 size by quadtree division once, and then the CU of 64 × 64 or smaller size may be divided into sub-CUs by QTMT tree division type. The 4 coding units of the current coding tree unit can be obtained by quadtree-splitting the current coding tree unit.

Further, the step of extracting and calculating the vertical edge texture feature and the horizontal edge texture feature of the current coding unit specifically includes:

extracting the edge characteristics of the current coding unit by adopting a block-based edge extraction algorithm;

calculating the vertical edge texture characteristics of the current coding unit according to the edge characteristics of the current coding unit;

calculating the horizontal edge texture characteristics of the current coding unit according to the edge characteristics of the current coding unit;

and calculating the texture feature density of the current coding unit according to the edge feature of the current coding unit.

Further, the block-based edge extraction algorithm adopts a Canny algorithm, and the step of calculating the vertical edge texture feature of the current coding unit according to the edge feature of the current coding unit specifically includes:

determining a texture value of any point in an edge image obtained by extracting a current coding unit through a Canny algorithm, and further obtaining a vertical edge feature component of each column in the edge image;

finding out the maximum value of the vertical edge feature component and the minimum value of the vertical edge feature component from the obtained vertical edge feature components of each column;

and calculating the vertical edge texture feature of the current coding unit according to the maximum value of the vertical edge feature component and the minimum value of the vertical edge feature component.

Specifically, the calculation formula of the vertical edge texture feature of the current coding unit is as follows:

wherein, (x, y) is the coordinate of the midpoint of the edge image obtained by extracting the current coding unit through a Canny algorithm, Canny (x, y) is the Canny characteristic value at the coordinate (x, y), ve_iAnd ve_jAnd respectively obtaining vertical edge feature components of the ith column and the jth column in the edge image obtained by extracting the current coding unit through a Canny algorithm, wherein w and h are respectively the width and the height of the current coding unit, and VE is a vertical edge texture feature of the current coding unit.

Further, the block-based edge extraction algorithm adopts a Canny algorithm, and the step of calculating the horizontal edge texture feature of the current coding unit according to the edge feature of the current coding unit specifically includes:

determining a texture value of any point in an edge image obtained by extracting a current coding unit through a Canny algorithm, and further obtaining a horizontal edge feature component of each line in the edge image;

finding out the maximum value and the minimum value of the horizontal edge feature component from the obtained horizontal edge feature components of each line;

and calculating the horizontal edge texture feature of the current coding unit according to the maximum value of the horizontal edge feature component and the minimum value of the horizontal edge feature component.

wherein, (x, y) is the coordinate of the midpoint of the edge image obtained by extracting the current coding unit through a Canny algorithm, Canny (x, y) is the Canny characteristic value at the coordinate (x, y), he_iHe and he_jRespectively, the horizontal edge feature components of the ith row and the jth row in the edge image obtained by extracting the current coding unit through a Canny algorithm, wherein w and h are respectively the width and the height of the current coding unit, and HE is the horizontal edge texture feature of the current coding unit.

Further, the block-based edge extraction algorithm adopts a Canny algorithm, and the step of calculating the texture feature density of the current coding unit according to the edge feature of the current coding unit specifically includes:

calculating the sum of Canny characteristic values of all points in the edge image obtained after the current coding unit is extracted by a Canny algorithm;

determining the width and height of the current coding unit;

and calculating the texture feature density of the current coding unit according to the sum of the calculated Canny feature values and the determined width and height.

Specifically, the texture feature density of the current coding unit is calculated by the following formula:

wherein, (x, y) is the coordinate of the midpoint of the edge image obtained after the current coding unit is extracted by a Canny algorithm, Canny (x, y) is the Canny characteristic value at the coordinate (x, y), w and h are the width and height of the current coding unit respectively, and Density is the texture characteristic Density of the current coding unit.

Further, the step of performing a partition decision of the current coding unit according to the vertical edge texture feature and the horizontal edge texture feature of the current coding unit specifically includes:

taking a current coding unit as a current coding block;

calculating the rate distortion cost of the current coding block;

when the texture feature density of the current coding block is determined to be larger than or equal to a second threshold value, executing the next step;

when the first quotient is determined to be larger than a third threshold value, skipping a horizontal binary division mode and a horizontal trigeminal division mode, and selecting a vertical binary division mode, a vertical trigeminal division mode and a quadtree division mode as candidate division modes of the current coding block, wherein the first quotient is equal to the division of the vertical edge texture characteristic value of the current coding block by the horizontal edge texture characteristic value;

when a second quotient is determined to be larger than a third threshold value, skipping a vertical binary division mode and a vertical trigeminal division mode, and selecting a horizontal binary division mode, a horizontal trigeminal division mode and a quadtree division mode as candidate division modes of the current coding block, wherein the second quotient is equal to a horizontal edge texture characteristic value of the current coding block divided by a vertical edge texture characteristic value;

when the first quotient and the second quotient are determined to be smaller than or equal to a third threshold value, selecting a horizontal binary division mode, a vertical trigeminal division mode, a horizontal trigeminal division mode and a quadtree division mode as candidate division modes of the current coding block;

sequentially performing partition attempts on the current coding block according to the selected candidate partition mode to obtain the partition mode of the current coding block;

dividing the current coding block into a plurality of sub-blocks according to the obtained dividing mode;

and selecting any one of the sub-blocks as the current coding block, and returning to the step of calculating the rate-distortion cost of the current coding block.

Specifically, the current coding block may be a coding unit CU, a sub-CU into which the coding unit CU is further divided, or the like. Both the second threshold and the third threshold may be preset.

The partitioning attempts are performed in sequence according to the selected candidate partitioning manner. Taking the selected candidate division mode as a horizontal binary division mode, a vertical trigeminal division mode, a horizontal trigeminal division mode and a quadtree division mode as an example, the current coding block sequentially performs division attempts of the horizontal binary division mode, the vertical trigeminal division mode, the horizontal trigeminal division mode and the quadtree division mode, and selects any one of the 5 modes as the division mode of the current coding block; otherwise, the current coding block is divided into a plurality of sub-blocks according to the dividing mode obtained by the dividing attempt, and the step of calculating the rate-distortion cost of the current coding block is returned after any sub-block is selected from the plurality of sub-blocks as the current coding block. Other selected candidate partition modes are similar to this mode, and are not described in detail herein.

The second aspect of the embodiment of the present invention adopts the following technical solutions:

a multi-functional video coding inter-frame partitioning system, comprising:

the determining unit is used for determining that the coding frame of the current coding tree unit is an intra-frame coding frame or a first ratio is smaller than or equal to a first threshold value and is processed by the feature extraction and calculation module, wherein the first ratio is the ratio of the number of pixels which are equal to 0 in a difference image and calculated by the current coding unit by adopting a three-frame difference method to account for the total number of pixels of the current coding unit, and the current coding unit is any one of 4 coding units obtained by performing quad-tree division on the current coding tree unit;

the characteristic extraction and calculation module is used for extracting and calculating the vertical edge texture characteristic and the horizontal edge texture characteristic of the current coding unit;

a partition decision module, configured to perform a partition decision of a current coding unit according to a vertical edge texture feature and a horizontal edge texture feature of the current coding unit, where the performing the partition decision of the current coding unit includes: when the vertical edge texture features of the current coding unit are determined to be larger than the horizontal edge texture features, skipping a horizontal binary division mode and a horizontal trigeminal division mode; and when the vertical edge texture features of the current coding unit are determined to be smaller than the horizontal edge texture features, skipping the vertical binary division mode and the vertical trigeminal division mode.

The third aspect of the embodiment of the present invention adopts the following technical solutions:

a multi-functional video coding inter-frame partitioning system, comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method for multi-functional video coding inter-frame partitioning.

The fourth aspect of the embodiment of the present invention adopts the technical solution that:

a storage medium having stored therein instructions executable by a processor, the storage medium comprising: the processor-executable instructions, when executed by a processor, are for implementing the multi-functional video coding inter-frame division method.

One or more of the above-described embodiments of the present invention have the following advantages: when the coding frame of the current coding tree unit is determined to be an intra-frame coding frame or the first ratio is smaller than or equal to the first threshold value, extracting and calculating the vertical edge texture feature and the horizontal edge texture feature of the current coding unit, then carrying out partition decision of the current coding unit according to the obtained vertical edge texture feature and the horizontal edge texture feature, and skipping some partition modes by using the relative size of the first ratio and whether the coding frame is the intra-frame coding frame or the edge texture feature of the current coding unit during inter-frame prediction coding, so that the coding time is reduced, and the coding efficiency is improved; when the current coding unit is not obviously similar to the co-located block of the reference image and needs to be further divided, skipping a horizontal binary division mode and a horizontal three-way division mode or skipping a vertical binary division mode and a vertical three-way division mode according to the vertical edge texture feature and the horizontal edge texture feature of the current coding unit, and selecting at least 3 division modes for carrying out recursive division.

Drawings

Fig. 1 is a schematic diagram of a conventional CTU partitioning structure;

FIG. 2 is a general flowchart of the multi-functional video coding interframe partition method according to the present invention;

FIG. 3 is a schematic of non-maximum suppression;

FIG. 4 is a CTU partitioning result diagram;

FIG. 5 is an edge map of FIG. 4 extracted by the Canny algorithm;

fig. 6 is a flowchart of an inter-frame fast algorithm based on the Canny algorithm and the three-frame difference method according to an embodiment of the present invention.

Detailed Description

The invention will be further explained and explained with reference to the drawings and the embodiments in the description. The step numbers in the embodiments of the present invention are set for convenience of illustration only, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adaptively adjusted according to the understanding of those skilled in the art.

The multifunctional video coding interframe division scheme firstly adopts a three-frame difference method to determine whether a current block is a static block or a non-static block, and if the current block is the static block, further division operation can be stopped in advance; if the current block is not a static block, an edge algorithm is used for extracting edge texture features of the CU in the coding unit, whether the CU is further divided or not is determined according to the features, and certain dividing modes are skipped, so that the purpose of reducing coding time while ensuring coding performance is achieved.

Referring to fig. 2, the basic process of the multi-functional video coding inter-frame partition scheme of the present invention is: firstly, determining that a coding frame of a current coding tree unit is an intra-frame coding frame or a first ratio is less than or equal to a first threshold; then, extracting and calculating the vertical edge texture features and the horizontal edge texture features of the current coding unit by adopting an edge extraction algorithm (such as edge detection algorithms such as Sobel, Roberts, Prewitt and Canny); and finally, according to the vertical edge texture features and the horizontal edge texture features of the current coding unit, performing division decision of the current coding unit: if the vertical edge texture features of the current coding unit are larger than the horizontal edge texture features, the horizontal edge texture features are not obvious, and the two modes of a horizontal binary division mode and a horizontal trigeminal division mode are skipped at the moment; and if the vertical edge texture features of the current coding unit are smaller than the horizontal edge texture features, the vertical edge texture features are not obvious, and the vertical binary division mode and the vertical trigeminal division mode are skipped at the moment.

The following describes the related theories and specific implementation processes related to the multi-functional video coding interframe partition scheme of the invention in detail:

canny edge detection algorithm

The invention relates to an algorithm for rapidly dividing and judging CU blocks by utilizing image edge characteristics. Commonly used edge detection algorithms are the Sobel, Roberts, Prewitt and Canny edge detection algorithms. Among these edge detection algorithms, Canny algorithm has superior performance because it has a low error rate when extracting edge features of an image, and points detected and marked as edges can be as close as possible to real edges. The present embodiment utilizes the Canny edge detection algorithm to extract edge features of CU blocks.

The Canny edge detection algorithm mainly comprises the following steps:

(1) the image noise is smoothed by convolving the CTU selected from the input image with a gaussian filter function H (x, y). The gaussian filter function is shown in equation (1):

(2) calculating partial derivatives G of the Gaussian filtered image gray along the x and y directions by using a Sobel operator_xAnd G_yThen, the corresponding gradient magnitude image M (x, y) and angle image θ (x, y) are calculated, as shown in equations (2) and (3):

(3) the gradient amplitude image M (x, y) is processed by applying non-maximum suppression, and the main processing steps comprise:

1) the gradient direction θ of the current point is obtained from the angle image θ (x, y).

2) When theta is along four directions of horizontal, +45 degrees, vertical and-45 degrees, if M (x, y) of the point is at least smaller than one of two adjacent pixel points along theta, the non-maximum inhibition result of the point is made to be 0, otherwise, the maximum inhibition result is made to be M (x, y); when θ is not along the above four directions, two adjacent points of the point along the gradient direction are sub-pixel points, and it is necessary to interpolate the points on both sides thereof in order to obtain their gradient values. As shown in fig. 3, M (x, y) represents a center position point, and a black line with an arrow represents a gradient direction. If | G_x|>|G_y| G, weight ═ G_y|/|G_xL, |; similarly, when | G_y|>|G_xWhen | G, the weight is | G_x|/|G_yL. For example, in the case of FIG. 3, the interpolation would be:

m1＝weight*M(x-1,y+1)+(1-weight)*M(x-1,y) (4)

m2＝weight*M(x+1,y-1)+(1-weight)*M(x+1,y) (5)

(4) edges are detected and connected using dual threshold processing and connection analysis. Points in the non-maximally suppressed image where pixels are greater than the high threshold (i.e., upper threshold bound) are edges and those less than the low threshold (i.e., lower threshold bound) are not edges; between the two thresholds, an edge is defined if its neighboring pixels have a value greater than the high threshold, and not otherwise. Finally, for a non-edge point, the value of the point in the final output edge map is set to 0, otherwise, it is 255.

Since step (4) in the conventional Canny algorithm requires calculation of high and low thresholds from the histogram of the gradient amplitudes of the entire image, the time consumption is large. In view of this, the present embodiment can directly select the high threshold and the low threshold manually (i.e. the upper threshold and the lower threshold are both preset values). The high threshold should not be too large to avoid missing edges because a pixel is considered a valid edge when the gradient magnitude of the pixel is greater than the high threshold. Thus, the low and high thresholds may be set to 4 and 45, respectively. In addition to this, in VVC, the coding tree unit CTU is the most initial block structure unit before being divided into coding units CU. Thus, the present embodiment uses a block-based Canny algorithm rather than the traditional frame-based Canny algorithm, which is more robust to the texture characteristics of the block.

After processing the input image with the modified Canny algorithm, an edge image can be obtained, as shown in fig. 4 and 5, in which fig. 4 and 5, the red rectangle represents a CTU of the input image. Each point in the edge image has a texture value. There is no differentiation between vertical and horizontal features for the texture value of a point. However, each column has its vertical texture feature value and each row has its horizontal texture feature value. From fig. 4 and 5, it can be observed that the blocks with obvious vertical (horizontal) texture are selected in a way that most of the partitions are vertically (horizontally) divided. Therefore, the edge texture information can be fully utilized when the division mode is selected, so that the encoding time is saved.

(II) inter-frame fast division algorithm based on improved Canny algorithm and three-frame difference method

In video coding, each frame represents a still image. In actual compression, various algorithms are used to reduce the data size, with IPB being the most common. The I frame, also called intra-frame coded frame, is an independent frame with all information, and can be independently coded without referring to other pictures. The first frame of a video sequence is always an I-frame. P-frames are coded pictures, also called predicted frames, that compress the amount of transmitted data by substantially reducing the temporal redundancy information of previously coded frames in a sequence of pictures. B frames are coded pictures, also called bidirectional predictive frames, that compress the amount of transmitted data taking into account both the previously coded frames of the source picture sequence and the temporal redundancy information between the later coded frames of the source picture sequence.

The interframe prediction utilizes the correlation of a video time domain, uses the pixels of adjacent coded images to predict the pixels of the current image, and can effectively remove video time domain redundancy. Therefore, when the current coding block is significantly similar to the co-located block of the reference picture, the prediction error can be small even if all further partitioning operations are skipped. In this case, the overhead of dividing information can be reduced; at the same time, the rate-distortion cost considering the overhead and prediction error will increase little, but the coding time will decrease significantly and reasonably.

The three-frame difference method is an effective moving target detection algorithm, marks the motion of an object by carrying out difference operation on images in a video sequence, and mainly comprises the following steps:

1) selection of three successive frames of images I in a video sequence_i-1(x,y)、I_i(x, y) and I_i+1(x, y), then the difference image can be defined as:

2) selecting a proper threshold value T, and carrying out binarization on the difference image of 1), wherein the definition is as follows:

3) and (3) performing logic AND operation (namely taking intersection) on the two images after the binarization in the step 2):

in VVC, the reference frame of the current frame is not necessarily a previous frame or a subsequent frame adjacent in video time, and therefore, when the three-frame difference method is used, three-frame images to be used are not temporally continuous, but a difference image D (x, y) is calculated using the current frame and its two reference frame images. In order to save the encoding time, for the current CU block, the present embodiment first calculates the ratio r of the number of pixels equal to 0 in D (x, y) to the number of pixels of the whole CU block, where r is calculated as follows:

wherein D (i, j) is a pixel point of which the pixel value is equal to 0 in the difference image D (x, y) of the current CU block, and h and w are the width and height of the current CU block, respectively.

When r is greater than a first threshold (e.g., 95% or other value near 100%), then the current CU block is determined to be a stationary block, at which point further partitioning may be terminated early.

If the partitioning of the current CU block is not terminated early (i.e., r is less than or equal to the first threshold), then a further fast decision is made using the modified canny algorithm (a) to save inter-coding time. In addition, for the coding frame belonging to the I frame in the current CU block, it does not need to refer to other frame images for coding, so it cannot apply the three-frame difference method to calculate the r value, so it also needs to use the improved canny algorithm (a) for further fast decision-making to save the inter-coding time.

In the inter-frame coding method of this embodiment, when r is less than or equal to the first threshold or the coded frame is an I frame, the edge texture feature of the CU can be extracted by using the Canny edge detection algorithm according to the conclusion (a), so as to skip some partitioning methods: if the characteristics of Vertical Edges (VE) in the current CU block are more obvious than those of Horizontal Edges (HE), two division modes of Horizontal binary and Horizontal trifurcations can be skipped; conversely, if the horizontal edge is more visible than the vertical edge, then both the vertical binary and vertical trifurcations can be skipped.

After processing the current CU selected from the input image using the block-based Canny algorithm, a binary edge map can be obtained, as shown in fig. 5. The canny (feature) value at (x, y) is 1 if the pixel value of the point of the edge map with coordinates (x, y) is not equal to zero, and 0 otherwise. Assuming that the CU block to be divided has a width w and a height h, VE and HE of each CU block are given by the following equations (10) and (11):

meanwhile, the texture Density (Density) of the CU block is calculated according to the following equation (12):

before performing a further fast decision on the current CU block, the inter-frame coding in this embodiment may select two suitable second thresholds TH _1 and third thresholds TH _2, for example, TH _1 and TH _2 may be set to 0.01 and 1.5, respectively, and the first threshold is set to 95%, then the inter-frame fast algorithm flowchart based on the Canny algorithm and the three-frame difference method in this embodiment is shown in fig. 6, and specifically includes the following steps:

s1: selecting a coding CTU (selected according to the size of the CTU, and generally, a plurality of CTUs can be adopted for coding the input image) from the input image as a current CTU;

s2: dividing a current CTU into 4 Coding Units (CUs) through quadtree division, and selecting any one of the 4 CUs as a current Coding Unit (CU);

s3: entering a QTMT _ RDO recursive mode, calculating the QTMT _ RDO rate distortion cost (the initial cost is the cost when the current CU is not divided) of the current coding block (the current coding block is the current coding unit CU, and if the current CU is determined to need to be further divided after subsequent division attempts, the current coding block is the coding sub-block obtained by further division, namely the sub-CU);

s4: judging whether the coding frame where the current coding block is located is an I frame, if so, executing the step S7, otherwise, executing the step S5;

s5: calculating the r value of the current coding block by adopting a three-frame difference method;

if the current coding block is the current CU, calculating the r value of the current CU according to a formula (9); if the current coding block is a coding sub-block, its r value can be calculated in a similar manner to equation (9).

S6: judging whether the calculated r value is larger than 95%, if so, not dividing the current coding block, ending recursive division, and returning to the previous layer (when in recursive division, the cost of the division of the previous layer is obtained by adding the rate distortion costs of all the next layers (namely sub-layers) obtained by the division of the previous layer); otherwise, go to step S7;

s7: extracting the edge characteristics of the current coding block by adopting a Canny algorithm based on blocks;

s8: respectively calculating VE, HE and sensitivity of the current coding block;

if the current coding block is the current CU, calculating VE, HE and sensitivity of the current CU according to formulas (10), (11) and (12) respectively; if the current coding block is a coding sub-block, VE, HE and Density can be calculated in a similar manner to equations (10), (11) and (12).

S9: judging whether the calculated sensitivity is smaller than TH _1, if so, not dividing the current coding block, ending recursive division, and returning to the previous layer, otherwise, executing step S10;

s10: judging whether VE/HE is larger than TH _2, if yes, sequentially performing partition attempts of vertical binary tree partition, vertical ternary tree partition and quadtree recursive partition on the current coding block, dividing the current coding block into a plurality of sub blocks according to a partition mode obtained by the partition attempts, selecting one sub block from the plurality of sub blocks as a new current coding block, and returning to step S3 (the processing modes of the rest sub blocks are the same, and continuous iterative processing can be performed on the sub blocks one by one until all the sub blocks are processed completely); otherwise, go to step S11;

s11: judging whether HE/VE is larger than TH _2, if yes, sequentially performing division attempts of horizontal binary recursive division, vertical binary recursive division, horizontal three-fork recursive division, vertical three-fork recursive division and quadtree recursive division on the current coding block, dividing the current coding block into a plurality of sub-blocks according to a division mode obtained by the division attempts, selecting one sub-block from the plurality of sub-blocks as a new current coding block, and returning to step S3 (the processing modes of the rest sub-blocks are the same, and continuous iterative processing can be performed on the sub-blocks one by one until all sub-blocks are processed completely); otherwise, the current coding block is subjected to the partitioning attempts of horizontal binary recursive partitioning, horizontal three-way recursive partitioning and quadtree recursive partitioning in sequence, the current coding block is divided into a plurality of sub-blocks according to the partitioning mode obtained by the partitioning attempts, one sub-block is selected from the plurality of sub-blocks as a new current coding block, and then the step S3 is returned (the processing modes of the rest sub-blocks are the same, and continuous iterative processing can be performed on the sub-blocks one by one until all sub-blocks are processed completely).

In addition, if the partitioning attempts of all candidate partitioning manners of the current coding block fail (i.e. no further partitioning is needed), the rate-distortion cost of the current coding block is already minimum, and at this time, the partitioning manner of the current coding block is already the best partitioning manner without further partitioning of sub-coding blocks, and the partitioning operation of the CTU can be ended.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for multi-functional video coding interframe partition, characterized by: the method comprises the following steps:

executing the next step when determining that a first ratio of a current coding unit of the current coding tree unit is smaller than or equal to a first threshold value, wherein the first ratio is the ratio of the number of pixels which are equal to 0 in a difference image obtained by the current coding unit through a three-frame difference method to the total number of pixels of the current coding unit, and the current coding unit is any one of 4 coding units obtained by dividing the current coding tree unit through a quadtree;

2. The multi-functional video coding inter-frame division method of claim 1, wherein: when the first ratio of the current coding unit of the current coding tree unit is determined to be less than or equal to the first threshold, executing the next step, which specifically includes: selecting a coding tree unit from an input image as a current coding tree unit;

selecting any one coding unit from the 4 coding units as a current coding unit;

3. The multi-functional video coding inter-frame division method of claim 1, wherein: the step of extracting and calculating the vertical edge texture feature and the horizontal edge texture feature of the current coding unit specifically includes:

4. The multi-functional video coding inter-frame division method of claim 3, wherein: the block-based edge extraction algorithm adopts a Canny algorithm, and the step of calculating the vertical edge texture feature of the current coding unit according to the edge feature of the current coding unit specifically comprises the following steps:

5. The multi-functional video coding inter-frame division method of claim 3, wherein: the block-based edge extraction algorithm adopts a Canny algorithm, and the step of calculating the horizontal edge texture feature of the current coding unit according to the edge feature of the current coding unit specifically comprises the following steps:

6. The multi-functional video coding inter-frame division method of claim 3, wherein: the block-based edge extraction algorithm adopts a Canny algorithm, and the step of calculating the texture feature density of the current coding unit according to the edge feature of the current coding unit specifically comprises the following steps:

determining the width and height of the current coding unit;

7. The multi-functional video coding inter-frame division method of claim 3, wherein: the step of performing a partition decision of the current coding unit according to the vertical edge texture feature and the horizontal edge texture feature of the current coding unit specifically includes:

taking a current coding unit as a current coding block;

calculating the rate distortion cost of the current coding block;

8. A multi-functional video coding interframe partition system, characterized by: the method comprises the following steps:

the determining unit is used for determining that the first ratio of the current coding unit of the current coding tree unit is smaller than or equal to a first threshold value and is processed by the feature extraction and calculation module, wherein the first ratio is the ratio of the number of pixels which are equal to 0 in a difference image obtained by the current coding unit through calculation by adopting a three-frame difference method to the total number of pixels of the current coding unit, and the current coding unit is any one of 4 coding units obtained by performing quad-tree division on the current coding tree unit;

9. A multi-functional video coding interframe partition system, characterized by: the method comprises the following steps:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement a multi-functional video coding interframe partition method as recited in any of claims 1-7.

10. A storage medium having stored therein instructions executable by a processor, the storage medium comprising: the processor-executable instructions, when executed by a processor, are for implementing a multi-functional video coding interframe partition method as recited in any one of claims 1-7.