CN110049338B - HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification - Google Patents

HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification Download PDF

Info

Publication number
CN110049338B
CN110049338B CN201910344082.4A CN201910344082A CN110049338B CN 110049338 B CN110049338 B CN 110049338B CN 201910344082 A CN201910344082 A CN 201910344082A CN 110049338 B CN110049338 B CN 110049338B
Authority
CN
China
Prior art keywords
depth
mode
classification tree
classification
flag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910344082.4A
Other languages
Chinese (zh)
Other versions
CN110049338A (en
Inventor
陆宇
黄旭东
黄晓峰
周洋
殷海兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Lizhuan Technology Transfer Center Co ltd
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201910344082.4A priority Critical patent/CN110049338B/en
Publication of CN110049338A publication Critical patent/CN110049338A/en
Application granted granted Critical
Publication of CN110049338B publication Critical patent/CN110049338B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses an HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification. The method comprises the step of realizing HEVC (high efficiency video coding) fast inter-frame coding by using a classification tree of a CU (coding Unit) level, a classification tree of a PU (polyurethane) level and a classification tree of a TU (Unit) level, and the contents of the method comprise a fast CU dividing method based on a CU deep classification tree, a fast PU selecting method based on an inter-frame mode classification tree and a TU dividing method based on a TU deep classification tree. The method utilizes the spatial-temporal similarity of the CU depth, the PU mode and the TU depth in the HEVC coding process, reduces the complexity of CU division, simplifies the selection process of the inter-frame prediction mode, and simultaneously utilizes the transformation coefficient characteristics of the TU to reduce the complexity of TU division.

Description

HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification
Technical Field
The invention belongs to the technical field of video coding, and particularly relates to a low-complexity video fast interframe coding method.
Background
In recent years, with the appearance of a large amount of high definition videos and even ultra-high definition videos, the development of the existing storage technology and network bandwidth is far from meeting the requirement of effective storage and transmission of ultra-high definition videos at the present stage, so that the technological innovation and development of video compression coding are not stopped all the time. In order to meet the requirements of transmission and storage of High-definition and even ultra-High-definition videos, a Video Coding experts group under the international telecommunication union communication standardization organization and a dynamic image experts group under the international electrotechnical commission have established a new generation of High-Efficiency Video Coding standard which is researched and formulated by a Video Coding joint experts group, and the High-Efficiency Video Coding (HEVC) standard, also called h.265, has been completed in 2013. HEVC is a new generation of video coding standard following H.264, and mainly aims to solve the problems of high resolution and parallel processing which cannot be solved by the traditional H.264/MPEG-4AVC standard and save 30-50% of code rate on the premise of keeping the subjective quality of video. In order to encode high-definition and ultra-high-definition videos such as 1080P, 2K, 4K and even 8K and reduce the bit rate of their transmission, the HEVC encoder adopts a large number of new technologies such as a flexible quadtree structure, an angular intra-frame prediction technology, an asymmetric inter-frame prediction technology, an advanced motion vector prediction, a multi-frame motion compensation prediction, and the like.
In the encoding process, HEVC employs a flexible block partition structure including Coding Units (CU), prediction Units (PU), and Transform Units (TU). The CU sizes are 64 × 64, 32 × 32, 16 × 16, and 8 × 8, and correspond to division depths of 0,1,2, and 3, respectively. Where 64 × 64 CUs are defined as Coding Tree Units (CTUs), with a quadtree partitioning structure, each CTU can be recursively partitioned into 4 equal-sized CUs, and each CU can continue to be partitioned into 4 equal-sized CUs up to the smallest CU (8 × 8). HEVC traverses all CUs from depth 0 to depth 3 throughout the encoding process. Corresponding to the partitioning of a CU, a TU can be partitioned into four depths, namely 32 × 32, 16 × 16, 8 × 8 and 4 × 4, and fig. 1 shows the partitioning of the CU and the TU in one CTU. Each CU has a corresponding PU and TU, and a PU is a basic unit for intra prediction and inter prediction. For a 2N × 2N CU, there are eight partition modes of the inter prediction unit PU: 2 Nx2N (Merge/Skip), 2 NxN, nx2N, nxN, 2 NxnU, 2 NxnD, nL x 2N, nR x 2N, where 2 NxN, nx 2N, nxN is a symmetric partition (SMP), 2 NxnU, 2 NxnD, nL x 2N, nR x 2N is an asymmetric partition (AMP), as shown in FIG. 2. To get the best inter prediction mode, all PU modes need to be traversed to get the best inter mode for each CU. The block partitioning and mode selection method of HEVC introduces higher coding complexity while achieving higher compression performance, and is not suitable for real-time coding. Therefore, how to optimize the coding mode while ensuring that the compression efficiency is not affected is one of the hot spots in the current HEVC research direction, so as to reduce the coding complexity, improve the coding speed, and maintain the original coding quality.
Disclosure of Invention
The invention aims to provide a high-speed HEVC inter-frame coding method based on multi-level classification aiming at the defect of high complexity of the conventional HEVC video coding, so that the coding quality is ensured and the coding complexity is reduced.
The fast inter-frame coding method mainly utilizes a classification tree of a CU layer, a classification tree of a PU layer and a classification tree of a TU layer to realize HEVC fast inter-frame coding, and the contents of the fast inter-frame coding method mainly comprise a fast CU dividing method based on a CU deep classification tree, a fast PU selecting method based on an inter-frame mode classification tree and a TU dividing method based on the TU deep classification tree.
The quick CU dividing method based on the CU depth classification tree is characterized in that the correlation characteristics of a current frame CU and a previous frame CU at the same position are calculated firstly, and the time domain CU depth classification tree is used for dividing the CU into an inter-frame similar CU and a non-inter-frame similar CU. And respectively applying a time domain CU depth classification tree and a space domain CU depth classification tree to realize the rapid division of the CU.
The rapid PU selection method based on the inter-frame mode classification tree utilizes the PU mode correlation of the current CU and the spatial adjacent CU and utilizes the PU inter-frame mode classification tree to realize the rapid selection of the PU.
The TU division method based on the TU depth classification tree firstly calculates the relevant characteristics of TU transformation coefficients and uses the TU depth classification tree to realize the rapid division of the TU.
The method utilizes the spatial-temporal similarity of the CU depth, the PU mode and the TU depth in the HEVC coding process, reduces the complexity of CU division, simplifies the selection process of the inter-frame prediction mode, and simultaneously utilizes the transformation coefficient characteristics of the TU to reduce the complexity of TU division.
The technical scheme adopted by the invention for solving the problems is as follows
Quick CU (CU) dividing method based on CU deep classification tree
Step (I) of generating a time domain CU characteristic sample set
Firstly, adopting standard HEVC coding, extracting time domain CU characteristics when the optimal depth of a CU is divided into 0 and 1, setting a Flag _ sim and initializing to 0 as a Flag bit of whether the CU is a time domain similar CU. If the optimal depth of the CU is 0 or 1, and the optimal depth of the current CU is the same as the optimal depth of the CU at the same position of the previous frame, assigning the Flag _ sim to be 0 and calculating the time domain CU characteristics under the condition; if the optimal depth of the CU is different from the depth of the collocated CU in the previous frame, the Flag _ sim is assigned to be 1, and the time domain CU characteristics under the condition are calculated and output to the same sample set together. For the sample set with Flag _ sim of 0 and Flag _ sim of 1, randomly taking out the same number of samples to form a final time domain CU feature sample set, that is, the number of samples with Flag _ sim of 0 and Flag _ sim of 1 in the sample set is equal. The interframe characteristics adopted by the invention are as follows:
characteristic F1: the normalized absolute difference of the distortion of the current CU and the previous frame at the same position CU (as shown in fig. 3) is calculated as follows:
Figure BDA0002041704790000031
wherein D cur Represents the distortion value, D, of the current CU col Represents the distortion value of the previous frame and the position CU.
Characteristic F2: the normalized absolute difference value of the pixel deviation average absolute difference sum of the current CU and the CU at the same position of the previous frame is calculated according to the following formula:
Figure BDA0002041704790000032
where Avg represents the average Pixel value of the CU, SADA represents the sum of the deviation average absolute differences of the CU, pixel (i, j) represents the Pixel Y component value of the CU location (i, j), and W and H represent the width and height of the CU, respectively.
Characteristic F3: the absolute difference of the rate distortion cost and the distortion ratio of the current CU and the previous frame at the same position CU is calculated as follows:
Figure BDA0002041704790000033
wherein D cur And RDcost cur Represents the distortion value and rate-distortion cost, D, of the current CU col And RDcost col And representing the distortion value and the rate-distortion cost of the previous frame and the position CU.
Characteristic F4: the absolute difference value of the Sobel gradient absolute sum of the current CU and the previous frame co-located CU is calculated as follows, as shown in fig. 4:
Figure BDA0002041704790000041
in which I i,j Denotes the division of a CU into blocks of pixels with an intermediate pixel position (i, j) among 3 x 3 small blocks of pixels, G i,j Is represented by i,j SUM of the absolute gradient of the horizontal and vertical Sobel, SUM _ G denotes the absolute SUM of the gradient of all pixel patches.
Step (II) of generating a time domain CU deep classification tree
First, the CUs are classified into two categories according to the similarity: a time-domain similar CU and a non-time-domain similar CU. The similarity between the time-domain similar CU and the previous frame of the same-position CU is higher, namely the optimal depth is the same; and the similarity of the non-time-domain similar CU and the CU at the same position of the previous frame is lower, namely the optimal depth is different. The specific classification tree generation method is as follows:
processing the time domain CU feature sample set generated in the step (I), respectively selecting half samples randomly to form a training set and the other half samples to form a verification set for the time domain CU feature sample set with the depths of 0 and 1, based on the two sets, adopting a classification and regression tree (CART) to generate an initial time domain CU depth classification tree, then setting the maximum depth of the classification tree to be not more than 10, adopting the verification set to evaluate the leaf node accuracy of the obtained classification tree and pruning, and stopping node segmentation once the total number of the training samples in the leaf node is less than or equal to 1% of the size of the verification set. Then, pruning the tree structure one by one node, and when the proportion of the sample number marked as 0 to the total node sample number is greater than or equal to 85%, marking the leaf node as a time domain similar node, and setting a label as 0; and marking the rest leaf nodes as non-time domain similar nodes, setting the labels to be 1, and if the labels of the two leaf nodes under the same father node are both 1, cutting off the leaf nodes.
Step (III) of dividing the CU into two types of time-domain similar CU and non-time-domain similar CU by using time-domain CU deep classification tree
And (5) dividing the time domain CU depth classification tree generated in the step (II) according to the classification variable and the threshold value on each node and the corresponding leaf node, and realizing the classification tree in the HM interframe coding by using an if-else programming method, wherein the specific implementation method refers to the following specific implementation method.
Firstly, executing a Merge mode of the PU, after the Merge mode, if the current depth of the current CU is the same as the optimal depth of the previous frame of the same-position CU, calculating and obtaining the corresponding time domain similarity characteristic in the step (I) of the current CU and the previous frame of the same-position CU, enabling the time domain similarity characteristic to pass through a time domain CU depth classification tree with the corresponding depth, obtaining a final time domain similarity classification result according to a finally obtained leaf node label, and regarding the CU with the leaf node label of 0 as a time domain similar CU; and regarding the CU with the leaf node label of 1, the CU is determined as a non-time domain similar CU.
And aiming at the time-domain similar CU, optimizing the division of the current CU by adopting the depth of the CU at the same position of the previous frame, assigning the depth of the CU at the same position of the previous frame to the current CU to be used as the optimal depth of the current CU, and stopping the continuous division of the CU.
And (4) aiming at the non-time-domain similar CU, adopting a classification tree of the CU which terminates the division in advance to execute whether to continue the classification of the CU division, and turning the specific characteristics to the step (IV).
Step (IV) of generating a space domain CU characteristic sample set for the non-time domain similar CU
Firstly, adopting standard HEVC coding, extracting the characteristics of a spatial domain CU at the optimal depths of 0,1 and 2 of the CU, setting a Flag _ ns and initializing to 0 as a Flag bit for judging whether the CU terminates in advance, stopping dividing if the Flag _ ns is 0, and continuing dividing if the Flag _ ns is 1.
When the optimal depth of the CU is 0, extracting spatial CU features under the depth of 0 and marking the spatial CU features as 0 to stop CU division; if the optimal depth of the CU is not 0, extracting spatial CU features under the depth of 0 and marking the spatial CU features as 1 to indicate that the CU division is continued. When the optimal depth of the CU is 1, extracting the characteristics of the airspace CU under the depth of 1, and marking the characteristics as 0 to stop CU division; if the optimal depth of the CU is not 1, extracting spatial CU features under the depth 1 and marking as 1 to indicate that the CU division is continued. When the optimal depth of the CU is 2, extracting the spatial CU characteristics under the depth 2 and marking the spatial CU characteristics as 0 to stop CU division; if the optimal depth of the CU is not 2, extracting the spatial CU features under the depth 2 and marking the spatial CU features as 1 to indicate that the CU division is continued.
And generating respective space domain CU characteristic sample sets at different depths, and for each depth space domain CU characteristic sample set, randomly acquiring samples with Flag _ ns of 0 and Flag _ ns of 1 to obtain the same number of samples respectively, and combining to form a final space domain CU characteristic sample set, wherein the number of the samples with Flag _ ns of 0 and Flag _ ns of 1 in the sample set is equal. And finally obtaining a spatial domain CU characteristic sample set under each different depth. The characteristics used are as follows:
characteristic F5: CBF of current CU, noted as F5.
Characteristic F6: the pixel of the residual block deviates from the average sum of absolute differences, and the calculation formula is as follows:
Figure BDA0002041704790000051
wherein, avg res Mean residual value, pixel, representing the residual block of the CU res (i, j) denotes the residual value of the residual block position (i, j), and W and H denote the width and height of the residual block, respectively.
Characteristic F7: the residual sub-block pixels deviate from the mean of the sum of the average absolute differences, denoted as F3. The residual sub-blocks are obtained by dividing the residual block into four residual small blocks according to 2 × 2, the height and width of each residual sub-block are equal and are half of the residual block, and the calculation formula is as follows:
Figure BDA0002041704790000061
wherein, avg _ res k Represents the average residual value of the k-th residual sub-block, SADA _ res k Denotes the sum of deviation average absolute differences of CU, pixel _ res (i, j) denotes the residual value of the residual sub-block position (i, j), and W and H denote the width and height of the residual sub-block, respectively.
Characteristic F8: the mean of the deviation of the pixels of the residual sub-block from the mean sum of absolute differences and the absolute difference of the deviation of the pixels of the residual block from the mean sum of absolute differences represent the relationship between the parent CU residual and the child CU residual.
F8=|F6-F7| (7)
Characteristic F9: the maximum Sobel gradient of the residual block is calculated as follows:
Figure BDA0002041704790000062
wherein, I _ res i,j G _ res represents a residual block divided into 3 × 3 blocks with a middle residual position (i, j) i,j Represents I _ res i,j The absolute sum of the transverse and longitudinal Sobel gradients of (a).
Characteristic F10: the pixel deviation from the average sum of absolute differences of the current CU is calculated as follows:
Figure BDA0002041704790000063
where Avg represents the average Pixel value of the CU, pixel (i, j) represents the Pixel Y component value of CU location (i, j), and W and H represent the width and height of the CU, respectively.
Step (V) of generating a spatial domain CU depth classification tree
First, according to whether the CU continues to be divided into two categories: class I and class ii. If the classification tree result is type I, finishing CU division in advance; and if the classification tree result is II, continuing the CU classification. The specific classification tree generation method is as follows:
similar to the method for generating the classification tree in the step (ii), firstly, the final spatial domain CU feature sample sets at each depth are processed, half of the samples are randomly selected to form a training set, and the other half of the samples form a verification set, based on the two sets, the classification and regression tree (CART) is used to generate the initial spatial domain CU depth classification tree, then the maximum depth of the classification tree is set to be not more than 10, the verification set is used to evaluate the leaf node accuracy of the obtained classification tree and prune, and once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size, the node segmentation is stopped. Then, node-by-node pruning is carried out on the tree structure, leaf nodes with the sample number marked as 0 accounting for 85% or more of the total node sample number are marked as division stopping nodes, and the labels are set to be 0; and marking the rest leaf nodes as continuous division nodes, setting the labels to be 1, and if the labels of the two leaf nodes under the same father node are all 1, cutting the leaf nodes. The final spatial domain CU depth classification tree is obtained at depths 0,1 and 2, respectively.
Step (VI), method for realizing rapid CU division by using time domain CU depth classification tree and space domain CU depth classification tree
And (3) for different CU depths, using the spatial-domain CU depth classification tree generated in the step (IV), dividing the spatial-domain CU depth classification tree according to the classification variable and the threshold value on each node and the corresponding leaf node, and realizing the classification tree in the inter-frame coding of the HM by using an if-else programming method, wherein the specific implementation method refers to the specific implementation method below.
After passing through a time domain CU deep classification tree, if the classification result of the CU is a time domain similar CU, assigning the depth of the current CU to the current CU as the optimal depth of the current CU, and simultaneously stopping continuous division of the CU; and if the classification result of the CU is a non-time domain similar CU, continuously using the spatial domain CU deep classification tree for classification selection. In the depth 0,1,2, extracting the corresponding spatial domain CU characteristics in the claim 3, passing the characteristics through a spatial domain CU depth classification tree corresponding to the CU depth, obtaining a final spatial domain CU classification result according to a finally obtained leaf node label, and determining a CU with a leaf node label of 0 as an I-type CU; for a CU with a leaf node labeled 1, it is designated as a class II CU. For the class I CU, the partition of the CU is terminated in advance; for a class ii CU, the fast PU selection method based on the inter mode classification tree continues.
(II) fast PU selection method based on inter-frame mode classification tree
Step (I) of generating a feature sample set of a PU inter mode
After PU mode discrimination of 2N × 2N and Merge is performed, the features of the PU inter mode are extracted at CU depths 0,1,2, and 3, respectively, a Flag _ mode is set and initialized to 0 as a Flag bit for PU mode selection, and for CU depths 0,1, and 2, a Flag_mode of 0 indicates that all remaining PU mode selections are skipped, a Flag _ mode of 1 indicates that all SMP modes are performed, all AMP modes are skipped, and a Flag _ mode of 2 indicates that all PU modes are performed. For CU depth 3, since AMP mode selection is not performed, the two categories are divided, and Flag _ mode of 0 indicates skipping all remaining PU mode selections, and Flag _ mode of 2 indicates performing all PU modes.
When all PU mode selections are finished, extracting features and judging the best PU mode, if the best PU mode is 2 Nx 2N or Merge, marking the flag bit as 0 and the obtained feature output sample, if the best PU mode is an SMP mode, marking the flag bit as 1 and the obtained feature output sample, and if the best PU mode is an AMP mode, marking the flag bit as 2 and the obtained feature output sample.
The different CU depths generate respective sets of feature samples for the PU inter mode. For the feature sample set of each CU depth, the same number of samples are randomly obtained from the samples with Flag _ mode 0,1 and 2, and are combined to form the final feature sample set of the PU inter mode, that is, the number of samples with Flag _ mode of 0, flag _ mode of 1, and Flag _ mode of 2 in the sample set is equal. And finally obtaining a feature sample set of the PU inter mode under each CU depth.
The characteristics used are as follows:
characteristic F11: the ratio absolute difference value of the rate distortion cost and the distortion is calculated according to the following formula:
Figure BDA0002041704790000081
where D and RDcost represent the distortion value and rate-distortion cost of the current CU.
Characteristic F12: the pixel deviation average absolute difference sum of the current CU is calculated as follows:
Figure BDA0002041704790000082
where Avg represents the average Pixel value at the CU, pixel (i, j) represents the Pixel Y component value at CU location (i, j), and W and H represent the width and height of the CU, respectively.
Characteristic F13: the average of the optimal CU depths of the surrounding blocks (upper CU, left CU, upper left CU, as in fig. 11) is calculated as follows:
Figure BDA0002041704790000091
dep above represents the optimal CU depth, dep, of the upper CU left Indicates the optimal CU depth, dep, of the left CU aboveleft Represents the optimal CU depth for the upper left CU.
Characteristic F14: the average of the best PU modes of the surrounding blocks (top CU, left CU, top left CU, fig. 11) is calculated as follows:
Figure BDA0002041704790000092
the values of the optimum PU mode are set to 2N × 2N 0,2N × N1,N × 2N 2,N × N3,2N × nU 4,2N × nD 5, nl × 2N 6, nr × 2N 7.PUmode above Represents the optimal PU mode, PUmode, of the upper CU left Indicates the best PU mode, PUmode, for the left CU aboveleft Represents the best PU mode for the upper left CU.
Characteristic F15: after execution of the optimal PU mode following the Merge, skip,2N × 2N mode, F15 is 0 if the optimal mode is Skip, F15 is 1 if the optimal mode is Merge, and F15 is 2 if the optimal mode is 2N × 2N.
Step (II) of generating PU inter-frame mode classification tree
Firstly, according to whether the PU continues to execute PU mode selection, the PU mode selection is divided into three categories: class I, class II and class III. Skipping all remaining PU modes for selection if the PU inter-frame mode classification tree result is type I; if the classification tree result is II class, all SMP modes are executed, and all AMP modes are skipped; if the classification tree result is class II, all PU modes are executed. The specific method for generating the PU interframe mode classification tree is as follows:
similar to the method for generating the classification tree in step (ii) of the CU part, first, the feature sample sets of the final PU inter mode at each CU depth are processed, half of the samples are randomly selected to form a training set, and the other half of the samples form a verification set, based on the two sets, the classification and regression tree (CART) is used to generate the initial PU inter mode classification tree, then the maximum depth of the classification tree is set to be not more than 10, and the verification set is used to evaluate the leaf node accuracy of the obtained classification tree and prune, and the node segmentation is stopped once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size. Then, pruning the tree structure one by one leaf node, and marking the leaf node with the sample number marked as 0 accounting for 85% of the total node sample number as skipping all PU modes on CU depths of 0,1 and 2, and setting the label as 0; adding the node accuracy rate of the sample number marked as 0 and the node accuracy rate of the sample number marked as 1 for the rest leaf nodes, marking to execute all SMP modes if the node accuracy rate after addition is greater than or equal to 85%, skipping all AMP modes, and setting a label as 1; the rest sets the tag to 2, i.e., performs all PU mode partitioning.
For CU depth 3, because AMP mode selection is not performed by default, the fraction of samples marked as 0 to the total node samples is marked as leaf nodes greater than or equal to 85%, and the leaf nodes are marked to skip all PU modes, and set to be marked as 0; the rest sets the tag to 2, i.e., performs all PU mode partitioning.
If the labels of both leaf nodes under the same parent node are 2, they are also pruned. The final PU inter mode classification tree is obtained at CU depths 0,1,2, 3, respectively.
Step (III) of realizing rapid PU selection by using PU inter-frame mode classification tree
And (3) for different CU depths, using the PU inter-frame mode classification tree generated in the step (II), dividing the PU inter-frame mode classification tree according to the classification variable and the threshold value on each node and the corresponding leaf node, and realizing the classification tree in the inter-frame coding of the HM by using an if-else programming method, wherein the specific implementation method refers to the specific implementation method below.
Respectively extracting the characteristics of the PU inter mode in the step (I) in the CU depth 0,1,2, passing the characteristics through a PU inter mode classification tree with the corresponding depth, obtaining the classification result of the final PU mode according to the finally obtained leaf node label, and determining the PU with the leaf node label of 0 as an I-type PU; for a CU with a leaf node label of 1, the CU is determined as a class II PU; a CU with a leaf node label of 2 is designated as a type iii PU. For class I PUs, skip all remaining PU modes; for class II PUs, all SMP modes are executed, and all AMP modes are skipped; for class III PUs, all PU modes are executed.
In depth 3, extracting the features of the PU inter mode in step (I), selecting the features from the PU mode selection classification tree of depth 3, and obtaining the classification result of the final PU mode according to the finally obtained leaf node labels, wherein the leaf node labels 0 are determined as I-class PUs because the AMP calculation is not performed at depth 3; a CU with a leaf node label of 2 is designated as a type iii PU. For class I PUs, skip all remaining PU modes; all PU modes are performed for class III PUs.
(III) quick TU partitioning method based on TU deep classification tree
Step (I) of generating TU characteristic sample set
For each PU mode, dividing TU is required to be executed, TU characteristics are respectively extracted at the depths of 0,1,2 and 3 of a CU, a Flag _ TU is set and initialized to be 0 and used as a Flag bit for the early termination of the TU, and the Flag _ TU is 0 and represents that the TU division is terminated in advance; a Flag _ TU of 1 indicates that TU depth partitioning continues to be performed.
And when the rate distortion cost of the parent TU is compared with that of the child TU, judging the category of the TU and extracting the corresponding division characteristics. If the rate distortion cost of the child TU is less than that of the parent TU, the child TU needs to be divided continuously, and Flag _ TU is marked as 1; and if the sum of the rate-distortion cost of the child TU is less than that of the current parent TU, the child TU does not need to be divided continuously, and Flag _ TU is marked to be 0.
Different CU depths generate respective TU feature sample sets. For a TU feature sample set of each CU depth, randomly acquiring samples with Flag _ TU of 0 and 1 in the same number respectively, and combining to form a final TU feature sample set, wherein the number of the samples with Flag _ TU of 0 and Flag _ TU of 1 in the sample set is equal. And obtaining a TU characteristic sample set under each CU depth. The characteristics used are as follows:
the lateral gradient of the residual coefficients is first calculated using the Roberts operator, as shown in fig. 5:
Figure BDA0002041704790000111
wherein, I i,j Indicating that the residual coefficient matrix is divided into 2 × 2 size matrices, with the coordinate position of the upper left residual coefficient being (i, j).
Characteristic F16: and (5) calculating by using the formula (14) to obtain a gradient matrix, finding out coordinate positions (i, j) of all non-zero values in each row, and recording the maximum value of i in the (i, j). I.e. all points g (i, j) of the traversal gradient matrix, i, j representing the coordinate positions of the matrix gradients, the largest value of i is selected.
i=max{i},s.t.g(i,j)≠0 (15)
Characteristic F17: and (5) calculating by using an equation (14) to obtain a gradient matrix, finding out coordinate positions (i, j) of all non-zero values in each column, and recording the maximum j value in the (i, j). I.e. all points g (i, j) of the traversal gradient matrix, i, j representing the coordinate positions of the matrix gradient, the largest j value is selected.
j=max{j},s.t.g(i,j)≠0 (16)
Characteristic F18: the number of all non-zero gradients in the resulting gradient matrix is calculated using equation (14). F18 is initialized to 0, all points g (i, j) of the traversal gradient matrix (i, j) represent the positions of the matrix gradients, and F18 is incremented when g (i, j) is not 0.
Characteristic F19: the matrix energy of the residual coefficient matrix, i.e. the sum of squares of each matrix element, reflects the degree of distribution of the residual coefficient matrix, and the calculation formula is as follows:
Figure BDA0002041704790000112
coef (i, j) represents the coefficient value of the residual coefficient matrix at position (i, j), and W and H represent the width and height of the residual coefficient matrix, respectively.
Step (II) of generating TU deep classification tree
Firstly, two categories are continuously divided according to TU or not: class I and class ii. If the classification tree result is type I, finishing TU division in advance; and if the classification tree result is II type, continuing TU classification. The specific classification tree generation method is as follows:
similar to the method for generating the classification tree in step (ii) of the CU module, first, a TU feature sample set at each CU depth is processed, half of the samples are randomly selected to form a training set, and the other half of the samples form a verification set, based on the two sets, an initial TU depth classification tree is generated using a classification and regression tree (CART), then the maximum depth of the classification tree is set to be not more than 10, and the verification set is used to evaluate the leaf node accuracy of the obtained classification tree and prune, and once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size, the node segmentation is stopped. Then, node-by-node pruning is carried out on the tree structure, leaf nodes with the sample number marked as 0 accounting for 85% or more of the total node sample number are marked as division stopping nodes, and the labels are set to be 0; and marking the rest leaf nodes as continuous division nodes, setting the labels to be 1, and if the labels of the two leaf nodes under the same father node are all 1, cutting the leaf nodes. The final TU depth classification tree is obtained on CU depths 0,1,2 and 3, respectively.
Step (III) of realizing rapid TU division by using TU deep classification tree
And (3) for different CU depths, using the TU depth classification tree generated in the step (II), dividing the classification tree according to the classification variable and the threshold value on each node and the corresponding leaf node, and realizing the classification tree in the inter-frame coding of the HM by using an if-else programming method, wherein the specific implementation method refers to the specific implementation method below.
Extracting TU characteristics in the step (I) in CU depths 0,1,2 and 3, passing the TU characteristics through a TU depth classification tree corresponding to the CU depths, obtaining a final early termination classification result according to a finally obtained leaf node label, and determining TUs with leaf node labels of 0 as class I TUs; for TUs with leaf node label 1, it is designated as a class ii TU. For the class I TU, the division of the TU is terminated in advance; and aiming at the TU of the type II, continuously executing the division of the TU.
The invention has the following beneficial effects:
the basic principle of the invention is that the correlation and the characteristics of the depth and the intra-frame prediction modes of the CU adjacent to the current frame CU in time and the CU adjacent to the current frame CU in space, and other relevant corresponding characteristics of the CU depth, the PU mode and the TU depth are utilized, the corresponding classification trees are generated by utilizing the corresponding characteristics to classify the similarity of the CU and whether to continue the division, and the mode of the PU is quickly selected to be rapidly divided with the TU. Therefore, a classification tree algorithm of the CU, the PU and the TU is provided, the complexity of division of the CU and the TU is reduced, the selection process of the PU inter-frame mode is simplified, and compared with the standard HEVC coding, the time is saved by more than 50%.
Drawings
Fig. 1 is a schematic diagram of CU partitioning for HEVC;
fig. 2 is a schematic diagram of a PU inter mode of HEVC;
FIG. 3 is a schematic diagram of the detailed location of temporally adjacent CUs;
FIG. 4 is a schematic diagram of the transverse and longitudinal gradients of the Sobel operator;
FIG. 5 is a schematic diagram of the transverse gradient of the Roberts operator;
FIG. 6 is a schematic diagram of a temporal CU depth classification tree with a depth of 0;
FIG. 7 is a diagram of a temporal CU depth classification tree with a depth of 1;
FIG. 8 is a schematic diagram of a spatial CU depth classification tree with depth 0;
FIG. 9 is a schematic diagram of a spatial CU depth classification tree with depth 1;
FIG. 10 is a schematic diagram of a spatial CU depth classification tree with depth 2;
FIG. 11 is a schematic diagram of the specific locations of spatially adjacent PUs of a PU;
FIG. 12 is a diagram of a PU inter mode classification tree with a CU depth of 0;
FIG. 13 is a diagram of a PU inter mode classification tree with a CU depth of 1;
FIG. 14 is a diagram of a PU inter mode classification tree with a CU depth of 2;
FIG. 15 is a diagram of a PU inter mode classification tree with a CU depth of 3;
FIG. 16 is a diagram of a TU depth classification tree with a CU depth of 0;
FIG. 17 is a diagram of a TU depth classification tree with a CU depth of 1;
FIG. 18 is a diagram of a TU depth classification tree with a CU depth of 2;
FIG. 19 is a diagram of a TU depth classification tree with a CU depth of 3;
FIG. 20 is an overall method flow diagram of the present invention;
FIG. 21 is a flow chart of a fast CU partition method of the present invention;
FIG. 22 is a flowchart of a fast PU selection method of the present invention;
fig. 23 is a flowchart of a fast TU partitioning method of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and the examples.
As shown in fig. 16, an HEVC fast inter-frame coding method based on multi-level classification adopts an HM16.9 version of HEVC video coding, and uses an HM16.9 self-contained inter-frame coding profile encoder _ lowdelay _ P _ main.cfg. Here, a CU block with a depth of 0 is taken as an example.
The fast inter-frame coding method mainly utilizes a classification tree of a CU layer, a classification tree of a PU layer and a classification tree of a TU layer to realize HEVC fast inter-frame coding, and the contents of the fast inter-frame coding method mainly comprise a fast CU dividing method based on a CU deep classification tree, a fast PU selecting method based on an inter-frame mode classification tree and a TU dividing method based on the TU deep classification tree.
The method for rapidly dividing the CU based on the CU deep classification tree comprises the following specific steps:
step (I) of generating a time domain CU characteristic sample set
The features of performing temporal CU depth selection are as follows:
characteristic F1: the normalized absolute difference of the distortion of the current CU and the previous frame at the same position CU (as shown in fig. 3) is calculated as follows:
Figure BDA0002041704790000141
wherein D cur Represents the distortion value, D, of the current CU col Represents the distortion value of the previous frame and the position CU. The value of D is obtained using the gettotalthroditation () function carried in the HM software.
Characteristic F2: the normalized absolute difference value of the pixel deviation average absolute difference sum of the current CU and the CU at the same position of the previous frame is calculated according to the following formula:
Figure BDA0002041704790000142
where Avg represents the average Pixel value of the CU, SADA represents the sum of the deviation average absolute differences of the CU, pixel (i, j) represents the Pixel Y component value of the CU location (i, j), and W and H represent the width and height of the CU, respectively. The value of the pixel Y COMPONENT value of the CU is obtained using the m _ ppcOrigYuv [0] - > getAddr (COMPONENT _ Y) function in the HM software.
Characteristic F3: the absolute difference of the rate distortion cost and the distortion ratio of the current CU and the previous frame at the same position CU is calculated as follows:
Figure BDA0002041704790000143
wherein D cur And RDcost cur Represents the distortion value and rate-distortion cost, D, of the current CU col And RDcost col And (4) representing the distortion value and the rate distortion cost of the CU in the same position as the previous frame. The value of RDcost is obtained by using the getTotalCost () function carried by the HM software.
Characteristic F4: the absolute difference value of the Sobel gradient absolute sum of the current CU and the previous frame co-located CU is calculated as follows, as shown in fig. 4:
Figure BDA0002041704790000151
wherein I i,j Denotes the division of a CU into blocks of pixels with an intermediate pixel position (i, j) among 3 x 3 small blocks of pixels, G i,j Is represented by I i,j SUM of the absolute gradient of the horizontal and vertical Sobel, SUM _ G denotes the absolute SUM of the gradient of all pixel patches.
Step (II) of generating a time domain CU deep classification tree
First, the CUs are classified into two categories according to the similarity: a time-domain similar CU and a non-time-domain similar CU. The similarity between the time-domain similar CU and the previous frame of the same-position CU is higher, namely the optimal depth is the same; and the similarity of the non-time-domain similar CU and the CU at the same position of the previous frame is lower, namely the optimal depth is different. The specific classification tree generation method is as follows:
processing the time domain CU feature sample set generated in the step (I), randomly selecting half samples to form a training set and the other half samples to form a verification set for the time domain CU feature sample set with the depths of 0 and 1, based on the two sets, adopting a classification and regression tree (CART) to generate an initial time domain CU depth classification tree, then setting the maximum depth of the classification tree to be not more than 10, adopting the verification set to evaluate the leaf node accuracy of the obtained classification tree and pruning, and stopping node segmentation once the total number of training samples in the leaf node is less than or equal to 1% of the size of the verification set. Then, pruning the tree structure one by one node, and when the sample number marked as 0 accounts for the proportion of the total node sample number, marking leaf nodes which are more than or equal to 85 percent as time domain similar nodes, and setting labels as 0; and marking the rest leaf nodes as non-time domain similar nodes, setting the labels as 1, and if the labels of the two leaf nodes under the same father node are both 1, cutting the leaf nodes off.
Step (III) of dividing the CU into two types of time-domain similar CU and non-time-domain similar CU by using time-domain CU deep classification tree
Firstly, executing a Merge mode of the PU, after the Merge mode, if the current depth of the current CU is the same as the optimal depth of the same-position CU in the previous frame, calculating and obtaining the corresponding time domain similarity characteristic in the step (I) of the current CU and the same-position CU in the previous frame, setting the time domain CU depth classification tree with the depth of 0 through the time domain CU depth classification tree with the corresponding depth, setting a flag bit flag _ sim by using the generated time domain CU depth classification tree with the depth of 0 according to the previous setting, if the flag _ sim is 0, dividing the CU into time domain similar CUs, and if the flag _ sim is 1, dividing the CU into non-time domain similar CUs: function (time-domain CU deep classification tree depth 0)
Figure BDA0002041704790000161
Obtaining a final similarity classification result according to the finally obtained flag _ sim tag, and regarding the CU with the flag _ sim being 0 as a time domain similar CU; and regarding the CU with the flag _ sim being 1, the CU is determined as a non-time domain similar CU.
And aiming at the time-domain similar CU, optimizing the division of the current CU by adopting the depth of the CU at the same position of the previous frame, assigning the depth of the CU at the same position of the previous frame to the current CU to be used as the optimal depth of the current CU, and stopping the continuous division of the CU.
And (5) aiming at the non-time domain similar CU, adopting a classification tree of which the CU terminates the division in advance to execute whether to continue the classification of the CU division, and turning to the step (IV) according to specific characteristics.
Step (IV) of generating a space domain CU characteristic sample set for the non-time domain similar CU
The features of performing spatial CU depth selection are as follows:
characteristic F5: CBF of current CU, noted as F5. The value of the CBF is obtained using the getCbf () function in the HM software.
Characteristic F6: the pixels of the residual block deviate from the mean sum of absolute differences, which is calculated as follows:
Figure BDA0002041704790000171
wherein, avg res Mean residual value, pixel, representing a block of CU residual res (i, j) denotes the residual value of the residual block position (i, j), and W and H denote the width and height of the residual block, respectively.
Characteristic F7: the residual sub-block pixels deviate from the mean of the sum of the average absolute differences, denoted as F3. The residual sub-blocks are obtained by dividing the residual block into four residual small blocks according to a division method of 2 × 2, the height and width of each residual sub-block are equal and are half of the residual block, and the calculation formula is as follows:
Figure BDA0002041704790000172
wherein, avg _ res k Represents the average residual value of the k-th residual sub-block, SADA _ res k Denotes the sum of deviation average absolute differences of CU, pixel _ res (i, j) denotes the residual value of the residual sub-block position (i, j), and W and H denote the width and height of the residual sub-block, respectively.
Characteristic F8: the mean of the deviation of the pixels of the residual sub-block from the mean sum of absolute differences and the absolute difference of the deviation of the pixels of the residual block from the mean sum of absolute differences represent the relationship between the parent CU residual and the child CU residual.
F8=|F6-F7| (7)
Characteristic F9: the maximum Sobel gradient of the residual block is calculated as follows:
Figure BDA0002041704790000181
wherein, I _ res i,j G _ res represents a residual block divided into 3 × 3 blocks with a middle residual position (i, j) i,j Represents I _ res i,j The absolute sum of the transverse and longitudinal Sobel gradients of (a).
Characteristic F10: the pixel deviation from the average sum of absolute differences of the current CU is calculated as follows:
Figure BDA0002041704790000182
where Avg represents the average Pixel value of the CU, pixel (i, j) represents the Pixel Y component value of CU location (i, j), and W and H represent the width and height of the CU, respectively.
Step (V) of generating a spatial domain CU depth classification tree
First, according to whether the CU continues to be divided into two categories: class I and class ii. If the classification tree result is I type, finishing CU division in advance; and if the classification tree result is II, continuing the CU classification. The specific classification tree generation method is as follows:
similar to the method for generating the classification tree in the step (ii), firstly, the final spatial domain CU feature sample sets at each depth are processed, half of the samples are randomly selected to form a training set, and the other half of the samples form a verification set, based on the two sets, the classification and regression tree (CART) is used to generate the initial spatial domain CU depth classification tree, then the maximum depth of the classification tree is set to be not more than 10, the verification set is used to evaluate the leaf node accuracy of the obtained classification tree and prune, and once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size, the node segmentation is stopped. Then, node-by-node pruning is carried out on the tree structure, leaf nodes with the sample number marked as 0 accounting for 85% or more of the total node sample number are marked as division stopping nodes, and the labels are set to be 0; and marking the rest leaf nodes as continuous division nodes, setting the labels to be 1, and if the labels of the two leaf nodes under the same father node are all 1, cutting the leaf nodes. And obtaining final spatial domain CU depth classification trees on the depths 0,1 and 2 respectively.
Step (VI), method for realizing rapid CU division by using time domain CU depth classification tree and space domain CU depth classification tree
For the CU which is subjected to the time domain CU depth classification tree and the classification result of the CU is the time domain similar CU, assigning the depth of the previous frame of the CU at the same position to the current CU as the optimal depth of the current CU, and simultaneously stopping continuous division of the CU; and if the classification result of the CU is a non-time domain similar CU, continuously performing classification selection by using the spatial domain CU deep classification tree. Extracting the depth features of the spatial domain CU in the step (IV), enabling the depth features to pass through a spatial domain CU depth classification tree corresponding to the depth of the CU, setting a flag bit flag _ ns by using the generated spatial domain CU depth classification tree with the depth of 0 according to the previous setting, and if the flag _ ns is 0, classifying the CU into a type I, and if the flag _ ns is 1, classifying the CU into a type II: function (depth 0 airspace CU depth classification tree)
Figure BDA0002041704790000191
Obtaining a final spatial domain CU classification result according to the finally obtained flag _ ns label, and determining the CU with the flag _ ns of 0 as an I-type CU; for a CU with flag _ ns of 1, it is designated as a class II CU. And for the CU with the classification result of 1, the CU is determined as a II-type CU. For a class I CU, stopping partitioning of the CU in advance; and for the II-class CU, continuously executing the fast PU selection method based on the inter-mode classification tree.
(II) fast PU selection method based on inter-frame mode classification tree
Step (I) of generating a feature sample set of a PU inter mode
The features required to perform the PU inter mode are as follows:
characteristic F11: the ratio absolute difference value of the rate distortion cost and the distortion is calculated according to the following formula:
Figure BDA0002041704790000201
where D and RDcost represent the distortion value and rate-distortion cost of the current CU.
Characteristic F12: the pixel deviation average absolute difference sum of the current CU is calculated as follows:
Figure BDA0002041704790000202
where Avg represents the average Pixel value at the CU, pixel (i, j) represents the Pixel Y component value at CU location (i, j), and W and H represent the width and height of the CU, respectively.
Characteristic F13: the average of the optimal CU depths of the surrounding blocks (upper CU, left CU, upper left CU, as in fig. 11) is calculated as follows:
Figure BDA0002041704790000203
dep above represents the optimal CU depth, dep, of the upper CU left Represents the optimal CU depth, dep, of the left CU aboveleft Represents the optimal CU depth for the upper left CU.
Characteristic F14: the average of the best PU modes for the surrounding blocks (top CU, left CU, top left CU, fig. 11) is calculated as follows:
Figure BDA0002041704790000204
the value for the optimal PU mode is set to 2NThe value of X2N is 0,2N XN 1,N X2N 2,N XN 3,2N XnU 4,2N XnD 5, the value of nL X2N 6, the value of nR X2N 7.PUmode above Represents the best PU mode, PUmode, of the upper CU left Represents the optimal PU mode, PUmode, of the left CU aboveleft Represents the best PU mode for the upper left CU.
Characteristic F15: after execution of the optimal PU mode following the Merge, skip,2N × 2N mode, F15 is 0 if the optimal mode is Skip, F15 is 1 if the optimal mode is Merge, and F15 is 2 if the optimal mode is 2N × 2N.
Step (II) of generating PU inter-frame mode classification tree
Firstly, according to whether the PU continues to execute PU mode selection, the PU mode selection is divided into three categories: class I, class II and class III. Skipping all remaining PU modes for selection if the PU inter-frame mode classification tree result is type I; if the classification tree result is class II, all SMP modes are executed, and all AMP modes are skipped; if the classification tree result is II type, all PU modes are executed. The specific method for generating the PU inter-frame mode classification tree is as follows:
similar to the method for generating the classification tree in step (ii) of the CU part, first, the feature sample sets of the final PU inter mode at each CU depth are processed, half of the samples are randomly selected to form a training set, and the other half of the samples form a verification set, based on the two sets, the classification and regression tree (CART) is used to generate the initial PU inter mode classification tree, then the maximum depth of the classification tree is set to be not more than 10, and the verification set is used to evaluate the leaf node accuracy of the obtained classification tree and prune, and the node segmentation is stopped once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size. Then, pruning the tree structure one by one leaf node, and marking the leaf node with the sample number marked as 0 accounting for 85% of the total node sample number as skipping all PU modes on CU depths of 0,1 and 2, and setting the label as 0; adding the node accuracy rate of the number of samples marked as 0 and the number of samples marked as 1 to the rest leaf nodes, marking to execute all SMP modes if the node accuracy rate after the addition is greater than or equal to 85%, skipping all AMP modes, and setting a label as 1; the rest sets the tag to 2, i.e., performs all PU mode partitioning.
For CU depth 3, because AMP mode selection is not performed by default, the leaf node that has a sample number marked 0 in proportion to the total node sample number and is greater than or equal to 85% is marked as skipping all PU modes, and set to a flag of 0; the rest are set to tag 2, i.e., all PU mode partitioning is performed.
If the labels of two leaf nodes under the same parent node are both 2, they are also pruned. The final PU inter mode classification tree is obtained at CU depths 0,1,2, and 3, respectively.
Step (III) of realizing rapid PU selection by using PU inter-frame mode classification tree
Firstly, extracting corresponding early termination characteristics in the step (I), enabling the early termination characteristics to pass through a PU inter-frame mode classification tree with a corresponding depth, selecting the classification tree by using a generated PU mode with the depth of 0 according to the previous setting, setting a flag bit flag _ mode, if the flag _ mode is 0, classifying the PU into a class I, if the flag _ mode is 1, classifying the PU into a class II, and if the flag _ mode is 2, classifying the PU into a class III:
function (PU inter mode classification tree of CU depth 0)
Figure BDA0002041704790000211
Figure BDA0002041704790000221
Obtaining a final PU mode classification result according to the finally obtained flag _ mode label, and regarding the PU with the flag _ mode of 0 as an I-type PU; for a CU with flag _ mode of 1, the CU is determined as a class II PU; for a CU with flag _ mode of 2, it is designated as a class III PU. For class I PUs, skip all remaining PU modes; for class II PUs, all SMP modes are executed, and all AMP modes are skipped; for class III PUs, all PU modes are executed.
(III) quick TU partitioning method based on TU deep classification tree
Step (I) of generating TU characteristic sample set
Firstly, extracting features required by TU classification:
the transverse gradient of the residual coefficients is first calculated using the Roberts operator, as shown in fig. 5:
Figure BDA0002041704790000222
/>
wherein, I i,j Indicating that the residual coefficient matrix is divided into 2 × 2 size matrices, with the coordinate position of the upper left residual coefficient being (i, j).
Characteristic F16: and (5) calculating by using the formula (14) to obtain a gradient matrix, finding out coordinate positions (i, j) of all non-zero values in each row, and recording the maximum value of i in the (i, j). I.e. all points g (i, j) of the traversal gradient matrix, i, j representing the coordinate positions of the matrix gradients, the largest value of i is selected.
i=max{i},s.t.g(i,j)≠0 (15)
Characteristic F17: and (5) calculating by using the formula (14) to obtain a gradient matrix, finding out the coordinate positions (i, j) of all non-zero values in each column, and recording the maximum j value in the (i, j). I.e. all points g (i, j) of the traversal gradient matrix, i, j representing the coordinate positions of the matrix gradient, the largest j value is selected.
j=max{j},s.t.g(i,j)≠0 (16)
Characteristic F18: the number of all non-zero gradients in the resulting gradient matrix is calculated using equation (14). F18 is initialized to 0, all points g (i, j) of the traversal gradient matrix (i, j) represent the positions of the matrix gradients, and F18 is incremented when g (i, j) is not 0.
Characteristic F19: the matrix energy of the residual coefficient matrix, i.e. the sum of squares of each matrix element, reflects the degree of distribution of the residual coefficient matrix, and the calculation formula is as follows:
Figure BDA0002041704790000231
coef (i, j) represents the coefficient value of the residual coefficient matrix at position (i, j), and W and H represent the width and height of the residual coefficient matrix, respectively.
Step (II) of generating TU deep classification tree
Firstly, two categories are continuously divided according to TU or not: class I and class ii. If the classification tree result is type I, finishing TU division in advance; and if the classification tree result is II type, the TU division is continued. The specific classification tree generation method is as follows:
similar to the method for generating the classification tree in step (ii) of the CU module, first, a TU feature sample set at each CU depth is processed, half of the samples are randomly selected to form a training set, and the other half of the samples form a verification set, based on the two sets, an initial TU depth classification tree is generated using a classification and regression tree (CART), then the maximum depth of the classification tree is set to be not more than 10, and the verification set is used to evaluate the leaf node accuracy of the obtained classification tree and prune, and once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size, the node segmentation is stopped. Then, node-by-node pruning is carried out on the tree structure, leaf nodes with the sample number marked as 0 accounting for 85% or more of the total node sample number are marked as division stopping nodes, and the labels are set to be 0; and marking the rest leaf nodes as continuous division nodes, setting the labels to be 1, and if the labels of the two leaf nodes under the same father node are all 1, cutting the leaf nodes. The final TU depth classification tree is obtained on CU depths 0,1,2 and 3, respectively.
Step (III) of realizing rapid TU division by using TU deep classification tree
Extracting TU characteristics in the step (I), enabling the TU characteristics to pass through a TU depth classification tree corresponding to the depth of the CU, according to the previous setting, using the generated TU with the depth of 0 to terminate the classification tree in advance, setting a flag bit flag _ TU, if the flag _ TU is 0, classifying the TU into a type I, and if the flag _ TU is 1, classifying the TU into a type II:
function (TU depth classification tree of CU depth 0)
Figure BDA0002041704790000241
Obtaining a final early termination classification result according to the finally obtained flag _ TU label, and determining the TU with the leaf node label of 0 as a type I TU; for TUs with leaf node label 1, it is designated as a class ii TU. For the class I TU, the division of the TU is terminated in advance; and aiming at the TU of the type II, continuously executing the division of the TU. So far, all the optimization modules with the depth of 0 are completely operated.

Claims (8)

1. The HEVC fast inter-frame coding method based on multi-level classification is characterized by comprising a fast CU dividing method based on a CU depth classification tree, a fast PU selecting method based on an inter-frame mode classification tree and a fast TU dividing method based on a TU depth classification tree; the rapid CU dividing method based on the CU deep classification tree comprises the steps of firstly calculating similar features of a current frame CU and a previous frame CU at the same position to generate a time domain CU feature sample set, and then training the sample set by using a decision tree to generate a time domain CU deep classification tree; dividing the CU into two types of time domain similar CU and non-time domain similar CU through a time domain CU deep classification tree; for non-time domain similar CUs, continuously extracting a space domain CU characteristic sample set, and then training the sample set by using a decision tree to generate a space domain CU deep classification tree; a time domain CU depth classification tree and a space domain CU depth classification tree are used for realizing a rapid CU division method; the fast PU selection method based on the inter-frame mode classification tree firstly extracts a characteristic sample set of a PU inter-frame mode, then trains the sample set by using a decision tree, and generates the PU inter-frame mode classification tree to realize the fast PU selection method; the TU partitioning method based on the TU deep classification tree firstly extracts a TU characteristic sample set, then trains the sample set by using a decision tree, and generates the TU deep classification tree to realize the rapid TU partitioning method.
2. The multi-level classification-based HEVC fast inter-frame coding method as claimed in claim 1, wherein the inter-frame CU features extracted in the fast CU partition method based on the CU depth classification tree are as follows:
characteristic F1: the calculation formula of the normalized absolute difference of the distortion of the current CU and the distortion of the previous frame at the same position CU is as follows:
Figure FDA0004077212390000011
wherein D cur Represents the distortion value, D, of the current CU col RepresentA distortion value of a previous frame and a position CU;
characteristic F2: the normalized absolute difference value of the pixel deviation average absolute difference sum of the current CU and the CU at the same position of the previous frame is calculated according to the following formula:
Figure FDA0004077212390000021
where Avg represents the average Pixel value of the CU, SADA represents the sum of the deviation average absolute differences of the CU, pixel (i, j) represents the Pixel Y component value of CU location (i, j), W and H represent the width and height of the CU, respectively;
characteristic F3: the absolute difference between the rate distortion cost and the distortion ratio of the current CU and the CU in the same position as the previous frame is calculated as follows:
Figure FDA0004077212390000022
wherein D cur And RDcost cur Represents the distortion value and rate-distortion cost, D, of the current CU col And RDcost col Representing the distortion value and the rate distortion cost of the previous frame and the position CU;
characteristic F4: the absolute difference value of the Sobel gradient absolute sum of the current CU and the previous frame co-located CU is calculated as follows:
Figure FDA0004077212390000023
wherein I i,j Denotes the division of a CU into blocks of pixels with an intermediate pixel position (i, j) among 3 x 3 small blocks of pixels, G i,j Is represented by i,j SUM of the absolute gradient of the horizontal and vertical Sobel, SUM _ G denotes the absolute SUM of the gradient of all pixel patches.
3. The multi-level classification-based HEVC fast inter-frame coding method of claim 2, wherein the features of intra-frame CUs extracted from non-inter-frame similar CUs in the CU depth classification tree-based fast CU partition method are as follows:
characteristic F5: the CBF of the current CU is marked as F5;
characteristic F6: the pixels of the residual block deviate from the mean sum of absolute differences, which is calculated as follows:
Figure FDA0004077212390000031
wherein, avg res Mean residual value, pixel, representing a block of CU residual res (i, j) represents the residual value of the residual block position (i, j), W and H represent the width and height of the residual block, respectively;
characteristic F7: the mean value of the deviation of the residual sub-block pixels from the average absolute difference sum is marked as F3; the residual sub-blocks are obtained by dividing the residual block into four residual small blocks according to the method of 2*2, the height and width of each residual sub-block are equal and are half of the residual block, and the calculation formula is as follows:
Figure FDA0004077212390000032
wherein, avg _ res k Represents the average residual value of the k-th residual sub-block, SADA _ res k Represents the sum of the deviation average absolute differences of CU, pixel _ res (i, j) represents the residual value of the residual sub-block position (i, j), W and H represent the width and height of the residual sub-block, respectively;
characteristic F8: the absolute difference value of the mean value of the pixel deviation average absolute difference sum of the residual error sub-blocks and the pixel deviation average absolute difference sum of the residual error blocks represents the relationship between the parent CU residual and the child CU residual;
F8=|F6-F7| (7)
characteristic F9: the maximum Sobel gradient of the residual block is calculated as follows:
Figure FDA0004077212390000033
wherein, I _ res i,j Indicating that the residual block is divided into 3*3 with the middle residual position (i, j)Residual small block, G _ res i,j Represents I _ res i,j Absolute sum of transverse and longitudinal Sobel gradients;
characteristic F10: the pixel deviation from the average sum of absolute differences of the current CU is calculated as follows:
Figure FDA0004077212390000041
where Avg represents the average Pixel value of the CU, pixel (i, j) represents the Pixel Y component value of CU location (i, j), and W and H represent the CU width and height, respectively.
4. The HEVC fast inter-frame coding method based on multi-level classification as claimed in claim 3, wherein the PU inter-frame mode features extracted in the fast PU selection method based on inter-frame mode classification tree are as follows:
characteristic F11: the ratio absolute difference of the rate distortion cost and the distortion is calculated as follows:
Figure FDA0004077212390000042
wherein D and RDcost represent the distortion value and rate-distortion cost of the current CU;
characteristic F12: the pixel deviation average absolute difference sum of the current CU is calculated as follows:
Figure FDA0004077212390000043
where Avg represents the average Pixel value at the CU, pixel (i, j) represents the Pixel Y component value at CU location (i, j), W and H represent the CU width and height, respectively;
characteristic F13: the average value of the optimal CU depths of the surrounding blocks is calculated as follows:
Figure FDA0004077212390000044
wherein, the surrounding blocks comprise an upper CU, a left CU and a left upper CU; dep above Represents the optimal CU depth, dep, of the upper CU left Represents the optimal CU depth, dep, of the left CU aboveleft Represents the optimal CU depth for the upper left CU;
characteristic F14: the average value of the best PU mode of the surrounding blocks is calculated as follows:
Figure FDA0004077212390000045
the values for the optimum PU mode are set to 2N × 2N for 0,2N × N for 1,N × 2N for 2,N × N for 3,2N × nU for 4,2N × nD for 5, nl × 2N for 6, nr × 2N for 7; PUmode above Represents the best PU mode, PUmode, of the upper CU left Indicates the best PU mode, PUmode, for the left CU aboveleft Represents the best PU mode for the upper left CU;
characteristic F15: after execution of the optimal PU mode following the Merge, skip,2N × 2N mode, F15 is 0 if the optimal mode is Skip, F15 is 1 if the optimal mode is Merge, and F15 is 2 if the optimal mode is 2N × 2N.
5. The multi-level classification-based HEVC fast inter-frame coding method of claim 4, wherein the TU features extracted in the fast TU partitioning method based on the TU depth classification tree are as follows:
the transverse gradient of the residual coefficients is first calculated with the Roberts operator:
Figure FDA0004077212390000051
wherein, I i,j Representing that the residual coefficient matrix is divided into 2 multiplied by 2 matrixes, and the coordinate position of the residual coefficient at the upper left corner is (i, j);
characteristic F16: calculating by using a formula (14) to obtain a gradient matrix, finding out coordinate positions (i, j) of all nonzero values in each row, and recording the maximum value i in the (i, j); traversing all points g (i, j) of the gradient matrix, wherein (i, j) represents the coordinate position of the gradient of the matrix, and selecting the maximum value of i;
i=max{i},s.t.g(i,j)≠0 (15)
characteristic F17: calculating by using a formula (14) to obtain a gradient matrix, finding out coordinate positions (i, j) of all nonzero values in each column, and recording the maximum j value in the (i, j); traversing all points g (i, j) of the gradient matrix, wherein (i, j) represents the coordinate position of the matrix gradient, and selecting the maximum j value;
j=max{j},s.t.g(i,j)≠0 (16)
characteristic F18: calculating the number of all nonzero gradients in the obtained gradient matrix by using the formula (14); f18 is initialized to 0, all points g (i, j) of the traversing gradient matrix represent the positions of matrix gradients, and when g (i, j) is not 0, F18 is increased by one;
characteristic F19: the matrix energy of the residual coefficient matrix, namely the sum of squares of each matrix element, reflects the distribution degree of the residual coefficient matrix, and the calculation formula is as follows:
Figure FDA0004077212390000052
coef (i, j) represents the coefficient value of the residual coefficient matrix at position (i, j), and W and H represent the width and height of the residual coefficient matrix, respectively.
6. The multi-level classification-based HEVC fast inter-frame coding method as claimed in claim 5, wherein the fast CU partitioning method based on CU depth classification trees comprises the steps of:
step (I) of generating a time domain CU characteristic sample set
Firstly, adopting standard HEVC coding, extracting time domain CU characteristics when the optimal depth of a CU is divided into 0 and 1, setting a Flag _ sim and initializing to 0 as a Flag bit of whether the CU is a time domain similar CU; if the optimal depth of the CU is 0 or 1, and the optimal depth of the current CU is the same as the optimal depth of the CU at the same position of the previous frame, assigning the Flag _ sim to be 0 and calculating the time domain CU characteristics under the condition; if the optimal depth of the CU is different from the depth of the CU at the same position of the previous frame, the Flag _ sim is assigned to be 1, and the time domain CU characteristics under the condition are calculated and output to the same sample set together; for a sample set with Flag _ sim of 0 and Flag _ sim of 1, randomly taking out the same number of samples to form a final time domain CU characteristic sample set, wherein the number of samples with Flag _ sim of 0 and Flag _ sim of 1 in the sample set is equal;
step (II) of generating a time domain CU deep classification tree
First, the CUs are classified into two categories according to the similarity: time-domain similar CUs and non-time-domain similar CUs; the similarity between the time-domain similar CU and the previous frame of the same-position CU is higher, namely the optimal depth is the same; the similarity between the non-time-domain similar CU and the CU in the same position of the previous frame is lower, namely the optimal depth is different; the specific classification tree generation method is as follows:
processing a time domain CU characteristic sample set generated in the step (I), randomly selecting half samples to form a training set and the other half samples to form a verification set for the time domain CU characteristic sample set with the depths of 0 and 1, based on the two sets, generating an initial time domain CU depth classification tree by using a classification and regression tree (CART), setting the maximum depth of the classification tree to be not more than 10, using the verification set to evaluate the leaf node accuracy of the obtained classification tree and pruning, and stopping node segmentation once the total number of training samples in the leaf node is less than or equal to 1% of the size of the verification set; then, pruning the tree structure one by one node, and when the sample number marked as 0 accounts for the proportion of the total node sample number, marking leaf nodes which are more than or equal to 85 percent as time domain similar nodes, and setting labels as 0; the other leaf nodes are marked as non-time domain similar nodes, the labels are set to be 1, and if the labels of the two leaf nodes under the same father node are both 1, the leaf nodes are cut off;
step (III) of dividing the CU into two types of time-domain similar CU and non-time-domain similar CU by using time-domain CU deep classification tree
Using the time domain CU depth classification tree generated in the step (II), dividing the time domain CU depth classification tree according to the classification variable and the threshold value on each node and the corresponding leaf node, and realizing the classification tree in the inter-frame coding of the HM by using an if-else programming method;
firstly, executing a Merge mode of the PU, after the Merge mode, if the current depth of the current CU is the same as the optimal depth of the previous frame of the same-position CU, calculating and obtaining the corresponding time domain similarity characteristic in claim 2 of the current CU and the previous frame of the same-position CU, enabling the time domain similarity characteristic to pass through a time domain CU depth classification tree with the corresponding depth, obtaining a final time domain similarity classification result according to a finally obtained leaf node label, and regarding the CU with the leaf node label of 0 as a time domain similar CU; for a CU with a leaf node label of 1, determining the CU as a non-time domain similar CU;
aiming at the time domain similar CU, optimizing the division of the current CU by adopting the depth of the previous frame of CU at the same position, assigning the depth of the previous frame of CU at the same position to the current CU as the optimal depth of the current CU, and simultaneously stopping the continuous division of the CU;
for the non-time domain similar CUs, classifying whether to continue the CU division by adopting a classification tree in which the CUs terminate the division in advance, and turning the specific characteristics to the step (IV);
step (IV) of generating a space domain CU characteristic sample set for the non-time domain similar CU
Firstly, adopting standard HEVC coding, extracting the characteristics of a spatial domain CU at the optimal depths of 0,1 and 2 of the CU, setting a Flag _ ns and initializing the Flag _ ns to be 0, wherein the Flag _ ns is used as a Flag bit for judging whether the CU terminates in advance, the Flag _ ns stops dividing when being 0, and the Flag _ ns continues dividing when being 1;
when the optimal depth of the CU is 0, extracting spatial CU features under the depth of 0 and marking the spatial CU features as 0 to stop CU division; if the optimal depth of the CU is not 0, extracting the spatial CU characteristics under the depth of 0, and marking the spatial CU characteristics as 1 to represent continuous CU division; when the optimal depth of the CU is 1, extracting spatial CU features under the depth of 1 and marking the spatial CU features as 0 to stop CU division; if the optimal depth of the CU is not 1, extracting spatial CU features under the depth 1, marking the spatial CU features as 1, and indicating continuous CU division; when the optimal depth of the CU is 2, extracting spatial CU features under the depth of 2 and marking the spatial CU features as 0 to stop CU division; if the optimal depth of the CU is not 2, extracting spatial CU features under the depth of 2, and marking the spatial CU features as 1 to represent continuous CU division;
generating respective space domain CU characteristic sample sets at different depths, randomly acquiring samples with Flag _ ns of 0 and Flag _ ns of 1 from the samples with Flag _ ns of each depth to form the same number of samples, and combining the samples to form a final space domain CU characteristic sample set, wherein the number of the samples with Flag _ ns of 0 and Flag _ ns of 1 in the sample set is equal; finally obtaining a spatial domain CU characteristic sample set under each different depth;
step (V) of generating a spatial domain CU deep classification tree
First, according to whether the CU continues to be divided into two categories: class I and class II; if the classification tree result is type I, finishing CU division in advance; if the classification tree result is type II, continuing CU classification; the specific classification tree generation method is as follows:
similar to the method for generating the classification tree in the step (II), firstly, processing a final spatial domain CU characteristic sample set under each depth, randomly selecting half samples to form a training set, forming a verification set by the other half samples, generating an initial spatial domain CU depth classification tree by adopting a classification and regression tree (CART) based on the two sets, then setting the maximum depth of the classification tree to be not more than 10, adopting the verification set to evaluate the leaf node accuracy of the obtained classification tree and pruning, and stopping dividing nodes once the total number of the training samples in the leaf node is less than or equal to 1% of the verification set; then, node-by-node pruning is carried out on the tree structure, leaf nodes with the sample number marked as 0 accounting for 85% or more of the total node sample number are marked as division stopping nodes, and the labels are set to be 0; marking other leaf nodes as continuous division nodes, setting the labels as 1, and if the labels of two leaf nodes under the same father node are all 1, cutting the leaf nodes; respectively obtaining final spatial domain CU depth classification trees on the depths 0,1 and 2;
step (VI), method for realizing rapid CU division by using time domain CU depth classification tree and space domain CU depth classification tree
For different CU depths, the spatial-domain CU depth classification tree generated in the step (IV) is used, the classification is divided according to a classification variable and a threshold value on each node and a corresponding leaf node, the classification tree is realized in the inter-frame coding of the HM by using an if-else programming method, and a specific implementation method refers to a specific implementation method below;
after the time domain CU deep classification tree is carried out, if the classification result of the CU is a time domain similar CU, the depth of the CU in the same position of the previous frame is assigned to the current CU to serve as the optimal depth of the current CU, and meanwhile, the continuous division of the CU is stopped; if the classification result obtained by the CU is a non-time domain similar CU, continuously performing classification selection by using a spatial domain CU deep classification tree; extracting the corresponding spatial domain CU features in claim 3 from a depth 0,1,2, passing the features through a spatial domain CU depth classification tree corresponding to the CU depth, obtaining a final spatial domain CU classification result according to the finally obtained leaf node label, and determining a CU with a leaf node label of 0 as a class I CU; for a CU with a leaf node label of 1, the CU is determined as a II-type CU; for the class I CU, the partition of the CU is terminated in advance; for a class ii CU, the fast PU selection method based on the inter mode classification tree continues.
7. The multi-level classification-based HEVC fast inter-frame coding method of claim 6, wherein the fast PU selection method based on the inter-mode classification tree comprises the steps of:
step (1) of generating a feature sample set of a PU inter mode
After PU mode discrimination of 2 Nx 2N and Merge is executed, the characteristics of PU inter modes are respectively extracted from CU depths of 0,1,2 and 3, a Flag _ mode is set and initialized to 0 as a Flag bit for PU mode selection, for the CU depths of 0,1 and 2, flag_mode is 0, the selection of all residual PU modes is skipped, for the CU depths of 1 and 2, flag _ mode is 1, all SMP modes are executed, all AMP modes are skipped, and for the CU depths of 2, flag _ mode is 2, all PU modes are executed; for CU depth 3, since AMP mode selection is not performed, the two types are only divided, a Flag _ mode of 0 indicates that all remaining PU mode selections are skipped, and a Flag _ mode of 2 indicates that all PU modes are performed;
when all PU mode selections are finished, extracting features and judging the best PU mode, if the best PU mode is 2 Nx 2N or Merge, marking a flag bit as 0 and the obtained feature output sample, if the best PU mode is an SMP mode, marking the flag bit as 1 and the obtained feature output sample, and if the best PU mode is an AMP mode, marking the flag bit as 2 and the obtained feature output sample;
generating feature sample sets of respective PU inter modes at different CU depths; for the feature sample set of each CU depth, randomly obtaining the same number of samples from the samples with Flag _ mode 0,1 and 2, and combining the samples to form a final feature sample set of the PU inter mode, where the number of samples with Flag _ mode of 0, flag _ mode of 1, and Flag _ mode of 2 in the sample set is equal; finally, obtaining a feature sample set of a PU inter-frame mode under each CU depth;
step (2) generating PU inter-frame mode classification tree
Firstly, according to whether the PU continues to execute PU mode selection, the PU mode selection is divided into three categories: class I, class II and class III; skipping the selection of all remaining PU modes if the result of the PU inter-frame mode classification tree is I type; if the classification tree result is II class, all SMP modes are executed, and all AMP modes are skipped; if the classification tree result is II, executing all PU modes; the specific method for generating the PU interframe mode classification tree is as follows:
similar to the method for generating the classification tree in the step (2) of the CU part, firstly processing feature sample sets of a final PU inter mode at each CU depth, randomly selecting half of samples to form a training set, and the other half of samples to form a verification set, based on the two sets, generating an initial PU inter mode classification tree by using classification and regression trees, then setting the maximum depth of the classification tree to be not more than 10, and using the verification set for evaluating the leaf node accuracy of the obtained classification tree and pruning, and stopping dividing the nodes once the total number of training samples in the leaf nodes is less than or equal to 1% of the size of the verification set; then, pruning the tree structure one by one leaf node, and marking leaf nodes with the sample number marked as 0 accounting for 85% of the total node sample number as skipping all PU modes on CU depths of 0,1 and 2, and setting the labels as 0; adding the node accuracy rate of the number of samples marked as 0 and the number of samples marked as 1 to the rest leaf nodes, marking to execute all SMP modes if the node accuracy rate after the addition is greater than or equal to 85%, skipping all AMP modes, and setting a label as 1; setting the labels to be 2 in the rest, namely executing all PU mode division;
for CU depth 3, because AMP mode selection is not performed by default, the leaf node that has a sample number marked 0 in proportion to the total node sample number and is greater than or equal to 85% is marked as skipping all PU modes, and set to a flag of 0; setting the labels to be 2 in the rest, namely executing all PU mode division;
if the labels of two leaf nodes under the same father node are both 2, the labels are cut off; respectively obtaining final PU inter-frame mode classification trees on the depths 0,1,2 and 3 of the CU;
step (3) method for realizing rapid PU selection by using PU inter-frame mode classification tree
For different CU depths, the PU inter-frame mode classification tree generated in the step (2) is used, the classification is divided according to the classification variable and the threshold value on each node and the corresponding leaf node, the classification tree is realized in the inter-frame coding of the HM by using an if-else programming method, and the specific implementation method refers to the specific implementation method below;
extracting the features of the corresponding PU inter mode in claim 4 in CU depth 0,1,2 respectively, passing the features through the PU inter mode classification tree of the corresponding depth, obtaining the classification result of the final PU mode according to the finally obtained leaf node label, and determining the PU with the leaf node label of 0 as an I-type PU; for a CU with a leaf node label of 1, the CU is determined as a class II PU; for a CU with a leaf node label of 2, the CU is determined as a type III PU; for class I PUs, skip all remaining PU modes; for class II PUs, all SMP modes are executed, and all AMP modes are skipped; executing all PU modes aiming at the class III PU;
in depth 3, extracting the features of the PU inter mode in claim 4, passing the features through the PU mode selection classification tree of depth 3, and obtaining the classification result of the final PU mode according to the label of the finally obtained leaf node, because the calculation of AMP is not executed in depth 3, there are only two classes, and the PU with the label of leaf node being 0 is defined as I-class PU; for a CU with a leaf node label of 2, the CU is determined as a type III PU; skipping over all remaining PU modes for class I PUs; all PU modes are performed for class III PUs.
8. The multi-level classification-based HEVC fast inter-frame coding method of claim 7, wherein the fast TU partitioning method based on TU depth classification tree comprises the steps of:
step (1) generating TU characteristic sample set
For each PU mode, dividing TU is required to be executed, TU characteristics are respectively extracted at the depths of 0,1,2 and 3 of a CU, a Flag _ TU is set and initialized to be 0 and used as a Flag bit for the early termination of the TU, and the Flag _ TU is 0 and represents that the TU division is terminated in advance; if the Flag _ TU is 1, the TU depth division is continuously performed;
when the rate distortion cost of the parent TU and the rate distortion cost of the child TU are compared, judging the category of the TU and extracting corresponding division characteristics; if the rate distortion cost of the child TU is less than that of the parent TU, the child TU needs to be divided continuously, and Flag _ TU is marked as 1; if the rate distortion cost sum of the child TU is less than the rate distortion cost sum of the current father TU, the child TU does not need to be divided continuously, and Flag _ TU is marked as 0;
generating respective TU feature sample sets by different CU depths; for a TU characteristic sample set of each CU depth, randomly acquiring samples with Flag _ TU of 0 and 1 in the same number respectively, and combining to form a final TU characteristic sample set, wherein the number of the samples with Flag _ TU of 0 and Flag _ TU of 1 in the sample set is equal; obtaining a TU characteristic sample set under each CU depth;
step (2) generating TU deep classification tree
Firstly, two categories are continuously divided according to TU or not: class I and class II; if the classification tree result is type I, finishing TU division in advance; if the classification tree result is type II, TU division is continued; the specific classification tree generation method is as follows:
similar to the method for generating the classification tree in the step (2) of the CU module, firstly processing a TU feature sample set under each CU depth, randomly selecting half of samples to form a training set, and forming a verification set for the other half of samples, based on the two sets, generating an initial TU depth classification tree by using classification and regression trees, then setting the maximum depth of the classification tree to be not more than 10, using the verification set for evaluating the leaf node accuracy of the obtained classification tree and pruning, and stopping dividing the nodes once the total number of training samples in the leaf node is less than or equal to 1% of the verification set size; then, node-by-node pruning is carried out on the tree structure, leaf nodes with the sample number marked as 0 accounting for 85% or more of the total node sample number are marked as division stopping nodes, and the labels are set to be 0; marking other leaf nodes as continuous division nodes, setting the labels as 1, and if the labels of two leaf nodes under the same father node are all 1, cutting the leaf nodes; respectively obtaining final TU depth classification trees on CU depths 0,1,2 and 3;
step (3) realizing rapid TU division by using TU deep classification tree
For different CU depths, the TU depth classification tree generated in the step (2) is used, the classification is divided according to the classification variable and the threshold value on each node and the corresponding leaf node, the classification tree is realized in the inter-frame coding of the HM by using an if-else programming method, and the specific implementation method refers to the specific implementation method below;
extracting TU characteristics in claim 5 from CU depths 0,1,2 and 3, passing the TU characteristics through a TU depth classification tree corresponding to the CU depths, obtaining a final early termination classification result according to a finally obtained leaf node label, and determining TUs with leaf node labels of 0 as class I TUs; for TUs with leaf node labels of 1, determining the TUs as class II TUs; for the class I TU, the division of the TU is terminated in advance; and aiming at the TU of the class II, continuously dividing the TU.
CN201910344082.4A 2019-04-26 2019-04-26 HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification Active CN110049338B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910344082.4A CN110049338B (en) 2019-04-26 2019-04-26 HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910344082.4A CN110049338B (en) 2019-04-26 2019-04-26 HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification

Publications (2)

Publication Number Publication Date
CN110049338A CN110049338A (en) 2019-07-23
CN110049338B true CN110049338B (en) 2023-04-18

Family

ID=67279621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910344082.4A Active CN110049338B (en) 2019-04-26 2019-04-26 HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification

Country Status (1)

Country Link
CN (1) CN110049338B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110730350B (en) * 2019-09-25 2021-08-24 杭州电子科技大学 SHVC (scalable high-speed coding) quick coding method combining coding depth estimation and Bayesian judgment
CN110913232B (en) * 2019-11-29 2021-09-14 北京数码视讯软件技术发展有限公司 Selection method and device of TU division mode and readable storage medium
CN111818332A (en) * 2020-06-09 2020-10-23 复旦大学 Fast algorithm for intra-frame prediction partition judgment suitable for VVC standard
CN111950587B (en) * 2020-07-02 2024-04-16 北京大学深圳研究生院 Intra-frame coding block dividing processing method and hardware device
CN112437310B (en) * 2020-12-18 2022-07-08 重庆邮电大学 VVC intra-frame coding rapid CU partition decision method based on random forest

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103873861B (en) * 2014-02-24 2017-01-25 西南交通大学 Coding mode selection method for HEVC (high efficiency video coding)
CN104796693B (en) * 2015-04-01 2017-08-25 南京邮电大学 A kind of quick CU depth of HEVC divides coding method
US10645385B2 (en) * 2017-06-28 2020-05-05 Mediatek Inc. Method and apparatus for performing fixed-size slice encoding with slice boundary prediction
CN107864380B (en) * 2017-12-14 2020-08-11 杭州电子科技大学 3D-HEVC fast intra-frame prediction decision method based on DCT

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴进福.面向HEVC的快速编码优化算法研究.中国博士学位论文全文数据库.2019, 全文. *

Also Published As

Publication number Publication date
CN110049338A (en) 2019-07-23

Similar Documents

Publication Publication Date Title
CN110049338B (en) HEVC (high efficiency video coding) rapid inter-frame coding method based on multi-level classification
CN103873861B (en) Coding mode selection method for HEVC (high efficiency video coding)
DE102016125117B4 (en) Motion vector coding with dynamic reference motion vectors
Zhang et al. Fast coding unit depth decision algorithm for interframe coding in HEVC
CN104023233B (en) Fast inter-frame prediction method of HEVC (High Efficiency Video Coding)
CN104396247B (en) Method and apparatus for the adaptive loop filter based on LCU of video coding
CN104378643B (en) A kind of 3D video depths image method for choosing frame inner forecast mode and system
CN111355956B (en) Deep learning-based rate distortion optimization rapid decision system and method in HEVC intra-frame coding
CN103957415B (en) CU dividing methods and device based on screen content video
CN103517069A (en) HEVC intra-frame prediction quick mode selection method based on texture analysis
CN106162167A (en) Efficient video coding method based on study
CN105120290B (en) A kind of deep video fast encoding method
CN106464855A (en) Method and device for providing depth based block partitioning in high efficiency video coding
CN106131554A (en) The HEVC point self-adapted compensation method of quick sample product based on major side direction
CN103533355B (en) A kind of HEVC fast encoding method
CN103118262B (en) Rate distortion optimization method and device, and video coding method and system
CN104702958A (en) HEVC intraframe coding method and system based on spatial correlation
CN109729351B (en) HEVC (high efficiency video coding) rapid mode selection method under low complexity configuration
CN104853191A (en) HEVC fast coding method
CN101969561A (en) Intra-frame mode selection method and device and encoder
CN101588487B (en) Video intraframe predictive coding method
CN105657420A (en) HEVC-oriented fast intra-frame prediction mode decision method and device
CN116489386A (en) VVC inter-frame rapid coding method based on reference block
CN103533349A (en) Support vector machine-based fast inter-frame prediction macro block mode selection method for B frame
CN109151467B (en) Screen content coding inter-frame mode rapid selection method based on image block activity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231222

Address after: 509 Kangrui Times Square, Keyuan Business Building, 39 Huarong Road, Gaofeng Community, Dalang Street, Longhua District, Shenzhen, Guangdong Province, 518000

Patentee after: Shenzhen lizhuan Technology Transfer Center Co.,Ltd.

Address before: 310018 No. 2 street, Xiasha Higher Education Zone, Hangzhou, Zhejiang

Patentee before: HANGZHOU DIANZI University