WO2018219020A1 - 一种视频图像的编解码方法及装置 - Google Patents

一种视频图像的编解码方法及装置 Download PDF

Info

Publication number
WO2018219020A1
WO2018219020A1 PCT/CN2018/079658 CN2018079658W WO2018219020A1 WO 2018219020 A1 WO2018219020 A1 WO 2018219020A1 CN 2018079658 W CN2018079658 W CN 2018079658W WO 2018219020 A1 WO2018219020 A1 WO 2018219020A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixels
node
width
coding tree
upper left
Prior art date
Application number
PCT/CN2018/079658
Other languages
English (en)
French (fr)
Inventor
赵寅
杨海涛
高山
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018219020A1 publication Critical patent/WO2018219020A1/zh
Priority to US16/689,550 priority Critical patent/US10911788B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel

Definitions

  • the present application relates to the field of video image technologies, and in particular, to a codec method and apparatus for video images.
  • the H.265 video coding standard divides a frame of image into coding tree units (CTUs) that do not overlap each other, and uses a quad-tree (QT)-based CTU partitioning method to use CTU as a quadtree.
  • the root node (root) divides the CTU into a number of leaf nodes according to the division of the quadtree.
  • a node corresponds to an image region. If the node is not divided, the node is called a leaf node, and its corresponding image region forms a coding unit (CU). If the node continues to divide, the image region corresponding to the node is divided into four. Areas of the same size (the length and width are each half of the divided area), each area corresponds to a node, and it is necessary to determine whether these nodes are still divided.
  • JVET Joint Exploration Team on Future Video Coding
  • JEM reference software joint exploration model
  • Binary tree partitioning and quadtree partitioning can be cascaded, referred to as QTBT partitioning.
  • QTBT partitioning For example, CTU is first divided according to QT, and QT leaf nodes are allowed to continue to use BT partitioning. This method can generate rectangular CUs other than squares through BT. However, since BT is divided once to make one node become two 1/2-sized nodes, if it is to be divided into smaller CUs, it will lead to too many division levels; and QT and BT adopt cascaded mode, ie QT The leaf nodes are divided by BT, and the leaf nodes of BT can no longer use QT partitioning.
  • the embodiment of the present application provides a method and a device for encoding and decoding video images to improve compression efficiency of codecs.
  • a first aspect provides a decoding method for a video image, where a coding tree node is used to represent a rectangular image region to be decoded in the video image, and a lower node is used to represent a partial rectangle in the rectangular image region to be decoded. In the image region, the image regions indicated by the different lower-level nodes do not overlap each other.
  • the coding tree node is the coding unit of the video image, the coding tree node does not include the lower-level node, and the method includes:
  • the preset division mode set includes a first division mode indicating that the coding tree node is a coding unit of the video image, and determining that the coding tree unit is equal to two by two large aspect ratios a second partitioning mode formed by the lower node, determining that the coding tree unit is composed of two equal-sized third partitioning modes of the lower-level node having an aspect ratio of 0.5, and determining that the coding tree unit is four or the like a fourth division mode formed by the lower node having a large aspect ratio of 1.
  • the aspect ratio of the lower node is 1 and the width of the lower node is greater than the preset threshold, the lower node and The candidate partition mode set of the coding tree node is the same;
  • the beneficial effect is that the above decoding scheme decodes the coding tree node by using a hybrid coding partition tree structure based on the multi-partition mode, and this decoding scheme allows more CU shapes than the quadtree partitioning method;
  • the binary tree division method reduces the division level; compared with the quadtree cascading binary tree division method, the division mode information is simplified and more division modes are allowed, which can be achieved higher than the quadtree cascading binary tree division method. Compression efficiency.
  • the partitioning mode is used to determine at least one of a number, a size, and a distribution of lower-level nodes constituting the coding tree node.
  • the coding tree node has a width of 4 times M pixels, and M is a positive integer, with the upper left corner of the coding tree node as the origin and the right horizontal direction , the downward direction is a vertical positive direction, and the candidate partition mode set further includes:
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (0, 2M) being the upper left corner, the width being 2 times the M pixels, and the height being 2 times a lower division node of M pixels, and a fifth division mode formed by (2M, 2M) being an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower nodes; or,
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 4 times (0, 2M), and a height of 4 times, twice as high a sixth division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 4 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times (2M, 2M), and a height of 2 times a seventh division mode formed by the lower nodes of the M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times with (0, 2M) as an upper left corner, and a height of 4 times
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 4 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times (2M, 2M), and a height of 2 times a ninth division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (0, 2M) being the upper left corner, the width being 4 times the M pixels, and the height being M pixels
  • the lower node and the tenth division mode formed by (0, 3M) being the upper left corner, the M pixel having a width of 4 times, and the lower node having the height of M pixels; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times the M pixels, the lower node of the M pixels, the (0, 2M) is the upper left corner, the width is 2 times the M pixels, and the height is 2 times the M pixels.
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (2M, 0) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels
  • the width is M pixels, and the height is 4 times, and (M, 0) is the upper left corner, and the width is For the M pixels, the lower node of the M pixels having a height of 4 times, the (2M, 0) is the upper left corner point, the width is 2 times the M pixels, and the height is 2 times the M pixels.
  • a lower-order node and a thirteenth division mode formed by (2M, 2M) being an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times of M pixels, the lower node of M pixels, with (0, 2M) as the upper left corner, 4 times the width of M pixels, and the lower node of the M pixels a fourteenth division mode composed of (0, 3M) as an upper left corner, a width of four times M pixels, and a height of M pixels of the lower nodes; or
  • the width is M pixels, and the height is 4 times, with (0, M) being the upper left corner, and the width
  • the lower node of the M pixels having a height of 4 times, the lower node of (0, 2M) is the upper left corner
  • the width is M pixels
  • the lower node of the M pixel is 4 times higher a fifteenth division mode composed of (0, 3M) as an upper left corner, a width of M pixels, and a height of 4 times the M nodes of the lower nodes; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times of M pixels, 2 lower-order M pixels of the lower node and (0, 3M) are upper left corner points, 4 times wider M pixels, and the height is M pixels.
  • the width is M pixels, and the height is 4 times, and (M, 0) is the upper left corner, and the width is 2 times of M pixels, 4 nodes of M pixels higher than the lower node and (3M, 0) are the upper left corner, the width is M pixels, and the height is 4 times the M pixels.
  • the beneficial effect is that since the division mode may include the above-described plurality of division modes, thereby allowing the coding tree node to perform division according to the above plurality of division modes, there is an advantage that the division level is small and the divided CU shape is large.
  • the partitioning mode is used to determine a subordinate node constituting the coding tree node, and further includes:
  • the division mode is used to determine a decoding order of a plurality of lower nodes constituting the coding tree node
  • the fourth division mode includes: the four equal-sized first sub-modes of the lower-level node that are decoded in a clockwise order, the first sub-mode and the four equal-sized The fourth sub-mode of the lower-level node whose aspect ratio is 1 is decoded in a counterclockwise order, wherein the candidate mode set includes at least the fourth sub-pattern first sub-mode.
  • the beneficial effect is that the decoding order of the plurality of lower nodes is determined by the division mode, and the decoding efficiency is improved.
  • the partitioning mode information is represented using a first syntax element, the first syntax element being used to indicate an identification of the obtained partitioning mode in the candidate partitioning mode set.
  • the coding tree node needs to divide the lower-level nodes according to the division mode indicated by the first syntax element. Specifically, the coding tree node may be divided into two or three or four lower-level nodes.
  • the split mode information is represented by using a second syntax element and a third syntax element, where the second syntax element is used to indicate whether the obtained split mode is the first a split mode, when the second syntax element determines that the obtained split mode is not the first split mode, the third syntax element is used to indicate that the obtained split mode is in addition to the first partition An identifier in the set of candidate partitioning modes other than the mode.
  • the coding tree node needs to divide the lower-level nodes according to the division mode indicated by the third syntax element. Specifically, the coding tree node may be divided into two or three or four lower-level nodes.
  • the parsing the code stream to obtain the coding information of the coding tree node includes:
  • the obtained split mode is not the first split mode, parsing the code stream to obtain coding information of a lower node of the coding tree node, where when the aspect ratio of the lower node is 1 and When the width of the lower-level node is greater than the preset threshold, the coding information of the lower-level node includes the division mode information of the lower-level node;
  • the reconstructing the pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node includes:
  • the method before the coding mode information of the tree node is encoded in the parsing code stream, the method further includes:
  • the beneficial effect is that the division mode included in the candidate division mode set is determined by the indication information, so that a plurality of possible candidate division mode sets can be flexibly set, and the flexibility is better.
  • the method before the coding mode information of the tree node is encoded in the parsing code stream, the method further includes:
  • the coding tree node is located within an image range of the video image.
  • a method for encoding a video image including:
  • Rate distortion cost is a sum of rate distortion costs of all coding units obtained according to a corresponding division mode
  • the candidate partition mode set includes a first partition mode indicating that the coding tree node is a basic coding unit of the video image, and determining that the coding tree unit is composed of two equal large aspect ratios of two Determining a second partitioning mode formed by the lower-level nodes, determining that the coding tree unit is composed of two equal-sized third partitioning modes of the lower-level nodes having an aspect ratio of 0.5 and determining that the coding tree unit is four or more a fourth division mode formed by the lower no
  • the coding tree node is allowed to be divided according to the set of candidate division mode sets, and has the advantages of a small number of division layers and a large shape of the CU.
  • the candidate partition mode set may also be set.
  • the encoder allows more partition modes to be tried, and the compression performance is better; when the candidate partition mode set allows less partition mode
  • the encoder allows the division mode of the attempt to be reduced, and the operation complexity is low.
  • each coding unit constituting the coding tree node is determined based on the target division mode, including:
  • the preset thresholds are obtained to obtain the respective coding units constituting the coding tree node.
  • the beneficial effect is that the child nodes of the coding tree node are allowed to be divided according to the set of candidate division mode sets, and have the advantages of a small number of division layers and a large shape of the CU.
  • a third aspect provides a decoding apparatus for a video image, where a coding tree node is used to represent a rectangular image area to be decoded in the video image, and a lower node is used to represent a partial rectangle in the rectangular image area to be decoded. In the image area, the image areas indicated by the different lower-level nodes do not overlap each other.
  • the coding tree node is the coding unit of the video image, the coding tree node does not include the lower-level node, and the apparatus includes:
  • a parsing unit configured to parse the partitioning mode information of the coding tree node in the code stream; and obtain, according to the partitioning mode information, a partitioning mode of the coding tree node from a candidate partitioning mode set of the coding tree node, where
  • the coding tree node has an aspect ratio of 1 and the coding tree node has a width greater than a preset threshold, and the candidate division mode set includes a first division mode indicating that the coding tree node is a coding unit of the video image.
  • the coding tree unit is formed by two equal division patterns of the lower nodes having an aspect ratio of 2, and determining that the coding tree unit is composed of two equal large aspect ratios of 0.5 a third division mode formed by the lower node and a fourth division mode formed by the lower node that the coding tree unit is composed of four equal large aspect ratios, when the aspect ratio of the lower node is 1 and When the width of the lower node is greater than the preset threshold, the candidate partition mode set of the lower node and the coding tree node are the same; parsing the code stream to obtain coding information of the coding tree node;
  • a decoding unit configured to reconstruct a pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node.
  • the partitioning mode is used to determine at least one of a number, a size, and a distribution of lower-level nodes constituting the coding tree node.
  • the coding tree node has a width of 4 times M pixels, and M is a positive integer, with the upper left corner of the coding tree node as the origin and the right horizontal direction , the downward direction is a vertical positive direction, and the candidate partition mode set further includes:
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (0, 2M) being the upper left corner, the width being 2 times the M pixels, and the height being 2 times a lower division node of M pixels, and a fifth division mode formed by (2M, 2M) being an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower nodes; or,
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 4 times (0, 2M), and a height of 4 times, twice as high a sixth division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 4 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times (2M, 2M), and a height of 2 times a seventh division mode formed by the lower nodes of the M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times with (0, 2M) as an upper left corner, and a height of 4 times
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 4 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times (2M, 2M), and a height of 2 times a ninth division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (0, 2M) being the upper left corner, the width being 4 times the M pixels, and the height being M pixels
  • the lower node and the tenth division mode formed by (0, 3M) being the upper left corner, the M pixel having a width of 4 times, and the lower node having the height of M pixels; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times the M pixels, the lower node of the M pixels, the (0, 2M) is the upper left corner, the width is 2 times the M pixels, and the height is 2 times the M pixels.
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (2M, 0) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels
  • the width is M pixels, and the height is 4 times, and (M, 0) is the upper left corner, and the width is For the M pixels, the lower node of the M pixels having a height of 4 times, the (2M, 0) is the upper left corner point, the width is 2 times the M pixels, and the height is 2 times the M pixels.
  • a lower-order node and a thirteenth division mode formed by (2M, 2M) being an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times of M pixels, the lower node of M pixels, with (0, 2M) as the upper left corner, 4 times the width of M pixels, and the lower node of the M pixels a fourteenth division mode composed of (0, 3M) as an upper left corner, a width of four times M pixels, and a height of M pixels of the lower nodes; or
  • the width is M pixels, and the height is 4 times, with (0, M) being the upper left corner, and the width
  • the lower node of the M pixels having a height of 4 times, the lower node of (0, 2M) is the upper left corner
  • the width is M pixels
  • the lower node of the M pixel is 4 times higher a fifteenth division mode composed of (0, 3M) as an upper left corner, a width of M pixels, and a height of 4 times the M nodes of the lower nodes; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times of M pixels, 2 lower-order M pixels of the lower node and (0, 3M) are upper left corner points, 4 times wider M pixels, and the height is M pixels.
  • the width is M pixels, and the height is 4 times, and (M, 0) is the upper left corner, and the width is 2 times of M pixels, 4 nodes of M pixels higher than the lower node and (3M, 0) are the upper left corner, the width is M pixels, and the height is 4 times the M pixels.
  • the dividing mode is used to determine a lower node that constitutes the coding tree node, and further includes:
  • the division mode is used to determine a decoding order of a plurality of lower nodes constituting the coding tree node
  • the fourth division mode includes: the four equal-sized first sub-modes of the lower-level node that are decoded in a clockwise order, the first sub-mode and the four equal-sized The fourth sub-mode of the lower-level node whose aspect ratio is 1 is decoded in a counterclockwise order, wherein the candidate mode set includes at least the fourth sub-pattern first sub-mode.
  • the split mode information is represented by a first syntax element, and the first syntax element is used to indicate an identifier of the obtained split mode in the candidate split mode set.
  • the split mode information is represented by using a second syntax element and a third syntax element, where the second syntax element is used to indicate whether the obtained split mode is the first a split mode, when the second syntax element determines that the obtained split mode is not the first split mode, the third syntax element is used to indicate that the obtained split mode is in addition to the first partition An identifier in the set of candidate partitioning modes other than the mode.
  • the parsing unit is specifically configured to: when parsing the code stream to obtain coding information of the coding tree node:
  • the obtained split mode is not the first split mode, parsing the code stream to obtain coding information of a lower node of the coding tree node, where when the aspect ratio of the lower node is 1 and When the width of the lower-level node is greater than the preset threshold, the coding information of the lower-level node includes the division mode information of the lower-level node;
  • the decoding unit when the decoding unit reconstructs the pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node, the decoding unit is specifically configured to:
  • the parsing unit is further configured to: before encoding the partition mode information of the tree node in the parsing code stream:
  • the parsing unit is further configured to: before encoding the partition mode information of the tree node in the parsing code stream:
  • the coding tree node is located within an image range of the video image.
  • a fourth aspect provides an apparatus for encoding a video image, including:
  • a first coding unit configured to perform, according to the at least one division mode of the set of candidate division modes of the coding tree node, for the coding tree node in the video image, to obtain the at least one division mode a rate distortion cost corresponding to each of the partition modes, the rate distortion cost being a sum of rate distortion costs of all coding units obtained according to the corresponding partition mode, wherein the coding tree node has an aspect ratio of 1 and The width of the coding tree node is greater than a preset threshold, and the candidate division mode set includes a first division mode indicating that the coding tree node is a basic coding unit of the video image, and determining that the coding tree unit is equal to two a second division mode formed by the lower node having an aspect ratio of 2, determining that the coding tree unit is composed of two equal divisions of the lower node having an aspect ratio of 0.5 and determining the coding
  • the tree unit is composed of four equal division patterns of the lower nodes having an aspect ratio of 1, when the lower node has an aspect ratio
  • a determining unit configured to determine a partitioning mode that minimizes a rate-distortion cost as a target partitioning mode of the coding tree node; and determine, according to the target partitioning mode, each coding unit that constitutes the coding tree node;
  • a second coding unit configured to encode the respective coding units to obtain a code stream and a reconstructed image corresponding to the coding tree node.
  • the determining unit when determining each coding unit constituting the coding tree node based on the target division mode, is specifically used to:
  • the preset thresholds are obtained to obtain the respective coding units constituting the coding tree node.
  • a decoding device comprising a processor, a memory, wherein the memory stores a computer readable program, and the processor implements the first aspect by running a program in the memory Decoding method.
  • an encoding apparatus comprising a processor, a memory, wherein the memory stores a computer readable program, and the processor implements the second aspect by running a program in the memory Coding method.
  • a computer storage medium for storing the computer software instructions of the first aspect or the second aspect, comprising a program designed to perform the above aspects.
  • FIG. 1A and 1B are schematic block diagrams of a video codec device or an electronic device
  • FIG. 2 is a schematic block diagram of a video codec system
  • FIG. 3 is a schematic diagram of node division of a QTBT division manner
  • FIG. 4 is a flowchart of a method for decoding a video image according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a division mode in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of a decoding process for at least one CTU in a video image
  • FIG. 7 is a flowchart of a method for encoding a video image according to an embodiment of the present application.
  • FIG. 8 is a structural diagram of a video image decoding apparatus according to an embodiment of the present application.
  • FIG. 9 is a structural diagram of a decoder of a video image in an embodiment of the present application.
  • FIG. 10 is a structural diagram of an apparatus for encoding a video image according to an embodiment of the present application.
  • FIG. 11 is a structural diagram of an encoder of a video image in an embodiment of the present application.
  • CTU An image consists of multiple CTUs.
  • a CTU usually corresponds to a square image area containing luma pixels and chroma pixels in the image area (or may also contain only luma pixels, or may only contain chroma pixels)
  • the CTU also includes syntax elements that indicate how to divide the CTU into at least one CU, and a method of decoding each coding unit to obtain a reconstructed image.
  • a CU The most basic unit for encoding, no further splitting.
  • a CU usually corresponds to an A ⁇ B rectangular area, including A ⁇ B luminance pixels and its corresponding chrominance pixels, A is the width of the rectangle, B is the height of the rectangle, and A and B can be the same or different, A and The value of B is usually an integer power of 2, such as 256, 128, 64, 32, 16, 8, and 4.
  • a CU can decode and reconstruct a reconstructed image of an A ⁇ B rectangular region by decoding processing.
  • the decoding process usually includes prediction, inverse quantization, inverse transform, etc., to generate a predicted image and a residual, and the predicted image and the residual are superimposed to obtain a reconstructed image. .
  • the size of the CTU is, for example, 64 x 64, 128 x 128, or 256 x 256.
  • a CTU is divided into a set of CUs that do not overlap each other. This group of CUs covers the entire CTU; a group of CUs includes one or more CUs.
  • One CU includes N rows and N columns of luminance pixels, or chrominance pixels including N rows and N columns, or luminance pixels including N rows and N columns, and N/2 rows of N/2 columns of chrominance pixels (such as YUV420 format), Or a luminance pixel of N rows and N columns and a chrominance pixel of N rows and N columns (such as YUV444 format) or RGB pixels of N rows and N columns (such as RGB format).
  • Quadtree A tree structure in which one node can be divided into four subnodes.
  • the H.265 video coding standard adopts a quadtree-based CTU division method.
  • the CTU is used as the root node, and each node corresponds to a square area; one node can be no longer divided (in this case, its corresponding area is a CU), or
  • the node is divided into four lower-level nodes, that is, the square area is divided into four square areas of the same size (the length and the width are half of the length and width of the front area), and each area corresponds to one node.
  • Binary tree A tree structure in which a node can be divided into two child nodes.
  • a node on a binary tree structure may not be divided, or the node may be divided into two nodes of the next level.
  • Video decoding The process of restoring a video stream to a reconstructed image according to specific grammar rules and processing methods.
  • Video encoding The process of compressing a sequence of images into a stream of code.
  • JEM A new codec reference software developed by the JVET organization after the H.265 standard.
  • FIG. 1A is a schematic block diagram of a video codec device or electronic device 50 that may incorporate a codec in accordance with an embodiment of the present application.
  • FIG. 1B is a schematic diagram of a device for video encoding in accordance with an embodiment of the present application. The unit in Figs. 1A and 1B will be explained below.
  • the electronic device 50 can be, for example, a mobile terminal or user equipment of a wireless communication system. It should be understood that embodiments of the present application may be implemented in any electronic device or device that may require encoding and decoding, or encoding, or decoding of a video image.
  • Device 50 can include a housing for incorporating and protecting the device.
  • Device 50 may also include display 32 in the form of a liquid crystal display.
  • the display may be any suitable display technology suitable for displaying images or video.
  • Device 50 may also include a keypad 34.
  • any suitable data or user interface mechanism can be utilized.
  • the user interface can be implemented as a virtual keyboard or data entry system as part of a touch sensitive display.
  • the device may include a microphone 36 or any suitable audio input, which may be a digital or analog signal input.
  • the device 50 may also include an audio output device, which in the embodiment of the present application may be any of the following: an earphone 38, a speaker, or an analog audio or digital audio output connection.
  • Device 50 may also include battery 40, and in other embodiments of the present application, the device may be powered by any suitable mobile energy device, such as a solar cell, fuel cell, or clock mechanism generator.
  • the device may also include an infrared port 42 for short-range line of sight communication with other devices.
  • device 50 may also include any suitable short range communication solution, such as a Bluetooth wireless connection or a USB/Firewire wired connection.
  • Device 50 may include a controller 56 or processor for controlling device 50.
  • Controller 56 can be coupled to memory 58, which in the embodiments of the present application can store data in the form of data and audio in the form of images, and/or can also store instructions for execution on controller 56.
  • Controller 56 may also be coupled to codec circuitry 54 suitable for implementing encoding and decoding of audio and/or video data or assisted encoding and decoding by controller 56.
  • the device 50 may also include a card reader 48 and a smart card 46 for providing user information and for providing authentication information for authenticating and authorizing users on the network.
  • Apparatus 50 may also include a radio interface circuit 52 coupled to the controller and adapted to generate, for example, a wireless communication signal for communicating with a cellular communication network, a wireless communication system, or a wireless local area network. Apparatus 50 may also include an antenna 44 coupled to radio interface circuitry 52 for transmitting radio frequency signals generated at radio interface circuitry 52 to other apparatus(s) and for receiving radio frequency signals from other apparatus(s).
  • a radio interface circuit 52 coupled to the controller and adapted to generate, for example, a wireless communication signal for communicating with a cellular communication network, a wireless communication system, or a wireless local area network.
  • Apparatus 50 may also include an antenna 44 coupled to radio interface circuitry 52 for transmitting radio frequency signals generated at radio interface circuitry 52 to other apparatus(s) and for receiving radio frequency signals from other apparatus(s).
  • device 50 includes a camera capable of recording or detecting a single frame, and codec 54 or controller receives these single frames and processes them.
  • the device may receive video image data to be processed from another device prior to transmission and/or storage.
  • device 50 may receive images for encoding/decoding via a wireless or wired connection.
  • video codec system 10 includes source device 12 and destination device 14.
  • Source device 12 produces encoded video data.
  • source device 12 may be referred to as a video encoding device or a video encoding device.
  • Destination device 14 may decode the encoded video data produced by source device 12.
  • destination device 14 may be referred to as a video decoding device or a video decoding device.
  • Source device 12 and destination device 14 may be examples of video codec devices or video codec devices.
  • Source device 12 and destination device 14 may include a wide range of devices including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set top boxes, smart phones, etc., televisions, cameras, display devices , digital media player, video game console, on-board computer, or the like.
  • Channel 16 may include one or more media and/or devices capable of moving encoded video data from source device 12 to destination device 14.
  • channel 16 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
  • source device 12 may modulate the encoded video data in accordance with a communication standard (eg, a wireless communication protocol) and may transmit the modulated video data to destination device 14.
  • the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media may form part of a network of packets (eg, a local area network, a wide area network, or a global network (eg, the Internet)).
  • the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14.
  • channel 16 can include a storage medium that stores encoded video data generated by source device 12.
  • destination device 14 can access the storage medium via disk access or card access.
  • the storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, DVD, CD-ROM, flash memory, or other suitable digital storage medium for storing encoded video data.
  • channel 16 can include a file server or another intermediate storage device that stores encoded video data generated by source device 12.
  • destination device 14 may access the encoded video data stored at a file server or other intermediate storage device via streaming or download.
  • the file server may be a server type capable of storing encoded video data and transmitting the encoded video data to the destination device 14.
  • the instance file server includes a web server (eg, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, and a local disk drive.
  • FTP file transfer protocol
  • NAS network attached storage
  • Destination device 14 can access the encoded video data via a standard data connection (e.g., an internet connection).
  • a standard data connection e.g., an internet connection.
  • An instance type of a data connection includes a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, DSL, cable modem, etc.), or both, suitable for accessing encoded video data stored on a file server. combination.
  • the transmission of the encoded video data from the file server may be streaming, downloading, or a combination of both.
  • the technology of the present application is not limited to a wireless application scenario.
  • the technology can be applied to video codecs supporting various multimedia applications such as aerial television broadcasting, cable television transmission, satellite television transmission, and streaming video. Transmission (eg, via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application.
  • video codec system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • source device 12 includes video source 18, video encoder 20, and output interface 22.
  • output interface 22 can include a modulator/demodulator (modem) and/or a transmitter.
  • Video source 18 may include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer for generating video data.
  • Video encoder 20 may encode video data from video source 18.
  • source device 12 transmits the encoded video data directly to destination device 14 via output interface 22.
  • the encoded video data may also be stored on a storage medium or file server for later access by the destination device 14 for decoding and/or playback.
  • destination device 14 includes an input interface 28, a video decoder 30, and a display 32.
  • input interface 28 includes a receiver and/or a modem.
  • Input interface 28 can receive the encoded video data via channel 16.
  • Display 32 may be integral with destination device 14 or may be external to destination device 14. In general, display 32 displays the decoded video data.
  • Display 32 can include a variety of display devices such as liquid crystal displays (LCDs), plasma displays, organic light emitting diode (OLED) displays, or other types of display devices.
  • LCDs liquid crystal displays
  • OLED organic light emitting diode
  • Video encoder 20 and video decoder 30 may operate in accordance with a video compression standard (eg, the High Efficiency Video Codec H.265 standard) and may conform to the HEVC Test Model (HM).
  • a video compression standard eg, the High Efficiency Video Codec H.265 standard
  • HM HEVC Test Model
  • a textual description of the H.265 standard is published on April 29, 2015, ITU-T.265(V3) (04/2015), available for download from http://handle.itu.int/11.1002/1000/12455 The entire contents of the document are incorporated herein by reference.
  • video encoder 20 and video decoder 30 may operate in accordance with other proprietary or industry standards including ITU-TH.261, ISO/IEC MPEG-1 Visual, ITU-TH.262, or ISO/IEC MPEG-2 Visual, ITU. -TH.263, ISO/IECMPEG-4 Visual, ITU-TH.264 (also known as ISO/IEC MPEG-4 AVC), including scalable video codec (SVC) and multiview video codec (MVC) extensions.
  • SVC scalable video codec
  • MVC multiview video codec
  • FIG. 2 is merely an example and the techniques of the present application are applicable to video codec applications (eg, single-sided video encoding or video decoding) that do not necessarily include any data communication between the encoding device and the decoding device.
  • data is retrieved from local memory, data is streamed over a network, or manipulated in a similar manner.
  • the encoding device may encode the data and store the data to a memory, and/or the decoding device may retrieve the data from the memory and decode the data.
  • encoding and decoding are performed by a plurality of devices that only encode data to and/or retrieve data from the memory and decode the data by not communicating with each other.
  • Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable Gate array (FPGA), discrete logic, hardware, or any combination thereof. If the technology is implemented partially or wholly in software, the device may store the instructions of the software in a suitable non-transitory computer readable storage medium, and the instructions in the hardware may be executed using one or more processors to perform the techniques of the present application. . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) can be considered as one or more processors. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated into a combined encoder/decoder (codec) in other devices Part of the (CODEC).
  • codec combined encoder/decoder
  • This application may generally refer to video encoder 20 "signaling" certain information to another device (e.g., video decoder 30).
  • the term “signaling” may generally refer to a syntax element and/or to convey the communication of encoded video data. This communication can occur in real time or near real time. Alternatively, this communication may occur over a time span, such as may occur when encoding the encoded element to a computer readable storage medium at the time of encoding, the syntax element being subsequently decodable after being stored in the medium The device is retrieved at any time.
  • the QT structure in the H.265 standard can only produce square CUs of different sizes, and does not well adapt the CU to textures of various shapes.
  • the QTBT structure in JEM uses the QT cascading BT method, which is referred to as the QTBT partitioning method. For example, the CTU is first divided according to QT, and the leaf nodes of the QT are allowed to continue to use the BT partition, as shown in FIG.
  • each end point represents a node, one node is connected with four solid lines to indicate quadtree partitioning, one node is connected with two broken lines to indicate binary tree division, and a to m are 13 leaf nodes, each The leaf node corresponds to 1 CU; the 1 on the binary tree node indicates vertical division, and 0 indicates horizontal division; one CTU is divided into 13 CUs from a to m according to the right diagram, as shown in the left diagram of FIG.
  • each CU has a QT level (QT depth) and a BT level (Binary tree depth, BT depth).
  • the QT level indicates the QT level of the QT leaf node to which the CU belongs, and the BT level indicates the CU belongs.
  • the BT level of the BT leaf node for example, the QT level of a and b in FIG. 3 is 1, the BT level is 2; the QT level of c, d, and e is 1, and the BT level is 1; the QT level of f, k, and l is 2, the BT level is 1; the QT level of i, j is 2, the BT level is 0; the QT level of g, h is 2, the BT level is 2; the QT level of m is 1, and the BT level is 0. If the CTU is only divided into one CU, the QT level of this CU is 0, and the BT level is 0.
  • the above-mentioned QT cascading BT method can generate a rectangular CU other than a square by BT, but since BT is divided once, one node becomes two 1/2-sized nodes, and if it is to be divided into smaller CUs, The result is that the partitioning level is too much; and QT and BT adopt the cascading mode, that is, the leaf nodes of QT are divided by BT, and the leaf nodes of BT can no longer use QT partitioning.
  • the embodiments of the present application provide a method and apparatus for encoding and decoding video images to improve the compression efficiency of the encoding and decoding.
  • the method and the device are based on the same inventive concept. Since the principles of the method and the device for solving the problem are similar, the implementation of the device and the method can be referred to each other, and the repeated description is not repeated.
  • FIG. 4 is a schematic flowchart diagram of a decoding method of a video image provided by an embodiment of the present application, where the process may be specifically implemented by hardware, software programming, or a combination of hardware and software.
  • the decoder may be configured to perform the process as shown in FIG. 4, and the function module in the decoder for performing the decoding scheme of the video image provided by the embodiment of the present application may be specifically implemented by hardware, software programming, and a combination of hardware and software.
  • the hardware may include one or more signal processing and/or application specific integrated circuits.
  • the process specifically includes the following processes:
  • the coding tree node in the embodiment of the present application is used to represent a rectangular image region to be decoded in the video image
  • the lower node is used to represent a partial rectangular image region in the rectangular image region to be decoded.
  • the image regions represented by the different lower-level nodes do not overlap each other.
  • the coding tree node is the coding unit of the video image, since the coding unit is the most basic unit for coding, the coding tree cannot be further split.
  • the node does not contain the subordinate node.
  • Step 40 Parse the partition mode information of the coding tree node in the code stream.
  • the division mode information of the coding tree node is used to indicate one of the candidate division mode sets of the coding tree node.
  • the method further includes:
  • the split mode information of the tree node is encoded in the parsing code stream, it is further required to perform: determining that the coding tree node is located in an image range of the video image.
  • the syntax representation manner of the division mode information of the coding tree node includes the following two types:
  • the split mode information is represented by a first syntax element, and the first syntax element is used to indicate an identifier of the obtained split mode in the candidate split mode set.
  • the coding tree node needs to divide the lower-level nodes according to the division mode indicated by the first syntax element. Specifically, the coding tree node may be divided into two or three or four lower-level nodes.
  • the split mode information is represented by using a second syntax element and a third syntax element, where the second syntax element is used to indicate whether the obtained split mode is the first split mode. And when the second syntax element determines that the obtained split mode is not the first split mode, the third syntax element is used to indicate that the obtained split mode is other than the first split mode The identifier in the candidate partition mode set.
  • the coding tree node needs to divide the lower-level nodes according to the division mode indicated by the third syntax element. Specifically, the coding tree node may be divided into two or three or four lower-level nodes.
  • Step 41 Obtain, according to the division mode information, a division mode of the coding tree node from a candidate division mode set of the coding tree node, where the division mode is used to determine the lower node that constitutes the coding tree node.
  • the coding tree node has an aspect ratio of 1 and the coding tree node has a width greater than a preset threshold
  • the candidate division mode set includes a coding unit indicating that the coding tree node is the video image.
  • a partitioning mode determining that the coding tree unit is formed by two equal division patterns of the lower nodes having an aspect ratio of 2, and determining that the coding tree unit is equal to 0.5 by two equal width ratios a third partitioning mode formed by the lower node and a fourth partitioning mode determined by the lower node of the four equal-large aspect ratios of the coding tree unit, when the aspect ratio of the lower node is When the width of the lower node is greater than the preset threshold, the candidate partition mode set of the lower node and the coding tree node are the same.
  • the candidate node mode set of the lower node and the coding tree node can be the same or different.
  • the dividing mode is used to determine a lower node constituting the coding tree node, and the dividing mode is used to determine at least one of a number, a size, and a distribution of lower node constituting the coding tree node.
  • the dividing mode is further used to determine a decoding order of a plurality of subordinate nodes constituting the coding tree node.
  • FIG. 5 is a schematic diagram of a division mode for a coding tree node, and reference numerals 1, 2, 3, 4, etc. in FIG. 5 indicate decoding orders of lower-level nodes.
  • the coding tree node has a width of 4 times M pixels, M is a positive integer, with the upper left corner of the coding tree node as the origin, the right is the horizontal positive direction, and the downward is the vertical positive direction, the division Modes, including:
  • a first division mode determining that the coding tree node comprises: (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is 4 times the M pixels of the lower node; S1 in 5 is shown.
  • a second division mode determining that the coding tree node includes (0, 0) as an upper left corner point, a width of 4 times M pixels, and a height of 2 times the M pixels of the lower level node and (0, 2M) is the upper left corner point, the M pixel having a width of 4 times, and the lower node of the M pixel having a height of 2 times; for details, refer to S2 in FIG. 5 .
  • a third division mode determining that the coding tree node includes (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 4 times the M pixels of the lower node and (2M, 0) is the upper left corner point, the width is 2 times the M pixels, and the height is 4 times the M pixels of the lower node; for details, refer to S3 in FIG.
  • a fourth division mode determining that the coding tree node includes (0, 0) as an upper left corner point, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, to (0, 2M) is the upper left corner, M pixels that are twice as wide, and the lower nodes of M pixels that are twice as high, with (2M, 0) as the upper left corner and twice as wide as M pixels.
  • a fifth division mode determining that the coding tree node includes (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, to (2M, 0) is the upper left corner point, M pixels having a width of 2 times, and the lower node of M pixels having a height of 2 times, with (0, 2M) being the upper left corner and the width being twice the M pixels.
  • a sixth division mode determining that the coding tree node includes (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, to (2M, 0) is the upper left corner, M pixels having a width of 2 times, the lower nodes of M pixels having a height of 2 times, and M pixels having a width of 4 times with (0, 2M) being the upper left corner,
  • the lower node of the M pixels having a height of 2 times for details, refer to S6 in FIG.
  • a seventh division mode determining that the coding tree node includes (0, 0) as an upper left corner point, a width of 4 times M pixels, and a height of 2 times the M pixels of the lower node, to (0, 2M) is the upper left corner point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times with (2M, 2M) as the upper left corner,
  • the lower node of the M pixels having a height of 2 times for details, refer to S7 in FIG.
  • the eighth division mode determining that the coding tree node includes (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, to (0, 2M) is the upper left corner, M pixels having a width of 2 times, the lower nodes of M pixels having a height of 2 times, and M pixels having a width of 2 times with (0, 2M) being the upper left corner,
  • a ninth division mode determining that the coding tree node includes (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 4 times the M pixels of the lower node, to (2M, 0) is the upper left corner, M pixels having a width of 2 times, the lower nodes of M pixels having a height of 2 times, and M pixels having a width of 2 times with (2M, 2M) being the upper left corner,
  • the lower node of the M pixels having a height of 2 times for details, refer to S9 in FIG.
  • a tenth division mode determining that the coding tree node includes (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, to (2M, 0) is the upper left corner point, the M pixels having a width of 2 times, and the lower node of the M pixels having a height of 2 times, with (0, 2M) being the upper left corner and the width being 4 times the M pixels.
  • the lower node having a height of M pixels and the M pixels having a width of 4 times (0, 3M) and a height of 4 pixels are higher than the lower node of M pixels; for details, refer to S10 in FIG. Shown.
  • the eleventh division mode determining that the coding tree node includes M pixels with (0, 0) as an upper left corner, a width of 4 times, and a lower node of M pixels, with (0, M) For the upper left corner, M pixels with a width of 4 times and the lower nodes with a height of M pixels, with (0, 2M) as the upper left corner and 2 times wider M pixels, the height is 2 times.
  • the lower node of the M pixels and the lower node of the M pixels with the width of 2 times and the M pixels of 2 times the height of the upper left corner point; (2M, 2M); S11 is shown.
  • the twelfth division mode determining that the coding tree node includes (0, 0) as an upper left corner point, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, (0 2M) is the upper left corner point, the width is 2 times the M pixels, and the lower node of the M pixel is 2 times higher, with (2M, 0) as the upper left corner, the width is M pixels, and the height is 4D M pixels of the lower node and (3M, 0) are the upper left corner, the width is M pixels, and the height is 4 times the M pixels of the lower node; for details, refer to FIG. 5 S12 is shown.
  • the thirteenth division mode determining that the coding tree node comprises the lower node of (M, 0) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels, to (M, 0)
  • the lower node is M pixels wide and 4 times higher than the M pixels, with (2M, 0) being the upper left corner, the width being 2 times the M pixels, and the height being 2 times
  • the fourteenth division mode determining that the coding tree node includes M pixels with (0, 0) as an upper left corner and a width of 4 times, and the lower node with a height of M pixels, (0, M) For the upper left corner, M pixels with a width of 4 times, and the lower nodes with a height of M pixels, with (0, 2M) as the upper left corner, 4 times the width of M pixels, and a height of M pixels.
  • the lower-level node and the lower-left corner point (0, 3M) are four times larger than the M-th pixel, and the lower-order node is higher than M pixels; for details, refer to S14 in FIG.
  • the fifteenth division mode determining that the coding tree node comprises the lower node of (M, 0) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels, to (0, M)
  • the subordinate node is M pixels wide and 4 times higher than the M pixels, with (0, 2M) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels.
  • the sixteenth division mode determining that the coding tree node includes M pixels with (0, 0) as the upper left corner and a width of 4 times, and the lower node with a height of M pixels, (0, M) For the upper left corner, M pixels having a width of 4 times, the lower nodes of M pixels having a height of 2 times, and M pixels having a width of 4 times (0, 3M) and a width of 4 times, the height is The lower node of the M pixels; for details, refer to S16 in FIG.
  • the seventeenth division mode determining that the coding tree node comprises the lower node of (M, 0) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels, to (M, 0)
  • the width is M pixels
  • the height is 4 times.
  • the lower node of the M pixels refer to S17 in FIG. 5 for details.
  • the candidate division mode set includes: the first division mode, the second division mode, the third division mode, and the fourth division mode, optionally, further comprising: the fifth division a mode, the sixth division mode, the seventh division mode, the eighth division mode, the ninth division mode, the tenth division mode, the eleventh division mode, the twelfth At least one of a division mode, the thirteenth division mode, the fourteenth division mode, the fifteenth division mode, the sixteenth division mode, and the seventeenth division mode.
  • the above division modes may be classified. Specifically, the following classifications may be classified into the following six categories:
  • the first type the node is divided into two rectangular sub-nodes having an aspect ratio of 2, and specifically includes two division modes: a second division mode and a third division mode.
  • the second type the node is divided into four square child nodes, specifically including the fourth partition mode and the fifth partition mode.
  • the third type the node is divided into two square child nodes and one rectangular child node having an aspect ratio of 2, specifically including a sixth division mode, a seventh division mode, an eighth division mode, and a ninth division mode. 4 division modes.
  • the fourth type the node is divided into two square child nodes and two rectangular child nodes with an aspect ratio of 4, specifically including a tenth division mode, an eleventh division mode, a twelfth division mode, and a thirteenth Divided mode These four division modes.
  • the fifth category the node is divided into four rectangular sub-nodes with an aspect ratio of 4, specifically including the four division modes of the fourteenth division mode and the fifteenth division mode.
  • the sixth category the node is divided into a rectangular sub-node with an aspect ratio of 2 and two rectangular sub-nodes with an aspect ratio of 4, including two division modes: the sixteenth division mode and the seventeenth division mode. .
  • the coding tree node continues to divide, it allows a set of partitioning modes to include at least s2-s4, and may also include at least s5, s6-s9, s10-s13, s14-s15, and s16-s17.
  • the optional combination of the candidate partition mode set includes 12 different combinations, specifically:
  • Step 42 Parse the code stream to obtain coding information of the coding tree node.
  • Step 43 Reconstruct the pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node.
  • the parsing the code stream to obtain the coding information of the coding tree node includes the following two possible implementation manners:
  • the obtained split mode is not the first split mode
  • parsing the code stream to obtain coding information of a lower node of the coding tree node, where, when the lower node is When the aspect ratio is 1 and the width of the lower node is greater than the preset threshold, the coding information of the lower node includes the division mode information of the lower node;
  • the reconstructing the pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node includes:
  • the obtained split mode is the first split mode
  • parsing the code stream to obtain coding information of the coding tree node, where when the coding node is wide and high
  • the ratio is 1 and the width of the coding node is equal to the preset threshold or the aspect ratio of the coding node is not 1
  • the coding information of the coding node does not include the division mode information of the coding node;
  • the reconstructing the pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node includes:
  • the obtained split mode is the first split mode, reconstructing a pixel value of the coding tree node according to the coding information of the coding node.
  • the coding tree node can be divided into a plurality of CUs. Further, the pixel values of the respective CUs are reconstructed according to the coding information of the respective CUs, thereby reconstructing the pixel values of the coding tree nodes.
  • the decoding of the CU includes entropy decoding, inverse quantization, inverse transform, prediction, loop filtering, and the like, and the processes mainly include:
  • the CU has a transform coefficient
  • inverse transform and inverse transform processing are performed on the transform coefficient according to the quantization parameter and the transform mode to obtain a reconstruction residual of the CU. If there is no transform coefficient in the CU, the reconstruction residual of the CU is 0, that is, the reconstructed residual value of each pixel in the CU is 0.
  • FIG. 6 shows a schematic diagram of a decoding process for at least one CTU in a video image, the specific steps are as follows:
  • Step 60 Set the CTU to the root node of the coding tree, and then perform step 61.
  • the coding tree level of the CTU is set to zero.
  • Step 61 If the node is square and the width of the node is greater than the threshold TX, step 62 is performed; otherwise, step 65 is performed.
  • Step 62 Parse the partition mode information of the node, and then perform step 63.
  • Step 63 If the division mode of the node is the non-division mode, that is, the first division mode, step 64 is performed; otherwise, step 65 is performed.
  • a node is a non-square node or a square node and the width is equal to the threshold TX, it is determined that the node does not need to be divided, and the corresponding division mode is a non-division mode, and the node corresponds to one CU at this time.
  • the node is square, that is, the width of the image area corresponding to the node is equal to high, and the threshold TX can be set equal to the side length of the minimum CU, which can be parsed from the SPS in the code stream, and the values are, for example, 4, 8, 16, and the like.
  • the coding tree hierarchy of a child node is incremented by one at the coding tree level of its parent node.
  • syntax representation and parsing manner of the partition mode information of the coding tree node may be one of the following two types:
  • the partitioning mode of the node is combined into the above-mentioned combination three (that is, including 9 partitioning modes such as s2-s4, s6-s9, and s16-s17), a one-to-one mapping relationship between the split_mode_idx and the partitioning mode is established, for example, Table 1 or Table 2 Shown.
  • Step 65 This node corresponds to a CU, and parses the coding information of the CU.
  • the coding information of the CU is contained in a syntax structure, such as the coding_unit() syntax structure in H.265.
  • the coding information of the CU includes, for example, a prediction mode, a transform coefficient, and the like.
  • the CTU root node
  • the CTU can be determined as a set of leaf nodes, and the coding information of the CU corresponding to the leaf node is obtained.
  • a candidate partition mode set can be used for a slice segment, a stripe, an image or a sequence
  • the candidate partition mode set can be pre-defined or marked in the code stream, for example, a candidate partition pattern of a predefined I frame.
  • the set is a combination of nine
  • the candidate partition mode set of the non-I frame is a combination of eight
  • the candidate partition mode set of all types of frames is predefined as a combination seven
  • the corresponding stripe is marked, for example, in a strip header or a stripe fragment header.
  • the slice segment allows a set of candidate partition modes to be used; or indicates, in a sequence parameter set (SPS), a set of candidate partition modes that are allowed to be used by the I frame, the P frame, and the B frame, respectively.
  • SPS sequence parameter set
  • the candidate partition mode set must include s2-s4, and the SPS includes an N-bit syntax element A, and each bit represents whether the candidate partition mode set includes some partition modes in s5-s16.
  • the syntax element A is composed of 4 bits, the first bit indicates whether the candidate division mode set further includes s6-s9, the second bit indicates whether the candidate division mode set further includes s16-s17, and the third bit indicates whether the candidate division mode set is still Including s10-s13, the fourth bit indicates whether the candidate partition mode set further includes s5, s14, and s15.
  • the syntax element A is composed of 3 bits, the first bit indicates whether the candidate division mode set further includes s6-s9, the second bit indicates whether the candidate division mode set further includes s10-s13, and the third bit indicates whether the candidate division mode set includes Also includes s14, s15.
  • coding_tree() is a coding tree syntax structure, and describes a specific division manner in which a coding tree node is divided according to a partition tree based on a multi-partition mode.
  • x0 and x1 respectively represent the horizontal offset and vertical offset (in 1 pixel) of the upper left corner of the node (ie, the upper left corner of the corresponding area of the node) relative to the upper left corner of the CTU (ie, the upper left corner of the corresponding area of the CTU), cuWidth and cuHeight represent the width and height of the node, respectively, in units of 1 pixel; minCUSize represents the minimum CU side length (for example, 4); "! indicates omitted syntax elements, syntax structures, variables, or calculation processing; ae(v ) indicates that CABAC decoding is used; condA stands for "whether the region corresponding to the node is inside the image" (yes, condA is true, otherwise condA is false), for example, condA can be x0+cuWidth ⁇ picWidth&&x1+cuHeight ⁇ picHeight, where picWidth and picHeight represent The width and height of the image (in 1 pixel).
  • the condA of the node is always true (that is, for these CTUs, the judgment of condA can be removed when parsing the split_flag); Part of the CTU, part of which falls within the image and another part falls outside the image, so the condA of its internal coding tree node may be false. When part of the corresponding area of the node is in the image and part of it is outside the image, condA is false. Then the node is divided by s4 by default, and there is no need to transmit the flag bit and the division mode number.
  • parsing its coding tree partition syntax structure coding_tree() can get its division, as follows:
  • the width of the node is equal to high, and the side length of the node is greater than the preset minimum CU side length, then a split flag bit split_flag is parsed from the code stream; otherwise, the flag split_flag does not appear in the code stream.
  • the width of the node is greater than the minimum CU side length, its value defaults to 1, and the node division mode defaults to s4; otherwise, its value defaults to 0, and the node defaults to no division.
  • split_flag_idx indicating the division mode divides the node according to the division mode indicated by split_mode_idx.
  • split_mode_idx is equal to 0 to 6 respectively, indicating that the node division mode is s2, s3, s4, s6, s7, s8, s9.
  • the division mode combination is the above combination three (ie, the optional division mode is s1-s4, s6-s9, s16-s17), then split_mode_idx is equal to 0 to 8 respectively indicating that the node division mode is s2, s3, s4, s6, S7, s8, s9, s16, s17.
  • split_flag[x0][y0] is not 0
  • the split_mode_idx the number of nodes generated by the partitioning mode of the node is obtained, numNodeSplit, and getNumNodeSplit() represents the corresponding processing, for example, the number of nodes generated by the corresponding partitioning mode is found according to the split_mode_idx by using the lookup table method;
  • the node generated after each division is calculated according to the position of its parent node (indicated by x0, y0), the shape (indicated by cuWidth, cuHieght), and the division mode (indicated by split_mode_idx[x0][y0]).
  • the position of this node (indicated by x1, y1) and shape (indicated by cuWidth1, cuHeight1), getNodeInfo() represents the corresponding processing; and the nodes generated after these partitioning continue to parse the encoding tree.
  • the node If the node terminates the partitioning (ie, split_flag[x0][y0] is equal to 0), the node enters the parsing process of the CU encoding information (for example, parsing the prediction mode, residual, and the like of the CU), and the coding_unit() in Table 4 represents the CU.
  • the grammatical structure of the encoded information ie, split_flag[x0][y0] is equal to 0.
  • FIG. 7 is a schematic flowchart diagram of a coding method of a video image provided by an embodiment of the present application, where the process may be specifically implemented by hardware, software programming, or a combination of hardware and software.
  • the encoder can be configured to perform the process as shown in FIG. 7.
  • the function module in the encoder for performing the encoding scheme of the video image provided by the embodiment of the present application can be specifically implemented by hardware, software programming, and a combination of hardware and software.
  • the hardware may include one or more signal processing and/or application specific integrated circuits.
  • the process specifically includes the following processes:
  • Step 70 Perform coding according to at least one division mode of the set of candidate division modes of the coding tree node, and obtain a rate distortion cost corresponding to each of the at least one division mode.
  • the rate distortion penalty is a sum of rate distortion costs of all coding units obtained according to a corresponding division mode, wherein an aspect ratio of the coding tree node is 1 and a width of the coding tree node is greater than a preset threshold.
  • the candidate partition mode set includes a first split mode indicating that the coding tree node is a basic coding unit of the video image, and determining that the coding tree unit is composed of two equal-sized lower-level nodes having an aspect ratio of a second partitioning mode, determining that the coding tree unit is composed of two equal-sized third partitioning modes of the lower-level nodes having an aspect ratio of 0.5 and determining that the coding tree unit is composed of four equal-sized aspect ratios a fourth partitioning mode formed by the lower-level node of 1, when the aspect ratio of the lower-level node is 1 and the width of the lower-level node is greater than the preset threshold, the lower-level node and the coding tree node
  • the candidate partitioning pattern set is the same.
  • the candidate partition mode set of the lower node and the coding tree node may be the same or different.
  • Step 71 Determine a partitioning mode that minimizes the rate distortion cost as a target partitioning mode of the coding tree node.
  • Step 72 Determine respective coding units constituting the coding tree node according to the target division mode, and code the respective coding units to obtain a code stream corresponding to the coding tree node and a reconstructed image.
  • determining, according to the target division mode, each coding unit that constitutes the coding tree node includes:
  • the preset thresholds are obtained to obtain the respective coding units constituting the coding tree node.
  • split_flag if the target division mode of the CU is a non-division mode, split_flag is equal to 0; otherwise, split_flag is equal to 1.
  • split_mode_idx corresponds to the target division mode.
  • the coding of the CU includes processing such as prediction, transform, quantization, entropy coding, etc., and the main processing includes the following steps:
  • Entropy coding information such as a prediction mode and a transform coefficient of the CU to generate a code stream of the CU.
  • the code stream of the CTU is composed of the code streams of the respective CUs.
  • the CTU is allowed to be divided according to a plurality of division modes, and has the advantages of a small number of division layers and a large CU shape.
  • the candidate partition mode set may also be set.
  • the encoder allows more partition modes to be tried, and the compression performance is better; when the candidate partition mode set allows less partition mode
  • the encoder allows the division mode of the attempt to be reduced, and the operation complexity is low.
  • the codec scheme in the present application encodes and decodes a CTU using a hybrid coding partition tree structure based on a multi-partition mode, which allows more CU shapes than a quadtree partitioning scheme.
  • the partitioning level is reduced; compared with the quadtree cascading binary tree partitioning mode, the CU with the largest aspect ratio of 1, 2, and 4 is saved, and the partitioning mode is simplified.
  • the information allows for more partitioning methods, and under the same coding complexity, higher compression efficiency can be achieved compared to the QTBT partitioning method.
  • the embodiment of the present application provides a decoding apparatus 800 for a video image, where a coding tree node is used to represent a rectangular image area to be decoded in the video image, and a lower node is used to represent a partial rectangular image region in the rectangular image region to be decoded, and image regions represented by different lower nodes do not overlap each other.
  • the coding tree node is a coding unit of the video image, the coding tree node does not include
  • the device 800 includes a parsing unit 801 and a decoding unit 802, where:
  • a parsing unit 801 configured to parse the partitioning mode information of the coding tree node in the code stream, and obtain, according to the partitioning mode information, a partitioning mode of the coding tree node from a candidate partitioning mode set of the coding tree node, where
  • the coding tree node has an aspect ratio of 1 and the coding tree node has a width greater than a preset threshold, and the candidate division mode set includes a first division indicating that the coding tree node is a coding unit of the video image.
  • a mode determining that the coding tree unit is formed by two equal division patterns of the lower nodes having an aspect ratio of 2, and determining that the coding tree unit is composed of two equal large aspect ratios of 0.5 Determining a third partitioning mode formed by the lower-level nodes and determining a fourth partitioning mode in which the coding tree unit is composed of four equal-sized upper-level nodes having an aspect ratio of 1, when the aspect ratio of the lower-level node is 1 And when the width of the lower-level node is greater than the preset threshold, the candidate partition mode set of the lower-level node and the coding tree node are the same; parsing the code stream to obtain coding information of the coding tree node;
  • the decoding unit 802 is configured to reconstruct a pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node.
  • the dividing mode is used to determine at least one of a number, a size, and a distribution of the lower nodes constituting the coding tree node.
  • the coding tree node is M times the width of M pixels, and M is a positive integer, with the upper left corner of the coding tree node as the origin, the right direction is the horizontal positive direction, and the downward direction is the vertical positive direction.
  • the candidate partition mode set further includes:
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (0, 2M) being the upper left corner, the width being 2 times the M pixels, and the height being 2 times a lower division node of M pixels, and a fifth division mode formed by (2M, 2M) being an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower nodes; or,
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 4 times (0, 2M), and a height of 4 times, twice as high a sixth division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 4 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times (2M, 2M), and a height of 2 times a seventh division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times with (0, 2M) as an upper left corner, and a height of 4 times
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 4 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, M pixels having a width of 2 times, the lower node of M pixels having a height of 2 times, and M pixels having a width of 2 times (2M, 2M), and a height of 2 times a ninth division mode formed by the lower nodes of M pixels; or
  • the coding tree node is composed of (0, 0) is an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (2M, 0) as the upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (0, 2M) being the upper left corner, the width being 4 times the M pixels, and the height being M pixels
  • the lower node and the tenth division mode formed by (0, 3M) being the upper left corner, the M pixel having a width of 4 times, and the lower node having the height of M pixels; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times the M pixels, the lower node of the M pixels, the (0, 2M) is the upper left corner, the width is 2 times the M pixels, and the height is 2 times the M pixels.
  • the coding tree node is composed of (0, 0) as an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node, with (0, 2M) as an upper left corner Point, the M pixel having a width of 2 times, and the lower node of the M pixel having a height of 2 times, with (2M, 0) being the upper left corner, the width being M pixels, and the height being 4 times the M pixels
  • the width is M pixels, and the height is 4 times, and (M, 0) is the upper left corner, and the width is For the M pixels, the lower node of the M pixels having a height of 4 times, the (2M, 0) is the upper left corner point, the width is 2 times the M pixels, and the height is 2 times the M pixels.
  • a lower-order node and a thirteenth division mode formed by (2M, 2M) being an upper left corner, a width of 2 times M pixels, and a height of 2 times the M pixels of the lower node; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times of M pixels, the lower node of M pixels, with (0, 2M) as the upper left corner, 4 times the width of M pixels, and the lower node of the M pixels a fourteenth division mode composed of (0, 3M) as an upper left corner, a width of four times M pixels, and a height of M pixels of the lower nodes; or
  • the width is M pixels, and the height is 4 times, with (0, M) being the upper left corner, and the width
  • the lower node of the M pixels having a height of 4 times, the lower node of (0, 2M) is the upper left corner
  • the width is M pixels
  • the lower node of the M pixel is 4 times higher a fifteenth division mode composed of (0, 3M) as an upper left corner, a width of M pixels, and a height of 4 times the M nodes of the lower nodes; or
  • the coding tree node is represented by (0, 0) is an upper left corner point, a width is 4 times M pixels, and a height is M pixels of the lower node, and (0, M) is an upper left corner point, and a width is 4 times of M pixels, 2 lower-order M pixels of the lower node and (0, 3M) are upper left corner points, 4 times wider M pixels, and the height is M pixels.
  • the width is M pixels, and the height is 4 times, and (M, 0) is the upper left corner, and the width is 2 times of M pixels, 4 nodes of M pixels higher than the lower node and (3M, 0) are the upper left corner, the width is M pixels, and the height is 4 times the M pixels.
  • the dividing mode is used to determine a lower node that constitutes the coding tree node, and further includes:
  • the division mode is used to determine a decoding order of a plurality of lower nodes constituting the coding tree node
  • the fourth division mode includes: the four equal-sized first sub-modes of the lower-level node that are decoded in a clockwise order, the first sub-mode and the four equal-sized The fourth sub-mode of the lower-level node whose aspect ratio is 1 is decoded in a counterclockwise order, wherein the candidate mode set includes at least the fourth sub-pattern first sub-mode.
  • the split mode information is represented by a first syntax element, and the first syntax element is used to indicate an identifier of the obtained split mode in the candidate split mode set.
  • the split mode information is represented by using a second syntax element and a third syntax element, where the second syntax element is used to determine whether the obtained split mode is the first split mode, when the first When the two syntax elements determine that the obtained partition mode is not the first partition mode, the third syntax element is used to indicate that the obtained partition mode is in the candidate partition mode except the first partition mode The identifier in the collection.
  • the parsing unit 801 when parsing the code stream to obtain the coding information of the coding tree node, is specifically configured to:
  • the obtained split mode is not the first split mode, parsing the code stream to obtain coding information of a lower node of the coding tree node, where when the aspect ratio of the lower node is 1 and When the width of the lower-level node is greater than the preset threshold, the coding information of the lower-level node includes the division mode information of the lower-level node;
  • the decoding unit 802 when the decoding unit 802 reconstructs the pixel value of the coding tree node according to the division mode information and the coding information of the coding tree node, the decoding unit 802 is specifically configured to:
  • the parsing unit 801 is further configured to:
  • the parsing unit 801 is further configured to:
  • the coding tree node is located within an image range of the video image.
  • the embodiment of the present application further provides a decoder 900.
  • the decoder 900 includes a processor 901 and a memory 902.
  • the program code for executing the solution of the present invention is stored in the memory 902 for
  • the instruction processor 901 executes the decoding method of the video image shown in FIG.
  • the application can also be used to design and program the processor to solidify the code corresponding to the method shown in FIG. 4 into the chip, so that the chip can execute the method shown in FIG. 4 during operation.
  • the embodiment of the present application provides a video image encoding apparatus 1000.
  • the apparatus 1000 includes a first encoding unit 1001, a determining unit 1002, and a second encoding unit 1003. ,among them:
  • the first coding unit 1001 is configured to perform, according to at least one division mode of the set of candidate division modes of the coding tree node, for the coding tree node in the video image, to obtain the at least one division mode. a rate distortion penalty corresponding to each of the partition modes, the rate distortion cost being a sum of rate distortion costs of all coding units obtained according to the corresponding partition mode, wherein the coding tree node has an aspect ratio of 1 and The width of the coding tree node is greater than a preset threshold, and the candidate division mode set includes a first division mode indicating that the coding tree node is a basic coding unit of the video image, and determining that the coding tree unit is equal to two a second division mode formed by the lower nodes having an aspect ratio of 2, determining that the coding tree unit is composed of two equal-sized third division modes of the lower nodes having an aspect ratio of 0.5 and determining the
  • the coding tree unit is composed of four equal division patterns of the lower nodes having an aspect ratio of 1, when the aspect ratio
  • a determining unit 1002 configured to determine a partitioning mode that minimizes a rate-distortion cost as a target partitioning mode of the coding tree node; and determine, according to the target partitioning mode, each coding unit that constitutes the coding tree node;
  • the second coding unit 1003 is configured to perform coding on the respective coding units to obtain a code stream and a reconstructed image corresponding to the coding tree node.
  • the determining unit 1002 is specifically configured to: when determining, according to the target division mode, each coding unit that constitutes the coding tree node:
  • the preset thresholds are obtained to obtain the respective coding units constituting the coding tree node.
  • each unit in the above devices 1000 and 800 is only a division of a logical function, and the actual implementation may be integrated into one physical entity in whole or in part, or may be physically separated.
  • each of the above units may be a separately set processing element, or may be integrated in one of the chips of the encoding device, or may be stored in the storage element of the encoder in the form of program code, by one of the encoding devices.
  • the processing component calls and performs the functions of each of the above units.
  • the individual units can be integrated or implemented independently.
  • the processing elements described herein can be an integrated circuit chip with signal processing capabilities.
  • each step of the above method or each of the above units may be completed by an integrated logic circuit of hardware in the processor element or an instruction in a form of software.
  • the processing element may be a general purpose processor, such as a central processing unit (CPU), or may be one or more integrated circuits configured to implement the above method, for example: one or more specific integrated circuits ( English: application-specific integrated circuit (abbreviation: ASIC), or one or more microprocessors (English: digital signal processor, referred to as: DSP), or one or more field programmable gate arrays (English: field-programmablegatearray, referred to as: FPGA) and so on.
  • ASIC application-specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • an embodiment of the present application further provides an encoder 1100.
  • the encoder 1100 includes a processor 1101 and a memory 1102.
  • the program code for executing the solution of the present invention is saved in FIG.
  • the application can also solidify the code corresponding to the method shown in FIG. 7 into the chip by design programming the processor, so that the chip can execute the method shown in FIG. 7 during operation.
  • the processor involved in the above encoder 1100 and decoder 900 in the embodiments of the present application may be a CPU, a DSP, an ASIC, or one or more integrated circuits for controlling the execution of the program of the present invention.
  • One or more memories included in the computer system which may be read-only memory (ROM: ROM) or other types of static storage devices that can store static information and instructions, random access memory (English: random accessmemory, Abbreviation: RAM) or other types of dynamic storage devices that can store information and instructions, or disk storage. These memories are connected to the processor via a bus or can also be connected to the processor via a dedicated cable.
  • embodiments of the present application can be provided as a method, system, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种视频图像的编解码方法及装置,以提升编解码的压缩效率。该方法包括:解析码流中所述编码树节点的划分模式信息;根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,解析所述码流获得所述编码树节点的编码信息;根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。这样编码树节点能够按照候选划分模式集合中的划分模式进行划分,减少了划分层次,提高编解码的压缩效率。

Description

一种视频图像的编解码方法及装置
本申请要求于2017年5月27日提交中国专利局、申请号为201710392690.3、申请名称为“一种视频图像的编解码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频图像技术领域,尤其涉及一种视频图像的编解码方法及装置。
背景技术
H.265视频编码标准把一帧图像分割成互不重叠的编码树单元(coding tree unit,CTU),使用基于四叉树(quad-tree,QT)的CTU划分方法,将CTU作为四叉树的根节点(root),按照四叉树的划分方式,将CTU递归划分成若干个叶节点(leaf node)。一个节点对应于一个图像区域,节点如果不划分,则节点称为叶节点,它对应的图像区域形成一个编码单元(coding unit,CU);如果节点继续划分,则节点对应的图像区域划分成四个相同大小的区域(其长和宽各为被划分区域的一半),每个区域对应一个节点,需要分别确定这些节点是否还会划分。
由于,H.265中的QT结构只能产生不同大小的正方形CU,不能很好地使CU适应各种形状的纹理。
未来视频编码联合探索组(Joint Exploration team on Future Video Coding,JVET)参考软件联合探索模型(Joint Exploration Model,JEM)中加入了基于二叉树(binary tree,BT)的编码划分方式,即一个节点可以以二叉树的方式继续划分成2个节点。
二叉树划分和四叉树划分可采用级联的方式,简称为QTBT划分方式,例如CTU先按照QT划分,QT的叶节点允许继续使用BT划分,这种方式虽然能够通过BT产生正方形以外的矩形CU,但是由于BT划分一次使得一个节点变成2个1/2大小的节点,如要划分到较小尺寸的CU,则会导致划分层次过多;并且QT和BT采用级联的方式,即QT的叶节点以BT划分,BT的叶节点不能再使用QT划分。
发明内容
本申请实施例提供一种视频图像的编解码方法及装置,以提升编解码的压缩效率。
本申请实施例提供的具体技术方案如下:
第一方面,提供一种视频图像的解码方法,编码树节点用于表示所述视频图像中的一个待解码的矩形图像区域,下级节点用于表示所述待解码的矩形图像区域中的部分矩形图像区域,不同下级节点所表示的图像区域互不重叠,当所述编码树节点为所述视频图像的编码单元时,所述编码树节点不包含所述下级节点,所述方法包括:
解析码流中所述编码树节点的划分模式信息;
根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的编码单元的第一划分 模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
解析所述码流获得所述编码树节点的编码信息;
根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。
该有益效果在于,上述解码方案使用基于多划分模式的混合编码划分树结构对编码树节点进行解码,这种解码方案相比于四叉树划分方式,允许了更多的CU形状;相比于二叉树划分方式,减少了划分层次;相比于四叉树级联二叉树划分方式,精简了划分模式信息并允许了更多的划分方式,相比于四叉树级联二叉树划分方式可取得更高的压缩效率。
结合第一方面,一种可能的设计中,所述划分模式用于确定构成所述编码树节点的下级节点的数量、大小、分布中的至少一者。
该有益效果在于,通过划分模式确定构成编码树节点的下级节点,这样划分得到的CU形状更多。
结合第一方面,一种可能的设计中,所述编码树节点的宽为4倍的M个像素,M为正整数,以所述编码树节点左上角点为原点,向右为水平正方向,向下为竖直正方向,所述候选划分模式集合,还包括:
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第五划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点构成的第六划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第七划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点构成的第八划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第九划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M 个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十一划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十二划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十三划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十四划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十五划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十六划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十七划分模式。
该有益效果在于,由于划分模式可以包括上述多种划分模式,从而允许编码树节点按照上述多种划分模式进行划分,具有划分层次少和划分得到的CU形状多的优点。
结合第一方面,一种可能的设计中,所述划分模式用于确定构成所述编码树节点的下 级节点,还包括:
所述划分模式用于确定构成所述编码树节点的多个下级节点的解码顺序;
对应的,所述第四划分模式,包括:所述四个等大的宽高比为1的所述下级节点按照顺时针顺序解码的第四划分模式第一子模式和所述四个等大的宽高比为1的所述下级节点按照逆时针顺序解码的第四划分模式第二子模式,其中,所述候选模式集合至少包括所述第四划分模式第一子模式。
该有益效果在于,通过划分模式确定多个下级节点的解码顺序,提升解码效率。
结合第一方面,一种可能的设计中,所述划分模式信息使用第一语法元素表示,所述第一语法元素用于表示所述获得的划分模式在所述候选划分模式集合中的标识。
这种方式中,该编码树节点需要按照第一语法元素指示的划分模式进行下级节点的划分,具体的,可以不划分,也可以划分为2个或3个或4个下级节点。
结合第一方面,一种可能的设计中,所述划分模式信息使用第二语法元素和第三语法元素表示,所述第二语法元素用于表示确定所述获得的划分模式是否为所述第一划分模式,当所述第二语法元素确定所述获得的划分模式不为所述第一划分模式时,所述第三语法元素用于表示所述获得的划分模式在除所述第一划分模式以外的所述候选划分模式集合中的标识。
这种方式中,该编码树节点需要按照第三语法元素指示的划分模式进行下级节点的划分,具体的,可以划分为2个或3个或4个下级节点。
结合第一方面,一种可能的设计中,所述解析所述码流获得所述编码树节点的编码信息,包括:
当所述获得的划分模式不为所述第一划分模式时,解析所述码流获得所述编码树节点的下级节点的编码信息,其中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点的编码信息包含所述下级节点的划分模式信息;
对应的,所述根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值,包括:
当所述获得的划分模式不为所述第一划分模式时,根据所述下级节点的编码信息,重建所述下级节点的像素值。
该有益效果在于,由于编码树节点及其下级节点能够按照多种划分模式进行划分,从而具有划分层次少和CU形状多的优点。
结合第一方面,一种可能的设计中,在所述解析码流中编码树节点的划分模式信息之前,还包括:
解析所述码流中所述候选划分模式集合的指示信息,所述指示信息用于指示所述候选划分模式集合包含的划分模式。
该有益效果在于,通过指示信息确定所述候选划分模式集合包含的划分模式,从而可以灵活设置多种可能的候选划分模式集合,灵活性较好。
结合第一方面,一种可能的设计中,在所述解析码流中编码树节点的划分模式信息之前,还包括:
确定所述编码树节点位于所述视频图像的图像范围内。
第二方面,提供一种视频图像的编码方法,包括:
针对所述视频图像中的编码树节点,按照设置的所述编码树节点的候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价,所述率失真代价为根据对应的划分模式得到的所有编码单元的率失真代价之和,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的基本编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
将率失真代价最小的划分模式确定为所述编码树节点的目标划分模式;
基于所述目标划分模式确定构成所述编码树节点的各个编码单元,对所述各个编码单元进行编码得到所述编码树节点对应的码流和重建图像。
上述编码方案中,允许编码树节点按照设置的候选划分模式集合进行划分,具有划分层数少和CU的形状多的优点。另外,还可以设定候选划分模式集合,当候选划分模式集合允许的划分模式较多时,编码器允许尝试较多的划分模式,压缩性能较好;当候选划分模式集合允许的划分模式较少时,编码器允许尝试的划分模式减少,运算复杂较低。
结合第二方面,一种可能的设计中,基于所述目标划分模式确定构成所述编码树节点的各个编码单元,包括:
根据所述编码树节点的目标划分模式,确定构成所述编码树节点的N个子节点;
当所述N个子节点中包括方形节点,且所述方形节点对应的图像区域的宽大于所述预设阈值时,根据所述候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价;
将率失真代价最小的划分模式确定为所述方形节点的目标划分模式;
根据所述方形节点的目标划分模式,确定构成所述方形节点的下一级方形节点,直到不存在所述下一级方形节点或所述确定的下一级方形节点对应的图像区域的宽等于预设阈值,得到构成所述编码树节点的所述各个编码单元。
该有益效果在于,允许编码树节点的子节点按照设置的候选划分模式集合进行划分,具有划分层数少和CU的形状多的优点。
第三方面,提供一种视频图像的解码装置,编码树节点用于表示所述视频图像中的一个待解码的矩形图像区域,下级节点用于表示所述待解码的矩形图像区域中的部分矩形图像区域,不同下级节点所表示的图像区域互不重叠,当所述编码树节点为所述视频图像的编码单元时,所述编码树节点不包含所述下级节点,所述装置包括:
解析单元,用于解析码流中所述编码树节点的划分模式信息;根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所 述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;解析所述码流获得所述编码树节点的编码信息;
解码单元,用于根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。
结合第三方面,一种可能的设计中,所述划分模式用于确定构成所述编码树节点的下级节点的数量、大小、分布中的至少一者。
结合第三方面,一种可能的设计中,所述编码树节点的宽为4倍的M个像素,M为正整数,以所述编码树节点左上角点为原点,向右为水平正方向,向下为竖直正方向,所述候选划分模式集合,还包括:
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第五划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点构成的第六划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第七划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点构成的第八划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第九划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十一划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十二划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十三划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十四划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十五划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十六划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十七划分模式。
结合第三方面,一种可能的设计中,所述划分模式用于确定构成所述编码树节点的下级节点,还包括:
所述划分模式用于确定构成所述编码树节点的多个下级节点的解码顺序;
对应的,所述第四划分模式,包括:所述四个等大的宽高比为1的所述下级节点按照顺时针顺序解码的第四划分模式第一子模式和所述四个等大的宽高比为1的所述下级节点按照逆时针顺序解码的第四划分模式第二子模式,其中,所述候选模式集合至少包括所述第四划分模式第一子模式。
结合第三方面,一种可能的设计中,所述划分模式信息使用第一语法元素表示,所述第一语法元素用于表示所述获得的划分模式在所述候选划分模式集合中的标识。
结合第三方面,一种可能的设计中,所述划分模式信息使用第二语法元素和第三语法元素表示,所述第二语法元素用于表示确定所述获得的划分模式是否为所述第一划分模式,当所述第二语法元素确定所述获得的划分模式不为所述第一划分模式时,所述第三语法元 素用于表示所述获得的划分模式在除所述第一划分模式以外的所述候选划分模式集合中的标识。
结合第三方面,一种可能的设计中,所述解析单元在解析所述码流获得所述编码树节点的编码信息时,具体用于:
当所述获得的划分模式不为所述第一划分模式时,解析所述码流获得所述编码树节点的下级节点的编码信息,其中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点的编码信息包含所述下级节点的划分模式信息;
对应的,所述解码单元在根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值时,具体用于:
当所述获得的划分模式不为所述第一划分模式时,根据所述下级节点的编码信息,重建所述下级节点的像素值。
结合第三方面,一种可能的设计中,所述解析单元在所述解析码流中编码树节点的划分模式信息之前,还用于:
解析所述码流中所述候选划分模式集合的指示信息,所述指示信息用于指示所述候选划分模式集合包含的划分模式。
结合第三方面,一种可能的设计中,所述解析单元在所述解析码流中编码树节点的划分模式信息之前,还用于:
确定所述编码树节点位于所述视频图像的图像范围内。
第四方面,提供一种视频图像的编码装置,包括:
第一编码单元,用于针对所述视频图像中的编码树节点,按照设置的所述编码树节点的候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价,所述率失真代价为根据对应的划分模式得到的所有编码单元的率失真代价之和,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的基本编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
确定单元,用于将率失真代价最小的划分模式确定为所述编码树节点的目标划分模式;基于所述目标划分模式确定构成所述编码树节点的各个编码单元;
第二编码单元,用于对所述各个编码单元进行编码得到所述编码树节点对应的码流和重建图像。
结合第四方面,一种可能的设计中,所述确定单元在基于所述目标划分模式确定构成所述编码树节点的各个编码单元时,具体用于:
根据所述编码树节点的目标划分模式,确定构成所述编码树节点的N个子节点;
当所述N个子节点中包括方形节点,且所述方形节点对应的图像区域的宽大于所述预设阈值时,根据所述候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价;
将率失真代价最小的划分模式确定为所述方形节点的目标划分模式;
根据所述方形节点的目标划分模式,确定构成所述方形节点的下一级方形节点,直到不存在所述下一级方形节点或所述确定的下一级方形节点对应的图像区域的宽等于预设阈值,得到构成所述编码树节点的所述各个编码单元。
第五方面,提供一种解码设备,该设备包括处理器、存储器,其中,所述存储器中存有计算机可读程序,所述处理器通过运行所述存储器中的程序,实现第一方面涉及的解码方法。
第六方面,提供一种编码设备,该设备包括处理器、存储器,其中,所述存储器中存有计算机可读程序,所述处理器通过运行所述存储器中的程序,实现第二方面涉及的编码方法。
第七方面,提供一种计算机存储介质,用于储存为上述第一方面或第二方面所述计算机软件指令,其包含用于执行上述方面所设计的程序。
应理解,本申请实施例的第三至七方面与本申请实施例的第一、二方面的技术方案一致,各方面及对应的可实施的设计方式所取得的有益效果相似,不再赘述。
附图说明
图1A、图1B为视频编解码装置或电子设备的示意性框图;
图2为视频编解码***的示意性框图;
图3为QTBT划分方式的节点划分示意图;
图4为本申请实施例中视频图像的解码方法流程图;
图5为本申请实施例中的划分模式示意图;
图6为针对视频图像中的至少一个CTU的解码过程示意图;
图7为本申请实施例中视频图像的编码方法流程图;
图8为本申请实施例中视频图像的解码装置结构图;
图9为本申请实施例中视频图像的解码器结构图;
图10为本申请实施例中视频图像的编码装置结构图;
图11为本申请实施例中视频图像的编码器结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
以下,对本申请中的部分用语进行解释说明,以便与本领域技术人员理解。
CTU:一幅图像由多个CTU构成,一个CTU通常对应于一个方形图像区域,包含这个图像区域中的亮度像素和色度像素(或者也可以只包含亮度像素,或者也可以只包含色度像素);CTU中还包含语法元素,这些语法元素指示如何将CTU划分成至少一个CU,以及解码每个编码单元得到重建图像的方法。
CU:进行编码的最基本的单元,不可进一步拆分。一个CU通常对应于一个A×B的矩形区域,包含A×B亮度像素和它对应的色度像素,A为矩形的宽,B为矩形的高,A和B可以相同也可以不同,A和B的取值通常为2的整数次幂,例如256、128、64、32、16、8、4。一个CU可通过解码处理解码得到一个A×B的矩形区域的重建图像,解码处 理通常包括预测、反量化、反变换等处理,产生预测图像和残差,预测图像和残差叠加后得到重建图像。
CTU的大小例如64×64,128×128,或者256×256。一个CTU被划分成一组互不重叠的CU,这一组CU覆盖整个CTU;一组CU包括一个或多个CU。一个CU包含N行N列的亮度像素、或者包含N行N列的色度像素、或者包含N行N列的亮度像素以及N/2行N/2列的色度像素(如YUV420格式)、或者包含N行N列的亮度像素以及N行N列的色度像素(如YUV444格式)、或者包含N行N列的RGB像素(如RGB格式)。
四叉树:一种树状结构,一个节点可划分为四个子节点。H.265视频编码标准采用基于四叉树的CTU划分方式,CTU作为根节点,每个节点对应于一个方形的区域;一个节点可以不再划分(此时它对应的区域为一个CU),或者将这个节点划分成四个下一层级的节点,即把这个方形区域划分成四个大小相同的方形区域(其长、宽各为划分前区域长、宽的一半),每个区域对应于一个节点。
二叉树:一种树状结构,一个节点可划分成两个子节点。现有采用二叉树的编码方法中,一个二叉树结构上的节点可以不划分,或者把此节点划分成两个下一层级的节点。其中,划分成两个节点的方式有两种:1)水平划分,将节点对应的区域划分成上、下两个大小相同的区域,每个区域对应于一个节点;2)竖直划分,将节点对应的区域划分成左、右两个大小相同的区域,每个区域对应于一个节点。
视频解码(video decoding):将视频码流按照特定的语法规则和处理方法恢复成重建图像的处理过程。
视频编码(video encoding):将图像序列压缩成码流的处理过程。
JEM:H.265标准之后,JVET组织开发的新式编解码器参考软件。
另外,本申请中的多个,是指两个或两个以上。
图1A是视频编解码装置或电子设备50的示意性框图,该装置或者电子设备可以并入根据本申请的实施例的编码解码器。图1B是根据本申请实施例的用于视频编码的示意性装置图。下面将说明图1A和图1B中的单元。
电子设备50可以例如是无线通信***的移动终端或者用户设备。应理解,可以在可能需要对视频图像进行编码和解码,或者编码,或者解码的任何电子设备或者装置内实施本申请的实施例。
装置50可以包括用于并入和保护设备的壳。装置50还可以包括形式为液晶显示器的显示器32。在本申请的其它实施例中,显示器可以是适合于显示图像或者视频的任何适当的显示器技术。装置50还可以包括小键盘34。在本申请的其它实施例中,可以运用任何适当的数据或者用户接口机制。例如,可以实施用户接口为虚拟键盘或者数据录入***作为触敏显示器的一部分。装置可以包括麦克风36或者任何适当的音频输入,该音频输入可以是数字或者模拟信号输入。装置50还可以包括如下音频输出设备,该音频输出设备在本申请的实施例中可以是以下各项中的任何一项:耳机38、扬声器或者模拟音频或者数字音频输出连接。装置50也可以包括电池40,在本申请的其它实施例中,设备可以由任何适当的移动能量设备,比如太阳能电池、燃料电池或者时钟机构生成器供电。装置还可以包括用于与其它设备的近程视线通信的红外线端口42。在其它实施例中,装置50还可以包括任何适当的近程通信解决方案,比如蓝牙无线连接或者USB/火线有线连接。
装置50可以包括用于控制装置50的控制器56或者处理器。控制器56可以连接到存 储器58,该存储器在本申请的实施例中可以存储形式为图像的数据和音频的数据,和/或也可以存储用于在控制器56上实施的指令。控制器56还可以连接到适合于实现音频和/或视频数据的编码和解码或者由控制器56实现的辅助编码和解码的编码解码器电路54。
装置50还可以包括用于提供用户信息并且适合于提供用于在网络认证和授权用户的认证信息的读卡器48和智能卡46。
装置50还可以包括无线电接口电路52,该无线电接口电路连接到控制器并且适合于生成例如用于与蜂窝通信网络、无线通信***或者无线局域网通信的无线通信信号。装置50还可以包括天线44,该天线连接到无线电接口电路52用于向其它(多个)装置发送在无线电接口电路52生成的射频信号并且用于从其它(多个)装置接收射频信号。
在本申请的一些实施例中,装置50包括能够记录或者检测单帧的相机,编码解码器54或者控制器接收到这些单帧并对它们进行处理。在本申请的一些实施例中,装置可以在传输和/或存储之前从另一设备接收待处理的视频图像数据。在本申请的一些实施例中,装置50可以通过无线或者有线连接接收图像用于编码/解码。
图2是根据本申请实施例的另一视频编解码***10的示意性框图。如图2所示,视频编解码***10包含源装置12及目的地装置14。源装置12产生经编码视频数据。因此,源装置12可被称作视频编码装置或视频编码设备。目的地装置14可解码由源装置12产生的经编码视频数据。因此,目的地装置14可被称作视频解码装置或视频解码设备。源装置12及目的地装置14可为视频编解码装置或视频编解码设备的实例。源装置12及目的地装置14可包括广泛范围的装置,包含台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、智能电话等手持机、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机,或其类似者。
目的地装置14可经由信道16接收来自源装置12的编码后的视频数据。信道16可包括能够将经编码视频数据从源装置12移动到目的地装置14的一个或多个媒体及/或装置。在一个实例中,信道16可包括使源装置12能够实时地将编码后的视频数据直接发射到目的地装置14的一个或多个通信媒体。在此实例中,源装置12可根据通信标准(例如,无线通信协议)来调制编码后的视频数据,且可将调制后的视频数据发射到目的地装置14。所述一个或多个通信媒体可包含无线及/或有线通信媒体,例如射频(RF)频谱或一根或多根物理传输线。所述一个或多个通信媒体可形成根据包的网络(例如,局域网、广域网或全球网络(例如,因特网))的部分。所述一个或多个通信媒体可包含路由器、交换器、基站,或促进从源装置12到目的地装置14的通信的其它设备。
在另一实例中,信道16可包含存储由源装置12产生的编码后的视频数据的存储媒体。在此实例中,目的地装置14可经由磁盘存取或卡存取来存取存储媒体。存储媒体可包含多种本地存取式数据存储媒体,例如蓝光光盘、DVD、CD-ROM、快闪存储器,或用于存储经编码视频数据的其它合适数字存储媒体。
在另一实例中,信道16可包含文件服务器或存储由源装置12产生的编码后的视频数据的另一中间存储装置。在此实例中,目的地装置14可经由流式传输或下载来存取存储于文件服务器或其它中间存储装置处的编码后的视频数据。文件服务器可以是能够存储编码后的视频数据且将所述编码后的视频数据发射到目的地装置14的服务器类型。实例文件服务器包含web服务器(例如,用于网站)、文件传送协议(FTP)服务器、网络附加存储(NAS)装置,及本地磁盘驱动器。
目的地装置14可经由标准数据连接(例如,因特网连接)来存取编码后的视频数据。数据连接的实例类型包含适合于存取存储于文件服务器上的编码后的视频数据的无线信道(例如,Wi-Fi连接)、有线连接(例如,DSL、缆线调制解调器等),或两者的组合。编码后的视频数据从文件服务器的发射可为流式传输、下载传输或两者的组合。
本申请的技术不限于无线应用场景,示例性的,可将所述技术应用于支持以下应用等多种多媒体应用的视频编解码:空中电视广播、有线电视发射、***发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频编解码***10可经配置以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。
在图2的实例中,源装置12包含视频源18、视频编码器20及输出接口22。在一些实例中,输出接口22可包含调制器/解调器(调制解调器)及/或发射器。视频源18可包含视频俘获装置(例如,视频相机)、含有先前俘获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频输入接口,及/或用于产生视频数据的计算机图形***,或上述视频数据源的组合。
视频编码器20可编码来自视频源18的视频数据。在一些实例中,源装置12经由输出接口22将编码后的视频数据直接发射到目的地装置14。编码后的视频数据还可存储于存储媒体或文件服务器上以供目的地装置14稍后存取以用于解码及/或播放。
在图2的实例中,目的地装置14包含输入接口28、视频解码器30及显示器32。在一些实例中,输入接口28包含接收器及/或调制解调器。输入接口28可经由信道16接收编码后的视频数据。显示器32可与目的地装置14整合或可在目的地装置14外部。一般来说,显示器32显示解码后的视频数据。显示器32可包括多种显示装置,例如液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或其它类型的显示装置。
视频编码器20及视频解码器30可根据视频压缩标准(例如,高效率视频编解码H.265标准)而操作,且可遵照HEVC测试模型(HM)。H.265标准的文本描述ITU-TH.265(V3)(04/2015)于2015年4月29号发布,可从http://handle.itu.int/11.1002/1000/12455下载,所述文件的全部内容以引用的方式并入本文中。
或者,视频编码器20及视频解码器30可根据其它专属或行业标准而操作,所述标准包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(还称为ISO/IECMPEG-4AVC),包含可分级视频编解码(SVC)及多视图视频编解码(MVC)扩展。应理解,本申请的技术不限于任何特定编解码标准或技术。
此外,图2仅为实例且本申请的技术可应用于未必包含编码装置与解码装置之间的任何数据通信的视频编解码应用(例如,单侧的视频编码或视频解码)。在其它实例中,从本地存储器检索数据,经由网络流式传输数据,或以类似方式操作数据。编码装置可编码数据且将所述数据存储到存储器,及/或解码装置可从存储器检索数据且解码所述数据。在许多实例中,通过彼此不进行通信而仅编码数据到存储器及/或从存储器检索数据及解码数据的多个装置执行编码及解码。
视频编码器20及视频解码器30各自可实施为多种合适电路中的任一者,例如一个或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、 离散逻辑、硬件或其任何组合。如果技术部分地或者全部以软件实施,则装置可将软件的指令存储于合适的非瞬时计算机可读存储媒体中,且可使用一个或多个处理器执行硬件中的指令以执行本申请的技术。可将前述各者中的任一者(包含硬件、软件、硬件与软件的组合等)视为一个或多个处理器。视频编码器20及视频解码器30中的每一者可包含于一个或多个编码器或解码器中,其中的任一者可整合为其它装置中的组合式编码器/解码器(编解码器(CODEC))的部分。
本申请大体上可指代视频编码器20将某一信息“用信号发送”到另一装置(例如,视频解码器30)。术语“用信号发送”大体上可指代语法元素及/或表示编码后的视频数据的传达。此传达可实时或近实时地发生。或者,此通信可在一时间跨度上发生,例如可在编码时以编码后得到的二进制数据将语法元素存储到计算机可读存储媒体时发生,所述语法元素在存储到此媒体之后接着可由解码装置在任何时间检索。
H.265标准中的QT结构只能产生不同大小的正方形CU,不能很好地使CU适应各种形状的纹理。JEM中的QTBT结构使用QT级联BT的方式,简称为QTBT划分方式,例如CTU先按照QT划分,QT的叶节点允许继续使用BT划分,如图3所示。其中图3的右图中每个端点表示一个节点,一个节点连出4根实线表示四叉树划分,一个节点连出2根虚线表示二叉树划分,a到m为13个叶节点,每个叶节点对应1个CU;二叉树节点上的1表示竖直划分,0表示水平划分;一个CTU按照右图的划分,成为a到m这13个CU,如图3左图所示。QTBT划分方式中,每个CU具有QT层级(Quad-tree depth,QT depth)和BT层级(Binary tree depth,BT depth),QT层级表示CU所属的QT叶节点的QT层级,BT层级表示CU所属BT叶节点的BT层级,例如图3中a和b的QT层级为1,BT层级为2;c、d、e的QT层级为1,BT层级为1;f、k、l的QT层级为2,BT层级为1;i、j的QT层级为2,BT层级为0;g、h的QT层级为2,BT层级为2;m的QT层级为1,BT层级为0。如果CTU只划分成一个CU,则此CU的QT层级为0,BT层级为0。
上述QT级联BT的方式,虽然能够通过BT产生正方形以外的矩形CU,但是由于BT划分一次使得一个节点变成2个1/2大小的节点,如要划分到较小尺寸的CU,则会导致划分层次过多;并且QT和BT采用级联的方式,即QT的叶节点以BT划分,BT的叶节点不能再使用QT划分。
鉴于上述视频图像编解码中的问题,本申请实施例提供一种视频图像的编解码方法及装置,以提升编解码的压缩效率。其中,方法和装置是基于同一发明构思的,由于方法及装置解决问题的原理相似,因此装置与方法的实施可以相互参见,重复之处不再赘述。
图4示出了本申请实施例提供的视频图像的解码方法的流程示意图,该流程具体可通过硬件、软件编程或软硬件的结合来实现。
解码器可被配置为执行如图4所示的流程,解码器中用以执行本申请实施例所提供的视频图像的解码方案的功能模块具体可以通过硬件、软件编程以及软硬件的组合来实现,硬件可包括一个或多个信号处理和/或专用集成电路。
如图4所示,该流程具体包括有以下处理过程:
需要说明的是,本申请实施例中的编码树节点用于表示所述视频图像中的一个待解码的矩形图像区域,下级节点用于表示所述待解码的矩形图像区域中的部分矩形图像区域,不同下级节点所表示的图像区域互不重叠,当所述编码树节点为所述视频图像的编码单元时,由于编码单元为进行编码的最基本的单元,不可进一步拆分,此时编码树节点不包含 所述下级节点。
步骤40:解析码流中所述编码树节点的划分模式信息。
需要说明的是,所述编码树节点的划分模式信息用于指示所述编码树节点的候选划分模式集合中的一种划分模式。
可选的,在所述解析码流中编码树节点的划分模式信息之前,还包括:
解析所述码流中所述候选划分模式集合的指示信息,所述指示信息用于指示所述候选划分模式集合包含的划分模式。
进一步的,在所述解析码流中编码树节点的划分模式信息之前,还需要执行:确定所述编码树节点位于所述视频图像的图像范围内。
具体的,编码树节点的划分模式信息的语法表示方式包括以下两种:
一种可能的实施方式中,所述划分模式信息使用第一语法元素表示,所述第一语法元素用于表示所述获得的划分模式在所述候选划分模式集合中的标识。
这种方式中,该编码树节点需要按照第一语法元素指示的划分模式进行下级节点的划分,具体的,可以不划分,也可以划分为2个或3个或4个下级节点。
另一种可能的实施方式中,所述划分模式信息使用第二语法元素和第三语法元素表示,所述第二语法元素用于表示确定所述获得的划分模式是否为所述第一划分模式,当所述第二语法元素确定所述获得的划分模式不为所述第一划分模式时,所述第三语法元素用于表示所述获得的划分模式在除所述第一划分模式以外的所述候选划分模式集合中的标识。
这种方式中,该编码树节点需要按照第三语法元素指示的划分模式进行下级节点的划分,具体的,可以划分为2个或3个或4个下级节点。
步骤41:根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,所述划分模式用于确定构成所述编码树节点的所述下级节点,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同。
另一种可能的实施方式中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合可以相同也可以不相同。
其中,所述划分模式用于确定构成所述编码树节点的下级节点,包括:所述划分模式用于确定构成所述编码树节点的下级节点的数量、大小、分布中的至少一者。可选的,所述划分模式还用于确定构成所述编码树节点的多个下级节点的解码顺序。图5为针对一个编码树节点的划分模式示意图,图5中的标号1,2,3,4等,表示下级节点的解码顺序。
所述编码树节点的宽为4倍的M个像素,M为正整数,以所述编码树节点左上角点为原点,向右为水平正方向,向下为竖直正方向,所述划分模式,包括:
第一划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为4倍的M个像素,高为4倍的M个像素的所述下级节点;具体可参阅图5中的S1所示。
第二划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S2所示。
第三划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(2M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点;具体可参阅图5中的S3所示。
第四划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S4所示。
第五划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S5所示。
第六划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S6所示。
第七划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S7所示。
第八划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点;具体可参阅图5中的S8所示。
第九划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S9所示。
第十划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点;具体可参阅图5中的S10所示。
第十一划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2 倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S11所示。
第十二划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点;具体可参阅图5中的S12所示。
第十三划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点;具体可参阅图5中的S13所示。
第十四划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点;具体可参阅图5中的S14所示。
第十五划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点;具体可参阅图5中的S15所示。
第十六划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点;具体可参阅图5中的S16所示。
第十七划分模式:确定所述编码树节点包括以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,具体可参阅图5中的S17所示。
其中,所述候选划分模式集合包括:所述第一划分模式、所述第二划分模式、所述第三划分模式、所述第四划分模式,可选的,还包括:所述第五划分模式、所述第六划分模式、所述第七划分模式、所述第八划分模式、所述第九划分模式、所述第十划分模式、所述第十一划分模式、所述第十二划分模式、所述第十三划分模式、所述第十四划分模式、所述第十五划分模式、所述第十六划分模式、所述第十七划分模式中的至少一种。
基于上述划分模式的特点,可选的,将上述划分模式进行分类,具体的,可以分为以下六类,分别为:
第一类:将节点划分为2个长宽比为2的矩形子节点,具体包括第二划分模式和第三划分模式这2种划分模式。
第二类:将节点划分为4个正方形的子节点,具体包括第四划分模式和第五划分模式 这2种划分模式。
第三类:将节点划分为2个正方形的子节点和1个长宽比为2的矩形子节点,具体包括第六划分模式、第七划分模式、第八划分模式、和第九划分模式这4种划分模式。
第四类:将节点划分为2个正方形的子节点和2个长宽比为4的矩形子节点,具体包括第十划分模式、第十一划分模式、第十二划分模式、和第十三划分模式这4种划分模式。
第五类:将节点划分为4个长宽比为4的矩形子节点,具体包括第十四划分模式和第十五划分模式这2种划分模式。
第六类:将节点划分为1个长宽比为2的矩形子节点和2个长宽比为4的矩形子节点,具体包括第十六划分模式和第十七划分模式这2种划分模式。
编码树节点如果继续划分,它允许使用的一组划分模式至少包括s2-s4,还可能包括s5,s6-s9,s10-s13,s14-s15,s16-s17这五组划分模式组合中的至少一组,所述候选划分模式集合的可选的组合包括12种不同的组合,具体可以为:
组合一:s2-s4
组合二:s2-s4,s6-s9
组合三:s2-s4,s6-s9,s16-s17
组合四:s2-s4,s5,s6-s9,s16-s17
组合五:s2-s4,s6-s9,s14-s15,s16-s17
组合六:s2-s4,s5,s6-s9,s14-s15,s16-s17
组合七:s2-s4,s6-s9,s10-s13,s14-s15
组合八:s2-s4,s6-s9,s10-s13,s14-s15,s16-s17
组合九:s2-s4,s5,s6-s9,s10-s13,s14-s15,s16-s17
组合十:s2-s4,s5,s6-s9
组合十一:s2-s4,s5,s6-s9,s10-s13,s16-s17
组合十二:s2-s4,s6-s9,s10-s13,s16-s17
步骤42:解析所述码流获得所述编码树节点的编码信息。
步骤43:根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。
具体的,所述解析所述码流获得所述编码树节点的编码信息,包括以下两种可能的实施方式:
一种可能的实施方式中,当所述获得的划分模式不为所述第一划分模式时,解析所述码流获得所述编码树节点的下级节点的编码信息,其中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点的编码信息包含所述下级节点的划分模式信息;
对应的,所述根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值,包括:
当所述获得的划分模式不为所述第一划分模式时,根据所述下级节点的编码信息,重建所述下级节点的像素值。
另一种可能的实施方式中,当所述获得的划分模式为所述第一划分模式时,解析所述码流获得所述编码树节点的编码信息,其中,当所述编码节点的宽高比为1且所述编码节点的宽等于所述预设阈值或所述编码节点的宽高比不为1时,所述编码节点的编码信息不 包含所述编码节点的划分模式信息;
对应的,所述根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值,包括:
当所述获得的划分模式为所述第一划分模式时,根据所述编码节点的编码信息,重建所述编码树节点的像素值。
通过上述方法,能够实现将编码树节点划分为多个CU,进一步的,根据各个CU的编码信息,重建各个CU的像素值,从而重建所述编码树节点的像素值。
具体的,CU的解码包括熵解码、反量化、反变换、预测、环路滤波等处理,其过程主要包括:
1.通过熵解码获得CU的预测模式、量化参数、变换系数、变换模式等编码信息;
2.根据预测模式,选用帧内预测或帧间预测,得到CU的预测像素;
3.如果CU存在变换系数,则根据量化参数、变换模式,对变换系数进行反量化和反变换处理,得到CU的重建残差。如果CU不存在变换系数,则CU的重建残差为0,即CU中各像素的重建残差值均为0。
4.将预测像素和重建残差相加后进行环路滤波处理,得到CU的重建像素值。
实施例1
图6示出了针对视频图像中的至少一个CTU的解码过程示意图,具体步骤如下所示:
步骤60:将该CTU设置为编码树的根节点,然后执行步骤61。
此时,将该CTU的编码树层级设置为0。
步骤61:如果节点为方形且节点的宽大于阈值TX,则执行步骤62;否则执行步骤65。
步骤62:解析该节点的划分模式信息,然后执行步骤63。
步骤63,若此节点的划分模式为不划分模式即上述第一划分模式,则执行步骤64;否则执行步骤65。
若一个节点为非方形节点或为方形节点且宽等于阈值TX时,确定该节点无需划分,对应的划分模式为不划分模式,此时该节点对应为一个CU。
节点为方形,即节点对应的图像区域的宽等于高,阈值TX可以设置为等于最小CU的边长,它可从码流中的SPS中解析得到,其值例如为4、8、16等。
步骤64:从节点的候选划分模式集合中获得该节点的划分模式,将此节点划分为N个子节点,N=2、3或4。对每个子节点,按照划分模式信息指示的节点处理顺序,依次执行步骤61,进一步确定它们各自的划分。
子节点的编码树层级为它的父节点(parent node)的编码树层级加1。
更具体的,编码树节点的划分模式信息的语法表示方式和解析方式可以如以下两种之一:
方式一:先解析一个划分标志位(flag),例如称为split_flag。如果split_flag为0,则表示此节点不再划分,确认为一个叶节点。如果split_flag为1,则表示此节点将需要被划分;然后解析划分模式序号,例如称为split_mode_idx,得到此节点的划分方式。例如,节点的划分模式组合为上述组合三(即包含s2-s4,s6-s9,s16-s17等9种划分模式),则建立split_mode_idx与划分模式的一一映射关系,例如表1或表2所示。
表1
split_mode_idx 0 1 2 3 4 5 6 7 8
划分模式 s2 s3 s4 s6 s7 s8 s9 s16 s17
表2
split_mode_idx 0 1 2 3 4 5 6 7 8
划分模式 s4 s3 s2 s8 s9 s6 s7 s17 s16
方式二:解析一个语法元素split_mode,其值为大于等于0的整数,如果split_mode=0,则表示节点不需要划分,如果split_mode>0,则由split_mode获得相应的划分方式。split_mode的上下文自适应二元算术编码(Context-based Adaptive Binary Arithmetic Coding,CABAC)二值化方法如表3所示,根据解析到的二值化字符串(bin string)查找表3可得到split_mode的语法元素值和对应的划分模式。此示例中,节点的划分模式组合为上述组合十二,即包含s2-s4,s6-s13,s16-s17共14种划分方式。
表3
split_mode 划分模式 Bin string
0 s1(不划分) 1
1 s4 01
2 s2 0011
3 s3 0001
4 s16 00101
5 s17 00001
6 s6 0010010
7 s7 0010000
8 s10 0010011
9 s11 0010001
10 s8 0000010
11 s9 0000000
12 s12 0000011
13 s13 0000001
步骤65:此节点对应于一个CU,解析CU的编码信息。
此为现有技术,CU的编码信息包含在一个语法结构体内,例如H.265中的coding_unit()语法结构体。CU的编码信息例如包括预测模式、变换系数等。
递归执行上述步骤61-63,可将该CTU(根节点)确定为一组叶节点,并解析得到叶节点对应的CU的编码信息。
值得一提的是:对一个条带片段、条带、图像或者序列可使用一种候选划分模式集合,候选划分模式集合可以预先定义或者在码流中标记,例如预先定义I帧的候选划分模式集合为组合九,非I帧的候选划分模式集合为组合八;又例如预先定义所有类型的帧的候选划分模式集合为组合七;又例如在条带头或条带片段头中标记对应条带或条带片段允许使用的候选划分模式集合;或者在序列参数集(SPS)中指示序列中I帧、P帧、B帧各自允 许使用的候选划分模式集合。
一种可选的方式为:对所有CTU,统一使用组合十二。
另一种可选的方式为,候选划分模式集合必须包含s2-s4,同时SPS中包含一个N比特语法元素A,每1比特分别表示候选划分模式集合中是否包含s5-s16中的一些划分模式。例如,语法元素A由4比特构成,第1比特表示候选划分模式集合是否还包含s6-s9,第2比特表示候选划分模式集合是否还包含s16-s17,第3比特表示候选划分模式集合是否还包含s10-s13,第4比特表示候选划分模式集合是否还包含s5、s14、s15。又例如,语法元素A由3比特构成,第1比特表示候选划分模式集合是否还包含s6-s9,第2比特表示候选划分模式集合是否还包含s10-s13,第3比特表示候选划分模式集合是否还包含s14、s15。通过解析SPS中的语法元素A,可以获得候选划分模式集合的具体构成方式。这种优选方式的优点在于,对计算复杂度受限的编码器,它在率失真优化决策中只尝试一部分划分模式,此时可对应地设置语法元素A,从而让解码器得知实际使用的候选划分模式集合,以节省划分模式信息的比特开销。
实施例1对应的划分模式信息采用的语法结构的语法表如表4所示。表4中,coding_tree()为编码树语法结构体(syntax structure),描述一个编码树节点按照基于多划分模式的划分树进行划分的具体划分方式。其中,x0和x1分别表示节点左上角(即节点对应区域的左上角)相对于CTU左上角(即CTU对应区域的左上角)的水平偏移和竖直偏移(以1像素为单位),cuWidth和cuHeight分别表示节点的宽和高(以1像素为单位);minCUSize表示最小CU边长(例如为4);“…”表示省略的语法元素、语法结构、变量或者计算处理;ae(v)表示使用CABAC解码;condA代表“节点对应的区域是否在图像内部”(是则condA为真,否则condA为假),例如condA可以为x0+cuWidth≤picWidth&&x1+cuHeight≤picHeight,其中picWidth和picHeight代表图像的宽和高(以1像素为单位)。
对图像中的大部分CTU,它完全落在图像内部,所以它内部的节点也都在图像内部,节点的condA恒为真(即对于这些CTU,解析split_flag时可去掉condA的判断);对于小部分CTU,它的一部分落在图像内,另一部分落在图像外,因此它内部的编码树节点的condA可能为假。当节点对应区域的一部分在图像内、一部分在图像外时,condA为假,则此节点默认使用s4方式划分,不需要传输标志位和划分模式序号。
对一个编码树节点,解析它的编码树划分语法结构体coding_tree()可获得它的划分方式,具体如下:
如果节点在图像内部,节点的宽等于高,且节点的边长大于预设的最小CU边长,则从码流中解析一个划分标志位split_flag;否则,标志位split_flag不出现在码流中,此时如果节点的宽大于最小CU边长,其值默认为1,节点划分模式默认采用s4;否则,其值默认为0,节点默认为不划分。
如果节点的split_flag等于0,则节点不再划分,此节点对应于一个CU,进入CU编码信息coding_unit()的解析;否则(split_flag等于1),节点将被继续划分,此时从码流中解析一个指示划分模式的语法元素split_mode_idx,根据split_mode_idx指示的划分模式对节点进行划分。例如,划分模式组合为上述组合二(即可选划分模式为s1-s4,s6-s9),则split_mode_idx等于0到6分别指示节点划分模式为s2,s3,s4,s6,s7,s8,s9;又例如划分模式组合为上述组合三(即可选划分模式为s1-s4,s6-s9,s16-s17),则split_mode_idx等于0到8分别指示节点划分模式为s2,s3,s4,s6,s7,s8,s9,s16,s17。
如果该节点继续划分(即split_flag[x0][y0]非0),则进行以下处理:
先根据split_mode_idx得到该节点的划分模式产生的节点数目numNodeSplit,getNumNodeSplit()表示相应的处理,例如用查表法根据split_mode_idx找到对应划分模式产生的节点数目;
再依次对每个划分后产生的节点,根据它的父节点的位置(由x0,y0指示)、形状(由cuWidth,cuHieght指示)、划分模式(由split_mode_idx[x0][y0]指示),计算此节点的位置(由x1,y1指示)和形状(由cuWidth1,cuHeight1指示),getNodeInfo()表示相应的处理;并对这些划分后产生的节点继续解析编码树。
如果节点终止划分(即split_flag[x0][y0]等于0),则对此节点进入CU编码信息的解析处理(例如解析CU的预测模式、残差等信息),表4中coding_unit()代表CU编码信息的语法结构体。
表4
Figure PCTCN2018079658-appb-000001
图7示出了本申请实施例提供的视频图像的编码方法的流程示意图,该流程具体可通过硬件、软件编程或软硬件的结合来实现。
编码器可被配置为执行如图7所示的流程,编码器中用以执行本申请实施例所提供的视频图像的编码方案的功能模块具体可以通过硬件、软件编程以及软硬件的组合来实现,硬件可包括一个或多个信号处理和/或专用集成电路。
如图7所示,该流程具体包括有以下处理过程:
步骤70:按照设置的所述编码树节点的候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价。
所述率失真代价为根据对应的划分模式得到的所有编码单元的率失真代价之和,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的基本编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同。
另一种可能的实施方式中,所述下级节点和所述编码树节点的候选划分模式集合可以相同也可以不同。
步骤71:将率失真代价最小的划分模式确定为所述编码树节点的目标划分模式。
步骤72:基于所述目标划分模式确定构成所述编码树节点的各个编码单元,对所述各个编码单元进行编码得到所述编码树节点对应的码流和重建图像。
具体的,基于所述目标划分模式确定构成所述编码树节点的各个编码单元,包括:
根据所述编码树节点的目标划分模式,确定构成所述编码树节点的N个子节点;
当所述N个子节点中包括方形节点,且所述方形节点对应的图像区域的宽大于所述预设阈值时,根据所述候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价;
将率失真代价最小的划分模式确定为所述方形节点的目标划分模式;
根据所述方形节点的目标划分模式,确定构成所述方形节点的下一级方形节点,直到不存在所述下一级方形节点或所述确定的下一级方形节点对应的图像区域的宽等于预设阈值,得到构成所述编码树节点的所述各个编码单元。
其中,编码树节点划分成CU的方式可通过split_flag和split_mode_idx的语法元素组合来表示,如表4所示,即如果CU的目标划分模式为不划分模式,则split_flag等于0;否则,split_flag等于1,split_mode_idx的值对应于目标划分模式。
CU的编码包括预测、变换、量化、熵编码等处理,其主要处理包括如下步骤:
1)根据预测模式,选用帧内预测或帧间预测,得到CU的预测像素;
2)将CU的原始像素和预测像素之间的残差进行变化和量化,得到变换系数;对变换系数进行反量化、反变换得到重建残差;
3)将CU的预测像素和重建残差相加后进行环路滤波处理,得到CU的重建像素;
4)对CU的预测模式、变换系数等信息进行熵编码,产生CU的码流。CTU的码流由各CU的码流构成。
上述编码方案中,允许CTU按照多种划分模式进行划分,具有划分层数少和CU形状多的优点。另外,还可以设定候选划分模式集合,当候选划分模式集合允许的划分模式较 多时,编码器允许尝试较多的划分模式,压缩性能较好;当候选划分模式集合允许的划分模式较少时,编码器允许尝试的划分模式减少,运算复杂较低。
综上所述,本申请中的编解码方案使用基于多划分模式的混合编码划分树结构对CTU进行编解码,这种编解码方案相比于四叉树划分方式,允许了更多的CU形状;相比于二叉树划分方式,减少了划分层次;相比于四叉树级联二叉树划分方式,保留了对编码效率提升作用最大的长宽比为1、2、4的CU,精简了划分模式信息并允许了更多的划分方式,在相同编码复杂度下,相比于QTBT划分方式可取得更高的压缩效率。
根据上述实施例,如图8所示,本申请实施例提供一种视频图像的解码装置800,编码树节点用于表示所述视频图像中的一个待解码的矩形图像区域,下级节点用于表示所述待解码的矩形图像区域中的部分矩形图像区域,不同下级节点所表示的图像区域互不重叠,当所述编码树节点为所述视频图像的编码单元时,所述编码树节点不包含所述下级节点,如图8所示,该装置800包括解析单元801和解码单元802,其中:
解析单元801,用于解析码流中所述编码树节点的划分模式信息;根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;解析所述码流获得所述编码树节点的编码信息;
解码单元802,用于根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。
可选的,所述划分模式用于确定构成所述编码树节点的下级节点的数量、大小、分布中的至少一者。
可选的,所述编码树节点的宽为4倍的M个像素,M为正整数,以所述编码树节点左上角点为原点,向右为水平正方向,向下为竖直正方向,所述候选划分模式集合,还包括:
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第五划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点构成的第六划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M 个像素的所述下级节点构成的第七划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点构成的第八划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第九划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十一划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十二划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十三划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十四划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十五划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素 的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十六划分模式;或者,
确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十七划分模式。
可选的,所述划分模式用于确定构成所述编码树节点的下级节点,还包括:
所述划分模式用于确定构成所述编码树节点的多个下级节点的解码顺序;
对应的,所述第四划分模式,包括:所述四个等大的宽高比为1的所述下级节点按照顺时针顺序解码的第四划分模式第一子模式和所述四个等大的宽高比为1的所述下级节点按照逆时针顺序解码的第四划分模式第二子模式,其中,所述候选模式集合至少包括所述第四划分模式第一子模式。
可选的,所述划分模式信息使用第一语法元素表示,所述第一语法元素用于表示所述获得的划分模式在所述候选划分模式集合中的标识。
可选的,所述划分模式信息使用第二语法元素和第三语法元素表示,所述第二语法元素用于表示确定所述获得的划分模式是否为所述第一划分模式,当所述第二语法元素确定所述获得的划分模式不为所述第一划分模式时,所述第三语法元素用于表示所述获得的划分模式在除所述第一划分模式以外的所述候选划分模式集合中的标识。
可选的,所述解析单元801在解析所述码流获得所述编码树节点的编码信息时,具体用于:
当所述获得的划分模式不为所述第一划分模式时,解析所述码流获得所述编码树节点的下级节点的编码信息,其中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点的编码信息包含所述下级节点的划分模式信息;
对应的,所述解码单元802在根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值时,具体用于:
当所述获得的划分模式不为所述第一划分模式时,根据所述下级节点的编码信息,重建所述下级节点的像素值。
可选的,所述解析单元801在所述解析码流中编码树节点的划分模式信息之前,还用于:
解析所述码流中所述候选划分模式集合的指示信息,所述指示信息用于指示所述候选划分模式集合包含的划分模式。
可选的,所述解析单元801在所述解析码流中编码树节点的划分模式信息之前,还用于:
确定所述编码树节点位于所述视频图像的图像范围内。
需要说明的是,本申请实施例中的装置800的各个单元的功能实现以及交互方式可以进一步参照相关方法实施例的描述,在此不再赘述。
根据同一发明构思,本申请实施例还提供一种解码器900,如图9所示,该解码器900包括处理器901和存储器902,执行本发明方案的程序代码保存在存储器902中,用于指令处理器901执行图4所示的视频图像的解码方法。
本申请还可以通过对处理器进行设计编程,将图4所示的方法所对应的代码固化到芯片内,从而使芯片在运行时能够执行图4所示的方法。
根据上述实施例,如图10所示,本申请实施例提供一种视频图像的编码装置1000,如图8所示,该装置1000包括第一编码单元1001、确定单元1002和第二编码单元1003,其中:
第一编码单元1001,用于针对所述视频图像中的编码树节点,按照设置的所述编码树节点的候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价,所述率失真代价为根据对应的划分模式得到的所有编码单元的率失真代价之和,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的基本编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
确定单元1002,用于将率失真代价最小的划分模式确定为所述编码树节点的目标划分模式;基于所述目标划分模式确定构成所述编码树节点的各个编码单元;
第二编码单元1003,用于对所述各个编码单元进行编码得到所述编码树节点对应的码流和重建图像。
可选的,所述确定单元1002在基于所述目标划分模式确定构成所述编码树节点的各个编码单元时,具体用于:
根据所述编码树节点的目标划分模式,确定构成所述编码树节点的N个子节点;
当所述N个子节点中包括方形节点,且所述方形节点对应的图像区域的宽大于所述预设阈值时,根据所述候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价;
将率失真代价最小的划分模式确定为所述方形节点的目标划分模式;
根据所述方形节点的目标划分模式,确定构成所述方形节点的下一级方形节点,直到不存在所述下一级方形节点或所述确定的下一级方形节点对应的图像区域的宽等于预设阈值,得到构成所述编码树节点的所述各个编码单元。
本申请实施例中的装置1000的各个单元的功能实现以及交互方式可以进一步参照相关方法实施例的描述,在此不再赘述。
应理解以上装置1000和800中的各个单元的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。例如,以上各个单元可以为单独设立的处理元件,也可以集成在编码设备的某一个芯片中实现,此外,也可以以程序代码的形式存储于编码器的存储元件中,由编码设备的某一个处理元件调用并执行以上各个单元的功能。此外各个单元可以集成在一起,也可以独立实现。这里所述的处理元件可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个单元可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。该处理元件可以是通用处理器,例如中央处理器(英文:centralprocessingunit,简称:CPU),还可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路 (英文:application-specificintegratedcircuit,简称:ASIC),或,一个或多个微处理器(英文:digitalsignalprocessor,简称:DSP),或,一个或者多个现场可编程门阵列(英文:field-programmablegatearray,简称:FPGA)等。
根据同一发明构思,如图11所示,本申请实施例还提供一种编码器1100,如图11所示,该编码器1100包括处理器1101和存储器1102,执行本发明方案的程序代码保存在存储器1102中,用于指令处理器1101执行图7所示的视频图像的编码方法。
本申请还可以通过对处理器进行设计编程,将图7所示的方法所对应的代码固化到芯片内,从而使芯片在运行时能够执行图7所示的方法。
可以理解的是,本申请实施例上述编码器1100和解码器900中涉及的处理器可以是一个CPU,DSP,ASIC,或一个或多个用于控制本发明方案程序执行的集成电路。计算机***中包括的一个或多个存储器,可以是只读存储器(英文:read-onlymemory,简称ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(英文:randomaccessmemory,简称:RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是磁盘存储器。这些存储器通过总线与处理器相连接,或者也可以通过专门的连接线与处理器连接。
本领域内的技术人员应明白,本申请实施例可提供为方法、***、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请实施例是参照根据本申请实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (22)

  1. 一种视频图像的解码方法,其特征在于,编码树节点用于表示所述视频图像中的一个待解码的矩形图像区域,下级节点用于表示所述待解码的矩形图像区域中的部分矩形图像区域,不同下级节点所表示的图像区域互不重叠,当所述编码树节点为所述视频图像的编码单元时,所述编码树节点不包含所述下级节点,所述方法包括:
    解析码流中所述编码树节点的划分模式信息;
    根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
    解析所述码流获得所述编码树节点的编码信息;
    根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。
  2. 根据权利要求1所述的方法,其特征在于,包括:所述划分模式用于确定构成所述编码树节点的下级节点的数量、大小、分布中的至少一者。
  3. 根据权利要求1或2所述的方法,其特征在于,所述编码树节点的宽为4倍的M个像素,M为正整数,以所述编码树节点左上角点为原点,向右为水平正方向,向下为竖直正方向,所述候选划分模式集合,还包括:
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第五划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点构成的第六划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第七划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点构成的第八划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M 个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第九划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十一划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十二划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十三划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十四划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十五划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十六划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所 述下级节点构成的第十七划分模式。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述划分模式用于确定构成所述编码树节点的下级节点,还包括:
    所述划分模式用于确定构成所述编码树节点的多个下级节点的解码顺序;
    对应的,所述第四划分模式,包括:所述四个等大的宽高比为1的所述下级节点按照顺时针顺序解码的第四划分模式第一子模式和所述四个等大的宽高比为1的所述下级节点按照逆时针顺序解码的第四划分模式第二子模式,其中,所述候选模式集合至少包括所述第四划分模式第一子模式。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,包括:所述划分模式信息使用第一语法元素表示,所述第一语法元素用于表示所述获得的划分模式在所述候选划分模式集合中的标识。
  6. 根据权利要求1至4任一项所述的方法,其特征在于,包括:所述划分模式信息使用第二语法元素和第三语法元素表示,所述第二语法元素用于表示确定所述获得的划分模式是否为所述第一划分模式,当所述第二语法元素确定所述获得的划分模式不为所述第一划分模式时,所述第三语法元素用于表示所述获得的划分模式在除所述第一划分模式以外的所述候选划分模式集合中的标识。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述解析所述码流获得所述编码树节点的编码信息,包括:
    当所述获得的划分模式不为所述第一划分模式时,解析所述码流获得所述编码树节点的下级节点的编码信息,其中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点的编码信息包含所述下级节点的划分模式信息;
    对应的,所述根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值,包括:
    当所述获得的划分模式不为所述第一划分模式时,根据所述下级节点的编码信息,重建所述下级节点的像素值。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,在所述解析码流中编码树节点的划分模式信息之前,还包括:
    解析所述码流中所述候选划分模式集合的指示信息,所述指示信息用于指示所述候选划分模式集合包含的划分模式。
  9. 根据权利要求1至8任一项所述的方法,其特征在于,在所述解析码流中编码树节点的划分模式信息之前,还包括:
    确定所述编码树节点位于所述视频图像的图像范围内。
  10. 一种视频图像的编码方法,其特征在于,包括:
    针对所述视频图像中的编码树节点,按照设置的所述编码树节点的候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价,所述率失真代价为根据对应的划分模式得到的所有编码单元的率失真代价之和,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的基本编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述 编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
    将率失真代价最小的划分模式确定为所述编码树节点的目标划分模式;
    基于所述目标划分模式确定构成所述编码树节点的各个编码单元,对所述各个编码单元进行编码得到所述编码树节点对应的码流和重建图像。
  11. 根据权利要求10所述的方法,其特征在于,基于所述目标划分模式确定构成所述编码树节点的各个编码单元,包括:
    根据所述编码树节点的目标划分模式,确定构成所述编码树节点的N个子节点;
    当所述N个子节点中包括方形节点,且所述方形节点对应的图像区域的宽大于所述预设阈值时,根据所述候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价;
    将率失真代价最小的划分模式确定为所述方形节点的目标划分模式;
    根据所述方形节点的目标划分模式,确定构成所述方形节点的下一级方形节点,直到不存在所述下一级方形节点或所述确定的下一级方形节点对应的图像区域的宽等于预设阈值,得到构成所述编码树节点的所述各个编码单元。
  12. 一种视频图像的解码装置,其特征在于,编码树节点用于表示所述视频图像中的一个待解码的矩形图像区域,下级节点用于表示所述待解码的矩形图像区域中的部分矩形图像区域,不同下级节点所表示的图像区域互不重叠,当所述编码树节点为所述视频图像的编码单元时,所述编码树节点不包含所述下级节点,所述装置包括:
    解析单元,用于解析码流中所述编码树节点的划分模式信息;根据所述划分模式信息,从所述编码树节点的候选划分模式集合中获得所述编码树节点的划分模式,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;解析所述码流获得所述编码树节点的编码信息;
    解码单元,用于根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值。
  13. 根据权利要求12所述的装置,其特征在于,包括:所述划分模式用于确定构成所述编码树节点的下级节点的数量、大小、分布中的至少一者。
  14. 根据权利要求12或13所述的装置,其特征在于,所述编码树节点的宽为4倍的M个像素,M为正整数,以所述编码树节点左上角点为原点,向右为水平正方向,向下为竖直正方向,所述候选划分模式集合,还包括:
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M 个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第五划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点构成的第六划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第七划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,2M)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点构成的第八划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第九划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十一划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十二划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(2M,0)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点和以(2M,2M)为左上角点,宽为2倍的M个像素,高为2倍的M个像素的所述下级节点构成的第十三划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素 的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,2M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十四划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(0,2M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十五划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点,以(0,M)为左上角点,宽为4倍的M个像素,高为2倍的M个像素的所述下级节点和以(0,3M)为左上角点,宽为4倍的M个像素,高为M个像素的所述下级节点构成的第十六划分模式;或者,
    确定所述编码树节点由以(0,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点,以(M,0)为左上角点,宽为2倍的M个像素,高为4倍的M个像素的所述下级节点和以(3M,0)为左上角点,宽为M个像素,高为4倍的M个像素的所述下级节点构成的第十七划分模式。
  15. 根据权利要求12至14任一项所述的装置,其特征在于,所述划分模式用于确定构成所述编码树节点的下级节点,还包括:
    所述划分模式用于确定构成所述编码树节点的多个下级节点的解码顺序;
    对应的,所述第四划分模式,包括:所述四个等大的宽高比为1的所述下级节点按照顺时针顺序解码的第四划分模式第一子模式和所述四个等大的宽高比为1的所述下级节点按照逆时针顺序解码的第四划分模式第二子模式,其中,所述候选模式集合至少包括所述第四划分模式第一子模式。
  16. 根据权利要求12至15任一项所述的装置,其特征在于,包括:所述划分模式信息使用第一语法元素表示,所述第一语法元素用于表示所述获得的划分模式在所述候选划分模式集合中的标识。
  17. 根据权利要求12至15任一项所述的装置,其特征在于,包括:所述划分模式信息使用第二语法元素和第三语法元素表示,所述第二语法元素用于表示确定所述获得的划分模式是否为所述第一划分模式,当所述第二语法元素确定所述获得的划分模式不为所述第一划分模式时,所述第三语法元素用于表示所述获得的划分模式在除所述第一划分模式以外的所述候选划分模式集合中的标识。
  18. 根据权利要求12至17任一项所述的装置,其特征在于,所述解析单元在解析所述码流获得所述编码树节点的编码信息时,具体用于:
    当所述获得的划分模式不为所述第一划分模式时,解析所述码流获得所述编码树节点的下级节点的编码信息,其中,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点的编码信息包含所述下级节点的划分模式信息;
    对应的,所述解码单元在根据所述编码树节点的划分模式信息和编码信息,重建所述编码树节点的像素值时,具体用于:
    当所述获得的划分模式不为所述第一划分模式时,根据所述下级节点的编码信息,重 建所述下级节点的像素值。
  19. 根据权利要求12至18任一项所述的装置,其特征在于,所述解析单元在所述解析码流中编码树节点的划分模式信息之前,还用于:
    解析所述码流中所述候选划分模式集合的指示信息,所述指示信息用于指示所述候选划分模式集合包含的划分模式。
  20. 根据权利要求12至19任一项所述的装置,其特征在于,所述解析单元在所述解析码流中编码树节点的划分模式信息之前,还用于:
    确定所述编码树节点位于所述视频图像的图像范围内。
  21. 一种视频图像的编码装置,其特征在于,包括:
    第一编码单元,用于针对所述视频图像中的编码树节点,按照设置的所述编码树节点的候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价,所述率失真代价为根据对应的划分模式得到的所有编码单元的率失真代价之和,其中,所述编码树节点的宽高比为1且所述编码树节点的宽大于预设阈值,所述候选划分模式集合包括表示所述编码树节点为所述视频图像的基本编码单元的第一划分模式,确定所述编码树单元由两个等大的宽高比为2的所述下级节点构成的第二划分模式,确定所述编码树单元由两个等大的宽高比为0.5的所述下级节点构成的第三划分模式和确定所述编码树单元由四个等大的宽高比为1的所述下级节点构成的第四划分模式,当所述下级节点的宽高比为1且所述下级节点的宽大于所述预设阈值时,所述下级节点和所述编码树节点的候选划分模式集合相同;
    确定单元,用于将率失真代价最小的划分模式确定为所述编码树节点的目标划分模式;基于所述目标划分模式确定构成所述编码树节点的各个编码单元;
    第二编码单元,用于对所述各个编码单元进行编码得到所述编码树节点对应的码流和重建图像。
  22. 根据权利要求21所述的装置,其特征在于,所述确定单元在基于所述目标划分模式确定构成所述编码树节点的各个编码单元时,具体用于:
    根据所述编码树节点的目标划分模式,确定构成所述编码树节点的N个子节点;
    当所述N个子节点中包括方形节点,且所述方形节点对应的图像区域的宽大于所述预设阈值时,根据所述候选划分模式集合中的至少一种划分模式进行编码,得到所述至少一种划分模式中每一种划分模式对应的率失真代价;
    将率失真代价最小的划分模式确定为所述方形节点的目标划分模式;
    根据所述方形节点的目标划分模式,确定构成所述方形节点的下一级方形节点,直到不存在所述下一级方形节点或所述确定的下一级方形节点对应的图像区域的宽等于预设阈值,得到构成所述编码树节点的所述各个编码单元。
PCT/CN2018/079658 2017-05-27 2018-03-20 一种视频图像的编解码方法及装置 WO2018219020A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/689,550 US10911788B2 (en) 2017-05-27 2019-11-20 Video image coding and decoding method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710392690.3 2017-05-27
CN201710392690.3A CN108965894B (zh) 2017-05-27 2017-05-27 一种视频图像的编解码方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/689,550 Continuation US10911788B2 (en) 2017-05-27 2019-11-20 Video image coding and decoding method and apparatus

Publications (1)

Publication Number Publication Date
WO2018219020A1 true WO2018219020A1 (zh) 2018-12-06

Family

ID=64454467

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079658 WO2018219020A1 (zh) 2017-05-27 2018-03-20 一种视频图像的编解码方法及装置

Country Status (4)

Country Link
US (1) US10911788B2 (zh)
CN (1) CN108965894B (zh)
TW (1) TWI675589B (zh)
WO (1) WO2018219020A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116033154A (zh) * 2017-07-06 2023-04-28 三星电子株式会社 图像解码方法和设备以及图像编码方法和设备
CN109510987B (zh) * 2017-09-15 2022-12-06 华为技术有限公司 编码树节点划分方式的确定方法、装置及编码设备
CN111327899A (zh) * 2018-12-16 2020-06-23 华为技术有限公司 视频译码器及相应方法
CN111355959B (zh) * 2018-12-22 2024-04-09 华为技术有限公司 一种图像块划分方法及装置
WO2020215216A1 (zh) * 2019-04-23 2020-10-29 Oppo广东移动通信有限公司 图像解码方法、解码器以及存储介质
CN113286152B (zh) 2020-02-20 2023-05-23 腾讯美国有限责任公司 视频解码方法、装置、计算机设备及存储介质
US11363285B2 (en) * 2020-02-20 2022-06-14 Tencent America LLC Block partitioning at picture boundaries

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120106622A1 (en) * 2010-11-03 2012-05-03 Mediatek Inc. Method and Apparatus of Slice Grouping for High Efficiency Video Coding
CN103442235A (zh) * 2013-09-06 2013-12-11 深圳市融创天下科技股份有限公司 一种图像处理方法以及装置
CN103780910A (zh) * 2014-01-21 2014-05-07 华为技术有限公司 视频编码中的块分割方式和最佳预测模式确定方法及相关装置
WO2016148438A2 (ko) * 2015-03-13 2016-09-22 엘지전자 주식회사 비디오 신호의 처리 방법 및 이를 위한 장치

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9161039B2 (en) * 2012-09-24 2015-10-13 Qualcomm Incorporated Bitstream properties in video coding
US9578339B2 (en) * 2013-03-05 2017-02-21 Qualcomm Incorporated Parallel processing for video coding
US10863186B2 (en) * 2016-08-26 2020-12-08 Sharp Kabushiki Kaisha Image decoding apparatus and image coding apparatus
US10609423B2 (en) * 2016-09-07 2020-03-31 Qualcomm Incorporated Tree-type coding for video coding
KR20190062585A (ko) * 2016-11-21 2019-06-05 파나소닉 인텔렉츄얼 프로퍼티 코포레이션 오브 아메리카 부호화 장치, 복호 장치, 부호화 방법 및 복호 방법
US11070801B2 (en) * 2016-12-16 2021-07-20 Sharp Kabushiki Kaisha Image coder/decoder restricting binary tree splitting of target node obtained by ternary tree split
KR102481643B1 (ko) * 2017-01-31 2022-12-28 삼성전자주식회사 디스플레이 제어 방법 및 전자 장치
US10939137B2 (en) * 2017-04-28 2021-03-02 Sharp Kabushiki Kaisha Image decoding device and image encoding device
CN112601085A (zh) * 2017-06-28 2021-04-02 华为技术有限公司 一种图像数据的编码、解码方法及装置
CN109510987B (zh) * 2017-09-15 2022-12-06 华为技术有限公司 编码树节点划分方式的确定方法、装置及编码设备
WO2019072367A1 (en) * 2017-10-09 2019-04-18 Huawei Technologies Co., Ltd. APPARATUS AND METHOD FOR IMAGE ENCODING HAVING LIMIT PARTITION PROCESSING
JP7017580B2 (ja) * 2017-11-16 2022-02-08 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 復号装置、画像復号装置及び復号方法
CN112740695B (zh) * 2018-09-22 2024-03-26 Lg 电子株式会社 使用间预测处理视频信号的方法和装置
US20200288159A1 (en) * 2019-03-08 2020-09-10 Qualcomm Incorporated Combined residual coding in video coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120106622A1 (en) * 2010-11-03 2012-05-03 Mediatek Inc. Method and Apparatus of Slice Grouping for High Efficiency Video Coding
CN103442235A (zh) * 2013-09-06 2013-12-11 深圳市融创天下科技股份有限公司 一种图像处理方法以及装置
CN103780910A (zh) * 2014-01-21 2014-05-07 华为技术有限公司 视频编码中的块分割方式和最佳预测模式确定方法及相关装置
WO2016148438A2 (ko) * 2015-03-13 2016-09-22 엘지전자 주식회사 비디오 신호의 처리 방법 및 이를 위한 장치

Also Published As

Publication number Publication date
US10911788B2 (en) 2021-02-02
TW201909645A (zh) 2019-03-01
CN108965894B (zh) 2021-12-21
TWI675589B (zh) 2019-10-21
US20200092587A1 (en) 2020-03-19
CN108965894A (zh) 2018-12-07

Similar Documents

Publication Publication Date Title
WO2018219020A1 (zh) 一种视频图像的编解码方法及装置
TWI728220B (zh) 用於視訊寫碼之多種類型樹架構
TWI745594B (zh) 與視訊寫碼中之變換處理一起應用之內部濾波
TWI722341B (zh) 用於在視訊寫碼中自適應之迴路濾波之線路緩衝減少
KR102596735B1 (ko) 루마 및 크로마 성분에 대한 ibc 전용 버퍼 및 디폴트 값 리프레싱을 사용하는 인코더, 디코더 및 대응하는 방법들
TWI693820B (zh) 為視訊寫碼中非方形區塊判定預測參數
TWI678104B (zh) 影像編解碼方法、影片編解碼器及影片編解碼系統
US10349085B2 (en) Efficient parameter storage for compact multi-pass transforms
ES2966479T3 (es) Un codificador, un decodificador y métodos correspondientes para la intrapredicción
US9491469B2 (en) Coding of last significant transform coefficient
JP2022050614A (ja) ビデオコード化のためのマルチタイプツリーフレームワーク
TW202032983A (zh) 具有廣角框內預測之定位相依框內預測結合
CN110622514A (zh) 用于视频译码的帧内参考滤波器
JP2016029814A (ja) 3dビデオコーディングのための予測パラメータ継承
CN111183647B (zh) 用于解码视频数据的方法、装置和计算机可读媒体
WO2018141116A1 (zh) 编解码方法及装置
CN116962710A (zh) 图像重建方法和装置
WO2023039859A1 (zh) 视频编解码方法、设备、***、及存储介质
TW202133619A (zh) 用於合併估計區域的基於歷史的運動向量預測約束
EP3761646A1 (en) Context modelling method and device for partition flag bit
US11323706B2 (en) Method and apparatus for aspect-ratio dependent filtering for intra-prediction
TW202131697A (zh) 用於視訊編解碼的多重變換集訊號傳遞
CA3103654C (en) Method and apparatus for intra prediction
WO2019174567A1 (zh) 划分标志位的上下文建模方法及装置
US11539953B2 (en) Apparatus and method for boundary partition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18809566

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18809566

Country of ref document: EP

Kind code of ref document: A1