WO2022042538A1 - Block-based point cloud geometric inter-frame prediction method and decoding method - Google Patents

Block-based point cloud geometric inter-frame prediction method and decoding method Download PDF

Info

Publication number
WO2022042538A1
WO2022042538A1 PCT/CN2021/114282 CN2021114282W WO2022042538A1 WO 2022042538 A1 WO2022042538 A1 WO 2022042538A1 CN 2021114282 W CN2021114282 W CN 2021114282W WO 2022042538 A1 WO2022042538 A1 WO 2022042538A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
point cloud
block
node
prediction
Prior art date
Application number
PCT/CN2021/114282
Other languages
French (fr)
Chinese (zh)
Inventor
李革
金佳民
赵文博
张琦
王静
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Publication of WO2022042538A1 publication Critical patent/WO2022042538A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search

Definitions

  • the invention belongs to the technical field of point cloud data processing, and in particular relates to a block-based point cloud geometric inter-frame prediction method and a decoding method.
  • Inter-frame prediction of point clouds has always been a technical difficulty in the field of point cloud encoding and decoding. Since point clouds do not have a fixed spatial structure, it is difficult to intuitively obtain a regular block structure like traditional video encoding and decoding. Moreover, the sparsity and flexibility of the spatial distribution of point clouds also bring great difficulties to motion estimation and motion compensation based on block division.
  • the existing point cloud inter-frame prediction framework mainly includes the following:
  • Prediction method based on occupancy code XOR operation This method performs XOR operation on the occupancy code of the current frame and the predicted frame to obtain the residual of two frames of geometric information, and then encodes the residual, but this method cannot capture the time series of point cloud objects When the motion information of the object is large, the residual error of the occupancy code obtained is also large.
  • Mapping-based prediction method This method first maps the 3D point cloud to a multi-angle 2D plane to obtain multiple 2D images, and then uses the existing video coding tools for compression. This method depends on the advantages and disadvantages of the mapping method. , and such mapping-based compression methods are mainly for 3D human object surface scanning datasets, and for sparse datasets such as lidar scanning maps, the effect of inter-frame prediction is limited.
  • Inter-frame prediction method based on block division This method first directly divides the point cloud into blocks, and then searches for the matching block with the smallest error from the reference frame as the prediction block, and uses the prediction block to improve the coding efficiency of the current block.
  • the prediction block For the use of the prediction block, there are currently two ways. One is to directly replace the current block with the prediction block, calculate the prediction residual, and finally encode the obtained residual coefficient. However, due to the sparsity of the spatial distribution of point clouds, the residual coefficient of the predicted block is often still relatively large. The second is to use the geometric occupancy information of the predicted block as a context to improve the entropy coding efficiency of the current block.
  • the exploration platform EM13 proposed by the MPEG point cloud topic group is such a prediction method, but it only supports the cube-shaped bounding box.
  • the octree is divided, so the bounding box range is too large, which leads to the introduction of too many empty blocks, and there are defects such as excessive motion estimation complexity.
  • the present invention provides a block-based point cloud geometric inter-frame prediction method, which combines the point cloud quadtree and
  • the binary tree division technology supports block division of non-cube bounding boxes, which greatly reduces the waste of code words caused by empty blocks.
  • the computational time complexity of motion estimation is greatly reduced; in addition, the present invention simplifies the inter-frame context model. Fusion, without increasing the number of contexts, significantly improves the entropy coding efficiency.
  • One of the objectives of the present invention is to disclose a block-based point cloud geometry inter-frame predictive coding method.
  • the second purpose of the present invention is to disclose a block-based point cloud geometric inter-frame prediction decoding method.
  • One of the objectives of the present invention is to implement the following scheme, a block-based point cloud geometric inter-frame predictive coding method, characterized in that it includes the following steps: S110 : Setting the same enclosing frame for the frame to be coded and the reference frame box, as the root node of tree division; S120: Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, and obtain the current node in the to-be-coded frame and the corresponding reference frame respectively S130: For each child node of the current node, encode the occupancy information of each child node of the current node according to the occupancy information of the corresponding position child node in the prediction block to obtain a code stream.
  • the same bounding box is set for the frame to be encoded and the reference frame as described in S110, which specifically includes the first method or the second method.
  • the tree division in S120 includes octree, quadtree, and binary tree division.
  • the methods for obtaining the current node in the to-be-coded frame and the corresponding prediction block in the reference frame include but are not limited to the following mode 1 or mode 2; the mode 1: setting the prediction block If the shortest side length of the block obtained by the tree division of the frame to be encoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; this way 2: the size PTU size of the initial prediction block PTU is set, if the shortest side length of the block obtained by the tree division of the frame to be encoded is When it is equal to or smaller than the PTU size, the initial prediction block PTU is obtained; according to the coding cost, it is decided whether to further divide the PTU, record the flag information, and split flags indicate whether each node of the PTU is further divided, The occupied flags indicates the occupancy information of the divided sub-nodes
  • calculating the most matching prediction block and the corresponding motion vector MV for each of the PUs specifically includes: determining a search window W; according to the search window W, in the reference frame Obtain a local point cloud; in the local point cloud, for the prediction basic unit PU, that is, the current node, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, and the error is the smallest.
  • the matching block and the corresponding motion vector MV of the current node are used as the prediction block and the corresponding motion vector MV of the current node, and the motion vector MV is encoded.
  • the determining of the search window W includes: the size of the search window W is set according to different distribution characteristics of the data set and different code rate points, and if the point cloud distribution is relatively discrete, a larger window range is set; If it is tighter, set a smaller window range.
  • the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, including:
  • the simplified ICP algorithm only considers translation transformation, and the calculation method of the error is the Lagrangian cost, and the calculation formula is as follows:
  • B is the current node of the frame to be encoded
  • W is the search window
  • Q is the nearest neighbor point set obtained by the current node B after the motion vector MV is translated
  • Est(MV) is the estimated and encoded motion
  • Dist is the block matching loss function of the codeword size required to estimate the coding matching deviation.
  • the Dist is a block matching loss function, including: the calculation formula of the block matching loss function Dist is as follows:
  • w is the point in the local point cloud corresponding to the search window W in the reference point cloud
  • q is the point in the nearest neighbor point set Q found by the current node B after the motion vector MV is translated.
  • each child node of the current node encoding the occupancy information of each child node of the current node according to the occupancy information of the child node corresponding to the position in the prediction block, including method 1 or method 2; the method 1: for each child node , on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child nodes of the current node is encoded; The corresponding position child node in the prediction block is occupied, and the occupation codes of the N child nodes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupancy information of the child nodes of the current node is encoded; The location child node is not occupied, and the occupancy information of the child node of the current node is encoded by using the context information in the frame.
  • a block-based point cloud geometric inter-frame prediction decoding method is characterized in that, it includes the following steps: S210: obtaining the same bounding box of the frame to be decoded and the reference frame, As the root node of tree division; S220: Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the frame to be decoded according to the tree division and/or the point cloud code stream. The current node and the corresponding prediction block in the reference frame; S230: Decode according to the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
  • obtaining the same bounding box of the frame to be decoded and the reference frame described in S210 specifically includes method 1 or method 2:
  • Method 1 Obtain the minimum bounding box of all frames of the entire point cloud sequence, as the frame to be decoded and The bounding box of the reference frame;
  • the second method obtain the minimum bounding box of the frame to be decoded and the reference frame, as the bounding box of the frame to be decoded and the reference frame.
  • the tree division in S220 includes octree, quadtree, and binary tree division.
  • the method of obtaining the current node in the frame to be decoded and the corresponding prediction block in the reference frame described in S220 includes and is not limited to the following method 1 or method 2; this method 1, according to the prediction block If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; In this way 2, according to the size PTU size of the initial prediction block PTU, if the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to When it is smaller than the PTU size, the initial prediction block PTU is obtained; the flag information split flags and occupied flags are obtained by decoding, and the PTU is further divided according to the flag information, and the prediction basic unit PU obtained after all division is completed is used as the current node; the motion vector MV
  • the motion vector MV of each PU is obtained by decoding and the corresponding prediction block is obtained by calculation, which specifically includes: decoding the code stream to obtain the motion vector MV; passing the current node through the motion vector MV translation, calculating the nearest neighbor point set in the reference point cloud, and obtaining the prediction block corresponding to the current node in the reference point cloud;
  • decoding the occupancy information of each child node of the current node according to the occupancy information of the child node corresponding to the position in the prediction block including method 1 or method 2: the method 1: for each child node , on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child nodes of the current node is decoded; The corresponding position child node in the prediction block is occupied, and the N child node occupation codes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupation information of the child node of the current node is decoded; If the location child node is not occupied, the occupancy information of the child node of the current node is decoded by using the context information in the frame.
  • a block-based point cloud geometry inter-frame prediction method of the present invention mainly has the following advantages:
  • non-cubic point cloud bounding boxes can be supported to further reduce the space ratio of empty blocks and reduce codeword waste.
  • the MV is not limited to a fixed search direction, and the optimal matching block can be found more accurately, while avoiding the complex motion vector search process.
  • FIG. 1 is a flow chart of the block-based point cloud geometry inter-frame predictive coding method proposed by the present invention.
  • FIG. 2 is a flow chart of the block-based point cloud geometry inter-frame prediction decoding method proposed by the present invention.
  • FIG. 3 is a flowchart of an embodiment of a block-based point cloud geometry inter-frame predictive coding method proposed by the present invention.
  • FIG. 4 is a flowchart of an embodiment of a block-based point cloud geometry inter-frame prediction decoding method proposed by the present invention.
  • FIG. 5 is a performance diagram of an embodiment of the block-based point cloud geometry inter-frame predictive coding method proposed by the present invention.
  • FIG. 6 is a performance diagram of the block-based point cloud geometric inter-frame prediction decoding method proposed in the present invention under the condition of geometric lossy compression.
  • FIG. 7 is a performance diagram of the block-based point cloud geometric inter-frame prediction decoding method proposed in the present invention under the condition of geometric lossless compression.
  • the block division-based point cloud geometry inter-frame prediction method of the present invention uses block motion estimation to capture point cloud time series motion information, and improves the point cloud geometry information compression performance; for three-dimensional point cloud geometry information compression, for point cloud data, using the previous method
  • the space occupancy information of the coded frame is used to predict the geometric occupancy information of the current block to be coded, and as a context, the entropy coding efficiency of the occupancy code is improved to improve the compression performance of the point cloud geometric information.
  • the method includes the following steps:
  • Embodiment 1 A block-based point cloud geometry inter-frame predictive coding method
  • FIG. 1 is a schematic flowchart of a block-based point cloud geometry inter-frame predictive coding method provided by the present invention. As shown in FIG. 1 , the block-based point cloud geometry inter-frame predictive coding method provided by the present invention includes the following steps:
  • S120 Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame;
  • the same bounding box is set for the frame to be encoded and the reference frame as described in S110, which specifically includes the first method or the second method.
  • the tree division in S120 includes octree, quadtree, and binary tree division.
  • the methods for obtaining the current node in the to-be-coded frame and the corresponding prediction block in the reference frame include but are not limited to the following mode 1 or mode 2; the mode 1: setting the prediction block If the shortest side length of the block obtained by the tree division of the frame to be encoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; this way 2: the size PTU size of the initial prediction block PTU is set, if the shortest side length of the block obtained by the tree division of the frame to be encoded is When it is equal to or smaller than the PTU size, the initial prediction block PTU is obtained; according to the coding cost, it is decided whether to further divide the PTU, record the flag information, and split flags indicate whether each node of the PTU is further divided, The occupied flags indicates the occupancy information of the divided sub-nodes
  • calculating the most matching prediction block and the corresponding motion vector MV for each of the PUs specifically includes: determining a search window W; according to the search window W, in the reference frame Obtain a local point cloud; in the local point cloud, for the prediction basic unit PU, that is, the current node, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, and the error is the smallest.
  • the matching block and the corresponding motion vector MV of the current node are used as the prediction block and the corresponding motion vector MV of the current node, and the motion vector MV is encoded.
  • the determining of the search window W includes: the size of the search window W is set according to different distribution characteristics of the data set and different code rate points, and if the point cloud distribution is relatively discrete, a larger window range is set; If it is tighter, set a smaller window range.
  • the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, including:
  • the simplified ICP algorithm only considers translation transformation, and the calculation method of the error is the Lagrangian cost, and the calculation formula is as follows:
  • B is the current node of the frame to be encoded
  • W is the search window
  • Q is the nearest neighbor point set obtained by the current node B after the motion vector MV is translated
  • Est(MV) is the estimated and encoded motion
  • Dist is the block matching loss function of the codeword size required to estimate the coding matching deviation.
  • the Dist is a block matching loss function, including: the calculation formula of the block matching loss function Dist is as follows:
  • w is the point in the local point cloud corresponding to the search window W in the reference point cloud
  • q is the point in the nearest neighbor point set Q found by the current node B after the motion vector MV is translated.
  • each child node of the current node encoding the occupancy information of each child node of the current node according to the occupancy information of the child node corresponding to the position in the prediction block, including method 1 or method 2; the method 1: for each child node , on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child nodes of the current node is encoded; The corresponding position child node in the prediction block is occupied, and the occupation codes of the N child nodes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupancy information of the child nodes of the current node is encoded; The location child node is not occupied, and the occupancy information of the child node of the current node is encoded by using the context information in the frame.
  • Embodiment 2 A block-based point cloud geometry inter-frame prediction decoding method
  • Figure 2 shows a schematic flowchart of a block-based point cloud geometry inter-frame predictive coding method provided by the present invention.
  • the block-based point cloud geometry inter-frame predictive coding method provided by the present invention includes the following steps:
  • S220 Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the current node in the frame to be decoded and the corresponding the prediction block in the reference frame;
  • S230 Decode the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
  • obtaining the same bounding box of the frame to be decoded and the reference frame described in S210 specifically includes method 1 or method 2:
  • Method 1 Obtain the minimum bounding box of all frames of the entire point cloud sequence, as the frame to be decoded and the minimum bounding box of all frames.
  • the second method obtain the minimum bounding box of the frame to be decoded and the reference frame, as the bounding box of the frame to be decoded and the reference frame.
  • the tree division in S220 includes octree, quadtree, and binary tree division.
  • the method of obtaining the current node in the frame to be decoded and the corresponding prediction block in the reference frame described in S220 includes and is not limited to the following method 1 or method 2; this method 1, according to the prediction block If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; In this way 2, according to the size PTU size of the initial prediction block PTU, if the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to When it is smaller than the PTU size, the initial prediction block PTU is obtained; the flag information split flags and occupied flags are obtained by decoding, and the PTU is further divided according to the flag information, and the prediction basic unit PU obtained after all division is completed is used as the current node; the motion vector MV
  • the motion vector MV of each PU is obtained by decoding and the corresponding prediction block is obtained by calculation, which specifically includes: decoding the code stream to obtain the motion vector MV; passing the current node through the motion vector MV Translate, calculate the nearest neighbor point set in the reference point cloud, and obtain the prediction block corresponding to the current node in the reference point cloud;
  • the occupancy information of each child node of the node to be decoded is obtained by decoding according to the point cloud code stream, including method 1 or method 2: the method 1: for each child node, in the original frame On the basis of the context, an additional inter-frame context bit is added to indicate whether the corresponding position in the predicted block is occupied, and the occupancy information of the child nodes of the current node is decoded; this method 2: If the corresponding position in the predicted block is occupied If the child node is occupied, the occupation codes of the N child nodes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupancy information of the child node of the current node is decoded; if the corresponding position child node in the prediction block is not Occupancy, using intra-frame context information to decode the occupancy information of the child nodes of the current node.
  • Embodiment 3 A block-based point cloud geometry inter-frame predictive coding method
  • FIG. 3 is a flowchart of an embodiment of a block-based point cloud geometry inter-frame predictive coding method proposed by the present invention, as shown in FIG. 3 .
  • S310 Set the same bounding box for the frame to be encoded and the reference frame as the root node of the tree division.
  • Input the geometric information of the frame to be encoded and the frame to be encoded as the reference frame.
  • the frame to be encoded and the reference frame are synchronized for tree division.
  • the calculation methods of the bounding boxes of the reference frame and the frame to be encoded include but are not limited to the following two schemes: the first method is to count all the frames of the entire sequence, and calculate the maximum bounding box size that can include all the frames as the current entire sequence. The second method is to calculate the size of the bounding box only in the reference frame and the current frame, and take the largest bounding box size as the root node of the tree division in the current frame encoding process.
  • S320 Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame, respectively.
  • the space of the obtained point cloud bounding box is divided into trees. According to the different shapes of the bounding box, the tree division modes such as octree, quadtree and binary tree can be used to decompose the space of the point cloud, and several coding blocks can be obtained.
  • the tree division modes such as octree, quadtree and binary tree can be used to decompose the space of the point cloud, and several coding blocks can be obtained.
  • PTU Prediction Tree Unit
  • a PTU can continue to be divided into multiple PUs (Prediction Units) through the octree, and it is decided whether to further divide according to the coding cost. After the division, the PU tree structure can be obtained.
  • PU is the basic unit of prediction, and each PU calculates a motion vector MV, which is used to find the most matching prediction block in the reference frame.
  • PTU size as PTU_size
  • first perform tree division on the current point cloud and if the minimum side length of the node is equal to PTU_size, the current node is regarded as a PTU.
  • PTU can be divided into PU tree and multiple PUs by octree division.
  • the PU tree contains two flag information. The split flags is used to indicate whether each layer of the PU tree is further divided. If the division is continued, the occupied flags is used to indicate that the division The occupancy information of the child nodes of .
  • Each PU computes a motion vector MV, encodes the resulting PU tree and MV so that the decoder can also generate this prediction block and drives the arithmetic entropy encoder in the same way as the encoder.
  • the ICP algorithm finds the matching block that is the closest to the current coded node, even if the Lagrangian cost is the smallest, and obtains the corresponding MV.
  • the local point set of the MV mapping with the smallest cost is the optimal prediction block. If there is no corresponding prediction block, the node at the corresponding position in the reference frame is directly used as the prediction block.
  • the defined window size can be set according to different distribution characteristics of the data set and different code rate points. If the point cloud distribution is relatively discrete, a larger window range can be set; if the point cloud is relatively tight, a smaller window size can be set .
  • B is the current block to be encoded
  • W is the search window
  • Q is the nearest neighbor point set found by B after MV translation
  • Est() estimates the code word required for encoding MV, and is obtained by calculating the code word required for exponential Golomb encoding
  • Dist is the block matching loss function
  • searching for a matching block involves searching for the nearest neighbors of each point in the window in the block to be coded.
  • the present invention accelerates the searching process by establishing a KD tree.
  • the occupancy information of the prediction block obtained in S323 is used as the context to help encode the occupancy information of the current node.
  • This scheme is based on the point cloud intra-frame context entropy coding tool in , which adds inter-frame context information to help further improve the entropy coding efficiency.
  • the second method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first 7 neighbor sub-nodes in the frame, and the corresponding intra-frame context mode [1]
  • the occupancy codes of the first seven sub-nodes in the frame sequence are all set to 1, and if the corresponding predicted sub-node is not occupied, the original intra-frame context information is retained.
  • the third method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first three neighbor sub-nodes in the frame, and the Morton order in the corresponding intra-frame context mode is pre-ordered.
  • the occupancy codes of the three sub-nodes are all set to 1. If the corresponding prediction sub-node is not occupied, the original intra-frame context information is retained.
  • the occupancy code information is encoded using the context information obtained in S331 to obtain the code stream of the point cloud.
  • Embodiment 4 A block-based point cloud geometry inter-frame prediction decoding method
  • the same bounding box of the frame to be decoded and the reference frame is obtained by decoding the code stream or external input, and the frame to be decoded and the reference frame are synchronized to perform tree division.
  • the calculation methods of the bounding boxes of the reference frame and the frame to be decoded include but are not limited to the following two methods: Method 1: Obtain the smallest bounding box of all frames of the entire point cloud sequence as the bounding box of the frame to be decoded and the reference frame; Method 2: Obtain the minimum bounding box of the frame to be decoded and the reference frame, as the bounding box of the frame to be decoded and the reference frame.
  • S420 Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the current node in the frame to be decoded and the corresponding A prediction block in a reference frame.
  • the space of the obtained point cloud bounding box is divided into trees. According to the different shapes of the bounding box, the point cloud can be decomposed spatially by using tree division modes such as octree, quadtree and binary tree to obtain several decoding blocks.
  • tree division modes such as octree, quadtree and binary tree to obtain several decoding blocks.
  • the division of the prediction unit (the current block and the prediction block) has but is not limited to the following two ways:
  • Method 1 Obtain the size of the predicted block by decoding the code stream or inputting time. If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the size of the predicted block, the tree division is considered to be completed, and the current node is obtained; In the blocks obtained by reference frame synchronization division, the block corresponding to the current node is directly used as the prediction block;
  • Method 2 Obtain the size PTU size of the initial prediction block PTU by decoding the code stream or time input. If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the PTU size, the initial prediction block PTU is obtained; decoding; The flag information split flags and occupied flags are obtained, the PTU is further divided according to the flag information, and the prediction basic unit PU obtained after all division is completed is used as the current node.
  • S430 Decode the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
  • the prediction block obtained in S423 is synchronized with the current node to be decoded to perform octree division, and the occupancy information of each child node of the current node is decoded according to the occupancy information of the corresponding position child node in the prediction block to obtain a point cloud. It includes the following two methods:
  • Method 1 For each child node, based on the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child node of the current node is decoded;
  • Method 2 If the child node at the corresponding position in the prediction block is occupied, set the occupation codes of the N child nodes in the corresponding intra-frame context mode to 1 as the context information, and decode the occupancy information of the child node of the current node; The child node at the corresponding position is not occupied, and the occupancy information of the child node of the current node is decoded by using the context information in the frame.
  • Embodiment 5 A block-based point cloud geometry inter-frame predictive coding method
  • the method of the present invention is used to perform lossless compression of point cloud geometric information, and the specific implementation steps are:
  • S110 Set the same bounding box for the frame to be encoded and the reference frame as the root node of the tree division.
  • the bounding box can be determined by the three-dimensional coordinates of the starting point and the two sets of information of length, width, height, and side length. Taking this sequence as an example, The corresponding bounding box information is: the xyz coordinate values of the starting point are (-115100, -115025, -44140), and the side lengths are (230239, 230316, 48208). For all frames in the sequence, the tree partition uses this bounding box as the root node.
  • S120 Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame, respectively.
  • first frame as a reference frame and encoding the second frame as an example
  • full intra-frame encoding is adopted.
  • input the reconstructed first frame as the reference frame and use the frame to be encoded as the benchmark to perform a synchronous tree division operation on the point cloud space of the two frames, and then two identical tree structures can be obtained.
  • a tree node corresponds to a space block area.
  • a space block can be further divided into eight, four or two sub-block spaces, and occupancy codes of different lengths are used to identify the corresponding sub-blocks. Whether it is a point or not, each layer only further divides the nodes with a point.
  • the PTU_size+2*512 that is, for the current PTU range, expand the length of 512 in the three coordinate directions.
  • the PTU can continue to be divided into a PU tree structure and corresponding multiple PUs through the octree division.
  • the PTU can also be regarded as a PU with a larger size.
  • the PU tree contains two flag information, and the split flags are used to indicate the nodes of each layer of the PU tree. Whether to further divide, if continue to divide, the occupied flags is used to indicate the occupancy information of the divided child nodes. For each PU, first calculate the encoding cost when it is not further divided. The cost calculation formula is as follows:
  • B is the current block to be coded
  • W is the search window
  • MV is the motion vector corresponding to the closest matching block obtained by the iterative nearest point algorithm for the current node within the search window
  • Q is the nearest neighbor found by B after MV translation Point set
  • Est() estimates the codeword required for encoding MV, which is obtained by calculating the codeword required for exponential Golomb encoding
  • Dist is the block matching loss function, and the formula is as follows:
  • each PU can obtain an MV and a corresponding prediction block, and then perform motion compensation, that is, use the prediction block to replace the node at the corresponding position in the reference frame, and encode the corresponding motion vector and PU tree structure information , so that the decoding end can also generate the same prediction block.
  • a PU to be coded after obtaining the prediction block through step 3 or step 4, further tree division is performed on the prediction block and the PU, and the corresponding prediction occupancy code can be obtained, which is used as the context to help encode the occupancy information of the current PU.
  • the child nodes divided by the PU there are two kinds of inter-frame prediction context situations: 1) the child node at the corresponding position in the prediction block is occupied, and 2) the child node at the corresponding position in the prediction block is not occupied.
  • the first way is to add an additional context bit to each child node based on the original intra-frame context, which is used to indicate the corresponding position in the prediction block is occupied.
  • the second method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first 7 neighbor sub-nodes in the frame.
  • the occupancy codes of the seven sub-nodes are all set to 1. If the corresponding predicted sub-node is not occupied, the original intra-frame context information is retained.
  • the third method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first three neighbor sub-nodes in the frame, and the Morton order in the corresponding intra-frame context mode is pre-ordered.
  • the occupancy codes of the three sub-nodes are all set to 1. If the corresponding prediction sub-node is not occupied, the original intra-frame context information is retained.
  • FIG. 6 shows the performance results of implementing the method of the present invention under the condition of geometric lossy compression based on the AVS point cloud compression platform PCRMv3.0, and stable performance gains are obtained on different data sets.
  • FIG. 7 is a performance result of implementing the method of the present invention under the condition of geometric lossless compression based on the AVS point cloud compression platform PCRMv3.0.
  • the block-based point cloud geometric inter-frame prediction method of the present invention can be used to realize the geometric compression of the point cloud.
  • the method first calculates the bounding box information of the point cloud, and then divides the octree or quadtree and binary tree. When the size of the divided sub-nodes reaches the set prediction unit requirements, the search error in the reference frame for the current node is small.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Disclosed are a block-based point cloud geometric inter-frame prediction encoding method and geometric inter-frame prediction decoding method. The encoding method comprises: S110, setting the same bounding box for a frame to be encoded and a reference frame, and taking the bounding box as a root node for tree division; S120, according to the bounding box, respectively performing tree division on the frame to be encoded and the reference frame, and respectively obtaining the current node in the frame to be encoded and a corresponding prediction block in the reference frame; and S130, for each sub-node of the current node, encoding occupation information of each sub-node of the current node according to occupation information of a sub-node at a corresponding position in the prediction block, so as to obtain a code stream. The decoding method comprises: S210, obtaining the same bounding box of a frame to be decoded and a reference frame, and taking the bounding box as a root node for tree division; S220, according to the bounding box, respectively performing tree division on the frame to be decoded and the reference frame, and according to tree division and/or a point cloud code stream, obtaining the current node in the frame to be decoded and a corresponding prediction block in the reference frame; and S230, performing decoding according to the point cloud code stream to obtain occupation information of each sub-node of the node to be decoded, so as to obtain a point cloud.

Description

一种基于块的点云几何帧间预测方法和解码方法A block-based point cloud geometry inter prediction method and decoding method 技术领域technical field
本发明属于点云数据处理技术领域,具体涉及一种基于块的点云几何帧间预测方法和解码方法。The invention belongs to the technical field of point cloud data processing, and in particular relates to a block-based point cloud geometric inter-frame prediction method and a decoding method.
背景技术Background technique
点云帧间预测一直是点云编解码领域中的技术难点,由于点云不具有固定的空间结构,因此难以像传统视频编解码一样直观地获得规则的块结构。且点云空间分布上的稀疏性和灵活性,也为基于块划分的运动估计和运动补偿带来较大难度。Inter-frame prediction of point clouds has always been a technical difficulty in the field of point cloud encoding and decoding. Since point clouds do not have a fixed spatial structure, it is difficult to intuitively obtain a regular block structure like traditional video encoding and decoding. Moreover, the sparsity and flexibility of the spatial distribution of point clouds also bring great difficulties to motion estimation and motion compensation based on block division.
目前现有的点云帧间预测框架主要包括一下几种:At present, the existing point cloud inter-frame prediction framework mainly includes the following:
基于占用码异或操作的预测方法:该方法将当前帧和预测帧占用码进行异或操作后得到两帧几何信息残差,再对残差进行编码,但这种方法无法捕捉点云对象时序上的运动信息,当物体运动幅度较大时,得到的占用码残差也较大。Prediction method based on occupancy code XOR operation: This method performs XOR operation on the occupancy code of the current frame and the predicted frame to obtain the residual of two frames of geometric information, and then encodes the residual, but this method cannot capture the time series of point cloud objects When the motion information of the object is large, the residual error of the occupancy code obtained is also large.
基于映射的预测方法:该方法先将三维点云映射到多角度的二维平面上,得到多张二维图像,再使用已有的视频编码工具进行压缩,此种方法较依赖于映射方式的优劣,且此类基于映射的压缩方式主要针对三维人物物体表面扫描类数据集,针对激光雷达扫描地图这类稀疏数据集,帧间预测效果有限。Mapping-based prediction method: This method first maps the 3D point cloud to a multi-angle 2D plane to obtain multiple 2D images, and then uses the existing video coding tools for compression. This method depends on the advantages and disadvantages of the mapping method. , and such mapping-based compression methods are mainly for 3D human object surface scanning datasets, and for sparse datasets such as lidar scanning maps, the effect of inter-frame prediction is limited.
基于块划分的帧间预测方法:该方法首先直接对点云进行块划分,再从参考帧中搜索误差最小的匹配块作为预测块,利用预测块提高当前块编码效率。针对预测块的使用,目前有两种方式,一种为直接用预测块替代当前块,并计算预测残差,最后编码得到的残差系数。但由于点云空间分布的稀疏性,往往得到预测块的残差系数仍然较大,若选择不编码残差以节省码字,则无法实现无损压缩。第二种为利用预测块的几何占用信息,作为上下文提升当前块熵编码效率,目前MPEG点云专题组提出的探索平台EM13中即为此类预测方式,但其仅支持对立方体状包围盒进行八叉树划分,因此设定的包围盒范围过大,导致引入过多空块,同时存在运动估计复杂度过高等缺陷。Inter-frame prediction method based on block division: This method first directly divides the point cloud into blocks, and then searches for the matching block with the smallest error from the reference frame as the prediction block, and uses the prediction block to improve the coding efficiency of the current block. For the use of the prediction block, there are currently two ways. One is to directly replace the current block with the prediction block, calculate the prediction residual, and finally encode the obtained residual coefficient. However, due to the sparsity of the spatial distribution of point clouds, the residual coefficient of the predicted block is often still relatively large. The second is to use the geometric occupancy information of the predicted block as a context to improve the entropy coding efficiency of the current block. At present, the exploration platform EM13 proposed by the MPEG point cloud topic group is such a prediction method, but it only supports the cube-shaped bounding box. The octree is divided, so the bounding box range is too large, which leads to the introduction of too many empty blocks, and there are defects such as excessive motion estimation complexity.
发明内容SUMMARY OF THE INVENTION
为了克服上述现有技术的不足,在考虑计算复杂度的条件下,进一步改善点云属性的 压缩性能,本发明提供一种基于块的点云几何帧间预测方法,结合点云四叉树和二叉树划分技术,支持对非立方体包围盒进行块划分,大大减少了空块造成的码字浪费。同时结合KD树和迭代最近点算法进行块匹配计算,在保证压缩率的条件下,大幅降低了运动估计的计算时间复杂度;此外本发明简化了帧间上下文模型,通过与帧内上下文模型的融合,在不增加上下文数量的情况下,对熵编码效率有显著提高。In order to overcome the above-mentioned shortcomings of the prior art and further improve the compression performance of point cloud attributes under the condition of considering the computational complexity, the present invention provides a block-based point cloud geometric inter-frame prediction method, which combines the point cloud quadtree and The binary tree division technology supports block division of non-cube bounding boxes, which greatly reduces the waste of code words caused by empty blocks. At the same time, combining the KD tree and the iterative closest point algorithm for block matching calculation, under the condition of ensuring the compression rate, the computational time complexity of motion estimation is greatly reduced; in addition, the present invention simplifies the inter-frame context model. Fusion, without increasing the number of contexts, significantly improves the entropy coding efficiency.
本发明的目的之一是公开一种基于块的点云几何帧间预测编码方法。One of the objectives of the present invention is to disclose a block-based point cloud geometry inter-frame predictive coding method.
本发明的目的之二是公开一种基于块的点云几何帧间预测解码方法。The second purpose of the present invention is to disclose a block-based point cloud geometric inter-frame prediction decoding method.
本发明的目的之一是按如下方案来实施的,一种基于块的点云几何帧间预测编码方法,其特征在于,包括如下步骤:S110;为待编码帧和参考帧设定相同的包围盒,作为树划分的根节点;S120:根据所述包围盒分别对所述待编码帧和所述参考帧进行树划分,分别得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块;S130:对于所述当前节点的每个子节点,根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,得到码流。One of the objectives of the present invention is to implement the following scheme, a block-based point cloud geometric inter-frame predictive coding method, characterized in that it includes the following steps: S110 : Setting the same enclosing frame for the frame to be coded and the reference frame box, as the root node of tree division; S120: Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, and obtain the current node in the to-be-coded frame and the corresponding reference frame respectively S130: For each child node of the current node, encode the occupancy information of each child node of the current node according to the occupancy information of the corresponding position child node in the prediction block to obtain a code stream.
优选地,S110中所述为待编码帧和参考帧设定相同的包围盒,具体包括方法一或方法二,该方法一:计算可包含整个点云序列所有帧的最小包围盒,作为所述待编码帧和所述参考帧的包围盒;该方式二:计算所述待编码帧和所述参考帧的最小包围盒,作为所述待编码帧和所述参考帧的包围盒。Preferably, the same bounding box is set for the frame to be encoded and the reference frame as described in S110, which specifically includes the first method or the second method. The bounding box of the frame to be coded and the reference frame; the second method: calculate the minimum bounding box of the frame to be coded and the reference frame, as the bounding box of the frame to be coded and the reference frame.
优选地,S120中所述树划分包括,八叉树、四叉树、二叉树划分方式。Preferably, the tree division in S120 includes octree, quadtree, and binary tree division.
优选地,S120中,得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块的方式有且不限于以下为方式一,或者方式二;该方式一:设定预测块的大小,若所述待编码帧的树划分得到的块的最短边长等于或小于所述预测块的大小时,即认为树划分完成,得到所述当前节点;在所述参考帧同步划分得到的块中,将所述当前节点对应位置的块直接作为预测块;该方式二:设定起始预测块PTU的大小PTU size,若所述待编码帧的树划分得到的块的最短边长等于或小于所述PTU size时,得到所述起始预测块PTU;根据编码代价决定是否对所述PTU进行进一步划分,记录标志信息,split flags表示对所述PTU每个节点是否进行进一步划分,occupied flags表示划分出的子节点的占用信息,全部划分完成得到的预测基本单元PU作为所述当前节点;对每个所述当前节点计算得到最匹配的预测块和对应的运动矢量MV,编码所述标志信息和所述运动矢量MV。Preferably, in S120, the methods for obtaining the current node in the to-be-coded frame and the corresponding prediction block in the reference frame include but are not limited to the following mode 1 or mode 2; the mode 1: setting the prediction block If the shortest side length of the block obtained by the tree division of the frame to be encoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; this way 2: the size PTU size of the initial prediction block PTU is set, if the shortest side length of the block obtained by the tree division of the frame to be encoded is When it is equal to or smaller than the PTU size, the initial prediction block PTU is obtained; according to the coding cost, it is decided whether to further divide the PTU, record the flag information, and split flags indicate whether each node of the PTU is further divided, The occupied flags indicates the occupancy information of the divided sub-nodes, and the prediction basic unit PU obtained after all divisions is used as the current node; the most matching prediction block and the corresponding motion vector MV are calculated for each of the current nodes. the flag information and the motion vector MV.
优选地,所述方式二中所述对每个所述PU计算得到最匹配的预测块和对应的运动矢 量MV,具体包括:确定搜索窗口W;根据所述搜索窗口W在所述参考帧中得到局部点云;在所述局部点云中,对于所述的预测基本单元PU即所述当前节点,利用简化的ICP算法得到误差最小的匹配块和对应的运动矢量MV,将所述误差最小的匹配块和对应的运动矢量MV作为所述当前节点的预测块和对应的运动矢量MV,编码运动矢量MV。Preferably, in the second method, calculating the most matching prediction block and the corresponding motion vector MV for each of the PUs specifically includes: determining a search window W; according to the search window W, in the reference frame Obtain a local point cloud; in the local point cloud, for the prediction basic unit PU, that is, the current node, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, and the error is the smallest. The matching block and the corresponding motion vector MV of the current node are used as the prediction block and the corresponding motion vector MV of the current node, and the motion vector MV is encoded.
优选地,所述确定搜索窗口W,包括:所述搜索窗口W的大小根据数据集的不同分布特点和不同码率点设置,若点云分布较为离散,则设置较大窗口范围;若点云较为紧密,则设置较小窗口范围。Preferably, the determining of the search window W includes: the size of the search window W is set according to different distribution characteristics of the data set and different code rate points, and if the point cloud distribution is relatively discrete, a larger window range is set; If it is tighter, set a smaller window range.
优选地,所述利用简化的ICP算法得到误差最小的匹配块和对应的运动矢量MV,包括:Preferably, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, including:
所述简化的ICP算法仅考虑平移变换,所述误差的计算方法为拉格朗日代价,计算公式如下:The simplified ICP algorithm only considers translation transformation, and the calculation method of the error is the Lagrangian cost, and the calculation formula is as follows:
Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)
其中B为所述待编码帧当前节点,W为所述搜索窗口,Q为所述当前节点B经过所述运动矢量MV平移后得到的最近邻点集,Est(MV)为估计编码所述运动矢量MV需要的码字大小,Dist为估计编码匹配偏差需要的码字大小的块匹配损失函数。Wherein B is the current node of the frame to be encoded, W is the search window, Q is the nearest neighbor point set obtained by the current node B after the motion vector MV is translated, and Est(MV) is the estimated and encoded motion The codeword size required by the vector MV, Dist is the block matching loss function of the codeword size required to estimate the coding matching deviation.
优选地,所述Dist为块匹配损失函数,包括:所述块匹配损失函数Dist的计算公式如下:Preferably, the Dist is a block matching loss function, including: the calculation formula of the block matching loss function Dist is as follows:
Figure PCTCN2021114282-appb-000001
Figure PCTCN2021114282-appb-000001
其中w为所述搜索窗W在所述参考点云中对应的局部点云中的点,q为所述当前节点B经过所述运动矢量MV平移后找到的最近邻点集Q中的点。Where w is the point in the local point cloud corresponding to the search window W in the reference point cloud, and q is the point in the nearest neighbor point set Q found by the current node B after the motion vector MV is translated.
优选地,S130中所述根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,包括方法一、或方法二;该方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,编码所述当前节点的子节点的占用信息;该方法二:如果所述预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,编码所述当前节点的子节点的占用信息;如果所述预测块中对应位置子节点为未占用,采用帧内上下文信息,编码所述当前节点的子节点的占用信息。Preferably, in S130, encoding the occupancy information of each child node of the current node according to the occupancy information of the child node corresponding to the position in the prediction block, including method 1 or method 2; the method 1: for each child node , on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child nodes of the current node is encoded; The corresponding position child node in the prediction block is occupied, and the occupation codes of the N child nodes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupancy information of the child nodes of the current node is encoded; The location child node is not occupied, and the occupancy information of the child node of the current node is encoded by using the context information in the frame.
本发明的目的之二是按如下方案来实施的,一种基于块的点云几何帧间预测解码方法,其特征在于,包括如下步骤:S210:获得待解码帧和参考帧相同的包围盒,作为树划分的 根节点;S220:根据所述包围盒分别对所述待解码帧和所述参考帧进行树划分,根据所述树划分和/或点云码流得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块;S230:根据所述点云码流解码得到所述待解码节点的每个子节点的占用信息,得到点云。The second object of the present invention is implemented according to the following scheme. A block-based point cloud geometric inter-frame prediction decoding method is characterized in that, it includes the following steps: S210: obtaining the same bounding box of the frame to be decoded and the reference frame, As the root node of tree division; S220: Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the frame to be decoded according to the tree division and/or the point cloud code stream. The current node and the corresponding prediction block in the reference frame; S230: Decode according to the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
优选地,S210中所述获得待解码帧和参考帧相同的包围盒,具体包括方法一或方法二:该方法一:获得整个点云序列所有帧的最小包围盒,作为所述待解码帧和所述参考帧的包围盒;该方式二:获得所述待解码帧和所述参考帧的最小包围盒,作为所述待解码帧和所述参考帧的包围盒。Preferably, obtaining the same bounding box of the frame to be decoded and the reference frame described in S210 specifically includes method 1 or method 2: Method 1: Obtain the minimum bounding box of all frames of the entire point cloud sequence, as the frame to be decoded and The bounding box of the reference frame; the second method: obtain the minimum bounding box of the frame to be decoded and the reference frame, as the bounding box of the frame to be decoded and the reference frame.
优选地,S220中所述树划分包括,八叉树、四叉树、二叉树划分方式。Preferably, the tree division in S220 includes octree, quadtree, and binary tree division.
优选地,S220中所述得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块的方式,有且不限于以下的方式一或方式二;该方式一,根据预测块的大小,若所述待解码帧的树划分得到的块的最短边长等于或小于所述预测块的大小时,即认为树划分完成,得到所述当前节点;在所述参考帧同步划分得到的块中,将所述当前节点对应位置的块直接作为预测块;该方式二,根据起始预测块PTU的大小PTU size,若所述待解码帧的树划分得到的块的最短边长等于或小于所述PTU size时,得到所述起始预测块PTU;解码得到标志信息split flags和occupied flags,根据所述标志信息对所述PTU进行进一步划分,全部划分完成得到的预测基本单元PU作为所述当前节点;解码得到每个所述当前节点的运动矢量MV并计算得到对应的预测块。Preferably, the method of obtaining the current node in the frame to be decoded and the corresponding prediction block in the reference frame described in S220 includes and is not limited to the following method 1 or method 2; this method 1, according to the prediction block If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; In this way 2, according to the size PTU size of the initial prediction block PTU, if the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to When it is smaller than the PTU size, the initial prediction block PTU is obtained; the flag information split flags and occupied flags are obtained by decoding, and the PTU is further divided according to the flag information, and the prediction basic unit PU obtained after all division is completed is used as the current node; the motion vector MV of each current node is obtained by decoding and the corresponding prediction block is obtained by calculation.
优选地,所述方式二中所述解码得到每个所述PU的运动矢量MV并计算得到对应的预测块,具体包括:解码码流得到运动矢量MV;将所述当前节点经过所述运动矢量MV平移,在所述参考点云中计算最近邻点集,获得所述当前节点在所述参考点云中对应的所述预测块;Preferably, in the second method, the motion vector MV of each PU is obtained by decoding and the corresponding prediction block is obtained by calculation, which specifically includes: decoding the code stream to obtain the motion vector MV; passing the current node through the motion vector MV translation, calculating the nearest neighbor point set in the reference point cloud, and obtaining the prediction block corresponding to the current node in the reference point cloud;
优选地,S230中所述根据所述预测块中对应位置子节点的占用信息,解码所述当前节点的每个子节点的占用信息,包括方法一、或方法二:该方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,解码所述当前节点的子节点的占用信息;该方法二:如果所述预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,解码所述当前节点的子节点的占用信息;如果所述预测块中对应位置子节点为未占用,采用帧内上下文信息,解码所述当前节点的子节点的占用信息。Preferably, in S230, decoding the occupancy information of each child node of the current node according to the occupancy information of the child node corresponding to the position in the prediction block, including method 1 or method 2: the method 1: for each child node , on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child nodes of the current node is decoded; The corresponding position child node in the prediction block is occupied, and the N child node occupation codes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupation information of the child node of the current node is decoded; If the location child node is not occupied, the occupancy information of the child node of the current node is decoded by using the context information in the frame.
由于采取以上技术方案,与现有技术相比,本发明的一种基于块的点云几何帧间预测 方法主要有以下几点优势:Due to adopting the above technical solutions, compared with the prior art, a block-based point cloud geometry inter-frame prediction method of the present invention mainly has the following advantages:
(一)针对帧间预测,可支持非立方体的点云包围盒,以进一步降低空块的空间占比,减少码字浪费。(1) For inter-frame prediction, non-cubic point cloud bounding boxes can be supported to further reduce the space ratio of empty blocks and reduce codeword waste.
(二)使用ICP进行运动估计,使MV不局限于固定的搜索方向,可更精确的查找到最优匹配块,同时避免了复杂的运动矢量搜索过程。(2) Using ICP for motion estimation, the MV is not limited to a fixed search direction, and the optimal matching block can be found more accurately, while avoiding the complex motion vector search process.
(三)增加的帧间上下文方式简单,不会对硬件的实现带来额外负担。(3) The added inter-frame context is simple and does not bring extra burden to the hardware implementation.
附图说明Description of drawings
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍。In order to illustrate the specific embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that are required to be used in the description of the specific embodiments or the prior art.
图1是本发明提出的基于块的点云几何帧间预测编码方法的流程框图。FIG. 1 is a flow chart of the block-based point cloud geometry inter-frame predictive coding method proposed by the present invention.
图2是本发明提出的基于块的点云几何帧间预测解码方法的流程框图。FIG. 2 is a flow chart of the block-based point cloud geometry inter-frame prediction decoding method proposed by the present invention.
图3是本发明提出的基于块的点云几何帧间预测编码方法的一种实施例的流程框图。FIG. 3 is a flowchart of an embodiment of a block-based point cloud geometry inter-frame predictive coding method proposed by the present invention.
图4是本发明提出的基于块的点云几何帧间预测解码方法的一种实施例的流程框图。FIG. 4 is a flowchart of an embodiment of a block-based point cloud geometry inter-frame prediction decoding method proposed by the present invention.
图5是本发明提出的基于块的点云几何帧间预测编码方法的一种实施例的性能图。FIG. 5 is a performance diagram of an embodiment of the block-based point cloud geometry inter-frame predictive coding method proposed by the present invention.
图6是本发明提出的基于块的点云几何帧间预测解码方法在几何有损压缩条件的性能图。FIG. 6 is a performance diagram of the block-based point cloud geometric inter-frame prediction decoding method proposed in the present invention under the condition of geometric lossy compression.
图7是本发明提出的基于块的点云几何帧间预测解码方法在几何无损压缩条件的性能图。FIG. 7 is a performance diagram of the block-based point cloud geometric inter-frame prediction decoding method proposed in the present invention under the condition of geometric lossless compression.
具体实施方式detailed description
下面结合附图,通过实施例进一步描述本发明,但不以任何方式限制本发明的范围。Below in conjunction with the accompanying drawings, the present invention is further described by means of embodiments, but the scope of the present invention is not limited in any way.
本发明的基于块划分的点云几何帧间预测方法,利用块运动估计捕捉点云时序运动信息,并提高点云几何信息压缩性能;针对三维点云几何信息压缩,针对点云数据,利用前面已编码帧的空间占用信息,预测当前待编码块的几何占用信息,并作为上下文提高占用码熵编码效率,以提高点云几何信息压缩性能,方法包括如下步骤:The block division-based point cloud geometry inter-frame prediction method of the present invention uses block motion estimation to capture point cloud time series motion information, and improves the point cloud geometry information compression performance; for three-dimensional point cloud geometry information compression, for point cloud data, using the previous method The space occupancy information of the coded frame is used to predict the geometric occupancy information of the current block to be coded, and as a context, the entropy coding efficiency of the occupancy code is improved to improve the compression performance of the point cloud geometric information. The method includes the following steps:
实施例一:一种基于块的点云几何帧间预测编码方法Embodiment 1: A block-based point cloud geometry inter-frame predictive coding method
图1所示为本发明提供的基于块的点云几何帧间预测编码方法的流程示意图,如图1所示,本发明提供的基于块的点云几何帧间预测编码方法包括如下步骤:FIG. 1 is a schematic flowchart of a block-based point cloud geometry inter-frame predictive coding method provided by the present invention. As shown in FIG. 1 , the block-based point cloud geometry inter-frame predictive coding method provided by the present invention includes the following steps:
S110:为待编码帧和参考帧设定相同的包围盒,作为树划分的根节点;S110: Set the same bounding box for the frame to be encoded and the reference frame as the root node of the tree division;
S120:根据所述包围盒分别对所述待编码帧和所述参考帧进行树划分,分别得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块;S120: Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame;
S130:对于所述当前节点的每个子节点,根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,得到码流。S130: For each child node of the current node, encode the occupancy information of each child node of the current node according to the occupancy information of the corresponding position child node in the prediction block to obtain a code stream.
优选地,S110中所述为待编码帧和参考帧设定相同的包围盒,具体包括方法一或方法二,该方法一:计算可包含整个点云序列所有帧的最小包围盒,作为所述待编码帧和所述参考帧的包围盒;该方式二:计算所述待编码帧和所述参考帧的最小包围盒,作为所述待编码帧和所述参考帧的包围盒。Preferably, the same bounding box is set for the frame to be encoded and the reference frame as described in S110, which specifically includes the first method or the second method. The bounding box of the frame to be coded and the reference frame; the second method: calculate the minimum bounding box of the frame to be coded and the reference frame, as the bounding box of the frame to be coded and the reference frame.
优选地,S120中所述树划分包括,八叉树、四叉树、二叉树划分方式。Preferably, the tree division in S120 includes octree, quadtree, and binary tree division.
优选地,S120中,得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块的方式有且不限于以下为方式一,或者方式二;该方式一:设定预测块的大小,若所述待编码帧的树划分得到的块的最短边长等于或小于所述预测块的大小时,即认为树划分完成,得到所述当前节点;在所述参考帧同步划分得到的块中,将所述当前节点对应位置的块直接作为预测块;该方式二:设定起始预测块PTU的大小PTU size,若所述待编码帧的树划分得到的块的最短边长等于或小于所述PTU size时,得到所述起始预测块PTU;根据编码代价决定是否对所述PTU进行进一步划分,记录标志信息,split flags表示对所述PTU每个节点是否进行进一步划分,occupied flags表示划分出的子节点的占用信息,全部划分完成得到的预测基本单元PU作为所述当前节点;对每个所述当前节点计算得到最匹配的预测块和对应的运动矢量MV,编码所述标志信息和所述运动矢量MV。Preferably, in S120, the methods for obtaining the current node in the to-be-coded frame and the corresponding prediction block in the reference frame include but are not limited to the following mode 1 or mode 2; the mode 1: setting the prediction block If the shortest side length of the block obtained by the tree division of the frame to be encoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; this way 2: the size PTU size of the initial prediction block PTU is set, if the shortest side length of the block obtained by the tree division of the frame to be encoded is When it is equal to or smaller than the PTU size, the initial prediction block PTU is obtained; according to the coding cost, it is decided whether to further divide the PTU, record the flag information, and split flags indicate whether each node of the PTU is further divided, The occupied flags indicates the occupancy information of the divided sub-nodes, and the prediction basic unit PU obtained after all divisions is used as the current node; the most matching prediction block and the corresponding motion vector MV are calculated for each of the current nodes. the flag information and the motion vector MV.
优选地,所述方式二中所述对每个所述PU计算得到最匹配的预测块和对应的运动矢量MV,具体包括:确定搜索窗口W;根据所述搜索窗口W在所述参考帧中得到局部点云;在所述局部点云中,对于所述的预测基本单元PU即所述当前节点,利用简化的ICP算法得到误差最小的匹配块和对应的运动矢量MV,将所述误差最小的匹配块和对应的运动矢量MV作为所述当前节点的预测块和对应的运动矢量MV,编码运动矢量MV。Preferably, in the second method, calculating the most matching prediction block and the corresponding motion vector MV for each of the PUs specifically includes: determining a search window W; according to the search window W, in the reference frame Obtain a local point cloud; in the local point cloud, for the prediction basic unit PU, that is, the current node, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, and the error is the smallest. The matching block and the corresponding motion vector MV of the current node are used as the prediction block and the corresponding motion vector MV of the current node, and the motion vector MV is encoded.
优选地,所述确定搜索窗口W,包括:所述搜索窗口W的大小根据数据集的不同分布特点和不同码率点设置,若点云分布较为离散,则设置较大窗口范围;若点云较为紧密,则设置较小窗口范围。Preferably, the determining of the search window W includes: the size of the search window W is set according to different distribution characteristics of the data set and different code rate points, and if the point cloud distribution is relatively discrete, a larger window range is set; If it is tighter, set a smaller window range.
优选地,所述利用简化的ICP算法得到误差最小的匹配块和对应的运动矢量MV,包括:Preferably, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, including:
所述简化的ICP算法仅考虑平移变换,所述误差的计算方法为拉格朗日代价,计算公式如下:The simplified ICP algorithm only considers translation transformation, and the calculation method of the error is the Lagrangian cost, and the calculation formula is as follows:
Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)
其中B为所述待编码帧当前节点,W为所述搜索窗口,Q为所述当前节点B经过所述运动矢量MV平移后得到的最近邻点集,Est(MV)为估计编码所述运动矢量MV需要的码字大小,Dist为估计编码匹配偏差需要的码字大小的块匹配损失函数。Wherein B is the current node of the frame to be encoded, W is the search window, Q is the nearest neighbor point set obtained by the current node B after the motion vector MV is translated, and Est(MV) is the estimated and encoded motion The codeword size required by the vector MV, Dist is the block matching loss function of the codeword size required to estimate the coding matching deviation.
优选地,所述Dist为块匹配损失函数,包括:所述块匹配损失函数Dist的计算公式如下:Preferably, the Dist is a block matching loss function, including: the calculation formula of the block matching loss function Dist is as follows:
Figure PCTCN2021114282-appb-000002
Figure PCTCN2021114282-appb-000002
其中w为所述搜索窗W在所述参考点云中对应的局部点云中的点,q为所述当前节点B经过所述运动矢量MV平移后找到的最近邻点集Q中的点。Where w is the point in the local point cloud corresponding to the search window W in the reference point cloud, and q is the point in the nearest neighbor point set Q found by the current node B after the motion vector MV is translated.
优选地,S130中所述根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,包括方法一、或方法二;该方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,编码所述当前节点的子节点的占用信息;该方法二:如果所述预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,编码所述当前节点的子节点的占用信息;如果所述预测块中对应位置子节点为未占用,采用帧内上下文信息,编码所述当前节点的子节点的占用信息。Preferably, in S130, encoding the occupancy information of each child node of the current node according to the occupancy information of the child node corresponding to the position in the prediction block, including method 1 or method 2; the method 1: for each child node , on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child nodes of the current node is encoded; The corresponding position child node in the prediction block is occupied, and the occupation codes of the N child nodes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupancy information of the child nodes of the current node is encoded; The location child node is not occupied, and the occupancy information of the child node of the current node is encoded by using the context information in the frame.
实施例二:一种基于块的点云几何帧间预测解码方法Embodiment 2: A block-based point cloud geometry inter-frame prediction decoding method
图2所示为本发明提供的基于块的点云几何帧间预测编码方法的流程示意图,如图2所示,本发明提供的基于块的点云几何帧间预测编码方法包括如下步骤:Figure 2 shows a schematic flowchart of a block-based point cloud geometry inter-frame predictive coding method provided by the present invention. As shown in Figure 2, the block-based point cloud geometry inter-frame predictive coding method provided by the present invention includes the following steps:
S210:获得待解码帧和参考帧相同的包围盒,作为树划分的根节点;S210: Obtain the same bounding box of the frame to be decoded and the reference frame as the root node of the tree division;
S220:根据所述包围盒分别对所述待解码帧和所述参考帧进行树划分,根据所述树划分和/或点云码流得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块;S220: Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the current node in the frame to be decoded and the corresponding the prediction block in the reference frame;
S230:根据所述点云码流解码得到所述待解码节点的每个子节点的占用信息,得到点云。S230: Decode the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
优选地,S210中所述获得待解码帧和参考帧相同的包围盒,具体包括方法一或方法二: 该方法一:获得整个点云序列所有帧的最小包围盒,作为所述待解码帧和所述参考帧的包围盒;该方式二:获得所述待解码帧和所述参考帧的最小包围盒,作为所述待解码帧和所述参考帧的包围盒。Preferably, obtaining the same bounding box of the frame to be decoded and the reference frame described in S210 specifically includes method 1 or method 2: Method 1: Obtain the minimum bounding box of all frames of the entire point cloud sequence, as the frame to be decoded and the minimum bounding box of all frames. The bounding box of the reference frame; the second method: obtain the minimum bounding box of the frame to be decoded and the reference frame, as the bounding box of the frame to be decoded and the reference frame.
优选地,S220中所述树划分包括,八叉树、四叉树、二叉树划分方式。Preferably, the tree division in S220 includes octree, quadtree, and binary tree division.
优选地,S220中所述得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块的方式,有且不限于以下的方式一或方式二;该方式一,根据预测块的大小,若所述待解码帧的树划分得到的块的最短边长等于或小于所述预测块的大小时,即认为树划分完成,得到所述当前节点;在所述参考帧同步划分得到的块中,将所述当前节点对应位置的块直接作为预测块;该方式二,根据起始预测块PTU的大小PTU size,若所述待解码帧的树划分得到的块的最短边长等于或小于所述PTU size时,得到所述起始预测块PTU;解码得到标志信息split flags和occupied flags,根据所述标志信息对所述PTU进行进一步划分,全部划分完成得到的预测基本单元PU作为所述当前节点;解码得到每个所述当前节点的运动矢量MV并计算得到对应的预测块。Preferably, the method of obtaining the current node in the frame to be decoded and the corresponding prediction block in the reference frame described in S220 includes and is not limited to the following method 1 or method 2; this method 1, according to the prediction block If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block, the block at the corresponding position of the current node is directly used as the prediction block; In this way 2, according to the size PTU size of the initial prediction block PTU, if the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to When it is smaller than the PTU size, the initial prediction block PTU is obtained; the flag information split flags and occupied flags are obtained by decoding, and the PTU is further divided according to the flag information, and the prediction basic unit PU obtained after all division is completed is used as the current node; the motion vector MV of each current node is obtained by decoding and the corresponding prediction block is obtained by calculation.
优选地所述方式二中所述解码得到每个所述PU的运动矢量MV并计算得到对应的预测块,具体包括:解码码流得到运动矢量MV;将所述当前节点经过所述运动矢量MV平移,在所述参考点云中计算最近邻点集,获得所述当前节点在所述参考点云中对应的所述预测块;Preferably, in the second method, the motion vector MV of each PU is obtained by decoding and the corresponding prediction block is obtained by calculation, which specifically includes: decoding the code stream to obtain the motion vector MV; passing the current node through the motion vector MV Translate, calculate the nearest neighbor point set in the reference point cloud, and obtain the prediction block corresponding to the current node in the reference point cloud;
优选地,S230中所述根据所述点云码流解码得到所述待解码节点的每个子节点的占用信息,包括方法一、或方法二:该方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,解码所述当前节点的子节点的占用信息;该方法二:如果所述预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,解码所述当前节点的子节点的占用信息;如果所述预测块中对应位置子节点为未占用,采用帧内上下文信息,解码所述当前节点的子节点的占用信息。Preferably, in S230, the occupancy information of each child node of the node to be decoded is obtained by decoding according to the point cloud code stream, including method 1 or method 2: the method 1: for each child node, in the original frame On the basis of the context, an additional inter-frame context bit is added to indicate whether the corresponding position in the predicted block is occupied, and the occupancy information of the child nodes of the current node is decoded; this method 2: If the corresponding position in the predicted block is occupied If the child node is occupied, the occupation codes of the N child nodes in the corresponding intra-frame context mode are set to 1 as the context information, and the occupancy information of the child node of the current node is decoded; if the corresponding position child node in the prediction block is not Occupancy, using intra-frame context information to decode the occupancy information of the child nodes of the current node.
实施例三:一种基于块的点云几何帧间预测编码方法Embodiment 3: A block-based point cloud geometry inter-frame predictive coding method
图3是本发明提出的基于块的点云几何帧间预测编码方法的一种实施例的流程框图,如图3所示。FIG. 3 is a flowchart of an embodiment of a block-based point cloud geometry inter-frame predictive coding method proposed by the present invention, as shown in FIG. 3 .
S310:为待编码帧和参考帧设定相同的包围盒,作为树划分的根节点。S310: Set the same bounding box for the frame to be encoded and the reference frame as the root node of the tree division.
S311:点云输入:S311: Point cloud input:
输入待编码帧的几何信息和作为参考帧的以编码帧。Input the geometric information of the frame to be encoded and the frame to be encoded as the reference frame.
S312:包围盒计算:S312: Bounding box calculation:
为了保证待编码帧和参考帧的节点具有一一对应的关系,需为待编码帧和参考帧设定相同的包围盒作为树划分的根节点。确定包围盒后,对待编码帧和参考帧同步进行树划分。其中,参考帧和待编码帧的包围盒的计算方式,有且不限于以下两种方案:第一种方式为统计整个序列所有帧,计算可包含所有帧的最大包围盒尺寸作为当前整个序列统一的包围盒;第二种方式为仅在参考帧和当前帧中计算包围盒大小,取最大包围盒尺寸作为当前待帧编码过程中树划分的根节点。In order to ensure that the nodes of the to-be-coded frame and the reference frame have a one-to-one correspondence, it is necessary to set the same bounding box for the to-be-coded frame and the reference frame as the root node of the tree division. After the bounding box is determined, the frame to be encoded and the reference frame are synchronized for tree division. Among them, the calculation methods of the bounding boxes of the reference frame and the frame to be encoded include but are not limited to the following two schemes: the first method is to count all the frames of the entire sequence, and calculate the maximum bounding box size that can include all the frames as the current entire sequence. The second method is to calculate the size of the bounding box only in the reference frame and the current frame, and take the largest bounding box size as the root node of the tree division in the current frame encoding process.
S320:根据所述包围盒分别对所述待编码帧和所述参考帧进行树划分,分别得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块。S320: Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame, respectively.
S321:点云树划分:S321: Point cloud tree division:
对得到的点云包围盒空间进行树划分,根据包围盒的不同形状,可结合使用八叉树、四叉树以及二叉树等树划分模式对点云进行空间分解,得到若干编码块。The space of the obtained point cloud bounding box is divided into trees. According to the different shapes of the bounding box, the tree division modes such as octree, quadtree and binary tree can be used to decompose the space of the point cloud, and several coding blocks can be obtained.
S322:预测单元划分:S322: Prediction unit division:
引入类似于视频编码中预测块的概念,我们定义PTU(Prediction Tree Unit)为起始预测块。一个PTU可继续通过八叉树划分为多个PU(Prediction Unit),根据编码代价决定是否进行进一步划分,划分后可得到PU树结构。PU即为预测基本单元,每个PU会计算一个运动矢量MV,用于查找参考帧中最匹配的预测块。Introducing a concept similar to prediction block in video coding, we define PTU (Prediction Tree Unit) as the initial prediction block. A PTU can continue to be divided into multiple PUs (Prediction Units) through the octree, and it is decided whether to further divide according to the coding cost. After the division, the PU tree structure can be obtained. PU is the basic unit of prediction, and each PU calculates a motion vector MV, which is used to find the most matching prediction block in the reference frame.
在S322中,定义PTU尺寸为PTU_size,先对当前点云进行树划分,若节点最小边长等于PTU_size,则将当前节点视为PTU。PTU可通过八叉树划分得到PU树和多个PU,PU树包含两个标志信息,split flags用于表示PU树每层节点是否进行进一步划分,若继续进行划分,occupied flags用于表示划分出的子节点的占用信息。每个PU会计算一个运动矢量MV,对于得到的PU树和MV进行编码,以使解码器也能够生成该预测块,并以和编码器相同的方式驱动算术熵编码器。In S322, define the PTU size as PTU_size, first perform tree division on the current point cloud, and if the minimum side length of the node is equal to PTU_size, the current node is regarded as a PTU. PTU can be divided into PU tree and multiple PUs by octree division. The PU tree contains two flag information. The split flags is used to indicate whether each layer of the PU tree is further divided. If the division is continued, the occupied flags is used to indicate that the division The occupancy information of the child nodes of . Each PU computes a motion vector MV, encodes the resulting PU tree and MV so that the decoder can also generate this prediction block and drives the arithmetic entropy encoder in the same way as the encoder.
S323:运动估计和运动补偿:S323: Motion Estimation and Motion Compensation:
对于一个PU,我们在参考帧中定义一个搜索窗口W,在搜索窗口范围内,通过ICP算法找到与当前带编码节点最近似,即使拉格朗日代价最小的匹配块,并得到对应的MV。最终代价最小的MV映射的局部点集即为最优预测块,若没有对应的出的预测块,则直接将 参考帧中对应位置的节点作为预测块。For a PU, we define a search window W in the reference frame, within the range of the search window, the ICP algorithm finds the matching block that is the closest to the current coded node, even if the Lagrangian cost is the smallest, and obtains the corresponding MV. The local point set of the MV mapping with the smallest cost is the optimal prediction block. If there is no corresponding prediction block, the node at the corresponding position in the reference frame is directly used as the prediction block.
在S323中,定义的窗口尺寸可根据数据集的不同分布特点和不同码率点设置,若点云分布较为离散,则设置较大窗口范围;若点云较为紧密,则可设置较小窗口尺寸。In S323, the defined window size can be set according to different distribution characteristics of the data set and different code rate points. If the point cloud distribution is relatively discrete, a larger window range can be set; if the point cloud is relatively tight, a smaller window size can be set .
在S323中,ICP算法做了相应简化,仅考虑平移变换,不考虑旋转变换,ICP后得到的平移向量即为运动矢量。拉格朗日代价计算公式如下:In S323, the ICP algorithm is simplified accordingly, only the translation transformation is considered, and the rotation transformation is not considered, and the translation vector obtained after ICP is the motion vector. The formula for calculating the Lagrangian cost is as follows:
Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)
其中B为当前待编码块,W为搜索窗口,Q为B经过MV平移后找到的最近邻点集,Est()估计编码MV所需码字,通过计算指数哥伦布编码所需码字得到,Dist为块匹配损失函数,公式如下:Among them, B is the current block to be encoded, W is the search window, Q is the nearest neighbor point set found by B after MV translation, Est() estimates the code word required for encoding MV, and is obtained by calculating the code word required for exponential Golomb encoding, Dist is the block matching loss function, the formula is as follows:
Figure PCTCN2021114282-appb-000003
Figure PCTCN2021114282-appb-000003
同时,查找匹配块时涉及到待编码块中每个点在窗口内的最近邻点的查找,本发明通过建立KD树以加速此查找过程。At the same time, searching for a matching block involves searching for the nearest neighbors of each point in the window in the block to be coded. The present invention accelerates the searching process by establishing a KD tree.
S330:对于所述当前节点的每个子节点,根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,得到码流。S330: For each child node of the current node, encode the occupancy information of each child node of the current node according to the occupancy information of the corresponding position child node in the prediction block to obtain a code stream.
S331:基于帧间预测块的上下文熵编码:S331: Context entropy coding based on inter-frame prediction blocks:
利用S323中得到的预测块的占用信息作为上下文,帮助编码当前节点的占用信息。本方案基于中的点云帧内上下文熵编码工具,在其中添加帧间上下文信息,帮助进一步提高熵编码效率。The occupancy information of the prediction block obtained in S323 is used as the context to help encode the occupancy information of the current node. This scheme is based on the point cloud intra-frame context entropy coding tool in , which adds inter-frame context information to help further improve the entropy coding efficiency.
在S331中,对于S323中得到的预测块,我们将其与当前待编码节点同步进行八叉树划分分别得到预测块和当前块的占用码信息,对于每个子节点,其帧间预测上下文情况有2种:
Figure PCTCN2021114282-appb-000004
预测块中对应位置子节点被占用,
Figure PCTCN2021114282-appb-000005
以及预测块中对应位置子节点未被占用。从下述三种方式中选择一种来实现帧间上下文:第一种方式为对于每个子节点,在原始帧内上下文的基础上,额外增加一个上下文位,用于表示预测块中的对应位置是否被占用。第二种方式为若对应预测子节点被占用,则将其认为是强预测,预测置信度与帧内前7个邻居子节点均被占用相同,并将对应帧内上下文模式[1]中莫顿序前7个子节点占用码均置为1,若对应预测子节点未被占用,则保留原始帧内上下文信息。第三种方式为若对应预测子节点被占用,则将其认为是强预测,预测置信度与帧内前3个邻居子节点均被占用相同,并将对应帧内上下文模式中莫顿序前3个子节点占用码均置为1,若对应预测子节 点未被占用,则保留原始帧内上下文信息。
In S331, for the prediction block obtained in S323, we perform octree division synchronously with the current node to be coded to obtain the occupancy code information of the prediction block and the current block, respectively. For each child node, its inter-frame prediction context is as follows 2 kinds:
Figure PCTCN2021114282-appb-000004
The child node at the corresponding position in the prediction block is occupied,
Figure PCTCN2021114282-appb-000005
And the child node at the corresponding position in the prediction block is not occupied. Choose one of the following three ways to implement the inter-frame context: The first way is to add an additional context bit to each child node based on the original intra-frame context, which is used to indicate the corresponding position in the prediction block is occupied. The second method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first 7 neighbor sub-nodes in the frame, and the corresponding intra-frame context mode [1] The occupancy codes of the first seven sub-nodes in the frame sequence are all set to 1, and if the corresponding predicted sub-node is not occupied, the original intra-frame context information is retained. The third method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first three neighbor sub-nodes in the frame, and the Morton order in the corresponding intra-frame context mode is pre-ordered. The occupancy codes of the three sub-nodes are all set to 1. If the corresponding prediction sub-node is not occupied, the original intra-frame context information is retained.
S332:获得码流:S332: Get the code stream:
利用S331中获得的上下文信息对占用码信息进行编码,获得点云的码流。The occupancy code information is encoded using the context information obtained in S331 to obtain the code stream of the point cloud.
实施例四:一种基于块的点云几何帧间预测解码方法Embodiment 4: A block-based point cloud geometry inter-frame prediction decoding method
S410:获得待解码帧和参考帧相同的包围盒,作为树划分的根节点。S410 : Obtain the same bounding box of the frame to be decoded and the reference frame as the root node of the tree division.
S411:输入码流和参考帧:S411: Input stream and reference frame:
输入待解码的点云码流和作为参考帧的已解码帧。Input the point cloud code stream to be decoded and the decoded frame as the reference frame.
S412:获得包围盒:S412: Get bounding box:
通过解码码流或者外部输入获得待解码帧和参考帧的相同的包围盒,对待解码帧和参考帧同步进行树划分。其中,参考帧和待解码帧的包围盒的计算方式,有且不限于以下两种方式:方式一:获得整个点云序列所有帧的最小包围盒,作为待解码帧和参考帧的包围盒;方式二:获得待解码帧和参考帧的最小包围盒,作为待解码帧和参考帧的包围盒。The same bounding box of the frame to be decoded and the reference frame is obtained by decoding the code stream or external input, and the frame to be decoded and the reference frame are synchronized to perform tree division. Among them, the calculation methods of the bounding boxes of the reference frame and the frame to be decoded include but are not limited to the following two methods: Method 1: Obtain the smallest bounding box of all frames of the entire point cloud sequence as the bounding box of the frame to be decoded and the reference frame; Method 2: Obtain the minimum bounding box of the frame to be decoded and the reference frame, as the bounding box of the frame to be decoded and the reference frame.
S420:根据所述包围盒分别对所述待解码帧和所述参考帧进行树划分,根据所述树划分和/或点云码流得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块。S420: Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the current node in the frame to be decoded and the corresponding A prediction block in a reference frame.
S421:点云树划分:S421: Point cloud tree division:
对得到的点云包围盒空间进行树划分,根据包围盒的不同形状,可结合使用八叉树、四叉树以及二叉树等树划分模式对点云进行空间分解,得到若干解码块。The space of the obtained point cloud bounding box is divided into trees. According to the different shapes of the bounding box, the point cloud can be decomposed spatially by using tree division modes such as octree, quadtree and binary tree to obtain several decoding blocks.
S422:预测单元划分:S422: Prediction unit division:
预测单元(当前块和预测块)的划分有且不限于以下两种方式:The division of the prediction unit (the current block and the prediction block) has but is not limited to the following two ways:
方式一:通过解码码流或者时输入获得预测块的大小,若待解码帧的树划分得到的块的最短边长等于或小于预测块的大小时,即认为树划分完成,得到当前节点;在参考帧同步划分得到的块中,将当前节点对应位置的块直接作为预测块;Method 1: Obtain the size of the predicted block by decoding the code stream or inputting time. If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the size of the predicted block, the tree division is considered to be completed, and the current node is obtained; In the blocks obtained by reference frame synchronization division, the block corresponding to the current node is directly used as the prediction block;
方式二:通过解码码流或者时输入获得起始预测块PTU的大小PTU size,若待解码帧的树划分得到的块的最短边长等于或小于PTU size时,得到起始预测块PTU;解码得到标志信息split flags和occupied flags,根据标志信息对PTU进行进一步划分,全部划分完成得到的预测基本单元PU作为当前节点。Method 2: Obtain the size PTU size of the initial prediction block PTU by decoding the code stream or time input. If the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the PTU size, the initial prediction block PTU is obtained; decoding; The flag information split flags and occupied flags are obtained, the PTU is further divided according to the flag information, and the prediction basic unit PU obtained after all division is completed is used as the current node.
S423:运动补偿获取预测块:S423: Motion compensation obtains the prediction block:
解码码流得到运动矢量MV;将当前节点经过运动矢量MV进行平移,在参考点云中计 算最近邻点集,获得当前节点在参考点云中对应的预测块;Decode the code stream to obtain the motion vector MV; translate the current node through the motion vector MV, calculate the nearest neighbor point set in the reference point cloud, and obtain the corresponding prediction block of the current node in the reference point cloud;
S430:根据所述点云码流解码得到所述待解码节点的每个子节点的占用信息,得到点云。S430: Decode the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
S431:基于帧间预测块的上下文熵解码S431: Context entropy decoding based on inter-frame prediction blocks
将S423中得到的预测块与当前待解码节点同步进行八叉树划分,根据预测块中对应位置子节点的占用信息,解码所述当前节点的每个子节点的占用信息,得到点云。包括以下两种方法:The prediction block obtained in S423 is synchronized with the current node to be decoded to perform octree division, and the occupancy information of each child node of the current node is decoded according to the occupancy information of the corresponding position child node in the prediction block to obtain a point cloud. It includes the following two methods:
方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,解码当前节点的子节点的占用信息;Method 1: For each child node, based on the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child node of the current node is decoded;
方法二:如果预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,解码当前节点的子节点的占用信息;如果述预测块中对应位置子节点为未占用,采用帧内上下文信息,解码当前节点的子节点的占用信息。Method 2: If the child node at the corresponding position in the prediction block is occupied, set the occupation codes of the N child nodes in the corresponding intra-frame context mode to 1 as the context information, and decode the occupancy information of the child node of the current node; The child node at the corresponding position is not occupied, and the occupancy information of the child node of the current node is decoded by using the context information in the frame.
S432:获得重构点云:S432: Obtain the reconstructed point cloud:
利用S431中的上下文信息对点云的几何占用信息进行解码,获得重构的点云。Decode the geometric occupancy information of the point cloud using the context information in S431 to obtain a reconstructed point cloud.
实施例五:一种基于块的点云几何帧间预测编码方法Embodiment 5: A block-based point cloud geometry inter-frame predictive coding method
以下针对AVS点云压缩工作组中的官方点云数据集Ford_01_AVS_1mm序列,采用本发明方法进行点云几何信息无损压缩,具体实施步骤为:For the Ford_01_AVS_1mm sequence of the official point cloud data set in the AVS point cloud compression working group, the method of the present invention is used to perform lossless compression of point cloud geometric information, and the specific implementation steps are:
S110:为待编码帧和参考帧设定相同的包围盒,作为树划分的根节点。S110: Set the same bounding box for the frame to be encoded and the reference frame as the root node of the tree division.
1.计算包围盒以作为空间八叉树划分根节点:1. Compute the bounding box to divide the root node as a spatial octree:
此处以第一种包围盒计算方式为例,首先遍历整个序列,统计能够包含所有帧的最小包围盒,包围盒可由起点三维坐标和长宽高边长两组信息确定,以本序列为例,对应的包围盒信息为:起点xyz坐标值为(-115100,-115025,-44140),边长为(230239,230316,48208)。对于序列中的所有帧,树划分均以此包围盒作为根节点。Taking the first method of calculating the bounding box as an example, first traverse the entire sequence, and count the smallest bounding box that can contain all frames. The bounding box can be determined by the three-dimensional coordinates of the starting point and the two sets of information of length, width, height, and side length. Taking this sequence as an example, The corresponding bounding box information is: the xyz coordinate values of the starting point are (-115100, -115025, -44140), and the side lengths are (230239, 230316, 48208). For all frames in the sequence, the tree partition uses this bounding box as the root node.
S120:根据所述包围盒分别对所述待编码帧和所述参考帧进行树划分,分别得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块。S120: Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame, respectively.
2.以第一帧作为参考帧,编码第二帧为例,第一帧由于没有前向已编码帧,故采用全帧内编码。编码第二帧时,输入重构后的第一帧作为参考帧,以待编码帧为基准,对两帧点云空间进行同步的树划分操作,即可得到两个相同的树结构,每个树节点对应一个空 间块区域,根据八叉树、四叉树或者二叉树划分模式,一个空间块可进一步分割为八个、四个或者两个子块空间,并用不同长度的占用码标识对应子块中是否有点,每层只对有点的节点进行进一步划分。2. Taking the first frame as a reference frame and encoding the second frame as an example, since the first frame has no forward encoded frame, full intra-frame encoding is adopted. When encoding the second frame, input the reconstructed first frame as the reference frame, and use the frame to be encoded as the benchmark to perform a synchronous tree division operation on the point cloud space of the two frames, and then two identical tree structures can be obtained. A tree node corresponds to a space block area. According to the octree, quadtree or binary tree division mode, a space block can be further divided into eight, four or two sub-block spaces, and occupancy codes of different lengths are used to identify the corresponding sub-blocks. Whether it is a point or not, each layer only further divides the nodes with a point.
3.设置PTU_size为4096,对于树划分出的子节点,若当前待编码节点最小边长大于4096,则直接将参考帧中对应位置的节点作为预测块。若点前节点最小边长等于4096,即将其认为预测起始单元,在参考帧中通过该运动估计查找误差最小的预测块。3. Set the PTU_size to 4096. For the child nodes divided by the tree, if the minimum side length of the current node to be encoded is greater than 4096, the node at the corresponding position in the reference frame is directly used as the prediction block. If the minimum side length of the node before the point is equal to 4096, that is, it is regarded as the prediction starting unit, and the prediction block with the smallest error is found in the reference frame through the motion estimation.
4.设置窗口大小PTU_size+2*512,即对于当前PTU的范围,在三个坐标方向上前后分别拓展512的长度。PTU可继续通过八叉树划分得到一个PU树结构和对应的多个PU,PTU也可视为一个尺寸较大的PU,PU树包含两个标志信息,split flags用于表示PU树每层节点是否进行进一步划分,若继续进行划分,occupied flags用于表示划分出的子节点的占用信息。对于每个PU,首先计算其不做进一步划分时的编码代价,代价计算公式如下:4. Set the window size PTU_size+2*512, that is, for the current PTU range, expand the length of 512 in the three coordinate directions. The PTU can continue to be divided into a PU tree structure and corresponding multiple PUs through the octree division. The PTU can also be regarded as a PU with a larger size. The PU tree contains two flag information, and the split flags are used to indicate the nodes of each layer of the PU tree. Whether to further divide, if continue to divide, the occupied flags is used to indicate the occupancy information of the divided child nodes. For each PU, first calculate the encoding cost when it is not further divided. The cost calculation formula is as follows:
Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)
其中B为当前待编码块,W为搜索窗口,MV为当前节点在搜索窗口范围内,通过迭代最近点算法得到的最近似匹配块对应的运动矢量,Q为B经过MV平移后找到的最近邻点集,Est()估计编码MV所需码字,通过计算指数哥伦布编码所需码字得到,Dist为块匹配损失函数,公式如下:where B is the current block to be coded, W is the search window, MV is the motion vector corresponding to the closest matching block obtained by the iterative nearest point algorithm for the current node within the search window, and Q is the nearest neighbor found by B after MV translation Point set, Est() estimates the codeword required for encoding MV, which is obtained by calculating the codeword required for exponential Golomb encoding, Dist is the block matching loss function, and the formula is as follows:
Figure PCTCN2021114282-appb-000006
Figure PCTCN2021114282-appb-000006
接着计算当前PU继续划分为8个子PU的编码代价,总编码代价由每个PU的代价累加得到:Then calculate the coding cost of the current PU divided into 8 sub-PUs, and the total coding cost is obtained by accumulating the cost of each PU:
Figure PCTCN2021114282-appb-000007
Figure PCTCN2021114282-appb-000007
通过比较Cost(MV)和Cost(PU tree)决定当前PU是否进行进一步划分,PU树的最大深度设置为2层。通过运动估计后,每个PU即可得到一个MV和对应的预测块,再进行运动补偿,即用该预测块替代掉参考帧中对应位置的节点,并编码相应的运动矢量和PU树 结构信息,以使解码端也能生成相同的预测块。By comparing Cost(MV) and Cost(PU tree), it is determined whether the current PU is further divided, and the maximum depth of the PU tree is set to 2 layers. After motion estimation, each PU can obtain an MV and a corresponding prediction block, and then perform motion compensation, that is, use the prediction block to replace the node at the corresponding position in the reference frame, and encode the corresponding motion vector and PU tree structure information , so that the decoding end can also generate the same prediction block.
S130:对于所述当前节点的每个子节点,根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,得到码流。S130: For each child node of the current node, encode the occupancy information of each child node of the current node according to the occupancy information of the corresponding position child node in the prediction block to obtain a code stream.
5.对于一个待编码PU,通过步骤3或步骤4得到预测块后,对预测块和PU进行进一步树划分,可得到对应的预测占用码,作为上下文帮助编码当前PU的占用信息。对于PU划分出的子节点,帧间预测上下文情况有2种:1)预测块中对应位置子节点被占用,2)以及预测块中对应位置子节点未被占用。从下述三种方式中选择一种来实现帧间上下文:第一种方式为对于每个子节点,在原始帧内上下文的基础上,额外增加一个上下文位,用于表示预测块中的对应位置是否被占用。第二种方式为若对应预测子节点被占用,则将其认为是强预测,预测置信度与帧内前7个邻居子节点均被占用相同,并将对应帧内上下文模式中莫顿序前7个子节点占用码均置为1,若对应预测子节点未被占用,则保留原始帧内上下文信息。第三种方式为若对应预测子节点被占用,则将其认为是强预测,预测置信度与帧内前3个邻居子节点均被占用相同,并将对应帧内上下文模式中莫顿序前3个子节点占用码均置为1,若对应预测子节点未被占用,则保留原始帧内上下文信息。5. For a PU to be coded, after obtaining the prediction block through step 3 or step 4, further tree division is performed on the prediction block and the PU, and the corresponding prediction occupancy code can be obtained, which is used as the context to help encode the occupancy information of the current PU. For the child nodes divided by the PU, there are two kinds of inter-frame prediction context situations: 1) the child node at the corresponding position in the prediction block is occupied, and 2) the child node at the corresponding position in the prediction block is not occupied. Choose one of the following three ways to implement the inter-frame context: The first way is to add an additional context bit to each child node based on the original intra-frame context, which is used to indicate the corresponding position in the prediction block is occupied. The second method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first 7 neighbor sub-nodes in the frame. The occupancy codes of the seven sub-nodes are all set to 1. If the corresponding predicted sub-node is not occupied, the original intra-frame context information is retained. The third method is that if the corresponding prediction sub-node is occupied, it is considered as a strong prediction, and the prediction confidence is the same as that of the first three neighbor sub-nodes in the frame, and the Morton order in the corresponding intra-frame context mode is pre-ordered. The occupancy codes of the three sub-nodes are all set to 1. If the corresponding prediction sub-node is not occupied, the original intra-frame context information is retained.
6.重复步骤2-5,即可得到当前帧空间占用码信息码流,以及运动矢量和PU树码流,即可完成对点云几何信息的编码。6. Repeat steps 2-5 to obtain the code stream of the current frame space occupancy code, as well as the motion vector and PU tree code stream, and the encoding of the geometric information of the point cloud can be completed.
为了验证本发明的几何帧间预测压缩方法的效果,我们对上述Ford_01_AVS_1mm序列的前600帧进行量化步长为512的有损几何编码,除了第一帧使用全帧内预测,后续帧均使用帧间加帧内的预测模式,帧间上下文使用第2种方式,在压缩性能上与现有方法对比结果如图5所示。In order to verify the effect of the geometric inter-frame prediction compression method of the present invention, we perform lossy geometric coding with a quantization step size of 512 on the first 600 frames of the Ford_01_AVS_1mm sequence, except that the first frame uses full intra-frame prediction, and subsequent frames use frame Inter-frame prediction mode is added, and the second method is used for inter-frame context. The comparison results of compression performance and existing methods are shown in Figure 5.
从图5可以看出,在相同的测试条件下,结合本发明的基于块划分的帧间预测工具后,在测试的点云多帧序列上,产生了稳定的性能增益。通过利用点云时序上的相关性,减少编码信息冗余,在各个码率点下,点云几何的压缩性能均得到改善。It can be seen from FIG. 5 that, under the same test conditions, combined with the block division-based inter-frame prediction tool of the present invention, a stable performance gain is produced on the tested point cloud multi-frame sequence. By exploiting the correlation of point cloud timing and reducing the redundancy of coding information, the compression performance of point cloud geometry is improved at each bit rate point.
图6为基于AVS点云压缩平台PCRMv3.0实施本发明方法在几何有损压缩条件下的性能结果,在不同数据集上获得了稳定的性能增益。FIG. 6 shows the performance results of implementing the method of the present invention under the condition of geometric lossy compression based on the AVS point cloud compression platform PCRMv3.0, and stable performance gains are obtained on different data sets.
图7为基于AVS点云压缩平台PCRMv3.0实施本发明方法在几何无损压缩条件下的性能结果。FIG. 7 is a performance result of implementing the method of the present invention under the condition of geometric lossless compression based on the AVS point cloud compression platform PCRMv3.0.
本发明的基于块的点云几何帧间预测方法,可用于实现点云的几何压缩。该方法首先对计算点云的包围盒信息,再进行八叉树或四叉树以及二叉树划分,当划分子节点尺寸达 到设定的预测单元要求,则针对当前节点在参考帧中查找误差较小的对应预测块,最后利用预测块的几何占用信息,帮助提升当前节点的占用信息熵编码效率,The block-based point cloud geometric inter-frame prediction method of the present invention can be used to realize the geometric compression of the point cloud. The method first calculates the bounding box information of the point cloud, and then divides the octree or quadtree and binary tree. When the size of the divided sub-nodes reaches the set prediction unit requirements, the search error in the reference frame for the current node is small. The corresponding prediction block, and finally use the geometric occupancy information of the prediction block to help improve the entropy coding efficiency of the occupancy information of the current node.
需要注意的是,公布实施例的目的在于帮助进一步理解本发明,但是本领域的技术人员可以理解:在不脱离本发明及所附权利要求的精神和范围内,各种替换和修改都是可能的。因此,本发明不应局限于实施例所公开的内容,本发明要求保护的范围以权利要求书界定的范围为准。It should be noted that the purpose of publishing the embodiments is to help further understanding of the present invention, but those skilled in the art can understand that various replacements and modifications are possible without departing from the spirit and scope of the present invention and the appended claims of. Therefore, the present invention should not be limited to the contents disclosed in the embodiments, and the scope of protection of the present invention shall be subject to the scope defined by the claims.

Claims (15)

  1. 一种基于块的点云几何帧间预测编码方法,其特征在于,包括如下步骤:A block-based point cloud geometric inter-frame prediction coding method, characterized in that, comprising the following steps:
    S110;为待编码帧和参考帧设定相同的包围盒,作为树划分的根节点;S110: Set the same bounding box for the frame to be encoded and the reference frame as the root node of the tree division;
    S120:根据所述包围盒分别对所述待编码帧和所述参考帧进行树划分,分别得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块;S120: Perform tree division on the to-be-coded frame and the reference frame respectively according to the bounding box, to obtain a current node in the to-be-coded frame and a corresponding prediction block in the reference frame;
    S130:对于所述当前节点的每个子节点,根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,得到码流。S130: For each child node of the current node, encode the occupancy information of each child node of the current node according to the occupancy information of the corresponding position child node in the prediction block to obtain a code stream.
  2. 如权利要求1所述点云几何帧间预测编码方法,其特征在于,S110中所述为待编码帧和参考帧设定相同的包围盒,具体包括方法一或方法二:The method for point cloud geometric inter-frame prediction coding according to claim 1, wherein the setting of the same bounding box for the frame to be coded and the reference frame in S110 specifically includes method 1 or method 2:
    方法一:计算可包含整个点云序列所有帧的最小包围盒,作为所述待编码帧和所述参考帧的包围盒;Method 1: Calculate the smallest bounding box that can contain all frames of the entire point cloud sequence, as the bounding box of the frame to be encoded and the reference frame;
    方式二:计算所述待编码帧和所述参考帧的最小包围盒,作为所述待编码帧和所述参考帧的包围盒。Manner 2: Calculate the minimum bounding box of the frame to be encoded and the reference frame as the bounding box of the frame to be encoded and the reference frame.
  3. 如权利要求1所述点云几何帧间预测编码方法,其特征在于,S120中所述树划分包括,八叉树、四叉树、二叉树划分方式。The method for encoding point cloud geometry inter-frame prediction according to claim 1, wherein the tree division in S120 includes an octree, quadtree, and binary tree division.
  4. 如权利要求1所述点云几何帧间预测编码方法,其特征在于,S120中,得到所述待编码帧中的当前节点和对应的所述参考帧中的预测块的方式有且不限于以下两种方式:The method for point cloud geometric inter-frame prediction coding according to claim 1, wherein in S120, the methods for obtaining the current node in the to-be-coded frame and the corresponding prediction block in the reference frame include but are not limited to the following Two ways:
    方式一,设定预测块的大小,若所述待编码帧的树划分得到的块的最短边长等于或小于所述预测块的大小时,即认为树划分完成,得到所述当前节点;在所述参考帧同步划分得到的块中,将所述当前节点对应位置的块直接作为预测块;Mode 1, the size of the prediction block is set, if the shortest side length of the block obtained by the tree division of the frame to be encoded is equal to or less than the size of the prediction block, it is considered that the tree division is completed, and the current node is obtained; In the blocks obtained by the synchronous division of the reference frame, the block at the corresponding position of the current node is directly used as the prediction block;
    或者,方式二,设定起始预测块PTU的大小PTU size,若所述待编码帧的树划分得到的块的最短边长等于或小于所述PTU size时,得到所述起始预测块PTU;根据编码代价决定是否对所述PTU进行进一步划分,记录标志信息,split flags表示对所述PTU每个节点是否进行进一步划分,occupied flags表示划分出的子节点的占用信息,全部划分完成得到的预测基本单元PU作为所述当前节点;对每个所述当前节点计算得到最匹配的预测块和对应的运动矢量MV,编码所述标志信息和所述运动矢量MV。Or, in the second way, the size of the initial prediction block PTU is set PTU size, if the shortest side length of the block obtained by the tree division of the frame to be encoded is equal to or less than the PTU size, the initial prediction block PTU is obtained. ; Determine whether to further divide the PTU according to the coding cost, record the flag information, split flags indicates whether each node of the PTU is further divided, occupied flags indicates the occupancy information of the divided child nodes, and all divisions are completed. The prediction basic unit PU is used as the current node; the most matching prediction block and the corresponding motion vector MV are calculated for each current node, and the flag information and the motion vector MV are encoded.
  5. 根据权利要求4所述的点云几何帧间预测编码方法,其特征在于,所述方式二中所述对每个所述PU计算得到最匹配的预测块和对应的运动矢量MV,具体包括:The method for point cloud geometric inter-frame prediction coding according to claim 4, characterized in that, in the second method, the most matching prediction block and the corresponding motion vector MV are obtained by calculating for each of the PUs, which specifically includes:
    确定搜索窗口W;Determine the search window W;
    根据所述搜索窗口W在所述参考帧中得到局部点云;Obtain a local point cloud in the reference frame according to the search window W;
    在所述局部点云中,对于所述的预测基本单元PU即所述当前节点,利用简化的ICP算法得到误差最小的匹配块和对应的运动矢量MV,将所述误差最小的匹配块和对应的运动矢量MV作为所述当前节点的预测块和对应的运动矢量MV,编码运动矢量MV。In the local point cloud, for the prediction basic unit PU, that is, the current node, the simplified ICP algorithm is used to obtain the matching block with the smallest error and the corresponding motion vector MV, and the matching block with the smallest error and the corresponding motion vector MV are obtained. The motion vector MV of the current node is used as the prediction block of the current node and the corresponding motion vector MV, and the motion vector MV is encoded.
  6. 如权利要求5所述的点云几何帧间预测编码方法,其特征在于,所述确定搜索窗口W,包括:The point cloud geometry inter-frame predictive coding method according to claim 5, wherein the determining the search window W comprises:
    所述搜索窗口W的大小根据数据集的不同分布特点和不同码率点设置,若点云分布较为离散,则设置较大窗口范围;若点云较为紧密,则设置较小窗口范围。The size of the search window W is set according to different distribution characteristics of the data set and different code rate points. If the point cloud distribution is relatively discrete, a larger window range is set; if the point cloud is relatively tight, a smaller window range is set.
  7. 如权利要求5所述的点云几何帧间预测编码方法,其特征在于,所述利用简化的ICP算法得到误差最小的匹配块和对应的运动矢量MV,包括:The point cloud geometric inter-frame predictive coding method according to claim 5, wherein the method of obtaining the matching block with the smallest error and the corresponding motion vector MV by using the simplified ICP algorithm, comprises:
    所述简化的ICP算法仅考虑平移变换,所述误差的计算方法为拉格朗日代价,计算公式如下:The simplified ICP algorithm only considers translation transformation, and the calculation method of the error is the Lagrangian cost, and the calculation formula is as follows:
    Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)Cost(MV)=Dist(Q(W,MV),B)+λEst(MV)
    其中B为所述待编码帧当前节点,W为所述搜索窗口,Q为所述当前节点B经过所述运动矢量MV平移后得到的最近邻点集,Est(MV)为估计编码所述运动矢量MV需要的码字大小,Dist为估计编码匹配偏差需要的码字大小的块匹配损失函数。Wherein B is the current node of the frame to be encoded, W is the search window, Q is the nearest neighbor point set obtained by the current node B after the motion vector MV is translated, and Est(MV) is the estimated and encoded motion The codeword size required by the vector MV, Dist is the block matching loss function of the codeword size required to estimate the coding matching deviation.
  8. 如权利要求7所述的点云几何帧间预测编码方法,其特征在于,所述Dist为块匹配损失函数,包括:The point cloud geometry inter-frame prediction coding method according to claim 7, wherein the Dist is a block matching loss function, comprising:
    所述块匹配损失函数Dist的计算公式如下:The calculation formula of the block matching loss function Dist is as follows:
    Figure PCTCN2021114282-appb-100001
    Figure PCTCN2021114282-appb-100001
    其中w为所述搜索窗W在所述参考点云中对应的局部点云中的点,q为所述当前节点B经过所述运动矢量MV平移后找到的最近邻点集Q中的点。Where w is the point in the local point cloud corresponding to the search window W in the reference point cloud, and q is the point in the nearest neighbor point set Q found by the current node B after the motion vector MV is translated.
  9. 如权利要求1所述的点云几何帧间预测编码方法,其特征在于,S130中所述根据所述预测块中对应位置子节点的占用信息,编码所述当前节点的每个子节点的占用信息,包括方法一、或方法二:The method for point cloud geometric inter-frame prediction coding according to claim 1, wherein in S130, the occupancy information of each child node of the current node is encoded according to the occupancy information of the corresponding position child node in the prediction block. , including Method 1 or Method 2:
    方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,编码所述当前节点的子节点的占用信息;Method 1: For each child node, on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child node of the current node is encoded;
    方法二:如果所述预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,编码所述当前节点的子节点的占用信息;如果所述预测块中对应位置子节点为未占用,采用帧内上下文信息,编码所述当前节点的子节点的占用信息。Method 2: If the corresponding position child node in the prediction block is occupied, set the occupation codes of the N child nodes in the corresponding intra-frame context mode to 1 as the context information, and encode the occupancy information of the child nodes of the current node; if The child node at the corresponding position in the prediction block is not occupied, and the occupancy information of the child node of the current node is encoded by using the context information in the frame.
  10. 一种基于块的点云几何帧间预测解码方法,其特征在于,包括如下步骤:A block-based point cloud geometric inter-frame prediction decoding method, characterized in that it comprises the following steps:
    S210:获得待解码帧和参考帧相同的包围盒,作为树划分的根节点;S210: Obtain the same bounding box of the frame to be decoded and the reference frame as the root node of the tree division;
    S220:根据所述包围盒分别对所述待解码帧和所述参考帧进行树划分,根据所述树划分和/或点云码流得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块;S220: Perform tree division on the frame to be decoded and the reference frame respectively according to the bounding box, and obtain the current node in the frame to be decoded and the corresponding the prediction block in the reference frame;
    S230:根据所述点云码流解码得到所述待解码节点的每个子节点的占用信息,得到点云。S230: Decode the point cloud code stream to obtain the occupancy information of each child node of the node to be decoded, and obtain a point cloud.
  11. 如权利要求10所述点云几何帧间预测解码方法,其特征在于,S210中所述获得待解码帧和参考帧相同的包围盒,具体包括方法一或方法二:The method for decoding point cloud geometry inter-frame prediction according to claim 10, wherein the obtaining of the same bounding box of the frame to be decoded and the reference frame described in S210 specifically includes method 1 or method 2:
    方法一:获得整个点云序列所有帧的最小包围盒,作为所述待解码帧和所述参考帧的包围盒;Method 1: Obtain the minimum bounding box of all frames of the entire point cloud sequence as the bounding box of the frame to be decoded and the reference frame;
    方式二:获得所述待解码帧和所述参考帧的最小包围盒,作为所述待解码帧和所述参考帧的包围盒。Manner 2: Obtain the minimum bounding box of the frame to be decoded and the reference frame as the bounding box of the frame to be decoded and the reference frame.
  12. 如权利要求10所述点云几何帧间预测解码方法,其特征在于,S220中所述树划分包括,八叉树、四叉树、二叉树划分方式。The method for decoding point cloud geometry inter-frame prediction according to claim 10, wherein the tree division in S220 includes octree, quadtree, and binary tree division.
  13. 如权利要求10所述点云几何帧间预测解码方法,其特征在于,S220中所述得到所述待解码帧中的当前节点和对应的所述参考帧中的预测块的方式,有且不限于以下两种方式:The method for point cloud geometric inter-frame prediction decoding according to claim 10, wherein the method of obtaining the current node in the to-be-decoded frame and the corresponding prediction block in the reference frame in S220 includes and does not Limited to the following two ways:
    方式一,根据预测块的大小,若所述待解码帧的树划分得到的块的最短边长等于或小于所述预测块的大小时,即认为树划分完成,得到所述当前节点;在所述参考帧同步划分得到的块中,将所述当前节点对应位置的块直接作为预测块;Mode 1, according to the size of the predicted block, if the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or smaller than the size of the predicted block, it is considered that the tree division is completed, and the current node is obtained; In the block obtained by the synchronous division of the reference frame, the block at the corresponding position of the current node is directly used as the prediction block;
    或者,方式二,根据起始预测块PTU的大小PTU size,若所述待解码帧的树划分得到的块的最短边长等于或小于所述PTU size时,得到所述起始预测块PTU;解码得到标志信息split flags和occupied flags,根据所述标志信息对所述PTU进行进一步划分,全部划分完成得到的预测基本单元PU作为所述当前节点;解码得到每个所述当前节点的运动矢量MV并计算得到对应的预测块。Or, in the second mode, according to the size PTU size of the initial prediction block PTU, if the shortest side length of the block obtained by the tree division of the frame to be decoded is equal to or less than the PTU size, the initial prediction block PTU is obtained; Decode the flag information split flags and occupied flags, further divide the PTU according to the flag information, and use the prediction basic unit PU obtained by all division as the current node; decode to obtain the motion vector MV of each current node And calculate the corresponding prediction block.
  14. 根据权利要求13所述的点云几何帧间预测解码方法,其特征在于,所述方式二中 所述解码得到每个所述PU的运动矢量MV并计算得到对应的预测块,具体包括:The point cloud geometric inter-frame prediction decoding method according to claim 13, is characterized in that, described in described mode 2, described decoding obtains the motion vector MV of each described PU and calculates and obtains the corresponding prediction block, specifically comprises:
    解码码流得到运动矢量MV;Decode the code stream to get the motion vector MV;
    将所述当前节点经过所述运动矢量MV平移,在所述参考点云中计算最近邻点集,获得所述当前节点在所述参考点云中对应的所述预测块;Translate the current node through the motion vector MV, calculate the nearest neighbor point set in the reference point cloud, and obtain the prediction block corresponding to the current node in the reference point cloud;
  15. 如权利要求10所述的点云几何帧间预测解码方法,其特征在于,S230中所述根据所述点云码流解码得到所述待解码节点的每个子节点的占用信息,包括方法一、或方法二:The method for decoding point cloud geometry inter-frame prediction according to claim 10, wherein, in S230, the occupancy information of each child node of the node to be decoded is obtained by decoding according to the point cloud code stream, including method 1: or method two:
    方法一:对于每个子节点,在原始帧内上下文的基础上,额外增加一个帧间上下文位,用于表示预测块中的对应位置是否被占用,解码所述当前节点的子节点的占用信息;Method 1: For each child node, on the basis of the original intra-frame context, an additional inter-frame context bit is added to indicate whether the corresponding position in the prediction block is occupied, and the occupancy information of the child node of the current node is decoded;
    方法二:如果所述预测块中对应位置子节点为占用,将对应帧内上下文模式中的N个子节点占用码均置为1作为上下文信息,解码所述当前节点的子节点的占用信息;如果所述预测块中对应位置子节点为未占用,采用帧内上下文信息,解码所述当前节点的子节点的占用信息。Method 2: If the corresponding position child node in the prediction block is occupied, set the occupation codes of the N child nodes in the corresponding intra-frame context mode to 1 as the context information, and decode the occupancy information of the child nodes of the current node; if If the child node at the corresponding position in the prediction block is not occupied, the occupancy information of the child node of the current node is decoded by using the context information in the frame.
PCT/CN2021/114282 2020-08-24 2021-08-24 Block-based point cloud geometric inter-frame prediction method and decoding method WO2022042538A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010857573.1A CN114095735A (en) 2020-08-24 2020-08-24 Point cloud geometric inter-frame prediction method based on block motion estimation and motion compensation
CN202010857573.1 2020-08-24

Publications (1)

Publication Number Publication Date
WO2022042538A1 true WO2022042538A1 (en) 2022-03-03

Family

ID=80295502

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114282 WO2022042538A1 (en) 2020-08-24 2021-08-24 Block-based point cloud geometric inter-frame prediction method and decoding method

Country Status (2)

Country Link
CN (1) CN114095735A (en)
WO (1) WO2022042538A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115474047A (en) * 2022-09-13 2022-12-13 福州大学 LiDAR point cloud encoding method and decoding method based on enhanced map correlation
WO2023202538A1 (en) * 2022-04-17 2023-10-26 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for point cloud coding
WO2024145912A1 (en) * 2023-01-06 2024-07-11 Oppo广东移动通信有限公司 Point cloud coding method and apparatus, point cloud decoding method and apparatus, device, and storage medium
WO2024145933A1 (en) * 2023-01-06 2024-07-11 Oppo广东移动通信有限公司 Point cloud coding method and apparatus, point cloud decoding method and apparatus, and devices and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024145935A1 (en) * 2023-01-06 2024-07-11 Oppo广东移动通信有限公司 Point cloud encoding method and apparatus, point cloud decoding method and apparatus, device, and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109196559A (en) * 2016-05-28 2019-01-11 微软技术许可有限责任公司 The motion compensation of dynamic voxelization point cloud is compressed
WO2019076503A1 (en) * 2017-10-17 2019-04-25 Nokia Technologies Oy An apparatus, a method and a computer program for coding volumetric video
US20200020132A1 (en) * 2018-07-11 2020-01-16 Samsung Electronics Co., Ltd. Visual quality of video based point cloud compression using one or more additional patches
CN111386551A (en) * 2017-10-19 2020-07-07 交互数字Vc控股公司 Method and device for predictive coding and decoding of point clouds
WO2020141260A1 (en) * 2019-01-02 2020-07-09 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
CN111465964A (en) * 2017-10-19 2020-07-28 交互数字Vc控股公司 Method and apparatus for encoding/decoding geometry of point cloud representing 3D object

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109196559A (en) * 2016-05-28 2019-01-11 微软技术许可有限责任公司 The motion compensation of dynamic voxelization point cloud is compressed
WO2019076503A1 (en) * 2017-10-17 2019-04-25 Nokia Technologies Oy An apparatus, a method and a computer program for coding volumetric video
CN111386551A (en) * 2017-10-19 2020-07-07 交互数字Vc控股公司 Method and device for predictive coding and decoding of point clouds
CN111465964A (en) * 2017-10-19 2020-07-28 交互数字Vc控股公司 Method and apparatus for encoding/decoding geometry of point cloud representing 3D object
US20200020132A1 (en) * 2018-07-11 2020-01-16 Samsung Electronics Co., Ltd. Visual quality of video based point cloud compression using one or more additional patches
WO2020141260A1 (en) * 2019-01-02 2020-07-09 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023202538A1 (en) * 2022-04-17 2023-10-26 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for point cloud coding
CN115474047A (en) * 2022-09-13 2022-12-13 福州大学 LiDAR point cloud encoding method and decoding method based on enhanced map correlation
WO2024145912A1 (en) * 2023-01-06 2024-07-11 Oppo广东移动通信有限公司 Point cloud coding method and apparatus, point cloud decoding method and apparatus, device, and storage medium
WO2024145933A1 (en) * 2023-01-06 2024-07-11 Oppo广东移动通信有限公司 Point cloud coding method and apparatus, point cloud decoding method and apparatus, and devices and storage medium

Also Published As

Publication number Publication date
CN114095735A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2022042538A1 (en) Block-based point cloud geometric inter-frame prediction method and decoding method
JP7250952B2 (en) Method, Apparatus, and Computer Program for Point Cloud Coding
JP7187688B2 (en) Motion estimation using 3D auxiliary data
CN112565764B (en) Point cloud geometric information interframe coding and decoding method
JP7330306B2 (en) Transform method, inverse transform method, encoder, decoder and storage medium
Daribo et al. Efficient rate-distortion compression of dynamic point cloud for grid-pattern-based 3D scanning systems
CN112565795A (en) Point cloud geometric information encoding and decoding method
JP2015504545A (en) Predictive position coding
CN115606188A (en) Point cloud encoding and decoding method, encoder, decoder and storage medium
WO2022121648A1 (en) Point cloud data encoding method, point cloud data decoding method, device, medium, and program product
CN102970529A (en) Multi-viewpoint video fractal coding compressing and uncompressing method based on objects
JP2024050705A (en) Method for predicting attribute information, encoder, decoder, and storage media
CN113518226A (en) G-PCC point cloud coding improvement method based on ground segmentation
Xu et al. Dynamic point cloud geometry compression via patch-wise polynomial fitting
CN102316323B (en) Rapid binocular stereo-video fractal compressing and uncompressing method
JP7383171B2 (en) Method and apparatus for point cloud coding
KR20230060534A (en) Point cloud encoding and decoding method and apparatus based on two-dimensional normalized plane projection
CN113453009B (en) Point cloud space scalable coding geometric reconstruction method based on fitting plane geometric error minimum
CN102263952B (en) Quick fractal compression and decompression method for binocular stereo video based on object
WO2022166968A1 (en) Point cloud encoding/decoding method and device based on two-dimensional regularized plane projection
CN116016951A (en) Point cloud processing method, device, equipment and storage medium
JP7497443B2 (en) Method, apparatus and computer program for point cloud coding
US11611775B2 (en) Method and apparatus for point cloud coding
WO2021139796A1 (en) Method for constructing morton codes, encoder, decoder, and storage medium
JP7470211B2 (en) Method and apparatus for computing distance-based weighted averages for point cloud coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21860373

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21860373

Country of ref document: EP

Kind code of ref document: A1