WO2023051783A1 - 一种编码方法、解码方法、装置、设备及可读存储介质 - Google Patents

一种编码方法、解码方法、装置、设备及可读存储介质 Download PDF

Info

Publication number
WO2023051783A1
WO2023051783A1 PCT/CN2022/123245 CN2022123245W WO2023051783A1 WO 2023051783 A1 WO2023051783 A1 WO 2023051783A1 CN 2022123245 W CN2022123245 W CN 2022123245W WO 2023051783 A1 WO2023051783 A1 WO 2023051783A1
Authority
WO
WIPO (PCT)
Prior art keywords
point
point cloud
target
matrix
sub
Prior art date
Application number
PCT/CN2022/123245
Other languages
English (en)
French (fr)
Inventor
冯亚楠
李琳
周冰
徐嵩
邢刚
马思伟
王苫社
徐逸群
胡玮
Original Assignee
咪咕文化科技有限公司
***通信集团有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 咪咕文化科技有限公司, ***通信集团有限公司, 北京大学 filed Critical 咪咕文化科技有限公司
Publication of WO2023051783A1 publication Critical patent/WO2023051783A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present application relates to the technical field of image processing, and relates to but not limited to an encoding method, decoding method, device, equipment and readable storage medium.
  • Point cloud data consists of a large number of three-dimensional disordered points, each point includes position information (X, Y, Z) and several attribute information (color, normal vector, etc.).
  • point cloud compression technology In order to facilitate the storage and transmission of point cloud data, point cloud compression technology has gradually become the focus of attention.
  • the prior art provides a scheme to selectively encode one or more 3D point cloud blocks using inter-coding (eg, motion compensation) techniques of previously encoded/decoded frames.
  • inter-coding eg, motion compensation
  • Embodiments of the present application provide an encoding method, a decoding method, a device, a device, and a readable storage medium, so as to improve processing performance.
  • An embodiment of the present application provides an encoding method applied to an encoding device, including:
  • the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.
  • the embodiment of the present application also provides a decoding method, which is applied to a decoding device, and the method includes:
  • the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.
  • the embodiment of the present application also provides an encoding device, including:
  • the first acquisition module is configured to cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;
  • the first generation module is configured to, for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and, in the target sub-point cloud
  • the Euclidean distance between the target point and the corresponding point of the target point generates a generalized Laplacian matrix
  • the first transformation module is configured to use the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud;
  • the first encoding module is configured to respectively quantize and encode the transformed sub-point clouds to obtain encoded code streams
  • the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.
  • the embodiment of the present application also provides a decoding device, including:
  • the second obtaining module is configured to obtain the encoded code stream
  • the second transformation module is configured to perform an inverse Fourier transformation of the encoded code stream based on Euclidean distance weights to obtain a transformation result
  • the first decoding module is configured to obtain a decoded code stream based on the transformation result
  • the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.
  • the embodiment of the present application also provides an electronic device, including: a memory, a processor, and a program stored in the memory and operable on the processor.
  • a program stored in the memory and operable on the processor.
  • the embodiment of the present application further provides a readable storage medium, where a program is stored on the readable storage medium, and when the program is executed by a processor, the steps in the encoding method or decoding method as described above are implemented.
  • the point cloud data to be processed in the current frame is clustered to obtain multiple sub-point clouds, and for any target sub-point cloud, according to the relationship between multiple point pairs in the target sub-point cloud
  • the Euclidean distance, and the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point generates a generalized Laplacian matrix, and using the generalized Laplacian matrix, respectively Inter-frame prediction and image Fourier residual transformation are performed on the plurality of sub-point clouds, so as to obtain an encoded code stream based on the transformation result.
  • the global correlation feature can be used to more fully express the correlation between points, so that The similarity between point cloud data can be removed as much as possible, and the coding performance can be improved.
  • the performance of the encoding end is improved, correspondingly, for the decoding end, since the data to be decoded is optimized, the decoding efficiency and performance can be improved accordingly.
  • Fig. 1 is a flow chart of the encoding method provided by the embodiment of the present application.
  • Fig. 2 and Fig. 3 are the schematic diagrams comparing the effect of the method of the embodiment of the present application and the method of the prior art;
  • Fig. 4 is a flow chart of the decoding method provided by the embodiment of the present application.
  • FIG. 5 is a structural diagram of an encoding device provided in an embodiment of the present application.
  • FIG. 6 is a structural diagram of a decoding device provided by an embodiment of the present application.
  • FIG. 1 is a flowchart of an encoding method provided by an embodiment of the present application, which is applied to an encoding device. As shown in Figure 1, the following steps are included:
  • Step 101 Cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds.
  • a 3D grid with a preset size is constructed, and the point cloud data to be processed is placed in the constructed 3D grid to obtain the coordinates of each point, and the 3D grid containing the points is used as a point cloud voxel to obtain Multiple point cloud voxels.
  • the coordinates and attribute information of each point cloud voxel can also be obtained.
  • the attribute information includes intensity, color and so on.
  • the coordinates of the point cloud voxel can be the coordinates of the center point of each point in the point cloud voxel; the color information of the point cloud voxel can be the color information of each point in the point cloud voxel average value.
  • the point cloud data to be processed can also be voxelized by means of an octree to obtain multiple point cloud voxels.
  • the method of uniform space division is used to cluster the point cloud data.
  • a clustering method for example, a K-means clustering method can be used.
  • the point cloud data to be processed is divided into multiple sub-point clouds based on the location information, and the space is evenly divided.
  • Each sub-point cloud can be encoded independently.
  • Step 102 for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and the target point in the target sub-point cloud and The Euclidean distance between corresponding points of the target point generates a generalized Laplacian matrix.
  • the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.
  • any sub-point cloud in the plurality of sub-point clouds can be used as the target sub-point cloud.
  • each target sub-point cloud is treated in the same way.
  • the target sub-point cloud may include multiple points, and every two points constitute a point pair in this embodiment.
  • the Euclidean distance between two points in each point pair is calculated. For example, for the i-th point and the j-th point in the target sub-point cloud, the Euclidean distance between the i-th point and the j-th point is calculated.
  • the Euclidean distance d(i,j) between point i(x 1 , x 2 ... x n ) and point j (y 1 , y 2 ... y n ) it can be The following formula (1) is calculated:
  • 1 ⁇ i ⁇ M, 1 ⁇ j ⁇ M, i, j, M are integers, and M is the total number of points included in the target sub-point cloud.
  • weights are calculated according to the following formula (2), and the weights are used to form the weight matrix W:
  • W ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud; distance represents the Euclidean distance between the i-th point and the j-th point; ⁇ is not equal to 0 Constant, representing the tuning parameter.
  • the difference between the degree matrix and the weight matrix is used as a Laplacian matrix.
  • L D-W
  • D represents a degree matrix
  • W represents a weight matrix
  • the diagonal element d i ⁇ j W ij of the degree matrix, and other elements are 0.
  • d i represents the i-th diagonal element of the degree matrix
  • W ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud
  • the diagonal matrix is generated according to the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point.
  • the reference point cloud of the target sub-point element may first be determined in the reference frame. For example, motion estimation is performed in a reference frame and a matching reference point cloud is found. Among them, there is a one-to-one correspondence between the target sub-point cloud and the reference point. For example, using an iterative closest point algorithm, in the reference frame, the reference point cloud of the target sub-point element can be determined based on the Euclidean distance. Then, the diagonal matrix D w is generated based on the Euclidean distance between each point in the target sub-point cloud and the corresponding point of each point in the reference point cloud.
  • the value on the i-th diagonal of the diagonal matrix is the reciprocal of the Euclidean distance between the i-th point and point p, and other elements are 0.
  • point p is the corresponding point of the i-th point in the reference point cloud.
  • the sum of the diagonal matrix and the Laplacian matrix is used as the generalized Laplacian matrix.
  • Lg L+D w , where Lg represents a generalized Laplacian matrix, L represents a Laplacian matrix, and D w represents a diagonal matrix.
  • Step 103 using the generalized Laplacian matrix, perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud.
  • inter-frame prediction and graph Fourier residual transform can be understood as inter-frame prediction and graph Fourier residual transform based on Euclidean distance weights, which may include the following:
  • an inter-frame prediction method is used to predict the attribute value of the current frame by using the reference frame.
  • the attributes may include color, intensity, normal vector and so on.
  • the target attribute can be any attribute.
  • the attribute prediction value of the reference frame to the target attribute of the current frame is obtained:
  • x t-1 indicates the attribute value of the target attribute of the reference frame
  • Lg indicates the generalized Laplacian matrix
  • the difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame can be used as the residual, as shown in formula (4) :
  • represents the residual of the target attribute of the current frame, Indicates the attribute prediction value of the target attribute of the reference frame to the current frame, x t represents the attribute value of the target attribute of the current frame.
  • the residual is obtained through an inter-frame prediction method, so as to obtain as much difference between two frames as possible. Since the same part between two frames does not require additional processing, the bit rate can be saved by calculating the residual.
  • a transformation matrix is obtained by using the generalized Laplacian matrix, and then the residual of the target attribute of the current frame is transformed by using the transformation matrix.
  • Lg represents the generalized Laplacian matrix
  • represents the transformation result
  • represents the transformation matrix
  • represents the residual of the target attribute of the current frame.
  • the processing method for other sub-point clouds is the same as that for the target sub-point cloud.
  • Step 104 Quantize and code the transformed sub-point clouds respectively to obtain coded code streams.
  • the color can be decomposed into three 3 ⁇ 1 vectors (for example: color space model (Luminance Chrominance-Blue Chrominance-Red, YUV) or (Red Green Blue, RGB)).
  • color space model Luminance Chrominance-Blue Chrominance-Red, YUV
  • RGB Red Green Blue
  • the Y component predicts the attribute value of the current frame according to the process in S1031, and generate a residual according to S1032. After that, transform the residual by using S1033.
  • the transformed Y component is uniformly quantized and arithmetically encoded to obtain a code stream. For each component, it can be processed in the same way as the Y component.
  • the global correlation feature can be used to more fully express the point-to-point The correlation between them, so that the similarities between point cloud data can be removed as much as possible, and the coding performance can be improved.
  • Figure 2 shows the method and region adaptive hierarchical transformation (Region Adaptive Hierarchical Transform, RAHT) of the embodiment of the present application, and the main direction weight map Fourier transform ( Main direction Weight Chart Fourier Transform, NWCFT) method performance comparison.
  • RAHT Region adaptive hierarchical transformation
  • NWCFT Main direction Weight Chart Fourier Transform
  • the comparison of the determined rate-distortion performance, here, the benchmark data sets 1 to 9 can be point cloud data respectively: Longdress, Loot, Redandblack, Soldier, Andrew, David, Phil, Ricardo and Sarah.
  • the data comparison with the RAHT method was done, as shown in Figure 3.
  • FIG. 4 is a flowchart of a decoding method provided by an embodiment of the present application, which is applied to an encoding device. As shown in Figure 4, the following steps are included:
  • Step 401 Acquire an encoded code stream.
  • the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.
  • Step 402 Perform inverse Fourier transform of the graph based on the Euclidean distance weight on the coded code stream to obtain a transform result.
  • Step 403 Obtain a decoded code stream based on the transformation result.
  • the global correlation feature can be used to more fully express the point-to-point The correlation between them, so that the similarities between point cloud data can be removed as much as possible, and the coding performance can be improved. Since the performance of the encoding end is improved, correspondingly, for the decoding end, since the data to be decoded is optimized, the decoding efficiency and performance can be improved accordingly.
  • FIG. 5 is a structural diagram of an encoding device provided by an embodiment of the present application. Since the problem-solving principle of the encoding device is similar to the encoding method in the embodiment of the present application, reference may be made to the implementation of the method for the implementation of the encoding device.
  • the encoding device 500 includes:
  • the first acquisition module 501 is configured to cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;
  • the first generating module 502 is configured to, for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and the target sub-point cloud
  • the Euclidean distance between the target point in and the corresponding point of the target point generates a generalized Laplacian matrix
  • the first transformation module 503 is configured to use the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud;
  • the first encoding module 504 is configured to respectively quantize and encode the transformed multiple sub-point clouds to obtain an encoded code stream; wherein, the corresponding point is located in the reference point cloud of the target sub-point cloud, and the reference point The cloud is located in the reference frame of the current frame.
  • the first acquisition module 501 includes: a first processing submodule configured to voxelize the point cloud data to be processed to obtain point cloud voxels; a first acquisition submodule configured In order to cluster the voxelized point cloud data, multiple sub-point clouds are obtained.
  • the first generation module 502 includes: a second acquisition submodule configured to obtain a weight matrix according to the Euclidean distance between multiple point pairs in the target sub-point cloud; a third acquisition submodule , configured to obtain a Laplacian matrix according to the degree matrix and the weight matrix; the first generation submodule is configured to generate a diagonal matrix; the second generation submodule is configured to obtain a Laplacian matrix according to the diagonal matrix and the Laplacian Placian matrix to get the generalized Laplacian matrix.
  • the second acquisition submodule includes: a first calculation unit configured to calculate the i-th point and the j-th point in the target sub-point cloud to obtain the i-th point and the j-th point The Euclidean distance between points; the first acquisition unit is configured to calculate the weight according to the following formula, and use the weight to form the weight matrix: Among them, W ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud; distance represents the Euclidean distance between the i-th point and the j-th point; ⁇ is not equal to 0 A constant represents an adjustment parameter; 1 ⁇ i ⁇ M, 1 ⁇ j ⁇ M, i, j, and M are integers, and M is the total number of points included in the target sub-point cloud.
  • the first generating submodule includes: a first determining unit configured to determine a reference point cloud of the target sub-point element in the reference frame; a first generating unit configured to The Euclidean distance between each point in the point cloud and the corresponding point of each point in the reference point cloud generates the diagonal matrix; wherein, the i-th diagonal of the diagonal matrix The value on is the reciprocal of the Euclidean distance between the i-th point and point p, where point p is the corresponding point of the i-th point in the reference point cloud.
  • the first determination unit is configured to determine the reference point cloud of the target sub-point element in the reference frame by using an iterative closest point algorithm.
  • the first transformation module 503 includes: a fourth acquisition submodule, configured to acquire the attribute prediction value of the reference frame to the target attribute of the current frame; a third generation submodule, configured to The attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame generate a residual of the target attribute of the current frame; the first transformation submodule is configured to be based on the The generalized Laplacian matrix transforms the residual of the target attribute of the current frame.
  • the fourth obtaining submodule is configured to obtain the attribute prediction value of the reference frame for the target attribute of the current frame according to the following formula: in, Indicates the attribute prediction value of the reference frame to the target attribute of the current frame, x t-1 indicates the attribute value of the target attribute of the reference frame, and Lg indicates the generalized Laplacian matrix.
  • the third generating submodule is configured to use the difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame as the residual.
  • the first transformation submodule includes: a second acquisition unit configured to use the generalized Laplacian matrix to obtain a transformation matrix; a first transformation unit configured to use the transformation matrix to transform the The residual of the target attribute of the current frame is transformed.
  • the second acquisition unit is configured to solve the following formula to obtain the transformation matrix:
  • Lg represents the generalized Laplacian matrix
  • the first transformation unit is configured to use the following formula to obtain the transformation result: Among them, ⁇ represents the transformation result, represents the transformation matrix, and ⁇ represents the residual of the target attribute of the current frame.
  • the encoding device 500 provided in the embodiment of the present application can implement the corresponding embodiment of the above-mentioned encoding method, and its implementation principle and technical effect are similar, so this embodiment will not be described here again.
  • FIG. 6 is a structural diagram of a decoding device provided by an embodiment of the present application. Since the principle of the decoding device to solve the problem is similar to the decoding method in the embodiment of the present application, the implementation of the decoding device can refer to the implementation of the method, and the repetition will not be repeated.
  • the decoding device 600 includes:
  • the second obtaining module 601 is configured to obtain an encoded code stream
  • the second transformation module 602 is configured to perform an inverse Fourier transformation of the encoded code stream based on Euclidean distance weights to obtain a transformation result;
  • the first decoding module 603 is configured to obtain a decoded code stream based on the transformation result
  • the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.
  • the second transformation module includes: a second processing submodule configured to dequantize the coded code stream; a second transformation submodule configured to dequantize the coded code stream based on Inverse Fourier transform of the graph with Euclidean distance weights to obtain the transformation result.
  • the first transformation sub-module is configured to use the following formula to perform inverse Fourier transform of the graph based on Euclidean distance weights on the coded code stream after dequantization: in, Indicates the inverse transformation residual value, represents the transformation matrix, Represents the quantized residual value of the target attribute of the current frame, and ⁇ represents the inverse quantization coefficient.
  • the decoding device 600 provided in the embodiment of the present application can execute the corresponding embodiment of the above-mentioned decoding method, and its implementation principle and technical effect are similar, so this embodiment will not be described here again.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is implemented in the form of a software function unit and sold or used as an independent product, it can be stored in a processor-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other various media that can store program codes.
  • the embodiment of the present application also provides an electronic device, including: a memory, a processor, and a program stored in the memory and operable on the processor.
  • a program stored in the memory and operable on the processor.
  • the embodiment of the present application also provides a readable storage medium, on which a program is stored, and when the program is executed by a processor, each process of the above encoding or decoding method embodiment can be achieved, and the same technical effect can be achieved. To avoid repetition, it is not described here.
  • the readable storage medium can be any available medium or data storage device that can be accessed by the processor, including but not limited to magnetic storage (such as floppy disk, hard disk, magnetic tape, magneto optical disc (Magneto Optical disc, MO disc) etc.), optical storage (such as: laser disc (Compact Disk, CD), digital versatile disc (Digital Versatile Disc, DVD), Blu-ray Disc (Blu-ray Disc, BD), high-definition universal disc (High-Definition Versatile Disc, HVD ), etc.), and semiconductor memory; for example: ROM, Erasable Programmable Read Only Memory (Erasable Programmable Read Only Memory, EPROM), Electrically Erasable Programmable Read Only Memory (EEPROM), non- Volatile Memory (Non Volatile Memory NVM), Solid State Disk (Solid State Disk, SSD), etc.
  • magnetic storage such as floppy disk, hard disk, magnetic tape, magneto optical disc (Magneto Optical disc, MO disc) etc.
  • the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application is essentially or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk, etc.) ) includes several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.
  • a terminal which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.
  • the application discloses an encoding method, a decoding method, a device, a device, and a readable storage medium, and relates to the technical field of image processing, so as to improve processing performance.
  • the method includes: clustering the point cloud data to be processed in the current frame to obtain multiple sub-point clouds; for any target sub-point cloud in the multiple sub-point clouds, according to multiple The Euclidean distance between point pairs, and the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point generates a generalized Laplacian matrix; using the generalized Laplacian Matrix, performing inter-frame prediction and image Fourier residual transformation on the target sub-point cloud; respectively quantizing and encoding the transformed sub-point clouds to obtain the encoded code stream; wherein, the corresponding point is located in the target sub-point cloud In the reference point cloud of the point cloud, the reference point cloud is located in the reference frame of the current frame.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种编码方法、解码方法、装置、设备及可读存储介质,涉及图像处理技术领域,以提高处理性能。该方法包括:将当前帧的待处理的点云数据进行聚类,得到多个子点云;对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;利用所述广义拉普拉斯矩阵,对目标子点云进行帧间预测与图傅里叶残差变换;分别对变换后的多个子点云进行量化和编码,得到编码码流;其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。

Description

一种编码方法、解码方法、装置、设备及可读存储介质
相关申请的交叉引用
本申请基于申请号为202111160289.X、申请日为2021年09月30日的中国专利申请提出,申请人为咪咕文化科技有限公司、***通信集团有限公司、北京大学,申请名称为“一种编码方法、解码方法、装置、设备及可读存储介质”的技术方案,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及图像处理技术领域,涉及但不限于一种编码方法、解码方法、装置、设备及可读存储介质。
背景技术
随着计算机硬件及算法的发展,三维点云数据的获取越来越方便,点云数据的数据量也越来越大。点云数据由大量的三维无序点组成,每一个点包括位置信息(X,Y,Z)以及若干属性信息(颜色,法向量等)。
为了方便点云数据的存储与传输,点云压缩技术逐渐成为关注的焦点。现有技术提供了一种使用先前编码/解码的帧的帧间编码(例如,运动补偿)技术选择性地编码一个或多个3D点云块的方案。但是,这种方案的编码等处理性能较差。
发明内容
本申请实施例提供一种编码方法、解码方法、装置、设备及可读存储介质,以提升处理性能。
本申请实施例提供了一种编码方法,应用于编码设备,包括:
将当前帧的待处理的点云数据进行聚类,得到多个子点云;
对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;
利用所述广义拉普拉斯矩阵,对目标子点云进行帧间预测与图傅里叶残差变换;
分别对变换后的多个子点云进行量化和编码,得到编码码流;
其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。
本申请实施例还提供一种解码方法,应用于解码设备,所述方法包括:
获取编码码流;
对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果;
基于所述变换结果,得到解码码流;
其中,所述编码码流是编码设备利用广义拉普拉斯矩阵对子点云进行帧间预测与图傅里叶残差变换的结果进行编码得到的。
本申请实施例还提供一种编码装置,包括:
第一获取模块,配置为将当前帧的待处理的点云数据进行聚类,得到多个子点云;
第一生成模块,配置为对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;
第一变换模块,配置为利用所述广义拉普拉斯矩阵,对所述目标子点云进行帧间预测与图傅里叶残差变换;
第一编码模块,配置为分别对变换后的多个子点云进行量化和编码,得到编码码流;
其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。
本申请实施例还提供一种解码装置,包括:
第二获取模块,配置为获取编码码流;
第二变换模块,配置为对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果;
第一解码模块,配置为基于所述变换结果,得到解码码流;
其中,所述编码码流是编码设备利用广义拉普拉斯矩阵对子点云进行帧间预测与图傅里叶残差变换的结果进行编码得到的。
本申请实施例还提供一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的程序,所述处理器执行所述程序时实现如上所述的编码方法或解码方法中的步骤。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储程序,所述程序被处理器执行时实现如上所述的编码方法或解码方法中的步骤。
在本申请实施例中,将当前帧的待处理的点云数据进行聚类,得到多个子点云,并对任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵,并利用所述广义拉普拉斯矩阵,分别对所述多个子点云进行帧间预测与图傅里叶残差变换,从而基于变换结果得到编码码流。由于广义拉普拉斯矩阵是利用点与点之间的欧式距离生成的,因此,在本申请实施例中可利用全局相关性特征,更加充分的表达出点与点之间的相关性,从而能够尽可能的去除点云数据之间的相似之处, 进而能够提升编码性能。同时,由于编码端的性能得到了提升,相应的,对于解码端来说,由于需要解码的数据得到优化,那么,可相应的提高解码效率和性能。
附图说明
图1是本申请实施例提供的编码方法的流程图;
图2和图3是本申请实施例的方法和现有技术的方法的效果对比示意图;
图4是本申请实施例提供的解码方法的流程图;
图5是本申请实施例提供的编码装置的结构图;
图6是本申请实施例提供的解码装置的结构图。
具体实施方式
本申请实施例中术语“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
本申请实施例中术语“多个”是指两个或两个以上,其它量词与之类似。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,并不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
参见图1,图1是本申请实施例提供的编码方法的流程图,应用于编码设备。如图1所示,包括以下步骤:
步骤101、将当前帧的待处理的点云数据进行聚类,得到多个子点云。
在此步骤中,将所述待处理的点云数据进行体元化,得到点云体元,然后,将所述体元化点云数据进行聚类,得到多个子点云。
示例性地,构建预设大小的三维网格,将待处理的点云数据置于构建的三维网格中,得到各点的坐标,并将含有点的三维网格作为点云体元,得到多个点云体元。此外,还可获得各点云体元的坐标和属性信息。其中,所述属性信息包括强度、颜色等。在本申请实施例中,点云体元的坐标,可以为点云体元中各点的中心点的坐标;点云体元的颜色信息,可以为点云体元中各点的颜色信息的平均值。在实际应用中,还可以采用八叉树的方式等对待处理的点云数据进行体元化,得到多个点云体元。
其中,利用空间均匀划分方法,对于点云数据进行聚类。聚类方法例如可采用利用K-means聚类方法。
在本申请实施例中,基于位置信息将待处理的点云数据划分为多个子点云,空间均匀划分。每一个子点云可独立编码。
步骤102、对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵。
其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。
其中,多个子点云中的任一子点云都可作为目标子点云。在实际应用中,对每个目标子点云的处理方式相同。
示例性地,在此步骤中,可包括如下内容:
S1021、根据所述目标子点云中多个点对之间的欧式距离,得到权重矩阵。
其中,目标子点云中可包括多个点,每两个点构成本实施例中的一个点对。在本申请实施例中,计算每个点对中两个点之间的欧式距离。例如,对于所述目标子点云中的第i个点和第j个点,计算得到第i个点和第j个点之间的欧式距离。示例性地,对于点i(x 1,x 2……x n)和点j(y 1,y 2……y n)之间的欧式距离d(i,j),在实际应用中可按照以下公式(1)计算:
Figure PCTCN2022123245-appb-000001
其中,1≤i≤M,1≤j≤M,i,j,M为整数,M为所述目标子点云中包括的点的总数。
示例性地,按照以下公式(2)计算权重,并利用所述权重形成所述权重矩阵W:
Figure PCTCN2022123245-appb-000002
其中,W ij表示目标子点云中的第i个点到第j个点的边所对应的权重;distance表示第i个点到第j个点之间的欧式距离;σ为不等于0的常数,表示调节参数。
S1022、根据度矩阵和所述权重矩阵,得到拉普拉斯矩阵。
在此步骤中,利用度矩阵和所述权重矩阵的差,作为拉普拉斯矩阵。
示例性地:L=D-W,表示拉普拉斯矩阵,D表示度矩阵,W表示权重矩阵。
其中,所述度矩阵的对角线元素d i=Σ jW ij,其他元素为0。其中,d i表示度矩阵的第i个对角线元素,W ij表示所述目标子点云中的第i个点到第j个点的边所对应的权重
S1023、生成对角矩阵。
所述对角矩阵是根据所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离生成的。
示例性地,可首先在所述参考帧中确定所述目标子点元的参考点云。例如,在参考帧中进行运动估计,找到匹配的参考点云。其中,目标子点云和参考点是一一对应的。例如,利用迭代最近点算法,在所述参考帧中,可基于欧式距离确定所述目标子点元的参考点云。然后,基于目标子点云中的每个点与每个点在所述参考点云中的对应点之间的欧式距离,生成所述对角矩阵D w。其中,所述对角矩阵的第i个对角线上的值,为第i个点与点p之间的欧式距离的倒数,其他元素为0。其中,点p为第i个点在所述参考点云中的对应点。
S1024、根据所述对角矩阵和所述拉普拉斯矩阵,得到所述广义拉普拉斯矩阵。
在此步骤中,利用所述对角矩阵和所述拉普拉斯矩阵的和,作为所述广义拉普拉斯矩阵。
示例性地,Lg=L+D w,其中,Lg表示广义拉普拉斯矩阵,L表示拉普拉斯矩阵,D w表示对角矩阵。
步骤103、利用所述广义拉普拉斯矩阵,对所述目标子点云进行帧间预测与图傅里叶残差变换。
在此步骤中,所述帧间预测与图傅里叶残差变换可以理解为是基于欧式距离权重的帧间预测与图傅里叶残差变换,可包括如下内容:
S1031、获取所述参考帧对所述当前帧的目标属性的属性预测值。
在本申请实施例中,采用帧间预测方法,利用参考帧来预测当前帧的属性值。其中,所述属性可包括颜色,强度,法向量等。那么,所述目标属性可以是任一属性。
示例性地,在此步骤中,按照如下公式(3),得到所述参考帧对所述当前帧的目标属性的属性预测值:
Figure PCTCN2022123245-appb-000003
其中,
Figure PCTCN2022123245-appb-000004
表示参考帧对当前帧的目标属性的属性预测值,x t-1表示参考帧的目标属性的属性值,Lg表示广义拉普拉斯矩阵。
S1032、根据所述当前帧的目标属性的属性值、所述参考帧对所述当前帧的目标属性的属性预测值,生成所述当前帧的目标属性的残差。
示例性地,在此,可利用所述当前帧的目标属性的属性值与所述参考帧对所述当前帧的目标属性的属性预测值之差,作为所述残差,如公式(4):
Figure PCTCN2022123245-appb-000005
其中,δ表示所述当前帧的目标属性的残差,
Figure PCTCN2022123245-appb-000006
表示参考帧对当前帧的目标属性的属性预测值,x t表示当前帧的目标属性的属性值。
通过帧间预测方法获得残差,从而能够尽可能多的获得两帧之间的差异。由于两帧之间相同的部分无需额外处理,从而通过计算残差的方式, 可节省码率。
S1033、基于所述广义拉普拉斯矩阵对所述当前帧的目标属性的残差进行变换。
在此步骤中,利用所述广义拉普拉斯矩阵得到变换矩阵,之后,利用所述变换矩阵对所述当前帧的目标属性的残差进行变换。
示例性地,对以下公式(5)求解,得到变换矩阵:
Figure PCTCN2022123245-appb-000007
其中,Lg表示广义拉普拉斯矩阵,
Figure PCTCN2022123245-appb-000008
表示变换矩阵。
利用以下公式(6),得到所述变换结果:
Figure PCTCN2022123245-appb-000009
其中,θ表示变换结果,
Figure PCTCN2022123245-appb-000010
示变换矩阵,δ表示所述当前帧的目标属性的残差。
在本申请实施例中,在传统的图傅里叶变换的基础上,引入广义图傅里叶变换的概念,对于点云数据的帧间属性进行预测与残差变换,从而能够去除数据间的冗余,提高编码效率。
其中,对其他子点云的处理方式和对目标子点云的处理方式相同。
步骤104、分别对变换后的多个子点云进行量化和编码,得到编码码流。
在此步骤中,对变换后的多个子点云进行均匀量化和算数编码,得到编码码流。
以目标属性为颜色为例,在此,可将颜色分解为三个3×1的向量(例如:色彩空间模型(Luminance Chrominance-Blue Chrominance-Red,YUV)或(Red Green Blue,RGB))。以Y分量为例,按照S1031中的过程对当前帧的属性值进行预测,并按照S1032生成残差。之后,利用S1033对残差进行变换。变换后的Y分量进行均匀量化和算数编码,得到码流。对于每个分量,都可按照Y分量相同的处理方式进行处理。
在本申请实施例中,由于广义拉普拉斯矩阵是利用点与点之间的欧式距离生成的,因此,在本申请实施例中可利用全局相关性特征,更加充分的表达出点与点之间的相关性,从而能够尽可能的去除点云数据之间的相似之处,进而能够提升编码性能。
在实际应用中,在实际点云序列上进行了测试。在测试中,首先针对16帧动态点云进行了测试,图2中显示了本申请实施例的方法与区域自适应层级变换(Region Adaptive Hierarchical Transform,RAHT),主方向权重图傅里叶变换(Main direction Weight Chart Fourier Transform,NWCFT)方法性能的对比。其中,如图2所示,给出了9个不同点云数据,即基准数据集1至9,分别基于方法一(RAHT)、方法二(NWCFT)以及方法三(本申请实施例提供的编码方法)中,确定的速率失真性能的比较,这里,基准数据集1至9可以分别是点云数据:Longdress、Loot、Redandblack、Soldier、Andrew、David、Phil、Ricardo以及Sarah。为了量化增益,在实验中,又 做了和RAHT方法的数据对比,如图3所示。
从图2和图3中可以看出,本申请实施例中,在传统的图傅里叶变换的基础上,引入广义图傅里叶变换的概念,对于点云帧间属性进行预测与残差变换,从而可进一步去除数据间的冗余,提高编码效率。实验结果表明,本申请实施例的方法能够提高主客观性能,可以地应用在实际点云的压缩、传输、存储***中。
参见图4,图4是本申请实施例提供的解码方法的流程图,应用于编码设备。如图4所示,包括以下步骤:
步骤401、获取编码码流。
其中,所述编码码流是编码设备利用广义拉普拉斯矩阵对子点云进行帧间预测与图傅里叶残差变换的结果进行编码得到的。
步骤402、对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果。
在解码端,对编码码流进行熵解码后,对所述编码码流进行反量化。之后,对反量化后的编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果。
示例性地,在此可利用以下公式(7)对反量化后的编码码流进行基于欧式距离权重的图傅里叶反变换:
Figure PCTCN2022123245-appb-000011
其中,
Figure PCTCN2022123245-appb-000012
表示反变换残差值,
Figure PCTCN2022123245-appb-000013
表示变换矩阵,
Figure PCTCN2022123245-appb-000014
表示当前帧的目标属性的量化残差值,ε表示反量化系数。
步骤403、基于所述变换结果,得到解码码流。
在本申请实施例中,由于广义拉普拉斯矩阵是利用点与点之间的欧式距离生成的,因此,在本申请实施例中可利用全局相关性特征,更加充分的表达出点与点之间的相关性,从而能够尽可能的去除点云数据之间的相似之处,进而能够提升编码性能。由于编码端的性能得到了提升,相应的,对于解码端来说,由于需要解码的数据得到优化,那么,可相应的提高解码效率和性能。
本申请实施例还提供了一种编码装置。参见图5,图5是本申请实施例提供的编码装置的结构图。由于编码装置解决问题的原理与本申请实施例中编码方法相似,因此该编码装置的实施可以参见方法的实施。
如图5所示,编码装置500包括:
第一获取模块501,配置为将当前帧的待处理的点云数据进行聚类,得到多个子点云;
第一生成模块502,配置为对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;
第一变换模块503,配置为利用所述广义拉普拉斯矩阵,对所述目标子点云进行帧间预测与图傅里叶残差变换;
第一编码模块504,配置为分别对变换后的多个子点云进行量化和编码,得到编码码流;其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。
在一些实施例中,所述第一获取模块501包括:第一处理子模块,配置为将所述待处理的点云数据进行体元化,得到点云体元;第一获取子模块,配置为将所述体元化点云数据进行聚类,得到多个子点云。
在一些实施例中,所述第一生成模块502包括:第二获取子模块,配置为根据所述目标子点云中多个点对之间的欧式距离,得到权重矩阵;第三获取子模块,配置为根据度矩阵和所述权重矩阵,得到拉普拉斯矩阵;第一生成子模块,配置为生成对角矩阵;第二生成子模块,配置为根据所述对角矩阵和所述拉普拉斯矩阵,得到所述广义拉普拉斯矩阵。
在一些实施例中,所述第三获取子模块,配置为利用所述度矩阵和所述权重矩阵的差,作为拉普拉斯矩阵;所述第二生成子模块,配置为利用所述对角矩阵和所述拉普拉斯矩阵的和,作为所述广义拉普拉斯矩阵;其中,所述度矩阵的对角线元素d i=Σ jW ij,其中,d i表示度矩阵的第i个对角线元素,W ij表示所述目标子点云中的第i个点到第j个点的边所对应的权重;1≤i≤M,1≤j≤M,i,j,M为整数,M为所述目标子点云中包括的点的总数;所述对角矩阵是根据所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离生成的。
在一些实施例中,所述第二获取子模块包括:第一计算单元,配置为对于所述目标子点云中的第i个点和第j个点,计算得到第i个点和第j个点之间的欧式距离;第一获取单元,配置为按照以下公式计算权重,并利用所述权重形成所述权重矩阵:
Figure PCTCN2022123245-appb-000015
其中,W ij表示目标子点云中的第i个点到第j个点的边所对应的权重;distance表示第i个点到第j个点之间的欧式距离;σ为不等于0的常数,表示调节参数;1≤i≤M,1≤j≤M,i,j,M为整数,M为所述目标子点云中包括的点的总数。
在一些实施例中,所述第一生成子模块包括:第一确定单元,配置为在所述参考帧中确定所述目标子点元的参考点云;第一生成单元,配置为基于目标子点云中的每个点与所述每个点在所述参考点云中的对应点之间的欧式距离,生成所述对角矩阵;其中,所述对角矩阵的第i个对角线上的值,为第i个点与点p之间的欧式距离的倒数,其中,点p为第i个点在所述参考点云中的对应点。
在一些实施例中,所述第一确定单元,配置为利用迭代最近点算法,在所述参考帧中确定所述目标子点元的参考点云。
在一些实施例中,所述第一变换模块503包括:第四获取子模块,配 置为获取所述参考帧对所述当前帧的目标属性的属性预测值;第三生成子模块,配置为根据所述当前帧的目标属性的属性值、所述参考帧对所述当前帧的目标属性的属性预测值,生成所述当前帧的目标属性的残差;第一变换子模块,配置为基于所述广义拉普拉斯矩阵对所述当前帧的目标属性的残差进行变换。
在一些实施例中,所述第四获取子模块,用于按照如下公式,得到所述参考帧对所述当前帧的目标属性的属性预测值:
Figure PCTCN2022123245-appb-000016
其中,
Figure PCTCN2022123245-appb-000017
表示参考帧对当前帧的目标属性的属性预测值,x t-1表示参考帧的目标属性的属性值,Lg表示广义拉普拉斯矩阵。
在一些实施例中,所述第三生成子模块,配置为利用所述当前帧的目标属性的属性值与所述参考帧对所述当前帧的目标属性的属性预测值之差,作为所述残差。
在一些实施例中,所述第一变换子模块包括:第二获取单元,配置为利用所述广义拉普拉斯矩阵得到变换矩阵;第一变换单元,配置为利用所述变换矩阵对所述当前帧的目标属性的残差进行变换。
在一些实施例中,所述第二获取单元,配置为对以下公式求解,得到变换矩阵:
Figure PCTCN2022123245-appb-000018
其中,Lg表示广义拉普拉斯矩阵,
Figure PCTCN2022123245-appb-000019
表示变换矩阵。
在一些实施例中,所述第一变换单元,配置为利用以下公式,得到所述变换结果:
Figure PCTCN2022123245-appb-000020
其中,θ表示变换结果,
Figure PCTCN2022123245-appb-000021
示变换矩阵,δ表示所述当前帧的目标属性的残差。
本申请实施例提供的编码装置500,可以执行上述编码方法对应的实施例,其实现原理和技术效果类似,本实施例此处不再描述。
本申请实施例还提供了一种解码装置。参见图6,图6是本申请实施例提供的解码装置的结构图。由于解码装置解决问题的原理与本申请实施例中解码方法相似,因此该解码装置的实施可以参见方法的实施,重复之处不再赘述。
如图6所示,解码装置600包括:
第二获取模块601,配置为获取编码码流;
第二变换模块602,配置为对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果;
第一解码模块603,配置为基于所述变换结果,得到解码码流;
其中,所述编码码流是编码设备利用广义拉普拉斯矩阵对子点云进行帧间预测与图傅里叶残差变换的结果进行编码得到的。
在一些实施例中,所述第二变换模块包括:第二处理子模块,配置为对所述编码码流进行反量化;第二变换子模块,配置为对反量化后的编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果。
在一些实施例中,所述第一变换子模块,配置为利用以下公式对反量 化后的编码码流进行基于欧式距离权重的图傅里叶反变换:
Figure PCTCN2022123245-appb-000022
其中,
Figure PCTCN2022123245-appb-000023
表示反变换残差值,
Figure PCTCN2022123245-appb-000024
表示变换矩阵,
Figure PCTCN2022123245-appb-000025
表示当前帧的目标属性的量化残差值,ε表示反量化系数。
本申请实施例提供的解码装置600,可以执行上述解码方法对应的实施例,其实现原理和技术效果类似,本实施例此处不再描述。
需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本申请实施例还提供一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的程序,所述处理器执行所述程序时实现如上所述的编码方法或解码方法中的步骤。
本申请实施例还提供一种可读存储介质,可读存储介质上存储有程序,该程序被处理器执行时实现上述编码或解码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再描述。其中,所述的可读存储介质,可以是处理器能够存取的任何可用介质或数据存储设备,包括但不限于磁性存储器(例如软盘、硬盘、磁带、磁光盘(Magneto Optical disc,MO disc)等)、光学存储器(例如:激光唱片(Compact Disk,CD)、数字通用光盘(Digital Versatile Disc,DVD)、蓝光光盘(Blu-ray Disc,BD)、高清通用光盘(High-Definition Versatile Disc,HVD)等)、以及半导体存储器;例如:ROM、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、带电可擦可编程只读存储器(Electrically Erasable Programmable Read Only Memory,EEPROM)、非易失性存储器(Non Volatile Memory NVM)、固态硬盘(Solid State Disk,SSD)等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多 限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。根据这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁盘、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。
工业实用性
本申请公开了一种编码方法、解码方法、装置、设备及可读存储介质,涉及图像处理技术领域,以提高处理性能。该方法包括:将当前帧的待处理的点云数据进行聚类,得到多个子点云;对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;利用所述广义拉普拉斯矩阵,对目标子点云进行帧间预测与图傅里叶残差变换;分别对变换后的多个子点云进行量化和编码,得到编码码流;其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。

Claims (20)

  1. 一种编码方法,应用于编码设备,包括:
    将当前帧的待处理的点云数据进行聚类,得到多个子点云;
    对于所述多个子点云中的任一目标子点云,根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;
    利用所述广义拉普拉斯矩阵,对所述目标子点云进行帧间预测与图傅里叶残差变换;
    分别对变换后的多个子点云进行量化和编码,得到编码码流;
    其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。
  2. 根据权利要求1所述的方法,其中,所述将当前帧的待处理的点云数据进行聚类,得到多个子点云,包括:
    将所述待处理的点云数据进行体元化,得到点云体元;
    将所述体元化点云数据进行聚类,得到所述多个子点云。
  3. 根据权利要求1所述的方法,其中,所述根据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵,包括:
    根据所述目标子点云中多个点对之间的欧式距离,得到权重矩阵;
    根据度矩阵和所述权重矩阵,得到拉普拉斯矩阵;
    生成对角矩阵;
    根据所述对角矩阵和所述拉普拉斯矩阵,得到所述广义拉普拉斯矩阵。
  4. 根据权利要求3所述的方法,其中,所述根据度矩阵和所述权重矩阵,得到拉普拉斯矩阵,包括:
    利用所述度矩阵和所述权重矩阵的差,作为所述拉普拉斯矩阵;
    所述根据所述对角矩阵和所述拉普拉斯矩阵,得到所述广义拉普拉斯矩阵,包括:
    利用所述对角矩阵和所述拉普拉斯矩阵的和,作为所述广义拉普拉斯矩阵;
    其中,所述度矩阵的对角线元素d i=∑ jW ij,其中,d i表示度矩阵的第i个对角线元素,W ij表示所述目标子点云中的第i个点到第j个点的边所对应的权重;1≤i≤M,1≤j≤M,i,j,M为整数,M为所述目标子点云中包括的点的总数;
    所述对角矩阵是根据所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离生成的。
  5. 根据权利要求3所述的方法,其中,所述根据所述目标子点云中多 个点对之间的欧式距离,得到权重矩阵,包括:
    对于所述目标子点云中的第i个点和第j个点,计算得到所述第i个点和所述第j个点之间的欧式距离;
    按照以下公式计算权重,并利用所述权重形成所述权重矩阵:
    Figure PCTCN2022123245-appb-100001
    其中,W ij表示目标子点云中的第i个点到第j个点的边所对应的权重;distance表示第i个点到第j个点之间的欧式距离;σ为不等于0的常数,表示调节参数;1≤i≤M,1≤j≤M,i,j,M为整数,M为所述目标子点云中包括的点的总数。
  6. 根据权利要求3所述的方法,其中,所述生成对角矩阵,包括:
    在所述参考帧中确定所述目标子点元的参考点云;
    基于目标子点云中的每个点与所述每个点在所述参考点云中的对应点之间的欧式距离,生成所述对角矩阵;
    其中,所述对角矩阵的第i个对角线上的值,为第i个点与点p之间的欧式距离的倒数,其中,点p为第i个点在所述参考点云中的对应点。
  7. 根据权利要求6所述的方法,其中,所述在所述参考帧中确定所述目标子点元的参考点云,包括:
    利用迭代最近点算法,在所述参考帧中确定所述目标子点元的参考点云。
  8. 根据权利要求1所述的方法,其中,所述利用所述广义拉普拉斯矩阵,对所述目标子点云进行帧间预测与图傅里叶残差变换,包括:
    获取所述参考帧对所述当前帧的目标属性的属性预测值;
    根据所述当前帧的目标属性的属性值、所述参考帧对所述当前帧的目标属性的属性预测值,生成所述当前帧的目标属性的残差;
    基于所述广义拉普拉斯矩阵对所述当前帧的目标属性的残差进行变换。
  9. 根据权利要求8所述的方法,其中,所述获取所述参考帧对所述当前帧的目标属性的属性预测值,包括:
    按照如下公式,得到所述参考帧对所述当前帧的目标属性的属性预测值:
    Figure PCTCN2022123245-appb-100002
    其中,
    Figure PCTCN2022123245-appb-100003
    表示参考帧对当前帧的目标属性的属性预测值,x t-1表示参考帧的目标属性的属性值,Lg表示广义拉普拉斯矩阵。
  10. 根据权利要求8所述的方法,其中,所述根据所述当前帧的目标属性的属性值、所述参考帧对所述当前帧的目标属性的属性预测值,生成所述当前帧的目标属性的残差,包括:
    利用所述当前帧的目标属性的属性值与所述参考帧对所述当前帧的目标属性的属性预测值之差,作为所述残差。
  11. 根据权利要求8所述的方法,其中,所述基于所述广义拉普拉斯矩阵对所述当前帧的目标属性的残差进行变换,包括:
    利用所述广义拉普拉斯矩阵得到变换矩阵;
    利用所述变换矩阵对所述当前帧的目标属性的残差进行变换。
  12. 根据权利要求11所述的方法,其中,所述利用所述广义拉普拉斯矩阵得到变换矩阵,包括:
    对以下公式求解,得到变换矩阵:
    Figure PCTCN2022123245-appb-100004
    其中,Lg表示广义拉普拉斯矩阵,
    Figure PCTCN2022123245-appb-100005
    表示变换矩阵。
  13. 根据权利要求11所述的方法,其中,所述利用所述变换矩阵对所述当前帧的目标属性的残差进行变换,包括:
    利用以下公式,得到所述变换结果:
    Figure PCTCN2022123245-appb-100006
    其中,θ表示变换结果,
    Figure PCTCN2022123245-appb-100007
    示变换矩阵,δ表示所述当前帧的目标属性的残差。
  14. 一种解码方法,应用于解码设备,所述方法包括:
    获取编码码流;
    对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果;
    基于所述变换结果,得到解码码流;
    其中,所述编码码流是编码设备利用广义拉普拉斯矩阵对子点云进行帧间预测与图傅里叶残差变换的结果进行编码得到的。
  15. 根据权利要求14所述的方法,其中,所述对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果,包括:
    对所述编码码流进行反量化;
    对反量化后的编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果。
  16. 根据权利要求15所述的方法,其中,所述对反量化后的编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果,包括:
    利用以下公式对反量化后的编码码流进行基于欧式距离权重的图傅里叶反变换:
    Figure PCTCN2022123245-appb-100008
    其中,
    Figure PCTCN2022123245-appb-100009
    表示反变换残差值,
    Figure PCTCN2022123245-appb-100010
    表示变换矩阵,
    Figure PCTCN2022123245-appb-100011
    表示当前帧的目标属性的量化残差值,ε表示反量化系数。
  17. 一种编码装置,包括:
    第一获取模块,配置为将当前帧的待处理的点云数据进行聚类,得到多个子点云;
    第一生成模块,配置为对于所述多个子点云中的任一目标子点云,根 据所述目标子点云中多个点对之间的欧式距离,以及,所述目标子点云中的目标点与所述目标点的对应点之间的欧式距离,生成广义拉普拉斯矩阵;
    第一变换模块,配置为利用所述广义拉普拉斯矩阵,对所述目标子点云进行帧间预测与图傅里叶残差变换;
    第一编码模块,配置为分别对变换后的多个子点云进行量化和编码,得到编码码流;
    其中,所述对应点位于所述目标子点云的参考点云中,所述参考点云位于所述当前帧的参考帧中。
  18. 一种解码装置,包括:
    第二获取模块,配置为获取编码码流;
    第二变换模块,配置为对所述编码码流进行基于欧式距离权重的图傅里叶反变换,得到变换结果;
    第一解码模块,配置为基于所述变换结果,得到解码码流;
    其中,所述编码码流是编码设备利用广义拉普拉斯矩阵对子点云进行帧间预测与图傅里叶残差变换的结果进行编码得到的。
  19. 一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的程序;
    所述处理器,用于读取存储器中的程序实现如权利要求1至13中任一项所述的编码方法中的步骤;或者实现如权利要求14至16中任一项所述的解码方法中的步骤。
  20. 一种可读存储介质,用于存储程序,所述程序被处理器执行时实现如权利要求1至13中任一项所述的编码方法中的步骤;或者实现如权利要求14至16中任一项所述的解码方法中的步骤。
PCT/CN2022/123245 2021-09-30 2022-09-30 一种编码方法、解码方法、装置、设备及可读存储介质 WO2023051783A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111160289.X 2021-09-30
CN202111160289.XA CN113766229B (zh) 2021-09-30 2021-09-30 一种编码方法、解码方法、装置、设备及可读存储介质

Publications (1)

Publication Number Publication Date
WO2023051783A1 true WO2023051783A1 (zh) 2023-04-06

Family

ID=78798550

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123245 WO2023051783A1 (zh) 2021-09-30 2022-09-30 一种编码方法、解码方法、装置、设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN113766229B (zh)
WO (1) WO2023051783A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113766229B (zh) * 2021-09-30 2023-04-28 咪咕文化科技有限公司 一种编码方法、解码方法、装置、设备及可读存储介质
WO2023173238A1 (zh) * 2022-03-12 2023-09-21 Oppo广东移动通信有限公司 编解码方法、码流、编码器、解码器以及存储介质
WO2023245981A1 (zh) * 2022-06-20 2023-12-28 北京大学深圳研究生院 一种点云的压缩方法、装置、电子设备及存储介质
CN114785998A (zh) * 2022-06-20 2022-07-22 北京大学深圳研究生院 一种点云的压缩方法、装置、电子设备及存储介质
WO2024077911A1 (en) * 2022-10-13 2024-04-18 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for point cloud coding
CN116797625B (zh) * 2023-07-20 2024-04-19 无锡埃姆维工业控制设备有限公司 一种单目三维工件位姿估计方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171761A (zh) * 2017-12-13 2018-06-15 北京大学 一种基于傅里叶图变换的点云帧内编码方法及装置
CN110418135A (zh) * 2019-08-05 2019-11-05 北京大学深圳研究生院 一种基于邻居的权重优化的点云帧内预测方法及设备
CN110572655A (zh) * 2019-09-30 2019-12-13 北京大学深圳研究生院 一种基于邻居权重的参数选取和传递的点云属性编码和解码的方法及设备
WO2020197086A1 (ko) * 2019-03-25 2020-10-01 엘지전자 주식회사 포인트 클라우드 데이터 송신 장치, 포인트 클라우드 데이터 송신 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법
CN112385238A (zh) * 2019-07-10 2021-02-19 深圳市大疆创新科技有限公司 一种数据编码、数据解码方法、设备及存储介质
CN113766229A (zh) * 2021-09-30 2021-12-07 咪咕文化科技有限公司 一种编码方法、解码方法、装置、设备及可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171761A (zh) * 2017-12-13 2018-06-15 北京大学 一种基于傅里叶图变换的点云帧内编码方法及装置
WO2020197086A1 (ko) * 2019-03-25 2020-10-01 엘지전자 주식회사 포인트 클라우드 데이터 송신 장치, 포인트 클라우드 데이터 송신 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법
CN112385238A (zh) * 2019-07-10 2021-02-19 深圳市大疆创新科技有限公司 一种数据编码、数据解码方法、设备及存储介质
CN110418135A (zh) * 2019-08-05 2019-11-05 北京大学深圳研究生院 一种基于邻居的权重优化的点云帧内预测方法及设备
CN110572655A (zh) * 2019-09-30 2019-12-13 北京大学深圳研究生院 一种基于邻居权重的参数选取和传递的点云属性编码和解码的方法及设备
CN113766229A (zh) * 2021-09-30 2021-12-07 咪咕文化科技有限公司 一种编码方法、解码方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN113766229A (zh) 2021-12-07
CN113766229B (zh) 2023-04-28

Similar Documents

Publication Publication Date Title
WO2023051783A1 (zh) 一种编码方法、解码方法、装置、设备及可读存储介质
Liu et al. Random walk graph Laplacian-based smoothness prior for soft decoding of JPEG images
US10853447B2 (en) Bezier volume representation of point cloud attributes
US20170026665A1 (en) Method and device for compressing local feature descriptor, and storage medium
US20180061428A1 (en) Variable length coding of indices and bit scheduling in a pyramid vector quantizer
CN111727445A (zh) 局部熵编码的数据压缩
CN114245896A (zh) 向量查询方法、装置、电子设备及存储介质
US20240202982A1 (en) 3d point cloud encoding and decoding method, compression method and device based on graph dictionary learning
Zhang et al. Transformer and upsampling-based point cloud compression
CN104392207A (zh) 一种用于数字图像内容识别的特征编码方法
CN107231556B (zh) 一种图像云储存设备
Fred et al. Bat optimization based vector quantization algorithm for medical image compression
Wang et al. Fast sparse fractal image compression
Km et al. Secure image transformation using remote sensing encryption algorithm
Wang et al. Fractal image encoding with flexible classification sets
Hajizadeh et al. Predictive compression of animated 3D models by optimized weighted blending of key‐frames
Kim et al. A fractal vector quantizer for image coding
Xu et al. Conditional perceptual quality preserving image compression
KR20100083554A (ko) 이산여현변환/역이산여현변환 방법 및 장치
Guo et al. Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Zhang et al. Mutual information-based context template modeling for bitplane coding in remote sensing image compression
Bayazit et al. 3-D mesh geometry compression with set partitioning in the spectral domain
Hu et al. A highly efficient method for improving the performance of GLA-based algorithms
US20240005562A1 (en) Point cloud encoding method and apparatus, electronic device, medium and program product
JP4871246B2 (ja) ベクトル量子化方法,装置およびそれらのプログラムとそれを記録したコンピュータ読み取り可能な記録媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22875164

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE