CN114972695B

CN114972695B - Point cloud generation method and device, electronic equipment and storage medium

Info

Publication number: CN114972695B
Application number: CN202210555391.8A
Authority: CN
Inventors: 李革; 陈婧怡; 李宏; 高伟
Original assignee: Peking University Shenzhen Graduate School
Current assignee: Peking University Shenzhen Graduate School
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2024-03-15
Anticipated expiration: 2042-05-20
Also published as: CN114972695A

Abstract

The application provides a point cloud generation method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: s1, acquiring point cloud data of a target type; s2, for each point cloud data, processing the point cloud data by using an encoder to obtain a mean vector and a variance vector corresponding to the point cloud data; s3, based on the mean value vector, the variance vector and the Gaussian distribution vector, obtaining a hidden code vector corresponding to the point cloud data; s4, inputting the first hidden code matrix into the forward direction process of the point cloud normalized stream to obtain a first matrix; s5, inputting each point cloud data and the first hidden code matrix into a reversible decoder, and performing a forward process of a target flow to obtain a second matrix; s6, calculating a first loss value based on the first hidden code matrix and the first matrix, and calculating a second loss value based on the first Gaussian distribution matrix and the second matrix. The method and the device can enable the point cloud generated by the reversible point cloud decoder after training to have richer details.

Description

Point cloud generation method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of point cloud generation, and in particular, to a method and apparatus for generating a point cloud, an electronic device, and a storage medium.

Background

The point cloud generation aims at generating a point cloud from a specific distribution. This is a challenging task due to the disorder of the point cloud.

Common point cloud generation methods include generating an countermeasure network, an autoregressive model, a variational automatic encoder, and a normalized stream. However, these methods have inherent drawbacks such as the instability of the generation of the countermeasure network training, the necessity of generating the autoregressive model in sequence, the ambiguity of the point cloud generated by the variational automatic encoder, and the long time required for the normalized stream training. Thus, there have been some studies to combine a variational automatic encoder with a normalized stream to improve the quality of the generated point cloud. However, this part of the approach rarely considers network design. Specifically, for the point cloud encoder part of the variational automatic encoder, the widely adopted structure is difficult to capture local information, and the defect can cause the lack of high-frequency representation of the hidden code, which is unfavorable for the generation of the point cloud. And direct replacement with a complex point cloud encoder would break the robustness of the generation and bring unacceptable computational costs.

Disclosure of Invention

In view of the foregoing, an object of the present application is to provide a method, an apparatus, an electronic device, and a storage medium for generating a point cloud, which can enable the point cloud generated by a reversible point cloud decoder after training to have more abundant details.

In a first aspect, an embodiment of the present application provides a method for generating a point cloud, where the method includes:

s101, acquiring point cloud data of each of m objects of a target class, wherein m is an integer greater than 1;

s102, for each piece of newly acquired point cloud data, performing a target processing process on the point cloud data by using a point cloud encoder to obtain a mean vector corresponding to the point cloud data and a variance vector corresponding to the point cloud data, wherein the target processing process comprises the following steps: performing first convolution processing on the point cloud data to obtain an initial feature matrix of the point cloud data; based on the initial feature matrix, obtaining a first space feature matrix used for representing the position information of each point in the point cloud data and each first neighbor point thereof in a feature space, and obtaining a second space feature matrix used for representing the position information of each point in the point cloud data and each second neighbor point thereof in a Cartesian space, wherein for each point in the point cloud data, the first neighbor point of the point is k points in the point cloud data, which are closer to the point in the feature space, and the second neighbor node of the point is k points in the point cloud data, which are closer to the point in the Cartesian space, and k is an integer greater than 1; performing first pooling treatment on the comprehensive space feature matrix to obtain a first pooled feature matrix, and performing second pooling treatment on the comprehensive space feature matrix to obtain a second pooled feature matrix, wherein the comprehensive space feature matrix is obtained by adding the first space feature matrix and the second space feature matrix, or is obtained by combining the first space feature matrix and the second space feature matrix; converting the first pooling feature matrix into the mean vector through a first full connection layer, and converting the second pooling feature matrix into the variance vector through a second full connection layer;

S103, based on the mean value vector, the variance vector and a Gaussian distribution vector formed by a first target sampling point, obtaining a hidden code vector corresponding to the point cloud data, wherein the first target sampling point is obtained by randomly sampling noise of which the probability density function accords with standard Gaussian distribution;

s104, inputting a first hidden code matrix obtained by combining hidden code vectors corresponding to each point cloud data obtained by the latest to a forward process of a first point cloud normalized stream to obtain a first target matrix with the same dimension as the first hidden code matrix;

s105, inputting the newly acquired point cloud data and the first hidden code matrix into a reversible point cloud decoder, and performing a forward process of a target flow to obtain a second target matrix, wherein the target flow is a second point cloud normalized stream or a Markov chain diffusion;

s106, calculating to obtain a first loss value based on the first hidden code matrix and the first target matrix, and calculating to obtain a second loss value based on a first Gaussian distribution matrix and the second target matrix, wherein the first Gaussian distribution matrix and the second target matrix are the same in dimension number and are formed by first target three-dimensional sampling points, and the first target three-dimensional sampling points are obtained by randomly sampling three-dimensional noise with probability density functions conforming to standard Gaussian distribution;

S107, if the latest first loss value is greater than a first preset loss value and/or the latest second loss value is greater than a second preset loss value, optimizing at least one of the first point cloud normalized stream, the reversible point cloud decoder and the point cloud encoder based on a gradient descent method, and repeating the steps S101-S106 until the latest first loss value is less than or equal to the first preset loss value and the latest second loss value is less than or equal to the second preset loss value, so as to take the current first point cloud normalized stream as a first point cloud normalized stream after training is completed, and take the current reversible point cloud decoder as a reversible point cloud decoder after training is completed;

s108, inputting a second Gaussian distribution matrix which is the same as the first target matrix in dimension and is formed by second target sampling points into the reverse process of the trained first point cloud normalized flow to obtain a second hidden code matrix which is the same as the first target matrix in dimension, wherein the second target sampling points are obtained by randomly sampling noise of which the probability density function accords with standard Gaussian distribution;

s109, inputting a third Gaussian distribution matrix and the second hidden code matrix, which are the same as the first Gaussian distribution matrix in dimension and are formed by second target three-dimensional sampling points, into the reversible point cloud decoder after training is completed, performing the reverse process of the target flow, and generating new point clouds of the target class, wherein the second target three-dimensional sampling points are obtained by randomly sampling three-dimensional noise of which the probability density function accords with standard Gaussian distribution.

In one possible implementation manner, the first pooling processing is performed on the comprehensive spatial feature matrix to obtain a first pooled feature matrix, including:

rearranging the numerical values in each row of the comprehensive space feature matrix in order from small to large, or rearranging the numerical values in each row of the comprehensive space feature matrix in order from large to small, so as to obtain a first candidate feature matrix;

performing second convolution processing on the first candidate feature matrix to obtain a second candidate feature matrix;

calculating a first selection probability of the numerical value of each position in the second candidate feature matrix based on a softmax function, and replacing the numerical value of the position in the second candidate feature matrix with the first selection probability to obtain a first weight matrix;

multiplying the transposed matrix of the first weight matrix with the first candidate feature matrix to obtain the first pooling feature matrix;

performing second pooling processing on the comprehensive space feature matrix to obtain a second pooled feature matrix, including:

rearranging the numerical values in each row of the comprehensive space feature matrix in order from small to large, or rearranging the numerical values in each row of the comprehensive space feature matrix in order from large to small, so as to obtain a third candidate feature matrix;

Performing third convolution processing on the third candidate feature matrix to obtain a fourth candidate feature matrix;

calculating a second selection probability of the numerical value of each position in the fourth candidate feature matrix based on a softmax function, and replacing the numerical value of the position in the fourth candidate feature matrix with the second selection probability to obtain a second weight matrix;

multiplying the transposed matrix of the second weight matrix with the third candidate feature matrix to obtain the second pooled feature matrix.

In a possible implementation manner, the first point cloud normalized stream includes n reversible residual coupling blocks, where n is an integer greater than 1, and for each of the reversible residual coupling blocks, the reversible residual coupling blocks includes a first target layer, a second target layer and a third target layer, where the first target layer is a convolution layer or a full connection layer, the second target layer is a convolution layer or a full connection layer, and the third target layer is a convolution layer or a full connection layer; inputting a first hidden code matrix obtained by combining hidden code vectors corresponding to each point cloud data obtained by the latest to a forward process of a first point cloud normalized stream to obtain a first target matrix with the same dimension as the first hidden code matrix, wherein the method comprises the following steps of:

Splitting the first hidden code matrix into a first hidden code matrix and a second hidden code matrix with the same dimension, taking the first hidden code matrix as a first input matrix of a first reversible residual coupling block, and taking the second hidden code matrix as a second input matrix of the first reversible residual coupling block;

calculating to obtain a first output matrix of an ith reversible residual coupling block and a second output matrix of the ith reversible residual coupling block through the following formula, and taking the first output matrix of the ith reversible residual coupling block as a first input matrix of an (i+1) th reversible residual coupling block and taking the second output matrix of the ith reversible residual coupling block as a second input matrix of the (i+1) th reversible residual coupling block when i is smaller than n, wherein the initial value of i is 1;

wherein,first output matrix for the ith reversible residual coupling block,/th reversible residual coupling block>Second output matrix for the ith reversible residual coupling block,/th reversible residual coupling block>Is->A first reference matrix obtained after passing through a first target layer in an ith reversible residual coupling block,/I>Is->A second reference matrix obtained after passing through a second target layer in the ith reversible residual coupling block,/I >Is->A third reference matrix obtained after passing through a third target layer in the ith reversible residual coupling block,second input matrix for the ith reversible residual coupling block,/th reversible residual coupling block>A first input matrix for an ith reversible residual coupling block;

the method comprises the steps of (1) returning to the step, calculating to obtain a first output matrix of an ith reversible residual coupling block and a second output matrix of the ith reversible residual coupling block through the following formulas, taking the first output matrix of the ith reversible residual coupling block as a first input matrix of the (i+1) th reversible residual coupling block when the i is smaller than n, and taking the second output matrix of the ith reversible residual coupling block as a second input matrix of the (i+1) th reversible residual coupling block until the i is equal to n;

and combining the first output matrix of the nth reversible residual coupling block and the second output matrix of the nth reversible residual coupling block to obtain the first target matrix.

In one possible implementation manner, inputting a second gaussian distribution matrix composed of second target sampling points with the same dimension as the first target matrix into the reverse process of the trained first point cloud normalized flow to obtain a second hidden code matrix with the same dimension as the first target matrix, including:

Splitting the second Gaussian distribution matrix into a first sub-Gaussian distribution matrix and a second sub-Gaussian distribution matrix which have the same dimension, taking the first sub-Gaussian distribution matrix as a third output matrix of an nth reversible residual coupling block, and taking the second sub-Gaussian distribution matrix as a fourth output matrix of the nth reversible residual coupling block;

calculating to obtain a third input matrix of the j-th reversible residual coupling block and a fourth input matrix of the j-th reversible residual coupling block through the following formula, and taking the third input matrix of the j-th reversible residual coupling block as a third output matrix of the j-1-th reversible residual coupling block and taking the fourth input matrix of the j-th reversible residual coupling block as a fourth output matrix of the j-1-th reversible residual coupling block when j is greater than 1, wherein the initial value of j is n;

wherein,third output matrix for the j-th reversible residual coupling block, ">Is->A fourth reference matrix obtained after passing through the third target layer in the jth reversible residual coupling block,/a third target layer>Fourth output matrix for the j-th reversible residual coupling block,/th reversible residual coupling block>Fourth input of the j-th reversible residual coupling blockMatrix entry->Is->A fifth reference matrix obtained after passing through the second target layer in the jth reversible residual coupling block,/v >Is->A sixth reference matrix obtained after passing through the first target layer in the jth reversible residual coupling block,/v>A third input matrix that is a j-th reversible residual coupling block;

calculating j-1 and returning to the step to obtain a third input matrix of a j-th reversible residual coupling block and a fourth input matrix of the j-th reversible residual coupling block through the following formulas, and taking the third input matrix of the j-th reversible residual coupling block as a third output matrix of the j-1-th reversible residual coupling block when j is greater than 1, and taking the fourth input matrix of the j-th reversible residual coupling block as a fourth output matrix of the j-1-th reversible residual coupling block until j is equal to 1;

and combining the third input matrix of the first reversible residual coupling block and the fourth input matrix of the first reversible residual coupling block to obtain the second hidden code matrix.

In a second aspect, an embodiment of the present application further provides a point cloud generating device, where the device includes:

the acquisition module is used for acquiring point cloud data of each of m objects of a target class, wherein m is an integer greater than 1;

the first processing module is configured to perform a target processing procedure on each point cloud data acquired recently by using a point cloud encoder to obtain a mean vector corresponding to the point cloud data and a variance vector corresponding to the point cloud data, where the target processing procedure includes: performing first convolution processing on the point cloud data to obtain an initial feature matrix of the point cloud data; based on the initial feature matrix, obtaining a first space feature matrix used for representing the position information of each point in the point cloud data and each first neighbor point thereof in a feature space, and obtaining a second space feature matrix used for representing the position information of each point in the point cloud data and each second neighbor point thereof in a Cartesian space, wherein for each point in the point cloud data, the first neighbor point of the point is k points in the point cloud data, which are closer to the point in the feature space, and the second neighbor node of the point is k points in the point cloud data, which are closer to the point in the Cartesian space, and k is an integer greater than 1; performing first pooling treatment on the comprehensive space feature matrix to obtain a first pooled feature matrix, and performing second pooling treatment on the comprehensive space feature matrix to obtain a second pooled feature matrix, wherein the comprehensive space feature matrix is obtained by adding the first space feature matrix and the second space feature matrix, or is obtained by combining the first space feature matrix and the second space feature matrix; converting the first pooling feature matrix into the mean vector through a first full connection layer, and converting the second pooling feature matrix into the variance vector through a second full connection layer;

The second processing module is used for obtaining a hidden code vector corresponding to the point cloud data based on the mean value vector, the variance vector and a Gaussian distribution vector formed by a first target sampling point, wherein the first target sampling point is obtained by randomly sampling noise of which the probability density function accords with standard Gaussian distribution;

the third processing module is used for inputting a first hidden code matrix obtained by combining hidden code vectors corresponding to each point cloud data obtained by the latest to a forward process of a first point cloud normalized stream to obtain a first target matrix with the same dimension as the first hidden code matrix;

the fourth processing module is used for inputting each point cloud data acquired recently and the first hidden code matrix into a reversible point cloud decoder, and carrying out a forward process of a target flow to obtain a second target matrix, wherein the target flow is a second point cloud normalized flow or a diffusion of a Markov chain;

the computing module is used for computing to obtain a first loss value based on the first hidden code matrix and the first target matrix, and computing to obtain a second loss value based on a first Gaussian distribution matrix and the second target matrix, wherein the first Gaussian distribution matrix and the second Gaussian distribution matrix are the same in dimension of the second target matrix and are formed by first target three-dimensional sampling points, and the first target three-dimensional sampling points are obtained by randomly sampling three-dimensional noise with probability density functions conforming to standard Gaussian distribution;

The optimizing module is used for optimizing at least one of the first point cloud normalized stream, the reversible point cloud decoder and the point cloud encoder based on a gradient descent method if the latest first loss value is larger than a first preset loss value and/or the latest second loss value is larger than a second preset loss value, and re-delivering the first point cloud normalized stream, the reversible point cloud decoder and the point cloud encoder to the acquiring module for processing until the latest first loss value is smaller than or equal to the first preset loss value and the latest second loss value is smaller than or equal to the second preset loss value, so that the current first point cloud normalized stream is used as a first point cloud normalized stream after training is completed, and the current reversible point cloud decoder is used as a reversible point cloud decoder after training is completed;

a fifth processing module, configured to input a second gaussian distribution matrix, which is identical to the first target matrix in dimension and is formed by second target sampling points, into a reverse process of the trained first point cloud normalized stream, to obtain a second hidden code matrix, which is identical to the first target matrix in dimension, where the second target sampling points are obtained by randomly sampling noise whose probability density function meets a standard gaussian distribution;

The point cloud generating module is used for inputting a third Gaussian distribution matrix and the second hidden code matrix, which are the same as the first Gaussian distribution matrix in dimension and are formed by second target three-dimensional sampling points, into the reversible point cloud decoder after training is completed, performing the reverse process of the target flow, and generating new point clouds of the target types, wherein the second target three-dimensional sampling points are obtained by randomly sampling three-dimensional noise of which the probability density function accords with standard Gaussian distribution.

In a possible implementation manner, the first point cloud normalized stream includes n reversible residual coupling blocks, where n is an integer greater than 1, and for each of the reversible residual coupling blocks, the reversible residual coupling blocks includes a first target layer, a second target layer and a third target layer, where the first target layer is a convolution layer or a full connection layer, the second target layer is a convolution layer or a full connection layer, and the third target layer is a convolution layer or a full connection layer; the third processing module is specifically configured to:

In a possible implementation manner, the fifth processing module is specifically configured to:

wherein,third output matrix for the j-th reversible residual coupling block, ">Is->A fourth reference matrix obtained after passing through the third target layer in the jth reversible residual coupling block,/a third target layer>Fourth output matrix for the j-th reversible residual coupling block,/th reversible residual coupling block>Fourth input matrix for the j-th reversible residual coupling block,/th reversible residual coupling block>Is->A fifth reference matrix obtained after passing through the second target layer in the jth reversible residual coupling block,/v>Is->A sixth reference matrix obtained after passing through the first target layer in the jth reversible residual coupling block,/v>A third input matrix that is a j-th reversible residual coupling block;

In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a storage medium, and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor in communication with the storage medium via the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the point cloud generation method of any of the first aspects.

In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the point cloud generation method according to any of the first aspects.

According to the point cloud generation method, the device, the electronic equipment and the storage medium, the point cloud generated by the reversible point cloud decoder after training is completed can have richer details.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a flowchart of a point cloud generating method provided in an embodiment of the present application;

FIG. 2 shows a comparison schematic of a generated point cloud provided in an embodiment of the present application;

fig. 3 shows a schematic structural diagram of a point cloud generating device according to an embodiment of the present application;

fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the present application are only for the purpose of illustration and description, and are not intended to limit the protection scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this application, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to the flow diagrams and one or more operations may be removed from the flow diagrams as directed by those skilled in the art.

In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the presence of the features stated hereinafter, but not to exclude the addition of other features.

In order to facilitate understanding of the present embodiment, a method, an apparatus, an electronic device, and a storage medium for generating a point cloud provided in the embodiments of the present application are described in detail.

Referring to fig. 1, a flowchart of a point cloud generating method according to an embodiment of the present application is shown, where the method includes:

s101, acquiring point cloud data of each of m objects of a target class, wherein m is an integer greater than 1.

Illustratively, the target class is an airplane class, a table class, a chair class, or the like, and then, when the target class is an airplane class, m objects of the target class may be m identical or different airplanes.

For example, m may be 8, and when m is 8, this means that the subsequent training process is training with a set of 8 point cloud data.

S102, for each piece of newly acquired point cloud data, performing a target processing process on the point cloud data by using a point cloud encoder to obtain a mean vector corresponding to the point cloud data and a variance vector corresponding to the point cloud data, wherein the target processing process comprises the following steps: performing first convolution processing on the point cloud data to obtain an initial feature matrix of the point cloud data; based on the initial feature matrix, obtaining a first space feature matrix used for representing the position information of each point in the point cloud data and each first neighbor point thereof in a feature space, and obtaining a second space feature matrix used for representing the position information of each point in the point cloud data and each second neighbor point thereof in a Cartesian space, wherein for each point in the point cloud data, the first neighbor point of the point is k points in the point cloud data, which are closer to the point in the feature space, and the second neighbor node of the point is k points in the point cloud data, which are closer to the point in the Cartesian space, and k is an integer greater than 1; performing first pooling treatment on the comprehensive space feature matrix to obtain a first pooled feature matrix, and performing second pooling treatment on the comprehensive space feature matrix to obtain a second pooled feature matrix, wherein the comprehensive space feature matrix is obtained by adding the first space feature matrix and the second space feature matrix, or is obtained by combining the first space feature matrix and the second space feature matrix; the first pooled feature matrix is converted into the mean vector by a first fully connected layer, and the second pooled feature matrix is converted into the variance vector by a second fully connected layer.

The first modification of the present application compared with the prior art is that in the prior art, steps of generating the first spatial feature matrix and the second spatial feature matrix according to the initial feature matrix are not included, but the initial feature matrix is directly subjected to pooling processing, and in the mode of the present application, the finally generated point cloud can be enabled to have richer details.

Preferably, k may be 10, and when k is 10, the finally generated point cloud can have better balance between generation quality and calculation resources.

Illustratively, the pooling process may be a maximum pooling process or an average pooling process, or the like.

S103, based on the mean value vector, the variance vector and a Gaussian distribution vector formed by first target sampling points, a hidden code vector corresponding to the point cloud data is obtained, wherein the first target sampling points are obtained by randomly sampling noise of which probability density functions accord with standard Gaussian distribution.

This step is the process of re-parameterization.

S104, the first hidden code matrix obtained by combining the hidden code vectors corresponding to each point cloud data obtained by the latest is input into the forward process of the first point cloud normalized stream, and a first target matrix with the same dimension as the first hidden code matrix is obtained.

For example, there are 3 hidden code vectors, respectively [1 2 3]、[4 5 6]And [ 7.8.9]Then the first hidden code matrix can be

S105, inputting the latest acquired point cloud data and the first hidden code matrix into a reversible point cloud decoder, and performing a forward process of a target flow to obtain a second target matrix, wherein the target flow is a second point cloud normalized stream or a Markov chain diffusion.

S106, calculating to obtain a first loss value based on the first hidden code matrix and the first target matrix, and calculating to obtain a second loss value based on a first Gaussian distribution matrix and the second target matrix, wherein the first Gaussian distribution matrix and the second Gaussian distribution matrix are identical in dimension with the second target matrix and are formed by first target three-dimensional sampling points, and the first target three-dimensional sampling points are obtained by randomly sampling three-dimensional noise with probability density functions conforming to standard Gaussian distribution.

For example, the first loss value and the second loss value may be calculated based on a mean square error or the like.

And S107, if the latest first loss value is greater than a first preset loss value and/or the latest second loss value is greater than a second preset loss value, optimizing at least one of the first point cloud normalized stream, the reversible point cloud decoder and the point cloud encoder based on a gradient descent method, and repeating the steps S101-S106 until the latest first loss value is less than or equal to the first preset loss value and the latest second loss value is less than or equal to the second preset loss value, so that the current first point cloud normalized stream is used as the first point cloud normalized stream after training is completed, and the current reversible point cloud decoder is used as the reversible point cloud decoder after training is completed.

And (3) training a group of every m point cloud data according to the steps S101-S107, and optimizing at least one of the point cloud encoder, the first point cloud normalized stream and the reversible point cloud decoder based on a gradient descent method until a loss value corresponding to the latest group of point cloud data (namely, the latest loss value) is smaller than or equal to a preset loss value, so as to obtain the point cloud encoder after training is completed and the reversible point cloud decoder after training is completed.

S108, inputting a second Gaussian distribution matrix which is the same as the first target matrix in dimension and is formed by second target sampling points into the reverse process of the trained first point cloud normalized flow to obtain a second hidden code matrix which is the same as the first target matrix in dimension, wherein the second target sampling points are obtained by randomly sampling noise of which the probability density function accords with standard Gaussian distribution.

Step S108 to step S109 are test (use) procedures after training, in which (i.e. step S101 to step S107) a forward procedure of the first point cloud normalized stream and a forward procedure of the target flow are used, and the test (i.e. the above two steps) uses a reverse procedure of the first point cloud normalized stream (after training) and a reverse procedure of the target flow (performed by the reversible point cloud decoder after training).

Referring to fig. 2, a comparative schematic diagram of a generated point cloud is provided for an embodiment of the present application, where the point cloud generated by the manner of the present application and the point cloud generated by the prior art (diffionpm) are shown, and it can be clearly seen that the point cloud generated by the manner of the present application is far better than the point cloud generated by the prior art (diffionpm) in both overall and detail aspects.

Exemplary, if the first spatial feature matrix isThen the first time period of the first time period,the first candidate feature matrix is +.>Or (I)>

assume that the second candidate feature matrix is [2 8 ]]Then, correspondingly, the first weight matrix is

here, a second modification of the present application with respect to the prior art, compared to the maximum pooling and average pooling in the prior art, a new adaptive weighted pooling method is provided.

The second pooling process is the same as the first pooling process, and will not be described herein.

Here, the third modification of the present application with respect to the prior art is that, specifically, the present application changes the coupling manner (specific formula is modified) of the coupling layer (i.e. the i.d. residual coupling block) in the first point cloud normalized stream.

Illustratively, the coupling effect of the coupling layer (i.e., the residual coupling block) is excellent when n is 14 through experiments.

The reverse process of the first point cloud normalized flow corresponding to the forward process of the first point cloud normalized flow in the previous step is used in the test process.

According to the point cloud generation method, the point cloud generated by the reversible point cloud decoder after training is completed can have richer details.

Based on the same inventive concept, the embodiment of the present application further provides a point cloud generating device corresponding to the point cloud generating method in the embodiment, and since the principle of solving the problem of the device in the embodiment of the present application is similar to that of the point cloud generating method in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.

Referring to fig. 3, a schematic structural diagram of a point cloud generating device according to an embodiment of the present application is shown, where the device includes:

an obtaining module 301, configured to obtain point cloud data of each of m objects of a target class, where m is an integer greater than 1;

the first processing module 302 is configured to perform, for each point cloud data that is newly acquired, a target processing process on the point cloud data by using a point cloud encoder, to obtain a mean vector corresponding to the point cloud data and a variance vector corresponding to the point cloud data, where the target processing process includes: performing first convolution processing on the point cloud data to obtain an initial feature matrix of the point cloud data; based on the initial feature matrix, obtaining a first space feature matrix used for representing the position information of each point in the point cloud data and each first neighbor point thereof in a feature space, and obtaining a second space feature matrix used for representing the position information of each point in the point cloud data and each second neighbor point thereof in a Cartesian space, wherein for each point in the point cloud data, the first neighbor point of the point is k points in the point cloud data, which are closer to the point in the feature space, and the second neighbor node of the point is k points in the point cloud data, which are closer to the point in the Cartesian space, and k is an integer greater than 1; performing first pooling treatment on the comprehensive space feature matrix to obtain a first pooled feature matrix, and performing second pooling treatment on the comprehensive space feature matrix to obtain a second pooled feature matrix, wherein the comprehensive space feature matrix is obtained by adding the first space feature matrix and the second space feature matrix, or is obtained by combining the first space feature matrix and the second space feature matrix; converting the first pooling feature matrix into the mean vector through a first full connection layer, and converting the second pooling feature matrix into the variance vector through a second full connection layer;

The second processing module 303 is configured to obtain a hidden code vector corresponding to the point cloud data based on the mean vector, the variance vector, and a gaussian distribution vector formed by a first target sampling point, where the first target sampling point is obtained by randomly sampling noise whose probability density function meets a standard gaussian distribution;

the third processing module 304 is configured to input a first hidden code matrix obtained by merging hidden code vectors corresponding to each point cloud data obtained by the latest process into a forward process of a first point cloud normalized stream, so as to obtain a first target matrix with the same dimension as the first hidden code matrix;

a fourth processing module 305, configured to input each point cloud data acquired recently and the first hidden code matrix into a reversible point cloud decoder, and perform a forward process of a target flow to obtain a second target matrix, where the target flow is a second point cloud normalized stream or a diffusion of a markov chain;

the calculating module 306 is configured to calculate a first loss value based on the first hidden code matrix and the first target matrix, and calculate a second loss value based on a first gaussian distribution matrix and the second target matrix, which are formed by first target three-dimensional sampling points with the same dimension as the second target matrix, where the first target three-dimensional sampling points are obtained by randomly sampling three-dimensional noise with a probability density function conforming to a standard gaussian distribution;

An optimizing module 307, configured to optimize at least one of the first point cloud normalized stream, the reversible point cloud decoder, and the point cloud encoder based on a gradient descent method if the latest first loss value is greater than a first preset loss value and/or the latest second loss value is greater than a second preset loss value, and re-process the at least one of the first point cloud normalized stream, the reversible point cloud decoder, and the point cloud encoder by the obtaining module 301 until the latest first loss value is less than or equal to the first preset loss value and the latest second loss value is less than or equal to the second preset loss value, so as to use the current first point cloud normalized stream as the trained first point cloud normalized stream, and use the current reversible point cloud decoder as the trained reversible point cloud decoder;

a fifth processing module 308, configured to input a second gaussian distribution matrix, which is identical to the first target matrix in dimension and is formed by second target sampling points, into a reverse process of the trained first point cloud normalized stream, to obtain a second hidden code matrix, which is identical to the first target matrix in dimension, where the second target sampling points are obtained by randomly sampling noise whose probability density function meets a standard gaussian distribution;

The point cloud generating module 309 is configured to input a third gaussian distribution matrix and the second hidden code matrix, which are identical to the first gaussian distribution matrix in dimension and are formed by a second target three-dimensional sampling point, to the reversible point cloud decoder after training is completed, perform a reverse process of the target flow, and generate a new point cloud of the target class, where the second target three-dimensional sampling point is obtained by randomly sampling three-dimensional noise with a probability density function conforming to a standard gaussian distribution.

In a possible implementation manner, the first point cloud normalized stream includes n reversible residual coupling blocks, where n is an integer greater than 1, and for each of the reversible residual coupling blocks, the reversible residual coupling blocks includes a first target layer, a second target layer and a third target layer, where the first target layer is a convolution layer or a full connection layer, the second target layer is a convolution layer or a full connection layer, and the third target layer is a convolution layer or a full connection layer; the third processing module 304 is specifically configured to:

/>

wherein,first output matrix for the ith reversible residual coupling block,/th reversible residual coupling block>Second output matrix for the ith reversible residual coupling block,/th reversible residual coupling block>Is->A first reference matrix obtained after passing through a first target layer in an ith reversible residual coupling block,/I>Is->A second reference matrix obtained after passing through a second target layer in the ith reversible residual coupling block, Is->A third reference matrix obtained after passing through a third target layer in the ith reversible residual coupling block,/v>Second input matrix for the ith reversible residual coupling block,/th reversible residual coupling block>A first input matrix for an ith reversible residual coupling block;

In a possible implementation manner, the fifth processing module 308 is specifically configured to:

According to the point cloud generating device, the point cloud generated by the reversible point cloud decoder after training is completed can be provided with richer details.

Referring to fig. 4, an electronic device 400 provided in an embodiment of the present application includes: a processor 401, a memory 402 and a bus, said memory 402 storing machine-readable instructions executable by said processor 401, said processor 401 communicating with said memory 402 via the bus when the electronic device is running, said processor 401 executing said machine-readable instructions to perform the steps of the method of point cloud generation as described above.

Specifically, the above-mentioned memory 402 and the processor 401 can be general-purpose memories and processors, and are not limited herein, and the above-mentioned method of generating a point cloud can be performed when the processor 401 runs a computer program stored in the memory 402.

Corresponding to the above method for generating the point cloud, the embodiment of the application further provides a computer readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the above method for generating the point cloud are executed.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the method embodiments, which are not described in detail in this application. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, and the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, and for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, indirect coupling or communication connection of devices or modules, electrical, mechanical, or other form.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.

The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes or substitutions are covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of generating a point cloud, the method comprising:

2. The method of generating a point cloud according to claim 1, wherein performing a first pooling process on the integrated spatial feature matrix to obtain a first pooled feature matrix includes:

3. The method according to claim 1, wherein the first normalized stream of point cloud includes n reversible residual coupling blocks, where n is an integer greater than 1, and for each of the reversible residual coupling blocks, the reversible residual coupling blocks includes a first target layer, a second target layer, and a third target layer, the first target layer is a convolution layer or a full connection layer, the second target layer is a convolution layer or a full connection layer, and the third target layer is a convolution layer or a full connection layer; inputting a first hidden code matrix obtained by combining hidden code vectors corresponding to each point cloud data obtained by the latest to a forward process of a first point cloud normalized stream to obtain a first target matrix with the same dimension as the first hidden code matrix, wherein the method comprises the following steps of:

wherein,first output matrix for the ith reversible residual coupling block,/th reversible residual coupling block>Second output matrix for the ith reversible residual coupling block,/th reversible residual coupling block>Is->Through the ith reversible residual coupleA first reference matrix obtained after combining the first target layer in the block,/>Is->A second reference matrix obtained after passing through a second target layer in the ith reversible residual coupling block, Is->A third reference matrix obtained after passing through a third target layer in the ith reversible residual coupling block,/v>Second input matrix for the ith reversible residual coupling block,/th reversible residual coupling block>A first input matrix for an ith reversible residual coupling block;

4. The method of generating a point cloud according to claim 3, wherein inputting a second gaussian distribution matrix composed of second target sampling points with the same dimension as the first target matrix into a reverse process of the trained first point cloud normalized stream to obtain a second hidden code matrix with the same dimension as the first target matrix, comprises:

wherein,third output matrix for the j-th reversible residual coupling block, ">Is->Through the jth cocoaFourth reference matrix obtained after the third target layer in the inverse residual coupling block, +.>Fourth output matrix for the j-th reversible residual coupling block,/th reversible residual coupling block>Fourth input matrix for the j-th reversible residual coupling block,/th reversible residual coupling block>Is->A fifth reference matrix obtained after passing through the second target layer in the jth reversible residual coupling block,/v >Is->A sixth reference matrix obtained after passing through the first target layer in the jth reversible residual coupling block,/v>A third input matrix that is a j-th reversible residual coupling block;

5. A point cloud generating apparatus, the apparatus comprising:

6. The point cloud generating apparatus of claim 5, wherein performing a first pooling process on the integrated spatial feature matrix to obtain a first pooled feature matrix comprises:

7. The point cloud generating apparatus of claim 5, wherein n reversible residual coupling blocks are included in the first point cloud normalized stream, wherein n is an integer greater than 1, and for each of the reversible residual coupling blocks, the reversible residual coupling blocks include a first target layer, a second target layer, and a third target layer, the first target layer is a convolution layer or a full connection layer, the second target layer is a convolution layer or a full connection layer, and the third target layer is a convolution layer or a full connection layer; the third processing module is specifically configured to:

8. The point cloud generating apparatus of claim 7, wherein the fifth processing module is specifically configured to:

9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the point cloud generation method of any of claims 1 to 4.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the point cloud generation method according to any of claims 1 to 4.