CN115456900A - Improved transform-based Qinhong tomb warrior fragment denoising method - Google Patents
Improved transform-based Qinhong tomb warrior fragment denoising method Download PDFInfo
- Publication number
- CN115456900A CN115456900A CN202211133859.0A CN202211133859A CN115456900A CN 115456900 A CN115456900 A CN 115456900A CN 202211133859 A CN202211133859 A CN 202211133859A CN 115456900 A CN115456900 A CN 115456900A
- Authority
- CN
- China
- Prior art keywords
- point
- point cloud
- sampling
- manifold
- points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 239000012634 fragment Substances 0.000 title claims abstract description 17
- 238000005070 sampling Methods 0.000 claims abstract description 67
- 238000012549 training Methods 0.000 claims abstract description 8
- 230000002708 enhancing effect Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 31
- 239000013598 vector Substances 0.000 claims description 22
- 230000003044 adaptive effect Effects 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 150000001875 compounds Chemical class 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 238000012952 Resampling Methods 0.000 claims description 4
- 238000009825 accumulation Methods 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 2
- 230000008859 change Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 abstract description 3
- 238000013507 mapping Methods 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004451 qualitative analysis Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000008571 general function Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012847 principal component analysis method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an improved transform-based Qin tomb warrior fragment denoising method, which comprises the following steps: 1. preprocessing a point cloud sample of Qin warrior data; 2. importing the preprocessed point cloud sample as a training set into an input embedding module and mapping the point cloud sample to a high-dimensional space; 3. importing the high-dimensional space point cloud into a self-adaptive down-sampling module of a transform encoder, obtaining relatively uniform points by using an FPS (field programmable system) AS original sampling points, and automatically learning the offset of each sampling point and updating position information by an AS (application server) so AS to reduce the data volume and keep the structural attribute of an original point cloud model; 4. importing the down-sampled result into an encoder module of a Transformer, and enhancing the characteristics of the point cloud through an RA module so as to effectively extract the characteristics; 5. selecting points closer to clean point cloud by using a self-adaptive sampling method to reconstruct a three-dimensional surface by taking the output of a transform decoder as a basis; 6. and continuously carrying out iterative training on the imported data until the loss value is small and tends to be stable, so that the de-noised clean point cloud is obtained, and the robustness to high noise is better.
Description
Technical Field
The invention belongs to the technical field of cultural relic protection, and particularly relates to an improved transform-based method for denoising fragments of Qin warriors.
Background
In the field of cultural relic excavation protection, digital initialization acquisition of cultural relic fragments is influenced by various factors such as measuring equipment, external environment, surface characteristics of a measured object and the like, and an initial point cloud data model obtained by scanning often contains a large number of noise points. The more the number of the noise points is, the greater the influence on the quality of point cloud is, and the accuracy and efficiency of tasks such as feature extraction, registration, curved surface reconstruction and visualization in the later stage are directly influenced. Therefore, denoising acquired initial digitized point cloud data is an important research content in the field.
In the traditional denoising method, a point cloud data denoising method based on curved surface fitting firstly carries out surface fitting on three-dimensional scanning point cloud data of an object, then calculates the distance between each point and a fitting surface, and finally deletes the gross error or abnormal value of the point cloud data according to a certain criterion to achieve the purpose of denoising the point cloud data. The method is a simple and effective estimation method, but the accuracy is not high, and large calculation errors exist particularly for complex models and models containing noise; in a moving robust principal component analysis method based on sparse representation theory, the estimated positions of points are calculated through local average, the sharp features are reserved through a weighted minimization method, and the positions of the points are updated by measuring the similarity between normal vectors in local neighborhoods through weights to perform noise elimination. But when the noise level is high, performance tends to decrease due to over-smoothing or over-sharpening.
In recent years, artificial intelligence methods represented by deep learning have taken a series of important breakthroughs and received unprecedented attention. The PointNet creates a pioneer for directly applying a deep learning model on point cloud to carry out feature learning, and in order to ensure the invariance of the replacement, the method applies a normalized rotation matrix on the point cloud, so that the points are too independent; in order to realize order independence, the network performs global feature extraction on all point cloud data by using global pooling operation, but geometric relevance among points is ignored, and a part of local feature information is lost; some improved networks based on PointNet, such as PointNet + +, neural Projection, pointclearnet, and Total cancellation, etc., take into account the local nature of the points for improving model performance. These methods can infer the displacement of noise from the underlying surface and reconstruct points, but these points are not specified for explicit surface restoration, possibly resulting in sub-optimal denoising results.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an improved transform-based method for denoising fragments of Qin warriors, which is beneficial to learning the potential manifold of noise point cloud and capturing the inherent structure for recovering the surface to reconstruct the manifold, and has better robustness to high noise.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for denoising Qin warriors fragments based on an improved Transformer comprises the following steps:
step 4, importing the down-sampled result into an improved Transformer encoder module, and enhancing the characteristics of the point cloud through a relative attention RA module of the point cloud so as to effectively extract the characteristics;
step 5, taking the output of an improved transform decoder as a basis, reconstructing the manifold of each point, sampling the manifold structure corresponding to each point in proportion, and selecting the point closer to the clean point cloud by using a self-adaptive sampling method to reconstruct the three-dimensional surface;
and 6, continuously performing iterative training of the steps 3 to 5 on the imported data by utilizing an improved Transformer encoder-decoder framework until the loss value of the loss function is small and tends to be stable, and obtaining the denoised clean point cloud.
Further, the data enhancement of step 1 includes rotating, translating and scaling the data.
Further, the adaptive neighborhood sampling algorithm AS of step 3 includes the following steps:
step 3.1, let P s To sample N from the Nth input point s A set of points consisting of points, and N s <N,x i As a set of points P s Sample point of (1), x i ∈P s ,f i Is the sampling point x i Characteristic of (b), f i ∈F s Sampling point x by k-NN query i The neighbor of (2) is grouped, and a general self-attention mechanism is used for carrying out feature updating;
step 3.2, sampling point x i K neighbors x i,1 ,...,x i,k Corresponding characteristic f i,1 ,...,f i,k Expressed as:
where A is used for aggregate characterization and R is used to describe the sampling point x i And neighbor point x i,j The high-level relation between gamma is used for changing the characteristic dimension of each neighbor point, and in order to reduce the calculation amount, let gamma (x) i,j )=W γ f i,j Closing (c)The system function R is expressed as:
wherein D' is the output channel of Conv;
step 3.3, for each sampling point x i Using MLP + Softmax, obtaining the coordinates of each point in the group and the normalized weight W of the characteristic channel p And W f Expressed as:
W p =Softmax(mlp(x i,j )) (3)
W f =Softmax(mlp(f i,j )) (4)
in the formulas (3) and (4), j belongs to [1, k ];
step 3.4, realizing the self-adaptive updating of the sampling point xi and the characteristic fi thereof through weighted summation operation,andthat is, the updated point information is expressed as:
further, the relative attention RA module of the point cloud in step 4 is used to calculate the relative attention feature between the self-attention SA module feature and the input feature, and is expressed as:
F ra =F in -F sa (7)
in the formula, F ra For relative attention features, F in As input features, F sa Is a self-attention SA module feature;
finally, the relative attention feature F ra Inputting features F over a network in Relative attention RA Module Final output feature F as the entire Point cloud out Is represented as:
F out =RA(F in )=relu(bn(mlp(F ra )))+F in 。 (8)
further, the self-attention SA module is characterized by:
wherein (Q, K, V) = F in .(W q ,W k ,W v );W q 、W k And W v To share the learned weight matrix, Q, K, and V are the Query, key, and Value matrices, respectively, generated by linear transformation of the input features,one dimension for the query and key vectors.
Further, the calculation formula (7) of the relative attention feature is a discrete laplacian operator, and in the graph G having N nodes, the N-dimensional vector f is expressed as:
f=(f 1 ,f 2 ,...,f N ) (10)
in the formula (f) i The function value of the function f at the node i;
disturbing the point i, changing the point i into any node j adjacent to the point i, and then expressing the gain caused by the change from the node j to the node i as follows:
when edge E ij Having a weight W ij Then:
when W is ij If =0, indicating that node i and node j are not adjacent, then:
finally, the following components are obtained:
in the formula (I), the compound is shown in the specification,is the degree of vertex i; w is a i: =(w i1 ,...,w iN ) Is an N-dimensional row vector;is an N-dimensional column vector; w is a i: f represents the inner product of the two vectors, and the gain accumulation for all N nodes is expressed as:
in the formula, W i: And representing the weight of the operation corresponding to the ith point, wherein W represents the weight of the operation corresponding to all the points, and D-W is a Laplace matrix L.
Further, in the step (5), the decoder firstly converts the embedded features of each sampling point and its neighborhood into a local curved surface with the point as the center to infer the potential manifold, and then repeatedly samples on the inferred manifold to generate the noise reduction point setReconstructing a clean point cloud
Further, the step (5) specifically includes the following steps:
step 5.1, first defining the 2D manifold M embedded in the 3D space parameterized by the feature vector y as:
M(u,v;y):[-1,1]×[-1,1]→R 3 (16)
wherein (u, v) is a 2D rectangular region [ -1,1] 2 A certain point in (a);
the 2D rectangle is mapped by the function approximator MLP to a patch manifold of arbitrary shape parameterized by y by equation (16), expressed as:
M i (u,v;y i )=MLP M ([u,v,y i ]) (17)
in the formula, M i (u,v;y i ) Representing a parameterized patch manifold;
step 5.2, the adaptive downsampling is collectedPoint P in i Corresponding dough sheet manifold M i Is defined as:
M i (u,v;y i )=p i +M(u,v;y i ) (18)
equation (18) represents the manifold M (u, v; y) to be constructed i ) Move to a local curve centered at pi,the surface patch manifold of all point correspondences in the table can be expressed asI.e. characterizing the potential surface of the point cloud;
step 5.3, reducing the number of input points by half in the adaptive downsampling process through step 5.1 and step 5.2, namely M = N/2, and performing M (u, v; y) on each patch manifold i ) Resampling, and collecting two points on each surface patch manifold to obtain a denoising point cloudExpressed as:
further, the loss function loss in the step (6) includes a loss function Las and a loss function Lus;
the loss function Las is used for quantizing the adaptive down-sampling setDistance from ground route point cloud Pgt due toAnd Pgt contains a different number of dots, andtherefore, the chamfer distance CD is selected as L as Expressed as:
the loss function Lus is used to quantify the final reconstructed point cloudThe distance from the ground channel Pgt, using the earth movement distance EMD as Lus, is expressed as:
finally, the network is trained end-to-end with supervision, and the minimum total loss function is expressed as:
L denoise =λL as +(1-λ)L us (22)
in the formula, λ is an empirical value of 0.01.
Compared with the prior art, the invention has the following technical effects:
based on the high performance of a Transformer in natural language processing and the sequence independence of all operations, the method is improved and is very suitable for point cloud feature learning, the problem that a relative attention RA module in the improved Transformer is very sensitive to abnormal points in a widely used FPS (remote data system) which is a farthest point sampling method, so that point cloud data in the real world are very unstable when being processed, and the problem that the sampling points from the FPS are required to be subsets of original point clouds, so that the inference of the original data and original geometric information is increased, basic point cloud information can be extracted in a self-adaptive mode, a more valuable basis is provided for smooth development of subsequent work, and meanwhile, an attention mechanism and global pooling operation are used in the feature extraction process, so that not only global information can be extracted, but also the integrity of local detailed information can be kept. Specifically, abundant high-dimensional characteristics of a point cloud sample of Qin terracotta warrior data are obtained by utilizing an improved Transformer structure, potential manifold of noise point cloud is learned from sampling points, namely, the sampling points obtained by an FPS are self-adapted through a self-adaptive down-sampling module, and are closer to the surface where the points are located; and then converting each sampling point and the embedded neighborhood characteristics thereof into a local surface to infer the surface manifold, reconstructing a clean point cloud capturing an internal structure by sampling on each surface manifold without being influenced by abnormal values, recovering the surface to reconstruct the manifold, realizing denoising, having good robustness under synthetic noise and real noise, and having good propulsion effect on the virtual recovery work of the computer-aided Qin warriors.
Drawings
FIG. 1: the invention is based on the point cloud denoising network structure chart of the improved transformer;
FIG. 2: qualitative analysis comparison graphs of different denoising methods are obtained;
FIG. 3: the invention is a schematic diagram of an adaptive down-sampling principle;
FIG. 4: in the structure diagram of the relative attention module RA of the invention, a self-attention module structure SA is arranged in a dashed line frame;
FIG. 5: the invention discloses a schematic diagram of patch manifold reconstruction and resampling.
Detailed Description
The present invention will be explained in further detail with reference to examples.
As shown in fig. 1, a method for denoising fragments of Qin tomb warriors based on improved Transformer includes the following steps:
the specific process is AS shown in fig. 3, firstly, a farthest point sampling algorithm FPS is used to obtain relatively uniform points AS original sampling points, and then, an adaptive neighborhood sampling algorithm AS is used to automatically learn the offset of each sampling point and update the position information of the sampling point;
the adaptive neighborhood sampling algorithm AS specifically comprises the following steps:
step 3.1, let P s To sample N from the Nth input point s A set of points consisting of points, and N s <N,x i As a set of points P s Sampling point of (1), x i ∈P s ,f i Is the sampling point x i Is characterized by f i ∈F s Corresponding to the dimension D1, the sampling point x is inquired through k-NN i The neighbor of (2) is grouped, and a general self-attention mechanism is used for carrying out feature updating;
step 3.2, sampling point x i K neighbors x i,1 ,...,x i,k Corresponding characteristic f i,1 ,...,f i,k Expressed as:
where A is used for aggregation characterization and R is used to describe the sampling point x i And neighbor point x i,j The high-level relation between gamma is used for changing the characteristic dimension of each neighbor point, and in order to reduce the calculation amount, let gamma (x) i,j )=W γ f i,j The relationship function R is expressed as:
wherein D' is the output channel of Conv;
step 3.3, for each sampling point x i Obtaining the coordinates of each point in the group and the normalized weight W of the characteristic channel by using MLP + Softmax p And W f Expressed as:
W p =Softmax(mlp(x i,j )) (3)
W f =Softmax(mlp(f i,j )) (4)
in the formulas (3) and (4), j belongs to [1, k ];
step 3.4, realizing sampling point x by weighted summation operation i And characteristic f thereof i The adaptive update of the time-domain data stream,andthat is, the updated point information is expressed as:
the adaptive down-sampling operation can obtain points closer to the potential surface, namely points with less noise disturbance, and is helpful for reducing the potential space for reconstructing the potential manifold when reconstructing the manifold;
step 4, importing the down-sampled result into an improved Transformer encoder module, and enhancing the characteristics of the point cloud through a relative attention RA module of the point cloud so as to effectively extract the characteristics;
attention is to screen a small amount of important information from a large amount of information, focus on the important information, ignore most of the unimportant information, focus on its corresponding Value the larger the weight is, self-Attention SA in the original transformer is a mechanism for focusing on other words of an input sentence when encoding each word, an architecture of an SA layer is described in a dashed box in fig. 4, when switching to a point data stream, according to terms, Q, K, and V are Query, key, and Value matrices generated by linear transformation of input features, first, a weighting coefficient is calculated according to Query and Key, and then, weighted summation is performed on Value according to the weighting coefficient, for the weighting coefficient, the most common method at present includes: evaluating the vector dot product of the two, evaluating the similarity of the vectors Cosine of the two or introducing an additional neural network, wherein the embodiment uses the method of the vector dot product to calculate, and in order to prevent the calculation result from being overlarge, the calculation result is calculated by dividing the vector dot product by a scaleWhereinFor a dimension of query and key vectors, normalizing the result into probability distribution by utilizing Softmax, and multiplying the probability distribution by a matrix Value to obtain a representation of weight summation, namely, the self-attention SA module is characterized by:
in the formula,(Q,K,V)=F in .(W q ,W k ,W v );W q 、W k And W v A learned weight matrix is shared.
As can be seen from the calculation process of the formula (9), the whole self-attention process is unchanged by displacement, so that the self-attention process is very suitable for disorder and irregularity of point clouds, but the absolute coordinates of the same point cloud after rigid transformation are greatly different from those before transformation, in order to describe the inherent characteristics of the point cloud, the embodiment introduces the relative attention characteristics of the point cloud, which is inspired by using a laplacian matrix L = D-a to replace an adjacent matrix a in a graph convolution network, D is a diagonal matrix, and each diagonal element D is a diagonal element D ii Representing the degree of the ith node, replacing the self-attention SA module in the original transform with a relative attention RA module to enhance the feature representation of the point cloud, as shown in FIG. 4, the relative attention RA module is to calculate the relative attention feature between the self-attention SA feature and the input feature, and is represented as:
F ra =F in -F sa (7)
in the formula, F ra For relative attention features, F in As input features, F sa Is a self-attention SA module feature;
finally, the relative attention feature is further used as the final output feature F of the whole RA through the network and the input feature out Is represented as:
F out =RA(F in )=relu(bn(mlp(F ra )))+F in (8)
F in -F sa similar to the discrete laplacian, in a graph G with N nodes, the N-dimensional vector f is represented as:
f=(f 1 ,f 2 ,...,f N ) (10)
in the formula f i The function value of the function f at the node i;
the point i is disturbed and may become any node j adjacent to the point i, and since the laplacian operator can calculate the gain from one point to the tiny disturbance in all degrees of freedom, the gain caused by changing any node j to the node i is represented by a graph, which is expressed as:
when edge E ij Having a weight W ij Then:
when W is ij If =0, indicating that node i and node j are not adjacent, then:
finally, the following is obtained:
in the formula (I), the compound is shown in the specification,is the degree of vertex i; w is a i: =(w i1 ,...,w iN ) Is an N-dimensional row vector;is an N-dimensional column vector; w is a i: f represents the inner product of the two vectors, and the gain accumulation is expressed for all N nodes as:
in the formula, W i: And representing the weight of the operation corresponding to the ith point, wherein W represents the weight of the operation corresponding to all the points, and D-W is the Laplace matrix L.
The ith row in the laplace matrix actually reflects the gain accumulation of the ith node when it perturbs all other nodes. Intuitively, the graph laplace reflects that a potential is applied to the node i, and the potential flows to other nodes smoothly, so that the function of supervision and guidance is played on model iterative optimization. Relative attention increases attention weighting and reduces the effects of noise, which is helpful to downstream tasks.
Step 5, taking the output of an improved transform decoder as a basis, reconstructing the manifold of each point, sampling the manifold structure corresponding to each point in proportion, and selecting the point closer to the clean point cloud by using a self-adaptive sampling method to reconstruct the three-dimensional surface;
after the decoder acquires the high-dimensional characteristic representation of the point cloud, the decoder can be used for processing a denoising task, the previous denoising task mostly depends on the idea that points are displaced from potential surfaces, but the points are not specified for restoring the surfaces, which may result in suboptimal denoising effect, the point cloud generally represents some potential surfaces or 2D manifolds of a set of sampling points, and in order to achieve the robustness of the denoising effect, the present embodiment can learn the potential manifolds of the noise point cloud, capture the inherent structures of the original point cloud, and reconstruct and restore the surfaces, and the process is as shown in (b) the decoder part in fig. 1;
the decoder transforms the embedded features of each sample point and its neighborhood into a local surface centered on the point to infer a potential manifold, and then samples the inferred manifold multiple times to produce a set of noise reduction pointsI.e. a clean point cloud is reconstructedThe whole process is shown in fig. 5, and specifically comprises the following steps:
step 5.1, first define the 2D manifold M embedded in the 3D space parameterized by some eigenvectors y as:
M(u,v;y):[-1,1]×[-1,1]→R 3 (16)
wherein (u, v) is a 2D rectangular region [ -1,1] 2 A certain point in (a);
equation (16) maps the 2D rectangle to an arbitrarily shaped patch manifold parameterized by y, the parameterized patch manifold Mi (u, v; y i ) The MLP is realized because the MLP is a general function approximator with expression capability enough to approximate the manifold of an arbitrary shape, which is expressed as:
M i (u,v;y i )=MLP M ([u,v,y i ]) (17)
in the formula, M i (u,v;y i ) Representing a parameterized patch manifold;
step 5.2, after the definition of the manifold M, the adaptive downsampling set is collectedPoint P in i Corresponding dough sheet manifold M i Is defined as follows:
M i (u,v;y i )=p i +M(u,v;y i ) (18)
equation (18) represents the manifold M (u, v; y) to be constructed i ) Move to a local curve centered at pi,the surface patch manifold of all point correspondences in the table can be expressed asI.e. characterizing the underlying surface of the point cloud;
step 5.3, reducing the number of input points in the adaptive down-sampling process by half through step 5.1 and step 5.2, namely M = N/2, and performing M (u, v; y) on each patch manifold i ) Resampling, and sampling two points on each surface patch manifold to obtain a denoising point cloudExpressed as:
and 6, continuously performing iterative training from the step 3 to the step 5 on the imported data by utilizing an improved Transformer encoder-decoder framework until the loss value of the loss function is small and tends to be stable, and obtaining a denoised clean point cloud.
To measure the reconstruction quality of the final point cloud, the Loss function Loss consists of two parts: 1) Loss function Las for quantization of adaptively downsampled setsThe distance between the point cloud Pgt and the ground route point cloud; 2) Loss function Lus quantization of the final reconstructed point cloudDistance from ground channel Pgt.
Due to the fact thatAnd Pgt contains a different number of dots, andtherefore, the chamfer distance CD is selected as L as Expressed as:
measuring the denoised point cloud by using Earth moving distance Earth's distance (EMD) as LusThe distance between the point cloud Pgt and the ground route point cloud is expressed as:
finally, the network is trained end-to-end with supervision, and the minimum total loss function is expressed as:
L denoise =λL as +(1-λ)L us (22)
in the formula, λ is an empirical value of 0.01.
The data set of the Qin warriors is a model which is scanned and collected on the spot from researchers at the cultural heritage digital national and local union engineering research center of northwest university to the warrior museum, and more than 500 existing models are accurately marked. Fig. 2 is a qualitative analysis comparison diagram of the denoising effect of the head and hand data sets of the terracotta warrior by using different denoising methods, and it can be seen that compared with other three methods NPD, tlDn and PCNet based on deep learning, the denoising method provided by the invention has better robustness on abnormal values and cleaner obtained results.
In the embodiment, the point cloud denoising network based on the transform is used as a feature extractor, the structure and semantic comprehension capability of point cloud is stronger, and compared with other three denoising methods, the denoising effect of the method is more remarkable along with the improvement of noise level, and as shown in table 1, the method is better than the previous deep learning method and is more robust to high noise.
TABLE 1 comparison of CD (chamfer distance) for each denoising method at different noise ratios
0.25% | 0.5% | 1% | 2% | 3% | |
NPD | 0.24 | 0.62 | 1.28 | 2.32 | 3.27 |
PCNet | 0.18 | 0.46 | 0.97 | 1.42 | 2.91 |
TlDn | 0.34 | 0.78 | 1.15 | 2.26 | 3.12 |
TDNet-RA(ours) | 0.16 | 0.39 | 0.83 | 1.20 | 2.15 |
Claims (9)
1. A method for denoising Qin warriors fragments based on an improved Transformer is characterized by comprising the following steps:
step 1, carrying out pretreatment on a point cloud sample of the Qin warriors data so as to realize data enhancement and marking treatment;
step 2, all the preprocessed point cloud samples are used as training sets, and are led into an input embedding module in batches and are mapped to a high-dimensional space;
step 3, importing the point cloud of the high-dimensional space into an adaptive down-sampling module in an improved transform encoder, firstly obtaining relatively uniform points by using a maximum-distance point sampling algorithm (FPS) AS original sampling points, and then automatically learning the offset of each sampling point by using an adaptive neighborhood sampling Algorithm (AS) and updating the position information of the sampling points, thereby reducing the data volume and retaining the structural attribute of an original point cloud model;
step 4, importing the down-sampled result into an improved Transformer encoder module, and enhancing the characteristics of the point cloud through a relative attention RA module of the point cloud so as to effectively extract the characteristics;
step 5, reconstructing the manifold of each point by taking the output of an improved transform decoder as a basis, sampling the manifold structure corresponding to each point in proportion, and selecting a point closer to a clean point cloud to reconstruct a three-dimensional surface;
and 6, continuously performing iterative training of the steps 3 to 5 on the imported data by utilizing an improved Transformer encoder-decoder framework until the loss value of the loss function is small and tends to be stable, and obtaining the denoised clean point cloud.
2. The improved transform-based Qin warriors fragment denoising method according to claim 1, wherein the data enhancement of step 1 comprises rotating, translating and scaling the data.
3. The improved transform-based Qin warrior fragment denoising method according to claim 1, wherein the adaptive neighborhood sampling algorithm AS of the step 3 comprises the following steps:
step 3.1, let P s To sample N from the Nth input point s A set of points consisting of points, and N s <N,x i As a set of points P s Sample point of (1), x i ∈P s ,f i Is the sampling point x i Characteristic of (b), f i ∈F s Sampling point x by k-NN query i The neighbor of (2) is grouped and a general self-attention mechanism is used for updating the characteristics;
step 3.2, sampling point x i K neighbors x i,1 ,...,x i,k Corresponding characteristic f i,1 ,...,f i,k Expressed as:
where A is used for aggregation characterization and R is used to describe the sampling point x i And neighbor point x i,j The high-level relation between the two points is that gamma is used for changing the characteristic dimension of each neighbor point, and in order to reduce the calculation amount, gamma (x) i,j )=W γ f i,j The relationship function R is expressed as:
wherein D' is the output channel of Conv;
step 3.3, for each sampling point x i Obtaining the coordinates of each point in the group and the normalized weight W of the characteristic channel by using MLP + Softmax p And W f Expressed as:
W p =Soft max(mlp(x i,j )) (3)
W f =Soft max(mlp(f i,j )) (4)
in the formulas (3) and (4), j belongs to [1, k ];
step 3.4, realizing sampling point x by weighted summation operation i And characteristic f thereof i The adaptive update of (a) is performed,andthat is, the updated point information is expressed as:
4. the improved Transformer-based Qin warriors fragment denoising method according to claim 1, wherein the relative attention RA module of the point cloud in the step 4 is used for calculating the relative attention feature between the self-attention SA module feature and the input feature, and is expressed as:
F ra =F in -F sa (7)
in the formula, F ra For relative attention features, F in As an input feature, F sa A self-attention SA module feature;
finally, the relative attention feature F ra Inputting features F over a network in Relative attention RA Module Final output feature F as the entire Point cloud out Is represented as:
F out =RA(F in )=relu(bn(mlp(F ra )))+F in 。 (8)
5. the improved transform-based terracotta warrior fragment denoising method according to claim 4, wherein the self-attention SA module characteristic is represented as:
6. The improved transform-based denoising method for fragments of Qin dynasty figures, according to claim 4, wherein the calculation formula (7) of the relative attention features is a discrete Laplacian, and in a graph G with N nodes, an N-dimensional vector f is expressed as:
f=(f 1 ,f 2 ,...,f N ) (10)
in the formula, f i The function value of the function f at the node i;
disturbing the point i, changing the point i into any node j adjacent to the point i, and then expressing the gain caused by the change from the node j to the node i as follows:
when edge E ij Having a weight W ij Then:
when W is ij If =0, indicating that node i and node j are not adjacent, then:
finally, the following components are obtained:
in the formula (I), the compound is shown in the specification,is the degree of vertex i; w is a i: =(w i1 ,...,w iN ) Is an N-dimensional row vector;is an N-dimensional column vector; w is a i: f represents the inner product of the two vectors, and the gain accumulation is expressed for all N nodes as:
in the formula, W i: And representing the weight of the operation corresponding to the ith point, wherein W represents the weight of the operation corresponding to all the points, and D-W is the Laplace matrix L.
7. The improved transform-based Qin warrior fragment denoising method of claim 1, wherein the decoder in step (5) transforms the embedded features of each sampling point and its neighborhood into a local surface centered on the point to infer the potential manifold, and then re-samples the inferred manifold to generate a denoising point setReconstructing a clean point cloud
8. The improved transform-based terracotta warrior fragment denoising method according to claim 7, wherein the step (5) comprises the following steps:
step 5.1, first defining the 2D manifold M embedded in the 3D space parameterized by the feature vector y as:
M(u,v;y):[-1,1]×[-1,1]→R 3 (16)
wherein (u, v) is a 2D rectangular region [ -1,1] 2 A certain point in (a);
the 2D rectangle is mapped by the function approximator MLP to a patch manifold of arbitrary shape parameterized by y by equation (16), expressed as:
M i (u,v;y i )=MLP M ([u,v,y i ]) (17)
in the formula, M i (u,v;y i ) Representing a parameterized patch manifold;
step 5.2, self-adaptive down-sampling aggregationPoint P in i Corresponding patch manifold M i Is defined as:
M i (u,v;y i )=p i +M(u,v;y i ) (18)
equation (18) represents the manifold M (u, v; y) to be constructed i ) Move to a local curve centered at pi,the patch manifold of all point correspondences in the set can be represented asI.e. characterizing the underlying surface of the point cloud;
step 5.3, reducing the number of input points in the adaptive down-sampling process by half through step 5.1 and step 5.2, namely M = N/2, and M is a manifold for each patch i ([u,v;y i ]) Resampling, and collecting two points on each surface patch manifold to obtain a denoising point cloudExpressed as:
9. the improved transform-based terracotta warrior fragment denoising method according to claim 1, wherein the loss function loss in step (6) comprises a loss function Las and a loss function Lus;
the loss function Las is used for quantizing the adaptive down-sampling setDistance from ground route point cloud Pgt due toAnd Pgt contains a different number of points, anTherefore, the chamfer distance CD is selected as L as Expressed as:
the loss function Lus is used to quantify the final reconstructed point cloudThe distance from the ground channel Pgt, using the earth movement distance EMD as Lus, is expressed as:
finally, the network is trained end-to-end with supervision, and the minimum total loss function is expressed as:
L denoise =λL as +(1-λ)L us (22)
in the formula, λ is an empirical value of 0.01.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211133859.0A CN115456900A (en) | 2022-09-19 | 2022-09-19 | Improved transform-based Qinhong tomb warrior fragment denoising method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211133859.0A CN115456900A (en) | 2022-09-19 | 2022-09-19 | Improved transform-based Qinhong tomb warrior fragment denoising method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115456900A true CN115456900A (en) | 2022-12-09 |
Family
ID=84304288
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211133859.0A Pending CN115456900A (en) | 2022-09-19 | 2022-09-19 | Improved transform-based Qinhong tomb warrior fragment denoising method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115456900A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116012266A (en) * | 2023-03-29 | 2023-04-25 | 中国科学技术大学 | Image denoising method, system, equipment and storage medium |
-
2022
- 2022-09-19 CN CN202211133859.0A patent/CN115456900A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116012266A (en) * | 2023-03-29 | 2023-04-25 | 中国科学技术大学 | Image denoising method, system, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qian et al. | PUGeo-Net: A geometry-centric network for 3D point cloud upsampling | |
CN111079532A (en) | Video content description method based on text self-encoder | |
CN110163815B (en) | Low-illumination reduction method based on multi-stage variational self-encoder | |
CN113780149A (en) | Method for efficiently extracting building target of remote sensing image based on attention mechanism | |
CN109636722B (en) | Method for reconstructing super-resolution of online dictionary learning based on sparse representation | |
CN112465889B (en) | Plant point cloud segmentation method, system and storage medium based on two-dimensional-three-dimensional integration | |
CN109064412A (en) | A kind of denoising method of low-rank image | |
CN107341513B (en) | Multi-source ocean surface temperature remote sensing product fusion method based on stable fixed order filtering model | |
CN109299303B (en) | Hand-drawn sketch retrieval method based on deformable convolution and depth network | |
CN114494297B (en) | Adaptive video target segmentation method for processing multiple priori knowledge | |
WO2023202474A1 (en) | Method and system for accurately forecasting three-dimensional spatiotemporal sequence multiple parameters of seawater quality | |
Han et al. | An improved corner detection algorithm based on harris | |
CN115456900A (en) | Improved transform-based Qinhong tomb warrior fragment denoising method | |
CN112529777A (en) | Image super-resolution analysis method based on multi-mode learning convolution sparse coding network | |
CN114638768B (en) | Image rain removing method, system and equipment based on dynamic association learning network | |
CN114663777B (en) | Hyperspectral image change detection method based on space-time joint graph attention mechanism | |
Xi et al. | Super resolution reconstruction algorithm of video image based on deep self encoding learning | |
CN113269691B (en) | SAR image denoising method for noise affine fitting based on convolution sparsity | |
CN117727046A (en) | Novel mountain torrent front-end instrument and meter reading automatic identification method and system | |
CN117593187A (en) | Remote sensing image super-resolution reconstruction method based on meta-learning and transducer | |
Li et al. | Towards communication-efficient digital twin via AI-powered transmission and reconstruction | |
CN116597142A (en) | Satellite image semantic segmentation method and system based on full convolution neural network and converter | |
CN113591685B (en) | Geographic object spatial relationship identification method and system based on multi-scale pooling | |
CN116152206A (en) | Photovoltaic output power prediction method, terminal equipment and storage medium | |
AU2021104479A4 (en) | Text recognition method and system based on decoupled attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |