CN115456900A - Improved transform-based Qinhong tomb warrior fragment denoising method - Google Patents

Improved transform-based Qinhong tomb warrior fragment denoising method Download PDF

Info

Publication number
CN115456900A
CN115456900A CN202211133859.0A CN202211133859A CN115456900A CN 115456900 A CN115456900 A CN 115456900A CN 202211133859 A CN202211133859 A CN 202211133859A CN 115456900 A CN115456900 A CN 115456900A
Authority
CN
China
Prior art keywords
point
point cloud
sampling
manifold
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211133859.0A
Other languages
Chinese (zh)
Inventor
徐雪丽
耿国华
王红珍
王敬禹
周明全
曹欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Yanan University
Original Assignee
Northwest University
Yanan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University, Yanan University filed Critical Northwest University
Priority to CN202211133859.0A priority Critical patent/CN115456900A/en
Publication of CN115456900A publication Critical patent/CN115456900A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an improved transform-based Qin tomb warrior fragment denoising method, which comprises the following steps: 1. preprocessing a point cloud sample of Qin warrior data; 2. importing the preprocessed point cloud sample as a training set into an input embedding module and mapping the point cloud sample to a high-dimensional space; 3. importing the high-dimensional space point cloud into a self-adaptive down-sampling module of a transform encoder, obtaining relatively uniform points by using an FPS (field programmable system) AS original sampling points, and automatically learning the offset of each sampling point and updating position information by an AS (application server) so AS to reduce the data volume and keep the structural attribute of an original point cloud model; 4. importing the down-sampled result into an encoder module of a Transformer, and enhancing the characteristics of the point cloud through an RA module so as to effectively extract the characteristics; 5. selecting points closer to clean point cloud by using a self-adaptive sampling method to reconstruct a three-dimensional surface by taking the output of a transform decoder as a basis; 6. and continuously carrying out iterative training on the imported data until the loss value is small and tends to be stable, so that the de-noised clean point cloud is obtained, and the robustness to high noise is better.

Description

Improved transform-based Qinhong tomb warrior fragment denoising method
Technical Field
The invention belongs to the technical field of cultural relic protection, and particularly relates to an improved transform-based method for denoising fragments of Qin warriors.
Background
In the field of cultural relic excavation protection, digital initialization acquisition of cultural relic fragments is influenced by various factors such as measuring equipment, external environment, surface characteristics of a measured object and the like, and an initial point cloud data model obtained by scanning often contains a large number of noise points. The more the number of the noise points is, the greater the influence on the quality of point cloud is, and the accuracy and efficiency of tasks such as feature extraction, registration, curved surface reconstruction and visualization in the later stage are directly influenced. Therefore, denoising acquired initial digitized point cloud data is an important research content in the field.
In the traditional denoising method, a point cloud data denoising method based on curved surface fitting firstly carries out surface fitting on three-dimensional scanning point cloud data of an object, then calculates the distance between each point and a fitting surface, and finally deletes the gross error or abnormal value of the point cloud data according to a certain criterion to achieve the purpose of denoising the point cloud data. The method is a simple and effective estimation method, but the accuracy is not high, and large calculation errors exist particularly for complex models and models containing noise; in a moving robust principal component analysis method based on sparse representation theory, the estimated positions of points are calculated through local average, the sharp features are reserved through a weighted minimization method, and the positions of the points are updated by measuring the similarity between normal vectors in local neighborhoods through weights to perform noise elimination. But when the noise level is high, performance tends to decrease due to over-smoothing or over-sharpening.
In recent years, artificial intelligence methods represented by deep learning have taken a series of important breakthroughs and received unprecedented attention. The PointNet creates a pioneer for directly applying a deep learning model on point cloud to carry out feature learning, and in order to ensure the invariance of the replacement, the method applies a normalized rotation matrix on the point cloud, so that the points are too independent; in order to realize order independence, the network performs global feature extraction on all point cloud data by using global pooling operation, but geometric relevance among points is ignored, and a part of local feature information is lost; some improved networks based on PointNet, such as PointNet + +, neural Projection, pointclearnet, and Total cancellation, etc., take into account the local nature of the points for improving model performance. These methods can infer the displacement of noise from the underlying surface and reconstruct points, but these points are not specified for explicit surface restoration, possibly resulting in sub-optimal denoising results.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an improved transform-based method for denoising fragments of Qin warriors, which is beneficial to learning the potential manifold of noise point cloud and capturing the inherent structure for recovering the surface to reconstruct the manifold, and has better robustness to high noise.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for denoising Qin warriors fragments based on an improved Transformer comprises the following steps:
step 1, preprocessing a point cloud sample of Qin warrior data so as to realize data enhancement and labeling processing;
step 2, all the preprocessed point cloud samples are used as training sets, and are led into an input embedding module in batches and are mapped to a high-dimensional space;
step 3, importing the point cloud of the high-dimensional space into an adaptive down-sampling module in an improved transform encoder, firstly obtaining relatively uniform points by using a maximum-distance point sampling algorithm (FPS) AS original sampling points, and then automatically learning the offset of each sampling point by using an adaptive neighborhood sampling Algorithm (AS) and updating the position information of the sampling points, thereby reducing the data volume and retaining the structural attribute of an original point cloud model;
step 4, importing the down-sampled result into an improved Transformer encoder module, and enhancing the characteristics of the point cloud through a relative attention RA module of the point cloud so as to effectively extract the characteristics;
step 5, taking the output of an improved transform decoder as a basis, reconstructing the manifold of each point, sampling the manifold structure corresponding to each point in proportion, and selecting the point closer to the clean point cloud by using a self-adaptive sampling method to reconstruct the three-dimensional surface;
and 6, continuously performing iterative training of the steps 3 to 5 on the imported data by utilizing an improved Transformer encoder-decoder framework until the loss value of the loss function is small and tends to be stable, and obtaining the denoised clean point cloud.
Further, the data enhancement of step 1 includes rotating, translating and scaling the data.
Further, the adaptive neighborhood sampling algorithm AS of step 3 includes the following steps:
step 3.1, let P s To sample N from the Nth input point s A set of points consisting of points, and N s <N,x i As a set of points P s Sample point of (1), x i ∈P s ,f i Is the sampling point x i Characteristic of (b), f i ∈F s Sampling point x by k-NN query i The neighbor of (2) is grouped, and a general self-attention mechanism is used for carrying out feature updating;
step 3.2, sampling point x i K neighbors x i,1 ,...,x i,k Corresponding characteristic f i,1 ,...,f i,k Expressed as:
Figure BDA0003851057360000031
where A is used for aggregate characterization and R is used to describe the sampling point x i And neighbor point x i,j The high-level relation between gamma is used for changing the characteristic dimension of each neighbor point, and in order to reduce the calculation amount, let gamma (x) i,j )=W γ f i,j Closing (c)The system function R is expressed as:
Figure BDA0003851057360000032
wherein D' is the output channel of Conv;
step 3.3, for each sampling point x i Using MLP + Softmax, obtaining the coordinates of each point in the group and the normalized weight W of the characteristic channel p And W f Expressed as:
W p =Softmax(mlp(x i,j )) (3)
W f =Softmax(mlp(f i,j )) (4)
in the formulas (3) and (4), j belongs to [1, k ];
step 3.4, realizing the self-adaptive updating of the sampling point xi and the characteristic fi thereof through weighted summation operation,
Figure BDA0003851057360000033
and
Figure BDA0003851057360000034
that is, the updated point information is expressed as:
Figure BDA0003851057360000035
Figure BDA0003851057360000036
further, the relative attention RA module of the point cloud in step 4 is used to calculate the relative attention feature between the self-attention SA module feature and the input feature, and is expressed as:
F ra =F in -F sa (7)
in the formula, F ra For relative attention features, F in As input features, F sa Is a self-attention SA module feature;
finally, the relative attention feature F ra Inputting features F over a network in Relative attention RA Module Final output feature F as the entire Point cloud out Is represented as:
F out =RA(F in )=relu(bn(mlp(F ra )))+F in 。 (8)
further, the self-attention SA module is characterized by:
Figure BDA0003851057360000041
wherein (Q, K, V) = F in .(W q ,W k ,W v );W q 、W k And W v To share the learned weight matrix, Q, K, and V are the Query, key, and Value matrices, respectively, generated by linear transformation of the input features,
Figure BDA0003851057360000042
one dimension for the query and key vectors.
Further, the calculation formula (7) of the relative attention feature is a discrete laplacian operator, and in the graph G having N nodes, the N-dimensional vector f is expressed as:
f=(f 1 ,f 2 ,...,f N ) (10)
in the formula (f) i The function value of the function f at the node i;
disturbing the point i, changing the point i into any node j adjacent to the point i, and then expressing the gain caused by the change from the node j to the node i as follows:
Figure BDA0003851057360000043
when edge E ij Having a weight W ij Then:
Figure BDA0003851057360000044
when W is ij If =0, indicating that node i and node j are not adjacent, then:
Figure BDA0003851057360000045
finally, the following components are obtained:
Figure BDA0003851057360000046
in the formula (I), the compound is shown in the specification,
Figure BDA0003851057360000047
is the degree of vertex i; w is a i: =(w i1 ,...,w iN ) Is an N-dimensional row vector;
Figure BDA0003851057360000048
is an N-dimensional column vector; w is a i: f represents the inner product of the two vectors, and the gain accumulation for all N nodes is expressed as:
Figure BDA0003851057360000051
in the formula, W i: And representing the weight of the operation corresponding to the ith point, wherein W represents the weight of the operation corresponding to all the points, and D-W is a Laplace matrix L.
Further, in the step (5), the decoder firstly converts the embedded features of each sampling point and its neighborhood into a local curved surface with the point as the center to infer the potential manifold, and then repeatedly samples on the inferred manifold to generate the noise reduction point set
Figure BDA0003851057360000052
Reconstructing a clean point cloud
Figure BDA0003851057360000053
Further, the step (5) specifically includes the following steps:
step 5.1, first defining the 2D manifold M embedded in the 3D space parameterized by the feature vector y as:
M(u,v;y):[-1,1]×[-1,1]→R 3 (16)
wherein (u, v) is a 2D rectangular region [ -1,1] 2 A certain point in (a);
the 2D rectangle is mapped by the function approximator MLP to a patch manifold of arbitrary shape parameterized by y by equation (16), expressed as:
M i (u,v;y i )=MLP M ([u,v,y i ]) (17)
in the formula, M i (u,v;y i ) Representing a parameterized patch manifold;
step 5.2, the adaptive downsampling is collected
Figure BDA0003851057360000054
Point P in i Corresponding dough sheet manifold M i Is defined as:
M i (u,v;y i )=p i +M(u,v;y i ) (18)
equation (18) represents the manifold M (u, v; y) to be constructed i ) Move to a local curve centered at pi,
Figure BDA0003851057360000055
the surface patch manifold of all point correspondences in the table can be expressed as
Figure BDA0003851057360000056
I.e. characterizing the potential surface of the point cloud;
step 5.3, reducing the number of input points by half in the adaptive downsampling process through step 5.1 and step 5.2, namely M = N/2, and performing M (u, v; y) on each patch manifold i ) Resampling, and collecting two points on each surface patch manifold to obtain a denoising point cloud
Figure BDA0003851057360000057
Expressed as:
Figure BDA0003851057360000061
further, the loss function loss in the step (6) includes a loss function Las and a loss function Lus;
the loss function Las is used for quantizing the adaptive down-sampling set
Figure BDA0003851057360000062
Distance from ground route point cloud Pgt due to
Figure BDA0003851057360000063
And Pgt contains a different number of dots, and
Figure BDA0003851057360000064
therefore, the chamfer distance CD is selected as L as Expressed as:
Figure BDA0003851057360000065
the loss function Lus is used to quantify the final reconstructed point cloud
Figure BDA0003851057360000066
The distance from the ground channel Pgt, using the earth movement distance EMD as Lus, is expressed as:
Figure BDA0003851057360000067
in the formula (I), the compound is shown in the specification,
Figure BDA0003851057360000068
phi is bijective;
finally, the network is trained end-to-end with supervision, and the minimum total loss function is expressed as:
L denoise =λL as +(1-λ)L us (22)
in the formula, λ is an empirical value of 0.01.
Compared with the prior art, the invention has the following technical effects:
based on the high performance of a Transformer in natural language processing and the sequence independence of all operations, the method is improved and is very suitable for point cloud feature learning, the problem that a relative attention RA module in the improved Transformer is very sensitive to abnormal points in a widely used FPS (remote data system) which is a farthest point sampling method, so that point cloud data in the real world are very unstable when being processed, and the problem that the sampling points from the FPS are required to be subsets of original point clouds, so that the inference of the original data and original geometric information is increased, basic point cloud information can be extracted in a self-adaptive mode, a more valuable basis is provided for smooth development of subsequent work, and meanwhile, an attention mechanism and global pooling operation are used in the feature extraction process, so that not only global information can be extracted, but also the integrity of local detailed information can be kept. Specifically, abundant high-dimensional characteristics of a point cloud sample of Qin terracotta warrior data are obtained by utilizing an improved Transformer structure, potential manifold of noise point cloud is learned from sampling points, namely, the sampling points obtained by an FPS are self-adapted through a self-adaptive down-sampling module, and are closer to the surface where the points are located; and then converting each sampling point and the embedded neighborhood characteristics thereof into a local surface to infer the surface manifold, reconstructing a clean point cloud capturing an internal structure by sampling on each surface manifold without being influenced by abnormal values, recovering the surface to reconstruct the manifold, realizing denoising, having good robustness under synthetic noise and real noise, and having good propulsion effect on the virtual recovery work of the computer-aided Qin warriors.
Drawings
FIG. 1: the invention is based on the point cloud denoising network structure chart of the improved transformer;
FIG. 2: qualitative analysis comparison graphs of different denoising methods are obtained;
FIG. 3: the invention is a schematic diagram of an adaptive down-sampling principle;
FIG. 4: in the structure diagram of the relative attention module RA of the invention, a self-attention module structure SA is arranged in a dashed line frame;
FIG. 5: the invention discloses a schematic diagram of patch manifold reconstruction and resampling.
Detailed Description
The present invention will be explained in further detail with reference to examples.
As shown in fig. 1, a method for denoising fragments of Qin tomb warriors based on improved Transformer includes the following steps:
step 1, preprocessing a point cloud sample of Qin warrior data, realizing data enhancement through rotation, translation and scaling, and labeling the data;
step 2, all the preprocessed point cloud samples are used as training sets, and are led into an input embedding module in batches and are mapped to a high-dimensional space;
step 3, importing the point cloud of the high-dimensional space into a self-adaptive down-sampling module in the improved transform encoder so as to reduce the data volume and simultaneously keep the structural attribute of the origin cloud model as much as possible;
the specific process is AS shown in fig. 3, firstly, a farthest point sampling algorithm FPS is used to obtain relatively uniform points AS original sampling points, and then, an adaptive neighborhood sampling algorithm AS is used to automatically learn the offset of each sampling point and update the position information of the sampling point;
the adaptive neighborhood sampling algorithm AS specifically comprises the following steps:
step 3.1, let P s To sample N from the Nth input point s A set of points consisting of points, and N s <N,x i As a set of points P s Sampling point of (1), x i ∈P s ,f i Is the sampling point x i Is characterized by f i ∈F s Corresponding to the dimension D1, the sampling point x is inquired through k-NN i The neighbor of (2) is grouped, and a general self-attention mechanism is used for carrying out feature updating;
step 3.2, sampling point x i K neighbors x i,1 ,...,x i,k Corresponding characteristic f i,1 ,...,f i,k Expressed as:
Figure BDA0003851057360000081
where A is used for aggregation characterization and R is used to describe the sampling point x i And neighbor point x i,j The high-level relation between gamma is used for changing the characteristic dimension of each neighbor point, and in order to reduce the calculation amount, let gamma (x) i,j )=W γ f i,j The relationship function R is expressed as:
Figure BDA0003851057360000082
wherein D' is the output channel of Conv;
step 3.3, for each sampling point x i Obtaining the coordinates of each point in the group and the normalized weight W of the characteristic channel by using MLP + Softmax p And W f Expressed as:
W p =Softmax(mlp(x i,j )) (3)
W f =Softmax(mlp(f i,j )) (4)
in the formulas (3) and (4), j belongs to [1, k ];
step 3.4, realizing sampling point x by weighted summation operation i And characteristic f thereof i The adaptive update of the time-domain data stream,
Figure BDA0003851057360000083
and
Figure BDA0003851057360000084
that is, the updated point information is expressed as:
Figure BDA0003851057360000085
Figure BDA0003851057360000086
the adaptive down-sampling operation can obtain points closer to the potential surface, namely points with less noise disturbance, and is helpful for reducing the potential space for reconstructing the potential manifold when reconstructing the manifold;
step 4, importing the down-sampled result into an improved Transformer encoder module, and enhancing the characteristics of the point cloud through a relative attention RA module of the point cloud so as to effectively extract the characteristics;
attention is to screen a small amount of important information from a large amount of information, focus on the important information, ignore most of the unimportant information, focus on its corresponding Value the larger the weight is, self-Attention SA in the original transformer is a mechanism for focusing on other words of an input sentence when encoding each word, an architecture of an SA layer is described in a dashed box in fig. 4, when switching to a point data stream, according to terms, Q, K, and V are Query, key, and Value matrices generated by linear transformation of input features, first, a weighting coefficient is calculated according to Query and Key, and then, weighted summation is performed on Value according to the weighting coefficient, for the weighting coefficient, the most common method at present includes: evaluating the vector dot product of the two, evaluating the similarity of the vectors Cosine of the two or introducing an additional neural network, wherein the embodiment uses the method of the vector dot product to calculate, and in order to prevent the calculation result from being overlarge, the calculation result is calculated by dividing the vector dot product by a scale
Figure BDA0003851057360000091
Wherein
Figure BDA0003851057360000092
For a dimension of query and key vectors, normalizing the result into probability distribution by utilizing Softmax, and multiplying the probability distribution by a matrix Value to obtain a representation of weight summation, namely, the self-attention SA module is characterized by:
Figure BDA0003851057360000093
in the formula,(Q,K,V)=F in .(W q ,W k ,W v );W q 、W k And W v A learned weight matrix is shared.
As can be seen from the calculation process of the formula (9), the whole self-attention process is unchanged by displacement, so that the self-attention process is very suitable for disorder and irregularity of point clouds, but the absolute coordinates of the same point cloud after rigid transformation are greatly different from those before transformation, in order to describe the inherent characteristics of the point cloud, the embodiment introduces the relative attention characteristics of the point cloud, which is inspired by using a laplacian matrix L = D-a to replace an adjacent matrix a in a graph convolution network, D is a diagonal matrix, and each diagonal element D is a diagonal element D ii Representing the degree of the ith node, replacing the self-attention SA module in the original transform with a relative attention RA module to enhance the feature representation of the point cloud, as shown in FIG. 4, the relative attention RA module is to calculate the relative attention feature between the self-attention SA feature and the input feature, and is represented as:
F ra =F in -F sa (7)
in the formula, F ra For relative attention features, F in As input features, F sa Is a self-attention SA module feature;
finally, the relative attention feature is further used as the final output feature F of the whole RA through the network and the input feature out Is represented as:
F out =RA(F in )=relu(bn(mlp(F ra )))+F in (8)
F in -F sa similar to the discrete laplacian, in a graph G with N nodes, the N-dimensional vector f is represented as:
f=(f 1 ,f 2 ,...,f N ) (10)
in the formula f i The function value of the function f at the node i;
the point i is disturbed and may become any node j adjacent to the point i, and since the laplacian operator can calculate the gain from one point to the tiny disturbance in all degrees of freedom, the gain caused by changing any node j to the node i is represented by a graph, which is expressed as:
Figure BDA0003851057360000101
when edge E ij Having a weight W ij Then:
Figure BDA0003851057360000102
when W is ij If =0, indicating that node i and node j are not adjacent, then:
Figure BDA0003851057360000103
finally, the following is obtained:
Figure BDA0003851057360000104
in the formula (I), the compound is shown in the specification,
Figure BDA0003851057360000105
is the degree of vertex i; w is a i: =(w i1 ,...,w iN ) Is an N-dimensional row vector;
Figure BDA0003851057360000106
is an N-dimensional column vector; w is a i: f represents the inner product of the two vectors, and the gain accumulation is expressed for all N nodes as:
Figure BDA0003851057360000107
in the formula, W i: And representing the weight of the operation corresponding to the ith point, wherein W represents the weight of the operation corresponding to all the points, and D-W is the Laplace matrix L.
The ith row in the laplace matrix actually reflects the gain accumulation of the ith node when it perturbs all other nodes. Intuitively, the graph laplace reflects that a potential is applied to the node i, and the potential flows to other nodes smoothly, so that the function of supervision and guidance is played on model iterative optimization. Relative attention increases attention weighting and reduces the effects of noise, which is helpful to downstream tasks.
Step 5, taking the output of an improved transform decoder as a basis, reconstructing the manifold of each point, sampling the manifold structure corresponding to each point in proportion, and selecting the point closer to the clean point cloud by using a self-adaptive sampling method to reconstruct the three-dimensional surface;
after the decoder acquires the high-dimensional characteristic representation of the point cloud, the decoder can be used for processing a denoising task, the previous denoising task mostly depends on the idea that points are displaced from potential surfaces, but the points are not specified for restoring the surfaces, which may result in suboptimal denoising effect, the point cloud generally represents some potential surfaces or 2D manifolds of a set of sampling points, and in order to achieve the robustness of the denoising effect, the present embodiment can learn the potential manifolds of the noise point cloud, capture the inherent structures of the original point cloud, and reconstruct and restore the surfaces, and the process is as shown in (b) the decoder part in fig. 1;
the decoder transforms the embedded features of each sample point and its neighborhood into a local surface centered on the point to infer a potential manifold, and then samples the inferred manifold multiple times to produce a set of noise reduction points
Figure BDA0003851057360000111
I.e. a clean point cloud is reconstructed
Figure BDA0003851057360000112
The whole process is shown in fig. 5, and specifically comprises the following steps:
step 5.1, first define the 2D manifold M embedded in the 3D space parameterized by some eigenvectors y as:
M(u,v;y):[-1,1]×[-1,1]→R 3 (16)
wherein (u, v) is a 2D rectangular region [ -1,1] 2 A certain point in (a);
equation (16) maps the 2D rectangle to an arbitrarily shaped patch manifold parameterized by y, the parameterized patch manifold Mi (u, v; y i ) The MLP is realized because the MLP is a general function approximator with expression capability enough to approximate the manifold of an arbitrary shape, which is expressed as:
M i (u,v;y i )=MLP M ([u,v,y i ]) (17)
in the formula, M i (u,v;y i ) Representing a parameterized patch manifold;
step 5.2, after the definition of the manifold M, the adaptive downsampling set is collected
Figure BDA0003851057360000113
Point P in i Corresponding dough sheet manifold M i Is defined as follows:
M i (u,v;y i )=p i +M(u,v;y i ) (18)
equation (18) represents the manifold M (u, v; y) to be constructed i ) Move to a local curve centered at pi,
Figure BDA0003851057360000121
the surface patch manifold of all point correspondences in the table can be expressed as
Figure BDA0003851057360000122
I.e. characterizing the underlying surface of the point cloud;
step 5.3, reducing the number of input points in the adaptive down-sampling process by half through step 5.1 and step 5.2, namely M = N/2, and performing M (u, v; y) on each patch manifold i ) Resampling, and sampling two points on each surface patch manifold to obtain a denoising point cloud
Figure BDA0003851057360000123
Expressed as:
Figure BDA0003851057360000124
and 6, continuously performing iterative training from the step 3 to the step 5 on the imported data by utilizing an improved Transformer encoder-decoder framework until the loss value of the loss function is small and tends to be stable, and obtaining a denoised clean point cloud.
To measure the reconstruction quality of the final point cloud, the Loss function Loss consists of two parts: 1) Loss function Las for quantization of adaptively downsampled sets
Figure BDA0003851057360000125
The distance between the point cloud Pgt and the ground route point cloud; 2) Loss function Lus quantization of the final reconstructed point cloud
Figure BDA0003851057360000126
Distance from ground channel Pgt.
Due to the fact that
Figure BDA0003851057360000127
And Pgt contains a different number of dots, and
Figure BDA0003851057360000128
therefore, the chamfer distance CD is selected as L as Expressed as:
Figure BDA0003851057360000129
measuring the denoised point cloud by using Earth moving distance Earth's distance (EMD) as Lus
Figure BDA00038510573600001210
The distance between the point cloud Pgt and the ground route point cloud is expressed as:
Figure BDA00038510573600001211
in the formula (I), the compound is shown in the specification,
Figure BDA00038510573600001212
phi is bijective;
finally, the network is trained end-to-end with supervision, and the minimum total loss function is expressed as:
L denoise =λL as +(1-λ)L us (22)
in the formula, λ is an empirical value of 0.01.
The data set of the Qin warriors is a model which is scanned and collected on the spot from researchers at the cultural heritage digital national and local union engineering research center of northwest university to the warrior museum, and more than 500 existing models are accurately marked. Fig. 2 is a qualitative analysis comparison diagram of the denoising effect of the head and hand data sets of the terracotta warrior by using different denoising methods, and it can be seen that compared with other three methods NPD, tlDn and PCNet based on deep learning, the denoising method provided by the invention has better robustness on abnormal values and cleaner obtained results.
In the embodiment, the point cloud denoising network based on the transform is used as a feature extractor, the structure and semantic comprehension capability of point cloud is stronger, and compared with other three denoising methods, the denoising effect of the method is more remarkable along with the improvement of noise level, and as shown in table 1, the method is better than the previous deep learning method and is more robust to high noise.
TABLE 1 comparison of CD (chamfer distance) for each denoising method at different noise ratios
0.25% 0.5% 1% 2% 3%
NPD 0.24 0.62 1.28 2.32 3.27
PCNet 0.18 0.46 0.97 1.42 2.91
TlDn 0.34 0.78 1.15 2.26 3.12
TDNet-RA(ours) 0.16 0.39 0.83 1.20 2.15

Claims (9)

1. A method for denoising Qin warriors fragments based on an improved Transformer is characterized by comprising the following steps:
step 1, carrying out pretreatment on a point cloud sample of the Qin warriors data so as to realize data enhancement and marking treatment;
step 2, all the preprocessed point cloud samples are used as training sets, and are led into an input embedding module in batches and are mapped to a high-dimensional space;
step 3, importing the point cloud of the high-dimensional space into an adaptive down-sampling module in an improved transform encoder, firstly obtaining relatively uniform points by using a maximum-distance point sampling algorithm (FPS) AS original sampling points, and then automatically learning the offset of each sampling point by using an adaptive neighborhood sampling Algorithm (AS) and updating the position information of the sampling points, thereby reducing the data volume and retaining the structural attribute of an original point cloud model;
step 4, importing the down-sampled result into an improved Transformer encoder module, and enhancing the characteristics of the point cloud through a relative attention RA module of the point cloud so as to effectively extract the characteristics;
step 5, reconstructing the manifold of each point by taking the output of an improved transform decoder as a basis, sampling the manifold structure corresponding to each point in proportion, and selecting a point closer to a clean point cloud to reconstruct a three-dimensional surface;
and 6, continuously performing iterative training of the steps 3 to 5 on the imported data by utilizing an improved Transformer encoder-decoder framework until the loss value of the loss function is small and tends to be stable, and obtaining the denoised clean point cloud.
2. The improved transform-based Qin warriors fragment denoising method according to claim 1, wherein the data enhancement of step 1 comprises rotating, translating and scaling the data.
3. The improved transform-based Qin warrior fragment denoising method according to claim 1, wherein the adaptive neighborhood sampling algorithm AS of the step 3 comprises the following steps:
step 3.1, let P s To sample N from the Nth input point s A set of points consisting of points, and N s <N,x i As a set of points P s Sample point of (1), x i ∈P s ,f i Is the sampling point x i Characteristic of (b), f i ∈F s Sampling point x by k-NN query i The neighbor of (2) is grouped and a general self-attention mechanism is used for updating the characteristics;
step 3.2, sampling point x i K neighbors x i,1 ,...,x i,k Corresponding characteristic f i,1 ,...,f i,k Expressed as:
Figure FDA0003851057350000021
where A is used for aggregation characterization and R is used to describe the sampling point x i And neighbor point x i,j The high-level relation between the two points is that gamma is used for changing the characteristic dimension of each neighbor point, and in order to reduce the calculation amount, gamma (x) i,j )=W γ f i,j The relationship function R is expressed as:
Figure FDA0003851057350000022
wherein D' is the output channel of Conv;
step 3.3, for each sampling point x i Obtaining the coordinates of each point in the group and the normalized weight W of the characteristic channel by using MLP + Softmax p And W f Expressed as:
W p =Soft max(mlp(x i,j )) (3)
W f =Soft max(mlp(f i,j )) (4)
in the formulas (3) and (4), j belongs to [1, k ];
step 3.4, realizing sampling point x by weighted summation operation i And characteristic f thereof i The adaptive update of (a) is performed,
Figure FDA0003851057350000023
and
Figure FDA0003851057350000024
that is, the updated point information is expressed as:
Figure FDA0003851057350000025
Figure FDA0003851057350000026
4. the improved Transformer-based Qin warriors fragment denoising method according to claim 1, wherein the relative attention RA module of the point cloud in the step 4 is used for calculating the relative attention feature between the self-attention SA module feature and the input feature, and is expressed as:
F ra =F in -F sa (7)
in the formula, F ra For relative attention features, F in As an input feature, F sa A self-attention SA module feature;
finally, the relative attention feature F ra Inputting features F over a network in Relative attention RA Module Final output feature F as the entire Point cloud out Is represented as:
F out =RA(F in )=relu(bn(mlp(F ra )))+F in 。 (8)
5. the improved transform-based terracotta warrior fragment denoising method according to claim 4, wherein the self-attention SA module characteristic is represented as:
Figure FDA0003851057350000031
wherein, (Q, K, V) = F in .(W q ,W k ,W v );W q 、W k And W v Learning rights for sharingThe weight matrix, Q, K and V are respectively the Query, key and Value matrix generated by linear transformation of input features,
Figure FDA0003851057350000036
one dimension for the query and key vectors.
6. The improved transform-based denoising method for fragments of Qin dynasty figures, according to claim 4, wherein the calculation formula (7) of the relative attention features is a discrete Laplacian, and in a graph G with N nodes, an N-dimensional vector f is expressed as:
f=(f 1 ,f 2 ,...,f N ) (10)
in the formula, f i The function value of the function f at the node i;
disturbing the point i, changing the point i into any node j adjacent to the point i, and then expressing the gain caused by the change from the node j to the node i as follows:
Figure FDA0003851057350000032
when edge E ij Having a weight W ij Then:
Figure FDA0003851057350000033
when W is ij If =0, indicating that node i and node j are not adjacent, then:
Figure FDA0003851057350000034
finally, the following components are obtained:
Figure FDA0003851057350000035
in the formula (I), the compound is shown in the specification,
Figure FDA0003851057350000041
is the degree of vertex i; w is a i: =(w i1 ,...,w iN ) Is an N-dimensional row vector;
Figure FDA0003851057350000042
is an N-dimensional column vector; w is a i: f represents the inner product of the two vectors, and the gain accumulation is expressed for all N nodes as:
Figure FDA0003851057350000043
in the formula, W i: And representing the weight of the operation corresponding to the ith point, wherein W represents the weight of the operation corresponding to all the points, and D-W is the Laplace matrix L.
7. The improved transform-based Qin warrior fragment denoising method of claim 1, wherein the decoder in step (5) transforms the embedded features of each sampling point and its neighborhood into a local surface centered on the point to infer the potential manifold, and then re-samples the inferred manifold to generate a denoising point set
Figure FDA0003851057350000044
Reconstructing a clean point cloud
Figure FDA0003851057350000045
8. The improved transform-based terracotta warrior fragment denoising method according to claim 7, wherein the step (5) comprises the following steps:
step 5.1, first defining the 2D manifold M embedded in the 3D space parameterized by the feature vector y as:
M(u,v;y):[-1,1]×[-1,1]→R 3 (16)
wherein (u, v) is a 2D rectangular region [ -1,1] 2 A certain point in (a);
the 2D rectangle is mapped by the function approximator MLP to a patch manifold of arbitrary shape parameterized by y by equation (16), expressed as:
M i (u,v;y i )=MLP M ([u,v,y i ]) (17)
in the formula, M i (u,v;y i ) Representing a parameterized patch manifold;
step 5.2, self-adaptive down-sampling aggregation
Figure FDA0003851057350000046
Point P in i Corresponding patch manifold M i Is defined as:
M i (u,v;y i )=p i +M(u,v;y i ) (18)
equation (18) represents the manifold M (u, v; y) to be constructed i ) Move to a local curve centered at pi,
Figure FDA0003851057350000047
the patch manifold of all point correspondences in the set can be represented as
Figure FDA0003851057350000051
I.e. characterizing the underlying surface of the point cloud;
step 5.3, reducing the number of input points in the adaptive down-sampling process by half through step 5.1 and step 5.2, namely M = N/2, and M is a manifold for each patch i ([u,v;y i ]) Resampling, and collecting two points on each surface patch manifold to obtain a denoising point cloud
Figure FDA0003851057350000052
Expressed as:
Figure FDA0003851057350000053
9. the improved transform-based terracotta warrior fragment denoising method according to claim 1, wherein the loss function loss in step (6) comprises a loss function Las and a loss function Lus;
the loss function Las is used for quantizing the adaptive down-sampling set
Figure FDA0003851057350000054
Distance from ground route point cloud Pgt due to
Figure FDA0003851057350000055
And Pgt contains a different number of points, an
Figure FDA0003851057350000056
Therefore, the chamfer distance CD is selected as L as Expressed as:
Figure FDA0003851057350000057
the loss function Lus is used to quantify the final reconstructed point cloud
Figure FDA0003851057350000058
The distance from the ground channel Pgt, using the earth movement distance EMD as Lus, is expressed as:
Figure FDA0003851057350000059
in the formula (I), the compound is shown in the specification,
Figure FDA00038510573500000510
phi is bijective;
finally, the network is trained end-to-end with supervision, and the minimum total loss function is expressed as:
L denoise =λL as +(1-λ)L us (22)
in the formula, λ is an empirical value of 0.01.
CN202211133859.0A 2022-09-19 2022-09-19 Improved transform-based Qinhong tomb warrior fragment denoising method Pending CN115456900A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211133859.0A CN115456900A (en) 2022-09-19 2022-09-19 Improved transform-based Qinhong tomb warrior fragment denoising method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211133859.0A CN115456900A (en) 2022-09-19 2022-09-19 Improved transform-based Qinhong tomb warrior fragment denoising method

Publications (1)

Publication Number Publication Date
CN115456900A true CN115456900A (en) 2022-12-09

Family

ID=84304288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211133859.0A Pending CN115456900A (en) 2022-09-19 2022-09-19 Improved transform-based Qinhong tomb warrior fragment denoising method

Country Status (1)

Country Link
CN (1) CN115456900A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116012266A (en) * 2023-03-29 2023-04-25 中国科学技术大学 Image denoising method, system, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116012266A (en) * 2023-03-29 2023-04-25 中国科学技术大学 Image denoising method, system, equipment and storage medium

Similar Documents

Publication Publication Date Title
Qian et al. PUGeo-Net: A geometry-centric network for 3D point cloud upsampling
CN111079532A (en) Video content description method based on text self-encoder
CN110163815B (en) Low-illumination reduction method based on multi-stage variational self-encoder
CN113780149A (en) Method for efficiently extracting building target of remote sensing image based on attention mechanism
CN109636722B (en) Method for reconstructing super-resolution of online dictionary learning based on sparse representation
CN112465889B (en) Plant point cloud segmentation method, system and storage medium based on two-dimensional-three-dimensional integration
CN109064412A (en) A kind of denoising method of low-rank image
CN107341513B (en) Multi-source ocean surface temperature remote sensing product fusion method based on stable fixed order filtering model
CN109299303B (en) Hand-drawn sketch retrieval method based on deformable convolution and depth network
CN114494297B (en) Adaptive video target segmentation method for processing multiple priori knowledge
WO2023202474A1 (en) Method and system for accurately forecasting three-dimensional spatiotemporal sequence multiple parameters of seawater quality
Han et al. An improved corner detection algorithm based on harris
CN115456900A (en) Improved transform-based Qinhong tomb warrior fragment denoising method
CN112529777A (en) Image super-resolution analysis method based on multi-mode learning convolution sparse coding network
CN114638768B (en) Image rain removing method, system and equipment based on dynamic association learning network
CN114663777B (en) Hyperspectral image change detection method based on space-time joint graph attention mechanism
Xi et al. Super resolution reconstruction algorithm of video image based on deep self encoding learning
CN113269691B (en) SAR image denoising method for noise affine fitting based on convolution sparsity
CN117727046A (en) Novel mountain torrent front-end instrument and meter reading automatic identification method and system
CN117593187A (en) Remote sensing image super-resolution reconstruction method based on meta-learning and transducer
Li et al. Towards communication-efficient digital twin via AI-powered transmission and reconstruction
CN116597142A (en) Satellite image semantic segmentation method and system based on full convolution neural network and converter
CN113591685B (en) Geographic object spatial relationship identification method and system based on multi-scale pooling
CN116152206A (en) Photovoltaic output power prediction method, terminal equipment and storage medium
AU2021104479A4 (en) Text recognition method and system based on decoupled attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination