CN114494372A - Remote sensing image registration method based on unsupervised deep learning - Google Patents

Remote sensing image registration method based on unsupervised deep learning Download PDF

Info

Publication number
CN114494372A
CN114494372A CN202210026370.7A CN202210026370A CN114494372A CN 114494372 A CN114494372 A CN 114494372A CN 202210026370 A CN202210026370 A CN 202210026370A CN 114494372 A CN114494372 A CN 114494372A
Authority
CN
China
Prior art keywords
image
scale
model network
corrected
transformation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210026370.7A
Other languages
Chinese (zh)
Other versions
CN114494372B (en
Inventor
叶沅鑫
唐腾峰
朱柏
张家诚
喻智睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN202210026370.7A priority Critical patent/CN114494372B/en
Publication of CN114494372A publication Critical patent/CN114494372A/en
Application granted granted Critical
Publication of CN114494372B publication Critical patent/CN114494372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a remote sensing image registration method based on unsupervised deep learning, which converts image registration into a regression optimization problem and can integrate a feature extraction network, image similarity measurement and feature descriptors of various forms and parameters. The depth features of the image to be registered are extracted by using a model network on multiple scales, geometric transformation parameters are obtained through parameter regression, and the image is geometrically corrected by using the parameters, so that the multi-scale step-by-step registration of the image from coarse to fine is realized. According to the method, a registration truth value is not needed to be used as a training sample, loss functions on multiple scales are jointly trained by constructing the loss functions based on similarity measurement and feature descriptors between images, parameters of each model network are updated through back propagation, geometric transformation parameters are optimized, and high-precision and high-robustness multi-source remote sensing image registration is achieved.

Description

Remote sensing image registration method based on unsupervised deep learning
Technical Field
The invention belongs to the technical field of remote sensing, and particularly relates to a design of a remote sensing image registration method based on unsupervised deep learning.
Background
With the rapid development of aerospace and remote sensing technologies, the means for acquiring remote sensing images are increasing and the types are abundant. Due to the difference between the equipment technology and the imaging mechanism of various sensors, the remote sensing image of a single data source cannot reflect the characteristics of ground objects comprehensively. In order to fully utilize multi-source remote sensing data acquired by different types of sensors and realize integration and information complementation, multi-source remote sensing images need to be registered.
The multi-source remote sensing image registration refers to a process of aligning and information superposition of multi-sensor remote sensing images of the same area acquired at different time, different visual angles or different sensor conditions, so that the same-name points on the aligned images have the same geographic coordinates. In the prior art, methods for multi-source remote sensing image registration comprise a traditional method without adopting a deep learning technology and a deep learning-based method. Traditional methods are based on feature or region templates, which rely on manually designed features, which usually need to be redesigned, for the registration of remote sensing images of different modalities of different sensors. The deep learning-based method extracts deep features from the multi-source remote sensing image, and has better universality compared with manual features. In the existing stage, a method based on supervised deep learning needs a large number of samples with truth labels as training data, but the existing stage remote sensing field does not have a large number of data for training, and the practical application of the method is limited by cost factors.
Disclosure of Invention
The invention aims to solve the problem that a large number of training samples are difficult to obtain by the existing remote sensing image registration method based on supervised deep learning, provides a remote sensing image registration method based on unsupervised deep learning, and can realize accurate registration between remote sensing images under the condition of no training samples.
The technical scheme of the invention is as follows: a remote sensing image registration method based on unsupervised deep learning comprises the following steps:
and S1, establishing a multi-source remote sensing image registration data set comprising two sets of image data, wherein every two images of the two sets of image data correspond to each other one by one, one set of image data is used as a reference image data set, and the other set of image data is used as an image data set to be corrected.
S2, selecting a reference image f from the reference image data set, selecting an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and taking the reference image f and the image m to be corrected as end-to-end input on a training sample.
S3, calculating the transformation parameters mu of the image on the model network of each scale on 3 scales1、μ2、μ3Gradually correcting the image m to be corrected to generate a corrected image m1、m2、m3Propagating back the loss function of the model network of each scale and correcting the image m3And a transformation parameter mu3As an end-to-end output on one training sample.
And S4, initializing model network parameters of 3 scales respectively.
And S5, performing joint training on the model networks of 3 scales in an end-to-end mode, and optimizing a joint loss function on the 3 scales.
S6, searching the direction of the fastest reduction of the combined loss function value through the deep learning optimizer, carrying out back propagation on the model network according to the direction, iteratively updating model network parameters, storing the network model parameters when the combined loss function is reduced to a preset threshold value and converged, and outputting a registered reference image f and a registered correction image m3
Further, step S3 includes the following substeps:
s3-1, inputting the reference image f and the image m to be corrected into the model network of the 1 st scale to obtain the transformation parameter mu of the 1 st scale1
S3-2, adopting transformation parameter mu1Geometric correction is carried out on the image m to be corrected to generate a corrected image m1
And S3-3, calculating the loss function of the model network of the 1 st scale.
S3-4, correcting the reference image f and the correction image m1Inputting the residual error into a model network of 2 nd scale to obtain the residual error delta mu of the transformation parameter1And then it is combined with the transformation parameter mu1Combine to obtain the 2 ndTransformation parameter mu of scale2
S3-5, adopting transformation parameter mu2For the corrected image m1Performing geometric correction to generate a corrected image m2
And S3-6, calculating the loss function of the model network of the 2 nd scale.
S3-7, correcting the reference image f and the corrected image m2Inputting the residual error into a model network of 3 rd scale to obtain the residual error delta mu of the transformation parameter2And then it is combined with the transformation parameter mu2Combining to obtain the transformation parameter mu of the 3 rd scale3
S3-8, adopting transformation parameter mu3For the corrected image m2Performing geometric correction to generate a corrected image m3
And S3-9, calculating a loss function of the model network of the 3 rd scale.
S3-10, correcting the image m3And a transformation parameter mu3As an end-to-end output on one training sample.
Further, step S3-1 includes the following substeps:
s3-1-1, down-sampling the reference image f and the image m to be corrected to 1/4 of the original size, and stacking the two images generated after down-sampling in the channel direction to generate a stacked image.
S3-1-2, inputting the superposed image into a feature extraction part of the model network of the 1 st scale to generate depth features.
S3-1-3, passing the depth characteristics through a parameter regression part of a model network of the 1 st scale to obtain a transformation parameter mu of the 1 st scale1
Further, step S3-2 includes the following substeps:
s3-2-1, converting the parameter mu1Form a geometric transformation matrix Tμ1
S3-2-2, transforming the matrix T by geometryμ1Performing geometric transformation on the image m to be corrected to generate a corrected image m1
Further, step S3-4 includes the following substeps:
s3-4-1, correcting the reference image f and the corrected image m1Down-sampled to 1/2 of the original size, respectively, and the two images generated after down-sampling are superimposed in the channel direction to generate a superimposed image.
And S3-4-2, inputting the superposed image into a feature extraction part of the model network of the 2 nd scale to generate depth features.
S3-4-3, passing the depth characteristics through a parameter regression part of a model network with the 2 nd scale to obtain the residual error delta mu of the transformation parameters1
S3-4-4, and converting residual error delta mu1And a transformation parameter mu1Combining to obtain the transformation parameter mu of the 2 nd scale2
Further, step S3-5 includes the following substeps:
s3-5-1, converting the parameter mu2Form a geometric transformation matrix Tμ2
S3-5-2, transforming the matrix T through geometryμ2For the corrected image m1Performing geometric transformation to generate a corrected image m2
Further, step S3-7 includes the following substeps:
s3-7-1, correcting the reference image f and the corrected image m2The superimposition is performed in the channel direction, resulting in a superimposed image.
And S3-7-2, inputting the superposed image into a feature extraction part of the model network with the 3 rd scale to generate depth features.
S3-7-3, passing the depth characteristics through a parameter regression part of a model network with a 3 rd scale to obtain residual errors delta mu of transformation parameters2
S3-7-4, and converting residual error delta mu2And a transformation parameter mu2Combining to obtain the transformation parameter mu of the 3 rd scale3
Further, step S3-8 includes the following substeps:
s3-8-1, converting the parameter mu3Form a geometric transformation matrix Tμ3
S3-8-2, transforming the matrix T through geometryμ3For the corrected image m2Performing geometric transformation to generateCorrecting image m3
Further, the Loss function Loss of the model network of the 1 st scale in step S3-3sim(f,m,μ1) Comprises the following steps:
Figure BDA0003464856830000031
loss function Loss of model network of 2 nd scale in step S3-6sim(f,m1,μ2) Comprises the following steps:
Figure BDA0003464856830000032
loss function Loss of model network of 3 rd scale in step S3-9sim(f,m2,μ3) Comprises the following steps:
Figure BDA0003464856830000033
the joint Loss function Loss in step S5 is:
Loss=λ1×Losssim(f,m,μ1)+λ2×Losssim(f,m1,μ2)+λ3×Losssim(f,m2,μ3)
where Sim (-) denotes a similarity measure, λ1,λ2,λ3And the weight factor is a loss function of each scale model network.
Further, step S4 includes the following substeps:
s4-1 to minimize Loss function Losssim(f,m,μ1) And training the model network of the 1 st scale.
S4-2, fixing the parameters of the model network of the 1 st scale to minimize the Loss function Losssim(f,m1,μ2) And training the model network of the 2 nd scale.
S4-3、Fixing parameters of the model network of the 1 st scale and the model network of the 2 nd scale to minimize the Loss function Losssim(f,m2,μ3) And training the model network of the 3 rd scale.
The invention has the beneficial effects that:
(1) the method converts the image registration into a regression optimization problem, can integrate the feature extraction network, the image similarity measure and the feature descriptor in various forms and parameters, and realizes the multi-scale image accurate registration of end-to-end mapping without supervision learning.
(2) The depth features of the image to be registered are extracted by using a model network on multiple scales, geometric transformation parameters are obtained through parameter regression, and the image is geometrically corrected by using the parameters, so that the multi-scale step-by-step registration of the image from coarse to fine is realized.
(3) According to the method, a registration truth value is not needed to be used as a training sample, loss functions on multiple scales are jointly trained by constructing the loss functions based on similarity measurement and feature descriptors between images, parameters of each model network are updated through back propagation, geometric transformation parameters are optimized, and high-precision and high-robustness multi-source remote sensing image registration is achieved.
Drawings
Fig. 1 is a flowchart of a remote sensing image registration method based on unsupervised deep learning according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a reference image, an image to be corrected, and a corrected image according to an embodiment of the invention.
Fig. 3 is a schematic general framework diagram of a remote sensing image registration method according to an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a model network 1 according to an embodiment of the present invention.
Fig. 5 is a schematic diagram illustrating calculation of similarity measurement of multi-source remote sensing images according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It is to be understood that the embodiments shown and described in the drawings are merely exemplary and are intended to illustrate the principles and spirit of the invention, not to limit the scope of the invention.
The embodiment of the invention provides a remote sensing image registration method based on unsupervised deep learning, which comprises the following steps of S1-S6 as shown in FIG. 1:
and S1, establishing a multi-source remote sensing image registration data set comprising two sets of image data, wherein every two images of the two sets of image data correspond to each other one by one, one set of image data is used as a reference image data set, and the other set of image data is used as an image data set to be corrected.
In the embodiment of the present invention, the image to be corrected in the image data set to be corrected should be an image with geometric distortion and a certain range of overlap (greater than or equal to 70% in the embodiment of the present invention) with the feature information included in the reference image.
In an embodiment of the present invention, step S1 is further described by taking the registration of an optical image with a Synthetic Aperture Radar (SAR) image as an example. As shown in fig. 2, in the embodiment of the present invention, an image a with a fixed resolution is used as a reference image, an image b overlapping with a partial area of the image a and having geometric distortion is used as an image to be corrected, and after registration and correction by the registration method provided by the present invention, an image c aligned with the overlapping area of the image a pixel by pixel is obtained. The multi-source remote sensing image data set comprises a plurality of pairs of regional images similar to the image a and the image b. It should be understood that other embodiments of the present invention include, but are not limited to, registration of multi-source optical images, registration of optical images with infrared images, registration of optical images with LiDAR (Light Detection and Ranging) intensity and elevation images, and registration of optical images with grid maps, and it is within the scope of the present invention to employ the registration methods provided herein.
S2, selecting a reference image f from the reference image data set, selecting an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and taking the reference image f and the image m to be corrected as end-to-end input on a training sample.
S3, calculating shadows on 3 scales respectivelyTransformation parameters mu like on model networks of various scales1、μ2、μ3Gradually correcting the image m to be corrected to generate a corrected image m1、m2、m3Propagating back the loss function of the model network of each scale and correcting the image m3And a transformation parameter mu3As an end-to-end output on one training sample.
The embodiment of the invention adopts a multi-scale matching strategy from coarse to fine, jointly trains model networks on 3 scales by an end-to-end frame, and predicts transformation parameters and residual errors thereof, thereby realizing accurate registration of images. The end-to-end frame refers to the embodiment of the invention that the reference image f and the image m to be corrected are input, and the corrected image m is output3And a transformation parameter mu3Which constitutes an end-to-end mapping relationship.
As shown in FIG. 3, step S3 includes the following substeps S3-1 through S3-10:
s3-1, inputting the reference image f and the image m to be corrected into the model network (the model network 1) of the 1 st scale (the scale 1 is short in the embodiment of the invention) to obtain the transformation parameter mu of the 1 st scale1
Step S3-1 includes the following substeps S3-1-1 to S3-1-3:
s3-1-1, down-sampling the reference image f and the image m to be corrected to 1/4 of the original size, and stacking the two images generated after down-sampling in the channel direction to generate a stacked image.
In the embodiment of the present invention, the size of the reference image f is fixed, and if the size of the image m to be corrected is not consistent with that of the reference image f, the size of the image m to be corrected is adjusted to be consistent with that of the reference image f by adopting a zero padding or cropping method.
S3-1-2, inputting the superposed image into a feature extraction part of the model network of the 1 st scale to generate depth features.
In one embodiment of the invention, as shown in fig. 4, the feature extraction portion of the model network 1 is composed of k sets of interconnected convolution blocks and downsampling layers, each convolution block including one convolution blockConvolution layer, a partial response normalization layer and a linear cell activation function layer, each down-sampling layer reducing the image resolution to 1/2. Experiments show that the size of a feature map generated by the last convolution block is enabled to be [4, 7 ] through reasonably selecting the value of k]In addition, when the number of convolution kernel channels of the convolution layer is set to 1/4 of the size of the image to be corrected, it is beneficial to generate more accurate transformation parameter mu in the subsequent steps1. In the embodiment of the present invention, if the size of the reference image f and the image m to be corrected is 512 × 512, the size of the image 1/4 after downsampling is 128 × 128, the number of convolution kernel channels per each convolution block is set to 32, and the value of k is set to 5, and the feature map size generated by 5 sets of convolution blocks and downsampling layers of the stacked image is 4.
In another embodiment of the present invention, the feature extraction part of the model network 1 includes, but is not limited to, a U-shaped structure network (U-Net), a full convolution neural network (FCN), and the like.
S3-1-3, passing the depth characteristics through a parameter regression part of a model network of the 1 st scale to obtain a transformation parameter mu of the 1 st scale1
As shown in fig. 4, in an embodiment of the present invention, the parameter regression portion of the model network 1 is composed of t fully connected layers connected in parallel, and the value of t can be set by integrating the computation speed and the range of image scaling, which is not limited by the present invention. Experiments prove that if the scaling coefficient is 0.5 and 2, the effect is better when 4 parallel full-connection layers are set. The parallel fully-connected layers are similar to the pyramid strategy used in conventional image registration, with the difference being that the initial values of the output spatial transformation parameters differ in scale. The computation of multiple parallel fully-connected layers can greatly accelerate the convergence of the loss function compared to using a single fully-connected layer output parameter.
It should be understood that the invention is not limited to the implementation of the feature extraction part and the parameter regression part of the model Network 1, and the idea of extracting the depth feature in the channel direction and outputting the geometric transformation parameter through the Convolutional Neural Network (CNN) of various forms and parameters by using the input method of the overlay image is within the protection effect of the invention.
S3-2, adopting transformation parameter mu1Geometric correction is carried out on the image m to be corrected to generate a corrected image m1
Step S3-2 includes the following substeps S3-2-1 to S3-2-2:
s3-2-1, transforming the data by the transformation parameter mu1Form a geometric transformation matrix Tμ1
In one embodiment of the present invention, as shown in FIG. 4, 6 geometric transformation parameters a are output in step S3-1-31,a2,a3,a4,a5,a6I.e. forming a two-dimensional affine matrix Tμ1
Figure BDA0003464856830000061
The 6 parameters in the affine transformation matrix represent the operations of translation, rotation, scaling and miscut on the image pixel coordinates. It is assumed that the geometric transformation of the image includes: the amount of translation in the x direction is DxTranslation in the y-direction of Dx(ii) a The scaling factor in the x-direction is SxThe scaling factor in the y-direction is Sy(ii) a Rotating the angle theta clockwise; the angle of miscut in the x direction is
Figure BDA0003464856830000062
The miscut angle in the y direction is ω, then the two-dimensional affine matrix Tμ1The 6 parameters are obtained by random permutation and combination of the operations:
Figure BDA0003464856830000071
in an embodiment of the present invention, the parameter regression portion of the model network 1 outputs a greater or lesser number of geometric transformation parameters to form other geometric transformation matrices than affine transformation, such as perspective transformation, rigid transformation, etc., which is not limited by the present invention.
S3-2-2, transforming the matrix T by geometryμ1Performing geometric transformation on the image m to be corrected to generate a corrected image m1
m1=Tμ1(m)
Specifically, each pixel with coordinates (X, Y) on the image m to be corrected is set to have a gray value σ, the coordinates (X, Y) on the corrected image after spatial transformation are calculated, and the corrected image m is generated according to a certain resampling and interpolation method1. In an embodiment of affine transformation, there are:
Figure BDA0003464856830000072
s3-3, calculating Loss function Loss of model network of 1 st scalesim(f,m,μ1):
Figure BDA0003464856830000073
Wherein,
Figure BDA0003464856830000074
represents Tμ1The inverse geometric transform of (a), which is defined as:
Figure BDA0003464856830000075
sim (-) represents a similarity measure, i.e. Sim (a, B) represents some measure of similarity of computed images a and B. Common similarity measure calculation methods include Sum of Squared Differences (SSD), Normalized Cross-Correlation (NCC), and Phase Correlation (Phase Correlation), among others:
Figure BDA0003464856830000076
Figure BDA0003464856830000077
wherein the size of image A and image B is w x w,
Figure BDA0003464856830000078
and
Figure BDA0003464856830000079
are the mean gray levels of image a and image B, respectively.
Calculating the conventional similarity measure (such as SSD or NCC) is time consuming, and according to the correlation or convolution of two images in the spatial domain being equal to the product thereof in the frequency domain, the phase correlation with faster calculation speed is adopted, and the specific steps are as follows:
let image A and image B have a displacement relation (x) in the spatial domain0,y0) I.e. B (x, y) ═ A (x-x)0,y-y0) Respectively denoted as F by Fourier transformA(u, v) and FB(u, v) in the frequency domain, there is the following relationship:
FB(u,v)=FA(u,v)exp(-i(ux0+vy0))
the normalized cross-power spectrum of both is expressed as:
Figure BDA0003464856830000081
where superscript denotes complex conjugation.
In an embodiment of the present invention, the image a and the image B are multisource optical remote sensing images obtained by the same sensor in the same area, and the gray value is used as an input for calculating the similarity measure of the image a and the image B.
In another embodiment of the present invention, the images a and B are remote sensing images obtained from different types of sensors (such as optical, infrared, SAR, etc.) in the same area, and instead of directly using gray values as input for calculating the similarity measure of the images a and B, Local Feature descriptors of the images a and B, such as directional Gradient Feature Channel (CFOG), directional Histogram of Oriented Gradient (HOG), Local Self-similarity Descriptor (LSS), and Phase-consistent Phase Histogram of oriented Histogram (HOPC), are calculated on a pixel-by-pixel basis. As shown in fig. 5, SSD, NCC or phase correlation between feature descriptor images of two images is used as a similarity measure.
Steps S3-1 to S3-3 are to generate transformation parameters and corrected images at scale 1 and to calculate the loss function, and these steps describe the specific implementation of the correlation operation in detail. The subsequent steps (steps S3-4 to S3-9) will also repeat similar operations on other scales, which are only different in parameters from the related operations on scale 1, and the flow thereof will be briefly summarized without repeating the detailed description of the principle.
S3-4, correcting the reference image f and the correction image m1Inputting the data into a model network (model network 2) of the 2 nd scale (the scale 2 is abbreviated in the embodiment of the invention) to obtain the residual error delta mu of the transformation parameter1And then it is combined with the transformation parameter mu1Combining to obtain the transformation parameter mu of the 2 nd scale2
Step S3-4 includes the following substeps S3-4-1 to S3-4-4:
s3-4-1, correcting the reference image f and the corrected image m1Down-sampled to 1/2 of the original size, respectively, and the two images generated after down-sampling are superimposed in the channel direction to generate a superimposed image.
And S3-4-2, inputting the superposed image into a feature extraction part of the model network of the 2 nd scale to generate depth features.
In the embodiment of the present invention, the network structure of the model network 2 is similar to the network structure of the model network 1, and only the parameter setting is different. The specific implementation of the feature extraction in step S3-4-2 will be further described with reference to specific embodiments, if the reference image f and the correction image m are1Is 512 × 512, 1/2 is 256 × 256, the number of convolution kernel channels per convolution block is set to 64, and the value of k is set to 6, and the stack is madeThe image produced a feature size of 4 via 6 sets of rolling blocks and downsampling layers.
S3-4-3, passing the depth characteristics through a parameter regression part of a model network with the 2 nd scale to obtain the residual error delta mu of the transformation parameters1
S3-4-4, and converting residual error delta mu1And a transformation parameter mu1Combining to obtain the transformation parameter mu of the 2 nd scale2
μ2=μ1*Δμ1
Where denotes multiplication of the matrix.
S3-5, adopting transformation parameter mu2For the corrected image m1Performing geometric correction to generate a corrected image m2
Step S3-5 includes the following substeps S3-5-1 to S3-5-2:
s3-5-1, converting the parameter mu2Form a geometric transformation matrix Tμ2
S3-5-2, transforming the matrix T through geometryμ2For the corrected image m1Performing geometric transformation to generate a corrected image m2
m2=Tμ2(m1)
S3-6, calculating Loss function Loss of model network of 2 nd scalesim(f,m1,μ2):
Figure BDA0003464856830000091
S3-7, correcting the reference image f and the correction image m2Inputting the data into a model network (model network 3) of 3 rd scale (scale 3 in the embodiment of the invention) to obtain the residual error delta mu of the transformation parameter2And then it is combined with the transformation parameter mu2Combining to obtain the transformation parameter mu of the 3 rd scale3
Step S3-7 includes the following substeps S3-7-1 to S3-7-4:
s3-7-1, correcting the reference image f and the corrected image m2Stacking in the direction of the passageAnd generating a superposed image.
And S3-7-2, inputting the superposed image into a feature extraction part of the model network with the 3 rd scale to generate depth features.
In the embodiment of the present invention, the network structure of the model network 3 is similar to the network structures of the model network 1 and the model network 2, and only the parameter setting is different. To further explain the specific implementation of the feature extraction in step S3-7-2 with reference to the specific embodiment, if the sizes of the image f and the image m are 512 × 512, the number of convolution kernel channels of each convolution block is set to 128, and the value of k is set to 7, and the feature map size generated by stacking the images through 7 sets of convolution blocks and downsampling layers is 4.
S3-7-3, passing the depth characteristics through a parameter regression part of a model network with a 3 rd scale to obtain residual errors delta mu of transformation parameters2
S3-7-4, and converting residual error delta mu2And a transformation parameter mu2Combining to obtain the transformation parameter mu of the 3 rd scale3
μ3=μ2*Δμ2
Where denotes multiplication of the matrix.
S3-8, adopting transformation parameter mu3For the corrected image m2Performing geometric correction to generate a corrected image m3
Step S3-8 includes the following substeps S3-8-1 to S3-8-2:
s3-8-1, converting the parameter mu3Form a geometric transformation matrix Tμ3
S3-8-2, transforming the matrix T through geometryμ3For the corrected image m2Performing geometric transformation to generate a corrected image m3
m3=Tμ3(m2)
S3-9, calculating Loss function Loss of model network of 3 rd scalesim(f,m2,μ3):
Figure BDA0003464856830000101
S3-10, correcting the image m3And a transformation parameter mu3As an end-to-end output on one training sample.
And S4, initializing model network parameters of 3 scales respectively.
Step S4 includes the following substeps S4-1 to S4-3:
s4-1 to minimize Loss function Losssim(f,m,μ1) And training the model network of the 1 st scale.
S4-2, fixing the parameters of the model network of the 1 st scale to minimize the Loss function Losssim(f,m1,μ2) And training the model network of the 2 nd scale.
S4-3, fixing parameters of the model network of the 1 st scale and the model network of the 2 nd scale to minimize Loss function Losssim(f,m2,μ3) And training the model network of the 3 rd scale.
And S5, performing joint training on the model networks of 3 scales in an end-to-end mode, and optimizing a joint loss function on the 3 scales.
In the embodiment of the invention, before the joint training of the model networks with 3 scales, the parameters of all the model networks need to be released.
In the embodiment of the invention, the Loss function Loss is as follows:
Loss=λ1×Losssim(f,m,μ1)+λ2×Losssim(f,m1,μ2)+λ3×Losssim(f,m2,μ3)
wherein λ1,λ2,λ3Is the weight factor of the loss function of each scale model network, in the embodiment of the invention, lambda1,λ2,λ3The values of (A) are 0.05, 0.05 and 0.9 respectively.
S6, searching the direction with the fastest reduction of the joint loss function value through the deep learning optimizer, carrying out back propagation on the model network in the direction, and iteratively updating the parameters of the model networkWhen the combined loss function is reduced to a preset threshold value and converged, all model networks mapped end to end have overall optimal parameters, namely a reference image f and a correction image m3The method has the best similarity, saves the parameters of the network model at the moment, and outputs a reference image f and a correction image m after registration3
Therefore, the method realizes the accurate registration of the multi-scale remote sensing image which is completely unsupervised to learn and mapped end to end.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto and changes may be made without departing from the scope of the invention in its aspects.

Claims (10)

1. A remote sensing image registration method based on unsupervised deep learning is characterized by comprising the following steps:
s1, establishing a multi-source remote sensing image registration data set comprising two groups of image data, wherein every two images of the two groups of image data correspond to each other one by one, one group of image data is used as a reference image data set, and the other group of image data is used as an image data set to be corrected;
s2, selecting a reference image f from the reference image dataset, selecting an image m to be corrected corresponding to the reference image f from the image dataset to be corrected, and taking the reference image f and the image m to be corrected as end-to-end input on a training sample;
s3, calculating the transformation parameters mu of the image on the model network of each scale on 3 scales1、μ2、μ3Gradually correcting the image m to be corrected to generate a corrected image m1、m2、m3Propagating back the loss function of the model network of each scale and correcting the image m3And a transformation parameter mu3As in a trainingAn end-to-end output on the sample;
s4, initializing model network parameters of 3 scales respectively;
s5, performing joint training on the model networks of 3 scales in an end-to-end mode, and optimizing joint loss functions on the 3 scales;
s6, searching the direction of the fastest reduction of the combined loss function value through the deep learning optimizer, carrying out back propagation on the model network according to the direction, iteratively updating model network parameters, storing the network model parameters when the combined loss function is reduced to a preset threshold value and converged, and outputting a registered reference image f and a registered correction image m3
2. The remote sensing image registration method according to claim 1, wherein the step S3 includes the following substeps:
s3-1, inputting the reference image f and the image m to be corrected into the model network of the 1 st scale to obtain the transformation parameter mu of the 1 st scale1
S3-2, adopting transformation parameter mu1Geometric correction is carried out on the image m to be corrected to generate a corrected image m1
S3-3, calculating a loss function of the model network of the 1 st scale;
s3-4, correcting the reference image f and the correction image m1Inputting the residual error into a model network of 2 nd scale to obtain the residual error delta mu of the transformation parameter1And then it is combined with the transformation parameter mu1Combining to obtain the transformation parameter mu of the 2 nd scale2
S3-5, adopting transformation parameter mu2For the corrected image m1Performing geometric correction to generate a corrected image m2
S3-6, calculating a loss function of the model network of the 2 nd scale;
s3-7, correcting the reference image f and the corrected image m2Inputting the residual error into a model network of 3 rd scale to obtain the residual error delta mu of the transformation parameter2And then it is combined with the transformation parameter mu2Combining to obtain the transformation parameter mu of the 3 rd scale3
S3-8, adopting transformation parameter mu3For the corrected image m2Performing geometric correction to generate a corrected image m3
S3-9, calculating a loss function of the model network of the 3 rd scale;
s3-10, correcting the image m3And a transformation parameter mu3As an end-to-end output on one training sample.
3. The remote sensing image registration method according to claim 2, wherein the step S3-1 comprises the following substeps:
s3-1-1, respectively down-sampling the reference image f and the image m to be corrected to 1/4 of the original size, and overlapping two images generated after down-sampling in the channel direction to generate an overlapped image;
s3-1-2, inputting the superposed image into a feature extraction part of a model network with the 1 st scale to generate depth features;
s3-1-3, passing the depth characteristics through a parameter regression part of a model network of the 1 st scale to obtain a transformation parameter mu of the 1 st scale1
4. The remote sensing image registration method according to claim 2, wherein the step S3-2 comprises the following substeps:
s3-2-1, converting the parameter mu1Form a geometric transformation matrix Tμ1
S3-2-2, transforming the matrix T by geometryμ1Performing geometric transformation on the image m to be corrected to generate a corrected image m1
5. The remote sensing image registration method according to claim 2, wherein the step S3-4 comprises the following substeps:
s3-4-1, correcting the reference image f and the corrected image m1Respectively down-sampling to 1/2 of the original size, and overlapping two images generated after down-sampling in the channel direction to generate an overlapped image;
s3-4-2, inputting the superposed image into a feature extraction part of a model network with a 2 nd scale to generate depth features;
s3-4-3, passing the depth characteristics through a parameter regression part of a model network with the 2 nd scale to obtain the residual error delta mu of the transformation parameters1
S3-4-4, and converting residual error delta mu1And a transformation parameter mu1Combining to obtain the transformation parameter mu of the 2 nd scale2
6. The remote sensing image registration method according to claim 2, wherein the step S3-5 comprises the following sub-steps:
s3-5-1, converting the parameter mu2Forming a geometric transformation matrix Tμ2
S3-5-2, transforming the matrix T through geometryμ2For the corrected image m1Performing geometric transformation to generate a corrected image m2
7. The remote sensing image registration method according to claim 2, wherein the step S3-7 comprises the following substeps:
s3-7-1, correcting the reference image f and the corrected image m2Superposing the images in the channel direction to generate superposed images;
s3-7-2, inputting the superposed image into a feature extraction part of a model network with a 3 rd scale to generate depth features;
s3-7-3, passing the depth characteristics through a parameter regression part of a model network with a 3 rd scale to obtain residual errors delta mu of transformation parameters2
S3-7-4, and converting residual error delta mu2And a transformation parameter mu2Combining to obtain the transformation parameter mu of the 3 rd scale3
8. The remote sensing image registration method according to claim 2, wherein the step S3-8 comprises the following substeps:
s3-8-1, converting the parameter mu3Form a geometric transformation matrix Tμ3
S3-8-2, by geometric transformationMatrix Tμ3For the corrected image m2Performing geometric transformation to generate a corrected image m3
9. The method for remotely sensing image registration according to claim 1, wherein the Loss function Loss of model network of scale 1 in step S3-3sim(f,m,μ1) Comprises the following steps:
Figure FDA0003464856820000031
the Loss function Loss of the model network of the 2 nd scale in the step S3-6sim(f,m1,μ2) Comprises the following steps:
Figure FDA0003464856820000032
the Loss function Loss of the model network of the 3 rd scale in the step S3-9sim(f,m2,μ3) Comprises the following steps:
Figure FDA0003464856820000033
the joint Loss function Loss in step S5 is:
Loss=λ1×Losssim(f,m,μ1)+λ2×Losssim(f,m1,μ2)+λ3×Losssim(f,m2,μ3)
where Sim (-) denotes a similarity measure, λ1,λ2,λ3And the weight factor is a loss function of each scale model network.
10. The method for remotely sensing image registration according to claim 9, wherein said step S4 includes the sub-steps of:
S4-1、to minimize the Loss function Losssim(f,m,μ1) Training the model network of the 1 st scale;
s4-2, fixing the parameters of the model network of the 1 st scale to minimize the Loss function Losssim(f,m1,μ2) Training the model network of the 2 nd scale;
s4-3, fixing parameters of the model network of the 1 st scale and the model network of the 2 nd scale to minimize the Loss function Losssim(f,m2,μ3) And training the model network of the 3 rd scale.
CN202210026370.7A 2022-01-11 2022-01-11 Remote sensing image registration method based on unsupervised deep learning Active CN114494372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210026370.7A CN114494372B (en) 2022-01-11 2022-01-11 Remote sensing image registration method based on unsupervised deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210026370.7A CN114494372B (en) 2022-01-11 2022-01-11 Remote sensing image registration method based on unsupervised deep learning

Publications (2)

Publication Number Publication Date
CN114494372A true CN114494372A (en) 2022-05-13
CN114494372B CN114494372B (en) 2023-04-21

Family

ID=81509569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210026370.7A Active CN114494372B (en) 2022-01-11 2022-01-11 Remote sensing image registration method based on unsupervised deep learning

Country Status (1)

Country Link
CN (1) CN114494372B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693755A (en) * 2022-05-31 2022-07-01 湖南大学 Non-rigid registration method and system for multimode image maximum moment and space consistency

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345575A (en) * 2018-09-17 2019-02-15 中国科学院深圳先进技术研究院 A kind of method for registering images and device based on deep learning
CN109711444A (en) * 2018-12-18 2019-05-03 中国科学院遥感与数字地球研究所 A kind of new remote sensing image matching method based on deep learning
CN111414968A (en) * 2020-03-26 2020-07-14 西南交通大学 Multi-mode remote sensing image matching method based on convolutional neural network characteristic diagram
CN113901900A (en) * 2021-09-29 2022-01-07 西安电子科技大学 Unsupervised change detection method and system for homologous or heterologous remote sensing image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345575A (en) * 2018-09-17 2019-02-15 中国科学院深圳先进技术研究院 A kind of method for registering images and device based on deep learning
CN109711444A (en) * 2018-12-18 2019-05-03 中国科学院遥感与数字地球研究所 A kind of new remote sensing image matching method based on deep learning
CN111414968A (en) * 2020-03-26 2020-07-14 西南交通大学 Multi-mode remote sensing image matching method based on convolutional neural network characteristic diagram
CN113901900A (en) * 2021-09-29 2022-01-07 西安电子科技大学 Unsupervised change detection method and system for homologous or heterologous remote sensing image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YE YUANXIN 等: "A FAST AND ROBUST MATCHING SYSTEM FOR MULTIMODAL REMOTE SENSING IMAGE REGISTRATION" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693755A (en) * 2022-05-31 2022-07-01 湖南大学 Non-rigid registration method and system for multimode image maximum moment and space consistency

Also Published As

Publication number Publication date
CN114494372B (en) 2023-04-21

Similar Documents

Publication Publication Date Title
CN111862126B (en) Non-cooperative target relative pose estimation method combining deep learning and geometric algorithm
CN109558862B (en) Crowd counting method and system based on attention thinning framework of space perception
US20240212374A1 (en) Lidar point cloud segmentation method, device, apparatus, and storage medium
Zhao et al. Extracting planar roof structures from very high resolution images using graph neural networks
WO2021138992A1 (en) Disparity estimation optimization method based on up-sampling and accurate rematching
CN109635714B (en) Correction method and device for document scanning image
CN114223019A (en) Feedback decoder for parameter efficient semantic image segmentation
CN113223066B (en) Multi-source remote sensing image matching method and device based on characteristic point fine tuning
CN112053441A (en) Full-automatic layout recovery method for indoor fisheye image
CN113554039A (en) Method and system for generating optical flow graph of dynamic image based on multi-attention machine system
CN113449612A (en) Three-dimensional target point cloud identification method based on sub-flow sparse convolution
CN117788296B (en) Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network
CN114494372A (en) Remote sensing image registration method based on unsupervised deep learning
Sun et al. Image fusion for the novelty rotating synthetic aperture system based on vision transformer
CN117593187A (en) Remote sensing image super-resolution reconstruction method based on meta-learning and transducer
CN114998630B (en) Ground-to-air image registration method from coarse to fine
CN116664855A (en) Deep learning three-dimensional sparse reconstruction method and system suitable for planetary probe vehicle images
CN111696167A (en) Single image super-resolution reconstruction method guided by self-example learning
CN114693755B (en) Non-rigid registration method and system for multimode image maximum moment and space consistency
Li et al. Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems
CN114972451A (en) Rotation-invariant SuperGlue matching-based remote sensing image registration method
CN116524111B (en) On-orbit lightweight scene reconstruction method and system for supporting on-demand lightweight scene of astronaut
Zhu et al. Research on Recognition and Registration Method Based on Deformed Fiducial Markers
CN117689702A (en) Point cloud registration method and device based on geometric attention mechanism
CN115937704A (en) Remote sensing image road segmentation method based on topology perception neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant