CN115797163B

CN115797163B - Target data cross-domain inversion augmentation method based on remote sensing image

Info

Publication number: CN115797163B
Application number: CN202310101406.8A
Authority: CN
Inventors: 杨小冈; 王思宇; 申通; 卢瑞涛; 席建祥; 朱正杰; 陈璐; 李云松
Original assignee: Rocket Force University of Engineering of PLA
Current assignee: Rocket Force University of Engineering of PLA
Priority date: 2023-02-13
Filing date: 2023-02-13
Publication date: 2023-04-28
Anticipated expiration: 2043-02-13
Also published as: CN115797163A

Abstract

The invention provides a target data cross-domain inversion augmentation method based on a remote sensing image, which comprises the following steps: step 1, generating image data multi-domain conversion of an countermeasure network based on a cycle; step 2, multi-domain data augmentation based on contrast learning; and step 3, image migration synthesis is carried out to obtain a multi-domain augmentation data set. The method takes a generated countermeasure network as a framework, introduces a multi-domain conversion method of image data based on circularly generated countermeasure network and a multi-domain data augmentation method based on contrast learning, transfers a visible light remote sensing image into an infrared image and an SAR image, takes a synthesized data set as a matching reference image of the unmanned aircraft, and therefore realizes navigation positioning tasks of the unmanned aircraft by utilizing the multi-domain image in the multi-source sensor; the method has good performance and improves the accuracy of the positioning matching algorithm.

Description

Target data cross-domain inversion augmentation method based on remote sensing image

Technical Field

The invention belongs to the technical field of image dataset preparation, relates to target data, and in particular relates to a target data cross-domain inversion augmentation method based on remote sensing images.

Background

In recent years, unmanned patrol aircrafts develop rapidly and are gradually applied to a plurality of fields such as military reconnaissance striking, mapping exploration, fire rescue, electric power line patrol and the like, and the patrol aircrafts realize intelligent visual navigation and positioning technology by utilizing a multi-source image sensor carried by the patrol aircrafts and become a current research hot spot.

With the progress of technology, the resolution of optical remote sensing images is continuously improved. And acquiring information of a remote target and surrounding environment by utilizing an optical remote sensing image, so that the unmanned patrol vehicle navigation, positioning, reconnaissance, striking and other tasks are realized.

With the gradual perfection of artificial intelligence technology, the human society is necessarily a big data age with high-speed and intelligent development in the future, and the intelligent scene matching technology becomes one of important approaches for navigation and positioning. The intelligent matching algorithm model based on deep learning is obtained by training a data set and analyzing different differentiation information of mining data, so that a large number of multi-domain heterogeneous images are needed to be used as data support in the early model training process, and the quality of the data set directly influences the capability of the artificial intelligent algorithm model. Most research is focused mainly on the algorithm model of artificial intelligence, but ignores the large amount of data required by intelligent algorithms as drivers to obtain better algorithm performance.

Because of the lack of other domain image samples, the navigation positioning by utilizing the multi-source imaging sensor is a difficult task, so that the development of the target data cross-domain inversion augmentation method based on the remote sensing image is a task with very practical significance and higher difficulty.

Disclosure of Invention

Aiming at the defects existing in the prior art, the invention aims to provide a target data cross-domain inversion augmentation method based on remote sensing images, which solves the technical problem that the positioning accuracy of various imaging technologies in unmanned patrol aircraft navigation positioning tasks in the prior art is to be further improved.

In order to solve the technical problems, the invention adopts the following technical scheme:

a target data cross-domain inversion augmentation method based on remote sensing images comprises the following steps:

step 1, generating image data multi-domain conversion of a countermeasure network based on a loop:

step 101, generating an image of the countermeasure network based on the loop.

Step 102, generating image discrimination of the countermeasure network based on the loop generation.

Step 103, designing a total loss function between the generated image and the true value.

And 2, multi-domain data augmentation based on contrast learning.

Step 3, image migration synthesis is carried out to obtain a multi-domain augmentation data set:

step 301, a set of visible light remote sensing image/infrared image unpaired data sets is given for each

And->

Representation, a set of visible remote sensing image/SAR image unpaired datasets is respectively represented by +.>

And->

Representing, given a remote sensing image dataset of visible light to be converted +.>

For use as a verification set;

step 302, training the two sets of data given in step 301 by the method of loop-based generation of image data multi-domain conversion of the countermeasure network in step 1, and converting the visible light remote sensing image into a corresponding infrared image dataset by model reasoning

SAR image dataset +.>

；

Step 303, training the two sets of data given in step 301 by the multi-domain data augmentation method based on contrast learning in step 2, and converting the visible light remote sensing image into a corresponding infrared image dataset by model reasoning

SAR image dataset +.>

；

Step 304, respectively fusing the data sets obtained in step 302 and step 303 to form a fused infrared image data set

And fusing the SAR image dataset +.>

Thereby forming a multi-domain augmented data set.

Step 4, similarity calculation and matching test:

and (3) taking the multi-domain augmentation data set obtained in the step (3) as a reference image, and calculating the similarity of the images through a PSNR algorithm and an LPIPS algorithm.

And (3) taking the multi-domain augmentation data set obtained in the step (3) as a reference graph, and carrying out a matching test through an ORB algorithm and a LoFTR algorithm.

Compared with the prior art, the invention has the following technical effects:

the method takes a generated countermeasure network as a framework, introduces a multi-domain conversion method of image data based on circularly generated countermeasure network and a multi-domain data augmentation method based on contrast learning, transfers a visible light remote sensing image into an infrared image and an SAR image, takes a synthesized data set as a matching reference image of the unmanned aircraft, and realizes navigation positioning tasks of the unmanned aircraft by utilizing the multi-domain image in the multi-source sensor; the method has good performance and improves the accuracy of the positioning matching algorithm.

And (II) the method does not need a training data set based on image pairs in the model training process based on a loop generation countermeasure network and a multi-domain data augmentation method based on contrast learning, so that the difficulty of data preparation before training is greatly reduced, and the image conversion efficiency is improved.

And (III) the method converts the single-domain image into multiple domains, reduces the limitation of a single sensor of the unmanned aircraft in visual navigation, utilizes the multiple-domain image in the multiple-source sensor to perform navigation positioning, and effectively improves the positioning precision of the aircraft.

(IV) the method of the invention performs a large amount of data generation and experimental comparison. The multi-domain data set generated by the method improves the probability of image matching and the effectiveness is well verified by comparing the traditional matching algorithm with the existing intelligent matching algorithm.

And (V) the method of the invention prevents the over fitting problem frequently occurring in the deep learning training by adding the data set, improves the precision and generalization capability of the model, enriches the variety of the heterogeneous data set, and realizes the visual navigation positioning of the multi-domain image.

Drawings

FIG. 1 is a schematic diagram of a loop generation countermeasure network architecture.

Fig. 2 is a comparative study generator framework diagram.

Fig. 3 (a) and fig. 3 (b) are schematic diagrams of the conversion effect of the visible remote sensing image/infrared image.

Fig. 4 (a) and fig. 4 (b) are schematic diagrams of the conversion effect of the visible light remote sensing image/SAR image.

Fig. 5 is a schematic diagram of the matching result of the original visible remote sensing image/the original infrared image.

Fig. 6 is a schematic diagram of the converted ir image/original ir matching result.

Fig. 7 is a schematic diagram of the original visible light remote sensing image/original SAR matching result.

Fig. 8 is a schematic diagram of the post-conversion SAR/original SAR matching results.

The following examples illustrate the invention in further detail.

Detailed Description

All the devices and algorithms in the present invention are known in the art, unless otherwise specified.

In the present invention, "/" means "and" for example, "visible light remote sensing image/SAR image" means a visible light remote sensing image and SAR image.

SAR, collectively Synthetic Aperture Radar, synthetic aperture radar.

The invention discloses a target data cross-domain inversion augmentation method based on remote sensing images, and provides a data augmentation method which is expanded from a single domain to multiple domains to solve the problem of multi-source scene matching navigation positioning based on deep learning. According to the method, the data set is added, so that the over-fitting problem frequently occurring in the deep learning training is prevented, and the precision and generalization capability of the model are improved.

According to the invention, a plurality of imaging technologies in the unmanned patrol aircraft navigation positioning task are considered, and the target data cross-domain inversion augmentation method based on the remote sensing image is designed to meet the navigation positioning requirement, so that the variety of the heterogeneous data set is enriched, and the multi-domain image visual navigation positioning is realized.

The following specific embodiments of the present invention are provided, and it should be noted that the present invention is not limited to the following specific embodiments, and all equivalent changes made on the basis of the technical solutions of the present application fall within the protection scope of the present invention.

Examples:

the embodiment provides a target data cross-domain inversion augmentation method based on remote sensing images, which comprises the following steps:

the loop generation countermeasure network architecture is shown in fig. 1, and the loop generation countermeasure network includes three parts, namely feature extraction (i.e. encoding), image domain conversion and image reconstruction (i.e. decoding).

Step 101, generating an image of the countermeasure network based on the loop generation:

this step aims at learning two domains for a given training sample

And->

Mapping relationship between ∈>

，

The method comprises two generation mapping relations +.>

And->

Firstly, use a generator

So that the sample is->

Domain conversion to->

Domain, then use generator->

So that the sample is->

Domain conversion to->

Domain.

In step 10101, an initial convolution operation is performed on the original image, the image size is unchanged, but the feature map of the image is converted from 3 to 64.

In step 10102, the abstract features of the input image are extracted using two convolution layers, and the dimensions of the final input image are converted from 256×256×64 to 64×64×256.

Step 10103, using a plurality of residual modules to extract features from the data

Domain conversion to->

Domain.

Step 10104, finally decoding by two-layer deconvolution to realize image decoding

Domain to->

Domain conversion.

Step 102, discriminating generated images of the loop-generated countermeasure network:

the image generation judging device is a classifier based on four layers of convolution layers, the convolution layers are utilized to extract feature map of an input image from 3 dimensions to 512 dimensions, and then the confidence coefficient of the image is judged through the full connection layer and the average pooling layer.

Step 103, designing a total loss function between the generated image and the true value:

step 10301, fight loss function

Applied to mapping function->

And corresponding discriminationAppliance->

The method comprises the steps of carrying out a first treatment on the surface of the Will fight the loss function->

Applied to mapping function->

Corresponding discriminatorD _A 。

Wherein:

Arepresentation ofAA domain;

Brepresentation ofBA domain;

representation ofAA generator corresponding to the domain;

representation ofBA generator corresponding to the domain;

D _A representation ofAA discriminator corresponding to the domain;

representation ofBAnd a discriminator corresponding to the domain.

The expression of (2) is:

wherein:

arepresenting an image;

brepresenting a true value;

representing b distributed over P _data (b) Is not limited to the desired one;

representing a distributed over P _data (a) Is not limited to the desired one;

representing mathematical expectations;

the representations are distributed;

P _data () Representing the probability density of the data.

In the present step, the step of the method,

for generating a product similar to->

Image of Domain->

，/>

For distinguishing converted image samples

And (3) true sample->

。

The step pair maps the function

Corresponding discriminatorD _A Similar resistance losses are introduced

。

Step 10302, for each sheet from

Image of Domain->

Using a cyclic consistency loss function

Image->

Processing, image->

The loop should satisfy the picture +.>

Restore to original image, e.g.>

。

The cyclic consistency loss function

The expression is as follows: />

Wherein:

II indicates the norm.

Step 10303, generating a total loss function between the image and the truth value

Expressed as:

wherein:

representing a cyclic loss functionParameters.

In the present embodiment of the present invention,

the relevant importance of these two objectives is controlled.

Step 2, multi-domain data augmentation based on contrast learning:

image generation based on a coder and a decoder, wherein the input field of the generator of the image generation is that

The output field is->

Giving unpaired dataset +.>

；

Wherein:

representing an input field;

representing an output field;

representing a real set;

Hrepresenting the height of the image;

Wrepresenting the width of the image;

Cthe number of channels representing the image;

Arepresenting unpaired datasets corresponding to input fields;

Brepresenting unpaired datasets corresponding to the output fields;

arepresenting a datasetAData in (a);

brepresenting a datasetBIs a data set of the data set.

The generator for generating the image is divided into two partial encoders

And decoder->

Thereby generating an output image +.>

The method comprises the steps of carrying out a first treatment on the surface of the In this embodiment, the framework of the generator for generating an image is shown in fig. 2. With encoder->

And obtaining a high-dimensional feature vector, and performing iterative training through a total contrast loss function to realize multi-domain data augmentation.

The total contrast loss function is:

wherein:

representing an fight loss function;

representing a maximized mutual information loss function; />

Representing an external loss function;

representing a maximized mutual information loss parameter;

representing an external loss parameter;

Ga representation generator;

Da representation discriminator;

Arepresenting unpaired datasets corresponding to input fields;

Brepresenting unpaired datasets corresponding to the output fields;

Mrepresenting a multi-layer perceptron network.

In the present embodiment, when

，/>

The method can be regarded as a lightweight CycleGan network during joint training.

In the present embodiment, the contrast loss function

Maximizing mutual information loss function

And an external loss function->

Are all calculated using calculation methods commonly known in the art.

And->

And->

For use as a verification set;

SAR image dataset +.>

；

SAR image dataset +.>

；

And fusing the SAR image dataset +.>

Thereby forming a multi-domain augmented data set.

In this embodiment, the visible light remote sensing image conversion effect is as shown in fig. 3 (a), 3 (b), 4 (a) and 4 (b).

Step 4, similarity calculation and matching test:

and (3) taking the multi-domain augmentation data set obtained in the step (3) as a reference graph, and calculating the similarity of the images through a PSNR (peak-to-noise ratio) algorithm and an LPIPS (learning perceptual imagepatch similarity) algorithm.

In the step, the generation effect of the visible light remote sensing image/infrared image and the visible light remote sensing image/SAR image is evaluated through the similarity, wherein the larger the PSNR is and the smaller the LPIPS is, the higher the representative image similarity is.

In this example, the evaluation results are shown in tables 1 and 2.

TABLE 1 comparison of visible remote sensing image/Infrared image conversion Effect

TABLE 2 comparison of visible remote sensing image/SAR image conversion effect

And (3) taking the multi-domain augmentation data set obtained in the step (3) as a reference graph, and carrying out a matching test through an ORB (brief feature point description) algorithm and a LoFTR (local feature matching) algorithm.

In this embodiment, the test results are shown in fig. 5, 6, 7 and 8.

Simulation example:

the effect of the invention is further illustrated by the following simulations:

1. simulation conditions:

in order to verify the effectiveness of the invention, multi-domain augmentation is carried out on a plurality of groups of data sets, and corresponding infrared and SAR image results are obtained. Experimental environment: the operating system is Ubuntu18.04, and the processor is a notebook computer of 2.9GHz IntelXeon E5-2667.

2. Simulation experiment:

the invention is utilized to generate a large amount of data and compare experiments. The multi-domain data set generated by the method improves the accuracy of image matching by comparing the traditional matching algorithm with the existing intelligent matching algorithm, and has a good effect on navigation positioning of unmanned aerial vehicles.

Fig. 5 is an original visible remote sensing image/original infrared image matching result. Fig. 6 is a converted infrared image/original infrared matching result. Fig. 7 is an original visible light remote sensing image/original SAR matching result. Fig. 8 is a post-conversion SAR/original SAR matching result. From the above figures, it can be seen that the multi-domain data augmentation method solves the problems of mismatching, inaccurate navigation positioning and the like, realizes navigation positioning of multi-domain images in the multi-source sensor, and effectively improves the positioning accuracy of the aircraft.

Comparative example 1:

this comparative example shows a method of target data cross-domain inversion augmentation, the other steps of which are substantially identical to the examples, except for the first step. In this comparative example, specific:

step one, loss function adjustment:

the loss function of the algorithm is a binary cross entropy loss function, that is, the loss function combining the binary cross entropy and the Sigmoid activation function is used for training.

Comparative example 2:

step one, loss function adjustment:

the loss function of the algorithm is the smoothl 1 loss function, i.e. training with a loss function that uses a square function around the 0 point so that it is smoother.

Comparing and analyzing the embodiment, the comparative example 1 and the comparative example 2, it can be found that the method has faster network convergence speed and higher network stability, and fig. 3 (a), fig. 3 (b), fig. 4 (a) and fig. 4 (b) are conversion effects after model training, and matching experiment results show that the loss function used by the invention has better conversion effects.

Comparative example 3:

the comparative example provides a target data cross-domain inversion augmentation method, which adopts a styleGAN model to carry out cross-domain inversion on a remote sensing visible light remote sensing image.

Comparative example 4:

the comparative example provides a target data cross-domain inversion augmentation method, which adopts a Pix2Pix model to carry out cross-domain inversion on a remote sensing visible light remote sensing image.

By comparing and analyzing the embodiment, the comparative example 3 and the comparative example 4, the infrared and SAR images generated by the method have better mode consistency, the details of the image content are ensured to be unchanged, the infrared and SAR images are closer to the real infrared and SAR images, and the comparative example 3 and the comparative example 4 have partial distortion conditions.

Claims

1. The target data cross-domain inversion augmentation method based on the remote sensing image is characterized by comprising the following steps of:

step 101, generating an image of an countermeasure network based on cyclic generation;

step 101 includes the steps of:

step 10101, performing an initial convolution operation on the original image, wherein the image size is unchanged, but the feature map of the image is converted from 3 to 64;

step 10102, extracting abstract features of the input image by using two convolution layers, and converting the dimension of the final input image from 256×256×64 to 64×64×256;

Domain conversion to->

A domain;

Domain to->

Domain conversion;

step 102, judging a generated image of a loop-generated countermeasure network;

step 102 includes the following steps:

the arbiter for generating image discrimination is a classifier based on four layers of convolution layers, the feature map of the input image is extracted from 3 dimensions to 512 dimensions by using the convolution layers, and then the confidence coefficient of the image is discriminated through the full connection layer and the average pooling layer;

step 103, designing a total loss function between the generated image and the true value;

step 103 includes the following steps:

step 10301, willBDomain-specific counterloss function

Applied to the slaveADomain to domainBDomain-basedBMapping function of domain-specific generator>

Corresponding discriminator->

The method comprises the steps of carrying out a first treatment on the surface of the Will beADomain-specific counterloss function

Applied to the slaveBDomain to domainADomain-basedAMapping function of domain-specific generator>

Corresponding discriminatorD _A ；

Wherein:

Arepresentation ofAA domain;

Brepresentation ofBA domain;

representation ofAA generator corresponding to the domain;

representation ofBA generator corresponding to the domain;

D _A representation ofAA discriminator corresponding to the domain;

representation ofBA discriminator corresponding to the domain;

step 10302, for each sheet from

Image of Domain->

By means of a cyclic consistency loss function>

Image->

Processing, image->

The loop should satisfy the picture +.>

Restoring to an original image;

Expressed as:

wherein:

a parameter representing a cyclic loss function; />

Step 2, multi-domain data augmentation based on contrast learning;

the step 2 comprises the following steps:

The output field is->

Giving unpaired dataset +.>

；

Wherein:

representing an input field;

representing an output field;

representing a real set;

Hrepresenting the height of the image;

Wrepresenting the width of the image;

Cthe number of channels representing the image;

Arepresenting the correspondence of the input fieldUnpaired datasets;

Brepresenting unpaired datasets corresponding to the output fields;

arepresenting a datasetAData in (a);

brepresenting a datasetBData in (a);

the generator for generating the image is divided into two partial encoders

And decoder->

Thereby generating an output image

The method comprises the steps of carrying out a first treatment on the surface of the With encoder->

Acquiring a high-dimensional feature vector, and performing iterative training through a total contrast loss function to realize multi-domain data augmentation;

the total contrast loss function is:

wherein:

representing an fight loss function;

representing a maximized mutual information loss function;

representing an external loss function;

representing a maximized mutual information loss parameter;

representing an external loss parameter;

Ga representation generator;

Da representation discriminator;

Arepresenting unpaired datasets corresponding to input fields;

Brepresenting unpaired datasets corresponding to the output fields;

Mrepresenting a multi-layer perceptron network;

And->

And->

For use as a verification set; />

Step 302, training the two sets of data given in step 301 by the method of generating multi-domain conversion of image data of countermeasure network based on loop in step 1Conversion of visible light remote sensing images into corresponding infrared image data sets is achieved through model reasoning

SAR image dataset +.>

；

SAR image dataset +.>

；

And fusing the SAR image dataset +.>

Thereby forming a multi-domain augmented data set.

2. The method of claim 1, further comprising step 4, similarity calculation and matching test:

taking the multi-domain augmentation data set obtained in the step 3 as a reference image, and calculating the similarity of the images through a PSNR algorithm and an LPIPS algorithm;