CN113793267B

CN113793267B - Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism

Info

Publication number: CN113793267B
Application number: CN202111102985.5A
Authority: CN
Inventors: 刘宝弟; 赵丽飞; 姜文宗; 王延江; 刘伟锋
Original assignee: China University of Petroleum East China
Current assignee: China University of Petroleum East China
Priority date: 2021-09-18
Filing date: 2021-09-18
Publication date: 2023-08-25
Anticipated expiration: 2041-09-18
Also published as: CN113793267A

Abstract

The invention discloses a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, belongs to the technical field of pattern recognition, and provides a cross-dimension attention mechanism guiding network, which is a self-supervision super-resolution method, and utilizes the reproducibility of information in a single image to avoid the dependence of model performance on a large-scale training data set. And secondly, a cross-dimension attention mechanism module is provided, the interaction between the channel dimension and the space dimension is considered by modeling the interdependence relationship between the channel of the picture feature and the space feature, the feature weight of the channel and the space is obtained through learning, more information features are selectively captured, and the learning capacity of the static convolution neural network is further improved.

Description

Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism

Technical Field

The invention relates to the technical field of pattern recognition, in particular to a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism.

Background

Image super-resolution refers to the recovery of a high resolution image from a low resolution image or sequence of images. The image super-resolution technology is divided into super-resolution restoration and super-resolution reconstruction, and at present, the main image super-resolution method comprises the following steps: (1) an image super-resolution method based on interpolation; (2) a reconstructed image super-resolution based method; (3) learning-based image super-resolution method.

Over the past few years, deep learning based remote sensing image super resolution methods have been effective in some cases to overcome the physical resolution limitations of remote sensing imaging sensors, and an indispensable factor in the success of such methods is the large number of specific datasets. However, in the process of actually obtaining the remote sensing image, the degradation mode of the real remote sensing image is greatly different from the degradation mode of the specific data set due to the influence of various complex factors, so that the performance of the trained model in practical application is greatly reduced.

Disclosure of Invention

In order to solve the problem of dependence of the existing image super-resolution method on a large-scale training data set, the embodiment of the invention provides a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, the cross-dimension attention mechanism is provided to guide a network, the reproducibility of information in a single image is utilized, the dependence of model performance on the large-scale training data set is avoided, an input image is subjected to downsampling to form a self-training pair, the self-similarity and degradation process in the image are learned in the training process, and then the input low-resolution image is subjected to image super-resolution reconstruction. The technical scheme is as follows:

the invention provides a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, which comprises the following steps:

downsampling an input image;

extracting convolution characteristics of the image;

calculating weights of channel characteristics and space characteristics, wherein a weight matrix of different channels in the image convolution characteristics is T _c ：

T _c ＝Sigmoid(f ^1×1 (ReLU(f ^1×1 (Avg(F)))))

Wherein F is E R ^C×H×W Convolving the image with features; avg is global average pooling, f ^1×1 A convolution operation with a convolution kernel size of 1 x 1;

the weight matrix of different spaces in the image convolution characteristic is T _s ：

T _s ＝Sigmoid(f ^1×1 (F))

Wherein F is E R ^C×H×W Convolving the image with features; f (f) ^1×1 A convolution operation with a convolution kernel size of 1 x 1;

calculating channel-space characteristic weights T E R with channel weights and space weights ^C×H×W ：

Wherein f ^1×1 A convolution operation with a convolution kernel size of 1 x 1;

computing output image features for a cross-dimensional attention mechanism

Optimizing a training process using a minimum absolute deviation as a loss function, wherein the minimum absolute deviation L ₁ The method comprises the following steps:

where θ is a parameter of a cross-dimensional attention mechanism network (CDAN), LR is an input low resolution image,is an image obtained by downsampling LR s times.

In the above self-supervision single remote sensing image super-resolution method based on the cross-dimension attention mechanism, optionally, the downsampling the input image specifically includes: using a low resolution remote sensing image as input, a downsampling operation is performed s times to obtain a lower resolution image corresponding to the input image, which is of the sizeA matching image pair for the input image is constructed.

In the above self-supervision single remote sensing image super-resolution method based on the cross-dimension attention mechanism, optionally, the convolution feature of the extracted image specifically includes: and obtaining the image convolution characteristics through the ReLU layer and the convolution layer.

The technical scheme provided by the embodiment of the invention has the beneficial effects that:

the embodiment of the invention provides a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, and provides a novel image super-resolution convolutional neural network, namely a cross-dimension attention mechanism guiding network, which is a self-supervision super-resolution method, and the reproducibility of information in a single image is utilized to avoid the dependence of model performance on a large-scale training data set. Secondly, a cross-dimension attention mechanism module is also provided, through modeling the interdependence relationship between the channel and the space feature of the picture feature, the interaction between the channel dimension and the space dimension is considered, the feature weight of the channel and the space is obtained through learning, more information features are selectively captured, and the learning capacity of the static convolution neural network is further improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism provided by the embodiment of the invention;

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.

The following will describe in detail a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism according to an embodiment of the present invention with reference to fig. 1.

Referring to fig. 1, a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism in an embodiment of the invention includes:

step 110: downsampling an input image;

in theory, the relationship between the low resolution image and the high resolution image is expressed as:

I _LR ＝(I _HR *k)↓ _s +n

wherein I is _LR Representing low resolution images, I _HR Representing a high resolution image, representing a convolution operation, k representing a blur kernel, +. _s Representing s times the downsampling and n representing noise, such as speckle noise, acoustic noise.

Since the actual remote sensing image is acquired, the high resolution image I _HR Unknown, fuzzy kernel k and noise n are uncertain and there is no reliable pair-wise data set to train the network. Thus, using a low resolution remote sensing image as input, a downsampling operation is performed by a factor of s to obtain a lower resolution image corresponding to the input image, which is of the sizeA matching image pair for the input image is constructed. Use->A low resolution image of size is used as input to the training process.

Step 120: extracting convolution characteristics of the image;

it should be noted that the size of the training process input extracted by using convolutional neural network isIs characterized by the image convolution characteristics of, in particular,obtaining the image convolution characteristic F E R through the ReLU layer and the convolution layer ^C×H×W Where C is the number of channels and H and W are the height and width of the image feature. Image convolution feature F epsilon R ^C×H×W As input to a cross-dimensional attention mechanism module (CDAM).

Step 130: calculating weights of channel characteristics and space characteristics, wherein a weight matrix of different channels in the image convolution characteristics is T _c ：

T _c ＝Sigmoid(f ^1×1 (ReLU(f ^1×1 (Avg(F)))))

Wherein F is E R ^C×H×W Representing the convolution characteristics of the image; avg represents global average pooling, f ^1×1 Representing a convolution operation with a convolution kernel size of 1 x 1; sigmoid represents an activation function;

T _s ＝Sigmoid(f ^1×1 (F))

Wherein F is E R ^C×H×W Representing the convolution characteristics of the image; f (f) ^1×1 Representing a convolution operation with a convolution kernel size of 1 x 1;

it should be noted that, the convolution feature obtained in step 120 is used as an input of a cross-dimensional attention mechanism module (CDAM), specifically, the cross-dimensional attention mechanism module (CDAM) is divided into two branches, and the first branch divides the image convolution feature F e R ^C×H×W Through the global average pooling layer, the image convolution characteristic F E R ^C×H×W The global information of (2) is compressed into channels of the picture convolution characteristics, and then a weight matrix T of different channels in the picture convolution characteristics is obtained through a convolution layer, a ReLU layer and a Sigmoid activation function _c ∈R ^C×1×1 The method comprises the steps of carrying out a first treatment on the surface of the A second branch for convolving the picture with the characteristic F E R ^C×H×W The weight matrix T with different spatial information is obtained through a convolution layer and a Sigmoid activation function _s ∈R ^1×H×W Channel attention features and spatial attention features are derived by the modeling process above.

Step 140: calculating channel-space characteristic weights T E R with channel weights and space weights ^C×H×W ：

Wherein f ^1×1 Representing a convolution operation with a convolution kernel size of 1 x 1;

it should be noted that, the weighted channel characteristics and spatial characteristics are weighted and fused by matrix multiplication, and then the cross-dimension channel-spatial characteristic weight T E R is obtained by a convolution layer and a Sigmoid activation function ^C×H×W 。

Step 150: computing output image features for a cross-dimensional attention mechanism

It should be noted that, the channel-space feature weight T of the image is fused with the convolution feature F of the input network by element multiplication to obtain the output image feature of the cross-dimension attention mechanism moduleAnd serves as input to the subsequent network.

The whole cross-dimension attention mechanism module performs joint learning on the channel and the spatial feature information of the input image features to obtain the mutual dependency relationship between the channel and the spatial feature information, establishes a cross-dimension attention mechanism model of the channel and the spatial feature information, and effectively obtains the attention weights of the channel and the spatial feature information in the whole image features. The convolutional neural network for the picture utilizes the characteristic of reproduction of inter-image cross-scale information, and is not limited by a patch-based method.

Step 160: optimizing a training process using a minimum absolute deviation as a loss function, wherein the minimum absolute deviation L ₁ The method comprises the following steps:

where θ represents a parameter across a dimension attention mechanism network (CDAN), LR represents an input low resolution image,an image obtained by downsampling LR s times is shown.

It should be noted that, in the training process, specifically, step 110 reduces the input image to a super-resolution multiple sThrough a cross-dimensional attention mechanism network (CDAN), the size is +.>Picture, super resolution picture with super resolution size LR. A cross-dimensional attention mechanism network (CDAN) contains six cross-dimensional attention mechanism modules (CDAM), which use a minimum absolute value deviation L as a whole ₁ As a loss function, make the true value I _LR And the predicted value I' _LR The error between the learning rate is minimum and the learning rate reaches 10 ^-6 When this is done, the entire training process ends.

The method does not need a large number of extra paired data sets to train the network, the picture of the input image after downsampling can be regarded as LR, the picture obtained by super resolution is SR through a cross-dimension attention mechanism network, the input picture can be regarded as HR, and the minimum absolute deviation L is utilized ₁ As a function of the loss of the network. Based on this approach, self-supervising super-resolution can be achieved. Particularly for a remote sensing image with a more repeated structure, through six cross-dimensional attention mechanism modules (CDAMs), a cross-dimensional attention mechanism network (CDAN) learns self-similarity and degradation process inside the image through training, and a training model specific to the input image is obtained.

After training, the trained super-resolution model is tested, namely, a low-resolution image I is input _LR Super resolution of I _SR (s×I _LR )。

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

1. A self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism is characterized by comprising the following steps:

downsampling an input image;

extracting convolution characteristics of the image;

T _c ＝Sigmoid(f ^1×1 (ReLU(f ^1×1 (Avg(F)))))

T _s ＝Sigmoid(f ^1×1 (F))

computing output image features for a cross-dimensional attention mechanism

2. The image super-resolution method according to claim 1, wherein the downsampling the input image is in particular: using a low resolution remote sensing image as input, a downsampling operation is performed s times to obtain a lower resolution image corresponding to the input image, which is of the sizeA matching image pair for the input image is constructed.

3. The image super-resolution method according to claim 1 or 2, wherein the convolution feature of the extracted image is specifically: and obtaining the image convolution characteristics through the ReLU layer and the convolution layer.