CN113793267B - Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism - Google Patents
Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism Download PDFInfo
- Publication number
- CN113793267B CN113793267B CN202111102985.5A CN202111102985A CN113793267B CN 113793267 B CN113793267 B CN 113793267B CN 202111102985 A CN202111102985 A CN 202111102985A CN 113793267 B CN113793267 B CN 113793267B
- Authority
- CN
- China
- Prior art keywords
- image
- convolution
- cross
- attention mechanism
- resolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000007246 mechanism Effects 0.000 title claims abstract description 36
- 230000008569 process Effects 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 abstract description 2
- 230000003993 interaction Effects 0.000 abstract description 2
- 238000003909 pattern recognition Methods 0.000 abstract description 2
- 230000003068 static effect Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 8
- 230000004913 activation Effects 0.000 description 4
- 238000006731 degradation reaction Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 241000764238 Isis Species 0.000 description 1
- 230000005534 acoustic noise Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, belongs to the technical field of pattern recognition, and provides a cross-dimension attention mechanism guiding network, which is a self-supervision super-resolution method, and utilizes the reproducibility of information in a single image to avoid the dependence of model performance on a large-scale training data set. And secondly, a cross-dimension attention mechanism module is provided, the interaction between the channel dimension and the space dimension is considered by modeling the interdependence relationship between the channel of the picture feature and the space feature, the feature weight of the channel and the space is obtained through learning, more information features are selectively captured, and the learning capacity of the static convolution neural network is further improved.
Description
Technical Field
The invention relates to the technical field of pattern recognition, in particular to a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism.
Background
Image super-resolution refers to the recovery of a high resolution image from a low resolution image or sequence of images. The image super-resolution technology is divided into super-resolution restoration and super-resolution reconstruction, and at present, the main image super-resolution method comprises the following steps: (1) an image super-resolution method based on interpolation; (2) a reconstructed image super-resolution based method; (3) learning-based image super-resolution method.
Over the past few years, deep learning based remote sensing image super resolution methods have been effective in some cases to overcome the physical resolution limitations of remote sensing imaging sensors, and an indispensable factor in the success of such methods is the large number of specific datasets. However, in the process of actually obtaining the remote sensing image, the degradation mode of the real remote sensing image is greatly different from the degradation mode of the specific data set due to the influence of various complex factors, so that the performance of the trained model in practical application is greatly reduced.
Disclosure of Invention
In order to solve the problem of dependence of the existing image super-resolution method on a large-scale training data set, the embodiment of the invention provides a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, the cross-dimension attention mechanism is provided to guide a network, the reproducibility of information in a single image is utilized, the dependence of model performance on the large-scale training data set is avoided, an input image is subjected to downsampling to form a self-training pair, the self-similarity and degradation process in the image are learned in the training process, and then the input low-resolution image is subjected to image super-resolution reconstruction. The technical scheme is as follows:
the invention provides a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, which comprises the following steps:
downsampling an input image;
extracting convolution characteristics of the image;
calculating weights of channel characteristics and space characteristics, wherein a weight matrix of different channels in the image convolution characteristics is T c :
T c =Sigmoid(f 1×1 (ReLU(f 1×1 (Avg(F)))))
Wherein F is E R C×H×W Convolving the image with features; avg is global average pooling, f 1×1 A convolution operation with a convolution kernel size of 1 x 1;
the weight matrix of different spaces in the image convolution characteristic is T s :
T s =Sigmoid(f 1×1 (F))
Wherein F is E R C×H×W Convolving the image with features; f (f) 1×1 A convolution operation with a convolution kernel size of 1 x 1;
calculating channel-space characteristic weights T E R with channel weights and space weights C×H×W :
Wherein f 1×1 A convolution operation with a convolution kernel size of 1 x 1;
computing output image features for a cross-dimensional attention mechanism
Optimizing a training process using a minimum absolute deviation as a loss function, wherein the minimum absolute deviation L 1 The method comprises the following steps:
where θ is a parameter of a cross-dimensional attention mechanism network (CDAN), LR is an input low resolution image,is an image obtained by downsampling LR s times.
In the above self-supervision single remote sensing image super-resolution method based on the cross-dimension attention mechanism, optionally, the downsampling the input image specifically includes: using a low resolution remote sensing image as input, a downsampling operation is performed s times to obtain a lower resolution image corresponding to the input image, which is of the sizeA matching image pair for the input image is constructed.
In the above self-supervision single remote sensing image super-resolution method based on the cross-dimension attention mechanism, optionally, the convolution feature of the extracted image specifically includes: and obtaining the image convolution characteristics through the ReLU layer and the convolution layer.
The technical scheme provided by the embodiment of the invention has the beneficial effects that:
the embodiment of the invention provides a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism, and provides a novel image super-resolution convolutional neural network, namely a cross-dimension attention mechanism guiding network, which is a self-supervision super-resolution method, and the reproducibility of information in a single image is utilized to avoid the dependence of model performance on a large-scale training data set. Secondly, a cross-dimension attention mechanism module is also provided, through modeling the interdependence relationship between the channel and the space feature of the picture feature, the interaction between the channel dimension and the space dimension is considered, the feature weight of the channel and the space is obtained through learning, more information features are selectively captured, and the learning capacity of the static convolution neural network is further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism provided by the embodiment of the invention;
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
The following will describe in detail a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism according to an embodiment of the present invention with reference to fig. 1.
Referring to fig. 1, a self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism in an embodiment of the invention includes:
step 110: downsampling an input image;
in theory, the relationship between the low resolution image and the high resolution image is expressed as:
I LR =(I HR *k)↓ s +n
wherein I is LR Representing low resolution images, I HR Representing a high resolution image, representing a convolution operation, k representing a blur kernel, +. s Representing s times the downsampling and n representing noise, such as speckle noise, acoustic noise.
Since the actual remote sensing image is acquired, the high resolution image I HR Unknown, fuzzy kernel k and noise n are uncertain and there is no reliable pair-wise data set to train the network. Thus, using a low resolution remote sensing image as input, a downsampling operation is performed by a factor of s to obtain a lower resolution image corresponding to the input image, which is of the sizeA matching image pair for the input image is constructed. Use->A low resolution image of size is used as input to the training process.
Step 120: extracting convolution characteristics of the image;
it should be noted that the size of the training process input extracted by using convolutional neural network isIs characterized by the image convolution characteristics of, in particular,obtaining the image convolution characteristic F E R through the ReLU layer and the convolution layer C×H×W Where C is the number of channels and H and W are the height and width of the image feature. Image convolution feature F epsilon R C×H×W As input to a cross-dimensional attention mechanism module (CDAM).
Step 130: calculating weights of channel characteristics and space characteristics, wherein a weight matrix of different channels in the image convolution characteristics is T c :
T c =Sigmoid(f 1×1 (ReLU(f 1×1 (Avg(F)))))
Wherein F is E R C×H×W Representing the convolution characteristics of the image; avg represents global average pooling, f 1×1 Representing a convolution operation with a convolution kernel size of 1 x 1; sigmoid represents an activation function;
the weight matrix of different spaces in the image convolution characteristic is T s :
T s =Sigmoid(f 1×1 (F))
Wherein F is E R C×H×W Representing the convolution characteristics of the image; f (f) 1×1 Representing a convolution operation with a convolution kernel size of 1 x 1;
it should be noted that, the convolution feature obtained in step 120 is used as an input of a cross-dimensional attention mechanism module (CDAM), specifically, the cross-dimensional attention mechanism module (CDAM) is divided into two branches, and the first branch divides the image convolution feature F e R C×H×W Through the global average pooling layer, the image convolution characteristic F E R C×H×W The global information of (2) is compressed into channels of the picture convolution characteristics, and then a weight matrix T of different channels in the picture convolution characteristics is obtained through a convolution layer, a ReLU layer and a Sigmoid activation function c ∈R C×1×1 The method comprises the steps of carrying out a first treatment on the surface of the A second branch for convolving the picture with the characteristic F E R C×H×W The weight matrix T with different spatial information is obtained through a convolution layer and a Sigmoid activation function s ∈R 1×H×W Channel attention features and spatial attention features are derived by the modeling process above.
Step 140: calculating channel-space characteristic weights T E R with channel weights and space weights C×H×W :
Wherein f 1×1 Representing a convolution operation with a convolution kernel size of 1 x 1;
it should be noted that, the weighted channel characteristics and spatial characteristics are weighted and fused by matrix multiplication, and then the cross-dimension channel-spatial characteristic weight T E R is obtained by a convolution layer and a Sigmoid activation function C×H×W 。
Step 150: computing output image features for a cross-dimensional attention mechanism
It should be noted that, the channel-space feature weight T of the image is fused with the convolution feature F of the input network by element multiplication to obtain the output image feature of the cross-dimension attention mechanism moduleAnd serves as input to the subsequent network.
The whole cross-dimension attention mechanism module performs joint learning on the channel and the spatial feature information of the input image features to obtain the mutual dependency relationship between the channel and the spatial feature information, establishes a cross-dimension attention mechanism model of the channel and the spatial feature information, and effectively obtains the attention weights of the channel and the spatial feature information in the whole image features. The convolutional neural network for the picture utilizes the characteristic of reproduction of inter-image cross-scale information, and is not limited by a patch-based method.
Step 160: optimizing a training process using a minimum absolute deviation as a loss function, wherein the minimum absolute deviation L 1 The method comprises the following steps:
where θ represents a parameter across a dimension attention mechanism network (CDAN), LR represents an input low resolution image,an image obtained by downsampling LR s times is shown.
It should be noted that, in the training process, specifically, step 110 reduces the input image to a super-resolution multiple sThrough a cross-dimensional attention mechanism network (CDAN), the size is +.>Picture, super resolution picture with super resolution size LR. A cross-dimensional attention mechanism network (CDAN) contains six cross-dimensional attention mechanism modules (CDAM), which use a minimum absolute value deviation L as a whole 1 As a loss function, make the true value I LR And the predicted value I' LR The error between the learning rate is minimum and the learning rate reaches 10 -6 When this is done, the entire training process ends.
The method does not need a large number of extra paired data sets to train the network, the picture of the input image after downsampling can be regarded as LR, the picture obtained by super resolution is SR through a cross-dimension attention mechanism network, the input picture can be regarded as HR, and the minimum absolute deviation L is utilized 1 As a function of the loss of the network. Based on this approach, self-supervising super-resolution can be achieved. Particularly for a remote sensing image with a more repeated structure, through six cross-dimensional attention mechanism modules (CDAMs), a cross-dimensional attention mechanism network (CDAN) learns self-similarity and degradation process inside the image through training, and a training model specific to the input image is obtained.
After training, the trained super-resolution model is tested, namely, a low-resolution image I is input LR Super resolution of I SR (s×I LR )。
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (3)
1. A self-supervision single remote sensing image super-resolution method based on a cross-dimension attention mechanism is characterized by comprising the following steps:
downsampling an input image;
extracting convolution characteristics of the image;
calculating weights of channel characteristics and space characteristics, wherein a weight matrix of different channels in the image convolution characteristics is T c :
T c =Sigmoid(f 1×1 (ReLU(f 1×1 (Avg(F)))))
Wherein F is E R C×H×W Convolving the image with features; avg is global average pooling, f 1×1 A convolution operation with a convolution kernel size of 1 x 1;
the weight matrix of different spaces in the image convolution characteristic is T s :
T s =Sigmoid(f 1×1 (F))
Wherein F is E R C×H×W Convolving the image with features; f (f) 1×1 A convolution operation with a convolution kernel size of 1 x 1;
calculating channel-space characteristic weights T E R with channel weights and space weights C×H×W :
Wherein f 1×1 A convolution operation with a convolution kernel size of 1 x 1;
computing output image features for a cross-dimensional attention mechanism
Optimizing a training process using a minimum absolute deviation as a loss function, wherein the minimum absolute deviation L 1 The method comprises the following steps:
where θ is a parameter of a cross-dimensional attention mechanism network (CDAN), LR is an input low resolution image,is an image obtained by downsampling LR s times.
2. The image super-resolution method according to claim 1, wherein the downsampling the input image is in particular: using a low resolution remote sensing image as input, a downsampling operation is performed s times to obtain a lower resolution image corresponding to the input image, which is of the sizeA matching image pair for the input image is constructed.
3. The image super-resolution method according to claim 1 or 2, wherein the convolution feature of the extracted image is specifically: and obtaining the image convolution characteristics through the ReLU layer and the convolution layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111102985.5A CN113793267B (en) | 2021-09-18 | 2021-09-18 | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111102985.5A CN113793267B (en) | 2021-09-18 | 2021-09-18 | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113793267A CN113793267A (en) | 2021-12-14 |
CN113793267B true CN113793267B (en) | 2023-08-25 |
Family
ID=79183960
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111102985.5A Active CN113793267B (en) | 2021-09-18 | 2021-09-18 | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113793267B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114549316A (en) * | 2022-02-18 | 2022-05-27 | 中国石油大学(华东) | Remote sensing single image super-resolution method based on channel self-attention multi-scale feature learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110992270A (en) * | 2019-12-19 | 2020-04-10 | 西南石油大学 | Multi-scale residual attention network image super-resolution reconstruction method based on attention |
CN112419155A (en) * | 2020-11-26 | 2021-02-26 | 武汉大学 | Super-resolution reconstruction method for fully-polarized synthetic aperture radar image |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018053340A1 (en) * | 2016-09-15 | 2018-03-22 | Twitter, Inc. | Super resolution using a generative adversarial network |
US11756160B2 (en) * | 2018-07-27 | 2023-09-12 | Washington University | ML-based methods for pseudo-CT and HR MR image estimation |
-
2021
- 2021-09-18 CN CN202111102985.5A patent/CN113793267B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110992270A (en) * | 2019-12-19 | 2020-04-10 | 西南石油大学 | Multi-scale residual attention network image super-resolution reconstruction method based on attention |
CN112419155A (en) * | 2020-11-26 | 2021-02-26 | 武汉大学 | Super-resolution reconstruction method for fully-polarized synthetic aperture radar image |
Non-Patent Citations (1)
Title |
---|
基于信道注意力机制卷积神经网络的图像超分辨率;端木春江;姚松林;;计算机时代(04);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113793267A (en) | 2021-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109389556B (en) | Multi-scale cavity convolutional neural network super-resolution reconstruction method and device | |
CN110136062B (en) | Super-resolution reconstruction method combining semantic segmentation | |
WO2018120329A1 (en) | Single-frame super-resolution reconstruction method and device based on sparse domain reconstruction | |
CN109087273B (en) | Image restoration method, storage medium and system based on enhanced neural network | |
CN109087258B (en) | Deep learning-based image rain removing method and device | |
CN111079532A (en) | Video content description method based on text self-encoder | |
CN113177882B (en) | Single-frame image super-resolution processing method based on diffusion model | |
CN110363068B (en) | High-resolution pedestrian image generation method based on multiscale circulation generation type countermeasure network | |
CN109636721B (en) | Video super-resolution method based on countermeasure learning and attention mechanism | |
CN111259904B (en) | Semantic image segmentation method and system based on deep learning and clustering | |
CN112699844B (en) | Image super-resolution method based on multi-scale residual hierarchy close-coupled network | |
CN112102163B (en) | Continuous multi-frame image super-resolution reconstruction method based on multi-scale motion compensation framework and recursive learning | |
CN111402138A (en) | Image super-resolution reconstruction method of supervised convolutional neural network based on multi-scale feature extraction fusion | |
CN114548265B (en) | Crop leaf disease image generation model training method, crop leaf disease identification method, electronic equipment and storage medium | |
CN111861886B (en) | Image super-resolution reconstruction method based on multi-scale feedback network | |
CN114170088A (en) | Relational reinforcement learning system and method based on graph structure data | |
CN112819705A (en) | Real image denoising method based on mesh structure and long-distance correlation | |
CN114972024A (en) | Image super-resolution reconstruction device and method based on graph representation learning | |
CN113793267B (en) | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism | |
Xia et al. | Meta-learning-based degradation representation for blind super-resolution | |
CN115760670B (en) | Unsupervised hyperspectral fusion method and device based on network implicit priori | |
CN117408924A (en) | Low-light image enhancement method based on multiple semantic feature fusion network | |
Liao et al. | TransRef: Multi-scale reference embedding transformer for reference-guided image inpainting | |
AU2021104479A4 (en) | Text recognition method and system based on decoupled attention mechanism | |
CN113240589A (en) | Image defogging method and system based on multi-scale feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |