CN112734643A

CN112734643A - Lightweight image super-resolution reconstruction method based on cascade network

Info

Publication number: CN112734643A
Application number: CN202110052039.8A
Authority: CN
Inventors: 李浪; 陶洋
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-01-15
Filing date: 2021-01-15
Publication date: 2021-04-30

Abstract

The invention belongs to the technical field of image processing, and mainly relates to a light-weight image super-resolution reconstruction method based on a cascade network. The method mainly comprises the following steps: step 1: acquiring a high-resolution target image of a data set, and performing downsampling pretreatment; step 2: designing a light-weight image super-resolution reconstruction network based on an attention mechanism and a cascade network; and step 3: and designing a training strategy and training parameters and optimizing model parameters. After training and optimization, the invention can provide an image super-resolution reconstruction method with small volume and high reconstruction quality.

Description

Lightweight image super-resolution reconstruction method based on cascade network

The technical field is as follows:

the invention belongs to the technical field of image processing, and mainly relates to a light-weight image super-resolution reconstruction method based on a cascade network.

Background art:

generally, for an image, a higher resolution represents more and richer image details. But is limited by the factors of the size, the cost and the like of the imaging equipment, and the finally presented image often does not meet the actual requirements of people. Under such a background, an image super-resolution reconstruction technique for reconstructing a high-resolution image from a low-resolution image has attracted extensive attention in academic circles, and is widely used in the fields of video surveillance, medicine, and the like. Therefore, the super-resolution reconstruction algorithm with wide research application range and high reconstruction quality has important theoretical value and practical significance.

Most super-resolution reconstruction methods improve the reconstruction effect by continuously stacking the depth of the network, and although the algorithms can often obtain better effect, the network has larger volume and large calculation amount, so that the super-resolution reconstruction methods are difficult to apply to practical application. The conventional convolution is generally used for feature extraction, and with the development of a network model, technologies such as deep separable convolution and packet convolution have achieved prominent achievements in the field of deep learning. The existing light weight network in the field of image super-resolution reconstruction mainly solves the problems from several angles: an effective up-sampling mode and a feature extraction module are designed, and the depth of the network is reduced. Although a relatively good effect is obtained in the case of a relatively light weight, there is still room for further improvement. A lightweight image super-resolution reconstruction based on a cascade network is designed, firstly a lightweight multi-scale feature extraction module is designed, and then different features are weighted by using an attention mechanism. And finally, fusing the channel characteristics of different levels by using a cascading mechanism. The super-resolution reconstruction network designed based on the scheme can meet the requirement of real-time performance, and meanwhile, the quality of the super-resolution reconstruction network is superior to that of the super-resolution reconstruction network with the same level of network complexity.

Disclosure of Invention

The invention provides a light-weight image super-resolution reconstruction method based on a cascade network, which aims at the problems that the existing super-resolution reconstruction method is large in size and difficult to apply in practice, provides an unsupervised super-resolution reconstruction method, and optimizes a reconstruction result through a pyramid generation countermeasure network, so that a super-resolution reconstruction network conforming to image characteristics is obtained.

The technical scheme of the invention is as follows:

s1: and (6) image acquisition. The DIV2K data set is obtained and the PNG data is converted into the npy data format.

S2: on the basis of S1, the. npy data is cropped to crop a high resolution image HR of 128 × 128, and then down-sampled to obtain a low resolution image LR, forming a high and low resolution image pair.

S3: on the basis of S2, using convolution design low-level feature extraction network to extract features, wherein the extracted features are represented as F_L。

F_L＝Conv2d3x3(LR,3,64)

Where Conv2d3x3 represents a conventional convolution with a convolution kernel size of 3, in this way shallow features of the network are extracted. Where 3 and 64 represent the number of input and output channels, respectively.

S4: the feature F obtained in S3_LAs an input of the depth feature extraction network, a channel 1 performs feature extraction through an FE module as shown in fig. 1, and finally performs channel dimensionality reduction using a standard convolution with a convolution kernel size of 1:

F_FE1＝Conv1x1(σ(GConv₄(DWConv_3x3(F,3,64),64,64)),64,32)

where DWConv3x3 represents a deep convolution with a convolution kernel size of 3, GConv4 represents a packet convolution with a packet number of 4, and σ represents the ReLU activation function. Conv1x1 represents a conventional convolution with a convolution kernel size of 1.

And the channel 2 firstly carries out feature extraction in a feature extraction mode of the channel 1, then carries out channel recombination, and then carries out secondary feature extraction through an FE module. The dimensionality reduction of the channels is then performed using the same 1x1 convolution.

F_FE2＝Conv1x1(σ(GConv4(DWConv3x3(Shufle(σ(GConv4(DWConv3_x3(F_L,3,64),64,64))),3,64),64,64)),64,32)

The characteristics of the outputs of channel 1 and channel 2 are then tiled to extend to a 64-channel profile and activated using ReLU, adding to the non-linearity of the network. Is shown as

F_DF1＝σ(Concat(F_FE1+F_FE2))

In DGSA, an enhanced channel attention mechanism network as shown in fig. 2 is introduced. First, the channel descriptor is validated. Let F (F ∈ i)^W×H×CW, H, C is the width, height and channel number of the feature map) to obtain a1 × 1 × C feature map, and the result of pooling is used as the descriptor Z of the channel, and the descriptor of the C-th channel is expressed as:

subsequently, channel relationship modeling, with W_kTo learn channel attention, expressed as follows:

W_c＝σ(W_kZ)

where k is expressed as the number of calculated adjacent channels, W_kExpressed as:

W_k＝[ω₁,ω₂,L,ω_c]^T

wherein the content of the first and second substances,

representing k adjacent channels

A collection of (a). Channel parameter sharing can further reduce parameter quantity, omega_iCan be expressed as:

finally, the information interaction between the channels is realized through the one-dimensional convolution with the convolution kernel size k, which is expressed as:

ω＝σ(Conv1dkxk(Z))

wherein k is adaptively selected according to the number of channels, and is represented as:

C＝φ(k)＝2^2k-1

weighting the learned channel attention mechanism into the input features to obtain a final feature map:

F_out＝Z*F

the final DGSA module may be represented as shown in fig. 3. And simultaneously, cascading the features of different layers, and sending the features into a conventional convolution with a convolution kernel size of 1 to perform feature information interaction and channel dimension reduction.

F_con＝Conv1x1(Concat(F₁,F₂,...),inputchannel,64)

In the formula, inputchanel represents the sum of the channels of each input feature map.

S5: feature F after feature extraction in S4_conThe features are fed into an upsampling module as shown in fig. 4. The upsampling module first needs to expand the channel.

F_up＝Conv3x3(F_con,64,64*r*r)

F_{up_f}＝PS(F_up,64*r*r,64)

Where r represents the multiple to be reconstructed and PS represents the Pixel Shuffle operation. If the amplification is four times, the above operation is performed twice, where r is 2.

S6: using the L1 loss as a loss function for network training, an Adam optimizer as an optimizer for the network, the error between SR and HR is calculated for back propagation, and 100 epochs are trained. And updating the parameters.

S7: after training is finished, parameters of the model are fixed, the image to be reconstructed is used as the input of the network, the reconstruction scale is selected, and the output is the super-resolution image of the target.

Drawings

FIG. 1 FE module

FIG. 2 attention mechanism network

FIG. 3 network model

FIG. 4 upsampling module

FIG. 5 comparison of reconstruction results

The specific implementation process comprises the following steps:

a lightweight image super-resolution reconstruction method based on a cascade network is characterized in that a network model is shown in figure 3, data preprocessing is firstly carried out, and then a super-resolution reconstruction network is constructed. And then setting training parameters and conditions to train the network and adjusting the network parameters. And finally, performing super-resolution reconstruction by using the trained model.

The invention is further illustrated by the following example of an embodiment, which is intended only for a better understanding of the subject matter of the invention and is not intended to limit the scope of the invention. The method comprises the following specific steps:

step S1: the example adopts a Set5, a Set14, a BSD100 and an Urban100 four-image super-resolution reconstruction data Set, the Set5 comprises 5 pairs of high-low resolution image pairs, the Set14 extends to 14 sheets, and the BSD100 and the Urban100 respectively comprise 100 sheets of high-low resolution image pairs with various features. The example selects quadrupling as the reference standard. And carrying out downsampling on the low-resolution image to obtain an ultra-low-resolution image pair.

Step S2: the present embodiment takes the low resolution image directly as the input to the network.

Step S3: and confirming the training environment and the strategy. The processor of the experimental device is

I99900K, wherein the display card is RTX2080Ti, the training is carried out in a Pythrch environment, and the acceleration model is trained by using CUDA10.0 and CuDNN7.1.

The epoch of the network trained in each scale is 100, the learning rate of a generator and a discriminator is set to be 0.001, the learning rate of the epoch is reduced to 10 percent of the original learning rate every 30, and the scale of the feature image is kept unchanged in the feature extraction process.

Step S4: the present example selects the low resolution images in Set5, Set14, and BSD100 as the reconstruction objects. And taking the low-resolution image to be reconstructed as an input, and directly inputting the low-resolution image into a network for calculation.

The example compares the algorithm proposed herein with the existing unsupervised image super-resolution reconstruction algorithm for 2, 3 and 4 times of reconstruction indexes, and mainly compares two indexes of peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM). The experimental results are shown in fig. 5 and table 1, taking 4-fold reconstruction as an example.

As can be seen from the observation of the graph, the method is superior to a comparison algorithm in terms of two indexes, and the quality of the reconstructed image is improved to a certain extent.

TABLE 1 reconstruction index comparison Table

Claims

1. A light-weight image super-resolution reconstruction method based on a cascade network is characterized by comprising the following steps:

s1: the DIV2K data set is obtained and the PNG data is converted into the npy data format.

S4: on the basis of S3, a depth feature extraction module DGSA is designed, wherein the DGSA comprises a light-weight volume part and an attention mechanism part. And 4 DGSA cascades form a deep feature extraction network. Extracting a depth feature F using the low-level feature extracted in S3 as an input to a depth feature extraction network_H。

S5: on the basis of S4, the extracted depth feature F is used_HThe input upsampling network enlarges the feature map to the size of the target reconstruction. And then, restoring the same channel number of the target image through a reconstruction module, and outputting a target SR.

S6: the error between the HR obtained at S1 and the SR obtained at S5 was calculated using the L1 loss function. And performing gradient negative feedback to update the parameters of the feature extraction layer, the up-sampling layer and the reconstruction layer.

S7: and (5) carrying out multiple iterations on the S2-S6, continuously updating network parameters, and fixing the model after the set training amount is reached. And reconstructing the image to be reconstructed as the input of the model to obtain the super-resolution image SR of the target_T。

2. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S1:

the image file of the originally obtained PNG is converted into the npy data, and the file reading speed is accelerated.

3. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S2:

the resulting. npy file is cropped to a crop size of 128 x 128 resolution and downsampled for different reconstruction scales.

4. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S3:

and designing a low-level feature extraction network, wherein the low-level feature extraction network is formed by convolution with a convolution kernel size of 3, the number of input channels is 3, and the number of output channels is 64. The size of the characteristic diagram is unchanged.

5. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S4:

design depth feature extraction module DGSA (Depthwise-Group ShuffleNet with Attention). The DGSA is formed by two channels, namely a main channel 1 and an output channel after being directly subjected to feature extraction by an FE module. The channel 2 firstly carries out first feature extraction through an FE module, then carries out feature channel recombination, changes the position of the channel and enhances the information interaction between the channels. The FE module 64 consists of deep convolution of channels and packet convolution, and the number of packets is 4. And activated using the ReLU function. After the characteristics of the two channels are obtained, convolution with convolution kernel of 1 multiplied by 1 is carried out to carry out residual error connection on the information of the two channels, the information of the channels is fused, and finally, the characteristic F is output_H1＝DGSA(F_L). Then fusing low-level features and depth features, splicing channels, fusing the channels by using a standard convolution with a convolution kernel of 1 multiplied by 1, and finally outputting F_H。

6. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S5:

the channel expansion is carried out by convolution with convolution kernel of 3 × 3, then Pixel Shuffle is carried out for feature map amplification, and the number of channels is ensured to be 64, if the amplification factor is 2 and 3, the amplification is carried out only once, and if the amplification factor is 4, the above operation is carried out twice, and the amplification is respectively doubled.

7. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S6:

using the L1 loss as a loss function for network training, an Adam optimizer as an optimizer for the network, the error between SR and HR is calculated for back propagation, and 100 epochs are trained. And updating the parameters.

8. A lightweight image super-resolution reconstruction method based on a cascade network is characterized in that in step S7:

after training is finished, parameters of the model are fixed, the image to be reconstructed is used as the input of the network, and the output is the super-resolution image of the target.