CN113506215B

CN113506215B - Super-resolution image reconstruction method and device based on wide activation and electronic equipment

Info

Publication number: CN113506215B
Application number: CN202110691006.8A
Authority: CN
Inventors: 张艳红; 侯芸; 董元帅; 周晶; 钱振宇; 田佳磊; 仝鑫隆
Original assignee: Checsc Highway Maintenance And Test Technology Co ltd; China Highway Engineering Consultants Corp; CHECC Data Co Ltd
Current assignee: Checsc Highway Maintenance And Test Technology Co ltd; China Highway Engineering Consultants Corp; CHECC Data Co Ltd
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2023-07-04
Anticipated expiration: 2041-06-22
Also published as: CN113506215A

Abstract

The invention provides a super-resolution image reconstruction method and device based on wide activation, electronic equipment and a storage medium. The super-resolution image reconstruction method based on wide activation comprises the following steps: acquiring a low-resolution LR image to be reconstructed; inputting a low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure; the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure is used for reconstructing the LR image into a corresponding HR image based at least on the interpolation algorithm and the wide activation residual structure.

Description

Super-resolution image reconstruction method and device based on wide activation and electronic equipment

Technical Field

The present invention relates to the field of computer vision and image processing, and in particular, to a super-resolution image reconstruction method and apparatus based on wide activation, an electronic device, and a storage medium.

Background

Spatial resolution is an important image quality assessment indicator that characterizes how much of an image contains scene information within a unit range. In general, high resolution images can provide clearer, richer detailed information than low resolution images, facilitating a full understanding of the image content and facilitating performance of post-processing tasks. The Super-Resolution (SR) reconstruction is a post-processing technology for improving the inherent Resolution of images from a software level, and has wide application in the fields of medical imaging, intelligent monitoring, virtual reality and the like. The object of image SR reconstruction is to recover a High Resolution (HR) image of the same scene from one or more Low Resolution (LR).

In the prior art, there are three general types of interpolation methods, modeling/reconstruction methods, and machine learning methods, wherein the machine learning method further includes a conventional shallow learning method and a deep learning method.

Although the deep learning method has greatly promoted the development of image SR reconstruction in recent years, the idea of improving the reconstruction performance by increasing the size of the neural network has encountered a certain bottleneck. Meanwhile, most applications require that the SR reconstruction method has certain timeliness, and the calculated amount and the storage resource consumption of the large-scale neural network can block the actual deployment of the algorithm to a certain extent. On the other hand, conventional interpolation algorithms generally assume that the image signal is a continuous signal with limited bandwidth, and the reconstruction result is prone to blurring, artifacts, and the like.

Disclosure of Invention

The invention provides a super-resolution image reconstruction method, device, electronic equipment and storage medium based on wide activation, which solve the problems that the traditional interpolation algorithm is easy to have blurring and artifact and the actual deployment of the algorithm is influenced by the ultra-large calculation amount and storage resource consumption of a neural network method in the prior art, combine the powerful nonlinear expression capacity of a deep learning method and the high-efficiency execution speed of the interpolation method to construct a high-efficiency SR reconstruction model with an accurate reconstruction effect, improve the balance between the efficiency and the effect of the image SR reconstruction, and promote the actual application and deployment of related technologies.

Specifically, the embodiment of the invention provides the following technical scheme:

in a first aspect, an embodiment of the present invention provides a super-resolution image reconstruction method based on wide activation, including:

acquiring a low-resolution LR image to be reconstructed;

inputting the low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual error structure to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual error structure;

the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure is used for reconstructing the LR image into the corresponding HR image based on at least an interpolation algorithm and the wide activation residual structure.

Further, the super-resolution image reconstruction method based on wide activation further comprises the following steps:

the adaptive resampling super-resolution image reconstruction model based on the wide activation residual error structure comprises an adaptive interpolation kernel estimation layer and an adaptive resampling layer;

the adaptive interpolation kernel estimation layer is used for generating an adaptive interpolation kernel for each spatial position of the HR image so as to be used for carrying out fine trimming on the image by the adaptive resampling layer;

the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image.

the method further comprises the steps of:

based on the training sample image, an adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure is trained to optimize the model parameters.

the adaptive interpolation kernel estimation layer further comprises a shallow feature extraction layer, a plurality of stacked wide-activation residual modules, a deep feature extraction layer and a Pixel Shuffle upsampling layer,

the shallow feature extraction layer, the plurality of stacked wide activation residual modules and the deep feature extraction layer form the wide activation residual structure, and are used for executing nonlinear reasoning of the adaptive interpolation kernel estimation layer so as to output feature mapping of the LR image.

the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image, including:

scaling the input LR image to a desired size of the HR image based on the interpolation algorithm to obtain an X _int Wherein X is _int Representing a temporary image of the same size as the HR image of the target, obtained directly by interpolation from the LR image; and

for each channel of the LR image input, applying the weight of the adaptive interpolation kernel to X _int At picture block positions of interval s, where s is 2, 3 or 4.

wherein the plurality of stacked wide activation residual modules comprises two layers of the same dilation convolutions with a dilation rate r >1, a ReLU activation function is adopted between the two layers of dilation convolutions,

the number of channels of the output feature map is increased in the first layer of convolution and then input to the ReLU activation function, and the number of channels of the output feature map is reduced in the second layer of convolution.

the Pixel Shuffle upsampling layer employs a sub-Pixel convolutional network and is used to spatially size the generated interpolation kernel to the HR image.

In a second aspect, an embodiment of the present invention further provides a super-resolution image reconstruction apparatus based on wide activation, including:

an image acquisition unit for acquiring a low resolution LR image to be reconstructed;

the reconstruction unit is used for inputting the low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual error structure, and obtaining a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual error structure; and

In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the steps of the above-mentioned super-resolution image reconstruction method based on wide activation are implemented by the processor when the program is executed.

In a fourth aspect, embodiments of the present invention also provide a storage medium comprising a computer program stored thereon, characterized in that the computer program, when executed by a processor, implements the steps of the method for broad activation based super resolution image reconstruction as described above.

According to the technical scheme, the super-resolution image reconstruction method, device, electronic equipment and storage medium based on wide activation provided by the embodiment of the invention solve the problems that the traditional interpolation algorithm is easy to blur and artifact and the actual deployment of the algorithm is influenced by the ultra-large calculation amount and storage resource consumption of the neural network method in the prior art, and a high-efficiency SR reconstruction model with an accurate reconstruction effect is constructed by combining the strong nonlinear expression capacity of the deep learning method and the high-efficiency execution speed of the interpolation method, so that the balance between the efficiency and the effect of image SR reconstruction is improved, and the practical application and deployment of the related technology are promoted.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a super-resolution image reconstruction method based on wide activation according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of an adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating an internal structure of a Wide Active Residual Block (WARB) according to an embodiment of the present invention; and

FIG. 4 is a graph of performance/parameter compromise versus effect for a typical image SR model provided by one embodiment of the invention, wherein the data is for a case where SR 4 reconstruction is performed on the Manga109 dataset;

FIG. 5 is a schematic structural diagram of a super-resolution image reconstruction device based on wide activation according to an embodiment of the present invention; and

fig. 6 is a schematic diagram of an electronic device according to an embodiment of the invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The various terms or phrases used herein have the ordinary meaning known to those of ordinary skill in the art, but rather the invention is intended to be more fully described and explained herein. If the terms and phrases referred to herein have a meaning inconsistent with the known meaning, the meaning expressed by the present invention; and if not defined in the present application, have meanings commonly understood by one of ordinary skill in the art.

Although the deep learning method in the prior art has greatly promoted the development of image SR reconstruction in recent years, the idea of improving the reconstruction performance by increasing the size of the neural network has encountered a certain bottleneck. Meanwhile, most applications require that the SR reconstruction method has certain timeliness, and the calculated amount and the storage resource consumption of the large-scale neural network can block the actual deployment of the algorithm to a certain extent. On the other hand, conventional interpolation algorithms generally assume that the image signal is a continuous signal with limited bandwidth, and the reconstruction result is prone to blurring, artifacts, and the like.

In view of this, in a first aspect, an embodiment of the present invention proposes a super-resolution image reconstruction method based on wide activation, which overcomes the problems that in the prior art, the conventional interpolation algorithm is prone to occurrence of blurring and artifacts, and the ultra-large calculation amount and storage resource consumption of the neural network method affect the actual deployment of the algorithm, combines the powerful nonlinear expression capability of the deep learning method and the efficient execution speed of the interpolation method to construct an efficient SR reconstruction model with an accurate reconstruction effect, improves the balance between efficiency and effect of image SR reconstruction, and promotes the practical application and deployment of the related technology.

The wide-activation-based super-resolution image reconstruction method of the present invention is described below with reference to fig. 1.

Fig. 1 is a flowchart of a super-resolution image reconstruction method based on wide activation according to an embodiment of the present invention.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may include the following steps:

s1: acquiring a low-resolution LR image to be reconstructed;

s2: inputting a low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure;

the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure is used for reconstructing the LR image into a corresponding HR image based at least on the interpolation algorithm and the wide activation residual structure.

An adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure provided by an embodiment of the present invention is described below with reference to fig. 2.

Fig. 2 is a schematic structural diagram of an adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure according to an embodiment of the present invention.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may further include: the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure comprises an adaptive interpolation kernel estimation layer and an adaptive resampling layer; the adaptive interpolation kernel estimation layer is used for generating an adaptive interpolation kernel for each spatial position of the HR image so as to be used for carrying out fine trimming on the image by the subsequent adaptive resampling layer; the adaptive resampling layer is used for adaptively applying the adaptive interpolation kernel to a corresponding position of the LR image after the adaptive interpolation kernel is generated to reconstruct the HR image.

Specifically, the whole model consists of two parts, the first part estimating an adaptive interpolation kernel and the second part applying the estimated interpolation kernel to adjust the up-sampled image.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may further include: the adaptive interpolation kernel estimation layer further comprises a shallow feature extraction layer, a plurality of stacked wide activation residual modules, a deep feature extraction layer and a Pixel Shuffle up-sampling layer, wherein the shallow feature extraction layer, the plurality of stacked wide activation residual modules and the deep feature extraction layer form a wide activation residual structure and are used for executing nonlinear reasoning of the adaptive interpolation kernel estimation layer so as to output feature mapping of an LR image.

Specifically, in the interpolation kernel estimation section, a content-based interpolation kernel is calculated for each position in an image using a data-driven method. For example, a full convolutional neural network (Fully Convolutional Network, FCN) is used to compute the weight values of the interpolation kernel, which includes a shallow feature extraction layer (3 x 3 convolution), a set of stacked wide-active residual modules (Wide Activation Residual Block, WARB), a deep feature extraction layer (3 x 3 convolution), and a Pixel Shuffle upsampling layer.

In particular, before the upsampling layer, the network is mainly used for nonlinear reasoning of the adaptive interpolation kernel estimation, the output of which is a set of feature maps in the LR image space

Where h and w represent the height and width of the feature map, respectively, k is the spatial size of the interpolation kernel (assuming the interpolation kernel is square), and s is the upsampling factor. In the spatial dimension, x _L Having the same resolution as the LR image. In order for the estimated interpolation kernel to correspond to the HR image, it needs to be upsampled (implemented by the Pixel Shuffle layer). The up-sampled interpolation kernel is denoted +.>

In spatial dimensions consistent with HR images. In this embodiment, x _H Corresponds to a k x k vector and can be reorganized into a rectangular interpolation kernel. Based on this, the adaptive interpolation kernel estimation process generates an adaptive interpolation kernel for each spatial position of the HR image for fine-tuning of subsequent images.

The internal structure of the Wide Activation Residual Block (WARB) provided by an embodiment of the present invention is described below in conjunction with FIG. 3.

Fig. 3 is a schematic diagram illustrating an internal structure of a Wide Active Residual Block (WARB) according to an embodiment of the invention.

Specifically, in the adaptive interpolation kernel reasoning phase, the netThe complex comprises a shallow feature extraction layer, a group of stacked wide-activation residual modules WARB, a deep feature extraction layer and an up-sampling layer, wherein the shallow and deep feature extraction layers adopt 3×3 convolutions with expansion rate r=2, the WARB comprises two layers of identical 3×3 convolutions, a ReLU activation function is adopted between the two layers, 4 WARB modules are arranged in total, and all convolutions adopt a dense convolution mode with stride of 1. The number of output channels of the whole model is 32, and the wide activation rate in the WARB is set to be 4, namely the dimension of the wide activation output channels is 32 multiplied by 4=128. For the Pixel Shuffle layer, we use a subpixel convolutional network (ESPCNN) to implement. The model parameters were initialized using the Xavier method, and the mini-batch size was set to 16. In model training, the HR image block sizes of the corresponding scales are 96×96 (s=2), 144×144 (s=3), and 192×192 (s=4) when the fixed LR training image block size is 48×48. The optimizer adopts Adam optimization method, wherein beta is ₁ ＝0.9，β ₂ =0.999, e=1e-8. The learning rate is initially 1e-4, halving every 10 ten thousand iterations, and training 50 ten thousand iterations in total.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may further include: the Pixel Shuffle upsampling layer employs a sub-Pixel convolutional network and is used to spatially size the generated interpolation kernel to the HR image.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may further include: the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image, including: scaling an input LR image to a desired size of the HR image based on an interpolation algorithm to obtain an X _int Wherein X is _int Representing a temporary image of the same size as the target HR image obtained directly from the LR image by interpolation; and applying weights of the adaptive interpolation kernel to X for each channel of the input LR image _int At picture block positions of interval s, where s is 2, 3 or 4.

Specifically, after the interpolation kernel weight parameters are estimated, they are adaptively applied to the corresponding locations of the LR input image to reconstruct the HR image. Adjacent pixels of the HR image may be resampled from the same set of pixels of the LR image, but with different final intensity values, because each pixel in the HR image space has a different respective interpolation kernel.

More specifically, the LR input image is scaled to the desired size of the HR image by an interpolation algorithm (common polynomial interpolation techniques such as nearest neighbor interpolation NN, bilinear interpolation Bilinear or Bicubic interpolation Bicubic, etc. can be selected) to obtain X _int At this time X _int And adaptive interpolation kernel x _H The same spatial dimensions as the HR image are beneficial for subsequent adaptive resampling operations.

In this embodiment, x _H Each spatial position of (1) corresponds to a k X k interpolation kernel weight, which is directly applied to X by an adaptive resampling operation _int As shown in the following formula (1):

where y is the resampled HR image. That is, for each channel of the input image, a k×k interpolation kernel weight is applied to X _int The image block position of the middle interval s is similar to the operation of multiplying the expansion rate r of the expansion convolution interval.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may further include: the plurality of stacked wide-activation residual modules comprise two layers of expansion convolutions with the same expansion ratio r >1, a ReLU activation function is adopted between the two layers of expansion convolutions, the number of channels of the output feature map is increased when the first layer of convolution is performed, then the channels are input into the ReLU activation function, and the number of channels of the output feature map is reduced when the second layer of convolution is performed.

While deep learning models become more expressive as network depth increases, the increase in network depth for low-level visual tasks also results in underutilization of shallow features. The main means of traditionally solving the underutilization of shallow features is to introduce a stride Connection or concatenation operation (Concatenation Operation), which "directly" delivers the shallow features to locations deeper in the network. A common approach to promote the exploitation of shallow features is broad activation. Specifically, the number of channels of the output feature map is increased in the first layer convolution and then input to the ReLU activation function, and the number of channels of the output feature map is decreased in the second layer convolution. In this way, the parameter size and calculation amount of the whole module remain unchanged, but shallow features are facilitated to flow through the ReLU activation function. While the first approach facilitates network training, the so-called Shortcut Connection makes the network similar to the integration of multiple shallow networks. While the second approach can avoid this problem, it does not utilize network efficient training.

In order to effectively compromise the two, the invention adopts a wide activation residual structure shown in fig. 3. In addition, in order to increase the model receptive field, the expansion convolution (Dilated Convolution) with the expansion ratio r >1 can be used to replace the common 3×3 convolution in fig. 2, and experimental results prove that the proposed WARB module is effective in estimating the adaptive interpolation kernel and promoting the model reconstruction performance.

The proposed model is a typical end-to-end mapping from LR images to HR images. Estimation of the model parameters is achieved by minimizing the reconstruction error between the HR image f (x) output by the model and the real HR image y, where f (·) represents the mapping function of the whole network. In the field of image restoration, L ₂ An objective function is generally more popular because it can provide a maximization index. But in recent years research has shown L ₁ The objective function is more beneficial to model convergence, so L is also adopted ₁ The objective function trains the model. Given training data set

Where |D| represents the total amount of samples, L ₁ The objective function representation is shown in the following equation (2):

where θ represents the set of parameters of the entire network. Notably, while the L1 objective function is not differentiable at x=0, the model is trained using Batch (Batch) data. The probability of the error between the model output and the HR image on one Batch is small, and the actual training has little effect.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may further include: based on the training sample image, an adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is trained to optimize model parameters.

In particular, the model provided by the embodiment of the invention is called a wide-activation residual resampling network (Wide Activation Residual Resampling Network, WARRN), i.e. an adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure. The training process of the whole model is given below as follows:

input: training set

Model parameters θ;

and (3) outputting: optimized model parameters theta;

initializing: model parameters were initialized using the Xavier method, t=5×10 ⁵ ，lr＝1e-4，bSize＝16；

While t<T

Random extraction of bSize 48X 48 small image blocks from LR images

Extracting respective HR image blocks from corresponding locations of HR images

Data of one batch

Input to the model, calculate the model output +.>

Calculating a model loss function according to the formula (2), and executing a feedback propagation algorithm to update the model parameter theta;

ending

The WARRN model provided by the embodiment of the invention is realized by adopting TensorFlow1.11.0, and a total of 50 ten thousand steps are iterated on one NVIDIA GeForce GTX 1080Ti GPU.

More specifically, embodiments of the present invention use a standard DIV2K training set to train a model, which contains 800 high quality training images, with image degradation using typical bicubic downsampling. During model training, the size of the fixed LR image block is 48×48, and HR image blocks with corresponding sizes are sampled at corresponding positions of the HR image. The model performance was evaluated using five commonly used test datasets Set5, set14, B100, urban100, and Manga 109. The test data sets contain rich image contents, cover natural scenes in most daily life, such as characters, animals, buildings, natural landscapes and the like, and can effectively evaluate the generalization performance of the model on different types of images.

As shown in fig. 4, the performance advantage of the WARRN can be more clearly observed when performing the SR x 4 reconstruction on the Manga109 dataset than the performance/parameter tradeoff effect of the method. Wherein the horizontal axis is the parameter (M) and the vertical axis is the PSNR peak signal-to-noise ratio dB.

More specifically, the model performance evaluation of embodiments of the present invention employs two metrics, typically peak signal-to-noise ratio, PSNR, and structural similarity measure, SSIM. The PSNR is generally determined by the maximum possible pixel value that an image can take and the mean square error MSE between images, given a reconstructed image y and a corresponding reference image x, the PSNR definition being shown in the following formula:

where c represents the number of bits of the binary pixel, 2 ^c -1 is the pixel peak value. Natural image data generally adopts unsigned 8-bit integers to represent pixels, so c=8, and the psnr generally has a value of 20 to 40 in dB. MSE is the mean square error between the predicted image and the reference image, and the calculation formula is shown as follows:

wherein H and W are the image height and width, respectively. For a multi-channel color image, the calculation of formula (4) is applied to each channel.

In contrast, SSIM is better able to reflect differences in image structural details, whose computation is based primarily on image brightness, contrast, and structural similarity. Given the reconstructed image y and the corresponding reference image x, the definition of SSIM is as shown in the following equation (5):

SSIM(x，y)＝[L(x，y)] ^α ·[C(x，y)] ^β ·[S(x，y)] ^γ

wherein L (-), C (-) and S (-) are brightness comparison, contrast comparison and structure comparison functions, respectively, and α, β and γ are three control parameters each greater than 0, for adjusting the relative importance of the three functions. The detailed definitions of these three functions are shown in the following formulas, respectively:

wherein mu _x Sum mu _y Respectively represent the pixel mean value, sigma _x Sum sigma _y Is the standard deviation of the pixel, sigma _xy Representing the covariance between the two. C (C) ₁ 、C ₂ And C ₃ Is three constants for preventing systematic errors caused by a denominator of 0. In practical applications, let α=β=γ=1 and C be generally ₃ ＝C ₂ Equation (5) can be written as shown in equation (9) below:

it is apparent from the definition that both evaluation indexes have symmetry, i.e., PSNR (x, y) =psnr (y, x) and SSIM (x, y) =ssim (y, x). The larger the value, the better, but the PSNR has no upper limit, and the SSIM has a value range of 0 to 1.0.

Specifically, in visual effect contrast, the visual effects of several typical image SR methods are compared, including bicubic interpolation Bicubic, SRCNN, VDSR, DRRN, lapSRN and MemNet, etc. The visual effect comparison shows that the result of the method amplified by 4 times on the Butterfly image of Set5 shows that the reconstructed result of other methods has obvious blurring effect at the edge position, but the edge processing effect of the proposed WARRN model is relatively good. Meanwhile, the quantitative index of the corresponding method on the Butterfly is displayed at the bottom of the local amplified image, and the quantitative evaluation of WARRN is the best, which shows that the reconstruction accuracy is the highest.

However, the above comparison is only a quantitative comparison for a particular sample, so the following table is provided to give quantitative assessment results for several typical SR methods on the standard test dataset described above, including three common SR reconstruction scales: sr×2, sr×3, and sr×4, the quantitative data shown by which are more statistically significant.

TABLE 1

Clearly, from a quantitative evaluation comparison of table 1, it can be seen that the maximum of the comparison cells is the WARRN, i.e., the WARRN mentioned at all SR scales exhibits a significant and stable performance advantage.

Based on the same inventive concept, in another aspect, an embodiment of the present invention provides a super-resolution image reconstruction apparatus based on wide activation.

The broad activation-based super-resolution image reconstruction device provided by the present invention will be described below with reference to fig. 5, and the broad activation-based super-resolution image reconstruction device described below and the broad activation-based super-resolution image reconstruction method described above may be referred to correspondingly.

Fig. 5 is a schematic structural diagram of a super-resolution image reconstruction device based on wide activation according to an embodiment of the present invention.

In the present embodiment, the super-resolution image reconstruction apparatus 1 based on wide activation includes: an image acquisition unit 10 for acquiring a low resolution LR image to be reconstructed; the reconstruction unit 20 is configured to input a low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure, and obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure; and an adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure is used to reconstruct the LR image into a corresponding HR image based at least on the interpolation algorithm and the wide activation residual structure.

Since the wide-activation-based super-resolution image reconstruction device provided by the embodiment of the present invention can be used to execute the wide-activation-based super-resolution image reconstruction method described in the above embodiment, the working principle and the beneficial effects thereof are similar, so that details will not be described herein, and reference will be made to the description of the above embodiments.

In this embodiment, it should be noted that, each module in the apparatus of the embodiment of the present invention may be integrated into one body, or may be separately deployed. The above units may be combined into one unit or may be further split into a plurality of sub units.

In yet another aspect, a further embodiment of the present invention provides an electronic device based on the same inventive concept.

In this embodiment, it should be noted that the electronic device may include: processor 610, communication interface (Communications Interface) 620, memory 630, and communication bus 640, wherein processor 610, communication interface 620, and memory 630 communicate with each other via communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a wide-activation based super-resolution image reconstruction method comprising: acquiring a low-resolution LR image to be reconstructed; inputting a low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure; the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure is used for reconstructing the LR image into a corresponding HR image based at least on the interpolation algorithm and the wide activation residual structure.

Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a broad activation based super-resolution image reconstruction method, the method comprising: acquiring a low-resolution LR image to be reconstructed; inputting a low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure; the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure is used for reconstructing the LR image into a corresponding HR image based at least on the interpolation algorithm and the wide activation residual structure.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Moreover, in the present invention, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Furthermore, in the present invention, the description of the terms "embodiment," "this embodiment," "yet another embodiment," and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A super-resolution image reconstruction method based on wide activation is characterized by comprising the following steps:

acquiring a low-resolution LR image to be reconstructed;

inputting the low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on a wide activation residual structure to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure, wherein the adaptive resampling super-resolution image reconstruction model based on the wide activation residual structure comprises an adaptive interpolation kernel estimation layer and an adaptive resampling layer,

the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image,

the adaptive interpolation kernel estimation layer further comprises a shallow feature extraction layer, a plurality of stacked wide-activation residual modules, a deep feature extraction layer, and a Pixel Shuffle upsampling layer, the Pixel Shuffle upsampling layer employing a sub-Pixel convolutional network and being configured to spatially size-correspond the generated interpolation kernel to the HR image,

wherein the shallow feature extraction layer, the plurality of stacked wide activation residual modules, and the deep feature extraction layer form the wide activation residual structure and are configured to perform nonlinear reasoning of the adaptive interpolation kernel estimation layer to output a feature map of the LR image,

the plurality of stacked wide activation residual modules comprise two layers of the same dilation convolutions with the dilation rate r >1, a ReLU activation function is adopted between the two layers of dilation convolutions,

increasing the number of channels of the output feature map in the first layer of convolution, inputting the number of channels into the ReLU activation function, and reducing the number of channels of the output feature map in the second layer of convolution;

2. The broad activation-based super-resolution image reconstruction method according to claim 1, further comprising:

3. The wide-activation-based super-resolution image reconstruction method according to claim 1, wherein the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image comprises:

scaling the input LR image to a desired size of the HR image based on the interpolation algorithm to obtain an intermediate image X _int Wherein X is _int Representing a temporary image of the same size as the HR image of the target, obtained directly by interpolation from the LR image; and

4. A broad activation-based super-resolution image reconstruction apparatus for performing the broad activation-based super-resolution image reconstruction method as claimed in claim 1.

5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the broad activation-based super-resolution image reconstruction method as claimed in any one of claims 1-3 when the program is executed.

6. A non-transitory computer readable storage medium, having stored thereon a computer program, which when executed by a processor, implements the steps of the broad activation based super resolution image reconstruction method as claimed in any one of claims 1-3.