CN111461987B

CN111461987B - Network construction method, image super-resolution reconstruction method and system

Info

Publication number: CN111461987B
Application number: CN202010250271.8A
Authority: CN
Inventors: 孙旭; 董晓宇; 高连如; 张兵
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2020-04-01
Filing date: 2020-04-01
Publication date: 2023-11-24
Anticipated expiration: 2040-04-01
Also published as: CN111461987A

Abstract

The invention provides a network construction method, an image super-resolution reconstruction method and a system, comprising the following steps: and constructing an SFnet by using a preset first convolution layer, an SFS and an up-sampling module, and training the SFnet based on sample data to obtain a super-resolution network model. And inputting the first resolution image of the M rows, N columns and A channels into a super-resolution network model for resolution improvement to obtain the second resolution image of the r x M rows, r x N columns and A channels. In the scheme, in the process of processing the first resolution image, the SFS is utilized to extract and integrate multi-level information, and the SFS is utilized to extract and utilize the characteristic information of the first resolution image at the global level and the local level, so that the super-resolution network model can improve the resolution of the first resolution image on the premise of ensuring high fidelity, and a second resolution image with high fidelity and high resolution is obtained.

Description

Network construction method, image super-resolution reconstruction method and system

Technical Field

The invention relates to the technical field of image processing, in particular to a network construction method, an image super-resolution reconstruction method and a system.

Background

Super-Resolution (SR) refers to recovering a low Resolution image into a high Resolution image, and with the development of scientific technology, super-Resolution reconstruction is widely used in fields such as monitoring, medicine and remote sensing. How to restore a low resolution image to a high resolution image is a problem that needs to be solved today.

Disclosure of Invention

In view of this, the embodiments of the present invention provide a network construction method, an image super-resolution reconstruction method and a system for recovering a low-resolution image into a high-resolution image.

In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:

the first aspect of the embodiment of the invention discloses a network construction method, which comprises the following steps:

constructing a first convolution layer by using C convolution kernels with the size of t multiplied by A, wherein C and A are positive integers, and t is a positive odd number;

constructing a second-order feedforward structure SFS by using G first-order feedforward groups FFG, a first characteristic connection operation and a second convolution layer, wherein the FFG is composed of B residual channel attention blocks RCAB, a second characteristic connection operation and a third convolution layer, the RCAB is composed of a fourth convolution layer, a linear rectification function ReLU layer, a fifth convolution layer and a channel attention CA module, and G and B are positive integers;

Connecting the output end of the first convolution layer with the input end of the SFS, and connecting the output end of the SFS with the input end of an up-sampling module to construct a second-order feedforward network SFnet, wherein the up-sampling module is used for executing r times up-sampling operation, r is any real number larger than 1, and the convolution layer in the up-sampling module is composed of (A x r) convolution kernels with the size of t x C;

and training the SFnet based on the sample data to obtain a super-resolution network model.

Preferably, the constructing the second-order feedforward structure SFS by using G first-order feedforward groups FFG, the first characteristic connection operation, and the second convolution layer includes:

g FFGs are sequentially connected, and the output end of each FFG is connected with the input end of the first characteristic connection operation respectively, wherein the 1 st input end of the FFG is connected with the input end of the first connection operation;

and connecting the output end of the first connection operation with the input end of a second convolution layer to construct the SFS, wherein the second convolution layer is composed of C convolution kernels with the size of t multiplied by C (G+1).

Preferably, the process of constructing the FFG according to the B residual channel attention block RCAB, the second characteristic join operation and the third convolution layer includes:

B RCAB are sequentially connected, and the output end of each RCAB is respectively connected with the input end of the second characteristic connection operation, wherein the 1 st input end of the RCAB is connected with the input end of the second characteristic connection operation;

and connecting the output end of the second characteristic connection operation with the input end of a third convolution layer to construct FFG, wherein the third convolution layer is composed of C convolution kernels with the size of t multiplied by C (B+1).

Preferably, the process of constructing the RCAB according to the fourth convolution layer, the linear rectification function ReLU layer, the fifth convolution layer, and the channel attention CA module includes:

and sequentially connecting a fourth convolution layer, a ReLU layer, a fifth convolution layer and a CA module, wherein the input end of the fourth convolution layer is connected with the output end of the CA module through an element-by-element summation unit to construct RCAB, and the fourth convolution layer and the fifth convolution layer are formed by C convolution kernels with the size of t multiplied by C.

Preferably, if the convolution layer in the upsampling module is formed by (c×r) convolution kernels having a size of t×t×c, after connecting the output terminal of the first convolution layer to the input terminal of the SFS and connecting the output terminal of the SFS to the input terminal of the upsampling module, the method further includes:

And connecting the output end of the up-sampling module with the input end of a sixth convolution layer to construct SFnet, wherein the sixth convolution layer is composed of A convolution kernels with the size of t multiplied by C.

The second aspect of the embodiment of the invention discloses an image super-resolution reconstruction method, which is applied to a super-resolution network model constructed by the network construction method disclosed by the first aspect of the embodiment of the invention, and comprises the following steps:

acquiring a first resolution image of M rows, N columns and A channels, wherein M, N and A are positive integers;

and inputting the first resolution image into a super-resolution network model to perform resolution improvement to obtain a second resolution image of r x M rows r x N columns A channels, wherein r is the improvement multiple of the resolution.

Preferably, the SFS is connected to the first convolution layer and the upsampling module, and the inputting the first resolution image into the super resolution network model to perform resolution enhancement, to obtain a second resolution image of r×m rows r×n columns a channels, includes:

inputting the first resolution image into the first convolution layer to obtain initial characteristics of M rows and N columns of C channels, wherein the first convolution layer is composed of C convolution kernels with the size of t multiplied by A, and C is a positive integer;

Inputting the initial characteristics into the SFS to obtain SFS characteristics of M rows and N columns of C channels;

and inputting the SFS characteristic into the up-sampling module to obtain a second resolution image of r, M rows, r, N columns and A channels, wherein a convolution layer in the up-sampling module is formed by (A, r) convolution kernels with the size of t, x C.

Preferably, the G FFGs are sequentially connected and their respective output ends are respectively connected with a first feature connection operation, a first convolution layer is respectively connected with the first feature connection operation and the 1 st FFG, a second convolution layer is respectively connected with the first feature connection operation and the up-sampling module, and the initial feature is input into the SFS to obtain an SFS feature of M rows and N columns of C channels, including:

inputting the initial characteristics into the 1 st FFG to obtain FFG characteristics of M rows and N columns of C channels output by the 1 st FFG, and inputting the initial characteristics into the first characteristic connection operation;

inputting the FFG characteristics output by the y-th FFG into the y+1th FFG to obtain the y+1th FFG characteristics output by the FFG, inputting the FFG characteristics output by the y-th FFG into the first characteristic connection operation, wherein y is an integer greater than or equal to 1 and less than or equal to G, and inputting the FFG characteristics output by the G into the first characteristic connection operation when y is equal to G;

Integrating the initial feature and all FFG features by utilizing the first feature connection operation to obtain FFG integrated features of M rows and N columns of C (G+1) channels;

and inputting the FFG integrated characteristic into the second convolution layer to obtain SFS characteristics of M rows and N columns of C channels, wherein the second convolution layer is composed of C convolution kernels with the size of t multiplied by C (G+1).

Preferably, the B RCABs are sequentially connected and the respective output ends are respectively connected with a second feature connection operation, the second feature connection operation is connected with a third convolution layer, the initial feature is input into the 1 st FFG to obtain the FFG feature of the M-row N-column C channel output by the 1 st FFG, and the method includes:

inputting the initial characteristics into the 1 st RCAB to obtain the RCAB characteristics of M rows and N columns of C channels output by the 1 st RCAB, and inputting the initial characteristics into the second characteristic connection operation;

inputting the RCAB features of the z-th RCAB output into the z+1th RCAB to obtain the RCAB features of the z+1th RCAB output, inputting the RCAB features of the z+1th RCAB output into the second feature connection operation, wherein z is an integer greater than or equal to 1 and less than or equal to B, and inputting the RCAB features of the B-th RCAB output into the second feature connection operation when z is equal to B;

Integrating the initial features and all RCAB features by using the second feature connection operation to obtain RCAB integrated features of M rows, N columns and C (B+1) channels;

inputting the RCAB integrated characteristic into the third convolution layer to obtain the FFG characteristic of M rows and N columns of C channels of the FFG output, wherein the third convolution layer is composed of C convolution kernels with the size of t multiplied by C (B+1).

Preferably, the fourth convolution layer, the ReLU layer, the fifth convolution layer and the CA module are sequentially connected, where the inputting the initial feature into the 1 st RCAB obtains an RCAB feature of an M row, an N column and a C channel output by the 1 st RCAB, and the method includes:

inputting the initial characteristics into the fourth convolution layer to obtain first sub-characteristics of M rows and N columns of C channels;

processing the first sub-feature by using the ReLU layer to obtain a second sub-feature of M rows and N columns of C channels;

inputting the second sub-feature into the fifth convolution layer to obtain a third sub-feature of M rows and N columns of C channels;

processing the third sub-feature by using the CA module to obtain a fourth sub-feature of M rows, N columns and C channels;

performing element-by-element summation calculation on the fourth sub-feature and the initial feature to obtain RCAB features of M rows and N columns of C channels;

the fourth convolution layer and the fifth convolution layer are formed from C convolution kernels of size t x C.

Preferably, if the convolution layer in the upsampling module is formed by (c×r) convolution kernels with a size of t×t×c, the SFnet further includes a sixth convolution layer, and after inputting the initial feature into the SFS to obtain the SFS feature of the M rows and N columns of C channels, the method further includes:

inputting the SFS characteristic into the up-sampling module to obtain an up-sampling module characteristic of r-N columns of C channels in r-M rows;

and inputting the up-sampling module characteristics into the sixth convolution layer to obtain a second resolution image of r x M rows r x N columns A channels, wherein the sixth convolution layer is formed by A convolution kernels with the size of t x C.

A third aspect of an embodiment of the present invention discloses a network construction system, including:

a first construction unit, configured to construct a first convolution layer by using C convolution kernels with a size of t×t×a, where C and a are positive integers, and t is a positive odd number;

the second construction unit is used for constructing a second-order feedforward structure SFS by using G first-order feedforward groups FFG, a first characteristic connection operation and a second convolution layer, wherein the FFG is composed of B residual channel attention blocks RCAB, a second characteristic connection operation and a third convolution layer, the RCAB is composed of a fourth convolution layer, a linear rectification function ReLU layer, a fifth convolution layer and a channel attention CA module, and G and B are positive integers;

A third construction unit, configured to connect an output end of the first convolution layer with an input end of the SFS, and connect an output end of the SFS with an input end of an upsampling module, to construct a second order feedforward network SFnet, where the upsampling module is configured to perform an r-time upsampling operation, where r is any real number greater than 1, and the convolution layer in the upsampling module is formed by (a×r) convolution kernels with a size of txt×c;

and the training unit is used for training the SFnet based on the sample data to obtain a super-resolution network model.

The fourth aspect of the embodiment of the invention discloses an image super-resolution reconstruction system, which is applied to a super-resolution network model constructed by the network construction method disclosed in the first aspect of the embodiment of the invention, and comprises the following steps:

an acquisition unit, configured to acquire a first resolution image of M rows and N columns of a channels, where M, N and a are positive integers;

and the processing unit is used for inputting the first resolution image into a super-resolution network model to perform resolution improvement to obtain a second resolution image of r x M rows r x N columns A channels, wherein r is the improvement multiple of the resolution.

Based on the network construction method, the image super-resolution reconstruction method and the system provided by the embodiment of the invention, the method comprises the following steps: and constructing an SFnet by using a preset first convolution layer, an SFS and an up-sampling module, and training the SFnet based on sample data to obtain a super-resolution network model. And inputting the first resolution image of the M rows, N columns and A channels into a super-resolution network model for resolution improvement to obtain the second resolution image of the r x M rows, r x N columns and A channels. In the scheme, in the process of processing the first resolution image, the SFS is utilized to extract and integrate multi-level information, and the characteristic information of the first resolution image is utilized at the global level and the local level through the SFS, so that the super-resolution network model improves the resolution of the first resolution image on the premise of ensuring high fidelity, and a second resolution image with high fidelity and high resolution is obtained.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a convolution kernel provided in an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a convolutional layer according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a CA module according to an embodiment of the present disclosure;

fig. 4 is a flowchart of a network construction method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the SFnet structure according to the embodiment of the present invention;

FIG. 6 is another schematic diagram of the SFnet according to the embodiment of the present invention;

fig. 7 is a schematic structural diagram of an RCAB according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an FFG according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an SFS according to an embodiment of the present invention;

FIG. 10 is a flowchart of an image super-resolution reconstruction method according to an embodiment of the present invention;

FIG. 11 is a flowchart of obtaining a second resolution image according to an embodiment of the present application;

FIG. 12 is a flowchart of obtaining SFS features according to an embodiment of the present application;

FIG. 13 is a flowchart of obtaining the 1 st FFG feature provided by an embodiment of the present application;

fig. 14 is a block diagram of a network construction system according to an embodiment of the present application;

fig. 15 is a block diagram of an image super-resolution reconstruction system according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As known from the background art, the resolution reconstruction technology is widely applied to various fields such as monitoring, medicine and remote sensing, and how to restore a low resolution image to a high resolution image is a problem that needs to be solved nowadays.

Therefore, the embodiment of the invention provides a network construction method, an image super-resolution reconstruction method and a system, wherein a first resolution image of M rows, N columns and A channels is input into a pre-trained super-resolution network model to perform resolution improvement, and a second resolution image of r x M rows, r x N columns and A channels is obtained, so that a low resolution image is restored to a high resolution image.

It is understood that the low resolution image refers to an image having a resolution lower than a first preset resolution, and the high resolution image refers to an image having a resolution higher than a second preset resolution, and the definition of the low resolution image and the high resolution image is not particularly limited herein.

Various operation contents are related in the contents of the embodiments of the present invention, and in order to better understand the contents in the embodiments of the present invention, some operation contents related to the embodiments of the present invention are described by the following contents. The following 12 kinds of calculation contents are only for illustration.

The 1 st operation content: the convolution kernel, 1 size t×t×c, is a three-dimensional array of 1 t×t×c, indicating t×t×c weights for performing the convolution operation. C is a positive integer, and t is a positive odd number.

The structure of the convolution kernel is shown in fig. 1, with the cuboid in fig. 1 indicating a three-dimensional array of t×t×c.

The 2 nd operation content: convolution layer (Conv), n convolution kernels of size t x C form 1 convolution layer, and in combination with the 1 st operation content, all convolution kernels in the convolution layer have n x t x C weights for performing convolution operation, that is,the convolution layer is a four-dimensional array formed by n×t×C weights for convolution operation(w represents a convolution layer).

The structure of the convolution layer is shown in fig. 2, and in fig. 2, the convolution layer is spread to form n convolution kernels with the size t×t×c.

The 3 rd operation content: a convolution function. The convolution function refers to: the method comprises the steps of performing convolution operation on an image X of M rows and N columns of C channels by using 1 convolution layer w formed by N convolution kernels with the size of t multiplied by C, wherein the content of the convolution operation is shown as a formula (1), and Y output after the convolution operation is an image of M rows and N columns of N channels.

Y＝F _Conv (X,w) (1)

It should be noted that M, N and N are positive integers, M is the length of the image X, N is the width of the image X, and the channel may also be referred to as a band. That is, for the image X, the image is not a flat one but a cube of size mxn×c.

For example: for an image in rgb (redgreenblue) format, the rgb image has three bands of red, green and blue, that is, the size of the rgb image is M rows and N columns of 3 bands (channels).

The 4 th operation content: and (5) performing characteristic connection operation. The feature connection operation means: image X for N M rows and N columns of C channels ₁ ,X ₂ ,...,X _n And (3) performing characteristic connection operation on each image, wherein the output image Y is an image of M rows and N columns of N-C channels as shown in the formula (2).

Y＝F _Concatenate (X ₁ ,X ₂ ,...,X _n ) (2)

For example: assuming that the sizes of the image 1 (feature 1) and the image 2 (feature 2) are M rows and N columns of 64 channels, performing feature connection operation on the image 1 and the image 2, and obtaining an image Y with the size of M rows and N columns of 128 channels.

The 5 th operation content: a linear rectification function (RectifiedLinearUnit, reLU). For the value x, reLU is defined as in equation (3).

y＝f _RELU (x)＝max(0,x) (3)

For vector x= [ x ] ₁ ,...,x _c ,...,x _C ]ReLU is defined as in equation (4), where vector y= [ y ] ₁ ,...,y _c ,...,y _C ]And y is _c ＝f _RELU (x _c )＝max(0,x _c )。

y＝F _RELU (x) (4)

Image x= [ X ] for M row N column C channel _ijc ] _M×N×C ReLU is defined as in equation (5), where image Y= [ Y ] _ijc ] _M×N×C And y is _ijc ＝f _RELU (x _ijc )＝max(0,x _ijc )。

Y＝F _ReLU (X) (5)

The 6 th operation content: sigmoid function, the function of which is to map the input quantity between 0 and 1.

For the value x, the sigmoid function is defined as in equation (6), where y ε (0, 1).

For vector x= [ x ] ₁ ,...,x _c ,...,x _C ]The sigmoid function is defined as in equation (7), where y= [ y ] ₁ ,...,y _c ,...,y _C ]，y _c ＝f _sigmiod (x _c ) And y is _c ∈(0,1)。

y＝F _sigmoid (x) (7)

The 7 th operation content: summing element by element. Element-wise summation refers to: image x= [ X ] for M row N column C channel _ijc ] _M×N×C Image y= [ Y ] of C channels with M rows and N columns _ijc ] _M×N×C The element-wise summation operation (pixel-wise summation) between the two is denoted asOperation result z= [ Z ] of element-by-element summation _ijc ] _M×N×C Still an image of M rows and N columns of C channels, where z _ijc ＝x _ijc +y _ijc 。

The 8 th operation content: multiplying element by element. Element-wise multiplication refers to: for a vector x= [ x ] of dimension C ₁ ,...,x _c ,...,x _C ]Sum vector y= [ y ] ₁ ,...,y _c ,...,y _C ]The element-by-element multiplication between the two is recorded asThe result of the operation is z= [ z ₁ ,...,z _c ,...,z _C ]Wherein z is _c ＝x _c ×y _c 。

For a vector x= [ x ] of dimension C ₁ ,...,x _c ,...,x _C ]And an image y= [ Y ] of M rows and N columns of C channels _ijc ] _M×N×C The element-wise multiplication between the two is noted asWherein z is _ijc ＝x _c ×y _ijc 。

The 9 th operation content: subpixel convolution (sub-pixel convolution). Subpixel convolution is used to merge and pixel-reorder (pixel-shuffle) operations on different channels of an image, sacrificing the number of channels of the image to magnify both the length and width dimensions of the image. When the resolution of the image is increased by r times, the resolution of the image is increased by N columns (c×r ² ) The image X of the channel, the subpixel convolution operates as in equation (8).

Y＝F _PixelShuffle (X,r) (8)

In formula (8), r is a resolution improvement factor (also referred to as an image size magnification factor and an up-sampling rate), Y is an operation result of sub-pixel convolution, and the size of Y is r×m rows r×n columns of C channels.

For example: for a size of M rows and N columns 1X 2 ² The image X of the channel is subjected to the operation in the formula (8), assuming that r is 2, and the size of the operation result Y is 2M rows 2N columns 1 channel.

The 10 th operation content: and (5) carrying out channel-by-channel global average pooling. The operation of the channel-wise global average pooling can be noted as z=f _GP (X) image x= [ X ] for M rows and N columns of C channels _ijc ] _M×N×C The statistical vector z= [ z ] of the C-dimensional channel level can be obtained by utilizing the global average pooling operation of each channel ₁ ,...,z _c ,...,z _C ]For z _c The operation of (a) is as in formula (9).

The 11 th operation content: channel Attention (CA) module. Image x= [ X ] for M row N column C channel _ijc ] _M×N×C The CA module is denoted as X' =f _CA (X,w _up ,w _down ) The specific definition is as shown in formula (10).

In formula (10), w _down For C/r' convolutional layers of size 1×1×C, w _up For C convolutional layers of size 1×1×C/r ', r' is the vector dimension transform factor in the CA module.

As shown in fig. 3, a schematic view of the CA module is shown in fig. 3.

The 12 th operation content: an upsampling module (upscalemodule). When the resolution of the image is increased by a factor r, the upsampling module is formed by n=c×r ² Convolutional layer F consisting of convolutional kernels of size t×t×C _Conv (X, w) convolving F with a subpixel _PixelShuffle (X, r). That is, the up-sampling module may perform r-fold spatial size enlargement (also referred to as resolution enhancement and up-sampling) on the image (feature) of the input itself, denoted as F _up (X,r)。

For image X of M rows and N columns of C channels, the upsampling operation of the upsampling module is as in equation (11).

Y＝F _up (X,w,r)＝F _PixelShuffle (F _Conv (X,w),r)(11)

The size of the operation result Y in the formula (11) is r×m rows r×n columns C channels.

For example: up-sampling mould of 2 timesBlock F _up (X, 2), by C.times.2 ² Convolutional layer F consisting of convolutional kernels of size t×t×C _Conv (X, w) convolving F with a subpixel _PixelShuffle (X, 2). For an image (feature) X of M rows and N columns of C channels, the size of the result Y calculated using the formula (11) is 2m×2n×c.

It should be noted that 4 times up-sampling is indirectly achieved by two 2 times up-sampling operations, i.e., y=f _up (X,4)＝F _up (F _up (X, 2), 2). Other multiple upsampling can be found in the foregoing and will not be illustrated here.

Referring to fig. 4, a flowchart of a network construction method provided by an embodiment of the present invention is shown, where the network construction method includes:

step S401: the first convolution layer is constructed using C convolution kernels of size t x a.

C and A are positive integers, and t is a positive odd number.

As can be seen from the foregoing, the convolution layer is formed of a plurality of convolution kernels, and in the process of specifically implementing step S401, the first convolution layer is constructed using C convolution kernels having a size of t×t×a.

Step S402: a second order feedforward structure (SFS) is constructed using G first order feedforward groups (first-order feedforward group, FFG), a first feature join operation, and a second convolution layer.

The FFG is composed of B residual channel attention blocks (Residual Channel Attention Block, RCAB), a second characteristic concatenation operation, and a third convolution layer. The RCAB is composed of a fourth convolution layer, a ReLU layer, a fifth convolution layer and a CA module. G and B are positive integers.

In the specific implementation process of step S402, RACB is constructed using the fourth convolution layer, the ReLU layer, the fifth convolution layer, and the CA module. FFG is constructed by B RACBs, a second feature join operation, and a third convolution layer.

That is, RACB and FFG are constructed in advance, and then G FFGs, the first feature join operation, and the second convolution layer are utilized to construct SFS.

Step S403: an output of the first convolution layer is connected to an input of the SFS and an output of the SFS is connected to an input of the upsampling module to construct a second order feed forward network (SFnet).

It should be noted that, the up-sampling module is configured to perform an up-sampling operation r times, where r is any real number greater than 1.

In the specific implementation process of step S403, the output end of the first convolution layer is connected with the input end of the SFS, and the output end of the SFS is connected with the input end of the upsampling module, so as to construct SFnet.

That is, when the image is processed by SFnet, the image of M rows, N columns and a channels is input into the first convolution layer, and the image is processed by the first convolution layer, SFS and up-sampling module, respectively, and the up-sampling module outputs the processed image of r x M rows, r x N columns and a channels.

When an image of M rows and N columns of a channels is input to the SFnet, it is necessary to ensure that the number of channels of the image output by the SFnet is unchanged, that is, the number of channels of the image output by the SFnet is also required to be a. In combination with the above 12 th operation, the convolution layer in the upsampling module is composed of (a×r) convolution kernels of size t×t×c. As shown in fig. 5, a schematic diagram of the SFnet is shown, and it should be noted that the content shown in fig. 5 is only for illustration, and the SFnet includes a first convolution layer 501, an SFS502, and an upsampling module 503.

Preferably, if the convolution layer in the up-sampling module is formed by (c×r) convolution kernels with a size of t×t×c, that is, the number of channels of the feature (image) output by the up-sampling module is C. From the foregoing, it can be seen that the number of channels of the image output by SFnet also needs to be a, so a sixth convolution layer composed of a convolution kernels of size t×t×c may be connected after the upsampling module.

Referring to fig. 6 in conjunction with fig. 5, another structural schematic diagram of the SFnet is shown, the SFnet further comprising: sixth convolution layer 504.

That is, SFnet is made up of a first convolution layer, SFS, up-sampling module, and sixth convolution layer. The output end of the first convolution layer is connected with the input end of the SFS, the output end of the SFS is connected with the input end of the up-sampling module, the output end of the up-sampling module is connected with the input end of the sixth convolution layer, and the channel number of the characteristics output by the up-sampling module is converted into A by the sixth convolution layer.

In fig. 6, the convolution layer in the up-sampling module is composed of (c×r) convolution kernels of size t×t×c, and the sixth convolution layer is composed of a convolution kernels of size t×t×c. The output end of the up-sampling module is connected with the input end of the sixth convolution layer.

That is, the image of the M-row N-column a channel is input into the first convolution layer, and the image is processed by the first convolution layer, the SFS, the upsampling module, and the sixth convolution layer, respectively, and the sixth convolution layer outputs the processed image of the r×m-row r×n-column a channel.

Step S404: and training SFnet based on the sample data to obtain the super-resolution network model.

In the specific implementation process of step S404, after the SFnet is constructed, the SFnet is trained by using the sample data until the SFnet converges, so as to obtain the super-resolution network model.

That is, the images of the M rows and N columns of a channels are input into the super-resolution network model, so as to obtain the images of the r×m rows and r×n columns of a channels, and the structure of the super-resolution network model can be seen in fig. 5 and 6.

The details of training SFnet are as described below.

Acquiring a high resolution sample image set containing J high resolution imagesThe resolution of each high resolution image in the high resolution sample image set is reduced by r times, so that a low resolution sample image set containing J low resolution images is obtained >

Inputting the ith low-resolution image in the low-resolution sample image set into SFnet for resolution improvement to obtain a super-resolution reconstruction imageIt will be appreciated thatThe i-th low-resolution image corresponds to the i-th high-resolution image in the high-resolution sample image set +.>

Initializing model parameters in SFnet using Adam algorithm toAnd->The minimum error is the target, and the loss function is minimized in the training process to optimize the model parameters in the SFnet, so as to obtain an optimized model parameter set +.>I.e. SFnet convergence. It should be noted that->The elements in (a) are the weights of all the convolution layers in SFnet.

It will be appreciated that the number of components,and->The smallest error refers to the smallest value of the loss function.

It should be noted that the loss function may be set according to practical situations, for example, a first loss function shown in equation (12) and a second loss function shown in equation (13).

In the embodiment of the invention, the SFnet is constructed by utilizing a preset first convolution layer, an SFS and an up-sampling module, and the SFnet is trained based on sample data to obtain a super-resolution network model. And improving the resolution of the image of the M rows, the N columns and the A channels through a super-resolution network model to obtain the image of the r x M rows, the r x N columns and the A channels. The SFS is utilized to extract and integrate multi-level information, and feature information of the image is utilized at a global level and a local level, so that the super-resolution network model improves the resolution of the image on the premise of ensuring high fidelity, and the image with high fidelity and high resolution is obtained.

The above-mentioned process of constructing an RCAB according to the embodiment of the present invention in step S402 of fig. 4, referring to fig. 7, shows a schematic structural diagram of the RCAB according to the embodiment of the present invention, where the RCAB includes: a fourth convolution layer 701, a ReLU layer 702, a fifth convolution layer 703, and a CA module 704.

In fig. 7, the fourth convolution layer, the ReLU layer, the fifth convolution layer, and the CA module are sequentially connected to construct an RCAB.

That is, the output of the fourth convolution layer is connected to the input of the ReLU layer, the output of the ReLU layer is connected to the input of the fifth convolution layer, and the output of the fifth convolution layer is connected to the input of the CA module.

The input end of the fourth convolution layer is connected with the output end of the CA module through an element-by-element summation unit, that is, the image (feature) input into the fourth convolution layer and the image (feature) output by the CA module are subjected to element-by-element summation operation.

The fourth convolution layer and the fifth convolution layer are formed of C convolution kernels having a size of t×t×c.

The operation of RCAB can be denoted as F _RCAB (X,W _RCAB )，F _RCAB (X,W _RCAB ) Is defined as in equation (14).

In equation (14), X is the image (feature) of the input RACBSet W _RCAB ＝(w _RCAB,1 ,w _RCAB,2 ,w _RCAB,up ,w _RCAB,down ) Including two of the fourth, fifth and CA modules that make up the RACB.

The process of constructing the FFG in step S402 of fig. 4 according to the above embodiment of the present invention, referring to fig. 8, is shown a schematic structural diagram of the FFG provided by the embodiment of the present invention, where the FFG includes: b RCABs 801, a second feature join operation 802, and a third convolution layer 803.

Note that RCAB in fig. 8 indicates the B-th RCAB.

The B RCAB's are connected in turn, that is, the output of the 1 st RCAB is connected to the input of the 2 nd RACB, the output of the 2 nd RCAB is connected to the input of the 3 rd RACB, and so on.

The output of each RCAB is connected to the input of the second feature connection operation, and it should be noted that the input of the 1 st RCAB is connected to the input of the second feature connection operation, that is, the image (feature) input to the 1 st RCAB is also input to the second feature connection operation.

The output end of the second characteristic connection operation is connected with the input end of a third convolution layer to construct FFG, and the third convolution layer is composed of C convolution kernels with the size of t multiplied by C (B+1).

The operation of FFG can be denoted as F _FFG (X,W _FFG )，F _FFG (X,W _FFG ) The specific definition of (2) is as in formula (15).

In the case of the formula (15),x is the image (feature) of the input FFG, set W _FFG ＝(w _RCAB,1 ,w _RCAB,2 ,...,w _RCAB,B ,w _FFG ) A convolution layer comprising B RCABs in the FFG and a third convolution layer.

The construction process of the SFS referred to in step S402 in fig. 4 in the above embodiment of the present invention, referring to fig. 9, shows a schematic structural diagram of the SFS provided in the embodiment of the present invention, where the SFS includes: g FFGs 901, a first feature join operation 902, and a second convolution layer 903.

Note that FFG-G in fig. 9 indicates the G-th FFG.

The G FFGs are connected in sequence, that is, the output terminal of the 1 st FFG is connected to the input terminal of the 2 nd FFG, the output terminal of the 2 nd FFG is connected to the input terminal of the 3 rd FFG, and the G FFGs are connected in sequence by such pushing.

The output end of each FFG is connected to the input end of the first characteristic connection operation, and it should be noted that the input end of the 1 st FFG is connected to the input end of the first characteristic connection operation. That is, the image (feature) input to the 1 st FFG is also input to the first feature connection operation.

The output end of the first connection operation is connected with the input end of a second convolution layer to construct SFS, and the second convolution layer is composed of C convolution kernels with the size of t multiplied by C (G+1).

The operation of SFS can be denoted as F _SFS (X,W _SFS )，F _SFS (X,W _SFS ) The specific definition of (2) is as in formula (16).

In the formula (16) of the present invention,x is the image (feature) of the input SFS, set W _SFS ＝(w _FFG,1 ,w _FFG,2 ,...,w _FFG,G ,w _SFS ) A convolution layer comprising G FFGs in the SSF and a second convolution layer.

In connection with the above-described descriptions shown in FIGS. 7-9, the operation of SFnet may be denoted as F _SFnet (X,W _SFnet R), when the structure of the SFnet is as shown in the structural schematic diagram of SFnet in FIG. 5, that is, SFnet includes the first convolution layer, SFS and up-sampling module, F _SFnet (X,W _SFnet The specific definition of r) is as in formula (17).

F _SFnet (X,W _SFnet ,r)＝F _UP (F _SFS (F _Conv (X,w _SFnet,1 ),W _SFS ),w _up ,r) (17)

In formula (17), X is the image of the input SFnet, set W _SFnet ＝(W _SFS ,w _up ,...,w _SFnet,1 ) Comprises a convolution layer in SFS, a convolution layer in an up-sampling module and a first convolution layer w _SFnet,1 。

When the structure of the SFnet is as shown in the schematic structure of SFnet in fig. 6, that is, SFnet includes a first convolution layer, an SFS, an up-sampling module, and a sixth convolution layer, F _SFnet (X,W _SFnet The specific definition of r) is as in formula (18).

F _SFnet (X,W _SFnet ,r)＝F _Conv (F _UP (F _SFS (F _Conv (X,w _SFnet,1 ),W _SFS ),w _up ,r),w _SFnet,2 ) (18)

Set W _SFnet ＝(W _SFS ,w _up ,...,w _SFnet,1 ,w _SFnet,2 ) Comprises a convolution layer in SFS, a convolution layer in an up-sampling module and a first convolution layer w _SFnet,1 And a sixth convolution layer w _SFnet,2 。

Note that, the operation contents in fig. 4 to 9 may refer to the relevant contents in the 12 kinds of operation contents, and will not be described herein.

In the embodiment of the invention, FFG is constructed by utilizing pre-constructed RACB, and SFS is constructed according to the constructed FFG. And constructing and training an SFnet by using the SFS to obtain a super-resolution network model, extracting and integrating multi-level information by using the SFS, and utilizing the characteristic information of the image at a global level and a local level by using the SFS to ensure that the super-resolution network model improves the resolution of the image on the premise of ensuring high fidelity, thereby obtaining the image with high fidelity and high resolution.

Corresponding to the network construction method provided by the above-mentioned embodiment of the present invention, referring to fig. 10, an image super-resolution reconstruction method provided by the embodiment of the present invention is shown, where the image super-resolution reconstruction method is applied to a super-resolution network model constructed by the above-mentioned network construction method, and the image super-resolution reconstruction method includes:

Step S1001: a first resolution image of M rows and N columns of a channels is acquired.

Note that M, N and a are positive integers.

It will be appreciated that the first resolution image is an image that needs to be subjected to image processing, i.e. the first resolution image is a low resolution image that needs to be resolution-enhanced, for example: a first resolution image of M rows and N columns 3 channels in rgb format is acquired.

Step S1002: and inputting the first resolution image into a super-resolution network model to perform resolution improvement to obtain a second resolution image of r x M rows r x N columns A channels.

It should be noted that r is a multiple of resolution improvement, and the super-resolution network model is constructed according to the network construction method.

The super-resolution network model comprises a first convolution layer, an SFS and an up-sampling module, and the specific structure can be seen in FIG. 5. Alternatively, the super-resolution network model includes a first convolution layer, an SFS, an upsampling module, and a sixth convolution layer, and the specific structure can be seen in fig. 6.

From the above, after resolution enhancement is performed on the first resolution image of the M rows, N columns and a channels, the number of channels of the obtained second resolution image needs to be identical to the number of channels of the first resolution image.

That is, in the specific implementation step S1002, the first resolution image of the M rows, N columns, and a channels is input into the super resolution network model to perform resolution enhancement, so as to obtain the second resolution image of the r x M rows, r x N columns, and a channel, that is, the high resolution image with the channel number consistent with the low resolution image.

For example: and inputting the first resolution image of M rows and N columns of 3 channels in the rgb format into a super-resolution network model for resolution improvement to obtain a second resolution image of r x M rows and r x N columns of 3 channels.

In the embodiment of the invention, M rows, N columns and A channels are input into a super-resolution network model to improve the resolution. The SFS in the super-resolution network model is utilized to extract and integrate multi-level information, the characteristic information of the first resolution image is utilized at the global level and the local level through the SFS, the resolution of the first resolution image is improved by r times on the premise of ensuring high fidelity, and the second resolution image of r, M rows, r, N columns and A channels is obtained.

The process of obtaining the second resolution image referred to in step S1002 in fig. 10 in the above embodiment of the present invention, referring to fig. 11, shows a flowchart of obtaining the second resolution image provided in the embodiment of the present invention, including the following steps:

it should be noted that, the structure of the super-resolution network model is as shown in fig. 5, that is, the super-resolution network model includes a first convolution layer, an SFS and an upsampling module, where the SFS is connected to the first convolution layer and the upsampling module, respectively.

Step S1101: and inputting the first resolution image into a first convolution layer to obtain the initial characteristics of the M rows, the N columns and the C channels.

The first convolution layer is formed by C convolution kernels of size t×t×a, and C is a positive integer.

In the specific implementation process of step S1101, the first resolution image of the M row, N column and a channel is input into the first convolution layer by combining the 1 st to 3 rd operation contents, and the first convolution layer processes the first resolution image to obtain the initial characteristics of the M row, N column and C channel.

It will be appreciated that the image is referred to as a feature during the operation of the super-resolution network model, and the feature output by the super-resolution network model is referred to as an image.

Step S1102: inputting the initial features into SFS to obtain SFS features of M rows and N columns of C channels.

In the specific implementation process of step S1102, the initial features of the M rows and N columns of C channels output by the first convolution layer are input into the SFS, and the SFS features of the M rows and N columns of C channels are obtained by processing the initial features with the SFS.

Step S1103: and inputting the SFS characteristic into the up-sampling module to obtain a second resolution image of r.times.M rows r.times.N columns A channels.

In the structure of the super-resolution network model shown in fig. 5, the convolution layer in the up-sampling module is formed by (a×r) convolution kernels with a size of t×t×c, that is, after the first resolution image is input into the super-resolution network model, the number of channels of the output second resolution image is made to coincide with the number of channels of the first resolution image.

In the specific implementation process of step S1103, according to the 12 th operation content, the SFS features of the M rows, N columns and C channels are input into the upsampling module, and the upsampling module is used to process the SFS features to obtain the second resolution image of the r x M rows, r x N columns and a channels, that is, the features of the r x M rows, r x N columns and a channels output by the upsampling module are the second resolution image.

Preferably, if the convolutional layer in the upsampling module is formed by (c×r) convolutional kernels having a size of t×t×c, that is, the size of the features output by the upsampling module is r×m rows r×n columns of C channels. In order to ensure that the channel number of the second resolution image is consistent with that of the first resolution image, a sixth convolution layer formed by A convolution kernels with the size of t×t×C is connected after the up-sampling module, and the structure of the super-resolution network model is shown in fig. 6.

That is, the SFS feature is input to the upsampling module to obtain the upsampling module feature of the r x M row r x N column C channel. And inputting the up-sampling module characteristics of the r x M rows and the r x N columns of C channels into a sixth convolution layer to obtain a second resolution image of the r x M rows and the r x N columns of A channels.

In the embodiment of the invention, the SFS is utilized to extract and integrate multi-level information, and the characteristic information of the first resolution image is utilized at the global level and the local level by the SFS, so that the resolution of the first resolution image is improved by r times on the premise of ensuring high fidelity, and the second resolution image of r, X, M rows, r, X, N columns and A channels is obtained.

The process of obtaining the SFS feature referred to in step S1102 of fig. 11 in the above embodiment of the present invention, referring to fig. 12, shows a flowchart of obtaining the SFS feature provided in the embodiment of the present invention, including the following steps:

it should be noted that, the SFS includes G FFGs, a first feature connection operation and a second convolution layer, the G FFGs are sequentially connected and respective output ends are respectively connected with the first feature connection operation, the first convolution layer is respectively connected with the first feature connection operation and the 1 st FFG, and the second convolution layer is respectively connected with the first feature connection operation and the up-sampling module.

The specific structure of the SFS may be referred to as the content shown in fig. 9, and will not be described herein.

Step S1201: inputting the initial characteristic into the 1 st FFG to obtain FFG characteristics of M rows and N columns of C channels output by the 1 st FFG, and inputting the initial characteristic into a first characteristic connection operation.

In the specific implementation process of step S1201, the initial features of the M rows, N columns and C channels are obtained after the first resolution image of the M rows, N columns and a channels is input into the first convolution layer by combining the structure diagram of the SFnet shown in fig. 5 and the structure diagram of the SFS shown in fig. 9.

The first convolution layer inputs initial features of M rows and N columns of C channels into a 1 st FFG and a first feature connection operation respectively, and the 1 st FFG processes the initial features to obtain FFG features of the corresponding M rows and N columns of C channels.

Step S1202: and inputting the FFG characteristics output by the y-th FFG into the y+1th FFG to obtain the FFG characteristics output by the y+1th FFG, and inputting the FFG characteristics output by the y-th FFG into a first characteristic connection operation.

Y is an integer of 1 or more and G or less.

In the specific implementation process of step S1202, the FFG feature output by the 1 st FFG is input into the 2 nd FFG and the first feature connection operation, and the 2 nd FFG processes the FFG feature of the 1 st FFG to obtain the FFG feature corresponding to the 2 nd FFG. And inputting the FFG characteristics output by the 2 nd FFG into the 3 rd FFG and the first characteristic connection operation, and processing the FFG characteristics of the 2 nd FFG by the 3 rd FFG to obtain the corresponding FFG characteristics. And by analogy, inputting the FFG characteristics output by the y-th FFG into the y+1th FFG to obtain the FFG characteristics output by the y+1th FFG.

That is, the FFG feature received by each of the 2 nd and subsequent FFGs is the FFG feature output by the previous FFG, and the FFG feature output by each FFG is input to the first feature connection operation.

When y is equal to G, the FFG features outputted from the G-th FFG are connected to the first features only. And each FFG feature is sized as M rows and N columns of C channels.

Step S1203: and integrating the initial characteristic and all FFG characteristics by using a first characteristic connection operation to obtain FFG integrated characteristics of M rows and N columns of C (G+1) channels.

In the process of implementing step S1203 specifically, as can be seen from the contents in step S1201 and step S1202, the first feature connection operation includes an initial feature and FFG features output by each FFG, that is, includes (g+1) features.

According to the 4 th operation content, the first feature connection operation is utilized to integrate the initial feature and all FFG features, namely (G+1) features, so as to obtain FFG integrated features of M rows and N columns of C (G+1) channels.

Step S1204: and inputting the FFG integration characteristic into a second convolution layer to obtain the SFS characteristic of the M rows and N columns of C channels.

The second convolution layer is formed by C convolution kernels of size t×t×c (g+1). In the specific implementation process of step S1204, the FFG integration features of the M rows and N columns of C (g+1) channels are input into the second convolution layer, and the FFG integration features are processed by the second convolution layer to obtain SFS features of the M rows and N columns of C channels.

In the embodiment of the invention, the initial characteristics and FFG characteristics output by each FFG are transmitted to a first characteristic connection operation for combination to obtain SFS characteristics of M rows and N columns of C channels, and characteristic information of a first resolution image is utilized at a global level and a local level.

The process of obtaining the 1 st FFG feature referred to in step S1201 in fig. 12 in the above embodiment of the present invention, referring to fig. 13, shows a flowchart of obtaining the 1 st FFG feature provided in the embodiment of the present invention, including the following steps:

the FFG includes: the structure of the FFG can be seen from the content shown in fig. 8, and the output ends of the B RCABs are connected with the second characteristic connection operation, the second characteristic connection operation and the third convolution layer, and the respective output ends of the B RCABs are connected with the second characteristic connection operation.

Step S1301: inputting the initial feature into the 1 st RCAB to obtain the RCAB feature of M rows and N columns of C channels output by the 1 st RCAB, and inputting the initial feature into a second feature connection operation.

In connection with the structure diagrams of SFnet and SFS shown in fig. 5 and 9, in the process of embodying step S1301, the first convolution layer inputs the initial feature of the M row N column C channel into the 1 st FFG in the SFS, that is, the initial feature into the 1 st RCAB in the 1 st FFG, and inputs the initial feature into the second feature join operation.

It will be appreciated that the RCAB includes a fourth convolution layer, a ReLU layer, a fifth convolution layer, and a CA module, which are sequentially connected, and the structure of the RCAB is shown in fig. 7.

After the initial feature is input into the 1 st RCAB, the fourth convolution layer processes the initial feature to obtain a first sub-feature of the M-row N-column C channel. And processing the first sub-feature by using the ReLU layer to obtain a second sub-feature of the M-row N-column C channel.

And inputting the second sub-feature into a fifth convolution layer, and processing the second sub-feature by the fifth convolution layer to obtain a third sub-feature of the M-row N-column C channel. And processing the third sub-feature by using the CA module to obtain a fourth sub-feature of the M rows and N columns of C channels. And carrying out element-by-element summation calculation on the fourth sub-feature and the initial feature to obtain RCAB features of M rows and N columns of C channels, namely obtaining the RCAB features of M rows and N columns of C channels after inputting the initial feature into the 1 st RCAB.

Step S1302: inputting the RCAB characteristics of the z-th RCAB output into the z+1th RCAB to obtain the RCAB characteristics of the z+1th RCAB output, and inputting the RCAB characteristics of the z+1th RCAB output into a second characteristic connection operation.

Z is the whole number of 1 or more and B or less.

In the specific implementation process of step S1302, the RCAB feature output by the 1 st RCAB is input into the 2 nd RCAB and the second feature is connected to operate, and the 2 nd RCAB processes the RCAB feature output by the 1 st RCAB to obtain the corresponding RCAB feature. And inputting the RCAB characteristics of the 2 nd RCAB output into the 3 rd RCAB and second characteristic connection operation, and processing the RCAB characteristics of the 2 nd RCAB output by the 3 rd RCAB to obtain the corresponding RCAB characteristics. And by analogy, inputting the RCAB features of the z-th RCAB output into the z+1th RCAB to obtain the RCAB features of the z+1th RCAB output.

The RCAB features received by the 2 nd and subsequent RCABs are the RCAB features output by the previous RCAB, and the RCAB features output by each RCAB are input into the second feature join operation.

The number of RCAB is B, that is, when z is equal to B, the RCAB feature output by the B-th RCAB is input to the second feature concatenation operation. The process of obtaining each RCAB feature is described in step S1301, and each RCAB feature has a size of M rows and N columns of C channels.

Step S1303: and integrating the initial features and all RCAB features by using a second feature connection operation to obtain RCAB integrated features of the M-row N-column C (B+1) channels.

In the process of implementing step S1303 specifically, as can be seen from the contents in step S1301 and step S1302, the second feature connection operation includes the initial feature and the RCAB feature of each RCAB output, that is, includes (b+1) features.

According to the 4 th operation content, the initial characteristics and all RCAB characteristics are integrated by utilizing the second characteristic connection operation, and the RCAB integrated characteristics of M rows and N columns of C (B+1) channels are obtained.

Step S1304: and inputting the RCAB integrated characteristic into a third convolution layer to obtain the FFG characteristic of the M-row N-column C channel output by the 1 st FFG.

It should be noted that, in the specific implementation step S1304, the third convolution layer is formed by a convolution kernel with C sizes of t×t×c×b+1, the RCAB integration features of the M rows and N columns of c×b+1 channels are input into the third convolution layer, and the third convolution layer is used to process the RCAB integration features to obtain the FFG features of the M rows and N columns of C channels, so as to obtain the FFG feature of the 1 st FFG output.

It will be appreciated that in steps S1301 to S1304, the FFG characteristics of the 1 st FFG output are calculated, and according to the contents in fig. 5 and 9, the input terminal of the 1 st FFG in the SFS is connected to the output terminal of the first convolution layer, that is, the input terminal of the 1 st RCAB in the 1 st FFG is connected to the output terminal of the first convolution layer.

Preferably, for the 2 nd and subsequent FFGs, the characteristic of the 1 st RCAB in the input FFG is the FFG characteristic of the previous FFG output. For example: the 1 st RCAB of the 2 nd FFG is input as the FFG characteristic of the 1 st FFG output, and the 1 st RCAB of the 3 rd FFG is input as the FFG characteristic of the 2 nd FFG output.

The process of obtaining the FFG characteristics of the FFG output of the 2 nd and subsequent FFGs can be referred to the contents shown in the above steps S1301 to S1304, and will not be described herein again.

In the embodiment of the invention, the initial characteristics (or the FFG characteristics output by the previous FFG) and each RCAB are transmitted to the second characteristic connection operation to be combined, so that the FFG characteristics of M rows and N columns of C channels are obtained, and the characteristic information of the first resolution image is fully utilized.

Corresponding to the network construction method provided by the above embodiment of the present invention, referring to fig. 14, the embodiment of the present invention further provides a structural block diagram of a network construction system, where the network construction system includes: a first construction unit 1401, a second construction unit 1402, a third construction unit 1403, and a training unit 1404;

a first construction unit 1401 is configured to construct a first convolution layer using C convolution kernels having a size of t×t×a, where C and a are positive integers and t is a positive odd number.

A second construction unit 1402, configured to construct an SFS by using G FFGs, a first feature connection operation, and a second convolution layer, where the FFGs are composed of B RCABs, the second feature connection operation, and a third convolution layer, the RCABs are composed of a fourth convolution layer, a ReLU layer, a fifth convolution layer, and a CA module, and G and B are positive integers.

In a specific implementation, the second construction unit 1402 for constructing the SFS is specifically configured to: g FFGs are sequentially connected, and the output end of each FFG is connected with the input end of the first characteristic connection operation respectively, wherein the input end of the 1 st FFG is connected with the input end of the first connection operation, the output end of the first connection operation is connected with the input end of the second convolution layer, and the SFS is constructed, and the second convolution layer is composed of C convolution kernels with the size of t multiplied by C (G+1).

In a specific implementation, the second construction unit 1402 for constructing the FFG is specifically configured to: and sequentially connecting the B RCAB, connecting the output end of each RCAB with the input end of the second characteristic connection operation, and connecting the output end of the second characteristic connection operation with the input end of the third convolution layer to construct the FFG. Wherein the input end of the 1 st RCAB is connected with the input end of the second characteristic connection operation, and the third convolution layer is composed of C convolution kernels with the size of t multiplied by C (B+1).

In a specific implementation, the second construction unit 1402 for constructing the RCAB specifically is configured to: and the fourth convolution layer, the ReLU layer, the fifth convolution layer and the CA module are sequentially connected, the input end of the fourth convolution layer is connected with the output end of the CA module through the element-by-element summation unit, RCAB is constructed, and the fourth convolution layer and the fifth convolution layer are formed by C convolution kernels with the sizes of t multiplied by C.

A third construction unit 1403 is configured to connect an output end of the first convolution layer with an input end of the SFS, and connect an output end of the SFS with an input end of an up-sampling module, where the up-sampling module is configured to perform an r-times up-sampling operation, where r is any real number greater than 1, and the convolution layer in the up-sampling module is formed by (a×r) convolution kernels with a size of txt×c.

Preferably, if the convolution layer in the up-sampling module is formed by (c×r) convolution kernels with a size of t×t×c, after connecting the output of the SFS to the input of the up-sampling module, the third construction unit 1403 is further configured to: and connecting the output end of the up-sampling module with the input end of a sixth convolution layer to construct SFnet, wherein the sixth convolution layer is composed of A convolution kernels with the size of t multiplied by C.

A training unit 1404, configured to train the SFnet based on the sample data, to obtain a super-resolution network model.

Corresponding to the above method for reconstructing an image super-resolution provided by the embodiment of the present invention, referring to fig. 15, the embodiment of the present invention further provides a structural block diagram of an image super-resolution reconstruction system, where the image super-resolution reconstruction system is applied to a super-resolution network model constructed by the above disclosed network construction method, and the image super-resolution reconstruction system includes: an acquisition unit 1501 and a processing unit 1502;

An acquiring unit 1501 acquires a first resolution image of M rows and N columns of a channels, M, N and a being positive integers.

The processing unit 1502 is configured to input the first resolution image into the super-resolution network model for resolution enhancement, and obtain a second resolution image of r×m rows r×n columns a channels, where r is an enhancement multiple of the resolution.

Preferably, the SFS is connected to the first convolution layer and the upsampling module, and, in conjunction with the content shown in fig. 15, the processing unit 1502 includes a first processing module, a second processing module, and a third processing module, where the execution principle of each module is as follows:

the first processing module is used for inputting the first resolution image into a first convolution layer to obtain the initial characteristics of M rows and N columns of C channels, wherein the first convolution layer is composed of C convolution kernels with the size of t multiplied by A, and C is a positive integer.

And the second processing module is used for inputting the initial characteristics into the SFS to obtain SFS characteristics of M rows and N columns of C channels.

And the third processing module is used for inputting the SFS characteristic into the up-sampling module to obtain a second resolution image of the r-N-column A channel in r-M rows, and the convolution layer in the up-sampling module is composed of (A-r) convolution kernels with the size of t multiplied by C.

Preferably, if the convolution layer in the upsampling module is formed by (c×r) convolution kernels having a size of t×t×c, the SFnet further includes a sixth convolution layer, and the third processing module is further configured to: and inputting the SFS features into an up-sampling module to obtain up-sampling module features of r x M rows and r x N columns of C channels, and inputting the up-sampling module features into a sixth convolution layer to obtain a second resolution image of r x M rows and r x N columns of A channels, wherein the sixth convolution layer is composed of A convolution kernels with the size of t x C.

Preferably, the G FFGs are sequentially connected and the respective output ends are respectively connected with a first feature connection operation, the first convolution layer is respectively connected with the first feature connection operation and the 1 st FFG, the second convolution layer is respectively connected with the first feature connection operation and the up-sampling module, and in combination with the content in fig. 15, the second processing module includes: the execution principle of each sub-module is as follows:

The first processing sub-module is used for inputting the initial characteristic into the 1 st FFG to obtain the FFG characteristic of M rows and N columns of C channels output by the 1 st FFG, and inputting the initial characteristic into the first characteristic connection operation.

In the FFG, B RCAB are connected in turn and the respective outputs are connected to a second characteristic connection operation, respectively, which is connected to a third convolution layer.

In a specific implementation, the first processing sub-module is specifically configured to: inputting the initial characteristics into a 1 st RCAB to obtain RCAB characteristics of M rows and N columns of C channels output by the 1 st RCAB, and inputting the initial characteristics into a second characteristic connection operation; inputting the RCAB features of the z-th RCAB output into the z+1th RCAB to obtain the RCAB features of the z+1th RCAB output, inputting the RCAB features of the z+1th RCAB output into a second feature connection operation, wherein z is an integer greater than or equal to 1 and less than or equal to B, and inputting the RCAB features of the B-th RCAB output into the second feature connection operation when z is equal to B; integrating the initial features and all RCAB features by using a second feature connection operation to obtain RCAB integrated features of M rows and N columns of C (B+1) channels; and inputting the RCAB integrated characteristic into a third convolution layer to obtain the FFG characteristic of M rows and N columns of C channels of the 1 st FFG output, wherein the third convolution layer is composed of C convolution kernels with the size of t multiplied by C (B+1).

The fourth convolution layer, the ReLU layer, the fifth convolution layer, and the CA module in the RCAB are sequentially connected, and in a specific implementation, the first processing submodule is specifically configured to: inputting the initial features into a fourth convolution layer to obtain first sub-features of M rows and N columns of C channels; processing the first sub-feature by utilizing the ReLU layer to obtain a second sub-feature of the M-row N-column C channel; inputting the second sub-feature into the fifth convolution layer to obtain a third sub-feature of the M rows and N columns of C channels; processing the third sub-feature by using the CA module to obtain a fourth sub-feature of the M rows and N columns of C channels; element-by-element summation calculation is carried out on the fourth sub-feature and the initial feature, and RCAB features of M rows, N columns and C channels are obtained; the fourth and fifth convolution layers are formed from C convolution kernels of size t x C.

And the second processing submodule is used for inputting the FFG characteristics output by the y-th FFG into the y+1th FFG to obtain the FFG characteristics output by the y+1th FFG, inputting the FFG characteristics output by the y-th FFG into the first characteristic connection operation, and inputting the FFG characteristics output by the G-th FFG into the first characteristic connection operation when y is an integer greater than or equal to 1 and less than or equal to G.

And the third processing sub-module is used for integrating the initial characteristic and all FFG characteristics by utilizing the first characteristic connection operation to obtain FFG integrated characteristics of M rows and N columns of C (G+1) channels.

And a fourth processing submodule, configured to input the FFG integration feature into a second convolution layer to obtain an SFS feature of M rows and N columns of C channels, where the second convolution layer is formed by C convolution kernels with a size of txxc (g+1).

In summary, the embodiments of the present invention provide a network construction method, an image super-resolution reconstruction method and a system, which construct an SFnet by using a preset first convolution layer, an SFS and an up-sampling module, and train the SFnet based on sample data to obtain a super-resolution network model. The method comprises the steps of inputting a first resolution image of M rows, N columns and A channels into a super resolution network model for resolution improvement, utilizing SFS to achieve multi-level information extraction and integration, utilizing feature information of the first resolution image at a global level and a local level through the SFS, enabling the super resolution network model to improve the resolution of the first resolution image on the premise of guaranteeing high fidelity, and obtaining a second resolution image of r, M rows, r, N columns and A channels with high fidelity and high resolution.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The image super-resolution reconstruction method is characterized by being applied to the constructed super-resolution network model, and comprises the following steps of:

inputting the first resolution image into a super-resolution network model for resolution improvement to obtain a second resolution image of r x M rows r x N columns A channels, wherein r is the improvement multiple of the resolution;

the SFS is respectively connected with the first convolution layer and the up-sampling module, the first resolution image is input into the super-resolution network model to carry out resolution improvement, and a second resolution image of r x M rows r x N columns A channels is obtained, and the SFS comprises the following steps:

2. The image super-resolution reconstruction method according to claim 1, wherein G FFGs are sequentially connected and respective output ends are respectively connected with a first feature connection operation, a first convolution layer is respectively connected with the first feature connection operation and a 1 st FFG, a second convolution layer is respectively connected with the first feature connection operation and the up-sampling module, and the inputting the initial feature into the SFS to obtain SFS features of M rows, N columns and C channels includes:

3. The image super-resolution reconstruction method according to claim 2, wherein B RCABs are sequentially connected and respective output ends are respectively connected with a second feature connection operation, the second feature connection operation is connected with a third convolution layer, the initial feature is input into the 1 st FFG, and the FFG feature of the M-row N-column C channel output by the 1 st FFG is obtained, including:

4. The method for reconstructing super-resolution image according to claim 3, wherein the fourth convolution layer, the ReLU layer, the fifth convolution layer, and the CA module are sequentially connected, and the inputting the initial feature into the 1 st RCAB, to obtain the RCAB feature of the M row, N column, and C channel output by the 1 st RCAB includes:

5. The method for reconstructing an image super-resolution according to claim 1, wherein if the convolution layer in the up-sampling module is formed by (c×r) convolution kernels having a size of t×t×c, SFnet further includes a sixth convolution layer, and after inputting the initial feature into the SFS to obtain SFS features of M rows and N columns of C channels, the method further includes:

6. The image super-resolution reconstruction method according to claim 1, wherein the process of constructing the super-resolution network model comprises:

training the SFnet based on sample data to obtain a super-resolution network model;

the construction of the second-order feedforward structure SFS by using G first-order feedforward groups FFG, a first characteristic connection operation and a second convolution layer comprises the following steps:

g FFGs are sequentially connected, and the output end of each FFG is connected with the input end of the first characteristic connection operation respectively, wherein the input end of the 1 st FFG is connected with the input end of the first connection operation;

7. The method of image super-resolution reconstruction according to claim 6, wherein the process of constructing the FFG according to the B residual channel attention blocks RCAB, the second feature join operation, and the third convolution layer, comprises:

8. The method of image super-resolution reconstruction according to claim 6, wherein constructing the RCAB from the fourth convolution layer, the linear rectification function ReLU layer, the fifth convolution layer, and the channel attention CA module comprises:

9. The method of claim 6, wherein if the convolution layer in the upsampling module is formed of (c×r) convolution kernels having a size of t×t×c, connecting the output terminal of the first convolution layer to the input terminal of the SFS, and connecting the output terminal of the SFS to the input terminal of the upsampling module further comprises:

10. An image super-resolution reconstruction system, characterized by being applied to a constructed super-resolution network model, comprising:

the processing unit is used for inputting the first resolution image into a super-resolution network model to perform resolution improvement to obtain a second resolution image of r x M rows r x N columns A channels, wherein r is the improvement multiple of the resolution;

the SFS is respectively connected with the first convolution layer and the up-sampling module, and the processing unit comprises:

the first processing module is used for inputting the first resolution image into the first convolution layer to obtain initial characteristics of M rows and N columns of C channels, the first convolution layer is composed of C convolution kernels with the size of t multiplied by A, and C is a positive integer;

the second processing module is used for inputting the initial characteristics into the SFS to obtain SFS characteristics of M rows and N columns of C channels;

and the third processing module is used for inputting the SFS characteristic into the up-sampling module to obtain a second resolution image of r, M rows, r, N columns and A channels, and the convolution layer in the up-sampling module is formed by (A, r) convolution kernels with the size of t, C.