CN113902658B

CN113902658B - RGB image-to-hyperspectral image reconstruction method based on dense multiscale network

Info

Publication number: CN113902658B
Application number: CN202111020790.6A
Authority: CN
Inventors: 周慧鑫; 李怡雨; 宋江鲁奇; 李宇燕; 张嘉嘉; 向培; 滕翔; 王瑛琨; 李苗青; 田成; 王财顺
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-09-01
Filing date: 2021-09-01
Publication date: 2023-02-10
Anticipated expiration: 2041-09-01
Also published as: CN113902658A

Abstract

The invention discloses a reconstruction method of RGB image to hyperspectral image based on dense multiscale network, which constructs an improved residual error network model; performing feature extraction on an input image through a 1 × 1 convolutional layer to obtain a feature map; in the feature mapping part of the residual error network model, a conversion layer and two branches of a main network are adopted to respectively process a feature map; the conversion layer is characterized in that two groups of 3 x 3 convolutions are added on the basis of quick connection to directly extract the characteristics of the characteristic graph to obtain a first output characteristic graph; a cross-channel fusion receptive field module is added behind the last residual block of the main network, and feature extraction is carried out on the feature map to obtain a second output feature map; and adding the first output characteristic diagram and the second output characteristic diagram to obtain a reconstructed hyperspectral image. The method realizes the reconstruction from the RGB image to the hyperspectral image, has better reconstruction effect than the traditional algorithm, and can effectively enlarge the receptive field of the network, thereby obtaining good reconstruction effect.

Description

Method for reconstructing RGB image to hyperspectral image based on dense multiscale network

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network.

Background

The hyperspectral image contains spatial information and spectral information at the same time, so that not only can relevant operation be performed on the two-dimensional scene image, but also the material can be analyzed according to the spectral response curve, and the hyperspectral image has high application value. In order to reduce the difficulty in acquiring high-spatial-resolution hyperspectral images, reconstructing hyperspectral images by using low-cost RGB images has become an important research content in recent years.

The method for reconstructing the hyperspectral image by the RGB image can be mainly divided into a method for reconstructing by additional equipment and a method for reconstructing by directly using an algorithm. The method for reconstructing the RGB image to the hyperspectral image based on the equipment mainly comprises the steps of modifying an RGB camera system, adopting a plurality of cameras or a preset imaging environment and the like, and increasing spectral dimension information obtained from the RGB cameras so as to reduce reconstruction difficulty. But also because of the dependence on equipment and environment, the universality of the method is reduced, and meanwhile, a complex optical path system needs professional knowledge to construct and use, so that the use difficulty is increased. The method for reconstructing the image based on the algorithm mainly comprises a traditional algorithm and an algorithm based on deep learning. The hyperspectral image reconstruction network based on deep learning mainly comprises three parts of feature extraction, feature mapping and spectrum reconstruction.

In the algorithm for reconstructing the hyperspectral image from the RGB image based on deep learning, effective feature extraction is an important premise for reconstructing the hyperspectral image more accurately, and the receptive field is one of important factors influencing feature extraction. Therefore, the deep learning network architecture for researching and expanding the receptive field and effectively extracting the context information has important significance for reconstructing the RGB to hyperspectral images.

Disclosure of Invention

In view of this, the present invention provides a method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network.

In order to achieve the purpose, the technical scheme of the invention is realized as follows:

the embodiment of the invention provides a method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network, which comprises the following steps:

constructing an improved residual error network model;

performing feature extraction on an input image through a 1 × 1 convolutional layer to obtain a feature map;

in the feature mapping part of the residual error network model, a conversion layer and a main network branch are adopted to respectively process a feature map;

the conversion layer is characterized in that two groups of 3 x 3 convolutions are added on the basis of quick connection to directly extract the characteristics of the characteristic graph to obtain a first output characteristic graph;

a cross-channel fusion receptive field module is added behind the last residual block of the main network, and feature extraction is carried out on the feature map to obtain a second output feature map;

and adding the first output characteristic diagram and the second output characteristic diagram to obtain a reconstructed hyperspectral image.

In the scheme, the constructed improved residual error network model is based on a reconstruction algorithm based on a residual error block, and a traditional convolution layer is used as a conversion layer to extract basic characteristics; the main network is composed of a traditional convolution layer, a 1 x 1 convolution and a plurality of groups of residual blocks, wherein the 1 x 1 convolution is used for adjusting the number of channels.

In the above scheme, the residual block includes a main network and two branches for quick connection, the main network generates a feature matrix through a plurality of convolution layers, and the feature matrix is activated after being added with the feature matrix for quick connection.

In the above scheme, the main network adds a cross-channel fusion receptive field module behind the last residual block, which specifically comprises:

(1) The pooling part respectively adopts 2 multiplied by 2, 3 multiplied by 3 and 6 multiplied by 6 pooling windows to carry out down-sampling operation on the input feature map;

(2) The preprocessing part adopts a self-normalization activation function Selu, continuously down-samples feature graphs subjected to pooling operation by adopting 2 x 2 and 3 x 3 after self-normalization and channel quantity conversion until the feature graphs are consistent with the feature graphs of the 6 x 6 branches in size, and performs cross-layer splicing on the two feature graphs and the processed feature graphs on the 6 x 6 branches;

(3) An up-sampling part adopts pixel buffer to perform up-sampling treatment;

(4) And the characteristic splicing part splices the channel dimensions of the four groups of obtained characteristic graphs.

In the foregoing solution, the normalized activation function Selu specifically includes:

wherein the content of the first and second substances,

in the above scheme, the upsampling part performs upsampling processing by using pixel buffer, and specifically includes: the expression for performing the upsampling operation by using pixel shuffle is

I ^SR ＝f ^L (I ^LR )＝PS(W _L *f ^L-1 (I ^LR )+b _L )

In which I ^SR Is a high resolution feature map, I ^LR Is a low resolution feature map, f ^L For non-linear operation on the L-th layer low resolution feature map, weight W _L Is n _l-1 ×r ² C×k _L ×k _L ，b _L Is an offset. PS is a periodic screening operation that combines and transforms spatial and channel information for upsampling purposes,it is calculated in the manner of

Wherein, T is a low-resolution image, r is an image expansion multiple, x and y are coordinates of pixel points on a high-resolution characteristic diagram, H multiplied by W multiplied by C multiplied by r ² Is converted into a tensor of rH × rW × C.

Compared with the prior art, the method realizes the reconstruction from the RGB image to the hyperspectral image, and the reconstruction effect is superior to that of the traditional algorithm; the cross-channel fusion receptive field module is constructed by using the multi-scale pyramid module, the intensive connecting blocks, the pixel recombination module and the cross-channel connection, the main network part in the reconstruction algorithm based on the residual block is improved by using the cross-channel fusion receptive field module, a new side connection is constructed, multi-scale information is fused on the basis of deepening the network depth, the receptive field of the network can be effectively expanded, and a good reconstruction effect is achieved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and do not limit the invention. In the drawings:

FIG. 1 is a flowchart of an algorithm for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network according to an embodiment of the present invention;

FIG. 2 is a network structure diagram of a method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network according to an embodiment of the present invention;

FIG. 3 is a cross-channel fusion receptive field module for a method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network according to an embodiment of the present invention;

fig. 4 is an effect diagram of reconstructing a hyperspectral image from an RGB image to a hyperspectral image in a method for reconstructing an RGB image based on a dense multiscale network according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

The embodiment of the invention provides a method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network, which comprises the following steps of:

step 1: the input image is subjected to feature extraction through a convolution layer having a size of 1 × 1.

Step 2: in the feature mapping part, a conversion layer and a main network branch are adopted to respectively process a feature map; the conversion layer part is used for directly extracting the features of the input image by adding two groups of 3 multiplied by 3 convolutions on the basis of quick connection.

And 3, step 3: in the main network part, the size of the first layer of the traditional convolution layer is 5 multiplied by 5, and the step length is 1; the second layer size is 1 × 1, step size is 1.

And 4, step 4: adding two layers of residual blocks with the size of 3 multiplied by 3, and then adding a cross-channel fusion receptive field module behind the last residual block of the residual modules to improve the feature extraction capability;

specifically, the cross-channel fusion receptive field module is realized by the following steps:

step 401: the pooling portion downsamples the input feature map using pooling windows of 2 × 2, 3 × 3, and 6 × 6, respectively.

Step 402: the preprocessing part adopts a self-normalization activation function Selu to replace the original batch normalization operation, the characteristic graphs which adopt 2 multiplied by 2 and 3 multiplied by 3 to carry out pooling operation are subjected to self-normalization and channel quantity conversion, then continuous down-sampling is carried out until the size of the characteristic graphs is consistent with that of the characteristic graphs of 6 multiplied by 6 branches, the two characteristic graphs are spliced in a cross-layer way with the characteristic graphs on the processed 6 multiplied by 6 branches, and the output x of the l layer is output _l Contact is made with all previous output layers.

Step 403: the upsampling part adopts pixel buffer to perform upsampling processing, specifically, the upsampling part adopts pixel buffer

I ^SR ＝f ^L (I ^LR )＝PS(W _L *f ^L-1 (I ^LR )+b _L )

In which I ^SR Is a high resolution feature map, I ^LR Is a low resolution feature map, f ^L For non-linear operation on the L-th layer low resolution feature map, weight W _L Is of size n _l-1 ×r ² C×k _L ×k _L ，b _L Is an offset. PS is a periodic screening operation that combines and transforms spatial and channel information to achieve upsampling by computing it as

Wherein T is a low-resolution image, r is an image expansion multiple, and x and y are coordinates of pixel points on a high-resolution characteristic diagram. The algorithm can convert H × W × C × r ² Is converted into a tensor of rH × rW × C.

Step 404: and the characteristic splicing part splices the channel dimensions of the four groups of obtained characteristic graphs.

And 5: and adding a convolution layer with the size of 1 multiplied by 1, and completing the spectral reconstruction of the main network by the convolution layer with the size of 5 multiplied by 5 and the step length of 1.

And 6: and finally, adding the two groups of output characteristic graphs of the conversion layer and the output characteristic graph of the main network to realize characteristic fusion of multiple scales, thereby obtaining the reconstructed hyperspectral image.

Network model and parameter setting:

1. when training the network, an L2 normal form loss function is used, and the expression is

Wherein S is the total error, yi represents the value of the ith pixel in the target output characteristic diagram, and f (xi) represents the value of the ith pixel in the actual output characteristic diagram.

2. The network parameter setting of the RGB-hyperspectral image reconstruction algorithm based on the dense multiscale network is shown in table 1, wherein avgpool represents average pooling, pixel shuffle represents that pixel recombination is adopted to realize upsampling, and stride represents a step length.

Table 1 network parameter settings

Table 1 (continuous) network parameter settings

3. Some hyper-parameters during the experiment were set as: the initial learning rate is 0.005, the learning rate is dynamically reduced every 5000 training steps in the learning process, the attenuation coefficient is 0.93, and an Adam optimizer is adopted to optimize the algorithm.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. A method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network is characterized by comprising the following steps:

step 1: performing feature extraction on an input image through a convolution layer with the size of 1 multiplied by 1;

step 2: in the feature mapping part, a conversion layer and a main network branch are adopted to process a feature map respectively; the conversion layer part is characterized in that two groups of 3 multiplied by 3 convolutions are added on the basis of quick connection to directly extract the characteristics of an input image;

and step 3: in the main network part, the size of the first layer of the traditional convolution layer is 5 multiplied by 5, and the step length is 1; the size of the second layer is 1 multiplied by 1, and the step length is 1;

the cross-channel fusion receptive field module specifically comprises: (1) The pooling part respectively adopts 2 multiplied by 2, 3 multiplied by 3 and 6 multiplied by 6 pooling windows to carry out down-sampling operation on the input feature map;

(3) An upsampling part adopts pixel buffer to perform upsampling processing;

(4) The characteristic splicing part splices the channel dimensions of the four groups of obtained characteristic graphs;

and 5: adding a convolution layer with the size of 1 multiplied by 1, and completing the spectrum reconstruction of the main network by the convolution layer with the size of 5 multiplied by 5 and the step length of 1;

step 6: and finally, adding the two groups of output characteristic graphs of the conversion layer and the output characteristic graph of the main network to realize characteristic fusion of multiple scales, thereby obtaining the reconstructed hyperspectral image.

2. The method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network according to claim 1, wherein the normalized activation function Selu specifically is:

wherein the content of the first and second substances,

3. the method for reconstructing an RGB image to a hyperspectral image based on a dense multiscale network according to claim 1, wherein the upsampling part performs upsampling processing by using pixel shuffle, and specifically comprises: the expression for performing the upsampling operation by using the pixel buffer is

I ^SR ＝f ^L (I ^LR )＝PS(W _L *f ^L-1 (I ^LR )+b _L )

Wherein I ^SR Is a high resolution feature map, I ^LR Is a low resolution feature map, f ^L For non-linear operation on the L-th layer low resolution feature map, weight W _L Is of size n _l-1 ×r ² C×k _L ×k _L ，b _L For biasing, PS is a periodic screening operation that combines and transforms spatial and channel information to achieve upsampling by a calculation