CN114972803B

CN114972803B - Snapshot type spectrum imaging method and system based on joint optimization

Info

Publication number: CN114972803B
Application number: CN202210403141.2A
Authority: CN
Inventors: 付莹; 张涛; 梁致远
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2022-04-18
Filing date: 2022-04-18
Publication date: 2024-05-31
Anticipated expiration: 2042-04-18
Also published as: CN114972803A

Abstract

The invention relates to a snapshot type spectrum imaging method and system based on joint optimization, and belongs to the technical field of computer spectrum imaging. According to the hyperspectral image reconstruction method based on deep learning, the space and spectrum dimensional arrangement mode is jointly optimized, and meanwhile, the hyperspectral image is reconstructed, so that the internal features of the hyperspectral image can be effectively extracted. Through a deep unfolding network, spatial demosaicing and spectral super-resolution are solved, model characteristics can be explicitly utilized, and the network is more flexible and more explanatory. Through the design optimization layer, the arrangement mode of the spatial multispectral filter array and the spectral response function is jointly optimized, the spatial spectrum correlation of the hyperspectral image is fully utilized, and the reconstruction accuracy of the hyperspectral image is improved.

Description

Snapshot type spectrum imaging method and system based on joint optimization

Technical Field

The invention relates to a method capable of acquiring high-quality hyperspectral images, in particular to a snapshot type spectral imaging method and system based on joint optimization, and belongs to the technical field of computer spectral imaging.

Background

Hyperspectral imaging is a technique for analyzing a large number of narrow-band image data, which combines two major techniques of imaging and spectroscopy. The data cubes acquired using this technique, i.e. hyperspectral images, the light striking each pixel is decomposed into many different spectral bands, rather than just assigning each pixel three primary colors. Generally, the visible light color seen by the human eye is largely divided into three bands: the long wavelength is perceived as red, the medium wavelength as green, and the short wavelength as blue. But spectral imaging divides the spectrum into more bands, including two-dimensional spatial information and one-dimensional spectral information of the target scene. The rich spectrum details of the hyperspectral image can reflect illumination and material information of a scene, so that the technology is widely applied to the fields of food safety, medical diagnosis, aerospace and the like.

Current spectral imaging systems can be divided into two categories, depending on the imaging means: scan and snapshot. The scanning spectral imaging technique, among other things, samples according to spectral and spatial dimensions and requires that the scene be static until the data cube is fully acquired. Early scanning spectral imaging techniques, either sampling one band at a time, or spatially sampling a row of pixels of all bands at a time; later on spatial spectral line scanning devices were developed, which included a camera placed behind the slit spectrometer and a dispersive element to acquire complete hyperspectral images by moving the scene, camera or slit alone in a direction orthogonal to the acquired lines. However, the above-mentioned progressive scanning spectral imaging system needs to scan the space or spectrum of the scene, so such devices are generally complex and have long imaging time, which limits the application of the device in capturing dynamic scenes at high speed.

In order to overcome the defects of the scanning type spectrum imaging system, a novel spectrum imaging system which adopts a snapshot type spectrum imaging technology has appeared in recent years. The snapshot spectrum imaging system is provided with a multispectral filter array and a spectrum response function, acquires area images with different wavelengths, can instantaneously generate mosaic images, and is matched with a demosaicing algorithm at the rear end, so that the time cost of the scanning type imaging device is greatly reduced. Each pixel of the mosaic image obtained by the snapshot has only information of one band, the spectral information of which is integrated into a plurality of bands by a spectral response function, and only one spectral value in all bands is recorded at each pixel position by the multispectral filter array.

Due to the spatial sampling of the multispectral filter array, a spatial demosaicing algorithm is required to recover the missing spatial information. However, spatial demosaicing only solves the spatial undersampling problem, and only a limited number of bands can be recovered. In order to fully reconstruct a hyperspectral image with high spatial and spectral resolution, it is also necessary to spectrally re-project a coarse hyperspectral image with a set of finer bands using a spectral super-resolution algorithm. Spatial demosaicing and spectral super-resolution work as software decoders in computational cameras, however, they are traditionally handled in an independent and sequential manner, which results in erroneous accumulation and sub-optimal reconstruction.

In addition, the more mainstream spatial demosaicing methods proposed by Brauers and Aach each perform hyperspectral image reconstruction under a given multispectral filter array. For the spectral super-resolution method, most studies have also performed spectral reconstruction at a given spectral response function. Gharbi it is pointed out that the multispectral filter array and the spectral response function at the hardware end can have a large influence on the quality of the recovered hyperspectral image from the spatial domain and the spectral domain respectively.

In order to optimize the spatial multispectral filter array, some traditional spatial demosaicing methods design various arrangement modes of the multispectral filter array under two major standards of spectral consistency and spatial uniformity proposed by Miao. In addition, in order to optimize the spectral response function, the Arad traditional spectral super-resolution method adopts an evolutionary optimization method to select the optimal spectral response. Recently, the deep learning method jointly optimizes two processes of selecting a spectral response function from a natural image and reconstructing a hyperspectral image. However, the arrangement pattern of the spatial multispectral filter array and the distribution pattern of the spectral response function are always optimized in the spatial and spectral domains, respectively, which may lead to sub-optimal solutions.

Disclosure of Invention

The invention aims at solving the problems of two processes of separation space demosaicing and spectrum super-resolution in the prior art, and rarely researches the defects of arrangement modes and the like of a space multispectral filtering array and a spectrum response function, and creatively provides a snapshot type spectrum imaging method and system based on joint optimization for realizing the purposes of joint optimization of the multispectral filtering array and the spectrum response function, joint space demosaicing and spectrum super-resolution, hyperspectral image reconstruction and the like.

In order to achieve the above purpose, the present invention adopts the following technical scheme.

In the training stage, firstly, a hyperspectral image training data set is constructed; randomly selecting a plurality of hyperspectral images from a training data set, inputting the hyperspectral images into a joint optimization network model, performing mosaic on each hyperspectral image by the model to obtain a mosaic image, and generating a hyperspectral image reconstructed by the model through a spatial demosaicing and spectrum super-resolution method; then, comparing the reconstructed hyperspectral image with a real hyperspectral image, calculating a loss function of the hyperspectral image, and iteratively updating parameters in the model until the parameters meet preset conditions, and stopping training;

And in the using stage, generating and storing hyperspectral images according to the combined optimization network by utilizing model parameters obtained in the training stage, and if the true hyperspectral images corresponding to the reconstructed hyperspectral images exist, comparing and evaluating the hyperspectral images and the true hyperspectral images and judging the effect of the network model.

A snapshot type spectrum imaging method based on joint optimization comprises the following steps:

step 1: training stage. And (3) manufacturing a hyperspectral image training data set, and iteratively modifying a parameter dictionary of the algorithm model.

Specifically, step 1 includes the steps of:

step 1.1: the hyperspectral image dataset is preprocessed (including image enhancement and cropping) to generate the training dataset.

The specific method comprises the following steps:

And downloading a published hyperspectral image dataset in a mat format, reserving the number of channels for each hyperspectral image, and cutting in a space dimension to obtain a hyperspectral image block with 64 multiplied by 64 spatial resolution. Each hyperspectral image block serves as a training sample, and all blocks are regarded as training data.

Step 1.2: the multispectral filter array and the spectral response function are designed. The parameters of the two can be learned, and the purpose is to jointly optimize the multispectral filter array and the spectral response function at the hardware end, so that the software and the hardware can be mutually constrained.

First, the multispectral response function is optimized. In order to introduce optimal spatial information into the hyperspectral imaging process, a learnable spectral response function curve is designed. Finally, the two are accompanied by network iteration, so that the relevant parameters are updated.

For the optimization of the spatial arrangement of the multispectral filter array, the invention realizes the multispectral filter array as a trainable physical binary mask. The multispectral filter array F is regarded as a 0-1 vector, and can generate serious spatial undersampling and be used for simulating the process of acquiring mosaic images by a snapshot type spectral imaging system. The invention selects the size of the 4 multiplied by 4 array, and obtains the corresponding sensor measurement value by tiling the multispectral filter array until the whole training sample is fully paved.

In designing a multispectral filter array, further consideration of physical constraints is required to ensure that the multispectral filter array can be manufactured. For this purpose, a temperature coefficient-containing Softmax operation can be used for the multispectral filter array, and under the action of the Softmax function, the larger value of each pixel along the spectrum will gradually dominate, so that the learned multispectral filter array becomes a 0-1 vector.

Because the spatial arrangement learning only optimizes the acquisition mode of the hyperspectral image in the spatial dimension, the invention further introduces a spatial spectrum joint optimization layer in order to optimize the acquisition mode simultaneously in the spatial and spectral dimensions. Specifically, the best spectral response is selected from the set of candidate spectral response functions for each spatial position of the multispectral filter array, or the best spatial light arrangement is directly designed through the network itself.

Each spatial position of the mosaic image Y can be interpreted as a weighted sum of the hyperspectral images along the spectral dimension, with different spatial positions having different weights, according to the linear relationship between the mosaic image and the hyperspectral image. In spatial spectral pattern optimization, the following two constraints apply: first, since each pixel value of the simulated mosaic image should be a positive number, all values in the learning arrangement pattern are non-negative; second, the spectral dimension arrangement needs to be smooth to facilitate the implementation of the filter. Thereby realizing a forced network learning non-negative and smooth spatial spectral arrangement pattern.

Step 1.3: the hyperspectral image reconstruction network model firstly performs mosaic in the model according to the input hyperspectral image, and then performs reconstruction to finally generate a group of multispectral filter array masks, a spectrum response function arrangement mode and a reconstructed hyperspectral image. The reconstructed hyperspectral image is then compared with the actual hyperspectral image, the loss function is calculated and the parameters in the model are updated accordingly.

The specific method comprises the following steps:

Firstly, a plurality of hyperspectral images (for example, 16 hyperspectral images are selected) are randomly selected from a training data set and input into a joint optimization network model. And performing mosaic on each hyperspectral image by the model to obtain a mosaic image, and generating a hyperspectral image reconstructed by the model through a spatial demosaicing and spectrum super-resolution method.

The joint optimization network model comprises depth priori regularization and convolution neural network.

Since the recovery of the original hyperspectral image X from the mosaic image Y is an inverse problem, various a priori knowledge is required. The invention spreads the optimization algorithm into a convolutional neural network. In the present invention, rather than training a single convolutional neural network, a series of convolutional neural networks are used to map the measured values to the desired signal domain. Such a deep-unfolding framework improves the flexibility and interpretability of the proposed model.

In depth a priori regularization, a structure is optimized following a simulated a priori regularization and the iterations are expanded into the depth network. Because the transformation matrix a is highly correlated with the multispectral filter array and the spectral response function, the relationship between the mosaic image Y and the underlying hyperspectral image can be approximated. However, the hyperspectral image reconstruction problem is an inverse problem, and it is feasible to use regularized priors to minimize the problem formulation. However, if the regularization term is not microminiatable, solving the problem formulation is extremely difficult. It is common practice to constrain the solved data fidelity terms according to the degradation model and ensure reconstructionDecoupling is carried out with a regularization term of the hyperspectral image prior, and a common optimization method is an alternate direction multiplier algorithm.

In the convolutional neural network, the parameters in the transformation matrix A and the reconstruction network are learned at the same time, so that each iteration needs the participation of the transformation matrix A, and the optimization process is more difficult. To solve this problem, the present invention replaces the transformation matrix a with a simple neural network consisting of convolutional layers and activation functions.

In conclusion, the invention realizes the training of the joint design of the multispectral filter array, the arrangement of the spectral response function and the spatial demosaicing and the spectral super-resolution expressed as an automatic encoder. Given a set of training hyperspectral images, the hardware encoding process involves projecting corresponding input spatial and spectral information, under the control of a trainable multispectral filter array and spectral response function, which projection generates a single-channel mosaic image Y that is used as input to the decoding (spatial and spectral reconstruction) step. The software decoder adopts a depth unfolding network to simultaneously carry out spatial demosaicing and spectral super-resolution, and reconstructs hyperspectral images with high spatial and spectral resolution.

Step 1.4: and (3) repeating the step (1.3) until the set termination condition is met, and storing the structure and model parameters of the network.

In the training stage, hyperspectral images are input to the model continuously and repeatedly, parameters in the model are dynamically adjusted according to the loss function until preset conditions are met (for example, training turns reach a set quantity or a certain evaluation index is better than a preset value), the training process is stopped, and the model parameters are stored.

Step 2: stage of use.

And (3) generating a hyperspectral image according to the combined optimization network by using the model parameters obtained in the training stage in the step (1) and storing the hyperspectral image. If the true hyperspectral image corresponding to the reconstructed hyperspectral image exists, comparing and evaluating the true hyperspectral image and the true hyperspectral image, and judging the effect of the network model.

Furthermore, in order to effectively implement the method, the invention provides a snapshot type spectrum imaging system based on joint optimization, which comprises a hardware coding subsystem, a software decoding subsystem and an reasoning subsystem.

The hardware coding subsystem is used for jointly optimizing the arrangement mode of the multispectral filter array and the spectral response function to obtain a mosaic image under the hardware coding subsystem.

The software decoding subsystem is used for jointly optimizing a spatial demosaicing and spectrum super-resolution algorithm, and reconstructing spatial and spectrum dimensions of a mosaic image from the hardware coding subsystem, so that training of a joint optimization network model is completed.

And the reasoning subsystem uses the trained joint optimization network model to conduct reasoning. In each reasoning process, the reasoning subsystem does not need to train repeatedly, and the same joint optimization network model is used each time.

The output end of the hardware coding subsystem is connected with the input end of the software decoding subsystem, and the output end of the software decoding subsystem is connected with the input end of the reasoning subsystem.

Advantageous effects

Compared with the prior art, the invention has the following advantages:

1. According to the hyperspectral image reconstruction method based on deep learning, the space and spectrum dimensional arrangement mode is jointly optimized, and meanwhile, the hyperspectral image is reconstructed, so that the internal features of the hyperspectral image can be effectively extracted.

2. The invention can develop the network through one depth, solve spatial demosaicing and spectrum super-resolution at the same time, and can explicitly utilize the model characteristics, so that the network is more flexible and has better interpretation.

3. The invention designs the optimization layer, jointly optimizes the arrangement mode of the spatial multispectral filter array and the spectral response function, fully utilizes the spatial spectrum correlation of the hyperspectral image, and improves the reconstruction precision of the hyperspectral image.

Drawings

Fig. 1 is a flow chart of the method of the present invention.

Fig. 2 is a schematic overall framework of the method of the invention.

FIG. 3 is a schematic diagram of a spectral filter array of a hardware encoder in a core algorithm model of the method of the present invention.

FIG. 4 is a schematic representation of the spectral response function of a hardware encoder in the core algorithm model of the method of the present invention.

Fig. 5 is a schematic diagram of a software decoder in a core algorithm model of the method of the present invention.

Fig. 6 is a schematic diagram of the composition of the system of the present invention.

Detailed Description

For a better description of the objects and advantages of the invention, the method of the invention will be further described with reference to the accompanying drawings and examples.

Examples

The embodiment discloses a hyperspectral image reconstruction method and a hyperspectral image reconstruction system based on deep learning, wherein the method comprises a training stage and a using stage; the system comprises a hardware encoding subsystem, a software decoding subsystem and an reasoning subsystem. The method flowchart and the system composition diagram of this embodiment are shown in fig. 1 and fig. 6, respectively.

A hyperspectral image reconstruction method based on deep learning. In the training stage, firstly, a published hyperspectral image dataset in a mat format is downloaded, the number of channels is reserved for each hyperspectral image, and the hyperspectral image dataset is cut in the space dimension to obtain a hyperspectral image block with 64 multiplied by 64 space resolution. Each hyperspectral image block serves as a training sample, and all blocks are regarded as training data. After the hyperspectral image training data set is manufactured, the parameter dictionary of the algorithm model is iteratively modified. And in the test stage, generating and storing hyperspectral images according to the combined optimization network by using the model parameters obtained in the training stage. Further, if a real hyperspectral image corresponding to the reconstructed hyperspectral image exists, comparing and evaluating the reconstructed hyperspectral image and the real hyperspectral image, and judging the effect of the network model. The flow chart of this embodiment is shown in fig. 2.

Specifically, step 1 includes the steps of:

Step 1.1: and downloading the disclosed hyperspectral image data set, performing preprocessing such as image enhancement, clipping and the like, and generating a training data set.

The specific method comprises the following steps:

Step 1.2: the multispectral filter array and the spectral response function are designed so that both are parameter-learnable. The purpose is to perform joint optimization on a multispectral filter array and a spectral response function at a hardware end, so that the function that software and hardware can be mutually constrained is achieved.

The specific method comprises the following steps:

First, a trainable physical binary mask, i.e. a multispectral filter array, is realized by programming, corresponding to the optimization of the spatial arrangement. Secondly, in order to introduce optimal spatial information into the hyperspectral imaging process, a learnable spectral response function curve is designed. Both are updated together with network iterations.

For the optimization of the spatial arrangement of the multispectral filter array, the invention realizes the multispectral filter array as a trainable physical binary mask. The multispectral filter array can be regarded as a 0-1 vector, so that the multispectral filter array can generate serious spatial undersampling, and the process of acquiring mosaic images by the snapshot type spectral imaging system is simulated. The invention selects the size of the array of 4 multiplied by 4, and the corresponding sensor measurement value is obtained by tiling the multispectral filter array until the whole training sample is fully paved. In designing a multispectral filter array, physical constraints imposed on the multispectral filter array need to be considered, so that the multispectral filter array can be manufactured. Thus, the temperature coefficient containing Softmax operation was used for the multispectral filter array F:

F′＝Softmax[α_tF] (1)

where F' is updated after each iteration and α _t represents the slowly increasing temperature parameter through the training iteration.

Here, the standard Softmax function is applied to the multidimensional input tensors to rescale them so that the element of the output tensor ranges between [0,1] and the sum is 1. By adding another super-parameter tau, the result is made to become larger and smaller under the action of the Softmax function. Thus, the large value of each pixel along the spectrum will be dominant gradually, changing the learned multi-spectral filter array into a 0-1 vector. The flow chart of this embodiment is shown in fig. 3.

Secondly, the above spatial arrangement learning optimizes the acquisition mode of hyperspectral images only in the spatial dimension. In order to optimize the acquisition mode in the space and spectrum dimensions simultaneously, the invention further introduces a space spectrum joint optimization layer. Specifically, the best spectral response is selected from the set of candidate spectral response functions for each spatial position of the multispectral filter array, or the best spatial light arrangement is directly designed through the network itself.

Each spatial position of the mosaic image Y can be interpreted as a weighted sum of the hyperspectral images along the spectral dimension, with different spatial positions having different weights, according to the linear relationship between the mosaic image and the hyperspectral image. The following two constraints apply in the optimization of spatial spectral patterns:

first, since each pixel value of the simulated mosaic image should be a positive number, all values in the learning arrangement pattern are non-negative. Second, the spectral dimension arrangement needs to be smooth to facilitate the implementation of the filter. Thereby realizing a forced network learning non-negative and smooth spatial spectrum arrangement mode, expressed as:

Wherein η ₁ and η ₂ are hyper-parameters. Representing the gradient of transformation matrix A in the spectral dimension,/>Representing the square of the 2-norm. In this section, constraint losses for designing spatial spectral patterns, losses/>, are obtainedWill be added to the total loss function of the network optimization. The flow chart of this embodiment is shown in fig. 4.

Step 1.3: the hyperspectral image reconstruction network model internally performs mosaic according to the input hyperspectral image, and then performs reconstruction to finally generate a set of multispectral filter array masks at the hardware end, the arrangement mode of the spectral response function and a reconstructed hyperspectral image. Comparing the reconstructed hyperspectral image with the real hyperspectral image, calculating a loss function of the hyperspectral image and updating parameters in the model according to the loss function.

The specific method comprises the following steps:

And randomly selecting 16 hyperspectral images from the training data set and inputting the hyperspectral images into a network model. The model firstly carries out mosaic on each hyperspectral image to obtain mosaic images, and then generates hyperspectral images reconstructed by the model through a spatial demosaicing and spectrum super-resolution method at a software end.

The joint optimization network model comprises: depth a priori regularization and convolutional neural networks.

In depth a priori regularization, the idea is followed to mimic the a priori regularized optimization structure and to expand the iterations into the depth network. Because the transformation matrix a is highly correlated with the multispectral filter array and the spectral response function, the relationship between the mosaic image Y and the underlying hyperspectral image can be approximated. However, the hyperspectral image reconstruction problem is an inverse problem, so it is feasible to use regularized priors to minimize equation 3:

wherein, Representing the reconstructed hyperspectral image, γ is a balance parameter, R (·) is a regularization function imposing a priori knowledge.

Equation 3 can well introduce image priors while ensuring consistency between the degraded hyperspectral image and the restored hyperspectral image. However, solving equation 3 is extremely difficult if the regularization term is not microminiatable. It is common practice to constrain the solved data fidelity terms according to the degradation model and ensure reconstructionDecoupling from regularization term of hyperspectral image prior. The most common optimization method is the alternate direction multiplier algorithm, and therefore, equation 3 solves as:

Wherein t and t+1 represent the current stage and the next stage, respectively; x ^(t+1) represents the hyperspectral image after one optimization at the current stage. ρ is a balance parameter; v represents an auxiliary variable introduced to solve equation 3; u represents the dual variable of V. V ^(t) represents the value of the current stage auxiliary variable, V ^(t+1) represents the value of the next stage auxiliary variable; u ^(t) represents the value of the current stage dual variable and U ^(t+1) represents the value of the next stage dual variable. The X-problem is a solution in quadratic form and with a closed form; the V-problem is a denoising problem, which is updated using a denoising network in each optimization stage, which brings X ^(t+1) closer to the desired signal domain. Thus, equation 4 is rewritten as:

Wherein I represents an identity matrix, a ^T represents a transpose of the transform matrix a, and T represents a matrix transpose operation.

In convolutional neural networks, because the transformation matrix a is learned simultaneously with parameters in the reconstruction network, each iteration requires participation of a, resulting in more difficult optimization process. To solve this problem, the present invention replaces the transformation matrix a with a simple neural network consisting of convolutional layers and activation functions. The flow chart of this embodiment is shown in fig. 5.

In general, the present invention enables the training of joint designs representing multispectral filter arrays, the arrangement of spectral response functions, and spatial demosaicing, spectral super-resolution as automatic encoders. Given a set of training hyperspectral images, the hardware encoding process involves projecting corresponding input spatial and spectral information, under the control of a trainable multispectral filter array and spectral response function, which projection generates a single-channel mosaic image Y that is used as input to the decoding (spatial and spectral reconstruction) step. The software decoder uses a depth-unfolding network to simultaneously spatially demosaict and spectrally super-resolution and reconstruct high-spatial and spectral-resolution hyperspectral images.

In the training process, hyperspectral images are continuously and repeatedly input to the model, parameters in the model are dynamically adjusted according to the loss function until preset conditions are met (for example, training turns reach a certain amount or a certain evaluation index is better than a certain preset value), the training process is stopped, and the model parameters are stored.

Step 2: stage of use.

And (3) generating a hyperspectral image according to the combined optimization network by using the model parameters obtained in the training stage in the step (1) and storing the hyperspectral image. Further, if a real hyperspectral image corresponding to the reconstructed hyperspectral image exists, comparing and evaluating the reconstructed hyperspectral image and the real hyperspectral image, and judging the effect of the network model.

Specifically, PSNR (PEAK SIGNAL to Noise Ratio) represents a peak signal-to-Noise Ratio, which is used to examine the quality of a reconstructed high-resolution image, and its calculation formula is as follows:

Wherein MSE represents mean square error, H and W are represented by the height and width of the image, x (i, j) and Representing pixel values of the original image and the reconstructed image at pixel points (i, j), respectively; equation 6 calculates the mean square error between pixels. N in equation 7 represents the number of bits of a pixel.

SSIM (Structural Similarity) denotes structural similarity for calculating the structural similarity between two hyperspectral images. The relationship between SSIM and the more traditional quality metrics may be geometrically demonstrated in vector space of image components, which may be pixel intensities or extracted features, such as transformed linear coefficients. The calculation formula of the SSIM is as follows:

wherein mu _x and Representing the original image X and the reconstructed image/>, respectivelyMean of σ _x and/>Representing the original hyperspectral image X and the reconstructed hyperspectral image/>, respectivelyVariance of/>Representing the original image X and the reconstructed image/>Is a covariance of (c). C ₁ and C ₂ are constants and function to avoid errors due to zero denominator. The value range of the SSIM is between 0 and 1, and the larger the value is, the smaller the reconstruction image distortion is, and the higher the reduction degree is.

PSNR and SSIM are two conventional spatial-based indices, while SAM (Spectral Angle Metric) is spectral-based, used to measure the original image X and reconstructed imageIs a function of the spectral similarity of (a). The SAM calculates the cosine of the angle between the test spectral vector and the reference spectral vector, expressed by the following formula:

Where M represents the number of channels of the hyperspectral image, and N represents the total number of pixels of the hyperspectral image. The smaller the SAM value, the better the restored image quality. Representing the pixel values of the original image and the reconstructed image at the i-th pixel and the j-th channel, respectively.

Based on the method, the embodiment further provides a snapshot type spectrum imaging system based on joint optimization, which comprises a hardware coding subsystem, a software decoding subsystem and an reasoning subsystem.

The hardware coding subsystem is used for jointly optimizing the arrangement mode of the multispectral filter array and the spectral response function, and the obtained mosaic image is sent to the software decoding subsystem.

And the inference subsystem uses the trained joint optimization network model to image, mosaic and inference and forecast the actual scene, so as to obtain a reconstructed high-spatial spectrum resolution image. In each reasoning process, the reasoning subsystem does not need to train repeatedly, and the same joint optimization network model is used each time.

The connection relation between the above-mentioned constituent systems is: the output end of the hardware coding subsystem is connected with the input end of the software decoding subsystem, and the output end of the software decoding subsystem is connected with the input end of the reasoning subsystem.

Under the condition of a snapshot spectrum imaging system, the method has better reconstruction quality in the test of simulation data compared with other contrast algorithms.

Claims

1. A snapshot-type spectral imaging method based on joint optimization, comprising the following steps:

in the using stage, using model parameters obtained in the training stage, generating hyperspectral images according to the combined optimization network, storing the hyperspectral images, and if the true hyperspectral images corresponding to the reconstructed hyperspectral images exist, comparing and evaluating the hyperspectral images and the true hyperspectral images and judging the effect of the network model;

Wherein the training phase comprises the following steps:

step 1.1: preprocessing a hyperspectral image dataset to generate a training dataset;

Wherein each hyperspectral image block is used as a training sample, and all blocks are regarded as training data;

step 1.2: designing a multispectral filtering array and a spectral response function, wherein parameters of the multispectral filtering array and the spectral response function can be learned;

Firstly, carrying out space arrangement optimization on a multispectral filter array; designing a learnable spectral response function curve; finally, carrying out network iteration on the two parameters together so as to update relevant parameters;

Optimizing the spatial arrangement of the multispectral filter array, and realizing the multispectral filter array as a trainable physical binary mask; the multispectral filtering array F is regarded as a 0-1 vector and is used for simulating the process of acquiring mosaic images by the snapshot type spectral imaging system; obtaining corresponding sensor measurement values by tiling the multispectral filter array until the whole training sample is fully paved; operating the multispectral filter array by using Softmax containing temperature coefficient to change the learned multispectral filter array into 0-1 vector;

optimizing the spectral response function by selecting an optimal spectral response from the set of candidate spectral response functions for each spatial location of the multispectral filter array;

Wherein, the temperature coefficient-containing Softmax operation is used for the multispectral filter array F, and the method is specifically as follows:

F^′＝Softmax[α_tF](1)

Wherein F ^′ is updated after each iteration, α _t represents a slowly increasing temperature parameter through training iterations; here, the standard Softmax function is applied to the multidimensional input tensors to rescale them so that the element range of the output tensor is between [0,1] and the sum is 1; by adding another super-parameter tau, the larger the value becomes, the smaller the value becomes under the action of the Softmax function;

forcing the network to learn a non-negative and smooth spatial spectral arrangement pattern, expressed as:

wherein η ₁ and η ₂ are hyper-parameters; Representing the gradient of transformation matrix A in the spectral dimension,/> Representing the square of the 2-norm; loss/>Adding the total loss function to the network optimization;

Step 1.3: the hyperspectral image reconstruction network model firstly performs mosaic in the model according to the input hyperspectral image, and then performs reconstruction to generate a group of multispectral filter array masks, a spectrum response function arrangement mode and a reconstructed hyperspectral image; then, comparing the reconstructed hyperspectral image with a real hyperspectral image, calculating a loss function of the hyperspectral image and updating parameters in the model according to the loss function;

2. The method of claim 1, wherein in step 1.2, the optimization of the multispectral response function curve directly designs the optimal spatial spectrum arrangement through the network itself, specifically as follows:

according to the linear relation between the mosaic image and the hyperspectral image, in the optimization of the spatial spectrum mode, the following two constraints are applied:

first, since each pixel value of the simulated mosaic image is a positive number, all values in the learning arrangement pattern are non-negative;

Second, the spectral dimension arrangement needs to be smooth to facilitate the implementation of the filter.

3. A method of joint optimization-based snapshot spectral imaging according to claim 1, wherein step 1.3 includes the steps of:

Firstly, randomly selecting a plurality of hyperspectral images from a training data set, and inputting the hyperspectral images into a joint optimization network model; performing mosaic on each hyperspectral image by the model to obtain a mosaic image, and generating a hyperspectral image reconstructed by the model through a spatial demosaicing and spectrum super-resolution method;

the joint optimization network model comprises a depth priori regularization and convolution neural network; in the convolutional neural network, a simple neural network composed of a convolutional layer and an activation function is adopted to replace a transformation matrix;

Given a set of training hyperspectral images, the hardware encoding process includes projecting corresponding input spatial and spectral information, such projection generating a single-channel mosaic image for use as a decoded input under control of a trainable multispectral filter array and a spectral response function; at the software decoder, a depth expansion network is adopted to simultaneously perform spatial demosaicing and spectral super-resolution, and a hyperspectral image with high spatial and spectral resolution is reconstructed.

4. The snapshot spectrum imaging method based on joint optimization as recited in claim 1, wherein in a use stage, if there is a real hyperspectral image corresponding to the reconstructed hyperspectral image, the two are compared and evaluated, and the effect of the network model is judged, the method comprises the following steps:

PSNR represents peak signal-to-noise ratio, which is used to examine the quality of the reconstructed high resolution image, and its calculation formula is as follows:

Wherein MSE represents mean square error, H and W are represented by the height and width of the image, x (i, j) and Representing pixel values of the original image and the reconstructed image at pixel points (i, j), respectively; equation 6 calculates the mean square error between the pixels; n in formula 7 represents the number of bits of a pixel;

SSIM represents structural similarity, used to calculate the structural similarity between two hyperspectral images:

wherein mu _x and Representing the original image X and the reconstructed image/>, respectivelyMean of σ _x and/>Representing the original hyperspectral image X and the reconstructed hyperspectral image/>, respectivelyVariance of/>Representing the original image X and the reconstructed image/>Is a covariance of (2); c ₁ and C ₂ are constants, and play a role in avoiding errors caused by zero denominator; the value range of the SSIM is between 0 and 1, and the larger the value of the SSIM is, the less the reconstructed image distortion is, the higher the reduction degree is;

SAM is based on spectrum and serves as a measure of the original image X and reconstructed image Spectral similarity of (2); the SAM calculates the cosine of the angle between the test spectral vector and the reference spectral vector, expressed by the following formula:

Wherein M represents the channel number of the hyperspectral image, and N represents the total pixel number of the hyperspectral image; the smaller the SAM value, the better the restored image quality; Representing the pixel values of the original image and the reconstructed image at the i-th pixel and the j-th channel, respectively.