CN117726916B

CN117726916B - Implicit fusion method for enhancing image resolution fusion

Info

Publication number: CN117726916B
Application number: CN202410181120.XA
Authority: CN
Inventors: 邓良剑; 梁玉洁
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2024-02-18
Filing date: 2024-02-18
Publication date: 2024-04-19
Anticipated expiration: 2044-02-18
Also published as: CN117726916A

Abstract

The invention provides an implicit fusion method for enhancing image resolution fusion, belonging to the field of remote sensing image fusion design; the method adopts a network structure of an encoder-decoder, and is carefully designed in two stages. In the first stage, coordinate information, spatial information and spectral information are integrated by an implicit neural fusion function. In the second stage, a Fourier unit is introduced to extract the spectrum information, and the effective interaction of the spectrum information and the fusion information is realized through the double-branch encoder. Compared with other neural network methods, the method realizes continuous representation of the image in image processing, successfully maps the space coordinates to the pixel information, fully utilizes multi-modal information in the fusion process, and obtains a superior fusion effect. Experiments prove that the method obtains advanced performance on two different data sets and provides a universal model for other multi-mode fusion tasks.

Description

Implicit fusion method for enhancing image resolution fusion

Technical Field

The invention belongs to the field of remote sensing image fusion design, and particularly relates to an implicit fusion method for enhancing image resolution fusion.

Background

The hyperspectral image (HSI) carries a large amount of spectrum wave bands with rich spectrum information, is beneficial to comprehensively grasp the characteristics of substances under different wavelengths, can effectively describe the spectrum distribution of each scene point, and provides inherent and identifying spectrum information of the scene. The collected dense spectrum band data can be applied to various fields including object identification and segmentation, anomaly detection, medical image analysis, remote sensing and the like, so that the data is increasingly applied to real life. But the hyperspectral images obtained are generally of lower spatial resolution due to the inherent mechanism of physical imaging today. While the spatial resolution of the multispectral image (MSI) is higher, this is accompanied by a reduction in spectral resolution. Therefore, the combination of a multispectral image and a hyperspectral pattern, which is expected to simultaneously obtain the hyperspectral resolution of the hyperspectral image and the high spatial resolution of the multispectral image, is attracting more and more attention.

Hyperspectral image fusion is an efficient and economical method of enhancing the spatial resolution of hyperspectral images (HSI) by combining hyperspectral images (HSI) with higher spatial resolution multispectral images (MSI) of the same scene. The multispectral image provides information of a limited wave band, and the hyperspectral image collects spectrum information in more wave bands, so that the information richness can be improved by fusion. Multispectral images typically have higher spatial resolution, while hyperspectral images provide more detailed spectral information. Fusion can achieve richer spectral details at high resolution. In the fields of agriculture, environmental monitoring and the like, the fusion task is helpful for more accurately identifying ground objects, monitoring vegetation, detecting environmental changes and the like. For the remote sensing field, the fusion task is helpful for more comprehensively understanding the surface characteristics, and the application value of the remote sensing data in resource management and environment monitoring is improved.

Although the prior art studies have achieved significant results in the task of hyperspectral fusion super-resolution, there are inevitably some problems and drawbacks. Current research has demonstrated that designing a dual-branched fusion network for high-resolution multispectral images (HR-MSI) and low-resolution hyperspectral images (LR-HSI) is limited by image features. Most existing networks are based on a generic CNN framework that lacks interpretability of hyperspectral image super-resolution fusion tasks and is mostly based on explicit representation of images. In the current research, we are urgent to pay close attention to the rapid development of technology to find new research break-outs and methods.

Disclosure of Invention

The invention aims to provide a high-efficiency and accurate multispectral and hyperspectral image fusion super-resolution method, provides a hyperspectral image fusion super-resolution method based on an implicit fusion function, provides a more innovative solution of an image fusion task, and provides more powerful and reasonable technical support for application in the field.

In order to achieve the above object, the technical scheme of the present invention is as follows: the hyperspectral image fusion super-resolution method based on the implicit fusion function comprises the following steps:

step 1: acquiring and processing a data set; acquiring an image dataset, preprocessing an image in the dataset to obtain a low-resolution hyperspectral image High resolution multispectral image/>Corresponding real pixel value image/>Dividing a training data set and a test data set;

Step 2: coding and multi-mode information generation;

First, a low resolution hyperspectral image Obtaining an up-sampled low-resolution hyperspectral image/>, through bicubic interpolation；

Then the up-sampled low resolution hyperspectral image is processedAnd high resolution multispectral image/>Splice together in the channel dimension to obtain splice characteristics/>；

Will low resolution hyperspectral imageAnd splice features/>Respectively inputting a spectrum coding function and a space coding function, and calculating the spectrum modal characteristic/>And spatial modality characteristics/>；

The center position of the pixel is expressed as a coordinate point of the pixel, and the coordinate graph is scaled into a square grid with the size of 2 multiplied by 2, so that a normalized two-dimensional coordinate graph with high resolution domain is obtainedThereby obtaining the query coordinates/>And nearest coordinates/>Obtain the coordinate modal information/>；

Step 3: the implicit neural image interpolation is fused in the first stage; the spectrum modal characteristics obtained in the step2 are characterizedSpatial modality characteristics/>And coordinate modality feature/>Splicing in the channel dimension, and obtaining the interpolation pixel value/>, of the pixel by using an implicit characteristic fusion functionSimilarity weight/>Then, similarity feature integration is carried out to obtain a fusion feature map/>；

Step 4: a decoding stage; will be represented by high resolution multispectral imagesSpectral response information obtained via a fourier unit/>Fusion of feature map with code output/>Simultaneously inputting a Decoder of the dual-branch Decoder for fusion, and uploading the Decoder by fusion characteristics/>For input, the lower branch takes the spectral features/>For input, the upper branch uses ReLU activation function to activate nonlinearity, and the lower branch uses complex Gabor wavelet activation function to activate nonlinearity, to obtain output/>, of decoding stage；

Step 5: obtaining output; adopting a long jumper wire connection mode to output the output obtained in the step 4And upsampled low resolution hyperspectral image/>Adding elements by elements to obtain a fusion result/>; And obtaining a final fusion result after training and testing.

The beneficial effects of the invention are as follows: the method provided by the invention is an efficient multispectral and hyperspectral image fusion method, a network structure of an encoder-decoder is designed to perform two-stage fusion, coordinate information, spatial information and spectrum information are fused in a first stage through an implicit nerve fusion function, a Fourier unit is designed to extract the spectrum information, and interaction between the spectrum information and the fusion information in a second stage is realized by using a double-branch encoder. The effective utilization of the frequency domain information by the framework is verified by fusing the multi-modal information such as coordinates, space, frequency spectrum data and the like for a plurality of times. Compared with other neural network methods, the method can continuously represent the images, realize continuous mapping from the space coordinates to the pixel information, and utilize multi-mode information in the fusion process so as to obtain a better fusion effect, and the implicit image interpolation brings more reasonable interpretation to MHIF tasks. Experiments prove that the method achieves the most advanced performance on two different data sets, and provides a universal example for other multi-mode fusion tasks.

Drawings

FIG. 1 is a main flow chart of an embodiment of the present invention.

Fig. 2 is a schematic diagram of the relative coordinates of step 2 and step 3 according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a fourier cell of step 4 in accordance with an embodiment of the invention.

Fig. 4 is a schematic diagram of a decoder according to step 4 of the embodiment of the present invention.

Detailed Description

The invention will be further described with reference to the drawings and the specific examples. As shown in fig. 1, an implicit fusion method for enhancing image resolution fusion includes the following steps:

Step1, acquiring and processing an image data set: 20 images were randomly selected from the CAVE dataset for training, with the remaining 11 images of the dataset comprising the test dataset. 20 images were selected from the Harvard dataset and a 1000 x 1000 section was cut in the upper left corner, 10 of which were used for training and 10 for testing.

For the CAVE dataset, we cut 3920 overlapping patches from the 20 training images selected, each patch (patch) being 64×64×31 in size, and take it as a real pixel value image (group-truth, hereinafter GT image). To generate a suitable low resolution hyperspectral image (LR-HSI), standard deviation is usedIs blurred to the original high resolution multispectral image by a3 x3 gaussian kernel and then uses the value/>Downsampling the blurred block by the scale factor of/>, in this embodimentIs 0.5,/>4. Furthermore, we created a high resolution multispectral image patch using the common spectral response function of the nikon D7003 camera and the high resolution multispectral image (HR-HSI). Thus, we generate 3920 LR-HSIs of size 16×16×31 and HR-MSIs of size 64×64×3 from the input pairs (LR-HSIs, HR-MSIs). The input pairs and corresponding GT images were then randomly split into training data (80%) and test data (20%).

The above procedure is equally applicable to the generation of Harvard datasets. The two data sets are applied to obtain the low-resolution hyperspectral imageAnd high resolution multispectral image/>Corresponding GT image/>。

Step2, generating spectral modal characteristicsSpatial modality characteristics/>And coordinate modality feature/>：

Step 2.1: first, a low resolution hyperspectral image is imagedObtaining an up-sampled low-resolution hyperspectral image (/ >) through bicubic interpolation) The process can be expressed as: /(I)

Wherein,Up-sampling function for bicubic interpolation,/>Is a low resolution hyperspectral image with a spatial size/>And the number of channels is/>，/>Upsampled image for LR-HSI bicubic interpolation with spatial size/>And the number of channels is/>And/>Is a high-resolution multispectral image with a spatial size ofAnd the number of channels is s.

Up-sampled low resolution hyperspectral imageAnd high resolution multispectral image/>Splice together in the channel dimension to obtain splice characteristics/>，/>Expressed as a concatenation of channel dimensions, respectively representing low resolution hyperspectral images/>And splice features/>And (3) inputting a spectrum coding function and a space coding function, and calculating spectrum modal characteristics and space modal characteristics of the spectrum coding function and the space coding function: /(I)

In the method, in the process of the invention,Is characterized by spectrum mode, and the space size is/>The number of channels is/>；Is characterized by spectrum mode, and the space size is/>The number of channels is/>；/>As a function of the spectral encoding,Learning parameters expressed as spectral encoding functions, likewise,/>As a spatial coding function,/>The learnable parameters, denoted spatial encoding functions, are both implemented with EDSR encoder networks.

Step 2.2: in general, implicit neural representations use relative position coordinatesIn the present invention,/>Normalizing a two-dimensional graph for a high resolution domain, the spatial size of which is/>The number of channels is 2. Which represents the pixel with its center position and will/>Scaled to a magnitude/>Allowing the invention to use a continuous image representation that can share the same coordinates in the High Resolution (HR) and Low Resolution (LR) fields. The location coordinates of the query are/>The relative position coordinates are then expressed as: /(I)

Wherein,For and query coordinates/>Nearest coordinates,/>The four corner pixels closest to q in HR are in this invention the four corners in square grid. In obtaining relative position coordinates/>Then position coding is carried out, thus obtaining the coordinate modal characteristic/>：/>

In the method, in the process of the invention,As a position coding function, L is a super parameter, and in this embodiment, L is set to 10.

The up-sampled low-resolution hyperspectral image I _{LR_up} and high-resolution multispectral image are combinedSplice together in the channel dimension to obtain splice characteristics/>，/>Expressed as a concatenation of channel dimensions, respectively representing low resolution hyperspectral images/>The spectral coding function and the spatial coding function are input with the splicing feature Cat (I _{LR_up},I_HR), and the spectral mode feature and the spatial mode feature of the spectral coding function are calculated and expressed as follows: /(I)；

And step 3, calculating interpolation pixel values and similarity weights, and carrying out fusion in the first stage by utilizing neural implicit image interpolation. The specific process is as shown in fig. 3: step 3.1: characterizing the spectral modality calculated in step 2Spatial modal characteristicsAnd coordinate modality feature/>Input to implicit feature fusion function/>Fusing the characteristics and generating similarity weight/>. The process is expressed as: /(I)

Wherein V is an interpolation value feature vector obtained by the function, and W is an interpolation weight feature vector. In this embodiment, the implicit feature fusion functionMultilayer perceptron (MLP) being a 2-layer fully connected layer,/>Is its learnable parameter that integrates the learning of interpolation weights and values. As shown in fig. 2, for query pixel q, it corresponds to coordinates/>Corresponds to nearest position coordinates/>By implicit feature fusion function/>Obtaining interpolation pixel value/>, corresponding to adjacent coordinatesAnd learned similarity weights/>Finally, the similarity weight is normalized by applying a softmax function:

step 3.2: and (5) performing neural implicit image interpolation to achieve the purpose of feature integration. The specific interpolation formula is:

wherein, Is the coordinates/>Fusion features of positions for/>And/>，/>I.e. after interpolation of the imageAnd/>Is a fusion feature map of the formula (I), the space size of which is/>The number of channels is/>。

Step 4. In the decoding stage, as shown in FIG. 3, the method comprisesPerforming Fourier transform to transform into Fourier domain, capturing spectrum features in frequency domain by using convolution layer, and performing Fourier inverse transform to transform spectrum features into space domain to obtain spectrum features/>; The above process is called a fourier unit and can be expressed as:

wherein, Is a fourier unit.

Obtaining spectral featuresAfter that, it is fused with the feature map/>Common input to dual-branch decoderObtaining final decoder output characteristics:

wherein, Is a dual-branch decoder,/>Is a learnable parameter of a decoder consisting of four layers/>The convolution intermediate interleaving ReLU activation function and the Complex Gabor wavelet activation function is a double-branch interactive fusion decoder. Superior to fusion features/>For input, the lower branch takes the spectral features/>For input, the spectrum feature and the fusion feature enter a convolution layer of 1×1 respectively, then the obtained two feature graphs are subjected to matrix multiplication to be used as new spectrum features, the fusion feature of the feature and the input decoder is used as new fusion feature after channel dimension splicing, after the new spectrum feature and the fusion feature are obtained, the upper branch uses a ReLU activation function to activate nonlinearity, the lower branch uses a complex Gabor wavelet activation function, the new two features enter the convolution layer of the next round, after the operation is performed for 4 times, the output feature/> is obtained through the last convolution layer after the multiplication of the two branch features. The specific structure is shown in fig. 4. Wherein the ReLU activation function is denoted as/>：/>

The Complex Gabor wavelet activation function is：

Wherein,Are all features input to the activation function,/>Controlling the frequency of the wavelet,/>The width of the wavelet is controlled.

Step 5: decoder with double branchesOutput characteristics/>And =/>, after up-sampling by bicubic interpolationAdding elements by elements to obtain the expected hyperspectral fusion image/>：

Wherein,For the output of a dual-branch decoder,/>Is thatThe upsampled image is bicubic interpolated.

In the embodiment, training and testing are realized by using an Adam optimizer under PyTorch.0.1 and Python 3.9.15 environments, in the training process, 2000 epochs are trained altogether, the learning rate of the first 1000 epochs training is set to be 0.0001, the learning rate of the second 1000 epochs training is set to be 0.00001, and the linux operating system is NVIDIA RTX3090 GPU (24 GB). Optimization is performed by minimizing the standard L1 loss absolute difference:

Where N is the total number of sampled pixels, For arbitrarily sampling the coordinates of the pixel,/>For the real pixel value group-Truth,/>For predicting pixel values. In testing, the present embodiment queries the coordinates of all pixels in the target domain to recover the complete upsampled fusion image.

When the two data sets are tested by using the optimal model obtained through training, four evaluation indexes, namely PSNR (peak signal to noise ratio), SAM (spectrum angle), ERGAS (relative integral dimension integrated error) and SSIM (structural similarity) are adopted. These metrics are used to evaluate model performance. We performed four-fold super-resolution fusion tests on the CAVE dataset and the Harvard dataset, resulting in the following results: the CAVE test set had an average PSNR value of 52.40dB, an average SAM value of 1.925, an average ERGAS value of 0.996, and an average SSIM value of 0.999; the Harvard test set had an average PSNR value of 49.07dB, an average SAM value of 2.093, an average ERGAS value of 1.792, and an average SSIM value of 0.996.

Those of ordinary skill in the art will appreciate that the embodiments described herein are intended to aid the reader in understanding the practice of the invention and that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims

1. An implicit fusion method for enhancing image resolution fusion, which is characterized by comprising the following steps:

Step 1: acquiring and processing a data set; acquiring an image dataset, preprocessing images in the dataset to obtain a low-resolution hyperspectral image I _LR, a high-resolution multispectral image I _HR and a corresponding real pixel value image I _GT, and dividing a training dataset and a testing dataset;

Step 2: coding and multi-mode information generation;

Firstly, performing bicubic interpolation on a low-resolution hyperspectral image I _LR to obtain an up-sampled low-resolution hyperspectral image I _{LR_up};

Then splicing the up-sampled low-resolution hyperspectral image I _{LR_up} and the high-resolution multispectral image I _HR together in the channel dimension to obtain a spliced characteristic Cat (I _{LR_up},I_HR);

Respectively inputting a low-resolution hyperspectral image I _LR and a splicing feature Cat (I _{LR_up},I_HR) into a spectral coding function and a spatial coding function, and calculating a spectral mode feature S _pe and a spatial mode feature S _pa;

The center position of a pixel is expressed as a coordinate point of the pixel, and the coordinate graph is scaled into a square grid with the size of 2 multiplied by 2, so that a normalized two-dimensional coordinate graph X ^HR with a high resolution domain is obtained, and further, the relative positions of a query coordinate X _q and a nearest coordinate X _q,i are obtained, and coordinate mode information PE (delta X) is obtained; the specific process of generating the coordinate mode information PE (delta X) is as follows:

normalized two-dimensional graph of the high resolution domain The space size is H multiplied by W, and the channel number is 2; the method uses the central position of the pixel to represent the pixel, scales the H multiplied by W coordinate graph into square grids with the size of [ -1,1] × [ -1,1], and adopts a continuous image representation mode to share the same coordinate in the high-resolution and low-resolution fields; the position coordinate of the query is x _q∈X^HR, and the relative position coordinate is expressed as:

Where x _q,i is the coordinate nearest to query coordinate x _q, The four corner pixels closest to q in HR are the four corners in the square grid; after obtaining the relative position coordinates δx, position encoding is performed, so as to obtain the coordinate modal feature PE (δx):

PE(δX)＝[sin(2⁰δX),cos(2⁰δX),...,sin(2^L-1δX),cos(2^L-1δX)]

Wherein PE is a position coding function, and L is a super parameter;

Step 3: the implicit neural image interpolation is fused in the first stage; splicing the spectrum modal feature S _pe, the spatial modal feature S _pa and the coordinate modal feature PE (delta X) obtained in the step 2 in the channel dimension, obtaining an interpolation pixel value v _q,i and a similarity weight w _q,i of a pixel by using an implicit feature fusion function, and then carrying out similarity feature integration to obtain a fusion feature diagram epsilon; the method comprises the following specific steps:

Step 3.1: the spectral modal feature S _pe, the spatial modal feature S _pa and the coordinate modal feature PE (δX) are input into an implicit feature fusion function f _θ, the features are fused, and meanwhile, a similarity weight w _q,i is generated, and the process is expressed as follows:

W，V＝f_θ(S_pe,S_pa，PE(δX))

Wherein the implicit feature fusion function f _θ is a multi-layer perceptron of 2 layers of full-connection layers, V is an interpolation value feature vector obtained by the function, W is an interpolation weight feature vector, and θ is a learnable parameter; for the query pixel q, the corresponding coordinate x _q∈X^HR corresponds to the nearest position coordinate x _q,i, the interpolation pixel value V _q,i epsilon V corresponding to the nearest position coordinate and the learned similarity weight W _q,i epsilon W are obtained by the implicit feature fusion function f _θ, and finally the similarity weight is normalized by the softmax function:

step 3.2: the neural implicit image interpolation is carried out, and a specific interpolation formula is as follows:

Where ε _q is the fusion feature of the coordinate x _q position, for I _LR and I _HR, The method is a fusion characteristic diagram of I _LR and I _HR after image interpolation, the space size is H multiplied by W, and the channel number is S;

Step 4: a decoding stage; spectral response information obtained from the high resolution multispectral image I _HR through a Fourier unit, Inputting the characteristic diagram epsilon and the coding output fusion characteristic diagram epsilon into a dual-branch Decoder to be fused at the same time, so as to obtain an output I _fusion of a decoding stage; the dual-branch Decoder consists of four layers of 1 multiplied by 1 convolution modules, and the convolution middle is inserted with a ReLU activation function and a Complex Gabor wavelet activation function, so that the dual-branch Decoder is a dual-branch interactive fusion Decoder; the decoder is upscaled with fusion features/>For input, the lower branch takes the spectral features/>For input, the spectrum feature and the fusion feature enter a convolution layer of 1×1 respectively, then the obtained two feature graphs are subjected to matrix multiplication to be used as new spectrum features, the fusion feature of the feature and the input decoder is spliced in the channel dimension to be used as new fusion feature, after the new spectrum features and the fusion feature are obtained, the upper branch uses a ReLU activation function to activate nonlinearity, the lower branch uses a complex Gabor wavelet activation function, the new two features enter the convolution layer of the next round, after the operation is performed for 4 times, the output feature I _fusion is obtained through the last convolution layer after the multiplication of the double-branch features, and the method is expressed as follows:

wherein, Is a dual-branch decoder,/>Is a learnable parameter of the decoder;

Wherein the ReLU activation function is denoted as σ _ReLU:

σ_ReLU(x)＝max(0，x)

the Complex Gabor wavelet activation function is σ _Gabor:

Wherein x is the characteristic input to the activation function, omega ₀ controls the frequency of the wavelet, and s ₀ controls the width of the wavelet;

Step 5: obtaining output; adding the output I _fusion obtained in the step 4 and the up-sampled low-resolution hyperspectral image I _{LR_up} element by adopting a long jumper connection mode to obtain a fusion result And obtaining a final fusion result after training and testing.

2. The method of claim 1, wherein the fusion result is obtained by minimizing the standard L1 loss absolute difference during trainingAnd (3) optimizing:

Where N is the total number of sampled pixels, x _i is the coordinates of any sampled pixel, I _GT(x_i) is the true pixel value, For predicting pixel values.

3. An implicit fusion method for image resolution fusion enhancement according to claim 2, characterized in that the step of preprocessing the images in the dataset is as follows:

m overlapping small blocks are cut out from a training image in a data set, and each patch is 64 multiplied by 31 in size and is used as a real pixel value image I _GT;

performing fuzzy processing on the initial high-resolution multispectral image by using a Gaussian kernel with standard deviation sigma, and then downsampling a fuzzy block by using a scale factor with value tau to obtain a low-resolution multispectral image I _LR;

Creating a high-resolution multispectral image patch by using a common spectral response function of the camera and the high-resolution multispectral image to obtain a high-resolution multispectral image I _HR; the low resolution hyperspectral image I _LR, the high resolution multispectral image I _HR, and the corresponding real pixel value images are then randomly separated into training data and test data.

4. An implicit fusion method of image resolution fusion enhancement according to claim 3, wherein the process of generating spectral modality features S _pe, spatial modality features S _pa is specifically as follows:

The process of obtaining the up-sampled low-resolution hyperspectral image I _{LR_up} from the low-resolution hyperspectral image I _LR through bicubic interpolation is expressed as follows:

I_{LR_up}＝f_BI(I_LR)

wherein f _BI (·) is a bicubic interpolation upsampling function, Is a low-resolution hyperspectral image, the space size is h multiplied by w, and the channel number is S,/>The spatial size of the image is H multiplied by W, the channel number is S, and/>, the image is obtained by bicubic interpolation and up-sampling of the low-resolution hyperspectral imageThe space size of the high-resolution multispectral image is H multiplied by W, and the channel number is s;

splicing the up-sampled low-resolution hyperspectral image I _{LR_up} and the high-resolution multispectral image I _HR together in the channel dimension to obtain splicing characteristics Cat (·) is expressed as a concatenation of channel dimensions, the low-resolution hyperspectral image I _LR and the concatenation feature Cat (I _{LR_up},I_HR) are respectively input into a spectral coding function and a spatial coding function, and the spectral mode feature and the spatial mode feature of the low-resolution hyperspectral image I _LR and the concatenation feature Cat (I _{LR_up},I_HR) are calculated and expressed as follows:

In the method, in the process of the invention, The space size of the spectrum is h multiplied by w, and the channel number is D; /(I)The space size of the spectrum is H multiplied by W, and the channel number is D; e _φ is a spectral coding function, phi is a learnable parameter of the spectral coding function, E _ψ is a spatial coding function, and ψ is a learnable parameter of the spatial coding function.

5. The method according to claim 4, wherein the process of passing through a fourier unit is specifically: will firstPerforming Fourier transform to transform into Fourier domain, capturing spectrum features in frequency domain by using convolution layer, and performing Fourier inverse transform to transform spectrum features into space domain to obtain spectrum features/>The above process is called a fourier cell, denoted as:

Wherein FU (·) is a fourier unit.

6. The method of claim 5, wherein the spectral encoding function and the spatial encoding function are implemented in EDSR encoder networks.