CN112734867A - Multispectral image compression method and system based on space spectrum feature separation and extraction - Google Patents

Multispectral image compression method and system based on space spectrum feature separation and extraction Download PDF

Info

Publication number
CN112734867A
CN112734867A CN202011493869.6A CN202011493869A CN112734867A CN 112734867 A CN112734867 A CN 112734867A CN 202011493869 A CN202011493869 A CN 202011493869A CN 112734867 A CN112734867 A CN 112734867A
Authority
CN
China
Prior art keywords
network
module
multispectral image
spatial
inter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011493869.6A
Other languages
Chinese (zh)
Other versions
CN112734867B (en
Inventor
孔繁锵
胡可迪
李丹
赵瞬民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202011493869.6A priority Critical patent/CN112734867B/en
Publication of CN112734867A publication Critical patent/CN112734867A/en
Application granted granted Critical
Publication of CN112734867B publication Critical patent/CN112734867B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

The invention discloses a multispectral image compression method and system based on space spectral feature separation extraction, which respectively extracts spectral features and space features of a multispectral image through an inter-spectral feature extraction module and a space feature extraction module; performing pixel-by-pixel fusion, uniformly performing downsampling, and reducing the size of the feature diagram; then, obtaining a compressed data code stream through quantization and entropy coding; when the image is restored, entropy decoding and inverse quantization are carried out on the received code stream to obtain characteristic data, the characteristic data are sent to a reverse decoding network, the structure of the reverse decoding network is symmetrical to that of a forward coding network, and spatial and inter-spectral information of the image is respectively restored through up-sampling restoration size and a corresponding spatial and inter-spectral characteristic restoration network to obtain a reconstructed multispectral image. The invention separates and extracts the space and spectrum information through the space spectrum characteristic extraction module, ensures the integrity of the space spectrum characteristic, and can realize the high-performance multi-code rate compression of the multi-spectrum image by combining the compression network and the rate distortion optimization.

Description

Multispectral image compression method and system based on space spectrum feature separation and extraction
Technical Field
The invention belongs to the technical field of image processing and deep learning, and particularly relates to a multispectral image compression method and system based on space spectral feature separation and extraction.
Background
The remote sensor can generate a three-dimensional multispectral image containing abundant spectrum and space information by acquiring digital images of a plurality of continuous narrow-band spectrum bands, and the abundant information has wide application in the aspects of military reconnaissance, target monitoring, crop condition assessment, surface resource survey, environmental research, marine application and the like. However, with the rapid development of multispectral imaging technology, the spatial resolution of multispectral data is higher and higher, which results in the rapid increase of the data volume, which is not favorable for the transmission, storage and application of images, and can hinder the further development of the related technology. Therefore, the research on the high-efficiency image compression technology method has very important practical significance.
Image data can be compressed because various redundant components exist between data. The redundancy of image data is mainly classified into the following categories: spatial redundancy due to the presence of correlation between adjacent pixels, temporal redundancy due to the presence of correlation between different adjacent frames in an image sequence, and spectral redundancy resulting from the correlation between different color channels or spectral bands. In the imaging process of the multispectral image, because the wave band interval is small, strong correlation exists between the data of each wave band, which can be called as inter-spectrum correlation; each band image is equivalent to a two-dimensional static image, so that each band image has spatial correlation with a general image. The compression of the multispectral image is to remove both redundancies. At present, however, the removal technology of the spatial redundancy of the hyperspectral image is quite mature, but the removal of the redundancy between spectrums is still in the research stage.
The traditional multi-spectral image compression algorithm is mainly divided into three categories: (1) an algorithm based on predictive coding; (2) an algorithm based on vector quantization technology coding; (3) transform coding based algorithms. These three types of algorithms all have obvious disadvantages: the algorithm based on predictive coding can realize lossless compression, but the compression ratio is low, the quality of the design of a predictor is a main factor influencing the compression performance of the method, and a good predictive algorithm can obviously reduce the entropy value of a residual image; the algorithm based on vector quantization technology coding has extremely high complexity, and because the distortion degree is in inverse proportion to the size of the code book, the capacity of the code book must be greatly increased to reduce the distortion degree, and when the code book is too large, the calculation amount can be greatly increased; when the compression rate is large, the algorithm based on transform coding has a blocking effect and an edge Gibbs effect, and the compression performance is seriously influenced. The traditional multispectral image compression method only simply processes multispectral data and cannot fully utilize the characteristic of rich space spectrum features of the multispectral image.
With the rapid development of the deep learning technology, the application of the deep learning technology in the image processing field is more and more extensive, and therefore, the combination of the deep learning technology and the image compression is gradually becoming a great trend. The deep learning technology is combined with the compression technology, and different network structures are formed by frames such as a convolutional neural network or a cyclic neural network by mainly utilizing the multi-parameter and learnable characteristics of deep learning, so that the characteristics of the image data are extracted. The deep learning technology has the advantages that deep information in the image can be extracted, and the essential characteristics of the object are reserved. The advantage is applied to image compression, and the defects of incomplete feature extraction and the like of the traditional compression technology can be effectively improved. Convolutional neural networks are most widely used today, but general convolutional networks usually ignore the inter-spectral information of multispectral images, and cause a great deal of information loss.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a multispectral image compression method and system based on space-spectrum feature separation extraction, which can realize multi-code rate compression of multispectral images and effectively improve the compression performance of the multispectral images.
The technical scheme is as follows: the invention discloses a multispectral image compression method based on space spectral feature separation and extraction, which comprises the following steps of:
(1) constructing a multispectral image compression network, training the multispectral image compression network, and optimizing network parameters to obtain an optimal multispectral image compression network model; the multispectral image compression network comprises a forward encoder network, a quantization module, an entropy coding module, an entropy decoding module, an inverse quantization module and a reverse decoding network; the forward coding network comprises an inter-spectrum feature extraction module, a spatial feature extraction module and a down-sampling module; the reverse decoding network comprises an inter-spectrum feature recovery module, a spatial feature recovery module and an up-sampling module;
(2) the method comprises the steps of sending a multispectral image to be compressed into a multispectral image compression network, extracting spectral features and spatial features of the multispectral image respectively through an inter-spectral feature extraction module and a spatial feature extraction module, reducing the size of a feature image by using down sampling, removing data redundancy through a quantization module, carrying out lossless entropy coding on quantized intermediate feature data to obtain a compressed code stream for transmission and storage, and realizing multi-magnification compression of the multispectral image by training different compression network models;
(3) entropy decoding and inverse quantization are carried out on the received compressed code stream to obtain multispectral image space spectrum characteristic data, then the multispectral image space spectrum characteristic data is input into a reverse decoding network, the characteristic image size is restored through up-sampling, and corresponding characteristics are correspondingly restored through a space characteristic restoration module and an inter-spectrum characteristic restoration module in the decoding network to obtain a reconstructed multispectral image.
Further, a linear rectification function is used in the multispectral image compression network in the step (1), and the expression of the linear rectification function is as follows:
ReLU(xi)=max(0,xi)
wherein xiFor the data of the ith channel, the function divides the input into two sections for mapping, when the input value is less than 0, the original value is mapped into 0, and if the input value is greater than 0, the original value is transmitted; as can be seen from the derivative, the gradient is not lost when the calculation is reversed.
Further, the forward encoder network in step (1) includes 8 inter-spectral feature extraction modules, 8 spatial feature extraction modules and 3 down-sampling modules; the downsampling module comprises 1 downsampling operation with the step size of 2 and the convolution kernel size of 4 x 4, and 1 convolution with the step size of 1 and the convolution kernel size of 3 x 3.
Further, the multispectral image compression network in the step (1) further comprises a rate distortion optimization module; the rate distortion optimization module adopts an importance graph network to replace general entropy calculation by continuous approximation code length, and the distribution of spectral characteristic and spatial characteristic data is continuously optimized through training so that the rate distortion optimization module is more compact, and a loss function is expressed as:
L=Ld+λLr
wherein ,LdFor distortion degree, lambda is penalty weight and is used for explicitly controlling code rate; l isrIs the average of the significance map network outputs p (x); l isrThe calculation formula of (2):
Lr=avg(P(X))
wherein ,
Figure BDA0002841504890000031
Figure BDA0002841504890000032
x∈{0,1,...,H},y∈{0,1,...,W}
wherein P (x, y) is the calculation process of the significance map network,
Figure BDA0002841504890000033
representing the output after the encoder without quantization, ω represents the weighting parameters of the significance map network, (x, y) refers to the spatial coordinates of the current pixel, b, k, s0The significance diagram network is composed of a convolution layer, a traditional residual error unit and a Sigmoid activation function, and the Sigmoid function enables the output value range of the network to be [0,1]]In the method, pixels with different importance are allocated with different code lengths according to the calculation result of the network, and finally the mean value is used for replacing and representing Lr
Further, the quantization module in step (2) removes data redundancy by using the following formula:
the quantization function is approximated by the formula:
XQ=Round[(2Q-1)×XS]
wherein ,XSQ is a quantization level, approximates the processed quantization function, rounds the data in forward transmission, skips the quantization layer in reverse transmission, and directly transmits the gradient to the previous layer.
Further, there are two ways to implement the multi-rate compression of the multispectral image in step (2):
1) the penalty weight lambda in the fixed rate distortion optimization is obtained by changing the number of neurons in the middle convolutional layer and training to obtain compression networks with different code rates, wherein the compression code rate is smaller when the number of the neurons is smaller;
2) and fixing the number of neurons in the middle convolutional layer, and training to obtain compression networks with different code rates by changing the size of the penalty weight lambda in rate distortion optimization, wherein the larger the lambda is, the smaller the compression code rate is.
The invention also provides a multispectral image compression system based on space-spectral feature separation and extraction, which comprises a forward coding network, a quantization module, an entropy coding module, an entropy decoding module, an inverse quantization module and a reverse decoding network; the forward coding network comprises an inter-spectrum feature extraction module, a spatial feature extraction module and a down-sampling module; the reverse coding network comprises an inter-spectrum feature recovery module, a spatial feature recovery module and an up-sampling module; the inter-spectrum feature extraction module and the spatial feature extraction module respectively extract the inter-spectrum features and the spatial features of the multispectral image, and the inter-spectrum features and the spatial features are fused pixel by pixel to obtain a feature map; the down-sampling module reduces the size of the feature map; the quantization module and the entropy coding module quantize and entropy code the characteristic diagram to obtain a compressed code stream for transmission and storage; the entropy decoding module and the inverse quantization module carry out entropy decoding and inverse quantization on the compressed code stream to obtain characteristic data of the multispectral image; the up-sampling module recovers the feature map size; and the spatial characteristic recovery module and the inter-spectrum characteristic recovery module recover the corresponding characteristics of the characteristic map to obtain a reconstructed multispectral image.
Has the advantages that: compared with the prior art, the invention has the beneficial effects that: 1. the network model provided by the invention adopts a spatial characteristic and inter-spectrum characteristic separation extraction module to respectively and independently extract corresponding characteristics, ensures the complete separation of the characteristics of the two parts and simultaneously ensures the integrity of the spatial spectrum characteristics, and does not need to introduce new parameters; when the network is trained, the data to be trained can be directly input into the network, and a training result is obtained at the output end, so that the intermediate independent learning steps are reduced, and the learning efficiency is greatly improved; 2. according to the invention, the spatial characteristic and the inter-spectrum characteristic separation and extraction modules are respectively cascaded and are connected in parallel to form a forward coding network and a reverse decoding network, so that the training efficiency is improved, meanwhile, the characteristic extraction unit comprises a short circuit connection and a reference residual unit, so that the training speed can be accelerated, and a series of problems of gradient explosion, network degradation and the like caused by a deep network are avoided; 3. the invention adds rate distortion optimization control spectral characteristic and spatial characteristic data distribution into the loss function to make the loss function more compact, and makes the reconstructed image more approximate to the original image while keeping more complete inter-spectral characteristics of the image.
Drawings
FIG. 1 is a schematic structural diagram of a multispectral image compression system based on spatial spectral feature separation extraction;
FIG. 2 is a schematic diagram of an inter-spectral feature extraction unit;
FIG. 3 is a schematic diagram of a spatial feature extraction unit;
FIG. 4 is a schematic diagram of a forward coding network architecture;
FIG. 5 is a schematic diagram of a reverse decoding network architecture;
FIG. 6 is a graph of the average PSNR of the test set at different code rates;
FIG. 7 is a PSNR graph of an extracted test pattern; wherein, (a) is the PSNR graph of the test pattern ah _ chun, (b) is the PSNR graph of the test pattern ah _ xia, (c) is the PSNR graph of the test pattern hunan _ chun, and (d) is the PSNR graph of the test pattern tj _ dong;
FIG. 8 is a restored image presentation diagram;
fig. 9 is a spectral angle graph.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention provides a multispectral image compression system based on space-spectral feature separation and extraction, which comprises a forward coding network, a quantization module, an entropy coding module, an entropy decoding module, an inverse quantization module and a reverse decoding network, wherein the forward coding network is connected with the quantization module; the forward coding network comprises an inter-spectrum feature extraction module, a spatial feature extraction module and a down-sampling module; the reverse coding network comprises an inter-spectrum characteristic recovery module, a spatial characteristic recovery module and an up-sampling module; the inter-spectrum feature extraction module and the spatial feature extraction module respectively extract the inter-spectrum features and the spatial features of the multispectral image, and the inter-spectrum features and the spatial features are fused pixel by pixel to obtain a feature map; the down-sampling module reduces the size of the feature map; the quantization module and the entropy coding module obtain compressed code streams for transmission and storage by quantizing and entropy coding the characteristic diagram; the entropy decoding module and the inverse quantization module carry out entropy decoding and inverse quantization on the compressed code stream to obtain characteristic data of the multispectral image; the up-sampling module recovers the feature map size; and the spatial characteristic recovery module and the inter-spectrum characteristic recovery module recover the corresponding characteristics of the characteristic map to obtain a reconstructed multispectral image.
The invention provides a multispectral image compression method based on space spectrum feature separation and extraction, which specifically comprises the following steps:
step S1: constructing a multispectral image compression network, training the multispectral image compression network, and optimizing network parameters to obtain an optimal multispectral image compression network model; the multispectral image compression network comprises a forward encoder network, a quantization module, an entropy encoding module, an entropy decoding module, an inverse quantization module and a reverse decoding network; the forward coding network comprises an inter-spectrum feature extraction module, a spatial feature extraction module and a down-sampling module; the reverse decoding code network comprises an inter-spectrum characteristic recovery module, a spatial characteristic recovery module and an up-sampling module.
The structures of the inter-spectrum feature extraction unit and the spatial feature extraction unit are respectively shown in fig. 2 and fig. 3, Conv represents convolution, ReLU represents a linear rectification function, short-circuit connection learning residual error is adopted to accelerate training, the inter-spectrum feature extraction unit uses one-dimensional inter-spectrum convolution to extract features, the size of a convolution kernel is 1 × 1 × 3, complete inter-spectrum feature data are extracted only by calculating in the inter-spectrum dimension, and a nonlinear activation function ReLU is added to learn nonlinear information; the spatial feature extraction unit extracts features by using packet convolution, the size of a convolution kernel is 3 multiplied by 3, the number of packets is determined according to the number of wave bands of an input multispectral image, if seven wave band images are input, the spatial feature extraction unit is divided into seven groups, and if eight wave band images are input, the spatial feature extraction unit is divided into eight groups; a plurality of inter-spectrum (space) feature extraction units are stacked and cascaded to form a corresponding inter-spectrum (space) feature extraction module, and after feature extraction is completed, fusion is carried out and down sampling is carried out together to form a forward coding network.
The forward coding network is mainly formed by cascade stacking of a convolution layer, a linear rectification unit, a space spectrum feature extraction network and a down-sampling layer, and the forward coding network realizes the function of extracting the complete space spectrum feature of the multispectral image.
The quantization layer and the entropy coding are arranged behind the forward coding network structure, the realized function is to quantize and entropy code the space spectrum characteristics extracted by the forward coding network, wherein the quantization aims to reduce redundant information in data, but is also a main source of an image compression distortion part, and the entropy coding is to encode the data after quantization under the lossless condition to obtain a binary compression code stream, so as to further remove redundancy among data statistics. The quantization layer, entropy coding and forward coding network form a compression model in the network.
In order to further optimize the performance of the multispectral compression network, achieve the code rate as small as possible, simultaneously keep the recovery image quality as possible, need to carry on the trade-off between code rate and image quality loss, add rate distortion optimization module to adopt the significance map network to replace the general entropy calculation with the continuous approximate code length, through training and optimizing the distribution of the spectral characteristic and space characteristic data constantly, make it more compact, the loss function is expressed as:
L=Ld+λLr
wherein ,LdFor Distortion (Distortion Loss), Mean-Square Error (Mean-Square Error) is used herein; λ is a penalty weight for displaying the control code rate; l isrIs the mean value of the significance map output, instead of the general entropy calculation representation, to reverseAnd reflecting the distribution concentration degree of the intermediate characteristic data.
LdThe calculation formula of (2) is as follows:
Figure BDA0002841504890000071
wherein H, W, C represent the height width and the number of channels of the image, respectively, N represents the batch size, I (x, y, z) represents the pixel value at the spatial position (x, y, z) of the input image,
Figure BDA0002841504890000072
representing the pixel value at the (x, y, z) spatial position of the restored image of the compression network.
LrThe calculation formula of (2) is as follows:
Lr=avg(P(X))
wherein ,
Figure BDA0002841504890000073
Figure BDA0002841504890000074
x∈{0,1,...,H},y∈{0,1,...,W}
where P (x, y) is the computation process of the significance map network,
Figure BDA0002841504890000075
representing the output after the encoder without quantization, ω represents the weighting parameters of the significance map network, (x, y) refers to the spatial coordinates of the current pixel, b, k, s0The significance diagram network is composed of a convolution layer, a traditional residual error unit and a Sigmoid activation function, and the Sigmoid function enables the output value range of the network to be [0,1]]In the method, pixels with different importance are allocated with different code lengths according to the calculation result of the network, and finally the mean value is used for replacing and representing Lr. Introducing entropy of intermediate feature data into a loss functionIn the learning process, the network can continuously optimize the distribution of the intermediate characteristic data to make the intermediate characteristic data centralized, and further improve the compression performance of the network.
The inverse quantization, entropy decoding and reverse decoding network in the network structure correspond to the quantization, entropy encoding and forward encoding network functions one by one, and the whole network is in a symmetrical structure and jointly forms a complete compression network structure.
Step S2: and inputting training set data into the multispectral image compression network, and optimizing network parameters by training the network until an optimal multispectral image compression network model is obtained. The multi-spectral image compression method includes the steps of sending a multi-spectral image to be compressed into a multi-spectral image compression network, extracting spectral features and spatial features of the multi-spectral image respectively through an inter-spectral feature extraction module and a spatial feature extraction module, reducing the size of a feature image by using down-sampling, removing data redundancy through a quantization module, carrying out lossless entropy coding on quantized intermediate feature data to obtain compressed code streams for transmission and storage, and realizing multi-magnification compression of the multi-spectral image by training different compression network models.
Step S21: the whole compression network model adopts an end-to-end training mode: the preprocessed multispectral image is directly input into a network model at an input end to start training, and a batch training method is adopted, namely, a plurality of pictures are read at one time, so that the network training efficiency is improved; and the image data is transmitted to a decompression model through the compression model, the data obtained at the output end is compared with the original data to calculate distortion errors, network parameters are adjusted through a minimized error function, updated network parameters are transmitted reversely until the network parameters are optimal, and the training is finished at the moment.
Step S22: the forward coding network is used for extracting characteristic data of an input multispectral image, reserving spectral information and spatial information of the multispectral image, and is helpful for reconstructing a high-quality image, and the structure of the forward coding network is shown in fig. 4, wherein Conv represents a convolutional layer, parameters in brackets behind Conv respectively represent the number of input channels, the number of output channels, the size of a convolutional kernel, step length and padding. In addition, the downsampling operation is implemented by using convolution with a step size larger than 1, and the parameters in the following brackets respectively represent the convolution kernel size, step size, padding. The ReLU represents a linear correction unit, the inter-spectral and spatial feature extraction units are respectively of the structures shown in fig. 2 and 3, and the forward coding network specifically comprises the following processes:
1) multispectral data with the size of H multiplied by W multiplied by C are respectively input into an inter-spectrum feature extraction module and a spatial feature extraction module, and input and output channels are all 1 in the inter-spectrum feature extraction module through convolution layers with the convolution kernel size of 1 multiplied by 3, the step length of (1,1,1) and the padding of (0,0,1) (the dimension of the padding of 1 corresponds to the inter-spectrum dimension); in the spatial feature extraction module, the size of the grouping convolution is 3 × 3, the step length is 1, the padding is 1, the grouping convolution is divided into 7 groups or 8 groups according to the number of wave bands of an input image, seven wave band images are input, the number of input channels of a first convolution layer is 7, output channels are 56, input and output channels of a middle unit are 56, the number of input channels of a last convolution layer is 56, and the output is 7; increasing the nonlinear relation among each layer of the neural network through a ReLU activation function;
2) after passing through a feature extraction module, two parts of features are subjected to pixel-by-pixel addition fusion, a layer of convolution and a layer of ReLU activation function are firstly carried out, the size of a convolution kernel is 3 multiplied by 3, the step size is 1, the padding is 1, then downsampling is carried out, namely the convolution kernel with the size of 4 multiplied by 4, the step size is 2 and the padding is 1 is carried out, 64 feature maps with the size of H/2 multiplied by W/2 are obtained, the same process is repeated twice again, 64 feature maps with the size of H/8 multiplied by W/8 are obtained, and the spatial resolution of the image is reduced to 1/64;
3) and finally, respectively obtaining (64,36) input and output channels through convolution layers with convolution kernel size of 3 multiplied by 3, step length of 1 and padding of 1, finally extracting 36 intermediate characteristic diagram data with size of M/8 multiplied by N/8, and enabling the value of network output to be between [0,1] through a Sigmoid function.
Step S23: the quantization layer and the entropy coding layer mainly remove data redundancy, data after coding quantization is a compressed code stream, and because the derivative of a quantization function is discontinuous, the derivative cannot be directly derived and added into a network, and the situation that the gradient disappears can occur during reverse transmission, the quantization function needs to be approximately processed, and the formula is as follows:
XQ=Round[(2Q-1)×XS]
wherein XSQ is the quantization level, which is the intermediate characteristic data obtained after the convolution layer is extracted and passing through a Sigmoid function.
And (3) approximating the processed quantization function, rounding the data in forward propagation, skipping a quantization layer in reverse propagation, and directly transmitting the gradient to the previous layer. Then the quantized intermediate characteristic data XQAnd generating a binary code stream by adopting ZPAQ lossless compression. Entropy decoding and restoring the code stream to obtain quantized intermediate characteristic data XQThen inverse quantized intermediate feature data XQ/(2Q-1) input into a reverse decoding network.
And step S3, performing entropy decoding and inverse quantization on the received compressed code stream to obtain multispectral image spatial spectrum characteristic data, inputting the multispectral image spatial spectrum characteristic data into a reverse decoding network, recovering the characteristic image size through upsampling, and recovering corresponding characteristics through a spatial characteristic recovery module and an inter-spectrum characteristic recovery module in the decoding network to obtain a reconstructed multispectral image.
Step S31: the inverse decoding network is used for reconstructing the intermediate feature image into a multispectral image, and the structure of the multispectral image is shown in fig. 5, wherein Conv represents a convolution layer and is symmetrical to the structure of the forward encoding network, and parameters in parentheses behind Conv respectively represent the number of input channels, the number of output channels, the size of a convolution kernel, a step size and padding. ReLU denotes a linear correction unit, and the spectral and spatial feature restoration units have the structures shown in fig. 2 and 3, respectively. PixelShuffle represents an upsampling function. The specific decoding process of the reverse decoding code network is as follows:
1) firstly, 36 feature map data with the size of H/8 xW/8 are subjected to convolution layers with the convolution kernel size of 3 x 3, the step size of 1 and the padding size of 1 to obtain 64 feature maps with the size of H/8 xW/8;
2) obtaining 16 characteristic graphs with the size of H/4 xW/4 through a Pixelshuffle layer; then, obtaining 64 feature maps with the size of H/4 xW/4 through convolution layers with the convolution kernel size of 3 x 3, the step size of 1 and the padding of 1 and a ReLU activation function, and repeating the process twice through convolution layers with the convolution kernel size of 3 x 3, the step size of 1 and the padding of 1 to obtain 64 feature maps with the size of M x N;
3) then, inputting the data into an inter-spectrum feature recovery module and a spatial feature recovery module respectively, and inputting and outputting 1 channel in the inter-spectrum feature recovery module through a convolution layer (1 corresponding to the spectral dimension) with the convolution kernel size of 1 multiplied by 3, the step length of (1,1,1) and the padding of (0,0, 1); in the spatial feature recovery module, the size of the packet convolution is 3 × 3, the step length is 1, the padding is 1, the packet convolution is divided into 7 groups or 8 groups according to the number of wave bands of an input image, seven wave band images are input, the number of input channels of a first convolution layer is 7, output channels are 56, input and output channels of a middle unit are 56, the number of input channels of a last convolution layer is 56, the output is 7, the whole process is in a symmetrical relation with a forward coding network, and the deconvolution operation is equivalently performed on feature map data;
4) and finally, adding the data obtained by the spatial feature recovery module among the spectrums, and reconstructing to obtain a recovered image with the same size as the original image.
Step S32: the multispectral image compression network aims to restore the restored image to the input image as much as possible, and to retain the original image information, and the multispectral image compression network is expressed in that the loss function value is minimized by acquiring the optimal network parameters through learning and training, and is expressed by a formula:
Figure BDA0002841504890000101
where x denotes an input image, θ12Parameters representing a network, Se () and Sa () represent an inter-spectrum feature extraction network and a spatial extraction network, respectively, En () represents quantization encoding, and Re () represents a decoding reconstruction network. The specific process of network parameter optimization is as follows:
when network training starts, network parameters are initialized randomly, and the network parameters enter a decoding network after being coded and quantized by a forward coding network; the decoding end reconstructs an image according to the parameters; sending the pixel value of the reconstructed image and the pixel value of the original image into a loss function, calculating the error between the pixel value of the reconstructed image and the pixel value of the original image, and updating the parameters through a back propagation algorithmUntil the loss function reaches a minimum, where LrThe input of the method is the unquantized output of the coding network, and the information entropy is added into the loss function, so that the image data distribution is more compact, and the image is conveniently compressed.
The effect of the present invention will be further described with reference to simulation experiments.
The hardware test platform of the invention is as follows: GPU (NVIDIA GeForce GTX 2080TI), internal memory (32GB), hard disk (Samsung SSD SM8712.57mm 256 GB/solid state disk).
The software platform is as follows: windows 764 bit operating system, pytorch1.2.0, matlab.
The training and testing sets of the multi-spectral image compression network used in the present invention are derived from the multi-spectral images of the landsat8 satellite, containing 7 multi-spectral bands. In order to prevent the network from being over-fitted, a multispectral image under different seasons, multiple weather conditions and multiple terrain conditions is selected from a training set of the network, so that the multispectral image contains rich and various features, and the multispectral image is cut into blocks of 128 x 128 in size, wherein about 80000 blocks are used as the training set; a test set of the network is obtained by selecting the same standard, wherein the test set comprises 17 images in total, the size of the test set is 512 multiplied by 512, the training set of the network and the test set have no repeated images, namely multispectral image data related to the test set do not participate in training.
The Adam Optimizer is used for network training, the initial learning rate is set to be 0.0001, the Adam Optimizer is used for rapid convergence of the network and generation of a pre-training model, then the learning rate is reduced to 0.00001, and the required model is obtained through training. For rate-distortion optimization, according to the learning idea of easy-to-hard, at an initial stage LrSet to 0, after the network has sufficiently converged, gradually increase LrThe weight lambda of the intermediate characteristic diagram enables the distribution of the data of the intermediate characteristic diagram to be gradually concentrated, and the compression performance is obviously improved.
The invention is applied to compress multispectral images at different code rates, the image compression ratio can be changed by changing the punishment weight lambda in a trained model, the recovery effect of testing images by adopting 8 different code rates is simulated, and compared with the performance of the prior JPEG2000 and 3D-SPIHT methods, the invention introduces a spectrum angle as a reference for checking the loss condition of spectrum information before and after image compression.
FIG. 6 shows the average PSNR of the present invention compared to JPEG2000 and 3D-SPIHT over data sets at different code rates; FIG. 7(a) (b) (c) (d) shows PSNR measurements of four multispectral images ah _ chun, ah _ xia, hunan _ chun, tj _ dong, respectively, in a test sample set; FIGS. 8(a) (b) (c) (D) are graphs showing the recovery comparison of the four multispectral images of FIG. 7 under the present invention and both JPEG2000 and 3D-SPIHT algorithms; FIG. 9 shows the average spectral angle curves of the present invention versus JPEG2000 and 3D-SPIHT.
The peak signal-to-noise ratio PSNR is the most widely applied objective evaluation index of the traditional image compression algorithm and is mainly measured by pixel change before and after compression, and the calculation formula is as follows:
Figure BDA0002841504890000111
Figure BDA0002841504890000112
wherein n represents the number of bits of the image, H, W, C represent the height width and the number of channels of the image, respectively, I (x, y, z) represents the pixel value at the spatial position (x, y, z) of the input image,
Figure BDA0002841504890000113
representing the pixel value at the (x, y, z) spatial position of the restored image of the compression network. As can be seen from FIG. 6, with the increase of the code rate, the PSNR of the reconstructed image and the original image both show an increasing trend, and the average PSNR between the reconstructed image and the original image of the invention is higher than JPEG2000 and 3D-SPIHT under the same code rate; the PSNR of the invention is higher than JPEG2000 and 3D-SPIHT by about 1dB and 3dB respectively by combining the comparison results under 8 different code rates; when the code rate is 0.308 to 0.334, the PSNR of the invention is superior to JPEG2000 and reaches 3.3 dB; as can be seen from the four PSNR graphs in FIGS. 7(a), (b), (c) and (D), the present invention has the most significant advantages when the code rate is in the range of 0.3-0.38, and the PSNR of each code rate of each test chart is higher than that of JPEG2000 and 3D-SPIHT.
FIG. 8 shows the comparison between the reconstructed image and the original image of the present invention and JPEG2000 and 3D-SPIHT algorithms, which shows the difference on the restored image more intuitively, and it can be seen from the four comparison images that the present invention is superior to JPEG2000 and 3D-SPIHT algorithms in quality of the reconstructed image, and the contrast effect is more obvious in the aspect of restoring the texture details.
FIG. 9 shows the average spectrum angle curves of the present invention and JPEG2000 and 3D-SPIHT, the smaller the spectrum angle, the more similar the spectrum information, it can be seen that the spectrum angle curve obtained by the multi-spectrum image compression network is always kept below the spectrum angle curves of JPEG2000 and 3D-SPIHT, and the average spectrum angles of the multi-spectrum image compression network under different code rates are all smaller than those of JPEG2000 and 3D-SPIHT. The smaller the spectral angle, the more similar the spectral information, i.e. the closer the restored image is to the original spectrum.
TABLE 1 average spectral Angle of test data sets at different code rates
Figure BDA0002841504890000121
Table 1 shows the average spectral angles of the test data set at different code rates, and the overall experimental data show that the algorithm has better effect on the spectral similarity curve than JPEG2000 and 3D-SPIHT, and more completely retains the spectral information.

Claims (7)

1. A multispectral image compression method based on space spectral feature separation and extraction is characterized by comprising the following steps:
(1) constructing a multispectral image compression network, training the multispectral image compression network, and optimizing network parameters to obtain an optimal multispectral image compression network model; the multispectral image compression network comprises a forward encoder network, a quantization module, an entropy coding module, an entropy decoding module, an inverse quantization module and a reverse decoding network; the forward coding network comprises an inter-spectrum feature extraction module, a spatial feature extraction module and a down-sampling module; the reverse decoding network comprises an inter-spectrum feature recovery module, a spatial feature recovery module and an up-sampling module;
(2) the method comprises the steps of sending a multispectral image to be compressed into a multispectral image compression network, extracting spectral features and spatial features of the multispectral image respectively through an inter-spectral feature extraction module and a spatial feature extraction module, reducing the size of a feature image by using down sampling, removing data redundancy through a quantization module, carrying out lossless entropy coding on quantized intermediate feature data to obtain a compressed code stream for transmission and storage, and realizing multi-magnification compression of the multispectral image by training different compression network models;
(3) entropy decoding and inverse quantization are carried out on the received compressed code stream to obtain multispectral image space spectrum characteristic data, then the multispectral image space spectrum characteristic data is input into a reverse decoding network, the characteristic image size is restored through up-sampling, and corresponding characteristics are correspondingly restored through a space characteristic restoration module and an inter-spectrum characteristic restoration module in the decoding network to obtain a reconstructed multispectral image.
2. The method for compressing multispectral image based on spatial spectral feature separation extraction as claimed in claim 1, wherein a linear rectification function is used in the multispectral image compression network in step (1), and the expression of the linear rectification function is as follows:
ReLU(xi)=max(0,xi)
wherein xiFor the data of the ith channel, the function divides the input into two sections for mapping, when the input value is less than 0, the original value is mapped into 0, and if the input value is greater than 0, the original value is transmitted; as can be seen from the derivative, the gradient is not lost when the calculation is reversed.
3. The method according to claim 1, wherein the forward encoder network in step (1) comprises 8 inter-spectral feature extraction modules, 8 spatial feature extraction modules and 3 down-sampling modules; the downsampling module comprises 1 downsampling operation with the step size of 2 and the convolution kernel size of 4 x 4, and 1 convolution with the step size of 1 and the convolution kernel size of 3 x 3.
4. The method according to claim 1, wherein the multispectral image compression network further comprises a rate-distortion optimization module in step (1); the rate distortion optimization module adopts an importance graph network to replace general entropy calculation by continuous approximation code length, and the distribution of spectral characteristic and spatial characteristic data is continuously optimized through training so that the rate distortion optimization module is more compact, and a loss function is expressed as:
L=Ld+λLr
wherein ,LdFor distortion degree, lambda is penalty weight and is used for explicitly controlling code rate; l isrIs the average of the significance map network outputs p (x); l isrThe calculation formula of (2):
Lr=avg(P(X))
wherein ,
Figure FDA0002841504880000021
Figure FDA0002841504880000022
x∈{0,1,...,H},y∈{0,1,...,W}
wherein P (x, y) is the calculation process of the significance map network,
Figure FDA0002841504880000023
representing the output after the encoder without quantization, ω represents the weighting parameters of the significance map network, (x, y) refers to the spatial coordinates of the current pixel, b, k, s0The significance diagram network is composed of a convolution layer, a traditional residual error unit and a Sigmoid activation function, and the Sigmoid function enables the output value range of the network to be [0,1]]In the method, pixels with different importance are allocated with different code lengths according to the calculation result of the network, and finally the mean value is used for replacing and representing Lr
5. The method for compressing multispectral image based on spatial spectral feature separation extraction as claimed in claim 1, wherein the quantization module in step (2) removes data redundancy by using the following formula:
the quantization function is approximated by the formula:
XQ=Round[(2Q-1)×XS]
wherein ,XSQ is a quantization level, approximates the processed quantization function, rounds the data in forward transmission, skips the quantization layer in reverse transmission, and directly transmits the gradient to the previous layer.
6. The method for compressing multispectral image based on spatial spectral feature separation extraction as claimed in claim 1, wherein there are two ways for implementing the multi-power compression of multispectral image in step (2):
1) the penalty weight lambda in the fixed rate distortion optimization is obtained by changing the number of neurons in the middle convolutional layer and training to obtain compression networks with different code rates, wherein the compression code rate is smaller when the number of the neurons is smaller;
2) and fixing the number of neurons in the middle convolutional layer, and training to obtain compression networks with different code rates by changing the size of the penalty weight lambda in rate distortion optimization, wherein the larger the lambda is, the smaller the compression code rate is.
7. A multispectral image compression system based on spatial spectral feature separation extraction by using the method as claimed in any one of claims 1 to 6, comprising a forward coding network, a quantization module, an entropy coding module, an entropy decoding module, an inverse quantization module, and an inverse decoding network; the forward coding network comprises an inter-spectrum feature extraction module, a spatial feature extraction module and a down-sampling module; the reverse coding network comprises an inter-spectrum feature recovery module, a spatial feature recovery module and an up-sampling module; the inter-spectrum feature extraction module and the spatial feature extraction module respectively extract the inter-spectrum features and the spatial features of the multispectral image, and the inter-spectrum features and the spatial features are fused pixel by pixel to obtain a feature map; the down-sampling module reduces the size of the feature map; the quantization module and the entropy coding module quantize and entropy code the characteristic diagram to obtain a compressed code stream for transmission and storage; the entropy decoding module and the inverse quantization module carry out entropy decoding and inverse quantization on the compressed code stream to obtain characteristic data of the multispectral image; the up-sampling module recovers the feature map size; and the spatial characteristic recovery module and the inter-spectrum characteristic recovery module recover the corresponding characteristics of the characteristic map to obtain a reconstructed multispectral image.
CN202011493869.6A 2020-12-17 2020-12-17 Multispectral image compression method and multispectral image compression system based on spatial spectrum feature separation and extraction Active CN112734867B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011493869.6A CN112734867B (en) 2020-12-17 2020-12-17 Multispectral image compression method and multispectral image compression system based on spatial spectrum feature separation and extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011493869.6A CN112734867B (en) 2020-12-17 2020-12-17 Multispectral image compression method and multispectral image compression system based on spatial spectrum feature separation and extraction

Publications (2)

Publication Number Publication Date
CN112734867A true CN112734867A (en) 2021-04-30
CN112734867B CN112734867B (en) 2023-10-27

Family

ID=75602675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011493869.6A Active CN112734867B (en) 2020-12-17 2020-12-17 Multispectral image compression method and multispectral image compression system based on spatial spectrum feature separation and extraction

Country Status (1)

Country Link
CN (1) CN112734867B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113393543A (en) * 2021-06-15 2021-09-14 武汉大学 Hyperspectral image compression method, device and equipment and readable storage medium
CN113554720A (en) * 2021-07-22 2021-10-26 南京航空航天大学 Multispectral image compression method and system based on multidirectional convolutional neural network
CN113706641A (en) * 2021-08-11 2021-11-26 武汉大学 Hyperspectral image compression method based on space and spectral content importance
CN115623207A (en) * 2022-12-14 2023-01-17 鹏城实验室 Data transmission method based on MIMO technology and related equipment
CN116112694A (en) * 2022-12-09 2023-05-12 无锡天宸嘉航科技有限公司 Video data coding method and system applied to model training
WO2023241188A1 (en) * 2022-06-13 2023-12-21 北华航天工业学院 Data compression method for quantitative remote sensing application of unmanned aerial vehicle

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190096049A1 (en) * 2017-09-27 2019-03-28 Korea Advanced Institute Of Science And Technology Method and Apparatus for Reconstructing Hyperspectral Image Using Artificial Intelligence
CN109741407A (en) * 2019-01-09 2019-05-10 北京理工大学 A kind of high quality reconstructing method of the spectrum imaging system based on convolutional neural networks
CN111754592A (en) * 2020-03-31 2020-10-09 南京航空航天大学 End-to-end multispectral remote sensing image compression method based on characteristic channel information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190096049A1 (en) * 2017-09-27 2019-03-28 Korea Advanced Institute Of Science And Technology Method and Apparatus for Reconstructing Hyperspectral Image Using Artificial Intelligence
CN109741407A (en) * 2019-01-09 2019-05-10 北京理工大学 A kind of high quality reconstructing method of the spectrum imaging system based on convolutional neural networks
CN111754592A (en) * 2020-03-31 2020-10-09 南京航空航天大学 End-to-end multispectral remote sensing image compression method based on characteristic channel information

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113393543A (en) * 2021-06-15 2021-09-14 武汉大学 Hyperspectral image compression method, device and equipment and readable storage medium
CN113393543B (en) * 2021-06-15 2022-07-01 武汉大学 Hyperspectral image compression method, device and equipment and readable storage medium
CN113554720A (en) * 2021-07-22 2021-10-26 南京航空航天大学 Multispectral image compression method and system based on multidirectional convolutional neural network
CN113706641A (en) * 2021-08-11 2021-11-26 武汉大学 Hyperspectral image compression method based on space and spectral content importance
CN113706641B (en) * 2021-08-11 2023-08-15 武汉大学 Hyperspectral image compression method based on space and spectral content importance
WO2023241188A1 (en) * 2022-06-13 2023-12-21 北华航天工业学院 Data compression method for quantitative remote sensing application of unmanned aerial vehicle
CN116112694A (en) * 2022-12-09 2023-05-12 无锡天宸嘉航科技有限公司 Video data coding method and system applied to model training
CN116112694B (en) * 2022-12-09 2023-12-15 无锡天宸嘉航科技有限公司 Video data coding method and system applied to model training
CN115623207A (en) * 2022-12-14 2023-01-17 鹏城实验室 Data transmission method based on MIMO technology and related equipment
CN115623207B (en) * 2022-12-14 2023-03-10 鹏城实验室 Data transmission method based on MIMO technology and related equipment

Also Published As

Publication number Publication date
CN112734867B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN112734867B (en) Multispectral image compression method and multispectral image compression system based on spatial spectrum feature separation and extraction
CN113554720A (en) Multispectral image compression method and system based on multidirectional convolutional neural network
CN112203093B (en) Signal processing method based on deep neural network
CN111754592A (en) End-to-end multispectral remote sensing image compression method based on characteristic channel information
CN110751597B (en) Video super-resolution method based on coding damage repair
CN111711817B (en) HEVC intra-frame coding compression performance optimization method combined with convolutional neural network
CN109903351B (en) Image compression method based on combination of convolutional neural network and traditional coding
CN112149652A (en) Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image
CN113747163B (en) Image coding and decoding method and compression method based on context recombination modeling
CN114449276B (en) Super prior side information compensation image compression method based on learning
CN115955563A (en) Satellite-ground combined multispectral remote sensing image compression method and system
CN111340901A (en) Compression method of power transmission network picture in complex environment based on generating type countermeasure network
Hu et al. An adaptive two-layer light field compression scheme using GNN-based reconstruction
CN1405735A (en) Colour-picture damage-free compression method based on perceptron
CN113822954B (en) Deep learning image coding method for man-machine cooperative scene under resource constraint
CN113362239A (en) Deep learning image restoration method based on feature interaction
CN112637599A (en) Novel reconstruction method based on distributed compressed video sensing system
CN111080729A (en) Method and system for constructing training picture compression network based on Attention mechanism
CN109474825B (en) Pulse sequence compression method and system
CN111479286A (en) Data processing method for reducing communication flow of edge computing system
Tawfik et al. A generic real time autoencoder-based lossy image compression
CN114979711A (en) Audio/video or image layered compression method and device
CN114882133B (en) Image coding and decoding method, system, device and medium
CN107155111B (en) Video compression method and device
CN111031312B (en) Image compression method for realizing attention mechanism based on network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant