CN111683250A - Generation type remote sensing image compression method based on deep learning - Google Patents

Generation type remote sensing image compression method based on deep learning Download PDF

Info

Publication number
CN111683250A
CN111683250A CN202010404524.2A CN202010404524A CN111683250A CN 111683250 A CN111683250 A CN 111683250A CN 202010404524 A CN202010404524 A CN 202010404524A CN 111683250 A CN111683250 A CN 111683250A
Authority
CN
China
Prior art keywords
image
network
remote sensing
encoder
quantization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010404524.2A
Other languages
Chinese (zh)
Other versions
CN111683250B (en
Inventor
种衍文
翟亮
潘少明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010404524.2A priority Critical patent/CN111683250B/en
Publication of CN111683250A publication Critical patent/CN111683250A/en
Application granted granted Critical
Publication of CN111683250B publication Critical patent/CN111683250B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

The technical scheme of the invention provides a deep learning-based generation-type remote sensing image compression method. The invention adopts a Pythroch deep learning framework for training, takes an auto encoder (AutoEncoder) + generation confrontation model (GAN) as an example, and a network model is mainly divided into three parts, namely an encoder, a pre-quantization and quantization module, a decoder (generator) and a discriminator. The framework is suitable for compression processing of homologous remote sensing images with any spectral dimension, compression and transmission of remote sensing images under the conditions of low bandwidth and low code rate, has excellent image reconstruction capability, is optimized for the scale and the running speed of a deep neural network, and is convenient for deployment and popularization of equipment facing the Internet of things.

Description

Generation type remote sensing image compression method based on deep learning
Technical Field
The invention belongs to the field of remote sensing image compression, and particularly relates to a method for compressing and decompressing a remote sensing image by using a deep learning framework.
Background
Compared with natural images, the spectrum dimension of the remote sensing image contains richer information, and the remote sensing images are various in types and large in data volume. By utilizing the spectral curve difference of different ground objects, the remote sensing image is widely applied to various fields of national economy. With the popularization of the application of the high-resolution remote sensing imaging technology, the challenges of how to effectively compress the transmission and storage data volume and the like caused by the obvious improvement of the spectrum and the spatial resolution of the remote sensing image are the problems to be solved urgently in the application process of the remote sensing image.
The emerging image processing method Deep Learning (Deep Learning) accomplishes a specific task by Learning features of a target from a large number of training samples. Deep learning has achieved a great deal of success in the fields of image processing, such as image classification, object detection, pedestrian re-recognition, and the like.
At present, the existing deep learning technology is mostly used for compressing common visible light images, and the remote sensing image compression technology based on deep learning is still less. Toderici (reference Toderici G, Vincent D, Johnston N, et al. full Resolution Image Compression with Current Neural Networks [ J ]. arXiv prediction arXiv:1608.05148,2016.) et al propose variable ratio Image Compression algorithms based on long and short time memory Networks. The algorithm inputs a 32 x 32 image into the network, realizes the compression of the image by reducing the scale of the image and adjusting the number of characteristic graphs, and then realizes the restoration of the image information by decoding the network. Ball (references Ball J, laprara V, simocelii E p. end-to-end Optimized Image Compression [ J ]. arXiv preprint arXiv:1611.01704,2016.) and the like use convolutional neural networks to achieve Compression of images. The network comprises three parts of an analysis transformation structure, a quantization structure and a synthesis transformation structure, wherein the structures mainly comprise a convolution layer, an image down-sampling layer, a GD N normalization layer and the like. Li (refer to Li M, Zuo W, Gu S, et al. left connecting conditional Networks for Content-weighted Image Compression [ J ]. arXiv preprinting Xiv:1703.10553,2017.) and so on, proposes an Image Compression technology based on Image Content weighting, the method uses different bit rate coding for different Image contents, adds a significance map concept on the basis of the traditional self-encoder structure, and realizes the code rate control of different Image contents through the significance map. However, the methods proposed by these authors are directed to compression of visible light images, not remote sensing images. In addition, with the improvement of related computing power schemes such as a supercomputing chip, the condition that the deep learning model is deployed in the on-satellite environment is increasingly mature, and how to overcome the barriers of the deep model on scale and time is also an important issue.
In summary, the current remote sensing image compression algorithm needs to design a set of relatively universal compression scheme aiming at the huge difference existing in the spectrum number of different remote sensing images so as to automatically adapt to the remote sensing image compression processing under the condition of different spectrum numbers; meanwhile, in order to solve the problem of rapid compression processing of massive remote sensing images, higher rate-distortion compression algorithm performance needs to be realized; in addition, in order to meet the application requirements of compressing and deploying the remote sensing images on small Internet of things facilities such as the satellite and the like, the provided compression algorithm and model need to meet the limitation of the limited resource scale requirement and the less inference time requirement of a deployment platform.
Disclosure of Invention
In order to solve the problems, the invention provides a deep learning generation type remote sensing image compression method, which adopts a mode of 'Auto-Encoder) + generation countermeasure model (GAN'), and completes the compression processing of the self-adaptive remote sensing image meeting the requirement of a small-size Internet of things deployment environment through the processing of three parts of an Encoder, a quantizer, a decoder (generator) and a discriminator.
The invention relates to a deep learning generation type remote sensing image compression method, which adopts the technical scheme that: the image tensor is compressed by an encoder network to obtain a hidden representation tensor of 1/128 scales of an original image, the hidden representation tensor is input to a quantizer network to be subjected to pre-quantization and quantization to obtain a binary code stream, the quantized binary code stream is input to a decoder (generator) to obtain a reconstructed image, the reconstructed image is input to a discriminator network to be subjected to discrimination, and the generator (decoder) and the discriminator are subjected to limited game (training) to achieve a Nash equilibrium state (network convergence) and achieve rate distortion optimization of the image.
The encoder network comprises a channel-adaptor module and a downlink block (downblock) module, wherein the channel-adaptor module is a convolutional layer (kernel-size 3, padding ng 1) with a reserved space dimension, image tensors (B, C, H, W) are unchanged through the channel-adaptor space dimension (H, W), the channel number is changed into 4 × max 8, C), wherein B is the batch processing number, C is an image channel, H is the image height, W is the image width, the formula C is a specific channel number, m (usually, 3, 4, 5) downblocks constructed based on a dense layer network (dense) in the encoder are combined by a dense module (d-dense block) and a downsampling module (downblock), the d-dense block is formed by 4 dense units (d-discrete units) and a downsampling module (downblock), the d-dense block is formed by a dense module (d-dense module) and a dense module (dense module) and a normalized block (compressed block) which are combined by a compressed module (compressed-compressed unit (compressed-compressed) and a compressed block (compressed-compressed block) and a compressed-compressed10) And (5) compressing the multiplying power.
The quantizer network includes a pre-quantization and quantization processing module. According to the scheme, a pre-quantization processing module based on discrete neural network learning is introduced into a quantizer network, a code stream (B x C H W) is mapped into an embedded popular space (C x (B x H W)) in a bottleneck layer (bottleneck), loss functions constructed by KL divergence are used, the learning parameters are the category distribution of the dimension B x H W of C, and clustering of structures is achieved. The pre-quantization module is implemented by matrix vector operation. The quantization processing module is used for carrying out { -1,1} binarization processing on the feature maps (feature maps) after pre-quantization to obtain a code stream.
The decoder (generator) and the arbiter together form a generative confrontation model (GAN): the decoder (generator) is composed of m (usually taking the value of 3, 4 and 5) ascending blocks (upblocks) constructed based on dense networks (densenet); each upblock consists of a u-denseblock and an upsample (pixel-buffer) module; the u-densenblock consists of 4 u-densenites, the output of the u-densenblock is the sum of the splices of all u-densenites in the C dimension; the u-densecuit is composed of convolution layer with IGDN (inverse GENERALIZED NORMALIZATION), Leaky Relu activation and output C as m. The basic structure of the discriminator network is 4 stack-type connected convolution layers, and the characteristic distance of the last layer of convolution layer is taken as distance measurement.
According to the deep learning generation type remote sensing image compression method, a loss function L used for model training is as follows:
L=(1-MSSSIM)+MSE+0.01×PSNR+Pro_Q_diff+GAN_loss
MSSSIM represents the similarity of an image multi-scale structure, MSE is mean square error, PSNR is the peak signal-to-noise ratio of an image signal, Pro _ Q _ diff is the loss of a pre-quantization module, and GAN _ loss is the generation of countermeasure loss.
(1) The scheme has better performance for high-resolution remote sensing images and even high-spectrum images with spectrum numbers close to natural images. For an image tensor (C H W) with a spectral dimension (C) of n, a 3X 3 convolution layer is input before encoding, and nonlinear processing is carried out to output a tensor with a height H and a width W of 32. The encoder consists of m (generally taking values of 3, 4 and 5) downllocks constructed based on densenet; each downblock consists of a denseblock module and a downsample module; the denseblock consists of 4 denseblocks, the output of which is the sum of the splices in the C dimension of all the denseblocks; the densunit is composed of a convolutional layer with a GDN (generalized NORMALIZATION transport) NORMALIZATION, a LeakyRelu activation and an output C of m in sequence. The corresponding decoder consists of m upsblocks constructed based on densenet; each upblock consists of a denseblock module and an upsample module; the following elements of the denseblock hierarchy are in accordance with the encoder section.
(2) The invention designs a low-code-rate compression scheme aiming at the characteristic of huge data volume of the remote sensing image. The scheme adopts a paradigm of 'Auto-encoder) + generation of a countermeasure model (GAN'). The image features (feature maps) are extracted from the image X by an encoder and mapped to a hidden space Z, then the image is reconstructed by a decoder (generator) and tries to be verified by a discriminator after the processes of pre-quantization, quantization and entropy coding, and the discriminator tries to falsify the reconstructed image. The image compression framework of the discriminator-generator mode is suitable for extremely low code ratesIn the scheme, the original image is implemented by an encoder to be n/(m × 2)10) And compressing the multiplying power, and then carrying out pre-quantization, quantization and entropy coding to obtain a final code stream.
(3) The invention designs a compression scheme for fusing the three parts aiming at the inter-spectral correlation, the spatial correlation and the texture characteristic of the remote sensing image. The framework is designed on the basis of densener at an encoder, integrates the spatial-spectral information sequence again to extract feature maps (feature maps) while reducing dimension, and decouples the context of the feature maps (feature maps) by utilizing a self-attention mechanism so as to inhibit noise and eliminate redundant information. Meanwhile, a pre-quantization module based on discrete neural network learning is introduced into a quantization module, a loss function constructed by KL divergence in an embedding popular space (C (B) H W)) is mapped to a code stream (B C H W) in a bottleneck layer (bottleeck), the class distribution of the dimension B H W with the learning parameter of C is realized, and the clustering of the structure is realized on the basis of an attention mechanism.
(4) Aiming at the difficulty that a deep neural network model is deployed on small-sized Internet of things equipment, the network model is optimized, the encoder-decoder is constructed by adopting the densenet unit, the cross-layer connection between the encoder and the decoder realizes the high fusion of information, the utilization rate of model parameters is greatly improved, and compared with the conventional common residual neural network (resnet) unit structure with the same performance, the model parameter scale is reduced by half.
Therefore, the invention has the following advantages: the method is suitable for compression processing of homologous remote sensing images with any spectral dimension, namely, a network can directly process the remote sensing images in homologous data sets, and end-to-end remote sensing image compression is realized without preprocessing aiming at the spectral dimension of the images. The framework is very suitable for remote sensing image compression and transmission under the conditions of low bandwidth and low code rate, and has excellent image reconstruction capability. In consideration of the limitation of the environment of small Internet of things equipment such as the satellite, the framework is optimized for the scale and the running speed of the deep neural network, and the deployment and the popularization of the Internet of things equipment are facilitated.
Drawings
Fig. 1 is a schematic diagram of a paradigm network of "Auto-Encoder (Auto-Encoder) + generation countermeasure model (GAN)" in an embodiment of the present invention.
Fig. 2 is a schematic diagram of an encoder-decoder-pre-quantization-discriminator module according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a dense-unit in the embodiment of the invention.
Fig. 4 is a schematic diagram of a denseblock structure in the embodiment of the present invention.
FIG. 5 is a schematic diagram of a densener encoder according to an embodiment of the present invention.
FIG. 6 is a schematic diagram of a densener decoder according to an embodiment of the present invention.
FIG. 7 is a schematic diagram of the principle of pre-quantization (quantization-quantization) in the embodiment of the present invention.
FIG. 8 is a reconstruction effect diagram of a high-resolution remote sensing image at a compression rate of 0.104bpp in an embodiment of the present invention, where (a) (c) is an original image, and (b) (d) is a reconstruction effect diagram.
Detailed Description
The following explains a specific compression flow with reference to examples and drawings.
The method takes a 3 × 64 × 64 image as a training image and a 3 × 512 × 512 image as a test image, and mainly comprises the following steps:
1. data set preparation and neural network hyper-parameters:
1.1 about 8000 Hi-Bid remote sensing images were randomly cropped into image blocks of size 64 × 64 × 3.
1.2, converting the clipped image blocks into tensors with 8 × 64 × 64 × 3 specification and with the blocksize as 8, preparing an input network model for training, iterating all data for 100 times, and using a loss function L in the training as follows:
L=(1-MSSSIM)+MSE+0.01×PSNR+Pre_Q_diff+GAN_loss
wherein MSSSIM represents image multi-scale structure similarity, MSE is mean square error, PSNR is image signal peak signal-to-noise ratio (loss of MSSSIM, MSE, PSNR as encoder network), Pre _ Q _ dif is loss of Pre-quantization module (loss of quantization processing module is small and negligible), GAN _ loss is generation countermeasure loss (the decoder (generator) and the discriminator together form generation countermeasure model (GAN)).
2. And (3) encoding:
the original 8 × 3 × 64 × 64 image tensor enters an encoder network composed of 5 downblocks, and the downblocks are connected with each other through recursive skip layers, as shown in fig. 2; the down block consists of a d-denseblock and a down sampling module (down sample), as shown in figure 5; the d-denseblock consists of 4 d-dense-units, the d-dense-units are connected in a sequential recursive layer, and the d-denseblock splices and fuses the outputs of all the previous d-dense-units in the channel dimension through layer-skipping connection (coordination), as shown in the attached figure 4; consisting of GDN (GENERALIZEDNORMALIZATION TRANSFORMATION) normalization, LeakyRelu activation, and convolutional layer in this order, as shown in FIG. 3. The encoder gradually transfers the spatial information of the original image tensor to the dimensionality between spectrums in the down-sampling process, and the front and back context information is connected in series and integrated by utilizing the characteristics of a densenert network structure. Aiming at the characteristics of 'same-spectrum foreign matter, same-object different spectrum' of the remote sensing image, the design effectively and jointly refines the information of the empty spectrum, removes redundancy and realizes high-efficiency compression of data. The hidden token tensor of 8 × 24 × 2 × 2 obtained by the encoder processing realizes 1/128-magnification compression compared with the original image tensor of 8 × 3 × 64 × 64.
3. Pre-quantization and quantization:
3.1 Pre-quantization (printing-quantization) the 8 × 24 × 2 × 2 scale hidden representation tensor z, output by the encoder networke(x) Mapping to hidden embedding space e ∈ Rk×d(k-24, d-8 × 2 × 2) d is the embedding space vector ej∈RdK is the vector ej∈RdThe number of categories. z is a radical ofe(x) Following the posterior class distribution q with the parameter k (z ═ k | x), one-hot (one-hot) is encoded in the following way:
Figure BDA0002490783360000061
ze(x) Through network learning, mapping to nearest neighbor embedding space e, at Rk×dImplementing discretized merged representation (discrete) in spaceetization clustering representation) to obtain zq(x) In that respect As shown in the following equation:
zq(x)=ek,where k=argminj||ze(x)-ej||2
ekrepresenting the vector embedded in space e.
3.2 quantification: output z of pre-quantizationq(x) And then, the quantization calculation is carried out, in order to reduce the storage space and the transmission bandwidth,
the above-described type data needs to be subjected to { -1,1} binarization processing.
4. And (3) decoding:
the code stream obtained by quantization is input into a decoder, and the 8 multiplied by 24 multiplied by 2 tensor enters a decoder network consisting of 5 upblocks, as shown in the attached figure 1; the upblock is composed of a u-denseblock and an up-sampling module (pixel-buffer), as shown in FIG. 6; the u-denseblock consists of 4 u-denseblocks, the u-denseblocks are connected with each other in a sequential recursive layer, and the u-denseblocks splice and fuse the outputs of all the u-denseblocks in the channel dimension through layer-skipping connection (concatenation), as shown in the attached figure 4; each u-dense-unit consists of batchnormal, GDN-activated, and convolutional layers, as shown in FIG. 3. The decoder gradually transfers the inter-spectrum information of the code stream to the spatial dimension in the up-sampling process to obtain a tensor (tensor) with the scale of 8 multiplied by 3 multiplied by 64, and the reconstruction of the image is realized.
5. The decoder (generator) and arbiter process:
the original image and the reconstructed image are input to a discriminator which attempts to verify the image generated by the pseudo generator (decoder), and the GAN _ loss is output as part of the overall loss function. And realizing image rate distortion optimization in the two continuous iterative games.
The network model of the invention is divided into three parts, namely an encoder, a pre-quantization and quantization module, a decoder (generator) and a discriminator, so that the training of the model is also divided into three stages. The implementation stage inputs the images (data) into the converged model in sequence to implement, and can realize the effects that the compression rate is 0.104bpp, the MS-SSIM of the reconstructed image is 0.976, and the PSNR is 29.01.

Claims (4)

1. A generation type remote sensing image compression method based on deep learning is characterized by comprising the following steps:
in the network model training stage, a training image is input into the constructed network model for training until convergence, and the method specifically comprises the following steps:
step 1, after the image tensor is subjected to network compression processing of an encoder, a hidden representation tensor of 1/128 scale of an original image is obtained;
step 2, inputting the hidden representation tensor into a quantizer network, and obtaining a binary code stream through pre-quantization processing and quantization processing;
step 3, inputting the quantized binary code stream into a decoder to obtain a reconstructed image, inputting the reconstructed image into a discriminator for discrimination, achieving a Nash equilibrium state after the decoder and the discriminator play games for a limited time, and realizing distortion optimization of the image, wherein the decoder and the discriminator jointly form a generation countermeasure model GAN;
the encoder network comprises a channel-adaptor and a downlink block, wherein the channel-adaptor is a convolution layer for reserving space dimension, image tensors (B, C, H and W) are unchanged through the channel-adaptor space dimension (H and W), the number of channels is 4 × max {8, C }, wherein B is the batch processing number, C is an image channel, H is the image height, W is the image width, the formula C is a specific channel number, the encoder network comprises m denoclock modules constructed based on densisets, each of the denoclock modules consists of a dense module d-densiseclock and a downsampling module downsample, the d-densiseclock consists of 4 dense units d-densiseunit, the output of the d-densiseclock is the sum of all d-densiseunit spliced on the C dimension, the d-densiseunit is sequentially composed of GDN normalization, original image and activation output of a render 352, and the output of a convolution encoder is 352/(C) through a convolutional encoder, and the encoder is realized by means of GDN, C, and activation10) Compressing the multiplying power;
the decoder and arbiter: the decoder consists of m uplink blocks constructed based on densenet; each upblock consists of a u-denseblock and an up-sampling module upsamplle; the u-densenblock consists of 4 u-densenites, the output of which is the sum of the splices in the C dimension of all of them; the u-densunit is composed of convolution layers with IGDN reverse normalization, LeakyRelu activation and output C being m in sequence; the basic structure of the discriminator network is 4 stack type connection convolution layers;
and a network model testing stage, namely inputting the image into the trained network model to obtain a compressed image.
2. The deep learning-based generative remote sensing image compression method as claimed in claim 1, wherein: the specific process of the quantizer network is as follows,
the quantizer network comprises a pre-quantization and quantization processing module, the pre-quantization processing module maps the code stream B C H W into the embedded popular space C (B H W) at the bottleneck layer bottleneck, a loss function constructed by KL divergence is used for learning the class distribution of the dimension B H W with the parameter C, and the clustering of the structure is realized on the basis of the attention mechanism; and the quantization processing module is used for carrying out { -1,1} binarization processing on the feature maps after pre-quantization to obtain a code stream.
3. The deep learning-based generative remote sensing image compression method as claimed in claim 1, wherein: the process of the pre-quantization module is as follows,
hidden representation tensor z of encoder network outpute(x) Mapping to hidden embedding space e ∈ Rk×dWhere d is the embedding space vector ej∈RdK is the vector ej∈RdNumber of classes, ze(x) Following the posterior class distribution q (z ═ k | x) with parameter k, one-hot coded as follows:
Figure FDA0002490783350000021
ze(x) Through network learning, mapping to nearest neighbor embedding space e, at Rk×dThe discretization merging expression is realized in the space, and the following formula is shown:
zq(x)=ek,wherek=argminj||ze(x)-ej||2
ekrepresenting the vector embedded in space e.
4. The deep learning-based generative remote sensing image compression method as claimed in claim 1, wherein: the loss function used for the training of the network model is as follows,
L=(1-MSSSIM)+MSE+0.01×PSNR+Pro_Q_diff+GAN_loss
MSSSIM represents the similarity of an image multi-scale structure, MSE is mean square error, PSNR is the peak signal-to-noise ratio of an image signal, MSSSIM, MSE and PSNR are used as the loss of an encoder network, Pro _ Q _ diff is the loss of a pre-quantization module, GAN _ loss is the generation countermeasure loss, and a decoder and a discriminator jointly form a generation countermeasure model GAN.
CN202010404524.2A 2020-05-13 2020-05-13 Generation type remote sensing image compression method based on deep learning Expired - Fee Related CN111683250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010404524.2A CN111683250B (en) 2020-05-13 2020-05-13 Generation type remote sensing image compression method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010404524.2A CN111683250B (en) 2020-05-13 2020-05-13 Generation type remote sensing image compression method based on deep learning

Publications (2)

Publication Number Publication Date
CN111683250A true CN111683250A (en) 2020-09-18
CN111683250B CN111683250B (en) 2021-03-16

Family

ID=72433490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010404524.2A Expired - Fee Related CN111683250B (en) 2020-05-13 2020-05-13 Generation type remote sensing image compression method based on deep learning

Country Status (1)

Country Link
CN (1) CN111683250B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200275A (en) * 2020-12-09 2021-01-08 上海齐感电子信息科技有限公司 Artificial neural network quantification method and device
CN112929663A (en) * 2021-04-08 2021-06-08 中国科学技术大学 Knowledge distillation-based image compression quality enhancement method
CN113393543A (en) * 2021-06-15 2021-09-14 武汉大学 Hyperspectral image compression method, device and equipment and readable storage medium
CN113450421A (en) * 2021-07-16 2021-09-28 中国电子科技集团公司第二十八研究所 Unmanned aerial vehicle reconnaissance image compression and decompression method based on enhanced deep learning
CN113596471A (en) * 2021-07-26 2021-11-02 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN113706641A (en) * 2021-08-11 2021-11-26 武汉大学 Hyperspectral image compression method based on space and spectral content importance
CN113709455A (en) * 2021-09-27 2021-11-26 北京交通大学 Multilevel image compression method using Transformer
CN114463340A (en) * 2022-01-10 2022-05-10 武汉大学 Edge information guided agile remote sensing image semantic segmentation method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170214930A1 (en) * 2016-01-26 2017-07-27 Sandia Corporation Gpu-assisted lossless data compression
US20170347110A1 (en) * 2015-02-19 2017-11-30 Magic Pony Technology Limited Online Training of Hierarchical Algorithms
CN108495132A (en) * 2018-02-05 2018-09-04 西安电子科技大学 The big multiplying power compression method of remote sensing image based on lightweight depth convolutional network
CN110929080A (en) * 2019-11-26 2020-03-27 西安电子科技大学 Optical remote sensing image retrieval method based on attention and generation countermeasure network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170347110A1 (en) * 2015-02-19 2017-11-30 Magic Pony Technology Limited Online Training of Hierarchical Algorithms
US20170214930A1 (en) * 2016-01-26 2017-07-27 Sandia Corporation Gpu-assisted lossless data compression
CN108495132A (en) * 2018-02-05 2018-09-04 西安电子科技大学 The big multiplying power compression method of remote sensing image based on lightweight depth convolutional network
CN110929080A (en) * 2019-11-26 2020-03-27 西安电子科技大学 Optical remote sensing image retrieval method based on attention and generation countermeasure network

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200275A (en) * 2020-12-09 2021-01-08 上海齐感电子信息科技有限公司 Artificial neural network quantification method and device
CN112929663A (en) * 2021-04-08 2021-06-08 中国科学技术大学 Knowledge distillation-based image compression quality enhancement method
CN113393543B (en) * 2021-06-15 2022-07-01 武汉大学 Hyperspectral image compression method, device and equipment and readable storage medium
CN113393543A (en) * 2021-06-15 2021-09-14 武汉大学 Hyperspectral image compression method, device and equipment and readable storage medium
CN113450421A (en) * 2021-07-16 2021-09-28 中国电子科技集团公司第二十八研究所 Unmanned aerial vehicle reconnaissance image compression and decompression method based on enhanced deep learning
CN113596471A (en) * 2021-07-26 2021-11-02 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN113596471B (en) * 2021-07-26 2023-09-12 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN113706641B (en) * 2021-08-11 2023-08-15 武汉大学 Hyperspectral image compression method based on space and spectral content importance
CN113706641A (en) * 2021-08-11 2021-11-26 武汉大学 Hyperspectral image compression method based on space and spectral content importance
CN113709455A (en) * 2021-09-27 2021-11-26 北京交通大学 Multilevel image compression method using Transformer
CN113709455B (en) * 2021-09-27 2023-10-24 北京交通大学 Multi-level image compression method using transducer
CN114463340A (en) * 2022-01-10 2022-05-10 武汉大学 Edge information guided agile remote sensing image semantic segmentation method
CN114463340B (en) * 2022-01-10 2024-04-26 武汉大学 Agile remote sensing image semantic segmentation method guided by edge information

Also Published As

Publication number Publication date
CN111683250B (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN111683250B (en) Generation type remote sensing image compression method based on deep learning
CN110348487B (en) Hyperspectral image compression method and device based on deep learning
CN113962893B (en) Face image restoration method based on multiscale local self-attention generation countermeasure network
CN110517329B (en) Deep learning image compression method based on semantic analysis
CN108960333B (en) Hyperspectral image lossless compression method based on deep learning
Qian et al. Fast three-dimensional data compression of hyperspectral imagery using vector quantization with spectral-feature-based binary coding
CN113554720A (en) Multispectral image compression method and system based on multidirectional convolutional neural network
CN116469100A (en) Dual-band image semantic segmentation method based on Transformer
Wang et al. Sparse tensor-based point cloud attribute compression
CN111340901B (en) Compression method of power transmission network picture under complex environment based on generation type countermeasure network
CN113450421B (en) Unmanned aerial vehicle reconnaissance image compression and decompression method based on enhanced deep learning
CN115578280A (en) Construction method of double-branch remote sensing image defogging network
CN115564721A (en) Hyperspectral image change detection method based on local information enhancement
CN117475216A (en) Hyperspectral and laser radar data fusion classification method based on AGLT network
Mei et al. Learn a compression for objection detection-vae with a bridge
CN111898671B (en) Target identification method and system based on fusion of laser imager and color camera codes
CN113706641A (en) Hyperspectral image compression method based on space and spectral content importance
CN112862655A (en) JPEG image steganalysis method based on channel space attention mechanism
CN117541505A (en) Defogging method based on cross-layer attention feature interaction and multi-scale channel attention
Kong et al. End-to-end multispectral image compression framework based on adaptive multiscale feature extraction
CN112750175A (en) Image compression method and system based on octave convolution and semantic segmentation
CN117372686A (en) Semantic segmentation method and system for complex scene of remote sensing image
CN115171029B (en) Unmanned-driving-based method and system for segmenting instances in urban scene
CN115239563A (en) Point cloud attribute lossy compression device and method based on neural network
CN115147317A (en) Point cloud color quality enhancement method and system based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210316