WO2023241188A1 - 一种无人机定量遥感应用的数据压缩方法 - Google Patents

一种无人机定量遥感应用的数据压缩方法 Download PDF

Info

Publication number
WO2023241188A1
WO2023241188A1 PCT/CN2023/087731 CN2023087731W WO2023241188A1 WO 2023241188 A1 WO2023241188 A1 WO 2023241188A1 CN 2023087731 W CN2023087731 W CN 2023087731W WO 2023241188 A1 WO2023241188 A1 WO 2023241188A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
remote sensing
multispectral
dimensional convolution
compression method
Prior art date
Application number
PCT/CN2023/087731
Other languages
English (en)
French (fr)
Inventor
张文豪
金永涛
李国洪
顾行发
田晓敏
朱霞
朱孟栩
Original Assignee
北华航天工业学院
中国科学院空天信息创新研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北华航天工业学院, 中国科学院空天信息创新研究院 filed Critical 北华航天工业学院
Priority to US18/226,038 priority Critical patent/US20230403395A1/en
Publication of WO2023241188A1 publication Critical patent/WO2023241188A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234354Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering signal-to-noise ratio parameters, e.g. requantization
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440254Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering signal-to-noise parameters, e.g. requantization

Definitions

  • the invention relates to the technical field of compression methods, and in particular to a data compression method for quantitative remote sensing applications of unmanned aerial vehicles.
  • the compression methods for UAV remote sensing images include traditional image compression methods and image compression methods based on deep learning.
  • image compression methods There are three main categories of traditional image compression methods: prediction-based image compression methods, vector quantization-based image compression methods, and transformation-based image compression methods.
  • the prediction-based image compression method uses the correlation between adjacent elements and bands of the image to predict the current pixel value through the context information of adjacent elements to achieve image compression.
  • the commonly used image compression method based on prediction is differential pulse modulation. This method minimizes the residual value of the image by selecting prediction coefficients.
  • the image compression method based on vector quantization converts several scalars of the image into a vector, integrates the vector space, and compresses the data.
  • the transformation-based image compression method transforms the image from the spatial domain to the transform domain, and performs compression coding within the transform domain.
  • Commonly used transformation methods include principal component analysis, discrete cosine transform, discrete wavelet transform, Karhunen-Loeve transform, etc.
  • Prediction-based image compression methods vector quantization-based image compression methods and transformation-based image compression methods all compress the pixel values of UAV remote sensing images.
  • the compression rate is low and distortion will occur to varying degrees.
  • Quantitative remote sensing applications of images are examples of images.
  • the present invention provides a data compression method for quantitative remote sensing applications of UAVs.
  • the present invention provides the following solutions:
  • a data compression method for UAV quantitative remote sensing applications including:
  • the preprocessing of multispectral images collected by drones specifically includes:
  • S100.1 collects multispectral images of target areas
  • S100.2 uses the SIFT operator to extract feature points in multispectral images, and splices them into multispectral remote sensing images based on the feature point information;
  • S100.3 Perform radiometric calibration of multispectral remote sensing images and convert the DN value of multispectral remote sensing images into surface reflectance
  • S100.4 clips the multispectral remote sensing image to obtain a 256 ⁇ 256 pixel multispectral image.
  • the encoder includes an autoencoder and a hyperparameter encoder.
  • the autoencoder is used to three-dimensionally convolve N ⁇ 256 ⁇ 256 multispectral images into 320 ⁇ 16 ⁇ 16 features. image;
  • the hyperparameter encoder is used to two-dimensionally convolve the 320 ⁇ 16 ⁇ 16 feature image into a 320 ⁇ 4 ⁇ 4 feature image.
  • the autoencoder includes a three-dimensional convolution layer and a GDN activation function; the three-dimensional convolution layer uses a 5 ⁇ 5 three-dimensional convolution kernel with a step size of 2, and the GDN activation function The function is used to increase the nonlinear relationship between each three-dimensional convolution layer.
  • the hyperparameter encoder includes a two-dimensional convolution layer and a LeakyReLU activation function; the two-dimensional convolution layer uses a 5 ⁇ 5 two-dimensional convolution kernel with a step size of 2, so The LeakyReLU activation function is used to increase the nonlinear relationship between each two-dimensional convolution layer.
  • the decoder includes a self-decoder and a hyper-parameter decoder.
  • the self-decoder and the auto-encoder are symmetrical structures.
  • the hyper-parameter decoder and the super-parameter decoder are Parameter encoders are symmetrical structures to each other.
  • the quantization and entropy coding of deep feature information includes the following steps:
  • S300.1 converts the floating point data of deep feature information into integer type
  • S300.2 uses the double Gaussian model to estimate the entropy of the entropy coding.
  • the present invention discloses the following technical effects:
  • the invention discloses a data compression method for quantitative remote sensing applications of unmanned aerial vehicles.
  • the images collected by the unmanned aerial vehicle are pre-processed to obtain multi-spectral images that can be used.
  • the multi-spectral images are compressed through an encoder.
  • the spectrum is subjected to three-dimensional convolution and two-dimensional convolution to obtain deep feature information and achieve compression.
  • quantizing and entropy coding the deep feature information the redundancy in the feature image is further removed.
  • the image's The loss and code rate are adjusted to the optimal allocation to obtain the optimal compressed image.
  • the optimal compressed image is reconstructed through the decoder for subsequent applications.
  • Figure 1 is a schematic flow chart of a data compression method for UAV quantitative remote sensing applications provided by an embodiment of the present invention
  • Figure 2 is a data compression model diagram of a data compression method for quantitative remote sensing applications of drones provided by an embodiment of the present invention
  • Figure 3 is a data compression model diagram of a data compression method for quantitative remote sensing applications of drones provided by an embodiment of the present invention
  • Figure 4 is a data compression method for UAV quantitative remote sensing applications provided by an embodiment of the present invention.
  • the water body extraction result of the method is the extraction result of slender water body from UAV remote sensing image after compression
  • Figure 4 (b) is the extraction result of blocky water body from UAV remote sensing image after compression.
  • the compression methods for UAV remote sensing images include traditional image compression methods and image compression algorithms based on deep learning.
  • image compression methods There are three main categories of traditional image compression methods: prediction-based image compression methods, vector quantization-based image compression methods and transformation-based image compression methods. These methods all compress the pixel values of UAV remote sensing images, and the compression rate is relatively high. Low and varying degrees of distortion will occur. Even at high compression ratios, due to the large amount of data, computer memory overflows, resulting in block effects, blur, artifacts and other problems in compressed images, seriously affecting the quantification of UAV remote sensing images. Remote sensing applications.
  • the image compression method based on deep learning improves the image compression ratio and reconstruction quality to a certain extent
  • the image compression method based on deep learning does not take into account the application scenarios of UAV quantitative remote sensing, and the data source is relatively single, mostly For RGB type false color data, there is no compression algorithm designed for quantitative remote sensing applications of UAV remote sensing images.
  • the present invention provides a data compression method for quantitative remote sensing applications of UAVs, as shown in Figure 1, including the following steps.
  • S100.1 collects multispectral images of the target area; among them, a drone equipped with a multispectral camera is used to collect multispectral images of the target area.
  • S100.2 uses the SIFT operator to extract feature point information in multispectral images, and splices it into multispectral remote sensing images based on the feature point information, thereby achieving registration of UAV remote sensing images.
  • S100.3 Perform radiometric calibration of multispectral remote sensing images and convert the DN value of multispectral remote sensing images into surface reflectance; among them, the invariant target method is used and the ASD spectrometer is used to measure the reflectance data of fixed targets. According to the invariant The relationship between the reflectivity of the target in different time phases and the UAV remote sensing image, radiometric calibration of the multispectral remote sensing image, and converting the DN value of the UAV image into surface reflectance.
  • This method can collect data from different sensors. Multispectral data obtained as well as data with different quantification standards are converted into the same measure. This process eliminates instrument errors caused by different sensors during the compression process.
  • S100.4 clips the radiometrically calibrated multispectral remote sensing image to obtain a 256 ⁇ 256 pixel multispectral image.
  • the encoder includes an autoencoder and a hyperparameter encoder, and the autoencoder is used to three-dimensionally convolve the N ⁇ 256 ⁇ 256 multispectral image into a 320 ⁇ 16 ⁇ 16 feature image; the The hyperparameter encoder is used to two-dimensionally convolve the 320 ⁇ 16 ⁇ 16 feature image into a 320 ⁇ 4 ⁇ 4 feature image.
  • the autoencoder includes a three-dimensional convolution layer and a GDN activation function;
  • the three-dimensional convolution layer uses a 5 ⁇ 5 three-dimensional convolution kernel with a step size of 2
  • the GDN activation function is used to increase each three-dimensional convolution non-linear relationships between layers.
  • the GDN activation function is formula (1).
  • ⁇ , ⁇ , ⁇ , ⁇ is the corresponding parameter of the transformation.
  • the preprocessed UAV remote sensing image is cropped into an n ⁇ 256 ⁇ 256 size image.
  • a three-dimensional convolution structure is used to extract the spectral information between the bands of the multispectral image.
  • the convolution kernel size of the three-dimensional convolution layer is n ⁇ 1 ⁇ 1.
  • a small convolution kernel is used to extract the spectral features of the multispectral image.
  • a convolution layer with a convolution kernel size of 5, a stride of 2, and zero padding of 2 is used.
  • Perform a convolution operation on the input image to obtain 192 feature maps with a size of 128 ⁇ 128, and then use the GDN activation function to connect the two convolution layers.
  • the GDN activation function is used to increase the nonlinear relationship between each layer of the convolutional neural network.
  • the convolution kernel size of the three-dimensional convolution layer is n ⁇ 1 ⁇ 1. Using small convolution kernels to extract spectral features of multispectral images avoids the problem of computer memory overflow caused by excessive data volume.
  • the hyperparameter encoder includes a two-dimensional convolution layer and a LeakyReLU activation function; the two-dimensional convolution layer uses a 5 ⁇ 5 two-dimensional convolution kernel with a step size of 2, and the LeakyReLU activation function is used to increase each Nonlinear relationships between 2D convolutional layers.
  • the LeakyReLU activation function is formula (2).
  • a i is a fixed parameter in the interval (1, + ⁇ )
  • xi represents the feature map of the i-th layer input
  • yi represents the feature map of the i-th layer output.
  • the GDN activation function between the first four convolutional layers and the connected convolutional layers constitutes a basic autoencoder. There is still room for further improvement in the compression of image data by the autoencoder.
  • Design The hyperparameter encoder is placed after the autoencoder.
  • the hyperparameter encoder takes the 320 ⁇ 16 ⁇ 16 feature image output by the autoencoder as the input image, using a convolution kernel size of 3, a stride of 1, and zero padding. Process the feature image with a convolution layer of 1 to obtain a new 320 ⁇ 16 ⁇ 16 feature image, and then use a convolution layer with a convolution kernel size of 5, a stride of 2, and zero padding of 2 to process the new feature image.
  • the feature image is downsampled, and the LeakyReLU activation function is used to increase the nonlinear relationship between the convolutional layers of the network. Finally, a set of 320 ⁇ 4 ⁇ 4 feature vectors is obtained.
  • the hyperparameter encoder further reduces the data dimension and extracts the Deep feature information. Among them, in Figure 2, Input represents input, Output represents output, Feature represents feature, Conv represents convolution, ReLU represents ReLU activation function, GDN represents GDN activation function, and LeakyReLU represents LeakyReLU activation function.
  • S300.1 converts floating point data of deep feature information into integer data
  • the image feature data extracted by the autoencoder is floating point data.
  • Floating point data will occupy a large amount of storage space when stored, and the feature data needs to be quantified.
  • Quantization processing will quantize floating-point data into integers. There will be some information loss in the quantization process, which will have a certain impact on the quality of the reconstructed image.
  • the principle of the quantization structure is to convert the floating point data of the feature image into integer data, and its formula is as shown in formula (3):
  • yi is the feature map output by the autoencoder, for quantitative results.
  • the entropy coding in this part uses arithmetic coding, which can losslessly remove redundancy in feature images.
  • S300.2 uses the double Gaussian model to estimate the entropy of the entropy coding.
  • the result of entropy coding requires accurate code rate estimation.
  • the prior probability model of potential features is used to estimate the symbol probability. Introduce information Come to the right distribution is estimated.
  • the Gaussian mixture model has a more powerful ability to approximate data distribution. By increasing the number of Gaussian models in the Gaussian mixture model, any continuous data probability distribution can be approximated.
  • This paper uses the double Gaussian model for entropy estimation.
  • the double Gaussian model The distribution function of is shown in formula (4):
  • w i represents the weight of different Gaussian models
  • N(u i , ⁇ i ) represents the distribution parameters of the Gaussian model
  • entropy coding is first performed on the integer data to obtain the entropy coding result, and then the entropy coding result is estimated using a double Gaussian model to obtain the loss value and code rate of the image.
  • rate-distortion optimization is a joint optimization of image distortion and compression code rate.
  • the tuning results of code rate estimation and image distortion will directly affect the entire end-to-end convolutional neural network image compression algorithm. Optimization effect.
  • the loss function used in the rate-distortion optimization of the end-to-end convolutional neural network image compression algorithm is The number is as shown in formula (5):
  • D represents distortion
  • the mean square error between the original image and the reconstructed image represents the degree of distortion of the image
  • R represents the code rate
  • represents the balance coefficient between distortion and code rate
  • Re represents the degree of distortion
  • the loss function is composed of the code rate of the end-to-end convolutional neural network image compression algorithm and the loss value between the original image and the reconstructed image.
  • the code rate estimation of the end-to-end convolutional neural network image compression algorithm is shown in formula (6) and formula (7).
  • the image loss and bit rate distribution are continuously adjusted to achieve a balance between image loss and bit rate, which not only ensures the reconstruction quality of the image, but also ensures the image compression efficiency.
  • the reconstructed image uses a self-decoder and a hyperparameter decoder.
  • the self-decoder adopts a completely symmetrical structure with the self-encoder.
  • the self-decoder includes a deconvolution layer, an IGDN activation function and a LeakyReLU activation function.
  • the formula of IGDN activation function is shown in (8).
  • ⁇ , ⁇ , ⁇ , ⁇ is the corresponding parameter of the transformation
  • the feature vector of size 320 ⁇ 4 ⁇ 4 obtained by the autoencoder is input to the autodecoder, using a convolution kernel size of 5, a stride of 2, and a zero padding of 2.
  • the product layer performs a deconvolution operation on the input image to obtain 320 feature maps with a size of 8 ⁇ 8.
  • the IGDN activation function and the LeakyReLU activation function connect the two convolution layers to increase the nonlinear relationship between the layers of the compression network. .
  • the LeakyReLU activation function between the first three convolutional layers and the connected convolutional layers constitutes the hyperparameter decoder, followed by the decoder.
  • the decoder adopts the structure corresponding to the encoder and restores the feature image to a feature vector with size n ⁇ 256 ⁇ 256.
  • the GDAL library (Geospatial Data Abstraction Library) is used to restore the n ⁇ 256 ⁇ 256 feature vector to a reconstructed image with coordinate information.
  • Feature represents the feature
  • Input represents the input
  • Output represents the output
  • ConvT represents deconvolution
  • LeakyReLU represents the LeakyReLU activation function
  • IGDN represents the IGDN activation function.
  • the reconstructed images with a size of 256 ⁇ 256 and coordinate information are spliced and fused, and several 256 ⁇ 256 images are spliced into a whole image.
  • the quantitative remote sensing application of UAV remote sensing images uses the identification of different ground object types, specifically: leaf area index NDVI and water body index NDWI.
  • Leaf area index NDVI is one of the important parameters that reflects crop growth and nutritional information.
  • the calculation principle is the sum of the difference between the reflection value in the near-infrared band and the reflection value in the red band. The calculation is as shown in formula (9) .
  • NIR is the reflection value of the near-infrared band
  • R is the reflection value of the red light band.
  • the water body index NDWI is one of the important parameters that reflects water body information.
  • the calculation principle is the difference between the reflection value of the green band and the reflection value of the near-infrared band than the sum of the two. The calculation is as follows: (10) shown.
  • NIR is the reflection value of the near-infrared band
  • G is the reflection value of the green band.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种无人机定量遥感应用的数据压缩方法,涉及压缩方法技术领域,该方法包括对无人机采集的多光谱图像进行预处理;通过编码器,依次对多光谱图像进行三维卷积和二维卷积,得到深层特征信息;对深层特征信息进行量化和熵编码;通过端到端联合训练,对图像的损失和码率进行最优分配,得到最优压缩图像;通过解码器对最优压缩图像进行重建。通过对多光谱图形进行多次卷积,提高图像重建质量和压缩比;通过将卷积后的深层特征信息进行量化和熵编码,去除特征图像中的冗余,提高图像重建质量和压缩比;通过端对端联合训练,将图像的损失与码率调节成最优比例,可实现高压缩比的同时,提高压缩质量,防止块效应、模糊、伪影等问题的出现。

Description

一种无人机定量遥感应用的数据压缩方法
本申请要求于2022年6月13日提交中国专利局、申请号为202210673676.1、发明名称为“一种无人机定量遥感应用的数据压缩方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及压缩方法技术领域,具体涉及一种无人机定量遥感应用的数据压缩方法。
背景技术
目前,无人机遥感图像的压缩方法有传统的图像压缩方法和基于深度学习的图像压缩方法。传统的图像压缩方法主要有三类:基于预测的图像压缩方法、基于矢量量化的图像压缩方法和基于变换的图像压缩方法。基于预测的图像压缩方法是利用图像相邻元素和波段之间的相关性,通过临近元素的上下文信息预测当前像素值,实现图像的压缩。基于预测的图像压缩方法常用的为差分脉冲调制,此方法通过选择预测系数,使图像的残差值达到最小。基于矢量量化的图像压缩方法是将图像的若干标量转化为一个矢量,将矢量空间整体化,从而压缩数据,此方法充分利用图像的相关性,编码性能较高,但是编码难度较大,计算资源耗费极大。基于变换的图像压缩方法是将图像从空间域变换为变换域,在变换域范围内实行压缩编码,常用的变换方法有主成分分析、离散余弦变换、离散小波变换、Karhunen-Loeve变换等。
基于预测的图像压缩方法、基于矢量量化的图像压缩方法和基于变换的图像压缩方法,这些方法都是对无人机遥感图像的像素值进行压缩,压 缩率较低并且会出现不同程度的失真,甚至在高压缩比时,由于数据量大,导致计算机内存溢出,从而导致压缩图像出现块效应、模糊、伪影等问题,严重影响无人机遥感图像的定量遥感应用。
发明内容
鉴于现有技术中的上述缺陷或不足,本发明提供一种无人机定量遥感应用的数据压缩方法。
为实现上述目的,本发明提供了如下方案:
一种无人机定量遥感应用的数据压缩方法,包括:
S100.对无人机采集的多光谱图像进行预处理;
S200.通过编码器,依次对多光谱图像中进行三维卷积和二维卷积,得到深层特征信息;
S300.对深层特征信息进行量化和熵编码;
S400.通过端到端联合训练,对图像的损失和码率进行最优分配,得到最优压缩图像;
S500.通过解码器对最优压缩图像进行重建。
根据本申请实施例提供的技术方案,所述对无人机采集的多光谱图像进行预处理,具体包括:
S100.1采集目标地区的多光谱图像;
S100.2利用SIFT算子提取多光谱图像中的特征点,根据特征点信息,拼接成多光谱遥感图像;
S100.3对多光谱遥感图像进行辐射定标,将多光谱遥感图像的DN值转换为地表反射率;
S100.4剪裁多光谱遥感图像,得到256×256像素的多光谱图像。
根据本申请实施例提供的技术方案,所述编码器包括自编码器和超参编码器,所述自编码器用于将N×256×256多光谱图像三维卷积成320×16×16的特征图像;
所述超参编码器用于将320×16×16的特征图像二维卷积成320×4×4的特征图像。
根据本申请实施例提供的技术方案,所述自编码器包括三维卷积层和GDN激活函数;所述三维卷积层采用步长为2的5×5的三维卷积核,所述GDN激活函数用于增加各三维卷积层之间的非线性关系。
根据本申请实施例提供的技术方案,所述超参编码器包括二维卷积层和LeakyReLU激活函数;所述二维卷积层采用步长为2的5×5二维卷积核,所述LeakyReLU激活函数用于增加各二维卷积层之间的非线性关系。
根据本申请实施例提供的技术方案,所述解码器包括自解码器和超参解码器,所述自解码器与所述自编码器互为对称结构,所述超参解码器和所述超参编码器互为对称结构。
根据本申请实施例提供的技术方案,所述对深层特征信息进行量化和熵编码包括以下步骤:
S300.1将深层特征信息的浮点数据转化为整型;
S300.2通过双高斯模型,对熵编码进行熵估计。
根据本发明提供的具体实施例,本发明公开了以下技术效果:
本发明公开一种无人机定量遥感应用的数据压缩方法,首先对无人机采集的图像进行预处理,得到可供使用的多光谱图像,通过编码器将多光 谱进行三维卷积和二维卷积,得到深层特征信息,实现压缩,再通过对深层特征信息进行量化和熵编码,进一步去除特征图像中的冗余,通过端到端联合训练,将图像的损失和码率调节成最优分配,得到最优压缩图像,最后通过解码器对最优压缩图像进行重建,以便后续应用。
通过对多光谱图形进行多次卷积,其包括三维卷积和二维卷积,有利于提高图像重建质量和压缩比;通过将卷积后的深层特征信息进行量化和熵编码,可以进一步去除特征图像中的冗余,进一步提高图像重建质量和压缩比;通过端对端联合训练,将图像的损失与码率调节成最优比例,可实现高压缩比的同时,提高压缩质量,防止块效应、模糊、伪影等问题的出现。
说明书附图
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种无人机定量遥感应用的数据压缩方法的流程示意图;
图2为本发明实施例提供的一种无人机定量遥感应用的数据压缩方法的数据压缩模型图;
图3为本发明实施例提供的一种无人机定量遥感应用的数据压缩方法的数据压缩模型图;
图4为本发明实施例提供的一种无人机定量遥感应用的数据压缩方 法的水体提取结果图;图4的(a)为压缩后无人机遥感影像细长水体提取结果图,图4的(b)为压缩后无人机遥感影像块状水体提取结果图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。
实施例1
目前,无人机遥感图像的压缩方法有传统的图像压缩方法和基于深度学习的图像压缩算法。传统的图像压缩方法主要有三类:基于预测的图像压缩方法、基于矢量量化的图像压缩方法和基于变换的图像压缩方法,这些方法都是对无人机遥感图像的像素值进行压缩,压缩率较低并且会出现不同程度的失真,甚至在高压缩比时,由于数据量大,导致计算机内存溢出,从而导致压缩图像出现块效应、模糊、伪影等问题,严重影响无人机遥感图像的定量遥感应用。
基于深度学习的图像压缩方法,虽然在一定程度上提升了图像压缩比和重建质量,但是基于深度学习的图像压缩方法并未考虑到无人机定量遥感应用场景,且数据源较为单一,多为RGB类型的假彩色数据,没有针对无人机遥感图像定量遥感应用设计的压缩算法。
为解决上述问题,本发明提供了一种无人机定量遥感应用的数据压缩方法,如图1所示,包括以下步骤。
S100.对无人机采集的多光谱图像进行预处理,具体为:
S100.1采集目标地区的多光谱图像;其中,利用无人机搭载多光谱相机采集目标地区的多光谱图像。
S100.2利用SIFT算子提取多光谱图像中的特征点信息,并根据特征点信息,拼接成多光谱遥感图像,进而实现无人机遥感图像的配准。
S100.3对多光谱遥感图像进行辐射定标,将多光谱遥感图像的DN值转换为地表反射率;其中,利用不变目标法,采用ASD光谱仪测得固定目标的反射率数据,根据不变目标在不同时相下的反射率和无人机遥感图像之间的关系,对多光谱遥感图像进行辐射定标,将无人机图像DN值转换为地表反射率,此方法能够将不同传感器采集到的多光谱数据以及具有不同量化标准的数据转换为同一衡量标准,此过程消除了压缩过程中不同传感器造成的仪器误差。
S100.4剪裁辐射定标后的多光谱遥感图像,得到256×256像素的多光谱图像。
进一步地,设计面向无人机定量遥感应用的数据压缩模型,其中,压缩模型包括以下S200-S400。
S200.通过编码器,依次对多光谱图像进行三维卷积和二维卷积,得到深层特征信息。
进一步地,所述编码器包括自编码器和超参编码器,所述自编码器用于将N×256×256多光谱图像三维卷积成320×16×16的特征图像;所述 超参编码器用于将320×16×16的特征图像二维卷积成320×4×4的特征图像。
其中,所述自编码器包括三维卷积层和GDN激活函数;所述三维卷积层采用步长为2的5×5的三维卷积核,所述GDN激活函数用于增加各三维卷积层之间的非线性关系。其中,GDN激活函数为公式(1)。
其中,θ={α,β,γ,ε}为该变换的相应参数。
工作原理:预处理后的无人机遥感图像裁剪为n×256×256大小的图像,首先利用三维卷积结构提取多光谱图像波段之间的光谱信息。三维卷积层的卷积核大小为n×1×1,使用小卷积核提取多光谱图像的光谱特征,利用卷积核尺寸为5、步长为2、零填充为2的卷积层对输入图像进行卷积操作,得到192个大小为128×128的特征图,然后利用GDN激活函数连接两个卷积层,GDN激活函数用于增加卷积神经网络各层之间的非线性关系。三维卷积层的卷积核大小为n×1×1,使用小卷积核提取多光谱图像的光谱特征,避免了数据量过大导致计算机内存溢出的问题。
其中,所述超参编码器包括二维卷积层和LeakyReLU激活函数;所述二维卷积层采用步长为2的5×5二维卷积核,所述LeakyReLU激活函数用于增加各二维卷积层之间的非线性关系。其中,LeakyReLU激活函数为公式(2)。
其中,ai是在区间(1,+∞)的固定参数,xi代表第i层输入的特征图,yi代表第i层输出的特征图。
工作原理:如图2所示,前四个卷积层和连接卷积层之间GDN激活函数构成了一个基本的自编码器,自编码器对图像数据的压缩还存在进一步提升的空间,设计了超参编码器,放置在自编码器之后,超参编码器将自编码器输出的320×16×16的特征图像作为输入图像,采用卷积核尺寸为3、步长为1、零填充为1的卷积层对特征图像进行处理,得到一个新的320×16×16的特征图像,然后利用卷积核尺寸为5、步长为2、零填充为2的卷积层对新的特征图像进行下采样,采用LeakyReLU激活函数增加网络各卷积层之间的非线性关系,最后得到一组320×4×4的特征向量,超参编码器进一步降低了数据维度,提取到图像的深层特征信息。其中,图2中,Input表示输入,Output表示输出,Feature表示特征,Conv表示卷积,ReLU表示ReLU激活函数,GDN表示GDN激活函数,LeakyReLU表示LeakyReLU激活函数。
S300.对深层特征信息进行量化和熵编码,具体为:
S300.1将深层特征信息的浮点数据转化为整型数据;
其中,自编码器提取得到的图像特征数据为浮点型数据,浮点型数据存储时会占用大量的存储空间,需要对特征数据进行量化处理。量化处理会将浮点型数据量化成整型,量化过程存在一部分的信息损失,会对重建图像的质量造成一定的影响。量化结构的原理是将特征图像的浮点型数据转换为整型数据,其公式如公式(3)所示:
其中,yi为自编码器输出的特征图,为量化结果。
图像经过自编码器提取特征和量化之后,还存在冗余去除不彻底的情 况,需要依赖高效的熵编码环节去除量化后特征图像中的冗余,进一步提高编码性能。本部分熵编码采用的为算术编码,能够无损的去除特征图像中的冗余。
S300.2通过双高斯模型,对熵编码进行熵估计。
其中,在端到端的图像压缩***中,熵编码的结果需要精确的码率估计,熵编码过程中利用潜在特征的先验概率模型进行符号概率估计。引入信息来对的分布进行估计。高斯混合模型具有更加强大的数据分布近似能力。通过增加高斯混合模型中高斯模型的数量,可以逼近任何连续的数据概率分布,本文使用双高斯模型进行熵估计,双高斯模型的分布函数如公式(4)所示:
其中,wi代表不同高斯模型的权重,N(uii)代表高斯模型的分布参数,代表熵编码结果。
在本步骤中,首先对整型数据进行熵编码处理,得到熵编码结果,然后通过双高斯模型对熵编码结果进行熵估计,得到图像的损失值和码率。
S400.通过端到端联合训练,对图像的损失和码率,利用损失函数对其进行最优分配,得到最优压缩图像。
其中,对于端到端的编码,率-失真优化是对图像失真和压缩码率的联合调优,码率估计和图像失真的调优结果将直接影响整个端到端卷积神经网络图像压缩算法的优化效果。为了更好地优化图像的压缩性能,端到端卷积神经网络图像压缩算法的率-失真优化采用的损失函 数如公式(5)所示:
其中,D表示失真,原始图像与重建图像的均方误差代表图像的失真程度;R表示码率;λ表示失真与码率的平衡系数;代表失真度;代表的码率;损失函数是由端到端卷积神经网络图像压缩算法的码率和原始图像与重建图像之间的损失值组成。端到端卷积神经网络图像压缩算法的码率估计如公式(6)和公式(7)所示。

进一步地,代表的分布。端到端卷积神经网络图像压缩算法训练过程中,不断调节图像的损失和码率的分配,使得图像损失和码率之间达到均衡,既保证图像的重建质量,又保证图像的压缩效率。
S500.通过解码器对最优压缩图像进行重建。
重建图像采用的是自解码器和超参解码器,自解码器采用和自编码器完全对称的结构,自解码器包含反卷积层、IGDN激活函数和LeakyReLU激活函数。IGDN激活函数的公式如(8)所示。
其中,θ={α,β,γ,ε}为该变换的相应参数
工作原理:如图3所示,自编码器得到的尺寸为320×4×4的特征向量输入自解码器,利用卷积核尺寸为5、步长为2、零填充为2的卷 积层对输入图像进行反卷积操作,得到320个大小为8×8的特征图,IGDN激活函数和LeakyReLU激活函数连接两个卷积层,用于增加压缩网络各层之间的非线性关系。前三个卷积层和连接卷积层之间LeakyReLU激活函数构成了超参解码器,超参解码器之后为解码器。解码器采用和编码器对应的结构,将特征图像还原为尺寸为n×256×256的特征向量。利用GDAL库(Geospatial Data Abstraction Library)将n×256×256的特征向量还原成带有坐标信息的重建图像。图3中,Feature表示特征,Input表示输入,Output表示输出,ConvT表示反卷积,LeakyReLU表示LeakyReLU激活函数,IGDN表示IGDN激活函数。
将尺寸为256×256带有坐标信息的重建图像进行拼接与融合处理,将若干个256×256大小的图像拼接成一整张图像。
实施例2
如图4所示,无人机遥感图像的定量遥感应用,采用不同地物类型的识别,具体为:叶面积指数NDVI、水体指数NDWI。
叶面积指数NDVI是反映农作物长势和营养信息的重要参数之一,计算原理是近红外波段的反射值与红光波段的反射值之差比上两者之和,计算如公式(9)所示。
其中,NIR为近红外波段的反射值,R为红光波段的反射值。
水体指数NDWI是反映水体信息的重要参数之一,计算原理是绿波段的反射值与近红外波段的反射值之差比上两者之和,计算如公式 (10)所示。
其中,NIR为近红外波段的反射值,G为绿波段的反射值。
通过计算无人机遥感图像的叶面积指数NDVI、水体指数NDWI,对无人机遥感图进行定量遥感应用,完成不同地物类型的识别与分类。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。
本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。

Claims (7)

  1. 一种无人机定量遥感应用的数据压缩方法,其特征在于,包括:
    S100.对无人机采集的多光谱图像进行预处理;
    S200.通过编码器,依次对多光谱图像进行三维卷积和二维卷积,得到深层特征信息;
    S300.对深层特征信息进行量化和熵编码;
    S400.通过端到端联合训练,对图像的损失和码率进行最优分配,得到最优压缩图像;
    S500.通过解码器对最优压缩图像进行重建。
  2. 根据权利要求1所述的一种无人机定量遥感应用的数据压缩方法,其特征在于,所述对无人机采集的多光谱图像进行预处理,具体包括:
    S100.1采集目标地区的多光谱图像;
    S100.2利用SIFT算子提取多光谱图像中的特征点,根据特征点信息,拼接成多光谱遥感图像;
    S100.3对多光谱遥感图像进行辐射定标,将多光谱遥感图像的DN值转换为地表反射率;
    S100.4剪裁多光谱遥感图像,得到256×256像素的多光谱图像。
  3. 根据权利要求1所述的一种无人机定量遥感应用的数据压缩方法,其特征在于,所述编码器包括自编码器和超参编码器,所述自编码器用于将N×256×256多光谱图像三维卷积成320×16×16的特征图像;所述超参编码器用于将320×16×16的特征图像二维卷积成320×4×4的特征图像。
  4. 根据权利要求3所述的一种无人机定量遥感应用的数据压缩方法,其特征在于,所述自编码器包括三维卷积层和GDN激活函数;所述三维 卷积层采用步长为2的5×5的三维卷积核,所述GDN激活函数用于增加各三维卷积层之间的非线性关系。
  5. 根据权利要求4所述的一种无人机定量遥感应用的数据压缩方法,其特征在于,所述超参编码器包括二维卷积层和LeakyReLU激活函数;所述二维卷积层采用步长为2的5×5二维卷积核,所述LeakyReLU激活函数用于增加各二维卷积层之间的非线性关系。
  6. 根据权利要求5所述的一种无人机定量遥感应用的数据压缩方法,其特征在于,所述解码器包括自解码器和超参解码器,所述自解码器与所述自编码器互为对称结构,所述超参解码器和所述超参编码器互为对称结构。
  7. 根据权利要求1所述的一种无人机定量遥感应用的数据压缩方法,其特征在于,所述对深层特征信息进行量化和熵编码包括以下步骤:
    S300.1将深层特征信息的浮点数据转化为整型;
    S300.2通过双高斯模型,对熵编码进行熵估计。
PCT/CN2023/087731 2022-06-13 2023-04-12 一种无人机定量遥感应用的数据压缩方法 WO2023241188A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/226,038 US20230403395A1 (en) 2022-06-13 2023-07-25 Data compression method for quantitative remote sensing with unmanned aerial vehicle

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210673676.1 2022-06-13
CN202210673676.1A CN115086715A (zh) 2022-06-13 2022-06-13 一种无人机定量遥感应用的数据压缩方法

Publications (1)

Publication Number Publication Date
WO2023241188A1 true WO2023241188A1 (zh) 2023-12-21

Family

ID=83251991

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087731 WO2023241188A1 (zh) 2022-06-13 2023-04-12 一种无人机定量遥感应用的数据压缩方法

Country Status (2)

Country Link
CN (1) CN115086715A (zh)
WO (1) WO2023241188A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086715A (zh) * 2022-06-13 2022-09-20 北华航天工业学院 一种无人机定量遥感应用的数据压缩方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190096049A1 (en) * 2017-09-27 2019-03-28 Korea Advanced Institute Of Science And Technology Method and Apparatus for Reconstructing Hyperspectral Image Using Artificial Intelligence
CN112734867A (zh) * 2020-12-17 2021-04-30 南京航空航天大学 一种基于空谱特征分离提取的多光谱图像压缩方法及***
WO2021164176A1 (zh) * 2020-02-20 2021-08-26 北京大学 基于深度学习的端到端视频压缩方法、***及存储介质
CN113554720A (zh) * 2021-07-22 2021-10-26 南京航空航天大学 一种基于多方向卷积神经网络的多光谱图像压缩方法及***
CN113628290A (zh) * 2021-07-28 2021-11-09 武汉大学 基于3d卷积自编码器的波段自适应高光谱图像压缩方法
CN114422784A (zh) * 2022-01-19 2022-04-29 北华航天工业学院 一种基于卷积神经网络的无人机多光谱遥感影像压缩方法
CN115086715A (zh) * 2022-06-13 2022-09-20 北华航天工业学院 一种无人机定量遥感应用的数据压缩方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200160565A1 (en) * 2018-11-19 2020-05-21 Zhan Ma Methods And Apparatuses For Learned Image Compression

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190096049A1 (en) * 2017-09-27 2019-03-28 Korea Advanced Institute Of Science And Technology Method and Apparatus for Reconstructing Hyperspectral Image Using Artificial Intelligence
WO2021164176A1 (zh) * 2020-02-20 2021-08-26 北京大学 基于深度学习的端到端视频压缩方法、***及存储介质
CN112734867A (zh) * 2020-12-17 2021-04-30 南京航空航天大学 一种基于空谱特征分离提取的多光谱图像压缩方法及***
CN113554720A (zh) * 2021-07-22 2021-10-26 南京航空航天大学 一种基于多方向卷积神经网络的多光谱图像压缩方法及***
CN113628290A (zh) * 2021-07-28 2021-11-09 武汉大学 基于3d卷积自编码器的波段自适应高光谱图像压缩方法
CN114422784A (zh) * 2022-01-19 2022-04-29 北华航天工业学院 一种基于卷积神经网络的无人机多光谱遥感影像压缩方法
CN115086715A (zh) * 2022-06-13 2022-09-20 北华航天工业学院 一种无人机定量遥感应用的数据压缩方法

Also Published As

Publication number Publication date
CN115086715A (zh) 2022-09-20

Similar Documents

Publication Publication Date Title
CN109727207B (zh) 基于光谱预测残差卷积神经网络的高光谱图像锐化方法
CN109859110B (zh) 基于光谱维控制卷积神经网络的高光谱图像全色锐化方法
CN110501072B (zh) 一种基于张量低秩约束的快照式光谱成像***的重构方法
CN111192193B (zh) 一种基于1维-2维卷积神经网络高光谱单图超分辨方法
CN108921809B (zh) 整体原则下基于空间频率的多光谱和全色图像融合方法
Zikiou et al. Support vector regression-based 3D-wavelet texture learning for hyperspectral image compression
CN108765280A (zh) 一种高光谱图像空间分辨率增强方法
Cao et al. A remote sensing image fusion method based on PCA transform and wavelet packet transform
WO2023241188A1 (zh) 一种无人机定量遥感应用的数据压缩方法
CN111915518B (zh) 基于三重低秩模型的高光谱图像去噪方法
CN101779461B (zh) 使用pixon方法的图像压缩和解压缩
CN116503292B (zh) 一种基于SwinIR的高光谱遥感图像去噪方法
Dua et al. Compression of multi-temporal hyperspectral images based on RLS filter
US20140267916A1 (en) Selective perceptual masking via scale separation in the spatial and temporal domains using intrinsic images for use in data compression
CN113962882B (zh) 一种基于可控金字塔小波网络的jpeg图像压缩伪影消除方法
Raju et al. Multispectral image compression for various band images with high resolution improved DWT SPIHT
Kaarna et al. Compression of spectral images
Kumar et al. Onboard hyperspectral image compression using compressed sensing and deep learning
Fuchs et al. Hyspecnet-11k: A large-scale hyperspectral dataset for benchmarking learning-based hyperspectral image compression methods
CN107146260B (zh) 一种基于均方误差的图像压缩感知采样方法
US8897378B2 (en) Selective perceptual masking via scale separation in the spatial and temporal domains using intrinsic images for use in data compression
CN115855839A (zh) 一种基于admm框架的改进空谱融合高光谱计算重构方法
CN114511470B (zh) 一种基于注意力机制的双分支全色锐化方法
US20230403395A1 (en) Data compression method for quantitative remote sensing with unmanned aerial vehicle
CN112989593B (zh) 基于双相机的高光谱低秩张量融合计算成像方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23822759

Country of ref document: EP

Kind code of ref document: A1