CN116416441A - Hyperspectral image feature extraction method based on multi-level variational automatic encoder - Google Patents
Hyperspectral image feature extraction method based on multi-level variational automatic encoder Download PDFInfo
- Publication number
- CN116416441A CN116416441A CN202111627432.1A CN202111627432A CN116416441A CN 116416441 A CN116416441 A CN 116416441A CN 202111627432 A CN202111627432 A CN 202111627432A CN 116416441 A CN116416441 A CN 116416441A
- Authority
- CN
- China
- Prior art keywords
- feature extraction
- hyperspectral image
- network
- hyperspectral
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 78
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000004927 fusion Effects 0.000 claims abstract description 14
- 230000003595 spectral effect Effects 0.000 claims description 52
- 230000006870 function Effects 0.000 claims description 17
- 238000011176 pooling Methods 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 8
- 230000007787 long-term memory Effects 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 5
- 230000001131 transforming effect Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 claims description 2
- 238000000034 method Methods 0.000 abstract description 33
- 238000001228 spectrum Methods 0.000 abstract description 20
- 239000000284 extract Substances 0.000 abstract description 4
- 238000011160 research Methods 0.000 description 8
- 238000013507 mapping Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 4
- 230000006403 short-term memory Effects 0.000 description 4
- 238000000701 chemical imaging Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 235000008331 Pinus X rigitaeda Nutrition 0.000 description 1
- 235000011613 Pinus brutia Nutrition 0.000 description 1
- 241000018646 Pinus brutia Species 0.000 description 1
- 238000004378 air conditioning Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The application provides a hyperspectral image feature extraction method based on a multi-level variation automatic encoder aiming at hyperspectral images. The method uses a variation automatic encoder as a method basic frame, and adopts the fusion characteristic finally corrected as the spatial spectrum combination characteristic finally output after training. The method can better extract important discrimination information in the data, improve the sortability and sorting precision of the pixels, reduce the occurrence of misclassification phenomenon in the subsequent sorting task and improve the anti-noise interference capability of the model.
Description
Technical Field
The application relates to the technical field of spectral imaging, in particular to a hyperspectral image feature extraction method based on a multilayer variational automatic encoder.
Background
In the remote sensing field, hyperspectral imaging techniques are widely used in various studies. The hyperspectral image contains rich spatial features and spectral features, wherein the spatial features refer to spatial position information of pixels at each wavelength, and the spectral features refer to spectral curves formed by spectral reflectivities of single pixels at each wavelength. By extracting the characteristics of the hyperspectral image, the low-dimensional embedded characteristics containing rich discrimination information can be obtained, redundant information in the image is reduced, and the recognition accuracy in the subsequent classification research can be improved. In the early stage, the hyperspectral image feature extraction method mainly extracts the spectral features of pixels, and does not consider the position information among the pixels, so that a better result is difficult to obtain. Along with the improvement of computer computing power and the deep learning research, methods for extracting spatial spectrum features by training a neural network are sequentially provided, the methods introduce the idea of multi-sensor data fusion, and the method of independently extracting spatial features and spectral features and carrying out feature fusion is adopted, so that the information loss is avoided, and the performance of an algorithm is improved.
The hyperspectral image feature extraction method can be classified into a feature extraction method based on spectral features and a feature extraction method based on spatial features from the viewpoint of information sources.
The feature extraction method based on the spectral features is to construct a feature extractor by utilizing a single spectral curve in the hyperspectral image, and neglecting the position information of different pixels in the space dimension. Early, more widely used methods included principal component analysis (Principle Component Analysis, PCA), minimization of noise separation (Minimum Noise Fraction, MNF), linear discriminant analysis (Linear Discriminant Analysis, LDA), and the like. These methods generally consider internal discrimination information of the hyperspectral pixels to ensure sortability. With the continued deep learning research, some depth network models have also been applied to hyperspectral image feature extraction research, including Auto-encoder (AE), variational Auto-encoder (VAE), long Short term Memory network (LSTM), and so on. However, the method does not consider the position relation among different pixels, so that the information in the image is only described on the spectrum level, and the advantage of 'map unification' of the hyperspectral image is not fully exerted, namely, uniformity and synergy exist between spatial information and spectrum information in the image. Currently, the main method in the research field is a feature extraction method based on spatial spectrum features.
The hyperspectral image is typical three-dimensional cube data, the data combines the spatial information of the target ground object with the spectral information under each wavelength and is commonly reflected in the complete data, so that the hyperspectral image has the characteristic of map unification, namely the spatial information and the spectral information of the hyperspectral image are consistent. Meanwhile, due to the influence of the changeable shooting environment of the hyperspectral image and the external interference, the phenomena of 'one object with different spectrums' and 'one spectrum of foreign matters' still exist in the image, and the phenomena also interfere the result of hyperspectral analysis research. The spatial information of the hyperspectral image can be understood as a local spatial neighborhood of a single pixel in a spatial dimension, and the definition assumes that each pixel has a certain relation with the pixels in the spatial neighborhood, so that the position information of the pixel in a real ground object can be mastered by learning the distribution of the local spatial neighborhood. The current common method is to extract the local information of the pixels by using a convolutional neural network, extract the spectral features of the pixels by using a full-connection layer, and finally realize the combination of the spatial features by using a splicing layer. When spectral feature extraction is performed, some studies will also utilize continuous information within the spectral curve of the long-term memory network learning pixels. However, this type of method has certain drawbacks: (1) these methods do not consider continuous information in hyperspectral images from many aspects, but only describe continuous information from the perspective of the spectral curve; (2) when the method extracts the spectral features and the spatial features respectively, the correlation and the cooperative performance of the two mappings are poor, and feature fusion is realized only by utilizing a splicing layer in the last step, so that the distribution of the two features cannot be fully matched; (3) these methods do not consider the homology between spatial and spectral features, and although both features describe the information of the hyperspectral image from different levels, both describe the same hyperspectral pixel, so there is a potential for homology.
In a word, the existing hyperspectral image feature extraction method has certain defects:
(1) the prior hyperspectral image feature extraction method based on the empty spectrum features does not consider the continuous information of pixels from various layers, and most methods only describe the continuous information from the angle of a spectrum curve, so that the diversity of data is weakened;
(2) in the existing hyperspectral image feature extraction method based on the spatial spectrum features, the cooperation capability between the spatial feature mapping and the spectral feature mapping is poor, the two mappings are basically completely split, and feature fusion is carried out by adopting a splicing layer after feature extraction is carried out, but the data distribution of the spatial features and the spectral features is greatly different, so that the expected purpose is difficult to achieve by splicing cleanly;
(3) the homology between the spatial features and the spectral features is not considered in the existing hyperspectral image feature extraction method based on the spatial features, and the reality that both features are extracted from the same hyperspectral pixel is ignored, so that the data distribution difference of the two features is further increased, the feature fusion expression is not facilitated, and the subsequent classification research is also not facilitated.
Disclosure of Invention
To overcome the above drawbacks, the present application aims to: the application provides a hyperspectral image feature extraction method based on a multi-level variational automatic encoder (multi-level VAE) aiming at hyperspectral images. The method uses a variation automatic encoder as a method basic frame, and adopts the fusion characteristic finally corrected as the spatial spectrum combination characteristic finally output after training.
In order to achieve the above purpose, the present application adopts the following technical scheme,
the hyperspectral image feature extraction method based on the multi-level variation automatic encoder is characterized by comprising the following steps of:
s1, selecting a hyperspectral image, wherein the size of the hyperspectral image is X multiplied by Y multiplied by B, X and Y are the space sizes of the hyperspectral image under each wavelength, B is the number of the wavelengths of the hyperspectral image,
s2, configuring neighborhood information for each hyperspectral pixel in the hyperspectral image, namely selecting a neighborhood pixel with the surrounding size of s multiplied by s as neighborhood information of the pixel, wherein the neighborhood information refers to a square area taking the hyperspectral pixel as the center, the side length is s, the s is an odd number,
s3, transforming the neighborhood information based on a depth network model to obtain a size of 1×s 2 X B as Input to the spatial feature extraction module p ,
Taking a second sample with a size of 1 XB hyperspectral pixels as an Input of a spectral feature extraction module q The first samples and the second samples have the same number and one-to-one correspondence,
s4, training a deep neural network,
s5, feature stitching and mean value feature mu calculation, namely the first step in the spatial feature extraction moduleInput of layer->Is->Layer output->The spectral feature extraction module is->Input of layer->Is->Layer(s)Splicing the layer outputs according to the calculation formula: />Wherein 1 < i < m, according to the calculation formula, < ->The mean characteristic μ was obtained with dimensions bs×s 2 ×d,
S6, pooling operation, namely pooling the mean characteristic mu and the standard deviation characteristic delta by using an average pooling layer to obtain the pooled mean characteristicAnd standard deviation characteristics after pooling->The dimensions are all bs x d and,
s7, obtaining a fused feature O based on a feature fusion module, inputting the fused feature O to a decoder module, wherein the decoder is used for reconstructing data of the fused feature O,
s8, network optimization, namely according to the formula f=Γ R +Γ KL +Γ Homo Constructing a loss function in training a network model, wherein
Wherein Σ (·) is the content in brackets all added together. Wherein Γ is R Calculating similarity between spectral feature extraction module input and encoder output using Euclidean distance Γ KL Is a loss function in a variational automatic encoder VAE, using KL divergence to calculate gaussian componentsSimilarity between cloth and embedded feature distribution Γ Homo And calculating the similarity between the output of all the spatial feature extraction modules and the output of the corresponding spectral feature extraction module by using the spectral angular distance.
Preferably, in step S1, the method further includes:
carrying out normalization preprocessing on the hyperspectral image, setting a neighborhood size s, setting the number m of network layers in a spatial feature extraction module and a spectral feature extraction module, setting the number n of network layers in a decoder module, and embedding a feature dimension d, wherein d is an even number larger than 0.
Preferably, step S4 includes: from X Y dimensions of 1 xs 2 And randomly selecting small batches of samples from the first sample of the XB and the second samples of the X Y size of the 1 XB respectively, inputting the small batches of samples into the deep neural network for training of the deep neural network, wherein the numbers of the small batches of pixels of the first sample and the second sample are bs, and the activation functions in the network are the Tanh activation functions.
Preferably, the hyperspectral image feature extraction method based on the multi-level variation automatic encoder is characterized by further comprising the following steps:
normalizing all X multiplied by Y hyperspectral pixels to enable the value range to be between-1 and 1, wherein the normalization formula is as follows:
wherein x is min Representing minimum value, x in the pixel data max Is the maximum value.
preferably, step S8 further includes: using a loss function, selecting a step size of 10 -3 Optimizing the depth network model, and after the model is stable, pooling the mean value characteristicsAs an output.
Preferably, step S3 includes:
transforming the hyperspectral pixel x with the size of 1 to obtain a hyperspectral pixel x with the size ofThe second sample is used as the Input of the spectral feature extraction module q ,
If B is not s 2 Is an integer multiple of s, then the epsilon wavelengths are removed so that B-epsilon is s 2 Where q is used to refer to the relevant variable in the spectral feature extraction module.
Preferably, step S6 further includes:
obtaining standard deviation feature delta by using long-term memory network layer L, wherein the input of the long-term memory network layer L isThe number of nodes is d, and the delta size is bs multiplied by s 2 ×d。
Preferably, in step S7, the decoder module includes:
n fully connected network layers, wherein each network layer is { d }, respectively 1 ,d 2 …d n Each network layer input is { ind }, respectively 1 ,ind 2 …ind n Each network layer output is { outd }, respectively 1 ,outd 2 …outd n And the number of nodes of the last layer is B, and the number of nodes of other network layers is d.
Preferably, step S7 further includes:
according to the formulaAnd obtaining a fused characteristic O, wherein gamma is a randomly generated noise matrix, accords with Gaussian distribution and has the size of bs multiplied by d.
Advantageous effects
Compared with the prior art, the multi-continuous feature integration method of the hyperspectral image is used for extracting the spatial features and the spectral features of the hyperspectral image, two kinds of continuous information contained in the hyperspectral image are considered, and the purpose of describing information at multiple angles is achieved by designing a depth network model. In addition, in the method, through a multi-level spatial feature correction method, the spatial features of different stages are utilized to sequentially carry out information correction on the spectral features of corresponding stages, so that the cooperation capability of the two features in the extraction process is improved. According to the method for improving the homology of the multi-level spatial spectrum features based on the spectral angular distances, the spectral angular distances are utilized to gradually improve the homology among the spatial spectrum features of each level, the relevance of the spatial features and the spectral features in the feature mapping stage is improved, and the problem that classification precision is difficult to improve due to large difference of distribution of two feature data is solved.
Drawings
Figure 1 is a flow chart of a feature extraction method according to an embodiment of the present application,
fig. 2 is a flow chart of an overall differential demodulation process according to an embodiment of the present application.
Detailed Description
The above-described aspects are further described below in conjunction with specific embodiments. It should be understood that these examples are illustrative of the present application and are not limiting the scope of the present application. The implementation conditions employed in the examples may be further adjusted as in the case of the specific manufacturer, and the implementation conditions not specified are typically those in routine experiments.
The hyperspectral image feature extraction method based on the multi-level variation automatic encoder utilizes a long-short-term memory network layer to extract continuous features of pixels from a space layer and a spectrum layer, and uses a splicing layer to fuse the two continuous features, so that the problem of single continuous information in a traditional feature extraction algorithm is solved. The method for improving the homology of the multi-level spatial spectrum features based on the spectral angular distances utilizes the spectral angular distances to calculate and increase the homology among the spatial spectrum features of each stage, solves the problem that the subsequent classification precision is difficult to improve due to the large data distribution difference between the spatial features and the spectral features, and improves the relevance of the two features in the feature mapping stage.
The hyperspectral image feature extraction method based on the multi-level variation automatic encoder provided by the application is described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a feature extraction method, which includes:
s1, selecting/acquiring a hyperspectral image, wherein the size of the hyperspectral image is X multiplied by Y multiplied by B, X and Y are the spatial sizes of the hyperspectral image under each wavelength, B is the number of wavelengths of the hyperspectral image, carrying out normalization pretreatment on the hyperspectral image, setting a neighborhood size s, the number m of network layers in a spatial feature extraction module and a spectral feature extraction module, the number n of network layers in a decoder module and an embedded feature dimension d, wherein d is required to be an even number larger than 0.
S2, setting neighborhood information for each hyperspectral pixel (X multiplied by Y hyperspectral pixels in total), and selecting a neighborhood pixel with the surrounding size of sxs as the neighborhood information of the pixel, wherein the neighborhood information size is sxsxB.
S3, constructing a depth network model, and transforming neighborhood information of the pixels to obtain a pixel with the size of 1×s 2 X B first sample and serve as Input of the spatial feature extraction module p . The hyperspectral pixel x with the size of 1 XB is transformed to obtain the pixel with the size ofAnd takes the second sample of the spectrum characteristic extraction module as Input of the spectrum characteristic extraction module q If B is not s 2 Is an integer multiple of s, then the epsilon wavelengths are removed so that B-epsilon is s 2 Is an integer multiple of (a). Wherein, the p upper corner mark is used for referring to the related variable in the spatial feature extraction module, and the q upper corner mark is used for referring to the related variable in the spectral feature extraction module. The first samples and the second samples have the same number and one-to-one correspondence. The depth network model comprises a spatial feature extraction module, a spectral feature extraction module, a feature fusion module and a decoder module.
S4, training the deep neural network, randomly selecting small batches of samples from all X multiplied by Y hyperspectral pixels, inputting the samples into the deep neural network, wherein the number of the small batches of pixels is bs, the activation function in the network is a Tanh activation function, and all network layers except the last layer of the decoder module are connected with a batch normalization layer (Batch Normalization Layer).
S5, performing characteristic splicing operation,
Inputq is taken asInput spectral feature extraction Module +.>In the layer, get the output->Wherein (1)>And->Are all bs×s in size 2 X d. Will->As the>Input of layer->According to the following general formulaA kind of electronic device with high-pressure air-conditioning system
Wherein Concat (·) is a splicing operation, and splicing the two in the third dimension to obtain bs×s 2 X 2d output and use it as the first spectral feature extraction moduleInput of layer->
The first space feature extraction moduleInput of layer->Is->Layer output->The spectral feature extraction module is->Input of layer->Is->Layer and->Splicing of layer outputs is shown below:
wherein i is more than 1 and less than m. According to the following formula:
the mean characteristic μ was obtained with dimensions bs×s 2 ×d。
S6, pooling operation, namely pooling the mean value characteristic mu and the standard deviation characteristic delta by using an average pooling layer (Average Pooling Layer) to obtain the pooled mean value characteristicAnd standard deviation characteristics after pooling->The dimensions are bs×d. The method further comprises the following steps: obtaining standard deviation feature delta by using a long-short-term memory network layer L, wherein the network layer input is thatThe number of nodes is d, and the delta size is bs multiplied by s 2 ×d。
S7, obtaining a fused feature O based on a feature fusion module, and taking the fused feature O as the input of a decoder module; in this step, the formula is usedAnd obtaining the fused characteristic O in a characteristic fusion module, wherein gamma is a randomly generated noise matrix which accords with Gaussian distribution and has the size of bs multiplied by d.
S8. according to the formula f=Γ R +Γ KL +Γ Homo Constructing a loss function in training a network model, wherein
Wherein Σ (-) is the sum of the contents in brackets. Wherein Γ is R Calculating similarity between spectral feature extraction module input and encoder output using Euclidean distance Γ HL Is a loss function in a variational automatic encoder VAE, using KL divergence to calculate similarity between gaussian distribution and embedded feature distribution, Γ Homo And calculating the similarity between the output of all the spatial feature extraction modules and the output of the corresponding spectral feature extraction module by using the spectral angular distance. In this embodiment, the spatial feature extraction module includes: m Long Short-term memory network layers (Long Short-term Memory Layer), each network layer is respectivelyThe input of each network layer is ∈>The output of each network layer is ∈> The number of the nodes of the last layer is d/2, and the number of the nodes of other network layers is d. The spectral feature extraction module comprises: m long-term and short-term memory network layers, each network layer is-> Each networkThe layer inputs are +.>The output of each network layer is respectivelyThe number of the nodes of the last layer is d/2, and the number of the nodes of other network layers is d. The decoder module is composed of n fully connected network layers (Fully Connected Layer), each network layer is { d }, respectively 1 ,d 2 ...d n Each network layer input is { ind }, respectively 1 ,ind 2 ...ind n Each network layer output is { outd }, respectively 1 ,outd 2 ...outd n And the number of nodes of the last layer is B, and the number of nodes of other network layers is d. The decoder is used for reconstructing the data of the fused characteristic O, and the structure of the automatic encoder is formed, so that the consistency of sample information is guaranteed.
In an embodiment, further comprising: using the above loss function, selecting step length of 10 -3 Optimizing the depth network model, and after the model is stable, pooling the mean value characteristicsAnd taking all the first samples and the second samples as test samples as output, and obtaining expected embedded features.
Preferably, in the step S4, all x×y hyperspectral pixels are divided into a training set and a testing set according to a certain proportion, and normalized to make the value range between-1 and 1, and the normalization formula is as follows:
wherein x is min Representing minimum value, x in the pixel data max Is the maximum value. Then randomly sequencing and packaging the training set pixels, namely dividing the training set pixels into a plurality of sample packets, wherein each sample packet contains bs pixels, and only selecting for each iteration optimizationOne of the sample packets is input into the neural network, and the sample packet selected each time is different. The calculation formula of the Tanh activation function is as follows:
the method described above is verified next in connection with the detailed description.
The hyperspectral image feature extraction method based on the multi-level variation automatic encoder is used for extracting the hollow spectrum features of hyperspectral images and for subsequent classification research, and takes a Indiana forest data set (Indian Pines Dataset) as an example, the image size is 145 multiplied by 200, and the image contains 21025 pixels in total, each pixel contains 200 spectrum wavelengths, and the whole data set contains 16 effective categories and background noise categories in total. After the pixels belonging to the background noise category are removed, 10366 effective pixels are remained in total. The deep network architecture is shown in fig. 2:
input: the input hyperspectral image is an image of 145×145×200 in size.
Parameter setting: the neighborhood size is 5, the number of network layers in the spatial feature extraction module and the spectral feature extraction module is 3, the number of network layers in the decoder module is 3, and the embedded feature dimension is 40.
And selecting neighborhood information, obtaining the neighborhood information with the size of 5 multiplied by 200 for each pixel, and inputting the pixel and the neighborhood information into a depth network for training.
Training the CNN
40% of samples in 10366 training set data are selected for training the deep network model, and the samples are randomly ordered and packaged, and the number of pixels in a small batch is 512. Only one of the sample packets is used for each training. After training, all 10366 training set data are input into a depth model for testing, embedded features with the size of 10366 multiplied by 40 are obtained, and finally classification is carried out by using an SVM classifier. And randomly selecting 10% of samples to train the SVM classifier, testing by using the rest 90% of samples to finally obtain a classification result, and evaluating the classification result by selecting the overall classification precision and the average classification precision. The overall classification result refers to the ratio of the number of correctly classified samples divided by the total number of samples in all samples. The average classification accuracy is first the ratio of the number of correctly classified samples in each class divided by the number of the class samples, and the average value of the various ratios is calculated.
The hyperspectral image characteristic extraction method based on the multi-level variation automatic encoder and the common variation automatic encoder (the common variation automatic encoder comprises an encoder, a characteristic fusion module and a decoder, wherein the encoder consists of 3 full-connection layers, the decoder consists of 3 full-connection layers, and the number of network layer nodes and the characteristic fusion module structure are the same as those of the method implemented by the application), and the obtained classification results are shown in the following table.
Overall classification accuracy | Average classification accuracy | |
The method implemented by the application | 85.3% | 79.1% |
Adding random Gaussian noise | 81.4% | 72.3% |
Automatic encoder for common variation | 76.7% | 66.3% |
As can be seen from the table, the method can better improve the classification performance of the embedded features and has fewer misclassification samples. In addition, by adding certain random Gaussian noise into the original hyperspectral image and repeating the experiment, the overall classification accuracy is 81.4% (the overall classification accuracy reaches 85.3% when no random Gaussian noise is added), so that the method has strong anti-noise interference capability. Therefore, the method can effectively improve the classifiability and classification precision of the embedded features and improve the noise interference resistance of the model.
The foregoing embodiments are provided to illustrate the technical concept and features of the present application and are intended to enable those skilled in the art to understand the contents of the present application and implement the same according to the contents, and are not intended to limit the scope of the present application. All such equivalent changes and modifications as come within the spirit of the disclosure are desired to be protected.
Claims (10)
1. The hyperspectral image feature extraction method based on the multi-level variation automatic encoder is characterized by comprising the following steps of:
s1, selecting a hyperspectral image, wherein the size of the hyperspectral image is X multiplied by Y multiplied by B, X and Y are the space sizes of the hyperspectral image under each wavelength, B is the number of the wavelengths of the hyperspectral image,
s2, configuring neighborhood information for each hyperspectral pixel in the hyperspectral image, namely selecting a neighborhood pixel with the surrounding size of s multiplied by s as neighborhood information of the pixel, wherein the neighborhood information refers to a square area taking the hyperspectral pixel as the center, the side length is s, the s is an odd number,
s3, constructing a depth network model, and transforming the neighborhood information based on the depth network model to obtain the depth-based depth network model with the size of 1 Xs 2 X B as Input to the spatial feature extraction module p ,
Second sample of 1 XB hyperspectral pixelThe second sample is used as Input of the spectral feature extraction module q The first samples and the second samples have the same number and one-to-one correspondence,
s4, training a deep neural network,
s5, feature stitching and mean value feature mu calculation, namely the first step in the spatial feature extraction moduleInput of layer->Is the firstLayer output->The spectral feature extraction module is->Input of layer->Is->Layer(s)Splicing the layer outputs according to the calculation formula: />Wherein 1 < i < m, according to the calculation formula, < ->The mean characteristic μ was obtained with dimensions bs×s 2 ×d,
S6, pooling operation, namely pooling the mean characteristic mu and the standard deviation characteristic delta by using an average pooling layer to obtain the pooled mean characteristicAnd standard deviation characteristics after pooling->The dimensions are all bs x d and,
s7, obtaining a fused feature O based on a feature fusion module, inputting the fused feature O to a decoder module, wherein the decoder is used for reconstructing data of the fused feature O,
s8. network optimization, i.e. according to the formula Γ=Γ R +Γ KL +Γ Homo Constructing a loss function in training a network model, wherein
Wherein Σ (·) is the content in brackets all added together.
2. The hyperspectral image feature extraction method based on a multi-level variational automatic encoder as claimed in claim 1, wherein in step S1, further comprising:
carrying out normalization preprocessing on the hyperspectral image, setting a neighborhood size s, setting the number m of network layers in a spatial feature extraction module and a spectral feature extraction module, setting the number n of network layers in a decoder module, and embedding a feature dimension d, wherein d is an even number larger than 0.
3. The hyperspectral image feature extraction method based on the multi-level variational automatic encoder as claimed in claim 1, wherein the step S4 comprises:
from X Y dimensions of 1 xs 2 And randomly selecting small batches of samples from the first sample of the XB and the second samples of the X Y size of the 1 XB respectively, inputting the small batches of samples into the deep neural network for training of the deep neural network, wherein the numbers of the small batches of pixels of the first sample and the second sample are bs, and the activation functions in the network are the Tanh activation functions.
4. The hyperspectral image feature extraction method based on a multi-level variational automatic encoder as claimed in claim 3, further comprising:
normalizing all X multiplied by Y hyperspectral pixels to enable the value range to be between-1 and 1, wherein the normalization formula is as follows:
wherein x is min Representing minimum value, x in the pixel data max Is the maximum value.
6. the hyperspectral image feature extraction method based on the multi-level variational automatic encoder as claimed in claim 1, wherein,
7. The hyperspectral image feature extraction method based on the multi-level variational automatic encoder as claimed in claim 1, wherein,
the step S3 includes:
the hyperspectral pixel x with the size of 1 XB is transformed to obtain the pixel with the size ofThe second sample is used as the Input of the spectral feature extraction module q ,
If B is not s 2 Is an integer multiple of s, then the epsilon wavelengths are removed so that B-epsilon is s 2 Where q is used to refer to the relevant variable in the spectral feature extraction module.
8. The hyperspectral image feature extraction method based on the multi-level variational automatic encoder as claimed in claim 1, wherein,
the step S6 further includes:
9. The hyperspectral image feature extraction method based on the multi-level variational automatic encoder as claimed in claim 1, wherein,
in step S7, the decoder module includes:
n fully connected network layers, wherein each network layer is { d }, respectively 1 ,d 2 …d n Each network layer input is { ind }, respectively 1 ,ind 2 …ind n Each network layer output is { outd }, respectively 1 ,outd 2 …outd n And the number of nodes of the last layer is B, and the number of nodes of other network layers is d.
10. The hyperspectral image feature extraction method based on a multi-level variational automatic encoder as claimed in claim 1, wherein step S7 further comprises:
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111627432.1A CN116416441A (en) | 2021-12-28 | 2021-12-28 | Hyperspectral image feature extraction method based on multi-level variational automatic encoder |
PCT/CN2022/142106 WO2023125456A1 (en) | 2021-12-28 | 2022-12-26 | Multi-level variational autoencoder-based hyperspectral image feature extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111627432.1A CN116416441A (en) | 2021-12-28 | 2021-12-28 | Hyperspectral image feature extraction method based on multi-level variational automatic encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116416441A true CN116416441A (en) | 2023-07-11 |
Family
ID=86997859
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111627432.1A Pending CN116416441A (en) | 2021-12-28 | 2021-12-28 | Hyperspectral image feature extraction method based on multi-level variational automatic encoder |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN116416441A (en) |
WO (1) | WO2023125456A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117115553B (en) * | 2023-09-13 | 2024-01-30 | 南京审计大学 | Hyperspectral remote sensing image classification method based on mask spectral space feature prediction |
CN117455970B (en) * | 2023-12-22 | 2024-05-10 | 山东科技大学 | Airborne laser sounding and multispectral satellite image registration method based on feature fusion |
CN117934975B (en) * | 2024-03-21 | 2024-06-07 | 安徽大学 | Full-variation regular guide graph convolution unsupervised hyperspectral image classification method |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10664716B2 (en) * | 2017-07-19 | 2020-05-26 | Vispek Inc. | Portable substance analysis based on computer vision, spectroscopy, and artificial intelligence |
CN111160273B (en) * | 2019-12-31 | 2023-05-09 | 北京云智空间科技有限公司 | Hyperspectral image spatial spectrum joint classification method and device |
CN111914907B (en) * | 2020-07-13 | 2022-07-29 | 河海大学 | Hyperspectral image classification method based on deep learning space-spectrum combined network |
CN112101271A (en) * | 2020-09-23 | 2020-12-18 | 台州学院 | Hyperspectral remote sensing image classification method and device |
-
2021
- 2021-12-28 CN CN202111627432.1A patent/CN116416441A/en active Pending
-
2022
- 2022-12-26 WO PCT/CN2022/142106 patent/WO2023125456A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2023125456A1 (en) | 2023-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
CN109685115B (en) | Fine-grained conceptual model with bilinear feature fusion and learning method | |
CN111489358B (en) | Three-dimensional point cloud semantic segmentation method based on deep learning | |
CN113011499B (en) | Hyperspectral remote sensing image classification method based on double-attention machine system | |
CN113378632B (en) | Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method | |
CN116416441A (en) | Hyperspectral image feature extraction method based on multi-level variational automatic encoder | |
WO2018010434A1 (en) | Image classification method and device | |
CN111753828B (en) | Natural scene horizontal character detection method based on deep convolutional neural network | |
CN109784197B (en) | Pedestrian re-identification method based on hole convolution and attention mechanics learning mechanism | |
CN107944483B (en) | Multispectral image classification method based on dual-channel DCGAN and feature fusion | |
CN103955702A (en) | SAR image terrain classification method based on depth RBF network | |
CN105116397B (en) | Radar high resolution range profile target identification method based on MMFA models | |
CN112001403B (en) | Image contour detection method and system | |
CN109190511B (en) | Hyperspectral classification method based on local and structural constraint low-rank representation | |
CN113592007B (en) | Knowledge distillation-based bad picture identification system and method, computer and storage medium | |
CN107832786A (en) | A kind of recognition of face sorting technique based on dictionary learning | |
CN109255339B (en) | Classification method based on self-adaptive deep forest human gait energy map | |
CN109241813A (en) | The sparse holding embedding grammar of differentiation for unconstrained recognition of face | |
CN112446357A (en) | SAR automatic target recognition method based on capsule network | |
CN111652273A (en) | Deep learning-based RGB-D image classification method | |
CN115966010A (en) | Expression recognition method based on attention and multi-scale feature fusion | |
CN111881716A (en) | Pedestrian re-identification method based on multi-view-angle generation countermeasure network | |
CN113642445A (en) | Hyperspectral image classification method based on full convolution neural network | |
CN108805280B (en) | Image retrieval method and device | |
CN109583456B (en) | Infrared surface target detection method based on feature fusion and dense connection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |