CN114898157A - Global learning device and method for hyperspectral image classification - Google Patents

Global learning device and method for hyperspectral image classification Download PDF

Info

Publication number
CN114898157A
CN114898157A CN202210563560.2A CN202210563560A CN114898157A CN 114898157 A CN114898157 A CN 114898157A CN 202210563560 A CN202210563560 A CN 202210563560A CN 114898157 A CN114898157 A CN 114898157A
Authority
CN
China
Prior art keywords
layer
output
module
convolution
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210563560.2A
Other languages
Chinese (zh)
Inventor
党兰学
刘崇阳
侯彦娥
左宪禹
刘扬
田军锋
林英豪
周黎鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN202210563560.2A priority Critical patent/CN114898157A/en
Publication of CN114898157A publication Critical patent/CN114898157A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a global learning device and method for hyperspectral image classification. The device includes: an encoder and a decoder; the encoder sequentially comprises a spectrum dimension adjusting layer, a first feature extraction layer and a second feature extraction layer according to an image processing sequence; the first feature extraction layer comprises three MLBSA structure layers which are stacked together; the MLBSA structural layer comprises three shuffling spectral attention SSA modules, two MLB layers and a down-sampling layer; the input of the first SSA module passes through a Zero-padded convolution module and then is subjected to addition fusion operation with the output of the down-sampling layer to obtain an output which is used as the output of the MLBSA structural layer; the decoder sequentially comprises a first up-sampling layer, a Concat layer and an output layer according to an image processing sequence; the output of the first MLBSA structural layer in the first feature extraction layer is used as the input of the first upper sampling layer, the output of the second feature extraction layer and the output of the first upper sampling layer are fused through a Concat layer, and the fusion result is processed through the output layer to complete the classification of the hyperspectral images.

Description

Global learning device and method for hyperspectral image classification
Technical Field
The invention relates to the technical field of hyperspectral image classification, in particular to a global learning device and method for hyperspectral image classification.
Background
The hyperspectral imaging technology can simultaneously detect two-dimensional geometric space information and one-dimensional continuous spectral information of a target object, so that the hyperspectral image has the characteristic of integrating maps. The geometric spatial information can reflect external characteristics such as size and shape of the target object, and the spectral information can reflect physical structure and chemical composition inside the target object. Therefore, the hyperspectral remote sensing is widely applied to the fields of rock and mineral substance detection, marine plant detection, water resource application, land resource utilization and the like.
How to construct a more accurate and effective classification method is a key problem in the application of the hyperspectral remote sensing technology. Traditional classification algorithms, such as Support Vector Machine (SVM), three-dimensional wavelet transform, gaussian mixture, etc., generally adopt a way of band selection and feature extraction to reduce the dimensionality of an original image, and project the image to a lower-level feature space. The methods often change the wave band correlation of the original image, lose part of spectral information, and cannot fully extract abstract features in the hyperspectral image, thereby influencing the classification accuracy.
In recent years, with the application and development of deep learning technology, an algorithm model constructed based on a Convolutional Neural Network (CNN) has been widely applied to image classification (left Y, bottom l.gradient-based learning applied to document recognition [ J ]. Proceedings of the IEEE,1998, 86(11): p.2278-2324.), speech recognition, target detection, image semantic segmentation, and other fields, and the CNN shows a strong feature extraction capability. More and more researchers use CNN to replace the traditional classification method and are applied to the classification of the hyperspectral images. Current CNN-based classification models evolve towards deeper or wider levels of complex architectures. Although good results are achieved to some extent, the deep layer means that the network model has more parameters, which not only increases the calculation overhead and has low classification speed, but also has higher requirements on computer hardware equipment.
Disclosure of Invention
In order to improve the classification accuracy and the classification speed of the hyperspectral images, the invention provides a global learning device and method for hyperspectral image classification.
In one aspect, the present invention provides a global learning apparatus for hyperspectral image classification, comprising: an encoder and a decoder; the encoder sequentially comprises a spectrum dimension adjusting layer, a first feature extraction layer and a second feature extraction layer according to an image processing sequence; the first feature extraction layer comprises three MLBSA structural layers stacked together; the MLBSA structural layer comprises three shuffling spectral attention SSA modules, two MLB layers and a down-sampling layer; wherein the SSA module and the MLB layer are stacked to cross each other; the downsampling layer is used as the last sublayer of the MLBSA structural layer; the input of the first SSA module passes through a Zero-padded convolution module and then is subjected to addition fusion operation with the output of the down-sampling layer, and the output is used as the output of the MLBSA structural layer; the MLB layer represents a modified linear bottleneck layer;
the decoder sequentially comprises a first up-sampling layer, a Concat layer and an output layer according to an image processing sequence; and the output of the first MLBSA structural layer in the first feature extraction layer is used as the input of the first up-sampling layer, the output of the second feature extraction layer and the output of the first up-sampling layer are fused through the Concat layer, and the fusion result is processed through the output layer to complete the classification of the hyperspectral images.
Further, the spectral dimension adjustment layer comprises three sublayers, namely an SSA module, a 1 × 1 convolution layer and an MLB layer from a shallow layer to a deep layer.
Further, the MLB layer comprises a first convolution module, a second convolution module and a third convolution module which are stacked together in sequence; the first convolution module and the second convolution module are sequentially composed of a convolution layer, a GN layer and a ReLU layer from a shallow layer to a deep layer; the third convolution module includes a convolution layer and a GN layer in this order.
Further, the second feature extraction layer comprises an SSA module, a first branch extraction layer for extracting global information, a second branch extraction layer for extracting local information, a feature fusion layer and a second up-sampling layer; the output of the SSA module passes through the first branch extraction layer and the second branch extraction layer respectively; and the output of the two branch extraction layers is subjected to feature fusion through the feature fusion layer, and the output of the feature fusion layer after passing through the second up-sampling layer is taken as the output of the second feature extraction layer.
Further, the first branch extraction layer comprises two connected cross-attention CCA modules.
Further, the second branch extraction layer adopts a cavity space pyramid pooling ASPP structure.
Further, the output layer comprises three sublayers, and two fourth convolution modules and a 1 × 1 convolution layer are sequentially arranged from the shallow layer to the deep layer; the fourth convolution module is composed of a convolution layer, a GN layer and a ReLU layer from a shallow layer to a deep layer in sequence.
In another aspect, the present invention provides a hyperspectral image classification method based on the above apparatus, including:
dividing a data set into a training set, a verification set and a test set by adopting a universal global random layering UGSS sampling strategy; the data set is a set formed by all extracted ground object sample data after ground object samples are extracted from the hyperspectral image;
training the device according to any one of claims 1 to 7 by using a training set and a validation set to obtain a trained classification model;
and classifying the test set by using the classification model.
Further, the SSA module processes the input data by sequentially using formula (1), formula (2), and formula (3):
Figure BDA0003657403030000031
s c =F ex (z c ,W)=W 2 (ReLU((GN(W 1 (Shuffle(z c )))))) (2)
Figure BDA0003657403030000032
wherein z is c Represents the value of all the pixel values of the c-th band after encoding, H and W represent the height and width of the hyperspectral image, respectively, u c (i, j) represents the pixel of the ith row and the jth column in the c wave band, Shuffle represents a Shuffle function for performing an operation of scrambling the spectral dimension to increase interactivity, W 1 And W 2 Is meant to denote two fully connected layers, s c Represents the intermediate output of the c-th band after being processed by the SSA module,
Figure BDA0003657403030000033
which represents the final output of the c-th band after processing by the SSA module.
The invention has the beneficial effects that:
(1) on the basis of a traditional linear bottleneck, the hyperspectral image is improved according to the characteristics of the hyperspectral image, an improved linear bottleneck MLB is designed, an MLBSA (layered sparse coding buffer) structure layer is designed on the basis, and the characteristic information of the hyperspectral image can be fully extracted by stacking the MLBSA structure to extract the characteristics;
(2) fully extracting local and global spatial information by using a second feature extraction layer with a double-branch structure at the tail end of the encoder; and the space-spectrum information of the hyperspectral image can be fully extracted by matching with an SS (shuffling mechanism) attention mechanism layer in the whole learning device;
(3) adding a shortcut connection behind the first MLBSA structure layer, performing up-sampling on the MLBSA structure layer, and combining the up-sampled output with the up-sampled output in the second characteristic layer, thereby fully utilizing low-level characteristics and high-level characteristics;
(4) the classification method is different from a classification method using a data cube as input, a universal global random layering UGSS sampling strategy is adopted, a complete image is input into a classification model every time, global space-spectrum information can be fully extracted, and high classification speed is achieved while high precision is guaranteed.
Drawings
Fig. 1 is a schematic structural diagram of a global learning apparatus for hyperspectral image classification according to an embodiment of the present invention;
FIG. 2 is a structural diagram of an MLBSA structural layer provided in the embodiment of the present invention;
fig. 3 is a structural diagram of an SSA module provided in an embodiment of the present invention;
FIG. 4 is a diagram illustrating classification results of different models on an IP data set according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating classification results of different models on a PU data set according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating classification results of different models on an SA data set according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
With reference to fig. 1 and 2, the present invention provides a global learning apparatus for hyperspectral image classification, including: an encoder and a decoder;
the encoder sequentially comprises a spectrum dimension adjusting layer, a first feature extraction layer and a second feature extraction layer according to an image processing sequence; the first feature extraction layer comprises three MLBSA structural layers which are stacked together; the MLBSA structural layer comprises three shuffling spectral attention SSA modules, two MLB layers and a down-sampling layer; wherein the SSA module and the MLB layer are stacked to cross each other; the downsampling layer is used as the last sublayer of the MLBSA structural layer; the input of the first SSA module passes through a Zero-padded convolution module and then is subjected to addition fusion operation with the output of the down-sampling layer to obtain the output of the MLBSA structural layer; the MLB layer represents a modified linear bottleneck layer; the down-sampling layer is used for extracting abstract features, so that computing resources are saved; MLBSA: modified Linear Bottlen and spectral entry; SSA: the Shuffle spectral Attention; MLB: modified Linear Bottlen.
The decoder sequentially comprises a first up-sampling layer, a Concat layer and an output layer according to an image processing sequence; and the output of the first MLBSA structural layer in the first feature extraction layer is used as the input of the first up-sampling layer, the output of the second feature extraction layer and the output of the first up-sampling layer are fused through the Concat layer, and the fusion result is processed through the output layer to complete the classification of the hyperspectral images.
In the embodiment of the invention, the MLBSA structures are stacked for feature extraction, and the quick connection is added behind the first MLBSA structure layer, so that the MLBSA structure is subjected to up-sampling and is combined with the output of the second feature extraction layer in the encoder, and thus, the low-level features and the high-level features are fully utilized.
In addition, due to the existence of the down-sampling layer, the input and output dimensions of the MLBSA structural layer are different, so for the residual structure, a Zero-padded convolution module is used.
As an implementation, as shown in fig. 1, the spectral dimension adjustment layer includes three sublayers, which are the SSA module, the 1 × 1 convolution layer and the MLB layer in sequence from the shallow layer to the deep layer.
Specifically, the spectrum dimension adjusting layer is adopted to perform initial processing on the input hyperspectral image, compared with the traditional convolution from high dimension to low dimension, the spectrum dimension adjusting layer in the embodiment of the invention adopts a 1 × 1 convolution layer to perform expansion operation, and maps the low dimension feature space into the high dimension space, so that the tensor dimension cannot be reduced.
As an implementation, as shown in fig. 2, the MLB layer includes a first convolution module, a second convolution module, and a third convolution module stacked together in sequence; the first convolution module and the second convolution module are sequentially composed of a convolution layer, a GN layer and a ReLU layer from a shallow layer to a deep layer; the third convolution module includes a convolution layer and a GN layer in this order. GN: group Normalization, Group Normalization.
In the structural design of the MLB layer, a non-linear activation function ReLU layer is used instead of a linear activation function as in the conventional method when mapping data from a low-dimensional space to a high-dimensional space, because: the high dimensionality of the hyperspectral image itself; after mapping the low-dimensional tensor to the high-dimensional tensor, the dimensionality of itself is high enough, and not much information is lost by using the non-linear activation function. Meanwhile, in the process of extracting the features, enough information can be extracted by using the high-dimensional tensor to extract the features, so that the extracted features are more beneficial to classification, therefore, in a third convolution module with low-dimensional characteristics, if a nonlinear activation function is still used, more useful information is lost, and therefore, an activation function is not adopted in the third convolution module.
As an implementable manner, as shown in fig. 1, the second feature extraction layer includes an SSA module, a first branch extraction layer for extracting global information, a second branch extraction layer for extracting local information, a feature fusion layer, and a second upsampling layer; the output of the SSA module passes through the first branch extraction layer and the second branch extraction layer respectively; and the output of the two branch extraction layers is subjected to feature fusion through the feature fusion layer, and the output of the feature fusion layer after passing through the second upper sampling layer is used as the output of the second feature extraction layer.
In this embodiment of the present invention, the first branch extraction layer includes two connected CCA modules; the second branch extraction layer adopts an ASPP structure. The CCA module is a simplification of a non-local module and can extract spatial information from the whole situation; ASPP extracts local spatial information from different sized receptive fields. CCA: cross-attention, cross-attention; the modular structure can be referred to as "z.huang, x.wang, l.huang, c.huang, y.wei, and w.liu," CCNet: Criss-Cross attachment for Semantic Segmentation in Proceedings of the IEEE/CVF International Conference on Computer Vision,2019, pp.603-612. ASPP: atrous Spatial Pyramid Pooling; the structure can be referred to as "l. — c.chen, g.papandreou, i.kokkinos, k.murphy, a.l.j.i.t.o.p.a.yuille, and m.intellgence," Deeplab: the creation of a Semantic image segment with deep connected networks, atomic connections, and full connected crfs, "IEEE Transactions on Pattern Analysis and Machine understanding, vol.40, No.4, pp.834-848,2017.
Local and global spatial information can be fully extracted by adopting a double-branch extraction layer in the second feature extraction layer; and the space-spectrum information can be fully extracted by matching with an SSA module used in the global learning device.
As an implementation, as shown in fig. 1, the output layer includes three sublayers, namely, two fourth convolution modules and a 1 × 1 convolution layer from the shallow layer to the deep layer; the fourth convolution module is composed of a convolution layer, a GN layer and a ReLU layer from a shallow layer to a deep layer in sequence.
The appropriate cross-channel interaction is very important for learning the channel attention with high performance and high efficiency, the purpose of the two FC layers of the existing SEBlock is to capture the non-linear cross-channel interaction, but due to the fact that dimension reduction exists in the existing SEBlock, the effect is negatively influenced to a certain extent, so that the embodiment of the invention designs an SSA module, the SSA module adds a Shuffle (Shuffle) operation before global pooling, namely, the spectral dimension of HSI is shuffled before dimension reduction, and then the spectral dimension is normalized by using a GN layer. So, the SSA module can make and carry out abundant information interaction between the passageway, cooperates first characteristic extraction layer again, and effectual spectral feature and spatial feature can constantly be learnt in the combination of the two, have promoted the effect of model.
Example 2
On the basis of the above embodiments, the embodiments of the present invention provide a global learning device for hyperspectral image classification, and provide parameter settings of each network layer or each module of the device.
In the embodiment of the present invention, in the MLB layer, the parameters of the convolutional layer of the first convolutional module are set as follows: the convolution kernel is 1, Stride is 1, Padding is 1; the parameters of the convolutional layer of the second convolutional module are set as: the convolution kernel is 3, Stride is 1, Padding is 1; the parameters of the convolutional layer of the third convolutional module are set as: the convolution kernel is 3, Stride is 1, Padding is 0.
In the MLBSA structural layer, the structure of the down-sampling layer is as follows: adopting a convolution layer with a convolution kernel of 3, Stride of 2 and Padding of 1; under this parameter setting, since Stride is 2 and convolution layer with Padding is 1, the spatial size is reduced, and the spectral dimension is increased, in practical application, an average pooling with size of 2 × 2 may be performed after the Zero-padded convolution module to further ensure that the input and output dimensions of the MLBSA structure layer are the same, so that the two layers can perform an additive fusion operation, such as [ - ] in fig. 2. By using the Zero-padded combined with the average pooled skip-join approach, normal add operation is guaranteed without adding additional parameters.
In the ASPP structure employed in the second branch extraction layer, the expansion factors of the hole convolution used in the hole space pyramid pooling are set to be rate 12, rate 24, and rate 36, respectively.
The characteristic fusion layer sequentially comprises a Concat layer and a convolution module consisting of a 1 multiplied by 1 convolution layer, a GN layer and a ReLU layer; and the outputs of the two branch extraction layers after fusion by the Concat layer pass through the convolution module, and the output of the convolution module is the output of the feature fusion layer.
The first and second upsampling layers are set to 2 times upsampling and 8 times upsampling, respectively.
Before the output of the first up-sampling layer is fused with the output of the second feature extraction layer, the output of the first up-sampling layer passes through a convolution module consisting of a 1 × 1 convolution layer, a GN layer and a ReLU layer, and the output of the convolution module is fused with the output of the second feature extraction layer through the Concat layer.
In the output layer, the convolution kernels of the convolution layers in the two fourth convolution modules are both 3.
Example 3
The embodiment of the invention provides a hyperspectral image classification method, which adopts the global learning device for hyperspectral image classification in the embodiments and comprises the following steps:
s301: dividing a data set into a training set, a verification set and a test set by adopting a universal global random layering UGSS sampling strategy; the data set is a set formed by all extracted surface feature sample data after surface feature sample extraction is carried out on the high-spectrum image;
the UGSS sampling strategy takes the whole hyperspectral image as the input of a classification model, and the specific idea is as follows: a fixed number of training samples is set and if the total number of samples of a feature is insufficient to provide the fixed number of samples, proportional extraction is used. Meanwhile, during training, the extracted training samples are divided into a training set and a verification set again according to a set proportion. All the extracted samples are divided into groups with the specified parameter quantity according to the set parameters, data of the whole graph is input during input, but only the extracted samples are updated during updating, so that the effect of layered training is achieved, the model can be ensured to be converged during training, and the mode of extracting the samples enables a UGSS sampling strategy to be universally used for most training tasks.
For example, for a data set, before training the network, samples of all ground features are extracted to make a training set and a verification set, and the rest is taken as a test set, if 200 samples are extracted and some ground feature samples are only 150, the invention extracts according to a set proportion, for example, the proportion is 0.8, then 120 samples are extracted to be the training set and the verification set, and the rest is taken as the test set, and the extracted 120 samples are divided into the training set and the verification set according to the set proportion, and the extracted training set samples are evenly distributed into each group according to the set group (a hyper-parameter). It should be noted that, in the training, although the training set, the verification set, and the test set are divided, all samples are still input for training each time, only the sample corresponding to the training set participates in the gradient descent operation, and the rest samples do not participate.
S302: training the global learning device by utilizing a training set and a verification set to obtain a trained classification model;
specifically, model parameters are trained by adopting a random gradient descent algorithm based on a training set, and a model with the minimum loss rate stored on a verification set is an optimal classification model.
S303: and classifying the test set by using the classification model.
As an implementation manner, the SSA module processes input data by sequentially using formula (1), formula (2), and formula (3); fig. 3 shows the structure of the SSA module.
Figure BDA0003657403030000081
s c =F ex (z c ,W)=W 2 (ReLU((GN(W 1 (Shuffle(z c )))))) (2)
Figure BDA0003657403030000082
Wherein z is c Represents the value of all the pixel values of the c wave band after being coded, H and W represent the height and width of the hyperspectral image respectively, u c (i, j) represents the pixel of the ith row and the jth column in the c wave band, where Shuffle represents a Shuffle function for performing an operation of scrambling the spectral dimension to increase interactivity, W 1 And W 2 Is meant to denote two fully connected layers, s c Representing via SSA moduleThe processed middle output of the c-th band,
Figure BDA0003657403030000083
which represents the final output of the c-th band after processing by the SSA module.
On the basis of the above embodiment, before step S301, the method further includes: carrying out zero-mean standardization operation on a marking sample of an input hyperspectral image;
specifically, the zero-mean normalization operation can be expressed by equation (4):
Figure BDA0003657403030000091
wherein the content of the first and second substances,
Figure BDA0003657403030000092
representing the pixel value of the ith row and the jth column in the nth band signature,
Figure BDA0003657403030000093
representing the mean, σ, of all pixels in the nth band n Indicating the standard deviation of the nth waveband pixel value, W, H and N respectively indicating the width, height and total waveband number of the hyperspectral image.
In order to verify the effect of the device and the method, the invention also provides the following experimental data.
1. Experimental Environment
Hardware equipment: CPU is Intel (R) Xeon (R) E5-2682 [email protected], GPU is NVIDIA GeForce RTX 3060;
a software platform: the Python version is 3.7, the Cuda version is 11.0.194, and the model structure is built by using a deep learning framework with the Pythroch version 1.8.0.
2. Experimental data set
In order to measure the classification effect of the invention, three reference hyperspectral data sets of Indian Pipes (IP), Pavia University (PU) and Salinas (SA) are selected for experimental study. The details of the three data sets are shown in table 1.
TABLE 1 detailed information of the three data sets
Figure BDA0003657403030000094
3. Experimental setup
In order to make the size of the input image meet the down-sampling requirement, the spectral dimension of the input image is kept at a multiple of 16, and zero padding is used, the number of groups normalized by the group is set to 16, and the number of channels after each down-sampling of the group is guaranteed to be a multiple of 16.
For all experiments, the global learning device was optimized using SGD and added with poly learning rate strategy, with initial learning rate of 0.001 multiplied by
Figure RE-GDA0003734214450000101
Wherein the power value is set to 0.9, the momentum is set to 0.9, no data expansion strategy is adopted, the fixed sample number extracted by each data set is 200, if the total number of samples of a certain feature is insufficient, the set proportion parameter is 0.8, the training set proportion ratio is 0.75 in the extracted sample number, and the verification set proportion ratio is 0.25. For the selected samples, the number of groups divided was set to 10. To evaluate the performance of the method of the invention, three general indicators were used: overall Accuracy (OA), Average Accuracy (AA), and Kappa coefficient (Kappa).
4. Contrast between different classification models
Several typical classification models in recent years, SVM-RBF (literature 1), 1D-CNN (literature 2), M3D-DCNN (literature 3), SSRN (literature 4), DBDA (literature 5) and the classification model of the invention (abbreviated as deployed) are selected for detailed comparison. Table 2, table 3 and table 4 show the final classification results for different models on IP, PU and SA datasets, respectively. Wherein:
document 1: kuo B C, Ho H, Li C H, et al. A Kernel-Based feed Selection Method for SVM With RBF Kernel for Hyperspectral Image Classification [ J ]. Selected Topics in Applied elevation estimates and removal Sensing, IEEE Journal of 2014,7(1): 317-;
document 2: wei H, Yangyu H, Li W, et al. deep capacitive Neural Networks for Hyperspectral Image Classification [ J ]. Journal of Sensors,2015,2015:1-12.
Document 3: m.he, b.li and h.chen, "Multi-scale 3D deep volumetric neural network for hyperspectral Image classification,"2017IEEE International Conference on Image Processing (ICIP), Beijing,2017, pp.3904-3908, doi: 10.1109/icip.2017.8297014.
Document 4: Z.Zhong, J.Li, Z.Luo, M.J.I.T.o.G.Chapman, and R.Sensing, "Spectral-spatial residual network for hyperspectral image classification," A3-D deep learning frame, "vol.56, No.2, pp.847-858,2017.
Document 5: li, S.ZHEN, C.Duan, Y.Yang, and X.J.R.S.Wang, "Classification of hyperspectral image based on double-branch dual-authentication network," vol.12, No.3, p.582,2020.
TABLE 2 comparison of classification accuracy of different models on IP datasets
Class of ground object SVM-RBF 1D-CNN M3D-DCNN SSRN DBDA Proposed
C1 58.77 92.22 96.67 100 88.1 100
C2 77.96 81.68 86.9 98.11 98.62 99.26
C3 71.68 81.44 90.59 98.02 98.34 97.79
C4 19.93 92.97 99.73 91.62 94.22 100
C5 89.84 94.59 98.34 99.08 99.09 100
C6 97.01 97.81 99.74 99.61 98.29 100
C7 75.88 98 100 96.67 83.83 100
C8 99.13 98.67 99.93 99.93 99.96 100
C9 51.78 100 100 100 93.33 100
C10 72.46 90.4 91.27 92.71 94.45 99.75
C11 90.22 72.36 78.29 99.33 99.45 99.34
C12 73.16 91.6 95.24 98.07 98.91 99.47
C13 72.62 100 100 96.67 93.81 100
C14 97.9 91.35 95.76 99.78 99.79 100
C15 57.92 85.16 98.44 97.03 98.59 100
C16 78.59 98.33 100 90.62 85.08 100
OA(%) 82.32±0.753 84.07±1.05 88.95±1.46 98.16±0.46 98.48±0.521 99.46±0.09
AA(%) 74.05±1.71 91.66±0.95 95.68±0.64 97.33±0.955 95.24±2.277 99.73±0.03
Kappa x 100 79.38±0.872 81.44±1.14 87.07±1.66 97.83±0.543 98.20±0.615 99.35±0.1
TABLE 3 comparison of classification accuracy of different models on PU data set
Class of ground object SVM-RBF 1D-CNN M3D-DCN SSRN DBDA Proposed
C1 96.83 85.27 94.59 99.9 99.56 99.98
C2 97.41 88.08 96.67 99.11 99.92 99.97
C3 76.04 82.91 93.52 92.14 98.98 100
C4 87.68 96.49 98.38 99.47 97.27 99.63
C5 97.56 99.74 100 100 99.9 100
C6 76.75 91.5 97.73 94.2 96.27 100
C7 66.36 91.6 97.69 98.17 99.32 100
C8 85.76 85.15 94.18 97.4 97.1 99.96
C9 99.87 99.88 99.83 99.78 97.55 100
OA(%) 90.34±0.678 88.78±2.25 96.41±0.84 97.9±1.83 98.90±0.607 99.95±0.00
AA(%) 87.14±0.694 91.18±0.86 96.95±0.43 97.80±1.3 98.46±0.630 99.95±0.00
Kappa x 100 87.22±0.868 85.26±2.78 95.2±1.11 97.20±2.4 98.53±0.809 99.94±0.00
TABLE 4 comparison of classification accuracy of different models on SA data set
Figure BDA0003657403030000111
Figure BDA0003657403030000121
From the experimental results in tables 2 to 4, it can be seen that the OA, AA and Kappa values of the classification model proposed by the present invention are higher than those of the other classification models in all three data sets. For the SVM-RBF and 1D-CNN models using only spectral feature classification, the accuracy is significantly lower than that of the M3D-DCNN, the SSRN, the DBDA and the model provided by the invention using the space-spectral feature classification, which shows that the higher classification accuracy cannot be achieved by using only spectral feature classification. In the IP data set, the information of the two feature 2 and 11 are similar, for example, the accuracy of the two features in M3D-DCNN is not ideal, while for the feature 1, 7, 9 with a small number of samples, the accuracy of the feature 7 in SSRN is only 96%, and the accuracy of the feature in DBDA is respectively 88.1%, 83.83%, and 93.33%, which are not ideal, indicating that the two feature lack the ability to process small sample data. The method achieves the precision of more than 99% for the data of the ground features with similar wave bands or the small samples, and shows that the SSA module used by the method can accurately select the wave bands, and the problem of the small samples is well solved by using the UGSS sampling strategy. And the minimal standard deviation of the owner of the method in all classification methods shows that the method has very good stability.
In addition, fig. 4-6 show the classification results of different classification models on IP, PU and SA datasets, respectively. Wherein (a) is a false color image and (b) is a label image. As is clear from fig. 4-6, classification using only spectral features, such as SVM-RBF and 1D-CNN models, produces many noise points, but the method based on spatio-spectral features overcomes this disadvantage, such as M3D-DCNN, SSRN and DBDA models, and achieves better results. Through comparison with the ground real image, the classification effect of the classification model provided by the invention is smoother.
TABLE 5 comparison of parameters, training time and testing time of different classification models on three datasets
Figure BDA0003657403030000131
As can be seen from the data in Table 5, under the IP data set, the training time of the classification model of the invention is far shorter than that of all the comparison models, and is only 223.7 seconds; in the PU data set, compared with most methods, the classification model has great advantages in training time, the SVM-RBF is simple, and the DBDA code uses a strategy of stopping training in advance, so that the training time of the classification model is longer than that of the SVM-RBF, but the accuracy of the classification model has great advantages compared with that of the two models, and saturation is achieved. In the SA data set, the SVM-RBF and the DBDA are also superior, but the classification model of the invention is very close to the two types of the SVM-RBF and the DBDA. In all data sets, the classification model of the invention ensures the highest classification accuracy. The test time of the classification model of the invention is the shortest for all three data sets, and the test time is only 0.06, 0.33 and 0.21 seconds in the three data sets respectively, which shows that the classification model of the invention can be better applied to related work after training.
In conclusion, the global learning device and the classification method for hyperspectral image classification provided by the invention have the advantages that the high precision is ensured in the classification work, and the classification speed is higher. In addition, the classification model fully extracts the features in the hyperspectral image, and still shows higher classification accuracy on ground objects with smaller sample data size.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. Global learning apparatus for hyperspectral image classification, comprising: an encoder and a decoder; the encoder sequentially comprises a spectrum dimension adjusting layer, a first feature extraction layer and a second feature extraction layer according to an image processing sequence; the first feature extraction layer comprises three MLBSA structural layers which are stacked together; the MLBSA structural layer comprises three shuffling spectral attention SSA modules, two MLB layers and a down-sampling layer; wherein the SSA module and the MLB layer are stacked to cross each other; the downsampling layer is used as the last sublayer of the MLBSA structural layer; the input of the first SSA module passes through a Zero-padded convolution module and then is subjected to addition fusion operation with the output of the down-sampling layer to obtain an output which is used as the output of the MLBSA structural layer; the MLB layer represents a modified linear bottleneck layer;
the decoder sequentially comprises a first up-sampling layer, a Concat layer and an output layer according to an image processing sequence; and the output of the first MLBSA structural layer in the first feature extraction layer is used as the input of the first up-sampling layer, the output of the second feature extraction layer and the output of the first up-sampling layer are fused through the Concat layer, and the fusion result is processed through the output layer to complete the classification of the hyperspectral images.
2. The global learning device for hyperspectral image classification according to claim 1, wherein the spectral dimension adjustment layer comprises three sublayers, namely an SSA module, a 1 x 1 convolutional layer and an MLB layer from a shallow layer to a deep layer.
3. The global learning apparatus for hyperspectral image classification according to claim 1, wherein the MLB layer comprises a first convolution module, a second convolution module and a third convolution module stacked in sequence; the first convolution module and the second convolution module are sequentially composed of a convolution layer, a GN layer and a ReLU layer from a shallow layer to a deep layer; the third convolution module includes a convolution layer and a GN layer in this order.
4. The global learning apparatus for hyperspectral image classification according to claim 1, wherein the second feature extraction layer comprises an SSA module, a first branch extraction layer for extracting global information, a second branch extraction layer for extracting local information, a feature fusion layer and a second up-sampling layer; the output of the SSA module passes through the first branch extraction layer and the second branch extraction layer respectively; and the output of the two branch extraction layers is subjected to feature fusion through the feature fusion layer, and the output of the feature fusion layer after passing through the second upper sampling layer is used as the output of the second feature extraction layer.
5. The global learning apparatus for hyperspectral image classification according to claim 4, wherein the first branch extraction layer comprises two connected cross-attention CCA modules.
6. The global learning apparatus for hyperspectral image classification according to claim 4, wherein the second branch extraction layer adopts a void space pyramid pooling ASPP structure.
7. The global learning device for hyperspectral image classification according to claim 1, wherein the output layer comprises three sublayers, namely two fourth convolution modules and a 1 x 1 convolution layer from a shallow layer to a deep layer; the fourth convolution module is composed of a convolution layer, a GN layer and a ReLU layer from a shallow layer to a deep layer in sequence.
8. The hyperspectral image classification method based on the device of any one of claims 1 to 7 is characterized by comprising the following steps:
dividing a data set into a training set, a verification set and a test set by adopting a universal global random layering UGSS sampling strategy; the data set is a set formed by all extracted surface feature sample data after surface feature sample extraction is carried out on the high-spectrum image;
training the device according to any one of claims 1 to 7 by using a training set and a validation set to obtain a trained classification model;
and classifying the test set by using the classification model.
9. The hyperspectral image classification method according to claim 8, wherein the SSA module processes input data sequentially using formula (1), formula (2) and formula (3):
Figure FDA0003657403020000021
s c =F ex (z c ,W)=W 2 (ReLU((GN(W 1 (Shuffle(z c )))))) (2)
Figure FDA0003657403020000022
wherein z is c Represents the value of all the pixel values of the c wave band after being coded, H and W respectively represent the height and width of the hyperspectral image, u c (i, j) represents the pixel of the ith row and the jth column in the c wave band, where Shuffle represents a Shuffle function for performing an operation of scrambling the spectral dimension to increase interactivity, W 1 And W 2 Is meant to denote two fully connected layers, s c Represents the intermediate output of the c-th band after being processed by the SSA module,
Figure FDA0003657403020000023
which represents the final output of the c-th band after processing by the SSA module.
CN202210563560.2A 2022-05-23 2022-05-23 Global learning device and method for hyperspectral image classification Pending CN114898157A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210563560.2A CN114898157A (en) 2022-05-23 2022-05-23 Global learning device and method for hyperspectral image classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210563560.2A CN114898157A (en) 2022-05-23 2022-05-23 Global learning device and method for hyperspectral image classification

Publications (1)

Publication Number Publication Date
CN114898157A true CN114898157A (en) 2022-08-12

Family

ID=82723154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210563560.2A Pending CN114898157A (en) 2022-05-23 2022-05-23 Global learning device and method for hyperspectral image classification

Country Status (1)

Country Link
CN (1) CN114898157A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563649A (en) * 2023-07-10 2023-08-08 西南交通大学 Tensor mapping network-based hyperspectral image lightweight classification method and device
CN116704328A (en) * 2023-04-24 2023-09-05 中国科学院空天信息创新研究院 Ground object classification method, device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116704328A (en) * 2023-04-24 2023-09-05 中国科学院空天信息创新研究院 Ground object classification method, device, electronic equipment and storage medium
CN116563649A (en) * 2023-07-10 2023-08-08 西南交通大学 Tensor mapping network-based hyperspectral image lightweight classification method and device
CN116563649B (en) * 2023-07-10 2023-09-08 西南交通大学 Tensor mapping network-based hyperspectral image lightweight classification method and device

Similar Documents

Publication Publication Date Title
CN112541503B (en) Real-time semantic segmentation method based on context attention mechanism and information fusion
CN111462126B (en) Semantic image segmentation method and system based on edge enhancement
CN112836773B (en) Hyperspectral image classification method based on global attention residual error network
CN108108751B (en) Scene recognition method based on convolution multi-feature and deep random forest
CN110110596B (en) Hyperspectral image feature extraction, classification model construction and classification method
CN114898157A (en) Global learning device and method for hyperspectral image classification
Alipourfard et al. A novel deep learning framework by combination of subspace-based feature extraction and convolutional neural networks for hyperspectral images classification
Lodhi et al. Multipath-DenseNet: A Supervised ensemble architecture of densely connected convolutional networks
Wang et al. Semantic segmentation of remote sensing ship image via a convolutional neural networks model
CN107679572A (en) A kind of image discriminating method, storage device and mobile terminal
CN105550712B (en) Aurora image classification method based on optimization convolution autocoding network
CN112115972B (en) Depth separable convolution hyperspectral image classification method based on residual connection
CN111860683A (en) Target detection method based on feature fusion
Chen et al. DRSNet: Novel architecture for small patch and low-resolution remote sensing image scene classification
CN115409846A (en) Colorectal cancer focus region lightweight segmentation method based on deep learning
CN108256557B (en) Hyperspectral image classification method combining deep learning and neighborhood integration
CN115641473A (en) Remote sensing image classification method based on CNN-self-attention mechanism hybrid architecture
Jiang et al. Forest-CD: Forest change detection network based on VHR images
Liu et al. Image retrieval using CNN and low-level feature fusion for crime scene investigation image database
CN114463340B (en) Agile remote sensing image semantic segmentation method guided by edge information
Kang et al. ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation
CN114299382A (en) Hyperspectral remote sensing image classification method and system
CN117115675A (en) Cross-time-phase light-weight spatial spectrum feature fusion hyperspectral change detection method, system, equipment and medium
Zeng et al. Masanet: Multi-angle self-attention network for semantic segmentation of remote sensing images
CN114373080B (en) Hyperspectral classification method of lightweight hybrid convolution model based on global reasoning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination