CN113470036B - Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation - Google Patents

Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation Download PDF

Info

Publication number
CN113470036B
CN113470036B CN202111023434.XA CN202111023434A CN113470036B CN 113470036 B CN113470036 B CN 113470036B CN 202111023434 A CN202111023434 A CN 202111023434A CN 113470036 B CN113470036 B CN 113470036B
Authority
CN
China
Prior art keywords
image
hyperspectral image
layer
module
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111023434.XA
Other languages
Chinese (zh)
Other versions
CN113470036A (en
Inventor
李树涛
胡耀宸
卢婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202111023434.XA priority Critical patent/CN113470036B/en
Publication of CN113470036A publication Critical patent/CN113470036A/en
Application granted granted Critical
Publication of CN113470036B publication Critical patent/CN113470036B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10036Multispectral image; Hyperspectral image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a knowledge distillation-based hyperspectral image unsupervised waveband selection method and a knowledge distillation-based hyperspectral image unsupervised waveband selection system, wherein the method comprises the steps of dividing a hyperspectral image into image blocks; training a teacher network to extract a spatial spectrum feature from the image block of the hyperspectral image; training a student network for estimating the wave band weight vector corresponding to each image block, wherein the structure of the student network is more simplified than that of a teacher network, and a global nonlinear relation among wave bands is modeled through a channel attention module; calculating the importance weight W of each wave band based on the corresponding wave band weight of each image block, sequencing the obtained importance weight W of each wave band, and selecting the weights beforeLAnd taking each waveband as the obtained optimal waveband subset. The invention introduces the knowledge distillation idea in the deep neural network for band selection, leads the student network training with a more simplified structure through the teacher network, and leads the error to be easier to propagate reversely, thereby learning the importance weight of the band and realizing the optimal selection of a small number of representative bands.

Description

Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation
Technical Field
The invention relates to a hyperspectral image processing technology, in particular to a method and a system for selecting a hyperspectral image unsupervised waveband based on knowledge distillation.
Background
The hyperspectral imaging technology is a novel imaging technology which integrates a map into a whole by carrying hyperspectral sensors (i.e. imaging spectrometers) on different space platforms, and imaging a target area by dozens to hundreds of continuous and subdivided spectral bands in ultraviolet, visible light, near infrared and mid-infrared spectral ranges, and simultaneously obtaining abundant space and spectral information of images. The hyperspectral remote sensing image has a three-dimensional data structure, the spectrum of the hyperspectral remote sensing image has the characteristics of continuous spectrum channels, a large number of wave bands and the like, the spectral resolution can reach the nanometer level, and the remote sensing earth observation capability and the discrimination capability of different surface features are greatly enhanced, so that the hyperspectral remote sensing image is widely applied to the fields of material exploration, surface feature classification, target detection and the like.
Due to the influence of complex factors such as imaging equipment, atmospheric environment, transmission media and the like, the hyperspectral image is difficult to avoid introducing noise to influence subsequent image processing and interpretation. In addition, the number of the wave bands of the hyperspectral image is large, the spectral dimension is high, the data volume is large, and the practical application is challenged. Firstly, the correlation among the wave bands of the hyperspectral images is strong, so that the information redundancy of the wave band images is caused; secondly, due to the characteristics of high dimensionality and high mass of the hyperspectral image, the calculation load of the image in the processes of transmission, storage and processing is large; finally, the problem of dimension disaster, namely the well-known Hughes phenomenon, exists in high-dimensional hyperspectral image classification, namely, the classification performance shows the trend of first improving and then reducing along with the continuous increase of the number of wave bands. Therefore, how to reduce the data dimensionality, remove redundant information and noise information of the hyperspectral image and improve the effectiveness and the high efficiency of data utilization on the premise of keeping important spatial spectrum information of the hyperspectral image becomes an important problem to be solved urgently.
The band selection method can be divided into two categories, namely supervised and unsupervised, according to whether label information is needed or not. The supervised wave band selection method generally takes the feature type information of an image as label information, and selects a wave band capable of improving feature separability to form a wave band subset; the unsupervised band selection method is to design an algorithm to select the band based on the characteristics of the data. However, in practical application, it takes time and labor to obtain a large amount of accurately marked sample information from a hyperspectral remote sensing image, so that an unsupervised algorithm is widely researched. For example, Chang et al in the literature "Constrained band selection for hyperspectral image, IEEE transactions on geoscience and remote sensing, 2006, 44(6): 1575- > 1585" propose a linear Constrained minimum variance method based on band dependence to select important bands. Zhu et al in the literature "unused hyperspectral band selection by dominant set extraction, IEEE Transactions on Geoscience and remove Sensing, 2015,54(1): 227-. A hyperspectral image band selection algorithm based on self-expression learning is proposed by Wei et al in the literature "Scalable one-pass selection-representation learning for hyperspectral band selection, IEEE Transactions on Geoscience and Remote Sensing,2019,57(7): 4360-4374". Wang et al propose an Optimal Neighborhood Reconstruction algorithm to solve the Band Selection problem in the literature, "Hyperspectral Band Selection video optical neighbor Reconstruction, IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(12): 8465-. The disadvantage of the above methods is that the global non-linear relationship between spectral image bands is not fully exploited.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides a knowledge distillation-based hyperspectral image unsupervised waveband selection method and a knowledge distillation-based hyperspectral image unsupervised waveband selection system, and aims to establish a spectrum waveband global nonlinear mapping model (student network) by combining the spatial and spectral characteristics of a hyperspectral image, guide the student network to select an important waveband subset with small correlation and large information amount through a teacher network, and improve the subsequent ground feature classification performance.
In order to solve the technical problems, the invention adopts the technical scheme that:
a knowledge distillation-based hyperspectral image unsupervised waveband selection method comprises the following steps:
1) dividing the hyperspectral image into image blocks;
2) extracting a spatial spectrum characteristic from an image block of the hyperspectral image through a training teacher network;
3) training a student network for estimating a wave band weight vector corresponding to each image block based on a spatial spectrum feature extracted from the image block of the hyperspectral image sample by the teacher network, wherein the structure of the student network is more simplified than that of the teacher network, and a global nonlinear relation between wave bands is modeled by a channel attention module;
4) calculating the importance weight W of each wave band based on the corresponding wave band weight of each image block, sequencing the obtained importance weight W of each wave band, and selecting the weights beforeLAnd taking each waveband as the obtained optimal waveband subset.
Optionally, the teacher network in step 2) is a three-dimensional convolution self-encoder network, and the function expression of the teacher network for extracting the spatial spectrum features is as follows:
h i k =f(f(x i ω k-(1) +b k-(1))ω k +b k )
in the above formula, the first and second carbon atoms are,h i k representing hyperspectral imagesiAn image blockx i In the teacher networkkThe spatial spectral characteristics of the output of the layer,fin order to be a linear rectifying activating function,ω k-(1)andb k-(1)are respectively teacher networkk-1 layer of convolution kernel parameters and biases,ω k andb k are respectively teacher networkkThe convolution kernel parameters and the bias of the layers.
Optionally, the three-dimensional convolutional self-encoder network comprises a first encoding layer, a second encoding layer, a pooling layer, a first decoding layer and a second decoding layer which are connected in sequence, wherein the first encoding layer and the second encoding layer respectively comprise a convolution module, a batch normalization module and a nonlinear activation function module, the first decoding layer and the second decoding layer respectively comprise a deconvolution module, a batch normalization module and a nonlinear activation function module, an input image block of the hyperspectral image is mapped to a new feature space through the first encoding layer and the second encoding layer, a spatial spectrum feature of the image block is obtained through the pooling layer, and a reconstructed image block is obtained through the first decoding layer and the second decoding layer by the spatial spectrum feature.
Optionally, the student network in step 3) includes a channel attention module, a weighting module and a nonlinear mapping module, which are connected in sequence, where the channel attention module is configured to obtain the second hyperspectral image according to the global nonlinear relationship between the modeling bandsiAn image blockx i Band weight vector ofw i (ii) a The weighting module is used for weighting the high spectrumFirst of the imageiAn image blockx i With corresponding band weight vectorw i Weighting to obtain weighted image blocks; and the nonlinear mapping module is used for mapping the weighted image blocks to a new feature space and extracting features.
Optionally, the channel attention module obtains a second hyperspectral image of the arbitrary hyperspectral image by modeling a global nonlinear relationship between the bandsiAn image blockx i Band weight vector ofw i Comprises the following steps: the second of the hyperspectral imageiAn image blockx i Performing global average pooling operation along spatial axis respectivelyavgpoolAnd maximum pooling operationmaxpoolObtaining the mean value and the maximum value of each channel characteristic, wherein the mean value and the maximum value of each channel characteristic are two values with the size of 1DA vector of (2), whereinDThe number of wave bands; respectively adding each size of 1DForward to a shared multi-layer perceptronMLPFor shared multi-layer perceptronMLPIs summed and then passed throughSigmoidActivating the function to obtain a hyperspectral imageiAn image blockx i Band weight vector ofw i
Optionally, the nonlinear mapping module includes a first feature extraction layer, a second feature extraction layer and a pooling layer, which are connected in sequence, the first feature extraction layer and the second feature extraction layer include a convolution module, a batch normalization module and a nonlinear activation function module, which are connected in sequence, and the function expression of the nonlinear mapping module for extracting features is:
y i j =f(f(x i ω j-(1) +b j-(1))ω j +b j )
in the above formula, the first and second carbon atoms are,y i j representing hyperspectral imagesiAn image blockx i In a non-linear stateMapping module 1jThe characteristics of the output of the layer(s),fin order to be a linear rectifying activating function,ω j-(1)andb j-(1)respectively a non-linear mapping modulej-1 layer of convolution kernel parameters and biases,ω j andb j respectively a non-linear mapping modulejThe convolution kernel parameters and the bias of the layers.
Optionally, the function expression of the loss function adopted in training the student network for estimating the band weight vector corresponding to each image block in step 3) is as follows:
Figure 538098DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,Loss s the function of the loss is represented by,M×Nthe size of the space of the hyperspectral image,h i 3 the space spectrum features output by the teacher network,y i 3 in order to characterize the output of the student network,λin order to regularize the coefficients, the coefficients are,w i is a hyperspectral imageiAn image blockx i The band weight vector of (2).
Optionally, the functional expression for calculating the band importance weight W of each band in step 4) is:
Figure 455238DEST_PATH_IMAGE002
in the above formula, the first and second carbon atoms are,M×Nthe size of the space of the hyperspectral image,w i is a hyperspectral imageiAn image blockx i The band weight vector of (2).
In addition, the invention also provides a knowledge distillation-based hyperspectral image unsupervised waveband selection system which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the knowledge distillation-based hyperspectral image unsupervised waveband selection method.
Furthermore, the invention also provides a computer readable storage medium having stored therein a computer program programmed or configured to execute the knowledge distillation based hyperspectral image unsupervised waveband selection method.
Compared with the prior art, the method has the following advantages:
1. the invention introduces a knowledge distillation idea in a deep neural network for waveband selection, wherein a teacher network has a more complex network structure and is used for learning effective feature representation of a hyperspectral image. Through the student network training with a more simplified teacher network guide structure, errors are easier to propagate reversely, so that the importance weight of a band is learned, and the optimal selection of a small number of representative bands is realized.
2. The student network of the invention models the global nonlinear relation between the wave bands through the channel attention module, obtains the importance weight reflecting the mutual relation of the wave bands and selects the wave bands.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a training process of a teacher network and a student network in the embodiment of the invention.
Fig. 3 is a schematic structural diagram of a channel attention module according to an embodiment of the present invention.
FIG. 4 shows the number of bandsLIncreasing from 5 to 30 the overall accuracy of the present embodiment method versus the prior art method.
FIG. 5 shows the number of bandsLKappa coefficient contrast curves for the present example method and the prior art method increasing from 5 to 30.
FIG. 6 shows the number of selected bandsLIncreasing from 5 to 30 the average accuracy of the present method versus the prior art method.
Detailed Description
As shown in fig. 1, the knowledge-based distillation-based hyperspectral image unsupervised waveband selection method of the embodiment includes:
1) dividing the hyperspectral image into image blocks;
2) extracting a spatial spectrum characteristic from an image block of the hyperspectral image through a training teacher network;
3) training a student network for estimating a wave band weight vector corresponding to each image block based on a spatial spectrum feature extracted from the image block of the hyperspectral image sample by the teacher network, wherein the structure of the student network is more simplified than that of the teacher network, and a global nonlinear relation between wave bands is modeled by a channel attention module;
4) calculating the importance weight W of each wave band based on the corresponding wave band weight of each image block, sequencing the obtained importance weight W of each wave band, and selecting the weights beforeLAnd taking each waveband as the obtained optimal waveband subset.
Referring to fig. 2, the Teacher Network and the Student Network in this embodiment form a Teacher-Student Network (Teacher-Student Network), that is, in this embodiment, a Teacher-Student Network (Teacher-Student Network) framework is used to implement Band Selection Based on the knowledge distillation idea, and for convenience of description, the name of the method in this embodiment is referred to as a Band Selection method (TSBS) Based on the Teacher-Student Network.
In the present embodiment, a hyperspectral image is represented as X = ∑ tonex 1, x 2,…,x M N× }∈R M N D××The space size of which isM×NThe number of wave bands isDx i Is expressed as a size ofn×nAs an optional implementation manner, in this embodiment, the hyperspectral image blocknThe value is 5, i.e. the hyperspectral image block is divided into 5 × 5 with the pixel as the center. The purpose of band selection is to select a subset of bands that retain critical informationSR M N L××WhereinLMuch less thanD
As an optional implementation manner, the teacher network in step 2) of this embodiment is a three-dimensional convolutional self-encoder network, and may extract effective features by utilizing spatial information and spectral information of a hyperspectral image to the maximum extent.
And the function expression of the teacher network for extracting the space spectrum features is as follows:
h i k =f(f(x i ω k-(1) +b k-(1))ω k +b k )
in the above formula, the first and second carbon atoms are,h i k representing hyperspectral imagesiAn image blockx i In the teacher networkkThe spatial spectral characteristics of the output of the layer,ffor a linear commutation activation function (called ReLU for short),ω k-(1)andb k-(1)are respectively teacher networkk-1 layer of convolution kernel parameters and biases,ω k andb k are respectively teacher networkkThe convolution kernel parameters and the bias of the layers. As an optional implementation manner, in this embodimentk={1,2…,5}。
Referring to fig. 2, in this embodiment, the three-dimensional convolutional self-encoder network includes a first encoding layer, a second encoding layer, a pooling layer, a first decoding layer, and a second decoding layer, which are connected in sequence, where the first encoding layer and the second encoding layer both include a Convolution (Convolution) module, a Batch Normalization (Batch Normalization) module, and a nonlinear activation function (ReLU) module, the first decoding layer and the second decoding layer both include a Deconvolution (Deconvolution) module, a Batch Normalization (Batch Normalization) module, and a nonlinear activation function (ReLU) module, an image block of an input hyperspectral image is mapped to a new feature space through the first encoding layer and the second encoding layer, a spatio-spectral feature of the image block is obtained through the pooling layer, and the spatio-spectral feature is reconstructed by the first decoding layer and the second decoding layer. In this embodiment, the space spectrum feature is output through the pooling layer,kthe value is 3, so the empty spectrum feature output by the teacher network is recorded ash i 3 . The three-dimensional convolution self-encoder network consists of two parts: encoder for mapping an input to features, and decoder for mapping features to a reconstruction of an original input, in generalBy minimizing the difference between the input and the reconstruction, features that retain valid information can be obtained. In addition, the three-dimensional convolutional auto-encoder network may be from the second of the input hyper-spectral imagesiAn image blockx i And the spatial spectral features along the spatial dimension and along the spectral dimension are automatically learned without manual design.
In this embodiment, the mean square error is used as a loss function when the teacher network is trained in step 2), and a function expression of the loss function is as follows:
Figure 297292DEST_PATH_IMAGE003
in the above formula, the first and second carbon atoms are,Loss t representing a loss function (reconstruction error),M×Nthe size of the space of the hyperspectral image,x i representing hyperspectral imagesiIndividual image block, first of hyperspectral imageiAn image blockx i Is of a size ofn×n
Figure 692501DEST_PATH_IMAGE004
Second for Hyperspectral images for teacher networksiAnd reconstructing the image block obtained by the image blocks. In this embodiment, the training teacher network performs feature extraction along the spatial dimension and along the spectral dimension on the hyperspectral image, and iteratively trains the three-dimensional convolutional self-encoder network by minimizing the reconstruction error (loss function shown in the above formula) of the original hyperspectral image to obtain an effective spatial spectral featureh i 3
Referring to fig. 2, the student network in step 3) of this embodiment includes a channel attention module, a weighting module, and a nonlinear mapping module, which are connected in sequence, where the channel attention module is used to obtain the second hyperspectral image of any hyperspectral image through the global nonlinear relationship between modeling bandsiAn image blockx i Band weight vector ofw i (ii) a The weighting module is used for weighting the first hyperspectral imageiAn image blockx i With corresponding band weight vectorw i Weighting to obtain weighted image blocks; and the nonlinear mapping module is used for mapping the weighted image blocks to a new feature space and extracting features. The student network obtains the wave band importance weight vector by modeling the nonlinear relation between wave bands through the channel attention module, in the process, the wave band with richer effective information obtains higher weight, then the input image block is multiplied by the corresponding wave band weight, and the nonlinear mapping module maps the reweighed image block to a new feature space.
As shown in FIG. 3, in this embodiment, the channel attention module obtains the second hyperspectral image by modeling the global nonlinear relationship between the bandsiAn image blockx i Band weight vector ofw i Comprises the following steps: the second of the hyperspectral imageiAn image blockx i Performing global average pooling operation along spatial axis respectivelyavgpoolAnd maximum pooling operationmaxpoolObtaining the mean value and the maximum value of each channel characteristic, wherein the mean value and the maximum value of each channel characteristic are two values with the size of 1DA vector of (2), whereinDThe number of wave bands; respectively adding each size of 1DForward to a shared multi-layer perceptronMLPFor shared multi-layer perceptronMLPIs summed and then passed throughSigmoidActivating the function to obtain a hyperspectral imageiAn image blockx i Band weight vector ofw i Using a functional expression, it can be expressed as:
Figure 506874DEST_PATH_IMAGE005
the weighting module is used for weighting the first hyperspectral imageiAn image blockx i With corresponding band weight vectorw i Weighting to obtain weighted image blocks, wherein the weighted image blocks can be expressed as follows by adopting a function expression:
Figure 142385DEST_PATH_IMAGE006
in the above formula, the first and second carbon atoms are,
Figure 409419DEST_PATH_IMAGE007
representing the image block after the weighting and,
Figure 608319DEST_PATH_IMAGE008
is a dot product.
Referring to fig. 2, the nonlinear mapping module in this embodiment includes a first feature extraction layer, a second feature extraction layer and a pooling layer, which are connected in sequence, the first feature extraction layer and the second feature extraction layer both include a Convolution (Convolution) module, a Batch Normalization (Batch Normalization) module and a nonlinear activation function (ReLU) module, which are connected in sequence, and the function expression of the nonlinear mapping module for extracting features is:
y i j =f(f(x i ω j-(1) +b j-(1))ω j +b j )
in the above formula, the first and second carbon atoms are,y i j representing hyperspectral imagesiAn image blockx i In a non-linear mapping blockjThe characteristics of the output of the layer(s),fin order to be a linear rectifying activating function,ω j-(1)andb j-(1)respectively a non-linear mapping modulej-1 layer of convolution kernel parameters and biases,ω j andb j respectively a non-linear mapping modulejThe convolution kernel parameters and the bias of the layers. Whereinj < kIn this example, takej = {1,2}。
Referring to fig. 2, in step 3) of this embodiment, the function expression of the loss function adopted when training the student network for estimating the band weight vector corresponding to each image block is as follows:
Figure 339514DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,Loss s the function of the loss is represented by,M×Nthe size of the space of the hyperspectral image,h i 3 the space spectrum features output by the teacher network,y i 3 in order to characterize the output of the student network,λin order to regularize the coefficients, the coefficients are,w i is a hyperspectral imageiAn image blockx i The band weight vector of (2). In this embodiment, the reconstruction error of the effective feature is measured by constructing the loss function, the student network is trained by minimizing the loss function, and the band weight vector is updated. In the above formula, the first term on the right side is the reconstruction error between the spatial spectrum feature of the teacher network and the feature output by the student network, and the second term is the first term of the hyperspectral imageiAn image blockx i Band weight vector ofw i The sparsity constraint of (1). As an optional implementation manner, in this embodiment, the regularization coefficientλThe value is 0.01. When the value of the loss function is large, it means that the selected band is not well mapped to the valid features of the teacher's network. And (3) performing iterative optimization on the channel attention module by minimizing the loss value in the training process to finally obtain the optimal waveband weight vector.
In this embodiment, the function expression for calculating the band importance weight W of each band in step 4) is as follows:
Figure 332878DEST_PATH_IMAGE002
in the above formula, the first and second carbon atoms are,M×Nthe size of the space of the hyperspectral image,w i is a hyperspectral imageiAn image blockx i The band weight vector of (2). The importance weights W of the wave bands are sorted in descending order, the larger the importance weight W of the wave band is, the more effective information contained in the wave band is, and the more effective information before selection isLEach band constitutes a band subset, i.e. an optimal band selection resultSR M N L××
To verify the effectiveness of the method of the present embodiment (TSBS), the performance of the proposed method was evaluated using the public data set PaviaU and compared to other existing methods. The public data set PaviaU is the region of university of italian parkia detected by the rosss sensor in 2003, has a spatial size of 610 × 340, and includes 115 bands, 12 of which are removed due to the influence of noise, and in this embodiment, a noise-removed 103 band image is used. The image contains 9 types of objects, and the number of labeled samples is 42766. For the teacher network, the feature extraction network was trained using a 10% data set. For student networks, all data sets were used for training and testing. Optionally, in this embodiment, the Adam optimizer is used to optimize the network parameters, the batch size (batch size) is set to 64, and the learning rate is set to 1e -5The number of training times is 100.
In this embodiment, a support vector machine is selected as a classifier (SVM classifier for short) to evaluate the performance of various methods. Parameters C and γ of the SVM classifier are determined by cross validation, the kernel function is a radial basis kernel function, and 50 samples are randomly selected for training the SVM classifier for each class. Three objective indices were used to evaluate the classification accuracy, i.e. Overall Accuracy (OA), Kappa coefficient (Kappa) and Average Accuracy (AA). The number range of the selected wave bands is 5-30, and the step length is 5. For a fair comparison, the three evaluation indices were averaged over 10 classification results. The algorithm proposed in this embodiment is compared with several unsupervised band selection methods, including a linear constrained minimum variance band selection method (LCMV), a dominant set extraction based band selection method (DESBS), an extensible self-expression learning band selection method (SOP-SRL), and a deep learning based feature selection method (TSFS). Specific test results are shown in tables 1 to 3 and fig. 4 to 6, where tables 1 to 3 are comparison tables of classification results between the method of the present invention and other conventional methods, and fig. 4, 5 and 6 are comparison tables of classification results between the method of the present invention and other conventional methods.
Table 1: number of bandsLIncreasing from 5 to 30 for the method of the present embodiment and the prior artTable comparing the overall accuracy of the method.
Figure 87208DEST_PATH_IMAGE009
Table 2: number of bandsLKappa coefficient comparison table for this example method and the prior art method when increasing from 5 to 30.
Figure 824219DEST_PATH_IMAGE010
Table 3: number of bandsLThe average accuracy of the method of the embodiment and the prior method is compared when the number is increased from 5 to 30.
Figure 659189DEST_PATH_IMAGE011
As can be seen from tables 1 to 3 and FIGS. 4 to 6, the number of selected bands is not limited to the aboveLWhen the number is increased from 5 to 30, the classification precision of different band selection methods is improved. The classification accuracy of all methods improves as the number of bands increases. The proposed method is superior to LCMV, dessb and TSFS in three objective indices. Only whenLSet to 20, SOP-SRL can achieve slightly higher OA and Kappa coefficients, and the method of the present embodiment performs better in most cases, i.e., it isL = 5, 15, 25 and 30. It is particularly noted that, as can be seen from the numerical comparison in FIG. 4, whenLThe proposed method has obvious advantages when it is small. For example, when 5 bands are selected, the proposed method TSBS achieves OA values that are 24.25%, 18.2%, 9.72% and 5.9% higher than LCMV, TSFS, SOP-SRL and DESBS, respectively. This is consistent with the purpose of band selection: and selecting as few bands as possible, reserving key information of the original data, and removing redundant information.
In summary, the present embodiment utilizes the idea of knowledge distillation, and introduces a teacher with a more complex structure and better performance to learn more effective image feature representation through network, so as to guide student network training with a more simplified structure and lower complexity, and enable errors to be propagated in reverse direction more easily, thereby implementing optimal selection of a small number of representative bands. The band selection method proposed in this embodiment mainly includes three stages: firstly, a teacher network realizes the extraction of the spatial spectrum feature of a hyperspectral image through a three-dimensional self-encoder, so as to obtain the effective feature representation of the image; secondly, training a student network under the guidance of a teacher network, specifically combining a channel attention module and a nonlinear mapping module, and learning the importance weight of a waveband by minimizing the reconstruction error of effective characteristics; and finally, sequencing the wave band weights to obtain a small number of optimal wave band selection results. The band selection performance is verified through the hyperspectral image classification result, and the method is shown to be capable of effectively selecting representative characteristic band subsets.
In addition, the present embodiment also provides a knowledge-based distillation hyperspectral image unsupervised waveband selection system, which includes a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to execute the steps of the aforementioned knowledge-based distillation hyperspectral image unsupervised waveband selection method.
Furthermore, the present embodiment also provides a computer-readable storage medium, in which a computer program is stored, which is programmed or configured to execute the aforementioned knowledge-based distillation hyperspectral image unsupervised waveband selection method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (8)

1. A knowledge distillation-based hyperspectral image unsupervised waveband selection method is characterized by comprising the following steps:
1) dividing the hyperspectral image into image blocks;
2) extracting a spatial spectrum characteristic from an image block of the hyperspectral image through a training teacher network;
3) training a student network for estimating a wave band weight vector corresponding to each image block based on a spatial spectrum feature extracted from the image block of the hyperspectral image sample by the teacher network, wherein the structure of the student network is more simplified than that of the teacher network, and a global nonlinear relation between wave bands is modeled by a channel attention module;
4) calculating the importance weight W of each wave band based on the corresponding wave band weight of each image block, sequencing the obtained importance weight W of each wave band, and selecting the weights beforeLThe wave bands are used as the obtained optimal wave band subsets;
the student network in the step 3) comprises a channel attention module, a weighting module and a nonlinear mapping module which are sequentially connected, wherein the channel attention module is used for obtaining the second hyperspectral image of any hyperspectral image through the global nonlinear relation between modeling wave bandsiAn image blockx i Band weight vector ofw i (ii) a The weighting module is used for weighting the second part of the hyperspectral imageiAn image blockx i With corresponding band weight vectorw i Weighting to obtain weighted image blocks; the nonlinear mapping module is used for mapping the weighted image blocks to a new feature space and extracting features; the channel attention module obtains the second hyperspectral image of any hyperspectral image through the global nonlinear relation between modeling wave bandsiAn image blockx i Band weight vector ofw i Comprises the following steps: the second of the hyperspectral imageiAn image blockx i Performing global average pooling operation along spatial axis respectivelyavgpoolAnd maximum pooling operationmaxpoolObtaining the mean value and the maximum value of each channel characteristic, wherein the mean value and the maximum value of each channel characteristic are two values with the size of 1DA vector of (2), whereinDThe number of wave bands; respectively adding each size of 1DForward to a shared multi-layer perceptronMLPFor shared multi-layer perceptronMLPIs summed and then passed throughSigmoidActivating the function to obtain a hyperspectral imageiAn image blockx i Band weight vector ofw i
2. The knowledge distillation-based hyperspectral image unsupervised waveband selection method according to claim 1, wherein the teacher network in the step 2) is a three-dimensional convolution self-encoder network, and the functional expression of the teacher network for extracting the spatial spectrum features is as follows:
h i k =f(f(x i ω k-(1) +b k-(1))ω k +b k )
in the above formula, the first and second carbon atoms are,h i k representing hyperspectral imagesiAn image blockx i In the teacher networkkThe spatial spectral characteristics of the output of the layer,fin order to be a linear rectifying activating function,ω k-(1)andb k-(1)are respectively teacher networkk-1 layer of convolution kernel parameters and biases,ω k andb k are respectively teacher networkkThe convolution kernel parameters and the bias of the layers.
3. The knowledge-distillation-based hyperspectral image unsupervised waveband selection method according to claim 2, wherein the three-dimensional convolutional self-encoder network comprises a first encoding layer, a second encoding layer, a pooling layer, a first decoding layer and a second decoding layer which are sequentially connected, wherein the first encoding layer and the second encoding layer respectively comprise a convolution module, a batch normalization module and a nonlinear activation function module, the first decoding layer and the second decoding layer respectively comprise a deconvolution module, a batch normalization module and a nonlinear activation function module, an image block of an input hyperspectral image is mapped to a new feature space through the first encoding layer and the second encoding layer, a spatio-spectral feature of the image block is obtained through the pooling layer, and the spatio-spectral feature obtains an image block reconstruction through the first decoding layer and the second decoding layer.
4. The knowledge distillation-based hyperspectral image unsupervised waveband selection method according to claim 1, wherein the nonlinear mapping module comprises a first feature extraction layer, a second feature extraction layer and a pooling layer which are connected in sequence, the first feature extraction layer and the second feature extraction layer respectively comprise a convolution module, a batch normalization module and a nonlinear activation function module which are connected in sequence, and the function expression of the feature extracted by the nonlinear mapping module is as follows:
y i j =f(f(x i ω j-(1) +b j-(1))ω j +b j )
in the above formula, the first and second carbon atoms are,y i j representing hyperspectral imagesiAn image blockx i In a non-linear mapping blockjThe characteristics of the output of the layer(s),fin order to be a linear rectifying activating function,ω j-(1)andb j-(1)respectively a non-linear mapping modulej-1 layer of convolution kernel parameters and biases,ω j andb j respectively a non-linear mapping modulejThe convolution kernel parameters and the bias of the layers.
5. The knowledge distillation-based hyperspectral image unsupervised waveband selection method according to claim 4, wherein the function expression of the loss function adopted in training the student network for estimating the waveband weight vector corresponding to each image block in the step 3) is as follows:
Figure 108288DEST_PATH_IMAGE001
in the above formula, the first and second carbon atoms are,Loss s the function of the loss is represented by,M×Nthe size of the space of the hyperspectral image,h i 3 network for teachersThe output space spectrum characteristics are shown in the figure,y i 3 in order to characterize the output of the student network,λin order to regularize the coefficients, the coefficients are,w i is a hyperspectral imageiAn image blockx i The band weight vector of (2).
6. The knowledge distillation-based hyperspectral image unsupervised waveband selection method according to claim 5, wherein the function expression for calculating the waveband importance weight W of each waveband in the step 4) is as follows:
Figure 25428DEST_PATH_IMAGE002
in the above formula, the first and second carbon atoms are,M×Nthe size of the space of the hyperspectral image,w i is a hyperspectral imageiAn image blockx i The band weight vector of (2).
7. A knowledge distillation based hyperspectral image unsupervised waveband selection system comprising a microprocessor and a memory connected with each other, characterized in that the microprocessor is programmed or configured to perform the steps of the knowledge distillation based hyperspectral image unsupervised waveband selection method of any of claims 1 to 6.
8. A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, the computer program being programmed or configured to perform the unsupervised waveband selection method for hyperspectral image based knowledge distillation according to any of claims 1 to 6.
CN202111023434.XA 2021-09-02 2021-09-02 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation Active CN113470036B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111023434.XA CN113470036B (en) 2021-09-02 2021-09-02 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111023434.XA CN113470036B (en) 2021-09-02 2021-09-02 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation

Publications (2)

Publication Number Publication Date
CN113470036A CN113470036A (en) 2021-10-01
CN113470036B true CN113470036B (en) 2021-11-23

Family

ID=77867382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111023434.XA Active CN113470036B (en) 2021-09-02 2021-09-02 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation

Country Status (1)

Country Link
CN (1) CN113470036B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113963022B (en) * 2021-10-20 2023-08-18 哈尔滨工业大学 Multi-outlet full convolution network target tracking method based on knowledge distillation
WO2024121999A1 (en) * 2022-12-07 2024-06-13 日本電信電話株式会社 Learning device, learning method, and learning program

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228186A (en) * 2016-07-20 2016-12-14 湖南大学 Classification hyperspectral imagery apparatus and method
CN108764462A (en) * 2018-05-29 2018-11-06 成都视观天下科技有限公司 A kind of convolutional neural networks optimization method of knowledge based distillation
CN109784192A (en) * 2018-12-20 2019-05-21 西安电子科技大学 Hyperspectral Image Classification method based on super-pixel feature extraction neural network algorithm
CN111191514A (en) * 2019-12-04 2020-05-22 中国地质大学(武汉) Hyperspectral image band selection method based on deep learning
CN111402182A (en) * 2020-03-18 2020-07-10 中国资源卫星应用中心 Land-coverage-information-based midsplit image synthesis method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6932947B2 (en) * 2017-03-02 2021-09-08 コニカミノルタ株式会社 Defective image occurrence prediction system and defective image occurrence prediction program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228186A (en) * 2016-07-20 2016-12-14 湖南大学 Classification hyperspectral imagery apparatus and method
CN108764462A (en) * 2018-05-29 2018-11-06 成都视观天下科技有限公司 A kind of convolutional neural networks optimization method of knowledge based distillation
CN109784192A (en) * 2018-12-20 2019-05-21 西安电子科技大学 Hyperspectral Image Classification method based on super-pixel feature extraction neural network algorithm
CN111191514A (en) * 2019-12-04 2020-05-22 中国地质大学(武汉) Hyperspectral image band selection method based on deep learning
CN111402182A (en) * 2020-03-18 2020-07-10 中国资源卫星应用中心 Land-coverage-information-based midsplit image synthesis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Deep feature selection using a teacher-student network";Ali Mirzaei,et.al;《Neurocomputing》;20200331;第396-408页 *

Also Published As

Publication number Publication date
CN113470036A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
Lemhadri et al. Lassonet: Neural networks with feature sparsity
Peng et al. Self-paced nonnegative matrix factorization for hyperspectral unmixing
CN111369487B (en) Hyperspectral and multispectral image fusion method, system and medium
CN111738124A (en) Remote sensing image cloud detection method based on Gabor transformation and attention
CN111583285B (en) Liver image semantic segmentation method based on edge attention strategy
CN113470036B (en) Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation
CN111191514A (en) Hyperspectral image band selection method based on deep learning
JP2023512140A (en) Anomaly detectors, methods of anomaly detection, and methods of training anomaly detectors
CN111415323B (en) Image detection method and device and neural network training method and device
CN113723255A (en) Hyperspectral image classification method and storage medium
CN113421216B (en) Hyperspectral fusion calculation imaging method and system
CN112818920B (en) Double-temporal hyperspectral image space spectrum joint change detection method
CN116665065B (en) Cross attention-based high-resolution remote sensing image change detection method
CN114937173A (en) Hyperspectral image rapid classification method based on dynamic graph convolution network
CN113496221B (en) Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering
Das et al. Sparsity regularized deep subspace clustering for multicriterion-based hyperspectral band selection
CN113208641A (en) Pulmonary nodule auxiliary diagnosis method based on three-dimensional multi-resolution attention capsule network
Pal Margin-based feature selection for hyperspectral data
CN116310851A (en) Remote sensing image change detection method
Trevino-Sanchez et al. Hybrid pooling with wavelets for convolutional neural networks
CN116363469A (en) Method, device and system for detecting infrared target with few samples
Islam et al. Subgrouping-based nmf with imbalanced class handling for hyperspectral image classification
CN115546638A (en) Change detection method based on Siamese cascade differential neural network
CN115375966A (en) Image countermeasure sample generation method and system based on joint loss function
CN113902973A (en) Hyperspectral anomaly detection method based on self-encoder and low-dimensional manifold modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant