CN115331110A - Fusion classification method and device for remote sensing hyperspectral image and laser radar image - Google Patents

Fusion classification method and device for remote sensing hyperspectral image and laser radar image Download PDF

Info

Publication number
CN115331110A
CN115331110A CN202211037953.6A CN202211037953A CN115331110A CN 115331110 A CN115331110 A CN 115331110A CN 202211037953 A CN202211037953 A CN 202211037953A CN 115331110 A CN115331110 A CN 115331110A
Authority
CN
China
Prior art keywords
image
hyperspectral
size
matrix
introducing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211037953.6A
Other languages
Chinese (zh)
Inventor
于文博
黄鹤
沈纲祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN202211037953.6A priority Critical patent/CN115331110A/en
Publication of CN115331110A publication Critical patent/CN115331110A/en
Priority to PCT/CN2022/142160 priority patent/WO2024040828A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/58Extraction of image or video features relating to hyperspectral data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention relates to a fusion classification method of a remote sensing hyperspectral image and a laser radar image, which comprises the steps of obtaining a hyperspectral image and a laser radar image, carrying out intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, selecting a neighborhood block for each hyperspectral intrinsic pixel, hyperspectral illumination pixel and laser radar pixel, training a plurality of depth network branches by using the neighborhood block, splicing the outputs of the depth network branches pairwise by using a splicing layer, and carrying out multi-mode fusion on the spliced outputs to obtain the final output category. The method for fusing and classifying the remote sensing hyperspectral images and the laser radar images can fully fuse important judgment information in multisource remote sensing images, achieve the purpose of high-precision classification of target pixels, fully avoid loss and loss of the important information in the fusion process, and reduce the problems of low classification precision and the like caused by information loss.

Description

Fusion classification method and device for remote sensing hyperspectral image and laser radar image
Technical Field
The invention relates to the technical field of remote sensing image processing, in particular to a method and a device for fusion classification of remote sensing hyperspectral images and laser radar images.
Background
In the field of remote sensing, hyperspectral images and lidar images are widely used in various related studies. The hyperspectral image has abundant spatial information and spectral information, wherein the spatial information is spatial position information of pixels under various wavelengths, and the spectral information is a spectral curve formed by spectral reflectivities of a single pixel under various wavelengths. The laser radar image records the elevation information of the target ground object, the hyperspectral image and the laser radar image are fully fused, the information complementation effect can be achieved, and then the complete information of the ground object is learned and modeled. Meanwhile, the embedding characteristics in the pixels can be fully mined by fusing and classifying the two remote sensing images, so that the identification precision of subsequent classification research is improved. In the early stage, the fusion classification method usually adopts two independent branches to perform feature extraction on two images, and realizes the fusion of multi-source information in a simple connection mode and the like, but the method does not consider the relevance among different branches, and is difficult to realize the balance of the multi-source information. With the improvement of computer computing power and the deepening of deep learning research, methods for achieving full fusion of hyperspectral images and laser radar images through training of neural networks are proposed successively, the methods improve the information extraction process of different images, the relevance of the images is improved, and the performance of the algorithm is improved.
At present, a hyperspectral image and laser radar image fusion classification method in the field of remote sensing can be generally divided into a fusion classification method based on classical machine learning and a fusion classification method based on deep learning. The fusion classification method based on classical machine learning is mainly based on the classical machine learning theory, and utilizes spatial information and spectral information in a hyperspectral image and elevation information in a laser radar image to construct a feature extraction module and a fusion module, so that joint expression among different remote sensing images is realized. More commonly used machine learning theories include Principal Component Analysis (PCA), minimum Noise separation (MNF), linear Discriminant Analysis (LDA), and the like. Other machine learning methods, such as manifold learning algorithms, structure sparsification algorithms, dictionary set decomposition algorithms, and the like, play an important role. The method generally extracts discrimination information in a hyperspectral image and a laser radar image, and ensures the classifiable capability of the sample by fusing different information. With the continuous deepening of the deep learning theory, some deep network models are also applied to the research of the fusion classification of the hyperspectral image and the laser radar image, such as an Auto-encoder (AE), a Variational Auto-encoder (VAE), a Long-Short-term Memory network (LSTM), and the like. Such methods describe the discriminant features contained in the samples from multiple aspects by extracting deep discriminant information through a complex network structure, and therefore, more and more fusion classification methods based on deep learning are proposed. For example, deep Encoder-Decoder Networks for Classification of Hyperspectral and LiDAR Data published by Danfeng Hong et al in IEEE Geoscience and Remote Sensing Letters in 2020 propose a full-connection network based on Encoder and Decoder structures, which extracts and fuses features in a Hyperspectral image and a LiDAR image respectively, thereby realizing reconstruction of feature information and transmission of a desired deeper embedding space. In addition, more over Means beans Better, multi modal Deep Learning Means Remote-Sensing image Classification, published in the same year by IEEE Transactions on Geoscience and Remote Sensing, proposed a Deep Learning framework for multi modal data, which performs a second Learning of complementary information between multi modal images by performing parameter cross-selection during network training. Therefore, the deep learning is widely applied to the fusion classification research of the hyperspectral image and the laser radar image in the remote sensing field and obtains a better result.
However, the existing hyperspectral image and lidar image fusion classification method in the remote sensing-oriented field has certain defects: (1) the existing method does not consider the relevance between the illumination information of the hyperspectral image and the elevation information in the laser radar image, so that the two are difficult to realize deep fusion, and the performance of a classification model is weakened; (2) the existing method does not apply illumination information of a hyperspectral image to the construction of a fusion classification model, does not consider the decomposition of the hyperspectral image into an intrinsic image and an illumination image and give full play to the advantages of the intrinsic image and the illumination image, and some methods try to introduce the intrinsic decomposition theory into the classification model, but directly abandon the decomposed illumination image, only use the intrinsic image and a laser radar image for fusion classification, and cannot give full play to the advantages of a multi-modal remote sensing image; (3) the existing method only uses completely separated branches to perform information mining and feature extraction when extracting discrimination information from a hyperspectral image and a laser radar image, which is not beneficial to fully grasping complete information of pixels and is difficult to exert the advantages of a multi-modal remote sensing image in the aspect of pixel classification and identification; (4) the conventional method usually adopts a convolutional neural network when extracting image space information, but the conventional convolutional neural network does not consider the limitation in multi-modal learning, and has no excessive structural design for the fusion of information between different modal images, so that the method is not beneficial to improving the fusion classification precision of hyperspectral images and laser radar images.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to overcome the problems in the prior art, and provide a method and a device for fusion classification of remote sensing hyperspectral images and laser radar images, which can fully fuse important judgment information in multisource remote sensing images, realize the purpose of high-precision classification of target pixels, fully avoid the loss and the loss of important information in the fusion process, and reduce the problems of classification precision reduction and the like caused by information loss.
In order to solve the technical problem, the invention provides a fusion classification method of a remote sensing hyperspectral image and a laser radar image, which comprises the following steps:
s1: acquiring a hyperspectral image and a laser radar image, wherein the category of each object in the two images is label;
s2: performing intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and selecting a neighborhood with the surrounding size of s multiplied by s as a neighborhood block of each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s multiplied by B, and the neighborhood block size of the lidar pixel in the lidar image L is s multiplied by s;
s3: training deep network branch L using neighborhood blocks 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of s × s × B in a hyperspectral intrinsic image, L 3 And L 4 Input of (2) is a lidar pixel of size sxsxsxsxb, L in a lidar image 5 And L 6 The input of the high spectrum illumination image is a high spectrum illumination pixel with the size of s multiplied by B, and the output is O 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s multiplied by d;
s4: deep network tributary L using a concatenation layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Output of (2) is spliced two by two to obtain O 12 、O 34 And O 56
S5: is prepared from O 34 And O 56 Input to the 1 st multi-mode packet convolutional layer to obtain output O 3456 1 Introducing O 34 Inputting into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Input into the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Introducing O 34 1 、O 3456 1 And O 56 1 Inputting to the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O into 34 1 Input the methodTo the 5 th multi-mode packet convolutional layer to obtain output O 34 2 Introducing O 56 1 Inputting into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Introducing O 34 2 、O 3456 2 And O 56 2 Input into the 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O 34 2 Input into the 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O 56 2 Inputting the data into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Introducing O 34 3 、O 3456 3 And O 56 3 Input into the 10 th multi-mode packet convolution layer to obtain output O 3456 4 Introducing O 12 And O 3456 1 Inputting into 11 th multi-mode packet convolution layer to obtain output O 12 1 Introducing O 12 1 And O 3456 2 Inputting into 12 th multi-mode packet convolution layer to obtain output O 12 2 Introducing O 12 2 And O 3456 3 Inputting into 13 th multi-mode packet convolution layer to obtain output O 12 3 O of size s × s × d 3456 4 And O 12 3 Inputting a two-dimensional average pooling layer with the size of s x s to obtain O with the size of 1 x d 3456 5 And O 12 4
S6: is prepared from O 12 4 And O 3456 5 Input splice layer to obtain output O 123456 With a size of 1X 2d, is 123456 Inputting the full connection layer to obtain the final output of the class
Figure BDA0003818659230000031
In an embodiment of the present invention, in step S1, after selecting the hyperspectral image and the lidar image, the hyperspectral image and the lidar image are subjected to normalization preprocessing.
In an embodiment of the present invention, the method for performing intrinsic image decomposition on the hyperspectral image in step S2 to obtain an intrinsic image and an illumination image includes:
s2.1: calculating each hyperspectral pixel H i Corresponding matrix D i Wherein i is more than or equal to 1 and less than or equal to X multiplied by Y:
D i =[H 1 ,...,H i-1 ,H i+1 ,...,H X×Y ,I B ]∈R B×(B+X×Y-1)
in which I B Is an identity matrix with the size of B multiplied by B;
s2.2: based on matrix D i Calculating each hyperspectral pixel H i Corresponding vector alpha i
min||α i || 1 s.t.H i =D i α i
Wherein alpha is i The shape of (B + X Y-1) X1;
s2.3: constructing a weight matrix W epsilon R (X×Y)×(X×Y) For the element W of the ith row and jth column in the weight matrix ij Value assignment is performed
Figure BDA0003818659230000041
Calculating a matrix G = (I) based on a weight matrix W X×Y -W T )(I X×Y -W)+δI X×Y In which I X×Y Is an identity matrix of size (X × Y) X (X × Y), δ is a constant, f is a transpose matrix;
s2.4: transforming the hyperspectral image H into a two-dimensional matrix, carrying out logarithmic calculation to obtain log (scatter (H)), and calculating a matrix K = (I) B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X. Times.Y)), wherein I B And I X×Y Are identity matrices of dimensions B × B and (X × Y) × (X × Y), 1 B And 1 X×Y All 1 vectors of size bx 1 and (X × Y) × 1, respectively;
s2.5: calculating a matrix ρ = δ KG based on the matrix G and the matrix K -1 Obtaining an intrinsic image RE = e obtained by decomposing the hyperspectral image H based on the matrix rho ρ And the illumination image SH = e log(H)-ρ Where e is a natural constant, the sizes of both images are X × Y × B.
In one embodiment of the present invention, the deep network leg L in the step S3 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Each including multiple two-dimensional convolution layers, each having d convolution kernels and a convolution kernel size of [3,3]]The convolution kernel sliding step lengths are all [1,1]Branch L 2 And L 3 Sharing all weights, tributaries L 4 And L 5 All weights are shared.
In an embodiment of the present invention, the loss function when constructing the deep network leg training in step S3 is:
Figure BDA0003818659230000042
where label is the input image category,
Figure BDA0003818659230000043
is the output image category.
In an embodiment of the present invention, the deep network leg L is split into two or more network legs L by using a splicing layer in step S4 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Output of (2) is spliced two by two to obtain O 12 、O 34 And O 56 Is represented by the formula O 12 =Concatenation(O 1 ,O 2 ),O 34 =Concatenation(O 3 ,O 4 ),O 56 =Concatenation(O 5 ,O 6 )。
In addition, the invention also provides a fusion classification device of the remote sensing hyperspectral image and the laser radar image, which comprises:
the data acquisition module is used for acquiring a hyperspectral image and a laser radar image, and the category of each object in the two images is label;
the image decomposition module is used for performing intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and selecting a neighborhood with the surrounding size of s multiplied by s as a neighborhood block of each hyperspectral intrinsic pixel, each hyperspectral illumination pixel and each lidar pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s multiplied by B, and the neighborhood block size of each lidar pixel in the lidar image L is s multiplied by s;
a deep network training module for training the deep network branch L by using the neighborhood block 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of s × s × B in a hyperspectral intrinsic image, L 3 And L 4 The input of (a) is a lidar pixel of size sxsxsxsxb in a lidar image, L 5 And L 6 The input of the high spectrum illumination image is a high spectrum illumination pixel with the size of s multiplied by B, and the output is O 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s multiplied by d;
image stitching module for stitching the deep network leg L with a stitching layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 The output of (A) is spliced two by two to obtain O 12 、O 34 And O 56
Multimodal fusion module for fusing O 34 And O 56 Input to the 1 st multi-mode packet convolutional layer to obtain output O 3456 1 Introducing O 34 Inputting into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Inputting to the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Introducing O 34 1 、O 3456 1 And O 56 1 Input into the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O 34 1 Input into the 5 th multi-mode packet convolution layer to obtain output O 34 2 Introducing O 56 1 Inputting into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Introducing O into 34 2 、O 3456 2 And O 56 2 Inputting to 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O 34 2 Inputting into 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O 56 2 Inputting the data into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Introducing O 34 3 、O 3456 3 And O 56 3 Input into the 10 th multi-mode packet convolution layer to obtain output O 3456 4 Introducing O 12 And O 3456 1 Inputting into 11 th multi-mode packet convolution layer to obtain output O 12 1 Introducing O 12 1 And O 3456 2 Inputting into 12 th multi-mode packet convolution layer to obtain output O 12 2 Introducing O 12 2 And O 3456 3 Inputting into 13 th multi-mode packet convolution layer to obtain output O 12 3 O of size s.times.s.times.d 3456 4 And O 12 3 Inputting a two-dimensional average pooling layer with the size of s x s to obtain O with the size of 1 x d 3456 5 And O 12 4
Image classification module to classify O 12 4 And O 3456 5 Input splice layer to obtain output O 123456 With a size of 1X 2d, is 123456 Inputting the full connection layer to obtain the final output of the type
Figure BDA0003818659230000051
In an embodiment of the present invention, the data acquisition module includes a data preprocessing submodule, and the data preprocessing submodule is configured to perform normalization preprocessing on the hyperspectral image and the lidar image after the hyperspectral image and the lidar image are selected.
In an embodiment of the present invention, the intrinsic image decomposition module performs intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and the intrinsic image and the illumination image include:
calculating each hyperspectral pixel H i Corresponding matrix D i Wherein i is more than or equal to 1 and less than or equal to X multiplied by Y:
D i =[H 1 ,...,H i-1 ,H i+1 ,...,H X×Y ,I B ]∈R B×(B+X×Y-1)
wherein I B Is an identity matrix with the size of B multiplied by B;
based on matrix D i Calculating each hyperspectral pixel H i Corresponding vector alpha i
min||α i || 1 s.t.H i =D i α i
Wherein alpha is i The shape of (B + X Y-1) X1;
constructing a weight matrix W epsilon R (X×Y)×(X×T) For the element W of the ith row and the jth column in the weight matrix ij Value assignment is performed
Figure BDA0003818659230000061
Calculating a matrix G = (I) based on a weight matrix W X×Y -W T )(I X×Y -W)+δI X×Y In which I X×Y Is an identity matrix of size (X × Y) × (X × Y), δ is a constant, T is a transposed matrix;
transforming the hyperspectral image H into a two-dimensional matrix, carrying out logarithmic calculation to obtain log (scatter (H)), and calculating a matrix K = (I) B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X X.times.Y)), where I B And I X×Y Are identity matrices of dimensions B × B and (X × Y) × (X × Y), 1 B And 1 X×Y All 1 vectors of size bx 1 and (X × Y) × 1, respectively;
calculating a matrix ρ = δ KG based on the matrix G and the matrix K -1 Obtaining an eigen image RE = e obtained by decomposing the hyperspectral image H based on the matrix rho ρ And the illumination image SH = e log(H)-ρ Where e is a natural constant, the sizes of both images are X × Y × B.
In one embodiment of the invention, theDeep network leg L 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Each including multiple two-dimensional convolution layers, each having d convolution kernels and a convolution kernel size of [3,3]]The convolution kernel sliding step lengths are all [1,1]Branch L 2 And L 3 Sharing all weights, tributaries L 4 And L 5 All weights are shared.
Compared with the prior art, the technical scheme of the invention has the following advantages:
1. the remote sensing hyperspectral image and laser radar image fusion classification method provided by the invention can fully fuse important discrimination information in a multisource remote sensing image, realize the purpose of high-precision classification of target pixels, fully avoid the loss and the loss of important information in the fusion process, and reduce the problems of classification precision reduction and the like caused by information loss;
2. according to the method, the intrinsic image decomposition theory of the hyperspectral image is applied to the fusion classification research of the hyperspectral image and the laser radar image, the intrinsic image decomposition theory and the fusion classification research of the multimodal remote sensing image are fully improved, the phenomenon that the illumination image obtained by decomposition is abandoned during the conventional intrinsic image decomposition is avoided, and the loss of information is reduced;
3. the invention provides a method for fully fusing an illumination image obtained by decomposing a hyperspectral image and a laser radar image, so that the relevance between illumination information in the hyperspectral image and elevation information in the laser radar image is fully mined and utilized, the advantages of the illumination image in model construction research are fully exerted, and the final classification performance is improved.
Drawings
In order that the present disclosure may be more readily and clearly understood, reference will now be made in detail to the present disclosure, examples of which are illustrated in the accompanying drawings.
FIG. 1 is a flow chart of a fusion classification method of a remote sensing hyperspectral image and a laser radar image provided by the invention.
FIG. 2 is a schematic frame diagram of a remote sensing hyperspectral image and lidar image fusion classification device provided by the invention.
Wherein the reference numerals are as follows: 10. a data acquisition module; 20. an image decomposition module; 30. a deep network training module; 40. an image stitching module; 50. a multimodal fusion module; 60. and an image classification module.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
Referring to fig. 1, an embodiment of the present invention provides a method for fusion classification of a remote sensing hyperspectral image and a lidar image, including:
s1: acquiring a hyperspectral image and a laser radar image, wherein the category of each object in the two images is label;
s2: performing intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and selecting a neighborhood with the surrounding size of s multiplied by s as a neighborhood block of each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s multiplied by B, and the neighborhood block size of the lidar pixel in the lidar image L is s multiplied by s;
s3: training deep network branch L using neighborhood blocks 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of s × s × B in a hyperspectral intrinsic image, L 3 And L 4 The input of (a) is a lidar pixel of size sxsxsxsxb in a lidar image, L 5 And L 6 The input of the high spectrum illumination image is a high spectrum illumination pixel with the size of s multiplied by B, and the output is O 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s multiplied by d;
s4: deep network tributaries L using a splice layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 The output of (A) is spliced two by two to obtain O 12 、O 34 And O 56
S5: mixing O with 34 And O 56 Input to the 1 st multi-mode packet convolutional layer to obtain output O 3456 1 Introducing O 34 Inputting into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Inputting to the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Introducing O 34 1 、O 3456 1 And O 56 1 Inputting to the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O into 34 1 Input into the 5 th multi-mode packet convolution layer to obtain output O 34 2 Introducing O 56 1 Inputting into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Introducing O 34 2 、O 3456 2 And O 56 2 Inputting to 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O 34 2 Inputting into 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O 56 2 Inputting the data into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Introducing O into 34 3 、O 3456 3 And O 56 3 Input into the 10 th multi-mode packet convolution layer to obtain the output O 3456 4 Introducing O 12 And O 3456 1 Inputting into 11 th multi-mode packet convolution layer to obtain output O 12 1 Introducing O 12 1 And O 3456 2 Inputting into 12 th multi-mode packet convolution layer to obtain output O 12 2 Introducing O 12 2 And O 3456 3 Input into the 13 th multi-mode packet convolution layer to obtain the output O 12 3 O of size s × s × d 3456 4 And O 12 3 Inputting a two-dimensional average pooling layer with the size of s x s to obtain O with the size of 1 x d 3456 5 And O 12 4
S6: mixing O with 12 4 And O 3456 5 Input splicing layer to obtain output O 123456 With a size of 1X 2d, and 123456 inputting the full connection layer to obtain the final output of the class
Figure BDA0003818659230000081
Specifically, in the step S1, a hyperspectral image H and a lidar image L are selected according to practical problems, where the hyperspectral image size is X × Y × B, X and Y are the spatial sizes of the hyperspectral images in each waveband, B is the number of the waveband of the hyperspectral image, the lidar image size is X × Y, X and Y are the spatial sizes of the lidar images, and the spatial sizes of the two images are the same. The method comprises the steps of carrying out normalization preprocessing on a hyperspectral image and a laser radar image, setting a neighborhood size s (s is an odd number larger than 0), setting the number of convolution kernels of each two-dimensional convolution layer to be d, the size of the convolution kernels to be [3,3], the sliding step length of the convolution kernels to be [1,1], filling parameters (Padding) of each two-dimensional convolution layer to be kept the Same (Same)', selecting Tanh functions for activation functions, setting the category of each ground object in the two images to be label, setting the category size to be 1X (X X Y) and setting the category number to be c.
The method for performing intrinsic image decomposition on the hyperspectral image to obtain the intrinsic image and the illumination image in the step S2 comprises the following steps:
s2.1: calculating each hyperspectral pixel H i Corresponding matrix D i Wherein i is more than or equal to 1 and less than or equal to X multiplied by Y:
D i =[H 1 ,...,H i-1 ,H i+1 ,...,H X×Y ,I B ]∈R B×(B+X×Y-1 )
wherein I B Is an identity matrix with the size of B multiplied by B;
s2.2: based on matrix D i Calculating each hyperspectral pixel H i Corresponding vector alpha i
min||α i || 1 s.t.H i =D i α i
Wherein alpha is i The shape of (B + X Y-1) X1;
s2.3: constructing a weight matrix W epsilon R (X×Y)×(X×Y) For the element W of the ith row and the jth column in the weight matrix ij Assign a value
Figure BDA0003818659230000082
Calculating a matrix G = (I) based on the weight matrix W X×Y -W T )(I X×Y -W)+δI X×Y In which I X×Y Is an identity matrix of size (X × Y) × (X × Y), δ is a constant, T is a transposed matrix;
s2.4: transforming the hyperspectral image H into a two-dimensional matrix, carrying out logarithmic calculation to obtain log (scatter (H)), and calculating a matrix K = (I) B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X X.times.Y)), where I B And I X×Y Are unit matrices of sizes B × B and (X × Y) × (X × Y), 1 B And 1 X×Y All 1 vectors of size bx 1 and (X × Y) × 1, respectively;
s2.5: calculating a matrix ρ = δ KG based on the matrix G and the matrix K -1 Obtaining an intrinsic image RE = e obtained by decomposing the hyperspectral image H based on the matrix rho ρ And the illumination image SH = e log(H)-ρ Where e is a natural constant, the sizes of both images are X × Y × B.
In the step S2, for each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel (X multiplied by Y pixels exist in three images), a neighborhood with the surrounding size of S multiplied by S is selected as a neighborhood block of the pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image H is S multiplied by B, and the neighborhood block size of the lidar pixel in the lidar image L is S multiplied by S.
In step S3, six deep network branches, L respectively, are first constructed 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of sxsxsxsBb in a hyperspectral intrinsic image RE, L 3 And L 4 Is inputtedIs a laser radar pixel with the size of s multiplied by B in a laser radar image L, L 5 And L 6 The input of the network is a hyperspectral illumination pixel with the size of s multiplied by B in a hyperspectral illumination image SH, the six depth network branches are respectively composed of three two-dimensional convolution layers, the number of convolution kernels in each two-dimensional convolution layer is d, and the sliding step length of the convolution kernels is [1,1]Branch L 2 And L 3 Sharing all weights, tributaries L 4 And L 5 All the weights are shared, and finally the outputs of the six deep network branches are O respectively 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s × s × d.
The loss function in the step S3 when the deep network branch training is constructed is as follows:
Figure BDA0003818659230000091
where label is the input image category,
Figure BDA0003818659230000092
is the output image category.
In the step S4, the output of the six deep network branches is spliced two by using a splicing Layer (Concatenation Layer), according to the following formula:
O 12 =Concatenation(O 1 ,O 2 )
O 34 =Concatenation(O 3 ,O 4 )
O 56 =Concatenation(O 5 ,O 6 )。
in the step S5, concatenation (O) 34 ,O 56 ) Input to the 1 st multi-mode packet convolutional layer to obtain output O 3456 1 Introducing O 34 Input into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Inputting to the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Condensation (O) 34 1 ,O 3456 1 ,O 56 1 ) Input into the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O 34 1 Input into the 5 th multi-mode packet convolution layer to obtain output O 34 2 Introducing O into 56 1 Input into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Condensation (O) 34 2 ,O 3456 2 ,O 56 2 ) Inputting to 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O 34 2 Inputting into 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O into 56 2 Input into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Condensation (O) 34 3 ,O 3456 3 ,O 56 3 ) Input into the 10 th multi-mode packet convolution layer to obtain output O 3456 4 O of size s.times.s.times.d 3456 4 Inputting a two-dimensional Average Pooling Layer (Average Pooling Layer) with size of s × s to obtain O with size of 1 × d 3456 5 (ii) a Condensation (O) 12 ,O 3456 1 ) Inputting into 11 th multi-mode packet convolution layer to obtain output O 12 1 Condensation (O) 12 1 ,O 3456 2 ) Inputting into 12 th multi-mode packet convolution layer to obtain output O 12 2 Condensation (O) 12 2 ,O 3456 3 ) Inputting into 13 th multi-mode packet convolution layer to obtain output O 12 3 With a size of s × s × d, and 12 3 inputting a two-dimensional Pooling Layer (Average Pooling Layer) with size of s × s to obtain O with size of 1 × d 12 4
In the step S6, O is added 12 4 And O 3456 5 Input splicing layer gets input O 123456 =Concatenation(O 12 4 ,O 3456 5 ) The size is 1 × 2d. Mixing O with 123456 Inputting a full connection layer, wherein the number of nodes is c, the activation function is a Softmax function, and the type of the final output is obtained
Figure BDA0003818659230000101
According to the remote sensing hyperspectral image and laser radar image fusion classification method, the intrinsic image decomposition theory and the multi-mode remote sensing image fusion classification research are efficiently combined, the advantages of illumination images are fully played in model construction, and loss of important discrimination information are reduced.
According to the method, the intrinsic image of the hyperspectral image is decomposed and introduced into the multi-modal remote sensing image fusion classification research for the first time, the purpose of balancing the elevation information in the illumination image and the laser radar image is achieved, meanwhile, the illumination information and the elevation information are fused, the intrinsic information mining process is guided, and the classification capability of the sample is improved.
The invention provides a method for fully fusing an illumination image obtained by decomposing a hyperspectral image and a laser radar image, so that the relevance between illumination information in the hyperspectral image and elevation information in the laser radar image is fully mined and utilized, and the advantages of the illumination image in model construction research are fully exerted.
The invention provides a distinguishing feature extraction method for hyperspectral images and laser radar images, which can greatly improve the relevance of the hyperspectral images and the laser radar images in the information mining process and reduce the occurrence of information imbalance.
The invention provides a method for applying a multi-mode grouping and convolution layer to hyperspectral image and laser radar image fusion classification research, which fully exerts the application value of the multi-mode grouping and convolution layer in the research field of the invention, greatly improves the joint capability among different modes, reduces unnecessary redundant information and strengthens the expression capability of important information.
The hyperspectral images and the lidar images adopted by the method for fusion classification of the remote sensing hyperspectral images and the lidar images are shot in Italy, tento (Italy), wherein the size of the hyperspectral images is 166 multiplied by 600 multiplied by 63, and the size of the lidar images is 166 multiplied by 600.
The embodiment inputs:
the input hyperspectral image is an image of 166 × 600 × 63, and the input lidar image is an image of 166 × 600.
(II) parameter setting
The neighborhood size is 11 and the number of convolution kernels per two-dimensional convolution layer is 120.
Decomposing the hyperspectral image to obtain an intrinsic image with the size of 166 multiplied by 600 multiplied by 63 and an illumination image with the size of 166 multiplied by 600 multiplied by 63.
And selecting neighborhood information, obtaining a neighborhood block with the size of 11 multiplied by 63 aiming at each pixel, and inputting the neighborhood block into a depth network for training.
(III) training the deep network model
Randomly selecting 10% sample neighborhood blocks from 99600 sample neighborhood blocks in total for training a deep network model, randomly ordering and packaging the sample neighborhood blocks, wherein the number of the small-batch sample neighborhood blocks is 512. Only one of the sample packets is used for each training. After training is finished, all 99600 sample neighborhood blocks are input into a deep network model for testing, finally, classification results of all samples are obtained, and overall classification accuracy and average classification accuracy are selected to evaluate the classification results. The overall classification result refers to the ratio of the number of correctly classified samples in all samples divided by the number of total samples. The average classification accuracy is obtained by dividing the number of samples classified correctly in each class by the ratio of the number of samples in the class and calculating the average value of the ratios of the classes.
(IV) results of this example
The classification results obtained by the remote sensing hyperspectral image and laser radar image fusion classification method and device provided by the invention and the conventional multi-stream encoder are shown in the following table 1.
Overall classification accuracy Average classification accuracy
The method of the invention 92.84% 90.74%
Common multi-stream encoder 86.23% 83.79%
The method can better fuse and classify the hyperspectral image and the laser radar image, and has fewer misclassified samples. In addition, when the hyperspectral image decomposition part in the method is removed and the experiment is repeated, the obtained overall classification precision is 85.49%, so that the method has strong information mining capability. In conclusion, the method can effectively improve the classifiability and the classification precision of the multi-source remote sensing image.
In the following, a remote sensing hyperspectral image and lidar image fusion classification device disclosed by the second embodiment of the invention is introduced, and a remote sensing hyperspectral image and lidar image fusion classification device described below and a remote sensing hyperspectral image and lidar image fusion classification method described above can be referred to correspondingly.
Referring to fig. 2, an embodiment of the present invention provides a device for fusion and classification of a remote sensing hyperspectral image and a lidar image, including:
the data acquisition module 10 is used for acquiring a hyperspectral image and a laser radar image, and the category of each object in the two images is label;
the image decomposition module 20 is used for performing intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and selecting a neighborhood with the surrounding size of s × s as a neighborhood block of each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s × s × B, and the neighborhood block size of the lidar pixel in the lidar image L is s × s;
a deep network training module 30 for training the deep network branch L using the neighborhood block 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of s × s × B in a hyperspectral intrinsic image, L 3 And L 4 The input of (a) is a lidar pixel of size sxsxsxsxb in a lidar image, L 5 And L 6 The input of (a) is a hyperspectral illumination pixel with the size of s multiplied by B in the hyperspectral illumination image, and the output is O 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s multiplied by d;
an image stitching module 40 that stitches the deep network leg L with a stitching layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 The output of (A) is spliced two by two to obtain O 12 、O 34 And O 56
A multimodal fusion module 50 for fusing O 34 And O 56 Input to the 1 st multi-mode packet convolutional layer to obtain output O 3456 1 Introducing O 34 Inputting into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Inputting to the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Introducing O into 34 1 、O 3456 1 And O 56 1 Inputting to the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O 34 1 Input to the 5 th multi-mode packet convolution layer to obtain output O 34 2 Introducing O 56 1 Inputting into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Introducing O 34 2 、O 3456 2 And O 56 2 Inputting to 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O into 34 2 Input into the 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O 56 2 Inputting the data into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Introducing O 34 3 、O 3456 3 And O 56 3 Input into the 10 th multi-mode packet convolution layer to obtain output O 3456 4 Introducing O 12 And O 3456 1 Inputting into 11 th multi-mode packet convolution layer to obtain output O 12 1 Introducing O 12 1 And O 3456 2 Input into 12 th multi-mode packet convolution layer to obtain output O 12 2 Introducing O 12 2 And O 3456 3 Input into the 13 th multi-mode packet convolution layer to obtain the output O 12 3 O of size s × s × d 3456 4 And O 12 3 Inputting a two-dimensional average pooling layer with the size of s x s to obtain O with the size of 1 x d 3456 5 And O 12 4
An image classification module 60 for classifying O 12 4 And O 3456 5 Input splice layer to obtain output O 123456 With a size of 1X 2d, is 123456 Inputting the full connection layer to obtain the final output of the class
Figure BDA0003818659230000121
In an embodiment of the present invention, the data obtaining module 10 includes a data preprocessing sub-module, and the data preprocessing sub-module is configured to perform normalization preprocessing on the hyperspectral image and the lidar image after selecting the hyperspectral image and the lidar image.
In an embodiment of the present invention, the image decomposition module 20 performs intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, including:
calculating each hyperspectral pixel H i Corresponding matrix D i Wherein i is more than or equal to 1 and less than or equal to X multiplied by Y:
D i =[H 1 ,...,H i-1 ,H i+1 ,...,H X×Y ,I B ]∈R B×(B+X×Y-1)
wherein I B Is an identity matrix with the size of B multiplied by B;
based on matrix D i Calculating each hyperspectral pixel H i Corresponding vector alpha i
min||α i || 1 s.t.H i =D i α i
Wherein alpha is i The shape of (B + X Y-1) X1;
constructing a weight matrix W epsilon R (X×Y)×(X×Y) For the element W of the ith row and the jth column in the weight matrix ij Value assignment is performed
Figure BDA0003818659230000131
Calculating a matrix G = (I) based on the weight matrix W X×Y -W T )(I X×Y -W)+δI X×Y In which I X×Y Is an identity matrix of size (X × Y) × (X × Y), δ is a constant, T is a transposed matrix;
transforming the hyperspectral image H into a two-dimensional matrix, carrying out logarithmic calculation to obtain log (scatter (H)), and calculating a matrix K = (I) B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X. Times.Y)), wherein I B And I X×Y Are unit matrices of sizes B × B and (X × Y) × (X × Y), 1 B And 1 X×Y All 1 vectors of size bx 1 and (X × Y) × 1, respectively;
calculating a matrix ρ = δ KG based on the matrix G and the matrix K -1 Obtaining an eigen image RE = e obtained by decomposing the hyperspectral image H based on the matrix rho ρ And the illumination image SH = e log(H)-ρ Where e is a natural constant, the sizes of both images are X × Y × B.
In one embodiment of the invention, said deep network leg L 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Each including multiple two-dimensional convolution layers, each having d convolution kernels and a convolution kernel size of [3,3]]The convolution kernel sliding step lengths are all [1,1]]Branch L 2 And L 3 Sharing all weights, tributaries L 4 And L 5 All weights are shared.
The remote sensing hyperspectral image and lidar image fusion classification device is used for realizing the remote sensing hyperspectral image and lidar image fusion classification method, so that the specific implementation mode of the device can be seen from the embodiment part of the remote sensing hyperspectral image and lidar image fusion classification method in the foregoing, and therefore, the specific implementation mode can refer to the description of the corresponding partial embodiments, and the description is not expanded here.
In addition, since the remote sensing hyperspectral image and lidar image fusion classification device is used for realizing the remote sensing hyperspectral image and lidar image fusion classification method, the functions of the remote sensing hyperspectral image and lidar image fusion classification device correspond to those of the system, and the details are not repeated here.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Various other modifications and alterations will occur to those skilled in the art upon reading the foregoing description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications of the invention may be made without departing from the spirit or scope of the invention.

Claims (10)

1. A remote sensing hyperspectral image and laser radar image fusion classification method is characterized by comprising the following steps:
s1: acquiring a hyperspectral image and a laser radar image, wherein the category of each object in the two images is label;
s2: performing intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and selecting a neighborhood with the surrounding size of s multiplied by s as a neighborhood block of each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s multiplied by B, and the neighborhood block size of the lidar pixel in the lidar image L is s multiplied by s;
s3: training deep network branch L using neighborhood blocks 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of s × s × B in a hyperspectral intrinsic image, L 3 And L 4 The input of (a) is a lidar pixel of size sxsxsxsxb in a lidar image, L 5 And L 6 The input of the high spectrum illumination image is a high spectrum illumination pixel with the size of s multiplied by B, and the output is O 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s multiplied by d;
s4: deep network tributaries L using a splice layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 The output of (A) is spliced two by two to obtain O 12 、O 34 And O 56
S5: mixing O with 34 And O 56 Input to the 1 st multi-mode packet convolutional layer to obtain output O 34561 Introducing O into 34 Input into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Inputting to the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Introducing O 34 1 、O 3456 1 And O 56 1 Inputting to the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O 34 1 Input to the 5 th multi-mode packet convolution layer to obtain output O 34 2 Introducing O 56 1 Input into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Introducing O 34 2 、O 3456 2 And O 56 2 Inputting to 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O into 34 2 Inputting into 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O 56 2 Inputting the data into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Introducing O 34 3 、O 3456 3 And O 56 3 Input into the 10 th multi-mode packet convolution layer to obtain the output O 3456 4 Introducing O 12 And O 3456 1 Input into the 11 th multi-mode packet convolution layer to obtain the output O 12 1 Introducing O 12 1 And O 3456 2 Inputting into 12 th multi-mode packet convolution layer to obtain output O 12 2 Introducing O into 12 2 And O 3456 3 Inputting into 13 th multi-mode packet convolution layer to obtain output O 12 3 O of size s.times.s.times.d 3456 4 And O 12 3 Inputting a two-dimensional average pooling layer with the size of s multiplied by s to obtain O with the size of 1 multiplied by d 3456 5 And O 12 4
S6: mixing O with 12 4 And O 3456 5 Input splice layer to obtain output O 123456 With a size of 1X 2d, is 123456 Inputting the full connection layer to obtain the final output of the class
Figure FDA0003818659220000011
2. The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1, characterized in that: and in the step S1, after the hyperspectral image and the laser radar image are selected, the hyperspectral image and the laser radar image are subjected to normalization preprocessing.
3. The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1 or 2, characterized in that: the method for performing intrinsic image decomposition on the hyperspectral image to obtain the intrinsic image and the illumination image in the step S2 comprises the following steps:
s2.1: calculating each hyperspectral pixel H i Corresponding matrix D i Wherein i is more than or equal to 1 and less than or equal to X multiplied by Y:
D i =[H 1 ,...,H i-1 ,H i+1 ,...,H X×Y ,I B ]∈R B×(B+X×Y-1)
wherein I B Is an identity matrix with the size of B multiplied by B;
s2.2: based on matrix D i Calculating each hyperspectral pixel H i Corresponding vector alpha i
min||α i || 1 s.t.H i =D i α i
Wherein alpha is i The shape of (B + X Y-1) X1;
s2.3: constructing a weight matrix W epsilon R (X×Y)×(X×Y) For the element W of the ith row and jth column in the weight matrix ij Value assignment is performed
Figure FDA0003818659220000021
Calculating a matrix G = (I) based on a weight matrix W X×Y -W T )(I X×Y -W)+δI X×Y In which I X×Y Is an identity matrix of size (X × Y) × (X × Y), δ is a constant, T is a transposed matrix;
s2.4: transforming the hyperspectral image H into a two-dimensional matrix, carrying out logarithmic calculation to obtain log (scatter (H)), and calculating a matrix K = (I) B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X. Times.Y)), wherein I B And I X×Y Are unit matrices of sizes B × B and (X × Y) × (X × Y), 1 B And 1 X×Y All 1 vectors of size bx 1 and (X × Y) × 1, respectively;
s2.5: calculating a matrix p = δ KG based on the matrix G and the matrix K -1 Obtaining an intrinsic image RE = e obtained by decomposing the hyperspectral image H based on the matrix rho ρ And the illumination image SH = e log(H)-ρ Where e is a natural constant, the sizes of both images are X × Y × B.
4. The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1, characterized in that: the deep network branch L in the step S3 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Each including multiple two-dimensional convolutional layers, each having d convolutional cores, and having a convolutional core size of [3,3]]The convolution kernel sliding step lengths are all [1,1]Branch L 2 And L 3 Sharing all weights, tributaries L 4 And L 5 All weights are shared.
5. The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1, characterized in that: the loss function in the step S3 of constructing the deep network branch training is
Figure FDA0003818659220000022
6. The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1 or 4, characterized in that: in the step S4, the deep network branch L is divided into two branches by using the splicing layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 The output of (A) is spliced two by two to obtain O 12 、O 34 And O 56 Is represented by the formula O 12 =Concatenation(O 1 ,O 2 ),O 34 =Concatenation(O 3 ,O 4 ),O 56 =Concatenation(O 5 ,O 6 )。
7. The utility model provides a remote sensing hyperspectral image fuses sorter with laser radar image which characterized in that includes:
the data acquisition module is used for acquiring a hyperspectral image and a laser radar image, and the category of each object in the two images is label;
the image decomposition module is used for carrying out intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and selecting a neighborhood with the surrounding size of s multiplied by s as a neighborhood block of each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, wherein the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s multiplied by B, and the neighborhood block size of the lidar pixel in the lidar image L is s multiplied by s;
deep network training module that trains deep network tributaries L using neighborhood blocks 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Wherein L is 1 And L 2 The input of (a) is a hyperspectral intrinsic pixel with size of s × s × B in a hyperspectral intrinsic image, L 3 And L 4 Input of (2) is a lidar pixel of size sxsxsxsxb, L in a lidar image 5 And L 6 The input of the high spectrum illumination image is a high spectrum illumination pixel with the size of s multiplied by B, and the output is O 1 ,O 2 ,O 3 ,O 4 ,O 5 And O 6 The sizes are s multiplied by d;
image stitching module for stitching the deep network leg L with a stitching layer 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 The output of (A) is spliced two by two to obtain O 12 、O 34 And O 56
Multimodal fusion module for fusing O 34 And O 56 Input into the 1 st multi-mode packet convolutional layer to obtain output O 3456 1 Introducing O 34 Inputting into the 2 nd multi-mode packet convolution layer to obtain output O 34 1 Introducing O 56 Inputting to the 3 rd multi-mode packet convolution layer to obtain output O 56 1 Introducing O 34 1 、O 3456 1 And O 56 1 Inputting to the 4 th multi-mode packet convolution layer to obtain output O 3456 2 Introducing O 34 1 Input to the 5 th multi-mode packet convolution layer to obtain output O 34 2 Introducing O 56 1 Input into the 6 th multi-mode packet convolution layer to obtain output O 56 2 Introducing O 34 2 、O 3456 2 And O 56 2 Inputting to 7 th multi-mode packet convolution layer to obtain output O 3456 3 Introducing O 34 2 Inputting into 8 th multi-mode packet convolution layer to obtain output O 34 3 Introducing O 56 2 Inputting the data into the 9 th multi-mode packet convolution layer to obtain the output O 56 3 Introducing O into 34 3 、O 3456 3 And O 56 3 Input into the 10 th multi-mode packet convolution layer to obtain output O 3456 4 Introducing O 12 And O 3456 1 Inputting into 11 th multi-mode packet convolution layer to obtain output O 12 1 Introducing O into 12 1 And O 3456 2 Input into 12 th multi-mode packet convolution layer to obtain output O 12 2 Introducing O 12 2 And O 3456 3 Input into the 13 th multi-mode packet convolution layer to obtain the output O 12 3 O of size s × s × d 3456 4 And O 12 3 Inputting a two-dimensional average pooling layer with the size of s x s to obtain O with the size of 1 x d 3456 5 And O 12 4
An image classification module to classify O 12 4 And O 3456 5 Input splice layer to obtain output O 123456 With a size of 1X 2d, is 123456 Inputting the full connection layer to obtain the final output of the type
Figure FDA0003818659220000031
8. The remote sensing hyperspectral image and lidar image fusion classification device of claim 7, wherein: the data acquisition module comprises a data preprocessing submodule, and the data preprocessing submodule is used for performing normalization preprocessing on the hyperspectral image and the laser radar image after the hyperspectral image and the laser radar image are selected.
9. The remote sensing hyperspectral image and lidar image fusion classification device according to claim 7 or 8, characterized in that: the image decomposition module carries out intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and the image decomposition module comprises:
calculating each hyperspectral pixel H i Corresponding matrix D i Wherein i is more than or equal to 1 and less than or equal to X multiplied by Y:
D i =[H 1 ,...,H i-1 ,H i+1 ,...,H X×Y ,I B ]∈R B×(B+X×Y-1)
wherein I B Is an identity matrix with the size of B multiplied by B;
based on matrix D i Calculating each hyperspectral pixel H i Corresponding vector alpha i
min||α i || 1 s.t.H i =D i α i
Wherein alpha is i The shape of (B + X Y-1) X1;
constructing a weight matrix W epsilon R (X×Y)×(X×Y) For the element W of the ith row and the jth column in the weight matrix ij Value assignment is performed
Figure FDA0003818659220000041
Calculating a matrix G = (I) based on a weight matrix W X×Y -W T )(I X×Y -W)+δI X×Y In which I X×Y Is an identity matrix of size (X × Y) X (X × Y), δ is a constant, T is a transpose matrix;
transforming the hyperspectral image H into a two-dimensional matrix, carrying out logarithmic calculation to obtain log (scatter (H)), and calculating a matrix K = (I) B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X. Times.Y)), wherein I B And I X×Y Are unit matrices of sizes B × B and (X × Y) × (X × Y), 1 B And 1 X×Y Are respectively provided withIs a full 1 vector of dimensions bx 1 and (X Y) × 1;
calculating a matrix ρ = δ KG based on the matrix G and the matrix K -1 Obtaining an eigen image RE = e obtained by decomposing the hyperspectral image H based on the matrix rho ρ And the illumination image SH = e log(H)-ρ Where e is a natural constant, the sizes of both images are X × Y × B.
10. The remote sensing hyperspectral image and lidar image fusion classification device of claim 7, wherein: the deep network leg L 1 ,L 2 ,L 3 ,L 4 ,L 5 And L 6 Each including multiple two-dimensional convolution layers, each having d convolution kernels and a convolution kernel size of [3,3]]The convolution kernel sliding step lengths are all [1,1]Branch L 2 And L 3 Sharing all weights, tributaries L 4 And L 5 All weights are shared.
CN202211037953.6A 2022-08-26 2022-08-26 Fusion classification method and device for remote sensing hyperspectral image and laser radar image Pending CN115331110A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211037953.6A CN115331110A (en) 2022-08-26 2022-08-26 Fusion classification method and device for remote sensing hyperspectral image and laser radar image
PCT/CN2022/142160 WO2024040828A1 (en) 2022-08-26 2022-12-27 Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211037953.6A CN115331110A (en) 2022-08-26 2022-08-26 Fusion classification method and device for remote sensing hyperspectral image and laser radar image

Publications (1)

Publication Number Publication Date
CN115331110A true CN115331110A (en) 2022-11-11

Family

ID=83928217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211037953.6A Pending CN115331110A (en) 2022-08-26 2022-08-26 Fusion classification method and device for remote sensing hyperspectral image and laser radar image

Country Status (2)

Country Link
CN (1) CN115331110A (en)
WO (1) WO2024040828A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116167955A (en) * 2023-02-24 2023-05-26 苏州大学 Hyperspectral and laser radar image fusion method and system for remote sensing field
CN116740457A (en) * 2023-06-27 2023-09-12 苏州大学 Hyperspectral image and laser radar image fusion classification method and system
WO2024040828A1 (en) * 2022-08-26 2024-02-29 苏州大学 Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117830752B (en) * 2024-03-06 2024-05-07 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification
CN117934978B (en) * 2024-03-22 2024-06-11 安徽大学 Hyperspectral and laser radar multilayer fusion classification method based on countermeasure learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993220A (en) * 2019-03-23 2019-07-09 西安电子科技大学 Multi-source Remote Sensing Images Classification method based on two-way attention fused neural network
US20200083902A1 (en) * 2016-12-13 2020-03-12 Idletechs As Method for handling multidimensional data
CN112130169A (en) * 2020-09-23 2020-12-25 广东工业大学 Point cloud level fusion method for laser radar data and hyperspectral image
CN112819959A (en) * 2021-01-22 2021-05-18 哈尔滨工业大学 Hyperspectral image and laser radar data intrinsic hyperspectral point cloud generation method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090318815A1 (en) * 2008-05-23 2009-12-24 Michael Barnes Systems and methods for hyperspectral medical imaging
CN112967350B (en) * 2021-03-08 2022-03-18 哈尔滨工业大学 Hyperspectral remote sensing image eigen decomposition method and system based on sparse image coding
CN114742985A (en) * 2022-03-17 2022-07-12 苏州大学 Hyperspectral feature extraction method and device and storage medium
CN115331110A (en) * 2022-08-26 2022-11-11 苏州大学 Fusion classification method and device for remote sensing hyperspectral image and laser radar image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200083902A1 (en) * 2016-12-13 2020-03-12 Idletechs As Method for handling multidimensional data
CN109993220A (en) * 2019-03-23 2019-07-09 西安电子科技大学 Multi-source Remote Sensing Images Classification method based on two-way attention fused neural network
CN112130169A (en) * 2020-09-23 2020-12-25 广东工业大学 Point cloud level fusion method for laser radar data and hyperspectral image
CN112819959A (en) * 2021-01-22 2021-05-18 哈尔滨工业大学 Hyperspectral image and laser radar data intrinsic hyperspectral point cloud generation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIN XD, GU YF: "Intrinsic Scene Properties from Hyperspectral Images and LiDAR", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 9 April 2020 (2020-04-09), pages 1423 - 1431 *
童庆禧;张兵;张立福;: "中国高光谱遥感的前沿进展", 遥感学报, no. 05, 25 September 2016 (2016-09-25) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024040828A1 (en) * 2022-08-26 2024-02-29 苏州大学 Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image
CN116167955A (en) * 2023-02-24 2023-05-26 苏州大学 Hyperspectral and laser radar image fusion method and system for remote sensing field
CN116740457A (en) * 2023-06-27 2023-09-12 苏州大学 Hyperspectral image and laser radar image fusion classification method and system

Also Published As

Publication number Publication date
WO2024040828A1 (en) 2024-02-29

Similar Documents

Publication Publication Date Title
CN115331110A (en) Fusion classification method and device for remote sensing hyperspectral image and laser radar image
CN108154194B (en) Method for extracting high-dimensional features by using tensor-based convolutional network
CN108491849B (en) Hyperspectral image classification method based on three-dimensional dense connection convolution neural network
CN111626300B (en) Image segmentation method and modeling method of image semantic segmentation model based on context perception
CN111325155B (en) Video motion recognition method based on residual difference type 3D CNN and multi-mode feature fusion strategy
CN108615010B (en) Facial expression recognition method based on parallel convolution neural network feature map fusion
WO2021082480A1 (en) Image classification method and related device
CN109063724B (en) Enhanced generation type countermeasure network and target sample identification method
CN111126256A (en) Hyperspectral image classification method based on self-adaptive space-spectrum multi-scale network
CN111310598B (en) Hyperspectral remote sensing image classification method based on 3-dimensional and 2-dimensional mixed convolution
CN113486851A (en) Hyperspectral image classification method based on double-branch spectrum multi-scale attention network
CN109934295B (en) Image classification and reconstruction method based on transfinite hidden feature learning model
CN107679539B (en) Single convolution neural network local information and global information integration method based on local perception field
CN110706214A (en) Three-dimensional U-Net brain tumor segmentation method fusing condition randomness and residual error
CN111915545A (en) Self-supervision learning fusion method of multiband images
CN111860683A (en) Target detection method based on feature fusion
CN113780147A (en) Lightweight hyperspectral ground object classification method and system with dynamic fusion convolution network
CN114463341A (en) Medical image segmentation method based on long and short distance features
CN113420838A (en) SAR and optical image classification method based on multi-scale attention feature fusion
CN114742985A (en) Hyperspectral feature extraction method and device and storage medium
CN116416441A (en) Hyperspectral image feature extraction method based on multi-level variational automatic encoder
CN116563606A (en) Hyperspectral image classification method based on dual-branch spatial spectrum global feature extraction network
CN108564116A (en) A kind of ingredient intelligent analysis method of camera scene image
CN113450297A (en) Fusion model construction method and system for infrared image and visible light image
Picard et al. An efficient system for combining complementary kernels in complex visual categorization tasks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination