CN114758170A - Three-branch three-attention mechanism hyperspectral image classification method combined with D3D - Google Patents
Three-branch three-attention mechanism hyperspectral image classification method combined with D3D Download PDFInfo
- Publication number
- CN114758170A CN114758170A CN202210344115.7A CN202210344115A CN114758170A CN 114758170 A CN114758170 A CN 114758170A CN 202210344115 A CN202210344115 A CN 202210344115A CN 114758170 A CN114758170 A CN 114758170A
- Authority
- CN
- China
- Prior art keywords
- branch
- model
- attention
- spatial
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of spectral image classification, and discloses a three-branch three-attention mechanism hyperspectral image classification method combined with D3D, wherein a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution is constructed, and spectral information and spatial information of a hyperspectral image are extracted; the D3DTBTA-Net is divided into a spectrum branch, a space X branch and a space Y branch, and after a spectrum characteristic diagram, a space X characteristic diagram and a space Y characteristic diagram are respectively extracted, the characteristic diagrams extracted from the three branches are fused and classified. The method can automatically classify according to the trained deep learning model without inputting any parameter and consuming a large amount of time cost and labor cost to label data; the characteristics with more discriminative power can be extracted through the deformable 3D convolution and the three-branch three-attention mechanism, so that the classification precision is improved, and the good classification performance can be still kept under the condition that the number of training samples is limited.
Description
Technical Field
The invention belongs to the technical field of spectral image classification, and particularly relates to a three-branch three-attention mechanism hyperspectral image classification method combined with D3D.
Background
At present, a hyperspectral image has nanoscale spectral resolution, can reflect slight differences of different ground objects in spectral dimensions, and greatly improves the resolution and identification capability of the ground objects. The hyperspectral image classification is characterized in that rich information contained in the hyperspectral image is utilized, each pixel is assigned with a unique class label, and the hyperspectral image classification is an important aspect of hyperspectral image application. However, the hyperspectral data has high-dimensional characteristics, the phenomena of same-object different spectrums and same-spectrum foreign matters exist in the hyperspectral image, so that the image data structure is highly nonlinear, and the adjacent wave bands and the adjacent pixels have strong correlation; meanwhile, labels in the hyperspectral images are insufficient, training samples are often limited in number, and dimensionality disasters are prone to occurring. Therefore, how to extract features with strong discriminability and realize accurate classification on the premise of small samples is the key of hyperspectral image classification.
The traditional machine learning method generally only utilizes spectral information and neglects rich spatial information of hyperspectral images, so that the classification precision is low; in addition, a great deal of time and labor costs are required to label data. When the convolutional neural network based on the convolutional neural network or the improved deeper network extracts the features, the sampling position of a convolutional kernel is usually fixed, and the size of a receptive field cannot be dynamically adjusted according to the actual condition of an image, so that the features are better extracted, and the classification performance is limited. The hyperspectral image classification method based on deep learning has low classification precision on small sample data. Therefore, it is necessary to design a new hyperspectral image classification method to overcome the defects in the prior art.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) the traditional machine learning method only utilizes spectral information and ignores rich spatial information of hyperspectral images, so that the classification precision is low, and a large amount of time cost and labor cost are needed to label data.
(2) The hyperspectral image classification method based on deep learning has low classification precision on small sample data.
(3) When the convolutional neural network based on the convolutional neural network or the improved deeper network extracts the features, the sampling position of a convolutional kernel is usually fixed, and the size of a receptive field cannot be dynamically adjusted according to the actual condition of an image, so that the features are better extracted, and the classification performance is limited.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a three-branch three-attention mechanism hyperspectral image classification method combined with D3D, and particularly relates to a D3 DTBTA-Net: a hyperspectral image classification method, a system, a medium, equipment and a terminal of a three-branch three-attention mechanism combined with D3D aim at solving the problem that a hyperspectral image classification algorithm based on small samples in the prior art is low in classification accuracy.
The invention is realized in such a way that a three-branch three-attention mechanism hyperspectral image classification method combined with D3D comprises the following steps: constructing a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution for extracting spectral information and spatial information of the hyperspectral image; the three-branch three-attention mechanism network D3DTBTA-Net respectively extracts a spectral feature map, a spatial X feature map and a spatial Y feature map by utilizing three branches, and performs feature map fusion and classification; the three branches are a spectral branch, a spatial X branch and a spatial Y branch.
Further, the three-branch three-attention hyperspectral image classification method combined with the D3D comprises the following steps of:
step one, generating a data set: generating a set of three-dimensional cubes, and randomly dividing the set of three-dimensional cubes into a training set, a verification set and a test set;
step two, training a model and verifying the model: the training set is used for updating parameters of multiple iterations, and the verification set is used for monitoring the performance of the model and selecting the model which is best trained;
step three, prediction: and selecting a test set to verify the effectiveness of the training model to obtain a classification result.
Further, the data set generation in the first step includes:
selecting a central pixel x from the raw dataiP neighboring pixels, generating a set of three-dimensional cubic blocksIf the target pixel is located at the edge of the image, setting the missing neighboring pixel value to zero; in the D3DTBTA-Net algorithm, p is the size of the patch, the size of the patch is set to be 9, and b is the number of the spectral bands; randomly dividing three-dimensional cube set into training set XtrainVerification set XvalAnd test set XtestThe corresponding label vector is divided into Ytrain、YvalAnd YtestOnly spatial information around the target pixel is used.
Further, the training model and the verification model in the second step include:
training a model and verifying the model by using a D3DTBTA-Net algorithm, wherein the D3DTBTA-Net algorithm is divided into three branches: the spectrum branch, the space X branch and the space Y branch are respectively used for capturing a spectrum characteristic diagram, a space X characteristic diagram and a space Y characteristic diagram and fusing the three acquired characteristic diagrams for classification; wherein the spectrum branch comprises a Dense spectrum block and a spectrum attention block; the space X branch comprises a Dense space X block and a space X attention block; the space Y branch contains a Dense space Y block and a space Y attention block.
Further, the following basic modules are used in the second step:
(1) 3D-CNN with BN: the 3D-CNN with BN is a common element in a depth learning model based on 3D cubic blocks; for pm×pm×bmN of sizemFeature map, in a 3D-CNN layer, contains the size of alpham+1×αm+1×dm+1K of (a)m+1A channel of size pm+1×pm+1×bm+1N of (A) to (B)m+1Outputting a characteristic diagram; the ith output of the (m +1) th 3D-CNN layer with BN is calculated as:
wherein the content of the first and second substances,is the jth input feature map of the (m +1) layer,is the output after BN of m layers; e (-) and Var (-) represent the expectation and variance functions of the input, respectively;andand respectively representing the weight and the bias of the (m +1) layer 3D-CNN, wherein the is a 3D convolution operation, and R (-) is an activation function introduced into a network nonlinear unit.
(2) DenseNet dense ligation: the dense block is the basic unit in DenseNet, and the output of the l-th dense block is calculated as:
xl=Hl[x0,x1,...,xl-1];
wherein HlIs a block containing a convolutional layer, an active layer and a BN layer, x0,x1,...,xl-1Representing the generated dense blocks, the more connections, the more information flows in the dense network; the dense network with L layers has L (L +1)/2 linksThen, the conventional convolutional network with the same number of layers has only L direct connections.
(3) An attention mechanism is as follows:
spectral attention mappingIs directly inputted from the initialCalculating, wherein p × p is the size of the input block, and c represents the number of input channels; a and A are reactedTPerforming matrix multiplication to obtain channel attention mappingConnecting the softmax layer as:
wherein x isjiRepresenting the influence of the ith channel on the jth channel; mixing XTThe result of the matrix multiplication with A is transformed intoWeighting the reconstructed result through the parameter of the scale alpha, and adding the input A to obtain a final spectrum attention chart
Where α is initialized to zero and learned step by step. The final plot E contains a weighted sum of all channel features to describe a dependency relationship that enhances the discriminability of the features.
The space attention block: given an input profileGenerating new feature maps B and C using two convolution layers, respectively, whereinDeforming B and C intoWhere n-p × p is the number of pixels; performing matrix multiplication between B and C, adding a softmax layer, and calculating a spatial attention feature map
Wherein s isjiRepresenting the influence of the ith pixel to the jth pixel; the closer the feature representations of two pixels are, the stronger the correlation between the representative pixels.
Simultaneously sending the initial input features A into the convolutional layer to obtain a new feature mapIs deformed intoAt D and STPerforms matrix multiplication operation therebetween, and the result is transformed into
Wherein the initial value of beta is zero, and more weights are gradually learned and distributed; adding a certain weight to all the positions and the original features to obtain final featuresThe context information in the spatial dimension is modeled as E.
(4) Deformable 3D convolution: the size of a receptive field is dynamically adjusted by deformable convolution according to the actual situation of an image, and an input feature with the size of C multiplied by H multiplied by W passes through a 3D-CNN with the size of p multiplied by q multiplied by r to generate an offset feature with the size of 3N multiplied by C multiplied by H multiplied by W, wherein N is the size of a sampling grid; having 3N values along the channel dimension, the values representing deformation values of the D3D sampling grid; applying the learned offset features to a deformation of the 3D-CNN sampling grid to generate a D3D sampling grid; the D3D sampling grid is used to generate output features.
D3D is represented by the following formula:
wherein, Δ pnRepresenting the offset corresponding to the nth value in the p × q × r convolutional sampling grid, using bilinear interpolation to generate the exact value. The bilinear interpolation formula is:
(5) mish activation function: the activation function adopted by the D3DTBTA is Mish, which is a self-regularized non-monotonic activation function, rather than the traditional relu (x) max (0, x). The formula of Mish is:
mish(x)=x×tanh(softplus(x));=xi×tanh(ln(1+ex))
wherein x represents an input; mish is unbounded at the upper bound, and the range at the lower bound is [ ≈ 0.31, ∞ ]; the differential coefficient of Mish is defined as:
(6) regarding the selection of the optimal weight, in the training process, selecting the model with the highest accuracy on the verification set as output, and if the accuracy on the verification set is consistent, selecting the model with the minimum loss on the verification set to output; the best model is saved in each iteration, if the model of the next iteration is better, the model saved last time is replaced, otherwise, the model is not replaced.
Dynamically adjusting the learning rate by adopting a cosine annealing method as shown in the following formula:
wherein eta isiIs in the range of the ith iterationThe learning rate of (c); t is a unit ofcurIs responsible for calculating the number of iterations that have been performed, and TiThe number of iterations performed in one adjustment period is controlled.
Further, the predicting in step three includes:
HSI data set A is composed of N marked pixelsComposition, where p is the band and the corresponding class label set isWhere q is the number of land cover categories.
In HSI classification, the quantitative measure for measuring the difference between the predicted result and the true value is a cross-entropy loss function defined as:
wherein, the first and the second end of the pipe are connected with each other,a label vector representing the model prediction, y ═ y1,y2,...,yL]Representing the true tag vector.
Another object of the present invention is to provide a three-branch three-attention hyperspectral image classification system combined with D3D, which includes:
the data set generating module is used for generating a set of three-dimensional cubes and randomly dividing the set of three-dimensional cubes into a training set, a verification set and a test set;
the model training and verifying module is used for updating parameters of multiple iterations through a training set, monitoring the performance of the model by using a verifying set and selecting the model with the best training;
and the prediction module is used for selecting the test set to verify the effectiveness of the training model and obtain a classification result.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
constructing a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution for extracting spectral information and spatial information of the hyperspectral image; the D3DTBTA-Net is divided into three branches: and after the spectral characteristic diagram, the spatial X characteristic diagram and the spatial Y characteristic diagram are respectively extracted, fusing the characteristic diagrams extracted from the three branches for classification.
It is another object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
constructing a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution for extracting spectral information and spatial information of a hyperspectral image; the D3DTBTA-Net is divided into three branches: and after the spectral characteristic diagram, the spatial X characteristic diagram and the spatial Y characteristic diagram are respectively extracted from the spectral branch, the spatial X branch and the spatial Y branch, the characteristic diagrams extracted from the three branches are fused and classified.
Another object of the present invention is to provide an information data processing terminal for implementing the three-branch three-attention hyperspectral image classification system in combination with D3D.
In combination with the technical solutions and the technical problems to be solved, please analyze the advantages and positive effects of the technical solutions to be protected in the present invention from the following aspects:
first, aiming at the technical problems existing in the prior art and the difficulty in solving the problems, the technical problems to be solved by the technical scheme of the present invention are closely combined with results, data and the like in the research and development process, and some creative technical effects are brought after the problems are solved. The specific description is as follows:
the invention provides a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution, which can enhance feature extraction and fully extract spectral information and spatial information of a hyperspectral image, thereby improving the classification precision of the hyperspectral image on the premise of a small sample. The D3DTBTA-Net of the invention is divided into three branches: and respectively extracting a spectral feature map, a spatial X feature map and a spatial Y feature map from the spectral branch, the spatial X branch and the spatial Y branch, and fusing the feature maps extracted from the three branches for classification. A comparison experiment with other classification methods shows that the D3DTBTA-Net is suitable for classifying the hyperspectral images of small samples and can obtain better classification performance.
Secondly, considering the technical solution as a whole or from the perspective of products, the technical effects and advantages of the technical solution to be protected by the present invention are specifically described as follows:
the method can automatically classify according to the trained deep learning model without inputting any parameter and consuming a large amount of time cost and labor cost to label data; the characteristics with more discriminative power can be extracted through the deformable 3D convolution and the three-branch three-attention mechanism, so that the classification precision is improved, the problem of dimension disasters is solved, and the good classification performance can be still kept under the condition that the number of training samples is limited.
Third, as an inventive supplementary proof of the claims of the present invention, there are also presented several important aspects:
the expected income and commercial value after the technical scheme of the invention is converted are as follows: at present, the remote sensing technology is widely applied to the fields of agriculture, forestry, geology, oceans, meteorology, hydrology, military affairs, environmental protection and the like. The method improves the classification precision of the remote sensing image, can be applied to various fields, for example, the method is applied to agricultural production, can dynamically monitor the growth vigor of crops, monitor the diseases and the insect pests of the crops, estimate the yield of the crops and the like, and in the agricultural production, the remote sensing technology can periodically observe and cover a large area to obtain ground information, thereby greatly saving the labor cost and reducing the errors caused by the labor factors, and further promoting the agricultural modernization process of China.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart of a three-branch three-attention hyperspectral image classification method in combination with D3D according to an embodiment of the invention;
FIG. 2 is a block diagram of a three-branch three-attention hyperspectral image classification system combined with D3D according to an embodiment of the invention;
FIG. 3 is a flow chart of the D3DTBTA-Net algorithm provided by the embodiment of the present invention;
FIG. 4 is a schematic diagram of a calculation process of a spectral attention map provided by an embodiment of the present invention;
FIG. 5 is a schematic diagram of a deformable 3D convolution provided by an embodiment of the present invention;
FIG. 6 is a schematic diagram of a network structure of a D3DTBTA according to an embodiment of the present invention;
FIG. 7 is an experimental plot on an Indian Pines (IP) data set provided by an embodiment of the present invention; wherein FIG. 7(a) is a pseudo-color image; FIG. 7(b) corresponds to a label; fig. 7(c) SVM (68.75%); FIG. 7(d) CDCNN (64.21%); FIG. 7(e) SSRN (91.59%); FIG. 7(f) FDSSC (93.85%); FIG. 7(g) DBDA (91.32%); FIG. 7(h) D3DTBTA (95.74%);
in the figure: 1. a data set generation module; 2. a model training and verification module; 3. and a prediction module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to solve the problems in the prior art, the invention provides a three-branch three-attention hyperspectral image classification method combined with D3D, and the invention is described in detail with reference to the accompanying drawings.
First, an embodiment is explained. This section is an explanatory embodiment expanding on the claims so as to fully understand how the present invention is embodied by those skilled in the art.
Example 1
The hyperspectral image classification method aims at solving the problem that a hyperspectral image classification algorithm based on a small sample is low in classification precision in the prior art. The invention provides a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution, which can enhance feature extraction and fully extract spectral information and spatial information of a hyperspectral image, thereby improving the classification precision of the hyperspectral image on the premise of a small sample. The D3DTBTA-Net is divided into three branches: and respectively extracting a spectral feature map, a spatial X feature map and a spatial Y feature map from the spectral branch, the spatial X branch and the spatial Y branch, and fusing the feature maps extracted from the three branches for classification. A comparison experiment with other classification methods shows that the D3DTBTA-Net is suitable for classifying the hyperspectral images of small samples and can obtain better classification performance.
As shown in fig. 1, the three-branch three-attention hyperspectral image classification method combined with D3D according to the embodiment of the invention includes the following steps:
s101, generating a data set: randomly dividing a three-dimensional cube set into a training set, a verification set and a test set;
s102, training a model and verifying the model: the training set is used for updating parameters of multiple iterations, and the verification set is used for monitoring the performance of the model and selecting the model which is best trained;
s103, predicting: and selecting a test set to verify the effectiveness of the training model to obtain a classification result.
As shown in fig. 2, the three-branch three-attention hyperspectral image classification system combined with D3D provided by the embodiment of the invention includes:
the data set generating module 1 is used for generating a set of three-dimensional cubes and randomly dividing the set of three-dimensional cubes into a training set, a verification set and a test set;
the model training and verifying module 2 is used for updating parameters of multiple iterations through a training set, monitoring the performance of the model by using a verifying set and selecting the model with the best training;
and the prediction module 3 is used for selecting the test set to verify the effectiveness of the training model and obtain a classification result.
The process of the D3DTBTA-Net algorithm provided by the embodiment of the invention comprises three steps: data set generation, training and validation, and prediction. Fig. 3 illustrates the overall algorithm flow of the method of the present invention.
Suppose that HSI data set A is made up of N labeled pixelsComposition, where p is the band and the corresponding class label set isWhere q is the number of land cover categories.
And 2, training a model and verifying the model. The training set is used to update the parameters for multiple iterations, while the validation set is used to monitor the performance of the model and select the best trained model. In step 2, the training model and the verification model use the algorithm D3DTBTA-Net of the present invention, which is divided into three branches: and the spectral branch, the spatial X branch and the spatial Y branch are respectively used for capturing the spectral feature map, the spatial X feature map and the spatial Y feature map, and then the obtained three feature maps are fused and classified. Wherein the spectrum branch comprises a Dense spectrum block and a spectrum attention block; the space X branch comprises a Dense space X block and a space X attention block; the space Y branch contains a De nse space Y block and a space Y attention block.
And 3, predicting. And selecting a test set to verify the effectiveness of the training model, so as to obtain a classification result. In HSI classification, a common quantitative measure of the difference between the predicted result and the true value is a cross-entropy loss function defined as:
wherein the content of the first and second substances,tag vector predicted for model, y ═ y1,y2,...,yL]Representing the true tag vector.
Further, the following basic modules are used in step 2.
(1) 3D-CNN with BN. 3D-CNN with BN is a common element in depth learning models based on 3D cube blocks. For pm×pm×bmN of sizemFeature map, in a 3D-CNN layer, comprising a size of αm+1×αm+1×dm+1K of (a)m+1A channel of size pm+1×pm+1×bm+1N of (a)m+1And outputting the characteristic diagram. The ith output of the (m +1) th 3D-CNN layer with BN is calculated as:
whereinIs the jth input feature map of the (m +1) layer,is the output after BN for m layers. E (-) and Var (-) represent the expectation and variance functions of the input, respectively.Andand (3) weight values and bias values of the (m +1) layer 3D-CNN are represented, the (m +1) layer 3D-CNN is a 3D convolution operation, and R (-) represents an activation function introduced into a network nonlinear unit.
(2) DenseNet dense junctions. Generally, the more convolutional layers, the better the performance of the network. However, when the network reaches a certain depth, the performance cannot be improved by continuously increasing the number of layers, but the network is degraded, that is, the accuracy on the training set gradually saturates or even decreases with the increase of the number of layers of the network. DenseNet is an effective way to solve this problem.
The dense block is the basic unit in DenseNet, and the output of the l-th dense block is calculated as:
xl=Hl[x0,x1,...,xl-1] (4)
wherein HlIs a block containing a convolutional layer, an active layer and a BN layer, x0,x1,...,xl-1Representing the generated dense blocks, the more connections, the more information flows in the dense network. Specifically, a dense network with L layers has L (L +1)/2 connections, whereas a conventional convolutional network with equal layers has only L direct connections.
(3) Attention is paid to the mechanism. One drawback of 3D-CNN is that all spatial pixels and spectral bands possess equivalent weights in the spatial and spectral domains. Obviously, different spectral bands and spatial pixels contribute differently to the extracted features.
As shown in FIG. 4, spectral attention mappingIs directly inputted from the initialCalculated, where p × p is the size of the input block, and c represents the number of input channels. Firstly, A and A are mixedTPerforming matrix multiplication to obtain channel attention mappingConnecting the softmax layer as:
wherein xjiIndicating the effect of the ith channel on the jth channel. Secondly, mixing XTThe result of the matrix multiplication with A is transformed intoFinally, weighting the reconstructed result through the parameter of the scale alpha, and adding the input A to obtain a final spectrum attention chart
Where α is initialized to zero and can be learned step by step. The final plot E contains a weighted sum of all channel features, which may describe a dependency, enhancing the discriminability of the features.
The space notices the block. Given an input feature mapGenerating new feature maps B and C using two convolution layers, respectively, whereinFirst, B and C are deformed intoWhere n-p × p is the number of pixels. Secondly, performing matrix multiplication between B and C, then adding a softmax layer, and calculating a spatial attention feature map
Wherein s isjiRepresenting the impact of the ith pixel through the jth pixel. Of two pixelsThe closer the feature representations are, the stronger the correlation between them is represented.
Simultaneously feeding the initial input features A into the convolutional layer to obtain a new feature mapIs deformed intoFinally at D and STPerforms matrix multiplication operation therebetween, and the result is transformed into
Where the initial value of β is zero, more weight can be learned and assigned step by step. According to the equation (8), the final feature is obtained by adding a certain weight to all the positions and the original featuresThus, the context information in the spatial dimension is modeled as E.
(4) A deformable 3D convolution. CNNs based on CNNs or improved deeper networks, when extracting features, the sampling positions of convolution kernels are usually fixed grids, and for very complex objects with different scales or shapes, the traditional method based on convolution neural networks cannot effectively extract features from complex structures, thereby limiting the classification performance. The size of the receptive field can be dynamically adjusted by the deformable convolution according to the actual condition of the image, and the features can be better extracted. The Deformable Convolution is generally two-dimensional, and Deformable 3D Convolution (D3D Convolution, D3D) fuses the Deformable Convolution and the 3D-CNN together, thereby significantly improving the deformation modeling capability of CNNs. D3D may expand the spatial field of view by a learnable offset variable, where as shown in fig. 5, an input feature of size C × H × W is first passed through a 3D-CNN of size p × q × r to generate an offset feature of size 3N × C × H × W (where N ═ p × q × r is the size of the sampling grid). Along its channel dimension, has 3N values representing the deformation values of the D3D sampling grid. The learned offset features are then used to deform the 3D-CNN sampling grid to generate a D3D sampling grid. Finally, a D3D sampling grid is used to produce the output features. D3D can be expressed by the following formula:
wherein Δ pnRepresenting the offset corresponding to the nth value in the p × q × r convolutional sampling grid. Since offset variables are usually fractional numbers, bilinear interpolation is used to generate accurate values. The bilinear interpolation formula is:
(5) the Mish activation function. The activation function adopted by the D3DTBTA is Mish, which is a self-regularized non-monotonic activation function, rather than the traditional relu (x) ═ max (0, x). The formula of Mish is:
mish(x)=x×tanh(softplus(x)) =xi×tanh(ln(1+ex)) (11)
where x represents the input. Mish is unbounded at the upper bound, and the range at the lower bound is [ ≈ -0.31, ∞ ]. The differential coefficient of Mish is defined as:
(6) And regarding selection of the optimal weight, selecting the model with the highest accuracy on the verification set as output in the training process, and selecting the model with the lowest loss on the verification set to output if the accuracy on the verification set is consistent. The best model is saved in each iteration, if the model of the next iteration is better, the model saved last time is replaced, otherwise, the model is not replaced.
The learning rate is an important hyper-parameter for training the network, and the dynamic learning rate can help the network avoid local minima. Dynamically adjusting the learning rate by adopting a cosine annealing method as follows:
wherein etaiIs in the range of the ith iterationThe learning rate of (2). T iscurIs responsible for calculating the number of iterations that have been performed, and TiThe number of iterations performed in one adjustment period is controlled.
Example 2
The network structure of the D3DTBTA is shown in fig. 6. For convenience, the upper branch is called the spectral branch, and the lower branch is called the spatial X branch and the spatial Y branch, respectively. And respectively inputting the spectrum branch, the space X branch and the space Y branch to obtain a spectrum characteristic diagram and a space characteristic diagram. And then, obtaining a classification result by adopting the fusion operation of the spectrum, the space X characteristic diagram and the space Y characteristic diagram.
The following section describes spectral branches, spatial X branches, spatial Y branches, and the operation of fusing spectra and spaces, taking an Indian Pipes (IP) dataset as an example. The sample cube size is 9 × 9 × 200, as the matrix mentioned below (9 × 9 × 97,24), 9 × 9 × 97 denotes the height, width and depth of the 3D cube, and 24 denotes the number of 3D cubes generated by the 3D-CNN. The IP data set contains 145 x 145 pixels with 200 spectral bands, i.e. the size of the IP is 145 x 200. Only 10249 pixels have a corresponding label, the other pixels being background.
Because the spectrum channels of the HSI are extremely numerous and are redundant for classification, the HSI classification algorithm generally carries out dimensionality reduction operation firstly, so that the redundancy is reduced, and the classification accuracy is improved. The D3DTBTA firstly uses a 3D-CNN layer with the convolution kernel size of 1 multiplied by 7, the step is set to be (1,1,2) to reduce the number of channels to obtain a characteristic diagram of (9 multiplied by 97,8), then uses a deformable 3D convolution enhancement characteristic with the convolution kernel size of 3 multiplied by 3, and then uses a 3D-CNN layer with the convolution kernel size of 1 multiplied by 7 to capture the characteristic diagram of (9 multiplied by 97,24) as an input characteristic diagram of three branches.
The captured signature of size (9 × 9 × 97,24) is input to the spectral branch, first passed through 3D-CNN sense spectral blocks with BN, each sense spectral block having 12 channels in 3D-CNN, with a convolution kernel size of 1 × 1 × 7. After passing through the Dense spectrum block, the number of channels of the feature map calculated by equation (5) is increased to 60, and the size of the feature map is (9 × 9 × 97, 60). Next, after the last 3D-CNN with a convolution kernel size of 1 × 1 × 97, a (9 × 9 × 1,60) feature map is generated. However, these 60 channels contribute differently to the classification. To refine the spectral features, spectral attention blocks are employed, which emphasize the weight of useful information and de-emphasize the weight of redundant information. After the weighted spectral feature map is obtained, the features are enhanced through a deformable 3D convolution with the size of 3 x 1, and then stability and robustness are improved by adopting BN layers and dropout layers. Finally, a 1 × 60 feature map is obtained by globally averaging the pooling layers. The details of the implementation of the spectral branching are shown in table 1.
TABLE 1 implementation details of spectral branching
Meanwhile, the feature map of (9 × 9 × 97,24) is input to the spatial X branch, and then the 3D-CNN sense spatial X block with BN is added. Each 3D-CNN has 12 channels in the sense space X block, with a convolution kernel size of 3 × 1 × 1. Next, the feature map of (9 × 9 × 1,60) is input to the spatial X attention block, and the coefficients of each pixel are weighted by the spatial X attention block, thereby obtaining a more discriminative spatial X feature map. After obtaining the weighted spatial X feature map, enhancing the features by a deformable 3D convolution with a size of 3 × 3 × 1, and then obtaining a 1 × 60 spatial X feature map by the BN layer, the Dropout layer, and the global average pooling layer. The implementation details of the spatial X branch are shown in table 2.
TABLE 2 implementation details of spatial X-branching
Likewise, the feature map of (9 × 9 × 97,24) is input to the space Y branch, and then the 3D-CNN sense space Y block with BN is added. Each 3D-CNN has 12 channels in the Dense space Y block, and the convolution kernel size is 1 × 3 × 1. The feature map of (9 × 9 × 1,60) is input to the spatial Y attention block, and the coefficients of each pixel are weighted by the spatial Y attention block, thereby obtaining a more discriminative spatial Y feature map. After obtaining the weighted spatial Y feature map, enhancing the features by a deformable 3D convolution with a size of 3 × 3 × 1, and then obtaining a 1 × 60 spatial Y feature map by the BN layer, the Dropout layer, and the global average pooling layer. The implementation details of the space Y branch are shown in table 3.
TABLE 3 implementation details of space Y-Branch
Obtaining a spectral feature map, a spatial X feature map and a spatial Y feature map through the spectral branch, the spatial X branch and the spatial Y branch, and then connecting the three feature maps for classification. In addition, the reason for using the tandem operation rather than the addition operation is that the spectral, spatial X, and spatial Y features are all in unrelated domains, the tandem operation can keep the spectral, spatial X, and spatial Y features independent, and the addition operation can mix the spectral, spatial X, and spatial Y features together. And finally, obtaining a classification result through the full connection layer and the softmax layer.
The inventive method was experimented on 4 published hyperspectral datasets, namely the Indian Pines (IP) dataset, the Pavia University (UP) dataset, the Salanas Valley (SV) dataset and the Kennedy Space Center (KSC). Other 5 methods were compared: SVM, CDCNN, SSRN, FDS SC, and DBDA. The methods are effective methods for classifying the hyperspectral images of the small samples and are authenticated by researchers.
The experiments were all performed on the same platform, configuring 16GB memory and NVIDIA GeForce RTX 1080 Ti GPU. All deep learning based classifiers are implemented using PyTorch and the support vector machine is implemented using ski earn.
Since the SVM directly uses spectral information for classification, the input sample size is 1 × 1 × p. For better comparative experiments, other deep learning based methods use the same input sample size of 9 × 9 × p, where p is the number of spectral bands.
The batch processing sizes of CDCNN, SSRN, FDSSC, DBDA and the method D3DTBTA of the invention are all set as 16, the optimizer is set as Adam, and the learning rate is 0.0005. Each method is independently performed for 10 iterations, and the experimental result is the average value of 10 iteration results. The total number of epochs is set to 150, with a step size of 30 per epoch. Experiments were performed using the optimal weight selection method.
The size of the training and validation samples was 3% of the total sample. The number of training, validation and test samples in the Indian Pines (IP) dataset is shown in table 4.
TABLE 4 number of training, validation and test samples in IP dataset
And II, application embodiment. In order to prove the creativity and the technical value of the technical scheme of the invention, the part is the application example of the technical scheme of the claims on specific products or related technologies.
Embodiments of the present invention may be realized in hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
And thirdly, evidence of relevant effects of the embodiment. The embodiment of the invention achieves some positive effects in the process of research and development or use, and has great advantages compared with the prior art, and the following contents are described by combining data, diagrams and the like in the test process.
In the examples, the experimental results on the Indian Pines (IP) data set are shown in fig. 7 and in table 5 below, where the training set size is 3%.
TABLE 5
Wherein FIG. 7(a) is a pseudo-color image; FIG. 7(b) corresponds to a label; fig. 7(c) SVM (68.75%); FIG. 7(d) CDCNN (64.21%); FIG. 7(e) SSRN (91.59%); FIG. 7(f) FDSSC (93.85%); FIG. 7(g) DBDA (91.32%); fig. 7(h) D3DTBTA (95.74%).
The Overall Accuracy (OA) of the four data sets at different training sample ratios is shown in table 6.
TABLE 6 Integrated accuracy (OA) at different training sample ratios
Under different training sample proportions, the best classification result is shown in bold. As shown in the table, the classification performance of the method is superior to that of other methods. The D3DTBTA provided by the invention is not the best in classification precision except on an IP data set with a training sample proportion of 1%, but has a small difference with the best classification precision, and the method provided by the invention obtains the best classification precision under other data sets and different training sample proportions. And with the increase of the proportion of the training samples, the classification precision is higher and higher. Under the condition of less training samples, the method provided by the invention can still keep good classification performance.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A three-branch three-attention hyperspectral image classification method combined with D3D is characterized by comprising the following steps of: constructing a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution for extracting spectral information and spatial information of the hyperspectral image; the three-branch three-attention mechanism network D3DTBTA-Net respectively extracts a spectral feature map, a spatial X feature map and a spatial Y feature map by utilizing three branches, and performs feature map fusion and classification.
2. The three-branch three-attention mechanism hyperspectral image classification method combined with D3D according to claim 1, wherein the three-branch three-attention mechanism hyperspectral image classification method combined with D3D comprises the following steps:
step one, generating a data set: generating a set of three-dimensional cubes, and randomly dividing the set of three-dimensional cubes into a training set, a verification set and a test set;
step two, training a model and verifying the model: the training set is used for updating parameters of multiple iterations, and the verification set is used for monitoring the performance of the model and selecting the model which is best trained;
step three, prediction: and selecting a test set to verify the effectiveness of the training model to obtain a classification result.
3. The three-branch three-attention mechanism hyperspectral image classification method in combination with D3D according to claim 2, wherein the data set generation in the first step comprises:
selecting a central pixel x from the raw dataiP neighboring pixels, generating a set of three-dimensional cubic blocksIf the target pixel is located at the edge of the image, setting the missing neighboring pixel value to zero; in the D3DTBTA-Net algorithm, p is the size of the patch, the size of the patch is set to be 9, and b is the number of the spectral bands; randomly dividing three-dimensional cube set into training set XtrainVerification set XvalAnd test set XtestThe corresponding label vector is divided into Ytrain、YvalAnd YtestOnly spatial information around the target pixel is used.
4. The method for classifying hyperspectral images with a three-branch three-attention mechanism combined with D3D according to claim 2, wherein the training model and the verification model in the second step comprise:
training a model and a verification model by using a D3DTBTA-Net algorithm, wherein the D3DTBTA-Net algorithm is divided into three branches: the spectrum branch, the space X branch and the space Y branch are respectively used for capturing a spectrum characteristic diagram, a space X characteristic diagram and a space Y characteristic diagram and fusing the three acquired characteristic diagrams for classification; wherein the spectrum branch comprises a Dense spectrum block and a spectrum attention block; the space X branch comprises a Dense space X block and a space X attention block; the space Y branch contains a Dense space Y block and a space Y attention block.
5. The method for classifying hyperspectral images by using a three-branch three-attention mechanism in combination with D3D according to claim 2, wherein the following basic modules are used in the second step:
(1) 3D-CNN with BN: the 3D-CNN with BN is a common element in a depth learning model based on 3D cubic blocks; for pm×pm×bmN of sizemFeature map, in a 3D-CNN layer, comprising a size of αm+1×αm+1×dm+1K of (a)m+1A channel of size pm+1×pm+1×bm+1N of (A) to (B)m+1Outputting a characteristic diagram; the ith output of the (m +1) th 3D-CNN layer with BN is calculated as:
wherein the content of the first and second substances,is the jth input feature map of the (m +1) layer,is the output after BN of m layers; e (-) and Var (-) represent the expectation and variance functions of the input, respectively; hi m+1And bi m+1Respectively representing the weight and bias of (m +1) layer 3D-CNN, which is a 3D convolution operation, and R (-) is an activation function introduced into a network nonlinear unit;
(2) DenseNet dense ligation: the dense block is the basic unit in DenseNet, and the output of the l-th dense block is calculated as:
xl=Hl[x0,x1,...,xl-1];
wherein HlIs a block containing a convolutional layer, an active layer and a BN layer, x0,x1,...,xl-1Representing the generated dense blocks, the more connections, the more information flows in the dense network; the dense network with the layer number L has L (L +1)/2 connections, and the traditional convolution network with the same layer number only has L direct connections;
(3) an attention mechanism is as follows:
spectral attention mappingIs directly inputted from the initialCalculating, wherein p × p is the size of the input block, and c represents the number of input channels; a and A are reactedTPerforming matrix multiplication to obtain channel attention mappingConnecting the softmax layer as:
wherein x isjiRepresenting the influence of the ith channel on the jth channel; mixing XTThe result of the matrix multiplication with A is transformed intoWeighting the reconstructed result through the parameter of the scale alpha, and adding the input A to obtain a final spectrum attention map
Wherein alpha is initialized to zero and gradually learned; the final graph E contains the weighted sum of all channel features, which is used for describing a dependency relationship and enhancing the discriminability of the features;
the space attention block: given an input profileGenerating new characteristic maps B and C by using two convolution layers respectively, whereinDeforming B and C intoWhere n is p × p is the number of pixels; performing matrix multiplication between B and C, adding a softmax layer, and calculating a spatial attention feature map
Wherein s isjiRepresenting the influence of the ith pixel to the jth pixel; the closer the feature representations of two pixels are, the stronger the correlation between the representative pixels;
simultaneously sending the initial input features A into the convolutional layer to obtain a new feature mapIs deformed intoAt D and STPerforms matrix multiplication operation therebetween, and the result is transformed into
Wherein the initial value of beta is zero, and more weights are gradually learned and distributed; adding a certain weight to all the positions and the original features to obtain final featuresSo the context information in the spatial dimension is modeled as E;
(4) deformable 3D convolution: the size of a receptive field is dynamically adjusted by deformable convolution according to the actual situation of an image, and an input feature with the size of C multiplied by H multiplied by W passes through a 3D-CNN with the size of p multiplied by q multiplied by r to generate an offset feature with the size of 3N multiplied by C multiplied by H multiplied by W, wherein N is the size of a sampling grid; having 3N values along the channel dimension, the values representing deformation values of the D3D sampling grid; applying the learned offset features to a deformation of the 3D-CNN sampling grid to generate a D3D sampling grid; generating output features using a D3D sampling grid;
D3D is represented by the following formula:
wherein, Δ pnRepresenting an offset corresponding to an nth value in a p × q × r convolutional sampling grid, using bilinear interpolation to generate an accurate value; the bilinear interpolation formula is:
(5) mish activation function: the activation function adopted by the D3DTBTA is Mish, and the formula of Mish is as follows:
mish(x)=x×tanh(softplus(x));
=xi×tanh(ln(1+ex))
wherein x represents an input; mish is unbounded at the upper bound, and the range at the lower bound is [ ≈ 0.31, ∞ ]; the differential coefficient of Mish is defined as:
(6) regarding the selection of the optimal weight, in the training process, selecting the model with the highest accuracy on the verification set as output, and if the accuracy on the verification set is consistent, selecting the model with the minimum loss on the verification set to output; the best model is saved in each iteration, if the model of the next iteration is better, the model saved last time is replaced, otherwise, the model is not replaced;
dynamically adjusting the learning rate by adopting a cosine annealing method as shown in the following formula:
6. The three-branch three-attention mechanism hyperspectral image classification method in combination with D3D according to claim 2, wherein the prediction in step three comprises:
HSI data set A is composed of N marked pixelsComposition, where p is the band and the corresponding class label set isWherein q is the number of land cover categories;
in HSI classification, the quantitative measure for measuring the difference between the predicted result and the true value is a cross-entropy loss function defined as:
7. A three-branch three-attention hyperspectral image classification system combined with D3D applying the three-branch three-attention hyperspectral image classification method combined with D3D according to any of claims 1 to 6, wherein the three-branch three-attention hyperspectral image classification system combined with D3D comprises:
the data set generating module is used for generating a set of three-dimensional cubes and randomly dividing the set of three-dimensional cubes into a training set, a verification set and a test set;
the model training and verifying module is used for updating parameters of multiple iterations through a training set, monitoring the performance of the model by using a verifying set and selecting the model with the best training;
and the prediction module is used for selecting the test set to verify the effectiveness of the training model and obtain a classification result.
8. A computer arrangement comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
constructing a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution for extracting spectral information and spatial information of a hyperspectral image; the D3DTBTA-Net is divided into three branches: and after the spectral characteristic diagram, the spatial X characteristic diagram and the spatial Y characteristic diagram are respectively extracted, fusing the characteristic diagrams extracted from the three branches for classification.
9. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
constructing a three-branch three-attention mechanism network D3DTBTA-Net combined with deformable 3D convolution for extracting spectral information and spatial information of the hyperspectral image; the D3DTBTA-Net is divided into three branches: and after the spectral characteristic diagram, the spatial X characteristic diagram and the spatial Y characteristic diagram are respectively extracted, fusing the characteristic diagrams extracted from the three branches for classification.
10. An information data processing terminal characterized by being configured to implement the three-branch three-attention hyperspectral image classification system in combination with D3D of claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210344115.7A CN114758170B (en) | 2022-04-02 | 2022-04-02 | Three-branch three-attention mechanism hyperspectral image classification method combined with D3D |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210344115.7A CN114758170B (en) | 2022-04-02 | 2022-04-02 | Three-branch three-attention mechanism hyperspectral image classification method combined with D3D |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114758170A true CN114758170A (en) | 2022-07-15 |
CN114758170B CN114758170B (en) | 2023-04-18 |
Family
ID=82329787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210344115.7A Active CN114758170B (en) | 2022-04-02 | 2022-04-02 | Three-branch three-attention mechanism hyperspectral image classification method combined with D3D |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114758170B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116310810A (en) * | 2022-12-06 | 2023-06-23 | 青岛柯锐思德电子科技有限公司 | Cross-domain hyperspectral image classification method based on spatial attention-guided variable convolution |
CN116883726A (en) * | 2023-06-25 | 2023-10-13 | 内蒙古农业大学 | Hyperspectral image classification method and system based on multi-branch and improved Dense2Net |
CN117036821A (en) * | 2023-08-22 | 2023-11-10 | 翔鹏佑康(北京)科技有限公司 | Single-cell rapid detection and identification method based on laser Raman spectrum |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170090068A1 (en) * | 2014-09-12 | 2017-03-30 | The Climate Corporation | Estimating soil properties within a field using hyperspectral remote sensing |
CN109978041A (en) * | 2019-03-19 | 2019-07-05 | 上海理工大学 | A kind of hyperspectral image classification method based on alternately update convolutional neural networks |
CN111191736A (en) * | 2020-01-05 | 2020-05-22 | 西安电子科技大学 | Hyperspectral image classification method based on depth feature cross fusion |
CN111539447A (en) * | 2020-03-17 | 2020-08-14 | 广东省智能制造研究所 | Hyperspectrum and terahertz data depth fusion-based classification method |
CN112052755A (en) * | 2020-08-24 | 2020-12-08 | 西安电子科技大学 | Semantic convolution hyperspectral image classification method based on multi-path attention mechanism |
CN112116563A (en) * | 2020-08-28 | 2020-12-22 | 南京理工大学 | Hyperspectral image target detection method and system based on spectral dimension and space cooperation neighborhood attention |
CN112232280A (en) * | 2020-11-04 | 2021-01-15 | 安徽大学 | Hyperspectral image classification method based on self-encoder and 3D depth residual error network |
CN113139515A (en) * | 2021-05-14 | 2021-07-20 | 辽宁工程技术大学 | Hyperspectral image classification method based on conditional random field and depth feature learning |
WO2021262129A1 (en) * | 2020-06-26 | 2021-12-30 | Ceylan Murat | An artificial intelligence analysis based on hyperspectral imaging for a quick determination of the health conditions of newborn premature babies without any contact |
-
2022
- 2022-04-02 CN CN202210344115.7A patent/CN114758170B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170090068A1 (en) * | 2014-09-12 | 2017-03-30 | The Climate Corporation | Estimating soil properties within a field using hyperspectral remote sensing |
CN109978041A (en) * | 2019-03-19 | 2019-07-05 | 上海理工大学 | A kind of hyperspectral image classification method based on alternately update convolutional neural networks |
CN111191736A (en) * | 2020-01-05 | 2020-05-22 | 西安电子科技大学 | Hyperspectral image classification method based on depth feature cross fusion |
CN111539447A (en) * | 2020-03-17 | 2020-08-14 | 广东省智能制造研究所 | Hyperspectrum and terahertz data depth fusion-based classification method |
WO2021262129A1 (en) * | 2020-06-26 | 2021-12-30 | Ceylan Murat | An artificial intelligence analysis based on hyperspectral imaging for a quick determination of the health conditions of newborn premature babies without any contact |
CN112052755A (en) * | 2020-08-24 | 2020-12-08 | 西安电子科技大学 | Semantic convolution hyperspectral image classification method based on multi-path attention mechanism |
CN112116563A (en) * | 2020-08-28 | 2020-12-22 | 南京理工大学 | Hyperspectral image target detection method and system based on spectral dimension and space cooperation neighborhood attention |
CN112232280A (en) * | 2020-11-04 | 2021-01-15 | 安徽大学 | Hyperspectral image classification method based on self-encoder and 3D depth residual error network |
CN113139515A (en) * | 2021-05-14 | 2021-07-20 | 辽宁工程技术大学 | Hyperspectral image classification method based on conditional random field and depth feature learning |
Non-Patent Citations (1)
Title |
---|
康拥朝等: "高光谱图像分类方法研究进展", 《新产经》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116310810A (en) * | 2022-12-06 | 2023-06-23 | 青岛柯锐思德电子科技有限公司 | Cross-domain hyperspectral image classification method based on spatial attention-guided variable convolution |
CN116310810B (en) * | 2022-12-06 | 2023-09-15 | 青岛柯锐思德电子科技有限公司 | Cross-domain hyperspectral image classification method based on spatial attention-guided variable convolution |
CN116883726A (en) * | 2023-06-25 | 2023-10-13 | 内蒙古农业大学 | Hyperspectral image classification method and system based on multi-branch and improved Dense2Net |
CN117036821A (en) * | 2023-08-22 | 2023-11-10 | 翔鹏佑康(北京)科技有限公司 | Single-cell rapid detection and identification method based on laser Raman spectrum |
Also Published As
Publication number | Publication date |
---|---|
CN114758170B (en) | 2023-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shook et al. | Crop yield prediction integrating genotype and weather variables using deep learning | |
Folberth et al. | Spatio-temporal downscaling of gridded crop model yield estimates based on machine learning | |
Srivastava et al. | Winter wheat yield prediction using convolutional neural networks from environmental and phenological data | |
CN114758170B (en) | Three-branch three-attention mechanism hyperspectral image classification method combined with D3D | |
e Lucas et al. | Reference evapotranspiration time series forecasting with ensemble of convolutional neural networks | |
Rußwurm et al. | Multi-temporal land cover classification with long short-term memory neural networks | |
Wang et al. | Evaluation of a deep-learning model for multispectral remote sensing of land use and crop classification | |
Plaza et al. | A new approach to mixed pixel classification of hyperspectral imagery based on extended morphological profiles | |
CN102938072B (en) | A kind of high-spectrum image dimensionality reduction and sorting technique based on the tensor analysis of piecemeal low-rank | |
Wylie et al. | Geospatial data mining for digital raster mapping | |
Sun et al. | Mapping plant functional types from MODIS data using multisource evidential reasoning | |
Hu et al. | Integrating coarse-resolution images and agricultural statistics to generate sub-pixel crop type maps and reconciled area estimates | |
Adeluyi et al. | Estimating the phenological dynamics of irrigated rice leaf area index using the combination of PROSAIL and Gaussian Process Regression | |
Cheng et al. | Wheat yield estimation using remote sensing data based on machine learning approaches | |
Lin et al. | Large-scale rice mapping using multi-task spatiotemporal deep learning and sentinel-1 sar time series | |
Ayaz et al. | Estimation of reference evapotranspiration using machine learning models with limited data | |
Liu et al. | An algorithm for early rice area mapping from satellite remote sensing data in southwestern Guangdong in China based on feature optimization and random Forest | |
von Bloh et al. | Machine learning for soybean yield forecasting in Brazil | |
Zhang et al. | Enhancing model performance in detecting lodging areas in wheat fields using UAV RGB Imagery: Considering spatial and temporal variations | |
Lang et al. | Integrating environmental and satellite data to estimate county-level cotton yield in Xinjiang Province | |
Saravi et al. | Reducing deep learning network structure through variable reduction methods in crop modeling | |
Zhong et al. | Detect and attribute the extreme maize yield losses based on spatio-temporal deep learning | |
Haining | Specification and estimation problems in models of spatial dependence. | |
Toomula et al. | An Extensive Survey of Deep learning-based Crop Yield Prediction Models for Precision Agriculture | |
Hosseini et al. | Areal precipitation coverage ratio for enhanced AI modelling of monthly runoff: a new satellite data-driven scheme for semi-arid mountainous climate |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |