CN112347888A - Remote sensing image scene classification method based on bidirectional feature iterative fusion - Google Patents
Remote sensing image scene classification method based on bidirectional feature iterative fusion Download PDFInfo
- Publication number
- CN112347888A CN112347888A CN202011180187.XA CN202011180187A CN112347888A CN 112347888 A CN112347888 A CN 112347888A CN 202011180187 A CN202011180187 A CN 202011180187A CN 112347888 A CN112347888 A CN 112347888A
- Authority
- CN
- China
- Prior art keywords
- feature
- remote sensing
- layer
- sensing image
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000002457 bidirectional effect Effects 0.000 title claims abstract description 38
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000011176 pooling Methods 0.000 claims abstract description 34
- 230000006870 function Effects 0.000 claims description 46
- 238000012360 testing method Methods 0.000 claims description 35
- 230000009467 reduction Effects 0.000 claims description 29
- 238000010586 diagram Methods 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000010606 normalization Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 9
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 6
- 230000001502 supplementing effect Effects 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 238000007500 overflow downdraw method Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013526 transfer learning Methods 0.000 description 2
- 101100261000 Caenorhabditis elegans top-3 gene Proteins 0.000 description 1
- 101150041570 TOP1 gene Proteins 0.000 description 1
- 101150107801 Top2a gene Proteins 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a remote sensing image scene classification method based on bidirectional feature iterative fusion, and belongs to the field of image processing. Firstly, designing a novel deep convolutional neural network based on a ResNet34 network model; secondly, inputting the remote sensing image into a network for training, and outputting the final convolution layer of each stage except the first stage in ResNet34 as subsequent input features, wherein the input features are four groups; then, a Top-Down submodule, a PostProcessor submodule and a Down-Top submodule are designed in the novel bidirectional feature iterative fusion network structure, and four groups of input features are respectively sent into the structure to generate output features with corresponding scales; and finally, inputting the highest-level output features into the full-connection layer after passing through the global average pooling layer, and using the output of the full-connection layer as the input of the SoftMax layer to realize the classification of the remote sensing images.
Description
Technical Field
The invention belongs to the field of image processing, and particularly relates to a remote sensing image scene classification method based on bidirectional feature iterative fusion.
Background
Remote sensing, broadly referred to as remote non-contact detection techniques. Because different objects have obvious difference on the spectrum effect of the electromagnetic wave of the same wave band, the remote sensing technical equipment analyzes the object spectrogram according to the principle, thereby realizing the identification of the remote object. The general remote sensing technology can be divided into multispectral, hyperspectral and synthetic aperture radars, and the generated remote sensing images have different spatial resolution, spectral resolution and time resolution. The spatial resolution refers to the size or dimension of the smallest unit that can be distinguished in detail on the remote sensing image. With the continuous development of remote sensing technology, the spatial resolution of remote sensing images is improved in stages: the French SPOT-6 satellite launched in 2012 can provide a full-color 1.5m resolution high-definition terrestrial image; the us WorldView-3 satellite launched in 2014 may provide a full color, 0.3m resolution high definition ground image. In recent years, the remote sensing technology in China has been developed greatly, and the ground pixel resolution can reach sub-meter level at most: the GF-11 satellite transmitted in China in 2018 can realize the ground image resolution of 10cm or less.
The high-spatial-resolution remote sensing image has abundant surface feature texture information, is often applied to the fields of homeland general survey, surface feature classification, change detection and the like, and provides information guarantee for the implementation of major plans. At present, because the data volume of high-resolution remote sensing images is huge, how to accurately divide the remote sensing images into different types according to functions is a topic of particular attention in academia. Actually, the effectiveness and uniqueness of sample feature extraction have extremely important influence on the classification precision of high-resolution remote sensing images.
The publication number CN110443143A discloses a multi-branch convolutional neural network fused remote sensing image scene classification method, which is characterized in that an object mask graph and an attention graph are obtained by respectively passing preprocessed data through an object detection network and an attention network; respectively inputting an original image, an object mask image and an attention map training set into a CNN network for fine adjustment to respectively obtain optimal classification models; and finally, the outputs of the three groups of Softmax layers are fused through a decision level to obtain a final prediction result. However, the three groups of network models result in large model parameters and are complex under special conditions, which is not beneficial to improving the classification precision.
The publication number CN110555446A discloses a remote sensing image scene classification method based on multi-scale depth feature fusion and transfer learning, firstly, a Gaussian pyramid algorithm is used for obtaining a multi-scale remote sensing image, the multi-scale remote sensing image is input into a full convolution neural network, and multi-scale depth local features are extracted; then, cutting the image to a fixed size required by CNN, inputting the image into a network to obtain global features of a full connection layer, coding the multi-scale depth local features and the global features obtained by CNN by using compact bilinear pooling operation, and jointly representing the remote sensing image by fusing the two depth features to enhance the mutual relation between the features; and finally, classifying the remote sensing image scenes by using a transfer learning technology and combining the two methods. Although the method integrates the global characteristics and the local characteristics of the remote sensing image and enriches the characteristic information, the semantic information and the spatial information of the multi-scale depth local characteristics are not distributed uniformly, and the space is improved for the classification result.
In summary, the existing high-resolution remote sensing image classification method has many defects, which are mainly expressed as follows:
(1) the existing remote sensing image classification method focuses on the high-level features of the last convolutional layer, the high-level features focus on semantic information, and the rich semantic information can enable a network to accurately detect a target. The remote sensing image scene classification is different from the common object classification, and the surrounding environment (embodied as spatial information) of the characteristic object can also help network classification, so that the image classification precision is not high;
(2) the traditional multi-scale feature extraction method has the advantages that the contribution degrees of feature maps of different scales to the whole result are the same, and experimental verification shows that the classification accuracy can be improved by performing weighted fusion on the feature maps of different scales. And the network using the conventional weighting form has a slow convergence speed.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems, the invention provides a remote sensing image scene classification method based on bidirectional feature iterative fusion. The method avoids extraction of excessive artificial features, learns reasonable normalization weight coefficients, and performs feature fusion by circularly and iteratively utilizing feature maps with different scales and different levels to supplement semantic information and spatial information with each other, thereby enhancing feature robustness and improving the accuracy of image classification.
The technical scheme is as follows: in order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows: a remote sensing image scene classification method based on bidirectional feature iterative fusion comprises the following steps:
(1) constructing a multi-classification remote sensing image data set, making corresponding sample labels, and dividing each type of remote sensing image into a training set Train and a Test set Test in proportion;
(2) constructing a convolutional neural network ResNet, taking remote sensing image data as the input of the network, dividing convolutional layers with the same output size into the same stage, and dividing the constructed ResNet network model into 5 stages in total;
(3) constructing a bidirectional feature iterative fusion structure which comprises three submodules of Top-Down, Postprocessor and Down-Top; the Top-Down sub-module comprises 4 paths of feature dimension reduction branches and 4 adjacent semantic feature fusion structures; connecting a PostProcessor submodule behind each feature fusion structure, wherein the PostProcessor submodule internally comprises 2 Residual subblocks, and each subblock respectively comprises 4 Residual error layers; the Down-Top sub-module comprises 4 spatial feature fusion structures;
(4) taking the output characteristics of each stage of the ResNet network except the first stage as the input characteristics of the bidirectional characteristic iterative fusion structure, respectively performing characteristic dimension reduction on the input characteristics through 4 paths of characteristic dimension reduction branches of the Top-Down submodule, and marking a generated characteristic diagram after dimension reduction as C2、C3、C4、C5;
(5) Inputting the feature map after dimension reduction into an adjacent semantic feature fusion structure of a Top-Down submodule with normalized weight, mutually fusing features and supplementing semantic information to the adjacent feature map in the structure, and respectively generating a feature map A which corresponds to the same size and is enhanced in semantic meaning2、A3、A4、A5;
(6) Feature map A to be preliminarily enhanced2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5;
(7) Will feature map B2、B3、B4、B5Respectively inputting Down-Top spatial feature fusion structures with normalized weights, supplementing and perfecting features of corresponding levels by using spatial information of adjacent feature maps in the structures, and respectively generating feature maps P corresponding to the same size2、P3、P4、P5;
(8) Selecting the characteristic graph P with the strongest semantic information in the step (7)5As the input characteristic of a Classifier Head, performing scene classification by using SoftMax after passing through a self-adaptive global average pooling layer and a full connection layer to obtain a classification result;
(9) according to the steps (4) to (8), training the convolutional neural network based on bidirectional feature iterative fusion by using a remote sensing image data training set to obtain a trained convolutional neural network;
(10) and inputting the images in the test set into a trained convolutional neural network to obtain output characteristics Y, and classifying and identifying the output characteristics Y by utilizing SoftMax to further realize the class prediction of the test set.
Further, in step (1), the method for dividing the training set and the test set is as follows:
(1.1) dividing a multi-classification remote sensing Image dataset Image [ [ Image [ ]1,…,Imagei,…,ImageN]And preparing a corresponding sample Label [ Label ═ Label1,…,Labeli,…,LabelN]Wherein N represents a total of N types of remote sensing images, ImageiRepresenting a set of i-th type remote sensing images, LabeliA label representing an i-th type remote sensing image;
(1.2) setting the total number of samples of each type of remote sensing image in the remote sensing image data set as n, and randomly extracting m images in the type to construct a training set Train [ Train ═1,…,Traini,…,Trainm]And constructing a Test set Test (Test) of the rest n-m remote sensing images1,…,Testi,…,Testn-m]Wherein, TrainiTraining set for representing ith type remote sensing image and containing m graphsLike, TestiAnd the test set represents the ith type remote sensing image and comprises n-m images.
Further, the construction method of the convolutional neural network based on bidirectional feature iterative fusion is as follows:
building a network based on the ResNet34 model: the ResNet34 model has 5 stages, each stage is marked as S1, S2, S3, S4 and S5, and the last four stages respectively comprise 3, 4, 6 and 3 basic modules, namely BasicBlock; conv2_3, Conv3_4, Conv4_6 and Conv5_3 represent the convolution outputs of the last BasicBlock of the stage respectively; taking the output characteristics of the models Conv2_3, Conv3_4, Conv4_6 and Conv5_3 as the input characteristics of the bidirectional characteristic iterative fusion structure; and constructing a Classification Head after the bidirectional feature iterative fusion structure, wherein the Classification Head internally comprises an adaptive global Average pooling layer and a full connection layer which are respectively marked as Average Pool and Fc, and taking a feature map with strongest semantic information output by the bidirectional feature iterative fusion structure as an input feature of the Classification Head.
Further, inputting a training set of the remote sensing image into the constructed convolutional neural network, calculating an output value of each neuron of the convolutional neural network in a feedforward mode, and setting a calculation function of each layer of feature diagram and a minimum loss function:
if the first layer is a convolutional layer, the jth characteristic diagram of the first layerThe calculation formula of (2) is as follows:
wherein g (·) denotes an activation function, x denotes a convolution operation,showing the ith characteristic diagram of the l-1 layer,represents fromToThe convolution kernel of (a) is performed,for biasing of jth feature map of ith layer, Ml-1The number of characteristic graphs of the l-1 layer is shown;
if the l-th layer is a pooling layer, the j-th feature map of the l-th layerThe calculation formula of (2) is as follows:
wherein g (-) represents an activation function,the pooling parameter representing the jth feature map of the l-th layer, down (-) representing a pooling function,showing the ith characteristic diagram of the l-1 layer,representing the bias of the jth characteristic diagram of the ith layer;
if the l-th layer is a full connection layer, the j-th feature map of the l-th layerThe calculation formula of (2) is as follows:
wherein ,fl-1Represents a weighted sum of all the profiles of layer l-1,bias representing jth feature map of ith layer, g (-) represents activation function;
carrying out up-sampling on the characteristic diagram by using bilinear interpolation in a fusion structure of Top-Down and Down-Top sub-modules to realize scale change;
back propagation computing the loss function of the deep convolutional neural network:
setting a training set of remote sensing images to have N x m images, wherein any image IkK belongs to {1,2, …, N multiplied by m }, wherein N represents the total N types of remote sensing images, and m represents that each type of image in the training set is m; for image IkIf the deep convolutional neural network correctly predicts the probability as the ith class is piThen the cross entropy loss function in the multi-classification task is:
wherein p=[p0,…,pi,…,pN-1]Is a probability distribution of each element piRepresenting the probability of the image belonging to the ith class; y ═ y0,…,yi,…,yN-1]Is a one-hot representation of the image tag, y when the sample belongs to the ith class i1, otherwise yi=0;
The formula of the overall cross entropy loss function is:
minimizing a loss function by adopting a gradient descent algorithm, and updating each parameter in the convolutional neural network;
training a deep convolutional neural network for optimal parameters to minimize Loss function Loss, the parameters of the convolutional neural network beingIf W represents all the parametersThe number, i.e.:then, after training the convolutional neural network by adopting a remote sensing image training set, a group of parameters W is found*So that:
wherein argmin represents that the value of W when the loss function is minimal is W*;
Updating the parameters of the convolutional neural network by adopting a gradient descent algorithm, and simultaneously minimizing a Loss function Loss:
where α represents a learning rate, and determines a convergence rate of each step, W(i)Denotes the ith set of parameters to be updated, W(i-1)Indicating the updated i-1 th set of parameters,representing Loss function Loss versus parameter W(i)Partial derivatives of (d);
and (3) adopting normalization weight in the adjacent semantic feature fusion structure to balance the influence ratio of the multilevel input to the final result:
wherein ,βiRepresenting the original weight of the input of the current level, t representing the input number of the adjacent semantic feature fusion structures,representing the normalized weight ratio.
Further, the method for fusing adjacent semantic features of the Top-Down module specifically comprises the following steps:
the adjacent semantic feature fusion structure in the bidirectional feature iterative fusion comprises three inputs, namely Levelk+1、Levelk、Levelk-1Corresponding to feature maps with different resolutions of high level, current level and low level;
the high-level feature map uses up-sampling, the low-level feature map uses down-sampling, and the current level uses identity transformation to enable the three to be added and fused; and (3) carrying out element-by-element addition operation with weights after weights are distributed to the three by using a weight normalization method to obtain a feature map after corresponding semantic information is enhanced:
wherein ,representing the Level of the current Level featurekThe corresponding output features of the same size,representing the normalized weights.
Further, the step (6) is to preliminarily enhance the feature map A2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5The method specifically comprises the following steps:
will feature map A2、A3、A4、A5Respectively inputting as the first Residual block of each Postprocessor branch, and performing convolution on the bypass by using a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature map A2_1、A3_1、A4_1、A5_1;
Will countCalculated A2_1、A3_1、A4_1、A5_1Respectively serving as the input of a second Residual block of each branch, and performing convolution on a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction on the side; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature diagram B2、B3、B4、B5。
Further, the step (8) is to apply the feature map P5Classifying by using a Classifer Head structure, wherein the method comprises the following steps:
will feature map P5As input of the Classiier Head, obtaining an output characteristic X through a global average pooling layer; taking the output characteristic X of the pooling layer as the input of the full-connection layer, and obtaining the output characteristic Y of the full-connection layer through the full-connection layer:
Y=[y1,y2,…,yn]
wherein n represents n classes of images in the dataset;
aiming at the output characteristic Y of the full connection layer, calculating a SoftMax value of each remote sensing image sample belonging to the ith class by adopting a SoftMax method as follows:
wherein ,yiAnd yjRepresenting the ith and jth samples in the input features, e representing a constant, SiA probability value representing that the picture belongs to the ith class; the final probability value of the ith remote sensing image is as follows:
S=max(S1,S2,…,Sn)
wherein max (. cndot.) represents taking n SiAt the time of probability maximum SiThe corresponding label type is used as the prediction type value Presect _ label of the ith remote sensing image samplei;
And continuously optimizing the parameters of the convolutional layer by using a gradient descent algorithm according to the prediction result so that the prediction type values of all the training samples are equal to the Label value Label until the loss function value is minimum.
Has the advantages that: compared with the prior art, the technical scheme of the invention has the following beneficial technical effects:
(1) the method can automatically learn and extract the depth characteristics of the remote sensing image through the depth convolution neural network, avoid the extraction of artificial characteristics, reduce the complexity and reduce the human intervention;
(2) the method of the invention carries out full aspect feature refinement and enhancement on the features of different scales and different levels by using a bidirectional feature iterative fusion structure, thereby avoiding the limitation of classification precision caused by the lack of the feature space information of the last layer of convolutional layer in the past;
(3) the method of the invention distributes weights to different inputs of adjacent semantic feature fusion structures, and normalizes the weight tensors, so as to better balance the influence of different levels of feature graphs on the current level of features, and compared with SoftMax normalization, the normalization operation accelerates network convergence.
Drawings
FIG. 1 is a block diagram of an embodiment of the present invention;
fig. 2 is a structural diagram of the constructed neural network.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
The invention relates to a remote sensing image scene classification method based on bidirectional feature iterative fusion, which comprises the following steps:
(1) and constructing a multi-classification remote sensing image data set, manufacturing a corresponding sample label, and dividing each type of remote sensing image into a training set Train and a Test set Test in proportion.
(1.1) dividing a multi-classification remote sensing Image dataset Image [ [ Image [ ]1,…,Imagei,…,ImageN]And preparing a corresponding sample Label [ Label ═ Label1,…,Labeli,…,LabelN]Wherein N represents a total of N types of remote sensing images,ImageiRepresenting a set of i-th type remote sensing images, LabeliA label representing an i-th type remote sensing image;
(1.2) setting the total number of samples of each type of remote sensing image in the remote sensing image data set as n, and randomly extracting m images in the type to construct a training set Train [ Train ═1,…,Traini,…,Trainm]And constructing a Test set Test (Test) of the rest n-m remote sensing images1,…,Testi,…,Testn-m]Wherein, TrainiTraining set for representing i-th type remote sensing image, comprising m images, TestiAnd the test set represents the ith type remote sensing image and comprises n-m images.
(2) Building a network based on the ResNet34 model: remote sensing image data is used as input of a network, convolutional layers with the same output size are divided into the same stage, the ResNet34 model has 5 stages, each stage is marked as S1, S2, S3, S4 and S5, and the last four stages respectively comprise 3, 4, 6 and 3 basic modules BasicBlock; conv2_3, Conv3_4, Conv4_6 and Conv5_3 represent the convolution outputs of the last BasicBlock of the stage, respectively.
(3) Constructing a bidirectional feature iterative fusion structure which comprises three submodules of Top-Down, Postprocessor and Down-Top; the Top-Down submodule comprises 4 characteristic dimension reduction branches and 4 adjacent semantic characteristic fusion structures, the characteristic dimension reduction branches are marked as Down channel1, Down channel2, Down channel3 and Down channel4, and the characteristic fusion structures are marked as TopDOwn1, TopDOwn2, TopDOwn3 and TopDOwn 4; connecting a PostProcessor submodule behind each feature fusion structure, wherein the PostProcessor submodule internally comprises 2 Residual subblocks, each subblock respectively comprises 4 Residual error layers which are respectively marked as Residual 1-Residual 8; the Down-Top sub-module comprises 4 spatial feature fusion structures which are respectively marked as Down Top1, Down Top2, Down Top3 and Down Top 4. A Classification Head structure is designed after a bidirectional feature iterative fusion structure, and the Classification Head structure comprises an adaptive global Average pooling layer and a full-connection layer which are respectively marked as Average Pool and Fc. The convolutional layer is used for extracting and processing the feature map, the pooling layer is used for compressing the feature map obtained by the convolutional layer, and the full-connection layer can convert the feature map into a one-dimensional vector.
In this embodiment, the constructed convolutional neural network based on bidirectional feature iterative fusion has the following specific parameters:
(a) in a first stage S1, redefining each remote sensing image size as 224x224, normalizing, defining a convolution layer with convolution kernel size of 7 x 7, step size of 2, and padding of 3;
(b) defining 1 pooling layer in the convolutional layer S2, wherein the pooling mode is MaxPO _ OLING; defining 3 BasicBlock, 2 layers in each BasicBlock, 64 convolution kernels with the size of 3 multiplied by 3 in each layer, and the step size is 1;
(c) in convolutional layer S3, 4 basicblocks are defined, 2 layers in each BasicBlock, 128 convolution kernels of size 3 × 3 per layer, with a step size of 1;
(d) in convolutional layer S4, 6 basicblocks are defined, 2 layers in each BasicBlock, 256 convolutional cores of size 3 × 3 per layer, with a step size of 1;
(e) in convolutional layer S5, 3 basicblocks are defined, 2 layers in each BasicBlock, 512 convolution kernels of size 3 × 3 per layer, with a step size of 1;
(f) respectively defining 256 convolution kernels with the size of 1 multiplied by 1 in the characteristic dimensionality reduction branches of Downchannel1, Downchannel2, Downchannel3 and Downchannel4, wherein the step size is 1;
(g) in two-way input of the characteristic fusion structure TopDOwn1, 1 pooling layer is defined by a lower branch, and the pooling mode is MaxPooling; in three inputs of TopDOwn2 and TopDOwn3, an upper branch defines 1 upper sampling layer in a bilinear interpolation mode, a lower branch defines 1 pooling layer in a MaxPoooling mode; in the two-path input of TopDOwn4, an upper path defines 1 layer of upper sampling layer, and the mode is bilinear interpolation;
(h) in each PostProcessor branch, defining 2 Residual sub-blocks, defining 3 layers of convolution layers in the main path in each sub-block, wherein the sizes of convolution cores in each layer are respectively defined as 1 × 1, 3 × 3 and 1 × 1, the number of output channels is 64, 64 and 256, and the step length is 1; meanwhile, a bypass in the Residual subblock defines 1 convolution layer, 256 convolution kernels with the size of 1 multiplied by 1 are defined, and the step length is 1;
(j) in two paths of input of the feature fusion structure DownTop1, an upper path defines 1 layer of upper sampling layer, and the mode is bilinear interpolation; in three-way input of DownTop2 and DownTop3, an upper branch defines 1 layer of upper sampling layer in a bilinear interpolation mode, a lower branch defines 1 layer of pooling layer in a MaxPooling mode; in the DownTop4, the lower branch defines 1 pooling layer, and the pooling mode is Max Paooling;
(k) defining 1 layer of pooling layers in a Classiier Head, wherein the pooling modes are Adaptive AveragePool, and the output size is 1 multiplied by 1; a full junction layer Fc is further defined.
(4) Taking the output features of the ResNet34 models Conv2_3, Conv3_4, Conv4_6 and Conv5_3 as the input features of the bidirectional feature iterative fusion structure, respectively performing feature dimension reduction on the input features through 4 paths of feature dimension reduction branches of the Top-Down submodule, and marking the feature map generated after dimension reduction as C2、C3、C4、C5。
(5) Inputting the feature map after dimension reduction into an adjacent semantic feature fusion structure of a Top-Down submodule with normalized weight, mutually fusing features and supplementing semantic information to the adjacent feature map in the structure, and respectively generating a feature map A which corresponds to the same size and is enhanced in semantic meaning2、A3、A4、A5。
The fusion method of the adjacent semantic features of the Top-Down module specifically comprises the following steps:
the adjacent semantic feature fusion structure in the bidirectional feature iterative fusion comprises three inputs, namely Levelk+1、Levelk、Levelk-1Corresponding to feature maps with different resolutions of high level, current level and low level;
the high-level feature map uses up-sampling, the low-level feature map uses down-sampling, and the current level uses identity transformation to enable the three to be added and fused; and (3) carrying out element-by-element addition operation with weights after weights are distributed to the three by using a weight normalization method to obtain a feature map after corresponding semantic information is enhanced:
wherein ,representing the Level of the current Level featurekThe corresponding output features of the same size,representing the normalized weights.
(6) Feature map A to be preliminarily enhanced2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5The method specifically comprises the following steps:
will feature map A2、A3、A4、A5Respectively inputting as the first Residual block of each Postprocessor branch, and performing convolution on the bypass by using a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature map A2_1、A3_1、A4_1、A5_1;
A obtained by calculation2_1、A3_1、A4_1、A5_1Respectively used as the input of the second Residual block of each branch, and carrying out convolution on the convolution layer with the convolution kernel size of 1 multiplied by 1 at the side to realize characteristic dimension reduction; convolution is carried out on the main path sequentially by using convolution layers with convolution kernel sizes of 1 × 1, 3 × 3 and 1 × 1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature graph B2、B3、B4、B5。
(7) Will feature map B2、B3、B4、B5Respectively inputting Down-Top spatial feature fusion structure with normalization weight, supplementing and perfecting corresponding level features by using spatial information of adjacent feature maps in the structure, and respectively generating featuresTo correspond to the same size characteristic diagram P2、P3、P4、P5。
(8) Selecting the characteristic graph P with the strongest semantic information in the step (7)5And as the input characteristic of a Classifier Head, after passing through an adaptive global Average pooling layer within the Classifier Head and a full link layer Fc, performing scene classification by using SoftMax to obtain a classification result. The method comprises the following steps:
will feature map P5As input of the Classiier Head, obtaining an output characteristic X through a global average pooling layer; taking the output characteristic X of the pooling layer as the input of the full-connection layer, and obtaining the output characteristic Y of the full-connection layer through the full-connection layer:
Y=[y1,y2,…,yn]
wherein n represents n classes of images in the dataset;
aiming at the output characteristic Y of the full connection layer, calculating a SoftMax value of each remote sensing image sample belonging to the ith class by adopting a SoftMax method as follows:
wherein ,yiAnd yjRepresenting the ith and jth samples in the input features, e representing a constant, SiA probability value representing that the picture belongs to the ith class; the final probability value of the ith remote sensing image is as follows:
S=max(S1,S2,…,Sn)
wherein max (. cndot.) represents taking n SiAt the time of probability maximum SiThe corresponding label type is used as the prediction type value Presect _ label of the ith remote sensing image samplei;
And continuously optimizing the parameters of the convolutional layer by using a gradient descent algorithm according to the prediction result so that the prediction type values of all the training samples are equal to the Label value Label until the loss function value is minimum.
(9) And (5) training the convolutional neural network based on bidirectional feature iterative fusion by using a remote sensing image data training set according to the steps (4) to (8) to obtain the trained convolutional neural network.
Inputting a training set of remote sensing images into the constructed convolutional neural network, calculating the output value of each neuron of the convolutional neural network in a feedforward mode, and setting a calculation function and a minimum loss function of each layer of feature diagram:
if the first layer is a convolutional layer, the jth characteristic diagram of the first layerThe calculation formula of (2) is as follows:
wherein g (·) denotes an activation function, x denotes a convolution operation,showing the ith characteristic diagram of the l-1 layer,represents fromToThe convolution kernel of (a) is performed,for biasing of jth feature map of ith layer, Ml-1The number of characteristic graphs of the l-1 layer is shown;
if the l-th layer is a pooling layer, the j-th feature map of the l-th layerThe calculation formula of (2) is as follows:
wherein g (-) represents an activation function,the pooling parameter representing the jth feature map of the l-th layer, down (-) representing a pooling function,showing the ith characteristic diagram of the l-1 layer,representing the bias of the jth characteristic diagram of the ith layer;
if the l-th layer is a full connection layer, the j-th feature map of the l-th layerThe calculation formula of (2) is as follows:
wherein ,fl-1Represents a weighted sum of all the profiles of layer l-1,bias representing jth feature map of ith layer, g (-) represents activation function;
carrying out up-sampling on the characteristic diagram by using bilinear interpolation in a fusion structure of Top-Down and Down-Top sub-modules to realize scale change;
back propagation computing the loss function of the deep convolutional neural network:
setting a training set of remote sensing images to have N x m images, wherein any image IkK belongs to {1,2, …, N multiplied by m }, wherein N represents the total N types of remote sensing images, and m represents that each type of image in the training set is m; for image IkIf the deep convolutional neural network correctly predicts the probability as the ith class is piThen the cross entropy loss function in the multi-classification task is:
wherein p=[p0,…,pi,…,pN-1]Is a probability distribution of each element piRepresenting the probability of the image belonging to the ith class; y ═ y0,…,yi,…,yN-1]Is a one-hot representation of the image tag, y when the sample belongs to the ith class i1, otherwise yi=0;
The formula of the overall cross entropy loss function is:
minimizing a loss function by adopting a gradient descent algorithm, and updating each parameter in the convolutional neural network;
training a deep convolutional neural network for optimal parameters to minimize Loss function Loss, the parameters of the convolutional neural network beingIf all parameters are denoted by W, then:then, after training the convolutional neural network by adopting a remote sensing image training set, a group of parameters W is found*So that:
wherein argmin represents that the value of W when the loss function is minimal is W*;
Updating the parameters of the convolutional neural network by adopting a gradient descent algorithm, and simultaneously minimizing a Loss function Loss:
where α represents a learning rate, and determines a convergence rate of each step, W(i)Denotes the ith set of parameters to be updated, W(i-1)Indicating the updated i-1 th set of parameters,representing Loss function Loss versus parameter W(i)Partial derivatives of (d);
and (3) adopting normalization weight in the adjacent semantic feature fusion structure to balance the influence ratio of the multilevel input to the final result:
wherein ,βiRepresenting the original weight of the input of the current level, t representing the input number of the adjacent semantic feature fusion structures,representing the normalized weight ratio.
(10) And inputting the images in the test set into a trained convolutional neural network to obtain output characteristics Y, and classifying and identifying the output characteristics Y by utilizing SoftMax to further realize the class prediction of the test set.
The invention selects a different remote sensing image scene classification algorithm to compare with the proposed method, and the selected comparison algorithm is as follows: a remote sensing image scene classification method [ P ]. Chinese patent CN104680173A,2015-06-03 ] "provides a high-resolution remote sensing image classification method realized by an SVM classifier based on sparse coding space pyramid matching model characteristics, which is called method 1 for short. Table 1 shows the performance comparison of the two methods on a high-resolution remote sensing scene image public data set UCMercered _ LandUse. The result shows that the method provided by the invention has better effect of classifying the remote sensing image scene.
TABLE 1
The foregoing is a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (7)
1. The remote sensing image scene classification method based on bidirectional feature iterative fusion is characterized by comprising the following steps of:
(1) constructing a multi-classification remote sensing image data set, making corresponding sample labels, and dividing each type of remote sensing image into a training set Train and a Test set Test in proportion;
(2) constructing a convolutional neural network ResNet, taking remote sensing image data as the input of the network, dividing convolutional layers with the same output size into the same stage, and dividing the constructed ResNet network model into 5 stages in total;
(3) constructing a bidirectional feature iterative fusion structure which comprises three submodules of Top-Down, Postprocessor and Down-Top; the Top-Down sub-module comprises 4 paths of feature dimension reduction branches and 4 adjacent semantic feature fusion structures; connecting a PostProcessor submodule behind each feature fusion structure, wherein the PostProcessor submodule internally comprises 2 Residual subblocks, and each subblock respectively comprises 4 Residual error layers; the Down-Top sub-module comprises 4 spatial feature fusion structures;
(4) taking the output characteristics of each stage of the ResNet network except the first stage as the input characteristics of the bidirectional characteristic iterative fusion structure, respectively performing characteristic dimension reduction on the input characteristics through 4 paths of characteristic dimension reduction branches of the Top-Down submodule, and marking a generated characteristic diagram after dimension reduction as C2、C3、C4、C5;
(5) Inputting the feature map after dimension reduction into an adjacent semantic feature fusion structure of a Top-Down submodule with normalized weight, mutually fusing features and supplementing semantic information to the adjacent feature map in the structure, and respectively generating a feature map A which corresponds to the same size and is enhanced in semantic meaning2、A3、A4、A5;
(6) Feature map A to be preliminarily enhanced2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5;
(7) Will feature map B2、B3、B4、B5Respectively inputting Down-Top spatial feature fusion structures with normalized weights, supplementing and perfecting features of corresponding levels by using spatial information of adjacent feature maps in the structures, and respectively generating feature maps P corresponding to the same size2、P3、P4、P5;
(8) Selecting the characteristic graph P with the strongest semantic information in the step (7)5As the input characteristic of a Classifier Head, performing scene classification by using SoftMax after passing through a self-adaptive global average pooling layer and a full connection layer to obtain a classification result;
(9) according to the steps (4) to (8), training the convolutional neural network based on bidirectional feature iterative fusion by using a remote sensing image data training set to obtain a trained convolutional neural network;
(10) and inputting the images in the test set into a trained convolutional neural network to obtain output characteristics Y, and classifying and identifying the output characteristics Y by utilizing SoftMax to further realize the class prediction of the test set.
2. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1, characterized in that in step (1), the method for dividing the training set and the test set is as follows:
(1.1) dividing a multi-classification remote sensing Image dataset Image [ [ Image [ ]1,…,Imagei,…,ImageN]And preparing a corresponding sample Label [ Label ═ Label1,…,Labeli,…,LabelN]Wherein N represents a total of N types of remote sensing images, ImageiRepresenting a set of i-th type remote sensing images, LabeliA label representing an i-th type remote sensing image;
(1.2) setting the total number of samples of each type of remote sensing image in the remote sensing image data set as n, and randomly extracting m images in the type to construct a training set Train [ Train ═1,…,Traini,…,Trainm]And constructing a Test set Test (Test) of the rest n-m remote sensing images1,…,Testi,…,Testn-m]Wherein, TrainiTraining set for representing i-th type remote sensing image, comprising m images, TestiAnd the test set represents the ith type remote sensing image and comprises n-m images.
3. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1 is characterized in that the construction method of the convolutional neural network based on bidirectional feature iterative fusion is as follows:
building a network based on the ResNet34 model: the ResNet34 model has 5 stages, each stage is marked as S1, S2, S3, S4 and S5, and the last four stages respectively comprise 3, 4, 6 and 3 basic modules, namely BasicBlock; conv2_3, Conv3_4, Conv4_6 and Conv5_3 represent the convolution outputs of the last BasicBlock of the stage respectively; taking the output characteristics of the models Conv2_3, Conv3_4, Conv4_6 and Conv5_3 as the input characteristics of the bidirectional characteristic iterative fusion structure; and constructing a Classification Head after the bidirectional feature iterative fusion structure, wherein the Classification Head internally comprises an adaptive global Average pooling layer and a full connection layer which are respectively marked as Average Pool and Fc, and taking a feature map with strongest semantic information output by the bidirectional feature iterative fusion structure as an input feature of the Classification Head.
4. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 3, characterized in that a training set of remote sensing images is input into a constructed convolutional neural network, the output value of each neuron of the convolutional neural network is calculated in a feedforward manner, and a calculation function of each layer of feature map and a minimum loss function are set:
if the first layer is a convolutional layer, the jth characteristic diagram of the first layerThe calculation formula of (2) is as follows:
wherein g (·) denotes an activation function, x denotes a convolution operation,showing the ith characteristic diagram of the l-1 layer,represents fromToThe convolution kernel of (a) is performed,for biasing of jth feature map of ith layer, Ml-1The number of characteristic graphs of the l-1 layer is shown;
if the l-th layer is a pooling layer, the j-th feature map of the l-th layerThe calculation formula of (2) is as follows:
wherein g (-) represents an activation function,the pooling parameter representing the jth feature map of the l-th layer, down (-) representing a pooling function,showing the ith characteristic diagram of the l-1 layer,representing the bias of the jth characteristic diagram of the ith layer;
if the l-th layer is a full connection layer, the j-th feature map of the l-th layerThe calculation formula of (2) is as follows:
wherein ,fl-1Represents a weighted sum of all the profiles of layer l-1,bias representing jth feature map of ith layer, g (-) represents activation function;
carrying out up-sampling on the characteristic diagram by using bilinear interpolation in a fusion structure of Top-Down and Down-Top sub-modules to realize scale change;
back propagation computing the loss function of the deep convolutional neural network:
setting a training set of remote sensing images to have N x m images, wherein any image IkK belongs to {1,2, …, N multiplied by m }, wherein N represents the total N types of remote sensing images, and m represents that each type of image in the training set is m; for image IkIf the deep convolutional neural network correctly predicts the probability as the ith class is piThen the cross entropy loss function in the multi-classification task is:
wherein p=[p0,…,pi,…,pN-1]Is a probability distribution of eachAn element piRepresenting the probability of the image belonging to the ith class; y ═ y0,…,yi,…,yN-1]Is a one-hot representation of the image tag, y when the sample belongs to the ith classi1, otherwise yi=0;
The formula of the overall cross entropy loss function is:
minimizing a loss function by adopting a gradient descent algorithm, and updating each parameter in the convolutional neural network;
training a deep convolutional neural network for optimal parameters to minimize Loss function Loss, the parameters of the convolutional neural network beingIf all parameters are denoted by W, then:then, after training the convolutional neural network by adopting a remote sensing image training set, a group of parameters W is found*So that:
wherein argmin represents that the value of W when the loss function is minimal is W*;
Updating the parameters of the convolutional neural network by adopting a gradient descent algorithm, and simultaneously minimizing a Loss function Loss:
where α represents a learning rate, and determines a convergence rate of each step, W(i)Denotes the ith set of parameters to be updated, W(i-1)Indicating updated i-1 th groupThe parameters are set to be in a predetermined range,representing Loss function Loss versus parameter W(i)Partial derivatives of (d);
and (3) adopting normalization weight in the adjacent semantic feature fusion structure to balance the influence ratio of the multilevel input to the final result:
5. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1, characterized in that the adjacent semantic feature fusion method of the Top-Down module is as follows:
the adjacent semantic feature fusion structure in the bidirectional feature iterative fusion comprises three inputs, namely Levelk+1、Levelk、Levelk-1Corresponding to feature maps with different resolutions of high level, current level and low level;
the high-level feature map uses up-sampling, the low-level feature map uses down-sampling, and the current level uses identity transformation to enable the three to be added and fused; and (3) carrying out element-by-element addition operation with weights after weights are distributed to the three by using a weight normalization method to obtain a feature map after corresponding semantic information is enhanced:
6. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1, characterized in that in the step (6), the preliminarily enhanced feature map A is subjected to2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5The method specifically comprises the following steps:
will feature map A2、A3、A4、A5Respectively inputting as the first Residual block of each Postprocessor branch, and performing convolution on the bypass by using a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature map A2_1、A3_1、A4_1、A5_1;
A obtained by calculation2_1、A3_1、A4_1、A5_1Respectively serving as the input of a second Residual block of each branch, and performing convolution on a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction on the side; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature diagram B2、B3、B4、B5。
7. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 4, characterized in that, in the step (8), the feature map P is extracted5Classifying by using a Classifer Head structure, wherein the method comprises the following steps:
will feature map P5As input of the Classiier Head, obtaining an output characteristic X through a global average pooling layer; taking the output characteristic X of the pooling layer as the input of the full-connection layer, and obtaining the output characteristic Y of the full-connection layer through the full-connection layer:
Y=[y1,y2,…,yn]
wherein n represents n classes of images in the dataset;
aiming at the output characteristic Y of the full connection layer, calculating a SoftMax value of each remote sensing image sample belonging to the ith class by adopting a SoftMax method as follows:
wherein ,yiAnd yjRepresenting the ith and jth samples in the input features, e representing a constant, SiA probability value representing that the picture belongs to the ith class; the final probability value of the ith remote sensing image is as follows:
S=max(S1,S2,…,Sn)
wherein max (. cndot.) represents taking n SiAt the time of probability maximum SiThe corresponding label type is used as the prediction type value Presect _ label of the ith remote sensing image samplei;
And continuously optimizing the parameters of the convolutional layer by using a gradient descent algorithm according to the prediction result so that the prediction type values of all the training samples are equal to the Label value Label until the loss function value is minimum.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011180187.XA CN112347888B (en) | 2020-10-29 | 2020-10-29 | Remote sensing image scene classification method based on bi-directional feature iterative fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011180187.XA CN112347888B (en) | 2020-10-29 | 2020-10-29 | Remote sensing image scene classification method based on bi-directional feature iterative fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112347888A true CN112347888A (en) | 2021-02-09 |
CN112347888B CN112347888B (en) | 2023-08-08 |
Family
ID=74356533
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011180187.XA Active CN112347888B (en) | 2020-10-29 | 2020-10-29 | Remote sensing image scene classification method based on bi-directional feature iterative fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112347888B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112699855A (en) * | 2021-03-23 | 2021-04-23 | 腾讯科技(深圳)有限公司 | Image scene recognition method and device based on artificial intelligence and electronic equipment |
CN113128564A (en) * | 2021-03-23 | 2021-07-16 | 武汉泰沃滋信息技术有限公司 | Typical target detection method and system based on deep learning under complex background |
CN113239736A (en) * | 2021-04-16 | 2021-08-10 | 广州大学 | Land cover classification annotation graph obtaining method, storage medium and system based on multi-source remote sensing data |
CN113420838A (en) * | 2021-08-20 | 2021-09-21 | 中国科学院空天信息创新研究院 | SAR and optical image classification method based on multi-scale attention feature fusion |
CN113807219A (en) * | 2021-09-06 | 2021-12-17 | 苏州中科蓝迪软件技术有限公司 | Method for identifying types of grain and oil crops in planting land by steps |
CN114022752A (en) * | 2021-11-04 | 2022-02-08 | 中国人民解放军国防科技大学 | SAR target detection method based on attention feature refinement and alignment |
CN114792398A (en) * | 2022-06-23 | 2022-07-26 | 阿里巴巴(中国)有限公司 | Image classification method and target data classification model construction method |
US20230067442A1 (en) * | 2021-08-31 | 2023-03-02 | Black Sesame International Holding Limited | Method of human pose estimation |
CN116740069A (en) * | 2023-08-15 | 2023-09-12 | 山东锋士信息技术有限公司 | Surface defect detection method based on multi-scale significant information and bidirectional feature fusion |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110728192A (en) * | 2019-09-16 | 2020-01-24 | 河海大学 | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network |
-
2020
- 2020-10-29 CN CN202011180187.XA patent/CN112347888B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110728192A (en) * | 2019-09-16 | 2020-01-24 | 河海大学 | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network |
Non-Patent Citations (1)
Title |
---|
余田田: "基于图和稀疏表示的遥感图像变化检测方法", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112699855A (en) * | 2021-03-23 | 2021-04-23 | 腾讯科技(深圳)有限公司 | Image scene recognition method and device based on artificial intelligence and electronic equipment |
CN113128564A (en) * | 2021-03-23 | 2021-07-16 | 武汉泰沃滋信息技术有限公司 | Typical target detection method and system based on deep learning under complex background |
CN113239736A (en) * | 2021-04-16 | 2021-08-10 | 广州大学 | Land cover classification annotation graph obtaining method, storage medium and system based on multi-source remote sensing data |
CN113420838A (en) * | 2021-08-20 | 2021-09-21 | 中国科学院空天信息创新研究院 | SAR and optical image classification method based on multi-scale attention feature fusion |
US20230067442A1 (en) * | 2021-08-31 | 2023-03-02 | Black Sesame International Holding Limited | Method of human pose estimation |
CN113807219A (en) * | 2021-09-06 | 2021-12-17 | 苏州中科蓝迪软件技术有限公司 | Method for identifying types of grain and oil crops in planting land by steps |
CN114022752A (en) * | 2021-11-04 | 2022-02-08 | 中国人民解放军国防科技大学 | SAR target detection method based on attention feature refinement and alignment |
CN114022752B (en) * | 2021-11-04 | 2024-03-15 | 中国人民解放军国防科技大学 | SAR target detection method based on attention feature refinement and alignment |
CN114792398A (en) * | 2022-06-23 | 2022-07-26 | 阿里巴巴(中国)有限公司 | Image classification method and target data classification model construction method |
CN116740069A (en) * | 2023-08-15 | 2023-09-12 | 山东锋士信息技术有限公司 | Surface defect detection method based on multi-scale significant information and bidirectional feature fusion |
CN116740069B (en) * | 2023-08-15 | 2023-11-07 | 山东锋士信息技术有限公司 | Surface defect detection method based on multi-scale significant information and bidirectional feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN112347888B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112347888B (en) | Remote sensing image scene classification method based on bi-directional feature iterative fusion | |
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
CN108537742B (en) | Remote sensing image panchromatic sharpening method based on generation countermeasure network | |
CN110414377B (en) | Remote sensing image scene classification method based on scale attention network | |
CN110110642B (en) | Pedestrian re-identification method based on multi-channel attention features | |
CN110728192B (en) | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network | |
CN110163110B (en) | Pedestrian re-recognition method based on transfer learning and depth feature fusion | |
CN110555458B (en) | Multi-band image feature level fusion method for generating countermeasure network based on attention mechanism | |
CN106909924B (en) | Remote sensing image rapid retrieval method based on depth significance | |
CN111950453B (en) | Random shape text recognition method based on selective attention mechanism | |
CN105740894B (en) | Semantic annotation method for hyperspectral remote sensing image | |
CN111274869B (en) | Method for classifying hyperspectral images based on parallel attention mechanism residual error network | |
CN112329760B (en) | Method for recognizing and translating Mongolian in printed form from end to end based on space transformation network | |
CN108549893A (en) | A kind of end-to-end recognition methods of the scene text of arbitrary shape | |
CN108830296A (en) | A kind of improved high score Remote Image Classification based on deep learning | |
CN115690479A (en) | Remote sensing image classification method and system based on convolution Transformer | |
CN112348036A (en) | Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade | |
CN111626267B (en) | Hyperspectral remote sensing image classification method using void convolution | |
Tereikovskyi et al. | The method of semantic image segmentation using neural networks | |
Ma et al. | Unsupervised domain adaptation augmented by mutually boosted attention for semantic segmentation of VHR remote sensing images | |
CN115496928A (en) | Multi-modal image feature matching method based on multi-feature matching | |
CN113205103A (en) | Lightweight tattoo detection method | |
CN114299398B (en) | Small sample remote sensing image classification method based on self-supervision contrast learning | |
CN115131313A (en) | Hyperspectral image change detection method and device based on Transformer | |
CN114048810A (en) | Hyperspectral image classification method based on multilevel feature extraction network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |