CN112347888A - Remote sensing image scene classification method based on bidirectional feature iterative fusion - Google Patents

Remote sensing image scene classification method based on bidirectional feature iterative fusion Download PDF

Info

Publication number
CN112347888A
CN112347888A CN202011180187.XA CN202011180187A CN112347888A CN 112347888 A CN112347888 A CN 112347888A CN 202011180187 A CN202011180187 A CN 202011180187A CN 112347888 A CN112347888 A CN 112347888A
Authority
CN
China
Prior art keywords
feature
remote sensing
layer
sensing image
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011180187.XA
Other languages
Chinese (zh)
Other versions
CN112347888B (en
Inventor
王鑫
王施意
张之露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202011180187.XA priority Critical patent/CN112347888B/en
Publication of CN112347888A publication Critical patent/CN112347888A/en
Application granted granted Critical
Publication of CN112347888B publication Critical patent/CN112347888B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a remote sensing image scene classification method based on bidirectional feature iterative fusion, and belongs to the field of image processing. Firstly, designing a novel deep convolutional neural network based on a ResNet34 network model; secondly, inputting the remote sensing image into a network for training, and outputting the final convolution layer of each stage except the first stage in ResNet34 as subsequent input features, wherein the input features are four groups; then, a Top-Down submodule, a PostProcessor submodule and a Down-Top submodule are designed in the novel bidirectional feature iterative fusion network structure, and four groups of input features are respectively sent into the structure to generate output features with corresponding scales; and finally, inputting the highest-level output features into the full-connection layer after passing through the global average pooling layer, and using the output of the full-connection layer as the input of the SoftMax layer to realize the classification of the remote sensing images.

Description

Remote sensing image scene classification method based on bidirectional feature iterative fusion
Technical Field
The invention belongs to the field of image processing, and particularly relates to a remote sensing image scene classification method based on bidirectional feature iterative fusion.
Background
Remote sensing, broadly referred to as remote non-contact detection techniques. Because different objects have obvious difference on the spectrum effect of the electromagnetic wave of the same wave band, the remote sensing technical equipment analyzes the object spectrogram according to the principle, thereby realizing the identification of the remote object. The general remote sensing technology can be divided into multispectral, hyperspectral and synthetic aperture radars, and the generated remote sensing images have different spatial resolution, spectral resolution and time resolution. The spatial resolution refers to the size or dimension of the smallest unit that can be distinguished in detail on the remote sensing image. With the continuous development of remote sensing technology, the spatial resolution of remote sensing images is improved in stages: the French SPOT-6 satellite launched in 2012 can provide a full-color 1.5m resolution high-definition terrestrial image; the us WorldView-3 satellite launched in 2014 may provide a full color, 0.3m resolution high definition ground image. In recent years, the remote sensing technology in China has been developed greatly, and the ground pixel resolution can reach sub-meter level at most: the GF-11 satellite transmitted in China in 2018 can realize the ground image resolution of 10cm or less.
The high-spatial-resolution remote sensing image has abundant surface feature texture information, is often applied to the fields of homeland general survey, surface feature classification, change detection and the like, and provides information guarantee for the implementation of major plans. At present, because the data volume of high-resolution remote sensing images is huge, how to accurately divide the remote sensing images into different types according to functions is a topic of particular attention in academia. Actually, the effectiveness and uniqueness of sample feature extraction have extremely important influence on the classification precision of high-resolution remote sensing images.
The publication number CN110443143A discloses a multi-branch convolutional neural network fused remote sensing image scene classification method, which is characterized in that an object mask graph and an attention graph are obtained by respectively passing preprocessed data through an object detection network and an attention network; respectively inputting an original image, an object mask image and an attention map training set into a CNN network for fine adjustment to respectively obtain optimal classification models; and finally, the outputs of the three groups of Softmax layers are fused through a decision level to obtain a final prediction result. However, the three groups of network models result in large model parameters and are complex under special conditions, which is not beneficial to improving the classification precision.
The publication number CN110555446A discloses a remote sensing image scene classification method based on multi-scale depth feature fusion and transfer learning, firstly, a Gaussian pyramid algorithm is used for obtaining a multi-scale remote sensing image, the multi-scale remote sensing image is input into a full convolution neural network, and multi-scale depth local features are extracted; then, cutting the image to a fixed size required by CNN, inputting the image into a network to obtain global features of a full connection layer, coding the multi-scale depth local features and the global features obtained by CNN by using compact bilinear pooling operation, and jointly representing the remote sensing image by fusing the two depth features to enhance the mutual relation between the features; and finally, classifying the remote sensing image scenes by using a transfer learning technology and combining the two methods. Although the method integrates the global characteristics and the local characteristics of the remote sensing image and enriches the characteristic information, the semantic information and the spatial information of the multi-scale depth local characteristics are not distributed uniformly, and the space is improved for the classification result.
In summary, the existing high-resolution remote sensing image classification method has many defects, which are mainly expressed as follows:
(1) the existing remote sensing image classification method focuses on the high-level features of the last convolutional layer, the high-level features focus on semantic information, and the rich semantic information can enable a network to accurately detect a target. The remote sensing image scene classification is different from the common object classification, and the surrounding environment (embodied as spatial information) of the characteristic object can also help network classification, so that the image classification precision is not high;
(2) the traditional multi-scale feature extraction method has the advantages that the contribution degrees of feature maps of different scales to the whole result are the same, and experimental verification shows that the classification accuracy can be improved by performing weighted fusion on the feature maps of different scales. And the network using the conventional weighting form has a slow convergence speed.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems, the invention provides a remote sensing image scene classification method based on bidirectional feature iterative fusion. The method avoids extraction of excessive artificial features, learns reasonable normalization weight coefficients, and performs feature fusion by circularly and iteratively utilizing feature maps with different scales and different levels to supplement semantic information and spatial information with each other, thereby enhancing feature robustness and improving the accuracy of image classification.
The technical scheme is as follows: in order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows: a remote sensing image scene classification method based on bidirectional feature iterative fusion comprises the following steps:
(1) constructing a multi-classification remote sensing image data set, making corresponding sample labels, and dividing each type of remote sensing image into a training set Train and a Test set Test in proportion;
(2) constructing a convolutional neural network ResNet, taking remote sensing image data as the input of the network, dividing convolutional layers with the same output size into the same stage, and dividing the constructed ResNet network model into 5 stages in total;
(3) constructing a bidirectional feature iterative fusion structure which comprises three submodules of Top-Down, Postprocessor and Down-Top; the Top-Down sub-module comprises 4 paths of feature dimension reduction branches and 4 adjacent semantic feature fusion structures; connecting a PostProcessor submodule behind each feature fusion structure, wherein the PostProcessor submodule internally comprises 2 Residual subblocks, and each subblock respectively comprises 4 Residual error layers; the Down-Top sub-module comprises 4 spatial feature fusion structures;
(4) taking the output characteristics of each stage of the ResNet network except the first stage as the input characteristics of the bidirectional characteristic iterative fusion structure, respectively performing characteristic dimension reduction on the input characteristics through 4 paths of characteristic dimension reduction branches of the Top-Down submodule, and marking a generated characteristic diagram after dimension reduction as C2、C3、C4、C5
(5) Inputting the feature map after dimension reduction into an adjacent semantic feature fusion structure of a Top-Down submodule with normalized weight, mutually fusing features and supplementing semantic information to the adjacent feature map in the structure, and respectively generating a feature map A which corresponds to the same size and is enhanced in semantic meaning2、A3、A4、A5
(6) Feature map A to be preliminarily enhanced2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5
(7) Will feature map B2、B3、B4、B5Respectively inputting Down-Top spatial feature fusion structures with normalized weights, supplementing and perfecting features of corresponding levels by using spatial information of adjacent feature maps in the structures, and respectively generating feature maps P corresponding to the same size2、P3、P4、P5
(8) Selecting the characteristic graph P with the strongest semantic information in the step (7)5As the input characteristic of a Classifier Head, performing scene classification by using SoftMax after passing through a self-adaptive global average pooling layer and a full connection layer to obtain a classification result;
(9) according to the steps (4) to (8), training the convolutional neural network based on bidirectional feature iterative fusion by using a remote sensing image data training set to obtain a trained convolutional neural network;
(10) and inputting the images in the test set into a trained convolutional neural network to obtain output characteristics Y, and classifying and identifying the output characteristics Y by utilizing SoftMax to further realize the class prediction of the test set.
Further, in step (1), the method for dividing the training set and the test set is as follows:
(1.1) dividing a multi-classification remote sensing Image dataset Image [ [ Image [ ]1,…,Imagei,…,ImageN]And preparing a corresponding sample Label [ Label ═ Label1,…,Labeli,…,LabelN]Wherein N represents a total of N types of remote sensing images, ImageiRepresenting a set of i-th type remote sensing images, LabeliA label representing an i-th type remote sensing image;
(1.2) setting the total number of samples of each type of remote sensing image in the remote sensing image data set as n, and randomly extracting m images in the type to construct a training set Train [ Train ═1,…,Traini,…,Trainm]And constructing a Test set Test (Test) of the rest n-m remote sensing images1,…,Testi,…,Testn-m]Wherein, TrainiTraining set for representing ith type remote sensing image and containing m graphsLike, TestiAnd the test set represents the ith type remote sensing image and comprises n-m images.
Further, the construction method of the convolutional neural network based on bidirectional feature iterative fusion is as follows:
building a network based on the ResNet34 model: the ResNet34 model has 5 stages, each stage is marked as S1, S2, S3, S4 and S5, and the last four stages respectively comprise 3, 4, 6 and 3 basic modules, namely BasicBlock; conv2_3, Conv3_4, Conv4_6 and Conv5_3 represent the convolution outputs of the last BasicBlock of the stage respectively; taking the output characteristics of the models Conv2_3, Conv3_4, Conv4_6 and Conv5_3 as the input characteristics of the bidirectional characteristic iterative fusion structure; and constructing a Classification Head after the bidirectional feature iterative fusion structure, wherein the Classification Head internally comprises an adaptive global Average pooling layer and a full connection layer which are respectively marked as Average Pool and Fc, and taking a feature map with strongest semantic information output by the bidirectional feature iterative fusion structure as an input feature of the Classification Head.
Further, inputting a training set of the remote sensing image into the constructed convolutional neural network, calculating an output value of each neuron of the convolutional neural network in a feedforward mode, and setting a calculation function of each layer of feature diagram and a minimum loss function:
if the first layer is a convolutional layer, the jth characteristic diagram of the first layer
Figure BDA0002749954350000031
The calculation formula of (2) is as follows:
Figure BDA0002749954350000032
wherein g (·) denotes an activation function, x denotes a convolution operation,
Figure BDA0002749954350000033
showing the ith characteristic diagram of the l-1 layer,
Figure BDA0002749954350000034
represents from
Figure BDA0002749954350000035
To
Figure BDA0002749954350000036
The convolution kernel of (a) is performed,
Figure BDA0002749954350000037
for biasing of jth feature map of ith layer, Ml-1The number of characteristic graphs of the l-1 layer is shown;
if the l-th layer is a pooling layer, the j-th feature map of the l-th layer
Figure BDA0002749954350000038
The calculation formula of (2) is as follows:
Figure BDA0002749954350000041
wherein g (-) represents an activation function,
Figure BDA0002749954350000042
the pooling parameter representing the jth feature map of the l-th layer, down (-) representing a pooling function,
Figure BDA0002749954350000043
showing the ith characteristic diagram of the l-1 layer,
Figure BDA0002749954350000044
representing the bias of the jth characteristic diagram of the ith layer;
if the l-th layer is a full connection layer, the j-th feature map of the l-th layer
Figure BDA0002749954350000045
The calculation formula of (2) is as follows:
Figure BDA0002749954350000046
wherein ,fl-1Represents a weighted sum of all the profiles of layer l-1,
Figure BDA0002749954350000047
bias representing jth feature map of ith layer, g (-) represents activation function;
carrying out up-sampling on the characteristic diagram by using bilinear interpolation in a fusion structure of Top-Down and Down-Top sub-modules to realize scale change;
back propagation computing the loss function of the deep convolutional neural network:
setting a training set of remote sensing images to have N x m images, wherein any image IkK belongs to {1,2, …, N multiplied by m }, wherein N represents the total N types of remote sensing images, and m represents that each type of image in the training set is m; for image IkIf the deep convolutional neural network correctly predicts the probability as the ith class is piThen the cross entropy loss function in the multi-classification task is:
Figure BDA0002749954350000048
wherein p=[p0,…,pi,…,pN-1]Is a probability distribution of each element piRepresenting the probability of the image belonging to the ith class; y ═ y0,…,yi,…,yN-1]Is a one-hot representation of the image tag, y when the sample belongs to the ith class i1, otherwise yi=0;
The formula of the overall cross entropy loss function is:
Figure BDA0002749954350000049
minimizing a loss function by adopting a gradient descent algorithm, and updating each parameter in the convolutional neural network;
training a deep convolutional neural network for optimal parameters to minimize Loss function Loss, the parameters of the convolutional neural network being
Figure BDA00027499543500000410
If W represents all the parametersThe number, i.e.:
Figure BDA00027499543500000411
then, after training the convolutional neural network by adopting a remote sensing image training set, a group of parameters W is found*So that:
Figure BDA00027499543500000412
wherein argmin represents that the value of W when the loss function is minimal is W*
Updating the parameters of the convolutional neural network by adopting a gradient descent algorithm, and simultaneously minimizing a Loss function Loss:
Figure BDA00027499543500000413
where α represents a learning rate, and determines a convergence rate of each step, W(i)Denotes the ith set of parameters to be updated, W(i-1)Indicating the updated i-1 th set of parameters,
Figure BDA0002749954350000051
representing Loss function Loss versus parameter W(i)Partial derivatives of (d);
and (3) adopting normalization weight in the adjacent semantic feature fusion structure to balance the influence ratio of the multilevel input to the final result:
Figure BDA0002749954350000052
wherein ,βiRepresenting the original weight of the input of the current level, t representing the input number of the adjacent semantic feature fusion structures,
Figure BDA0002749954350000053
representing the normalized weight ratio.
Further, the method for fusing adjacent semantic features of the Top-Down module specifically comprises the following steps:
the adjacent semantic feature fusion structure in the bidirectional feature iterative fusion comprises three inputs, namely Levelk+1、Levelk、Levelk-1Corresponding to feature maps with different resolutions of high level, current level and low level;
the high-level feature map uses up-sampling, the low-level feature map uses down-sampling, and the current level uses identity transformation to enable the three to be added and fused; and (3) carrying out element-by-element addition operation with weights after weights are distributed to the three by using a weight normalization method to obtain a feature map after corresponding semantic information is enhanced:
Figure BDA0002749954350000054
wherein ,
Figure BDA0002749954350000055
representing the Level of the current Level featurekThe corresponding output features of the same size,
Figure BDA0002749954350000056
representing the normalized weights.
Further, the step (6) is to preliminarily enhance the feature map A2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5The method specifically comprises the following steps:
will feature map A2、A3、A4、A5Respectively inputting as the first Residual block of each Postprocessor branch, and performing convolution on the bypass by using a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature map A2_1、A3_1、A4_1、A5_1
Will countCalculated A2_1、A3_1、A4_1、A5_1Respectively serving as the input of a second Residual block of each branch, and performing convolution on a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction on the side; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature diagram B2、B3、B4、B5
Further, the step (8) is to apply the feature map P5Classifying by using a Classifer Head structure, wherein the method comprises the following steps:
will feature map P5As input of the Classiier Head, obtaining an output characteristic X through a global average pooling layer; taking the output characteristic X of the pooling layer as the input of the full-connection layer, and obtaining the output characteristic Y of the full-connection layer through the full-connection layer:
Y=[y1,y2,…,yn]
wherein n represents n classes of images in the dataset;
aiming at the output characteristic Y of the full connection layer, calculating a SoftMax value of each remote sensing image sample belonging to the ith class by adopting a SoftMax method as follows:
Figure BDA0002749954350000061
wherein ,yiAnd yjRepresenting the ith and jth samples in the input features, e representing a constant, SiA probability value representing that the picture belongs to the ith class; the final probability value of the ith remote sensing image is as follows:
S=max(S1,S2,…,Sn)
wherein max (. cndot.) represents taking n SiAt the time of probability maximum SiThe corresponding label type is used as the prediction type value Presect _ label of the ith remote sensing image samplei
And continuously optimizing the parameters of the convolutional layer by using a gradient descent algorithm according to the prediction result so that the prediction type values of all the training samples are equal to the Label value Label until the loss function value is minimum.
Has the advantages that: compared with the prior art, the technical scheme of the invention has the following beneficial technical effects:
(1) the method can automatically learn and extract the depth characteristics of the remote sensing image through the depth convolution neural network, avoid the extraction of artificial characteristics, reduce the complexity and reduce the human intervention;
(2) the method of the invention carries out full aspect feature refinement and enhancement on the features of different scales and different levels by using a bidirectional feature iterative fusion structure, thereby avoiding the limitation of classification precision caused by the lack of the feature space information of the last layer of convolutional layer in the past;
(3) the method of the invention distributes weights to different inputs of adjacent semantic feature fusion structures, and normalizes the weight tensors, so as to better balance the influence of different levels of feature graphs on the current level of features, and compared with SoftMax normalization, the normalization operation accelerates network convergence.
Drawings
FIG. 1 is a block diagram of an embodiment of the present invention;
fig. 2 is a structural diagram of the constructed neural network.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
The invention relates to a remote sensing image scene classification method based on bidirectional feature iterative fusion, which comprises the following steps:
(1) and constructing a multi-classification remote sensing image data set, manufacturing a corresponding sample label, and dividing each type of remote sensing image into a training set Train and a Test set Test in proportion.
(1.1) dividing a multi-classification remote sensing Image dataset Image [ [ Image [ ]1,…,Imagei,…,ImageN]And preparing a corresponding sample Label [ Label ═ Label1,…,Labeli,…,LabelN]Wherein N represents a total of N types of remote sensing images,ImageiRepresenting a set of i-th type remote sensing images, LabeliA label representing an i-th type remote sensing image;
(1.2) setting the total number of samples of each type of remote sensing image in the remote sensing image data set as n, and randomly extracting m images in the type to construct a training set Train [ Train ═1,…,Traini,…,Trainm]And constructing a Test set Test (Test) of the rest n-m remote sensing images1,…,Testi,…,Testn-m]Wherein, TrainiTraining set for representing i-th type remote sensing image, comprising m images, TestiAnd the test set represents the ith type remote sensing image and comprises n-m images.
(2) Building a network based on the ResNet34 model: remote sensing image data is used as input of a network, convolutional layers with the same output size are divided into the same stage, the ResNet34 model has 5 stages, each stage is marked as S1, S2, S3, S4 and S5, and the last four stages respectively comprise 3, 4, 6 and 3 basic modules BasicBlock; conv2_3, Conv3_4, Conv4_6 and Conv5_3 represent the convolution outputs of the last BasicBlock of the stage, respectively.
(3) Constructing a bidirectional feature iterative fusion structure which comprises three submodules of Top-Down, Postprocessor and Down-Top; the Top-Down submodule comprises 4 characteristic dimension reduction branches and 4 adjacent semantic characteristic fusion structures, the characteristic dimension reduction branches are marked as Down channel1, Down channel2, Down channel3 and Down channel4, and the characteristic fusion structures are marked as TopDOwn1, TopDOwn2, TopDOwn3 and TopDOwn 4; connecting a PostProcessor submodule behind each feature fusion structure, wherein the PostProcessor submodule internally comprises 2 Residual subblocks, each subblock respectively comprises 4 Residual error layers which are respectively marked as Residual 1-Residual 8; the Down-Top sub-module comprises 4 spatial feature fusion structures which are respectively marked as Down Top1, Down Top2, Down Top3 and Down Top 4. A Classification Head structure is designed after a bidirectional feature iterative fusion structure, and the Classification Head structure comprises an adaptive global Average pooling layer and a full-connection layer which are respectively marked as Average Pool and Fc. The convolutional layer is used for extracting and processing the feature map, the pooling layer is used for compressing the feature map obtained by the convolutional layer, and the full-connection layer can convert the feature map into a one-dimensional vector.
In this embodiment, the constructed convolutional neural network based on bidirectional feature iterative fusion has the following specific parameters:
(a) in a first stage S1, redefining each remote sensing image size as 224x224, normalizing, defining a convolution layer with convolution kernel size of 7 x 7, step size of 2, and padding of 3;
(b) defining 1 pooling layer in the convolutional layer S2, wherein the pooling mode is MaxPO _ OLING; defining 3 BasicBlock, 2 layers in each BasicBlock, 64 convolution kernels with the size of 3 multiplied by 3 in each layer, and the step size is 1;
(c) in convolutional layer S3, 4 basicblocks are defined, 2 layers in each BasicBlock, 128 convolution kernels of size 3 × 3 per layer, with a step size of 1;
(d) in convolutional layer S4, 6 basicblocks are defined, 2 layers in each BasicBlock, 256 convolutional cores of size 3 × 3 per layer, with a step size of 1;
(e) in convolutional layer S5, 3 basicblocks are defined, 2 layers in each BasicBlock, 512 convolution kernels of size 3 × 3 per layer, with a step size of 1;
(f) respectively defining 256 convolution kernels with the size of 1 multiplied by 1 in the characteristic dimensionality reduction branches of Downchannel1, Downchannel2, Downchannel3 and Downchannel4, wherein the step size is 1;
(g) in two-way input of the characteristic fusion structure TopDOwn1, 1 pooling layer is defined by a lower branch, and the pooling mode is MaxPooling; in three inputs of TopDOwn2 and TopDOwn3, an upper branch defines 1 upper sampling layer in a bilinear interpolation mode, a lower branch defines 1 pooling layer in a MaxPoooling mode; in the two-path input of TopDOwn4, an upper path defines 1 layer of upper sampling layer, and the mode is bilinear interpolation;
(h) in each PostProcessor branch, defining 2 Residual sub-blocks, defining 3 layers of convolution layers in the main path in each sub-block, wherein the sizes of convolution cores in each layer are respectively defined as 1 × 1, 3 × 3 and 1 × 1, the number of output channels is 64, 64 and 256, and the step length is 1; meanwhile, a bypass in the Residual subblock defines 1 convolution layer, 256 convolution kernels with the size of 1 multiplied by 1 are defined, and the step length is 1;
(j) in two paths of input of the feature fusion structure DownTop1, an upper path defines 1 layer of upper sampling layer, and the mode is bilinear interpolation; in three-way input of DownTop2 and DownTop3, an upper branch defines 1 layer of upper sampling layer in a bilinear interpolation mode, a lower branch defines 1 layer of pooling layer in a MaxPooling mode; in the DownTop4, the lower branch defines 1 pooling layer, and the pooling mode is Max Paooling;
(k) defining 1 layer of pooling layers in a Classiier Head, wherein the pooling modes are Adaptive AveragePool, and the output size is 1 multiplied by 1; a full junction layer Fc is further defined.
(4) Taking the output features of the ResNet34 models Conv2_3, Conv3_4, Conv4_6 and Conv5_3 as the input features of the bidirectional feature iterative fusion structure, respectively performing feature dimension reduction on the input features through 4 paths of feature dimension reduction branches of the Top-Down submodule, and marking the feature map generated after dimension reduction as C2、C3、C4、C5
(5) Inputting the feature map after dimension reduction into an adjacent semantic feature fusion structure of a Top-Down submodule with normalized weight, mutually fusing features and supplementing semantic information to the adjacent feature map in the structure, and respectively generating a feature map A which corresponds to the same size and is enhanced in semantic meaning2、A3、A4、A5
The fusion method of the adjacent semantic features of the Top-Down module specifically comprises the following steps:
the adjacent semantic feature fusion structure in the bidirectional feature iterative fusion comprises three inputs, namely Levelk+1、Levelk、Levelk-1Corresponding to feature maps with different resolutions of high level, current level and low level;
the high-level feature map uses up-sampling, the low-level feature map uses down-sampling, and the current level uses identity transformation to enable the three to be added and fused; and (3) carrying out element-by-element addition operation with weights after weights are distributed to the three by using a weight normalization method to obtain a feature map after corresponding semantic information is enhanced:
Figure BDA0002749954350000081
wherein ,
Figure BDA0002749954350000082
representing the Level of the current Level featurekThe corresponding output features of the same size,
Figure BDA0002749954350000083
representing the normalized weights.
(6) Feature map A to be preliminarily enhanced2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5The method specifically comprises the following steps:
will feature map A2、A3、A4、A5Respectively inputting as the first Residual block of each Postprocessor branch, and performing convolution on the bypass by using a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature map A2_1、A3_1、A4_1、A5_1
A obtained by calculation2_1、A3_1、A4_1、A5_1Respectively used as the input of the second Residual block of each branch, and carrying out convolution on the convolution layer with the convolution kernel size of 1 multiplied by 1 at the side to realize characteristic dimension reduction; convolution is carried out on the main path sequentially by using convolution layers with convolution kernel sizes of 1 × 1, 3 × 3 and 1 × 1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature graph B2、B3、B4、B5
(7) Will feature map B2、B3、B4、B5Respectively inputting Down-Top spatial feature fusion structure with normalization weight, supplementing and perfecting corresponding level features by using spatial information of adjacent feature maps in the structure, and respectively generating featuresTo correspond to the same size characteristic diagram P2、P3、P4、P5
(8) Selecting the characteristic graph P with the strongest semantic information in the step (7)5And as the input characteristic of a Classifier Head, after passing through an adaptive global Average pooling layer within the Classifier Head and a full link layer Fc, performing scene classification by using SoftMax to obtain a classification result. The method comprises the following steps:
will feature map P5As input of the Classiier Head, obtaining an output characteristic X through a global average pooling layer; taking the output characteristic X of the pooling layer as the input of the full-connection layer, and obtaining the output characteristic Y of the full-connection layer through the full-connection layer:
Y=[y1,y2,…,yn]
wherein n represents n classes of images in the dataset;
aiming at the output characteristic Y of the full connection layer, calculating a SoftMax value of each remote sensing image sample belonging to the ith class by adopting a SoftMax method as follows:
Figure BDA0002749954350000091
wherein ,yiAnd yjRepresenting the ith and jth samples in the input features, e representing a constant, SiA probability value representing that the picture belongs to the ith class; the final probability value of the ith remote sensing image is as follows:
S=max(S1,S2,…,Sn)
wherein max (. cndot.) represents taking n SiAt the time of probability maximum SiThe corresponding label type is used as the prediction type value Presect _ label of the ith remote sensing image samplei
And continuously optimizing the parameters of the convolutional layer by using a gradient descent algorithm according to the prediction result so that the prediction type values of all the training samples are equal to the Label value Label until the loss function value is minimum.
(9) And (5) training the convolutional neural network based on bidirectional feature iterative fusion by using a remote sensing image data training set according to the steps (4) to (8) to obtain the trained convolutional neural network.
Inputting a training set of remote sensing images into the constructed convolutional neural network, calculating the output value of each neuron of the convolutional neural network in a feedforward mode, and setting a calculation function and a minimum loss function of each layer of feature diagram:
if the first layer is a convolutional layer, the jth characteristic diagram of the first layer
Figure BDA0002749954350000092
The calculation formula of (2) is as follows:
Figure BDA0002749954350000093
wherein g (·) denotes an activation function, x denotes a convolution operation,
Figure BDA0002749954350000094
showing the ith characteristic diagram of the l-1 layer,
Figure BDA0002749954350000095
represents from
Figure BDA0002749954350000096
To
Figure BDA0002749954350000097
The convolution kernel of (a) is performed,
Figure BDA0002749954350000098
for biasing of jth feature map of ith layer, Ml-1The number of characteristic graphs of the l-1 layer is shown;
if the l-th layer is a pooling layer, the j-th feature map of the l-th layer
Figure BDA0002749954350000099
The calculation formula of (2) is as follows:
Figure BDA0002749954350000101
wherein g (-) represents an activation function,
Figure BDA0002749954350000102
the pooling parameter representing the jth feature map of the l-th layer, down (-) representing a pooling function,
Figure BDA0002749954350000103
showing the ith characteristic diagram of the l-1 layer,
Figure BDA0002749954350000104
representing the bias of the jth characteristic diagram of the ith layer;
if the l-th layer is a full connection layer, the j-th feature map of the l-th layer
Figure BDA0002749954350000105
The calculation formula of (2) is as follows:
Figure BDA0002749954350000106
wherein ,fl-1Represents a weighted sum of all the profiles of layer l-1,
Figure BDA0002749954350000107
bias representing jth feature map of ith layer, g (-) represents activation function;
carrying out up-sampling on the characteristic diagram by using bilinear interpolation in a fusion structure of Top-Down and Down-Top sub-modules to realize scale change;
back propagation computing the loss function of the deep convolutional neural network:
setting a training set of remote sensing images to have N x m images, wherein any image IkK belongs to {1,2, …, N multiplied by m }, wherein N represents the total N types of remote sensing images, and m represents that each type of image in the training set is m; for image IkIf the deep convolutional neural network correctly predicts the probability as the ith class is piThen the cross entropy loss function in the multi-classification task is:
Figure BDA0002749954350000108
wherein p=[p0,…,pi,…,pN-1]Is a probability distribution of each element piRepresenting the probability of the image belonging to the ith class; y ═ y0,…,yi,…,yN-1]Is a one-hot representation of the image tag, y when the sample belongs to the ith class i1, otherwise yi=0;
The formula of the overall cross entropy loss function is:
Figure BDA0002749954350000109
minimizing a loss function by adopting a gradient descent algorithm, and updating each parameter in the convolutional neural network;
training a deep convolutional neural network for optimal parameters to minimize Loss function Loss, the parameters of the convolutional neural network being
Figure BDA00027499543500001010
If all parameters are denoted by W, then:
Figure BDA00027499543500001011
then, after training the convolutional neural network by adopting a remote sensing image training set, a group of parameters W is found*So that:
Figure BDA00027499543500001012
wherein argmin represents that the value of W when the loss function is minimal is W*
Updating the parameters of the convolutional neural network by adopting a gradient descent algorithm, and simultaneously minimizing a Loss function Loss:
Figure BDA00027499543500001013
where α represents a learning rate, and determines a convergence rate of each step, W(i)Denotes the ith set of parameters to be updated, W(i-1)Indicating the updated i-1 th set of parameters,
Figure BDA0002749954350000111
representing Loss function Loss versus parameter W(i)Partial derivatives of (d);
and (3) adopting normalization weight in the adjacent semantic feature fusion structure to balance the influence ratio of the multilevel input to the final result:
Figure BDA0002749954350000112
wherein ,βiRepresenting the original weight of the input of the current level, t representing the input number of the adjacent semantic feature fusion structures,
Figure BDA0002749954350000113
representing the normalized weight ratio.
(10) And inputting the images in the test set into a trained convolutional neural network to obtain output characteristics Y, and classifying and identifying the output characteristics Y by utilizing SoftMax to further realize the class prediction of the test set.
The invention selects a different remote sensing image scene classification algorithm to compare with the proposed method, and the selected comparison algorithm is as follows: a remote sensing image scene classification method [ P ]. Chinese patent CN104680173A,2015-06-03 ] "provides a high-resolution remote sensing image classification method realized by an SVM classifier based on sparse coding space pyramid matching model characteristics, which is called method 1 for short. Table 1 shows the performance comparison of the two methods on a high-resolution remote sensing scene image public data set UCMercered _ LandUse. The result shows that the method provided by the invention has better effect of classifying the remote sensing image scene.
TABLE 1
Figure BDA0002749954350000114
The foregoing is a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (7)

1. The remote sensing image scene classification method based on bidirectional feature iterative fusion is characterized by comprising the following steps of:
(1) constructing a multi-classification remote sensing image data set, making corresponding sample labels, and dividing each type of remote sensing image into a training set Train and a Test set Test in proportion;
(2) constructing a convolutional neural network ResNet, taking remote sensing image data as the input of the network, dividing convolutional layers with the same output size into the same stage, and dividing the constructed ResNet network model into 5 stages in total;
(3) constructing a bidirectional feature iterative fusion structure which comprises three submodules of Top-Down, Postprocessor and Down-Top; the Top-Down sub-module comprises 4 paths of feature dimension reduction branches and 4 adjacent semantic feature fusion structures; connecting a PostProcessor submodule behind each feature fusion structure, wherein the PostProcessor submodule internally comprises 2 Residual subblocks, and each subblock respectively comprises 4 Residual error layers; the Down-Top sub-module comprises 4 spatial feature fusion structures;
(4) taking the output characteristics of each stage of the ResNet network except the first stage as the input characteristics of the bidirectional characteristic iterative fusion structure, respectively performing characteristic dimension reduction on the input characteristics through 4 paths of characteristic dimension reduction branches of the Top-Down submodule, and marking a generated characteristic diagram after dimension reduction as C2、C3、C4、C5
(5) Inputting the feature map after dimension reduction into an adjacent semantic feature fusion structure of a Top-Down submodule with normalized weight, mutually fusing features and supplementing semantic information to the adjacent feature map in the structure, and respectively generating a feature map A which corresponds to the same size and is enhanced in semantic meaning2、A3、A4、A5
(6) Feature map A to be preliminarily enhanced2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5
(7) Will feature map B2、B3、B4、B5Respectively inputting Down-Top spatial feature fusion structures with normalized weights, supplementing and perfecting features of corresponding levels by using spatial information of adjacent feature maps in the structures, and respectively generating feature maps P corresponding to the same size2、P3、P4、P5
(8) Selecting the characteristic graph P with the strongest semantic information in the step (7)5As the input characteristic of a Classifier Head, performing scene classification by using SoftMax after passing through a self-adaptive global average pooling layer and a full connection layer to obtain a classification result;
(9) according to the steps (4) to (8), training the convolutional neural network based on bidirectional feature iterative fusion by using a remote sensing image data training set to obtain a trained convolutional neural network;
(10) and inputting the images in the test set into a trained convolutional neural network to obtain output characteristics Y, and classifying and identifying the output characteristics Y by utilizing SoftMax to further realize the class prediction of the test set.
2. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1, characterized in that in step (1), the method for dividing the training set and the test set is as follows:
(1.1) dividing a multi-classification remote sensing Image dataset Image [ [ Image [ ]1,…,Imagei,…,ImageN]And preparing a corresponding sample Label [ Label ═ Label1,…,Labeli,…,LabelN]Wherein N represents a total of N types of remote sensing images, ImageiRepresenting a set of i-th type remote sensing images, LabeliA label representing an i-th type remote sensing image;
(1.2) setting the total number of samples of each type of remote sensing image in the remote sensing image data set as n, and randomly extracting m images in the type to construct a training set Train [ Train ═1,…,Traini,…,Trainm]And constructing a Test set Test (Test) of the rest n-m remote sensing images1,…,Testi,…,Testn-m]Wherein, TrainiTraining set for representing i-th type remote sensing image, comprising m images, TestiAnd the test set represents the ith type remote sensing image and comprises n-m images.
3. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1 is characterized in that the construction method of the convolutional neural network based on bidirectional feature iterative fusion is as follows:
building a network based on the ResNet34 model: the ResNet34 model has 5 stages, each stage is marked as S1, S2, S3, S4 and S5, and the last four stages respectively comprise 3, 4, 6 and 3 basic modules, namely BasicBlock; conv2_3, Conv3_4, Conv4_6 and Conv5_3 represent the convolution outputs of the last BasicBlock of the stage respectively; taking the output characteristics of the models Conv2_3, Conv3_4, Conv4_6 and Conv5_3 as the input characteristics of the bidirectional characteristic iterative fusion structure; and constructing a Classification Head after the bidirectional feature iterative fusion structure, wherein the Classification Head internally comprises an adaptive global Average pooling layer and a full connection layer which are respectively marked as Average Pool and Fc, and taking a feature map with strongest semantic information output by the bidirectional feature iterative fusion structure as an input feature of the Classification Head.
4. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 3, characterized in that a training set of remote sensing images is input into a constructed convolutional neural network, the output value of each neuron of the convolutional neural network is calculated in a feedforward manner, and a calculation function of each layer of feature map and a minimum loss function are set:
if the first layer is a convolutional layer, the jth characteristic diagram of the first layer
Figure FDA0002749954340000021
The calculation formula of (2) is as follows:
Figure FDA0002749954340000022
wherein g (·) denotes an activation function, x denotes a convolution operation,
Figure FDA0002749954340000023
showing the ith characteristic diagram of the l-1 layer,
Figure FDA0002749954340000024
represents from
Figure FDA0002749954340000025
To
Figure FDA0002749954340000026
The convolution kernel of (a) is performed,
Figure FDA0002749954340000027
for biasing of jth feature map of ith layer, Ml-1The number of characteristic graphs of the l-1 layer is shown;
if the l-th layer is a pooling layer, the j-th feature map of the l-th layer
Figure FDA0002749954340000028
The calculation formula of (2) is as follows:
Figure FDA0002749954340000029
wherein g (-) represents an activation function,
Figure FDA00027499543400000210
the pooling parameter representing the jth feature map of the l-th layer, down (-) representing a pooling function,
Figure FDA00027499543400000211
showing the ith characteristic diagram of the l-1 layer,
Figure FDA00027499543400000212
representing the bias of the jth characteristic diagram of the ith layer;
if the l-th layer is a full connection layer, the j-th feature map of the l-th layer
Figure FDA00027499543400000213
The calculation formula of (2) is as follows:
Figure FDA00027499543400000214
wherein ,fl-1Represents a weighted sum of all the profiles of layer l-1,
Figure FDA00027499543400000215
bias representing jth feature map of ith layer, g (-) represents activation function;
carrying out up-sampling on the characteristic diagram by using bilinear interpolation in a fusion structure of Top-Down and Down-Top sub-modules to realize scale change;
back propagation computing the loss function of the deep convolutional neural network:
setting a training set of remote sensing images to have N x m images, wherein any image IkK belongs to {1,2, …, N multiplied by m }, wherein N represents the total N types of remote sensing images, and m represents that each type of image in the training set is m; for image IkIf the deep convolutional neural network correctly predicts the probability as the ith class is piThen the cross entropy loss function in the multi-classification task is:
Figure FDA0002749954340000031
wherein p=[p0,…,pi,…,pN-1]Is a probability distribution of eachAn element piRepresenting the probability of the image belonging to the ith class; y ═ y0,…,yi,…,yN-1]Is a one-hot representation of the image tag, y when the sample belongs to the ith classi1, otherwise yi=0;
The formula of the overall cross entropy loss function is:
Figure FDA0002749954340000032
minimizing a loss function by adopting a gradient descent algorithm, and updating each parameter in the convolutional neural network;
training a deep convolutional neural network for optimal parameters to minimize Loss function Loss, the parameters of the convolutional neural network being
Figure FDA0002749954340000033
If all parameters are denoted by W, then:
Figure FDA0002749954340000034
then, after training the convolutional neural network by adopting a remote sensing image training set, a group of parameters W is found*So that:
Figure FDA0002749954340000035
wherein argmin represents that the value of W when the loss function is minimal is W*
Updating the parameters of the convolutional neural network by adopting a gradient descent algorithm, and simultaneously minimizing a Loss function Loss:
Figure FDA0002749954340000036
where α represents a learning rate, and determines a convergence rate of each step, W(i)Denotes the ith set of parameters to be updated, W(i-1)Indicating updated i-1 th groupThe parameters are set to be in a predetermined range,
Figure FDA0002749954340000037
representing Loss function Loss versus parameter W(i)Partial derivatives of (d);
and (3) adopting normalization weight in the adjacent semantic feature fusion structure to balance the influence ratio of the multilevel input to the final result:
Figure FDA0002749954340000038
wherein ,βiRepresenting the original weight of the input of the current level, t representing the input number of the adjacent semantic feature fusion structures,
Figure FDA0002749954340000039
representing the normalized weight ratio.
5. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1, characterized in that the adjacent semantic feature fusion method of the Top-Down module is as follows:
the adjacent semantic feature fusion structure in the bidirectional feature iterative fusion comprises three inputs, namely Levelk+1、Levelk、Levelk-1Corresponding to feature maps with different resolutions of high level, current level and low level;
the high-level feature map uses up-sampling, the low-level feature map uses down-sampling, and the current level uses identity transformation to enable the three to be added and fused; and (3) carrying out element-by-element addition operation with weights after weights are distributed to the three by using a weight normalization method to obtain a feature map after corresponding semantic information is enhanced:
Figure FDA0002749954340000041
wherein ,
Figure FDA0002749954340000042
representing the Level of the current Level featurekThe corresponding output features of the same size,
Figure FDA0002749954340000043
representing the normalized weights.
6. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 1, characterized in that in the step (6), the preliminarily enhanced feature map A is subjected to2、A3、A4、A5Respectively input into corresponding branches in the Postprocessor structure to generate a feature map B2、B3、B4、B5The method specifically comprises the following steps:
will feature map A2、A3、A4、A5Respectively inputting as the first Residual block of each Postprocessor branch, and performing convolution on the bypass by using a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature map A2_1、A3_1、A4_1、A5_1
A obtained by calculation2_1、A3_1、A4_1、A5_1Respectively serving as the input of a second Residual block of each branch, and performing convolution on a convolution layer with the convolution kernel size of 1x1 to realize characteristic dimension reduction on the side; convolution is carried out on the main path by sequentially using convolution layers with convolution kernel sizes of 1x1, 3x3 and 1x1 to realize feature refinement, and the feature after dimension reduction and the feature after refinement are added and fused element by element to obtain a new feature diagram B2、B3、B4、B5
7. The remote sensing image scene classification method based on bidirectional feature iterative fusion of claim 4, characterized in that, in the step (8), the feature map P is extracted5Classifying by using a Classifer Head structure, wherein the method comprises the following steps:
will feature map P5As input of the Classiier Head, obtaining an output characteristic X through a global average pooling layer; taking the output characteristic X of the pooling layer as the input of the full-connection layer, and obtaining the output characteristic Y of the full-connection layer through the full-connection layer:
Y=[y1,y2,…,yn]
wherein n represents n classes of images in the dataset;
aiming at the output characteristic Y of the full connection layer, calculating a SoftMax value of each remote sensing image sample belonging to the ith class by adopting a SoftMax method as follows:
Figure FDA0002749954340000044
wherein ,yiAnd yjRepresenting the ith and jth samples in the input features, e representing a constant, SiA probability value representing that the picture belongs to the ith class; the final probability value of the ith remote sensing image is as follows:
S=max(S1,S2,…,Sn)
wherein max (. cndot.) represents taking n SiAt the time of probability maximum SiThe corresponding label type is used as the prediction type value Presect _ label of the ith remote sensing image samplei
And continuously optimizing the parameters of the convolutional layer by using a gradient descent algorithm according to the prediction result so that the prediction type values of all the training samples are equal to the Label value Label until the loss function value is minimum.
CN202011180187.XA 2020-10-29 2020-10-29 Remote sensing image scene classification method based on bi-directional feature iterative fusion Active CN112347888B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011180187.XA CN112347888B (en) 2020-10-29 2020-10-29 Remote sensing image scene classification method based on bi-directional feature iterative fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011180187.XA CN112347888B (en) 2020-10-29 2020-10-29 Remote sensing image scene classification method based on bi-directional feature iterative fusion

Publications (2)

Publication Number Publication Date
CN112347888A true CN112347888A (en) 2021-02-09
CN112347888B CN112347888B (en) 2023-08-08

Family

ID=74356533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011180187.XA Active CN112347888B (en) 2020-10-29 2020-10-29 Remote sensing image scene classification method based on bi-directional feature iterative fusion

Country Status (1)

Country Link
CN (1) CN112347888B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699855A (en) * 2021-03-23 2021-04-23 腾讯科技(深圳)有限公司 Image scene recognition method and device based on artificial intelligence and electronic equipment
CN113128564A (en) * 2021-03-23 2021-07-16 武汉泰沃滋信息技术有限公司 Typical target detection method and system based on deep learning under complex background
CN113239736A (en) * 2021-04-16 2021-08-10 广州大学 Land cover classification annotation graph obtaining method, storage medium and system based on multi-source remote sensing data
CN113420838A (en) * 2021-08-20 2021-09-21 中国科学院空天信息创新研究院 SAR and optical image classification method based on multi-scale attention feature fusion
CN113807219A (en) * 2021-09-06 2021-12-17 苏州中科蓝迪软件技术有限公司 Method for identifying types of grain and oil crops in planting land by steps
CN114022752A (en) * 2021-11-04 2022-02-08 中国人民解放军国防科技大学 SAR target detection method based on attention feature refinement and alignment
CN114792398A (en) * 2022-06-23 2022-07-26 阿里巴巴(中国)有限公司 Image classification method and target data classification model construction method
US20230067442A1 (en) * 2021-08-31 2023-03-02 Black Sesame International Holding Limited Method of human pose estimation
CN116740069A (en) * 2023-08-15 2023-09-12 山东锋士信息技术有限公司 Surface defect detection method based on multi-scale significant information and bidirectional feature fusion

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728192A (en) * 2019-09-16 2020-01-24 河海大学 High-resolution remote sensing image classification method based on novel characteristic pyramid depth network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728192A (en) * 2019-09-16 2020-01-24 河海大学 High-resolution remote sensing image classification method based on novel characteristic pyramid depth network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余田田: "基于图和稀疏表示的遥感图像变化检测方法", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699855A (en) * 2021-03-23 2021-04-23 腾讯科技(深圳)有限公司 Image scene recognition method and device based on artificial intelligence and electronic equipment
CN113128564A (en) * 2021-03-23 2021-07-16 武汉泰沃滋信息技术有限公司 Typical target detection method and system based on deep learning under complex background
CN113239736A (en) * 2021-04-16 2021-08-10 广州大学 Land cover classification annotation graph obtaining method, storage medium and system based on multi-source remote sensing data
CN113420838A (en) * 2021-08-20 2021-09-21 中国科学院空天信息创新研究院 SAR and optical image classification method based on multi-scale attention feature fusion
US20230067442A1 (en) * 2021-08-31 2023-03-02 Black Sesame International Holding Limited Method of human pose estimation
CN113807219A (en) * 2021-09-06 2021-12-17 苏州中科蓝迪软件技术有限公司 Method for identifying types of grain and oil crops in planting land by steps
CN114022752A (en) * 2021-11-04 2022-02-08 中国人民解放军国防科技大学 SAR target detection method based on attention feature refinement and alignment
CN114022752B (en) * 2021-11-04 2024-03-15 中国人民解放军国防科技大学 SAR target detection method based on attention feature refinement and alignment
CN114792398A (en) * 2022-06-23 2022-07-26 阿里巴巴(中国)有限公司 Image classification method and target data classification model construction method
CN116740069A (en) * 2023-08-15 2023-09-12 山东锋士信息技术有限公司 Surface defect detection method based on multi-scale significant information and bidirectional feature fusion
CN116740069B (en) * 2023-08-15 2023-11-07 山东锋士信息技术有限公司 Surface defect detection method based on multi-scale significant information and bidirectional feature fusion

Also Published As

Publication number Publication date
CN112347888B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
CN112347888B (en) Remote sensing image scene classification method based on bi-directional feature iterative fusion
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN108537742B (en) Remote sensing image panchromatic sharpening method based on generation countermeasure network
CN110414377B (en) Remote sensing image scene classification method based on scale attention network
CN110110642B (en) Pedestrian re-identification method based on multi-channel attention features
CN110728192B (en) High-resolution remote sensing image classification method based on novel characteristic pyramid depth network
CN110163110B (en) Pedestrian re-recognition method based on transfer learning and depth feature fusion
CN110555458B (en) Multi-band image feature level fusion method for generating countermeasure network based on attention mechanism
CN106909924B (en) Remote sensing image rapid retrieval method based on depth significance
CN111950453B (en) Random shape text recognition method based on selective attention mechanism
CN105740894B (en) Semantic annotation method for hyperspectral remote sensing image
CN111274869B (en) Method for classifying hyperspectral images based on parallel attention mechanism residual error network
CN112329760B (en) Method for recognizing and translating Mongolian in printed form from end to end based on space transformation network
CN108549893A (en) A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN108830296A (en) A kind of improved high score Remote Image Classification based on deep learning
CN115690479A (en) Remote sensing image classification method and system based on convolution Transformer
CN112348036A (en) Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade
CN111626267B (en) Hyperspectral remote sensing image classification method using void convolution
Tereikovskyi et al. The method of semantic image segmentation using neural networks
Ma et al. Unsupervised domain adaptation augmented by mutually boosted attention for semantic segmentation of VHR remote sensing images
CN115496928A (en) Multi-modal image feature matching method based on multi-feature matching
CN113205103A (en) Lightweight tattoo detection method
CN114299398B (en) Small sample remote sensing image classification method based on self-supervision contrast learning
CN115131313A (en) Hyperspectral image change detection method and device based on Transformer
CN114048810A (en) Hyperspectral image classification method based on multilevel feature extraction network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant