WO2020244261A1 - 高分辨率遥感图像的场景识别***及模型生成方法 - Google Patents

高分辨率遥感图像的场景识别***及模型生成方法 Download PDF

Info

Publication number
WO2020244261A1
WO2020244261A1 PCT/CN2020/077889 CN2020077889W WO2020244261A1 WO 2020244261 A1 WO2020244261 A1 WO 2020244261A1 CN 2020077889 W CN2020077889 W CN 2020077889W WO 2020244261 A1 WO2020244261 A1 WO 2020244261A1
Authority
WO
WIPO (PCT)
Prior art keywords
remote sensing
imfnet
network model
image
layer
Prior art date
Application number
PCT/CN2020/077889
Other languages
English (en)
French (fr)
Other versions
WO2020244261A8 (zh
Inventor
王永成
张欣
张n
徐东don
Original Assignee
中国科学院长春光学精密机械与物理研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院长春光学精密机械与物理研究所 filed Critical 中国科学院长春光学精密机械与物理研究所
Publication of WO2020244261A1 publication Critical patent/WO2020244261A1/zh
Publication of WO2020244261A8 publication Critical patent/WO2020244261A8/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images

Definitions

  • the embodiment of the present invention relates to the technical field of remote sensing image classification, in particular to a scene recognition system for high-resolution remote sensing images and a model generation method for recognizing high-resolution remote sensing image scenes.
  • remote sensing image data presents a trend of massive and diversified development.
  • Intelligent and automated analysis of remote sensing image data is the development of the era of big data.
  • the recognition and classification of remote sensing image scenes is an inevitable link in the process of remote sensing image data analysis.
  • remote sensing image scene classification has experienced the development from pixels to objects, and then to semantic scenes.
  • the spatial resolution of satellite images was low, and the pixel size was usually larger than that of the target of interest. Therefore, most remote sensing image analysis methods were based on pixel or sub-pixel analysis.
  • scene classification based solely on pixel level has encountered a bottleneck. Therefore, researchers describe and analyze the "object" level of remote sensing images.
  • the "object" level has better performance than the pixel-level classification method, it does not involve semantic information, so researchers began to understand and analyze the semantic level of the scene.
  • Describing a given category of remote sensing images at different scales and directions may have great variability. With the refinement of remote sensing image classification, the problems of high intra-class variability and low inter-class distance become more and more serious. How to achieve this Scene classification based on semantic categories is an urgent problem to be solved.
  • the embodiments of the present disclosure provide a scene recognition system for high-resolution remote sensing images and a model generation method for recognizing high-resolution remote sensing image scenes, which effectively improves the accuracy of high-resolution remote sensing scene image recognition.
  • the embodiments of the present invention provide the following technical solutions:
  • One aspect of the embodiments of the present invention provides a scene recognition system for high-resolution remote sensing images, including:
  • the IMFNet network model including convolutional layer components, pooling layer components, Inception components, and fully connected layer components;
  • Each convolutional layer in the convolutional layer component and the pooling layer of the pooling layer component are alternately arranged for extracting the shallow information of the input remote sensing image;
  • the Inception component includes a plurality of Inception modules, and each Inception module is connected to the pooling layer of the pooling layer component, and is used to extract high-level information of the remote sensing image;
  • the fully connected layers of the fully connected layer components are cascaded, so that the output characteristics of each fully connected layer are input to the output layer component through the cascade.
  • the IMFNet network model includes 4 convolutional layers, 6 pooling layers, 2 Inception modules, and 3 fully connected layers; and the size of the remote sensing image is 256 ⁇ 256 ⁇ 3.
  • the Inception component includes a first Inception module and a second Inception module;
  • the first Inception module includes a first branch, a second branch, a third branch, and a fourth branch; the first branch includes a convolutional layer with a convolution kernel size of 1*1, and the second branch is a convolutional layer.
  • the size of the convolution kernel is 1*1 and the size of the convolution kernel is 5*5.
  • the third branch is composed of two convolution kernels with a size of 1*1 and a convolution kernel of 3*3. Layers of convolutional layers; the fourth branch is composed of a pooling layer with a step size of 2 and a convolutional layer with a convolution kernel size of 1*1;
  • the second Inception module includes one branch, two branches, three branches, and four branches; the one branch includes a convolution layer with a convolution kernel size of 1*1, and the second branch includes a convolution kernel size of 1*1.
  • the first sub-branch of the convolutional layer and the second sub-branch composed of two parallel convolutional layers with convolution kernel sizes of 1*5 and 5*1, and the three branches are composed of a convolution kernel with a size of 1*1
  • a pooling layer and a convolutional layer with a convolution kernel size of 1*1 are formed.
  • the IMFNet network model further includes a data set amplification module configured to perform a sample image amplification operation on a training sample set containing multiple shared high-resolution remote sensing images.
  • the set amplification module includes:
  • the label frame labeling sub-module is used to generate a preset number of label frames on the sample image
  • the image interception sub-module is used to randomly crop the image part in each annotation frame on the sample image to generate multiple sub-images that contain different image contents;
  • the image adjustment sub-module is used to adjust the image size of each sub-image to the size of the input image of the IMFNet network model by using a size adjustment algorithm.
  • the image adjustment unit further includes a flip unit and a normalization unit;
  • the turning unit is used for turning each sub-image according to a preset angle.
  • the normalization unit is used to adjust the average brightness of each sub-image to 0 and the variance to 1.
  • each fully connected layer of the fully connected layer component includes a model optimization module configured to randomly delete multiple hidden units in the IMFNet network model by using a Dropout algorithm.
  • the output layer component of the IMFNet network model includes a Softmax classifier and a loss function module;
  • the loss function module is used to add a parameter norm penalty term IMFNet network model to the cross entropy loss function by using a parameter norm regularization method.
  • the output layer component of the IMFNet network model further includes a parameter update frequency control module configured to use a pre-built moving average model to continuously update the attenuation rate to control the IMFNet network model variables The magnitude of the update.
  • a parameter update frequency control module configured to use a pre-built moving average model to continuously update the attenuation rate to control the IMFNet network model variables The magnitude of the update.
  • the output layer component of the IMFNet network model further includes a parameter optimization module, and the parameter optimization module is used to optimize the parameter weights of the IMFNet network model by using the Adam algorithm.
  • Another aspect of the embodiments of the present invention provides a model generation method for recognizing high-resolution remote sensing image scenes, including:
  • the framework structure of the IMFNet network model used for remote sensing image scene recognition is constructed in the pre-built training environment; the convolutional layer and the pooling layer of the IMFNet network model are alternately arranged to extract the shallow information of the input remote sensing image, Each Inception module is connected to the pooling layer for extracting high-level information of the remote sensing image, and each fully connected layer is cascaded to input the output features of each fully connected layer to the output layer component through the cascade;
  • the IMFNet network model is trained using the high-resolution remote sensing images of the training sample set until the preset end condition is met, and a trained IMFNet network model is obtained.
  • the IMFNet network model includes an Inception component for extracting high-level information features. This component does not need to artificially determine the selection of filters and pooling, and can realize the self-learning of the IMFNet network model. Due to the Inception component For the combination of different receptive fields, both micro-features and macro-features can be learned.
  • the embodiment of the present invention also provides a corresponding model generation method for recognizing the scene of the high-resolution remote sensing image for the scene recognition system of the high-resolution remote sensing image, which further makes the system more feasible, and the model generation
  • the method has corresponding advantages.
  • Fig. 1 is a structural block diagram of a scene recognition system for high-resolution remote sensing images according to an exemplary embodiment of the present disclosure
  • FIG. 2 is a structural diagram of a scene recognition system for high-resolution remote sensing images according to another exemplary embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a model optimization process according to another exemplary embodiment of the present disclosure.
  • Figure 4 is a test accuracy rate change curve obtained by training and testing the internationally universal SIRI-WHU data set and UCMerced data set provided by this disclosure;
  • FIG. 5 is a schematic flowchart of a model generation method for recognizing high-resolution remote sensing image scenes provided by the present disclosure.
  • Figure 1 is a structural block diagram of a high-resolution remote sensing image scene recognition system provided by an embodiment of the present invention in an embodiment mode.
  • the system may include an IMFNet network for remote sensing image scene recognition.
  • Model 1 may include an input layer component 11, a convolutional layer component 12, a pooling layer component 13, an Inception component 14, a fully connected layer component 15, and an output layer component 16.
  • each convolutional layer in the convolutional layer component 12 and the pooling layer of the pooling layer component 13 are alternately arranged to extract the shallow information of the input remote sensing image. That is, for each convolutional layer in the convolutional layer component 12 is connected to the pooling layer in the pooling layer component 13, the dimensionality reduction operation can be performed on the output features of the corresponding convolutional layer by using the pooling layer.
  • the Inception component 14 may include multiple Inception modules, and each Inception module is connected to the pooling layer of the pooling layer component 13. Since the Inception module is a combination of different receptive fields, it can learn both micro-features and learning. In terms of macro characteristics, there is no need to artificially decide the choice of filters and pooling, and the self-learning of the network is realized. Therefore, the Inception module can be used to extract high-level information of the input image.
  • each fully connected layer of the fully connected layer component 15 is cascaded, so that the output characteristics of each fully connected layer are input to the output layer component after being cascaded. That is to say, the fully connected layer component 15 adopts a multi-layer feature fusion method, which can cascade the features of the fully connected layer containing high-level semantic information, and then classify the features through the output layer, so that the features contained in different layers Semantic information can complement each other, thereby improving the accuracy of classification.
  • the structure and function of the input layer component 11 and the output layer component 16 of the IMFNet network model can refer to the relevant description of the input layer and output layer of the convolutional neural network model in the related art, which will not be repeated here.
  • the input layer component 11 of the IMFNet network model may further include a training sample set active reading module, and the training sample set active reading module is used to read sample data from the training sample set in a queue or multithreading manner.
  • the IMFNet network model includes an Inception component for extracting high-level information features.
  • This component does not need to manually determine the filter and pooling selection, and can realize the self-learning of the IMFNet network model.
  • the component is a combination of different receptive fields, which can learn both micro-features and macro-features. It not only realizes the self-learning of the network without increasing the number of network layers, but also improves the accuracy of the IMFNet network model to identify targets;
  • the semantic information contained in the features of different layers can complement each other, ensuring the integrity of the feature information, and further improving the IMFNet network model to identify targets. Accuracy.
  • the method of scene recognition for high-resolution remote sensing images based on convolutional neural networks is mainly based on migration learning, that is, pre-training models trained on large-scale data (such as ImageNet) are directly applied or fine-tuned in the field of remote sensing images. .
  • migration learning that is, pre-training models trained on large-scale data (such as ImageNet) are directly applied or fine-tuned in the field of remote sensing images.
  • imageNet large-scale data
  • it is necessary to build a new convolutional neural network model using shared high-resolution remote sensing image data sets Although the data volume of remote sensing images is increasing day by day, the data volume of internationally shared remote sensing image data sets with label information is still very small, and training a new convolutional neural network model usually requires a lot of data. The objects in the image are randomly distributed in the image.
  • the IMFNet network model may include a data set amplification module.
  • the data set amplification module is used to perform sample image amplification operations on a training sample set containing multiple shared high-resolution remote sensing images.
  • the data set amplification module includes:
  • the label frame labeling sub-module is used to generate a preset number of label frames on the sample image. For example, two labeling frames of different sizes can be used to label the image, so as to realize the extraction of image information of different viewing angles and different sizes.
  • the image interception sub-module is used to randomly crop the image part in each annotation frame on the sample image to generate multiple sub-images with different image contents. For example, for the image in each labeled frame, any part of the image can be randomly cropped, for example, 40% of the information content can be cropped.
  • the image adjustment sub-module is used to adjust the image size of each sub-image to the size of the input image of the IMFNet network model by using a size adjustment algorithm.
  • the size adjustment algorithm may be, for example, an interpolation algorithm, such as bilinear interpolation, nearest neighbor interpolation, bicubic interpolation, area interpolation, etc.
  • the size of the input image can be 256*256*3, and different interpolation algorithms or multiple interpolation algorithms can be used to stretch the size of each sub-image to 256*256*3.
  • the image adjustment sub-module may further include a flip unit and a normalization unit, for example.
  • the flip unit can be used to flip each sub-image according to a preset angle. Image flipping will not affect the result of model recognition. However, describing the same image from different angles and preserving as much detailed information as possible will help improve the accuracy of model training. Therefore, the obtained sub-images can be flipped with a probability of about 50%, so that the sample images in the training sample set are more diverse.
  • the normalization unit can also be used to normalize the brightness of each sub-image. For example, the image can be standardized to change the average brightness of each sub-image to 0 and the variance to 1, as shown in formula (1) and formula (2):
  • X is the image matrix
  • is the image mean
  • is the standard deviation
  • N is the number of pixels in the X image.
  • the multi-view and multi-scale stretched training sample set provided by this application is different from the method of cropping the four corners and the center of the image and rotating to amplify the data set.
  • This application randomly crops different parts of the labeled frame and uses it randomly
  • the size adjustment algorithm such as interpolation method, stretches the cropped image to the input size of the network, and then flips it left and right.
  • the interpolation algorithm is used to stretch each sub-image obtained by cropping.
  • the interpolation process is equivalent to adding noise, which improves the robustness of the model, thereby improving the generalization ability of the constructed convolutional neural network model .
  • the embodiment of the present invention sequentially performs labeling, random cropping, random stretching, random flipping, and image normalization operations on each sample image in the training sample set.
  • the data set is amplified so that the network can Learn remote sensing images from different perspectives; on the other hand, noise is introduced in the stretching operation to improve the robustness of the model.
  • IMFNet network model 1 can include 1 input layer, 4 convolutional layers, 6 maximum pooling layers, 2 Inception modules, 3 fully connected layers, and 1 output layer.
  • the input of the input layer of the network model can be a 256 ⁇ 256 ⁇ 3 picture.
  • the first few layers of the IMFNet network model 1 can use the traditional method of alternating convolutional layers and maximum pooling layers to extract shallow features. Among them, the first three convolution layers all use a 5 ⁇ 5 convolution kernel to extract larger features, and the fourth convolution layer uses a 3 ⁇ 3 convolution kernel to extract finer features.
  • the first Inception module is a module in the Inception V1 model, which consists of four branches; the second Inception module is an improved version of the Inception V3 model module, and the third branch of the module is one more convolutional layer than the Inception V1 module , Can extract more refined features.
  • the filter Concat means that the feature maps obtained through the convolution operation are connected in depth.
  • This application also uses a multi-layer feature fusion method for the fully connected layer and the output layer.
  • the features of the three fully connected layers can be cascaded as the input of the output layer, so that the semantic information contained in the features of different layers can be mutually connected. In addition, the integrity of the information is guaranteed to a certain extent, and the accuracy of classification can be improved.
  • the output layer of IMFNet network model 1 can be classified by Softmax classifier, and the output size depends on the number of categories of remote sensing scenes to be classified.
  • the Inception component 14 may include a first Inception module and a second Inception module;
  • the first Inception module includes a first branch, a second branch, a third branch, and a fourth branch; the first branch includes a convolutional layer with a convolution kernel size of 1*1, and the second branch consists of a convolution kernel with a size of 1* 1.
  • the convolution kernel is composed of two convolution layers with a size of 5*5, and the third branch is composed of two convolution layers with a convolution kernel size of 1*1 and a convolution kernel size of 3*3; fourth The branch is composed of a pooling layer with a step size of 2 and a convolution layer with a convolution kernel size of 1*1;
  • the second Inception module includes one branch, two branches, three branches and four branches; one branch includes the convolutional layer with a convolution kernel size of 1*1, and the second branch includes the first convolutional layer with a convolution kernel size of 1*1.
  • the convolutional layer with a size of 5*1 and two parallel convolutional layers with a convolution kernel size of 1*3 and 3*1; the four branches are composed of a pooling layer with a step size of 2 and a convolution kernel size of 1. *1 Convolutional layer structure.
  • IMFNet network model when the amount of data is too small and the model parameters are too many, over-fitting problems are likely to occur. Based on this, please refer to Figure 3.
  • the Dropout strategy and the parameter norm penalty regularization method can be used to prevent overfitting, and the moving average model can be used to make the model more robust.
  • each fully connected layer of the fully connected layer component 15 includes a model optimization module.
  • the model optimization module is used to randomly delete multiple hidden units in the IMFNet network model 1 using the Dropout algorithm.
  • the Dropout algorithm can drop a part of the neurons in the hidden layer each time, which is equivalent to training on a different network each time, thereby effectively reducing the interdependence between neurons.
  • the Dropout strategy is added to the fully connected part, which can effectively reduce the mutual adaptability and over-fitting problems of network neurons, thereby improving the generalization ability of the model.
  • the output layer component 16 of the IMFNet network model 1 may also include a parameter update frequency control module, which can be used to use a pre-built moving average model to continuously update the attenuation rate to control the magnitude of the IMFNet network model variable update .
  • the moving average model controls the magnitude of the variable update by continuously updating the decay rate, so that the model is updated faster in the early training stage, and the model is updated more slowly in the later training stage, which is close to the optimal value, which is beneficial to improve the robustness of the IMFNet network model 1.
  • the attenuation rate and parameter update of the moving average model can be as formula (3) and formula (4):
  • shadow_var decay ⁇ shadow_var+(1-decay) ⁇ var.
  • init_decay is the set initial decay rate
  • num_update is the number of updates
  • var is the variable to be updated
  • shadow_var is the updated value of the variable, which can also be called a shadow variable.
  • the moving average model can be applied in both the model training process and the model verification evaluation process.
  • shadow variables can be maintained for each trainable weight and updated with iterations; in the model verification and evaluation stage, shadow variables can be used to replace the real variable values for classification prediction.
  • the output layer component 16 of the IMFNet network model 1 may also include a Softmax classifier and a loss function module.
  • the loss function module can be used to use the parameter norm regularization method to add a parameter norm penalty term to the loss function. Any loss function can be used, such as a cross-entropy loss function, which is not limited in this application.
  • Parameter norm regularization restricts the size of the weight by adding a parameter norm penalty term to the objective function, so that the model cannot arbitrarily fit the random noise in the training data, further optimize the model, and improve the target recognition accuracy of the IMFNet network model.
  • the parameter norm is regularized by adding the index R(W) describing the complexity of the model to the loss function J(W,b), and limiting the weight W by optimizing J(W,b)+ ⁇ R(W) The size, so that the model cannot fit random noise in the training data arbitrarily.
  • R(W) Using different parameter metrics R(W) will produce different regularization effects.
  • Commonly used parameter metrics regularization methods include L1 norm regularization and L2 norm regularization.
  • the present invention adopts L2 norm regularization, as shown in the following formula.
  • the output layer component 16 of the IMFNet network model 1 may also include a parameter optimization module, which is used to optimize the parameter weights of the IMFNet network model 1 using the Adam algorithm.
  • a parameter optimization module which is used to optimize the parameter weights of the IMFNet network model 1 using the Adam algorithm.
  • the backpropagation algorithm can be used to implement the iterative process, for example, the number of iterations can be set to 100,000, which is conducive to further improving the IMFNet network model 1 The target recognition accuracy rate.
  • the existing high-resolution remote sensing sample images can be divided into training samples and test samples according to 4:1, thereby generating training sample sets and test sample sets.
  • the binary files of the training sample set can be used to train the IMFNet network model, and the test sample set can be used to test the trained IMFNet network model.
  • the test set can be used to load the trained model every 10s
  • the test results can complete the scene recognition and classification task of high-resolution remote sensing images.
  • the model training and verification evaluation can be performed on two internationally-used high-resolution remote sensing image data sets, such as the UCMerced data set and the SIRI-WHU data set.
  • the pixels of each image in the UCMerced data set It is 256 ⁇ 256, and the spatial resolution is 30cm. There are 21 types of images, each with 100 images.
  • the SIRI-WHU data set the pixels of each image are 200 ⁇ 200, and the spatial resolution is 2m.
  • the accuracy, precision, recall and F1 value evaluation indicators are used to evaluate the recognition results of the IMFNet network model.
  • the examples can be divided into real examples (TP) and false according to the combination of the real category and the learner’s prediction category.
  • TP real examples
  • FP positive cases
  • TN true negative cases
  • FN false negative cases
  • a true negative example is a negative example judged as a negative example
  • a false positive example is a negative example judged as a positive example.
  • the number of negative examples N is:
  • N TN+FP
  • the accuracy rate is the proportion of examples that judge the correctness.
  • the calculation formula is:
  • the accuracy is the proportion of all cases judged to be positive, and the calculation formula is:
  • the recall rate is the proportion of all positive cases judged to be positive cases.
  • the calculation formula is:
  • F1 value is a comprehensive evaluation index of precision rate and recall rate, and the calculation formula is:
  • the embodiment of the present invention also provides a corresponding recognition scene model generation method for the scene recognition system of high-resolution remote sensing images, which further makes the system more feasible.
  • the model generation method for recognizing high-resolution remote sensing image scenes provided by the embodiments of the present invention is introduced below.
  • the model generation method for recognizing high-resolution remote sensing image scenes described below is the same as that of the high-resolution remote sensing image described above.
  • the scene recognition system can refer to each other.
  • FIG. 5 is a schematic flowchart of a model generation method for recognizing high-resolution remote sensing image scenes according to an embodiment of the present invention.
  • the embodiment of the present invention may include the following content:
  • S501 Construct a framework structure of an IMFNet network model for scene recognition of remote sensing images in a pre-built training environment.
  • the convolutional layer and the pooling layer of the IMFNet network model are alternately arranged to extract the shallow information of the input remote sensing image.
  • Each Inception module is connected to the pooling layer to extract the high-level information of the remote sensing image, and each fully connected layer is cascaded , To input the output features of each fully connected layer to the output layer components after cascading.
  • S502 Use the high-resolution remote sensing image of the training sample set to train the IMFNet network model until the preset end condition is met, and obtain the trained IMFNet network model.
  • the hardware platform in this embodiment may be based on an Intel E5 2665 dual-core processor, a 4-way GTX1080Ti GPU, and a 32G memory.
  • the software platform can be based on the Urbanu16.04 version, using CUDA 8.0.61, CUDNN v6 and TensorFlow 1.4.0 environment.
  • TensorFlow is an open source software library used for high-performance numerical calculations, which can transmit complex data structures to artificial intelligence neural networks for analysis and processing.
  • computing tasks can be easily deployed to multiple platforms (such as CPU, GPU, TPU) and devices (desktop devices, servers, etc.), and are widely used in machine deep learning fields such as speech recognition or image recognition.
  • the image format in the training sample set or the test sample set must be consistent with the computing software used in the built software environment. For example, when TensorFlow software is used, the image format in the training sample set or the test sample set needs to be converted to the TFRecord format, and then the binary file can be read through the queue and multithreading.
  • a single training sample image in the training sample set can be used as the input of the IMFNet network model in batches, and the batch size can be set to 64.
  • the embodiment of the present invention realizes the self-learning of the network model, ensures the integrity of the feature information, and effectively improves the accuracy of the IMFNet network model to identify targets.
  • the embodiment of the present invention also provides a model generation device for recognizing high-resolution remote sensing image scenes, which may specifically include:
  • Memory used to store computer programs
  • the processor is configured to execute a computer program to implement the steps of the model generation method for recognizing a high-resolution remote sensing image scene as described in any of the above embodiments.
  • each functional module of the model generation device for recognizing high-resolution remote sensing image scenes described in the embodiments of the present invention can be specifically implemented according to the methods in the above method embodiments, and the specific implementation process can be referred to the relevant descriptions in the above method embodiments. , I won’t repeat it here.
  • the embodiment of the present invention realizes the self-learning of the network model, ensures the integrity of the feature information, and effectively improves the accuracy of the IMFNet network model to identify targets.
  • the embodiment of the present invention also provides a computer-readable storage medium that stores a model generation program for recognizing high-resolution remote sensing image scenes.
  • a model generation program for recognizing high-resolution remote sensing image scenes is executed by a processor. The steps of the model generation method for recognizing high-resolution remote sensing image scenes as described in any of the above embodiments.
  • each functional module of the computer-readable storage medium in the embodiment of the present invention can be specifically implemented according to the method in the above method embodiment, and the specific implementation process can refer to the related description of the above method embodiment, and will not be repeated here.
  • the embodiment of the present invention realizes the self-learning of the network model, ensures the integrity of the feature information, and effectively improves the accuracy of the IMFNet network model to identify targets.
  • the steps of the method or algorithm described in the embodiments disclosed in this document can be directly implemented by hardware, a software module executed by a processor, or a combination of the two.
  • the software module can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all areas in the technical field. Any other known storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种高分辨率遥感图像的场景识别***及用于识别高分辨率遥感图像场景的模型生成方法。其中,***包括用于对遥感图像场景识别的IMFNet网络模型,IMFNet网络模型包括卷积层组件、池化层组件、Inception组件及全连接层组件;卷积层组件中的每个卷积层与池化层组件的池化层相互交替排列,用于提取输入遥感图像的浅层信息;Inception组件包括多个Inception模块,各Inception模块与池化层组件的池化层相连,用于提取遥感图像的高层信息;全连接层组件的各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件。本申请实现了网络模型的自行学习,保证了特征信息的完整性,有效地提高了IMFNet网络模型识别目标的准确率。

Description

高分辨率遥感图像的场景识别***及模型生成方法
本申请要求于2019年6月5日提交至中国专利局、申请号为201910486629.4、发明名称为“高分辨率遥感图像的场景识别***及模型生成方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明实施例涉及遥感图像分类技术领域,特别是涉及一种高分辨率遥感图像的场景识别***及用于识别高分辨率遥感图像场景的模型生成方法。
背景技术
随着获取遥感图像数据能力的不断提高以及遥感数据成像方式的多样化发展,遥感图像数据呈现海量化和多元化的发展趋势,对遥感图像数据进行智能化、自动化的分析是大数据时代发展下的要求,在遥感图像数据分析过程中对遥感图像场景的识别和分类是一个不可避免的环节。
从抽象级别的方式上来讲,遥感图像场景分类经历了从像素到对象,再到语义场景的发展。20世纪70年代早期,卫星图像的空间分辨率较低,像素大小通常比感兴趣的目标更大,因此大多数的遥感图像分析的方式都是基于像素或者亚像素进行分析。随着遥感技术的发展,空间分辨率不断提高,单纯基于像素级别的场景分类遇到了瓶颈。因此,研究人员对遥感图像的“对象”层次进行描述与分析。虽然“对象”级比像素级分类方法性能要好,但是并未涉及到语义信息,因此研究人员开始对场景的语义层面进行理解分析。
在不同尺度以及方向上描述一个给定类别的遥感图像可能出现很大的变异性,随着遥感图像分类的细化,高类内变异性、低类间距离的问题越来越严重,如何实现基于语义类别的场景分类,是一个亟待解决的问题。
在对遥感图像进行分类时,相关技术通常采用共享的高分辨率可见光遥感图像数据集训练卷积神经网络来对遥感图像场景进行分类,但是由于目前共享的高分辨率遥感图像数据集中图像的数量较少,因此分类精度较低。
发明内容
本公开实施例提供了一种高分辨率遥感图像的场景识别***及用于识别高分辨率遥感图像场景的模型生成方法,有效地提高了高分辨遥感场景图像识别的准确率。
为解决上述技术问题,本发明实施例提供以下技术方案:
本发明实施例一方面提供了一种高分辨率遥感图像的场景识别***,包括:
包括用于对遥感图像场景识别的IMFNet网络模型,所述IMFNet网络模型包括卷积层组件、池化层组件、Inception组件及全连接层组件;
所述卷积层组件中的每个卷积层与所述池化层组件的池化层相互交替排列,用于提取输入遥感图像的浅层信息;
所述Inception组件包括多个Inception模块,各Inception模块与所述池化层组件的池化层相连,用于提取所述遥感图像的高层信息;
所述全连接层组件的各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件。
可选的,所述IMFNet网络模型包括4个卷积层、6个池化层、2个Inception模块及3个全连接层;且所述遥感图像的尺寸为256×256×3。
可选的,所述Inception组件包括第一Inception模块和第二Inception模块;
所述第一Inception模块包括第一分支、第二分支、第三分支和第四分支;所述第一分支包括卷积核尺寸为1*1的卷积层,所述第二分支为由卷积核尺寸为1*1、卷积核尺寸为5*5的两层卷积层构成,所述第三分支为由卷积核尺寸为1*1、卷积核尺寸为3*3的两层卷积层构成;所述第四分支为由步长为2的池化层和卷积核尺寸为1*1的卷积层构成;
所述第二Inception模块包括一分支、二分支、三分支和四分支;所述一分支包括卷积核尺寸为1*1的卷积层,所述二分支包括卷积核尺寸为1*1的卷积层的第一子分支、卷积核尺寸为1*5和5*1的两个平行卷积层构成的第二子分支,所述三分支为由卷积核尺寸为1*1的卷积层、卷积核尺寸为5*1的卷积层以及卷积核尺寸为1*3和3*1的两个平行卷积层构成;所述四分支为由步长为2的池化层和卷积核尺寸为1*1的卷积层构成。
可选的,所述IMFNet网络模型还包括数据集扩增模块,所述数据集扩增模块用于对包含多张共享的高分辨遥感图像的训练样本集进行样本图像扩增操作,所述数据集扩增模块包括:
标注框标注子模块,用于在样本图像上生成预设个数的标注框;
图像截取子模块,用于随机裁剪所述样本图像上各标注框中的图像部分,以生成多个包含图像内容不完全相同的子图像;
图像调整子模块,用于采用尺寸调整算法将各子图像的图像尺寸调整为所述IMFNet网络模型输入图像的尺寸。
可选的,所述图像调整单元还包括翻转单元和归一化单元;
所述翻转单元用于按照预设角度对每幅子图像进行翻转。
所述归一化单元用于将每幅子图像的亮度均值调整为0、方差调整为1。
可选的,所述全连接层组件的各全连接层均包括模型优化模块,所述模型优化模块用于采用Dropout算法随机删除所述IMFNet网络模型中的多个隐藏单元。
可选的,所述IMFNet网络模型的输出层组件包括Softmax分类器和损失函数模块;
所述损失函数模块用于利用参数范数正则化方法为交叉熵损失函数增加参数范数惩罚项IMFNet网络模型。
可选的,所述IMFNet网络模型的输出层组件还包括参数更新频率控制模块,所述参数更新频率控制模块用于利用预先构建的滑动平均模型通过不断更新衰减率来控制所述IMFNet网络模型变量更新的幅度。
可选的,所述IMFNet网络模型的输出层组件还包括参数优化模块, 所述参数优化模块用于采用Adam算法优化所述IMFNet网络模型的参数权重。
本发明实施例另一方面提供了一种用于识别高分辨率遥感图像场景的模型生成方法,包括:
在预先搭建训练环境中构建用于对遥感图像场景识别的IMFNet网络模型的框架结构;所述IMFNet网络模型的卷积层与池化层相互交替排列,用于提取输入遥感图像的浅层信息,各Inception模块与池化层相连,用于提取所述遥感图像的高层信息,且各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件;
利用训练样本集的高分辨率遥感图像训练所述IMFNet网络模型直至满足预设结束条件,得到训练好的IMFNet网络模型。
本申请提供的技术方案的优点在于,IMFNet网络模型包括用于提取高层信息特征的Inception组件,该组件无需人为地决定滤波器以及池化的选择,可实现IMFNet网络模型的自行学习,由于Inception组件为不同感受野的组合,既能够学习到微观特征又能学习到宏观特性,在不增加网络层数的情况下不仅实现了网络的自行学习,而且提高了IMFNet网络模型识别目标的准确率;此外,通过将全连接层的特征进行级联作为输出层的输入,从而使得不同层的特征所包含的语义信息可以相互补充,保证了特征信息的完整性,可进一步提高IMFNet网络模型识别目标的准确率。
此外,本发明实施例还针对高分辨率遥感图像的场景识别***提供了相应的用于识别高分辨率遥感图像的场景的模型生成方法,进一步使得所述***更具有可行性,所述模型生成方法具有相应的优点。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。
附图说明
为了更清楚的说明本发明实施例或相关技术的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来 讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本公开根据一示例性实施例示出的高分辨率遥感图像的场景识别***的结构框图;
图2为本公开根据另一示例性实施例示出的高分辨率遥感图像的场景识别***的结构图;
图3为本公开根据另一示例性实施例示出的模型优化过程示意图;
图4为本公开提供的对国际通用的SIRI-WHU数据集和UC Merced数据集进行训练测试所得到的测试准确率变化曲线;
图5为本公开提供的一种用于识别高分辨率遥感图像场景的模型生成方法的流程示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”“第四”等是用于区别不同的对象,而不是用于描述特定的顺序。此外术语“包括”和“具有”以及他们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、***、产品或设备没有限定于已列出的步骤或单元,而是可包括没有列出的步骤或单元。
在介绍了本发明实施例的技术方案后,下面详细的说明本申请的各种非限制性实施方式。
首先参见图1,图1为本发明实施例提供的一种高分辨率遥感图像的场景识别***在一种实施例方式下的结构框图,该***可包括用于对遥感图像场景识别的IMFNet网络模型1,IMFNet网络模型1可包括输入层组件11、卷积层组件12、池化层组件13、Inception组件14、全连接层组件15以及输出层组件16。
本申请中,卷积层组件12中的每个卷积层与池化层组件13的池化层相互交替排列,用于提取输入遥感图像的浅层信息。也就是说,对于卷积层组件12中的每个卷积层均与池化层组件13中的池化层相连,利用池化层可对相应卷积层输出特征进行降维操作。
在本发明实施中,Inception组件14可包括多个Inception模块,各Inception模块与池化层组件13的池化层相连,由于Inception模块为不同感受野的组合,既能够学习到微观特征又能学习到宏观特性,还无需人为地决定滤波器以及池化的选择,实现了网络的自行学习,因此可采用Inception模块提取输入图像的高层信息。
可以理解的是,全连接层组件15的各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件。也就是说,全连接层组件15采用多层特征融合方法,该方法可将包含高层语义信息的全连接层进行特征级联,再通过输出层进行特征分类,从而使得不同层的特征所包含的语义信息可以相互补充,从而提高分类的准确率。
在该实施例中,IMFNet网络模型的输入层组件11和输出层组件16的结构功能可参阅相关技术中卷积神经网络模型的输入层和输出层的相关描述,此处,便不再赘述。可选的,IMFNet网络模型的输入层组件11还可包括训练样本集主动读取模块,训练样本集主动读取模块用于采用队列或多线程方式从训练样本集中读取样本数据。
在本发明实施例提供的技术方案中,IMFNet网络模型包括用于提取高层信息特征的Inception组件,该组件无需人为地决定滤波器以及池化的选择,可实现IMFNet网络模型的自行学习,由于Inception组件为不同感受野的组合,既能够学习到微观特征又能学习到宏观特性,在不增加网络层数的情况下不仅实现了网络的自行学习,而且提高了IMFNet网络模型识别目标的准确率;此外,通过将全连接层的特征进行级联作为输出层的输入,从而使得不同层的特征所包含的语义信息可以相互补充,保证了特征信息的完整性,可进一步提高IMFNet网络模型识别目标的准确率。
目前,基于卷积神经网络的高分辨率遥感图像的场景识别的方法以迁移学习为主,即将基于大规模数据(如ImageNet)训练的预训练模型通过 直接应用或微调的方式应用在遥感图像领域。但是为了深度学习的方法在遥感场景分类中的进一步发展,使用共享的高分辨率遥感图像数据集构建新的卷积神经网络模型具有一定的必要性。尽管遥感图像的数据量日益增长,但是目前具有标签信息的国际共享遥感图像数据集的数据量依然很少,而训练一个新的卷积神经网络模型通常需要大量的数据,由于高分辨率遥感图像中的物体随机分布在图像中,本申请提供在有限样本基础上可进行多视角多尺度拉伸的策略,从而实现了训练样本数据集的扩增,解决了目前共享的高分辨率遥感图像数据集中数据量的限制。IMFNet网络模型可包括数据集扩增模块,数据集扩增模块用于对包含多张共享的高分辨遥感图像的训练样本集进行样本图像扩增操作,数据集扩增模块包括:
标注框标注子模块,用于在样本图像上生成预设个数的标注框。举例来说,例如可采用两个不同大小的标注框在图像上进行标注,从而实现提取不同视角不同尺寸的图像信息。
图像截取子模块,用于随机裁剪样本图像上各标注框中的图像部分,以生成多个包含图像内容不完全相同的子图像。举例来说,对于每个标注框中的图像,可随机裁剪任意部分的图像,例如裁剪40%的信息内容。
图像调整子模块,用于采用尺寸调整算法将各子图像的图像尺寸调整为IMFNet网络模型输入图像的尺寸。尺寸调整算法例如可为插值算法,如双线性插值法、最近邻插值法、双三次插值法、面积插值法等。举例来说,输入图像的尺寸可为256*256*3,可利用不同的插值算法或同时使用多种插值算法将各子图像的尺寸拉伸为256*256*3。
在另外一种实施方式中,图像调整子模块例如还可包括翻转单元和归一化单元。
其中,翻转单元可用于按照预设角度对每幅子图像进行翻转。图像翻转不会影响模型识别的结果,但是,不同角度描述同一张图像,尽可能多的保留图像的细节信息,有利于提升模型训练精度。故可以以50%概率左右翻转得到的各个子图像,使得训练样本集中的样本图像更具有多样性。为了便于后续图像数据处理,还可通过归一化单元用于对每幅子图像的亮度进行归一化处理。例如可对图像进行标准化操作,将每幅子图像的亮度 均值变为0,方差变为1,如公式(1)及公式(2)所示:
Figure PCTCN2020077889-appb-000001
Figure PCTCN2020077889-appb-000002
式中,X为图像矩阵,μ为图像均值,σ为标准方差,N为X图像的像素数目。
需要说明的是,本申请提供的多视角多尺度拉伸训练样本集与裁剪图像的四角及中心并进行旋转从而扩增数据集的方法不同,本申请随机裁剪标注框的不同部位,并且随机采用了尺寸调整算法,如插值方法将裁剪的图像拉伸为网络的输入大小,然后进行左右翻转。不仅能够扩大数据集的数量,采用插值算法对裁剪得到各子图像进行拉伸,插值的过程相当于加入噪声,提高了模型的鲁棒性,从而提高构建的卷积神经网络模型的泛化能力。
由上可知,本发明实施例依次对训练样本集中的每个样本图像进行标注框标注、随机裁剪、随机拉伸、随机翻转、图像归一化操作,一方面扩增了数据集,使网络能够学习到不同视角的遥感图像;另一方面在拉伸操作中通过引入噪声,从而提高模型的鲁棒性。
为了便于本领域技术人员更加清楚明白本申请提供的技术方案,本申请还提供了一种示意性的IMFNet网络模型1结构,请参阅图2,可包括:
IMFNet网络模型1可包括1个输入层,4个卷积层、6最大池化层、2个Inception模块、3个全连接层以及1个输出层。网络模型的输入层的输入可为256×256×3的图片,IMFNet网络模型1的前几层可采用传统的卷积层及最大池化层相互交替的方法提取浅层特征。其中,前三个卷积层均采用了5×5的卷积核进行较大特征的提取,第四个卷积层采用了3×3的卷积核对较为精细的特征进行提取。第一个Inception模块为Inception V1模型中的模块,该模块由四个分支组成;第二个Inception模块为Inception V3模型模块的改进版,该模块第三个分支比Inception V1模块多一个卷积层,能够提取更加精细的特征。其中,过滤器Concat表示将通过卷积操作得到 的特征图按深度连接起来。本申请还对全连接层及输出层采用了多层特征融合方法,可通过将三个全连接层的特征进行级联作为输出层的输入,从而使得不同层的特征所包含的语义信息可以相互补充,在一定程度上保证了信息的完整性,能够提高分类的准确率。IMFNet网络模型1的输出层可采用Softmax分类器进行分类,输出的大小取决于待分类遥感场景的类别数。
具体的,在图2所示的IMFNet网络模型1中,Inception组件14可包括第一Inception模块和第二Inception模块;
第一Inception模块包括第一分支、第二分支、第三分支和第四分支;第一分支包括卷积核尺寸为1*1的卷积层,第二分支为由卷积核尺寸为1*1、卷积核尺寸为5*5的两层卷积层构成,第三分支为由卷积核尺寸为1*1、卷积核尺寸为3*3的两层卷积层构成;第四分支为由步长为2的池化层和卷积核尺寸为1*1的卷积层构成;
第二Inception模块包括一分支、二分支、三分支和四分支;一分支包括卷积核尺寸为1*1的卷积层,二分支包括卷积核尺寸为1*1的卷积层的第一子分支、卷积核尺寸为1*5和5*1的两个平行卷积层构成的第二子分支,三分支为由卷积核尺寸为1*1的卷积层、卷积核尺寸为5*1的卷积层以及卷积核尺寸为1*3和3*1的两个平行卷积层构成;四分支为由步长为2的池化层和卷积核尺寸为1*1的卷积层构成。
可以理解的是,IMFNet网络模型1的训练过程中,在数据量太少、模型参数过多的情况下,容易产生过拟合问题。基于此,请参阅图3所示,对IMFNet网络模型1可采用Dropout策略、参数范数惩罚正则化的方法来防止过拟合,并采用滑动平均模型使得模型更加健壮。
其中,全连接层组件15的各全连接层均包括模型优化模块,如图2所示,模型优化模块用于采用Dropout算法随机删除IMFNet网络模型1中的多个隐藏单元。Dropout算法可在每次丢掉一部分的隐藏层的神经元,相当于每次在不同的网络上进行训练,进而有效地减少了神经元之间的相互依赖性。在全连接部分加入了Dropout策略,可有效地减弱网络神经元的互适应性及过拟合问题,从而提高了模型的泛化能力。
可选的,IMFNet网络模型1的输出层组件16还可包括参数更新频率控制模块,参数更新频率控制模块可用于利用预先构建的滑动平均模型通过不断更新衰减率来控制IMFNet网络模型变量更新的幅度。滑动平均模型通过不断更新衰减率来控制变量更新的幅度,从而使得训练初期模型更新较快,在训练后期模型也就是接近最优值时更新较慢,有利于提高IMFNet网络模型1的健壮性。滑动平均模型的衰减率和参数更新可如公式(3)和公式(4):
Figure PCTCN2020077889-appb-000003
shadow_var=decay×shadow_var+(1-decay)×var。(4)
式中,init_decay为设置的初始衰减率,num_update为更新次数,var为待更新变量,shadow_var为变量更新后的数值,也可称为影子变量。
需要说明的是,滑动平均模型可在模型训练过程以及模型验证评估过程均有应用。在模型训练阶段,可为每个可训练的权重维护影子变量,并随着迭代的进行更新;在模型验证评估阶段,可使用影子变量替代真实变量值,进行分类预测。
在其他一些实施方式中,IMFNet网络模型1的输出层组件16还可包括Softmax分类器和损失函数模块。损失函数模块可用于利用参数范数正则化方法为损失函数增加参数范数惩罚项,可采用任何一种损失函数,例如交叉熵损失函数,本申请对此不做任何限定。参数范数正则化通过向目标函数中加入参数范数惩罚项来限制权重的大小,使得模型不能任意拟合训练数据中的随机噪声,进一步地优化模型,提高IMFNet网络模型的目标识别准确率。具体来说,参数范数正则化为在损失函数中J(W,b)加入刻画模型复杂程度的指标R(W),通过优化J(W,b)+λR(W)进而限制权重W的大小,从而使得模型不能任意拟合训练数据中的随机噪声。采用不同的参数度量R(W)会产生不同的正则化效果,常用的参数度量正则化的方法包括L1范数正则化和L2范数正则化。本发明采用了L2范数正则化,如公式下式所示。
Figure PCTCN2020077889-appb-000004
此外,IMFNet网络模型1的输出层组件16还可包括参数优化模块,参数优化模块用于采用Adam算法优化IMFNet网络模型1的参数权重。基于mini-batch的Adam优化算法(adaptive moment estimatio,适应性矩估计)进行梯度下降操作,可采用反向传播算法实现迭代过程,例如可设置迭代次数为100000次,有利于进一步提升IMFNet网络模型1的目标识别准确率。
可选的,可将已有的高分辨遥感样本图像按照4:1分为训练样本和测试样本,从而生成训练样本集和测试样本集。可利用训练样本集的二进制文件来对IMFNet网络模型进行训练,并使用测试样本集对训练好的IMFNet网络模型进行测试,在测试过程中可采用测试集在训练好的模型上每隔10s加载一次测试结果,从而完成高分辨率遥感图像的场景识别分类任务。
在一种实施方式中,可在两个国际通用的高分辨率遥感图像数据集,如UC Merced数据集以及SIRI-WHU数据集上进行模型训练及验证评价,UC Merced数据集中每幅图像的像素为256×256,空间分辨率为30cm,共21类图像,每类图像100幅。SIRI-WHU数据集中每幅图像的像素为200×200,空间分辨率为2m,共12类图像,每类图像200幅。最后采用准确率、精确度、召回率及F1值评价指标对IMFNet网络模型的识别结果进行评价。对于一组个数为M,正例样本为P个,负例样本为N个的样本分类识别问题,可以将样例根据真实类别与学习器预测类别的组合划分为真正例(TP)、假正例(FP)、真负例(TN)、假负例(FN)四种情形,如表1所示。
表1 分类问题的四种情形
Figure PCTCN2020077889-appb-000005
其中,真正例为判断为正例的正例,假负例为判断为负例的正例,则 正例P的个数为:
P=TP+FN;
同理,真负例为判断为负例的负例,假正例为判断为正例的负例,则负例N的个数为:
N=TN+FP;
准确率为判断正确的例子的比例,计算公式为:
accuracy=TP/(P+N);
精确度为所有判断为正例的例子中,真正为正例所占的比例,计算公式为:。
precision=TP/(TP+FP);
召回率为所有正例中,被判断为正例的比例,计算公式为:
recall=TP/(TP+FN);
F1值为精确率与召回率的综合评价指标,计算公式为:
F1=2×precison×recall/(precision+recall)。
从图4中可以看出,UC Merced数据集的分类准确率已经达到92.14%,SIRI-WHU数据集的分类准确率也已经达到90.43%,从而可证明了本申请提供的技术方案具有一定的可行性,而且还有较高的准确度。
本发明实施例还针对高分辨率遥感图像的场景识别***提供了相应的识别场景模型生成方法,进一步使得***更具有可行性。下面对本发明实施例提供的用于识别高分辨率遥感图像场景的模型生成方法进行介绍,下文描述的用于识别高分辨率遥感图像场景的模型生成方法与上文描述的高分辨率遥感图像的场景识别***可相互对应参照。
请参见图5,图5为本发明实施例提供的一种用于识别高分辨率遥感图像场景的模型生成方法的流程示意图,本发明实施例可包括以下内容:
S501:在预先搭建训练环境中构建用于对遥感图像场景识别的IMFNet网络模型的框架结构。
IMFNet网络模型的卷积层与池化层相互交替排列,用于提取输入遥感 图像的浅层信息,各Inception模块与池化层相连,用于提取遥感图像的高层信息,且各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件。
S502:利用训练样本集的高分辨率遥感图像训练IMFNet网络模型直至满足预设结束条件,得到训练好的IMFNet网络模型。
可以理解的是,在训练IMFNet网络模型之前,需要预先搭建软硬件环境。一种实施方式中,本实施例中的硬件平台可为基于Intel E5 2665双核处理器,4路GTX1080Ti GPU,32G内存。软件平台可为基于Ubantu16.04版本,采用CUDA 8.0.61、CUDNN v6以及TensorFlow1.4.0环境。其中,TensorFlow为开放源代码软件库,用于进行高性能数值计算,可将复杂的数据结构传输至人工智能神经网中进行分析和处理。凭借其灵活的架构,可轻松地将计算工作部署到多种平台(如CPU、GPU、TPU)和设备(桌面设备、服务器等),被广泛用于语音识别或图像识别等机器深度学习领域。
还需要说明的是,训练样本集或测试样本集中的图像格式要与所搭建的软件环境中使用的计算软件保持一致。例如采用TensorFlow软件时,需要将训练样本集或测试样本集中的图像格式转化为TFRecord格式,然后可通过队列以及多线程方式对该二进制文件进行读取。可选的,可将训练样本集中的单个训练样本图像分批作为IMFNet网络模型的输入,批的大小可设为64。
本发明实施例所述用于识别高分辨率遥感图像场景的模型生成方法的各步骤的具体实现过程可参阅上述***实施例中的各功能模块的相关描述,此处不再赘述。
由上可知,本发明实施例实现了网络模型的自行学习,保证了特征信息的完整性,有效地提高了IMFNet网络模型识别目标的准确率。
本发明实施例还提供了一种用于识别高分辨率遥感图像场景的模型生成设备,具体可包括:
存储器,用于存储计算机程序;
处理器,用于执行计算机程序以实现如上任意一实施例所述用于识别 高分辨率遥感图像场景的模型生成方法的步骤。
本发明实施例所述用于识别高分辨率遥感图像场景的模型生成设备的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
由上可知,本发明实施例实现了网络模型的自行学习,保证了特征信息的完整性,有效地提高了IMFNet网络模型识别目标的准确率。
本发明实施例还提供了一种计算机可读存储介质,存储有用于识别高分辨率遥感图像场景的模型生成程序,所述用于识别高分辨率遥感图像场景的模型生成程序被处理器执行时如上任意一实施例所述用于识别高分辨率遥感图像场景的模型生成方法的步骤。
本发明实施例所述计算机可读存储介质的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
由上可知,本发明实施例实现了网络模型的自行学习,保证了特征信息的完整性,有效地提高了IMFNet网络模型识别目标的准确率。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于 随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上对本发明所提供的一种高分辨率遥感图像的场景识别***及用于识别高分辨率遥感图像场景的模型生成方法进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。

Claims (10)

  1. 一种高分辨率遥感图像的场景识别***,其特征在于,包括用于对遥感图像场景识别的IMFNet网络模型,所述IMFNet网络模型包括卷积层组件、池化层组件、Inception组件及全连接层组件;
    所述卷积层组件中的每个卷积层与所述池化层组件的池化层相互交替排列,用于提取输入遥感图像的浅层信息;
    所述Inception组件包括多个Inception模块,各Inception模块与所述池化层组件的池化层相连,用于提取所述遥感图像的高层信息;
    所述全连接层组件的各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件。
  2. 根据权利要求1所述的高分辨率遥感图像的场景识别***,其特征在于,所述IMFNet网络模型包括4个卷积层、6个池化层、2个Inception模块及3个全连接层;且所述遥感图像的尺寸为256×256×3。
  3. 根据权利要求2所述的高分辨率遥感图像的场景识别***,其特征在于,所述Inception组件包括第一Inception模块和第二Inception模块;
    所述第一Inception模块包括第一分支、第二分支、第三分支和第四分支;所述第一分支包括卷积核尺寸为1*1的卷积层,所述第二分支为由卷积核尺寸为1*1、卷积核尺寸为5*5的两层卷积层构成,所述第三分支为由卷积核尺寸为1*1、卷积核尺寸为3*3的两层卷积层构成;所述第四分支为由步长为2的池化层和卷积核尺寸为1*1的卷积层构成;
    所述第二Inception模块包括一分支、二分支、三分支和四分支;所述一分支包括卷积核尺寸为1*1的卷积层,所述二分支包括卷积核尺寸为1*1的卷积层的第一子分支、卷积核尺寸为1*5和5*1的两个平行卷积层构成的第二子分支,所述三分支为由卷积核尺寸为1*1的卷积层、卷积核尺寸为5*1的卷积层以及卷积核尺寸为1*3和3*1的两个平行卷积层构成;所述四分支为由步长为2的池化层和卷积核尺寸为1*1的卷积层构成。
  4. 根据权利要求1至3任意一项所述的高分辨率遥感图像的场景识别***,其特征在于,所述IMFNet网络模型还包括数据集扩增模块,所述数据集扩增模块用于对包含多张共享的高分辨遥感图像的训练样本集进行 样本图像扩增操作,所述数据集扩增模块包括:
    标注框标注子模块,用于在样本图像上生成预设个数的标注框;
    图像截取子模块,用于随机裁剪所述样本图像上各标注框中的图像部分,以生成多个包含图像内容不完全相同的子图像;
    图像调整子模块,用于采用尺寸调整算法将各子图像的图像尺寸调整为所述IMFNet网络模型输入图像的尺寸。
  5. 根据权利要求4所述的高分辨率遥感图像的场景识别***,其特征在于,所述图像调整单元还包括翻转单元和归一化单元;
    所述翻转单元用于按照预设角度对每幅子图像进行翻转;
    所述归一化单元用于将每幅子图像的亮度均值调整为0、方差调整为1。
  6. 根据权利要求4所述的高分辨率遥感图像的场景识别***,其特征在于,所述全连接层组件的各全连接层均包括模型优化模块,所述模型优化模块用于采用Dropout算法随机删除所述IMFNet网络模型中的多个隐藏单元。
  7. 根据权利要求6所述的高分辨率遥感图像的场景识别***,其特征在于,所述IMFNet网络模型的输出层组件包括Softmax分类器和损失函数模块;
    所述损失函数模块用于利用参数范数正则化方法为交叉熵损失函数增加参数范数惩罚项IMFNet网络模型。
  8. 根据权利要求7所述的高分辨率遥感图像的场景识别***,其特征在于,所述IMFNet网络模型的输出层组件还包括参数更新频率控制模块,所述参数更新频率控制模块用于利用预先构建的滑动平均模型通过不断更新衰减率来控制所述IMFNet网络模型变量更新的幅度。
  9. 根据权利要求8所述的高分辨率遥感图像的场景识别***,其特征在于,所述IMFNet网络模型的输出层组件还包括参数优化模块,所述参数优化模块用于采用Adam算法优化所述IMFNet网络模型的参数权重。
  10. 一种用于识别高分辨率遥感图像场景的模型生成方法,其特征在于,包括:
    在预先搭建训练环境中构建用于对遥感图像场景识别的IMFNet网络模型的框架结构;所述IMFNet网络模型的卷积层与池化层相互交替排列,用于提取输入遥感图像的浅层信息,各Inception模块与池化层相连,用于提取所述遥感图像的高层信息,且各全连接层级联,以将各全连接层输出特征通过级联后输入至输出层组件;
    利用训练样本集的高分辨率遥感图像训练所述IMFNet网络模型直至满足预设结束条件,得到训练好的IMFNet网络模型。
PCT/CN2020/077889 2019-06-05 2020-03-05 高分辨率遥感图像的场景识别***及模型生成方法 WO2020244261A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910486629.4A CN110188725A (zh) 2019-06-05 2019-06-05 高分辨率遥感图像的场景识别***及模型生成方法
CN201910486629.4 2019-06-05

Publications (2)

Publication Number Publication Date
WO2020244261A1 true WO2020244261A1 (zh) 2020-12-10
WO2020244261A8 WO2020244261A8 (zh) 2021-04-08

Family

ID=67720514

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/077889 WO2020244261A1 (zh) 2019-06-05 2020-03-05 高分辨率遥感图像的场景识别***及模型生成方法

Country Status (2)

Country Link
CN (1) CN110188725A (zh)
WO (1) WO2020244261A1 (zh)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985487A (zh) * 2020-08-31 2020-11-24 香港中文大学(深圳) 一种遥感影像目标提取方法、电子设备及存储介质
CN112633112A (zh) * 2020-12-17 2021-04-09 中国人民解放***箭军工程大学 一种基于融合卷积神经网络的sar图像目标检测方法
CN112686269A (zh) * 2021-01-18 2021-04-20 北京灵汐科技有限公司 池化方法、装置、设备和存储介质
CN112686139A (zh) * 2020-12-29 2021-04-20 西安电子科技大学 基于跨阶段局部多尺度密集连接的遥感图像目标检测方法
CN112712049A (zh) * 2021-01-11 2021-04-27 中国电子科技集团公司第十五研究所 一种小样本条件下的卫星影像舰船型号识别方法
CN112836788A (zh) * 2020-12-21 2021-05-25 中国电子科技集团公司第二十七研究所 一种用于干扰类型识别的低功耗深度学习网络方法
CN112950780A (zh) * 2021-03-12 2021-06-11 北京理工大学 一种基于遥感影像的网络地图智能生成方法及***
CN112966579A (zh) * 2021-02-24 2021-06-15 湖南三湘绿谷生态科技有限公司 一种基于无人机遥感的大面积油茶林快速估产方法
CN112990085A (zh) * 2021-04-08 2021-06-18 海南长光卫星信息技术有限公司 养殖塘变化检测方法、装置及计算机可读存储介质
CN113065511A (zh) * 2021-04-21 2021-07-02 河南大学 基于深度学习的遥感图像飞机检测模型及方法
CN113109782A (zh) * 2021-04-15 2021-07-13 中国人民解放军空军航空大学 一种直接应用于雷达辐射源幅度序列的新型分类方法
CN113158807A (zh) * 2021-03-24 2021-07-23 中科北纬(北京)科技有限公司 一种遥感影像的模型自训练和优化***
CN113239736A (zh) * 2021-04-16 2021-08-10 广州大学 一种基于多源遥感数据的土地覆盖分类标注图获取方法、存储介质及***
CN113298095A (zh) * 2021-06-23 2021-08-24 成都天巡微小卫星科技有限责任公司 一种基于卫星遥感的高精度路网密度提取方法及***
CN113326779A (zh) * 2021-05-31 2021-08-31 中煤科工集团沈阳研究院有限公司 一种井下巷道积水检测识别方法
CN113359135A (zh) * 2021-07-07 2021-09-07 中国人民解放军空军工程大学 一种成像及识别模型的训练方法、应用方法、装置及介质
CN113469072A (zh) * 2021-07-06 2021-10-01 西安电子科技大学 基于GSoP和孪生融合网络的遥感图像变化检测方法及***
CN113516600A (zh) * 2021-06-02 2021-10-19 航天东方红卫星有限公司 一种基于特征自适应校正的遥感图像薄云去除方法
CN113642456A (zh) * 2021-08-11 2021-11-12 福州大学 一种基于拼图引导的深度特征融合的高分辨率遥感图像场景分类方法
CN113689399A (zh) * 2021-08-23 2021-11-23 长安大学 一种用于电网识别遥感图像处理方法及***
CN113723411A (zh) * 2021-06-18 2021-11-30 湖北工业大学 一种用于遥感图像语义分割的特征提取方法和分割***
CN114005046A (zh) * 2021-11-04 2022-02-01 长安大学 基于Gabor滤波器和协方差池化的遥感场景分类方法
CN114022356A (zh) * 2021-10-29 2022-02-08 长视科技股份有限公司 基于小波域的河道流量水位遥感图像超分辨率方法与***
CN114049519A (zh) * 2021-11-17 2022-02-15 江西航天鄱湖云科技有限公司 一种光学遥感图像场景分类方法
CN114066831A (zh) * 2021-11-04 2022-02-18 北京航空航天大学 一种基于两阶段训练的遥感图像镶嵌质量无参考评价方法
CN114373120A (zh) * 2021-03-25 2022-04-19 河北地质大学 一种多尺度空间融合高光谱土壤重金属污染识别评价方法
CN114511573A (zh) * 2021-12-29 2022-05-17 电子科技大学 一种基于多层级边缘预测的人体解析模型及方法
CN114581861A (zh) * 2022-03-02 2022-06-03 北京交通大学 一种基于深度学习卷积神经网络的轨道区域识别方法
CN116485652A (zh) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 遥感影像车辆目标检测的超分辨率重建方法
CN116561536A (zh) * 2023-07-11 2023-08-08 中南大学 一种滑坡隐患的识别方法、终端设备及介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188725A (zh) * 2019-06-05 2019-08-30 中国科学院长春光学精密机械与物理研究所 高分辨率遥感图像的场景识别***及模型生成方法
CN111259734B (zh) * 2020-01-08 2023-06-16 深圳市彬讯科技有限公司 空间户型识别方法、装置、计算机设备及存储介质
CN111815627B (zh) * 2020-08-24 2020-12-01 成都睿沿科技有限公司 遥感图像变化检测方法、模型训练方法及对应装置
CN113674252A (zh) * 2021-08-25 2021-11-19 上海鹏冠生物医药科技有限公司 一种基于图神经网络的组织细胞病理图像诊断***

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299733A (zh) * 2018-09-12 2019-02-01 江南大学 利用紧凑型深度卷积神经网络进行图像识别的方法
CN109344883A (zh) * 2018-09-13 2019-02-15 西京学院 一种基于空洞卷积的复杂背景下果树病虫害识别方法
CN109657584A (zh) * 2018-12-10 2019-04-19 长安大学 辅助驾驶的改进LeNet-5融合网络交通标志识别方法
US20190122111A1 (en) * 2017-10-24 2019-04-25 Nec Laboratories America, Inc. Adaptive Convolutional Neural Knowledge Graph Learning System Leveraging Entity Descriptions
CN110188725A (zh) * 2019-06-05 2019-08-30 中国科学院长春光学精密机械与物理研究所 高分辨率遥感图像的场景识别***及模型生成方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9959647B1 (en) * 2015-09-08 2018-05-01 National Technology & Engineering Solutions Of Sandia, Llc Representation of activity in images using geospatial temporal graphs
CN106250931A (zh) * 2016-08-03 2016-12-21 武汉大学 一种基于随机卷积神经网络的高分辨率图像场景分类方法
CN109165682B (zh) * 2018-08-10 2020-06-16 中国地质大学(武汉) 一种融合深度特征和显著性特征的遥感图像场景分类方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190122111A1 (en) * 2017-10-24 2019-04-25 Nec Laboratories America, Inc. Adaptive Convolutional Neural Knowledge Graph Learning System Leveraging Entity Descriptions
CN109299733A (zh) * 2018-09-12 2019-02-01 江南大学 利用紧凑型深度卷积神经网络进行图像识别的方法
CN109344883A (zh) * 2018-09-13 2019-02-15 西京学院 一种基于空洞卷积的复杂背景下果树病虫害识别方法
CN109657584A (zh) * 2018-12-10 2019-04-19 长安大学 辅助驾驶的改进LeNet-5融合网络交通标志识别方法
CN110188725A (zh) * 2019-06-05 2019-08-30 中国科学院长春光学精密机械与物理研究所 高分辨率遥感图像的场景识别***及模型生成方法

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985487B (zh) * 2020-08-31 2024-03-19 香港中文大学(深圳) 一种遥感影像目标提取方法、电子设备及存储介质
CN111985487A (zh) * 2020-08-31 2020-11-24 香港中文大学(深圳) 一种遥感影像目标提取方法、电子设备及存储介质
CN112633112A (zh) * 2020-12-17 2021-04-09 中国人民解放***箭军工程大学 一种基于融合卷积神经网络的sar图像目标检测方法
CN112836788A (zh) * 2020-12-21 2021-05-25 中国电子科技集团公司第二十七研究所 一种用于干扰类型识别的低功耗深度学习网络方法
CN112836788B (zh) * 2020-12-21 2022-12-27 中国电子科技集团公司第二十七研究所 一种用于干扰类型识别的低功耗深度学习网络方法
CN112686139B (zh) * 2020-12-29 2024-02-09 西安电子科技大学 基于跨阶段局部多尺度密集连接的遥感图像目标检测方法
CN112686139A (zh) * 2020-12-29 2021-04-20 西安电子科技大学 基于跨阶段局部多尺度密集连接的遥感图像目标检测方法
CN112712049A (zh) * 2021-01-11 2021-04-27 中国电子科技集团公司第十五研究所 一种小样本条件下的卫星影像舰船型号识别方法
CN112712049B (zh) * 2021-01-11 2023-01-17 中国电子科技集团公司第十五研究所 一种小样本条件下的卫星影像舰船型号识别方法
CN112686269A (zh) * 2021-01-18 2021-04-20 北京灵汐科技有限公司 池化方法、装置、设备和存储介质
CN112966579A (zh) * 2021-02-24 2021-06-15 湖南三湘绿谷生态科技有限公司 一种基于无人机遥感的大面积油茶林快速估产方法
CN112950780A (zh) * 2021-03-12 2021-06-11 北京理工大学 一种基于遥感影像的网络地图智能生成方法及***
CN112950780B (zh) * 2021-03-12 2022-09-06 北京理工大学 一种基于遥感影像的网络地图智能生成方法及***
CN113158807B (zh) * 2021-03-24 2024-02-09 中科北纬(北京)科技有限公司 一种遥感影像的模型自训练和优化***
CN113158807A (zh) * 2021-03-24 2021-07-23 中科北纬(北京)科技有限公司 一种遥感影像的模型自训练和优化***
CN114373120B (zh) * 2021-03-25 2023-05-23 河北地质大学 一种多尺度空间融合高光谱土壤重金属污染识别评价方法
CN114373120A (zh) * 2021-03-25 2022-04-19 河北地质大学 一种多尺度空间融合高光谱土壤重金属污染识别评价方法
CN112990085A (zh) * 2021-04-08 2021-06-18 海南长光卫星信息技术有限公司 养殖塘变化检测方法、装置及计算机可读存储介质
CN113109782A (zh) * 2021-04-15 2021-07-13 中国人民解放军空军航空大学 一种直接应用于雷达辐射源幅度序列的新型分类方法
CN113109782B (zh) * 2021-04-15 2023-08-15 中国人民解放军空军航空大学 一种直接应用于雷达辐射源幅度序列的分类方法
CN113239736A (zh) * 2021-04-16 2021-08-10 广州大学 一种基于多源遥感数据的土地覆盖分类标注图获取方法、存储介质及***
CN113239736B (zh) * 2021-04-16 2023-06-06 广州大学 一种基于多源遥感数据的土地覆盖分类标注图获取方法
CN113065511A (zh) * 2021-04-21 2021-07-02 河南大学 基于深度学习的遥感图像飞机检测模型及方法
CN113065511B (zh) * 2021-04-21 2024-02-02 河南大学 基于深度学习的遥感图像飞机检测模型及方法
CN113326779B (zh) * 2021-05-31 2024-03-22 中煤科工集团沈阳研究院有限公司 一种井下巷道积水检测识别方法
CN113326779A (zh) * 2021-05-31 2021-08-31 中煤科工集团沈阳研究院有限公司 一种井下巷道积水检测识别方法
CN113516600A (zh) * 2021-06-02 2021-10-19 航天东方红卫星有限公司 一种基于特征自适应校正的遥感图像薄云去除方法
CN113516600B (zh) * 2021-06-02 2024-03-19 航天东方红卫星有限公司 一种基于特征自适应校正的遥感图像薄云去除方法
CN113723411B (zh) * 2021-06-18 2023-06-27 湖北工业大学 一种用于遥感图像语义分割的特征提取方法和分割***
CN113723411A (zh) * 2021-06-18 2021-11-30 湖北工业大学 一种用于遥感图像语义分割的特征提取方法和分割***
CN113298095A (zh) * 2021-06-23 2021-08-24 成都天巡微小卫星科技有限责任公司 一种基于卫星遥感的高精度路网密度提取方法及***
CN113469072A (zh) * 2021-07-06 2021-10-01 西安电子科技大学 基于GSoP和孪生融合网络的遥感图像变化检测方法及***
CN113469072B (zh) * 2021-07-06 2024-04-12 西安电子科技大学 基于GSoP和孪生融合网络的遥感图像变化检测方法及***
CN113359135A (zh) * 2021-07-07 2021-09-07 中国人民解放军空军工程大学 一种成像及识别模型的训练方法、应用方法、装置及介质
CN113359135B (zh) * 2021-07-07 2023-08-22 中国人民解放军空军工程大学 一种成像及识别模型的训练方法、应用方法、装置及介质
CN113642456A (zh) * 2021-08-11 2021-11-12 福州大学 一种基于拼图引导的深度特征融合的高分辨率遥感图像场景分类方法
CN113642456B (zh) * 2021-08-11 2023-08-11 福州大学 基于拼图引导的深度特征融合的遥感图像场景分类方法
CN113689399A (zh) * 2021-08-23 2021-11-23 长安大学 一种用于电网识别遥感图像处理方法及***
CN113689399B (zh) * 2021-08-23 2024-05-31 国网宁夏电力有限公司石嘴山供电公司 一种用于电网识别遥感图像处理方法及***
CN114022356A (zh) * 2021-10-29 2022-02-08 长视科技股份有限公司 基于小波域的河道流量水位遥感图像超分辨率方法与***
CN114066831A (zh) * 2021-11-04 2022-02-18 北京航空航天大学 一种基于两阶段训练的遥感图像镶嵌质量无参考评价方法
CN114005046A (zh) * 2021-11-04 2022-02-01 长安大学 基于Gabor滤波器和协方差池化的遥感场景分类方法
CN114049519A (zh) * 2021-11-17 2022-02-15 江西航天鄱湖云科技有限公司 一种光学遥感图像场景分类方法
CN114511573A (zh) * 2021-12-29 2022-05-17 电子科技大学 一种基于多层级边缘预测的人体解析模型及方法
CN114581861B (zh) * 2022-03-02 2023-05-23 北京交通大学 一种基于深度学习卷积神经网络的轨道区域识别方法
CN114581861A (zh) * 2022-03-02 2022-06-03 北京交通大学 一种基于深度学习卷积神经网络的轨道区域识别方法
CN116485652A (zh) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 遥感影像车辆目标检测的超分辨率重建方法
CN116485652B (zh) * 2023-04-26 2024-03-01 北京卫星信息工程研究所 遥感影像车辆目标检测的超分辨率重建方法
CN116561536B (zh) * 2023-07-11 2023-11-21 中南大学 一种滑坡隐患的识别方法、终端设备及介质
CN116561536A (zh) * 2023-07-11 2023-08-08 中南大学 一种滑坡隐患的识别方法、终端设备及介质

Also Published As

Publication number Publication date
WO2020244261A8 (zh) 2021-04-08
CN110188725A (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
WO2020244261A1 (zh) 高分辨率遥感图像的场景识别***及模型生成方法
US20210042580A1 (en) Model training method and apparatus for image recognition, network device, and storage medium
WO2018052587A1 (en) Method and system for cell image segmentation using multi-stage convolutional neural networks
CN110929610B (zh) 基于cnn模型和迁移学习的植物病害识别方法及***
CN109299716A (zh) 神经网络的训练方法、图像分割方法、装置、设备及介质
CN110222718B (zh) 图像处理的方法及装置
Nawaz et al. AI-based object detection latest trends in remote sensing, multimedia and agriculture applications
WO2021218470A1 (zh) 一种神经网络优化方法以及装置
WO2021042857A1 (zh) 图像分割模型的处理方法和处理装置
WO2023040147A1 (zh) 神经网络的训练方法及装置、存储介质和计算机程序
CN114898359B (zh) 一种基于改进EfficientDet的荔枝病虫害检测方法
Luan et al. Sunflower seed sorting based on convolutional neural network
Liu et al. Image classification method on class imbalance datasets using multi-scale CNN and two-stage transfer learning
Andrei-Alexandru et al. Low cost defect detection using a deep convolutional neural network
CN114882278A (zh) 一种基于注意力机制和迁移学习的轮胎花纹分类方法和装置
Haque et al. Image-based identification of maydis leaf blight disease of maize (Zea mays) using deep learning
CN110751091A (zh) 静态图像行为识别的卷积神经网络模型
CN117372881B (zh) 一种烟叶病虫害智能识别方法、介质及***
Sardeshmukh et al. Crop image classification using convolutional neural network
CN114758190A (zh) 训练图像识别模型的方法、图像识别方法、装置和农机
Cosovic et al. Cultural heritage image classification
Swaminathan et al. D2CNN: Double-staged deep CNN for stress identification and classification in cropping system
Yu Research progress of crop disease image recognition based on wireless network communication and deep learning
Ashiquzzaman et al. Applying data augmentation to handwritten arabic numeral recognition using deep learning neural networks
Li et al. MLP-Mixer Approach for corn leaf diseases classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20818279

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20818279

Country of ref document: EP

Kind code of ref document: A1