CN112364699A - Remote sensing image segmentation method, device and medium based on weighted loss fusion network - Google Patents

Remote sensing image segmentation method, device and medium based on weighted loss fusion network Download PDF

Info

Publication number
CN112364699A
CN112364699A CN202011097624.1A CN202011097624A CN112364699A CN 112364699 A CN112364699 A CN 112364699A CN 202011097624 A CN202011097624 A CN 202011097624A CN 112364699 A CN112364699 A CN 112364699A
Authority
CN
China
Prior art keywords
network
loss
remote sensing
sensing image
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011097624.1A
Other languages
Chinese (zh)
Inventor
颜军
张永军
刘文杰
邓剑文
吴明朗
郑忠良
郝梦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Obit Artificial Intelligence Research Institute Co ltd
Zhuhai Orbita Aerospace Technology Co ltd
Guizhou University
Original Assignee
Guangdong Obit Artificial Intelligence Research Institute Co ltd
Zhuhai Orbita Aerospace Technology Co ltd
Guizhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Obit Artificial Intelligence Research Institute Co ltd, Zhuhai Orbita Aerospace Technology Co ltd, Guizhou University filed Critical Guangdong Obit Artificial Intelligence Research Institute Co ltd
Priority to CN202011097624.1A priority Critical patent/CN112364699A/en
Publication of CN112364699A publication Critical patent/CN112364699A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a remote sensing image segmentation method, a device and a medium based on a weighted loss fusion network, which comprises the following steps: preprocessing the remote sensing image to obtain training data; constructing a convolutional neural network, wherein the convolutional neural network comprises a coder with a multi-channel training branch, and context extraction is carried out on training data through the coder; constructing a double pyramid module, extracting a feature map of the training data through the double pyramid module, and outputting a corresponding feature map; performing up-sampling processing on the obtained feature maps to obtain feature maps with different sizes, and fusing the feature maps; and constructing a perception loss network, calculating the perception loss, weighting and fusing the perception loss and the training loss, and reversely transmitting the perception loss and the training loss to the training network to update parameters. The invention has the beneficial effects that: the network extracts high-quality deep features and scale features, avoids loss of spatial features to the maximum extent, effectively improves the segmentation effect of the remote sensing image, and is faster in training and fitting.

Description

Remote sensing image segmentation method, device and medium based on weighted loss fusion network
Technical Field
The invention relates to the field of remote sensing images and deep learning, in particular to a remote sensing image segmentation method, a remote sensing image segmentation device and a remote sensing image segmentation medium based on a weighted loss fusion network.
Background
With the development of remote sensing technology, the data volume of remote sensing images is larger and larger, and the resolution is higher and higher. The remote sensing image contains a large amount of information, so that the application of the remote sensing image has many aspects including target detection, scene classification, semantic segmentation and the like. Applications of remote sensing images tend to be diversified, such as city planning, building extraction, road extraction, vehicle detection and illegal building extraction. However, in these fields, high segmentation quality is required, and although there are many segmentation methods for remote sensing images, the segmentation effect still needs to be improved.
The semantic segmentation of the remote sensing image is a research hotspot, and with the development of deep learning, the segmentation precision is greatly improved by the semantic segmentation based on the full convolution neural network. The information amount of the remote sensing image is large, but the data amount of each sample is extremely uneven. Therefore, although the remote sensing image can be segmented to a certain extent by the common network, the segmentation precision can be greatly improved. The common neural network deepens the network to improve the classification accuracy, but has great loss on target spatial features and scale features.
Disclosure of Invention
The invention aims to solve at least one of the technical problems in the prior art, and provides a remote sensing image segmentation method, a device and a medium based on a weighted loss fusion network, which avoid the loss of spatial features and effectively improve the segmentation effect of remote sensing images for high-precision remote sensing images.
The technical scheme of the invention comprises a remote sensing image segmentation method based on a weighted loss fusion network, which is characterized in that: s100, preprocessing the remote sensing image to obtain training data for a convolutional neural network; s200, constructing a convolutional neural network, wherein the convolutional neural network comprises a coder with multi-channel training branches, and context extraction is carried out on the training data through the coder; s300, constructing a double pyramid module, extracting feature maps of the training data through two groups of spatial pyramids with different convolution expansion rates of the double pyramid module, and outputting corresponding first feature maps; s400, performing upsampling processing on the feature map obtained in the step S300 to obtain second feature maps with different sizes, and fusing the first feature map and the second feature map; s500, a perception loss network is constructed, loss calculation is carried out on the fused feature graph through the loss network, and back propagation is carried out through weighted fusion of the loss obtained through calculation and the loss network, so that parameter updating is achieved.
According to the remote sensing image segmentation method based on the weighted loss fusion network, the preprocessing in S100 comprises the following steps: and carrying out image cutting, normalization and data expansion processing on the remote sensing image, wherein the operations of data expansion are respectively carried out from the horizontal direction and the vertical direction.
According to the remote sensing image segmentation method based on the weighted loss fusion network, S200 comprises the following steps: the convolutional neural network uses Inception V-4 pre-trained on an ImageNet data set as a network backbone, removes an average pooling layer at the tail end of the convolutional neural network, and simultaneously adds network branches on the network backbone to obtain an encoder with multi-channel training branches, wherein the network branches are used for reserving shallow features of images.
According to the remote sensing image segmentation method based on the weighted loss fusion network, S200 comprises the following steps: and removing the average pooling layer and all subsequent layers, independently establishing a network branch for the output after the Reduction-A layer, and fusing the network branch with the tail end of the main network after passing through the 2 x 2 maximum pooling layer.
According to the remote sensing image segmentation method based on the weighted loss fusion network, S300 comprises the following steps: s310, establishing a first space pyramid behind the Stem module as a parallel training network, extracting multi-scale features of a first stage of the training data according to the first space pyramid, fusing feature graphs of five branches of the ASPP1 module to form corresponding fused feature graphs, and performing convolution on the fused feature graphs by 1 x 1; s320, executing the convolution operation of the maximum pooling of 4 x 4, and reducing the size of the feature map; s330, constructing a second spatial pyramid in a Reduction-A module of an inclusion V-4 network, receiving a corresponding feature map through an ASPP2 module, fusing four branches of the ASPP2 module, outputting the feature map through a 1 x 1 convolution layer, and adding a 2 x 2 maximum pooling layer after the convolution layer to obtain the feature map.
The remote sensing image segmentation method based on the weighted loss fusion network is characterized in that expansion rates of the first space pyramid and the second space pyramid are set to be [1, 6,12,18 ] and [1, 6,12 ] respectively.
According to the remote sensing image segmentation method based on the weighted loss fusion network, S400 comprises the following steps: and recovering the image size by four groups of convolution block sampling modules, and obtaining a classification prediction result by Softmax.
According to the remote sensing image segmentation method based on the weighted loss fusion network, S500 comprises the following steps: and constructing a perception loss network, adopting a pre-trained VGG16 network, transmitting a prediction graph obtained by a segmentation network into the loss network, calculating to obtain loss, and then performing weighted fusion and back propagation with the loss calculated by the segmentation network to realize parameter updating.
The technical scheme of the invention also comprises a remote sensing image segmentation device based on the weighted loss fusion network, which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, and is characterized in that any one of the method steps is realized when the processor executes the computer program.
The technical solution of the present invention further includes a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program implements any of the above method steps.
The invention has the beneficial effects that: the network extracts high-quality deep features and scale features, avoids loss of spatial features to the maximum extent, effectively improves the segmentation effect of the remote sensing image, and obtains a good result on an ISPRS 2D data set.
Drawings
The invention is further described below with reference to the accompanying drawings and examples;
FIG. 1 shows a general flow diagram according to an embodiment of the invention.
Fig. 2 is an overall architecture diagram according to an embodiment of the present invention.
Fig. 3 is a diagram of a training network architecture according to an embodiment of the present invention.
Fig. 4 is a diagram of an inclusion v-4 backbone network architecture according to an embodiment of the present invention.
Fig. 5 is a structural view of a spatial pyramid according to an embodiment of the present invention.
FIG. 6 is a comparison of predicted results on a Vaihingen validation set, according to an embodiment of the present invention.
FIG. 7 is a diagram of a media device according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number.
In the description of the present invention, the consecutive reference numbers of the method steps are for convenience of examination and understanding, and the implementation order between the steps is adjusted without affecting the technical effect achieved by the technical solution of the present invention by combining the whole technical solution of the present invention and the logical relationship between the steps.
FIG. 1 shows a general flow diagram according to an embodiment of the invention. The process comprises the following steps: s100, preprocessing the remote sensing shadow to obtain training data for a convolutional neural network; s200, constructing a convolutional neural network, wherein the convolutional neural network comprises a coder with multi-channel training branches, and context extraction is carried out on the training data through the coder; s300, constructing a double pyramid module, extracting feature maps of the training data through two groups of spatial pyramids with different convolution expansion rates of the double pyramid module, and outputting corresponding first feature maps; s400, performing upsampling processing on the feature map obtained in the step S300 to obtain second feature maps with different sizes, and fusing the first feature map and the second feature map; s500, a perception loss network is constructed, loss calculation is carried out on the fused feature graph through the loss network, and back propagation is carried out through weighted fusion of the loss obtained through calculation and the loss network, so that parameter updating is achieved.
For the pretreatment thereof, the present invention provides the following embodiments:
the validity of the method is verified using the real data. The experiment was trained using the ISPRS 2D Semantic Label control Vaihingen dataset.
The Vaihingen dataset is a high-resolution aerial image dataset with complete semantic tags, including high-resolution TOP projection (TOP) and digital terrain model (DSM). The image file is composed of different channels, and is in an IRRG (IR-R-G, 3-channel) image format. Herein, TOP IRRG images alone are used for training. The data set comprises 16 image spots with different sizes, and the data set labels are divided into six types including Improvious Surfaces, Building, Low vector, Tree, Car and background.
The method comprises the steps of cutting 16 images with semantic labels into images with the size of 299 x 299, considering the depth of a network, enabling the data size to be too small, and not obtaining enough characteristic information, carrying out data expansion, turning the images in the horizontal direction and the vertical direction respectively, then carrying out rotation to obtain 14824 images with the size of 299 x 299, and randomly selecting 75% of total samples as a training set, 20% as a testing set and the balance as a verification set.
Because the training data is huge and the computer computing power is limited, the Adam algorithm is selected for training optimization, so that the model can be converged more quickly.
Fig. 2 is an overall architecture diagram according to an embodiment of the present invention, which is implemented by adding a perceptual loss network while training the training network, calculating perceptual loss, then performing weighted fusion with the training loss, and performing back propagation to the training network.
Fig. 3 is a diagram of a neural network architecture according to an embodiment of the present invention. The flow of the network structure is as follows:
step 1: and preprocessing the remote sensing image, including image cutting, normalization and data expansion, wherein the operations of the data expansion are respectively turning from the horizontal direction and the vertical direction.
The step provides data for training of the network, after data expansion, overfitting can be avoided along with deep training of the network, and a large amount of information is provided for network training to better extract features.
Step 2: and removing the average pooling layer and all subsequent layers to adapt to the semantic segmentation task, then independently establishing a network branch for the output after the Reduction-A layer, and then fusing the network branch with the tail end of the trunk network after passing through the 2 x 2 maximum pooling layer.
The step realizes the modification of the backbone network, avoids the loss of shallow features caused by too deep network depth, and enables the network to better learn the image features.
And step 3: double-pyramid modules are constructed, two groups of space pyramids with different expansion rates are adopted, and the convolution expansion rates of the first group of pyramid modules are respectively as follows: 1. 6,12 and 18, fusing the output characteristic graph with the output of the second group of up-sampling volume blocks, wherein the convolution expansion rates of the second group of pyramid modules are 1,6 and 12 respectively, and the output characteristic graph is fused with the output of the first group of up-sampling volume blocks.
In the step, by introducing two groups of space pyramids, the scale features of the target can be better extracted, so that the classification precision is greatly improved. Different from the application of the common space pyramid, the common network simply adds the module to the terminal of the decoder network, so that the network depth is deepened in a certain sense, and certain characteristic loss is caused. According to the method, the double pyramid module is connected to different stages of a backbone network and then fused with the layer corresponding to the up-sampling module, so that the reservation of shallow features and the extraction of more scale features are considered.
And 4, step 4: an up-sampling module is designed, the up-sampling module consists of four groups of convolution blocks, the recovery of the image size is gradually completed, and finally, a classification prediction result is obtained through Softmax.
And 5: and when the characteristics of the training network are extracted, the perception loss is calculated through the perception loss network, the perception loss and the loss of the training network are subjected to weighted fusion, and finally the perception loss and the loss are reversely transmitted to the training network.
This step restores the feature size by upsampling and completes the final segmentation prediction.
As a semantic segmentation task, the IOU and F1 are used as evaluation indexes, and the formula is as follows:
Figure BDA0002724276150000061
Figure BDA0002724276150000062
where P is the number of positive samples, N is the number of negative samples, TP predicts the number of correct positive samples, FP is the number of mispredicted positive samples, TN is the number of correct negative samples, FN is the number of mispredicted negative samples, and the number of samples is the number of pixels per picture.
Fig. 4 is a diagram of an inclusion v-4 backbone network architecture according to an embodiment of the present invention. The backbone network structure chart mainly comprises a down-sampling module and an up-sampling module, wherein the down-sampling module consists of a backbone network based on an Inception V-4 network and a double-pyramid module, and the up-sampling module consists of four groups of convolution blocks
Fig. 5 is a structural view of a spatial pyramid according to an embodiment of the present invention. Based on the spatial pyramid structure diagram and the embodiment of fig. 3, the segmentation process based on the spatial pyramid structure is as follows:
step 1: the method comprises the steps of taking an increment V-4 network as a backbone, abandoning the last Average Power, Drapout and Softmax, then constructing a branch of the network, combining a feature map of an increment-A module with an output feature map of an increment-C module to form an encoder with a multi-channel training branch, and fully extracting context information of the network.
Step 2: and adding a double pyramid module, and then fusing the characteristics to form an encoder module. Establishing a first pyramid module after a Stem module to form a parallel training network, transmitting a 35 x 384 feature map into the first pyramid module, fully extracting multi-scale features of a first stage by cavity convolution with sampling rates of 1,6,12 and 18 respectively, then fusing feature maps of five branches of an ASPP1 module to form a 35 x 256 feature map, and performing 1 x 1 convolution on the fused feature map. To match the size of the feature map to be fused, a max pooling operation of 4 × 4 is added after the 1 × 1 convolution, reducing the feature map size to 32 × 32. Establishing a second pyramid module after the Reduction-A module of the Inception V-4 network, wherein the feature map received by the ASPP2 module is 17 × 17 × 1024, setting convolution sampling rates in the ASPP2 module to be 1,6,8 and 12 respectively due to the matching of the feature map size, then fusing the four branches of the ASPP2 module together, outputting a 17 × 17 × 512 feature map through a 1 × 1 convolution layer, and adding a 2 × 2 maximum pooling layer after the convolution layer to match the size of the fused feature map to obtain a 16 × 16 × 512 feature map. And after the training of the double pyramid module is completed, fusing the feature graphs with corresponding sizes in the decoder respectively.
And step 3: designing a decoder, dividing the decoder into four convolution modules, wherein each convolution block has an up-sampling operation, and finally restoring the image size through bilinear up-sampling. The decoder parameters are shown in table 1 below:
Figure BDA0002724276150000071
TABLE 1 decoder parameter Table
FIG. 6 is a comparison of predicted results on a Vaihingen validation set, according to an embodiment of the present invention. With reference to fig. 5, table 2 and table 3, experiments were conducted to compare mainstream split networks, including: FCN32, SegNet, PspNet, U-Net.
Figure BDA0002724276150000072
Figure BDA0002724276150000081
TABLE 2 IOU score comparison on Vaihingen data sets
Imp.S. Build. Low.V. Tree Car Overall
FCN 85.15 94.42 79.26 76.51 80.16 84.25
SegNet 82.65 89.76 76.04 73.74 71.63 82.69
PspNet 71.39 75.58 65.71 45.32 40.61 63.36
U-Net 83.49 90.14 78.36 75.41 73.10 83.20
Our 93.97 97.77 90.06 88.23 90.94 96.43
TABLE 3F 1 score comparison on Vaihingen data set
To verify whether the modification to the network plays a positive role, the backbone network alone without any modification is used as a split network for testing and comparison, see fig. 6 for details.
As can be seen from the data comparison, the high-precision remote sensing image segmentation method based on the weighted loss fusion network provided by the invention can obtain good effect in remote sensing image segmentation.
FIG. 7 is a diagram of a media device according to an embodiment of the present invention. Including memory 100, processor 200. The memory 100 is used for storing various data when the processor runs, and the processor is used for executing the following steps: preprocessing the remote sensing image to obtain training data for a convolutional neural network; constructing a convolutional neural network, wherein the convolutional neural network comprises a coder with a multi-channel training branch, and context extraction is carried out on training data through the coder; constructing a double pyramid module, extracting a feature map of the training data through two groups of different convolution expansion rate space pyramids of the double pyramid module, and outputting a corresponding first feature map; performing upsampling processing on the obtained feature map to obtain second feature maps with different sizes, and fusing the first feature map and the second feature map; and constructing a perception loss network, performing loss calculation on the fused characteristic diagram through the loss network, and performing back propagation on the weighted fusion of the loss obtained through calculation and the loss network to realize parameter updating.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims (11)

1. A remote sensing image segmentation method based on a weighted loss fusion network is characterized by comprising the following steps:
s100, preprocessing the remote sensing shadow to obtain training data for a convolutional neural network;
s200, constructing a convolutional neural network, wherein the convolutional neural network comprises a coder with multi-channel training branches, and context extraction is carried out on the training data through the coder;
s300, constructing a double pyramid module, extracting feature maps of the training data through two groups of spatial pyramids with different convolution expansion rates of the double pyramid module, and outputting corresponding first feature maps;
s400, performing upsampling processing on the feature map obtained in the step S300 to obtain second feature maps with different sizes, and fusing the first feature map and the second feature map;
s500, a perception loss network is constructed, loss calculation is carried out on the fused feature graph through the loss network, and back propagation is carried out through weighted fusion of the loss obtained through calculation and the loss network, so that parameter updating is achieved.
2. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 1, wherein the preprocessing in S100 comprises: and carrying out image cutting, normalization and data expansion processing on the remote sensing image, wherein the operations of data expansion are respectively carried out from the horizontal direction and the vertical direction.
3. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 1, wherein the S200 comprises: the convolutional neural network uses Inception V-4 pre-trained on an ImageNet data set as a network backbone, removes an average pooling layer at the tail end of the convolutional neural network, and simultaneously adds network branches on the network backbone to obtain an encoder with multi-channel training branches, wherein the network branches are used for reserving shallow features of images.
4. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 3, wherein the S200 comprises:
and removing the average pooling layer and all subsequent layers, independently establishing a network branch for the output after the Reduction-A layer, and fusing the network branch with the tail end of the main network after passing through the 2 x 2 maximum pooling layer.
5. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 3, wherein the S300 comprises:
s310, establishing a first space pyramid behind the Stem module as a parallel training network, extracting multi-scale features of a first stage of the training data according to the first space pyramid, fusing feature graphs of five branches of the ASPP1 module to form corresponding fused feature graphs, and performing convolution on the fused feature graphs by 1 x 1;
s320, executing the convolution operation of the maximum pooling of 4 x 4, and reducing the size of the feature map;
s330, constructing a second spatial pyramid in a Reduction-A module of an inclusion V-4 network, receiving a corresponding feature map through an ASPP2 module, fusing four branches of the ASPP2 module, outputting the feature map through a 1 x 1 convolution layer, and adding a 2 x 2 maximum pooling layer after the convolution layer to obtain the feature map.
6. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 5, wherein the first and second spatial pyramids are set to expansion rates [1, 6,12,18 ] and [1, 6,12 ], respectively.
7. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 1, wherein the S400 comprises:
the image size is recovered by four sets of convolution block sampling modules.
8. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 1, wherein the S500 comprises: and constructing a perception loss network, adopting a pre-trained VGG16 network, transmitting a prediction graph obtained by a segmentation network into the loss network, calculating to obtain loss, and then performing weighted fusion and back propagation with the loss calculated by the segmentation network to realize parameter updating.
9. The remote sensing image segmentation method based on the weighted loss fusion network as claimed in claim 1, wherein the method further comprises:
the segmentation method of S100 to S400 was evaluated using IOU and F1 as evaluation indexes in such a manner that,
Figure FDA0002724276140000021
Figure FDA0002724276140000022
where P is the number of positive samples, N is the number of negative samples, TP predicts the number of correct positive samples, FP is the number of mispredicted positive samples, TN is the number of correct negative samples, FN is the number of mispredicted negative samples, and the number of samples is the number of pixels per picture.
10. A method and apparatus for remote sensing image segmentation based on a weighted loss fusion network, the apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method steps of any one of claims 1 to 8 when executing the computer program.
11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method steps of any one of claims 1 to 8.
CN202011097624.1A 2020-10-14 2020-10-14 Remote sensing image segmentation method, device and medium based on weighted loss fusion network Pending CN112364699A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011097624.1A CN112364699A (en) 2020-10-14 2020-10-14 Remote sensing image segmentation method, device and medium based on weighted loss fusion network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011097624.1A CN112364699A (en) 2020-10-14 2020-10-14 Remote sensing image segmentation method, device and medium based on weighted loss fusion network

Publications (1)

Publication Number Publication Date
CN112364699A true CN112364699A (en) 2021-02-12

Family

ID=74506688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011097624.1A Pending CN112364699A (en) 2020-10-14 2020-10-14 Remote sensing image segmentation method, device and medium based on weighted loss fusion network

Country Status (1)

Country Link
CN (1) CN112364699A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861829A (en) * 2021-04-13 2021-05-28 山东大学 Water body extraction method and system based on deep convolutional neural network
CN113298817A (en) * 2021-07-02 2021-08-24 贵阳欧比特宇航科技有限公司 High-accuracy semantic segmentation method for remote sensing image
CN113298092A (en) * 2021-05-28 2021-08-24 有米科技股份有限公司 Neural network training method and device for extracting multi-level image contour information
CN113378897A (en) * 2021-05-27 2021-09-10 浙江省气候中心 Neural network-based remote sensing image classification method, computing device and storage medium
CN113822428A (en) * 2021-08-06 2021-12-21 中国工商银行股份有限公司 Neural network training method and device and image segmentation method
CN113989287A (en) * 2021-09-10 2022-01-28 国网吉林省电力有限公司 Urban road remote sensing image segmentation method and device, electronic equipment and storage medium
CN114092815A (en) * 2021-11-29 2022-02-25 自然资源部国土卫星遥感应用中心 Remote sensing intelligent extraction method for large-range photovoltaic power generation facility
CN114387512A (en) * 2021-12-28 2022-04-22 南京邮电大学 Remote sensing image building extraction method based on multi-scale feature fusion and enhancement

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
WO2020143323A1 (en) * 2019-01-08 2020-07-16 平安科技(深圳)有限公司 Remote sensing image segmentation method and device, and storage medium and server
US20200250462A1 (en) * 2018-11-16 2020-08-06 Beijing Sensetime Technology Development Co., Ltd. Key point detection method and apparatus, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
US20200250462A1 (en) * 2018-11-16 2020-08-06 Beijing Sensetime Technology Development Co., Ltd. Key point detection method and apparatus, and storage medium
WO2020143323A1 (en) * 2019-01-08 2020-07-16 平安科技(深圳)有限公司 Remote sensing image segmentation method and device, and storage medium and server

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861829A (en) * 2021-04-13 2021-05-28 山东大学 Water body extraction method and system based on deep convolutional neural network
CN112861829B (en) * 2021-04-13 2023-06-30 山东大学 Water body extraction method and system based on deep convolutional neural network
CN113378897A (en) * 2021-05-27 2021-09-10 浙江省气候中心 Neural network-based remote sensing image classification method, computing device and storage medium
CN113298092A (en) * 2021-05-28 2021-08-24 有米科技股份有限公司 Neural network training method and device for extracting multi-level image contour information
CN113298817A (en) * 2021-07-02 2021-08-24 贵阳欧比特宇航科技有限公司 High-accuracy semantic segmentation method for remote sensing image
CN113822428A (en) * 2021-08-06 2021-12-21 中国工商银行股份有限公司 Neural network training method and device and image segmentation method
CN113989287A (en) * 2021-09-10 2022-01-28 国网吉林省电力有限公司 Urban road remote sensing image segmentation method and device, electronic equipment and storage medium
CN114092815A (en) * 2021-11-29 2022-02-25 自然资源部国土卫星遥感应用中心 Remote sensing intelligent extraction method for large-range photovoltaic power generation facility
CN114092815B (en) * 2021-11-29 2022-04-15 自然资源部国土卫星遥感应用中心 Remote sensing intelligent extraction method for large-range photovoltaic power generation facility
CN114387512A (en) * 2021-12-28 2022-04-22 南京邮电大学 Remote sensing image building extraction method based on multi-scale feature fusion and enhancement
CN114387512B (en) * 2021-12-28 2024-04-19 南京邮电大学 Remote sensing image building extraction method based on multi-scale feature fusion and enhancement

Similar Documents

Publication Publication Date Title
CN112364699A (en) Remote sensing image segmentation method, device and medium based on weighted loss fusion network
CN110210551B (en) Visual target tracking method based on adaptive subject sensitivity
CN109241972B (en) Image semantic segmentation method based on deep learning
CN113159051B (en) Remote sensing image lightweight semantic segmentation method based on edge decoupling
CN109726627B (en) Neural network model training and universal ground wire detection method
CN112380921A (en) Road detection method based on Internet of vehicles
CN111652892A (en) Remote sensing image building vector extraction and optimization method based on deep learning
CN113780296A (en) Remote sensing image semantic segmentation method and system based on multi-scale information fusion
CN111523546A (en) Image semantic segmentation method, system and computer storage medium
CN111696110B (en) Scene segmentation method and system
CN111046768A (en) Deep learning method for simultaneously extracting road pavement and center line of remote sensing image
CN113780149A (en) Method for efficiently extracting building target of remote sensing image based on attention mechanism
CN112580694B (en) Small sample image target recognition method and system based on joint attention mechanism
CN110781850A (en) Semantic segmentation system and method for road recognition, and computer storage medium
CN110517272B (en) Deep learning-based blood cell segmentation method
CN111815526B (en) Rain image rainstrip removing method and system based on image filtering and CNN
CN114037640A (en) Image generation method and device
CN115293986A (en) Multi-temporal remote sensing image cloud region reconstruction method
CN113298817A (en) High-accuracy semantic segmentation method for remote sensing image
CN114996495A (en) Single-sample image segmentation method and device based on multiple prototypes and iterative enhancement
CN115937693A (en) Road identification method and system based on remote sensing image
CN104463962A (en) Three-dimensional scene reconstruction method based on GPS information video
CN111612803A (en) Vehicle image semantic segmentation method based on image definition
CN114494284B (en) Scene analysis model and method based on explicit supervision area relation
CN115909077A (en) Hyperspectral image change detection method based on unsupervised spectrum unmixing neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination