CN112084901A - GCAM-based high-resolution SAR image airport runway area automatic detection method and system - Google Patents
GCAM-based high-resolution SAR image airport runway area automatic detection method and system Download PDFInfo
- Publication number
- CN112084901A CN112084901A CN202010871235.3A CN202010871235A CN112084901A CN 112084901 A CN112084901 A CN 112084901A CN 202010871235 A CN202010871235 A CN 202010871235A CN 112084901 A CN112084901 A CN 112084901A
- Authority
- CN
- China
- Prior art keywords
- sar image
- runway area
- gcam
- convolution
- pooling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 78
- 238000000605 extraction Methods 0.000 claims abstract description 63
- 238000000034 method Methods 0.000 claims abstract description 51
- 238000013507 mapping Methods 0.000 claims abstract description 15
- 230000007246 mechanism Effects 0.000 claims abstract description 14
- 238000011176 pooling Methods 0.000 claims description 76
- 238000012545 processing Methods 0.000 claims description 23
- 230000011218 segmentation Effects 0.000 claims description 20
- 238000005070 sampling Methods 0.000 claims description 18
- 238000010586 diagram Methods 0.000 claims description 16
- 238000001125 extrusion Methods 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 230000002708 enhancing effect Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 230000009467 reduction Effects 0.000 claims description 7
- 238000003860 storage Methods 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 238000007670 refining Methods 0.000 claims description 4
- 238000002474 experimental method Methods 0.000 abstract description 5
- 101710104099 L-methionine sulfoximine/L-methionine sulfone acetyltransferase Proteins 0.000 abstract 1
- ODPGJSIBNBRBDT-JGVFFNPUSA-N [(2s,5r)-5-[6-(methylamino)purin-9-yl]oxolan-2-yl]methanol Chemical compound C1=NC=2C(NC)=NC=NC=2N1[C@H]1CC[C@@H](CO)O1 ODPGJSIBNBRBDT-JGVFFNPUSA-N 0.000 abstract 1
- 238000012549 training Methods 0.000 description 21
- 238000012360 testing method Methods 0.000 description 18
- 238000013135 deep learning Methods 0.000 description 14
- 230000003287 optical effect Effects 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 6
- 239000011800 void material Substances 0.000 description 6
- 238000011160 research Methods 0.000 description 4
- 125000004432 carbon atom Chemical group C* 0.000 description 3
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 230000035515 penetration Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000010287 polarization Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/88—Radar or analogous systems specially adapted for specific applications
- G01S13/89—Radar or analogous systems specially adapted for specific applications for mapping or imaging
- G01S13/90—Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
- G01S13/9021—SAR image post-processing techniques
- G01S13/9027—Pattern recognition for feature extraction
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/88—Radar or analogous systems specially adapted for specific applications
- G01S13/89—Radar or analogous systems specially adapted for specific applications for mapping or imaging
- G01S13/90—Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
- G01S13/9094—Theoretical aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- Radar, Positioning & Navigation (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Electromagnetism (AREA)
- Astronomy & Astrophysics (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a GCAM-based high-resolution SAR image airport runway area automatic detection method and a system thereof, wherein the GCAM-based high-resolution SAR image airport runway area automatic detection method comprises the steps of carrying out downsampling on a high-resolution SAR image to generate a medium-resolution image; inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area; and carrying out coordinate mapping on the extracted runway area to obtain the detection result of the runway area of the final high-resolution SAR image. Experiments show that compared with DeeplaLV 3+, RefineNet and MDDA networks, the method is high in precision and short in time consumption, can fully learn the geospatial information of the SAR image airport, and can realize high-precision, rapid and automatic extraction of the runway area of the high-resolution SAR image airport.
Description
Technical Field
The invention relates to an airport runway area automatic detection technology, in particular to a GCAM-based high-resolution SAR image airport runway area automatic detection method and a GCAM-based high-resolution SAR image airport runway area automatic detection system.
Background
Airports are important transportation hubs and military facilities, and the detection of airport targets from Synthetic Aperture Radar (SAR) images has become an important application. SAR has the advantages of all-weather imaging all day long, cloud and fog penetration and the like, but SAR images are not easy to read compared with optical images, interpretation is more complex, and most airport detection is usually based on optical remote sensing images. With the higher resolution and more data of the SAR images, the research on extracting airports by using the SAR images is gradually increased in recent years, and the related research also starts to be deepened continuously. The traditional airport extraction method is time-consuming and labor-consuming, most of the airport extraction methods only have good effect on the optical image, but the airport extraction effect on the SAR image is poor. Therefore, the realization of automatic and rapid extraction of the airport runway area on the high-resolution SAR image has profound and urgent practical significance. In addition, the false alarm generated in the airplane detection can be greatly reduced by using the airport runway area to mask the airplane detection, and the airplane detection precision is improved.
Airport detection has wide application in navigation, accident search and rescue, airplane positioning and the like of a navigation station. The runway area is one of the main components of the airport, and in the research on airport detection, the research on detecting the airport on an optical remote sensing image is more. The prior art carries out airport detection by using a traditional method for extracting airport edge line segments, but the method for extracting the line segments requires that an airport has obvious linear characteristics, which is not suitable for large civil airports with more stations and weak runway linear characteristics; some other schemes use a sparse reconstruction saliency model (SRS) and a target-aware active contour model (TAACM) to complete airport detection, and the method enhances the detail extraction of an airport; some schemes combine a visual saliency analysis model, a bidirectional complementary saliency analysis module and a saliency active contour model (SOACM) to extract the airport contour, and the method is suitable for most optical remote sensing images; the SAR image has strong penetration capability, can work without interference, and can acquire abundant ground feature information, so that the SAR image gradually becomes an experimental object for airport detection. Some schemes combine the traditional line segment grouping method and the significance analysis model to detect the airport on the small SAR image, but the method is not suitable for detecting the airport on the large SAR image; some other proposals propose a PolSAR airport runway detection algorithm combining optimized polarization features and random forests, but the method can only effectively extract the parallel runway features in the airport.
In recent years, deep learning has achieved a very good effect in the direction of semantic segmentation. The semantic segmentation is a deep learning method for performing feature learning based on image pixel points so as to realize different types of image division. Airport detection needs to extract all airport features, and the principle is consistent with the semantic segmentation idea, so that a method combining deep learning and airport detection begins to appear. For example: a certain prior art provides an airport detection method combining a deep learning YOLO model and a significance analysis model; in the prior art, an airport is detected by combining a deep learning Goole-LF network and a Support Vector Machine (SVM) method; in the prior art, airport extraction is carried out by combining a deep learning Faster-CNN network and a space analysis method; an end-to-end depth transferable convolution depth learning network is constructed in the prior art to detect an airport; however, the above methods are examples of applying deep learning to optical remote sensing images, and due to the scarcity of airport sample data, the deep learning model is often over-fitted during training. For the extraction of the high-resolution SAR image airport, a certain prior art provides a deep learning network MDDA (Mult-level and densely dual attribute) of the high-resolution SAR image runway area, which can realize the high-precision airport extraction, but requires a large data set and a long training time. Therefore, it is very practical to find a deep learning method which is suitable for a small sample data set and can efficiently extract an airport.
The deep learning network is developed very rapidly, and the deep learning DeepLab series has excellent performance in the semantic segmentation field. DeepLabv1 was proposed in 2014, a perforated convolution (Atrous Conv) is introduced for the first time, the problems that a traditional CNN algorithm is adopted under signals existing in pixel markers and does not deform in space are solved, the capacity of capturing fine details of a model is improved through a Conditional Random Field (CRF), and DeepLabv1 obtains a second name in the PASCAL semantic segmentation challenge; DeepLabv2 was proposed in 2016, and DeepLabv2 further proposes an ASPP (asynchronous text messaging) module on the basis of DeepLabv1, so that context semantic information is captured from a multi-scale direction, and a backbone network VGG-16 is changed into ResNet, so that the problem of characteristic resolution reduction caused by pooling in the traditional CNN is solved; in 2017, DeepLabv3 appears, and the DeepLabv3 improves ASPP on the basis of DeepLabv2, so that the network performance is better; in 2018, the DeepLabv3+ is further improved on the basis of the DeepLabv3, a coding-decoding structure is introduced into the DeepLabv3+, the DeepLabv3 is used as a coding part, a simple and effective decoding block is designed, and a deep separable convolution (Depthwise separable convolution) is added into a backbone network, so that the computation amount and the parameter amount are effectively reduced on the premise of keeping the performance of a model.
Therefore, how to realize high-precision, rapid and automatic extraction of the high-resolution SAR image airport runway area based on deep learning is a key technical problem to be solved urgently aiming at the problems in the extraction of the SAR image airport.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides the high-resolution SAR image airport runway area automatic detection method and system based on GCAM, which can fully learn the geographic spatial information of the SAR image airport and can realize the high-precision, quick and automatic extraction of the high-resolution SAR image airport runway area.
In order to solve the technical problems, the invention adopts the technical scheme that:
a GCAM-based high-resolution SAR image airport runway area automatic detection method comprises the following steps:
1) down-sampling the high-resolution SAR image to generate a medium-resolution image;
2) inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;
3) and carrying out coordinate mapping on the extracted runway area to obtain a detection result of the final high-resolution SAR image.
Optionally, the downsampling the high-resolution SAR image in step 1) specifically refers to performing 5-fold downsampling on the SAR image by using a pixel value extraction method.
Optionally, the GCAM includes a coding block and a decoding block, where the coding block includes a residual error network ResNet, a multi-scale extrusion pyramid MSP and an edge refinement module EDM, where the residual error network ResNet is used to perform feature extraction on an input data set to obtain a preliminary feature, the multi-scale extrusion pyramid MSP is used to obtain global context information from different resolutions by using different pooling convolutional layers for the preliminary feature, the edge refinement module EDM is used to enhance network edge extraction capability for the preliminary feature, and outputs of the multi-scale extrusion pyramid MSP and the edge refinement module EDM are further fused to obtain a multi-level feature; the decoding block is used for carrying out semantic segmentation on the runway area of the airport by combining the preliminary features and the multi-level features to extract the runway area.
Optionally, the residual error network ResNet is an improved residual error network obtained by replacing a common two-dimensional convolution with hole convolutions with hole rates of 2, 4, 8, and 16 on the basis of the residual error network ResNet _ 101.
Optionally, the multi-scale extrusion pyramid MSP comprises a multi-receptive-field parallel pooling working layer and an effective attention module eSE, wherein the multi-receptive-field parallel pooling working layer is built in parallel by a 1 × 1 convolution with a void rate of 1, a 3 × 3 convolution with three void rates of 6,12 and 18, a global average pooling module GAP and a stripe pooling module SP;the stripe pooling module SP performs pooling operation in the horizontal direction by utilizing a stripe pooling window H multiplied by 1 and pooling operation in the vertical direction by utilizing a stripe pooling window H multiplied by 1 in the vertical direction aiming at a two-dimensional feature tensor with the input size of H multiplied by W, averages element values in a pooling kernel respectively to obtain output of stripe pooling in the horizontal direction and output of stripe pooling in the vertical direction, then performs expansion in the left-right direction and the up-down direction on the output respectively by using two one-dimensional convolutions aiming at the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction, the two expanded feature maps have the same size, then fuses the two expanded feature maps, and finally multiplies the original data and the data subjected to Sigmoid processing to obtain the output of the H multiplied by W two-dimensional feature tensor; the active attention Module eSE learns first by globally averaged pooling of features F for the input feature map XiavgWill feature FavgObtaining a weight matrix W by full connection layer processingCThe weight matrix WCReadjusting the extracted channel attention feature A through Sigmoid functioneSEThen the channel attention feature AeSEApplying the input feature map Xi to obtain a refined feature map XrefineFinally, the refined characteristic diagram X is obtainedrefineAnd performing feature re-screening to obtain global context information.
Optionally, the edge refinement module EDM comprises a global convolution module GCB for enhancing the affinity of the feature map to the pixel classification layer and the ability to process feature maps of different resolutions to obtain global information, an edge refinement module BR for enhancing the edge extraction ability of the coding block from the global information; the global convolution module GCB comprises a big convolution kernel of kxk and a characteristic combination module, wherein the big convolution kernel of kxk comprises two paths, one path consists of convolution of kx01 x1 cxc x 2c and convolution of 1 x3 kxc x c, the other path consists of convolution of 1 xkxxc x c and convolution of kx1 xc x c, wherein c is the number of channels, and output results of the two paths are input into the characteristic combination module together to obtain the characteristic SumW×H×C(ii) a The edge refinement module BR targets a feature SumW×H×CSequentially processing by small convolution kernel, activation function and small convolution kernel, and processingSuperposition of the processing results to the original features SumW×H×CAnd finally obtaining a characteristic diagram after the edges of the refined runway area.
Optionally, the decoding block performs 1 × 1 convolution dimensionality reduction on output features of the coding block, performs edge information decoding on a feature map obtained by refining the edge of the runway area by using an edge refinement module EDM, performs bilinear 4-fold upsampling, connects a result obtained by performing 1 × 1 convolution dimensionality reduction on preliminary features output by a residual error network ResNet and bilinear 4-fold upsampling, applies a 3 × 3 convolution to the connected features to refine the features, and performs simple bilinear 4-fold upsampling, so that a final segmentation result is obtained.
In addition, the invention also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, which comprises:
the down-sampling program unit is used for down-sampling the high-resolution SAR image to generate a medium-resolution image;
a runway area extraction program unit for inputting the medium resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;
and the coordinate mapping program unit is used for carrying out coordinate mapping on the extracted runway area to obtain a final detection result.
In addition, the invention also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, which comprises a computer device and a memory, wherein the computer device comprises a microprocessor and the memory which are connected with each other, the microprocessor is programmed or configured to execute the steps of the GCAM-based high-resolution SAR image airport runway area automatic detection method, or a computer program which is programmed or configured to execute the GCAM-based high-resolution SAR image airport runway area automatic detection method is stored in the memory.
Furthermore, the present invention also provides a computer readable storage medium having stored therein a computer program programmed or configured to execute the GCAM-based high resolution SAR image airport runway area automatic detection method.
Compared with the prior art, the invention has the following advantages: the method comprises the steps of carrying out downsampling on a high-resolution SAR image to generate a medium-resolution image; inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area; coordinate mapping is carried out on the extracted runway area to obtain a final detection result of the high-resolution SAR image, the deep learning and the SAR image are combined with the extraction of the runway area of the airport, the geographic space information of the SAR image airport can be fully learned, and the high-precision, quick and automatic extraction of the high-resolution SAR image runway area of the airport can be realized.
Drawings
FIG. 1 is a schematic diagram of the basic principle of the method according to the embodiment of the present invention.
Fig. 2 is a schematic structural diagram of an improved residual error network in the embodiment of the present invention.
Fig. 3 is a schematic structural diagram of the stripe pooling module SP in the embodiment of the present invention.
Fig. 4 is a schematic structural diagram of an effective attention module eSE according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of the global convolution module GCB and the edge refinement module BR in the embodiment of the present invention.
Fig. 6 shows an SAR image, a tag, and an optical remote sensing image of a certain airport sample tag in an embodiment of the present invention.
Fig. 7 shows the runway extraction result for airport I in the embodiment of the present invention.
Fig. 8 shows runway extraction results for airport II in an embodiment of the present invention.
Fig. 9 shows runway extraction results for airport III in an embodiment of the invention.
Detailed Description
As shown in fig. 1, the GCAM-based high-resolution SAR image airport runway area automatic detection method of the embodiment includes:
1) down-sampling the high-resolution SAR image to generate a medium-resolution image;
2) inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;
3) and carrying out coordinate mapping on the extracted runway area to obtain a detection result of the final high-resolution SAR image.
In this embodiment, the downsampling of the high-resolution SAR image in step 1) specifically refers to performing 5-fold downsampling on the SAR image by using a pixel value extraction method. The method mainly comprises two parts of down-sampling, namely down-sampling of a data set sample picture, and down-sampling of three high-resolution test SAR images, wherein the three high-resolution test SAR images are medium-resolution SAR images after sampling.
In order to extract the SAR image airport runway area quickly, the embodiment proposes a geographic space context Attention mechanism network GCAM (geographic spatial context Attention mechanism), as shown in fig. 2, the geospatial context attention mechanism network GCAM includes a coding block and a decoding block, the coding block includes a residual error network ResNet, a Multi-scale Squeeze Pyramid MSP (Multi-scale squaqueeze Pyramid) and an edge refinement module EDM (edge refinement module), the residual error network ResNet is used for extracting features of an input data set to obtain a preliminary feature, the Multi-scale Squeeze Pyramid MSP is used for obtaining global context information by operating different pooling layers from different resolutions according to the preliminary feature, the edge refinement module EDM is used for enhancing network edge extraction capability according to the preliminary feature, and outputs of the Multi-scale Squeeze Pyramid MSP and the edge refinement module EDM are further fused to obtain a Multi-level feature; the decoding block is used for carrying out semantic segmentation on the runway area of the airport by combining the preliminary features and the multi-level features to extract the runway area. Firstly, a coding block performs primary feature extraction on an input data set by using a residual error network ResNet; the multi-scale extrusion pyramid MSP and the edge refinement module EDM respectively extract and fuse the initial features, the multi-scale extrusion pyramid MSP obtains global context information from different resolutions by different pooling convolutional layer operations, and the edge refinement module EDM enhances the network edge extraction capability and further fuses multi-level features; the decoding block adopts edge refinement decoding, one part of the decoding block receives multi-level high-level features from the coding block, and the other part of the decoding block receives preliminary features from a residual error network ResNet, so that semantic segmentation of the runway area of the airport is realized.
The residual error network ResNet is a backbone network of a geographic space context attention mechanism network GCAM, has the characteristics of jump connection, residual error optimization and the like, can accelerate training by the structure, improves model accuracy, and is very suitable for building a semantic segmentation network. In order to solve the problem that the network pooling operation is prone to lose detailed features, as shown in fig. 2, a residual error network ResNet adopted in this embodiment is an improved residual error network obtained by replacing a common two-dimensional convolution with hole convolutions having hole rates of 2, 4, 8, and 16 on the basis of a residual error network ResNet _ 101. The hole convolution can solve the problem that detailed features are easy to lose in network pooling operation, the addition of the hole convolution does not increase the number of parameters of residual error network ResNet additionally, but the subsequent convolution layer can keep larger feature diagram size, so that the detection of target pixels is facilitated, and the overall performance of the model is improved. Considering the addition of the hole convolution, for an arbitrary position j of the picture, applying a filter ω (k) on the input feature x [ j + r.k ], the output y (j) is:
wherein the rate r introduces r-1 0 values between sampling points, effectively extending the receptive field from k × k to k + (k-1) (r-1) without increasing the number of parameters and the amount of computation. Fig. 2 shows an improved structure part of the improved residual error network. The last block (block) of the residual error network ResNet _101 is copied 4 times and then built in parallel, but the pure parallel work of the blocks does not utilize the network to acquire deep semantic information, so that the features are concentrated in the last few layers of smaller feature maps, and the continuous convolution with step length does not utilize semantic segmentation. Therefore, in the present embodiment, the hole convolutions with the hole rates of 2, 4, 8 and 16 are substituted for the ordinary two-dimensional convolution, thereby improving the final output step size. The resolution of a part of feature maps is changed by adding the hole convolution, so that the final output features of the residual error network ResNet _101 not only have high-dimensional low-resolution feature maps, but also contain part of low-dimensional high-resolution features, and full extraction of multi-size features is realized.
Referring to fig. 1, a multi-scale extrusion pyramid MSP includes a multi-field parallel pooling working layer and an active attention module eSE.
Referring to fig. 1, the multi-sensing-field parallel pooling working layer is constructed by a 1 × 1 convolution with a void rate of 1, a 3 × 3 convolution with three void rates of 6,12 and 18 respectively, a global average pooling module GAP and a stripe pooling module SP in parallel; the stripe pooling module SP performs pooling operation in the horizontal direction by utilizing a stripe pooling window H multiplied by 1 and pooling operation in the vertical direction by utilizing a stripe pooling window H multiplied by 1 in the vertical direction aiming at a two-dimensional feature tensor with the input size of H multiplied by W, averages element values in a pooling kernel respectively to obtain output of stripe pooling in the horizontal direction and output of stripe pooling in the vertical direction, then performs expansion in the left-right direction and the up-down direction on the output respectively by using two one-dimensional convolutions aiming at the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction, the two expanded feature maps have the same size, then fuses the two expanded feature maps, and finally multiplies the original data and the data subjected to Sigmoid processing to obtain the output of the H multiplied by W two-dimensional feature tensor; in this embodiment, the feature map after the improved residual error network processing includes 256 channels and rich semantic information, and is first input to a multi-sensitive-field parallel pooling working layer, which is constructed by a 1 × 1 convolution with a void rate of 1, a 3 × 3 convolution with three void rates of 6,12, and 18, a global average pooling module GAP, and a stripe pooling module SP in parallel. The cavity convolution of four different cavity rates can effectively capture multi-scale information from different receptive fields; the addition of global average pooling carries out down-sampling processing on the characteristics so as to prevent over-fitting of the network; strip pooling captures local information of the features; the multi-receptive-field parallel pooling working layer realizes multi-scale feature fusion.
The striped pooling module SP (stripe Pooling) can overcome the disadvantage that general pooling is prone to false alarms. As shown in fig. 3, when the two-dimensional feature tensor x e R is inputH×WThe stripe pooling module SP performs pooling operations in the horizontal and vertical directions using the band pooling windows hx 1 and 1 × W, respectively, and averages the element values in the pooling kernel and takes the value as a pooling output value. Level ofOutput y of striped pooling in directionh=RHComprises the following steps:
in the above formula, the first and second carbon atoms are,for output of arbitrary matrix elements, x, for strippooling in the horizontal directioni,jAll matrix elements within the pooling core.
Output y of stripe pooling in vertical directionv=RWComprises the following steps:
in the above formula, the first and second carbon atoms are,for output of arbitrary matrix elements, x, for strippooling in the vertical directioni,jAll matrix elements within the pooling core.
After H × 1 and 1 × W coring, the output is expanded in the left-right direction and the up-down direction using two one-dimensional convolutions. And after expansion, the two feature graphs have the same size, then are fused, and finally, the original data and the data processed by the Sigmoid function are multiplied to output a result. In the stripe pooling layer in the horizontal and vertical directions, the discretely distributed pixel regions and the band-shaped pixel regions easily depend on each other. Since the convolution kernel is long and narrow, and the convolution kernel shape is narrow in the opposite dimension, it is easy to capture local information of the feature. These features all make streak pooling preferable to square kernel based average pooling.
And the effective attention Module eSE (effective Squeeze-and-Excitation Module) is used for receiving the multi-scale features and then screening the advantages and the disadvantages of the features from the channel information. Referring to fig. 4, the active attention module eSE first learns the features F by global mean pooling for the input feature map XiavgWill specially beSign FavgObtaining a weight matrix W through Full Connectivity (FC) processingCThe weight matrix WCReadjusting the extracted channel attention feature A through Sigmoid functioneSEThen the channel attention feature AeSEMultiplying the input feature map Xi to obtain a refined feature map XrefineThus, each input Xi is subjected to weight assignment pixel by pixel, and feature re-screening is realized. Wherein, the Full Connectivity (FC) and a Sigmoid function readjust the input feature map to extract the useful channel information.
When the size of the input feature map is Xi∈RC×W×HThen the valid channel attention map AeSE(Xi)∈RC×1×1The calculation is as follows:
AeSE(Xi)=σ(WC(Fgap(Xi)))
in the above formula, AeSE(Xi) Representing the channel attention feature A extracted from the feature map Xi for the inputeSEσ is Sigmoid function, WCAs a weight matrix, Fgap(Xi) Features F obtained for global average pooling of feature maps Xi against inputavgAnd F isgap(Xi) The functional expression of (a) is:
in the above equation, Xi, j represents all elements in the matrix of the feature map Xi.
Attention to channel feature AeSEApplying the input feature map Xi to obtain a refined feature map XrefineThe expression of (a) is as follows:
in the above formula, the first and second carbon atoms are,representing an exclusive or. The input feature map Xi is the multi-scale feature map from the output of the multi-scale extrusion pyramid MSP. A is to beeSE(Xi) Applying attention as a channel feature to a multiscale feature map makes the multiscale feature more informative. Finally, the output characteristic diagram is input into the refined characteristic diagram X element by elementrefineAnd (5) performing characteristic rescreening.
Referring to fig. 1, the multi-scale extrusion pyramid MSP, the edge refinement module EDM work in parallel and receive simultaneously the output signature from the improved residual network. As shown in fig. 1, the edge refinement module EDM comprises a global convolution module gcb (global conditional block) for enhancing the affinity of the feature map to the pixel classification layer and the ability to process feature maps of different resolutions to obtain global information, an edge refinement module br (boundary reference) for enhancing the edge extraction capability of the coding block from the global information. The edge refinement module EDM can effectively solve the problem of pixel point classification and positioning in semantic segmentation, wherein the global convolution module GCB increases the size of a convolution kernel to the space size of a feature map, so that the feature map and a pixel classification layer are closely related, thereby enhancing the capability of processing different features and obtaining global information; and then an edge refinement module BR is introduced to further improve the network edge extraction capability.
As shown in fig. 5, the global convolution module GCB in this embodiment includes a large k × k convolution kernel and a feature combination module, where the large k × k convolution kernel includes two paths, one path is composed of a convolution of k × 01 × 1c × 2c and a convolution of 1 × 3k × c × c, and the other path is composed of a convolution of 1 × k × c × c and a convolution of k × 1 × c × c, where c is the number of channels, and output results of the two paths are input to the feature combination module together to obtain the feature SumW×H×C(ii) a Edge refinement Module BR for feature SumW×H×CSequentially processing the data by a small convolution kernel, an activation function and a small convolution kernel, and then overlapping the processing result to the original characteristic SumW×H×CAnd finally obtaining a characteristic diagram after the edges of the refined runway area.
Referring to fig. 5, the global convolution module GCB adopts a convolution construction mode to fully utilize the multi-channel information of the features. Aiming at the problem of pixel point classification, the global convolution module GCB adopts a large convolution kernel, so that semantic information corresponding to each pixel point cannot be changed due to image transformation (translation, turnover and the like), and the relation among pixels is closer; in terms of aiming at the pixel point positioning problem, the global convolution module GCB uses complete convolution, utilizes the matrix decomposition principle, and uses convolution of 1 xk and kx 1, and convolution of kx 1 and 1 xk to replace large kernel convolution of kx k, thereby reducing parameter quantity, reducing calculation quantity, matching each pixel type with corresponding correct type, and realizing accurate pixel segmentation. Because the global convolution module GCB does not have a BN layer (Batch Normalization) and an activation function, an edge thinning module BR of a small convolution kernel is introduced, the phenomenon of object boundary pixel misclassification is prevented, and classification accuracy and positioning accuracy are realized.
As shown in fig. 1, a decoding block performs 1 × 1 convolution dimensionality reduction on output features of a coding block, performs edge information decoding and bilinear 4-fold upsampling on a feature map obtained by refining the edge of a runway area by using an edge refining module EDM, then connects a result obtained by performing 1 × 1 convolution dimensionality reduction on preliminary features output by a residual error network ResNet and bilinear 4-fold upsampling on the preliminary features, applies a 3 × 3 convolution to the connected features to refine the features, and finally performs a simple bilinear 4-fold upsampling, thereby obtaining a final segmentation result. The input to the decoding block comprises two parts: the output characteristics of the coding blocks and the preliminary characteristics output by the residual error network ResNet. The output characteristics of the coding block are firstly reduced through 1 multiplied by 1 convolution, then edge information decoding is carried out by utilizing EDM, and then bilinear 4 times of upsampling is carried out, so that the operation is characterized in that the edge information is fully decoded while the number of characteristic channels is reduced; and then connected with corresponding features from the backbone network of the same spatial resolution, since the features from the backbone network contain a part of low-level features, which usually contain a large number of channels, then we also take a 1 × 1 convolution to reduce the number of channels and reduce the unnecessary channel computation of the network.
In this embodiment, step 3) is configured to perform coordinate mapping on the extracted runway area to obtain a final detection result. The coordinate mapping is the same as the existing method, so the description is omitted, and after the geographic space context attention mechanism network GCAM realizes the segmentation of the airport runway area of the medium-resolution SAR image, the result graph is processed by using the coordinate mapping method, so that the result graph of the high-resolution SAR original graph is obtained. And finally, visualizing the result image and the original image to realize runway area extraction of the high-resolution SAR image.
The GCAM-based high-resolution SAR image airport runway area automatic detection method of the embodiment is experimentally verified below. The experimental environment was as follows: CPU Inter to qian jin brand 5120; GPU (single) NVIDIA RTX 2080 Ti; the data set uses an SAR image of a high-resolution No. 3 system, and firstly, a pixel extraction method is utilized to carry out 5-time down-sampling on 10 airport sample images; and then, using LaberImage software to label pixels, wherein the pixels are divided into a runway area and a background. We cut 10 down-sampled medium resolution SAR images arbitrarily into images larger than 480 x 480 to make small sample datasets, and generate 466 images. The ratio of training set to validation set was 4: 1. As shown in fig. 6, sub-images (a) - (c) are respectively an SAR image, a tag and an optical remote sensing image of a certain airport sample tag; the communication area where the mark a is located is a runway area, and the runway area comprises an airstrip, a taxiway, a parking apron and an airplane; the remaining individual communication areas are background.
The parameters are set as follows: the learning rate in the network training process is set to be 0.00001, and the weight attenuation coefficient is 0.995. The batch of input pictures (batch size) is 1, and the network training is iterated 100 times, keeping an epoch every 5 times. And randomly cutting the input picture in the training process, wherein the size of a window for random cutting is 480 multiplied by 480.
In this embodiment, PA (Pixel accuracy) and IOU (Intersection over Intersection) are used as parameters for verifying the runway extraction accuracy. PA represents the proportion of pixels marked correctly to the total pixels; the IOU represents the ratio of intersection and union between the segmentation result and the label; then MPA (Mean pixel proportion) represents the proportion of the number of pixels each class is correctly classified into; MIOU (Mean intersection over unit) represents the average of IOU per category. The method specifically comprises the following steps:
assuming a total of k +1 classes (including a class of backgrounds), in the above formula, PijRepresenting the number of pixels that originally belonged to class i but predicted to be class j, is a false positive sample, PjiRepresenting the number of pixels that originally belong to class j but are predicted to be class i, is a false negative sample, PiiRepresenting the number of true pixels of class i.
In order to verify the high efficiency of extracting the SAR image in the airport runway area, three groups of comparison experiments are performed. Comparative experiments were performed using the method of this example with deep lab v3+, RefineNet and MDDA. The number of experimental airports is three, including an airport i of 12000 × 15000, an airport ii of 9600 × 9600, and an airport iii of 15000 × 17500, which have not been used in data collection. MDDA is a deep learning network which is suitable for extracting SAR images from airport runway areas and is proposed previously, and DeepLabV3+ and RefineNet are semantic segmentation mainstream networks. The dataset used for the experiment was a manually annotated 466 small sample dataset. Because the network outputs down-sampled medium resolution images, the sizes of the down-sampled airports i, ii, iii are 2400 x 3000, 2000 x 2000, and 3000 x 3500, respectively. And finally, carrying out coordinate mapping processing on the result graph to directly obtain the result graph before downsampling. Network training time, picture testing time and runway area extraction precision before and after sampling are analyzed.
Fig. 7 to 9 show the extraction results of airport runway areas of airports i, ii, and iii, respectively. The SAR image with the medium resolution is obtained by performing down-sampling on the SAR image with the high resolution (a), the SAR image with the medium resolution (b) which is 5 times of that of the down-sampled SAR image, and the SAR image with the medium resolution (c) is marked by the category of a runway area, wherein red is the runway area, and black is a non-runway area, namely a background; (d) the extraction result of the RefineNet on the SAR image with the medium resolution, (e) the extraction result of the MDDA on the SAR image with the medium resolution, (f) the extraction result of the DeeplLabV 3+ on the SAR image with the medium resolution, and (g) the extraction result of the SAR image with the medium resolution in the method (GCAM) of the embodiment; (h) is a fused graph of the RefineNet result (d) and the medium resolution SAR map (b), (i) is a fused graph of the MDDA result (e) and the medium resolution SAR map (b), (j) is a fused graph of the deplab v3+ result (f) and the medium resolution SAR map (b), (k) is a fused graph of the method (GCAM) result (g) and the medium resolution SAR map (b) of the present embodiment; (l) - (o) is a fusion graph of the results (d) - (g) after coordinate mapping processing and the high-resolution original image (a); wherein the area marked with the number 1 is a runway area; the area frame with the reference number 2 is a false detection frame, namely, the background false detection is a part of the runway area; the area box numbered 3 marks the missed detection box, i.e. the portion of the runway area that is not detected.
I, airport experiment results and analysis.
As shown in a subgraph (a) in fig. 7, an airport i mainly comprises a large-area long runway area and an airplane parking apron, and a large number of airplanes in the airport have obvious airplane target bright spots; the background area has a gathering housing area and an intricate traffic line.
We tested medium resolution SAR images of airport i, with a test pattern size of 2400 x 3000. As shown in subgraphs (d) - (g) in fig. 7, the extraction result of the method of the present embodiment is the closest tag, and the MDDA does not completely extract the partial edge of the runway area; DeepLabV3+ has a small part missing detection phenomenon on the runway area; the RefineNet extraction was least effective. From the visual views (h) - (k), we mark the main missed boxes. The method of the embodiment has no large missing detection area, the MDDA has 2 main missing detection areas, the deep LabV3+ has 4 obvious missing detection areas, the number of false detection frames of the RefineNet is the most, more edge missing detection exists, and the missing detection areas are all in the edge area of the runway area. Comparing the network result (j) of the method of this embodiment with the extraction result (k) of the deep lab v3+, it can be seen that the addition of the edge refinement module EDM in the method of this embodiment enhances the learning of the network edge feature.
And II, carrying out experimental result and analysis on the airport II.
Airport ii is a simpler feature than airport i. The runway area of the airport II is mainly composed of long straight runways, small building groups are arranged near the edge area of the airport, large residential areas are not arranged, and a plurality of water areas are arranged around the runway area. The water area is imaged under the synthetic aperture radar to present the same dark black characteristic as a runway, which has interference on network distinguishing characteristics.
We tested medium resolution SAR images of airport ii size 2000 x 2000. Fig. 8 shows the runway area extraction for airport ii. Comparing sub-graphs (d), (e), (f), (g) and (c) in fig. 8, it can be seen that the method of the present embodiment has no false alarm and the best extraction effect. As can be seen from the subgraphs (h) - (k) in FIG. 8, the method of the present embodiment has only one small missed detection frame; the MDDA has 1 false detection frame and 4 obvious missed detection frames; DeepLabV3+ has more false alarms and most missed detection boxes; for the left edge area of the airport II, several missed detection areas exist in RefineNet; their extraction efficiency is to be improved. The edge extraction capability and the false alarm removal capability of the method are best, which also brings the superiority of the multi-scale squeezing pyramid MSP in the method.
And III, experimental results and analysis of the airport.
The runway area structure and surrounding terrain structure of airport iii are the most complex. The number of runways, taxiways, rest stops and parking ramps is large. The airport III is a civil airport, and the runway area of the airport III is mostly short runways without large-area long straight runways. The surrounding ground features SAR have more gray colors and bright spots, and are obviously compared with the characteristics of the airport runway area, so that the probability of network misjudgment is reduced. However, the edge features of the airport III are complex, and the edge information is the most, so that the network is required to have better global semantic information learning capability and can effectively decode the edge information.
We tested a medium resolution SAR image of size 3000 x 3500 for airport iii. Compared with subgraphs (d) - (l) in fig. 9, the method of the present embodiment has the same best extraction effect, and only part of small areas are missed; MDDA detection has two significant false alarms; the deep LabV3+ has a large number of missed detections, which indicates that the learning ability of the deep LabV3+ on the edge information is not strong; RefineNet has a large number of false alarms and the extraction effect is the worst. This also presents the effectiveness of edge decoding for the method of the present embodiment.
In order to more intuitively embody the high efficiency of the method for extracting the runway area of the airport. Table 1 shows the extraction accuracy of the medium resolution SAR images of three airports under different algorithms. The average extraction precision of the method for three airport runway areas reaches 0.9823, the average IOU reaches 0.9665, and the average extraction precision is higher than MDDA, DeepLabV3+ and RefineNet. According to table 1, the difference between the PA and IOU values of the same airport runway area is small, which indicates that the method of the present embodiment can almost completely extract the runway area without false alarm; DeepLabV3+ is easy to generate false alarm, so that the values of PA and IOU in the same airport runway area have a certain value difference, and the false alarm can reduce the IOU value in the runway area; although the overall extraction effect of the MDDA is not good, the MDDA has defects in the detail learning of a small sample data set; both PA and IOU values for RefineNet are lowest.
Table 1: and (5) analyzing the extraction precision of different networks.
Table 2 gives the training times for the different algorithms for the small sample dataset and the test times for the medium resolution SAR images for the three airports. According to table 2, from the training time of a small sample data set, our network only needs about 2 hours of training time; the training time of the MDDA is longest, and nearly 8 hours, the effect of training small samples of the MDDA is obviously not good as that of large samples; the training time of DeepLabV3+ and RefineNet is almost the same as the method of the present embodiment, but the precision is quite different. From the test time of the medium-resolution SAR images of three airports, the smaller the picture size is, the shorter the test time is, and the average test time of the method of the present embodiment is only 16.95s, the average test time of the reflonenet is 16.69s, the average test time of the deep lab v3+ is 15.89s, and the test time of the MDDA is approximately 2.5 times that of the method of the present embodiment. The addition of the MSP and the EDM brings a certain parameter quantity to the network, which is also the reason that the training time and the testing time of the method of the embodiment are slightly longer than those of the deep LabV3+, and the shorter the network training time and the picture testing time, the higher the efficiency of the actual engineering. Therefore, in summary, the method can achieve high-precision and fast extraction in processing of the small sample data set of the SAR image, and the method has high efficiency.
Table 2: data sets of different networks train time and test time of medium resolution airport images.
Therefore, the method can realize the rapid automatic extraction of the high-resolution SAR image airport runway area. The network design is light, the network iteration time is greatly shortened, and the network training time and the picture testing time are reduced. MSP enables the network to learn global features and encode effective features in a multi-scale and all-around manner, the parallel working mode of EDM and MSP enhances the learning between context and semantic information, and EDM enables edge information to be completely decoded and extracted. Meanwhile, the network is more suitable for training small sample data sets, a large common SAR airport data set for semantic segmentation is not available at present, the large common SAR airport data set can only be manually marked, and the small sample application is more beneficial to saving of manual time and cost. In general, from the view of extraction precision, data set training time and picture test time, the network is superior to the main stream algorithm DeepLabV3+, the GCAM performance is superior to the previously proposed algorithm MDDA, and high-efficiency automation is realized.
In summary, in order to realize rapid and automatic extraction of an airport by using a high-resolution SAR image, the embodiment provides an airport runway area automatic detection method based on a GCAM (general navigation aid memory access), which comprises three parts, namely, the step of performing downsampling processing on an original high-resolution SAR image, the step of extracting the airport runway area by using the GCAM, and the step of mapping coordinates of a result image generated by the GCAM. The down-sampling process enables a single training sample to contain more airport information, and is beneficial to making a small sample data set; MSP is added into stripe pooling and four parallel convolutions to work together, so that the characteristic can be learned in a multi-scale mode, and an eSE module is used for screening useful characteristics; the EDM helps the network to learn edge semantic information, and the coordinate mapping processing can obtain the extraction result of the original high-resolution SAR image. In the test of three airport runway areas, our network compares best to DeepLabV3+, RefineNet and MDDA, MPA can reach 0.98, and MIOU can reach 0.96. In addition, the time of the network training data set is only 2.25h, and the average test time of the images is only 16.94 s. From the extraction result, the GCAM has no false alarm and less missing detection, and can efficiently realize the extraction of the runway area of the airport. In addition, the GCAM can improve the detection efficiency in the actual engineering; and after the airport runway area is extracted, the detection range of subsequent airplane extraction can be shortened, and the time is saved.
In addition, the embodiment further provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, which includes:
the down-sampling program unit is used for down-sampling the high-resolution SAR image to generate a medium-resolution image;
a runway area extraction program unit for inputting the medium resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;
and the coordinate mapping program unit is used for carrying out coordinate mapping on the extracted runway area to obtain a detection result of the final high-resolution SAR image.
In addition, the embodiment also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, which comprises a computer device, wherein the computer device comprises a microprocessor and a memory which are connected with each other, the microprocessor is programmed or configured to execute the steps of the GCAM-based high-resolution SAR image airport runway area automatic detection method, or a computer program which is programmed or configured to execute the GCAM-based high-resolution SAR image airport runway area automatic detection method is stored in the memory.
Furthermore, the present embodiment also provides a computer-readable storage medium having stored therein a computer program programmed or configured to execute the aforementioned GCAM-based high resolution SAR image airport runway area automatic detection method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is directed to methods, apparatus (systems), and computer program products according to embodiments of the application wherein instructions, which execute via a flowchart and/or a processor of the computer program product, create means for implementing functions specified in the flowchart and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.
Claims (10)
1. A GCAM-based high-resolution SAR image airport runway area automatic detection method is characterized by comprising the following steps:
1) down-sampling the high-resolution SAR image to generate a medium-resolution image;
2) inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;
3) and carrying out coordinate mapping on the extracted runway area to obtain a final detection result of the high-resolution SAR image.
2. The GCAM-based high-resolution SAR image airport runway area automatic detection method according to claim 1, characterized in that, the down-sampling of the high-resolution SAR image in step 1) is specifically a 5-fold down-sampling processing of the SAR image by adopting a pixel value extraction method.
3. The GCAM-based high resolution SAR image airport runway area automatic detection method of claim 1, wherein the geospatial context attention mechanism network GCAM comprises a coding block and a decoding block, the coding block comprises a residual network ResNet, a multi-scale extrusion pyramid MSP and an edge refinement module EDM, the residual network ResNet is used for performing feature extraction on an input data set to obtain preliminary features, the multi-scale extrusion pyramid MSP is used for obtaining global context information from different resolutions with different pooling convolutional layer operations for the preliminary features, the edge refinement module EDM is used for enhancing network edge extraction capability for the preliminary features, and outputs of the multi-scale extrusion pyramid MSP and the edge refinement module EDM are further fused to obtain multi-level features; the decoding block is used for carrying out semantic segmentation on the runway area of the airport by combining the preliminary features and the multi-level features to extract the runway area.
4. The GCAM-based high-resolution SAR image airport runway area automatic detection method according to claim 3, wherein the residual error network ResNet is an improved residual error network obtained by replacing ordinary two-dimensional convolution with hole convolutions with hole rates of 2, 4, 8 and 16 on the basis of the residual error network ResNet _ 101.
5. The GCAM-based high-resolution SAR image airport runway area automatic detection method of claim 3, wherein the multi-scale extrusion pyramid MSP comprises a multi-field parallel pooling working layer and an effective attention module eSE, wherein the multi-field parallel pooling working layer is built by a 1 x 1 convolution with a voidage of 1, a 3 x 3 convolution with three voidages of 6,12 and 18 respectively, a global average pooling module GAP and a stripe pooling module SP in parallel; the stripe pooling module SP performs pooling operation in the horizontal direction by utilizing a stripe pooling window H multiplied by 1 and pooling operation in the vertical direction by utilizing a stripe pooling window H multiplied by 1 in the vertical direction aiming at a two-dimensional feature tensor with the input size of H multiplied by W, averages element values in a pooling kernel respectively to obtain output of stripe pooling in the horizontal direction and output of stripe pooling in the vertical direction, then performs expansion in the left-right direction and the up-down direction on the output respectively by using two one-dimensional convolutions aiming at the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction, the two expanded feature maps have the same size, then fuses the two expanded feature maps, and finally multiplies the original data and the data subjected to Sigmoid processing to obtain the output of the H multiplied by W two-dimensional feature tensor; the active attention Module eSE learns first by globally averaged pooling of features F for the input feature map XiavgWill feature FavgObtaining a weight matrix W by full connection layer processingCThe weight matrix WCReadjusting the extracted channel attention feature A through Sigmoid functioneSEThen the channel attention feature AeSEApplying the input feature map Xi to obtain a refined feature map XrefineFinally, the refined characteristic diagram X is obtainedrefineAnd performing feature re-screening to obtain global context information.
6. The GCAM-based high resolution SAR image airport runway area automatic detection method of claim 3, characterized in that the edge refinement module EDM comprises a global convolution module GCB for enhancing the close relation of feature maps and pixel classification layers and the ability to process feature maps of different resolutions to obtain global information, an edge refinement module BR for enhancing the edge extraction ability of coded blocks from global information; the global convolution module GCB comprises a big convolution kernel of kxk and a characteristic combination module, wherein the big convolution kernel of kxk comprises two paths, one path consists of convolution of kx01 x1 cxc x 2c and convolution of 1 x3 kxc x c, the other path consists of convolution of 1 xkxxc x c and convolution of kx1 xc x c, wherein c is the number of channels, and output results of the two paths are input into the characteristic combination module together to obtain the characteristic SumW×H×C(ii) a The edge refinement module BR targets a feature SumW×H×CSequentially processing the data by a small convolution kernel, an activation function and a small convolution kernel, and then overlapping the processing result to the original characteristic SumW×H×CAnd finally obtaining a characteristic diagram after the edges of the refined runway area.
7. The GCAM-based high-resolution SAR image airport runway area automatic detection method of claim 3, wherein the decoding block performs 1 x 1 convolution dimensionality reduction on output features of a coding block, performs edge information decoding on a feature map obtained by an edge refinement module EDM after refining the runway area edge, performs bilinear 4-fold upsampling, then connects the result obtained by performing 1 x 1 convolution dimensionality reduction on preliminary features output by a residual error network ResNet and bilinear 4-fold upsampling, then applies a 3 x 3 convolution to the connected features to refine the features, and finally performs a simple bilinear 4-fold upsampling, thereby obtaining the final segmentation result.
8. A GCAM-based high-resolution SAR image airport runway area automatic detection system is characterized by comprising:
the down-sampling program unit is used for down-sampling the high-resolution SAR image to generate a medium-resolution image;
a runway area extraction program unit for inputting the medium resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;
and the coordinate mapping program unit is used for carrying out coordinate mapping on the extracted runway area to obtain a detection result of the final high-resolution SAR image.
9. A GCAM-based high resolution SAR image airport runway area automatic detection system comprising a computer device comprising a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to perform the steps of the GCAM-based high resolution SAR image airport runway area automatic detection method according to any one of claims 1 to 7, or the memory has stored therein a computer program programmed or configured to perform the GCAM-based high resolution SAR image airport runway area automatic detection method according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a computer program programmed or configured to perform the GCAM-based high resolution SAR image airport runway area automatic detection method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010871235.3A CN112084901B (en) | 2020-08-26 | 2020-08-26 | GCAM-based high-resolution SAR image airport runway area automatic detection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010871235.3A CN112084901B (en) | 2020-08-26 | 2020-08-26 | GCAM-based high-resolution SAR image airport runway area automatic detection method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112084901A true CN112084901A (en) | 2020-12-15 |
CN112084901B CN112084901B (en) | 2024-03-01 |
Family
ID=73728710
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010871235.3A Active CN112084901B (en) | 2020-08-26 | 2020-08-26 | GCAM-based high-resolution SAR image airport runway area automatic detection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112084901B (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112528896A (en) * | 2020-12-17 | 2021-03-19 | 长沙理工大学 | SAR image-oriented automatic airplane target detection method and system |
CN112598003A (en) * | 2020-12-18 | 2021-04-02 | 燕山大学 | Real-time semantic segmentation method based on data expansion and full-supervision preprocessing |
CN112950477A (en) * | 2021-03-15 | 2021-06-11 | 河南大学 | High-resolution saliency target detection method based on dual-path processing |
CN113240040A (en) * | 2021-05-27 | 2021-08-10 | 西安理工大学 | Polarized SAR image classification method based on channel attention depth network |
CN113496221A (en) * | 2021-09-08 | 2021-10-12 | 湖南大学 | Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering |
CN113567984A (en) * | 2021-07-30 | 2021-10-29 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
CN113673417A (en) * | 2021-08-19 | 2021-11-19 | 中国商用飞机有限责任公司 | Method and system for assisting airplane ground taxiing based on image comparison |
CN113780241A (en) * | 2021-09-29 | 2021-12-10 | 北京航空航天大学 | Acceleration method and device for detecting salient object |
CN113887373A (en) * | 2021-09-27 | 2022-01-04 | 中关村科学城城市大脑股份有限公司 | Attitude identification method and system based on urban intelligent sports parallel fusion network |
CN114022751A (en) * | 2021-11-04 | 2022-02-08 | 中国人民解放军国防科技大学 | SAR target detection method, device and equipment based on feature refinement deformable network |
CN114066822A (en) * | 2021-10-27 | 2022-02-18 | 的卢技术有限公司 | Zebra crossing detection method based on deep learning |
CN114202733A (en) * | 2022-02-18 | 2022-03-18 | 青岛海信网络科技股份有限公司 | Video-based traffic fault detection method and device |
CN114387439A (en) * | 2022-01-13 | 2022-04-22 | 中国电子科技集团公司第五十四研究所 | Semantic segmentation network based on fusion of optical and PolSAR (polar synthetic Aperture Radar) features |
CN114820652A (en) * | 2022-04-07 | 2022-07-29 | 北京医准智能科技有限公司 | Method, device and medium for segmenting local quality abnormal region of mammary X-ray image |
CN114842206A (en) * | 2022-07-04 | 2022-08-02 | 江西师范大学 | Remote sensing image semantic segmentation model and method based on double-layer global convolution |
CN115131682A (en) * | 2022-07-19 | 2022-09-30 | 云南电网有限责任公司电力科学研究院 | Power grid distribution condition drawing method and system based on remote sensing image |
CN116343113A (en) * | 2023-03-09 | 2023-06-27 | 中国石油大学(华东) | Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105095914A (en) * | 2015-08-13 | 2015-11-25 | 中国民航大学 | Airport runway detection method based on combination of h/q decomposition and Bayesian iterative classification |
US20160131739A1 (en) * | 2007-09-06 | 2016-05-12 | Rockwell Collins, Inc. | Display system and method using weather radar sensing |
WO2019085905A1 (en) * | 2017-10-31 | 2019-05-09 | 北京市商汤科技开发有限公司 | Image question answering method, device and system, and storage medium |
CN110084249A (en) * | 2019-04-24 | 2019-08-02 | 哈尔滨工业大学 | The image significance detection method paid attention to based on pyramid feature |
CN110506278A (en) * | 2017-04-19 | 2019-11-26 | 西门子医疗有限公司 | Target detection in latent space |
CN110533045A (en) * | 2019-07-31 | 2019-12-03 | 中国民航大学 | A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism |
CN111222474A (en) * | 2020-01-09 | 2020-06-02 | 电子科技大学 | Method for detecting small target of high-resolution image with any scale |
-
2020
- 2020-08-26 CN CN202010871235.3A patent/CN112084901B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160131739A1 (en) * | 2007-09-06 | 2016-05-12 | Rockwell Collins, Inc. | Display system and method using weather radar sensing |
CN105095914A (en) * | 2015-08-13 | 2015-11-25 | 中国民航大学 | Airport runway detection method based on combination of h/q decomposition and Bayesian iterative classification |
CN110506278A (en) * | 2017-04-19 | 2019-11-26 | 西门子医疗有限公司 | Target detection in latent space |
WO2019085905A1 (en) * | 2017-10-31 | 2019-05-09 | 北京市商汤科技开发有限公司 | Image question answering method, device and system, and storage medium |
CN110084249A (en) * | 2019-04-24 | 2019-08-02 | 哈尔滨工业大学 | The image significance detection method paid attention to based on pyramid feature |
CN110533045A (en) * | 2019-07-31 | 2019-12-03 | 中国民航大学 | A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism |
CN111222474A (en) * | 2020-01-09 | 2020-06-02 | 电子科技大学 | Method for detecting small target of high-resolution image with any scale |
Non-Patent Citations (3)
Title |
---|
LIFU CHEN等: "A New Framework for Automatic Airports Extraction from SAR Images Using Multi-Level Dual Attention Mechanism", 《REMOTE SENS》, vol. 12, no. 3, pages 560 * |
SIYU TAN等: "Geospatial Contextual Attention Mechanism for Automatic and Fast Airport Detection in SAR Imagery", 《IEEE ACCESS》, vol. 8, pages 173627 - 173640, XP011811750, DOI: 10.1109/ACCESS.2020.3024546 * |
谭思雨: "基于高分辨率SAR图像的机场跑道区域自动提取算法研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, no. 01, pages 031 - 968 * |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112528896B (en) * | 2020-12-17 | 2024-05-31 | 长沙理工大学 | SAR image-oriented automatic aircraft target detection method and system |
CN112528896A (en) * | 2020-12-17 | 2021-03-19 | 长沙理工大学 | SAR image-oriented automatic airplane target detection method and system |
CN112598003A (en) * | 2020-12-18 | 2021-04-02 | 燕山大学 | Real-time semantic segmentation method based on data expansion and full-supervision preprocessing |
CN112598003B (en) * | 2020-12-18 | 2022-11-25 | 燕山大学 | Real-time semantic segmentation method based on data expansion and full-supervision preprocessing |
CN112950477B (en) * | 2021-03-15 | 2023-08-22 | 河南大学 | Dual-path processing-based high-resolution salient target detection method |
CN112950477A (en) * | 2021-03-15 | 2021-06-11 | 河南大学 | High-resolution saliency target detection method based on dual-path processing |
CN113240040A (en) * | 2021-05-27 | 2021-08-10 | 西安理工大学 | Polarized SAR image classification method based on channel attention depth network |
CN113567984A (en) * | 2021-07-30 | 2021-10-29 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
CN113567984B (en) * | 2021-07-30 | 2023-08-22 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
CN113673417A (en) * | 2021-08-19 | 2021-11-19 | 中国商用飞机有限责任公司 | Method and system for assisting airplane ground taxiing based on image comparison |
CN113496221A (en) * | 2021-09-08 | 2021-10-12 | 湖南大学 | Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering |
CN113496221B (en) * | 2021-09-08 | 2022-02-01 | 湖南大学 | Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering |
CN113887373A (en) * | 2021-09-27 | 2022-01-04 | 中关村科学城城市大脑股份有限公司 | Attitude identification method and system based on urban intelligent sports parallel fusion network |
CN113780241A (en) * | 2021-09-29 | 2021-12-10 | 北京航空航天大学 | Acceleration method and device for detecting salient object |
CN113780241B (en) * | 2021-09-29 | 2024-02-06 | 北京航空航天大学 | Acceleration method and device for detecting remarkable object |
CN114066822A (en) * | 2021-10-27 | 2022-02-18 | 的卢技术有限公司 | Zebra crossing detection method based on deep learning |
CN114022751A (en) * | 2021-11-04 | 2022-02-08 | 中国人民解放军国防科技大学 | SAR target detection method, device and equipment based on feature refinement deformable network |
CN114022751B (en) * | 2021-11-04 | 2024-03-05 | 中国人民解放军国防科技大学 | SAR target detection method, device and equipment based on feature refinement deformable network |
CN114387439B (en) * | 2022-01-13 | 2023-09-12 | 中国电子科技集团公司第五十四研究所 | Semantic segmentation network based on optical and PolSAR feature fusion |
CN114387439A (en) * | 2022-01-13 | 2022-04-22 | 中国电子科技集团公司第五十四研究所 | Semantic segmentation network based on fusion of optical and PolSAR (polar synthetic Aperture Radar) features |
CN114202733A (en) * | 2022-02-18 | 2022-03-18 | 青岛海信网络科技股份有限公司 | Video-based traffic fault detection method and device |
CN114820652B (en) * | 2022-04-07 | 2023-05-23 | 北京医准智能科技有限公司 | Method, device and medium for segmenting partial quality abnormal region of mammary gland X-ray image |
CN114820652A (en) * | 2022-04-07 | 2022-07-29 | 北京医准智能科技有限公司 | Method, device and medium for segmenting local quality abnormal region of mammary X-ray image |
CN114842206A (en) * | 2022-07-04 | 2022-08-02 | 江西师范大学 | Remote sensing image semantic segmentation model and method based on double-layer global convolution |
CN115131682A (en) * | 2022-07-19 | 2022-09-30 | 云南电网有限责任公司电力科学研究院 | Power grid distribution condition drawing method and system based on remote sensing image |
CN116343113A (en) * | 2023-03-09 | 2023-06-27 | 中国石油大学(华东) | Method and system for detecting oil spill based on polarized SAR characteristics and coding and decoding network |
Also Published As
Publication number | Publication date |
---|---|
CN112084901B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112084901A (en) | GCAM-based high-resolution SAR image airport runway area automatic detection method and system | |
CN108921875B (en) | Real-time traffic flow detection and tracking method based on aerial photography data | |
CN111598030B (en) | Method and system for detecting and segmenting vehicle in aerial image | |
Chen et al. | Vehicle detection in high-resolution aerial images via sparse representation and superpixels | |
Zhang et al. | An empirical study of multi-scale object detection in high resolution UAV images | |
Chen et al. | Vehicle detection in high-resolution aerial images based on fast sparse representation classification and multiorder feature | |
Alidoost et al. | A CNN-based approach for automatic building detection and recognition of roof types using a single aerial image | |
Zhang et al. | A longitudinal scanline based vehicle trajectory reconstruction method for high-angle traffic video | |
Wan et al. | A novel neural network model for traffic sign detection and recognition under extreme conditions | |
CN111160205A (en) | Embedded multi-class target end-to-end unified detection method for traffic scene | |
CN113723377A (en) | Traffic sign detection method based on LD-SSD network | |
Li et al. | An aerial image segmentation approach based on enhanced multi-scale convolutional neural network | |
Hu | Intelligent road sign inventory (IRSI) with image recognition and attribute computation from video log | |
Tan et al. | Geospatial contextual attention mechanism for automatic and fast airport detection in SAR imagery | |
CN115527133A (en) | High-resolution image background optimization method based on target density information | |
Zhang et al. | Vehicle detection in UAV aerial images based on improved YOLOv3 | |
CN112785610B (en) | Lane line semantic segmentation method integrating low-level features | |
Sarlin et al. | Snap: Self-supervised neural maps for visual positioning and semantic understanding | |
CN114463205A (en) | Vehicle target segmentation method based on double-branch Unet noise suppression | |
Cao et al. | UAV small target detection algorithm based on an improved YOLOv5s model | |
CN111881914B (en) | License plate character segmentation method and system based on self-learning threshold | |
Ng et al. | Scalable Feature Extraction with Aerial and Satellite Imagery. | |
CN113361528A (en) | Multi-scale target detection method and system | |
CN113177956A (en) | Semantic segmentation method for unmanned aerial vehicle remote sensing image | |
Kamenetsky et al. | Aerial car detection and urban understanding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |