CN109934153A

CN109934153A - Building extracting method based on gate depth residual minimization network

Info

Publication number: CN109934153A
Application number: CN201910175523.2A
Authority: CN
Inventors: 黄健锋; 张新长; 辛秦川; 孙颖
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-03-07
Filing date: 2019-03-07
Publication date: 2019-06-25
Anticipated expiration: 2039-03-07
Also published as: CN109934153B

Abstract

The embodiment of the invention discloses a kind of building extracting methods based on gate depth residual minimization network comprising: obtain the image characteristic combination of aerial image and airborne lidar point cloud data；Enhance the diversity of image pattern by way of random cropping, rotation, overturning, light-and-shade degree adjustment；The multi-level features for being learnt image automatically using improved depth residual error convolutional neural networks are obtained rough building and extract result；Using gating feature marking unit carry out validity feature screening with merge, by gradually up-sample obtain high quality building extract result.Implement present example, characteristic information gate pass through mechanism is combined with depth residual error convolutional neural networks, is extracted for aerial image and the building of airborne lidar point cloud data.

Description

Building extracting method based on gate depth residual minimization network

Technical field

The present invention relates to technical field of computer vision more particularly to a kind of building based on gate depth residual minimization network Build object extracting method.

Background technique

Building information is obtained automatically from remotely-sensed data for revision of topographic map, cybercity construction, urban sprawl point Analysis, population estimation and environment investigation etc. play a significant role.However, how accurate and automatically from remote sensing images Obtaining building information is always one of remote sensing and computer vision field compared with hang-up, 1) main cause includes: Building under most of scenes has different shape feature and deck material, spectrum especially in flourishing urban area Difference in reflectivity is larger, and is easy to be blocked by the shade of periphery high-rise building and tall and big trees；2) high-resolution remote sensing image Class in difference is big, the small spectrum for making building of class inherited and geometrical characteristic become complicated.

In order to solve this problem, many researchers are by the spectral information and airborne LiDAR point of aerial image The three-dimensional elevation information of cloud data is merged, so that the building for obtaining degree of precision extracts result.However, such methods are deposited In following limitation: firstly, most of research methods distinguish building and non-building using the feature of image lower level Pixel, this generally requires that certain threshold value setting or rule is combined to determine, such methods is caused not have versatility；Secondly, being permitted More algorithms all first do pre-segmentation to image when extracting building, and result is highly dependent on the setting of partitioning parameters, be easy by Such environmental effects when imaging, such as solar radiation, shade even random noise.

It has recently been demonstrated that depth convolutional neural networks (CNNs) are in processing remote sensing images (such as scene classification and object Detection) aspect can obtain very good effect.CNNs can not only learn the rudimentary and mid-level features of image automatically, may be used also To learn the high-level semantics feature in original image automatically.Its variant structure --- full convolutional Neural network (FCNs) is since being mentioned After out, more become the mainstream frame of present image semantic segmentation.FCNs is usually a kind of end to end with encoder-decoding The neural network of device structure synchronously can carry out category label to each pixel in segmented image, avoid artificial spy The operation of sign design and pre-segmentation.Nevertheless, the building extracting method based on FCNs still has following problems to need properly It solves, comprising:

(1) FCNs by using CNNs as its encoder be used for characteristics of image extraction, although it is such output comprising High-level semantic feature, but it is excessively rough, is easily lost the edge detail information of image, such as in building extraction, The low-level image feature abundant such as the edge and right angle of building is easy to be ignored；

(2) although being connected by " jump " or being passed to low level feature using the maximum value position of maximum pond layer The decoder of FCNs can Optimized Segmentation as a result, but this mode be easy to cause the generation of redundancy feature, reduce network Practise efficiency.In addition, the feature exported generally comprises classification uncertainty or non-boundary relevant information, these information are for classification As a result optimization impacts.

Summary of the invention

It is an object of the invention to overcome the deficiencies in the prior art, characteristic information is gated pass through mechanism and depth by the present invention Residual error convolutional neural networks combine, and extract for aerial image and the building of airborne lidar point cloud data.

To solve the above-mentioned problems, the invention proposes a kind of building extractions based on gate depth residual minimization network Method includes the following steps:

Obtain the image characteristic combination of aerial image and airborne lidar point cloud data；

Enhance the diversity of image pattern by way of random cropping, rotation, overturning, light-and-shade degree adjustment；

The multi-level features for learning image automatically using improved depth residual error convolutional neural networks, obtain rough building Object extracts result；

Using gating feature marking unit carry out validity feature screening with merge, by gradually up-sampling acquisition high quality Building extract result.

LiDAR point cloud data in the aerial image and airborne lidar point cloud data are normalization number Surface model, aerial image include red, green, three wave bands of near-infrared.

The acquisition aerial image and the image characteristic combination of airborne lidar point cloud data include:

Eliminate the abnormal point in airborne LiDAR point cloud；

The ground point of cloud and non-ground points are separated；

Digital elevation model and digital surface model are extracted by natural neighbor interpolation method, obtains difference between the two i.e. To normalize digital surface model；

By the normalization digital surface model of same spatial resolution and the red, green of aerial image, near-infrared Wave band is laid out combination.

The diversity for enhancing image pattern by way of random cropping, rotation, overturning, light-and-shade degree adjustment includes:

Vector mark is carried out to the building in data set coverage area using Map Vectorization method automatically or semi-automatically Note, is tiled into two-value label image for the vector figure spot marked, wherein 0 indicates non-building pixel, 1 indicates building image Member；

Label image and primitive character combination image are cut into the image pair of 480 × 480 sizes respectively, according to 60%, 20%, 20% ratio cut partition at training set, verifying collection and test set, be supplied to convolutional neural networks be trained and Verifying；

In training process, to input network image to carry out random cropping, rotation, overturning, the processing of light-and-shade degree adjustment, So that the image of input neural network increases the diversity of training sample set, avoids network to there is different combinations every time There is over-fitting in training.

The convolutional neural networks have the structure of coder-decoder, in which:

Encoder is made of improved depth residual error convolutional neural networks, for learn automatically input picture it is low-in-it is high Level characteristics obtain the rough building that size is original image 1/32 by multiple convolution operation or maximum pond operation Classification image；

Decoder is mainly by multiple gating feature marking units and with the up-sampling layer group of bilinearity quadratic interpolation function At up-sampling layer gradually carries out 2 times of up-sampling operations to the rough building classification image that gate signature unit generates, most The building classification results with original image same size are obtained afterwards.

The depth residual error convolutional neural networks are based on ResNet-50 convolutional neural networks structure, in which:

By input layer, convolution module, (convolutional layer+batch processing layer+ReLU swashs respectively at ResNet-50 convolutional Neural network Encourage layer+maximum pond layer), the similar residual error module of four structures and categorization module (mean value pond layer+full articulamentum+ Softmax classification layer) it constitutes；

Each residual error module is made of a projection convolution block and continuous multiple identical convolution blocks, and projection convolution block can add Times characteristic pattern quantity and by 1/2 scale smaller characteristic pattern, and identical convolution block does not change size and the spy of input and output then Levy quantity；

For input layer after first convolution module, size becomes original 1/2, and passes through a residual error mould every time Block, the size of image also gradually become original 1/2, finally obtain the characteristic pattern that size is original image 1/32.

The multi-level features for being learnt image automatically using improved depth residual error convolutional neural networks, are obtained rough Building extracts as a result, improvements include:

1) a new convolution module (convolution is embedded in front of first convolution module of depth residual error convolutional neural networks Layer+batch processing layer+ReLU excitation layer), which can receive the input of multiwave image, output feature quantity be 64 and Size is consistent with original image；

2) first original convolution module modification feature input number is 64；

3) categorization module is removed, a convolutional layer is increased, output is two wave bands, indicates that rough building extracts knot Fruit；

Improved depth residual error convolutional neural networks can receive the input of multiple wave bands (being not limited to three wave bands) image, Characteristics of image self-learning capability with original ResNet-50 can obtain image by nonlinear operation and multiple down-sampling Low-in-high-level feature, and Output Size size is the rough building classification results of original image 1/32.

The gating feature marking unit is inputted with two kinds of feature, and that one is levels is higher, size is lesser Feature input, another kind are that level is lower, the input of larger-size feature；

High-level feature is inputted, gating feature marking unit is first carried out convolution algorithm, is then carried out on 2 times Sampling；

Feature input for low level then carries out convolution algorithm and batch normalization operation, operates without up-sampling；

Result after two kinds of feature input processings is subjected to dot-product operation and batch normalization operation, is then passed to solution In code device；

The rough building classification results obtained to residual error network carry out 2 times of up-samplings, then be transmitted to decoder Feature merges, and using a convolutional layer and ReLU excitation layer, obtains the building classification results having a size of original 2 times.

It is described using gating feature marking unit carry out the screening of validity feature with merge, by gradually up-sample obtain it is high The building of quality extracts result, comprising:

Gating feature marking unit uses five times altogether in the encoder, and in the Feature Selection stage, it will have classification to firmly believe It spends the higher feature feature richer with marginal information to merge, filters out different numbers according to the feature level where unit The feature of amount is followed successively by 4,8,12,16,20 features；

In each upper sampling process of decoder, gating feature marking unit is by result after screening and thick after up-sampling Slightly building classification results merge, and obtain the rough sort result having a size of original 2 times；

Meanwhile the result after screening is also passed to next gating feature marking unit by gating feature marking unit, is made It is inputted for the high-level feature of next unit；

Five gating feature marking units of Reusability according to this obtain the identical high quality buildings object with original image size and extract As a result.

In embodiments of the present invention, characteristic information gate pass through mechanism is combined with depth residual error convolutional neural networks, It is extracted for aerial image and the building of airborne lidar point cloud data.Airborne lidar point cloud data and high-resolution Rate aviation image obtains image characteristic combination relevant to building recognition first, then passes through image rotation, overturning, bright darkness The processing such as adjustment increase sample diversity, are then input in improved depth residual error network encoder, realize defeated from multi-source Enter and learn multi-level feature in image automatically, using gating feature marking unit carry out validity feature screening with merge, lead to Cross the building extraction result that gradually up-sampling obtains high quality.Compared with other classification methods, the method that this programme proposes has Effect improves the overall precision of building extraction, shows that this combined method is that aerial image melts with LiDAR point cloud Close the effective solution for carrying out building extraction.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Other attached drawings are obtained according to these attached drawings.

Fig. 1 is the building extracting method flow chart based on gate depth residual minimization network in the embodiment of the present invention；

Fig. 2 is the example of the building extracting method based on gate depth residual minimization network in the embodiment of the present invention Figure；

Fig. 3 is the architecture diagram of the gate depth residual minimization network in the embodiment of the present invention；

Fig. 4 is the exemplary diagram of the gating feature marking unit in the embodiment of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

It is according to the present invention to be made of characteristic information gate pass through mechanism and depth residual error convolutional neural networks, it obtains first It takes the feature of multi- source Remote Sensing Data data (including aerial image and airborne lidar point cloud data) to combine, is increased using image Strong method increases training sample diversity, then by improved depth residual coder obtain image it is low-in-high level image Feature, finally by gating feature marking unit carry out validity feature screening with merge, and gradually up-sampling obtain high quality Building extract result.Airborne laser radar (LiDAR, Light Detection And Ranging) system passes through transmitting The intensive dimension point coordinate of high-precision 3 of earth's surface, which is directly quickly obtained, with reception laser pulse energy is referred to as airborne LiDAR point cloud number According to.

Specific Fig. 1 shows the building extraction side based on gate depth residual minimization network in the embodiment of the present invention Method flow chart, specifically comprises the following steps:

S101, the image characteristic combination for obtaining aerial image and airborne lidar point cloud data；

Two kinds of data of aerial image employed in the embodiment of the present invention and airborne LiDAR point cloud are as data The critical data source that the characteristics of image of two kinds of data is extracted as building is merged in source.Wherein LiDAR point cloud data are rasterizing Normalization digital surface model (nDSM) afterwards, needs exist for that LiDAR point cloud data are handled as follows:

(1) abnormal point in airborne LiDAR point cloud is eliminated；

(2) ground point of cloud and non-ground points are separated；

(3) digital elevation model (DEM) and digital surface model (DSM) are extracted by natural neighbor interpolation method；

(4) obtaining the difference of DEM and DSM between the two is to normalize digital surface model.

Here require normalization digital surface model spatial resolution and aerial image it is consistent.Finally, It extracts red (R), green (G) and near-infrared (NIR) wave band of aerial image and is carried out with normalization digital surface model Stacked combination constitutes the input (NIR-R-G-nDSM) of convolutional neural networks.

S102, the diversity for enhancing image pattern by way of random cropping, rotation, overturning, light-and-shade degree adjustment；

Training depth convolutional neural networks need image data a large amount of and with characteristic polymorphic, are needed thus to training Data set is marked and data enhancing.

Vector mark is carried out to the building in data set coverage area using Map Vectorization method automatically or semi-automatically Note, is tiled into two-value label image for the vector figure spot marked, wherein 0 indicates non-building pixel, 1 indicates building image Member.

Label image and primitive character combination image are cut into the image of 480 × 480 sizes to (as schemed respectively 2), convolutional neural networks progress is supplied at training set, verifying collection and test set according to 60%, 20%, 20% ratio cut partition Training and verifying.

S103, the multi-level features for being learnt image automatically using improved depth residual error convolutional neural networks are obtained rough Building extract result；

Firstly, briefly introducing the bare bones of convolutional neural networks:

Convolutional neural networks (CNN) are usually made of several convolution modules, full articulamentum and loss layer.Wherein, one A convolution module is made of convolutional layer, nonlinear activation layer, pond layer and batch normalization layer.Convolutional layer is by several convolution lists The parameter of first (neuron) composition, each unit is calculated by the optimization of network back-propagation algorithm, and function is to be inputted Different characteristic, such as edge, right angle, texture information.Given characteristic pattern X^l-1As the input of convolutional layer l, filtered using k-th Wave deviceFormula (1) processing is carried out to input feature vector figure, obtains output characteristic pattern:

WhereinIt is the characteristic pattern obtained after convolution algorithm, * is convolution algorithm,It is l layers of k-th of bias vector, it is public Formula (1) obtains characteristic results corresponding to each neuron while can greatly reduce neural network parameter amount.

The effect for criticizing normalization (Batch Norm, BN) layer is that neural network is avoided gradient disappearance or gradient explosion occur. In BN layers, the normalization process conversion that each input batch is carried out is as follows:

Wherein,For change of scale and offset after as a result, γ^lIt is Normalized Scale parameter, β^lIt is offset parameter, passes through All inputs can be concentrated near 0, every layer of input is made not generate too big variation by the normalized of formula (2)

Activation primitive layer is the activation level in order to control the neuron of forward signal transformation.Made with BN layers of obtained result For input, the Nonlinear Mapping of input feature vector is executed using amendment linear unit (ReLU) activation primitive；

Pond layer (Pooling Layer) is mainly used for abstract input feature vector, usually using maximum pond or average pond Obtain down-sampling characteristic pattern；

Each node of full articulamentum is connected with all nodes of a upper convolution module, for the feature extracted Figure is compressed into the vector of specified dimension.Due to the characteristic that it is connected entirely, the parameter of general full articulamentum is also most.

Then, emphasis illustrates the basic functional principle of improved depth residual error network module (such as Fig. 2 and Fig. 3):

Enhanced training sample is input in depth convolutional neural networks, wherein the convolutional neural networks are tool There are the full convolutional neural networks (FCN) of coder-decoder structure, encoder (Encoder) is mainly responsible for study input picture It is low-in-high-level feature, be made of improved depth residual error network ResNet-50.Original ResNet-50 is respectively by defeated Enter layer, convolution module (convolutional layer+batch processing layer+ReLU excitation layer+maximum pond layer), the similar residual error mould of four structures Block and categorization module (mean value pond layer+full articulamentum+Softmax classification layer) are constituted.Specifically, each residual error module is by one A projection convolution block and continuous multiple identical convolution blocks compositions, projection convolution block being capable of doubled features figure quantity and by 1/2 Scale smaller characteristic pattern, and identical convolution block does not change the size and feature quantity of input and output then.Input layer is passing through first After a convolution module, size becomes original 1/2, and passes through a residual error module every time, and the size of image also gradually becomes former 1/2 come finally obtains the characteristic pattern that size is original image 1/32.

On original ResNet-50 model, for building extract task the characteristics of and adapt to multi-band image it is defeated Enter, made following improvement:

(1) a new convolution module (volume is embedded in front of first convolution module of depth residual error convolutional neural networks Lamination+batch processing layer+ReLU excitation layer), which can receive multiwave image input, and output feature quantity is 64 And size is consistent with original image；

(2) first original convolution module modification feature input number is 64；

(3) categorization module is removed, a convolutional layer is increased, output is two wave bands, indicates that rough building extracts As a result；

S104, using gating feature marking unit carry out validity feature screening and merge, by gradually up-sampling acquisition The building of high quality extracts result.

Full convolutional neural networks are able to carry out image segmentation end to end and mark by pixel, wherein encoder and decoding Device synchronizes training and prediction, on the basis of obtaining the rough building classification results that Output Size is original image 1/32, passes through It designs effective decoder architecture and obtains high-resolution building classification results.

Firstly, will introduce how gating feature marking unit (such as Fig. 3) works between encoder and decoder.

Gating feature marking unit is responsible for making to screen to the feature of encoder and passes to decoder.Improved In ResNet-50,5 gating feature marking units are embedded in, are located at first convolution module and second convolution module Between, between second convolution module and first residual error module and between each residual error module.

Gating feature marking unit (such as Fig. 4) is inputted with two kinds of feature, is that the encoder of low level is special respectively SignWith high-level encoder feature With biggish size and lesser receptive field, andWith lesser Size and biggish receptive field.Two kinds of features are screened in the following way and are merged:

WhereinIt is to obtain after the screening of above two type feature as a result, c indicates the wave band quantity of output, BN (), UP(·)、Batch normalization, up-sampling, convolution and dot-product operation operation are respectively indicated with ⊙.Indicate 3 × 3 volume It is c that product operation and wave band, which export,.Hereafter,It will be delivered in decoder, and be melted with rough building classification results It closes, formula is as follows:

WhereinIt is fused feature, there is c+2 wave band, ReLU () and CONCAT (a, b) are respectively indicated Nonlinear function operation and union operation,

Larger-sized building classification results in order to obtain need to up-sample fused feature, specific as follows:

The higher resolution result obtained after as up-sampling.

The feature quantity that the gate marking unit of different location is exported is different, is followed successively by 4,8,12,16,20 features (such as Fig. 3).Meanwhile the result after screening is also passed to next gating feature marking unit by gating feature marking unit, is made It is inputted for the high-level feature of next unit.Five gating feature marking units of Reusability according to this are obtained with original image size Identical high quality buildings object extracts result.

To sum up, characteristic information gate pass through mechanism is mutually tied with depth residual error convolutional neural networks in the embodiment of the present invention It closes, is extracted for aerial image and the building of airborne lidar point cloud data.Airborne lidar point cloud data and height Resolution ratio aviation image obtains image characteristic combination relevant to building recognition first, then by image rotation, overturning, bright The processing such as darkness adjustment increase sample diversity, are then input in improved depth residual error network encoder, realize from more Automatically learn multi-level feature in the input picture of source, using gating feature marking unit carry out validity feature screening with melt It closes, the building extraction result of high quality is obtained by gradually up-sampling.Compared with other classification methods, the side of this programme proposition Method effectively increases the overall precision of building extraction, shows that this combined method is aerial image and LiDAR point Cloud fusion carries out the effective solution of building extraction.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include: read-only memory (ROM, ReadOnly Memory), random access memory (RAM, Random Access Memory), disk or CD etc..

In addition, being provided for the embodiments of the invention the building extraction side based on gate depth residual minimization network above Method is described in detail, and used herein a specific example illustrates the principle and implementation of the invention, above The explanation of embodiment is merely used to help understand method and its core concept of the invention；Meanwhile for the general skill of this field Art personnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this Description should not be construed as limiting the invention.

Claims

1. a kind of building extracting method based on gate depth residual minimization network, which comprises the steps of:

The multi-level features for being learnt image automatically using improved depth residual error convolutional neural networks, are obtained rough building and mentioned Take result；

Using gating feature marking unit carry out the screening of validity feature with merge, obtain high quality by gradually up-sampling and build It builds object and extracts result.

2. the building extracting method as described in claim 1 based on gate depth residual minimization network, which is characterized in that institute The LiDAR point cloud data in aerial image and airborne lidar point cloud data are stated as normalization digital surface model, institute Stating aerial image includes red, green, three wave bands of near-infrared.

3. the building extracting method as claimed in claim 2 based on gate depth residual minimization network, which is characterized in that institute It states and obtains the image characteristic combination of aerial image and airborne lidar point cloud data and include:

Eliminate the abnormal point in airborne LiDAR point cloud；The ground point of cloud and non-ground points are separated；Pass through natural neighbor interpolation Method extracts digital elevation model and digital surface model, obtains difference between the two, and the difference between the two is normalizing Change digital surface model；

By the normalization digital surface model of same spatial resolution and the red, green of aerial image, near infrared band It is laid out combination.

4. the building extracting method as claimed in claim 3 based on gate depth residual minimization network, which is characterized in that institute State enhances the diversity of image pattern by way of random cropping, rotation, overturning, light-and-shade degree adjustment includes:

Vector label is carried out to the building in data set coverage area using Map Vectorization method automatically or semi-automatically, it will The vector figure spot marked is tiled into two-value label image, wherein 0 indicates non-building pixel, 1 indicates building pixel；

In training process, to input network image to carry out random cropping, rotation, overturning, the processing of light-and-shade degree adjustment so that The image of input neural network is to there is different combinations every time.

5. the building extracting method as claimed in claim 4 based on gate depth residual minimization network, which is characterized in that institute State the structure that convolutional neural networks have coder-decoder, in which:

Encoder is made of improved depth residual error convolutional neural networks, for learn automatically input picture it is low-in-it is high-level Feature obtains the rough building that size is original image 1/32 by multiple convolution operation or maximum pond operation and classifies Image；

Up-sampling layer of the decoder by multiple gating feature marking units and with bilinearity quadratic interpolation function forms, up-sampling Layer gradually carries out 2 times of up-samplings to the rough building classification image that gate signature unit generates and operates, finally obtain with The building classification results of original image same size.

6. the building extracting method as claimed in claim 5 based on gate depth residual minimization network, which is characterized in that institute It states depth residual error convolutional neural networks and is based on ResNet-50 convolutional neural networks structure, in which: ResNet-50 convolutional Neural net Road is made of input layer, a convolution module, the similar residual error module of four structures and categorization module respectively；

Each residual error module is made of a projection convolution block and continuous multiple identical convolution blocks, and projection convolution block can double spy Levy figure quantity and by 1/2 scale smaller characteristic pattern, the identical convolution block do not change the size and characteristic of input and output Amount；

For input layer after first convolution module, size becomes original 1/2, and passes through a residual error module every time, figure The size of picture also gradually becomes original 1/2, finally obtains the characteristic pattern that size is original image 1/32.

7. the building extracting method as claimed in claim 6 based on gate depth residual minimization network, which is characterized in that institute The multi-level features for learning image automatically using improved depth residual error convolutional neural networks are stated, rough building is obtained and extracts Result includes:

1) a new convolution module, the insertion are embedded in front of first convolution module of depth residual error convolutional neural networks One new convolution module can receive multiwave image input, and output feature quantity is 64 and size is consistent with original image；2) First convolution module modification feature input number originally is 64；3) categorization module is removed, a convolutional layer is increased, output is Two wave bands indicate that rough building extracts result.

8. the building extracting method as claimed in claim 7 based on gate depth residual minimization network, which is characterized in that institute It states gating feature marking unit to input with two kinds of feature, higher, the lesser feature input of size that one is levels, separately Lower, larger-size feature input that one is levels；High-level feature is inputted, gating feature marking unit is first by it Convolution algorithm is carried out, 2 times of up-samplings are then carried out；Feature input for low level, then carry out convolution algorithm and batch normalization Operation is operated without up-sampling；

Result after two kinds of feature input processings is subjected to dot-product operation and batch normalization operation, is then passed to decoder In；2 times of up-samplings are carried out to the rough building classification results that residual error network obtains, are then closed with the feature for being transmitted to decoder And using a convolutional layer and ReLU excitation layer, the building classification results having a size of original 2 times are obtained.

9. the building extracting method as claimed in any one of claims 1 to 8 based on gate depth residual minimization network, special Sign is, it is described using gating feature marking unit carry out the screening of validity feature with merge, by gradually up-sample obtain it is high The building of quality extracts result, comprising:

Gating feature marking unit uses five times altogether in the encoder, in the Feature Selection stage, will have classification certainty factor higher The feature feature richer with marginal information merge, the spy of different number is filtered out according to the feature level where unit Sign, is followed successively by 4,8,12,16,20 features；

In each upper sampling process of decoder, gating feature marking unit by the result after screening and after up-sampling roughly building The merging of object classification results is built, the rough sort result having a size of original 2 times is obtained；Meanwhile gating feature marking unit is also screening Result afterwards passes to next gating feature marking unit, and the high-level feature as next unit inputs；According to this repeatedly Using five gating feature marking units, obtains the identical high quality buildings object with original image size and extract result.