CN110543872A - unmanned aerial vehicle image building roof extraction method based on full convolution neural network - Google Patents

unmanned aerial vehicle image building roof extraction method based on full convolution neural network Download PDF

Info

Publication number
CN110543872A
CN110543872A CN201910862731.XA CN201910862731A CN110543872A CN 110543872 A CN110543872 A CN 110543872A CN 201910862731 A CN201910862731 A CN 201910862731A CN 110543872 A CN110543872 A CN 110543872A
Authority
CN
China
Prior art keywords
building
roof
aerial vehicle
unmanned aerial
building roof
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910862731.XA
Other languages
Chinese (zh)
Other versions
CN110543872B (en
Inventor
于洋
刘斌
苏正猛
白少云
吴波涛
王建春
梅伟
张永利
王静
顾世祥
黄俊伟
冯琦
白世晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
YUNNAN PROVINCE WATER RESOURCES AND HYDROPOWER SURVEY AND DESIGN INSTITUTE
Original Assignee
YUNNAN PROVINCE WATER RESOURCES AND HYDROPOWER SURVEY AND DESIGN INSTITUTE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by YUNNAN PROVINCE WATER RESOURCES AND HYDROPOWER SURVEY AND DESIGN INSTITUTE filed Critical YUNNAN PROVINCE WATER RESOURCES AND HYDROPOWER SURVEY AND DESIGN INSTITUTE
Priority to CN201910862731.XA priority Critical patent/CN110543872B/en
Publication of CN110543872A publication Critical patent/CN110543872A/en
Application granted granted Critical
Publication of CN110543872B publication Critical patent/CN110543872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network, which comprises the steps that a first part is used for establishing an aviation image building roof sample library, a second part is used for designing the full convolution neural network to carry out feature learning on building roof samples, the trained network is used for carrying out building roof detection, and a third part is used for carrying out post-processing on extraction results to obtain more accurate building roof results, wherein the method is different from the traditional extraction method and fully utilizes abundant unmanned aerial vehicle image resources in the aspect of data acquisition; in the algorithm design method, a special full convolution neural network based on layer jump connection is designed, gradient dispersion and gradient explosion are prevented while roof features of a building are fully extracted, in the post-processing aspect, the method utilizes a conditional random field and a D-S evidence theory to perform post-processing on the extracted result of the building roof, and the extraction precision of the unmanned aerial vehicle image building roof is improved through the post-processing.

Description

Unmanned aerial vehicle image building roof extraction method based on full convolution neural network
Technical Field
The invention relates to the technical field of unmanned aerial vehicle image processing, in particular to an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network.
Background
With the promotion of the urbanization process and the rapid development of economic construction in China, the automatic extraction of the building has more and more important significance for social public and application of various industries, and the rapid extraction and update of the building elements become an important content for basic geographic information construction in China. Currently, the domestic comprehensive judgment and adjustment based on the high-resolution remote sensing image is a main means for updating basic geographic information elements. Compare in remote sensing image building interior industry and judge the accent, the unmanned aerial vehicle image acquires the degree of difficulty lower, and data production is more nimble, receives external condition restriction still less, and the high ageing, convenience and economic nature of unmanned aerial vehicle technique advantage such as make to utilize unmanned aerial vehicle technique to carry out the building and draw and update have very big advantage. Compare in field ground actual measurement building data, interior industry judgement drawing mode based on unmanned aerial vehicle technique image has improved the efficiency that the house drawed and updated.
In order to improve the production and updating efficiency of building data, an automatic/semi-automatic building roof rapid extraction method based on unmanned aerial vehicle images needs to be explored urgently, and the automation degree of building element data updating is improved. With the rapid development of the unmanned aerial vehicle technology, the spatial resolution of image data is greatly improved, and more real ground surface detail information is provided, which provides new opportunities and challenges for the automatic extraction of building roofs. In unmanned aerial vehicle image, the building roof is clear and definite with its surrounding environmental information for accurate building roof is drawed and is fixed a position and become possible. On the other hand, the house is represented by an aggregate of various characteristics, such as materials, textures, surrounding environment and the like, so that the internal characteristics of the roof elements of the building have great heterogeneity, and meanwhile, the building and the adjacent ground have great characteristic correlation, so that the automatic building roof extraction method is difficult to accurately identify building objects; in addition, the task of automated building roof extraction becomes more difficult due to shadows and other terrain. The influence of various factors is comprehensively considered, and the research on a full-automatic stable and reliable building roof extraction method is still an internationally recognized problem.
with the rapid development of photogrammetry and drone technology, photogrammetry has gradually become one of the main ways to produce topographical maps. The number of images of the unmanned aerial vehicle is rapidly increased, so that the unmanned aerial vehicle images are decoded to solve the problem which needs to be solved urgently at present. The unmanned aerial vehicle image interpretation mode mainly comprises manual interpretation and automatic interpretation. Because the number of images is huge, the manual interpretation workload is huge, and the automatic interpretation is a necessary development trend. At present, the number of automatic interpretation methods for unmanned aerial vehicle images is small, and because the unmanned aerial vehicle images and the high-resolution remote sensing images have certain similarity, the general idea is to apply the remote sensing image interpretation method to the unmanned aerial vehicle images. The common methods are mainly divided into two types, one is a top-down data-driven method, and the other is a bottom-up model-driven method. At present, data driving methods are researched more and have better effect under common conditions, and common data driving methods include methods based on geometric boundaries, such as a markov random field; region segmentation based methods such as decision tree classification, primitive texture feature mapping, etc.; methods based on auxiliary features, such as DSM, LIDAR data-assisted methods, etc. The effect of the current model-driven method is not very ideal, and common model-driven methods include semantic model classification-based methods such as linear discriminant and conditional random field-based methods; a priori model knowledge-based method, such as a Snake or active contour-based method, a deformation model and level set-based method, a priori shape model-based method, and the like; and a method based on a visual recognition model, such as a method based on probability model voting. Although there are many methods for building detection, there are still many problems. The current number of driving methods, although relatively good results, still do not fully utilize the building features and are relatively poor in robustness. The method based on model driving has various building species and large dependence on prior knowledge when establishing a target model, solves partial extraction problems only in a limited environment at present, and is difficult to find a universal model for description.
In recent years, with the rapid development of deep learning technology and computer hardware, a deep convolutional neural network has strong interpretation capability in the field of image processing, and a new solution is brought to unmanned aerial vehicle image building roof extraction.
The invention content is as follows:
Aiming at the problems in the background art, the invention provides an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network.
the technical scheme of the invention is as follows:
An unmanned aerial vehicle image building roof extraction method based on a full convolution neural network comprises the following steps:
Establishing an unmanned aerial vehicle image building roof sample library, obtaining an unmanned aerial vehicle orthographic image through field aviation and interior data processing, manually drawing a building roof sample on the orthographic image, adding part of open-source high-resolution remote sensing image building roof samples to increase the recognition capability of a network on different images, rotating the sample library samples at 45 degrees, 90 degrees, 180 degrees and 270 degrees, simultaneously performing blurring, gamma conversion and stretching scaling, and increasing the number of samples to ensure that the detection characteristics have multi-direction and multi-environment robustness.
Designing a full convolution neural network based on layer jump connection, learning the roof characteristics of the unmanned aerial vehicle image building, increasing the visual field by using porous convolution to extract more roof characteristics of the building, combining the problems of the traditional deep convolution neural network characteristic extraction according to the roof sample characteristics of the aviation image building, performing layer jump connection on an input layer and an output layer of a convolution module in a characteristic extraction stage, designing a convolution neural network based on layer jump connection, performing deconvolution characteristic reconstruction aiming at a characteristic graph obtained by convolution characteristic extraction, performing characteristic reconstruction by using deconvolution, and performing characteristic fusion on the characteristic graph obtained in a deconvolution process and the characteristic graph obtained in the convolution process in a deconvolution process.
And step three, carrying out building roof detection by using the trained network model.
and fourthly, carrying out edge refinement on the building roof primary extraction result by using the conditional random field, and carrying out building edge refinement on the building roof extraction result obtained by the primary detection by using the conditional random field to obtain a building roof detection result with finer edges.
and fifthly, verifying the roof of the building based on the characteristic evidence based on the D-S evidence theory.
Preferably, the second step specifically includes the following steps:
(1) the invention relates to a convolutional neural network based on layer jump connection, which is designed according to the characteristics of an unmanned aerial vehicle image building and by combining the advantages and the disadvantages of the traditional convolutional neural network, wherein the network comprises 6 residual convolutional modules and 2 common convolutional modules, the residual convolutional modules comprise 15 convolutional layers and 6 pooling layers, the common convolutional modules comprise 8 convolutional layers, the specific network structure is shown in figure 2, batch standardization is adopted for parameter optimization in the convolutional process, a nonlinear characteristic is introduced into the network by a ReLU (return instruction) for an activation function, in order to solve the problem of characteristic loss in the characteristic extraction process of the traditional deep convolutional neural network, a layer jump connection-based residual module is adopted for characteristic learning, an input layer and an output layer of the residual convolutional module are subjected to layer jump connection, and the characteristic extraction result is ensured, gradient diffusion and gradient explosion are prevented, and a jump layer connection schematic diagram is shown in figure 3.
(2) Building feature reconstruction based on deconvolution, after convolutional layer feature extraction, a deconvolution layer is adopted to reconstruct the roof feature of an unmanned aerial vehicle image building, the deconvolution layer comprises 6 convolution modules in total and comprises 6 deconvolution up-sampling layers, 16 convolutional layers and 6 deconvolution up-sampling layers in total, a specific network structure is shown in figure 2, in the deconvolution feature reconstruction process, feature fusion is carried out on an unmanned aerial vehicle image building roof feature graph obtained through deconvolution and a feature graph obtained in the convolution process, auxiliary feature reconstruction is carried out by using the feature graph in the convolution process, and after feature reconstruction, classification is carried out by using a sigmoid classifier.
Preferably, the step four specifically includes the following steps:
And extracting a preliminary result from the unmanned aerial vehicle image building roof obtained in the step three, selecting a conditional random field to carry out edge refinement on the building roof, wherein the conditional random field is a conditional probability non-model of another set of output random variables under the condition of giving one set of input random variables, so that the problem of mark offset can be solved well, and global normalization can be carried out on all the characteristics to obtain a global optimal solution.
construction of a conditional random field image segmentation energy function: defining hidden variables Xi as classification labels of pixel points i, wherein the value range of the classification labels is that semantic labels L to be classified are { L1, L2, L3 … … }; yi is the observed value of each random variable Xi, namely the color value of each pixel point, and the object of image semantic segmentation of the conditional random field is as follows: and deducing the corresponding class label of the latent variable Xi by observing the variable Yi.
Gibbs distribution for conditional random field (I | X):
(1)
Wherein: e (X | I) is an energy function, and is simply expressed as E (X); x belongs to the label set L, and Z (I) is a normalization factor.
By minimizing the energy function in equation (1), an optimal pixel classification result can be obtained. Defining the energy function under the condition of the whole graph:
(2)
wherein: ψ u (xi) is a univariate potential function, ψ P (xi) ═ -log P (xi), P (xi) represents the probability that the pixel point i belongs to a certain class label, and depeplab provides. The second term ψ p (xi, xj) in equation (3) is a potential-pair function,
Wherein: when μ (xi, xj) is a tag comparison function, and xi ≠ xj, μ (xi, xj) is 1, otherwise, it is 0, and is used for determining compatibility between different tags; p represents position information, I represents color information; θ α is used to control the scale of the position information; θ β is used to control the scale of color similarity; ω linear combining weights, the second part of equation (3) is related to the position information only, and θ γ controls the scale of the position information.
Updating Q (X) continuously by iteration through mean field approximation Q (X) pi Qi (xi), and finally obtaining the optimal solution of the model by minimizing the K-L divergence of P (X) and Q (X).
Preferably, the step five specifically includes the following steps:
(1) Firstly, the space of geometric constitution of all possible results of the building object to be verified is divided, defined as a verification frame is marked as X, the geometry composed of all subsets in the X is marked as 2X, for any assumed geometry A in the 2X, m (A) epsilon [0, 1], and
Where m is a probability distribution function (BPAF) at 2X, and m (a) is referred to as a basic probability function of a.
The D-S evidence theory defines a trust function Bel and a likelihood function Pl to represent the uncertainty of the problem, namely:
The belief function bel (a) represents the degree of true belief for a, and also becomes the lower bound function; the likelihood function Pl (A) represents the confidence level that A is not false, then [ Bel (A), Pl (A) ] is a confidence interval of A, the confidence interval describes the upper and lower limits of the confidence level held by A under the condition that a plurality of evidences exist, a Dempster synthesis rule can be used for synthesizing a plurality of BPAF, namely
(2) according to the D-S evidence model for verifying the building roof, as the building roof verification only needs to verify the identity of a building according to a building scene observed in an unmanned aerial vehicle image, according to a D-S evidence theory, an identification frame X is taken as { T, F }, T is a non-building object, F is a building object, a defined reliability distribution function m ({ T, F } + m (T) + m (F)) + 1 is provided, wherein m (F) represents the reliability of a current feature supporting building object, m (T) represents the reliability of supporting the non-building object, and m ({ T, F }) 1-m (T)) -m (F) represents the reliability of the building identity of an uncertain object according to the evidence, namely supporting unknown reliability.
(3) The invention discloses a multi-feature evidence model of a building, which selects an edge evidence model, a spectrum evidence model, a texture evidence model, a context evidence model and a DSM evidence model closely related to the building, performs modeling processing suitable for building verification on the features and defines a probability distribution function.
(4) And (3) building verification judgment criteria, namely processing the building object through analysis of relevant building verification characteristics and definition of corresponding probability distribution functions, obtaining the probability distribution functions corresponding to the characteristics according to building characteristic detection results, and synthesizing BPAF corresponding to the characteristics by using an evidence synthesis rule of a D-S evidence theory to obtain the probability distribution functions of comprehensive multi-characteristic evidence.
According to the definition of the D-S evidence theory on the belief function Bel, the belief probability Beli (T) and Beli (F) of the existence of the building can be calculated. According to the maximum probability distribution rule, the building verification decision criterion is defined as follows: for building roof i, if beli (t) > beli (f), then it is not considered a building roof; conversely, the current object is considered to be the building roof.
Has the advantages that: the invention adopts the unmanned aerial vehicle image as an input data source, designs a deep fully-convolutional neural network based on layer jump connection to extract the roof of a building, refines the edge of the roof of the building by using a conditional random field, combines the prior knowledge of the roof of the real building, realizes the automatic extraction of the roof of the aerial image building, has stronger practicability and higher accuracy, and has the innovation points that:
(1) The design is based on jump layer connection's full convolution neural network, quotes the residual error module based on jump layer connection and carries out the feature learning in the feature extraction process, carries out the feature fusion with convolution module characteristic map in deconvolution process, and the extraction unmanned aerial vehicle image building characteristic that unique network design can be better.
(2) Aiming at the problem of rough edges in the unmanned aerial vehicle image building roof extraction, a conditional random field is adopted to carry out edge refinement on the primary result of building roof extraction.
(3) aiming at the problem of false detection in building roof extraction, a D-S introduction evidence theory is introduced to carry out building roof verification based on characteristic evidence, and various building characteristics are innovatively introduced to serve as building roof verification clues.
Drawings
FIG. 1 is a flow chart of a method for extracting a roof of an unmanned aerial vehicle image building based on a full convolution neural network;
FIG. 2 is a diagram of a full convolutional neural network architecture based on layer hopping connections;
Fig. 3 is a diagram of a jump layer connection junction.
Detailed Description
The invention provides an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network. Firstly, extracting the roof features of the building by adopting a convolutional neural network based on layer jump connection, and reconstructing the features by utilizing deconvolution aiming at the roof feature map of the building obtained by the convolutional neural network. And then, detecting the building roof by using the trained network model, performing edge refinement on the detection result by using a conditional random field, and finally, introducing a D-S evidence theory to perform reasoning verification on the building roof extraction result to remove the false detection object.
The technical scheme of the invention is described in detail below with reference to the accompanying drawings, the technical process is shown in fig. 1, and the example technical scheme process comprises the following steps:
The method comprises the following steps: establishing an unmanned aerial vehicle image building roof sample library, obtaining an unmanned aerial vehicle orthographic image through field aviation and interior processing, manually drawing a building roof sample on the unmanned aerial vehicle image, adding part of open-source high-resolution remote sensing image building roof marking data to ensure the applicability of a training network model, selecting a proper mode to expand the number of samples after obtaining the building roof sample library, mainly comprising 45 degrees, 90 degrees, 180 degrees and 270 degrees of rotation, increasing the robustness of a network to houses in different directions while expanding the number of samples, and carrying out gamma transformation on the image, wherein the basic formula of the gamma transformation is shown as (1):
s=cr (1)
Wherein c and gamma are normal numbers, coefficients of gamma are 0.5 and 2, so as to enhance the applicability of the model to building detection in different brightness environments, 10-percent scaling and stretching are carried out on building samples, and after a series of sample expansion operations, the number of the samples is changed into 30 times of that of an original image.
and step two, designing a full convolution neural network based on layer jump connection, extracting the roof features of the building, and increasing the visual field by using porous convolution so as to extract more features of the roof of the building. The specific contents are as follows:
(1) The invention relates to a convolutional neural network based on layer jump connection, which is designed according to the characteristics of an unmanned aerial vehicle image building and by combining the advantages and the disadvantages of the traditional convolutional neural network, wherein the network comprises 6 residual convolutional modules and 2 common convolutional modules, the residual convolutional modules comprise 15 convolutional layers and 6 pooling layers, the common convolutional modules comprise 8 convolutional layers, the specific network structure is shown as the attached figure 2, batch standardization is adopted for parameter optimization in the convolutional process, the batch standardization firstly calculates the mean value of batch processing, the basic formula is shown as the formula (2),
Wherein the input is the second to calculate the batch variance, the basic formula is shown as (3),
then, for the normalization, the basic formula is shown as (4),
Finally, carrying out scale conversion and offset to obtain output yi, wherein a basic formula is shown as (5),
Wherein the input is the input of the computer,
The activating function adopts a ReLU to introduce a nonlinear characteristic into the network, the ReLU is shown in a basic formula (6),
f(x)=max(0,x) (6)
In order to solve the problem of feature loss in the process of feature extraction of the traditional deep convolutional neural network, a layer jump connection-based residual error module is adopted for feature learning, an input layer and an output layer of the residual error convolutional module are subjected to layer jump connection, the feature extraction result is ensured, meanwhile, gradient dispersion and gradient explosion are prevented, and a layer jump connection schematic diagram is shown in an attached drawing 3.
(2) building feature reconstruction based on deconvolution, after feature extraction of a convolution layer, performing unmanned aerial vehicle image building roof feature reconstruction by using a deconvolution layer, wherein the deconvolution layer comprises 6 deconvolution modules in total and comprises 6 deconvolution up-sampling layers and 16 convolution layers in total, a specific network structure is shown in figure 2, in the deconvolution feature reconstruction process, feature fusion is performed on an unmanned aerial vehicle image building roof feature graph obtained through deconvolution and a feature graph obtained in the convolution process, auxiliary feature reconstruction is performed by using the feature graph in the convolution process, and after feature reconstruction, image classification is performed by using sigmoid.
And step three, carrying out building roof detection by using the network model obtained by training, inputting the unmanned aerial vehicle image to be detected into the trained model, and obtaining an unmanned aerial vehicle image building roof primary extraction result.
and step four, carrying out building edge refinement by using a conditional random field, extracting a preliminary result from the roof of the unmanned aerial vehicle image building obtained in the step three, although the image segmentation at the pixel level can be realized, under the normal condition, the edge of the roof of the building is not fine enough, in order to improve the detection precision, selecting the conditional random field to carry out edge refinement on the roof of the building, wherein the conditional random field is a conditional probability non-model of another set of output random variables under the condition of a set of input random variables, so that the problem of mark offset can be solved well, and all the characteristics can be subjected to global normalization to obtain a global optimal solution.
Construction of a conditional random field image segmentation energy function: defining hidden variables Xi as classification labels of pixel points i, wherein the value range of the classification labels is that semantic labels L to be classified are { L1, L2, L3 … … }; yi is the observed value of each random variable Xi, namely the color value of each pixel point, and the object of image semantic segmentation of the conditional random field is as follows: and deducing the corresponding class label of the latent variable Xi by observing the variable Yi.
Gibbs distribution for conditional random field (I | X):
(7)
Wherein: e (X | I) is an energy function, and is simply expressed as E (X); x belongs to the label set L, and Z (I) is a normalization factor.
By minimizing the energy function in equation (1), an optimal pixel classification result can be obtained. Defining the energy function under the condition of the whole graph:
(8)
wherein: ψ u (xi) is a univariate potential function, ψ P (xi) ═ -log P (xi), P (xi) represents the probability that the pixel point i belongs to a certain class label, and depeplab provides. The second term ψ p (xi, xj) in equation (3) is a potential-pair function,
Wherein: when μ (xi, xj) is a tag comparison function, and xi ≠ xj, μ (xi, xj) is 1, otherwise, it is 0, and is used for determining compatibility between different tags; p represents position information, I represents color information; θ α is used to control the scale of the position information; θ β is used to control the scale of color similarity; and omega linearly combining the weights. The second part of equation (3) relates only to the position information, and θ γ controls the scale of the position information.
Updating Q (X) continuously by iteration through mean field approximation Q (X) pi Qi (xi), and finally obtaining the optimal solution of the model by minimizing the K-L divergence of P (X) and Q (X).
And step five, verifying the building roof based on the D-S evidence theory, wherein compared with the traditional building roof verification, the D-S evidence theory is used as the basis of the reasoning verification of the building, and the edge, the geometry, the spectrum, the context and the DSM of the building roof are added as the characteristic evidence. The specific contents are as follows:
(1) Theoretical basis of D-S evidence
as the bottom concept of the D-S evidence theory, the space of geometric composition of all possible results of the building object to be verified is firstly divided, defined as a verification framework denoted as X, and the geometry composed of all subsets in X is denoted as 2X, for any assumed geometry A in 2X, m (A) is epsilon [0, 1], and
Where m is a probability distribution function (BPAF) at 2X, and m (a) is referred to as a basic probability function of a.
D-S evidence theory defines a trust function Bel and a likelihood function Pl to represent the uncertainty of the problem,
Namely:
the belief function bel (a) represents the degree of true belief for a, and also becomes the lower bound function; the likelihood function Pl (A) represents the confidence level that A is not false, then [ Bel (A), Pl (A) ] is a confidence interval of A, the confidence interval describes the upper and lower limits of the confidence level held by A under the condition that a plurality of evidences exist, a Dempster synthesis rule can be used for synthesizing a plurality of BPAF, namely
(2) according to the D-S evidence model for verifying the building roof, as the building roof verification only needs to verify the identity of a building according to a building scene observed in an unmanned aerial vehicle image, according to a D-S evidence theory, an identification frame X is taken as { T, F }, T is a non-building object, F is a building object, a defined reliability distribution function m ({ T, F } + m (T) + m (F)) + 1 is provided, wherein m (F) represents the reliability of a current feature supporting building object, m (T) represents the reliability of supporting the non-building object, and m ({ T, F }) 1-m (T)) -m (F) represents the reliability of the building identity of an uncertain object according to the evidence, namely supporting unknown reliability.
(3) the invention discloses a multi-feature evidence model of a building, which selects an edge evidence model, a spectrum evidence model, a texture evidence model, a context evidence model and a DSM evidence model closely related to the building, performs modeling processing suitable for building verification on the features and defines a probability distribution function.
(4) And (3) building verification judgment criteria, namely processing the building object through analysis of relevant building verification characteristics and definition of corresponding probability distribution functions, obtaining the probability distribution functions corresponding to the characteristics according to building characteristic detection results, and synthesizing BPAF corresponding to the characteristics by using an evidence synthesis rule of a D-S evidence theory to obtain the probability distribution functions of comprehensive multi-characteristic evidence.
according to the definition of the D-S evidence theory on the belief function Bel, the belief probability Beli (T) and Beli (F) of the existence of the building can be calculated. According to the maximum probability distribution rule, the building verification decision criterion is defined as follows: for building roof i, if beli (t) > beli (f), then it is not considered a building roof; conversely, the current object is considered to be the building roof.
The invention adopts the unmanned aerial vehicle image as an input data source, designs a deep fully-convolutional neural network based on layer jump connection to extract the roof of a building, refines the edge of the roof of the building by using a conditional random field, combines the prior knowledge of the roof of the real building, realizes the automatic extraction of the roof of the aerial image building, has stronger practicability and higher accuracy, and has the innovation points that:
(1) the design is based on jump layer connection's full convolution neural network, quotes the residual error module based on jump layer connection and carries out the feature learning in the feature extraction process, carries out the feature fusion with convolution module characteristic map in deconvolution process, and the extraction unmanned aerial vehicle image building characteristic that unique network design can be better.
(2) aiming at the problem of rough edges in the unmanned aerial vehicle image building roof extraction, a conditional random field is adopted to carry out edge refinement on the primary result of building roof extraction.
(3) Aiming at the problem of false detection in building roof extraction, a D-S introduction evidence theory is introduced to carry out building roof verification based on characteristic evidence, and various building characteristics are innovatively introduced to serve as building roof verification clues.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the inventive concept of the present invention, and these changes and modifications are all within the scope of the present invention.

Claims (4)

1. an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network is characterized by comprising the following steps:
Establishing an unmanned aerial vehicle image building roof sample library, obtaining an orthophoto through field aviation and interior data processing, manually drawing a building roof sample on the orthophoto, adding a part of open-source high-resolution remote sensing image building roof samples to increase the recognition capability of a network on different images, rotating the sample library samples by 45 degrees, 90 degrees, 180 degrees and 270 degrees, simultaneously performing blurring, gamma conversion and stretching scaling, and increasing the number of samples to ensure that the detection characteristics have multi-direction and multi-environment robustness.
Designing a convolutional neural network based on layer jump connection, extracting the roof features of the unmanned aerial vehicle image building, increasing the visual field by using porous convolution to extract more roof features of the building, combining the problems of the traditional deep convolutional neural network feature extraction according to the characteristics of roof samples of the aviation image building, performing layer jump connection on an input layer and an output layer of a convolution module in a feature extraction stage, designing a convolutional neural network based on layer jump connection, performing deconvolution feature reconstruction, performing feature reconstruction by using deconvolution aiming at a feature map obtained by extracting the convolution features, and performing feature fusion on the feature map obtained in the deconvolution process and the feature map obtained in the convolution process in the deconvolution process.
and step three, carrying out building roof detection by using the trained network model.
and step four, carrying out building roof edge refinement by using the conditional random field, and carrying out building edge refinement by using the conditional random field aiming at a building roof extraction result obtained by primary detection, so as to obtain a building roof with finer edges.
and fifthly, verifying the roof of the building based on the characteristic evidence based on the D-S evidence theory.
2. The unmanned aerial vehicle image building roof extraction method based on the full convolution neural network of claim 1, wherein the method comprises the following steps: the second step specifically comprises the following steps:
(1) The invention relates to a convolutional neural network based on layer jump connection, which is designed according to the characteristics of an unmanned aerial vehicle image building and by combining the advantages and the disadvantages of the traditional convolutional neural network, wherein the network comprises 6 residual convolutional modules and 2 common convolutional modules, the residual convolutional modules comprise 15 convolutional layers and 6 pooling layers, the common convolutional modules comprise 8 convolutional layers, the specific network structure is shown in figure 2, batch standardization is adopted for parameter optimization in the convolutional process, a nonlinear characteristic is introduced into the network by a ReLU (return instruction) for an activation function, in order to solve the problem of characteristic loss in the characteristic extraction process of the traditional deep convolutional neural network, a layer jump connection-based residual module is adopted for characteristic learning, an input layer and an output layer of the residual convolutional module are subjected to layer jump connection, and the characteristic extraction result is ensured, gradient diffusion and gradient explosion are prevented, and a jump layer connection schematic diagram is shown in figure 3.
(2) Building feature reconstruction based on deconvolution, after convolutional layer feature extraction, a deconvolution layer is adopted to reconstruct the roof feature of an unmanned aerial vehicle image building, the deconvolution layer comprises 6 convolution modules in total and comprises 6 deconvolution up-sampling layers, 16 convolutional layers and 6 deconvolution up-sampling layers in total, a specific network structure is shown in figure 2, in the deconvolution feature reconstruction process, feature fusion is carried out on an unmanned aerial vehicle image building roof feature graph obtained through deconvolution and a feature graph obtained in the convolution process, auxiliary feature reconstruction is carried out by using the feature graph in the convolution process, and after feature reconstruction, classification is carried out by using a sigmoid classifier.
3. the unmanned aerial vehicle image building roof extraction method based on the full convolution neural network of claim 1, wherein the method comprises the following steps: the fourth step specifically comprises the following steps:
and extracting a preliminary result from the unmanned aerial vehicle image building roof obtained in the step three, selecting a conditional random field to carry out edge refinement on the building roof, wherein the conditional random field is a conditional probability non-model of another set of output random variables under the condition of giving one set of input random variables, so that the problem of mark offset can be solved well, and global normalization can be carried out on all the characteristics to obtain a global optimal solution.
Construction of a conditional random field image segmentation energy function: defining a hidden variable Xi as a classification label of a pixel point i, wherein the value range of the hidden variable Xi is that a semantic label L to be classified is {11, 12, 13 … … }; yi is the observed value of each random variable Xi, namely the color value of each pixel point, and the object of image semantic segmentation of the conditional random field is as follows: and deducing the corresponding class label of the latent variable Xi by observing the variable Yi.
Gibbs distribution for conditional random field (I | X):
Wherein: e (X | I) is an energy function, and is simply expressed as E (X); x belongs to the label set L, and Z (I) is a normalization factor.
by minimizing the energy function in equation (1), an optimal pixel classification result can be obtained. Defining the energy function under the condition of the whole graph:
E(X)=∑ψ(x)+∑ψ(x,x) (2)
wherein: ψ u (xi) is a univariate potential function, ψ P (xi) ═ -log P (xi), P (xi) represents the probability that the pixel point i belongs to a certain class label, and depeplab provides. The second term ψ p (xi, xj) in equation (3) is a potential-pair function,
wherein: when μ (xi, xj) is a tag comparison function, and xi ≠ xj, μ (xi, xj) is 1, otherwise, it is 0, and is used for determining compatibility between different tags; p represents position information, I represents color information; θ α is used to control the scale of the position information; θ β is used to control the scale of color similarity; and omega linearly combining the weights. The second part of equation (3) relates only to the position information, and θ γ controls the scale of the position information.
Updating Q (X) continuously by iteration through average field approximation Q (X) Π iQi (xi), and finally obtaining the optimal solution of the model by minimizing the K-L divergence of P (X) and Q (X).
4. The unmanned aerial vehicle image building roof extraction method based on the full convolution neural network of claim 1, wherein the method comprises the following steps: the fifth step specifically comprises the following steps:
(1) firstly, the space of geometric constitution of all possible results of the building object to be verified is divided, defined as a verification frame is marked as X, the geometry composed of all subsets in the X is marked as 2X, for any assumed geometry A in the 2X, m (A) epsilon [0, 1], and
where m is a probability distribution function (BPAF) at 2X, and m (a) is referred to as a basic probability function of a.
the D-S evidence theory defines a trust function Bel and a likelihood function Pl to represent the uncertainty of the problem, namely:
the belief function bel (a) represents the degree of true belief for a, and also becomes the lower bound function; the likelihood function Pl (A) represents the confidence level that A is not false, then [ Bel (A), Pl (A) ] is a confidence interval of A, the confidence interval describes the upper and lower limits of the confidence level held by A under the condition that a plurality of evidences exist, a Dempster synthesis rule can be used for synthesizing a plurality of BPAF, namely
wherein the content of the first and second substances,
(2) According to the D-S evidence model for verifying the building roof, the identity of the building only needs to be verified according to the building scene observed in the unmanned aerial vehicle image, and according to the D-S evidence theory, the identification frame X is taken as { T, F }, T is a non-building object, F is a building object, and then a defined reliability distribution function m ({ T, F } + m (T) + m (F)) < 1 > is provided. Where m (F) indicates the confidence that the current feature supports the building object, m (T) indicates the confidence that the non-building object is supported, and m ({ T, F }) -1-m (T) -m (F) indicates the confidence that the identity of the object building is uncertain from the evidence, i.e., supports unknown confidence.
(3) the invention discloses a multi-feature evidence model of a building, which selects an edge evidence model, a spectrum evidence model, a texture evidence model, a context evidence model and a DSM evidence model closely related to the building, performs modeling processing suitable for building verification on the features and defines a probability distribution function.
(4) and (3) building verification judgment criteria, namely processing the building object through analysis of relevant building verification characteristics and definition of corresponding probability distribution functions, obtaining the probability distribution functions corresponding to the characteristics according to building characteristic detection results, and synthesizing BPAF corresponding to the characteristics by using an evidence synthesis rule of a D-S evidence theory to obtain the probability distribution functions of comprehensive multi-characteristic evidence.
According to the definition of the D-S evidence theory on the belief function Bel, the belief probability Beli (T) and Beli (F) of the existence of the building can be calculated. According to the maximum probability distribution rule, the building verification decision criterion is defined as follows: for building roof i, if beli (t) > beli (f), then it is not considered a building roof; conversely, the current object is considered to be the building roof.
CN201910862731.XA 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network Active CN110543872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910862731.XA CN110543872B (en) 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910862731.XA CN110543872B (en) 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network

Publications (2)

Publication Number Publication Date
CN110543872A true CN110543872A (en) 2019-12-06
CN110543872B CN110543872B (en) 2023-04-18

Family

ID=68713455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910862731.XA Active CN110543872B (en) 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network

Country Status (1)

Country Link
CN (1) CN110543872B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110989016A (en) * 2019-12-26 2020-04-10 山东师范大学 Non-visual field area pipeline surveying system and method based on mobile terminal
CN111028217A (en) * 2019-12-10 2020-04-17 南京航空航天大学 Image crack segmentation method based on full convolution neural network
CN111428224A (en) * 2020-04-02 2020-07-17 苏州杰锐思智能科技股份有限公司 Computer account login method based on face recognition
CN112052829A (en) * 2020-09-25 2020-12-08 中国直升机设计研究所 Pilot behavior monitoring method based on deep learning
CN112508986A (en) * 2020-12-04 2021-03-16 武汉大学 Water level measurement method based on deep convolutional network and random field
CN113780292A (en) * 2021-08-31 2021-12-10 北京交通大学 Semantic segmentation network model uncertainty quantification method based on evidence reasoning
CN116958455A (en) * 2023-09-21 2023-10-27 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484668A (en) * 2015-01-19 2015-04-01 武汉大学 Unmanned aerial vehicle multi-overlapped-remote-sensing-image method for extracting building contour line
US20170220887A1 (en) * 2016-01-29 2017-08-03 Pointivo, Inc. Systems and methods for extracting information about objects from scene information
CN107516100A (en) * 2017-08-31 2017-12-26 北京航天绘景科技有限公司 A kind of image building extracting method based on elevation morphology building index
CN109389051A (en) * 2018-09-20 2019-02-26 华南农业大学 A kind of building remote sensing images recognition methods based on convolutional neural networks
CN109448039A (en) * 2018-10-22 2019-03-08 浙江科技学院 A kind of monocular depth estimation method based on depth convolutional neural networks
US20190102897A1 (en) * 2016-09-27 2019-04-04 Xactware Solutions, Inc. Computer Vision Systems and Methods for Detecting and Modeling Features of Structures in Images
US20190114930A1 (en) * 2017-10-16 2019-04-18 Iain Matthew Russell Methods, computer programs, computing devices and controllers
CN109670515A (en) * 2018-12-13 2019-04-23 南京工业大学 A kind of detection method and system changed for building in unmanned plane image
CN110136170A (en) * 2019-05-13 2019-08-16 武汉大学 A kind of remote sensing image building change detecting method based on convolutional neural networks

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484668A (en) * 2015-01-19 2015-04-01 武汉大学 Unmanned aerial vehicle multi-overlapped-remote-sensing-image method for extracting building contour line
US20170220887A1 (en) * 2016-01-29 2017-08-03 Pointivo, Inc. Systems and methods for extracting information about objects from scene information
US20190102897A1 (en) * 2016-09-27 2019-04-04 Xactware Solutions, Inc. Computer Vision Systems and Methods for Detecting and Modeling Features of Structures in Images
CN107516100A (en) * 2017-08-31 2017-12-26 北京航天绘景科技有限公司 A kind of image building extracting method based on elevation morphology building index
US20190114930A1 (en) * 2017-10-16 2019-04-18 Iain Matthew Russell Methods, computer programs, computing devices and controllers
CN109389051A (en) * 2018-09-20 2019-02-26 华南农业大学 A kind of building remote sensing images recognition methods based on convolutional neural networks
CN109448039A (en) * 2018-10-22 2019-03-08 浙江科技学院 A kind of monocular depth estimation method based on depth convolutional neural networks
CN109670515A (en) * 2018-12-13 2019-04-23 南京工业大学 A kind of detection method and system changed for building in unmanned plane image
CN110136170A (en) * 2019-05-13 2019-08-16 武汉大学 A kind of remote sensing image building change detecting method based on convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘文涛等: "基于全卷积神经网络的建筑物屋顶自动提取", 《地球信息科学学报》 *
安文等: "模糊支持向量机建筑物边缘特征提取方法研究", 《计算机工程与设计》 *
段连飞等: "基于BP神经网络的机载高分辨率SAR图像分类方法研究", 《测绘通报》 *
陈文康: "基于深度学习的农村建筑物遥感影像检测", 《测绘》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028217A (en) * 2019-12-10 2020-04-17 南京航空航天大学 Image crack segmentation method based on full convolution neural network
CN110989016B (en) * 2019-12-26 2022-06-24 山东师范大学 Non-visual field area pipeline surveying system and method based on mobile terminal
CN110989016A (en) * 2019-12-26 2020-04-10 山东师范大学 Non-visual field area pipeline surveying system and method based on mobile terminal
CN111428224A (en) * 2020-04-02 2020-07-17 苏州杰锐思智能科技股份有限公司 Computer account login method based on face recognition
CN111428224B (en) * 2020-04-02 2023-10-13 苏州杰锐思智能科技股份有限公司 Face recognition-based computer account login method
CN112052829B (en) * 2020-09-25 2023-06-30 中国直升机设计研究所 Pilot behavior monitoring method based on deep learning
CN112052829A (en) * 2020-09-25 2020-12-08 中国直升机设计研究所 Pilot behavior monitoring method based on deep learning
AU2021277762B2 (en) * 2020-12-04 2023-05-25 Wuhan University Water level measurement method based on deep convolutional network and random field
CN112508986A (en) * 2020-12-04 2021-03-16 武汉大学 Water level measurement method based on deep convolutional network and random field
CN113780292B (en) * 2021-08-31 2022-05-06 北京交通大学 Semantic segmentation network model uncertainty quantification method based on evidence reasoning
CN113780292A (en) * 2021-08-31 2021-12-10 北京交通大学 Semantic segmentation network model uncertainty quantification method based on evidence reasoning
CN116958455A (en) * 2023-09-21 2023-10-27 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment
CN116958455B (en) * 2023-09-21 2023-12-26 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment

Also Published As

Publication number Publication date
CN110543872B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN110543872B (en) Unmanned aerial vehicle image building roof extraction method based on full convolution neural network
CN106778605B (en) Automatic remote sensing image road network extraction method under assistance of navigation data
CN111259906B (en) Method for generating remote sensing image target segmentation countermeasures under condition containing multilevel channel attention
Shi et al. Road detection from remote sensing images by generative adversarial networks
CN111079847B (en) Remote sensing image automatic labeling method based on deep learning
CN109871875B (en) Building change detection method based on deep learning
CN111160127B (en) Remote sensing image processing and detecting method based on deep convolutional neural network model
CN105069811A (en) Multi-temporal remote sensing image change detection method
WO2022256460A1 (en) Systems for rapid accurate complete detailing and cost estimation for building construction from 2d plans
CN111291675A (en) Hyperspectral ancient painting detection and identification method based on deep learning
Alidoost et al. Knowledge based 3D building model recognition using convolutional neural networks from LiDAR and aerial imageries
CN112633140A (en) Multi-spectral remote sensing image urban village multi-category building semantic segmentation method and system
CN114283162A (en) Real scene image segmentation method based on contrast self-supervision learning
Jiang et al. Local and global structure for urban ALS point cloud semantic segmentation with ground-aware attention
CN116630637A (en) optical-SAR image joint interpretation method based on multi-modal contrast learning
CN116612382A (en) Urban remote sensing image target detection method and device
CN115830322A (en) Building semantic segmentation label expansion method based on weak supervision network
CN116630610A (en) ROI region extraction method based on semantic segmentation model and conditional random field
Rao et al. Roads detection of aerial image with FCN-CRF model
Saxena et al. An Optimized Technique for Image Classification Using Deep Learning
CN113627480B (en) Polarization SAR image classification method based on reinforcement learning
Guo et al. River extraction method of remote sensing image based on edge feature fusion
Wang et al. Extraction of main urban roads from high resolution satellite images by machine learning
Lguensat et al. Convolutional neural networks for the segmentation of oceanic eddies from altimetric maps
CN113705731A (en) End-to-end image template matching method based on twin network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant