CN110543872B - Unmanned aerial vehicle image building roof extraction method based on full convolution neural network - Google Patents

Unmanned aerial vehicle image building roof extraction method based on full convolution neural network Download PDF

Info

Publication number
CN110543872B
CN110543872B CN201910862731.XA CN201910862731A CN110543872B CN 110543872 B CN110543872 B CN 110543872B CN 201910862731 A CN201910862731 A CN 201910862731A CN 110543872 B CN110543872 B CN 110543872B
Authority
CN
China
Prior art keywords
building
roof
feature
convolution
evidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910862731.XA
Other languages
Chinese (zh)
Other versions
CN110543872A (en
Inventor
于洋
刘斌
苏正猛
白少云
吴波涛
王建春
梅伟
张永利
王静
顾世祥
黄俊伟
冯琦
白世晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Institute Of Water Conservancy And Hydropower Investigation And Design
Original Assignee
Yunnan Institute Of Water Conservancy And Hydropower Investigation And Design
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Institute Of Water Conservancy And Hydropower Investigation And Design filed Critical Yunnan Institute Of Water Conservancy And Hydropower Investigation And Design
Priority to CN201910862731.XA priority Critical patent/CN110543872B/en
Publication of CN110543872A publication Critical patent/CN110543872A/en
Application granted granted Critical
Publication of CN110543872B publication Critical patent/CN110543872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network, which comprises the steps that a first part is used for establishing an aviation image building roof sample library, a second part is used for designing the full convolution neural network to carry out feature learning on building roof samples, the trained network is used for carrying out building roof detection, and a third part is used for carrying out post-processing on extraction results to obtain more accurate building roof results, wherein the method is different from the traditional extraction method and fully utilizes abundant unmanned aerial vehicle image resources in the aspect of data acquisition; in the algorithm design method, a special full convolution neural network based on layer jump connection is designed, gradient dispersion and gradient explosion are prevented while roof features of a building are fully extracted, in the post-processing aspect, the method utilizes a conditional random field and a D-S evidence theory to perform post-processing on the extracted result of the building roof, and the extraction precision of the unmanned aerial vehicle image building roof is improved through the post-processing.

Description

Unmanned aerial vehicle image building roof extraction method based on full convolution neural network
Technical Field
The invention relates to the technical field of unmanned aerial vehicle image processing, in particular to an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network.
Background
With the promotion of the urbanization process and the rapid development of economic construction in China, the automatic extraction of the building has more and more important significance for social public and application of various industries, and the rapid extraction and update of the building elements become an important content for basic geographic information construction in China. Currently, the domestic comprehensive judgment and adjustment based on the high-resolution remote sensing image is a main means for updating basic geographic information elements. Compare in remote sensing image building interior industry and judge the accent, the unmanned aerial vehicle image acquires the degree of difficulty lower, and data production is more nimble, receives external condition restriction still less, and the high ageing, convenience and economic nature of unmanned aerial vehicle technique advantage such as make to utilize unmanned aerial vehicle technique to carry out the building and draw and update have very big advantage. Compare in field ground actual measurement building data, interior industry judgement drawing mode based on unmanned aerial vehicle technique image has improved the efficiency that the house drawed and updated.
In order to improve the production and updating efficiency of building data, an automatic/semi-automatic building roof rapid extraction method based on unmanned aerial vehicle images needs to be explored urgently, and the automation degree of building element data updating is improved. With the rapid development of the unmanned aerial vehicle technology, the spatial resolution of image data is greatly improved, and more real ground surface detail information is provided, which provides new opportunities and challenges for the automatic extraction of building roofs. In unmanned aerial vehicle image, the building roof is clear and definite with its surrounding environmental information for accurate building roof is drawed and is fixed a position and become possible. On the other hand, the house is represented by an aggregate of various characteristics, such as materials, textures, surrounding environment and the like, so that the internal characteristics of the roof elements of the building have great heterogeneity, and meanwhile, the building and the adjacent ground have great characteristic correlation, so that the automatic building roof extraction method is difficult to accurately identify building objects; in addition, the task of automated building roof extraction becomes more difficult due to shadows and other terrain. The influence of various factors is comprehensively considered, and the research on a full-automatic stable and reliable building roof extraction method is still an internationally recognized problem.
With the rapid development of photogrammetry and drone technology, photogrammetry has gradually become one of the main ways to produce topographical maps. The number of images of the unmanned aerial vehicle is rapidly increased, so that the unmanned aerial vehicle images are decoded to solve the problem which needs to be solved urgently at present. The unmanned aerial vehicle image interpretation mode mainly comprises manual interpretation and automatic interpretation. Because the number of images is huge, the workload of manual interpretation is huge, and automatic interpretation is a necessary development trend. At present, the number of automatic interpretation methods for unmanned aerial vehicle images is small, and because the unmanned aerial vehicle images and the high-resolution remote sensing images have certain similarity, the general idea is to apply the remote sensing image interpretation method to the unmanned aerial vehicle images. The common methods are mainly divided into two types, one is a top-down data-driven method, and the other is a bottom-up model-driven method. At present, data driving methods are researched more and have better effect under common conditions, and common data driving methods include methods based on geometric boundaries, such as a markov random field; region segmentation based methods such as decision tree classification, primitive texture feature mapping, etc.; methods based on auxiliary features, such as DSM, LIDAR data-assisted methods, etc. The current model-driven method has no ideal effect, and common model-driven methods include methods based on semantic model classification, such as linear discriminant and methods based on conditional random fields; a priori model knowledge-based method, such as a Snake or active contour-based method, a deformation model and level set-based method, a priori shape model-based method, and the like; and a method based on a visual recognition model, such as a method based on probability model voting. Although there are many current building detection methods, there are still many problems. The current number of driving methods, although relatively good results, still do not fully utilize the building features and are relatively poor in robustness. The method based on model driving has various building species and large dependence on prior knowledge when establishing a target model, solves partial extraction problems only in a limited environment at present, and is difficult to find a universal model for description.
In recent years, with the rapid development of deep learning technology and computer hardware, a deep convolutional neural network has strong interpretation capability in the field of image processing, and a new solution is brought to unmanned aerial vehicle image building roof extraction.
The invention content is as follows:
aiming at the problems in the background art, the invention provides an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network.
The technical scheme of the invention is as follows: an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network is characterized by comprising the following steps:
establishing an unmanned aerial vehicle image building roof sample library, obtaining an orthographic image through field aviation flight and interior data processing, manually sketching the orthographic image to obtain a building roof sample, adding part of open-source high-resolution remote sensing image building roof samples to increase the recognition capability of a network on different images, rotating the sample library sample by 45 degrees, 90 degrees, 180 degrees and 270 degrees, simultaneously performing fuzzy transformation, gamma transformation and stretching scaling, and increasing the number of samples so that the detection characteristics have multi-direction and multi-environment robustness;
designing a convolutional neural network based on layer hopping connection, extracting the roof features of the unmanned aerial vehicle image building, increasing the visual field by using porous convolution to extract more roof features of the building, combining the problems existing in the feature extraction of the traditional deep convolutional neural network according to the characteristics of roof samples of the aviation image building, performing layer hopping connection on an input layer and an output layer of a convolution module in a feature extraction stage, reconstructing deconvolution features, performing feature reconstruction by using deconvolution aiming at feature graphs obtained by extracting the convolution features, and performing feature fusion on the feature graphs obtained in the deconvolution process and the feature graphs obtained in the convolution process in the deconvolution process;
thirdly, detecting the roof of the building by using the trained network model;
step four, carrying out building roof edge refinement by using the conditional random field, and carrying out building roof edge refinement by using the conditional random field aiming at a building roof extraction result obtained by primary detection, so that the edge of the building roof is more refined;
step five, based on the D-S evidence theory, carrying out the building roof verification based on the characteristic evidence
Preferably, the second step specifically includes the following steps:
(1) The convolutional neural network based on layer jump connection combines the advantages and the disadvantages of the traditional convolutional neural network according to the characteristics of the unmanned aerial vehicle image building; the network has 6 residual convolution modules and 2 common convolution modules, the residual convolution modules comprise 15 convolution layers and 6 pooling layers, the common convolution modules comprise 8 convolution layers, parameter optimization is carried out by batch standardization in the convolution process, the batch standardization firstly calculates the average value of batch processing, the basic formula is shown as (1),
Figure GDA0004087416050000041
wherein:
Figure GDA0004087416050000042
is input and is asserted>
Figure GDA0004087416050000043
Then, the batch variance is calculated, the basic formula is shown as (2),
Figure GDA0004087416050000044
then for
Figure GDA0004087416050000045
Standardized, the basic formula is shown in (3), based on>
Figure GDA0004087416050000046
Finally, carrying out scale transformation and offset to obtain output y i The basic formula is shown as (4),
Figure GDA0004087416050000047
wherein:
Figure GDA0004087416050000048
is input and is asserted>
Figure GDA0004087416050000049
The activating function adopts a ReLU to introduce a nonlinear characteristic into the network, the ReLU is shown in a basic formula (5),
f(x)=max(0,x) (5)
in order to solve the problem of feature loss in the process of feature extraction of the traditional deep convolution neural network, a layer jump connection-based residual error module is adopted for feature learning, and an input layer and an output layer of the residual error convolution module are subjected to layer jump connection, so that the feature extraction result is ensured, and meanwhile, gradient dispersion and gradient explosion are prevented;
(2) Building feature reconstruction based on deconvolution, after feature extraction of a convolution layer, the unmanned aerial vehicle image building roof feature reconstruction is carried out by adopting a deconvolution layer, the deconvolution layer comprises 6 deconvolution modules in total and comprises 6 deconvolution up-sampling layers and 16 convolution layers in total, in the deconvolution feature reconstruction process, feature fusion is carried out on an unmanned aerial vehicle image building roof feature graph obtained through deconvolution and a feature graph obtained in the convolution process, auxiliary feature reconstruction is carried out by utilizing the feature graph in the convolution process, and after the feature reconstruction, image classification is carried out by utilizing sigmoid.
Preferably, the step four specifically includes the following steps:
extracting a preliminary result from the unmanned aerial vehicle image building roof obtained in the third step, selecting a conditional random field to carry out edge refinement on the building roof, wherein the conditional random field is a conditional probability model of another set of output random variables under the condition of giving one set of input random variables, so that the problem of mark bias can be solved well, and all the characteristics can be subjected to global normalization to obtain a global optimal solution;
construction of a conditional random field image segmentation energy function: defining hidden variables Xi as classification labels of pixel points i, wherein the value range of the classification labels is that semantic labels L = { L1, L2, L3, … … } to be classified; yi is the observed value of each random variable Xi, namely the color value of each pixel point, and the object of image semantic segmentation of the conditional random field is as follows: deducing a corresponding category label of the latent variable Xi through the observation variable Yi;
gibbs distribution for conditional random field (X | I):
Figure GDA0004087416050000051
wherein: e (X | I) is an energy function, and is simply expressed as E (X); x belongs to a label set L; z (I) is a normalization factor;
by minimizing the energy function in the formula (6), an optimal pixel classification result can be obtained; defining the energy function under the condition of the whole graph:
E(X)=∑ i ψ u (x i )+∑ i<j ψ p (x i ,x j ) (7)
wherein: psi u (x i ) Is a unitary potential function; psi p (x i ,x j )=-log P(x i ,x j ) Representing a pair potential function, calculated by equation (8); p (x) i ,x j ) Representing the probability that the pixel point (i, j) belongs to a certain category label;
Figure GDA0004087416050000052
wherein: mu.s p (x i ,x j ) As a function of tag comparison, x i ≠x j When, mu p (x i ,x j ) 1, otherwise, equal to 0, for judging compatibility among different tags; p represents position information, and I represents color information; theta.theta. α A scale for controlling the location information; theta β A scale for controlling color similarity; ω is a linear combining weight, the second part of equation (8) is related to position information only, θ γ Controlling a scale of the location information;
approximation Q (X) = Π by average field i Q i (X i ) And continuously and iteratively updating Q (X), and finally obtaining the optimal solution of the model by minimizing the K-L divergence of P (X) and Q (X).
Preferably, the step five specifically includes the following steps:
(1) Theoretical basis of D-S evidence
As a bottom concept of the D-S evidence theory, firstly, a space formed by the geometry of all possible results of a building object to be verified is divided, a verification framework is defined as X, and the geometry formed by all subsets in the X is defined as 2 X For 2 X Has m (A) epsilon [0,1]And is and
Figure GDA0004087416050000061
Figure GDA0004087416050000062
wherein m is 2 X M (A) is called the basic probability function of A;
the D-S evidence theory defines a trust function Bel and a likelihood function Pl to represent the uncertainty of the problem, namely:
Figure GDA0004087416050000063
Figure GDA0004087416050000064
the belief function Bel (a) represents the degree of confidence that a is true, also referred to as a lower bound function; the likelihood function Pl (A) represents the confidence level that A is not false, then [ Bel (A), pl (A) ] is a confidence interval of A, the confidence interval describes the upper and lower limits of the confidence level held by A, and under the condition that a plurality of evidences exist, a Dempster synthesis rule can be used for synthesizing a plurality of BPAFs, namely
Figure GDA0004087416050000065
Figure GDA0004087416050000066
(2) The D-S evidence model is verified on the roof of the building, the identity of the building is verified only according to the building scene observed in the unmanned aerial vehicle image, according to the D-S evidence theory, the identification frame X is taken as { T, F }, T is a non-building object, F is a building object, and if the identification frame X is taken as the { T, F } of the building scene, the D-S evidence model has the advantages that
Figure GDA0004087416050000071
Defining a confidence score function G: g ({ T, F }) → [0,1],
Figure GDA0004087416050000072
G ({ T, F } + G (T) + G (F)) =1, where G (F) denotes the confidence that the current feature supports a building object, G (T) denotes the confidence that a non-building object is supported, and G ({ T, F }) =1-G (T) -G (F) denotes the confidence that the identity of the object building is uncertain from this evidence, i.e. supports an unknown confidence;
(3) Selecting an edge evidence model, a spectrum evidence model, a texture evidence model, a context evidence model and a DSM evidence model which are closely related to the building, performing modeling processing suitable for building verification on the characteristics, and defining a probability distribution function;
(4) Building verification judgment criteria, namely processing a building object through analysis of relevant building verification characteristics and definition of corresponding probability distribution functions, obtaining the probability distribution functions corresponding to the characteristics according to building characteristic detection results, and then synthesizing BPAF corresponding to the characteristics by using an evidence synthesis rule of a D-S evidence theory to obtain the probability distribution functions of comprehensive multi-characteristic evidence;
according to the definition of the D-S evidence theory on the belief function Bel, the belief probability Bel of the existence of the building can be calculated i (T),Bel i (F) (ii) a According to the maximum probability distribution rule, the building verification judgment criterion is defined as follows: for building roofs i, if Bel i (T)>Bel i (F) Then the roof is not considered to be the roof of the building; conversely, the current object is considered to be the building roof.
Has the advantages that: the invention adopts the unmanned aerial vehicle image as an input data source, designs a deep fully-convolutional neural network based on layer jump connection to extract the roof of a building, refines the edge of the roof of the building by using a conditional random field, combines the prior knowledge of the roof of the real building, realizes the automatic extraction of the roof of the aerial image building, has stronger practicability and higher accuracy, and has the innovation points that:
(1) The design is based on jump layer connection's full convolution neural network, quotes the residual error module based on jump layer connection and carries out the feature learning in the feature extraction process, carries out the feature fusion with convolution module characteristic map in deconvolution process, and the extraction unmanned aerial vehicle image building characteristic that unique network design can be better.
(2) Aiming at the problem of rough edges in the unmanned aerial vehicle image building roof extraction, a conditional random field is adopted to carry out edge refinement on the primary result of building roof extraction.
(3) Aiming at the problem of false detection in building roof extraction, a D-S introduction evidence theory is introduced to carry out building roof verification based on characteristic evidence, and various building characteristics are innovatively introduced to serve as building roof verification clues.
Drawings
FIG. 1 is a flow chart of a method for extracting a roof of an unmanned aerial vehicle image building based on a full convolution neural network;
FIG. 2 is a diagram of a full convolutional neural network architecture based on layer hopping connections;
fig. 3 is a diagram of a jump layer connection junction.
Detailed Description
The invention provides an unmanned aerial vehicle image building roof extraction method based on a full convolution neural network. Firstly, extracting the roof features of the building by adopting a convolutional neural network based on layer jump connection, and reconstructing the features by utilizing deconvolution aiming at the roof feature map of the building obtained by the convolutional neural network. And then, detecting the building roof by using the trained network model, performing edge refinement on the detection result by using a conditional random field, and finally, introducing a D-S evidence theory to perform reasoning verification on the building roof extraction result to remove the false detection object.
The technical scheme of the invention is described in detail below with reference to the accompanying drawings, the technical process is shown in fig. 1, and the example technical scheme process comprises the following steps:
the method comprises the following steps: establishing an unmanned aerial vehicle image building roof sample library, obtaining an unmanned aerial vehicle orthographic image through field aviation flight and interior processing, manually sketching the unmanned aerial vehicle image on a building roof sample, adding part of open-source high-resolution remote sensing image building roof mark data to ensure the applicability of a training network model, selecting a proper mode to expand the number of samples after obtaining the building roof sample library, mainly comprising 45 degrees, 90 degrees, 180 degrees and 270 degrees of rotation, increasing the robustness of a network to houses in different directions while expanding the number of samples, and carrying out gamma transformation on the image, wherein the basic formula of the gamma transformation is shown as (1):
s=cr γ (1)
wherein c and gamma are normal numbers, coefficients of gamma are 0.5 and 2, so as to enhance the applicability of the model to building detection in different brightness environments, 10-percent scaling and stretching are carried out on building samples, and after a series of sample expansion operations, the number of the samples is changed into 30 times of that of an original image.
Designing a convolutional neural network based on layer jump connection, extracting the roof features of the unmanned aerial vehicle image building, increasing the visual field by using porous convolution to extract more roof features of the building, combining the problems of the traditional deep convolutional neural network feature extraction according to the characteristics of roof samples of the aviation image building, performing layer jump connection on an input layer and an output layer of a convolution module in a feature extraction stage, reconstructing deconvolution features, performing feature reconstruction by using deconvolution aiming at feature graphs obtained by extracting the convolution features, and performing feature fusion on the feature graphs obtained in the deconvolution process and the feature graphs obtained in the convolution process in the deconvolution process;
the specific content is as follows:
(1) The convolutional neural network based on layer jump connection combines the advantages and the disadvantages of the traditional convolutional neural network according to the characteristics of the unmanned aerial vehicle image building; the network has 6 residual convolution modules and 2 common convolution modules, the residual convolution modules comprise 15 convolution layers and 6 pooling layers, the common convolution modules comprise 8 convolution layers, parameter optimization is carried out by batch standardization in the convolution process, the batch standardization firstly calculates the average value of batch processing, the basic formula is shown as (2),
Figure GDA0004087416050000091
wherein:
Figure GDA0004087416050000092
is input and is asserted>
Figure GDA0004087416050000093
Then, the batch variance is calculated, the basic formula is shown as (3),
Figure GDA0004087416050000094
then for
Figure GDA0004087416050000095
The normalization is carried out, the basic formula is shown as (4),
Figure GDA0004087416050000101
finally, carrying out scale transformation and offset to obtain output y i The basic formula is shown in (5),
Figure GDA0004087416050000102
wherein:
Figure GDA0004087416050000103
is input and is asserted>
Figure GDA0004087416050000104
The activating function adopts a ReLU to introduce a nonlinear characteristic into the network, the ReLU is shown in a basic formula (6),
f(x)=max(0,x) (6)
in order to solve the problem of feature loss in the feature extraction process of the traditional deep convolutional neural network, a layer jump connection-based residual error module is adopted for feature learning, and the input layer and the output layer of the residual error convolutional module are subjected to layer jump connection, so that the feature extraction result is ensured, and meanwhile, gradient dispersion and gradient explosion are prevented;
(2) Building feature reconstruction based on deconvolution, after feature extraction of convolution layers, performing unmanned aerial vehicle image building rooftop feature reconstruction by adopting deconvolution layers, wherein the deconvolution layers totally comprise 6 deconvolution modules and totally comprise 6 deconvolution up-sampling layers and 16 convolution layers, and in the deconvolution feature reconstruction process, performing feature fusion on an unmanned aerial vehicle image building rooftop feature graph obtained through deconvolution and a feature graph obtained in the convolution process, performing auxiliary feature reconstruction by using the feature graph in the convolution process, and performing image classification by using sigmoid after feature reconstruction;
thirdly, detecting the roof of the building by using the trained network model;
step four, refining the edge of the building roof by using the conditional random field, and refining the edge of the building by using the conditional random field aiming at the building roof extraction result obtained by the primary detection, so that the edge of the building roof is finer;
the method comprises the following steps of selecting a conditional random field to carry out edge refinement on a building roof, wherein the conditional random field is a conditional probability model of another set of output random variables under the condition of giving a set of input random variables, the problem of mark offset can be solved well, and all features can be subjected to global normalization to obtain a global optimal solution;
construction of a conditional random field image segmentation energy function: defining hidden variables Xi as classification labels of pixel points i, wherein the value range of the classification labels is that semantic labels L = { L1, L2, L3, … … } to be classified; yi is an observed value of each random variable Xi, namely a color value of each pixel point, and the object of image semantic segmentation of the conditional random field is as follows: deducing a corresponding category label of the latent variable Xi through the observation variable Yi;
gibbs distribution for conditional random field (X | I):
Figure GDA0004087416050000111
wherein: e (X | I) is an energy function, and is simply expressed as E (X); x belongs to a label set L; z (I) is a normalization factor;
by minimizing the energy function in the formula (8), an optimal pixel classification result can be obtained; defining the energy function under the condition of the whole graph:
E(X)=∑ i ψ u (x i )+∑ i<j ψ p (x i ,x j ) (8)
wherein: psi u (x i ) Is a unitary potential function; psi p (x i ,x j )=-log P(x i ,x j ) A pair potential function represented by formula (9); p (x) i ,x j ) Representing the probability that the pixel point (i, j) belongs to a certain category label;
Figure GDA0004087416050000112
wherein: mu.s p (x i ,x j ) As a function of tag comparison, x i ≠x j When, mu p (x i ,x j ) =1, otherwise equals 0, for determining compatibility between different tags; p represents position information, and I represents color information; theta α A scale for controlling the location information; theta.theta. β A scale for controlling color similarity; ω is a linear combining weight, the second part of equation (9) is related to position information only, θ γ Controlling a scale of the location information;
approximation Q (X) = Π by average field i Q i (X i ) Continuously iteratively updating Q (X), and finally obtaining the optimal solution of the model by minimizing the K-L divergence of P (X) and Q (X);
and step five, based on the D-S evidence theory, carrying out building roof verification based on the characteristic evidence, wherein the specific contents are as follows:
(1) Theoretical basis of D-S evidence
As a bottom concept of the D-S evidence theory, firstly, a space of geometric composition of all possible results of a building object to be verified is divided, a verification framework is defined as X, and all subsets in X are combinedGeometric notation 2 X For 2, to X Has m (A) epsilon [0,1]And are each and every
Figure GDA0004087416050000121
/>
Figure GDA0004087416050000122
Wherein m is 2 X M (A) is called the basic probability function of A;
the D-S evidence theory defines a trust function Bel and a likelihood function Pl to represent the uncertainty of the problem, namely:
Figure GDA0004087416050000123
Figure GDA0004087416050000124
the belief function Bel (a) represents the degree of confidence that a is true, also referred to as the lower bound function; the likelihood function Pl (A) represents the confidence level that A is not false, then [ Bel (A), pl (A) ] is a confidence interval of A, the confidence interval describes the upper and lower limits of the confidence level held by A, and under the condition that a plurality of evidences exist, a Dempster synthesis rule can be used for synthesizing a plurality of BPAFs, namely
Figure GDA0004087416050000125
Figure GDA0004087416050000126
(2) The D-S evidence model is verified on the roof of the building, and the verification on the roof of the building only needs to verify the building according to the building scene observed in the unmanned aerial vehicle imageBuilding identity, according to D-S evidence theory, taking an identification frame X as { T, F }, wherein T is a non-building object, F is a building object, and if so
Figure GDA0004087416050000127
Defining a belief assignment function G: g ({ T, F }) → [0,1],
Figure GDA0004087416050000128
G ({ T, F } + G (T) + G (F)) =1, where G (F) denotes the confidence that the current feature supports a building object, G (T) denotes the confidence that a non-building object is supported, and G ({ T, F }) =1-G (T) -G (F) denotes the confidence that the identity of the object building is uncertain from this evidence, i.e. supports an unknown confidence;
(3) Selecting an edge evidence model, a spectrum evidence model, a texture evidence model, a context evidence model and a DSM evidence model which are closely related to the building, performing modeling processing suitable for building verification on the characteristics, and defining a probability distribution function;
(4) Building verification judgment criteria, namely processing a building object through analysis of relevant building verification characteristics and definition of corresponding probability distribution functions, obtaining the probability distribution functions corresponding to the characteristics according to building characteristic detection results, and then synthesizing BPAF corresponding to the characteristics by using an evidence synthesis rule of a D-S evidence theory to obtain the probability distribution functions of comprehensive multi-characteristic evidence;
according to the definition of the D-S evidence theory on the belief function Bel, the belief probability Bel of the existence of the building can be calculated i (T),Bel i (F) (ii) a According to the maximum probability distribution rule, the building verification decision criterion is defined as follows: for building roofs i, if Bel i (T)>Bel i (F) Then the roof is not considered as the roof of the building; conversely, the current object is considered to be the building roof.
The invention adopts the unmanned aerial vehicle image as an input data source, designs a deep fully-convolutional neural network based on layer jump connection to extract the roof of a building, refines the edge of the roof of the building by using a conditional random field, combines the prior knowledge of the roof of the real building, realizes the automatic extraction of the roof of the aerial image building, has stronger practicability and higher accuracy, and has the innovation points that:
(1) The design is based on jump layer connection's full convolution neural network, quotes the residual error module based on jump layer connection and carries out the feature learning in the feature extraction process, carries out the feature fusion with convolution module characteristic map in deconvolution process, and the extraction unmanned aerial vehicle image building characteristic that unique network design can be better.
(2) Aiming at the problem of rough edges in the unmanned aerial vehicle image building roof extraction, a conditional random field is adopted to carry out edge refinement on the primary result of building roof extraction.
(3) Aiming at the problem of false detection in building roof extraction, a D-S introduction evidence theory is introduced to carry out building roof verification based on characteristic evidence, and various building characteristics are innovatively introduced to serve as building roof verification clues.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the inventive concept of the present invention, and these changes and modifications are all within the scope of the present invention.

Claims (1)

1. An unmanned aerial vehicle image building roof extraction method based on a full convolution neural network is characterized by comprising the following steps:
establishing an unmanned aerial vehicle image building roof sample library, obtaining an orthophoto through field aviation and interior data processing, manually drawing a building roof sample on the orthophoto, adding a part of open-source high-resolution remote sensing image building roof samples to increase the recognition capability of a network on different images, rotating the sample library samples at 45 degrees, 90 degrees, 180 degrees and 270 degrees, simultaneously performing fuzzy transformation, gamma transformation and stretching scaling, and increasing the number of samples to ensure that the detection characteristics have multi-direction and multi-environment robustness;
designing a convolutional neural network based on layer jump connection, extracting the roof features of the unmanned aerial vehicle image building, increasing the visual field by using porous convolution to extract more roof features of the building, combining the problems of the traditional deep convolutional neural network feature extraction according to the characteristics of roof samples of the aviation image building, performing layer jump connection on an input layer and an output layer of a convolution module in a feature extraction stage, reconstructing deconvolution features, performing feature reconstruction by using deconvolution aiming at feature graphs obtained by extracting the convolution features, and performing feature fusion on the feature graphs obtained in the deconvolution process and the feature graphs obtained in the convolution process in the deconvolution process;
the specific contents are as follows:
(1) The convolutional neural network based on layer jump connection combines the advantages and the disadvantages of the traditional convolutional neural network according to the characteristics of the unmanned aerial vehicle image building; the network has 6 residual convolution modules and 2 common convolution modules, the residual convolution modules comprise 15 convolution layers and 6 pooling layers, the common convolution modules comprise 8 convolution layers, parameter optimization is carried out by batch standardization in the convolution process, the batch standardization firstly calculates the average value of batch processing, the basic formula is shown as (1),
Figure FDA0004087416040000011
wherein:
Figure FDA0004087416040000012
for input, <' >>
Figure FDA0004087416040000013
Then, the batch variance is calculated, the basic formula is shown as (2),
Figure FDA0004087416040000014
then for
Figure FDA0004087416040000021
The standardization is carried out so that the standard,the basic formula is shown in (3),
Figure FDA0004087416040000022
finally, carrying out scale transformation and offset to obtain output y i The basic formula is shown as (4),
Figure FDA0004087416040000023
wherein:
Figure FDA0004087416040000024
is input and is asserted>
Figure FDA0004087416040000025
The activating function adopts a ReLU to introduce a nonlinear characteristic into the network, the ReLU is shown in a basic formula (5),
f(x)=max(0,x) (5)
in order to solve the problem of feature loss in the process of feature extraction of the traditional deep convolution neural network, a layer jump connection-based residual error module is adopted for feature learning, and an input layer and an output layer of the residual error convolution module are subjected to layer jump connection, so that the feature extraction result is ensured, and meanwhile, gradient dispersion and gradient explosion are prevented;
(2) Building feature reconstruction based on deconvolution, after feature extraction of convolution layers, performing unmanned aerial vehicle image building rooftop feature reconstruction by adopting deconvolution layers, wherein the deconvolution layers totally comprise 6 deconvolution modules and totally comprise 6 deconvolution up-sampling layers and 16 convolution layers, and in the deconvolution feature reconstruction process, performing feature fusion on an unmanned aerial vehicle image building rooftop feature graph obtained through deconvolution and a feature graph obtained in the convolution process, performing auxiliary feature reconstruction by using the feature graph in the convolution process, and performing image classification by using sigmoid after feature reconstruction;
thirdly, detecting the roof of the building by using the trained network model;
step four, carrying out building roof edge refinement by using the conditional random field, and carrying out building roof edge refinement by using the conditional random field aiming at a building roof extraction result obtained by primary detection, so that the edge of the building roof is more refined;
the method comprises the following steps of selecting a conditional random field to carry out edge refinement on a building roof, wherein the conditional random field is a conditional probability model of another set of output random variables under the condition of giving a set of input random variables, the problem of mark offset can be solved well, and all features can be subjected to global normalization to obtain a global optimal solution;
construction of a conditional random field image segmentation energy function: defining hidden variables Xi as classification labels of pixel points i, wherein the value range of the classification labels is that semantic labels L = { L1, L2, L3, … … } to be classified; yi is the observed value of each random variable Xi, namely the color value of each pixel point, and the object of image semantic segmentation of the conditional random field is as follows: deducing a corresponding category label of the latent variable Xi through the observation variable Yi;
gibbs distribution for conditional random fields (X | I):
Figure FDA0004087416040000031
wherein: e (X | I) is an energy function, and is simply expressed as E (X); x belongs to a label set L; z (I) is a normalization factor;
by minimizing the energy function in the formula (6), an optimal pixel classification result can be obtained; defining the energy function under the condition of the whole graph:
Figure FDA0004087416040000032
wherein: psi u (x i ) Is a unitary potential function; psi p (x i ,x j )=-logP(x i ,x j ) Representing a pair potential function, calculated by equation (8); p (x) i ,x j ) Indicates that the pixel (i, j) belongs toProbability of labeling a certain category;
Figure FDA0004087416040000033
wherein: mu.s p (x i ,x j ) As a function of tag comparison, x i ≠x j When, mu p (x i ,x j ) 1, otherwise, equal to 0, for judging compatibility among different tags; p represents position information, I represents color information; theta α A scale for controlling the location information; theta β A scale for controlling color similarity; ω is a linear combining weight, the second part of equation (8) is related to position information only, θ γ Controlling a scale of the location information;
approximation Q (X) = Π by average field i Q i (X i ) Continuously iteratively updating Q (X), and finally obtaining the optimal solution of the model by minimizing the K-L divergence of P (X) and Q (X);
and step five, based on the D-S evidence theory, carrying out building roof verification based on the characteristic evidence, wherein the specific contents are as follows:
(1) Theoretical basis of D-S evidence
As a bottom concept of the D-S evidence theory, firstly, a space formed by the geometry of all possible results of a building object to be verified is divided, a verification framework is defined as X, and the geometry formed by all subsets in the X is recorded as 2 X For 2 X Has m (A) epsilon [0,1 [ A ] ]]And is and
Figure FDA0004087416040000041
/>
Figure FDA0004087416040000042
wherein m is 2 X M (A) is called the basic probability function of A;
the D-S evidence theory defines a trust function Bel and a likelihood function Pl to represent the uncertainty of the problem, namely:
Figure FDA0004087416040000043
Figure FDA0004087416040000044
the belief function Bel (a) represents the degree of confidence that a is true, also referred to as the lower bound function; the likelihood function Pl (A) represents the confidence level that A is not false, then [ Bel (A), pl (A) ] is a confidence interval of A, the confidence interval describes the upper and lower limits of the confidence level held by A, and under the condition that a plurality of evidences exist, a Dempster synthesis rule can be used for synthesizing a plurality of BPAFs, namely
Figure FDA0004087416040000045
Figure FDA0004087416040000046
(2) The D-S evidence model is verified on the roof of the building, the identity of the building is verified only according to the building scene observed in the unmanned aerial vehicle image, according to the D-S evidence theory, the identification frame X is taken as { T, F }, T is a non-building object, F is a building object, and if the identification frame X is taken as the { T, F } of the building scene, the D-S evidence model has the advantages that
Figure FDA0004087416040000047
Defining a confidence score function G: g ({ T, F }) → {0,1 → { 5363 })],
Figure FDA0004087416040000048
G ({ T, F } + G (T) + G (F)) =1, where G (F) indicates the confidence that the current feature supports the building object and G (T) indicates that non-support is supportedBuilding object confidence, and G ({ T, F }) =1-G (T) -G (F) indicates that the confidence of the object building identity is uncertain from this evidence, i.e., supporting unknown confidence;
(3) Selecting an edge evidence model, a spectrum evidence model, a texture evidence model, a context evidence model and a DSM evidence model which are closely related to the building, performing modeling processing suitable for building verification on the features, and defining a probability distribution function;
(4) Building verification judgment criteria, namely processing building objects through analysis of building verification related features and definition of corresponding probability distribution functions, obtaining probability distribution functions corresponding to the features according to building feature detection results, and synthesizing BPAF corresponding to the features by using an evidence synthesis rule of a D-S evidence theory to obtain a probability distribution function of comprehensive multi-feature evidence;
according to the definition of the D-S evidence theory on the belief function Bel, the belief probability Bel of the existence of the building can be calculated i (T),Bel i (F) (ii) a According to the maximum probability distribution rule, the building verification decision criterion is defined as follows: for building roofs i, if Bel i (T)>Bel i (F) Then the roof is not considered to be the roof of the building; conversely, the current object is considered to be the roof of a building.
CN201910862731.XA 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network Active CN110543872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910862731.XA CN110543872B (en) 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910862731.XA CN110543872B (en) 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network

Publications (2)

Publication Number Publication Date
CN110543872A CN110543872A (en) 2019-12-06
CN110543872B true CN110543872B (en) 2023-04-18

Family

ID=68713455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910862731.XA Active CN110543872B (en) 2019-09-12 2019-09-12 Unmanned aerial vehicle image building roof extraction method based on full convolution neural network

Country Status (1)

Country Link
CN (1) CN110543872B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028217A (en) * 2019-12-10 2020-04-17 南京航空航天大学 Image crack segmentation method based on full convolution neural network
CN110989016B (en) * 2019-12-26 2022-06-24 山东师范大学 Non-visual field area pipeline surveying system and method based on mobile terminal
CN111428224B (en) * 2020-04-02 2023-10-13 苏州杰锐思智能科技股份有限公司 Face recognition-based computer account login method
CN112052829B (en) * 2020-09-25 2023-06-30 中国直升机设计研究所 Pilot behavior monitoring method based on deep learning
CN112508986B (en) * 2020-12-04 2022-07-05 武汉大学 Water level measurement method based on deep convolutional network and random field
CN113780292B (en) * 2021-08-31 2022-05-06 北京交通大学 Semantic segmentation network model uncertainty quantification method based on evidence reasoning
CN116958455B (en) * 2023-09-21 2023-12-26 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484668A (en) * 2015-01-19 2015-04-01 武汉大学 Unmanned aerial vehicle multi-overlapped-remote-sensing-image method for extracting building contour line
CN107516100A (en) * 2017-08-31 2017-12-26 北京航天绘景科技有限公司 A kind of image building extracting method based on elevation morphology building index
CN109389051A (en) * 2018-09-20 2019-02-26 华南农业大学 A kind of building remote sensing images recognition methods based on convolutional neural networks
CN109448039A (en) * 2018-10-22 2019-03-08 浙江科技学院 A kind of monocular depth estimation method based on depth convolutional neural networks
CN109670515A (en) * 2018-12-13 2019-04-23 南京工业大学 A kind of detection method and system changed for building in unmanned plane image
CN110136170A (en) * 2019-05-13 2019-08-16 武汉大学 A kind of remote sensing image building change detecting method based on convolutional neural networks

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9904867B2 (en) * 2016-01-29 2018-02-27 Pointivo, Inc. Systems and methods for extracting information about objects from scene information
US10127670B2 (en) * 2016-09-27 2018-11-13 Xactware Solutions, Inc. Computer vision systems and methods for detecting and modeling features of structures in images
GB2562813B (en) * 2017-10-16 2019-04-24 Matthew Russell Iain Detecting and identifying unmanned aerial vehicles

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484668A (en) * 2015-01-19 2015-04-01 武汉大学 Unmanned aerial vehicle multi-overlapped-remote-sensing-image method for extracting building contour line
CN107516100A (en) * 2017-08-31 2017-12-26 北京航天绘景科技有限公司 A kind of image building extracting method based on elevation morphology building index
CN109389051A (en) * 2018-09-20 2019-02-26 华南农业大学 A kind of building remote sensing images recognition methods based on convolutional neural networks
CN109448039A (en) * 2018-10-22 2019-03-08 浙江科技学院 A kind of monocular depth estimation method based on depth convolutional neural networks
CN109670515A (en) * 2018-12-13 2019-04-23 南京工业大学 A kind of detection method and system changed for building in unmanned plane image
CN110136170A (en) * 2019-05-13 2019-08-16 武汉大学 A kind of remote sensing image building change detecting method based on convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于BP神经网络的机载高分辨率SAR图像分类方法研究;段连飞等;《测绘通报》;20090225(第02期);14-17+27 *
基于全卷积神经网络的建筑物屋顶自动提取;刘文涛等;《地球信息科学学报》;20181128(第11期);1562-1570 *
基于深度学习的农村建筑物遥感影像检测;陈文康;《测绘》;20161015(第05期);227-230 *
模糊支持向量机建筑物边缘特征提取方法研究;安文等;《计算机工程与设计》;20110216(第02期);649-652 *

Also Published As

Publication number Publication date
CN110543872A (en) 2019-12-06

Similar Documents

Publication Publication Date Title
CN110543872B (en) Unmanned aerial vehicle image building roof extraction method based on full convolution neural network
Xue et al. From LiDAR point cloud towards digital twin city: Clustering city objects based on Gestalt principles
CN106778605B (en) Automatic remote sensing image road network extraction method under assistance of navigation data
CN111259906B (en) Method for generating remote sensing image target segmentation countermeasures under condition containing multilevel channel attention
CN111539942B (en) Method for detecting face depth tampered image based on multi-scale depth feature fusion
CN112132006B (en) Intelligent forest land and building extraction method for cultivated land protection
CN111160127B (en) Remote sensing image processing and detecting method based on deep convolutional neural network model
CN109871875B (en) Building change detection method based on deep learning
CN111079847B (en) Remote sensing image automatic labeling method based on deep learning
CN112347970B (en) Remote sensing image ground object identification method based on graph convolution neural network
CN111862119A (en) Semantic information extraction method based on Mask-RCNN
CN111291675B (en) Deep learning-based hyperspectral ancient painting detection and identification method
CN114283162A (en) Real scene image segmentation method based on contrast self-supervision learning
CN112381730B (en) Remote sensing image data amplification method
CN116910571B (en) Open-domain adaptation method and system based on prototype comparison learning
CN116630637A (en) optical-SAR image joint interpretation method based on multi-modal contrast learning
CN116612382A (en) Urban remote sensing image target detection method and device
CN115830322A (en) Building semantic segmentation label expansion method based on weak supervision network
CN116246161A (en) Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge
CN114359493A (en) Method and system for generating three-dimensional semantic map for unmanned ship
Lguensat et al. Convolutional neural networks for the segmentation of oceanic eddies from altimetric maps
Aziz et al. A systematic review of image-based technologies for detecting as-is BIM objects
Kiani et al. Design and implementation of an expert interpreter system for intelligent acquisition of spatial data from aerial or remotely sensed images
Widyaningrum et al. Tailored features for semantic segmentation with a DGCNN using free training samples of a colored airborne point cloud
Wang et al. Comprehensive mining of information in Weakly Supervised Semantic Segmentation: Saliency semantics and edge semantics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant