CN110415280B

CN110415280B - Remote sensing image and building vector registration method and system under multitask CNN model

Info

Publication number: CN110415280B
Application number: CN201910371472.0A
Authority: CN
Inventors: 陈奇; 王磊
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2019-05-06
Filing date: 2019-05-06
Publication date: 2021-07-13
Anticipated expiration: 2039-05-06
Also published as: CN110415280A

Abstract

The invention discloses a method and a system for registering a remote sensing image and a building vector based on a multitask CNN model, which comprises the steps of training a full convolution network model by using a registered building vector as a reference sample, and further generating a feature map layer suitable for building identification from a high-resolution remote sensing image through the model; secondly, designing a multitask CNN model, overlapping the feature map layer and the rasterized building vector, inputting the overlapped feature map layer and the rasterized building vector into the model, and outputting the false alarm probability and geometric correction parameters of the current building vector through a plurality of convolution, pooling and full connection operations; finally, calculating a reference value of the output result of the multitask CNN model based on the building vectors before and after correction, and finishing model training; and realizing the self-adaptive registration of the building vector and the remote sensing image through the trained multitask CNN model. When the building vector is automatically screened and corrected, the method improves the effective precision of the original data on the basis of keeping the effective information of the building vector.

Description

Remote sensing image and building vector registration method and system under multitask CNN model

Technical Field

The invention relates to the field of surveying and mapping science and technology, in particular to a method and a system for carrying out adaptive registration on a remote sensing image and a building vector by utilizing a multitask CNN model.

Background

Accurate building vector contour information is obtained from the high-resolution remote sensing image, and important basis can be provided for various application fields such as city planning, land survey, illegal building detection and military reconnaissance. Because the building vector data in the historical data is generally subjected to manual checking and has reliable vector structure and edge detail information, compared with the method of directly extracting the remote sensing image building, the method for registering the existing building vector data and the remote sensing image is a more economic and reliable information acquisition mode. However, due to the removal and damage of buildings, inconsistency of mapping data resolution, imaging angle or positioning accuracy in different periods and the like, accurate registration of remote sensing images and vector data of a heterogeneous building needs to be achieved, false alarm vectors need to be deleted, and uneven offset and deformation of the vectors relative to the image degree need to be solved. The high-precision remote sensing image and building vector automatic registration technology has important significance for improving the precision, quality and application value of historical vector data.

The existing remote sensing image and building vector registration technology can be mainly divided into a rule-based refinement processing method and an active contour model-based vector optimization method. The former generally extracts image line features at first, designs related rules to screen and marshal the line features, and then automatically edits and replaces the original vector, but weak features and false features which are difficult to identify in the image and too strict hypothesis preconditions limit the practical range of the technology; the latter drives the original vector to converge towards the building edge while keeping continuous smoothness by optimizing energy function parameters containing various constraints, but the method does not consider the structure and deformation characteristics of the existing building vector data, and the direct application of the technology can cause excessive correction on the original building vector. In addition, both the two technologies are essentially built on the basis of low-level image features of manual design, and are difficult to adapt to complex and diverse building structures and textures in different regions and changes of illumination, resolution and imaging quality in different data, so that the application scenes and the generalization capability of the technologies are both limited to a large extent.

Disclosure of Invention

The invention aims to solve the technical problem of providing a method and a system for registering a remote sensing image and a building vector under a multitask CNN model aiming at the defects that the application scene and the generalization capability in the prior art are limited to a large extent.

The technical scheme adopted by the invention for solving the technical problems is as follows: a method for registering a remote sensing image and a building vector under a multitask CNN model is constructed, and specifically comprises the following steps:

s1, preparing data and constructing a training data set; the training data set comprises a plurality of high-resolution remote sensing images; aiming at each high-resolution remote sensing image, the training data set further comprises an original building vector which is not registered with the high-resolution remote sensing image, and a registered building vector obtained by performing registration correction on the original building vector and the high-resolution remote sensing image;

s2, training a full convolution network model for remote sensing image building detection by using each registration building vector acquired from the training data set;

s3, traversing each original building vector in the training data set, and in the traversing process, training the multitask CNN model by adopting a backward propagation and random gradient descent algorithm and taking a feature map layer generated by using a full convolution network as auxiliary information;

and S4, inputting the remote sensing image data into the multitask CNN model trained in the step S3, and automatically registering the remote sensing image and the building vector after a plurality of convolution, pooling and full connection operations in the model.

Further, the input items of the multitask CNN model include: inputting a binary image formed by rasterizing an original building vector and a high-resolution remote sensing image with the same image range as the binary image into the full convolution network model obtained by training in the step S2, and obtaining a feature map layer with the same geographical range as the high-resolution remote sensing image, wherein the feature map layer is respectively used as an input item of the multitask CNN model;

the output items of the multitask CNN model include: the false alarm probability of the building vector and the geometric correction parameter are divided into two branches;

calculating a Loss value Loss of the multitask CNN model by using a Loss function, wherein a specific mathematical formula is as follows:

wherein i represents the sample number, N represents the total number of samples in the training batch, and p_iA predictor representing the false positive probability of the current sample,

a reference value representing a false positive probability of a current sample; m is_iRepresenting a geometric correction parameter prediction value for an original building vector;

representing the geometric correction reference value from the original building vector to the registered building vector under the non-false-report condition, specifically, in step S3, when traversing the original building vector in the training data set, comparing the original building vector with the corresponding registered building vector;

represents the cross-entropy loss of the current sample false positive probability,

and (4) representing the loss value of the geometric correction parameter, and calculating by adopting a mean square error loss function.

Further, a gradient descent algorithm is applied to the Loss function, when the Loss value Loss approaches to X, the multi-task CNN model is trained completely and is applied to the subsequent execution steps; wherein X is more than or equal to 0.

Further, the automatic registration process is performed on the remote sensing image and the original building vector through the multitask CNN model, namely the process of correcting each building vector to be registered input into the multitask CNN model, when the false alarm probability output by the multitask CNN model is larger than a preset first threshold value, the currently input building vector is deleted, otherwise, the input building vector is corrected according to the geometric correction parameters output by the multitask CNN model, so that the input building vector can be registered with the corresponding high-score remote sensing image.

Further, for the building vector that is input into the multitask CNN model and deleted in the correction process when the multitask CNN model training is performed, the reference value of the corresponding model output item is set as: the false alarm probability is 1, and the geometric correction parameter is null; for the building vector input into the multitask CNN model and retained in the correction process, the reference value of its corresponding model output item is set as: the false alarm probability is 0, and the geometric correction parameters are obtained by taking the coordinates of the same-name points of the building vectors before and after correction and performing least square estimation.

Further, in order to avoid that the original information of the building vector is damaged due to the fact that the building vector is excessively corrected, in the correcting process, a cross-over ratio index between the building vectors before and after correction is calculated, and if the cross-over ratio index is smaller than a preset second threshold value, a correction result is not adopted.

The invention provides a remote sensing image and building vector registration system under a multitask CNN model, which specifically comprises the following modules:

the data construction module is used for preparing data and constructing a training data set; the training data sets comprise a plurality of high-score remote sensing images; aiming at each high-resolution remote sensing image, the training data set also comprises an original building vector which is not registered with the high-resolution remote sensing image, and a registered building vector obtained by registering and correcting the original building vector and the high-resolution remote sensing image;

the full convolution network training model is used for training the full convolution network model for detecting the high-resolution remote sensing image building by acquiring each registration building vector from the training data set;

the multi-task CNN model training module is used for traversing each original building vector in the training data set, and training the multi-task CNN model by adopting a back propagation and random gradient descent algorithm in the traversing process; inputting a binary image formed by rasterizing an original building vector and a high-resolution remote sensing image with the same image range as the binary image into a full convolution network model obtained by training, and respectively taking a feature map layer with the same geographical range as the high-resolution remote sensing image as an input item of a multitask CNN model;

and the remote sensing image registration module is used for inputting remote sensing image data into the trained multitask CNN model, and automatically registering the remote sensing image and the building vector after a plurality of convolution, pooling and full connection operations in the model.

Furthermore, the remote sensing image and building vector registration system provided by the invention automatically registers the remote sensing image and the building vector by using any one of the remote sensing image and building vector registration methods.

In the method and the system for registering the remote sensing image and the building vector under the multitask CNN model, a feature layer generated by a full convolution network is used as auxiliary information, a multitask CNN model learning frame is constructed, the building vector before and after manual correction is used as a learning sample, the multitask CNN model is trained, the trained model takes the feature layer and an original building vector as input, the false alarm probability and the geometric correction parameter of the vector are output, and the screening and correction of the original vector are completed, so that the aim of self-adaptive registration of the remote sensing image and the building vector is fulfilled.

The implementation of the method and the system for registering the remote sensing image and the building vector under the multitask CNN model has the following beneficial effects:

1. the CNN model is utilized to generate high-dimensional features from the remote sensing image in a self-adaptive manner for estimating geometric correction parameters of the building vector, and the method has stronger data adaptability and generalization capability compared with the traditional method based on manual design rules and features;

2. the invention can carry out registration processing on the original building vectors with different distortion degrees by pertinently setting the geometric correction model, and is beneficial to improving the precision level and the application value of the original vectors on the basis of keeping the effective structure information of the original vectors.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

FIG. 1 is a flow chart of a method for building detection by high-resolution remote sensing images in embodiment 1 of the present invention;

FIG. 2 is a system configuration diagram of high-resolution remote sensing image building detection in embodiment 2 of the present invention;

FIG. 3 is a schematic diagram of a full convolution network structure for building detection of high resolution remote sensing images;

fig. 4 is a structural diagram of a multitask CNN model for registering a remote sensing image and a building vector.

Detailed Description

For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.

Example 1:

referring to fig. 1, which is a flowchart of a method for detecting a high-resolution remote sensing image building in embodiment 1 of the present invention, the steps of the method for detecting a high-resolution remote sensing image building disclosed in the present invention include:

s1, preparing data and constructing a training data set; the training data sets comprise a plurality of high-resolution remote sensing images; aiming at each high-resolution remote sensing image, the training data set further comprises an original building vector which is not registered with the high-resolution remote sensing image, and a registered building vector obtained by performing registration correction on the original building vector and the high-resolution remote sensing image;

specifically, in the embodiment, the manual registration correction of the original building vector includes moving the overall position of the vector outline, and performing addition, deletion and movement operations on vector nodes, so that the registered building vector after registration is accurately registered with the building outline in the remote sensing image; wherein, the proportion of the training data set in all the data is determined according to the total amount of the data to be processed, and is generally 10% -30% of the total data amount; the registered building vector can be stored into a sample library together with the original building vector and the high-resolution remote sensing image, and data in the sample library can be used as training data and used for a new registration task.

S2, training a full convolution network model for high-resolution remote sensing image building detection by using each registration building vector acquired from the training data set; specifically, each registration building vector is rasterized into a binary image, the obtained binary image is used as a reference true value, a full convolution network model is trained, a specific training method is carried out by adopting a back propagation and random gradient descent algorithm, and loss estimation of the full convolution model is further realized by establishing a cross entropy function.

In the embodiment, the adopted full convolution network model for high-resolution remote sensing image building detection has a symmetrical double-pyramid structure, and in the model, an input image firstly passes through a classical CNN structure comprising a plurality of convolution layers and pooling layers to obtain a characteristic diagram with lower resolution; and generating a feature map corresponding to the step by step in the CNN structure through a series of up-sampling operations, and finally obtaining a tail end feature map layer through one up-sampling operation so as to output a segmentation result. The feature map is reduced by using a 1 × 1 convolution kernel while the low-resolution feature map is up-sampled, and before the next up-sampling, the dimension-reduced features are fused with the feature map of the corresponding level in the CNN structure by adding corresponding elements. The training of the model is mainly realized by optimizing the following cross entropy loss function by using a back propagation and small batch random gradient descent algorithm:

wherein, y_iThe predicted probability value for the ith sample (pixel) output by the network,

corresponding to its true value (1 for positive samples and 0 for negative samples); and m is the total number of training samples in the batch.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a full-convolution model according to an embodiment of the present invention, as shown in fig. 3, the full-convolution model includes convolution and pooling operations in 3 stages and 3 upsampling operations symmetrical to the convolution and pooling operations, feature maps on the left and right sides are feature fused through lateral connections, and finally, the fused features are subjected to one upsampling operation to obtain a final feature map layer, where the feature map layer is used to generate a segmentation result and a reference true value to establish a cross entropy loss function.

S3, acquiring an original building vector from the training data set, and rasterizing the original building vector into a binary image, wherein the image range of the binary image is determined by the original building vector and a circumscribed rectangle of a corresponding registration building vector; in a specific implementation process, for an original building vector of a certain building, a corresponding registration building vector is firstly found from a training data set, then a circumscribed rectangle of the two vectors is taken, and a smaller buffer area (10-30 pixels) is added to determine the size of a rasterized image. If the original building vector is manually determined to be a false positive vector deletion, the rasterized image size is determined with only its own bounding rectangle and applying the same buffer.

Traversing each original building vector taken out from the training data set, wherein in the traversing process, a back propagation and random gradient descent algorithm is adopted to train the multitask CNN model;

specifically, in the current training process, the input of the multitask CNN model includes a binary image generated after rasterization of the original building vector and a feature map layer obtained by inputting the binary image into a full convolution network model; the output of the model comprises two branches, namely the false positive probability of the vector and the geometric correction parameter; the loss of the model is obtained by comparing the two output values and the false alarm probability true value generated after the original building vector is compared with the registration building vector, and the geometric correction parameter reference value.

Specifically, please refer to fig. 4, which is a structural diagram of a multitask CNN model for registering a remote sensing image and a building vector according to an embodiment of the present invention; as shown in the figure, in this current embodiment, still taking affine transformation as an example, after input data (including feature layer and original building vector after rasterization) is subjected to convolution and pooling for several times, a multidimensional vector is generated through one fully connected layer, and after two parallel fully connected operations, the multidimensional vector synchronously generates a false alarm probability (calculated by Sigmoid activation function) and affine parameters, that is, two output branches corresponding to the multitask CNN model. Combining the false positive probability and the reference value of the geometric correction parameter, a loss function can be constructed according to the following formula:

wherein i represents the sample number, N represents the total number of samples in the training batch, and p_iRepresenting a current sampleThe predicted value of the false alarm probability is obtained,

a reference value representing the false alarm probability of the current sample (the false alarm vector is 1, otherwise, the false alarm vector is 0); m is_iRepresenting a geometric correction parameter prediction value for an original building vector;

a geometric correction reference value representing the original building vector to the registered building vector in a non-false positive condition;

and (4) representing the loss value of the geometric correction parameter, and calculating by adopting a mean square error loss function. Wherein, the optimal solution of the loss function can be realized by back propagation and a small batch of random gradient descent algorithm.

S4, in order to further determine the registration accuracy of the trained multitask CNN model, in this embodiment, a test data set is constructed while data preparation is performed; the test data set comprises a plurality of high-resolution remote sensing images; aiming at each high-score remote sensing image, the test data set also comprises a test building vector which is not registered with the high-score remote sensing image;

traversing each test building vector in the test data set before inputting the remote sensing image into the multitask CNN model and automatically registering the remote sensing image and the building vector; when a test building vector is traversed, setting a buffer area to grid the test building vector into a binary image, generating a remote sensing image characteristic map layer at a corresponding position by using a full convolution network model, superposing the remote sensing image characteristic map layer and the remote sensing image characteristic map layer, inputting the superposed remote sensing image characteristic map layer and the remote sensing image characteristic map layer into a multi-task CNN model, and outputting a false alarm probability and a predicted value of a geometric correction parameter; and repeating traversal until all the tested building vectors in the test data set are processed, and immediately adjusting the network parameters of the multitask CNN model according to the output false alarm probability and the predicted value of the geometric correction parameter, so that the remote sensing image can be accurately matched with the building vectors.

Specifically, in the implementation process, the size of the rasterized buffer area can be increased by 10-30 pixels on the basis of the maximum offset of the original building vector and the registered building vector in the training data set, and the remote sensing image is cut in the same range according to the rasterized binary image to generate a feature map layer. The binary image and the feature map layer both need to be further resampled to a given image size to meet the input requirements of the multitask CNN model.

In the process of automatically registering the remote sensing image and the building vector through the multitask CNN model, namely the process of correcting each building vector input into the multitask CNN model, when the false alarm probability output by the multitask CNN model is greater than a preset threshold value, deleting the currently input building vector, otherwise, correcting the building vector according to the geometric correction parameter output by the multitask CNN model so as to enable the building vector to be registered with the corresponding high-score remote sensing image;

specifically, the region threshold value can be set between 0.5 and 0.8 according to specific application requirements in the implementation process, and for non-false-alarm vectors, after the geometric correction parameters of the vectors are obtained, the coordinates of each node in the vectors are subjected to geometric transformation calculation according to the correction parameters, so that the corrected vector result can be obtained.

As a preferred embodiment, in order to avoid excessively correcting the building vector, which results in damaging the original information of the building vector, in the correction process, a cross-over ratio index between the building vectors before and after correction is calculated, and if the cross-over ratio index is smaller than a preset index threshold, the correction result is not adopted; wherein the index threshold may be statistically set by a training data set: that is, the cross-over ratio of all original building vectors and the registered building vectors in the training data set is calculated, and the cross-over ratio average value and the mean error of all samples are further calculated. Specifically, in this embodiment, when processing the building vectors in the test set, if the union ratio of the original building vector and the registered building vector is smaller than the union ratio in the training data set and 3 times the error, the correction result is not adopted.

The method utilizes the CNN model to generate high-dimensional characteristics from the remote sensing image in a self-adaptive manner for estimating geometric correction parameters of the building vector, and has stronger data adaptability and generalization capability compared with the traditional method based on manual design rules and characteristics; secondly, the geometric correction model is set in a targeted manner to carry out registration processing on the original building vectors with different distortion degrees, so that the accuracy level and the application value of the original building vectors are improved on the basis of reserving the effective structure information of the original vectors.

Example 2:

please refer to fig. 2, which is a system structure diagram of high-resolution remote sensing image building detection, the remote sensing image and building vector registration system under the multitask CNN model disclosed by the present invention includes a data construction module L1, a full convolution network training model L2, a multitask CNN model training module L3 and a remote sensing image registration module L4;

the data construction module L1, the full convolution network training module L2, the multitask CNN model training module L3 and the remote sensing image registration module L4 are connected in sequence, so that a complete high-resolution remote sensing image building detection system is formed, wherein the functions of each module and the mutual synergistic effect of the modules are as follows:

the data construction module L1 is used for preparing data and constructing a training data set; the training data sets comprise a plurality of high-score remote sensing images; aiming at each high-resolution remote sensing image, the training data set also comprises an original building vector which is not registered with the high-resolution remote sensing image, and a registered building vector obtained by registering and correcting the original building vector and the high-resolution remote sensing image;

the full convolution network training model L2 is used for acquiring each registration building vector by utilizing a training data set constructed by the data construction module L1 and training the full convolution network model for detecting the high-resolution remote sensing image building;

the multitask CNN model training module L3 is used for traversing each original building vector in the training data set aiming at the training data set constructed by the data construction module L1, and in the traversing process, a back propagation and random gradient descent algorithm is adopted to train the multitask CNN model; inputting a binary image formed by rasterizing an original building vector and a high-resolution remote sensing image with the same image range as the binary image into a full convolution network model obtained by training, and respectively taking a feature map layer with the same geographical range as the high-resolution remote sensing image as an input item of a multitask CNN model;

the remote sensing image registration module L4 is used for inputting remote sensing image data into the multitask CNN model obtained by the multitask CNN model training module L3 training, and automatic registration of the remote sensing image and the building vector is carried out after a plurality of convolution, pooling and full connection operations in the model.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A remote sensing image and building vector registration method under a multitask CNN model specifically comprises the following steps:

s1, preparing data and constructing a training data set; the training data set comprises a plurality of high-resolution remote sensing images; aiming at each high-resolution remote sensing image, the training data set further comprises an original building vector which is not registered with the high-resolution remote sensing image, and a registered building vector which is obtained by registering and correcting the original building vector and the high-resolution remote sensing image;

s4, inputting the remote sensing image data into the multitask CNN model trained in the step S3, and automatically registering the remote sensing image and the building vector after a plurality of convolution, pooling and full connection operations in the model;

the inputs to the multitasking CNN model include: inputting a binary image formed by rasterizing an original building vector and a high-resolution remote sensing image with the same image range as the binary image into the full convolution network model obtained by training in the step S2 to obtain a feature map layer with the same geographical range as the high-resolution remote sensing image;

the output items of the multitask CNN model comprise a false alarm probability of a building vector and two branches of geometric correction parameters;

representing the geometric correction reference of the original building vector to the registered building vector in the non-false positive situation, in particular, in step S3, the original building vector is traversed in the training data setDuring measurement, the original building vector is compared with the corresponding registration building vector;

2. The method of claim 1, wherein a gradient descent algorithm is applied to the Loss function, and when the Loss value Loss approaches X, the multitask CNN model is trained and applied to subsequent execution steps; wherein X is more than or equal to 0.

3. The method for registering remote-sensing images and building vectors as claimed in claim 1, wherein the multitask CNN model is used for automatically registering the remote-sensing images and the original building vectors, that is, the multitask CNN model is used for correcting each building vector to be registered, when the false alarm probability output by the multitask CNN model is greater than a preset first threshold value, the currently input building vector is deleted, otherwise, the input building vector is corrected according to the geometric correction parameters output by the multitask CNN model, so that the input building vector can be registered with the corresponding high-resolution remote-sensing image.

4. The method as claimed in claim 3, wherein for the building vectors that are input into the multitask CNN model and deleted in the correction process during the multitask CNN model training, the reference values of the corresponding model output items are set as follows: the false alarm probability is 1, and the geometric correction parameter is null; for the building vector input into the multitask CNN model and retained in the correction process, the reference value of its corresponding model output item is set as: the false alarm probability is 0, and the geometric correction parameters are obtained by taking the coordinates of the same-name points of the building vectors before and after correction and performing least square estimation.

5. The method for registering remote sensing images and building vectors as claimed in claim 4, wherein in order to avoid over-correction of the building vectors, which may damage the original information of the building vectors, during the correction process, a cross-over ratio index between the building vectors before and after correction is calculated, and if the cross-over ratio index is smaller than a predetermined second threshold, the correction result is not adopted.

6. A remote sensing image and building vector registration system under a multitask CNN model specifically comprises the following modules:

the data construction module is used for preparing data and constructing a training data set; the training data sets comprise a plurality of high-resolution remote sensing images; aiming at each high-resolution remote sensing image, the training data set also comprises an original building vector which is not registered with the high-resolution remote sensing image, and a registered building vector which is obtained by registering and correcting the original building vector and the high-resolution remote sensing image;

the multi-task CNN model training module is used for traversing each original building vector in the training data set, and training the multi-task CNN model by adopting a back propagation and random gradient descent algorithm in the traversing process; inputting a high-resolution remote sensing image with the same image range as the binary image into a full convolution network model obtained by training, and obtaining a feature map layer with the same geographical range as the high-resolution remote sensing image, wherein the feature map layer and the feature map layer are respectively used as input items of a multitask CNN model;

7. The system for registering the remote sensing image and the building vector as claimed in claim 6, wherein the remote sensing image and the building vector are automatically registered by using the method for registering the remote sensing image and the building vector as claimed in any one of claims 1 to 5.