CN116052006B

CN116052006B - Building edge optimization method based on multitask learning and dual lottery hypothesis

Info

Publication number: CN116052006B
Application number: CN202310314668.2A
Authority: CN
Inventors: 邢华桥; 项俊武; 温奇; 孙雨生; 王海航; 侯东阳
Original assignee: Shandong Jianzhu University
Current assignee: Shandong Jianzhu University
Priority date: 2023-03-29
Filing date: 2023-03-29
Publication date: 2023-06-16
Anticipated expiration: 2043-03-29
Also published as: CN116052006A

Abstract

The invention provides a building edge optimization method based on multitask learning and dual lottery hypothesis, belonging to the technical field of remote sensing science. The method is realized by the following technical scheme: building semantic segmentation and edge detection data set of high-resolution remote sensing image

The method comprises the steps of carrying out a first treatment on the surface of the Constructing a multi-task model of a semantic segmentation task and an edge detection task of a building by taking the CNN segmentation model as a framework; setting the sparseness of a multi-task model and the common parameter ratio of two task parameters, randomly generating a remote sensing image building semantic segmentation and edge detection task parameter mask matrix, and generating a sub-network of the two tasks; adopting a combination of norm regularization and a parameter mask matrix to respectively improve the loss functions of the semantic segmentation task and the edge detection task of the building; training the model to obtain a trained remote sensing image building multitasking model; reconstructing and extracting a semantic segmentation task model in the multi-task.

Description

Building edge optimization method based on multitask learning and dual lottery hypothesis

Technical Field

The invention relates to a building edge optimization method based on multitask learning and dual lottery hypothesis, belonging to the technical field of remote sensing science.

Background

Buildings are the most prominent man-made structures and geographical features of urban areas. Accurate extraction of building information from high resolution remote sensing images plays a vital role in urban planning, change monitoring, environmental monitoring, real estate management, population estimation and disaster risk assessment. Because a great deal of manpower and material resources are consumed for manually and visually interpreting building information in the high-resolution remote sensing image and the time cost is high, the deep learning technology is used for realizing automatic extraction of the building, and the method is a more economical and efficient extraction mode. However, due to the fact that the existing mainstream remote sensing image semantic segmentation method based on the convolutional neural network loses a large amount of building edge information when downsampling is carried out, semantic segmentation results are poor in edge performance, and automatic vectorization of high-resolution remote sensing image buildings is difficult to achieve on the basis. The remote sensing image building edge detection task has rich edge information, so that the building edge information of the edge detection task is effectively utilized to optimize the semantic segmentation edges of the high-resolution remote sensing image building, and the automatic vectorization precision, quality and application value are improved.

The existing remote sensing image building semantic segmentation edge optimization method based on CNN can be mainly divided into an edge optimization method based on traditional modeling and an edge optimization method based on CNN edge feature enhancement, but the method does not consider rich edge information contained in remote sensing image building edge detection tasks. Based on the conventional multitask CNN edge optimization method, negative influences caused by mutual drag among different tasks are not considered, effective information among different tasks is difficult to transfer effectively, and particularly, on the premise that the negative influences caused by the mutual drag among a plurality of tasks are effectively weakened and the existing model is compatible, an optimal remote sensing image building semantic segmentation result for building edge positions accurately perceived by the edge detection tasks does not have a better solution at present by fully utilizing information transfer during multitask training and constructing a general multitask frame, aiming at the fact that the remote sensing image building semantic segmentation and edge detection tasks lack an effective multitask information sharing method.

Disclosure of Invention

The invention aims to provide a building edge optimization method based on multi-task learning and dual lottery hypothesis, which reduces the negative influence caused by difficulty in avoiding mutual restriction among different tasks by adopting remote sensing image building multi-task CNN, constructs a general multi-task information transmission frame and effectively transmits edge information of an edge detection task to a semantic segmentation task in the multi-task training process.

The invention aims to achieve the aim, and the aim is achieved by the following technical scheme:

s1, building semantic segmentation and edge detection data set of high-resolution remote sensing image building

。

S2, constructing a multi-task model of a semantic segmentation task and an edge detection task of the building by taking the CNN segmentation model as a framework, wherein the semantic segmentation task in the model is a main task, and the edge detection task is an auxiliary task.

S3, setting the sparsity of the multi-task model and the common parameter ratio of two task parameters, and randomly generating a remote sensing image building semantic segmentation and edge detection task parameter mask matrix

A sub-network of two tasks is generated.

S4, adopting combination

The norms regularization and the parameter mask matrix improve the loss functions of the building semantic segmentation task and the edge detection task, respectively.

S5, adopting a cosine unequal proportion alternating training strategy, periodically enhancing and weakening transmission of the building edge detection task to semantic segmentation task edge information, adopting a back propagation and random gradient descent algorithm, and utilizing the data set constructed in the step S1

Training the constructed multi-task model, and combining the limiting condition of the step S3 and the method of the step S4 at the same time of trainingMask matrix randomly generated for semantic segmentation task of remote sensing image building>

And performing iterative transformation for multiple times, and obtaining the trained remote sensing image building multitasking model after reaching the preset precision.

S6, dividing parameter mask matrix according to the last converted building semantics

The parameter weight value is used for reconstructing and extracting a semantic segmentation task model in the multi-task; and obtaining a semantic segmentation model which is subjected to edge detection task auxiliary training, inputting a remote sensing image of an actual building into the trained semantic segmentation model, and extracting a reconstructed semantic segmentation model to obtain a semantic segmentation result of the building with more optimized edges.

Preferably, the high-resolution remote sensing image building semantic segmentation and edge detection dataset

The construction steps of (a) are as follows:

s11, converting the original building vector data file into a binary building semantic segmentation label according to the boundary range vector file and taking the size of the corresponding remote sensing image pel as a reference.

S12, cutting the remote sensing image into corresponding ranges according to the boundary range vector file.

And S13, performing morphological corrosion treatment on the binary building semantic segmentation label generated in the step S11 to generate a corresponding building edge detection label.

S14, calculating whether each remote sensing image needs to be expanded or not according to the set size and the overlapping rate, if the remote sensing images need to be expanded, calculating the pixels of the expanded images, and expanding the original remote sensing images, the corresponding building semantic segmentation labels and the building edge detection labels to the specified size.

S15, sliding clipping is carried out on the remote sensing image, the building semantic segmentation tag image and the building edge detection tag image at the same time according to the set size and the set overlapping rate.

S16, after cutting is completed, a high-resolution remote sensing image building semantic segmentation and edge detection data set is obtained

Taking an original remote sensing image as a reference according to 8: the scale of 2 divides the building dataset into: training and test sets, 80% of the data were used for training and 20% of the data were used for testing the performance after training.

Preferably, the multi-task model of the building semantic segmentation task and the edge detection task is input into a high-resolution remote sensing image, and output into a building semantic segmentation task building segmentation graph and a building edge detection task building edge extraction probability graph; the building semantic segmentation task and the edge detection task initialization stages share all convolution modules of the encoder section.

Preferably, the mask matrix

The following conditions are satisfied:

；

wherein:

mask matrix randomly generated for remote sensing image building semantic segmentation task, < >>

Is->

And->

，/>

For a set sparsity ratio->

The parameter ratio is shared for the two task parameters set.

The multitasking model is:

；

for each task its subnetwork is represented as:

。

wherein:

for building partition map, < >>

Extracting probability map for building edge, +.>

For multitasking model, ++>

For high-resolution remote sensing image building semantic segmentation and edge detection data set,/for the remote sensing image building semantic segmentation and edge detection data set>

Is a multitasking model parameter.

Preferably, the loss function in S4 is:

；

in the method, in the process of the invention,

loss function for semantic segmentation tasks of a building, < ->

Loss function for building edge detection task, < ->

Representing a certain loss function conventionally used, +.>

Parameter representing a value of 1 in the parameter mask matrix, < >>

Parameter representing 0 in the parameter mask matrix, < ->

For weighing the parameters. />

Starting from the set minimum value, each iteration gradually increases its value in the form of an arithmetic progression until it increases to the set maximum value.

Preferably, the concrete steps of the cosine unequal proportion alternating training strategy for further weakening the transmission of negative influence of the drag of the semantic segmentation task and the edge detection task of the building are as follows:

in each cycle, the ratio of the number of alternating iterations of the building semantic segmentation task to edge detection

The following conditions are satisfied:

；

in the method, in the process of the invention,

for the segmentation period +.>

Denoted as +.>

Iterative cycle->

Is the maximum number of iteration cycles.

From the accuracy evaluation index in training process +.>

Score determination and satisfies the formula:

；

in the method, in the process of the invention,

representing semantic segmentation task->

Score, ->

Representing edge detection task->

Score, namely: building edge detection task->

Score and building edge detection task->

The fraction is simultaneously greater than the minimum period of 0.6.

Building semantic segmentation and edge detection dataset using high resolution remote sensing images by using back propagation and random gradient descent algorithms

Performing multitask training, wherein mask matrix generated randomly for remote sensing image building semantic segmentation task in training process>

Performing iterative transformation for a plurality of times to enable the parameter mask matrix to meet the following formula:

；

in the method, in the process of the invention,

parameter mask matrix representing a building semantic segmentation network initializing constraint generation +.>

Indicate->

Iterative transformation of->

First->

A constrained generation of a parameter mask matrix for a building semantic segmentation network for a number of iterations, wherein

Is a multiple of M; after each iterative transformation, a new generated parameter mask matrix is used->

Reconstructing the semantic segmentation sub-network of the building.

Preferably, the said

。

Preferably, the parameter mask matrix is divided according to the semantic of the building after the last conversion

And reconstructing and extracting a semantic segmentation task model in the multi-task according to the parameter weight value, wherein the method comprises the following specific steps of: last iteration transformation according to semantic segmentation networkIs>

Will->

All parameters of (2) are removed and reserved

Reconstructing the semantic segmentation network.

The invention has the advantages that: the invention establishes the connection between two tasks through the constrained parameter mask matrix of the two subtasks of the semantic segmentation and the edge detection of the building by randomly generating the remote sensing image with constraint

The regularization and mask matrix improved loss function is used for realizing the transformation from the randomly selected sub-network to the optimal sub-network, and the overall parameter information of the model can be fully utilized, so that the method is more flexible and can effectively weaken the negative influence caused by mutual drag among different tasks compared with the traditional multitasking sharing mode, and has stronger feasibility. A cosine unequal proportion alternating training strategy is adopted, and the transmission of the building edge detection task to semantic segmentation task edge information is periodically strengthened and weakened. The whole process can be effectively applied to various existing deep learning models. The method has the advantages that the result of the complete building can be better segmented in the aspect of semantic segmentation research of the remote sensing image building, the extracted result is more regular and is close to the real segmentation result, meanwhile, the phenomena of over-detection and omission can be better avoided, and the precision and the application value of the automatic extraction result of the building are further improved.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.

Fig. 1 is a diagram of a multi-task training framework in a building edge optimization method based on multi-task learning and dual lottery hypotheses in the practice of the present invention.

Fig. 2 is a flow chart of building semantic segmentation and edge detection dataset construction in a building edge optimization method based on a multi-task learning and dual lottery hypothesis in an example of the invention.

Fig. 3 is a graph comparing the experimental results of the building edge optimization method based on the multi-task learning and dual lottery hypothesis with the corresponding single-task model in the example of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The implementation of the invention provides a building edge optimization method based on multitask learning and dual lottery hypothesis.

Referring to fig. 1, fig. 1 is a diagram of a multi-task training frame in a building edge optimization method based on multi-task learning and dual lottery hypothesis in the implementation of the present invention, which specifically includes the following steps:

Referring to fig. 2, a high-resolution remote sensing image building semantic segmentation and edge detection dataset is constructed, which specifically includes the following steps:

And S15, performing sliding clipping on the remote sensing image, the building semantic segmentation label image and the building edge detection label image at the same time according to the size of 256 x 256 and the overlapping rate of 0.5.

S16, after the cutting work is completed, obtaining a high-resolution remote sensing image building semantic segmentation and edge detection data set

S2, utilizing the high-resolution remote sensing image building semantic segmentation and edge detection dataset described in S1

In the example, a multi-task model for constructing a semantic segmentation task and an edge detection task of a building by taking a U-Net segmentation model as a framework is set, wherein the semantic segmentation task is a main task, and the edge detection task is an auxiliary task; the multi-task model extraction to obtain the high-level semantic features of the building semantic segmentation and the high-level semantic features of the edge detection are realized through a series of operations such as convolution, pooling, up-sampling and the like. The multi-task model is input into a high-resolution remote sensing image, and the model output is output from a building semantic segmentation task building segmentation graph and a building edge detection task building edge extraction probability graph. Furthermore, on the model, the two task initialization phases need to share some parameters, namely: some modules such as convolution are shared on the model.

The multitasking model is:

in the formula->

For building partition map, < >>

Extracting probability map for building edge>

For multitasking model, ++>

Is a multitasking model parameter.

S3, according to the set sparsity and the shared parameter ratio of the two task parameters, randomly generating a remote sensing image building semantic segmentation and edge detection task parameter mask matrix under the condition of the constraint

A sub-network of two tasks is generated.

In the example, the sparsity is set to be 0.95, the shared parameter ratio of two task parameters is set to be 0.95, and the remote sensing image building semantic segmentation and edge detection task parameter mask matrix is randomly generated

The mask matrix for both tasks should satisfy the mathematical formula:

；

in the method, in the process of the invention,

For remote sensingMask matrix randomly generated by semantic segmentation task of image building,/-for>

Is->

And->

，/>

For a set sparsity ratio->

The parameter ratio is shared for the two task parameters set. The multitasking model may be expressed as:

；

for each task its subnetwork is represented as:

。

s4, according to the dual lottery hypothesis, a sub-network randomly selected from the randomly initialized dense networks can be converted into a trainable condition. Under the condition of multitasking, adopting combination

The norm regularization can enable the randomly selected sub-network to be converted into the sub-network with the best effect, and meanwhile, the sub-network of the building edge detection task can be utilized to achieve constraint and information sharing of the semantic segmentation task during training.

In this example, according to the model output item, the loss function of the original semantic segmentation task is set as:

in the formula->

The result of the semantic segmentation is represented,

the result of the semantic segmentation is represented as a probability of a building. The loss function of the original edge detection is set as follows:

in the formula->

Representing the result of the edge detection segmentation, < >>

Indicating the probability of whether the edge detection result is a building edge.

In order to transform the sub-network with constraint random selection into the sub-network with best effect, we use the combination of

And->

Advantage of the norm regularization method ++>

Norm regularization improves the loss function of both tasks.

；

In the method, in the process of the invention,

loss function for semantic segmentation tasks of a building, < ->

Loss function for building edge detection task, < ->

Representing a certain loss function conventionally used, +.>

Parameter representing a value of 1 in the parameter mask matrix, < >>

Parameter representing 0 in the parameter mask matrix, < ->

For weighing the parameters. Set->

Is 0, the growth step is +.>

，/>

Is 1.

S5, adopting a cosine unequal proportion alternating training strategy, and periodically strengthening and weakening the transmission of the building edge detection task to semantic segmentation task edge information. Data set constructed by step S1 by adopting back propagation and random gradient descent algorithm

Training the constructed multi-task model, and simultaneously combining the limiting condition of the step S3 and the method of the step S4 to randomly generate a mask matrix for the semantic segmentation task of the remote sensing image building>

In this example, the Epoch is set to 32, and a cosine unequal proportion alternate training strategy is adopted, so that the negative influence caused by the mutual constraint of the semantic segmentation task and the edge detection task of the building is further weakened. In each cycle, the ratio of the number of alternating iterations of the building semantic segmentation task to edge detection

The mathematical formula is satisfied:

；

in the method, in the process of the invention,

for the segmentation period +.>

Denoted as +.>

Iterative cycle->

Is the maximum number of iteration cycles.

From the accuracy evaluation index in training process +.>

Score determination, and satisfies the equation:

；

in the method, in the process of the invention,

representing semantic segmentation task->

Score, ->

Representing edge detection task->

Score, namely: building edge detection task->

Score and building edge detection task->

The fraction is simultaneously greater than the minimum period of 0.6.

Using data sets with back-propagation and random gradient descent algorithms

And performing repeated iterative transformation, wherein the formula is as follows:

，/>

Indicate->

Iterative transformation of->

First->

Is a multiple of M, and->

. After each iterative transformation, a new generated parameter mask matrix is used->

Reconstructing the semantic segmentation sub-network of the building.

In addition, in the training process, the super parameters such as positive and negative sample proportion, learning rate, batch size, weight attenuation coefficient and the like are debugged according to the test condition. And obtaining the trained remote sensing image building multi-task model after reaching the preset precision.

And the parameter weight value is used for reconstructing and extracting the semantic segmentation task model in the multi-task. And obtaining a semantic segmentation model which is subjected to edge detection task auxiliary training, inputting a remote sensing image of an actual building into the trained semantic segmentation model, and extracting a reconstructed semantic segmentation model to obtain a semantic segmentation result of the building with more optimized edges.

Parameter mask matrix based on last iteration transformation of semantic segmentation network

Will->

All parameters of (2) are removed and remain->

Reconstructing the semantic segmentation network to obtain a semantic segmentation model which is trained in an auxiliary mode through an edge detection task, inputting a remote sensing image of an actual building into the trained semantic segmentation model, and extracting the reconstructed semantic segmentation model to obtain a semantic segmentation result of the building with more optimized edges.

As shown in fig. 3, experiments prove that the invention can better segment complete building results in the aspect of remote sensing image building semantic segmentation research, the extracted results are more regular and approximate to real segmentation results, and meanwhile, the phenomena of over-detection and omission can be better avoided.

Using the public dataset: the training and testing dataset was constructed by the procedure shown in figure 2, university of armed forces Aerial imagery dataset. The U-Net semantic segmentation single task is experimentally compared with the U-Net skeleton semantic segmentation and edge detection multi-task model constructed by the building edge optimization method based on the multi-task learning and dual lottery hypothesis. The results for U-Net were: accuracy 0.9377, precision 0.7443, F1Score 0.8017, iou 0.6982, kappa 0.747, the results of the present invention are: accuracy 0.9475, precision 0.8230, F1Score 0.8098, iou 0.7153, kappa 0.7613. Experiments show that compared with the corresponding single task, the method has a certain improvement on each evaluation index. In the actual extraction results, as shown in fig. 3, the invention improves the missing detection and the false detection to a certain extent, and the extraction results at the edges of the building are more regular and approximate to the actual segmentation results.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The building edge optimization method based on the multitask learning and dual lottery hypothesis is characterized by comprising the following steps of:

；

S2, constructing a multi-task model of a semantic segmentation task and an edge detection task of the building by taking the CNN segmentation model as a framework, wherein the semantic segmentation task in the model is a main task, and the edge detection task is an auxiliary task; the multi-task models of the building semantic segmentation task and the edge detection task are input into high-resolution remote sensing images, and are output into building semantic segmentation task building segmentation graphs and building edge detection task building edge extraction probability graphs; all convolution modules of the encoder part are shared in the initialization stage of the semantic segmentation task and the edge detection task of the building;

Generating a sub-network of two tasks;

the mask matrix

Satisfies the following conditions

；

Wherein:

Is->

And->

，/>

For a set sparsity ratio->

Sharing a parameter ratio for the set two task parameters;

the multitasking model is:

；

for each task its subnetwork is represented as:

；

wherein:

for building partition map, < >>

Extracting probability map for building edge, +.>

For multitasking model, ++>

Is a multitasking model parameter;

s4, adopting combination

The norms regularization and the parameter mask matrix respectively improve the loss functions of the semantic segmentation task and the edge detection task of the building;

the loss function is:

；

in the middle of，

Loss function for semantic segmentation tasks of a building, < ->

Loss function for building edge detection task, < ->

Representing a certain loss function conventionally used, +.>

Parameter representing a value of 1 in the parameter mask matrix, < >>

Parameter representing 0 in the parameter mask matrix, < ->

Is a trade-off parameter; />

Starting from the set minimum value, gradually adding the value of the value in the form of an arithmetic progression for each iteration until the value is increased to the set maximum value;

s5, adopting a cosine unequal proportion alternating training strategy, periodically reinforcing and weakening transmission of negative influence of building edge detection tasks on semantic segmentation task restriction, adopting back propagation and random gradient descent algorithm, and utilizing the high-resolution remote sensing image building semantic segmentation and edge detection data set constructed in the step S1

Performing iterative transformation for multiple times, and obtaining a trained remote sensing image building multitasking model after reaching a preset precision;

the concrete steps for transmitting the adverse effect of the constraint of the semantic segmentation task and the edge detection task of the building are further weakened by the cosine unequal proportion alternating training strategy are as follows:

The following conditions are satisfied:

；

in the method, in the process of the invention,

for the segmentation period +.>

Denoted as +.>

Iterative cycle->

Is the maximum iteration cycle number;

from the accuracy evaluation index in training process +.>

Score determination and satisfies the formula:

；

in the method, in the process of the invention,

representing semantic segmentation task->

Score, ->

Representing edge detection task->

Score, namely: building edge detection task->

Score and building edge detection task->

A fraction of minimum cycles simultaneously greater than 0.6;

；

in the method, in the process of the invention,

Indicate->

Iterative transformation of->

First->

A parameter mask matrix of a constraint-generated building semantic segmentation network for a number of iterations, wherein +.>

Reconstructing a semantic segmentation sub-network of the building;

2. The method for optimizing building edges based on multi-task learning and dual lottery hypothesis according to claim 1, wherein the high resolution remote sensing image building semantic segmentation and edge detection data set

The construction steps of (a) are as follows:

s11, converting an original building vector data file into a binary building semantic segmentation tag according to a boundary range vector file by taking the size of a corresponding remote sensing image pel as a reference;

s12, cutting the remote sensing image into a corresponding range according to the boundary range vector file;

s13, carrying out morphological corrosion treatment on the binary building semantic segmentation label generated in the step S11 to generate a corresponding building edge detection label;

s14, calculating whether each remote sensing image needs to be expanded or not according to the set size and the overlapping rate, if the remote sensing images need to be expanded, calculating the pixels of the expanded images at the same time, and expanding the original remote sensing images, the corresponding building semantic segmentation labels and the building edge detection labels to the specified size at the same time;

s15, sliding cutting is carried out on the remote sensing image, the building semantic segmentation tag image and the building edge detection tag image at the same time according to the set size and the set overlapping rate;

3. The method for optimizing building edges based on the multiple learning and dual lottery hypothesis according to claim 1, wherein the following steps are performed

。

4. The method for optimizing building edges based on multi-task learning and dual lottery hypothesis according to claim 1, wherein the parameter mask matrix is partitioned according to the last converted building semantics

And reconstructing and extracting a semantic segmentation task model in the multi-task according to the parameter weight value, wherein the method comprises the following specific steps of: parameter mask matrix according to last iteration transformation of semantic segmentation network>

Will->

All parameters of (2) are removed and remain->

Reconstructing the semantic segmentation network.

5. A building edge optimization system based on a multi-task learning and dual lottery hypothesis, wherein the system is capable of implementing a building edge optimization method based on a multi-task learning and dual lottery hypothesis as in any one of claims 1-4.