CN115810152A

CN115810152A - Remote sensing image change detection method and device based on graph convolution and computer equipment

Info

Publication number: CN115810152A
Application number: CN202211625236.5A
Authority: CN
Inventors: 王威; 刘聪; 王新; 李骥; 张文杰
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2022-12-16
Filing date: 2022-12-16
Publication date: 2023-03-17

Abstract

The application relates to a remote sensing image change detection method and device based on graph convolution and computer equipment in the technical field of image processing. The method includes the steps that obtained double-temporal remote sensing images are marked to obtain training samples, a remote sensing image change detection network is constructed, and the training samples are adopted for training; and detecting the double-time-phase remote sensing image to be detected by adopting the trained remote sensing image change detection network to obtain a remote sensing image change detection result. The network comprises two feature extraction branches consisting of backbone networks with the same structure and parameters and an EF module, a graph convolution coding module, two graph convolution decoding modules with the same structure and an output network; the feature extraction branch is used for extracting image features and boundary information of a training sample, the image convolution coding module is used for coding features after double-temporal image fusion, and the image convolution decoding module is used for fusing multi-level feature differences and predicting a change mask. By adopting the method, the change area of the picture can be accurately predicted.

Description

Remote sensing image change detection method and device based on graph convolution and computer equipment

Technical Field

The application relates to the technical field of image processing, in particular to a remote sensing image change detection method and device based on graph convolution and computer equipment.

Background

Change Detection (Change Detection) is an important task in remote sensing image analysis, where each pixel in a region is assigned a binary label (i.e., changed or unchanged) by comparing two-temporal images of the same region taken at different times. The remote sensing image change detection has made remarkable success based on the strong distinguishing capability of the deep neural network, but context information cannot be well aggregated for the high-resolution remote sensing change detection task.

Disclosure of Invention

In view of the above, it is necessary to provide a remote sensing image change detection method, apparatus and computer device based on graph convolution.

A remote sensing image change detection method based on graph convolution comprises the following steps:

and acquiring a double-time-phase remote sensing image, and labeling the double-time-phase remote sensing image to obtain a training sample.

Constructing a remote sensing image change detection network based on graph convolution; the remote sensing image change detection network comprises two feature extraction branches consisting of backbone networks with the same structure and parameters and a boundary perception fusion module, a graph convolution coding module, two graph convolution decoding modules with the same structure and an output network; the characteristic extraction branch is used for extracting the characteristics of a training sample through a backbone network, splicing the obtained image characteristics to obtain fusion characteristics, extracting boundary information of multiple scales through a shallow backbone network by adopting a boundary perception fusion module, and splicing the image characteristics and the boundary information to obtain single-time-phase image characteristics; the graph convolution coding module is used for coding the fusion features by adopting graph convolution to obtain double-time-phase image coding features; the graph convolution decoding module is used for decoding the single-time phase image characteristics and the double-time phase image coding characteristics by adopting graph convolution to obtain image decoding characteristics; and the output network is used for performing up-sampling operation according to the difference characteristic diagram of the image decoding characteristics output by the two image convolution decoding modules to obtain a change detection prediction result.

And training the remote sensing image change detection network by using the label of the training sample and the change detection prediction result obtained by inputting the training sample into the remote sensing image change detection network to obtain the trained remote sensing image change detection network.

And detecting the double-time-phase remote sensing image to be detected by adopting the trained remote sensing image change detection network to obtain a remote sensing image change detection result.

In one embodiment, the training samples comprise a first time-phase remote sensing image training sample and a second time-phase remote sensing image training sample.

Training the remote sensing image change detection network by using the label of the training sample and the change detection prediction result obtained by inputting the training sample into the remote sensing image change detection network to obtain the trained remote sensing image change detection network, comprising:

and respectively inputting the first time-phase remote sensing image training sample and the second time-phase remote sensing image training sample into two feature extraction branches to obtain a fusion feature, a first time-phase remote sensing image feature and a second time-phase remote sensing image feature.

And inputting the fusion features into the graph convolution coding module to obtain double-time-phase image coding features.

And inputting the first time-phase remote sensing image characteristic and the double-time-phase image coding characteristic into a first graph convolution decoding module to obtain a first time-phase image decoding characteristic.

And inputting the second time-phase image characteristic and the double time-phase image coding characteristic into a second graph convolution decoding module to obtain a second time-phase image decoding characteristic.

And inputting the first time phase image decoding characteristic and the second time phase image decoding characteristic into an output network to obtain a change detection prediction result.

And carrying out reverse training on the remote sensing image change detection network according to the change detection prediction result and the label of the training sample to obtain the trained remote sensing image change detection network.

In one embodiment, the backbone network in the feature extraction branch is obtained by removing the classification header from the Resnet50 network.

Inputting the first time-phase remote sensing image training sample and the second time-phase remote sensing image training sample into two feature extraction branches respectively to obtain fusion features, first time-phase remote sensing image features and second time-phase remote sensing image features, wherein the fusion features comprise:

and inputting the first time-phase remote sensing image training sample into a backbone network of a first feature extraction branch to obtain first time-phase remote sensing image features.

Inputting the first to third layers of features of the backbone network into a boundary perception fusion module of a first feature extraction branch to obtain first time-phase remote sensing image boundary information.

And splicing the first time-phase remote sensing image characteristic and the first time-phase remote sensing image boundary information to obtain a first time-phase image characteristic.

And inputting the second time-phase remote sensing image training sample into a second characteristic extraction branch to obtain a second time-phase remote sensing image characteristic and a second time-phase image characteristic.

And splicing the first time phase remote sensing image characteristic and the second time phase remote sensing image characteristic to obtain a fusion characteristic.

In one embodiment, inputting the features of the first to third layers of the backbone network into a boundary perception fusion module of a first feature extraction branch to obtain boundary information of a first time-phase remote sensing image, including:

inputting the features of the first layer to the third layer of the backbone network into a boundary perception fusion module of a first feature extraction branch, performing channel fusion on the features of the second layer of the backbone network and the features of the first layer of the backbone network after performing up-sampling on the features of the second layer of the backbone network, and obtaining a first intermediate feature.

And performing channel fusion on the characteristics of the third layer of the backbone network and the characteristics of the second layer of the backbone network after up-sampling to obtain second intermediate characteristics.

And performing channel fusion on the second intermediate features after the second intermediate features are subjected to up-sampling and the first intermediate features, and performing point convolution processing on a fusion result to obtain first time-phase remote sensing image boundary information.

In one embodiment, the graph convolution encoding module includes: the spatial attention is tried to carry out convolution coding branch and the channel attention is tried to carry out convolution branch; wherein the spatial attention map convolutional encoding branch comprises: the system comprises a step convolution layer, a space attention map convolution module, a nearest neighbor interpolation module and a point convolution layer; the channel attention map convolution branch comprises two linear transformation functions, a channel attention map convolution module and a point convolution layer.

The graph convolution encoding module is characterized in that:

processing the fused features by step size convolution layer to obtain new features X _s 。

New feature X _s Inputting the space attention map convolution module to obtain a space map convolution feature M _s Comprises the following steps:

wherein μ (·) is,

Phi (-) is a linear transformation,

represents a contiguous matrix, W _s A parameter matrix that is convolved for spatial attention.

Convolving the space map with a feature M _s And multiplying the processed result by the fusion characteristic by adopting a nearest neighbor interpolation method, and then processing by adopting a point convolution layer to obtain the output of the spatial attention map convolution coding branch.

Processing the fused features by respectively adopting a first linear transformation and a second linear transformation, multiplying the obtained first linear transformation result and the obtained second linear transformation result to obtain a new feature X _f 。

New feature X _f Inputting the channel convolution attention map convolution module to obtain a channel map convolution feature M _f Comprises the following steps:

M _f ＝(I+A _f )X _f W _f wherein I is an identity matrix, I + A _f Is a contiguous matrix, W _f A parameter matrix that is convolved for channel attention.

The second linear transformation result is integrated with the channel map by the characteristics M _f After multiplication, the signals are input into a point convolution layer to obtain the output of a channel attention map convolution branch.

And splicing the output of the spatial attention diagram convolutional coding branch and the output of the channel attention diagram convolutional coding branch to obtain the double-time-phase image coding characteristic.

In one embodiment, the graph convolution decoding module includes: the spatial attention-seeking convolutional decoding branch and the channel attention-seeking convolutional decoding branch; the spatial attention diagram convolutional decoding branch comprises two step size convolutional layers, a spatial attention diagram convolutional decoding module and a point convolutional layer; the channel attention-seeking convolutional decoding branch comprises two linear transforms, a spatial attention-seeking convolutional decoding module and a point convolutional layer.

In a first graph convolution decoding module:

inputting the first time phase image characteristic and the double time phase image coding characteristic into a step size convolution layer respectively to obtain a new image characteristic Y _s ' and New coding feature X _s ′。

New image characteristic Y _s ' and New coding feature X _s Inputting the space attention map convolution decoding module to obtain a space map convolution feature M _s ' is:

wherein mu (·),

Phi (-) is a linear transformation,

represents a contiguous matrix, W _s ' a parameter matrix for a spatial attention map convolutional decoding module.

Convolving the space map with a feature M _s ' and the new image feature Y _s ' after multiplication, after point convolution layer, the output of the spatial attention map convolution decoding branch is obtained.

Respectively carrying out linear transformation on the first time phase image characteristic and the double time phase image coding characteristic, and multiplying obtained transformation results to obtain a characteristic C _ Y ^T 。

The feature X _ Y is measured ^T Inputting the obtained data into the spatial attention diagram convolution decoding module to obtain a channel diagram convolution characteristic M _f ' is:

M _f ′＝(I+A _f ′)(X_Y ^T )W _f ′

wherein I is an identity matrix, I + A _f Is a contiguous matrix, W _f ' the channel attention is looked at to convolve the parameter matrix of the decoding module.

Convolving the channel map with a feature M _f ' and the new image feature Y _s After multiplication, a point convolution layer is obtained to obtain the output of a channel attention map convolution decoding branch.

And splicing the spatial attention diagram convolutional decoding branch output and the channel attention diagram convolutional decoding branch output to obtain a first time phase image decoding characteristic.

In one embodiment, the linear transformation is implemented by 1 × 1 convolutional layers.

In one embodiment, inputting the first time-phase image decoding feature and the second time-phase image decoding feature into an output network to obtain a change detection prediction result, includes:

inputting the first time phase image decoding characteristic and the second time phase image decoding characteristic into an output network, and obtaining a change detection prediction result as follows:

Pre＝σ(upsample(|P ¹ -P ² |))

where Pre is the prediction result of change detection, P ¹ ，P ² Respectively representing a first time phase image decoding characteristic and a second time phase image decoding characteristic, wherein sigma is an activation function, and upsample is an upsampling operation.

A remote sensing image change detection apparatus based on graph convolution, the apparatus comprising:

and the training sample acquisition module is used for acquiring the double-time-phase remote sensing image and labeling the double-time-phase remote sensing image to obtain a training sample.

The remote sensing image change detection network construction module based on graph convolution is used for constructing a remote sensing image change detection network based on graph convolution; the remote sensing image change detection network comprises two feature extraction branches consisting of backbone networks with the same structure and parameters and a boundary perception fusion module, a graph convolution coding module, two graph convolution decoding modules with the same structure and an output network; the characteristic extraction branch is used for extracting the characteristics of a training sample through a backbone network, splicing the obtained image characteristics to obtain fusion characteristics, extracting boundary information of multiple scales through a shallow backbone network by adopting a boundary perception fusion module, and splicing the image characteristics and the boundary information to obtain single-time-phase image characteristics; the graph convolution coding module is used for coding the fusion features by adopting graph convolution to obtain double-time-phase image coding features; the graph convolution decoding module is used for decoding the single-time phase image characteristics and the double-time phase image coding characteristics by adopting graph convolution to obtain image decoding characteristics; and the output network is used for performing up-sampling operation according to the difference characteristic diagram of the image decoding characteristics output by the two image convolution decoding modules to obtain a change detection prediction result.

And the remote sensing image change detection network training module based on graph convolution is used for training the remote sensing image change detection network by utilizing the labels of the training samples and the change detection prediction result obtained by inputting the training samples into the remote sensing image change detection network to obtain the trained remote sensing image change detection network.

And the remote sensing image change detection module is used for detecting the double-time-phase remote sensing image to be detected by adopting the trained remote sensing image change detection network to obtain a remote sensing image change detection result.

A computer device comprising a memory storing a computer program and a processor implementing the steps of any of the methods described above when the processor executes the computer program.

The remote sensing image change detection method, the remote sensing image change detection device and the computer equipment based on graph convolution comprise the following steps: acquiring a double-time-phase remote sensing image, labeling the double-time-phase remote sensing image to obtain a training sample, and constructing a remote sensing image change detection network, wherein the network comprises two feature extraction branches consisting of a residual error network and a boundary sensing fusion module which have the same structure and parameters, a graph convolution coding module, two graph convolution decoding modules which have the same structure, and an output network; the feature extraction branch is used for extracting image features of training samples by adopting a backbone network, boundary information extraction of multiple scales is carried out on backbone network shallow features through a boundary perception fusion module, a graph convolution coding module further refines feature information after double-temporal picture fusion, and a graph convolution decoding module is used for fusing multi-level feature differences and predicting a change mask. Training the remote sensing image change detection network by adopting a training sample, and detecting the double-time-phase remote sensing image to be detected by adopting the trained remote sensing image change detection network to obtain a remote sensing image change detection result. By adopting the method, the change area of the picture can be accurately predicted.

Drawings

FIG. 1 is a schematic flow chart of a remote sensing image change detection method based on graph convolution according to an embodiment;

FIG. 2 is a schematic structural diagram of a remote sensing image change detection network based on graph convolution in another embodiment;

FIG. 3 is a schematic diagram of a boundary-aware fusion module in another embodiment;

FIG. 4 is a diagram illustrating the structure of a convolutional decoding module in another embodiment;

FIG. 5 is a block diagram showing the structure of a remote sensing image change detection apparatus based on graph convolution according to an embodiment;

FIG. 6 is a diagram of the internal structure of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

A remote sensing image change detection network based on graph convolution is called as follows for short: BI-GCN.

And the Edge Fusion Block is an EF module for short.

In one embodiment, as shown in fig. 1, there is provided a remote sensing image change detection method based on graph convolution, the method comprising the following steps:

step 100: and acquiring a double-time-phase remote sensing image, and labeling the double-time-phase remote sensing image to obtain a training sample.

The double time phase remote sensing images are two remote sensing images shot at different times and in the same place.

Step 102: and constructing a remote sensing image change detection network based on graph convolution.

The remote sensing image change detection network comprises two feature extraction branches consisting of a backbone network and a boundary perception fusion module, wherein the backbone network and the boundary perception fusion module are identical in structure and parameters, a graph convolution coding module, two graph convolution decoding modules identical in structure and an output network.

The feature extraction branch is used for extracting features of training samples through a backbone network, splicing the obtained image features to obtain fusion features, extracting boundary information of multiple scales through a shallow backbone network by adopting a boundary perception fusion module, and splicing the image features and the boundary information to obtain single-time-phase image features.

And the graph convolution coding module is used for coding the fusion characteristics by adopting graph convolution to obtain the double-time-phase image coding characteristics.

And the graph convolution decoding module is used for decoding the single-time phase image characteristic and the double-time phase image coding characteristic by adopting graph convolution to obtain an image decoding characteristic.

The output network is used for carrying out up-sampling operation according to the difference characteristic diagram of the image decoding characteristics output by the two image convolution decoding modules to obtain a change detection prediction result.

Specifically, the structure of the remote sensing image change detection network based on graph convolution is shown in fig. 2. The backbone network is a feature extraction part of the Resnet network, and a boundary perception fusion module (EF module) extracts boundary information of multiple scales through a shallow backbone network. The image convolution coding module further refines the feature information after the double-temporal image fusion, and the image convolution decoding module is used for fusing the multi-level feature differences and predicting the change mask.

In standard convolution, information is only exchanged between multiple locations in a small neighborhood defined by the filter size (e.g., a convolution of 3*3 size). To create a large receptive field and capture long distance dependencies, multiple layers need to be stacked one after the other as in the common CNN architecture. Graph convolution is an efficient and easily integrated module that encapsulates the adjacent definitions used in standard convolution and allows long distance information exchange in a single network layer by defining edges between nodes in the graph. Formally, a graph convolution is defined as

X＝σ(AXW) (1)

Where σ (·) is a nonlinear activation function, a is an adjacency matrix representing the neighborhood between nodes of the graph, W is a parameter matrix, and the definition and structure of the graph play a key role in determining information propagation. The graph convolution coding module constructs orthogonal graph space through different graph projection strategies to better learn the difference characteristics in the change detection task.

The whole structure of the graph volume decoding module follows the design idea of the graph volume coding module.

Step 104: and training the remote sensing image change detection network by using the label of the training sample and the change detection prediction result obtained by inputting the training sample into the remote sensing image change detection network to obtain the trained remote sensing image change detection network.

Step 106: and detecting the double-time-phase remote sensing image to be detected by adopting the trained remote sensing image change detection network to obtain a remote sensing image change detection result.

In the above method for detecting a change in a remote sensing image based on graph convolution, the method includes: acquiring a double-time-phase remote sensing image, labeling the double-time-phase remote sensing image to obtain a training sample, and constructing a remote sensing image change detection network, wherein the network comprises two feature extraction branches consisting of a residual error network and a boundary sensing fusion module which have the same structure and parameters, a graph convolution coding module, two graph convolution decoding modules which have the same structure, and an output network; the feature extraction branch is used for extracting image features of training samples by adopting a backbone network, boundary information extraction of multiple scales is carried out on shallow features of the backbone network through a boundary perception fusion module, a graph convolution coding module further refines feature information after double-temporal pictures are fused, and a graph convolution decoding module is used for fusing multi-level feature differences and predicting a change mask. Training the remote sensing image change detection network by adopting a training sample, and detecting the double-time-phase remote sensing image to be detected by adopting the trained remote sensing image change detection network to obtain a remote sensing image change detection result. By adopting the method, the change area of the picture can be accurately predicted.

In one embodiment, the training samples comprise a first time-phase remote sensing image training sample and a second time-phase remote sensing image training sample; the step 104 comprises the following specific steps:

step 200: and inputting the first time phase image training sample and the second time phase remote sensing image training sample into the two characteristic extraction branches respectively to obtain the fusion characteristic, the first time phase remote sensing image characteristic and the second time phase remote sensing image characteristic.

Step 202: and inputting the fusion features into a graph convolution coding module to obtain the double-time-phase image coding features.

Step 204: and inputting the first time phase image characteristic and the double time phase image coding characteristic into a first graph convolution decoding module to obtain the first time phase image decoding characteristic.

Step 206: and inputting the second time-phase image characteristic and the double time-phase image coding characteristic into a second image convolution decoding module to obtain a second time-phase image decoding characteristic.

Step 208: and inputting the first time phase image decoding characteristic and the second time phase image decoding characteristic into an output network to obtain a change detection prediction result.

Step 210: and carrying out reverse training on the remote sensing image change detection network according to the change detection prediction result and the label of the training sample to obtain the trained remote sensing image change detection network.

In one embodiment, the backbone network in the feature extraction branch is obtained by removing the classification header from the Resnet50 network; step 200 comprises: inputting a first time-phase remote sensing image training sample into a backbone network of a first feature extraction branch to obtain a first time-phase remote sensing image feature; inputting the first to third layers of characteristics of the backbone network into a boundary perception fusion module of a first characteristic extraction branch to obtain first time-phase remote sensing image boundary information; splicing the first time-phase remote sensing image characteristic and the first time-phase remote sensing image boundary information to obtain a first time-phase image characteristic; inputting a second time-phase remote sensing image training sample into a second characteristic extraction branch to obtain a second time-phase remote sensing image characteristic and a second time-phase image characteristic; and splicing the first time phase remote sensing image characteristic and the second time phase remote sensing image characteristic to obtain a fusion characteristic.

In one embodiment, inputting the features of the first to third layers of the backbone network into the boundary perception fusion module of the first feature extraction branch to obtain the boundary information of the first time-phase remote sensing image, including: inputting the features of the first layer to the third layer of the backbone network into a boundary perception fusion module of a first feature extraction branch, performing channel fusion on the feature of the second layer of the backbone network after performing up-sampling on the feature of the second layer of the backbone network and the feature of the first layer of the backbone network to obtain a first intermediate feature; performing channel fusion on the characteristics of the third layer of the backbone network and the characteristics of the second layer of the backbone network after up-sampling to obtain second intermediate characteristics; and performing channel fusion on the second intermediate features after the second intermediate features are up-sampled and the first intermediate features, and performing point convolution processing on a fusion result to obtain first time-phase remote sensing image boundary information. The structure of the boundary-aware fusion module is shown in fig. 3.

Specifically, most network models need to downsample an input image to meet the requirements of memory and speed, but some boundary information is lost in the downsampling process, and the problem that the extracted features are insufficient exists, so that the area is incomplete and the boundary is irregular. This situation occurs particularly in shallow networks, resulting in poor performance. The boundary information of the shallow layer is not ignored in the change detection field, and the method provides a boundary perception fusion module which extracts the boundary information based on the shallow layer network and fuses the characteristics through an auxiliary image convolution decoding module so as to enhance the capability of the model for learning the characteristics from the shallow layer network. The formula is as follows:

Y _e ＝τ(Y ₁₂₈ ，up(τ(Y ₆₄ ，up(Y ₃₂ )))) (2)

wherein, Y ₁₂₈ ，Y ₆₄ ，Y ₃₂ Respectively, the features of the first, second and third layers of resnet50, τ (-) representing the channel fusion operation, and up (-) representing the up-sampling doubling.

Y _e And inputting the image into a convolutional decoding module after channel fusion is carried out through the channel clipping and the characteristics extracted by the backbone network.

In one embodiment, as shown in fig. 2, the graph convolution encoding module includes: the spatial attention is tried to carry out convolution coding branch and the channel attention is tried to carry out convolution branch; wherein, the spatial attention diagram convolution coding branch comprises: the system comprises a step convolution layer, a space attention map convolution module, a nearest neighbor interpolation module and a point convolution layer; the channel attention diagram convolution branch comprises two linear transformation functions, a channel attention diagram convolution module and a point convolution layer; in the graph convolution coding module: processing the fused features by step convolution layer to obtain new features X _s (ii) a New feature X _s Inputting the space attention diagram convolution module to obtain a space diagram convolution characteristic M _s Comprises the following steps:

wherein mu (·),

Phi (-) is a linear transformation,

Convolving the space map with the feature M _s Multiplying the processed result by the fusion characteristic by adopting a nearest neighbor interpolation method, and then processing by adopting a point convolution layer to obtain the output of a spatial attention diagram convolution coding branch; processing the fused features by respectively adopting a first linear transformation and a second linear transformation, multiplying the obtained first linear transformation result and the obtained second linear transformation result to obtain a new feature X _f (ii) a New feature X _f Inputting the data into a channel convolution attention map convolution module to obtain a channel map convolution characteristic M _f Comprises the following steps:

M _f ＝(I+A _f )X _f W _f (4)

wherein I is an identity matrix, I + A _f Is a contiguous matrix, W _f A parameter matrix that is convolved for channel attention. Adjacency matrix A _f And W _f And (4) randomly initializing, and optimally learning through a gradient descent algorithm in the end-to-end network training process.

The second linear transformation result is integrated with the channel map convolution feature M _f After multiplication, inputting the multiplied values into a point convolution layer to obtain the output of a channel attention diagram convolution branch; and splicing the output of the spatial attention diagram convolutional coding branch and the output of the channel attention diagram convolutional coding branch to obtain the double-time-phase image coding characteristic.

Specifically, the upper branch of the graph convolution coding module is a spatial attention-seeking convolutional coding branch, and in the spatial attention-seeking convolutional coding branch: projecting input features into a new coordinate space Ω _S In the method, the down-sampling operation is adopted to convert the input characteristic X into a new characteristic X _s The down-sampling rate is 8. In the coordinate space omega _S In which three learnable linear transformations μ (-) are used,

Phi (-) to generate the constituent elements of the graph convolution. The specific formula is shown as formula (3).

Then, M is subjected to nearest neighbor interpolation algorithm interp (·) _s Projected back into the original coordinate space referenced to the input feature X.

The upper branch of the graph convolution coding module is a channel attention chart convolution coding branch, and in the channel attention chart convolution coding branch: feature space graph convolution models the correlation of channels along the channel dimension of the network, capturing the correlation between more abstract features in the image. And respectively reducing the number of characteristic channels of the input characteristic X by 8 times and 4 times through linear transformation alpha (-) and beta (-) to the input characteristic X. Passing the input features X through a projection function X _f ＝α(X)β(X) ^T Projecting into the feature space f to obtain a new feature X _f . According to the formula 1, a new feature X is obtained _f Is represented by the formula (4). Then new feature M is added _f Mapping back to the original feature space

The calculation result of the final refined features is as follows:

wherein, X' is the encoding characteristic of the double-time phase image, tau represents the channel fusion operation, conv represents the point convolution.

In one embodiment, as shown in fig. 4, the graph convolution decoding module includes: the spatial attention-seeking convolutional decoding branch and the channel attention-seeking convolutional decoding branch; the spatial attention diagram convolutional decoding branch comprises two step size convolutional layers, a spatial attention diagram convolutional decoding module and a point convolutional layer; the channel attention chart convolutional decoding branch comprises two linear transformations, a space attention chart convolutional decoding module and a point convolutional layer; in a first graph convolution decoding module: inputting the first time phase image characteristic and the double time phase image coding characteristic into the step size convolution layer respectively to obtain a new image characteristic Y _s ' and New coding feature X _s '; new image characteristic Y _s ' and New coding feature X _s ' input into the spatial attention map convolutional decoding module, obtain the spatial map convolution feature M _s ' is:

wherein mu (·),

Phi (-) is a linear transformation,

Convolving the space map with the feature M _s ' and New image feature Y _s After multiplication, obtaining the output of a spatial attention diagram convolution decoding branch circuit after point convolution layer; respectively carrying out linear transformation on the first time phase image characteristic and the double time phase image coding characteristic, and multiplying the obtained transformation results to obtain a characteristic X _ Y ^T (ii) a Will feature X _ Y ^T Inputting the signal into a spatial attention diagram convolution decoding module to obtain a channel diagram convolution characteristic M _f ' is:

M _f ′＝(I+A _f ′)(X_Y ^T )W _f ′ (8)

Convolving the channel map with a feature M _f ' and New image feature Y _s After multiplication, obtaining channel attention diagram convolution decoding branch circuit output after point convolution layer; and splicing the spatial attention diagram convolution decoding branch output and the channel attention diagram convolution decoding branch output to obtain a first time phase image decoding characteristic.

Specifically, the overall architecture of the graph convolution decoding module follows the design concept of the graph convolution encoding module, a first graph convolution decoding module is taken as an example to analyze a calculation formula of feature extraction, and features input into the first graph convolution decoding module include: first time phase image feature Y ₁ And a bi-temporal image coding feature.

The formula in the attention part of the coordinate space is as follows to

Using the fusion information of (2) as an adjacency matrix, and using M as an adjacency matrix _s ' projection back with input feature Y ₁ Is the original coordinate space of the reference.

Wherein, Y ₁ Is a first time phase image feature.

Two linear transformations α (-) and β (-) in the channel attention-seeking convolutional decoding branch reduce the input feature X' by 8 times and the feature Y, respectively _i 2 times of the number of the feature channels, and then the new feature M _f ' map back to input feature Y ₁ Is a reference original feature space.

X_Y ^t ＝α(X′)β(Y ₁ ) ^T (11)

M _f ′＝(I+A _f )(X_Y ^T )W _f (12)

Final decoded calculation:

the structure and principle of the second graph convolution decoding module are the same as those of the first graph convolution decoding module. The second time phase image characteristic Y ₂ And inputting the coding characteristics of the double time phase images into a second image convolution decoding module to obtain the decoding characteristics P of the second time phase images ² ；P ¹ 、P ² Representing the prediction results of a pair of pictures in a two-phase time.

In one embodiment, step 208 includes: inputting the first time phase image decoding characteristic and the second time phase image decoding characteristic into an output network, and obtaining a change detection prediction result as follows:

Pre＝σ(upsample(|P ¹ -P ² |)) (15)

where Pre is the prediction result of change detection, P ¹ ，P ² Respectively representing the first phase image decoding characteristic and the second phase image decoding characteristic, wherein σ is an activation function, preferably, the activation function σ is a Relu function, and upsample is an upsampling operation.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

In one validation embodiment, the experimental data was model trained and tested using the LEVIR-CD open dataset. The LEVIR-CD dataset contains two time periods 637 with varying building markers for high resolution (0.5 m) Google Earth imagery 1024 × 1024 in size. In order to reduce the amount of calculation, the original data set image is uniformly cut into 256 × 256 sizes in the experiment. The experimental environment configuration is shown in table 1.

Table 1 experimental environment configuration

Experimental Environment	Description of the configuration
		Server GPU	NVDIA A10
Memory device	20GB
		Learning framework	torch 1.10.0、cuda11.2
Developing languages	Python 3.9.12
		Editing device	Pycharm

In order to embody the recognition effect of the BI-GCN model on the data set, a comparative experiment was performed using a SimUnet _ diff network model, a SimUnet _ conc network model, and a BIT network model. Model precision (precision), recall (recall), F1 Score (F1-Score), cross-over ratio (IoU), and Overall Accuracy (OA) were evaluated from five indices, respectively

Table 2 comparison of results for each index

Network	Precision ratio	Recall rate	Fraction of F1	Cross ratio of	Overall rate of accuracy
						SiamUnet_diff	90.69％	83.25％	86.82％	76.71％	98.71％
SiamUnet_conc	90.30％	84.59％	87.35％	77.54％	98.75％
						BIT	91.19％	87.36％	89.23％	80.56％	98.93％
BI-GCN	92.91％	87.77％	90.27％	82.27％	99.04％

In one embodiment, as shown in fig. 5, there is provided a remote sensing image change detection apparatus based on graph convolution, including: the remote sensing image change detection system comprises a training sample acquisition module, a remote sensing image change detection network construction module based on graph convolution, a remote sensing image change detection network training module based on graph convolution and a remote sensing image change detection module, wherein:

The remote sensing image change detection network construction module based on graph convolution is used for constructing a remote sensing image change detection network based on graph convolution; the remote sensing image change detection network comprises two feature extraction branches consisting of backbone networks with the same structure and parameters and a boundary perception fusion module, a graph convolution coding module, two graph convolution decoding modules with the same structure and an output network; the characteristic extraction branch is used for extracting the characteristics of the training sample through a backbone network, splicing the obtained image characteristics to obtain fusion characteristics, extracting boundary information of multiple scales through a shallow backbone network by adopting a boundary perception fusion module, and splicing the image characteristics and the boundary information to obtain single-time-phase image characteristics; the graph convolution coding module is used for coding the fusion characteristics by adopting graph convolution to obtain double-time-phase image coding characteristics; the graph convolution decoding module is used for decoding the single-time phase image characteristics and the double-time phase image coding characteristics by adopting graph convolution to obtain image decoding characteristics; the output network is used for carrying out up-sampling operation according to the difference characteristic diagram of the image decoding characteristics output by the two image convolution decoding modules to obtain a change detection prediction result.

In one embodiment, the training samples comprise a first time-phase remote sensing image training sample and a second time-phase remote sensing image training sample; the remote sensing image change detection network training module based on graph convolution is further used for inputting the first time-phase remote sensing image training sample and the second time-phase remote sensing image training sample into the two characteristic extraction branches respectively to obtain a fusion characteristic, a first time-phase remote sensing image characteristic and a second time-phase remote sensing image characteristic; inputting the fusion features into a graph convolution coding module to obtain double-temporal image coding features; inputting the first time phase image characteristic and the double time phase image coding characteristic into a first graph convolution decoding module to obtain a first time phase image decoding characteristic; inputting the second time-phase image characteristic and the double time-phase image coding characteristic into a second image convolution decoding module to obtain a second time-phase image decoding characteristic; inputting the first time phase image decoding characteristic and the second time phase image decoding characteristic into an output network to obtain a change detection prediction result; and carrying out reverse training on the remote sensing image change detection network according to the change detection prediction result and the label of the training sample to obtain the trained remote sensing image change detection network.

In one embodiment, the backbone network in the feature extraction branch is obtained by removing the classification header from the Resnet50 network; the remote sensing image change detection network training module based on graph convolution is also used for inputting a first time-phase remote sensing image training sample into a backbone network of the first characteristic extraction branch to obtain a first time-phase remote sensing image characteristic; inputting the first to third layers of characteristics of the backbone network into a boundary perception fusion module of a first characteristic extraction branch to obtain first time-phase remote sensing image boundary information; splicing the first time-phase remote sensing image characteristic and the first time-phase remote sensing image boundary information to obtain a first time-phase image characteristic; inputting a second time-phase remote sensing image training sample into a second characteristic extraction branch to obtain a second time-phase remote sensing image characteristic and a second time-phase image characteristic; and splicing the first time phase remote sensing image characteristic and the second time phase remote sensing image characteristic to obtain a fusion characteristic.

In one embodiment, the remote sensing image change detection network training module based on graph convolution is further used for inputting the features of the first layer to the third layer of the backbone network into the boundary perception fusion module of the first feature extraction branch, performing channel fusion on the features of the second layer of the backbone network after performing up-sampling on the features of the second layer of the backbone network and the features of the first layer of the backbone network, and obtaining a first intermediate feature; performing channel fusion on the characteristics of the third layer of the backbone network and the characteristics of the second layer of the backbone network after performing up-sampling on the characteristics of the third layer of the backbone network to obtain second intermediate characteristics; and performing channel fusion on the second intermediate features after the second intermediate features are up-sampled and the first intermediate features, and performing point convolution processing on a fusion result to obtain first time-phase remote sensing image boundary information.

In one embodiment, the graph convolution encoding module includes: the spatial attention is tried to carry out convolution coding branch and the channel attention is tried to carry out convolution branch; wherein, the spatial attention diagram convolution coding branch comprises: the system comprises a step convolution layer, a space attention map convolution module, a nearest neighbor interpolation module and a point convolution layer; the channel attention diagram convolution branch comprises two linear transformation functions, a channel attention diagram convolution module and a point convolution layer; the remote sensing image change detection network training module based on graph convolution is also used for: processing the fused features by step convolution layer to obtain new features X _s (ii) a New feature X _s Inputting the space attention diagram convolution module to obtain a space diagram convolution characteristic M _s Convolution feature M of space diagram _s Is represented by the formula (3).

Convolving the space map with the feature M _s Multiplying the processed result by the fusion characteristic by adopting a nearest neighbor interpolation method, and then processing by adopting a point convolution layer to obtain the output of a spatial attention map convolution coding branch; processing the fused features by respectively adopting a first linear transformation and a second linear transformation, multiplying the obtained first linear transformation result and the obtained second linear transformation result to obtain a new feature X _f (ii) a New feature X _f Inputting the data into a channel convolution attention map convolution module to obtain a channel map convolution characteristic M _f Convolution feature M of channel map _f Is represented by the formula (4).

The second linear transformation result is integrated with the channel map convolution feature M _f After multiplication, inputting the multiplied values into a point convolution layer to obtain the output of a channel attention diagram convolution branch; the outputs of the spatial attention-seeking convolutional encoding branches and the outputs of the channel attention-seeking convolutional encoding branchesAnd after splicing, obtaining the coding characteristics of the double time phase images.

In one embodiment, the graph convolution decoding module comprises: the spatial attention-seeking convolutional decoding branch and the channel attention-seeking convolutional decoding branch; the spatial attention diagram convolutional decoding branch comprises two step size convolutional layers, a spatial attention diagram convolutional decoding module and a point convolutional layer; the channel attention diagram convolution decoding branch comprises two linear transformations, a space attention diagram convolution decoding module and a point convolution layer; the remote sensing image change detection network training module based on graph convolution is also used for the first graph convolution decoding module: inputting the first time phase image characteristic and the double time phase image coding characteristic into the step size convolution layer respectively to obtain a new image characteristic Y _s ' and New coding feature X _s '; new image characteristic Y _s ' and New coding feature X _s ' input into the spatial attention map convolutional decoding module, obtain the spatial map convolution feature M _s ', space diagram convolution feature M _s The expression of' is shown in formula (7).

Convolving the space map with the feature M _s ' and New image feature Y _s After multiplication, obtaining the output of a spatial attention diagram convolution decoding branch circuit after point convolution layer; respectively carrying out linear transformation on the first time phase image characteristic and the double time phase image coding characteristic, and multiplying the obtained transformation results to obtain a characteristic X _ Y ^T (ii) a Will be characteristic X _ Y ^T Inputting the signal into a spatial attention diagram convolution decoding module to obtain a channel diagram convolution characteristic M _f ', channel map convolution feature M _f The expression of' is shown in formula (8).

Convolving the channel map with a feature M _f ' and New image feature Y _s After multiplication, obtaining channel attention diagram convolution decoding branch circuit output after point convolution layer; and splicing the spatial attention chart convolutional decoding branch output and the channel attention chart convolutional decoding branch output to obtain a first time phase image decoding characteristic.

In one embodiment, the remote sensing image change detection network training module based on graph convolution is further configured to input the first time-phase image decoding feature and the second time-phase image decoding feature into an output network to obtain a change detection prediction result, where an expression of the change detection prediction result is shown in formula (15).

For specific limitations of the remote sensing image change detection device based on graph convolution, reference may be made to the above limitations of the remote sensing image change detection method based on graph convolution, and details thereof are not repeated here. All or part of each module in the remote sensing image change detection device based on graph convolution can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a remote sensing image change detection method based on graph convolution. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the above method embodiments when executing the computer program.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A remote sensing image change detection method based on graph convolution is characterized by comprising the following steps:

acquiring a double-time-phase remote sensing image, and labeling the double-time-phase remote sensing image to obtain a training sample;

constructing a remote sensing image change detection network based on graph convolution; the remote sensing image change detection network comprises two feature extraction branches consisting of backbone networks with the same structure and parameters and a boundary perception fusion module, a graph convolution coding module, two graph convolution decoding modules with the same structure and an output network; the characteristic extraction branch is used for extracting the characteristics of a training sample through a backbone network, splicing the obtained image characteristics to obtain fusion characteristics, extracting boundary information of multiple scales through a shallow backbone network by adopting a boundary perception fusion module, and splicing the image characteristics and the boundary information to obtain single-time-phase image characteristics; the graph convolution coding module is used for coding the fusion features by adopting graph convolution to obtain double-time-phase image coding features; the graph convolution decoding module is used for decoding the single-time-phase image characteristics and the double-time-phase image coding characteristics by adopting graph convolution to obtain image decoding characteristics; the output network is used for performing up-sampling operation according to the difference characteristic diagram of the image decoding characteristics output by the two image convolution decoding modules to obtain a change detection prediction result;

training the remote sensing image change detection network by using the label of the training sample and the change detection prediction result obtained by inputting the training sample into the remote sensing image change detection network to obtain the trained remote sensing image change detection network;

2. The method according to claim 1, wherein the training samples comprise a first time-phase remote sensing image training sample and a second time-phase remote sensing image training sample;

training the remote sensing image change detection network by using the label of the training sample and the change detection prediction result obtained by inputting the training sample into the remote sensing image change detection network to obtain the trained remote sensing image change detection network, comprising the following steps:

inputting the first time-phase remote sensing image training sample and the second time-phase remote sensing image training sample into two feature extraction branches respectively to obtain a fusion feature, a first time-phase remote sensing image feature and a second time-phase remote sensing image feature;

inputting the fusion features into the image convolution coding module to obtain double-time-phase image coding features;

inputting the first time-phase remote sensing image characteristic and the double time-phase image coding characteristic into a first graph convolution decoding module to obtain a first time-phase image decoding characteristic;

inputting the second time-phase image characteristic and the double time-phase image coding characteristic into a second graph convolution decoding module to obtain a second time-phase image decoding characteristic;

inputting the first time phase image decoding characteristic and the second time phase image decoding characteristic into an output network to obtain a change detection prediction result;

3. The method of claim 2, wherein the backbone network in the feature extraction leg is obtained by removing a classification header from a Resnet50 network;

inputting the first time-phase remote sensing image training sample into a backbone network of a first feature extraction branch to obtain a first time-phase remote sensing image feature;

inputting the first to third layers of features of the backbone network into a boundary perception fusion module of a first feature extraction branch to obtain first time-phase remote sensing image boundary information;

splicing the first time-phase remote sensing image characteristic and the first time-phase remote sensing image boundary information to obtain a first time-phase image characteristic;

inputting the second time-phase remote sensing image training sample into a second characteristic extraction branch to obtain a second time-phase remote sensing image characteristic and a second time-phase image characteristic;

4. The method according to claim 3, wherein inputting the features of the first to third layers of the backbone network into a boundary perception fusion module of a first feature extraction branch to obtain the boundary information of the first time-phase remote sensing image comprises:

inputting the features of the first layer to the third layer of the backbone network into a boundary perception fusion module of a first feature extraction branch, performing channel fusion on the features of the second layer of the backbone network after performing up-sampling on the features of the second layer of the backbone network and the features of the first layer of the backbone network to obtain a first intermediate feature;

performing channel fusion on the characteristics of the third layer of the backbone network and the characteristics of the second layer of the backbone network after up-sampling to obtain second intermediate characteristics;

5. The method of claim 2, wherein the graph convolution encoding module comprises: the spatial attention is tried to convolution coding branch and the channel attention is tried to convolution branch; wherein the spatial attention map convolutional encoding branch comprises: the system comprises a step convolution layer, a space attention map convolution module, a nearest neighbor interpolation module and a point convolution layer; the channel attention diagram convolution branch comprises two linear transformation functions, a channel attention diagram convolution module and a point convolution layer;

the graph convolution encoding module is characterized in that:

processing the fused features by step size convolution layer to obtain new features X _s ；

New feature X _s Inputting the space attention map convolution module to obtain a space map convolution feature M _s ：

Wherein mu (·),

Phi (-) is a linear transformation,

represents a contiguous matrix, W _s A parameter matrix for convolution for spatial attention;

convolving the space map with a feature M _s Multiplying the processed result by the fusion characteristic by adopting a nearest neighbor interpolation method, and then processing by adopting a point convolution layer to obtain the output of a spatial attention map convolution coding branch;

processing the fused features by respectively adopting a first linear transformation and a second linear transformation, multiplying the obtained first linear transformation result and the second linear transformation result to obtain a new feature X _f ；

M _f ＝(I+A _f )X _f W _f

wherein I is an identity matrix, I + A _f Is a contiguous matrix, W _f A parameter matrix for convolution for channel attention;

the second linear transformation result is integrated with the channel map by the feature M _f After multiplication, inputting the multiplied values into a point convolution layer to obtain the output of a channel attention diagram convolution branch;

6. The method of claim 2, wherein the graph convolution decoding module comprises: the spatial attention-seeking convolutional decoding branch and the channel attention-seeking convolutional decoding branch; the spatial attention diagram convolutional decoding branch comprises two step size convolutional layers, a spatial attention diagram convolutional decoding module and a point convolutional layer; the channel attention diagram convolutional decoding branch comprises two linear transformations, a space attention diagram convolutional decoding module and a point convolutional layer;

in a first graph convolution decoding module:

inputting the first time phase image characteristic and the double time phase image coding characteristic into a step size convolution layer respectively to obtain a new image characteristic Y _s ' and New coding feature X _s ′；

New image characteristic Y _s ' and New coding feature X _s ' inputting into a spatial attention diagram convolution decoding module to obtain a spatial graph convolution characteristic M _s ' is:

wherein mu (·),

Phi (-) is a linear transformation,

represents an adjacency matrix, W _s ' convolving a parameter matrix of a decoding module for spatial attention;

convolving the space map with a feature M _s ' and the new image feature Y _s After multiplication, obtaining the output of a spatial attention diagram convolution decoding branch circuit after point convolution layer;

respectively carrying out linear transformation on the first time phase image characteristic and the double time phase image coding characteristic, and multiplying obtained transformation results to obtain a characteristic X _ Y ^T ；

The characteristic X _ Y ^T Inputting the obtained data into the spatial attention diagram convolution decoding module to obtain a channel diagram convolution characteristic M _f ' is:

M _f ′＝(I+A _f ′)(X_Y ^T )W _f ′

wherein I is an identity matrix, I + A _f Is a contiguous matrix, W _f ' convolving a parameter matrix of a decoding module for channel attention;

convolving the channel map with a feature M _f ' and the new image feature Y _s After multiplication, obtaining channel attention diagram convolution decoding branch circuit output after point convolution layer;

7. The method of claim 5 or 6, wherein the linear transformation is implemented by a 1 x 1 convolutional layer.

8. The method of claim 2, wherein inputting the first and second phase image decoding features into an output network to obtain a change detection prediction result comprises:

Pre＝σ(upsample(|P ¹ -P ² |))

9. A remote sensing image change detection device based on graph convolution is characterized by comprising:

the training sample acquisition module is used for acquiring the double-time-phase remote sensing image and marking the double-time-phase remote sensing image to obtain a training sample;

the remote sensing image change detection network construction module based on graph convolution is used for constructing a remote sensing image change detection network based on graph convolution; the remote sensing image change detection network comprises two feature extraction branches consisting of a backbone network and a boundary perception fusion module which have the same structure and parameters, a picture convolution coding module, two picture convolution decoding modules with the same structure and an output network; the characteristic extraction branch is used for extracting the characteristics of a training sample through a backbone network, splicing the obtained image characteristics to obtain fusion characteristics, extracting boundary information of multiple scales through a shallow backbone network by adopting a boundary perception fusion module, and splicing the image characteristics and the boundary information to obtain single-time-phase image characteristics; the graph convolution coding module is used for coding the fusion features by adopting graph convolution to obtain double-time-phase image coding features; the graph convolution decoding module is used for decoding the single-time phase image characteristics and the double-time phase image coding characteristics by adopting graph convolution to obtain image decoding characteristics; the output network is used for carrying out up-sampling operation according to the difference characteristic diagram of the image decoding characteristics output by the two image convolution decoding modules to obtain a change detection prediction result;

the remote sensing image change detection network training module is used for training the remote sensing image change detection network by utilizing the labels of the training samples and the change detection prediction result obtained by inputting the training samples into the remote sensing image change detection network to obtain the trained remote sensing image change detection network;

10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the method of any one of claims 1 to 8 when executing the computer program.