CN113609896B - Object-level remote sensing change detection method and system based on dual-related attention - Google Patents

Object-level remote sensing change detection method and system based on dual-related attention Download PDF

Info

Publication number
CN113609896B
CN113609896B CN202110692812.7A CN202110692812A CN113609896B CN 113609896 B CN113609896 B CN 113609896B CN 202110692812 A CN202110692812 A CN 202110692812A CN 113609896 B CN113609896 B CN 113609896B
Authority
CN
China
Prior art keywords
features
dual
feature
attention
change detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110692812.7A
Other languages
Chinese (zh)
Other versions
CN113609896A (en
Inventor
胡翔云
张琳
张觅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202110692812.7A priority Critical patent/CN113609896B/en
Publication of CN113609896A publication Critical patent/CN113609896A/en
Application granted granted Critical
Publication of CN113609896B publication Critical patent/CN113609896B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an object-level remote sensing change detection method and system based on dual-related attention, which are used for carrying out data enhancement for change detection to generate a double input stream; setting a backbone network with shared weight for receiving the double input stream and extracting different scale characteristics of the double-phase image; setting a feature fusion neck of dual-related attention guidance, focusing on the correlation of the dual-time phase features of the same scale in a space level and a channel level to obtain refined difference features, and setting a refinement path aggregation pyramid module to fuse features among layers of different scales; and finally, sending the difference features of different scales into a change detection head, and predicting the position, the size and the change confidence of the change ground feature in the form of a boundary frame. The data enhancement method special for the change detection can accelerate model training and improve model performance, can effectively resist pseudo-change interference in image pairs through guidance of dual-related attention mechanisms, and has higher accuracy and robustness.

Description

Object-level remote sensing change detection method and system based on dual-related attention
Technical Field
The invention belongs to the field of automatic change detection of remote sensing images, and particularly relates to an object-level remote sensing change detection method and system based on dual-related attention.
Background
Change detection is a process of detecting a difference by observing and recognizing states of objects or phenomena of different periods. More precisely, the purpose of the change detection is to find the change information of a semantic category of particular interest under the influence of filtering out irrelevant change information. It has been one of the most important problems in the remote sensing field. Currently, change detection is widely used in various applications, such as urban planning, land resource management, environmental monitoring, agricultural investigation, disaster assessment and other applications, and has great research value.
Currently, the change detection method can be classified into a pixel-level change detection method, an object-level change detection method, and a scene-level change detection method according to the granularity of the basic unit used. Pixel level methods typically output pixel level binary mask predictions by extracting ground varying features, classifying pixel by pixel as varying or unchanged in input size. For object level change detection, an object is generally acquired by means of segmentation or detection, and whether the object is changed or not is determined by extracting and comparing multi-temporal features of the object, and prediction of the object level, such as a binary mask of an object instance or a bounding box of the object, is output. The scene level method analyzes whether the category of the corresponding scene changes from the semantic level and what changes occur at different time, classifies images or image slices, and provides prediction of image level labels. Since change detection most of the time requires determination of changes in a particular class of regions or objects, pixel-level and object-level methods are more used.
For the pixel-level change detection method, representative conventional methods are Principal Component Analysis (PCA), change Vector Analysis (CVA), and the like. In view of the problem of poor adaptability of manually designed features to high-resolution images and the superiority of the deep learning method in feature extraction, more and more change detection studies begin to introduce the deep learning method. These models have a large receptive field with far better performance than traditional methods, but remain at the pixel level in terms of processing. The pixel level method always has a hidden constraint condition, and needs to perform high-precision mutual registration between different phase data, namely, performing strict alignment on front and rear images. However, shape changes due to the difference in viewing angle and projected shadows caused by high-rise buildings tend to create false change regions. Furthermore, recent studies have shown that modern deep neural networks are not strictly invariant to translation. Therefore, under supervision of the "pixel tag" mode, the pixel level method with the pixel correspondence effect has difficulty in avoiding such pseudo-variations.
The conventional object-based change detection method uses an "object" as an analysis unit, and can reduce erroneous judgment of a pseudo-change region to some extent. Notably, an "object" in an object-based approach is typically a set of local clusters of pixels, obtained by segmenting an image using spectral texture features, geometric features (e.g., shape and area), and other information. However, due to the limitations of conventional manual design feature extraction methods, these segmented regions often rely on thresholds, making "objects" prone to over-segmentation and boundary fragmentation, have no semantic integrity, and do not truly mimic actual geographic entity targets.
Disclosure of Invention
In order to overcome the technical problems, the invention provides an object-level remote sensing change detection scheme based on dual-related attention by utilizing a deep learning technology, which is used for detecting changed geographic entities (such as newly added buildings and changed artificial ground objects). Compared with the pixel-level method and the traditional object-based method, the method disclosed by the invention focuses more on the overall information and context connection of the changed geographic entity, and can effectively resist pseudo-change interference in the image.
The invention provides an object-level remote sensing change detection method based on dual-related attention, which comprises the following steps,
step 1, data enhancement for change detection is carried out to generate a double input stream, the implementation mode comprises that random time sequence exchange and combination are carried out on four pairs of images to obtain a pair of synthesized images, enhancement operation is carried out on one of the synthesized images, the enhancement operation comprises random geometric transformation, brightness transformation, gaussian noise addition and color transformation, and only corresponding geometric transformation is carried out on the other synthesized image;
step 2, setting a backbone network with shared weight for receiving the double input streams and extracting different scale characteristics of the double-phase images;
step 3, setting a feature fusion neck for dual-related attention guidance on the basis of a backbone network, wherein a dual-related attention module is arranged in the feature fusion neck for guiding the network to focus on the correlation of the spatial hierarchy and the channel hierarchy of dual-temporal features with the same scale so as to acquire refined difference features, and a refinement path aggregation pyramid module is arranged for fusing features among layers with different scales;
and 4, finally, sending the difference features with different scales into a change detection head, and predicting the position, the size and the change confidence of the change feature in a boundary frame mode.
In addition, in the backbone network sharing the weight, feature reuse is realized through a layered feature fusion strategy, and excessively repeated gradient information is truncated and used for extracting parallel features of different scales of the double-phase images.
And, construct the twin network as the backbone network which shares the weight with CSPDarkNet-53 network, extract the characteristic of five scales, mark as C1-C5 layer sequentially from top to bottom.
And the feature fusion neck of the dual collaborative attention guide fuses the features of the C3-C5 layers of the twin CSPDarknet-53 network by adopting three dual relevant attention modules to generate P3-P5 layer features, and then fuses the difference features among different scale layers by adopting a refinement path aggregation pyramid module.
Moreover, the dual-correlation attention module comprises a space-correlation attention module, a channel-correlation attention module and a variation difference module, and parallel features F from the same scale layer are set a B are connected in series to obtain feature F ab Difference feature F directing spatially dependent attention P Channel dependent attention directed difference feature F C And feature F ab And further fusing to obtain the final variation difference characteristic F.
The change detection heads are divided into three layers, namely detection head-S, detection head-M and detection head-L, respectively correspond to N3-N5 layer characteristics from the refinement path aggregation pyramid module, learn the change confidence and coordinates of the change object in parallel according to the supervision information, and learn the regression of the bounding box of the change object by using a priori frame; training the whole network model according to the training sample labels by using a multi-task loss function; and inputting the double-phase time-phase images to be detected into a trained network to obtain an accurate change detection result.
On the other hand, the invention also provides an object-level remote sensing change detection system based on the dual-related attention, which is used for realizing the object-level remote sensing change detection method based on the dual-related attention.
Furthermore, the device comprises the following modules,
the first module is used for carrying out data enhancement for change detection to generate a double input stream, the implementation mode comprises that random time sequence exchange and combination are carried out on four pairs of images to obtain a pair of synthesized images, enhancement operation is carried out on one of the synthesized images, the enhancement operation comprises random geometric transformation, brightness transformation, gaussian noise addition and color transformation, and only corresponding geometric transformation is carried out on the other synthesized image;
the second module is used for setting a backbone network with shared weight and receiving double input streams and extracting different scale characteristics of the double-phase images;
and the third module is used for setting a feature fusion neck for dual-related attention guidance on the basis of a backbone network, wherein the feature fusion neck is internally provided with a dual-related attention module for guiding the network to focus on the correlation of the dual-time phase features with the same scale in a space level and a channel level so as to acquire refined difference features, and a refinement path aggregation pyramid module is arranged for fusing features among layers with different scales.
And a fourth module, which is used for sending the difference features with different scales to the change detection head, and predicting the position, the size and the change confidence of the change feature in the form of a boundary frame.
Alternatively, the system comprises a processor and a memory, wherein the memory is used for storing program instructions, and the processor is used for calling the stored instructions in the memory to execute the object-level remote sensing change detection method based on dual-related attention.
Or comprises a readable storage medium having stored thereon a computer program which, when executed, implements an object-level remote sensing change detection method based on dual-related attention as described above.
Compared with the prior art, the invention has the following three advantages:
1) The data enhancement mode special for change detection can accelerate model training and improve model performance.
The data enhancement method provided by the invention helps the network pay more attention to the nature of the change by randomly exchanging the early image and the later image on line instead of over fitting in a specific input mode. Combining four pairs of dual-phase images into a pair of composite image pairs helps to implicitly increase batch size, enrich the changing scene, and help the model to converge more stably and faster. By scaling the composite image pair, the duty ratio of the changed small target is implicitly increased, the network is guided to pay attention to the change detection of the small target, and the model performance is effectively improved.
2) Feature correlation is used to capture the difference features and speed up model convergence.
The dual correlation attention module designed by the invention establishes correlation attention of parallel features from channels and spatial layers, guides the network to further refine the features related to the change from the channels and spatial layers of the features, suppresses the unrelated features, and is beneficial to rapid convergence of the network.
3) The method has the advantages of being excellent in robustness and good in change detection effect, and can effectively resist pseudo-change interference.
According to the invention, through network structure design and loss function design, the constructed end-to-end change detection network is more focused on the integral characteristics and context association of the change object, and pseudo-change interference caused by visual angle change and projection difference is avoided. The whole training process does not need to design artificial feature guidance, and the network can self-adaptively learn the required features, so that the method has better generalization. Under complex changing scenes, the method also has better stable performance.
Drawings
FIG. 1 is a flow chart of a data enhancement method dedicated to change detection according to an embodiment of the present invention.
Fig. 2 is a general framework diagram of a dual-related attention-directed change detection network in accordance with an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a dual-related attention module according to an embodiment of the present invention, where fig. 3 (a) is a general architecture of the dual-related attention module, fig. 3 (b) is a specific architecture of the spatial-related attention module, fig. 3 (c) is a specific architecture of the channel-related attention module, and fig. 3 (d) is a specific architecture of the variation difference module.
Detailed Description
The technical scheme of the invention is specifically described below with reference to the accompanying drawings and examples.
In the data layer, the framework designs a data enhancement method special for change detection, which can effectively accelerate the training speed of the model and improve the performance of the model. At the model level, the framework constructs a dual-related attention-guided change detection network, and can effectively extract the integral characteristics and the context association of a change object. The framework ultimately represents the detected changing geographic entity (e.g., newly added building, artificial structure, etc.) in the form of a bounding box.
The embodiment of the invention provides an object-level remote sensing change detection method based on dual-related attention, which comprises a data enhancement process (shown in figure 1) special for change detection and a dual-related attention-guided change detection network (shown in figure 2). The dual-related-attention-guided change detection network sequentially comprises a backbone network sharing weights, a dual-collaborative-attention-guided feature fusion neck and a change detection head, wherein the dual-collaborative-attention-guided feature fusion neck comprises a plurality of dual-related attention modules and a refinement path aggregation pyramid module corresponding to the backbone network. In addition, fig. 3 illustrates architectural details of dual attention modules in a network. The method specifically comprises the following steps:
step 1: the data enhancement mode special for change detection performs random combination and transformation on the four pairs of images to generate a double-input stream.
The embodiment adopts a data enhancement mode special for change detection to perform random time sequence exchange and combination on four pairs of images, and performs random geometric transformation, color transformation and other operations to generate a double input stream.
Comprises the following substeps:
step 1.1, randomly selecting four pairs of double-phase images of different areas from a training set, and randomly exchanging front-phase images and rear-phase images.
And 1.2, splicing and combining the four pairs of processed images into a pair of synthetic image pairs in random order.
For example, the region 1T0 phase image, the region 2T0 phase image, the region 3T0 phase image, the region 4T0 phase image, the region 1T1 phase image, the region 2T1 phase image, the region 3T1 phase image, and the region 4T1 phase image are randomly exchanged, and the region 1T1 phase image, the region 2T0 phase image, the region 3T1 phase image, and the region 4T0 phase image are synthesized 1, and the region 1T0 phase image, the region 2T1 phase image, the region 3T0 phase image, and the region 4T1 phase image are synthesized 2.
Step 1.3, for one of the synthesized images, the embodiment performs the following enhancement functions:
(1) Geometric transformation: random transformation, including clipping, scaling, translation, flipping, and rotation;
(2) Brightness conversion;
(3) Increasing Gaussian noise;
(4) And (5) color transformation.
To ensure the versatility and robustness of the network, the embodiment only performs the same geometric transformation method for another composite image. Through the above steps, a dual input image stream is generated using the composite image.
Step 2: and setting a backbone network with shared weight to receive the double input stream and extract different scale characteristics of the double-phase image.
The embodiment uses a CSPDarkNet-53 network to construct a twin network as a backbone network with shared weight to receive double input streams, realizes feature reuse through a hierarchical feature fusion strategy, and cuts off excessively repeated gradient information for extracting parallel features of different scales of a double-phase image (the C1-C5 layers from top layer to bottom layer in FIG. 2 correspond to five corresponding scale features).
Step 3: the feature fusion neck of the dual collaborative attention guide fuses the features of the C3-C5 layers of the twin CSPDarknet-53 network by adopting three dual relevant attention modules to generate P3-P5 layer features, and then fuses the difference features among different scale layers by adopting a refinement path aggregation pyramid module. Comprises the following substeps:
step 3.1, the dual-correlation attention module acquires refined difference features by using the feature correlation of the spatial hierarchy and the channel hierarchy on the parallel features of the same scale layer, wherein the refined difference features comprise the spatial correlation attention module, the channel correlation attention module and the variation difference module, and specific details are shown in fig. 3 and comprise the following substeps:
step 3.1.1 parallel features F from the Co-scale layer a B are fed into a spatially dependent attention module. First calculate their spatial correlation map C P As shown in fig. 3 (b), the calculation is as follows:
wherein ,is through F a Results obtained from the transpose and deformation operations, F b Is made up of F b The results obtained by the deformation operation->Representing a matrix multiplication. According to the space correlation diagram C P For F a And F is equal to b The spatial local features related to the change in (C) are enhanced, the spatial local features unrelated to the change are suppressed, and the spatial correlation diagram C is used for solving the problem that the spatial local features are not related to the change P Normalized to softmax of (c) and normalized to deformed F a Matrix multiplication may yield enhanced features F Pa From transposed spatial correlation diagram C P Normalized to softmax of (c) and normalized to deformed F b Matrix multiplication may yield enhanced features F Pb . Finally, fuse feature F Pa and FPb To generate spatially dependent attention weights W P The fusion process includes using the feature F Pa and FPb Make a connection (symbol C in FIG. 3 (b)), sigmoid transform to obtain feature F Pab And compressing the channel number from 2C to obtain W' p through 1*1 convolution and 3*3 convolution, and finally performing sigmoid conversion.
Step 3.1.2 parallel features F from the co-scale layer a ,F b Is fed into the channel dependent attention module. First calculate their channel correlation diagrams C C As shown in fig. 3 (c), the calculation is as follows:
wherein ,is through F b Results obtained from the transpose and deformation operations, F' a Is made up of F a The results obtained from the deforming operation are,representing a matrix multiplication. According to the channel correlation diagram C C For F a And F is equal to b The channels related to the change in (a) are enhanced, wherein the channels not related to the change are suppressed, and a channel related graph C is obtained C Enhanced feature F Ca and FCb . Finally, fuse feature F Ca and FCb De-generating channel dependent attention weights W C . The fusion process includes the use of the feature F Ca and FCb Performing connection (symbol C in FIG. 3 (C)), sigmoid transformation to obtain feature F Cab The number of channels is compressed from 2C to C by 1*1 convolution and 3*3 convolution to obtain W' C Finally, the method is subjected to sigmoid transformation.
Step 3.1.3 parallel features F from the Co-scale layer a ,F b Is fed into a variance module. The variation difference module extracts difference features through two branches of absolute difference operation and serial operation respectively, and finally fuses the difference features extracted by the two branches to obtain final difference feature F D As shown in fig. 3 (d). F (F) d1 From F a ,F b Obtaining absolute difference F ab From F a ,F b Characteristic connection is carried out, and then the characteristic F can be obtained through 1*1 convolution d2 Final F d1 and Fd2 Performing feature connection, 1*1 convolution, batch normalization and LeakyReLU transformation to obtain final feature F D
LeakyReLU(x)=max(αx,x)
Step 3.1.4 parallel features F from the co-scale layer a ,F b Obtaining feature F by tandem ab
Step 3.1.5, F D Respectively at W P ,W C Is guided by (a) to perform feature enhancement to obtain difference feature F guided by spatially dependent attention P Channel dependent attention directed difference feature F C As shown in fig. 3 (a), the calculation method is as follows:
F′ p =W P ⊙F D
F′ c =W C ⊙F D
wherein +.H indicates an element-wise dot product operation,representing an element-wise addition operation.
Difference feature F that directs spatially dependent attention P General purpose medicineLane dependent attention directed difference feature F C And feature F ab And further fusing to obtain the final variation difference characteristic F.
And 3.2, fusing the difference features among the layers with different scales by using a refinement path aggregation pyramid module.
The path aggregation pyramid network adds a bottom-up path outside a conventional top-down path, and can effectively enhance the expression of low-layer features in feature fusion. The refinement path aggregation pyramid module provided by the invention uses series operation instead of element-by-element addition operation when fusion of features of different scales is carried out, so that the features of a lower layer and a higher layer can be adaptively fused, and semantic changes of features of different scales can be more effectively captured.
Features of the C3-C5 layers are subjected to feature fusion of different phases through a dual-related attention module, and then feature fusion of different scales is finished through a top-down path. Specifically, the generation of P4 layer features is taken as an example. The C5 layer features directly obtain P5 layer features through the dual correlation attention module. The features of the P5 layer are amplified to the same resolution as the features of the C4 layer through upsampling, and the features obtained through the dual-related attention module of the features of the C4 layer are fused to obtain the features of the P4 layer. The specific mode is to perform characteristic connection and perform 1*1 convolution, 3*3 convolution, batch normalization and LeakyReLU transformation. The feature generation of the P3 layer is the same as the feature generation of the P4 layer.
And then, the features of the P3-P5 layers are subjected to feature fusion through a bottom-up path to generate features N3-N5. Specifically, taking the generation of the N4 layer feature as an example, the N3 layer feature is obtained by performing 1*1 convolution, 3*3 convolution, batch normalization and LeakyReLU transformation on the P3 layer feature, the N3 layer feature is reduced to the P4 layer feature resolution through upsampling, and the N4 layer feature is obtained by performing feature fusion with the P4 layer feature, specifically, the feature connection is performed, and the steps of 1*1 convolution, 3*3 convolution, batch normalization and LeakyReLU transformation are performed. The feature generation of the N5 layer is the same as the feature generation of the N4 layer.
And 4, sending the features (N3-N5) with different scales from the refinement path aggregation pyramid module into a change detection head to output a change object prediction in the form of a boundary box.
In the dual-related attention-guided change detection network, finally, the difference features with different scales are sent to a change detection head of the network, and the detected change geographic entity (such as a newly added building, an artificial structure and the like) is represented in the form of a boundary box.
In an embodiment, step 4 comprises the sub-steps of:
step 4.1, the change detection heads are divided into three layers, namely a detection head-S, a detection head-M and a detection head-L, which respectively correspond to N3-N5 layer features from the refined path aggregation pyramid module, and the change confidence Conf of the change object is learned in parallel according to the supervision information changed And coordinates (x, y, w, h). And the prior frames are used for quickly and efficiently learning the regression of the boundary frames of the change object, the prior frames on the three detection layers are 3 by default, and the length and width of all prior frames are obtained by carrying out kmeans clustering according to the training set data. A is the prior frame number, C is the feature channel number, W is the feature map width, and H is the feature map height.
And 4.2, training the model according to the training sample label by using a multi-task loss function. The Loss function Loss includes a bounding box regression Loss L reg And varying confidence prediction loss L Conf
Loss=λ reg ×L regConf ×L Conf
wherein λreg Represents the weight of the regression loss in the total loss, lambda Conf Representing the contribution of the varying confidence prediction loss to the total loss. Lambda is taken out reg =1,λ Conf =2.5。
Comprises the following substeps:
and 4.2.1, generating positive and negative samples.
And calculating the ratio of the ground truth box size to the anchor size. If the ratio is within the set threshold (1/4-4), the invention treats the anchor as a positive sample (confidence of 1); otherwise, consider a negative sample (confidence of 0).
Step 4.2.2, calculating the regression loss of the boundary box. The CIoU loss function is used as a bounding box regression loss function.
Wherein b represents the center point coordinates of the prediction bounding box,the center point coordinates representing the truth box, ioU is the cross-correlation factor, the cross-correlation area of the prediction box and the truth box is removed by the cross-correlation area of the prediction box and the truth box, +.>Refers to the Euclidean distance between the center point of the prediction boundary box and the center point of the ground truth box, c represents the diagonal distance of the minimum closed rectangle containing both the prediction box and the ground truth box, and v measures the similarity of the aspect ratio of the prediction boundary box and the ground truth box>Andrepresenting the width and length of the truth box, w and h representing the width and length of the prediction box, alpha being the weight coefficient,
and 4.2.3, calculating the change confidence prediction loss. The Focal loss function is used as the varying confidence loss function.
Where y' represents the predicted confidence of the change, y represents the true confidence, α is used to control the contribution of positive and negative samples to the loss, and γ is used to control the contribution of difficult and simple samples to the loss. In the experiments of the present invention, α=0.25 and γ=1.5 were used by default.
And 4.2.4, training the object level change detection network guided by the dual relevant attention by utilizing the multi-task loss until the whole network is converged to the optimal precision.
And 4.3, inputting the double-phase relative images to be detected into the trained network in the training step 4.2, and obtaining accurate change detection results. The generation of the double-phase image pair to be detected is consistent with the processing mode of the training set in the previous step.
According to the invention, the prediction result of the change detection of part of experimental data can be seen, and the change detection of the remote sensing images under different scenes can be accurately and robustly carried out. Experimental results prove that the data enhancement method special for the change detection can accelerate model training and improve model performance, and the dual-correlation-attention-based detection network can effectively resist pseudo-change interference in an image pair through end-to-end learning and training of a training set on the change image and guiding by a dual-correlation-attention mechanism, and has higher accuracy and robustness.
In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention, and a computer device including the operation of the corresponding computer program, should also fall within the protection scope of the present invention.
In some possible embodiments, an object-level remote sensing change detection system based on dual-related attention is provided, comprising the following modules,
the first module is used for carrying out data enhancement for change detection to generate a double input stream, the implementation mode comprises that random time sequence exchange and combination are carried out on four pairs of images to obtain a pair of synthesized images, enhancement operation is carried out on one of the synthesized images, the enhancement operation comprises random geometric transformation, brightness transformation, gaussian noise addition and color transformation, and only corresponding geometric transformation is carried out on the other synthesized image;
the second module is used for setting a backbone network with shared weight and receiving double input streams and extracting different scale characteristics of the double-phase images;
and the third module is used for setting a feature fusion neck for dual-related attention guidance on the basis of a backbone network, wherein the feature fusion neck is internally provided with a dual-related attention module for guiding the network to focus on the correlation of the dual-time phase features with the same scale in a space level and a channel level so as to acquire refined difference features, and a refinement path aggregation pyramid module is arranged for fusing features among layers with different scales.
And a fourth module, which is used for sending the difference features with different scales to the change detection head, and predicting the position, the size and the change confidence of the change feature in the form of a boundary frame.
In some possible embodiments, an object level remote sensing change detection system based on dual related attention is provided, including a processor and a memory, the memory for storing program instructions, the processor for invoking the stored instructions in the memory to perform an object level remote sensing change detection method based on dual related attention as described above.
In some possible embodiments, an object level remote sensing change detection system based on dual-related attention is provided, which comprises a readable storage medium having stored thereon a computer program which, when executed, implements an object level remote sensing change detection method based on dual-related attention as described above.
The specific embodiments described herein are offered by way of example only to illustrate the spirit of the invention. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.

Claims (10)

1. An object-level remote sensing change detection method based on dual-related attention is characterized by comprising the following steps of: comprises the steps of,
step 1, data enhancement for change detection is carried out to generate a double input stream, the implementation mode comprises that random time sequence exchange and combination are carried out on four pairs of images to obtain a pair of synthesized images, enhancement operation is carried out on one of the synthesized images, the enhancement operation comprises random geometric transformation, brightness transformation, gaussian noise addition and color transformation, and only corresponding geometric transformation is carried out on the other synthesized image;
step 2, setting a backbone network with shared weight for receiving the double input streams and extracting different scale characteristics of the double-phase images;
step 3, setting a feature fusion neck for dual-related attention guidance on the basis of a backbone network, wherein a dual-related attention module is arranged in the feature fusion neck for guiding the network to focus on the correlation of the spatial hierarchy and the channel hierarchy of dual-temporal features with the same scale so as to acquire refined difference features, and a refinement path aggregation pyramid module is arranged for fusing features among layers with different scales; the implementation is as follows,
step 3.1, the dual-related attention module acquires refined difference features by using the feature correlation of the spatial hierarchy and the channel hierarchy on the parallel features of the same scale layer, wherein the dual-related attention module comprises a spatial related attention module, a channel related attention module and a change difference module, and the implementation process comprises the following substeps:
step 3.1.1 parallel features F from the Co-scale layer a ,F b Is fed into a spatially dependent attention module; first calculate their spatial correlation map C P The calculation is as follows:
wherein ,is through F a Transpose andthe deformation operation gives results F' b Is made up of F b The results obtained by the deformation operation->Representing a matrix multiplication; according to the space correlation diagram C P For F a And F is equal to b The spatial local features related to the change in (C) are enhanced, the spatial local features unrelated to the change are suppressed, and the spatial correlation diagram C is used for solving the problem that the spatial local features are not related to the change P Normalized to softmax of (c) and normalized to deformed F a Matrix multiplication to obtain enhanced features F Pa From transposed spatial correlation diagram C P Normalized to softmax of (c) and normalized to deformed F b Matrix multiplication to obtain enhanced features F Pb The method comprises the steps of carrying out a first treatment on the surface of the Finally, fuse feature F Pa and FPb To generate spatially dependent attention weights W P The fusion process includes using the feature F Pa and FPb Obtaining a feature F by performing connection and sigmoid transformation Pab Compressing the channel number by convolution to obtain W' p Finally, performing sigmoid transformation;
step 3.1.2 parallel features F from the co-scale layer a ,F b Is sent to a channel dependent attention module; first calculate their channel correlation diagrams C C The calculation is as follows:
wherein ,is through F b Results obtained from the transpose and deformation operations, F' a Is made up of F a The results obtained by the deformation operation->Representing a matrix multiplication; according to the channel correlation diagram C C For F a And F is equal to b Channel dependent of medium variationEnhancement in which channels not related to variation are suppressed, obtaining a correlation diagram C according to the channels C Enhanced feature F Ca and FCb The method comprises the steps of carrying out a first treatment on the surface of the Finally, fuse feature F Ca and FCb To generate channel dependent attention weights W C The method comprises the steps of carrying out a first treatment on the surface of the The fusion process includes the use of the feature F Ca and FCb Obtaining a feature F by performing connection and sigmoid transformation Cab Compressing the channel number by convolution to obtain W' C Finally, performing sigmoid transformation;
step 3.1.3 parallel features F from the Co-scale layer a ,F b Is sent to a variance module; the variation difference module extracts difference features through two branches of absolute difference operation and serial operation respectively, and finally fuses the difference features extracted by the two branches to obtain a final difference feature F D ;F d1 From F a ,F b Obtaining absolute difference F ab From F a ,F b Performing characteristic connection to obtain a characteristic F, and performing convolution to obtain a characteristic F d2 Final F d1 and Fd2 Performing feature connection, convolution, batch normalization and LeakyReLU transformation to obtain final feature F D
Step 3.1.4 parallel features F from the co-scale layer a ,F b Obtaining feature F by tandem ab
Step 3.1.5, F D Respectively at W P ,W C Is guided by (a) to perform feature enhancement to obtain difference feature F guided by spatially dependent attention P Channel dependent attention directed difference feature F C The calculation method is as follows:
F′ p =W P ⊙F D
F′ c =W C ⊙F D
wherein +.H indicates an element-wise dot product operation,representing an element-wise addition operation;
difference feature F that directs spatially dependent attention P Channel dependent attention directed difference feature F C And feature F ab Further fusing to obtain a final variation difference characteristic F;
step 3.2, fusing the difference features among different scale layers by a refinement path aggregation pyramid module;
the path aggregation pyramid network adds a bottom-up path outside a conventional top-down path, and enhances the expression of low-layer features in feature fusion; the refinement path aggregation pyramid module uses series operation instead of element-by-element addition operation when fusion of features of different scales is carried out so as to adaptively fuse the features of a lower layer and a higher layer and capture semantic changes of ground objects of different scales;
the features of the C3-C5 layers are subjected to feature fusion of different time phases through a dual-related attention module, and then subjected to feature fusion of different scales through a top-down path; then, the features of the P3-P5 layers are further subjected to feature fusion through a bottom-up path to generate features N3-N5;
and 4, finally, sending the difference features with different scales into a change detection head, and predicting the position, the size and the change confidence of the change feature in a boundary frame mode.
2. The object-level remote sensing change detection method based on dual-related attention according to claim 1, wherein: in the backbone network sharing the weight, feature reuse is realized through a layered feature fusion strategy, and excessively repeated gradient information is truncated and used for extracting parallel features of different scales of the double-phase images.
3. The object-level remote sensing change detection method based on dual-related attention according to claim 2, wherein: and constructing a twin network by using the CSPDarkNet-53 network as a backbone network sharing the weight, extracting the characteristics of five scales, and marking the characteristics as C1-C5 layers from the top layer to the bottom layer.
4. The object-level remote sensing change detection method based on dual-related attention according to claim 3, wherein: the feature fusion neck of the dual collaborative attention guide fuses the features of the C3-C5 layers of the twin CSPDarknet-53 network by adopting three dual relevant attention modules to generate P3-P5 layer features, and then fuses the difference features among different scale layers by adopting a refinement path aggregation pyramid module.
5. The object-level remote sensing change detection method based on dual-related attention as claimed in claim 1 or 2 or 3 or 4, wherein: the dual-related attention module comprises a space-related attention module, a channel-related attention module and a variation difference module, and is provided with parallel features F from the same-scale layer a ,F b Obtaining feature F by tandem ab Difference feature F directing spatially dependent attention P Channel dependent attention directed difference feature F C And feature F ab And further fusing to obtain the final variation difference characteristic F.
6. The object-level remote sensing change detection method based on dual-related attention according to claim 4, wherein: the change detection heads are divided into three layers, namely a detection head-S, a detection head-M and a detection head-L, which respectively correspond to N3-N5 layer characteristics from the refined path aggregation pyramid module, change confidence and coordinates of the change object are learned in parallel according to the supervision information, and regression of a boundary frame of the change object is learned by using a priori frame; training the whole network model according to the training sample labels by using a multi-task loss function; and inputting the double-phase time-phase images to be detected into a trained network to obtain an accurate change detection result.
7. An object-level remote sensing change detection system based on dual-related attention, which is characterized in that: an object-level remote sensing change detection method for implementing a dual-related attention-based method as claimed in any one of claims 1 to 6.
8. The dual-related attention-based object-level remote sensing change detection system of claim 7, wherein: comprising the following modules, wherein the modules are arranged in a row,
the first module is used for carrying out data enhancement for change detection to generate a double input stream, the implementation mode comprises that random time sequence exchange and combination are carried out on four pairs of images to obtain a pair of synthesized images, enhancement operation is carried out on one of the synthesized images, the enhancement operation comprises random geometric transformation, brightness transformation, gaussian noise addition and color transformation, and only corresponding geometric transformation is carried out on the other synthesized image;
the second module is used for setting a backbone network with shared weight and receiving double input streams and extracting different scale characteristics of the double-phase images;
the third module is used for setting a feature fusion neck for dual-related attention guidance on the basis of a backbone network, wherein the feature fusion neck is internally provided with a dual-related attention module for guiding the network to focus on the correlation of the dual-time phase features with the same scale in a space level and a channel level so as to acquire refined difference features, and a refinement path aggregation pyramid module is arranged for fusing features among layers with different scales;
and a fourth module, which is used for sending the difference features with different scales to the change detection head, and predicting the position, the size and the change confidence of the change feature in the form of a boundary frame.
9. The dual-related attention-based object-level remote sensing change detection system of claim 7, wherein: comprising a processor and a memory for storing program instructions, the processor being adapted to invoke the stored instructions in the memory to perform a method of object-level telemetry change detection based on dual-related attention as claimed in any one of claims 1-6.
10. The dual-related attention-based object-level remote sensing change detection system of claim 7, wherein: comprising a readable storage medium having stored thereon a computer program which, when executed, implements a dual related attention based object level remote sensing change detection method as claimed in any of claims 1-6.
CN202110692812.7A 2021-06-22 2021-06-22 Object-level remote sensing change detection method and system based on dual-related attention Active CN113609896B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110692812.7A CN113609896B (en) 2021-06-22 2021-06-22 Object-level remote sensing change detection method and system based on dual-related attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110692812.7A CN113609896B (en) 2021-06-22 2021-06-22 Object-level remote sensing change detection method and system based on dual-related attention

Publications (2)

Publication Number Publication Date
CN113609896A CN113609896A (en) 2021-11-05
CN113609896B true CN113609896B (en) 2023-09-01

Family

ID=78336707

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110692812.7A Active CN113609896B (en) 2021-06-22 2021-06-22 Object-level remote sensing change detection method and system based on dual-related attention

Country Status (1)

Country Link
CN (1) CN113609896B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155200B (en) * 2021-11-09 2022-08-26 二十一世纪空间技术应用股份有限公司 Remote sensing image change detection method based on convolutional neural network
CN114037912A (en) * 2022-01-07 2022-02-11 成都国星宇航科技有限公司 Method and device for detecting change of remote sensing image and computer readable storage medium
CN115114395B (en) * 2022-04-15 2024-03-19 腾讯科技(深圳)有限公司 Content retrieval and model training method and device, electronic equipment and storage medium
CN114821354B (en) * 2022-04-19 2024-06-07 福州大学 Urban building change remote sensing detection method based on twin multitasking network
CN114937204B (en) * 2022-04-29 2023-07-25 南京信息工程大学 Neural network remote sensing change detection method for lightweight multi-feature aggregation
CN115601318B (en) * 2022-10-10 2023-05-02 广东昱升个人护理用品股份有限公司 Intelligent production method and system for quick-absorption low-reverse-osmosis paper diaper
CN116385881B (en) * 2023-04-10 2023-11-14 北京卫星信息工程研究所 Remote sensing image ground feature change detection method and device
CN117574259B (en) * 2023-10-12 2024-05-07 南京工业大学 Attention twin intelligent migration interpretability diagnosis method suitable for high-end equipment
CN117671437B (en) * 2023-10-19 2024-06-18 中国矿业大学(北京) Open stope identification and change detection method based on multitasking convolutional neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111091712A (en) * 2019-12-25 2020-05-01 浙江大学 Traffic flow prediction method based on cyclic attention dual graph convolution network
CN112287123A (en) * 2020-11-19 2021-01-29 国网湖南省电力有限公司 Entity alignment method and device based on edge type attention mechanism
CN112949549A (en) * 2021-03-19 2021-06-11 中山大学 Super-resolution-based change detection method for multi-resolution remote sensing image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111091712A (en) * 2019-12-25 2020-05-01 浙江大学 Traffic flow prediction method based on cyclic attention dual graph convolution network
CN112287123A (en) * 2020-11-19 2021-01-29 国网湖南省电力有限公司 Entity alignment method and device based on edge type attention mechanism
CN112949549A (en) * 2021-03-19 2021-06-11 中山大学 Super-resolution-based change detection method for multi-resolution remote sensing image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
注意力引导的三维卷积网络用于遥感场景变化检测;张涵 等;应用科学学报;第39卷(第2期);全文 *

Also Published As

Publication number Publication date
CN113609896A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
CN113609896B (en) Object-level remote sensing change detection method and system based on dual-related attention
Kong et al. Deep feature pyramid reconfiguration for object detection
Basalamah et al. Scale driven convolutional neural network model for people counting and localization in crowd scenes
Wang et al. Actionness estimation using hybrid fully convolutional networks
CN108062525B (en) Deep learning hand detection method based on hand region prediction
CN113673425B (en) Multi-view target detection method and system based on Transformer
CN113673338B (en) Automatic labeling method, system and medium for weak supervision of natural scene text image character pixels
CN107808376B (en) Hand raising detection method based on deep learning
Long et al. Object detection in aerial images using feature fusion deep networks
CN111898432B (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
Chen et al. Corse-to-fine road extraction based on local Dirichlet mixture models and multiscale-high-order deep learning
CN115512103A (en) Multi-scale fusion remote sensing image semantic segmentation method and system
Yin et al. Attention-guided siamese networks for change detection in high resolution remote sensing images
CN115661246A (en) Attitude estimation method based on self-supervision learning
CN116311353A (en) Intensive pedestrian multi-target tracking method based on feature fusion, computer equipment and storage medium
Zheng et al. Feature enhancement for multi-scale object detection
CN117253044B (en) Farmland remote sensing image segmentation method based on semi-supervised interactive learning
Yuan et al. Smoke semantic segmentation with multi-scale residual paths and weighted middle surveillances
CN117829243A (en) Model training method, target detection device, electronic equipment and medium
CN116596966A (en) Segmentation and tracking method based on attention and feature fusion
Wang et al. Task differentiation: Constructing robust branches for precise object detection
Yian et al. Improved deeplabv3+ network segmentation method for urban road scenes
Mi et al. Sniffer-Net: quantitative evaluation of smoke in the wild based on spatial–temporal motion spectrum

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant