CN117611996A

CN117611996A - Grape planting area remote sensing image change detection method based on depth feature fusion

Info

Publication number: CN117611996A
Application number: CN202311526699.0A
Authority: CN
Inventors: 张宏鸣; 沈寅威; 李峥嵘; 韩文霆; 吴军虎; 强建华; 唐恒翱; 阳光; 高郑杰; 张二磊; 詹涛; 牛当当; 宋荣杰; 朱珊娜
Original assignee: Xi'an Xingshan Shitu Technology Co ltd; Northwest A&F University; Aerial Photogrammetry and Remote Sensing Co Ltd
Current assignee: Xi'an Xingshan Shitu Technology Co ltd; Northwest A&F University; Aerial Photogrammetry and Remote Sensing Co Ltd
Priority date: 2023-11-15
Filing date: 2023-11-15
Publication date: 2024-02-27

Abstract

The invention provides a method for detecting remote sensing image change of a grape planting area based on depth feature fusion, which comprises the following steps: and step one, processing high-resolution second-order remote sensing image data. And step two, constructing a change detection data set. Step three, constructing a change detection model: step 301, selecting ResNet101 as a backbone network. In step 302, a ResCBAM model is constructed. In step 303, a context aggregation module is designed. Step 304, a depth feature fusion module is designed. And fourthly, training and storing the model. And fifthly, detecting the change of the remote sensing image of the grape planting area. The final cross-over ratio, recall rate and F1 value of the invention respectively reach 77.22%, 85.10% and 87.15%, which are respectively improved by 3.24%, 4.59% and 2.1% compared with the current main stream change detection method.

Description

Grape planting area remote sensing image change detection method based on depth feature fusion

Technical Field

The invention belongs to the technical field of agricultural planting, relates to a remote sensing image of a planting area, and particularly relates to a grape planting area remote sensing image change detection method based on depth feature fusion.

Background

The change region of the grape planting area in the remote sensing image is accurately extracted by using a deep learning technology, so that the spatial distribution change information of the grape planting area is obtained, the method can be used as a data base for management and administration, and technical support is provided for regional land resource management, yield estimation, fine management and the like. The complex remote sensing interpretation task is always a research difficulty in the remote sensing field and computer vision, and the problem that the traditional field interpretation is time-consuming and labor-consuming can be solved to a great extent by utilizing the deep learning technology, so that the method has been widely applied to agricultural planting planning.

The currently commonly used change detection methods can be classified into algebraic change detection techniques, machine learning change detection techniques, classification-based change detection techniques, and deep learning-based change detection techniques. The threshold selection of the change area is important based on algebraic change detection technology, the scale of the change of the research area can be obtained, but the threshold selection is difficult, the change detection technology based on machine learning needs to perform manual feature screening, the detection has instability, and meanwhile, the change detection technology based on classification has the problem of error accumulation and the recognition precision is insufficient. With the improvement of the capability of deep learning technology in feature extraction, the deep learning technology has made a great breakthrough in the field of computer vision and is gradually applied to remote sensing interpretation work. However, the existing remote sensing image change detection technology of the grape planting area has the problems of low extraction precision, incomplete identification main body and the like, so that a high-precision end-to-end grape planting area change detection model is urgently needed. The following disadvantages remain in the current research and technology:

firstly, imaging time and conditions of two-period remote sensing images are different, deviation such as illumination and color exists between the two-period remote sensing images, and heterogeneity can influence the extraction of a change area, so that multiband information is integrated and focused to better highlight the difference between different plots.

Secondly, the characteristics extracted by the existing change detection method have the problem of neglecting the context relation, the semantic information contained in a single pixel point is less, and rich discrimination information is difficult to provide for change deduction, so that the network structure needs to be modified, and more context information is integrated.

Third, the depth semantic features lack detail features in the original image, the interior of the change region extracted by the existing change detection method is often incomplete, the edge information is poor, and the accuracy is low, so that a depth feature fusion module needs to be designed to better process the depth features so as to improve the detection precision.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a method for detecting the change of remote sensing images of grape planting areas based on depth feature fusion, which aims to solve the technical problems that the detection method in the prior art is relatively high in recognition accuracy and the integrity of a recognition subject is to be further improved.

In order to solve the technical problems, the invention adopts the following technical scheme:

a method for detecting remote sensing image change of a grape planting area based on depth feature fusion comprises the following steps:

and step one, processing high-resolution second-order remote sensing image data.

And step two, constructing a change detection data set.

Step three, constructing a change detection model:

step 301, selecting ResNet101 as a backbone network.

In step 302, a ResCBAM model is constructed.

In step 303, a context aggregation module is designed.

Step 304, a depth feature fusion module is designed.

Step four, model training and storage:

and (3) training and storing the ResCBAM model obtained in the step (III) by adopting the change detection data set obtained in the step (II).

Step five, detecting the change of remote sensing images of grape planting areas:

preprocessing a double-phase remote sensing image of a region to be detected of a grape planting region according to the remote sensing image data processing method in the first step, aligning a geographic coordinate system, cutting the image into the same pixel size, inputting the same pixel size into a ResCBAM model trained in the fourth step, calling the weight in the pre-stored ResCBAM model trained in the fourth step, predicting the region according to a sliding window mode, and splicing each small block, so that a change region extraction image with the same pixel size as the original image is obtained.

Compared with the prior art, the invention has the following technical effects:

the invention reconstructs an input layer and a residual structure of a Resnet model to form a new network model ResCBAM, so that a neural network for processing a traditional natural image is suitable for remote sensing images containing more spectrum information, a CBAM attention mechanism is added in a network encoder stage, and characteristics related to changes are emphasized in two dimensions of a space and a channel, thereby reducing the influence of heterogeneity of the dual-time-phase remote sensing images on the detection of the changes and being more beneficial to distinguishing the spectrum information among different plots.

And (II) the invention designs a context aggregation module in the decoder stage, enlarges the model receptive field, and increases the global correlation of pixel information, thereby greatly avoiding the problems of false detection and missing detection in detection, and further improving the model precision.

The depth feature fusion module is designed, so that the problems of semantic difference, noise redundancy and the like during fusion of different depth features are avoided, and the completeness of a model extraction change area and the smoothness of edges are improved.

And (IV) the final cross-over ratio, recall rate and F1 value respectively reach 77.22%, 85.10% and 87.15%, which are respectively improved by 3.24%, 4.59 and 2.1% compared with the current main stream change detection method.

Drawings

Fig. 1 is a schematic diagram of a two-phase remote sensing image of a grape planting area.

Fig. 2 is a diagram of a change detection network configuration.

Fig. 3 is a block diagram of a convolution attention module.

Fig. 4 is a block diagram of a context aggregation module.

Fig. 5 is a block diagram of a depth feature fusion module.

FIG. 6 is a graph of the predicted results of different comparison methods.

The following examples illustrate the invention in further detail.

Detailed Description

It should be noted that all algorithms, software, tools and modules in the present invention are all known in the art unless specifically indicated otherwise.

The invention relates to a grape planting area change detection technology based on the crossing fields of remote sensing technology, computer image processing and the like, which uses two or more images in the same geographic position to discover and identify the difference of ground objects so as to analyze the change condition of a specific place. The segmentation model suitable for the natural image is used in the field of remote sensing image interpretation by modifying the deep learning network model, so that the model reads multispectral channels, and a attention mechanism is added in the model, the attention degree of the model to the change characteristics is improved, and meanwhile, a context aggregation module and a depth characteristic fusion module are designed, so that the defects of small receptive field and less attention to the depth characteristics in the deep learning model are overcome.

The invention provides a method suitable for detecting remote sensing image changes in a grape planting area based on a ResNet convolutional neural network, which is a high-precision end-to-end change detection method and solves the problems of low recognition precision, incomplete recognition main body and the like of the conventional change detection method. The deep learning model is improved, the attention mechanism is added in the decoder stage while multi-band information is fused, the change characteristics are emphasized, and the context aggregation module and the deep characteristic fusion module are designed, so that the problems of false detection and missing detection in the result are reduced.

The invention is based on ResNet convolutional neural network, and combines the encoder and decoder structure, reconstruct the input layer and residual structure of Resnet model, form new network model (ResNet with ConvolutionalBlock Attention Module, resCBAM), the model can fuse multi-band data, and add convolutional attention mechanism in each Bottleneck as the characteristic extraction module of the network, weight the characteristic extraction in space and channel dimension, raise the attention of the model to the change region. Meanwhile, a Context Aggregation Module (CAM) and a Depth Feature Fusion Module (DFFM) are designed at the decoder stage, and multiple layers of depth features are fused while global information of a depth feature map is increased so as to enrich feature dimensions, thereby improving model precision, enabling a main body of a change lifting area to be more complete and enabling edges to be smoother, and finally forming a grape planting area change detection method suitable for high-resolution satellite remote sensing images. The change data set of the Ningxia grape planting area is compared with the main flow change detection methods DSINF, SNUNet, A Net and ResNet-CD. The result shows that the designed ResCBAM network fuses more band data, the designed context aggregation module and the designed depth feature fusion module effectively reduce false detection and missed detection, and the final merging ratio, recall rate and F1 value respectively reach 77.22%, 85.10% and 87.15%, and compared with the current mainstream change detection method, the method is respectively improved by 3.24%, 4.59 and 2.1% points. The extracted change area is more complete, more edge details are reserved, and a new solution idea is provided for the change detection of the large-scale grape planting area with complex background.

The following specific embodiments of the present invention are given according to the above technical solutions, and it should be noted that the present invention is not limited to the following specific embodiments, and all equivalent changes made on the basis of the technical solutions of the present application fall within the protection scope of the present invention.

Examples:

the embodiment provides a method for detecting remote sensing image change of a grape planting area based on depth feature fusion, which is characterized by comprising the following steps:

step one, high-resolution second-order remote sensing image data processing:

the multispectral image and the panchromatic image obtained by high-resolution second satellite shooting are subjected to operations of radiometric calibration, atmospheric correction, orthographic correction and image registration by using ENVI software, and then image fusion is carried out by using a NNDiffusion PanSharpening tool, so that the image is more similar to the real situation of the earth surface, and fused image data of 1m spatial resolution and red-green-blue-near infrared 4 wave bands are obtained.

The step integrates rich empty spectrum information and spectrum information, a good data base is tamped for subsequent change detection, and a double-phase remote sensing image is shown in figure 1.

Step two, constructing a change detection data set:

step 201, according to the distribution of the remote sensing image of the satellite shooting place and the local grape planting area, according to the actual ground investigation and the visual interpretation, carrying out artificial vectorization labeling on the remote sensing images of two periods in ArcGis software, generating a label image with the same size and spatial resolution as the original image, then carrying out difference operation on the label images of two periods to obtain a label with a change detection, wherein the pixel value of the change area is 255, the pixel value of the unchanged area is 0, and after the label is paired with the front-back time phase image, the label is glidingly cut into a size of 512 pixels by 512 pixels through an overlapping rate of 0.6, so as to obtain a final change detection image-label pair.

Step 202, expanding a data set by using a data enhancement technology, removing part of unchanged image-tag pairs to obtain 1254 image tag pairs, generating abundant differentiated data by using the data enhancement technology, performing rotation transformation and inversion transformation on the data to obtain 10072 image-tag pairs, dividing the image-tag pairs into a training set, a verification set and a test set according to a ratio of 9:2:2, wherein the training set comprises 7016 image-tag pairs, the verification set and the test set comprise 1528 image-tag pairs, the training set is used for training a model, and the test set and the verification set are used for verifying generalization capability of the model.

Step three, constructing a change detection model:

step 301, selecting ResNet101 as a backbone network.

ResNet network is first proposed in 2016 He and the like, and as the number of layers of a convolution network model is continuously increased, the Degradation phenomenon of greatly reduced effect can occur, and the occurrence of the network solves the problem that the network gradient suddenly disappears or explodes. The ResNet structure uses shortcut connection (Shortconnection) for Identity mapping (Identity mapping), and the specific flow is shown in formula 1.

H(x)＝F(x,{W _i })+x

F(x，{W _i })＝W ₂ σ(W ₁ x) formula 1;

wherein:

h (x) represents an output feature;

f () represents the learned residual mapping;

x represents an input feature;

W _i representing the weight parameters;

W ₁ representing layer 1 network parameters;

W ₂ representing layer 2 network parameters;

σ represents the ReLU function.

In the field of computer vision, a convolutional neural network is a commonly used network structure for automatically extracting the characteristic information of original image data, and the extracted characteristic information is more characterized and the semantic information is more abundant along with the improvement of the network layer number. The ResNet models commonly used at present comprise ResNet18, resNet50, resNet101 and the like, and the ResNet models have rich spectrum and spatial information based on remote sensing images, so that a deeper network can extract characteristics with rich semantic information, and the ResNet models are beneficial to complex change detection tasks.

The overall network architecture of the present invention is shown in fig. 2.

Step 302, a ResCBAM model is built:

in this step, the characteristic map generated by the encoder is different in phase, and because of the phase inconsistency of the images, heterogeneity exists between the two images, and a large amount of other information such as color and brightness changes can appear in the characteristic map while the change information of the grape planting area is contained in the characteristic map. Thus, a convolution attention module is incorporated in the network encoder stage, which includes two dimensions, namely a channel attention module (Channel Attention Module) and a spatial attention module (Spatial Attention Module), and the module structure is shown in fig. 3. The change detection model uses a spatio-channel attention module to emphasize feature map pixels that are relevant to the target while suppressing pixels that are not relevant to the target.

The detailed procedure for constructing the ResCBAM model is as follows:

step 30201, obtaining channel attention, and performing spatial average pooling and maximum pooling on the feature map F of the channel attention map of H2XWXC to obtain two 1 X1 XC global feature maps F _avg And F _max 。

Wherein:

h represents the height of the input picture;

w represents the width of the input picture;

c represents the number of channels of the input picture.

Step 30202, global feature map F _avg And F _max And sending the obtained two feature images into a ResNet network containing two layers, generating a final weight coefficient through a Sigmoid function after obtaining the two feature images, and multiplying the final weight coefficient by a feature image F to obtain a new feature image F ', wherein the calculation formula of the new feature image F' is shown in a formula 2.

F′＝Sigmoid(W ₁ (W ₀ (F _avg )+W ₁ (W ₀ (F _max ) Formula 2);

wherein:

f' represents a new feature map;

sigmoid () represents performing a Sigmoid function operation;

W ₁ representing layer 1 network parameters;

W ₂ representing layer 2 network parameters;

F _avg an average pooling feature map representing feature map F;

F _max representing the maximum pooling of the feature map F.

Step 30203, introducing the spatial attention module to the data of the new feature map F' to focus on the information of different positions on the same channel; for the new feature map F ', an average pooling F ' is performed in the channel dimension ' _avg And maximum pooling and F' _max The description about the spatial attention H multiplied by W multiplied by 1 is formed, the description is spliced together and then is subjected to a convolution layer and a Sigmoid activation function to generate a weight coefficient, the weight coefficient is multiplied by a new feature map F ', a new feature map F ' after scaling is obtained, and a calculation formula of the new feature map F ' after scaling is shown as a formula 3.

F″＝Sigmoid(conv _7×7 ([F′ _avg ：F′ _max ]) Formula 3;

wherein:

f' represents the new feature map after scaling;

sigmoid () represents performing a Sigmoid function operation;

conv _7×7 representing a convolution operation with a convolution kernel size of 7;

[] Representing a conflate operation.

F’ _avg An average pooled feature map representing a new feature map F';

F’ _max a maximally pooled feature map representing a new feature map F';

and 30203, combining the channel attention and the spatial attention in a sequential manner, and merging the channel attention and the spatial attention into a ResNet network to relieve the problems caused by color and light deviation in different merging processes.

Step 303, designing a context aggregation module:

in the step, a large number of woodlands and barren lands similar to the textures of the grape planting areas exist in the double-time-phase remote sensing image, the planting areas are scattered, various ground objects such as houses and roads are scattered in the middle, the receptive field of each pixel point is small after the image is subjected to feature extraction, the information provided by the change detection inference is very limited, a large number of false detection or missing detection problems often exist, and the change information of the grape planting areas is difficult to effectively detect. Conventional downsampling increases receptive field but reduces spatial resolution. The deep series has been proposed in 2015, and innovatively proposes a structure of a cavity space pyramid pool (Astrous Spatial Pyramid Pooling, ASPP), and resolution can be ensured while a receptive field is enlarged by using cavity convolution. This is well suited for segmentation tasks where increased receptive fields can accurately locate the target.

The fine steps of designing a Context Aggregation Module (CAM) are as follows:

in step 30301, the feature graphs on the two feature extraction branches of the twin ResCBAM model are respectively subjected to cavity convolution and cavity space pyramid pools with different expansion rates by adopting formulas 4 to 7, so that more context information is obtained. F, F ₀ ＝conv _1×1 F formula 4;

P ₄ =avgpool (F) formula 6;

F _ti ＝[F ₀ ：F ₁ ：F ₂ ：F ₃ ：F ₄ ]i=1, 2 formula 7;

wherein:

f represents a feature map;

F ₀ representing a layer 0 feature map;

F ₁ representing a layer 1 feature map;

F ₂ representing a layer 2 feature map;

F ₃ representing a layer 3 feature map;

F ₄ representing a layer 4 feature map;

F _i feature graphs representing different levels;

F _ti an image feature map representing a ti phase passing through the cavity space pyramid;

i represents the i-th hierarchy;

conv _1×1 representing a convolution operation with a convolution kernel size of 1;

a hole convolution with a convolution kernel of 3 and a hole rate of 12×i is represented;

AvgPool represents a global average pooling operation.

The method adopts a simple and effective structure to improve the accuracy of model prediction, so that the characteristic pixel points have larger receptive field and long-distance space dependence, contain the characteristic information of more changeable targets, and are favorable for the inference of the changeable areas in the later stage.

Step 30302, the feature images of the two periods of the dual-time-phase remote sensing image are subjected to convolution operation to adjust the channel number, and feature fusion is performed through absolute difference values to obtain a fusion feature image F ^* The fusion characteristic diagram F ^* The calculation formula of (2) is shown in formula 8.

Wherein:

F ^* representing a fusion feature map;

a convolution operation with a convolution kernel size of 3 and a channel number of 64 is represented;

representing element subtraction;

the absolute value operation is denoted by the absolute value.

This step designs a context aggregation module suitable for the change detection task, and the module structure is shown in fig. 4.

Step 304, designing a depth feature fusion module:

in the step, in the deep learning processing process, when the feature extraction is carried out in the initial stage, the overlapped area of the receptive field corresponding to each pixel point in the generated feature map is small, and more space detail information is contained; when deep feature extraction is carried out, the receptive field is increased continuously, and the information represented by the pixel points at the moment is an area with rich semantic information, but the granularity is not enough, so that the feature extraction of the planting area is carried out by adopting a single scale, and the accuracy is difficult to ensure. The remote sensing image of the grape planting area has more small area areas, the grape planting area in the image deepens along with the change detection model, the generated high-level abstract features cannot well express the edge detail information of the planting area, even the feature map does not contain the information of the small area planting area, and the accuracy of change detection is lower due to the lack of the detail information of the image when the image is directly up-sampled, so that a cross-layer structure is required to be designed for depth feature fusion.

The detailed steps for designing the Depth Feature Fusion Module (DFFM) are as follows:

in step 30401, each layer of feature map in the ResCBAM feature extraction network is input into a depth feature fusion module, and the feature map is different in size and needs to be recovered through up-sampling operation, and a calculation formula is shown in formula 9:

wherein:

representing a feature map fused with the features of the previous layer;

M _di feature maps representing different depths;

up represents an up-sampling operation;

i represents the i-th hierarchy.

Step 30402, different depthsSplicing to further fuse depth features to finally obtain a depth fusion feature map M _cd Depth fusion feature map M _cd The calculation formula of (2) is shown in formula 10.

The method not only reserves the edge information of the bottom layer detail, but also contains rich semantic features, and greatly improves the integrity and detail of the change detection of the grape planting area.

The schematic structure of the depth feature fusion module designed in this step is shown in fig. 5.

Step four, model training and storage:

Step 401, model training:

in this step, the encoder part uses transfer learning to initialize backbone network weights using a pre-trained network on the ImageNet dataset. The hardware platform for experimental operation is Intel (R) Xeon (R) Gold [email protected], the operating system is Ubuntu 20.04.4LTS, two NVIDIA GeForceRTX 3090 display cards are mounted, and the Pytorch deep learning framework is used for realizing the construction and debugging of the model.

The experimental super parameters were set as follows: training batch size is 8, training is carried out in parallel by using double cards, the maximum training round number is 150, the learning rate is dynamically adjusted by adopting a Poly strategy, and the initial learning rate is set to be 10 ^-3 Adam is used as the optimization algorithm.

The loss function is a BCE loss function, and the specific calculation is shown in equation 11.

Wherein:

L _BCE representing a BCE loss function;

y represents a true value;

representing the predicted value.

The method comprises the steps of selecting a plurality of models with higher scores in a verification set, verifying generalization capability on a test set, obtaining the model with highest detection precision, extracting the most complete main body of a change area and better edge details, storing the model, and can be directly applied to actual production planning of a grape planting area to provide instructive data support for sustainable development of agriculture.

Step 402, model preservation:

selecting a model with higher score, testing through a testing set, screening out a model with better robustness and generalization, ensuring that the extracted change area is more complete, ensuring that the edge details are clearer, storing the model as a pth weight file, deploying the pth weight file into a server, and directly guiding actual production and finely planning a grape planting area by calling the model.

Application example:

the application example provides a grape planting area remote sensing image change detection method based on the depth feature fusion of the embodiment.

First, experimental data:

table 1 high score satellite number 2 payload parameters

The experimental place of the application example is positioned in the middle of Ningxia plain at the upper stream of yellow river at the east longitude 105 degrees 46 '-106 degrees 8' and north latitude 38 degrees 28 '-38 degrees 44', and Ningxia Hui nationality Yinchuan city. The area belongs to a arid climate area, the area is a alluvial plain, the altitude is 1010-1150 m, the annual average precipitation is 203mm, the day-night temperature difference is 10-15 ℃, the annual average sunshine number is 2800-3000 h, the climate is suitable for grape growth, the area of the planting area is large, and the area becomes one of the golden areas for planting high-quality grapes in the world due to the superior natural condition. Two remote sensing images of two time phases of the double-phase remote sensing image are respectively imaged on 8/4/2016 and 7/25/2019, the size of each image is 14351 multiplied by 29124 pixels, and the image parameters are shown in table 1. The grape planting area in the research area has large space span, and the variety of land features is rich, and the grape planting area also contains crops such as wheat, corn, medlar, other types of orchards and the like, so that the problem of 'homozygote and heteropedia' of similar strip texture characteristics of different land feature types can exist. And because the image time span is big, the same regional different time can have luminance, the problem of colour difference, and the spectral characteristic in grape planting district is not fixed. These problems can seriously affect the results of the post-change detection.

And marking grape planting areas of the remote sensing images in 2016 and 2019 respectively, and performing difference operation on the generated label images to obtain a label with change detection, wherein the pixel value of the changed area is 255, and the pixel value of the unchanged area is 0. After the image is matched with the front-back time phase images, the image is cut into 512 multiplied by 512 pixel sizes through a sliding window, and a final change detection image-label pair is obtained.

Second, experimental protocol:

in the invention, the encoder part adopts transfer learning, and uses a pre-training network on the ImageNet data set to initialize the weight of the backbone network. The hardware platform for experimental operation is Intel (R) Xeon (R) Gold 6226R [email protected], the operating system is Ubuntu 20.04.4LTS, two NVIDIA GeForce RTX3090 display cards are mounted, and the Pytorch deep learning framework is used for realizing the construction and debugging of the model. The experimental super parameters were set as follows: training batch size is 8, training is carried out in parallel by using double cards, the maximum training round number is 150, the learning rate is dynamically adjusted by adopting a Poly strategy, and the initial learning rate is set to be 10 ^-3 Adam is used as the optimization algorithm.

Comparative example 1:

the comparative example shows a method for detecting the change of the grape planting area, which is different from the embodiment and the application example only in that the convolution attention module, the context aggregation module and the depth feature fusion module in the third step of the embodiment are removed, and other steps are the same as the embodiment.

Comparative example 2:

this comparative example shows a method for detecting a change in a grape planting area, which differs from the examples and application examples only in that the present comparative example eliminates the convolution attention module in step three of the examples, and other steps are the same as the examples.

Comparative example 3:

this comparative example shows a method for detecting a change in a grape planting area, which differs from the examples and application examples only in that this comparative example eliminates the context aggregation module in step three of the examples, and other steps are the same as the examples.

Comparative example 4:

the comparative example shows a method for detecting the change of the grape planting area, which is different from the examples and application examples only in that the comparative example omits the depth feature fusion module in the third step in the examples, and other steps are the same as the examples.

And (3) experimental verification:

the high-precision end-to-end change detection method of the embodiment and the application example of the invention is compared with the extraction effect of the main flow change detection algorithm on the change region of the grape planting region. The results are shown in Table 2.

TABLE 2 detection results of grape planting area variation for different comparative methods

Remarks: f1 represents F1-score, which balances Precision and Recall such that both are highest.

Analysis of results:

the experimental results of comparative example 1, comparative example 2, comparative example 3 and inventive example are shown in table 2, and it can be seen from the experimental results of comparative example 1, comparative example 2 and inventive example in table 2 that the convolution attention module increases the ability of the representative feature extraction by suppressing pixels that are independent of the variation. The method shows that the attention module is added in the encoder stage, so that the problem of heterogeneity of the double-phase images can be effectively solved, features among plots can be adaptively learned, and the features related to the change are focused, so that the improvement of the change detection precision is facilitated. According to the experimental results of comparative example 1, comparative example 3 and the inventive example in table 2, it can be seen that the proposed context aggregation module constructs dense context information, so that the pixels in the feature map have larger receptive field and long-distance space dependence, contain the feature information of more changeable targets, and provide rich inferred information for the decoder stage. According to the experimental results of comparative example 1, comparative example 4 and the embodiment of the invention in table 2, the proposed depth feature fusion module fuses high-low layer depth feature information, avoids the influence of semantic difference, fuses low-level semantic features and high-level abstract features by adopting cross-layer connection, completely expresses the change information of grape planting areas, increases the change detection precision of the grape planting areas with different areas and larger difference.

Comparative example 5:

compared with the most advanced change detection methods (including SNUnet, A2net, DSIFN and ResNet-CD detection methods) at present, the change detection model based on depth feature fusion is compared with the change detection model. The SNUnet method provides a densely connected twin network for change detection, an integrated channel attention module (ECAM) is constructed, high-layer and low-layer characteristics are fused, and the problem of shallow information loss is solved. A2net method constructs a lightweight change detection network, effectively gathers multi-level features from high level to low level, and improves detection accuracy. The DSIFN method constructs a depth supervision fusion network, inputs the extracted features into a differential discrimination network to extract differential features, fuses the differential features with multi-scale depth features, and improves accuracy. The ResNet-CD method realizes the change detection by using a basic ResNet backbone network, and extracts a relatively complete change area. The above several change detection models were all trained on the created Ningxia grape dataset. The results are shown in Table 3.

TABLE 3 detection results of the variation in the planting area of grapes for the different comparison methods of comparative example 5

Analysis of results:

tables 3 and 6 show the results of detection of the change in the grape planting area by the different methods in comparative example 5, in which the true example TP is shown in white, the true example TN is shown in black, the false positive example FP is shown in red, and the false negative example FN is shown in blue. It can be seen that due to heterogeneity of the dual-phase images, a large number of color and brightness differences can influence the result of variation detection, SNUNet and A2Net can not well detect the variation region of the grape area main body, and can erroneously detect the field with color variation in the research region as the varied grape area. In addition, a plurality of fields and barren lands similar to the textures of the grape planting areas exist in the images, a large number of false detection and missed detection exist in detection results, resNet-CD, SNUNet, A Net and DSIFN both erroneously detect some fields and ridges similar in textures into change areas and can miss detection part of the change areas. Meanwhile, a large amount of missed detection exists in the detection of the change of the grape planting area by ResNet-CD, SNUNet, A Net and DSIFN, the deep fusion module designed by the method well avoids the problem because the deep fusion module does not well fuse depth characteristics and lacks detailed texture information of the bottom layer, so that the detected change area theme does not contain holes, and the edge details of the detected change area theme are more perfect, therefore, the grape planting area change detection method provided by the invention has certain superiority, can be applied to actual production, and provides a new solution idea for agricultural development planning.

Claims

1. A method for detecting remote sensing image change of a grape planting area based on depth feature fusion is characterized by comprising the following steps:

step one, processing high-resolution second remote sensing image data;

step two, constructing a change detection data set;

step three, constructing a change detection model:

step 301, selecting ResNet101 as a backbone network;

step 302, a ResCBAM model is built:

step 30201, obtaining channel attention, and performing spatial average pooling and maximum pooling on the feature map F of the channel attention map of H2XWXC to obtain two 1 X1 XC global feature maps F _avg And F _max ；

Wherein:

h represents the height of the input picture;

w represents the width of the input picture;

c represents the number of channels of the input picture;

step 30202, global feature map F _avg And F _max Sending the two feature images into a ResNet network containing two layers, generating a final weight coefficient through a Sigmoid function after obtaining the two feature images, and multiplying the weight coefficient by a feature image F to obtain a new feature image F ', wherein the calculation formula of the new feature image F' is shown in a formula 2:

F′＝Sigmoid(W ₁ (W ₀ (F _avg )+W ₁ (W ₀ (F _max ) Formula 2);

wherein:

f' represents a new feature map;

sigmoid () represents performing a Sigmoid function operation;

W ₀ representing layer 0 weight parameters;

W ₁ representing layer 1 weight parameters;

F _avg an average pooling feature map representing feature map F;

F _max a maximum pooling feature map representing feature map F;

step 30203, introducing the spatial attention module to the data of the new feature map F' to focus on the information of different positions on the same channel; for the new feature map F ', an average pooling F ' is performed in the channel dimension ' _avg And maximum pooling and F' _max Forming a description about the space attention H multiplied by W multiplied by 1, splicing the description together, generating a weight coefficient through a convolution layer and a Sigmoid activation function, and multiplying the weight coefficient by a new feature map F ' to obtain a new feature map F ' after scaling, wherein the calculation formula of the new feature map F ' after scaling is shown as a formula 3;

F″＝Sigmoid(conv _7×7 ([F′ _avg ：F′ _max ]) Formula 3;

wherein:

f' represents the new feature map after scaling;

sigmoid () represents performing a Sigmoid function operation;

[] Represents a conflate operation;

F’ _avg an average pooled feature map representing a new feature map F';

F’ _max a maximally pooled feature map representing a new feature map F';

step 30203, finally combining the channel attention and the spatial attention in a sequential manner, and merging into the ResNet network;

step 303, designing a context aggregation module;

step 304, designing a depth feature fusion module;

step four, model training and storage:

training and storing the ResCBAM model obtained in the third step by adopting the change detection data set obtained in the second step;

2. The method for detecting the change of the remote sensing image of the grape planting area based on the depth feature fusion as set forth in claim 1, wherein the specific process of the step one is as follows:

and respectively performing operations of radiation calibration, atmospheric correction, orthographic correction and image registration on the multispectral image and the full-color image obtained by high-resolution second satellite shooting, and then performing image fusion to obtain fused image data of 1m spatial resolution and red-green-blue-near infrared 4 wave bands.

3. The method for detecting the change of the remote sensing image of the grape planting area based on the depth feature fusion as set forth in claim 1, wherein the specific process of the second step is as follows:

step 201, performing vectorization labeling on the two-period remote sensing images of the dual-time-phase remote sensing image to generate a label image with the same size and spatial resolution as the original image, performing difference operation on the two-period label image of the dual-time-phase remote sensing image to obtain a change detection label, wherein the pixel value of a change region is 255, the pixel value of an unchanged region is 0, and after the label is paired with the front-back time-phase image, sliding and cutting the label into a size of 512 pixels by 512 pixels through an overlapping rate of 0.6 to obtain a final change detection image-label pair;

step 202, expanding a data set by using a data enhancement technology, removing part of unchanged image-tag pairs to obtain 1254 image tag pairs, generating differentiated data by using the data enhancement technology, performing rotation transformation and inversion transformation on the data to obtain 10072 image-tag pairs, dividing the image-tag pairs into a training set, a verification set and a test set according to a ratio of 9:2:2, wherein the training set comprises 7016 image-tag pairs, the verification set and the test set comprise 1528 image-tag pairs, the training set is used for training a model, and the test set and the verification set are used for verifying generalization capability of the model.

4. The method for detecting the change of the remote sensing image of the grape planting area based on the depth feature fusion as set forth in claim 1, wherein the specific process of step 303 is as follows:

step 30301, adopting the formulas 4 to 7 to respectively pass the feature graphs on the two feature extraction branches of the twin ResCBAM model through the cavity volume and the cavity space pyramid pool with different expansion rates to obtain more context information;

F ₀ ＝conv _1×1 f formula 4;

F ₄ =avgpool (F) formula 6;

F _ti ＝[F ₀ ：F ₁ ：F ₂ ：F ₃ ：F ₄ ]i=1, 2 formula 7;

wherein:

f represents a feature map;

F ₀ representing a layer 0 feature map;

F ₁ representing a layer 1 feature map;

F ₂ representing a layer 2 feature map;

F ₃ representing a layer 3 feature map;

F ₄ representing a layer 4 feature map;

F _i feature graphs representing different levels;

i represents the i-th hierarchy;

AvgPool represents a global average pooling operation;

step 30302, the feature images of the two periods of the dual-time-phase remote sensing image are subjected to convolution operation to adjust the channel number, and feature fusion is performed through absolute difference values to obtain a fusion feature image F ^* The fusion characteristic diagram F ^* The calculation formula of (2) is shown as formula 8;

wherein:

F ^* representing a fusion feature map;

representing element subtraction;

the absolute value operation is denoted by the absolute value.

5. The method for detecting the change of the remote sensing image of the grape planting area based on the depth feature fusion as set forth in claim 1, wherein the specific process of the step 304 is as follows:

wherein:

representing a feature map fused with the features of the previous layer;

M _di feature maps representing different depths;

up represents an up-sampling operation;

i represents the i-th hierarchy;

step 30402, different depthsSplicing to further fuse depth features to finally obtain a depth fusion feature map M _cd Depth fusion feature map M _cd The calculation formula of (2) is shown as formula 10;