CN114359120B

CN114359120B - Remote sensing image processing method, device, equipment and storage medium

Info

Publication number: CN114359120B
Application number: CN202210275055.8A
Authority: CN
Inventors: 黄军文; 汤红; 赵士红
Original assignee: Shenzhen Huafu Information Technology Co ltd
Current assignee: Shenzhen Huafu Technology Co ltd
Priority date: 2022-03-21
Filing date: 2022-03-21
Publication date: 2022-06-21
Anticipated expiration: 2042-03-21
Also published as: CN114359120A

Abstract

The invention provides a remote sensing image processing method, a device, equipment and a storage medium, wherein the method comprises the steps of carrying out image segmentation on a remote sensing image based on a pre-trained UNet 3+ network to generate a first output image, capturing a region image related to a preset target in the first output image based on a space attention network to serve as a second output image of the preset target, and carrying out morphological image processing on the second output image to obtain a preset target segmentation image. Compared with the prior remote sensing image block segmentation technology for processing the remote sensing image, the remote sensing image block segmentation method can obtain a clearer preset target segmentation image and solve the problem of intermittent connection of the long and thin road images in the remote sensing image.

Description

Remote sensing image processing method, device, equipment and storage medium

Technical Field

The present invention relates to the field of remote sensing image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing a remote sensing image.

Background

The remote sensing image land block segmentation technology is to analyze the pixel level content of a remote sensing image and then extract and classify the concerned features in the remote sensing image; the technology has high practical value in the fields of urban and rural planning, flood prevention and disaster relief and the like. However, processing a remote sensing image by a remote sensing image block segmentation technique causes a problem that a partial image of a road or a river in the remote sensing image is discontinuous.

Disclosure of Invention

The invention mainly aims to provide a method, a device and equipment for processing remote sensing images and a computer readable storage medium, and aims to solve the technical problem that parts of images of roads or rivers in the existing remote sensing images are discontinuous.

In order to achieve the above object, the present invention provides a method for processing a remote sensing image, comprising:

carrying out image segmentation on the remote sensing image based on a pre-trained UNet 3+ network to generate a first output image;

capturing a region image related to a predetermined target in the first output image as a second output image of the predetermined target based on a spatial attention network;

and performing morphological image processing on the second output image to obtain a non-discontinuous target segmentation image.

In order to achieve the above object, the present invention also provides a remote sensing image processing apparatus including:

the image segmentation module is used for carrying out image segmentation on the remote sensing image based on a pre-trained UNet 3+ network to generate a first output image;

a spatial attention module for capturing the region image related to a predetermined target in the first output image as a second output image of the predetermined target based on a spatial attention network;

and the post-processing module is used for carrying out morphological image processing on the second output image to obtain a non-discontinuous target segmentation image.

In addition, to achieve the above object, the present invention further provides a remote sensing image processing device, which includes a processor, a memory, and a remote sensing image processing program stored on the memory and executable by the processor, wherein when the remote sensing image processing program is executed by the processor, the steps of the remote sensing image processing method as described above are implemented.

In addition, to achieve the above object, the present invention further provides a computer readable storage medium, on which a remote sensing image processing program is stored, wherein when the remote sensing image processing program is executed by a processor, the steps of the remote sensing image processing method as described above are implemented.

The invention provides a remote sensing image processing method, which is characterized in that the method comprises the steps of carrying out image segmentation on a remote sensing image based on a pre-trained UNet 3+ network to generate a first output image, capturing a region image related to a preset target in the first output image based on a spatial attention network as a second output image of the preset target, and carrying out morphological image processing on the second output image to obtain an uninterrupted target segmentation map, wherein the remote sensing image is input into the pre-trained UNet 3+ network, the remote sensing image input into the UNet 3+ network is generated into a first output image by an image segmentation method, the first output image is input into the spatial attention network, the region image related to the preset target is captured, and a second output image of the preset target is generated, and then, performing morphological image processing on the second output image to obtain a preset target segmentation map with complete image content, so that the preset target segmentation map is generated based on the UNet 3+ network and the spatial attention network.

Drawings

Fig. 1 is a schematic diagram of a hardware structure of a remote sensing image processing device according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a remote sensing image processing method according to a first embodiment of the present invention;

FIG. 3 is a schematic flow chart of a remote sensing image processing method according to a second embodiment of the present invention;

FIG. 4 is a schematic flow chart of a remote sensing image processing method according to a third embodiment of the present invention;

FIG. 5 is a schematic flow chart of a remote sensing image processing method according to a fourth embodiment of the present invention;

fig. 6 is a functional module schematic diagram of an embodiment of the remote sensing image processing apparatus according to the invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The remote sensing image processing method related to the embodiment of the invention is mainly applied to remote sensing image processing equipment, and the remote sensing image processing equipment can be equipment with display and processing functions, such as a PC (personal computer), a portable computer, a mobile terminal and the like.

Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a remote sensing image processing device according to an embodiment of the present invention. In the embodiment of the present invention, the remote sensing image-based processing device may include a processor 1001 (e.g., a CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used for implementing connection communication among the components; the user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface); the memory 1005 may be a high-speed RAM memory, or may be a non-volatile memory (e.g., a magnetic disk memory), and optionally, the memory 1005 may be a storage device independent of the processor 1001.

Those skilled in the art will appreciate that the hardware configuration shown in fig. 1 does not constitute a limitation of the remote sensing image processing apparatus, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.

With continued reference to fig. 1, the memory 1005 of fig. 1, which is a computer-readable storage medium, may include an operating system, a network communication module, and a remote sensing image processing program.

In fig. 1, the network communication module is mainly used for connecting a server and performing data communication with the server; the processor 1001 may call the remote sensing image processing program stored in the memory 1005, and execute the remote sensing image processing method according to the embodiment of the present invention.

The embodiment of the invention provides a remote sensing image processing method.

Referring to fig. 2, fig. 2 is a flowchart illustrating a remote sensing image processing method according to a first embodiment of the present invention.

In this embodiment, the remote sensing image processing method includes the following steps:

step S10, image segmentation is carried out on the remote sensing image based on a pre-trained UNet 3+ network, and a first output image is generated;

in this embodiment, the remote sensing image and the corresponding first output image are used as training data in advance, a full convolution network UNet 3+ (a further improved UNet network) network is trained, the UNet 3+ network includes an encoder and a decoder, the encoder is used as an input layer of the UNet 3+ network, the decoder is used as an output layer of the UNet 3+ network, the UNet 3+ network has multiple levels, the encoder and the decoder are multiple, a plurality of encoders and a plurality of decoders are connected through hopping, and the decoders are connected with each other.

In order to learn hierarchical representation from the remote sensing image, the UNet 3+ network further adopts a deep supervision mode, each decoder stage in the UNet 3+ network has a side output, and the output is compared with GT (ground truth ) to calculate Loss, so that full-scale supervision is realized.

Specifically, the remote sensing image is input into a UNet 3+ network trained in advance, respective pixel categories corresponding to the remote sensing image are obtained from any target image of the remote sensing image, and target extraction is achieved, wherein the segmented target is composed of a plurality of mutually communicated pixels, and a first output image is obtained after the UNet 3+ network judges the respective pixel categories of the corresponding target of the remote sensing image.

A step S20 of capturing, in the first output image, the region image relating to a predetermined target as a second output image of the predetermined target based on a spatial attention network;

specifically, based on a spatial attention network, the first output image is input to the spatial attention network, maximum pooling and average pooling are performed on the first output image respectively to obtain two feature descriptions with different resolutions, the two feature descriptions are spliced together according to the spatial attention network to generate a feature description diagram, the feature description diagram is input to an A × A convolutional layer, meanwhile, an activation function is Sigmoid to obtain a weight coefficient Ms, Ms × input first output image pixel points × feature description diagram, and a second output image of the predetermined target is obtained.

Step S30, performing morphological image processing on the second output image to obtain a non-discontinuous target segmentation map, wherein the morphological image processing is one of image processing techniques, and is used for extracting image components meaningful for the expression and the drawing region shape of the target from the second output image, extracting the shape feature of the target object, the road boundary, the connected region, and the like, and simultaneously applying the techniques of thinning, pixelation, and cropping the burr and the like to process the second output image.

Specifically, the expansion operation is carried out on the second output image, then the erosion operation is carried out on the second output image after the expansion operation, the boundary tracking analysis is carried out on the second output image after the erosion operation, so that a second output image containing the outline characteristics of the road or the river and the area characteristics of the road or the river is obtained, small-area isolated pixel particles of the second output image after the boundary tracking analysis are removed according to a preset threshold value, the second output image after the small-area isolated pixel particles are removed is subjected to the expansion operation again, the second output image after the expansion operation is subjected to thinning processing, so that a preset target segmentation image is obtained, wherein the thinning processing refers to the fact that the lines of the image are reduced from the multi-pixel width to the unit pixel width.

Further, based on the embodiment shown in fig. 2, in this embodiment, before the step S10, the method further includes:

and carrying out image data enhancement on the original remote sensing image to obtain at least two remote sensing images.

Specifically, the image data enhancement of the original remote sensing image to obtain at least two remote sensing images includes:

and enhancing the image data of the original remote sensing image by at least one of the following modes: resampling the remote sensing image, horizontally turning, vertically turning, and scaling by 0.5-2 times.

In the implementation, the original remote sensing images are subjected to image data enhancement to obtain at least two remote sensing images, so that the diversity of the remote sensing image data can be expanded, the over-fitting problem of a UNet 3+ network is avoided, and the error outside the set range generated by processing the remote sensing images by the UNet 3+ network is reduced.

In this embodiment, a method for processing a remote sensing image includes performing image segmentation on the remote sensing image based on a pre-trained UNet 3+ network to generate a first output image, capturing a region image related to a predetermined target in the first output image based on a spatial attention network, as a second output image of the predetermined target, and performing morphological image processing on the second output image to obtain an uninterrupted target segmentation map, and in this way, the remote sensing image is input into the pre-trained UNet 3+ network, the remote sensing image input into the UNet 3+ network is generated into the first output image by an image segmentation method, the first output image is input into the spatial attention network, the region image related to the predetermined target is captured, and thus the second output image of the predetermined target is generated, and then, performing morphological image processing on the second output image to obtain a preset target segmentation map with complete image content, so that the preset target segmentation map is generated based on the UNet 3+ network and the spatial attention network.

Referring to fig. 3, fig. 3 is a schematic flow chart of a remote sensing image processing method according to a second embodiment of the invention.

Based on the foregoing embodiment shown in fig. 2, in this embodiment, the step S10 specifically includes:

step S11, generating at least two remote sensing characteristic graphs of the remote sensing image based on the encoder of the UNet 3+ network;

specifically, an encoder based on the UNet 3+ network extracts features of all targets of the remote sensing image, and generates a remote sensing feature map according to the extracted features, wherein all targets refer to objects such as roads, rivers, houses and mountains in the remote sensing image.

Step S12, acquiring two remote sensing feature maps which are respectively used as a first remote sensing feature map and a second remote sensing feature map, and respectively capturing a coarse-grained semantic feature and/or a fine-grained semantic feature in the first remote sensing feature map and a coarse-grained semantic feature and/or a fine-grained semantic feature in the second remote sensing feature map by a UNet 3+ network-based decoder;

the coarse-grained semantics refer to different categories of targets in the feature map, the fine-grained semantics refer to subcategories of large categories in the feature map, the large categories refer to different categories in the feature map, such as differences between rivers and forests and houses, and the subcategories refer to similar categories, such as creeks and large rivers.

Step S13, based on the decoder, fusing the coarse-grained semantic features or the fine-grained semantic features in the first remote sensing feature map with the coarse-grained semantic features or the fine-grained semantic features in the second remote sensing feature map to generate the first output image;

the fusion is to generate the first output image by gradually sampling coarse-grained semantic features from top to bottom and splicing the coarse-grained semantic features with fine-grained semantic features corresponding to the resolution, wherein the spliced first output image simultaneously has the coarse-grained semantic features and the fine-grained semantic features, and the coarse-grained semantic features and the fine-grained semantic features in the first output image are respectively from different remote sensing feature maps.

It can be understood that the fusion is the fusion of semantic features between different remote sensing feature maps, and may be the fusion of coarse-grained semantic features in the first remote sensing feature map and coarse-grained semantic features in the second remote sensing feature map, or the fusion of coarse-grained semantic features in the first remote sensing feature map and fine-grained semantic features in the second remote sensing feature map.

Further, based on the embodiment shown in fig. 3, in this embodiment, before the step S20, the method further includes:

and acquiring a difference value of the first output image and the second output image based on a branch loss function in a spatial attention network, and adjusting weight coefficients in the UNet 3+ network and the spatial attention network according to the difference value.

Wherein the branch Loss function of the spatial attention network Loss1=0.6 Softmax Loss +0.4 Lovasz Softmax Loss.

Wherein the output layer of the spatial attention network is connected with the branch loss function of the spatial attention network.

In this embodiment, the calculation result of the feature and the weight coefficient of the first output image by the branch Loss function in the spatial attention network is substituted into Loss1 for calculation, wherein the first output image is converted into a value which can be recognized by a computer in both a UNet 3+ network and a spatial attention network by the encoder; comparing the result obtained by the calculation of the Loss1 with the characteristics of the first output image to obtain a difference value, returning the difference value through the nodes in the spatial attention network, and adjusting the weight coefficient in the spatial attention network.

Through the Loss function of Loss1, the automatic adjustment of the weight coefficient of the spatial attention network can be realized, and the error of the spatial attention network in processing pictures is reduced, so that the error of the spatial attention network in processing pictures tends to a preset range.

Referring to fig. 4, fig. 4 is a flowchart illustrating a remote sensing image processing method according to a third embodiment of the present invention.

Based on the foregoing embodiment shown in fig. 2, in this embodiment, the step S20 includes:

s21, performing attention adjustment on the first output image based on the space attention network to generate a space attention diagram;

and S22, multiplying the pixel points of the spatial attention map by the pixel points of the first output image based on the spatial attention network, and generating the second output image.

The space attention network is arranged on an output layer of the last layer of the decoder, the pixel points of the space attention network are multiplied by the pixel points of the first output image, adaptive feature optimization is carried out, more attention holes and fracture areas are concerned, and the elongated targets in the remote sensing image are communicated. The spatial attention network is a lightweight, general-purpose module, and thus can be seamlessly integrated into the UNet 3+ network architecture with negligible overhead for this module, and can be trained end-to-end with UNet 3+ networks.

The spatial attention network processes the image by giving a first output image of H multiplied by W multiplied by C (H multiplied by W represents the pixel size, C is a channel, and represents the number of convolution kernels and the number of characteristics), performing maximum pooling and average pooling of one channel dimension on the first output image respectively to obtain two H multiplied by W multiplied by C channel descriptions, splicing the two descriptions together according to the channel, then passing through a 7 multiplied by 7 convolution layer, with the activation function being Sigmoid, obtaining a weight coefficient Ms, and obtaining a scaled second output image by inputting pixel points of the first output image multiplied by input pixel points of the first output image. Useful features can be increased and useless features can be weakened through the spatial attention network, and therefore the effects of feature screening and feature enhancement are achieved.

Referring to fig. 5, fig. 5 is a schematic flow chart of a remote sensing image processing method according to a fourth embodiment of the present invention.

Based on the foregoing embodiment shown in fig. 2, in this embodiment, the step S30 includes the following specific steps:

step S31, performing morphological closed operation on the second output image to obtain a first remote sensing image;

specifically, expansion operation is performed on the initial image, and then corrosion operation is performed on the initial image after the expansion operation, wherein the expansion operation is performed on a highlight part of a road image in the initial image, so that the edge area of the road image is expanded, the finally obtained expansion effect image has a highlight area larger than that of the road image in the initial image, and the corrosion operation is performed by replacing an adjacent area of the road image in the initial image with a maximum value, so that the edge highlight area of the road image in the initial image is reduced.

Performing corrosion operation on the initial image after the expansion operation, corroding the highlight part in the initial image, reducing the field of the initial image, and obtaining an effect image with a highlight area smaller than that of the initial image; wherein, the operation shows that the adjacent area is replaced by the minimum value and the highlight area is reduced. The internal hole area and the external area of the second output image are filled firstly through expansion operation and corrosion operation on the second output image, then the external area of the initial image is corroded, and the filling part of the internal hole of the initial image is reserved.

Step S32, carrying out boundary tracking analysis on the first remote sensing image according to the contour feature and the area feature of the preset target to obtain a second remote sensing image;

specifically, determining an initial search point of a boundary of the first remote sensing image, searching a target contour according to a preset boundary judgment criterion and a preset search criterion, and stopping searching when a preset termination condition is reached; the criterion is mainly used to determine whether a point is a boundary point, and the search criterion knows how to search for the next edge point.

Specifically, scanning point by point is started from the lower left corner of the first remote sensing image, and when an edge point is met, tracking is carried out until a follow-up point returns to the starting point (for a closed line) or a follow-up point is not provided with a new follow-up point (for a non-closed line);

if the line is a non-closed line, after one side is tracked, the other end point needs to be tracked from the starting point in the opposite direction;

if there is more than one subsequent point, selecting the closest point as the subsequent point according to the connection criterion, and using the next closest subsequent point as a new edge tracking starting point for additional tracking;

after one line is tracked, the next untracked point is scanned until all edges are tracked.

Step S33, removing isolated pixel particles of the second remote sensing image according to a preset threshold value to obtain a third remote sensing image;

step S34, performing expansion operation on the third remote sensing image to obtain a fourth remote sensing image;

and step S35, performing image thinning processing on the fourth remote sensing image to obtain a target segmentation image without interruption.

Specifically, the morphological closing operation comprises the following steps: and performing expansion operation and corrosion operation on the second output image, filling holes in the content of the second output image, and connecting fracture areas of the second output image.

The refining process of the fourth remote sensing image to obtain an uninterrupted target segmentation map comprises the following steps:

extracting skeleton information of the fourth remote sensing image, and generating a fifth remote sensing image by topologically representing a connecting line of the fourth remote sensing image;

and covering the third remote sensing image with the fifth remote sensing image to obtain an uninterrupted target segmentation map.

In addition, the embodiment of the invention also provides a remote sensing image processing device.

Referring to fig. 6, fig. 6 is a functional module schematic diagram of an embodiment of the remote sensing image processing apparatus according to the invention.

In this embodiment, the remote sensing image processing apparatus includes:

the image segmentation module 10 is used for performing image segmentation on the remote sensing image based on a pre-trained UNet 3+ network to generate a first output image;

a spatial attention module 20 for capturing, in the first output image, an area image related to a predetermined target as a second output image of the predetermined target based on a spatial attention network;

and the post-processing module 30 is configured to perform morphological image processing on the second output image to obtain a target segmentation map without discontinuity.

Further, the image segmentation module 10 includes:

the characteristic generating unit is used for generating at least two remote sensing characteristic graphs of the remote sensing image based on the encoder of the UNet 3+ network;

the characteristic capturing unit is used for acquiring two remote sensing characteristic graphs which are respectively used as a first remote sensing characteristic graph and a second remote sensing characteristic graph, and a decoder based on the UNet 3+ network respectively captures coarse-grained semantic features and/or fine-grained semantic features in the first remote sensing characteristic graph and coarse-grained semantic features and/or fine-grained semantic features in the second remote sensing characteristic graph;

and the feature fusion unit is used for fusing the coarse-grained semantic features or the fine-grained semantic features in the first remote sensing feature map with the coarse-grained semantic features or the fine-grained semantic features in the second remote sensing feature map based on the decoder to generate the first output image.

Further, the spatial attention module 20 includes:

a spatial channel unit, configured to perform attention adjustment on the first output image based on the spatial attention network, and generate a spatial attention map;

and the space fusion unit is used for multiplying the pixel points of the space attention diagram by the pixel points of the first output image based on the space attention network to generate the second output image.

Further, the post-processing module 30 includes:

the morphology closed operation unit is used for performing morphology closed operation on the second output image to obtain a first remote sensing image;

the boundary extraction unit is used for carrying out boundary tracking analysis on the first remote sensing image according to the contour feature and the area feature of the preset target to obtain a second remote sensing image;

the pixel point removing unit is used for removing isolated pixel particles of the second remote sensing image according to a preset threshold value to obtain a third remote sensing image;

the image expansion unit is used for performing expansion operation on the third remote sensing image to obtain a fourth remote sensing image;

and the image thinning unit is used for carrying out image thinning processing on the fourth remote sensing image to obtain a non-discontinuous target segmentation image.

Further, the image thinning unit includes:

a skeleton extraction unit, configured to extract skeleton information of the fourth remote sensing image and perform topological representation on a connected line of the fourth remote sensing image to generate a fifth remote sensing image;

and the image covering unit is used for covering the fifth remote sensing image with the third remote sensing image to obtain an uninterrupted target segmentation image.

Furthermore, the remote sensing image processing device comprises a data enhancement module for enhancing image data of the original remote sensing image to obtain at least two remote sensing images.

Further, the remote sensing image processing device further comprises a space supervision module, which is used for acquiring a difference value between the first output image and the second output image based on a branch loss function in a space attention network, and adjusting the weight coefficients in the UNet 3+ network and the space attention network according to the difference value.

Each module in the remote sensing image processing device corresponds to each step in the embodiment of the remote sensing image processing method, and the functions and the implementation process of the remote sensing image processing device are not described in detail herein.

In addition, the embodiment of the invention also provides a computer readable storage medium.

The computer readable storage medium of the present invention stores a remote sensing image processing program, wherein the remote sensing image processing program, when executed by a processor, implements the steps of the remote sensing image processing method as described above.

The method for implementing the remote sensing image processing program when executed can refer to the embodiments of the remote sensing image processing method of the present invention, and will not be described herein again

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present invention or directly or indirectly applied to other related technical fields are also included in the scope of the present invention.

Claims

1. A remote sensing image processing method is characterized by comprising the following steps:

generating at least two remote sensing characteristic graphs of the remote sensing image based on an encoder of a UNet 3+ network;

acquiring two remote sensing feature maps which are respectively used as a first remote sensing feature map and a second remote sensing feature map, and respectively capturing coarse-grained semantic features and/or fine-grained semantic features in the first remote sensing feature map and coarse-grained semantic features and/or fine-grained semantic features in the second remote sensing feature map by a UNet 3+ network-based decoder, wherein the coarse-grained semantic features are different classes of targets in the feature maps, and the fine-grained semantic features are subclasses of a large class in the feature maps;

based on the decoder, fusing the coarse-grained semantic features or the fine-grained semantic features in the first remote sensing feature map with the coarse-grained semantic features or the fine-grained semantic features in the second remote sensing feature map to generate a first output image;

performing morphological closed operation on the second output image to obtain a first remote sensing image;

carrying out boundary tracking analysis on the first remote sensing image according to the contour characteristic and the area characteristic of the preset target to obtain a second remote sensing image;

removing isolated pixel particles of the second remote sensing image according to a preset threshold value to obtain a third remote sensing image;

performing expansion operation on the third remote sensing image to obtain a fourth remote sensing image;

covering the third remote sensing image with the fifth remote sensing image to obtain an uninterrupted target segmentation map;

wherein the capturing, as the second output image of the predetermined target, an area image related to the predetermined target in the first output image based on the spatial attention network comprises:

performing attention adjustment on the first output image based on the spatial attention network to generate a spatial attention map;

and multiplying the pixel points of the spatial attention map by the pixel points of the first output image based on the spatial attention network to generate the second output image.

2. The method of processing a remote-sensing image according to claim 1, wherein before the image segmentation of the remote-sensing image based on a pre-trained UNet 3+ network to generate the first output image, the method further comprises:

and carrying out image data enhancement on the original remote sensing images to obtain at least two remote sensing images.

3. A remote sensing image processing apparatus, comprising:

a spatial attention module for capturing, in the first output image, an area image related to a predetermined target as a second output image of the predetermined target based on a spatial attention network;

the post-processing module is used for carrying out morphological image processing on the second output image to obtain a non-interrupted target segmentation image;

wherein the image segmentation module comprises:

the characteristic generating unit is used for generating at least two remote sensing characteristic maps of the remote sensing image based on the encoder of the UNet 3+ network;

the characteristic capturing unit is used for acquiring two remote sensing characteristic graphs which are respectively used as a first remote sensing characteristic graph and a second remote sensing characteristic graph, and a decoder based on the UNet 3+ network respectively captures coarse-grained semantic features and/or fine-grained semantic features in the first remote sensing characteristic graph and coarse-grained semantic features and/or fine-grained semantic features in the second remote sensing characteristic graph, wherein the coarse-grained semantic features are different classes of targets in the characteristic graphs, and the fine-grained semantic features are subclasses of large classes in the characteristic graphs;

the feature fusion unit is used for fusing the coarse-grained semantic features or the fine-grained semantic features in the first remote sensing feature map with the coarse-grained semantic features or the fine-grained semantic features in the second remote sensing feature map based on the decoder to generate the first output image;

wherein the spatial attention module comprises:

the spatial fusion unit is used for multiplying the pixel points of the spatial attention diagram by the pixel points of the first output image based on the spatial attention network to generate a second output image;

wherein the post-processing module comprises:

the image thinning unit is used for thinning the fourth remote sensing image to obtain a non-interrupted target segmentation image;

wherein the image thinning unit includes:

the skeleton extraction unit is used for extracting skeleton information of the fourth remote sensing image, topologically representing a connecting line of the fourth remote sensing image and generating a fifth remote sensing image;

4. A remote sensing image processing apparatus comprising a processor, a memory, and a remote sensing image processing program stored on the memory and executable by the processor, wherein the steps of the remote sensing image processing method according to any one of claims 1 to 2 are implemented when the remote sensing image processing program is executed by the processor.

5. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a remote sensing image processing program, wherein the remote sensing image processing program, when executed by a processor, implements the steps of the remote sensing image processing method according to any one of claims 1 to 2.