CN114639020A - Segmentation network, segmentation system and segmentation device for target object of image - Google Patents

Segmentation network, segmentation system and segmentation device for target object of image Download PDF

Info

Publication number
CN114639020A
CN114639020A CN202210303635.3A CN202210303635A CN114639020A CN 114639020 A CN114639020 A CN 114639020A CN 202210303635 A CN202210303635 A CN 202210303635A CN 114639020 A CN114639020 A CN 114639020A
Authority
CN
China
Prior art keywords
segmentation
network
branch
decoder
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210303635.3A
Other languages
Chinese (zh)
Inventor
刘琦
李阳
肖博
路慧
杨志云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202210303635.3A priority Critical patent/CN114639020A/en
Publication of CN114639020A publication Critical patent/CN114639020A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a segmentation network, a segmentation system and a segmentation device for a target object of an image, and relates to the field of semantic segmentation of remote sensing images. The segmentation network, the segmentation system and the segmentation device of the target object of the image comprise a backbone network, a multi-scale middle layer member and a decoding module, wherein the multi-scale middle layer member comprises an attention module and a feature fusion module. The attention module is used for extracting a feature map, and the feature map is processed by the feature module to represent multi-scale information. The method can capture deep semantic information and multi-scale information of the high-resolution remote sensing image, and improves the accuracy of prediction.

Description

Segmentation network, segmentation system and segmentation device for target object of image
Technical Field
The invention relates to the field of semantic segmentation of remote sensing images, in particular to a segmentation network, a segmentation system and a segmentation device for a target object of an image.
Background
The improvement of Chinese remote sensing series satellite technology and quantity, and the gradually formed network service platform, the quantity and quality of the related remote sensing images are continuously improved at present, so that researchers can more conveniently obtain high-resolution remote sensing images. The remote sensing images play an important role in the fields of land resource exploration, environment detection and protection, city planning, crop estimation, disaster prevention and reduction and the like. However, because of the huge data, how to automatically, efficiently and quickly extract relevant information from an image becomes a very important research direction. How to automatically identify the update of the urban space database by the surface buildings, the urban dynamic detection and the establishment of the 'smart city' in the high-resolution satellite image has absolute value.
Pan et al propose a generative countermeasure network consisting of space and channel attention mechanisms (SCA) for accurate partitioning of buildings. Protopapa dakis et al [4] propose a Deep Neural Network (DNN) based on Stacked Automatic Encoder (SAE) driver and semi-supervised (SSL) for extracting buildings from low cost satellites. Wang et al [5] propose a novel non-local residual U-shaped network that uses a co-dec structure to extract and recover feature maps and uses a self-attention mechanism to obtain global context information. Hu et al [6] have built new modules by setting up components to improve progress. In addition, an attention mechanism is introduced into the network, and the segmentation accuracy is improved. Liu et al [7] propose a network that can recover lightweight model details by means of a spatial pyramid. Chen et al [8] propose an adaptive iterative segmentation method. Cheng et al [9] propose a deep active ray network (Darnet) for end-to-end training that achieves accurate building segmentation through energy minimization and back propagation of the backbone CNN. Shi et al [10] combine Graph Convolution Network (GCN) with Deep Structure Feature Embedding (DSFE) to propose a gated graph convolution network to generate sharp boundaries and fine-grained pixel-level classification.
At present, the high-resolution remote sensing image has higher resolution than a common image and contains more textural features and detail parts. And the same architectural hue and character may appear differently. In the semantic extraction result, the extraction may be incomplete or may be extracted by mistake. Some researchers have adopted network scaling to change the accuracy in order to extract more features. By deepening the network and increasing the network calculation amount, the defects that the network parameters move millions are overcome, and the lightweight and real-time performance cannot be realized in the prediction process. So that the whole field is necessary for real-time performance and multi-scale performance.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a segmentation network, a segmentation system and a segmentation device of a target object of an image, which can capture deep semantic information and multi-scale information of a high-resolution remote sensing image and improve the accuracy of prediction.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
in one aspect, there is provided a segmentation network of an object of an image, the segmentation network comprising:
cutting out a data set by using a sliding window;
extracting semantic feature vectors from the data set through a decoder, and replacing the convolution common to the decoder with Res2Net series convolution;
the semantic feature vectors are subjected to channel information fusion through the middle layer;
and restoring the semantic feature vector subjected to channel information fusion to the size of the input picture through a decoder.
Further, the convolution of Res2Net is changed to a hole convolution.
Furthermore, the channel information fusion adopts a high-level semantic fusion method.
Further, the decoder is constructed with a dual-branch decoder structure comprising a deconvolution branch and a feature enhancement branch.
Furthermore, each time the dimension of the deconvolution branch of the decoder is enlarged to twice that of the original dimension, the feature enhancement branch and the decoder are supplemented with details.
In another aspect, there is provided a segmentation system for a target of an image, the segmentation system comprising:
the system comprises a backbone network, a multi-scale interlayer component and a decoding module, wherein the multi-scale interlayer component comprises an attention module and a feature fusion module. The attention module is used for extracting a feature map, and the feature map is processed by the feature module to represent multi-scale information.
Further, the decoding module is configured with a dual-branch decoder architecture, the dual-branch decoder comprising a deconvolution branch and a feature enhancement branch.
Further, the deconvolution branch is used for capturing basic information and adding bottom semantic details, and the feature enhancement branch is used for enhancing high-level semantic information and deepening multi-scale information.
Further, the low-layer semantic information is information that is peer-transferred to a decoder by an encoder in an encoding process.
In still another aspect, there is provided a segmentation apparatus for an object of an image, the segmentation apparatus including:
a network of segmentation of an object of an image according to any one of claims 1 to 5 and a system of segmentation of an object of an image according to any one of claims 6 to 9.
(III) advantageous effects
The invention provides a segmentation network, a segmentation system and a segmentation device for a target object of an image, which can capture deep semantic information and multi-scale information of a high-resolution remote sensing image and improve the accuracy of prediction.
Drawings
FIG. 1 is a network framework diagram of the present invention;
FIG. 2 is an attention mechanism for use with the present invention;
FIG. 3 is a graph showing the results of comparison between the present model and other models.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Referring to fig. 1-3, there is shown a network overall framework diagram of a segmented network of objects of an image, the segmented network comprising:
cutting out a data set by using a sliding window; the reason that the original data set is a particularly large picture is that the sliding window is needed to cut out a single picture and transmit the single picture to the network;
extracting semantic feature vectors from the data set through a decoder, and replacing the convolution common to the decoder with Res2Net series convolution;
the semantic feature vectors are subjected to channel information fusion through the middle layer;
and restoring the semantic feature vector subjected to channel information fusion to the size of the input picture through a decoder.
The convolution of Res2Net is changed to a hole convolution.
The channel information fusion adopts a high-level semantic fusion method.
The decoder is built with a dual-branch decoder structure comprising a deconvolution branch and a feature enhancement branch. The dual-branch decoder structure is mainly composed of a deconvolution branch and a feature enhancement branch. The restoration to the input picture in the decoder mainly depends on upsampling (up-sampling) restoration, wherein the semantic feature components of each layer in the decoder network are combined to enhance the details, because at the time of the encoder, the picture size consistency is maintained before and after the semantic division codec network every time the decoder is reduced to 1/2.
The decoder performs detail supplement each time the deconvolution branch dimension is enlarged to twice the original. With the addition of a space and channel attention mechanism.
A segmentation system for an object of an image, the segmentation system comprising:
the multi-scale multi-branch decoding system comprises a backbone network, a multi-scale middle layer component and a dual-branch decoding module. The multi-scale interlayer member is divided into two parts: an attention module and a feature fusion module. After the feature map extracted by the attention module is processed by the feature fusion module, multi-scale rich surface building information can be effectively represented. In the decoding module, a dual-branch decoder architecture is constructed, which includes a deconvolution branch and a feature enhancement branch. The deconvolution branch is responsible for capturing basic information and adding bottom semantic details, and the feature enhancement branch strengthens high-level semantic information and deepens multi-scale information.
An apparatus for segmenting an object of an image, the apparatus comprising:
a segmentation network of objects of an image according to any one of claims 1 to 5 and a segmentation system of objects of an image according to any one of claims 6 to 9.
The invention patent, the overall framework is based on an encoder-decoder network. First, a feature extraction is performed on the input data. Since building segmentation is essentially a pixel point classification problem, global information is also important for local information. Global high-level semantic information can be extracted by the decoder. And then, fusing the characteristics of the middle layer to enable the semantic information to contain more multi-scale information. And finally, gradually restoring by a double-branch decoder defined by us to gradually complement detail information, wherein the detail information can be understood as extracted feature vectors of the feature branches, and assuming that the picture input is 256 × 256, the feature size behind the middle layer is 16 × 16, and the original size is doubled after each deconvolution. From 16 to 32, to 64, 128 and finally to 256, this is a stepwise reduction. In the reduction process, a concatenate function is directly used to connect semantic variables of the characteristic enhancement branch, namely, a predicted value is obtained by gradually complementing, and the predicted value is obtained because the predicted value obtained by the network designed by the user is better than that of the network designed by the predecessor, so the accuracy is high, and the segmentation accuracy is effectively improved.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A segmentation network of an object of an image, characterized in that the segmentation network comprises:
cutting out a data set by using a sliding window;
extracting semantic feature vectors from the data set through a decoder, and replacing the convolution common to the decoder with Res2Net series convolution;
the semantic feature vectors are subjected to channel information fusion through the middle layer;
and restoring the semantic feature vector subjected to channel information fusion to the size of the input picture through a decoder.
2. The segmentation network of objects of an image according to claim 1, characterized by: the convolution of Res2Net is changed to a hole convolution.
3. The segmentation network, the segmentation system and the segmentation device for the target object of the image according to claim 1, wherein: the channel information fusion adopts a high-level semantic fusion method.
4. The segmentation network, the segmentation system and the segmentation device for the target object of the image according to claim 1, wherein: the decoder is built with a dual-branch decoder structure comprising a deconvolution branch and a feature enhancement branch.
5. The segmentation network, the segmentation system and the segmentation device for the target object of the image according to claim 1, wherein: and the characteristic enhancement branch and the decoder perform detail supplement each time the dimension of the deconvolution branch is enlarged to be twice of the original dimension.
6. An object segmentation system for an image, the segmentation system comprising:
the system comprises a backbone network, a multi-scale interlayer component and a decoding module, wherein the multi-scale interlayer component comprises an attention module and a feature fusion module. The attention module is used for extracting a feature map, and the feature map is processed by the feature module to represent multi-scale information.
7. The system for segmenting an object of an image according to claim 6, wherein: the decoding module is constructed with a dual-branch decoder architecture comprising a deconvolution branch and a feature enhancement branch.
8. The system for segmenting an object in an image according to claim 7, wherein: the deconvolution branch is used for capturing basic information and adding bottom semantic details, and the feature enhancement branch is used for enhancing high-level semantic information and deepening multi-scale information.
9. The system for segmenting an object in an image according to claim 8, wherein: the low-layer semantic information is information which is transmitted to a decoder by the same level in the encoding process of the encoder.
10. An apparatus for segmenting an object of an image, the apparatus comprising:
a network of segmentation of an object of an image according to any one of claims 1 to 5 and a system of segmentation of an object of an image according to any one of claims 6 to 9.
CN202210303635.3A 2022-03-24 2022-03-24 Segmentation network, segmentation system and segmentation device for target object of image Pending CN114639020A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210303635.3A CN114639020A (en) 2022-03-24 2022-03-24 Segmentation network, segmentation system and segmentation device for target object of image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210303635.3A CN114639020A (en) 2022-03-24 2022-03-24 Segmentation network, segmentation system and segmentation device for target object of image

Publications (1)

Publication Number Publication Date
CN114639020A true CN114639020A (en) 2022-06-17

Family

ID=81949383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210303635.3A Pending CN114639020A (en) 2022-03-24 2022-03-24 Segmentation network, segmentation system and segmentation device for target object of image

Country Status (1)

Country Link
CN (1) CN114639020A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115527031A (en) * 2022-09-16 2022-12-27 山东科技大学 Bone marrow cell image segmentation method, computer device and readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115527031A (en) * 2022-09-16 2022-12-27 山东科技大学 Bone marrow cell image segmentation method, computer device and readable storage medium
CN115527031B (en) * 2022-09-16 2024-04-12 山东科技大学 Bone marrow cell image segmentation method, computer device and readable storage medium
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115205300B (en) * 2022-09-19 2022-12-09 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Similar Documents

Publication Publication Date Title
Agarwal et al. Attention attention everywhere: Monocular depth prediction with skip attention
CN113298818B (en) Remote sensing image building segmentation method based on attention mechanism and multi-scale features
CN111598174B (en) Model training method based on semi-supervised antagonistic learning and image change analysis method
US11521377B1 (en) Landslide recognition method based on laplacian pyramid remote sensing image fusion
CN113780296B (en) Remote sensing image semantic segmentation method and system based on multi-scale information fusion
CN109919025A (en) Video scene Method for text detection, system, equipment and medium based on deep learning
CN114639020A (en) Segmentation network, segmentation system and segmentation device for target object of image
CN113780149A (en) Method for efficiently extracting building target of remote sensing image based on attention mechanism
CN113298815A (en) Semi-supervised remote sensing image semantic segmentation method and device and computer equipment
Liu et al. Remote sensing data fusion with generative adversarial networks: State-of-the-art methods and future research directions
CN112287983B (en) Remote sensing image target extraction system and method based on deep learning
CN111368846A (en) Road ponding identification method based on boundary semantic segmentation
CN117078943A (en) Remote sensing image road segmentation method integrating multi-scale features and double-attention mechanism
CN115035295A (en) Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function
CN112766409A (en) Feature fusion method for remote sensing image target detection
CN112241939A (en) Light-weight rain removing method based on multi-scale and non-local
Oehmcke et al. Creating cloud-free satellite imagery from image time series with deep learning
CN114943888B (en) Sea surface small target detection method based on multi-scale information fusion
CN115797881A (en) Multi-task joint perception network model for traffic road pavement information and detection method
Jiang et al. Arbitrary-shaped building boundary-aware detection with pixel aggregation network
CN111914596B (en) Lane line detection method, device, system and storage medium
CN115984714B (en) Cloud detection method based on dual-branch network model
Ruiz-Lendínez et al. Deep learning methods applied to digital elevation models: state of the art
CN115035402B (en) Multistage feature aggregation system and method for land cover classification problem
CN112528803B (en) Road feature extraction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination