CN116205927A - Image segmentation method based on boundary enhancement - Google Patents

Image segmentation method based on boundary enhancement Download PDF

Info

Publication number
CN116205927A
CN116205927A CN202310165505.2A CN202310165505A CN116205927A CN 116205927 A CN116205927 A CN 116205927A CN 202310165505 A CN202310165505 A CN 202310165505A CN 116205927 A CN116205927 A CN 116205927A
Authority
CN
China
Prior art keywords
boundary
enhancement
feature map
feature
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310165505.2A
Other languages
Chinese (zh)
Inventor
古晶
孙新凯
翟得胜
杨淑媛
冯婕
侯彪
刘芳
焦李成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202310165505.2A priority Critical patent/CN116205927A/en
Publication of CN116205927A publication Critical patent/CN116205927A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image segmentation method based on boundary enhancement, which comprises the following steps: establishing a boundary-enhanced image segmentation network model, and segmenting an input image by using the trained network model; the encoder of the image segmentation network model comprises a first feature extraction module, a boundary extraction module and a second feature extraction module, and is mainly used for extracting boundary features and boundary labels of an input image so as to obtain feature images with different scales; the decoder of the image segmentation network model comprises a bidirectional mutual enhancement module and a multi-scale attention aggregation module, and is mainly used for carrying out attention aggregation processing on the enhancement feature map based on the scale dimension, the space dimension and the channel dimension to obtain a multi-dimensional fusion feature map. The boundary information extracted by the method is more accurate, so that the obtained multidimensional fusion feature map is outstanding, and the space information and semantic information of the accuracy of the segmentation result can be more effectively improved, thereby improving the accuracy of image segmentation.

Description

Image segmentation method based on boundary enhancement
Technical Field
The invention belongs to the technical field of image segmentation, and particularly relates to an image segmentation method based on boundary enhancement.
Background
The goal of image segmentation is to segment the input image according to semantic information and predict the semantic class of each pixel from a given set of labels. With the gradual intellectualization of modern life, more and more applications need to infer related semantic information from images for subsequent processing, such as augmented reality, automatic driving, video monitoring, and the like, so that accurate segmentation of the images becomes important.
Conventional image segmentation typically employs conventional machine learning algorithms based on clustering and random forests to obtain image features. In recent years, with the rapid development of specialized computing chips, the computing cost is rapidly reduced, so that the large-area use of a deep learning algorithm becomes possible, and the precision of image segmentation is obviously improved without increasing the cost. Therefore, an image segmentation method based on a deep learning algorithm has received a great deal of attention from students.
For example, jianlong Hou et al, in "BSNet: dynamic Hybrid Gradient Convolution Based Boundary-Sensitive Network for Remote Sensing Image Segmentation" (IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-22,2022) in the literature, propose a boundary-sensitive network based on dynamic hybrid gradient convolution and coordination sensitivity. Chengli Peng et al in Cross Fusion Net A Fast Semantic Segmentation Network for Small-Scale Semantic Information Capturing in Aerial Scenes (IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-13,2022) in the literature propose a Cross Fusion network that extracts multi-scale semantic information. Aijin Li et al, in "Multitask Semantic Boundary Awareness Network for Remote Sensing Image Segmentation" (IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-14,2022) propose a semantic boundary aware network. Guohui Deng et al, in "CCANet: class-Constraint Coarse-to-Fine Attentional Deep Network for Subdecimeter Aerial Image Semantic Segmentation" (IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-20,2022) herein, propose a Class-constrained coarse-to-fine attention depth network. Rui Li et al, in "Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images" (IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-13,2022) herein, propose a multi-noted network.
However, the image segmentation method based on the deep learning algorithm has the following defects:
firstly, when the boundary information of the feature map is extracted in the existing scheme, the error exists in the boundary information in the feature map obtained by convolution, downsampling and other operations, so that the error exists in the space information for recovering the boundary details;
secondly, in the process of aggregating the features of different scales, the feature images of different scales are directly fused after up-sampling by simple cascading or adding operation, the influence degree of the features of different scales on the segmentation result is not considered, and the difference of the space information of low level and the semantic information of high level in the features of different scales is not considered, so that the space information and the semantic information in the feature images of different scales cannot be fully fused, and the image segmentation result is not ideal.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides an image segmentation method based on boundary enhancement. The technical problems to be solved by the invention are realized by the following technical scheme:
an image segmentation method based on boundary enhancement, comprising:
establishing a boundary enhanced image segmentation network model based on the encoder-decoder framework;
training the image segmentation network model, and segmenting an input image by using the trained network model to obtain a segmentation result diagram;
the encoder of the image segmentation network model comprises a first feature extraction module, a boundary extraction module and a second feature extraction module;
the first feature extraction module is used for carrying out feature extraction on the input image to obtain a first feature map;
the boundary extraction module is used for extracting boundary features of the first feature map, extracting boundary labels of image labels corresponding to the input images, and supervising the output boundary features by utilizing the boundary labels to obtain a boundary feature map;
the second feature extraction module is used for carrying out multi-scale feature extraction on the first feature map and the boundary feature map to obtain feature maps with different scales;
the decoder of the image segmentation network model comprises a bidirectional mutual enhancement module and a multi-scale attention aggregation module; the bidirectional mutual enhancement module is used for processing the feature images with different scales to obtain enhancement feature images with different scales;
the multi-scale attention aggregation module is used for performing attention aggregation processing on the enhanced feature map based on the scale dimension, the space dimension and the channel dimension to obtain a multi-dimensional fusion feature map.
The invention has the beneficial effects that:
according to the image segmentation method based on boundary enhancement, on one hand, when the boundary extraction is carried out on the feature image, the boundary extraction is carried out on the label image, the boundary label is obtained, and the boundary extracted from the feature image is supervised through the boundary label, so that the extracted boundary information is more accurate; on the other hand, during feature fusion, the obtained multidimensional fusion feature map highlights spatial information and semantic information which can more effectively improve the accuracy of a segmentation result through scale, space and channel three-dimensional channel attention processing, so that the accuracy of image segmentation is improved.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is a schematic flow chart of an image segmentation method based on boundary enhancement according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an image segmentation network model according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a bidirectional mutual enhancement sub-network according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but embodiments of the present invention are not limited thereto.
Example 1
Referring to fig. 1, fig. 1 is a flowchart of an image segmentation method based on boundary enhancement according to an embodiment of the present invention, which includes:
establishing a boundary enhanced image segmentation network model based on the encoder-decoder framework;
training an image segmentation network model, and segmenting an input image by using the trained network model to obtain a segmentation result diagram;
the encoder of the image segmentation network model comprises a first feature extraction module, a boundary extraction module and a second feature extraction module;
the first feature extraction module is used for carrying out feature extraction on the input image to obtain a first feature map;
the boundary extraction module is used for extracting boundary features of the first feature map, extracting boundary labels of image labels corresponding to the input images, and supervising the output boundary features by utilizing the boundary labels to obtain a boundary feature map;
the second feature extraction module is used for carrying out multi-scale feature extraction on the first feature map and the boundary feature map to obtain feature maps with different scales;
the decoder of the image segmentation network model comprises a bidirectional mutual enhancement module and a multi-scale attention aggregation module; the bidirectional mutual enhancement module is used for processing the feature images with different scales to obtain enhancement feature images with different scales;
the multi-scale attention aggregation module is used for carrying out attention aggregation processing on the enhanced feature map based on the scale dimension, the space dimension and the channel dimension to obtain a multi-dimensional fusion feature map.
In this embodiment, the first feature extraction module is a convolution layer that includes 13×3 convolution operation Conv, 1 batch normalization BN, and 1 linear rectification function ReLU. The calculation formula of the linear rectification function ReLu is as follows:
Figure BDA0004095812790000051
where x is an element in the input feature map.
Furthermore, the boundary extraction module in this embodiment may use a laplace operator, a Sobel operator, a Canny operator, or a LoG operator to extract the boundary label.
Preferably, the boundary extraction module is constructed by using the laplace operator in this embodiment. Referring to fig. 2, fig. 2 is a schematic structural diagram of an image segmentation network model according to an embodiment of the present invention; the boundary extraction module comprises 2 Laplacians and 1 convolution layer; wherein,,
the first Laplace operator is used for extracting boundary features of the first feature map, and the second Laplace operator is used for extracting boundary labels of image labels corresponding to the input image;
the convolution layer is used for processing the boundary characteristics and monitoring the output of the first characteristic extraction module based on the boundary labels in the model training stage to obtain a boundary characteristic diagram.
The edge extraction of the Laplace operator on the image label and the supervision on the boundary characteristics output by the convolution layer are only performed in the network training stage. When the trained network is used for reasoning, the boundary extraction module only uses the Laplacian to carry out boundary extraction on the first feature map output by the first feature extraction module, and then the boundary features are processed through the convolution layer. And finally, adding the boundary feature image output by the convolution layer with the first feature image, and inputting the boundary feature image into a second feature extraction module for processing.
In this embodiment, the second feature extraction module employs a convolutional neural network architecture, which may be any one of ResNet-50, resNet-152, resNeXt-50, resNeXt-101, or ResNeXt-152.
Preferably, the embodiment takes a convolutional neural network architecture based on ResNet-101 as a second feature extraction module, and the structural diagram is shown in fig. 2, and the convolutional neural network architecture comprises a downsampling layer, a ResNet-101 first stage, a ResNet-101 second stage, a ResNet-101 third stage and a ResNet-101 fourth stage; wherein,,
the ResNet-101 first stage contains 3 residual blocks, the ResNet-101 second stage contains 4 residual blocks, the ResNet-101 third stage contains 23 residual blocks, and the ResNet-101 fourth stage contains 3 residual blocks.
More specifically, the residual block includes: 1 x 1 convolution operation, 13 x 3 convolution operation, and 1 x 1 convolution operation.
In this embodiment, the downsampling layer, i.e. the maximum value is pooled, with a step size of 2. Downsampling in the ResNet-101 first stage is achieved by maximum pooling, and downsampling in the ResNet-101 second stage, resNet-101 third stage, and ResNet-101 fourth stage are all achieved by setting the step size of the 1 st convolution operation in the stages to 2.
And adding the boundary feature map output by the boundary extraction module with the first feature map output by the first feature extraction module, and then sending the added boundary feature map to the second feature extraction module for processing, so that the high-level features and the low-level features with different sizes can be obtained.
In the embodiment, when the boundary extraction is performed on the feature map, the boundary extraction is performed on the label image, so that the boundary label is obtained, and the extracted boundary in the feature map is supervised through the boundary label, so that the extracted boundary information is more accurate.
Further, the bidirectional mutual enhancement module comprises a plurality of bidirectional mutual enhancement sub-networks with the same structure, and each bidirectional mutual enhancement sub-network comprises a first enhancement module and a second enhancement module which are respectively used for enhancing the feature images with two different sizes.
Specifically, as shown in fig. 2, based on the res net-101 convolutional neural network architecture adopted by the second feature extraction module in this embodiment, two bidirectional mutual enhancement sub-networks are set in the bidirectional mutual enhancement module, where the input of one bidirectional mutual enhancement sub-network is an output feature map of a first stage of res net-101 and a third stage of res net-101; the inputs to another two-way mutual enhancement sub-network are the output profiles of the ResNet-101 second stage and ResNet-101 fourth stage.
Further, referring to fig. 3, fig. 3 is a schematic structural diagram of a bidirectional mutual enhancement sub-network according to an embodiment of the present invention, which includes a first enhancement module and a second enhancement module, wherein,
the first enhancement module firstly adopts two convolution layers to process the input characteristic diagram with a first size so as to obtain a second characteristic diagram with the first size; then processing the input feature map with the first size through a convolution layer, a pooling layer and a Sigmoid activation layer in sequence to obtain a third feature map with the second size;
correspondingly, the second enhancement module firstly adopts two convolution layers to process the input characteristic diagram with the second size so as to obtain a fourth characteristic diagram with the second size; then processing the input feature map with the second size through a convolution layer, an up-sampling layer and a Sigmoid activation layer in sequence to obtain a fifth feature map with the first size;
the first enhancement module is further configured to multiply the second feature map with the fifth feature map, and obtain a first enhancement feature map having a first size through a convolution layer;
the second enhancement module is further configured to multiply the third feature map with the fourth feature map and obtain a second enhanced feature map having a second size through a convolution layer.
For example, for the first enhancement module, the input feature map size (i.e., the first size) is h×w×c: for the input feature map, features in the input feature map are processed through two convolution layers, so that a processed feature map of H multiplied by W multiplied by C, which has the same size as the input size, namely a second feature map, is obtained. In addition, the input feature map is processed through a convolution layer with a step length of 2, a pooling layer with a step length of 2 and a Sigmoid activation layer, so as to obtain an activated feature map with a size of H/4 xW/4 xC, namely a third feature map.
For the second enhancement module, the input feature map size (i.e., the second size) is H/4W/4C: for the input feature map, features in the input feature map are processed through two convolution layers, so that a processed feature map of H/4 xW/4 xC with the same size as the input size, namely a fourth feature map, is obtained. In addition, the input feature map is processed through a convolution layer, an up-sampling operation with an up-sampling multiple of 4 and a Sigmoid activation layer, so as to obtain an activated feature map with a size of h×w×c, namely, a fifth feature map.
The calculation formula of the Sigmoid activation function is as follows:
Figure BDA0004095812790000071
where x is the input signature.
The processed feature map of size h×w×c (i.e., the second feature map) is multiplied by the activated feature map of size h×w×c (i.e., the fifth feature map), and an output feature map of size h×w×c (i.e., the first enhancement feature map) is obtained by one convolution layer. The processed feature map with the size H/4 xw/4 xc (i.e. the third feature map) is multiplied by the activated feature map with the size H/4 xw/4 xc (i.e. the fourth feature map) to obtain the output feature map with the size H/4 xw/4 xc (i.e. the second enhancement feature map) by one convolution layer.
In the embodiment, the characteristic diagrams of the first stage of the ResNet-101 and the third stage of the ResNet-101 are input into one bidirectional mutual enhancement sub-network, and the second stage of the ResNet-101 and the fourth stage of the ResNet-101 are input into the other bidirectional mutual enhancement sub-network, so that the enhancement characteristic diagrams with four scales consistent with the sizes of the first, second, third and fourth stages of the ResNet-101 are obtained.
Further, please continue to refer to fig. 2, wherein the multi-scale attention aggregation module includes a multi-scale fusion sub-network, a scale-dimension attention sub-network, a space-dimension attention sub-network, and a channel-dimension attention sub-network; wherein,,
the multi-scale fusion sub-network is used for carrying out size transformation and feature cascading on the enhanced feature graphs with different scales;
the scale dimension attention sub-network is used for carrying out global average pooling on the feature images after cascading in the scale dimension so as to obtain the feature images after the scale dimension attention processing;
the space dimension attention sub-network is used for carrying out global average pooling and maximum pooling on the feature map after the dimension attention processing on the space dimension so as to obtain the feature map after the dimension attention processing;
the channel dimension attention sub-network is used for carrying out global average pooling on the feature map after the space dimension attention processing on the channel dimension so as to obtain a multi-dimensional fusion feature map.
Specifically, as shown in fig. 2, the multi-scale fusion sub-network performs size transformation on the four-scale enhancement feature images output by the bidirectional mutual enhancement module, so that the sizes of the four-scale feature images are all changed into the same size as the sizes of the feature images output by the ResNet-101 in the first stage. And then cascading the feature graphs after the size conversion to obtain feature graphs with dimensions of (dimension S, channel C and space H multiplied by W) respectively.
For the feature map after cascading, the dimension attention sub-network performs global average pooling on the dimension to obtain a feature vector, and the size of the feature vector is (Sx1×1×1). And then processing the feature vector through a convolution layer (the convolution kernel is 1 multiplied by 1) to obtain a attention vector of the dimension, and multiplying the attention vector of the dimension by the feature map after the multi-dimension fusion sub-network is cascaded to obtain the feature map after the dimension attention processing.
The space dimension attention sub-network firstly processes the feature map after dimension attention processing through a convolution layer, and then respectively carries out global average pooling and maximum pooling on the processed feature map in the space dimension to obtain two space attention force diagrams with the size of (1 multiplied by H multiplied by W). The obtained two spatial attention patterns are processed through a convolution layer, a spatial dimension attention pattern with the size of (1×1×h×w) is output, and the spatial dimension attention pattern is multiplied by the scale dimension attention processed feature pattern to obtain a spatial dimension attention processed feature pattern.
The channel dimension attention sub-network firstly processes the feature map after the space dimension attention processing through a convolution layer, and then carries out global average pooling on the processed feature map in the channel dimension to obtain a feature vector with the size of (1 XC x 1). And then processing the feature vector through a full connection layer and a convolution layer (the convolution kernel is 1 multiplied by 1)), obtaining the attention vector of the channel dimension, and multiplying the attention vector of the channel dimension by the feature map after the attention processing of the space dimension to obtain the output feature map of the multi-scale attention aggregation module.
According to the embodiment, the three-dimensional channel attention is processed through the dimensions, the space and the channel dimension, so that the obtained multi-dimensional fusion feature map highlights the space information and the semantic information which can effectively improve the accuracy of the segmentation result.
It can be understood that, after the multi-scale attention aggregation module, the decoder of the image segmentation network model provided in this embodiment further includes an upsampling layer and a convolution layer, where the upsampling layer is used to upsample the output feature map of the multi-scale attention aggregation module, and the upsampling multiplying power is 4, and then the segmentation result map is obtained through the convolution layer.
According to the image segmentation method based on boundary enhancement, on one hand, when the boundary extraction is carried out on the feature image, the boundary extraction is carried out on the label image, the boundary label is obtained, and the boundary extracted from the feature image is supervised through the boundary label, so that the extracted boundary information is more accurate; on the other hand, during feature fusion, the obtained multi-dimensional fusion feature map highlights spatial information and semantic information which can more effectively improve the accuracy of a segmentation result through scale, space and channel three-dimensional attention processing, so that the accuracy of image segmentation is improved.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (9)

1. An image segmentation method based on boundary enhancement, comprising:
establishing a boundary enhanced image segmentation network model based on the encoder-decoder framework;
training the image segmentation network model, and segmenting an input image by using the trained network model to obtain a segmentation result diagram;
the encoder of the image segmentation network model comprises a first feature extraction module, a boundary extraction module and a second feature extraction module;
the first feature extraction module is used for carrying out feature extraction on the input image to obtain a first feature map;
the boundary extraction module is used for extracting boundary features of the first feature map, extracting boundary labels of image labels corresponding to the input images, and supervising the output boundary features by utilizing the boundary labels to obtain a boundary feature map;
the second feature extraction module is used for carrying out multi-scale feature extraction on the first feature map and the boundary feature map to obtain feature maps with different scales;
the decoder of the image segmentation network model comprises a bidirectional mutual enhancement module and a multi-scale attention aggregation module; the bidirectional mutual enhancement module is used for processing the feature images with different scales to obtain enhancement feature images with different scales;
the multi-scale attention aggregation module is used for performing attention aggregation processing on the enhanced feature map based on the scale dimension, the space dimension and the channel dimension to obtain a multi-dimensional fusion feature map.
2. The boundary enhancement based image segmentation method according to claim 1, wherein the first feature extraction module is a convolution layer comprising 13 x 3 convolution operations Conv, 1 batch normalization BN, and 1 linear rectification function ReLU.
3. The image segmentation method based on boundary enhancement according to claim 1, wherein the boundary extraction module adopts a laplace operator, a Sobel operator, a Canny operator or a LoG operator to extract the boundary label.
4. The boundary enhancement based image segmentation method according to claim 1, wherein the boundary extraction module comprises 2 laplacian and 1 convolutional layer; wherein,,
the first Laplace operator is used for extracting boundary features of the first feature map, and the second Laplace operator is used for extracting boundary labels of image labels corresponding to the input image;
the convolution layer is used for processing the boundary features and supervising the output of the first feature extraction module based on the boundary labels in a model training stage to obtain a boundary feature map.
5. The boundary-enhancement-based image segmentation method according to claim 1, wherein the second feature extraction module employs a convolutional neural network architecture, which may be any one of ResNet-50, resNet-152, resNeXt-50, resNeXt-101 or ResNeXt-152.
6. The boundary enhancement based image segmentation method according to claim 1, wherein the second feature extraction module adopts a convolutional neural network architecture based on ResNet-101, and comprises a downsampling layer, a ResNet-101 first stage, a ResNet-101 second stage, a ResNet-101 third stage and a ResNet-101 fourth stage; wherein,,
the ResNet-101 first stage contains 3 residual blocks, the ResNet-101 second stage contains 4 residual blocks, the ResNet-101 third stage contains 23 residual blocks, and the ResNet-101 fourth stage contains 3 residual blocks.
7. The image segmentation method based on boundary enhancement according to claim 1, wherein the bidirectional mutual enhancement module comprises a plurality of bidirectional mutual enhancement sub-networks with the same structure, each bidirectional mutual enhancement sub-network comprises a first enhancement module and a second enhancement module, and the first enhancement module and the second enhancement module are respectively used for carrying out enhancement processing on two feature maps with different sizes; wherein,,
the first enhancement module firstly adopts two convolution layers to process the input characteristic diagram with a first size so as to obtain a second characteristic diagram with the first size; then processing the input feature map with the first size through a convolution layer, a pooling layer and a Sigmoid activation layer in sequence to obtain a third feature map with the second size;
correspondingly, the second enhancement module firstly adopts two convolution layers to process the input characteristic diagram with the second size so as to obtain a fourth characteristic diagram with the second size; then processing the input feature map with the second size through a convolution layer, an up-sampling layer and a Sigmoid activation layer in sequence to obtain a fifth feature map with the first size;
the first enhancement module is further configured to multiply the second feature map and the fifth feature map, and obtain a first enhancement feature map having a first size through a convolution layer;
the second enhancement module is further configured to multiply the third feature map with the fourth feature map, and obtain a second enhancement feature map having a second size through a convolution layer.
8. The boundary-enhancement-based image segmentation method according to claim 6, wherein the bidirectional mutual enhancement module comprises two bidirectional mutual enhancement sub-networks, wherein the input of one bidirectional mutual enhancement sub-network is an output characteristic diagram of a first stage of ResNet-101 and a third stage of ResNet-101; the inputs to another two-way mutual enhancement sub-network are the output profiles of the ResNet-101 second stage and ResNet-101 fourth stage.
9. The boundary-enhancement-based image segmentation method of claim 1, wherein the multi-scale attention aggregation module comprises a multi-scale fusion sub-network, a scale-dimensional attention sub-network, a space-dimensional attention sub-network, and a channel-dimensional attention sub-network; wherein,,
the multi-scale fusion sub-network is used for carrying out size transformation and feature cascading on the enhancement feature graphs with different scales;
the scale dimension attention sub-network is used for carrying out global average pooling on the feature images after cascading in the scale dimension so as to obtain the feature images after the scale dimension attention processing;
the space dimension attention sub-network is used for carrying out global average pooling and maximum pooling on the feature map after the dimension attention processing on the space dimension so as to obtain the feature map after the dimension attention processing;
the channel dimension attention sub-network is used for carrying out global average pooling on the feature map after the space dimension attention processing on the channel dimension so as to obtain a multi-dimensional fusion feature map.
CN202310165505.2A 2023-02-24 2023-02-24 Image segmentation method based on boundary enhancement Pending CN116205927A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310165505.2A CN116205927A (en) 2023-02-24 2023-02-24 Image segmentation method based on boundary enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310165505.2A CN116205927A (en) 2023-02-24 2023-02-24 Image segmentation method based on boundary enhancement

Publications (1)

Publication Number Publication Date
CN116205927A true CN116205927A (en) 2023-06-02

Family

ID=86518850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310165505.2A Pending CN116205927A (en) 2023-02-24 2023-02-24 Image segmentation method based on boundary enhancement

Country Status (1)

Country Link
CN (1) CN116205927A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721351A (en) * 2023-07-06 2023-09-08 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 Remote sensing intelligent extraction method for road environment characteristics in overhead line channel

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721351A (en) * 2023-07-06 2023-09-08 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 Remote sensing intelligent extraction method for road environment characteristics in overhead line channel

Similar Documents

Publication Publication Date Title
CN109190752B (en) Image semantic segmentation method based on global features and local features of deep learning
CN112287940B (en) Semantic segmentation method of attention mechanism based on deep learning
CN111860499B (en) Feature grouping-based bilinear convolutional neural network automobile brand identification method
CN108717569B (en) Expansion full-convolution neural network device and construction method thereof
Ju et al. A simple and efficient network for small target detection
CN108345827B (en) Method, system and neural network for identifying document direction
CN107545263B (en) Object detection method and device
CN113239954B (en) Attention mechanism-based image semantic segmentation feature fusion method
CN111738344A (en) Rapid target detection method based on multi-scale fusion
CN111860683B (en) Target detection method based on feature fusion
CN114202743A (en) Improved fast-RCNN-based small target detection method in automatic driving scene
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN109657538B (en) Scene segmentation method and system based on context information guidance
CN112380978A (en) Multi-face detection method, system and storage medium based on key point positioning
CN113139544A (en) Saliency target detection method based on multi-scale feature dynamic fusion
CN113642445A (en) Hyperspectral image classification method based on full convolution neural network
CN115631369A (en) Fine-grained image classification method based on convolutional neural network
CN116205927A (en) Image segmentation method based on boundary enhancement
CN111104924B (en) Processing algorithm for identifying low-resolution commodity image
CN111612803B (en) Vehicle image semantic segmentation method based on image definition
CN113223037A (en) Unsupervised semantic segmentation method and unsupervised semantic segmentation system for large-scale data
CN111626298A (en) Real-time image semantic segmentation device and segmentation method
CN114494703B (en) Intelligent workshop scene target lightweight semantic segmentation method
CN112001479B (en) Processing method and system based on deep learning model and electronic equipment
CN117036658A (en) Image processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination