CN114565770A - Image segmentation method and system based on edge auxiliary calculation and mask attention - Google Patents
Image segmentation method and system based on edge auxiliary calculation and mask attention Download PDFInfo
- Publication number
- CN114565770A CN114565770A CN202210288277.3A CN202210288277A CN114565770A CN 114565770 A CN114565770 A CN 114565770A CN 202210288277 A CN202210288277 A CN 202210288277A CN 114565770 A CN114565770 A CN 114565770A
- Authority
- CN
- China
- Prior art keywords
- feature
- image
- prediction
- edge
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000003709 image segmentation Methods 0.000 title claims abstract description 27
- 238000004364 calculation method Methods 0.000 title description 8
- 230000011218 segmentation Effects 0.000 claims abstract description 51
- 230000000694 effects Effects 0.000 claims abstract description 20
- 230000009467 reduction Effects 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 27
- 230000002776 aggregation Effects 0.000 claims description 17
- 238000004220 aggregation Methods 0.000 claims description 17
- 238000010586 diagram Methods 0.000 claims description 16
- 230000004927 fusion Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 4
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 abstract description 2
- 238000005457 optimization Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image segmentation method and system based on edge auxiliary computation and mask attention, wherein a feature encoder constructed by multi-order cascaded residual modules is established, an edge feature map is obtained by fusing three shallow feature maps, an edge predicted image is obtained after feature dimension reduction, the characterization capability of the front three-layer feature encoder is enhanced, the last-order residual module sequentially passes through a plurality of feature decoders and a mask attention module, the mask attention module is utilized to improve the key attention of each level feature decoder to a local area, a segmentation result image corresponding to dimension prediction is output at each level, the output feature map of the feature decoder and the edge feature map of the front three-order residual module are fused, and the final segmentation result image is predicted through feature dimension reduction. Compared with the existing image segmentation method, the method can provide more accurate segmentation edge prediction, is suitable for image segmentation under various complex scenes, and has stronger generalization performance and better segmentation effect.
Description
Technical Field
The invention belongs to the field of computer vision, and relates to an image segmentation method and system based on edge auxiliary calculation and mask attention.
Background
The current image segmentation technology based on deep learning is an important research direction in the field of computer vision and has been widely applied, and the current image segmentation method is to classify each pixel in an image by using a deep learning model and finally obtain the semantic category of each pixel. However, the existing method still has the following problems that the edge segmentation of the target in the image by the model is inaccurate, multi-scale context information cannot be fully utilized, excessive information loss exists in the prediction process, the target loss function of model optimization is too single, effective modeling cannot be performed, and the segmentation effect of the model is finally influenced.
The noun explains:
BatchNorm layer: the input of each layer of neural network keeps the same distribution in the deep neural network training process. The method comprises the steps of firstly calculating the integral mean value and variance of input data, then carrying out normalization operation, and finally carrying out scaling and translation according to set scaling factors and translation factors.
ReLu layer: the linear rectification function is realized, and as an activation function in the neural network, for an input vector x, the output is the larger value of 0 and x after the ReLu layer is used.
MaxPool layer: the whole feature map is divided into a plurality of blocks with the same size in an non-overlapping way, only the maximum parameter is reserved in each block, and other parameters are discarded. The MaxPool layer mentioned in this patent has a size of 2 × 2, a step size of 2, and the width and height of the output feature map are half of those of the input feature map.
Conv Block: for simplicity of illustration, the following description does not include the BatchNorm layer and the ReLu layer. The inputs have the following two branches: (1) sequentially passing through 1 × 1 convolution layer, 3 × 3 convolution layer and 1 × 1 convolution layer; (2) the number of feature map channels was changed through 1 × 1 convolution layers. And then, fusing the feature maps output by the two branches through an adding operation to obtain an output.
Identity Block: for simplicity of illustration, the following description does not include the BatchNorm layer and the ReLu layer. The inputs have the following two branches: (1) sequentially passing through 1 × 1 convolution layer, 3 × 3 convolution layer and 1 × 1 convolution layer; (2) jump connected, output equals input. And then, fusing the feature maps output by the two branches through an adding operation to obtain an output.
Disclosure of Invention
In order to solve the technical problems, the invention discloses an image segmentation method and system based on edge auxiliary calculation and mask attention, and the invention improves the segmentation effect and the segmentation accuracy of the image.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an image segmentation system based on boundary perception attention comprises a feature encoder constructed by n stages of cascaded residual modules conv, wherein a feature decoder block is arranged corresponding to each stage of residual module, and a mask attention module is arranged corresponding to each stage of feature decoder block; wherein, the characteristic diagram output by the residual error module of the a-th level is the input of the residual error module of the a + 1-th level; the characteristic diagram output by the a-stage residual error module is input into a-stage characteristic decoder block, and the output characteristic diagram output by the a-stage characteristic decoder block is input into an a-stage mask attention module mask _ attention block; the enhanced feature map output by the a-th level mask attention block is input into an a-1-th level feature decoder;
respectively performing descending and up-sampling on the output of the first 3-stage residual error module to obtain three shallow feature maps, fusing the three shallow feature maps to obtain a final edge feature map efeature, and performing dimensionality reduction on the final edge feature map efeature feature to obtain an edge predicted image edge _ predicted; splicing and fusing the final edge feature image and the last enhanced feature image last _ feature output by the first-level mask attention module to obtain a final prediction segmentation image prediction _ 512; n is more than or equal to 3.
In a further improvement, n is 5; the first-stage residual module conv1 includes a convolution layer with convolution kernel 7 × 7, a BatchNorm layer, a ReLu layer and a MaxPool layer; the remaining residual modules include Conv Block and Identity Block; the input and output dimensions of the Conv Block are different, and the Conv Block is used for changing the dimensions of the network; the k input dimension and the output dimension of the Identity Bloc are the same and are used for deepening the network; the method comprises the steps that an input image respectively obtains a feature map feature _1, a feature map feature _3, a feature map feature _4 and a feature map feature _5 through five residual modules, the feature maps feature _1, feature _2 and feature _3 are taken, channel dimensionality reduction is respectively carried out on convolutional layers with convolution kernels of 1 x 1, feature extraction is carried out on the convolutional layers with convolution kernels of 3 x 3, linear interpolation up-sampling operation with factors of 2, 4 and 8 is respectively carried out, and three edge feature maps effect _1, effect _2 and effect _3 with the same scale are obtained; and performing feature fusion on the three edge feature maps, namely, the edge _1, the edge _2 and the edge _3 in a splicing mode to obtain a final edge feature map, and performing channel dimension reduction on the edge feature map by using a convolution layer with a convolution kernel of 1 × 1 to obtain an edge predicted image edge _ prediction.
In a further improvement, the feature map output by the a-th stage residual module and the enhanced feature map output by the a + 1-th stage mask attention block are input into an a-th stage feature decoder block; in the decoder block of the a-level feature decoder, the enhanced feature graph is subjected to linear interpolation upsampling with a factor of 2, then is subjected to feature fusion with the feature graph through splicing operation, and is output through two convolution layers with convolution kernels of 3 x 3 to obtain an output feature graph;
inputting an output characteristic diagram of the decoder block of the a-level characteristic decoder into a mask attention block of the a-level mask; firstly, obtaining a prediction segmentation image with a corresponding scale by a convolution layer with a convolution kernel of 3 × 3 and a convolution kernel of 1 × 1 by an output feature map, multiplying the prediction segmentation image as a mask attention map mask _ attribute by the output feature map to obtain an attention feature map att _ feature, and directly adding the attention feature map att _ feature and the output feature map to obtain an enhanced feature map; the mask attention modules of the fifth stage to the second stage output predicted divided images prediction _ x having a scale size of 32 × 32, 64 × 64, 128 × 128, 256 × 256, respectively, x being 2, 3, 4, 5.
Further improvement, the last enhanced feature pattern last _ feature output by the first-stage mask attention module is spliced with the final edge feature pattern effect, then the result of the convolution operation is input into a Sigmoid activation function through a convolution layer with a convolution kernel of 1 × 1, and a final prediction segmentation image prediction _512 is obtained;
calculating an aggregation Loss function Loss for the edge prediction image edge _ prediction, the multi-scale prediction segmentation image prediction _ x, and the final prediction segmentation image prediction _512 respectively:
Loss=BCELoss+DiceLoss+JaccardLoss
the BCELoss calculates the cross entropy loss of two classes under a single-label two-class scene, one input sample picture corresponds to one output segmentation picture, and for a Batch data set D (p, y) containing N sample pictures, p is a prediction result, the value range is 0-1, y is label information, and the value range is 0 or 1; the BCELoss calculation formula is as follows:
wherein pi represents a prediction result of the ith sample picture, and yi represents label information of the ith sample picture;
wherein DiceLoss 1-Dice (P, Y)
The Dice (P, Y) represents a Dice coefficient, P is a prediction result and ranges from 0 to 1, and Y is label information and takes a value of 0 or 1;
JaccardLoss=1-Jaccard(P,Y);
wherein Jaccard (P, Y) represents Jaccard coefficient;
obtaining a total aggregation Loss function Loss _ sum of an edge prediction image edge _ prediction, a multi-scale prediction segmentation image prediction _ x and a final prediction segmentation image prediction _ 512:
loss _ sum is Loss (32) + Loss (64) + Loss (128) + Loss (256) + Loss (512) + Loss (edge) and its
Among them, Loss (32), Loss (64), Loss (128), and Loss (256) are aggregation Loss functions of the predicted segmented images output by the mask attention modules of the fifth to second stages, respectively; loss (512) is the aggregation Loss function of the final prediction segmentation image prediction _ 512; loss (edge) is an aggregation loss function of edge prediction image edge _ predict;
and optimizing a total aggregation Loss function Loss-sum, wherein the Loss-sum is the minimum, and obtaining the optimized image segmentation system.
And further improving, optimizing a total aggregation Loss function Loss-sum by adopting an Adam gradient descent algorithm.
An initial image is input into the image segmentation system based on the boundary perception attention, and a final edge feature image effect and a final prediction segmentation image prediction _512 are obtained.
The invention has the advantages that:
1. aiming at the problem that the edge segmentation of a model to a target in an image is inaccurate in the prior art, the invention provides edge auxiliary calculation, the structure takes cascade of depth residual modules as a feature coding path, semantic information is transmitted layer by layer, the representation capability and the feature extraction capability of a feature encoder in the first three layers are enhanced by fusing shallow low-dimensional high-fine granularity detail features, the optimization process of the model parameters is assisted by an edge loss function, and guidance is provided for image segmentation in a feature decoding path, so that the edge of a segmented target is more accurate and clear.
2. Aiming at the problem that multi-scale context information cannot be fully utilized and a target loss function of model optimization is too single in the prior art, the invention provides a mask attention structure and a multi-scale aggregation loss function, wherein the structure takes a double convolution layer and mask attention module as a feature decoding path, focuses on the position containing important information in a feature space, and supplements detailed local features layer by layer. The method carries out strong supervision learning aiming at multi-scale segmentation predicted images, and the scales are fused layer by layer, so that the global and detail local characteristics required by segmentation are continuously enriched and perfected, the spatial resolution of a characteristic diagram is improved, and the accuracy and the effect of target segmentation in the images are further improved.
Drawings
FIG. 1 is a block diagram of a network model structure of an image segmentation algorithm according to an embodiment of the present invention; (patent bodies do not have color and gray scale, the inside figure shows that the background color is removed and black border is used, and English inside preferably gives Chinese notes)
FIG. 2 is a block diagram of an encoder path structure according to an embodiment of the present invention;
FIG. 3 is a block diagram of a mask attention module according to an embodiment of the present invention.
Detailed Description
The present invention is further illustrated by the following examples.
Example 1
An image segmentation method based on edge-aided calculation and mask attention, the frame diagram of the method is shown in figure 1, and the method comprises the following steps:
s1, establishing a multi-order cascaded residual error module-constructed feature encoder, respectively performing dimensionality reduction and upsampling on the output of the first three-order residual error module, fusing three shallow feature maps to obtain an edge feature map, obtaining an edge predicted image after feature dimensionality reduction, and enhancing the characterization capability of the first three-layer feature encoder, wherein the specific implementation method comprises the following steps:
the feature encoder is composed of five levels including a Conv1 layer, a Conv2_ x layer, a Conv3_ x layer, a Conv4_ x layer and a Conv5_ x layer, wherein the Conv1 layer comprises a convolution layer with a convolution kernel of 7 × 7, a BatchNorm layer, a ReLu layer and a MaxPoint layer, all levels except the Conv1 layer are cascade residual blocks, the residual blocks mainly comprise a Conv Block and an Identity Block, the input dimension and the output dimension of the Conv Block are different, the dimension of the Identity Block is used for changing the dimension of the network, and the input dimension and the output dimension of the Identity Block are the same and can be connected in series for deepening the network. An input image respectively obtains a feature map feature _1, a feature map feature _3, a feature map feature _4 and a feature map feature _5 through five levels of a feature encoder, the feature maps feature _1, feature _2 and feature _3 corresponding to the first three levels are taken out, channel dimensionality reduction is respectively carried out on convolutional layers with convolution kernel of 1 x 1, feature extraction is carried out on convolutional layers with convolution kernel of 3 x 3, linear interpolation up-sampling operation with factors of 2, 4 and 8 is respectively carried out, and three edge feature maps effect _1, effect _2 and effect _3 with the same scale are obtained. And performing feature fusion on the three edge feature maps in a splicing mode to obtain a final edge feature map efeature, and performing channel dimension reduction on the edge feature map efeature by adopting a convolution layer with a convolution kernel of 1 x 1 to obtain an edge prediction image edge _ prediction. The method can fuse the detail features of shallow layer, low dimension and high fine granularity, enhance the representation capability and the feature extraction capability of the feature encoder of the first three layers, assist the optimization process of model parameters through an edge loss function, and provide guidance for image segmentation in a feature decoding path, so that the edge of a segmented target is more accurate and clear.
S2, the last-order residual error module sequentially passes through a plurality of feature decoders and mask attention modules which are in charge of up-sampling and jumping connection, the mask attention modules are utilized to improve the focus attention of each level feature decoder to a local area, and a segmentation result image predicted by a corresponding scale is output at each level, and the specific implementation method is as follows:
the input of the mask attention module firstly passes through convolution layers with convolution kernels of 3 × 3 and convolution kernels of 1 × 1 to obtain a prediction segmentation image with a corresponding scale, the prediction segmentation image is used as a mask attention map mask _ attribute to be multiplied by the input to obtain an attention feature map att _ feature, and the attention feature map att _ feature is directly added with the input through jump connection to obtain an enhanced feature map. The feature decoder is composed of two convolution layers with convolution kernel of 3 x 3, the input is the feature graph of the corresponding hierarchical feature encoder and the feature graph of the previous mask attention module, wherein the feature graph of the previous mask attention module needs to be subjected to linear interpolation upsampling with factor of 2, and feature fusion of the two feature graphs is completed by utilizing a splicing operation under the condition that the feature graphs have the same scale. In this process, four different scales of the predictive segmented image predictive _ x are generated, with the scales being 32 × 32, 64 × 64, 128 × 128, 256 × 256, respectively. The step can be used for obtaining the mask attention map according to different training of importance of spatial position information on the image, namely extracting a mask attention moment array on an information path of a multi-scale decoder module to guide the segmentation of semantic information of the target, determining the spatial position needing important attention and finally improving the overall segmentation effect of the target.
S3, merging the output characteristic diagram of the characteristic decoder and the edge characteristic diagram of the first three-order residual error module, aiming at introducing high fine granularity information of the edge, improving the edge prediction accuracy in the segmentation result, and calculating the aggregation loss function for model parameter optimization on 6 types of prediction result diagrams output by the model through the final segmentation result image of characteristic dimension reduction prediction, wherein the specific implementation method comprises the following steps:
and fusing the last feature pattern of the last feature decoder and the edge feature pattern effect calculated by the previous three-order residual error module by utilizing splicing operation, then inputting the result of the convolution operation into a Sigmoid activation function through a convolution layer with a convolution kernel of 1 x 1, and obtaining the final prediction segmentation image prediction _ 512. And calculating an aggregation loss function for each prediction result according to the edge prediction image edge _ prediction, the multi-scale prediction segmentation image prediction _ x and the final prediction segmentation image prediction _512, wherein the aggregation loss function is the addition of BCELoss, DiceLoss and JacchardLoss, and the total loss function during model training is the addition of the edge prediction image edge _ prediction, the multi-scale prediction segmentation image prediction _ x and the final prediction segmentation image prediction _ 512. The Adam gradient descent algorithm designs independent adaptive learning rates for different parameters by calculating the first moment estimation and the second moment estimation of the gradient, has excellent performance on unsteady state and online problems, and utilizes the Adam gradient descent algorithm to optimize model parameters. The step can carry out strong supervision and learning on the multi-scale output result of the decoder path, the model parameters are optimized faster and better under the constraint of a plurality of optimization targets, excellent performance is shown in the segmentation problem, and multi-scale information can be aggregated more effectively in the process, so that help is provided for final image segmentation.
The embodiment of the invention also provides an image segmentation system based on edge auxiliary calculation and mask attention, which comprises computer equipment; the computer device is configured or programmed for performing the steps of the above-described embodiment method.
In the invention, the computer equipment can be a microprocessor, an upper computer and other equipment.
The Montgomery published CXR data set is tested, experimental results using the data set in a paper published in the last 5 years are compared (due to the fact that evaluation indexes adopted in the paper are different, partial evaluation index results are absent), and after comparison, the method provided by the Montgomery published CXR data set has obvious advantages on three indexes of acc, dice and jaccard.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (7)
1. An image segmentation system based on boundary perception attention is characterized by comprising a feature encoder constructed by n stages of cascaded residual modules conv, wherein a feature decoder block is arranged corresponding to each stage of residual module, and a mask attention module is arranged corresponding to each stage of feature decoder block; the characteristic diagram output by the residual error module of the a-th level is input by the residual error module of the a + 1-th level; the feature diagram output by the a-stage residual error module is input into a decoder block of the a-stage feature decoder, and the output feature diagram output by the decoder block of the a-stage feature decoder is input into a mask attention block of the a-stage mask _ attention block; the enhanced feature map output by the a-th level mask attention block is input into an a-1-th level feature decoder;
respectively performing descending and up-sampling on the output of the first 3-stage residual error module to obtain three shallow feature maps, fusing the three shallow feature maps to obtain a final edge feature map efeature, and performing dimensionality reduction on the final edge feature map efeature feature to obtain an edge predicted image edge _ predicted; splicing and fusing the final edge feature image and the last enhanced feature image last _ feature output by the first-level mask attention module to obtain a final prediction segmentation image prediction _ 512; n is more than or equal to 3.
2. The boundary aware attention-based image segmentation system as claimed in claim 1, wherein n-5; the first-stage residual error module conv1 includes a convolution layer with convolution kernel 7 × 7, a batch norm layer, a ReLu layer and a MaxPool layer; the remaining residual blocks comprise Conv Block and Identity Block; the input and output dimensions of the Conv Block are different, and the Conv Block is used for changing the dimensions of the network; the k input dimension and the output dimension of the Identity Bloc are the same and are used for deepening the network; the method comprises the steps that an input image respectively obtains a feature map feature _1, a feature map feature _3, a feature map feature _4 and a feature map feature _5 through five residual modules, the feature maps feature _1, feature _2 and feature _3 are taken, channel dimensionality reduction is respectively carried out on convolutional layers with convolution kernels of 1 x 1, feature extraction is carried out on the convolutional layers with convolution kernels of 3 x 3, linear interpolation up-sampling operation with factors of 2, 4 and 8 is respectively carried out, and three edge feature maps effect _1, effect _2 and effect _3 with the same scale are obtained; and performing feature fusion on the three edge feature maps, namely, the edge _1, the edge _2 and the edge _3 in a splicing mode to obtain a final edge feature map, and performing channel dimension reduction on the edge feature map by using a convolution layer with a convolution kernel of 1 × 1 to obtain an edge predicted image edge _ prediction.
3. The boundary aware attention based image segmentation system as claimed in claim 2, wherein the feature map output by the a-th stage residual module and the enhanced feature map output by the a + 1-th stage mask attention block are input to an a-th stage feature decoder block; in the decoder block of the a-level feature decoder, the enhanced feature graph is subjected to linear interpolation upsampling with a factor of 2, then is subjected to feature fusion with the feature graph through splicing operation, and is output through two convolution layers with convolution kernels of 3 x 3 to obtain an output feature graph;
inputting an output characteristic diagram of the decoder block of the a-level characteristic decoder into a mask attention block of the a-level mask; firstly, obtaining a prediction segmentation image with a corresponding scale by a convolution layer with a convolution kernel of 3 × 3 and a convolution kernel of 1 × 1 by an output feature map, multiplying the prediction segmentation image as a mask attention map mask _ attribute by the output feature map to obtain an attention feature map att _ feature, and directly adding the attention feature map att _ feature and the output feature map to obtain an enhanced feature map; the mask attention modules of the fifth stage to the second stage output predicted divided images prediction _ x having a scale size of 32 × 32, 64 × 64, 128 × 128, 256 × 256, respectively, x being 2, 3, 4, 5.
4. The boundary aware attention-based image segmentation system as claimed in claim 3, wherein the last enhanced feature map last _ feature output by the first-stage mask attention module is subjected to a stitching operation with the final edge feature map effect, and then the result of the stitching operation is input to a Sigmoid activation function through a convolution layer with convolution kernel 1 × 1 to obtain a final predicted segmented image predict _ 512;
calculating an aggregation Loss function Loss for the edge prediction image edge _ prediction, the multi-scale prediction segmentation image prediction _ x, and the final prediction segmentation image prediction _512 respectively:
Loss=BCELoss+DiceLoss+JaccardLoss
BCELoss calculates two-class cross entropy loss under a single-label two-class scene, one input sample picture corresponds to one output segmentation picture, and for a Batch data set D (p, y) containing N sample pictures, p is a prediction result, the value range is 0-1, y is label information, and the value range is 0 or 1; the BCELoss is calculated as follows:
wherein p isiIndicates the prediction result of the ith sample picture, yiLabel information indicating an ith sample picture;
wherein DiceLoss 1-Dice (P, Y)
The Dice (P, Y) represents a Dice coefficient, P is a prediction result and ranges from 0 to 1, and Y is label information and takes a value of 0 or 1;
JaccardLoss=1-Jaccard(P,Y);
wherein Jaccard (P, Y) represents Jaccard coefficient;
obtaining a total aggregation Loss function Loss _ sum of an edge prediction image edge _ prediction, a multi-scale prediction segmentation image prediction _ x and a final prediction segmentation image prediction _ 512:
the method comprises the following steps of (1) obtaining a predicted segmentation image, wherein the predicted segmentation image is a prediction segmentation image output by a mask attention module from a fifth stage to a second stage, and the prediction segmentation image comprises the following steps of (32) + Loss (64) + Loss (128) + Loss (256) + Loss (512) + Loss (edge); loss (512) is the aggregation Loss function of the final prediction segmentation image prediction _ 512; loss (edge) is an aggregation loss function of edge prediction image edge _ predict;
and optimizing the total aggregation Loss function Loss _ sum, wherein the Loss _ sum is the minimum, and obtaining the optimized image segmentation system.
5. The boundary aware attention-based image segmentation system as claimed in claim 4, wherein the total aggregation Loss function Loss _ sum is optimized by using Adam gradient descent algorithm.
6. An image segmentation method based on boundary awareness, characterized in that an initial image is input into the image segmentation system based on boundary awareness as claimed in any of claims 1 to 5, and a final edge feature map effect and a final prediction segmentation image prediction _512 are obtained.
7. The method as claimed in claim 6, wherein the initial image is inputted into the optimized image segmentation system as claimed in claim 4 or 5, and the final edge feature map effect and the final prediction segmentation image prediction _512 are obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210288277.3A CN114565770B (en) | 2022-03-23 | 2022-03-23 | Image segmentation method and system based on edge auxiliary calculation and mask attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210288277.3A CN114565770B (en) | 2022-03-23 | 2022-03-23 | Image segmentation method and system based on edge auxiliary calculation and mask attention |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114565770A true CN114565770A (en) | 2022-05-31 |
CN114565770B CN114565770B (en) | 2022-09-13 |
Family
ID=81719920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210288277.3A Active CN114565770B (en) | 2022-03-23 | 2022-03-23 | Image segmentation method and system based on edge auxiliary calculation and mask attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114565770B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115760810A (en) * | 2022-11-24 | 2023-03-07 | 江南大学 | Medical image segmentation apparatus, method and computer-readable storage medium |
CN115984293A (en) * | 2023-02-09 | 2023-04-18 | 中国科学院空天信息创新研究院 | Spatial target segmentation network and method based on edge perception attention mechanism |
CN116188501A (en) * | 2023-03-02 | 2023-05-30 | 江南大学 | Medical image segmentation method based on multi-scale cross attention |
CN116703950A (en) * | 2023-08-07 | 2023-09-05 | 中南大学 | Camouflage target image segmentation method and system based on multi-level feature fusion |
CN116721351A (en) * | 2023-07-06 | 2023-09-08 | 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 | Remote sensing intelligent extraction method for road environment characteristics in overhead line channel |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109584246A (en) * | 2018-11-16 | 2019-04-05 | 成都信息工程大学 | Based on the pyramidal DCM cardiac muscle diagnosis and treatment irradiation image dividing method of Analysis On Multi-scale Features |
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN111462126A (en) * | 2020-04-08 | 2020-07-28 | 武汉大学 | Semantic image segmentation method and system based on edge enhancement |
CN111539435A (en) * | 2020-04-15 | 2020-08-14 | 创新奇智(合肥)科技有限公司 | Semantic segmentation model construction method, image segmentation equipment and storage medium |
CN111583285A (en) * | 2020-05-12 | 2020-08-25 | 武汉科技大学 | Liver image semantic segmentation method based on edge attention strategy |
CN111986181A (en) * | 2020-08-24 | 2020-11-24 | 中国科学院自动化研究所 | Intravascular stent image segmentation method and system based on double-attention machine system |
CN112967300A (en) * | 2021-02-23 | 2021-06-15 | 艾瑞迈迪医疗科技(北京)有限公司 | Three-dimensional ultrasonic thyroid segmentation method and device based on multi-scale fusion network |
CN113379771A (en) * | 2021-07-02 | 2021-09-10 | 西安电子科技大学 | Hierarchical human body analytic semantic segmentation method with edge constraint |
CN114004811A (en) * | 2021-11-01 | 2022-02-01 | 西安交通大学医学院第二附属医院 | Image segmentation method and system based on multi-scale residual error coding and decoding network |
CN114048822A (en) * | 2021-11-19 | 2022-02-15 | 辽宁工程技术大学 | Attention mechanism feature fusion segmentation method for image |
-
2022
- 2022-03-23 CN CN202210288277.3A patent/CN114565770B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109584246A (en) * | 2018-11-16 | 2019-04-05 | 成都信息工程大学 | Based on the pyramidal DCM cardiac muscle diagnosis and treatment irradiation image dividing method of Analysis On Multi-scale Features |
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN111462126A (en) * | 2020-04-08 | 2020-07-28 | 武汉大学 | Semantic image segmentation method and system based on edge enhancement |
CN111539435A (en) * | 2020-04-15 | 2020-08-14 | 创新奇智(合肥)科技有限公司 | Semantic segmentation model construction method, image segmentation equipment and storage medium |
CN111583285A (en) * | 2020-05-12 | 2020-08-25 | 武汉科技大学 | Liver image semantic segmentation method based on edge attention strategy |
CN111986181A (en) * | 2020-08-24 | 2020-11-24 | 中国科学院自动化研究所 | Intravascular stent image segmentation method and system based on double-attention machine system |
CN112967300A (en) * | 2021-02-23 | 2021-06-15 | 艾瑞迈迪医疗科技(北京)有限公司 | Three-dimensional ultrasonic thyroid segmentation method and device based on multi-scale fusion network |
CN113379771A (en) * | 2021-07-02 | 2021-09-10 | 西安电子科技大学 | Hierarchical human body analytic semantic segmentation method with edge constraint |
CN114004811A (en) * | 2021-11-01 | 2022-02-01 | 西安交通大学医学院第二附属医院 | Image segmentation method and system based on multi-scale residual error coding and decoding network |
CN114048822A (en) * | 2021-11-19 | 2022-02-15 | 辽宁工程技术大学 | Attention mechanism feature fusion segmentation method for image |
Non-Patent Citations (5)
Title |
---|
SHASHA LIU ET AL.: "Shape-aware Multi-task Learning for Semi-supervised 3D Medical Image Segmentation", 《2021 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)》 * |
ZHANG Z ET AL.: "ET-Net: A generic edge-attention guidance network for medical image segmentation", 《INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION-MICCAI 2019》 * |
余帅等: "基于多级通道注意力的遥感图像分割方法", 《激光与光电子学进展》 * |
赵楠等: "一种糖尿病足溃疡智能测量模型的构建与验证", 《中南大学学报(医学版)》 * |
郭元晨等: "基于空间注意力下边缘图融合的草图图像检索", 《计算机辅助设计与图形学学报》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115760810A (en) * | 2022-11-24 | 2023-03-07 | 江南大学 | Medical image segmentation apparatus, method and computer-readable storage medium |
CN115760810B (en) * | 2022-11-24 | 2024-04-12 | 江南大学 | Medical image segmentation apparatus, method and computer-readable storage medium |
CN115984293A (en) * | 2023-02-09 | 2023-04-18 | 中国科学院空天信息创新研究院 | Spatial target segmentation network and method based on edge perception attention mechanism |
CN115984293B (en) * | 2023-02-09 | 2023-11-07 | 中国科学院空天信息创新研究院 | Spatial target segmentation network and method based on edge perception attention mechanism |
CN116188501A (en) * | 2023-03-02 | 2023-05-30 | 江南大学 | Medical image segmentation method based on multi-scale cross attention |
CN116188501B (en) * | 2023-03-02 | 2024-02-13 | 江南大学 | Medical image segmentation method based on multi-scale cross attention |
CN116721351A (en) * | 2023-07-06 | 2023-09-08 | 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 | Remote sensing intelligent extraction method for road environment characteristics in overhead line channel |
CN116703950A (en) * | 2023-08-07 | 2023-09-05 | 中南大学 | Camouflage target image segmentation method and system based on multi-level feature fusion |
CN116703950B (en) * | 2023-08-07 | 2023-10-20 | 中南大学 | Camouflage target image segmentation method and system based on multi-level feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN114565770B (en) | 2022-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114565770B (en) | Image segmentation method and system based on edge auxiliary calculation and mask attention | |
US20210383231A1 (en) | Target cross-domain detection and understanding method, system and equipment and storage medium | |
CN110910391B (en) | Video object segmentation method for dual-module neural network structure | |
CN111062395B (en) | Real-time video semantic segmentation method | |
US20240029272A1 (en) | Matting network training method and matting method | |
CN111612008A (en) | Image segmentation method based on convolution network | |
CN114187311A (en) | Image semantic segmentation method, device, equipment and storage medium | |
CN111882620A (en) | Road drivable area segmentation method based on multi-scale information | |
Kang et al. | SdBAN: Salient object detection using bilateral attention network with dice coefficient loss | |
CN114820579A (en) | Semantic segmentation based image composite defect detection method and system | |
CN115620010A (en) | Semantic segmentation method for RGB-T bimodal feature fusion | |
CN117037119A (en) | Road target detection method and system based on improved YOLOv8 | |
US11948078B2 (en) | Joint representation learning from images and text | |
CN111739037B (en) | Semantic segmentation method for indoor scene RGB-D image | |
CN110852199A (en) | Foreground extraction method based on double-frame coding and decoding model | |
CN114419323A (en) | Cross-modal learning and domain self-adaptive RGBD image semantic segmentation method | |
CN111179272B (en) | Rapid semantic segmentation method for road scene | |
CN113139502A (en) | Unsupervised video segmentation method | |
CN116630932A (en) | Road shielding target detection method based on improved YOLOV5 | |
CN116596966A (en) | Segmentation and tracking method based on attention and feature fusion | |
CN114639101A (en) | Emulsion droplet identification system, method, computer equipment and storage medium | |
CN111753714B (en) | Multidirectional natural scene text detection method based on character segmentation | |
CN110942463B (en) | Video target segmentation method based on generation countermeasure network | |
Wang et al. | Feature enhancement: predict more detailed and crisper edges | |
CN114419078B (en) | Surface defect region segmentation method and device based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |