CN113421268A - Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism - Google Patents

Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism Download PDF

Info

Publication number
CN113421268A
CN113421268A CN202110637809.5A CN202110637809A CN113421268A CN 113421268 A CN113421268 A CN 113421268A CN 202110637809 A CN202110637809 A CN 202110637809A CN 113421268 A CN113421268 A CN 113421268A
Authority
CN
China
Prior art keywords
level
network
feature map
attention mechanism
channel attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110637809.5A
Other languages
Chinese (zh)
Other versions
CN113421268B (en
Inventor
欧晓焱
葛琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110637809.5A priority Critical patent/CN113421268B/en
Publication of CN113421268A publication Critical patent/CN113421268A/en
Application granted granted Critical
Publication of CN113421268B publication Critical patent/CN113421268B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a semantic segmentation method based on a multistage channel attention mechanism depeplabv 3+ network, which belongs to the technical field of image processing segmentation and comprises the following steps: step 1: inputting an image; step 2: acquiring high-level and low-level semantic features of an input image through a deep convolutional neural network; and step 3: and (4) sending the high-level semantic features to a cavity pyramid pooling module, and step 4: sending the first characteristic diagram obtained in the step 3 into a multi-level channel attention mechanism module, and carrying out a step 5: and 6, performing bilinear interpolation upsampling on the second feature map obtained in the step 4, and combining the upsampled second feature map with the low-level semantic features obtained in the step 2: performing bilinear interpolation upsampling on the combined feature map in the step 5 again; and 7: outputting a final prediction result; the method and the device improve the accuracy of the table semantic segmentation, reduce the size of the network model and improve the identification speed so as to meet the real-time requirement of mobile application.

Description

Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism
Technical Field
The invention relates to a semantic segmentation method based on a multistage channel attention mechanism depeplabv 3+ network, and belongs to the technical field of image processing segmentation.
Background
Semantic segmentation is to classify the image at the pixel level and predict the class to which each pixel belongs. Is one of the key problems in the computer field today. As Convolutional Neural Networks (CNNs) and deep learning exhibit excellent performance in the field of computer vision, more and more research tends to build semantic segmentation models using the CNNs and the deep learning.
However, in a complex environment, due to factors such as occlusion, posture tilt, and unbalanced illumination, the accuracy of object edge segmentation is greatly reduced, and the accuracy of object edge segmentation needs to be improved by adding an additional loss function and reasonably utilizing a context modeling method. The attention mechanism in computer vision can enable a model to learn attention, and a large amount of irrelevant information is filtered by an information selection mechanism from top to bottom, so that the attention mechanism is reasonably used in a semantic segmentation model, and the connection and relationship between local and global characteristics and between high-level and low-level semantic characteristics can be more effectively acquired. Meanwhile, as the CNN network model has more parameters and larger model, the classification and segmentation speed is also slower, and as the mobile application is popularized, higher requirements are placed on the size and the running speed of the network model.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a semantic segmentation method based on a deeplabv3+ network of a multi-level channel attention mechanism, which can reduce the size of a network model and improve the recognition speed while improving the accuracy of table semantic segmentation so as to meet the real-time requirement of mobile application.
The technical scheme is as follows: the invention provides a semantic segmentation method based on a multistage channel attention mechanism depeplabv 3+ network, which comprises the following steps of:
step 1: inputting an image;
step 2: acquiring high-level and low-level semantic features of an input image through a deep convolutional neural network;
and step 3: sending the high-level semantic features to a cavity pyramid pooling module to obtain a first feature map;
and 4, step 4: sending the first characteristic diagram obtained in the step 3 into a multi-level channel attention mechanism module to obtain a second characteristic diagram;
and 5: performing bilinear interpolation upsampling on the second feature map obtained in the step 4 and merging the second feature map with the low-level semantic features obtained in the step 2 to obtain a merged feature map;
step 6: performing bilinear interpolation upsampling on the combined feature map in the step 5 again;
and 7: and outputting a final prediction result.
Further, in step 2, the picture is sent to a deep convolution network added with a hole convolution to extract high-level and low-level semantic features, and the convolution equation is as follows:
Figure BDA0003105880060000021
in the formula: y [ i ] denotes the hole convolution output at each position i, x [ i ] denotes the input at each position i, w [ k ] denotes a convolution filter of length k, r denotes the sample step size of the input signal, where r is set to 2.
Further, in the step 3, the extracted high-level semantic features are sent to a cavity pyramid pooling module, and are respectively convolved and pooled with the cavity convolution layers and pooling layers with different rates to obtain five feature maps, which are then connected into a five-layer input feature map F, wherein the cavity convolution rates are 1, 6, 12 and 18 respectively.
Further, the step 4 specifically includes the following steps:
step 4.1: respectively carrying out convolution with convolution kernels of 1x1,3x3 and 5x5 on the first characteristic diagram to obtain a 3-branch characteristic diagram F1、F2、F3
Step 4.2: the feature map F (H × W × C) of each branch is subjected to global maximum pooling and global average pooling based on width and height, respectively, to obtain two 1 × 1 × C feature maps, respectively.
Step 4.3: sending the two 1 × 1 × C feature graphs obtained in the step 4.2 into a multilayer perceptron with 2 layers, wherein the number of first-layer neurons is C/r, r is a reduction rate, an activation function Relu, the number of second-layer neurons is C, the neural networks of the first-layer neurons and the second-layer neurons are mutually shared, then carrying out addition operation based on element-wise on features output by the multilayer perceptron, and finally generating a channel attention feature M through sigmoid activationcThe process is represented as follows:
Figure BDA0003105880060000031
in the formula: σ denotes sigmoid activation, W0∈RC/r×C,W1∈RC×C/r,W0And W1Is the weight of the MLP.
Step 4.4: and fusing the attention characteristic of each branch channel with the branch characteristic graph to obtain each recalibration characteristic graph, wherein the process expression is as follows:
Fi′=Fi×Mci i=1...k
in the formula: i represents the number of branches, so k is 3.
Step 4.5: adding the recalibration characteristic graphs to obtain a final recalibration characteristic graph, wherein the process expression is as follows:
Figure BDA0003105880060000032
in the formula: i represents the number of branches, so m is 3.
Further, in step 5, the equation of the bilinear interpolation upsampling is as follows:
f(x,y)=f(Q1111+f(Q2121+f(Q1212+f(Q2222
in the formula: f () represents a linear relation, Q, for interpolation between selected pointsijRepresenting a selected point, ωijDenotes f (Q)ij) And (4) weighting.
Further, in step 6, the merged feature map in step 5 is convolved by 3 × 3 and then upsampled by bilinear interpolation again.
Further, in step 7, a final prediction result is obtained by a loss function formula, where the loss function formula is:
fl(pt)=-αt·(1-pt)γ·log(pt)
in the formula: alpha is a class weight, (1-p)t)γIs the modulation factor.
Compared with the prior art, the invention has the following beneficial effects:
the invention discloses a semantic segmentation method based on a multilevel channel attention mechanism deplab 3+ network, which comprises the steps of obtaining high-level and low-level semantic features of an input image through a deep convolutional neural network, sending the high-level semantic features to a hollow pyramid pooling module to obtain a first feature map, sending the first feature map to a multilevel channel attention mechanism module, performing bilinear interpolation up-sampling, merging with the low-level semantic features to obtain a merged feature map, performing bilinear interpolation up-sampling again to obtain a more accurate feature map, reducing the size of a network model while improving the accuracy of table semantic segmentation, improving the identification speed and meeting the real-time requirement of mobile application.
Aiming at the characteristic that a target area is complex and changeable, in order to select information with different spatial scales across channels, feature maps are respectively convolved by 3 types to form three branches, each branch is subjected to maximum pooling and global average pooling operation and then shares network layer local connection to obtain the weight of the channel, then the feature maps on the original branches are subjected to channel feature recalibration, and finally the feature maps processed by the three-branch operation are subjected to addition operation to obtain a more accurate feature map, so that the feature enhancement effect on the complex segmentation target area is realized, the loss of important feature information in the training process is reduced, and the segmentation result is more accurate.
Drawings
FIG. 1 is a flow chart of a method provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a depth convolution and void pyramid pooling module according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a multi-level channel attention mechanism module in accordance with an embodiment of the present invention;
FIG. 4 is a diagram illustrating bilinear interpolation in accordance with an embodiment of the present invention;
FIG. 5 is a diagram illustrating loss training in an embodiment of the present invention;
FIG. 6 is a diagram illustrating pixel accuracy training in an embodiment of the present invention;
FIG. 7 is a diagram illustrating a final predicted result under a general data set according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
As shown in fig. 1, the semantic segmentation method based on the decaplabv 3+ network of the multi-level channel attention mechanism according to the embodiment of the present invention includes the following steps:
step 1: inputting an image;
step 2: acquiring high-level and low-level semantic features of an input image through a deep convolutional neural network;
acquiring a characteristic diagram of an input image; in order to test the effect of the method in a complex environment, a city spaces data set is adopted, the road environment has a large amount of shielding, posture inclination, uneven illumination and other conditions, and the final prediction effect of the method is still excellent under the data set;
the picture is sent into a deep convolution network added with cavity convolution to extract high-level and low-level semantic features, and the convolution process is as follows:
Figure BDA0003105880060000051
in the formula: y [ i ] denotes the hole convolution output at each position i, x [ i ] denotes the input at each position i, w [ k ] denotes a convolution filter of length k, r denotes the sample step size of the input signal, where r is set to 2.
As shown in fig. 2, step 3: sending the extracted high-level semantic features into a cavity pyramid pooling module, performing convolution and pooling with cavity convolution layers and pooling layers with different rates respectively to obtain five feature maps, and then connecting the five feature maps into a five-layer input feature map F, wherein the cavity convolution rates are respectively 1, 6, 12 and 18, and FIG. 2 is a schematic diagram of a depth convolution and cavity pyramid pooling module in the specific embodiment of the invention;
as shown in fig. 3, step 4: sending the first characteristic diagram obtained in the step 3 into a multi-level channel attention mechanism module to obtain a second characteristic diagram, and specifically comprising the following steps:
step 4.1: respectively carrying out convolution with convolution kernels of 1x1,3x3 and 5x5 on the first characteristic diagram to obtain a 3-branch characteristic diagram F1、F2、F3
Step 4.2: respectively performing global maximum pooling and global average pooling on the basis of width and height on the feature map F (H multiplied by W multiplied by C) of each branch to respectively obtain two feature maps of 1 multiplied by C;
step 4.3: sending the two 1 × 1 × C feature graphs obtained in the step 4.2 into a multilayer perceptron with 2 layers, wherein the number of first-layer neurons is C/r, r is a reduction rate, an activation function Relu, the number of second-layer neurons is C, the neural networks of the first-layer neurons and the second-layer neurons are mutually shared, then carrying out addition operation based on element-wise on features output by the multilayer perceptron, and finally generating a channel attention feature M through sigmoid activationcThe process is represented as follows:
Figure BDA0003105880060000061
in the formula: σ denotes sigmoid activation, W0∈RC/r×C,W1∈RC×C/r,W0And W1Is MLPThe weight of (c).
Step 4.4: and fusing the attention characteristic of each branch channel with the branch characteristic graph to obtain each recalibration characteristic graph, wherein the process expression is as follows:
Fi′=Fi×Mci i=1...k
in the formula: i represents the number of branches, so k is 3.
Step 4.5: adding the recalibration characteristic graphs to obtain a final recalibration characteristic graph, wherein the process expression is as follows:
Figure BDA0003105880060000071
in the formula: i represents the number of branches, so m is 3.
Fig. 3 is a schematic diagram of a multi-level channel attention mechanism module according to an embodiment of the present invention, which recalibrates high-level semantic features to enhance the connection between pixels and between local and global, so as to achieve the purpose of improving the segmentation accuracy.
As shown in fig. 4, step 5: performing bilinear interpolation upsampling on the second feature map obtained in the step 4 and merging the second feature map with the low-level semantic features obtained in the step 2 to obtain a merged feature map;
in step 5, the equation of the bilinear interpolation upsampling is as follows:
f(x,y)=f(Q1111+f(Q2121+f(Q1212+f(Q2222
in the formula: f () represents a linear relation, Q, for interpolation between selected pointsijRepresenting a selected point, ωijDenotes f (Q)ij) A weight; FIG. 4 is a diagram illustrating a bilinear interpolation method according to an embodiment of the present invention, which interpolates in the X direction first, and then interpolates the interpolation result in the Y direction;
step 6: performing bilinear interpolation upsampling on the combined feature map in the step 5 again; and (4) convolving the combined feature map in the step 5 by 3x3 and then performing bilinear interpolation upsampling again.
And 7: and obtaining a final prediction result through a loss function formula, wherein the loss function formula is as follows:
fl(pt)=-αt·(1-pt)γ·log(pt)
in the formula: alpha is a class weight, (1-p)t)γIs the modulation factor.
Fig. 5 and 6 are schematic diagrams obtained by loss training and pixel accuracy training of the experiment of the present invention, respectively, through actual software measurement, the final total accuracy of the present invention can reach 93.73% at most, fig. 7 is a schematic diagram of the final prediction effect of the experiment of the present invention in a general data set, through actual software measurement, the final mean IoU of the present invention can reach 71.89%.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (7)

1. A semantic segmentation method based on a multistage channel attention mechanism deplapv 3+ network is characterized by comprising the following steps:
step 1: inputting an image;
step 2: acquiring high-level and low-level semantic features of an input image through a deep convolutional neural network;
and step 3: sending the high-level semantic features to a cavity pyramid pooling module to obtain a first feature map;
and 4, step 4: sending the first characteristic diagram obtained in the step 3 into a multi-level channel attention mechanism module to obtain a second characteristic diagram;
and 5: performing bilinear interpolation upsampling on the second feature map obtained in the step 4 and merging the second feature map with the low-level semantic features obtained in the step 2 to obtain a merged feature map;
step 6: performing bilinear interpolation upsampling on the combined feature map in the step 5 again;
and 7: and outputting a final prediction result.
2. The semantic segmentation method based on the deplapv 3+ network of the multi-level channel attention mechanism as claimed in claim 1, wherein in the step 2, the picture is fed into a deep convolution network with hole convolution to extract high-level and low-level semantic features, and the convolution equation is as follows:
Figure FDA0003105880050000011
in the formula: y [ i ] denotes the hole convolution output at each position i, x [ i ] denotes the input at each position i, w [ k ] denotes a convolution filter of length k, r denotes the sample step size of the input signal, where r is set to 2.
3. The semantic segmentation method based on the multistage channel attention mechanism depslabv 3+ network as claimed in claim 1, wherein in the step 3, the extracted high-level semantic features are sent to a cavity pyramid pooling module, are respectively convolved and pooled with cavity convolution layers and pooling layers with different rate rates to obtain five feature maps, and are then connected into a five-layer input feature map F, wherein the cavity convolution rates are 1, 6, 12 and 18 respectively.
4. The semantic segmentation method based on the deplapv 3+ network with the multi-level channel attention mechanism as claimed in claim 1, wherein the step 4 specifically comprises the following steps:
step 4.1: respectively carrying out convolution with convolution kernels of 1x1,3x3 and 5x5 on the first characteristic diagram to obtain a 3-branch characteristic diagram F1、F2、F3
Step 4.2: the feature map F (H × W × C) of each branch is subjected to global maximum pooling and global average pooling based on width and height, respectively, to obtain two 1 × 1 × C feature maps, respectively.
Step 4.3: feeding the two 1 × 1 × C feature maps obtained in step 4.2 into a multilayer perceptron with 2 layersThe method comprises the steps of obtaining a first layer of neuron number C/r, obtaining a reduction rate r, obtaining an activation function Relu, obtaining a second layer of neuron number C, sharing neural networks of the first layer of neuron and the second layer of neuron, adding features output by the multilayer perceptron based on element-wise, activating by sigmoid, and finally generating a channel attention feature McThe process is represented as follows:
Figure FDA0003105880050000021
in the formula: σ denotes sigmoid activation, W0∈RC/r×C,W1∈RC×C/r,W0And W1Is the weight of the MLP.
Step 4.4: and fusing the attention characteristic of each branch channel with the branch characteristic graph to obtain each recalibration characteristic graph, wherein the process expression is as follows:
Fi′=Fi×Mcii=1...k
in the formula: i represents the number of branches, so k is 3.
Step 4.5: adding the recalibration characteristic graphs to obtain a final recalibration characteristic graph, wherein the process expression is as follows:
Figure FDA0003105880050000022
in the formula: i represents the number of branches, so m is 3.
5. The semantic segmentation method based on the deplapv 3+ network of the multi-level channel attention mechanism as claimed in claim 1, wherein in the step 5, the bilinear interpolation upsampling has an over-formula:
f(x,y)=f(Q1111+f(Q2121+f(Q1212+f(Q2222
in the formula: f () represents a linear relation for interpolation between the selected points,Qijrepresenting a selected point, ωijDenotes f (Q)ij) And (4) weighting.
6. The semantic segmentation method based on the deplapv 3+ network based on the multi-level channel attention mechanism as claimed in claim 1, wherein in the step 6, the merged feature map in the step 5 is convolved by 3x3 and then upsampled by bilinear interpolation again.
7. The semantic segmentation method based on the deplapv 3+ network of the multi-level channel attention mechanism as claimed in claim 1, wherein in the step 7, a final prediction result is obtained by a loss function formula, and the loss function formula is as follows:
fl(pt)=-αt·(1-pt)γ·log(pt)
in the formula: alpha is a class weight, (1-p)t)γIs the modulation factor.
CN202110637809.5A 2021-06-08 2021-06-08 Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism Active CN113421268B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110637809.5A CN113421268B (en) 2021-06-08 2021-06-08 Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110637809.5A CN113421268B (en) 2021-06-08 2021-06-08 Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism

Publications (2)

Publication Number Publication Date
CN113421268A true CN113421268A (en) 2021-09-21
CN113421268B CN113421268B (en) 2022-09-16

Family

ID=77787974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110637809.5A Active CN113421268B (en) 2021-06-08 2021-06-08 Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism

Country Status (1)

Country Link
CN (1) CN113421268B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114140469A (en) * 2021-12-02 2022-03-04 北京交通大学 Depth hierarchical image semantic segmentation method based on multilayer attention
CN114913436A (en) * 2022-06-15 2022-08-16 中科弘云科技(北京)有限公司 Ground object classification method and device based on multi-scale attention mechanism, electronic equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287940A (en) * 2020-10-30 2021-01-29 西安工程大学 Semantic segmentation method of attention mechanism based on deep learning
CN112541503A (en) * 2020-12-11 2021-03-23 南京邮电大学 Real-time semantic segmentation method based on context attention mechanism and information fusion

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287940A (en) * 2020-10-30 2021-01-29 西安工程大学 Semantic segmentation method of attention mechanism based on deep learning
CN112541503A (en) * 2020-12-11 2021-03-23 南京邮电大学 Real-time semantic segmentation method based on context attention mechanism and information fusion

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114140469A (en) * 2021-12-02 2022-03-04 北京交通大学 Depth hierarchical image semantic segmentation method based on multilayer attention
CN114913436A (en) * 2022-06-15 2022-08-16 中科弘云科技(北京)有限公司 Ground object classification method and device based on multi-scale attention mechanism, electronic equipment and medium

Also Published As

Publication number Publication date
CN113421268B (en) 2022-09-16

Similar Documents

Publication Publication Date Title
CN108509978B (en) Multi-class target detection method and model based on CNN (CNN) multi-level feature fusion
WO2020253416A1 (en) Object detection method and device, and computer storage medium
CN111291809B (en) Processing device, method and storage medium
CN113033570B (en) Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion
CN113421268B (en) Semantic segmentation method based on deplapv 3+ network of multi-level channel attention mechanism
CN112288011B (en) Image matching method based on self-attention deep neural network
CN112069868A (en) Unmanned aerial vehicle real-time vehicle detection method based on convolutional neural network
CN111898439B (en) Deep learning-based traffic scene joint target detection and semantic segmentation method
CN109886066A (en) Fast target detection method based on the fusion of multiple dimensioned and multilayer feature
CN110223304B (en) Image segmentation method and device based on multipath aggregation and computer-readable storage medium
CN111723829B (en) Full-convolution target detection method based on attention mask fusion
CN110222717A (en) Image processing method and device
WO2022111617A1 (en) Model training method and apparatus
CN110956119B (en) Method for detecting target in image
CN115082293A (en) Image registration method based on Swin transducer and CNN double-branch coupling
CN111768415A (en) Image instance segmentation method without quantization pooling
CN114048822A (en) Attention mechanism feature fusion segmentation method for image
CN110633640A (en) Method for identifying complex scene by optimizing PointNet
CN112329801A (en) Convolutional neural network non-local information construction method
CN114943893A (en) Feature enhancement network for land coverage classification
CN115482518A (en) Extensible multitask visual perception method for traffic scene
CN116563682A (en) Attention scheme and strip convolution semantic line detection method based on depth Hough network
US20220215617A1 (en) Viewpoint image processing method and related device
CN114170231A (en) Image semantic segmentation method and device based on convolutional neural network and electronic equipment
CN113205102B (en) Vehicle mark identification method based on memristor neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant