CN110136141B - Image semantic segmentation method and device oriented to complex environment - Google Patents

Image semantic segmentation method and device oriented to complex environment Download PDF

Info

Publication number
CN110136141B
CN110136141B CN201910333809.9A CN201910333809A CN110136141B CN 110136141 B CN110136141 B CN 110136141B CN 201910333809 A CN201910333809 A CN 201910333809A CN 110136141 B CN110136141 B CN 110136141B
Authority
CN
China
Prior art keywords
convolution
network
features
image
semantic segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910333809.9A
Other languages
Chinese (zh)
Other versions
CN110136141A (en
Inventor
吴俊君
王嫣然
陈世浪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan University
Original Assignee
Foshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan University filed Critical Foshan University
Priority to CN201910333809.9A priority Critical patent/CN110136141B/en
Publication of CN110136141A publication Critical patent/CN110136141A/en
Application granted granted Critical
Publication of CN110136141B publication Critical patent/CN110136141B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of computer vision, and particularly relates to an image semantic segmentation method and device for a complex environment, wherein a fine-tuned VGG16 convolutional neural network is firstly used for generating a basic network, and preliminary features of a training image are extracted through the basic network; connecting the hidden layer convolution feature module with each convolution layer of the VGG16 convolution neural network to generate high-level semantic features; inputting the preliminary features into the cavity convolution of the pyramid structure by a cavity convolution method to obtain fine-granularity low-layer features; then fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map; setting network training parameters, taking a cross entropy loss function as a target, and carrying out network training through back propagation so as to establish a semantic segmentation network; finally, inputting the test image into the semantic segmentation network to generate a semantic segmentation result of the test image, wherein the invention can solve the defect of fuzzy segmentation boundary in the complex environment of the existing method, can generate a high-resolution predictive image and improve the performance of the image semantic segmentation method in the complex environment.

Description

Image semantic segmentation method and device oriented to complex environment
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to an image semantic segmentation method and device for a complex environment.
Background
The image semantic segmentation is an image segmentation method for classifying the image according to semantic content expressed by each pixel point in the image, is a basic technology of scene understanding, can be used for positioning and identifying objects at a pixel level, has a crucial effect on unmanned systems such as autonomous navigation at an intelligent driving and robot cognition level, an unmanned plane landing system and intelligent security monitoring, and directly relates to the accuracy of the unmanned systems on scene understanding.
Because the traditional semantic segmentation method has poor scene understanding capability and work efficiency when the unmanned system faces an unstructured complex environment, the semantic segmentation problem facing the complex environment in recent years becomes a research hot spot and a series of remarkable results are achieved. Particularly, due to the occurrence of the convolutional neural network, the field of image semantic segmentation obtains favorable progress, and the problem of semantic segmentation precision is improved from different angles such as a model structure, a loss function, efficiency and the like. However, the existing image semantic segmentation precision is challenged by various factors such as unstructured, diversified targets, irregular shapes, and object shielding, which are possessed by a complex real environment.
Disclosure of Invention
The invention aims to provide an image semantic segmentation method and device oriented to a complex environment, which are used for solving the defect of fuzzy segmentation boundary in the complex environment of the existing method and improving the performance of the image semantic segmentation method in the complex environment.
In order to achieve the above object, the present invention provides the following solutions:
a complex environment-oriented image semantic segmentation method comprises the following steps:
step S100, modifying a VGG16 convolutional neural network to generate a basic network, extracting preliminary features of a training image through the basic network, wherein a convolutional layer in the VGG16 convolutional neural network is divided into 5 stages;
step S200, processing preliminary features obtained by a convolution layer at the previous 4 stages in the basic network by using a hidden layer convolution feature module to generate high-level semantic features;
step S300, processing the preliminary features obtained by the last layer convolution in the basic network through the cavity convolution of the pyramid structure to obtain fine-granularity low-layer features;
step S400, fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map;
step S500, setting network training parameters, and carrying out network training by counter propagation with the cross entropy loss function as a target, thereby establishing a semantic segmentation network;
and S600, inputting the test image into the semantic segmentation network to generate a semantic segmentation result of the test image.
Further, in step S100, the modifying the VGG16 convolutional neural network to generate the base network specifically includes:
discarding all full connection layers and the last pooling layer in the original VGG16 convolutional neural network, and constructing an end-to-end full convolutional network;
and carrying out convolution, pooling, batch normalization and ReLU operation through the full convolution network to obtain a feature map of each convolution layer in the basic network, thereby extracting the primary features of the image.
Further, the specific implementation method of the step S200 is as follows:
step S210, inputting the feature map into convolution with the size of 1 multiplied by 1 and convolution with the size of 3 multiplied by 3 respectively, and obtaining convolution features of all scales;
step S220, fusing convolution characteristics of all scales, and performing ReLU operation to obtain a first result;
step S230, inputting the first result into convolution with 1×1, and adjusting the number of output feature channels to the corresponding number of categories, thereby generating high-level semantic features.
Further, the specific implementation method of the step S300 is as follows:
step S310, respectively inputting the feature images into two groups of cavity convolutions, respectively carrying out batch normalization and ReLU operation, respectively inputting the feature images into convolutions with the size of 1 multiplied by 1, respectively adjusting the number of output feature channels to the corresponding category number, and generating a first feature image and a second feature image;
step S320, carrying out convolution, batch normalization and ReLU operation on the first feature map and the second feature map so as to form a pyramid structure;
and step S330, fusing the first feature map and the second feature map in the pyramid structure to generate fine-granularity low-level features.
Further, in step S400, the high-level semantic features and the fine-grained low-level features are specifically subjected to addition fusion through an eltwise layer, so as to generate a high-resolution feature map.
Further, in step S500, the network training parameters are specifically set as:
with the poly learning strategy, the initial learning rate is set to 0.001, the power is set to 0.9, the initial value of the convolution kernel weight is set to obey a gaussian distribution with a mean value of 0, a standard deviation of 0.01, the initial value of the bias is set to 0, the weight attenuation value is set to 0.0005, and the attenuation momentum is set to 0.9.
An image semantic segmentation apparatus oriented to a complex environment, the apparatus comprising:
the extraction unit is used for modifying the VGG16 convolutional neural network to generate a basic network, extracting the preliminary characteristics of the training image through the basic network, and dividing a convolutional layer in the VGG16 convolutional neural network into 5 stages;
the high-level semantic feature unit is used for processing the preliminary features obtained by the first 4-stage convolution layers in the basic network by using the hidden layer convolution feature module to generate high-level semantic features;
the fine-granularity low-layer characteristic unit is used for processing the preliminary characteristics obtained by the last layer convolution in the basic network through the cavity convolution of the pyramid structure to obtain fine-granularity low-layer characteristics;
the high-resolution feature map unit is used for fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map;
the semantic segmentation network unit is used for setting network training parameters, aiming at the cross entropy loss function, and carrying out network training through back propagation so as to establish a semantic segmentation network;
the semantic segmentation result unit is used for inputting the test image into the semantic segmentation network and generating a semantic segmentation result of the test image.
The beneficial effects of the invention are as follows: the invention discloses an image semantic segmentation method and device facing a complex environment, which comprises the steps of firstly modifying a VGG16 convolutional neural network to generate a basic network, and extracting preliminary features of a training image through the basic network; further, a hidden layer convolution feature module is used for processing the preliminary features obtained by the 4-stage convolution layer before VGG16 to generate high-level semantic features; the preliminary features obtained by the convolution of the final layer of VGG16 are processed through the cavity convolution of the pyramid structure, and fine-granularity low-layer features are obtained; then fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map; setting network training parameters, taking a cross entropy loss function as a target, and carrying out network training through back propagation so as to establish a semantic segmentation network; finally, inputting the test image into the semantic segmentation network to generate a semantic segmentation result of the test image, wherein the invention can solve the defect of fuzzy segmentation boundary in the complex environment of the existing method, can generate a high-resolution predictive image and improve the performance of the image semantic segmentation method in the complex environment.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an image semantic segmentation method facing a complex environment according to an embodiment of the invention;
fig. 2 is a schematic structural diagram of an image semantic segmentation device facing a complex environment according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the image semantic segmentation method for a complex environment provided by the embodiment of the invention includes the following steps:
step S100, modifying a VGG16 convolutional neural network to generate a basic network, extracting preliminary features of a training image through the basic network, wherein a convolutional layer in the VGG16 convolutional neural network is divided into 5 stages;
step S200, processing preliminary features obtained by a convolution layer at the previous 4 stages in the basic network by using a hidden layer convolution feature module to generate high-level semantic features;
step S300, processing the preliminary features obtained by the last layer convolution in the basic network through the cavity convolution of the pyramid structure to obtain fine-granularity low-layer features;
step S400, fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map;
step S500, setting network training parameters, and carrying out network training by counter propagation with the cross entropy loss function as a target, thereby establishing a semantic segmentation network;
and S600, inputting the test image into the semantic segmentation network to generate a semantic segmentation result of the test image.
As a preference of this embodiment, in step S100, the VGG16 convolutional neural network specifically includes: a neural network with learning capability, which is formed by connecting 13 convolution layers and 3 full connection layers in series, wherein the 13 convolution layers are divided into 5 stages, the first stage comprises convolution layers with the size of 2 layers of 3×3, and the output dimension is 64; the second stage comprises 2 layers of 3 x 3 size, with an output dimension of 128 convolutions; the third stage comprises two convolution layers with the size of 3 multiplied by 3 and one convolution layer with the size of 1 multiplied by 1, and the output dimension of the two convolution layers is 256; the fourth stage comprises two convolution layers with the size of 3 multiplied by 3 and one convolution layer with the size of 1 multiplied by 1, and the output dimension of the fourth stage is 512; the fifth stage comprises two convolution layers with the size of 3 multiplied by 3 and one convolution layer with the size of 1 multiplied by 1, and the output dimension of the convolution layers is 512; each stage is followed by a maximum pooling layer.
In step S100, the modifying the VGG16 convolutional neural network to generate a base network specifically includes:
discarding all full connection layers and the last pooling layer in the original VGG16 convolutional neural network, and constructing an end-to-end full convolutional network;
and carrying out convolution, pooling, batch normalization and ReLU operation through the full convolution network to obtain a feature map of each convolution layer in the basic network, thereby extracting the primary features of the image.
In one or more embodiments, the formula for the ReLU operation is:
f (x) =max (0, x), where x is the input and f (x) is the output.
In one embodiment, the specific implementation method of the step S200 is as follows:
step S210, inputting the feature map into convolution with the size of 1 multiplied by 1 and convolution with the size of 3 multiplied by 3 respectively, and obtaining convolution features of all scales;
step S220, fusing convolution characteristics of all scales, and performing ReLU operation to obtain a first result;
step S230, inputting the first result into convolution with 1×1, and adjusting the number of output feature channels to the corresponding number of categories, thereby generating high-level semantic features.
In one embodiment, the specific implementation method of the step S300 is as follows:
step S310, respectively inputting the feature images into two groups of cavity convolutions, respectively carrying out batch normalization and ReLU operation, respectively inputting the feature images into convolutions with the size of 1 multiplied by 1, respectively adjusting the number of output feature channels to the corresponding category number, and generating a first feature image and a second feature image;
in a preferred embodiment, the feature map is input into a 3×3 hole convolution with a hole size of 6, and batch normalization and ReLU operation are performed, and then input into a convolution with a size of 1×1, and the number of output feature channels is adjusted to the corresponding number of categories, so as to generate a first feature map;
inputting the feature map into a 3×3 cavity convolution with a hole size of 12, performing batch normalization and ReLU operation, inputting the feature map into a convolution with a size of 1×1, and adjusting the number of output feature channels to the corresponding number of categories to generate a second feature map;
step S320, the first feature map and the second feature map are subjected to convolution, batch normalization and ReLU operation, so that a pyramid structure is formed;
and step S330, fusing the first feature map and the second feature map in the pyramid structure to generate fine-granularity low-level features.
In one embodiment, the high-level semantic features and the fine-grained low-level features described in step S400 are subjected to addition fusion specifically through an eltwise layer, so as to generate a high-resolution feature map.
In one embodiment, in step S500, the network training parameters are specifically set as:
with the poly learning strategy, the initial learning rate is set to 0.001, the power is set to 0.9, the initial value of the convolution kernel weight is set to obey a gaussian distribution with a mean value of 0, a standard deviation of 0.01, the initial value of the bias is set to 0, the weight attenuation value is set to 0.0005, and the attenuation momentum is set to 0.9.
In order to measure the network prediction performance in the embodiment and verify the accuracy of the semantic segmentation result, the method in the embodiment is operated by adopting the following experimental environment: dell Precision Tower T7920 workstation configured as CPU: intel Xeon Silver 4114, 10 core 20 threads, main frequency 2.2GHz, memory: 64GB, operating system: ubuntu 16.04 LTS (64 bits), GPU: NVIDIA Geforce GTX 1080TI, video memory: 11G.
The verification is carried out by adopting the following steps:
step S610, dividing the pictures in the SUN RGB-D dataset into training pictures, verification pictures and test pictures;
step S620, preprocessing the divided training pictures, specifically including: mirror image operation and random cutting are carried out on the pictures;
step S630, training the network by using the training picture and the verification picture, testing the network by using the test picture, and measuring the prediction performance of the network by using the pixel precision, the average pixel precision and the average cross ratio index.
The pixel precision indicates the proportion of the correct pixels to the total pixels.
The average pixel precision represents the improvement of the pixel precision, firstly, the proportion of the number of correctly classified pixels in each class is calculated, and then the average of all classes is calculated.
The average intersection ratio is calculated by calculating the ratio of the intersection and the union of the two sets, and the intersection ratio between the real segmentation and the predicted segmentation is calculated in the semantic segmentation problem, namely the number of the real positive examples is divided by the total number of the real positive examples, the error negative examples and the error positive examples.
The experimental test shows that: the method provided by the embodiment can generate a high-resolution predictive image, and can ensure that a part of methods with larger segmentation effect on the SUN RGB-D data set have certain promotion.
Referring to fig. 2, an embodiment of the present invention further provides an image semantic segmentation apparatus facing a complex environment, where the apparatus includes:
an extracting unit 100, configured to modify a VGG16 convolutional neural network to generate a base network, extract preliminary features of a training image through the base network, where a convolutional layer in the VGG16 convolutional neural network is divided into 5 stages;
the high-level semantic feature unit 200 is configured to process preliminary features obtained by a convolution layer in the first 4 stages in the base network by using a hidden layer convolution feature module, and generate high-level semantic features;
a fine-grained low-level feature unit 300, configured to process the preliminary feature obtained by the last-level convolution in the base network through the hole convolution with the pyramid structure, to obtain a fine-grained low-level feature;
a high-resolution feature map unit 400, configured to fuse the high-level semantic features and the fine-grained low-level features to generate a high-resolution feature map;
the semantic segmentation network unit 500 is configured to set a network training parameter, perform network training by using the cross entropy loss function as a target through back propagation, and thereby establish a semantic segmentation network;
the semantic segmentation result unit 600 is configured to input a test image into the semantic segmentation network, and generate a semantic segmentation result of the test image.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims (5)

1. The image semantic segmentation method facing the complex environment is characterized by comprising the following steps of:
step S100, modifying a VGG16 convolutional neural network to generate a basic network, extracting preliminary features of a training image through the basic network, wherein a convolutional layer in the VGG16 convolutional neural network is divided into 5 stages;
step S200, processing preliminary features obtained by a convolution layer at the previous 4 stages in the basic network by using a hidden layer convolution feature module to generate high-level semantic features;
step S300, processing the preliminary features obtained by the last layer convolution in the basic network through the cavity convolution of the pyramid structure to obtain fine-granularity low-layer features;
step S400, fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map;
step S500, setting network training parameters, and carrying out network training by counter propagation with the cross entropy loss function as a target, thereby establishing a semantic segmentation network;
step S600, inputting the test image into the semantic segmentation network to generate a semantic segmentation result of the test image;
in step S100, the modifying the VGG16 convolutional neural network to generate a base network specifically includes:
discarding all full connection layers and the last pooling layer in the original VGG16 convolutional neural network, and constructing an end-to-end full convolutional network;
carrying out convolution, pooling, batch normalization and ReLU operation through the full convolution network to obtain a feature map of each convolution layer in the basic network, thereby extracting the primary features of the image;
the specific implementation method of the step S200 is as follows:
step S210, inputting the feature map into convolution with the size of 1 multiplied by 1 and convolution with the size of 3 multiplied by 3 respectively, and obtaining convolution features of all scales;
step S220, fusing convolution characteristics of all scales, and performing ReLU operation to obtain a first result;
step S230, inputting the first result into convolution with 1×1, and adjusting the number of output feature channels to the corresponding number of categories, thereby generating high-level semantic features.
2. The image semantic segmentation method facing the complex environment according to claim 1, wherein the specific implementation method of the step S300 is as follows:
step S310, respectively inputting the feature images into two groups of cavity convolutions, respectively carrying out batch normalization and ReLU operation, respectively inputting the feature images into convolutions with the size of 1 multiplied by 1, respectively adjusting the number of output feature channels to the corresponding category number, and generating a first feature image and a second feature image;
step S320, carrying out convolution, batch normalization and ReLU operation on the first feature map and the second feature map so as to form a pyramid structure;
and step S330, fusing the first feature map and the second feature map in the pyramid structure to generate fine-granularity low-level features.
3. The image semantic segmentation method facing the complex environment according to claim 1, wherein in step S400, the high-level semantic features and the fine-granularity low-level features are subjected to addition fusion through an eltwise layer to generate a high-resolution feature map.
4. The image semantic segmentation method according to claim 1, wherein in step S500, the network training parameters are specifically set as follows:
with the poly learning strategy, the initial learning rate is set to 0.001, the power is set to 0.9, the initial value of the convolution kernel weight is set to obey a gaussian distribution with a mean value of 0, a standard deviation of 0.01, the initial value of the bias is set to 0, the weight attenuation value is set to 0.0005, and the attenuation momentum is set to 0.9.
5. An image semantic segmentation device facing a complex environment, the device comprising:
the extraction unit is used for modifying the VGG16 convolutional neural network to generate a basic network, extracting the preliminary characteristics of the training image through the basic network, and dividing a convolutional layer in the VGG16 convolutional neural network into 5 stages;
the high-level semantic feature unit is used for processing the preliminary features obtained by the first 4-stage convolution layers in the basic network by using the hidden layer convolution feature module to generate high-level semantic features;
the fine-granularity low-layer characteristic unit is used for processing the preliminary characteristics obtained by the last layer convolution in the basic network through the cavity convolution of the pyramid structure to obtain fine-granularity low-layer characteristics;
the high-resolution feature map unit is used for fusing the high-level semantic features and the fine-granularity low-level features to generate a high-resolution feature map;
the semantic segmentation network unit is used for setting network training parameters, aiming at the cross entropy loss function, and carrying out network training through back propagation so as to establish a semantic segmentation network;
the semantic segmentation result unit is used for inputting the test image into the semantic segmentation network to generate a semantic segmentation result of the test image;
the modifying the VGG16 convolutional neural network to generate the basic network specifically comprises the following steps:
discarding all full connection layers and the last pooling layer in the original VGG16 convolutional neural network, and constructing an end-to-end full convolutional network;
carrying out convolution, pooling, batch normalization and ReLU operation through the full convolution network to obtain a feature map of each convolution layer in the basic network, thereby extracting the primary features of the image;
the high-level semantic feature unit is specifically configured to:
the feature map is input into convolution of 1 multiplied by 1 and convolution of 3 multiplied by 3 respectively, and convolution features of all scales are obtained;
fusing convolution characteristics of all scales, and performing ReLU operation to obtain a first result;
and inputting the first result into convolution with the size of 1 multiplied by 1, and adjusting the number of output characteristic channels to the corresponding number of categories so as to generate high-level semantic characteristics.
CN201910333809.9A 2019-04-24 2019-04-24 Image semantic segmentation method and device oriented to complex environment Active CN110136141B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910333809.9A CN110136141B (en) 2019-04-24 2019-04-24 Image semantic segmentation method and device oriented to complex environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910333809.9A CN110136141B (en) 2019-04-24 2019-04-24 Image semantic segmentation method and device oriented to complex environment

Publications (2)

Publication Number Publication Date
CN110136141A CN110136141A (en) 2019-08-16
CN110136141B true CN110136141B (en) 2023-07-11

Family

ID=67571100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910333809.9A Active CN110136141B (en) 2019-04-24 2019-04-24 Image semantic segmentation method and device oriented to complex environment

Country Status (1)

Country Link
CN (1) CN110136141B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111079767B (en) * 2019-12-22 2022-03-22 浪潮电子信息产业股份有限公司 Neural network model for segmenting image and image segmentation method thereof
CN111259901A (en) * 2020-01-13 2020-06-09 镇江优瞳智能科技有限公司 Efficient method for improving semantic segmentation precision by using spatial information
CN113496158A (en) * 2020-03-20 2021-10-12 中移(上海)信息通信科技有限公司 Object detection model optimization method, device, equipment and storage medium
CN111340139B (en) * 2020-03-27 2024-03-05 中国科学院微电子研究所 Method and device for judging complexity of image content
CN111444923A (en) * 2020-04-13 2020-07-24 中国人民解放军国防科技大学 Image semantic segmentation method and device under natural scene
CN111428739B (en) * 2020-04-14 2023-08-25 图觉(广州)智能科技有限公司 High-precision image semantic segmentation method with continuous learning capability
CN112116594B (en) * 2020-09-10 2023-12-19 福建省海峡智汇科技有限公司 Semantic segmentation-based wind-drift foreign matter identification method and device
CN112801104B (en) * 2021-01-20 2022-01-07 吉林大学 Image pixel level pseudo label determination method and system based on semantic segmentation
CN113657388B (en) * 2021-07-09 2023-10-31 北京科技大学 Image semantic segmentation method for super-resolution reconstruction of fused image
CN113780297B (en) * 2021-09-15 2024-03-12 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644426A (en) * 2017-10-12 2018-01-30 中国科学技术大学 Image, semantic dividing method based on pyramid pond encoding and decoding structure
CN109145769A (en) * 2018-08-01 2019-01-04 辽宁工业大学 The target detection network design method of blending image segmentation feature
CN109598269A (en) * 2018-11-14 2019-04-09 天津大学 A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10068171B2 (en) * 2015-11-12 2018-09-04 Conduent Business Services, Llc Multi-layer fusion in a convolutional neural network for image classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644426A (en) * 2017-10-12 2018-01-30 中国科学技术大学 Image, semantic dividing method based on pyramid pond encoding and decoding structure
CN109145769A (en) * 2018-08-01 2019-01-04 辽宁工业大学 The target detection network design method of blending image segmentation feature
CN109598269A (en) * 2018-11-14 2019-04-09 天津大学 A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DeepLab: Semantic Image Segmentation withDeep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs;George Papandreou;arxiv;第1-14页 *
Pyramid Feature Attention Network for Saliency detection;Ting Zhao;arxiv;第1-10页 *
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation;Guosheng Lin;arxiv;第1-10页 *

Also Published As

Publication number Publication date
CN110136141A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
CN110136141B (en) Image semantic segmentation method and device oriented to complex environment
CN110335270B (en) Power transmission line defect detection method based on hierarchical regional feature fusion learning
US20230206603A1 (en) High-precision point cloud completion method based on deep learning and device thereof
Chen et al. PCB defect detection method based on transformer-YOLO
CN111242026B (en) Remote sensing image target detection method based on spatial hierarchy perception module and metric learning
WO2021169049A1 (en) Method for glass detection in real scene
CN116503318A (en) Aerial insulator multi-defect detection method, system and equipment integrating CAT-BiFPN and attention mechanism
CN116309313A (en) Battery surface welding defect detection method
CN114897857A (en) Solar cell defect detection method based on light neural network
CN117649448A (en) Intelligent recognition and segmentation method for leakage water of tunnel working face
CN113902792A (en) Building height detection method and system based on improved RetinaNet network and electronic equipment
CN117252817A (en) Transparent conductive film glass surface defect detection method and system
CN116012709B (en) High-resolution remote sensing image building extraction method and system
CN116129158A (en) Power transmission line iron tower small part image recognition method and device
CN115082650A (en) Implementation method of automatic pipeline defect labeling tool based on convolutional neural network
Wang et al. Automated pavement crack detection based on multiscale fully convolutional network
CN117553807B (en) Automatic driving navigation method and system based on laser radar
CN115272814B (en) Long-distance space self-adaptive multi-scale small target detection method
CN115114860B (en) Data modeling amplification method for concrete pipeline damage identification
CN112446267B (en) Setting method of face recognition network suitable for front end
CN117593517B (en) Camouflage target detection method based on complementary perception cross-view fusion network
CN114972373A (en) Target detection method based on parallax segmentation
CN115526840A (en) Infrared image segmentation method and system for typical ground wire clamp of power transmission line
CN118154607A (en) Lightweight defect detection method based on mixed multiscale knowledge distillation
CN118314190A (en) Method and system for constructing anisotropic non-lambertian reflection residual retroreflective model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant