CN113095136A - Unmanned aerial vehicle aerial video semantic segmentation method based on UVid-Net - Google Patents

Unmanned aerial vehicle aerial video semantic segmentation method based on UVid-Net Download PDF

Info

Publication number
CN113095136A
CN113095136A CN202110257662.7A CN202110257662A CN113095136A CN 113095136 A CN113095136 A CN 113095136A CN 202110257662 A CN202110257662 A CN 202110257662A CN 113095136 A CN113095136 A CN 113095136A
Authority
CN
China
Prior art keywords
model
semantic segmentation
data
net
unmanned aerial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110257662.7A
Other languages
Chinese (zh)
Other versions
CN113095136B (en
Inventor
潘晓光
陈亮
董虎弟
宋晓晨
张雅娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Sanyouhe Smart Information Technology Co Ltd
Original Assignee
Shanxi Sanyouhe Smart Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi Sanyouhe Smart Information Technology Co Ltd filed Critical Shanxi Sanyouhe Smart Information Technology Co Ltd
Priority to CN202110257662.7A priority Critical patent/CN113095136B/en
Priority claimed from CN202110257662.7A external-priority patent/CN113095136B/en
Publication of CN113095136A publication Critical patent/CN113095136A/en
Application granted granted Critical
Publication of CN113095136B publication Critical patent/CN113095136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of semantic segmentation, and particularly relates to a semantic segmentation method of unmanned aerial vehicle aerial video based on UVid-Net, which comprises the following steps: data acquisition: collecting a data set for unmanned aerial vehicle video semantic segmentation, and carrying out pixel-level labeling on the category of the data set to complete construction of the data set required by model training; data preprocessing: preprocessing comprises normalization, data division, image scaling and the like, and a data set is amplified to ensure the training effect of the model; identifying a model: building a semantic segmentation model based on UVid-Net, inputting training data, and completing construction of a parameter model; and (3) model saving: when the loss function of the model is not reduced any more, the model is saved; and (3) evaluating a model: and performing performance evaluation on the segmentation result of the model through various evaluation indexes. The coding path of the invention captures the time dynamics of the video by extracting the features from the multi-frame, and can be used for reserving the features of the encoder layer and improving the semantic segmentation performance.

Description

Unmanned aerial vehicle aerial video semantic segmentation method based on UVid-Net
Technical Field
The invention belongs to the technical field of semantic segmentation, and particularly relates to a semantic segmentation method of unmanned aerial vehicle aerial video based on UVid-Net.
Background
Aerial image analysis has been used to assess damage immediately after a natural disaster. Typically, aerial images are captured by different imaging modalities such as Synthetic Aperture Radar (SAR), hyperspectral imaging, etc. on satellites. In recent years, Unmanned Aerial Vehicles (UAVs) have also been widely used in various applications such as disaster management, urban planning, wildlife tracking, and agricultural planning. Drone images/video, due to rapid deployment and customized flight paths, can provide additional finer details and complement critical applications of satellite-based image analysis methods, such as disaster response. In addition, drone images may be used in conjunction with satellite images to better enable city planning or geographic information updates. However, the unmanned aerial vehicle image/video analysis is limited to target detection and identification tasks, such as building detection, road segmentation, etc., and currently, the semantic segmentation work for unmanned aerial vehicle images or videos is limited.
Problems or disadvantages of the prior art: current semantic segmentation is the process of assigning a predetermined class label to all pixels in an image. However, there are certain difficulties in how to integrate temporal information in the semantic segmentation process of extended video applications.
Disclosure of Invention
Aiming at the problems, the semantic segmentation method of the unmanned aerial vehicle aerial photography video based on UVid-Net is provided, an extended version ManipalUAVid data set for unmanned aerial vehicle video semantic segmentation is collected, and pixel level labeling is carried out on four background classes of greening, buildings, roads and water bodies. And after data collection is finished, preprocessing the data, wherein the preprocessing comprises segmentation and noise addition. Inputting the preprocessed data into the well-built UVid-Net network for training the network model, storing the model until the loss function of the model does not decrease any more, completing the construction of the model, and evaluating the performance of the model through various evaluation methods. The technical scheme adopted by the invention is as follows:
a semantic segmentation method of unmanned aerial vehicle aerial video based on UVid-Net comprises the following steps:
s100, data acquisition: collecting a data set for unmanned aerial vehicle video semantic segmentation, and carrying out pixel-level labeling on the category of the data set to complete construction of the data set required by model training;
s200, data preprocessing: preprocessing comprises normalization, data division, image scaling and the like, and a data set is amplified to ensure the training effect of the model;
s300, identifying a model: building a semantic segmentation model based on UVid-Net, inputting training data, and completing construction of a parameter model;
s400, model storage: when the loss function of the model is not reduced any more, the model is saved;
s500, model evaluation: and performing performance evaluation on the segmentation result of the model through various evaluation indexes.
Further, the step S100 of collecting data specifically includes: the method comprises the steps of collecting various extended version ManipalUAVid data sets for unmanned aerial vehicle video semantic segmentation, wherein the data sets comprise a plurality of videos, providing labels for key frames of the videos, and carrying out pixel-level labeling on four background classes of greening, buildings, roads and water bodies.
Further, in step S200, the specific steps include: normalization, normalization operation is carried out on all data, dimensions are unified, and model training is facilitated
Figure BDA0002968625430000021
Where x' is the normalized data set and x is the unprocessed data set.
Data division: the data set was as follows 7: 2: the proportion of 1 is divided into a training set, a verification set and a test set, wherein the training set is used for training the model, the verification set is used for detecting whether the model loss continuously decreases, and the test set is used for testing the model effect;
image zooming: because the size of the acquired image in the original image dataset is not fixed, all data obtained after the dataset is divided are scaled so as to input a model, and the model is adjusted to the size of 1280 multiplied by 720 pixel size according to the size proportion;
data expansion: the collected two data sets for unmanned aerial vehicle video semantic segmentation are fused to form a new extended ManipalUAVid data set, the data set obtains more video data by combining the two data sets, and each video comprises more frames, so that the time consistency of the video semantic segmentation model can be evaluated.
Further, S300 further includes model building: constructing a network model based on UVid-Net to perform semantic segmentation on unmanned aerial vehicle aerial video, and performing semantic segmentation on two frames
Figure BDA0002968625430000031
And
Figure BDA0002968625430000032
as input to the network model, and then
Figure BDA0002968625430000033
Semantic segmentation is carried out, the model is divided into two modules of encoding and decoding, feature extraction is respectively carried out by two different structures of U-Net and ResNet-5 in an encoding stage, wherein a U-Net encoder consists of a convolutional layer and a maxpool layer and is used for feature extraction, an upper branch of the encoder comprises four modules, the upper branch module consists of two continuous 3x3 convolutional layers including a batch normalization function and a ReLU activation function, then, the activation process is used for reducing the dimension of a feature map by one 1x1 convolutional layer, finally, the maxpooling layer is used for extracting the most significant features for subsequent layers, after each maximum pooling operation, the number of feature mappings is doubled, a lower branch of the encoder also consists of four modules, each lower branch of the modules respectively comprises a group of 3x3 convolutional layer and a maxpooling layer with a batch normalization function and a ReLU activation function, what is needed isThe maxporoling layer extracts the most significant features and, like the above branches, doubles the number of feature maps after each max pooling operation.
Further, the features extracted by the upper and lower branches of the encoder are input to two separate bottleneck layers, and finally, the activation of the two branches is concatenated and provided to the decoder; while the ResNet-50 feature extractor consists of residual blocks, which help to mitigate vanishing gradients, the architecture consists of an initial kernel-size (7x7) convolution operation followed by a batch normalization layer and a ReLU activation function.
Further, in S400, after the loss function of the model is no longer reduced, the model is saved, and the loss function is calculated by using the cross-entropy loss function, which is expressed as follows:
Figure BDA0002968625430000034
where yi represents the label of sample i, pi represents the probability that sample i predicts correctly, and N represents the number of categories.
Further, in S500, the performance of the model segmentation result is evaluated by calculating the following evaluation indexes, such as: intersection mean (mIoU), Precision (Precision), Recall (Recall), and F1-Score, which are formulated as follows:
Figure BDA0002968625430000041
Figure BDA0002968625430000042
wherein TP, FP, TN and FN represent true positive, false positive, true negative and false negative predictions, respectively.
Has the advantages that: the invention provides a CNN structure (UVid-Net) based on an enhanced codec for unmanned aerial vehicle video semantic segmentation. The encoder of the proposed architecture embeds time information for time-consistent tags, and the decoder is enhanced by introducing a feature retainer module. The structure has two parallel CNN layers for feature extraction. This new encoding path captures the temporal dynamics of the video by extracting features from multiple frames. These features are further processed by the decoder for class label estimation, and the algorithm adopts a new decoding path, which can be used to retain the features of the encoder layer and improve the semantic segmentation performance.
Drawings
FIG. 1 is a system flow diagram of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The application discloses a semantic segmentation method of unmanned aerial vehicle aerial video based on UVid-Net, which comprises the following steps:
s100, data acquisition: collecting a data set for unmanned aerial vehicle video semantic segmentation, and carrying out pixel-level labeling on the category of the data set to complete construction of the data set required by model training;
s200, data preprocessing: preprocessing comprises normalization, data division, image scaling and the like, and a data set is amplified to ensure the training effect of the model;
s300, identifying a model: building a semantic segmentation model based on UVid-Net, inputting training data, and completing construction of a parameter model;
s400, model storage: when the loss function of the model is not reduced any more, the model is saved;
s500, model evaluation: and performing performance evaluation on the segmentation result of the model through various evaluation indexes.
Further, the step S100 of collecting data specifically includes: the method comprises the steps of collecting various extended version ManipalUAVid data sets for unmanned aerial vehicle video semantic segmentation, wherein the data sets comprise a plurality of videos, providing labels for key frames of the videos, and carrying out pixel-level labeling on four background classes of greening, buildings, roads and water bodies.
Further, in step S200, the specific steps include: normalization, normalization operation is carried out on all data, dimensions are unified, and model training is facilitated
Figure BDA0002968625430000051
Where x' is the normalized data set and x is the unprocessed data set.
Data division: the data set was as follows 7: 2: the proportion of 1 is divided into a training set, a verification set and a test set, wherein the training set is used for training the model, the verification set is used for detecting whether the model loss continuously decreases, and the test set is used for testing the model effect;
image zooming: because the size of the acquired image in the original image dataset is not fixed, all data obtained after the dataset is divided are scaled so as to input a model, and the model is adjusted to the size of 1280 multiplied by 720 pixel size according to the size proportion;
data expansion: the collected two data sets for unmanned aerial vehicle video semantic segmentation are fused to form a new extended ManipalUAVid data set, the data set obtains more video data by combining the two data sets, and each video comprises more frames, so that the time consistency of the video semantic segmentation model can be evaluated.
Further, in S300, a network model based on UVid-Net is constructed to carry out semantic segmentation on the unmanned aerial vehicle aerial video, and two frames are divided
Figure BDA0002968625430000061
And
Figure BDA0002968625430000062
as input to the network model, and then
Figure BDA0002968625430000063
And performing semantic segmentation. The model is divided into encoding and decodingAnd the two modules respectively carry out feature extraction through two different structures, namely U-Net and ResNet-5, in the encoding stage, wherein the U-Net encoder consists of a convolution layer and a maxpool layer and is used for feature extraction, and the upper branch of the encoder comprises the four modules. Each block consists of two consecutive 3x3 convolutional layers, including the batch normalization and the ReLU activation functions. The activation process is then used to reduce the dimensionality of the feature map by a 1x1 convolutional layer. Finally, the maxporoling layer is used to extract the most prominent features for subsequent layers. And the number of feature maps doubles after each max pooling operation. The lower branch of the encoder is also composed of four modules. The lower branch blocks each have a set of 3x3 convolutional layers with batch normalization and ReLU activation functions, as does the second set. The next layer is maxporoling, which extracts the most prominent features. Similar to the branch above, the number of property maps doubles after each max pooling operation. The features extracted by the upper and lower branches of the encoder are input to two separate bottleneck layers. Finally, the activation of the two branches is concatenated and provided to the decoder; while the ResNet-50 feature extractor consists of residual blocks, which help to mitigate vanishing gradients, the architecture consists of an initial kernel-size (7x7) convolution operation followed by a batch normalization layer and a ReLU activation function. Subsequently, the maximum pool operation with kernel size (3x3) is applied. After the maxpool operation, the architecture consists of four phases. The first stage comprises three remaining blocks, each block containing three layers, each consisting of 64, 64 and 128 filters. The second stage consists of 4 remaining blocks, each block having 3 layers. These three layers use 128, 128 and 256 filters. The third stage includes 6 remaining blocks, each block having 3 layers. These layers use 256, and 512 filters. The fourth stage consists of 3 residual blocks, each having 3 layers. These layers use 512, and 1024 filters. Stages 2,3 and 4 of the first remaining block reduce the width and height of the input dimension 2 with a stride operation. The first and last layers of each remaining block consist of (1x1) kernel size, the second layer consists of (3x3) kernel size, and finally the features are provided to the decoder. By means of these two different feature extractors in the coding stageAnd performing feature extraction on the segments, fusing feature vectors of the two segments, providing the fused features for a decoder for semantic segmentation, wherein the decoding module is the same as the U-Net decoding module and performs the same operation, but the UVid-Net network model finally passes through a feature retention module which combines corresponding feature mappings of an encoder and the decoder, and finally the probability that the pixel belongs to each class is obtained through a SoftMax layer.
Further, in S400, after the loss function of the model is no longer reduced, the model is saved, and the loss function is calculated by using the cross-entropy loss function, which is expressed as follows:
Figure BDA0002968625430000071
where yi represents the label of sample i, pi represents the probability that sample i predicts correctly, and N represents the number of categories.
Further, in S500, the performance of the model segmentation result is evaluated by calculating the following evaluation indexes, such as: intersection mean (mIoU), Precision (Precision), Recall (Recall), and F1-Score, which are formulated as follows:
Figure BDA0002968625430000072
Figure BDA0002968625430000073
wherein TP, FP, TN and FN represent true positive, false positive, true negative and false negative predictions, respectively.
Although only the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art, and all changes are encompassed in the scope of the present invention.

Claims (7)

1. A semantic segmentation method of unmanned aerial vehicle aerial video based on UVid-Net is characterized in that: comprises the following steps:
s100, data acquisition: collecting a data set for unmanned aerial vehicle video semantic segmentation, and carrying out pixel-level labeling on the category of the data set to complete construction of the data set required by model training;
s200, data preprocessing: preprocessing comprises normalization, data division, image scaling and the like, and a data set is amplified to ensure the training effect of the model;
s300, identifying a model: building a semantic segmentation model based on UVid-Net, inputting training data, and completing construction of a parameter model;
s400, model storage: when the loss function of the model is not reduced any more, the model is saved;
s500, model evaluation: and performing performance evaluation on the segmentation result of the model through various evaluation indexes.
2. The UVid-Net based semantic segmentation method for the unmanned aerial vehicle aerial video according to claim 1, wherein the specific steps of data collection in step S100 are as follows: the method comprises the steps of collecting various extended version ManipalUAVid data sets for unmanned aerial vehicle video semantic segmentation, wherein the data sets comprise a plurality of videos, providing labels for key frames of the videos, and carrying out pixel-level labeling on four background classes of greening, buildings, roads and water bodies.
3. The unmanned aerial vehicle aerial video semantic segmentation method based on UVid-Net as claimed in claim 1, wherein in step S200, the concrete steps include: normalization, normalization operation is carried out on all data, dimensions are unified, and model training is facilitated
Figure FDA0002968625420000011
Where x' is the normalized data set and x is the unprocessed data set.
Data division: the data set was as follows 7: 2: the proportion of 1 is divided into a training set, a verification set and a test set, wherein the training set is used for training the model, the verification set is used for detecting whether the model loss continuously decreases, and the test set is used for testing the model effect;
image zooming: because the size of the acquired image in the original image dataset is not fixed, all data obtained after the dataset is divided are scaled so as to input a model, and the model is adjusted to the size of 1280 multiplied by 720 pixel size according to the size proportion;
data expansion: the collected two data sets for unmanned aerial vehicle video semantic segmentation are fused to form a new extended ManipalUAVid data set, the data set obtains more video data by combining the two data sets, and each video comprises more frames, so that the time consistency of the video semantic segmentation model can be evaluated.
4. The method for semantic segmentation of unmanned aerial vehicle aerial video based on UVid-Net as claimed in claim 3, wherein S300 further comprises model construction: constructing a network model based on UVid-Net to perform semantic segmentation on unmanned aerial vehicle aerial video, and performing semantic segmentation on two frames
Figure FDA0002968625420000021
And
Figure FDA0002968625420000022
as input to the network model, and then
Figure FDA0002968625420000023
Semantic segmentation is carried out, the model is divided into an encoding module and a decoding module, feature extraction is respectively carried out through two different structures U-Net and ResNet-5 in an encoding stage, wherein a U-Net encoder consists of a convolutional layer and a maxpool layer and is used for feature extraction, an upper branch of the encoder comprises four modules, the upper branch module consists of two continuous 3x3 convolutional layers and comprises batch normalization and ReLU activation functions, then the activation process is used for reducing the dimension of a feature map through 1x1 convolutional layer, and finally, the maxpooling layer is used for extracting the most significant features for LU activationSubsequent layers, and each time after maximum pooling, the number of feature maps doubles, the lower branch of the encoder also consists of four modules, each of which has a set of 3 × 3 convolutional layer with batch normalization and ReLU activation functions and a maxpoloring layer that extracts the most significant features and, like the upper branch, doubles the number of feature maps after each maximum pooling.
5. The method for semantic segmentation of unmanned aerial vehicle aerial video based on UVid-Net as claimed in claim 4, wherein features extracted from upper and lower branches of the encoder are inputted into two separate bottleneck layers, and finally, the activation of the two branches is connected and provided to the decoder; while the ResNet-50 feature extractor consists of residual blocks, which help to mitigate vanishing gradients, the architecture consists of an initial kernel-size (7x7) convolution operation followed by a batch normalization layer and a ReLU activation function.
6. The method for semantic segmentation of unmanned aerial vehicle aerial video based on UVid-Net as claimed in claim 1, wherein in S400, when the loss function of the model is no longer reduced, the model is saved, and the loss function is calculated by using cross entropy loss function, and the formula is as follows:
Figure FDA0002968625420000031
where yi represents the label of sample i, pi represents the probability that sample i predicts correctly, and N represents the number of categories.
7. The method for semantic segmentation of unmanned aerial vehicle aerial video based on UVid-Net according to claim 1, wherein in S500, the performance of model segmentation result is evaluated by calculating following evaluation indexes, such as: intersection mean (mIoU), Precision (Precision), Recall (Recall), and F1-Score, which are formulated as follows:
Figure FDA0002968625420000032
Figure FDA0002968625420000033
wherein TP, FP, TN and FN represent true positive, false positive, true negative and false negative predictions, respectively.
CN202110257662.7A 2021-03-09 UVid-Net-based semantic segmentation method for unmanned aerial vehicle aerial video Active CN113095136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110257662.7A CN113095136B (en) 2021-03-09 UVid-Net-based semantic segmentation method for unmanned aerial vehicle aerial video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110257662.7A CN113095136B (en) 2021-03-09 UVid-Net-based semantic segmentation method for unmanned aerial vehicle aerial video

Publications (2)

Publication Number Publication Date
CN113095136A true CN113095136A (en) 2021-07-09
CN113095136B CN113095136B (en) 2024-07-26

Family

ID=

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190043203A1 (en) * 2018-01-12 2019-02-07 Intel Corporation Method and system of recurrent semantic segmentation for image processing
US20190164007A1 (en) * 2017-11-30 2019-05-30 TuSimple Human driving behavior modeling system using machine learning
CN110322435A (en) * 2019-01-20 2019-10-11 北京工业大学 A kind of gastric cancer pathological image cancerous region dividing method based on deep learning
CN111062252A (en) * 2019-11-15 2020-04-24 浙江大华技术股份有限公司 Real-time dangerous article semantic segmentation method and device and storage device
CN111488884A (en) * 2020-04-28 2020-08-04 东南大学 Real-time semantic segmentation method with low calculation amount and high feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190164007A1 (en) * 2017-11-30 2019-05-30 TuSimple Human driving behavior modeling system using machine learning
US20190043203A1 (en) * 2018-01-12 2019-02-07 Intel Corporation Method and system of recurrent semantic segmentation for image processing
CN110322435A (en) * 2019-01-20 2019-10-11 北京工业大学 A kind of gastric cancer pathological image cancerous region dividing method based on deep learning
CN111062252A (en) * 2019-11-15 2020-04-24 浙江大华技术股份有限公司 Real-time dangerous article semantic segmentation method and device and storage device
CN111488884A (en) * 2020-04-28 2020-08-04 东南大学 Real-time semantic segmentation method with low calculation amount and high feature fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈璐;管霜霜;: "基于深度学习的城市高分遥感图像变化检测方法的研究", 计算机应用研究, no. 1, 30 June 2020 (2020-06-30) *

Similar Documents

Publication Publication Date Title
Azimi et al. Aerial LaneNet: Lane-marking semantic segmentation in aerial imagery using wavelet-enhanced cost-sensitive symmetric fully convolutional neural networks
CN111563508B (en) Semantic segmentation method based on spatial information fusion
Wang et al. DDU-Net: Dual-decoder-U-Net for road extraction using high-resolution remote sensing images
CN110246141B (en) Vehicle image segmentation method based on joint corner pooling under complex traffic scene
CN111104903A (en) Depth perception traffic scene multi-target detection method and system
Chen et al. Corse-to-fine road extraction based on local Dirichlet mixture models and multiscale-high-order deep learning
CN109543691A (en) Ponding recognition methods, device and storage medium
CN111079604A (en) Method for quickly detecting tiny target facing large-scale remote sensing image
CN115035295A (en) Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function
Zang et al. Traffic lane detection using fully convolutional neural network
CN117496347A (en) Remote sensing image building extraction method, device and medium
CN117036936A (en) Land coverage classification method, equipment and storage medium for high-resolution remote sensing image
CN113297959A (en) Target tracking method and system based on corner attention twin network
CN116597411A (en) Method and system for identifying traffic sign by unmanned vehicle in extreme weather
CN114943937A (en) Pedestrian re-identification method and device, storage medium and electronic equipment
CN116012815A (en) Traffic element identification method, multi-task network model, training method and training device
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
Lv et al. Deep learning-based semantic segmentation of remote sensing images: a review
CN114550014A (en) Road segmentation method and computer device
CN114639020A (en) Segmentation network, segmentation system and segmentation device for target object of image
CN113177956A (en) Semantic segmentation method for unmanned aerial vehicle remote sensing image
CN117372686A (en) Semantic segmentation method and system for complex scene of remote sensing image
CN116563553A (en) Unmanned aerial vehicle image segmentation method and system based on deep learning
CN113569600A (en) Method and device for identifying weight of object, electronic equipment and storage medium
CN116485894A (en) Video scene mapping and positioning method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant