CN112580645A - Unet semantic segmentation method based on convolutional sparse coding - Google Patents
Unet semantic segmentation method based on convolutional sparse coding Download PDFInfo
- Publication number
- CN112580645A CN112580645A CN202011445030.5A CN202011445030A CN112580645A CN 112580645 A CN112580645 A CN 112580645A CN 202011445030 A CN202011445030 A CN 202011445030A CN 112580645 A CN112580645 A CN 112580645A
- Authority
- CN
- China
- Prior art keywords
- unet
- model
- csc
- network
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000012549 training Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 12
- 238000012938 design process Methods 0.000 claims description 6
- 238000012805 post-processing Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 5
- 230000009286 beneficial effect Effects 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000012360 testing method Methods 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/467—Encoded features or binary features, e.g. local binary patterns [LBP]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/513—Sparse representations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a Unet semantic segmentation method based on convolution sparse coding, which combines the convolution sparse coding with a coding network in an Unet model to form a coder of a CSC-Unet model so as to obtain the global information of an image; combining the convolution sparse coding with a decoding network in Unet to form a decoder of a CSC-Unet model so as to obtain the position information of the image; and use of a hopping structure allows global information to be combined with positional information to produce accurate and fine segmentation. The method firstly preprocesses the training pictures and labels in the data set, such as cutting, data enhancement and the like, and then reads in the CSC-Unet segmentation model for training. After training is finished, the test samples and labels in the data set are read into the CSC-Unet segmentation model, and the stored best weight values are loaded into the model, so that the purpose of accurate semantic segmentation by using the model is achieved.
Description
Technical Field
The invention relates to the field of semantic segmentation of images, in particular to a Unet semantic segmentation method based on convolutional sparse coding.
Background
Over the past decades, sparse and redundant representation areas have made a significant leap. It grows into a mature discipline with great influence. The sparse domain brings the idea that natural signals can be described as a linear combination consisting of only a few members or components (often called atoms). Sparse domain models are increasingly becoming the core of signal and image processing and machine learning applications, producing the most advanced results in a variety of tasks and many different domains. In recent years, the offspring model Convolutional Sparse Coding (CSC) in the sparse domain and the multi-layered convolutional sparse coding (ML-CSC) have achieved significant results in many different domains ranging from signal and image processing to machine learning. On the other hand, semantic segmentation is one of the key tasks in computer vision. In reality, more and more application scenes need to infer relevant knowledge or semantics (i.e., concrete to abstract processes) from imagery. As a core problem of computer vision, semantic segmentation is increasingly important for scene understanding. Based on the above, the invention provides the Unet semantic segmentation method based on the convolution sparse coding, which can better capture the semantic information and the representation information of the image so as to more accurately perform semantic segmentation on the image.
Disclosure of Invention
The invention aims to obtain a model with more accurate segmentation effect by combining convolutional sparse coding and a Unet segmentation model based on the obvious effect of the convolutional sparse coding in many different fields.
In order to achieve the above object, the present invention provides a method for performing semantic segmentation on a Unet based on convolutional sparse coding, the method comprising the following steps:
s1: reading training samples and labels in a data set into a CSC-Unet semantic segmentation network, and preprocessing training pictures and labels according to actual needs, such as cutting and normalization;
s2: combining the convolution sparse coding with a coding network in Unet to form a coder of a CSC-Unet model so as to obtain global information of the image;
s3: combining the convolutional sparse coding with a decoding network in Unet to form a decoder of a CSC-Unet model so as to obtain the position information of the image, and using a hopping structure to enable the global information to be combined with the position information so as to generate accurate and fine segmentation;
s4: and carrying out post-processing on a result obtained by the CSC-Unet model to obtain a visualized semantic segmentation map.
As a preferred technical solution of the present invention, the S1 and S4 are respectively preprocessing and post-processing of the image, and the S2 and S3 are methods for proposing a CSC-Unet semantic segmentation based on convolutional sparse coding and Unet segmentation model formation.
As a preferred embodiment of the present invention, the step S1: the data preprocessing part can process the data by the following steps:
s1.1, preprocessing data, such as normalization, standardization, cutting and data enhancement, is beneficial to training of a deep network, accelerates a convergence process, avoids an overfitting problem and enhances the generalization capability of a model;
s1.2, preprocessing data and reading the data into a network: and reading the training samples and the labels in the data set into the convolutional neural network according to the size of the batch-size.
As a preferred embodiment of the present invention, the step S2: the design process of the encoder of the CSC-Unet model can be composed of the following steps:
s2.1, designing a two-layer convolution sparse coding model network model to form an ML-CSC module: let the original signal X satisfy two-layer convolution sparse model, which can be expressed as X ═ D1Γ1,Γ1=D2Γ2Wherein
S2.2, solving of ML-CSC problem: how to find gamma1And Γ2Can be seen as a Depth Coding Problem (DCP): in | | | Y-D1Γ1||2≤ε,Γ1=D2Γ2,Solving under the conditionWhere Y is the original signal X mixed with the noise E, i.e., Y ═ X + E (| E | | non-calculation circuit)2Epsilon is less than or equal to epsilon). Solving the depth coding problem with a layered basis pursuit algorithm (LBP) can yield:wherein
S2.3, solving the LBP problem: an approximate solution to the LBP problem can be found using a multi-layered iterative soft threshold algorithm (ML-ISTA):where T is the number of iterations, TλIs a threshold operator; if we further assume that the representation coefficients are non-negative, the approximate solution can be written as:whereinWkFor the convolution operation, when t is 0, i.e. there is no iteration number, the ML-CSC module is equivalent to two convolution operations, and the convolution coefficients are:W1and W2When the iteration time t is 1,the number of learnable parameters is not increased in the iterative process;
s2.4, combining the ML-CSC module with a coding end of a traditional Unet segmentation network.
As a preferred embodiment of the present invention, the step S3: the design process of the encoder of the CSC-Unet model and the jump connection can be composed of the following steps:
s3.1, combining the ML-CSC module with a decoding end of a traditional Unet segmentation network;
s3.2, using a jump structure to enable the global information to be combined with the position information, so as to generate accurate and fine segmentation;
and S3.3, selecting NLL _ LOSS as a LOSS function, activating a log-SoftMax function for input parameters, and selecting an Adam optimization function which is an optimization method of self-adaptive learning rate as an optimization function for optimization.
As a preferred technical solution of the present invention, in S4, a result obtained by the CSC-Unet model is post-processed to obtain a visualized accurate semantic segmentation map; the process can be composed of the following steps:
s4.1, performing precision operation on the result output by the model and the real label to obtain a confusion matrix so as to obtain corresponding index measurement parameters such as average intersection ratio (Miou), pixel precision, average pixel precision and the like, and further measure the network performance;
and S4.2, storing the prediction result of the model in a picture form, and visually feeling the accuracy of the segmentation.
The invention has the beneficial effects that: compared with the segmentation method in the prior art, the method has the advantages that the global information of the original image can be better captured at the encoding end, the position information of the original image can be better captured at the decoding end, and finally the captured global information and the position information are combined to obtain more accurate semantic segmentation of the image.
Drawings
FIG. 1 is a simplified flowchart of the Unet semantic segmentation method based on convolutional sparse coding according to the present invention;
fig. 2-3 are models of the Unet semantic segmentation method based on convolutional sparse coding of the present invention.
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention more readily understood by those skilled in the art, and thus will more clearly and distinctly define the scope of the invention.
Example (b): referring to fig. 1-3, the present invention provides a technical solution: the following further describes specific embodiments of the present invention with reference to the drawings.
As shown in fig. 1, the method for performing semantic segmentation on a pnet based on convolutional sparse coding of the present invention includes the following steps:
s1: the data preprocessing part can process the data by the following steps:
s1.1, preprocessing the data, such as normalization, standardization, cutting, data enhancement and the like. This facilitates deep network training, speeds up the convergence process, while also avoiding the over-fitting problem and enhancing the generalization capability of the model.
S1.2, preprocessing data and reading the data into a network: and reading the training samples and the labels in the data set into the convolutional neural network according to the size of the batch-size.
S2: the design process of the encoder of the CSC-Unet model can be composed of the following steps:
s2.1, designing a two-layer convolution sparse coding model network model to form an ML-CSC module: let the original signal X satisfy two-layer convolution sparse model, which can be expressed as X ═ D1Γ1,Γ1=D2Γ2Wherein
S2.2, solving of ML-CSC problem: how to find gamma1And Γ2Can be seen as a Depth Coding Problem (DCP): in | | | Y-D1Γ1||2≤ε,Γ1=D2Γ2,Solving under the conditionWhere Y is the original signal X mixed with the noise E, i.e., Y ═ X + E (| E | | non-calculation circuit)2Epsilon is less than or equal to epsilon). Solving the depth coding problem with a layered basis pursuit algorithm (LBP) can yield:wherein
S2.3, solving the LBP problem: an approximate solution to the LBP problem can be found using a multi-layered iterative soft threshold algorithm (ML-ISTA):where T is the number of iterations, TλIs a threshold operator. If we further assume that the representation coefficients are non-negative, the approximate solution can be written as:whereinWkFor convolution operation, when t is 0, i.e. there are no iterations, the ML-CSC module is equivalent to two convolution operationsThe convolution coefficients are: w1And W2When the iteration time t is 1,as shown in fig. 2, and does not increase the number of learnable parameters in the iterative process.
S2.4, combining the ML-CSC module with the coding end of the traditional Unet segmentation network, as shown in figure 3.
S3: the design process of the encoder of the CSC-Unet model and the jump connection can be composed of the following steps:
s3.1, combining the ML-CSC module with a decoding end of a traditional Unet segmentation network, as shown in figure 3;
s3.2, using a jump structure to enable the global information to be combined with the position information, so as to generate accurate and fine segmentation;
and S3.3, selecting NLL _ LOSS as a LOSS function, activating a log-SoftMax function for input parameters, and selecting an Adam optimization function which is an optimization method of self-adaptive learning rate as an optimization function for optimization.
S4: and carrying out post-processing on a result obtained by the CSC-Unet model to obtain a visualized accurate semantic segmentation map. The process can be composed of the following steps:
s4.1, performing precision operation on the result output by the model and the real label to obtain a confusion matrix so as to obtain corresponding index measurement parameters such as average intersection ratio (Miou), pixel precision, average pixel precision and the like, and further measure the network performance;
and S4.2, storing the prediction result of the model in a picture form, and visually feeling the accuracy of the segmentation.
The method comprises the steps of preprocessing training pictures and labels in a data set, such as cutting and data enhancement, reading in a CSC-Unet segmentation model for training, reading test samples and labels in the data set into a convolutional neural network after all training is finished, and loading the stored best weight values into the model, so that the aim of accurate semantic segmentation is fulfilled. The invention is very beneficial to the research of geographic information systems, unmanned vehicle driving, medical image analysis, robots, image search engines and the like.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Claims (6)
1. The Unet semantic segmentation method based on convolutional sparse coding is characterized by comprising the following steps of:
s1: reading training samples and labels in a data set into a CSC-Unet semantic segmentation network, and preprocessing training pictures and labels according to actual needs, such as cutting and normalization;
s2: combining the convolution sparse coding with a coding network in Unet to form a coder of a CSC-Unet model so as to obtain global information of the image;
s3: combining the convolutional sparse coding with a decoding network in Unet to form a decoder of a CSC-Unet model so as to obtain the position information of the image, and using a hopping structure to enable the global information to be combined with the position information so as to generate accurate and fine segmentation;
s4: and carrying out post-processing on a result obtained by the CSC-Unet model to obtain a visualized semantic segmentation map.
2. The method for semantic segmentation of the Unet network based on convolutional sparse coding as claimed in claim 1, wherein: the S1 and S4 are respectively preprocessing and post-processing of the image, and the S2 and S3 are methods for proposing CSC-Unet semantic segmentation based on convolutional sparse coding and Unet segmentation model formation.
3. The method for semantic segmentation of the Unet network based on convolutional sparse coding as claimed in claim 1, wherein: the S1: the data preprocessing part can process the data by the following steps:
s1.1, preprocessing data, such as normalization, standardization, cutting and data enhancement, is beneficial to training of a deep network, accelerates a convergence process, avoids an overfitting problem and enhances the generalization capability of a model;
s1.2, preprocessing data and reading the data into a network: and reading the training samples and the labels in the data set into the convolutional neural network according to the size of the batch-size.
4. The method for semantic segmentation of the Unet network based on convolutional sparse coding as claimed in claim 1, wherein: the S2: the design process of the encoder of the CSC-Unet model can be composed of the following steps:
s2.1, designing a two-layer convolution sparse coding model network model to form an ML-CSC module: let the original signal X satisfy two-layer convolution sparse model, which can be expressed as X ═ D1Γ1,Γ1=D2Γ2Wherein
S2.2, solving of ML-CSC problem: how to find gamma1And Γ2Can be seen as a Depth Coding Problem (DCP): in | | | Y-D1Γ1||2≤ε,Γ1=D2Γ2,Solving under the conditionWhere Y is the original signal X mixed with noise E, i.e. Y ═X+E(||E||2Epsilon is less than or equal to epsilon). Solving the depth coding problem with a layered basis pursuit algorithm (LBP) can yield:wherein
S2.3, solving the LBP problem: an approximate solution to the LBP problem can be found using a multi-layered iterative soft threshold algorithm (ML-ISTA):where T is the number of iterations, TλIs a threshold operator; if we further assume that the representation coefficients are non-negative, the approximate solution can be written as:whereinWkFor the convolution operation, when t is 0, i.e. there is no iteration number, the ML-CSC module is equivalent to two convolution operations, and the convolution coefficients are: w1And W2When the iteration time t is 1,the number of learnable parameters is not increased in the iterative process;
s2.4, combining the ML-CSC module with a coding end of a traditional Unet segmentation network.
5. The method for semantic segmentation of the Unet network based on convolutional sparse coding as claimed in claim 1, wherein: the S3: the design process of the encoder of the CSC-Unet model and the jump connection can be composed of the following steps:
s3.1, combining the ML-CSC module with a decoding end of a traditional Unet segmentation network;
s3.2, using a jump structure to enable the global information to be combined with the position information, so as to generate accurate and fine segmentation;
and S3.3, selecting NLL _ LOSS as a LOSS function, activating a log-SoftMax function for input parameters, and selecting an Adam optimization function which is an optimization method of self-adaptive learning rate as an optimization function for optimization.
6. The method for semantic segmentation of the Unet network based on convolutional sparse coding as claimed in claim 1, wherein: in the S4, post-processing a result obtained by the CSC-Unet model to obtain a visualized accurate semantic segmentation map; the process can be composed of the following steps:
s4.1, performing precision operation on the result output by the model and the real label to obtain a confusion matrix so as to obtain corresponding index measurement parameters such as average intersection ratio (Miou), pixel precision, average pixel precision and the like, and further measure the network performance;
and S4.2, storing the prediction result of the model in a picture form, and visually feeling the accuracy of the segmentation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011445030.5A CN112580645B (en) | 2020-12-08 | 2020-12-08 | Unet semantic segmentation method based on convolution sparse coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011445030.5A CN112580645B (en) | 2020-12-08 | 2020-12-08 | Unet semantic segmentation method based on convolution sparse coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112580645A true CN112580645A (en) | 2021-03-30 |
CN112580645B CN112580645B (en) | 2024-05-03 |
Family
ID=75130881
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011445030.5A Active CN112580645B (en) | 2020-12-08 | 2020-12-08 | Unet semantic segmentation method based on convolution sparse coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112580645B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113344811A (en) * | 2021-05-31 | 2021-09-03 | 西南大学 | Multilayer convolution sparse coding weighted recursive denoising deep neural network and method |
CN113409322A (en) * | 2021-06-18 | 2021-09-17 | 中国石油大学(华东) | Deep learning training sample enhancement method for semantic segmentation of remote sensing image |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670392A (en) * | 2018-09-04 | 2019-04-23 | 中国人民解放军陆军工程大学 | Road image semantic segmentation method based on hybrid automatic encoder |
CN111210435A (en) * | 2019-12-24 | 2020-05-29 | 重庆邮电大学 | Image semantic segmentation method based on local and global feature enhancement module |
WO2020215236A1 (en) * | 2019-04-24 | 2020-10-29 | 哈尔滨工业大学(深圳) | Image semantic segmentation method and system |
CN111951249A (en) * | 2020-08-13 | 2020-11-17 | 浙江理工大学 | Mobile phone light guide plate defect visual detection method based on multitask learning network |
-
2020
- 2020-12-08 CN CN202011445030.5A patent/CN112580645B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670392A (en) * | 2018-09-04 | 2019-04-23 | 中国人民解放军陆军工程大学 | Road image semantic segmentation method based on hybrid automatic encoder |
WO2020215236A1 (en) * | 2019-04-24 | 2020-10-29 | 哈尔滨工业大学(深圳) | Image semantic segmentation method and system |
CN111210435A (en) * | 2019-12-24 | 2020-05-29 | 重庆邮电大学 | Image semantic segmentation method based on local and global feature enhancement module |
CN111951249A (en) * | 2020-08-13 | 2020-11-17 | 浙江理工大学 | Mobile phone light guide plate defect visual detection method based on multitask learning network |
Non-Patent Citations (1)
Title |
---|
郭爱心;殷保群;李运;: "基于深度卷积神经网络的小尺度行人检测", 信息技术与网络安全, no. 07 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113344811A (en) * | 2021-05-31 | 2021-09-03 | 西南大学 | Multilayer convolution sparse coding weighted recursive denoising deep neural network and method |
CN113409322A (en) * | 2021-06-18 | 2021-09-17 | 中国石油大学(华东) | Deep learning training sample enhancement method for semantic segmentation of remote sensing image |
Also Published As
Publication number | Publication date |
---|---|
CN112580645B (en) | 2024-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113850825B (en) | Remote sensing image road segmentation method based on context information and multi-scale feature fusion | |
CN109919204B (en) | Noise image-oriented deep learning clustering method | |
CN111325664B (en) | Style migration method and device, storage medium and electronic equipment | |
CN110689599A (en) | 3D visual saliency prediction method for generating countermeasure network based on non-local enhancement | |
CN112634296A (en) | RGB-D image semantic segmentation method and terminal for guiding edge information distillation through door mechanism | |
CN116721334B (en) | Training method, device, equipment and storage medium of image generation model | |
CN112580645A (en) | Unet semantic segmentation method based on convolutional sparse coding | |
CN113870335A (en) | Monocular depth estimation method based on multi-scale feature fusion | |
CN112801047B (en) | Defect detection method and device, electronic equipment and readable storage medium | |
CN115908772A (en) | Target detection method and system based on Transformer and fusion attention mechanism | |
CN117218498B (en) | Multi-modal large language model training method and system based on multi-modal encoder | |
CN110633706B (en) | Semantic segmentation method based on pyramid network | |
CN114283352A (en) | Video semantic segmentation device, training method and video semantic segmentation method | |
CN112053338A (en) | Image decomposition method and related device and equipment | |
Sharma et al. | Alchemist: Parametric control of material properties with diffusion models | |
CN116958712A (en) | Image generation method, system, medium and device based on prior probability distribution | |
CN112418229A (en) | Unmanned ship marine scene image real-time segmentation method based on deep learning | |
Di et al. | FDNet: An end-to-end fusion decomposition network for infrared and visible images | |
CN117011219A (en) | Method, apparatus, device, storage medium and program product for detecting quality of article | |
CN114565625A (en) | Mineral image segmentation method and device based on global features | |
CN113936040A (en) | Target tracking method based on capsule network and natural language query | |
CN117095136B (en) | Multi-object and multi-attribute image reconstruction and editing method based on 3D GAN | |
Liu et al. | Deep multi-scale network for single image dehazing with self-guided maps | |
CN117635941A (en) | Remote sensing image semantic segmentation method based on multi-scale feature and global information modeling | |
CN117876858A (en) | Underwater target detection method based on improved yolov7 algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |