CN113920317A - Semantic segmentation method based on visible light image and low-resolution depth image - Google Patents
Semantic segmentation method based on visible light image and low-resolution depth image Download PDFInfo
- Publication number
- CN113920317A CN113920317A CN202111369121.XA CN202111369121A CN113920317A CN 113920317 A CN113920317 A CN 113920317A CN 202111369121 A CN202111369121 A CN 202111369121A CN 113920317 A CN113920317 A CN 113920317A
- Authority
- CN
- China
- Prior art keywords
- resolution
- image
- semantic segmentation
- visible light
- depth image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 77
- 238000000034 method Methods 0.000 title claims abstract description 25
- 230000004927 fusion Effects 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 13
- 238000005070 sampling Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a semantic segmentation method based on a visible light image and a low-resolution depth image. Based on multi-task learning, the low-resolution depth image is processed by utilizing the high-resolution visible light image by utilizing the super-resolution module network to obtain the high-resolution depth image, and then the semantic segmentation result is obtained by utilizing the semantic segmentation module network, so that the high-resolution visible light information which can be obtained in the actual situation is utilized, the resolution and the quality of the semantic segmentation are ensured, the problem of resolution alignment is solved, and the application of the semantic segmentation in the real life is expanded.
Description
Technical Field
The invention belongs to the technical field of computer vision and image processing, and particularly relates to a semantic segmentation method based on a visible light image and a low-resolution depth image.
Background
Visible light-Depth (RGB-Depth, abbreviated to RGB-D) semantic segmentation is the segmentation of scene objects and regions using the visual appearance information of the scene and the scene distance information acquired by a Depth sensor. The RGB-D semantic segmentation has wide application as an underlying core technology, such as indoor navigation, automatic driving, machine vision and the like.
With the development of the deep learning method, the RGB-D semantic segmentation technology is effectively improved and widely applied to actual visual tasks. S. gupta et al in the document "s.gupta, r.girshick, p.arbel ez, and j.malik.learning Rich sources from RGB-D Images for Object Detection and segmentation. in European Conference on Computer Vision,2014, pp.345-360" propose a Two-stream (Two-stream) encoder-decoder network containing Two encoder branches for extracting visible light information and depth information Features, respectively, to predict the segmentation results. Most of the subsequent RGB-D semantic segmentation methods adopt a double-current network as the basis of the RGB-D semantic segmentation network to extract, fuse and upsample multi-modal features. The document "X.Chen, K.Y.Lin, J.Wang, W.Wu, C.Qian, H.Li, and G.Zeng.Bi-directional Cross-modulation Feature Propagation with Separation-and-Aggregation Gate for RGB-D sensing segmentation. in European Conference connection Computer Vision, 2020" by Chen et al proposes a two-way Cross-modal Feature Propagation based on a Gate, improving the way in which features of both modalities are fused.
Although the above methods work well, they tend to rely on a large number of visible and depth image pairs of the same resolution as training data. However, in reality, due to the depth sensor principle and hardware limitations, the resolution of the depth image is often low, the visible light camera is developed rapidly, and the acquired visible light image has high resolution. Based on this, the current research is mostly to realize the matching with the depth image resolution by acquiring the visible light image with lower resolution. On one hand, the visible light image information with high resolution is not fully utilized, and on the other hand, the generalization capability of the RGB-D semantic segmentation model and the capability of solving practical problems in real life are limited.
Disclosure of Invention
In order to overcome the defect that the existing RGB-D semantic segmentation method cannot process the non-aligned resolution RGB-D image pair, the invention provides a semantic segmentation method based on a visible light image and a low-resolution depth image. Based on multi-task learning, a super-resolution module network is used for processing a high-resolution visible light image and a low-resolution depth image to obtain a high-resolution depth image, and a semantic segmentation module network is used for obtaining a semantic segmentation result, so that high-resolution visible light information which can be obtained in an actual situation is utilized, the resolution and quality of semantic segmentation are guaranteed, the problem of resolution alignment is solved, and the application of semantic segmentation in real life is expanded.
A semantic segmentation method based on a visible light image and a low-resolution depth image is characterized by comprising the following steps:
step 1: the method comprises the steps of conducting down-sampling on a depth image in an input RGB-D data set to obtain a depth image with low resolution; training a super-resolution module network by taking a visible light image and a low-resolution depth image in an RGB-D data set as input data and taking the depth image as supervision information to obtain a trained network;
the RGB-D data set is a public visible light and depth image data set;
the super-resolution module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs up-sampling processing on the fused image to obtain a depth image with the same resolution as the input visible light image;
step 2: the visible light image in the RGB-D data set and the depth image obtained by the super-resolution module are used as input data, the middle two-layer parameters of the encoder of the trained super-resolution module network are used as supervision information of the middle two-layer parameters of the encoder of the semantic segmentation module network, and the semantic segmentation module network is trained to obtain a trained network;
the semantic segmentation module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs upsampling processing on the fused image to obtain an image which is a semantic segmentation result image;
and step 3: and inputting the RGB-D data obtained by real acquisition into a trained super-resolution module network, and outputting the RGB-D data which is a semantic segmentation result image through the trained semantic segmentation module network.
2. The method of claim 1, wherein the semantic segmentation is based on a visible light image and a low resolution depth image, and wherein: loss function L of the super-resolution module networkSRComprises the following steps:
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,representing the original depth image in the RGB-D data set,representing low-resolution depth images obtained by down-sampling, IrgbRepresenting a visible light image, W, in an RGB-D data setsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
Specifically, the loss function L of the super-resolution module networkSRComprises the following steps:
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,representing the original depth image in the RGB-D data set,representing low-resolution depth images obtained by down-sampling, IrgbRepresenting a visible light image, W, in an RGB-D data setsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
Specifically, the encoders in the super-resolution module network and the semantic segmentation module network are both 4-layer convolutional networks, and the loss function L of the semantic segmentation module networksegComprises the following steps:
wherein L (x, class) represents the weighted cross entropy between the predicted segmentation result and the real label, x represents the predicted semantic segmentation result, class represents the category,represents the parameters of the i-th layer of the encoder in the super-resolution module network,representing parameters of an i-th layer of an encoder in a semantic segmentation module network;
wherein, the calculation formula of L (x, class) is as follows:
wherein weight [ class ] represents the weight of class, the value of the weight is equal to the proportion of the number of class pixels in the data set to the total number of pixels, x [ class ] represents the class channel of the output characteristic diagram, j represents the position of the pixel, and x [ j ] represents the probability that the pixel j is predicted to be the class.
The invention has the beneficial effects that: due to the introduction of the super-resolution subtask, the difference between the depth resolution and the visible light resolution which can be obtained in real life can be made up, the high-resolution visible light information which can be obtained in real life is fully utilized, and the method has more practical and industrial values; because the relativity between the super-resolution subtasks and the semantic segmentation subtasks is utilized, the semantic segmentation subtask architecture is assisted to be optimized, and the accuracy of the semantic segmentation network can be improved. The invention is suitable for vehicle-mounted auxiliary systems, automatic driving systems, indoor autonomous navigation systems and the like, and has better practical value.
Drawings
FIG. 1 is a flow chart of the semantic segmentation method based on visible light images and low resolution depth images of the present invention;
FIG. 2 is a comparison graph of depth image results for a super-resolution module network of the present invention;
in the figure, (a) -an original depth image, (b) -a super-resolution module network output depth image, (c) -an actual high-resolution depth image and (d) -a high-resolution visible light image;
FIG. 3 is a comparison image of the segmentation results of the present invention;
in the figure, (a) -input depth image, (b) -input visible light image, (c) -segmentation result image of the invention and (d) -real segmentation image.
Detailed Description
The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.
As shown in fig. 1, the present invention provides a semantic segmentation method based on a visible light image and a low resolution depth image, which is implemented as follows:
the resolution of the visible light sensor in real life is higher than that of the depth sensor, and in order to make the method applicable to practice, the processing object of the invention is a high-resolution visible light image IrgbAnd corresponding low resolution depth imageHowever, the resolution of the two modal images in the existing large RGB-D data set is consistent, so in order to train the super-resolution subtask, the depth image in the RGB-D data set is firstly down-sampled to obtain the depth image with relatively lower resolution
1. Super resolution subtask
Visible light image IrgbAnd low resolution depth imageAnd performing super-resolution module network training as input. The super-resolution module network comprises two parallel encoders, a fusion module and a decoder, wherein the two branched encoders are used for extracting characteristics of a high-resolution visible light image and a low-resolution depth image, and the extracted characteristics are fused and transmitted to the decoder for up-sampling. With high resolution depth information originally in the data setSupervising the generation of the high resolution depth image as a supervision signal to train the super resolution module network:
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,representing the original depth image in the ith pair of images in the RGB-D dataset,a low resolution depth image representing the image of the ith pair obtained by down-sampling,representing the visible image in the ith pair of images in an RGB-D dataset, WsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
2. Semantic segmentation subtasks
The super-resolution subtask may obtain an aligned resolution RGB-D image pair from a non-aligned resolution RGB-D image pair. Inputting the predicted aligned RGB-D image pair into a semantic segmentation module network, and extracting the characteristics of the visible light image and the depth image predicted by the super-resolution module network by using two encoders comprising K-layer convolution, wherein K is the number of network layers and can be adjusted according to actual conditions. And fusing the extracted features, transmitting the fused features into a decoder, and performing up-sampling to finally obtain a segmentation result.
Aiming at the training of a semantic segmentation module network and the optimization solution of a loss function thereof, the invention adopts a two-part combined mode, wherein one part is from the cross entropy loss between a predicted segmentation result and a real label in a data set, and the other part is from the coupling constraint between a super-resolution module network and the semantic segmentation module network, namely, after the training of the super-resolution module network is finished, the parameters of two middle layers of encoders of the super-resolution sub-module network are taken out and are used as the supervision information of two middle layers of parameters of the semantic segmentation module network encoder to assist the training of two middle layers of the semantic segmentation network encoder. Namely, the following objective function is used for carrying out optimization solution on the network parameters of the semantic segmentation module:
wherein weight [ class ] represents the weight of class, the value of the weight is equal to the proportion of the number of class pixels in the data set to the total number of pixels, x [ class ] represents the class channel of the output characteristic diagram, j represents the position of the pixel, and x [ j ] represents the probability that the pixel j is predicted to be the class, and the value range is 0-1.
3. Model application
After the model parameters are optimized and learned, the highest-resolution visible light image which can be collected by the visible light sensor in a real scene and the highest-resolution depth image which can be collected by the depth sensor are input into the super-resolution module network, and the depth image which is aligned to the resolution of the visible light image upwards can be obtained. And then, inputting the depth image obtained by prediction and the visible light image collected by the visible light sensor into a semantic segmentation module network for predicting the segmentation result.
In order to verify the effectiveness of the method, a simulation experiment is carried out by using Pythrch on an operating system with a central processing unit of Intel (R) core (TM) i7-6800K CPU @3.40GHz and a memory 60G, Linux. The data used in the experiment was the published SUN RGB-D dataset. The SUN RGB-D dataset is the current largest RGB-D semantically segmented dataset containing 40 classes of 10335 annotated RGB-D images, of which 5285 pairs were used for training and 5050 pairs were used for testing. It is captured by four different sensors, Kinect V1, Kinect V2, xution and RealSense.
To demonstrate the effectiveness of the algorithm, RedNet, ACNet, PAP, FSFNet were chosen as comparison methods, respectively. Wherein RedNet is proposed in the document J.Jiang, L.Zheng, F.Luo, and Z.Zhang.RedNet for index RGB-D magnetic segmentation. Eprint Arxiv,2018. "; PAP is set forth in the literature "Z.Zhang, Z.Cui, C.Xu, Y.Yan, N.Sebe, J.Yang.Pattern-affinity amplification error Depth, Surface Normal and selection. in IEEE Conference on Computer Vision and Pattern Recognition,2019, pp.4106-4115"; FSFNet is set forth in the document "Y.Su, Y.Yuan, Z.Jiang.deep feature selection-and-fusion for RGB-D magnetic segmentation. in International Conference on Multimedia & Expo, 2021."; ACNet is proposed in documents X.Hu, K.Yang, L.Fei, and K.Wang, "ACNET: Attention Based networks to explicit comparative feeds for RGBD continuous," in Proc.IEEE International Conference on Image Processing,2019, pp.1440-1444. And respectively calculating two indexes of an average cross-over ratio (mIoU) and a Pixel precision (Pixel Acc), evaluating the RGB-D semantic segmentation quality, and obtaining a better segmentation effect when the index value is larger. The calculation results are shown in table 1, and it can be seen that both indexes of the present invention perform better than other methods.
TABLE 1
Fig. 2 shows an input low-resolution depth image, a super-resolution depth image obtained by network prediction using the super-resolution module of the present invention, and an actual high-resolution depth image and a visible light image, and it can be seen that the super-resolution subtask of the present invention can super-divide non-aligned resolution RGB-D image data into aligned resolution RGB-D data, thereby laying a foundation for a subsequent semantic segmentation subtask.
Fig. 3 shows a semantic segmentation result image and a real segmentation image obtained by the method of the present invention when a low resolution depth image and a visible light image are input. It is seen that the invention can achieve better semantic segmentation performance even when the resolution of the input depth image is lower than that of the visible light image.
Claims (3)
1. A semantic segmentation method based on a visible light image and a low-resolution depth image is characterized by comprising the following steps:
step 1: the method comprises the steps of conducting down-sampling on a depth image in an input RGB-D data set to obtain a depth image with low resolution; training a super-resolution module network by taking a visible light image and a low-resolution depth image in an RGB-D data set as input data and taking the depth image as supervision information to obtain a trained network;
the RGB-D data set is a public visible light and depth image data set;
the super-resolution module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs up-sampling processing on the fused image to obtain a depth image with the same resolution as the input visible light image;
step 2: the visible light image in the RGB-D data set and the depth image obtained by the super-resolution module are used as input data, the middle two-layer parameters of the encoder of the trained super-resolution module network are used as supervision information of the middle two-layer parameters of the encoder of the semantic segmentation module network, and the semantic segmentation module network is trained to obtain a trained network;
the semantic segmentation module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs upsampling processing on the fused image to obtain an image which is a semantic segmentation result image;
and step 3: and inputting the RGB-D data obtained by real acquisition into a trained super-resolution module network, and outputting the RGB-D data which is a semantic segmentation result image through the trained semantic segmentation module network.
2. The method of claim 1, wherein the semantic segmentation is based on a visible light image and a low resolution depth image, and wherein: loss function L of the super-resolution module networkSRComprises the following steps:
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,representing the original depth image in the RGB-D data set,representing low-resolution depth images obtained by down-sampling, IrgbRepresenting a visible light image, W, in an RGB-D data setsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
3. A method of semantic segmentation based on visible light images and low resolution depth images according to claim 1 or 2, characterized in that: the encoders in the super-resolution module network and the semantic segmentation module network are both 4-layer convolutional networks, and the loss function L of the semantic segmentation module networksegComprises the following steps:
wherein L (x, class) represents the weighted cross entropy between the predicted segmentation result and the real label, x represents the predicted semantic segmentation result, class represents the category,represents the parameters of the i-th layer of the encoder in the super-resolution module network,representing parameters of an i-th layer of an encoder in a semantic segmentation module network;
wherein, the calculation formula of L (x, class) is as follows:
wherein weight [ class ] represents the weight of class, the value of the weight is equal to the proportion of the number of class pixels in the data set to the total number of pixels, x [ class ] represents the class channel of the output characteristic diagram, j represents the position of the pixel, and x [ j ] represents the probability that the pixel j is predicted to be the class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111369121.XA CN113920317B (en) | 2021-11-15 | 2021-11-15 | Semantic segmentation method based on visible light image and low-resolution depth image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111369121.XA CN113920317B (en) | 2021-11-15 | 2021-11-15 | Semantic segmentation method based on visible light image and low-resolution depth image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113920317A true CN113920317A (en) | 2022-01-11 |
CN113920317B CN113920317B (en) | 2024-02-27 |
Family
ID=79247396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111369121.XA Active CN113920317B (en) | 2021-11-15 | 2021-11-15 | Semantic segmentation method based on visible light image and low-resolution depth image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113920317B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115908531A (en) * | 2023-03-09 | 2023-04-04 | 深圳市灵明光子科技有限公司 | Vehicle-mounted distance measuring method and device, vehicle-mounted terminal and readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634296A (en) * | 2020-10-12 | 2021-04-09 | 深圳大学 | RGB-D image semantic segmentation method and terminal for guiding edge information distillation through door mechanism |
CN112861911A (en) * | 2021-01-10 | 2021-05-28 | 西北工业大学 | RGB-D semantic segmentation method based on depth feature selection fusion |
US20210174513A1 (en) * | 2019-12-09 | 2021-06-10 | Naver Corporation | Method and apparatus for semantic segmentation and depth completion using a convolutional neural network |
-
2021
- 2021-11-15 CN CN202111369121.XA patent/CN113920317B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210174513A1 (en) * | 2019-12-09 | 2021-06-10 | Naver Corporation | Method and apparatus for semantic segmentation and depth completion using a convolutional neural network |
CN112634296A (en) * | 2020-10-12 | 2021-04-09 | 深圳大学 | RGB-D image semantic segmentation method and terminal for guiding edge information distillation through door mechanism |
CN112861911A (en) * | 2021-01-10 | 2021-05-28 | 西北工业大学 | RGB-D semantic segmentation method based on depth feature selection fusion |
Non-Patent Citations (1)
Title |
---|
王子羽;张颖敏;陈永彬;王桂棠;: "基于RGB-D图像的室内场景语义分割网络优化", 自动化与信息工程, no. 02, 15 April 2020 (2020-04-15) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115908531A (en) * | 2023-03-09 | 2023-04-04 | 深圳市灵明光子科技有限公司 | Vehicle-mounted distance measuring method and device, vehicle-mounted terminal and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113920317B (en) | 2024-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112581409B (en) | Image defogging method based on end-to-end multiple information distillation network | |
CN115713679A (en) | Target detection method based on multi-source information fusion, thermal infrared and three-dimensional depth map | |
CN111354030B (en) | Method for generating unsupervised monocular image depth map embedded into SENet unit | |
CN110781850A (en) | Semantic segmentation system and method for road recognition, and computer storage medium | |
Choi et al. | Attention-based multimodal image feature fusion module for transmission line detection | |
CN112651423A (en) | Intelligent vision system | |
CN111882620A (en) | Road drivable area segmentation method based on multi-scale information | |
CN115131281A (en) | Method, device and equipment for training change detection model and detecting image change | |
CN114120272A (en) | Multi-supervision intelligent lane line semantic segmentation method fusing edge detection | |
CN115293992B (en) | Polarization image defogging method and device based on unsupervised weight depth model | |
CN113554032A (en) | Remote sensing image segmentation method based on multi-path parallel network of high perception | |
Wang et al. | TF-SOD: a novel transformer framework for salient object detection | |
CN116385761A (en) | 3D target detection method integrating RGB and infrared information | |
Hong et al. | USOD10K: a new benchmark dataset for underwater salient object detection | |
CN115527096A (en) | Small target detection method based on improved YOLOv5 | |
CN114463340B (en) | Agile remote sensing image semantic segmentation method guided by edge information | |
CN116503709A (en) | Vehicle detection method based on improved YOLOv5 in haze weather | |
Liu et al. | Road segmentation with image-LiDAR data fusion in deep neural network | |
Wang et al. | Global perception-based robust parking space detection using a low-cost camera | |
CN115035172A (en) | Depth estimation method and system based on confidence degree grading and inter-stage fusion enhancement | |
CN113920317B (en) | Semantic segmentation method based on visible light image and low-resolution depth image | |
CN116563553B (en) | Unmanned aerial vehicle image segmentation method and system based on deep learning | |
CN116258756B (en) | Self-supervision monocular depth estimation method and system | |
CN112861911A (en) | RGB-D semantic segmentation method based on depth feature selection fusion | |
CN117079237A (en) | Self-supervision monocular vehicle distance detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |