CN113920317A - Semantic segmentation method based on visible light image and low-resolution depth image - Google Patents

Semantic segmentation method based on visible light image and low-resolution depth image Download PDF

Info

Publication number
CN113920317A
CN113920317A CN202111369121.XA CN202111369121A CN113920317A CN 113920317 A CN113920317 A CN 113920317A CN 202111369121 A CN202111369121 A CN 202111369121A CN 113920317 A CN113920317 A CN 113920317A
Authority
CN
China
Prior art keywords
resolution
image
semantic segmentation
visible light
depth image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111369121.XA
Other languages
Chinese (zh)
Other versions
CN113920317B (en
Inventor
袁媛
苏月皎
姜志宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202111369121.XA priority Critical patent/CN113920317B/en
Publication of CN113920317A publication Critical patent/CN113920317A/en
Application granted granted Critical
Publication of CN113920317B publication Critical patent/CN113920317B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a semantic segmentation method based on a visible light image and a low-resolution depth image. Based on multi-task learning, the low-resolution depth image is processed by utilizing the high-resolution visible light image by utilizing the super-resolution module network to obtain the high-resolution depth image, and then the semantic segmentation result is obtained by utilizing the semantic segmentation module network, so that the high-resolution visible light information which can be obtained in the actual situation is utilized, the resolution and the quality of the semantic segmentation are ensured, the problem of resolution alignment is solved, and the application of the semantic segmentation in the real life is expanded.

Description

Semantic segmentation method based on visible light image and low-resolution depth image
Technical Field
The invention belongs to the technical field of computer vision and image processing, and particularly relates to a semantic segmentation method based on a visible light image and a low-resolution depth image.
Background
Visible light-Depth (RGB-Depth, abbreviated to RGB-D) semantic segmentation is the segmentation of scene objects and regions using the visual appearance information of the scene and the scene distance information acquired by a Depth sensor. The RGB-D semantic segmentation has wide application as an underlying core technology, such as indoor navigation, automatic driving, machine vision and the like.
With the development of the deep learning method, the RGB-D semantic segmentation technology is effectively improved and widely applied to actual visual tasks. S. gupta et al in the document "s.gupta, r.girshick, p.arbel ez, and j.malik.learning Rich sources from RGB-D Images for Object Detection and segmentation. in European Conference on Computer Vision,2014, pp.345-360" propose a Two-stream (Two-stream) encoder-decoder network containing Two encoder branches for extracting visible light information and depth information Features, respectively, to predict the segmentation results. Most of the subsequent RGB-D semantic segmentation methods adopt a double-current network as the basis of the RGB-D semantic segmentation network to extract, fuse and upsample multi-modal features. The document "X.Chen, K.Y.Lin, J.Wang, W.Wu, C.Qian, H.Li, and G.Zeng.Bi-directional Cross-modulation Feature Propagation with Separation-and-Aggregation Gate for RGB-D sensing segmentation. in European Conference connection Computer Vision, 2020" by Chen et al proposes a two-way Cross-modal Feature Propagation based on a Gate, improving the way in which features of both modalities are fused.
Although the above methods work well, they tend to rely on a large number of visible and depth image pairs of the same resolution as training data. However, in reality, due to the depth sensor principle and hardware limitations, the resolution of the depth image is often low, the visible light camera is developed rapidly, and the acquired visible light image has high resolution. Based on this, the current research is mostly to realize the matching with the depth image resolution by acquiring the visible light image with lower resolution. On one hand, the visible light image information with high resolution is not fully utilized, and on the other hand, the generalization capability of the RGB-D semantic segmentation model and the capability of solving practical problems in real life are limited.
Disclosure of Invention
In order to overcome the defect that the existing RGB-D semantic segmentation method cannot process the non-aligned resolution RGB-D image pair, the invention provides a semantic segmentation method based on a visible light image and a low-resolution depth image. Based on multi-task learning, a super-resolution module network is used for processing a high-resolution visible light image and a low-resolution depth image to obtain a high-resolution depth image, and a semantic segmentation module network is used for obtaining a semantic segmentation result, so that high-resolution visible light information which can be obtained in an actual situation is utilized, the resolution and quality of semantic segmentation are guaranteed, the problem of resolution alignment is solved, and the application of semantic segmentation in real life is expanded.
A semantic segmentation method based on a visible light image and a low-resolution depth image is characterized by comprising the following steps:
step 1: the method comprises the steps of conducting down-sampling on a depth image in an input RGB-D data set to obtain a depth image with low resolution; training a super-resolution module network by taking a visible light image and a low-resolution depth image in an RGB-D data set as input data and taking the depth image as supervision information to obtain a trained network;
the RGB-D data set is a public visible light and depth image data set;
the super-resolution module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs up-sampling processing on the fused image to obtain a depth image with the same resolution as the input visible light image;
step 2: the visible light image in the RGB-D data set and the depth image obtained by the super-resolution module are used as input data, the middle two-layer parameters of the encoder of the trained super-resolution module network are used as supervision information of the middle two-layer parameters of the encoder of the semantic segmentation module network, and the semantic segmentation module network is trained to obtain a trained network;
the semantic segmentation module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs upsampling processing on the fused image to obtain an image which is a semantic segmentation result image;
and step 3: and inputting the RGB-D data obtained by real acquisition into a trained super-resolution module network, and outputting the RGB-D data which is a semantic segmentation result image through the trained semantic segmentation module network.
2. The method of claim 1, wherein the semantic segmentation is based on a visible light image and a low resolution depth image, and wherein: loss function L of the super-resolution module networkSRComprises the following steps:
Figure BDA0003353811950000021
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,
Figure BDA0003353811950000022
representing the original depth image in the RGB-D data set,
Figure BDA0003353811950000023
representing low-resolution depth images obtained by down-sampling, IrgbRepresenting a visible light image, W, in an RGB-D data setsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
Specifically, the loss function L of the super-resolution module networkSRComprises the following steps:
Figure BDA0003353811950000031
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,
Figure BDA0003353811950000032
representing the original depth image in the RGB-D data set,
Figure BDA0003353811950000033
representing low-resolution depth images obtained by down-sampling, IrgbRepresenting a visible light image, W, in an RGB-D data setsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
Specifically, the encoders in the super-resolution module network and the semantic segmentation module network are both 4-layer convolutional networks, and the loss function L of the semantic segmentation module networksegComprises the following steps:
Figure BDA0003353811950000034
wherein L (x, class) represents the weighted cross entropy between the predicted segmentation result and the real label, x represents the predicted semantic segmentation result, class represents the category,
Figure BDA0003353811950000035
represents the parameters of the i-th layer of the encoder in the super-resolution module network,
Figure BDA0003353811950000036
representing parameters of an i-th layer of an encoder in a semantic segmentation module network;
wherein, the calculation formula of L (x, class) is as follows:
Figure BDA0003353811950000037
wherein weight [ class ] represents the weight of class, the value of the weight is equal to the proportion of the number of class pixels in the data set to the total number of pixels, x [ class ] represents the class channel of the output characteristic diagram, j represents the position of the pixel, and x [ j ] represents the probability that the pixel j is predicted to be the class.
The invention has the beneficial effects that: due to the introduction of the super-resolution subtask, the difference between the depth resolution and the visible light resolution which can be obtained in real life can be made up, the high-resolution visible light information which can be obtained in real life is fully utilized, and the method has more practical and industrial values; because the relativity between the super-resolution subtasks and the semantic segmentation subtasks is utilized, the semantic segmentation subtask architecture is assisted to be optimized, and the accuracy of the semantic segmentation network can be improved. The invention is suitable for vehicle-mounted auxiliary systems, automatic driving systems, indoor autonomous navigation systems and the like, and has better practical value.
Drawings
FIG. 1 is a flow chart of the semantic segmentation method based on visible light images and low resolution depth images of the present invention;
FIG. 2 is a comparison graph of depth image results for a super-resolution module network of the present invention;
in the figure, (a) -an original depth image, (b) -a super-resolution module network output depth image, (c) -an actual high-resolution depth image and (d) -a high-resolution visible light image;
FIG. 3 is a comparison image of the segmentation results of the present invention;
in the figure, (a) -input depth image, (b) -input visible light image, (c) -segmentation result image of the invention and (d) -real segmentation image.
Detailed Description
The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.
As shown in fig. 1, the present invention provides a semantic segmentation method based on a visible light image and a low resolution depth image, which is implemented as follows:
the resolution of the visible light sensor in real life is higher than that of the depth sensor, and in order to make the method applicable to practice, the processing object of the invention is a high-resolution visible light image IrgbAnd corresponding low resolution depth image
Figure BDA0003353811950000041
However, the resolution of the two modal images in the existing large RGB-D data set is consistent, so in order to train the super-resolution subtask, the depth image in the RGB-D data set is firstly down-sampled to obtain the depth image with relatively lower resolution
Figure BDA0003353811950000042
1. Super resolution subtask
Visible light image IrgbAnd low resolution depth image
Figure BDA0003353811950000043
And performing super-resolution module network training as input. The super-resolution module network comprises two parallel encoders, a fusion module and a decoder, wherein the two branched encoders are used for extracting characteristics of a high-resolution visible light image and a low-resolution depth image, and the extracted characteristics are fused and transmitted to the decoder for up-sampling. With high resolution depth information originally in the data set
Figure BDA0003353811950000044
Supervising the generation of the high resolution depth image as a supervision signal to train the super resolution module network:
Figure BDA0003353811950000045
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,
Figure BDA0003353811950000046
representing the original depth image in the ith pair of images in the RGB-D dataset,
Figure BDA0003353811950000047
a low resolution depth image representing the image of the ith pair obtained by down-sampling,
Figure BDA0003353811950000048
representing the visible image in the ith pair of images in an RGB-D dataset, WsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
2. Semantic segmentation subtasks
The super-resolution subtask may obtain an aligned resolution RGB-D image pair from a non-aligned resolution RGB-D image pair. Inputting the predicted aligned RGB-D image pair into a semantic segmentation module network, and extracting the characteristics of the visible light image and the depth image predicted by the super-resolution module network by using two encoders comprising K-layer convolution, wherein K is the number of network layers and can be adjusted according to actual conditions. And fusing the extracted features, transmitting the fused features into a decoder, and performing up-sampling to finally obtain a segmentation result.
Aiming at the training of a semantic segmentation module network and the optimization solution of a loss function thereof, the invention adopts a two-part combined mode, wherein one part is from the cross entropy loss between a predicted segmentation result and a real label in a data set, and the other part is from the coupling constraint between a super-resolution module network and the semantic segmentation module network, namely, after the training of the super-resolution module network is finished, the parameters of two middle layers of encoders of the super-resolution sub-module network are taken out and are used as the supervision information of two middle layers of parameters of the semantic segmentation module network encoder to assist the training of two middle layers of the semantic segmentation network encoder. Namely, the following objective function is used for carrying out optimization solution on the network parameters of the semantic segmentation module:
Figure BDA0003353811950000051
Figure BDA0003353811950000052
wherein weight [ class ] represents the weight of class, the value of the weight is equal to the proportion of the number of class pixels in the data set to the total number of pixels, x [ class ] represents the class channel of the output characteristic diagram, j represents the position of the pixel, and x [ j ] represents the probability that the pixel j is predicted to be the class, and the value range is 0-1.
3. Model application
After the model parameters are optimized and learned, the highest-resolution visible light image which can be collected by the visible light sensor in a real scene and the highest-resolution depth image which can be collected by the depth sensor are input into the super-resolution module network, and the depth image which is aligned to the resolution of the visible light image upwards can be obtained. And then, inputting the depth image obtained by prediction and the visible light image collected by the visible light sensor into a semantic segmentation module network for predicting the segmentation result.
In order to verify the effectiveness of the method, a simulation experiment is carried out by using Pythrch on an operating system with a central processing unit of Intel (R) core (TM) i7-6800K CPU @3.40GHz and a memory 60G, Linux. The data used in the experiment was the published SUN RGB-D dataset. The SUN RGB-D dataset is the current largest RGB-D semantically segmented dataset containing 40 classes of 10335 annotated RGB-D images, of which 5285 pairs were used for training and 5050 pairs were used for testing. It is captured by four different sensors, Kinect V1, Kinect V2, xution and RealSense.
To demonstrate the effectiveness of the algorithm, RedNet, ACNet, PAP, FSFNet were chosen as comparison methods, respectively. Wherein RedNet is proposed in the document J.Jiang, L.Zheng, F.Luo, and Z.Zhang.RedNet for index RGB-D magnetic segmentation. Eprint Arxiv,2018. "; PAP is set forth in the literature "Z.Zhang, Z.Cui, C.Xu, Y.Yan, N.Sebe, J.Yang.Pattern-affinity amplification error Depth, Surface Normal and selection. in IEEE Conference on Computer Vision and Pattern Recognition,2019, pp.4106-4115"; FSFNet is set forth in the document "Y.Su, Y.Yuan, Z.Jiang.deep feature selection-and-fusion for RGB-D magnetic segmentation. in International Conference on Multimedia & Expo, 2021."; ACNet is proposed in documents X.Hu, K.Yang, L.Fei, and K.Wang, "ACNET: Attention Based networks to explicit comparative feeds for RGBD continuous," in Proc.IEEE International Conference on Image Processing,2019, pp.1440-1444. And respectively calculating two indexes of an average cross-over ratio (mIoU) and a Pixel precision (Pixel Acc), evaluating the RGB-D semantic segmentation quality, and obtaining a better segmentation effect when the index value is larger. The calculation results are shown in table 1, and it can be seen that both indexes of the present invention perform better than other methods.
TABLE 1
Figure BDA0003353811950000061
Fig. 2 shows an input low-resolution depth image, a super-resolution depth image obtained by network prediction using the super-resolution module of the present invention, and an actual high-resolution depth image and a visible light image, and it can be seen that the super-resolution subtask of the present invention can super-divide non-aligned resolution RGB-D image data into aligned resolution RGB-D data, thereby laying a foundation for a subsequent semantic segmentation subtask.
Fig. 3 shows a semantic segmentation result image and a real segmentation image obtained by the method of the present invention when a low resolution depth image and a visible light image are input. It is seen that the invention can achieve better semantic segmentation performance even when the resolution of the input depth image is lower than that of the visible light image.

Claims (3)

1. A semantic segmentation method based on a visible light image and a low-resolution depth image is characterized by comprising the following steps:
step 1: the method comprises the steps of conducting down-sampling on a depth image in an input RGB-D data set to obtain a depth image with low resolution; training a super-resolution module network by taking a visible light image and a low-resolution depth image in an RGB-D data set as input data and taking the depth image as supervision information to obtain a trained network;
the RGB-D data set is a public visible light and depth image data set;
the super-resolution module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs up-sampling processing on the fused image to obtain a depth image with the same resolution as the input visible light image;
step 2: the visible light image in the RGB-D data set and the depth image obtained by the super-resolution module are used as input data, the middle two-layer parameters of the encoder of the trained super-resolution module network are used as supervision information of the middle two-layer parameters of the encoder of the semantic segmentation module network, and the semantic segmentation module network is trained to obtain a trained network;
the semantic segmentation module network comprises two parallel encoders, a fusion module and a decoder, wherein the encoders perform feature extraction on input images to obtain corresponding feature images; the fusion module performs addition fusion on the feature images output by the two encoders to obtain a fused feature image; the decoder performs upsampling processing on the fused image to obtain an image which is a semantic segmentation result image;
and step 3: and inputting the RGB-D data obtained by real acquisition into a trained super-resolution module network, and outputting the RGB-D data which is a semantic segmentation result image through the trained semantic segmentation module network.
2. The method of claim 1, wherein the semantic segmentation is based on a visible light image and a low resolution depth image, and wherein: loss function L of the super-resolution module networkSRComprises the following steps:
Figure FDA0003353811940000011
where N represents the number of high and low resolution depth image pairs, i represents the index of the high and low depth image pairs,
Figure FDA0003353811940000012
representing the original depth image in the RGB-D data set,
Figure FDA0003353811940000013
representing low-resolution depth images obtained by down-sampling, IrgbRepresenting a visible light image, W, in an RGB-D data setsrRepresenting super-resolution module network parameters, fsr(. -) represents super resolution module network processing.
3. A method of semantic segmentation based on visible light images and low resolution depth images according to claim 1 or 2, characterized in that: the encoders in the super-resolution module network and the semantic segmentation module network are both 4-layer convolutional networks, and the loss function L of the semantic segmentation module networksegComprises the following steps:
Figure FDA0003353811940000021
wherein L (x, class) represents the weighted cross entropy between the predicted segmentation result and the real label, x represents the predicted semantic segmentation result, class represents the category,
Figure FDA0003353811940000022
represents the parameters of the i-th layer of the encoder in the super-resolution module network,
Figure FDA0003353811940000023
representing parameters of an i-th layer of an encoder in a semantic segmentation module network;
wherein, the calculation formula of L (x, class) is as follows:
Figure FDA0003353811940000024
wherein weight [ class ] represents the weight of class, the value of the weight is equal to the proportion of the number of class pixels in the data set to the total number of pixels, x [ class ] represents the class channel of the output characteristic diagram, j represents the position of the pixel, and x [ j ] represents the probability that the pixel j is predicted to be the class.
CN202111369121.XA 2021-11-15 2021-11-15 Semantic segmentation method based on visible light image and low-resolution depth image Active CN113920317B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111369121.XA CN113920317B (en) 2021-11-15 2021-11-15 Semantic segmentation method based on visible light image and low-resolution depth image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111369121.XA CN113920317B (en) 2021-11-15 2021-11-15 Semantic segmentation method based on visible light image and low-resolution depth image

Publications (2)

Publication Number Publication Date
CN113920317A true CN113920317A (en) 2022-01-11
CN113920317B CN113920317B (en) 2024-02-27

Family

ID=79247396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111369121.XA Active CN113920317B (en) 2021-11-15 2021-11-15 Semantic segmentation method based on visible light image and low-resolution depth image

Country Status (1)

Country Link
CN (1) CN113920317B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908531A (en) * 2023-03-09 2023-04-04 深圳市灵明光子科技有限公司 Vehicle-mounted distance measuring method and device, vehicle-mounted terminal and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634296A (en) * 2020-10-12 2021-04-09 深圳大学 RGB-D image semantic segmentation method and terminal for guiding edge information distillation through door mechanism
CN112861911A (en) * 2021-01-10 2021-05-28 西北工业大学 RGB-D semantic segmentation method based on depth feature selection fusion
US20210174513A1 (en) * 2019-12-09 2021-06-10 Naver Corporation Method and apparatus for semantic segmentation and depth completion using a convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210174513A1 (en) * 2019-12-09 2021-06-10 Naver Corporation Method and apparatus for semantic segmentation and depth completion using a convolutional neural network
CN112634296A (en) * 2020-10-12 2021-04-09 深圳大学 RGB-D image semantic segmentation method and terminal for guiding edge information distillation through door mechanism
CN112861911A (en) * 2021-01-10 2021-05-28 西北工业大学 RGB-D semantic segmentation method based on depth feature selection fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王子羽;张颖敏;陈永彬;王桂棠;: "基于RGB-D图像的室内场景语义分割网络优化", 自动化与信息工程, no. 02, 15 April 2020 (2020-04-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908531A (en) * 2023-03-09 2023-04-04 深圳市灵明光子科技有限公司 Vehicle-mounted distance measuring method and device, vehicle-mounted terminal and readable storage medium

Also Published As

Publication number Publication date
CN113920317B (en) 2024-02-27

Similar Documents

Publication Publication Date Title
CN112581409B (en) Image defogging method based on end-to-end multiple information distillation network
CN115713679A (en) Target detection method based on multi-source information fusion, thermal infrared and three-dimensional depth map
CN111354030B (en) Method for generating unsupervised monocular image depth map embedded into SENet unit
CN110781850A (en) Semantic segmentation system and method for road recognition, and computer storage medium
Choi et al. Attention-based multimodal image feature fusion module for transmission line detection
CN112651423A (en) Intelligent vision system
CN111882620A (en) Road drivable area segmentation method based on multi-scale information
CN115131281A (en) Method, device and equipment for training change detection model and detecting image change
CN114120272A (en) Multi-supervision intelligent lane line semantic segmentation method fusing edge detection
CN115293992B (en) Polarization image defogging method and device based on unsupervised weight depth model
CN113554032A (en) Remote sensing image segmentation method based on multi-path parallel network of high perception
Wang et al. TF-SOD: a novel transformer framework for salient object detection
CN116385761A (en) 3D target detection method integrating RGB and infrared information
Hong et al. USOD10K: a new benchmark dataset for underwater salient object detection
CN115527096A (en) Small target detection method based on improved YOLOv5
CN114463340B (en) Agile remote sensing image semantic segmentation method guided by edge information
CN116503709A (en) Vehicle detection method based on improved YOLOv5 in haze weather
Liu et al. Road segmentation with image-LiDAR data fusion in deep neural network
Wang et al. Global perception-based robust parking space detection using a low-cost camera
CN115035172A (en) Depth estimation method and system based on confidence degree grading and inter-stage fusion enhancement
CN113920317B (en) Semantic segmentation method based on visible light image and low-resolution depth image
CN116563553B (en) Unmanned aerial vehicle image segmentation method and system based on deep learning
CN116258756B (en) Self-supervision monocular depth estimation method and system
CN112861911A (en) RGB-D semantic segmentation method based on depth feature selection fusion
CN117079237A (en) Self-supervision monocular vehicle distance detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant