CN117953029B - General depth map completion method and device based on depth information propagation - Google Patents

General depth map completion method and device based on depth information propagation Download PDF

Info

Publication number
CN117953029B
CN117953029B CN202410356521.4A CN202410356521A CN117953029B CN 117953029 B CN117953029 B CN 117953029B CN 202410356521 A CN202410356521 A CN 202410356521A CN 117953029 B CN117953029 B CN 117953029B
Authority
CN
China
Prior art keywords
depth map
map
affinity
dense
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410356521.4A
Other languages
Chinese (zh)
Other versions
CN117953029A (en
Inventor
樊彬
朱正宇
刘红敏
刘子熠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202410356521.4A priority Critical patent/CN117953029B/en
Publication of CN117953029A publication Critical patent/CN117953029A/en
Application granted granted Critical
Publication of CN117953029B publication Critical patent/CN117953029B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of image enhancement, in particular to a general depth map completion method and device based on depth information transmission. The general depth map completion method based on depth information propagation comprises the following steps: acquiring data of a scene by using a depth sensor to obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image; performing depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map; inputting the sparse depth map, the RGB map and the dense depth map into ResUNeT networks for feature extraction to obtain an affinity map; and carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map. The depth map complementation method has the advantages of high complementation accuracy and high reasoning speed, and overcomes the defect of insufficient resolution of the depth sensor.

Description

General depth map completion method and device based on depth information propagation
Technical Field
The invention relates to the technical field of image enhancement, in particular to a general depth map completion method and device based on depth information transmission.
Background
With rapid progress in the fields of automatic driving, robots, augmented reality and the like in recent years, the importance of depth information maps is increasingly highlighted. These depth information maps play a vital auxiliary role for the tasks described above, so that these tasks can be done with very good results. These depth information maps are typically obtained by using commercial depth sensors, such as structured light sensors, toF (time of flight) radars, etc. However, commercial sensors all have respective limitations. For example, in the HDL64E model of lidar Velodyne commonly used in the autopilot field, the scanning result can only obtain a sparse depth information map with lower resolution, and the number of effective depth information pixels in the sparse depth information map is only about 5% of the number of pixels of the corresponding RGB image. Such sparsity depth information maps, while capable of satisfying some basic simple 3d visual tasks such as obstacle avoidance and moving object detection, are somewhat frustrating for more complex tasks such as autopilot. The sparsity of the scanning results of common commercial radars greatly limits the reliability thereof.
To overcome the limitations of the depth sensor itself, there have been many studies in using a given sparse depth map and corresponding RGB images to obtain a dense depth map, a method called "depth complement". By the depth complement technology, complete depth information can be recovered from sparse depth measurement, so that reliability and accuracy of a depth information map are improved. Most depth-filling methods fill holes in sparse depth maps from the very beginning using graphical imaging operations (such as erosion dilation, etc.), transitioning to feed the sparse depth maps into a convolutional neural network and directly predicting the dense depth maps.
However, the prediction result of the convolutional neural network may have edge blurring, and in order to solve the above problem, a Convolutional Spatial Propagation Network (CSPN) is proposed for refining the predicted depth map to obtain sharp depth map edges and more accurate depth information. The method generally predicts the affinity coefficient between each pixel point and surrounding points, and then uses the coefficient to propagate depth information multiple times to obtain a dense depth information map. For such a spatial propagation based depth complement neural network framework, the inputs to the network are a sparse depth map and corresponding RGB images, and the outputs of the network are the initial depth map of the propagation process and the affinity coefficient map used for the propagation.
However, this framework is a compromise for coping with the sparsity problem of the depth map scan result, and the network simultaneously bears the task of predicting the dense depth map and the affinity coefficient, but both tasks lack direct supervision, so that the neural network does not learn the two tasks sufficiently, and further, the generalization capability of the model is reduced, and finally, the reliability of the completed depth map is lower. The existing space propagation neural network structure is complex to design affinity generation branches, and larger propagation times are used in the propagation process, so that the calculation amount of the depth complement method based on space propagation is large, and finally the reasoning speed is too slow in the use process.
In the prior art, a depth map complement method which overcomes the defect of insufficient resolution of a depth sensor and has high complement precision and high reasoning speed is lacking.
Disclosure of Invention
In order to solve the technical problems that in the prior art, the acquired depth information is too sparse and inaccurate in measurement due to insufficient resolution of a depth sensor, and the existing depth completion method based on space propagation is low in precision and large in calculation amount, the embodiment of the invention provides a general depth map completion method and device based on depth information propagation. The technical scheme is as follows:
in one aspect, a general depth map completion method based on depth information propagation is provided, and the method is implemented by general depth map completion equipment, and includes:
Acquiring data of a scene by using a depth sensor to obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image;
performing depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map;
inputting the sparse depth map, the RGB map and the dense depth map into ResUNeT networks for feature extraction to obtain an affinity map;
and carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map.
Wherein the pre-filling method is a convolution space propagation network, a non-local space propagation network, a dense space propagation network or a full convolution space propagation network.
Wherein the affinity profile comprises a first affinity profile, a second affinity profile, and a third affinity profile;
the first affinity diagram is used for complementing structural information; the first affinity map has a size that is one sixteenth of the size of the dense depth map; the dense depth map has a size of M is the length of the dense depth map and n is the width of the dense depth map;
the second affinity diagram is used for detail information complementation; the second affinity map has a size that is one-fourth the size of the dense depth map;
The third affinity diagram is used for detail information complementation; the size of the third affinity map is the size of the dense depth map.
Wherein the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;
the feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolution layers;
the affinity map generation branch comprises a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity map generating branch includes 2 deconvolution layers and 1 convolution layer.
Optionally, inputting ResUNeT the sparse depth map, the RGB map, and the dense depth map into a network for feature extraction to obtain an affinity map, including:
Performing feature extraction through ResUNeT networks according to the sparse depth map, the RGB map and the dense depth map to obtain a first image feature, a second image feature and a third image feature;
The first image features include a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the first image feature has a size that is one sixteenth of the size of the dense depth map;
The second image features include a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the second image feature has a size that is one quarter of the size of the dense depth map;
The third image features include a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;
Performing convolution operation according to the first image characteristics to obtain a first affinity graph;
performing convolution operation according to the second image characteristics to obtain a second affinity graph;
and carrying out convolution operation according to the third image characteristic to obtain a third affinity graph.
Optionally, the iteratively propagating according to the dense depth map and the affinity map to obtain a complement depth map includes:
Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first complement depth map;
Performing up-sampling processing on the first complement depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on a second affinity map to obtain a second complement depth map;
performing up-sampling processing on the second complement depth map to obtain a third dense depth map; and based on the third affinity graph, iteratively propagating the third dense depth graph to obtain a complement depth graph.
In another aspect, there is provided a general depth map completion apparatus based on depth information propagation, the apparatus being applied to a general depth map completion method based on depth information propagation, the apparatus comprising:
the data acquisition module is used for acquiring data of the scene by using the depth sensor to obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image;
the pre-filling module is used for carrying out depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map;
the affinity graph generation module is used for inputting the sparse depth graph, the RGB graph and the dense depth graph into ResUNeT networks for feature extraction to obtain an affinity graph;
and the depth map complement module is used for carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map.
Wherein the pre-filling method is a convolution space propagation network, a non-local space propagation network, a dense space propagation network or a full convolution space propagation network.
Wherein the affinity profile comprises a first affinity profile, a second affinity profile, and a third affinity profile;
the first affinity diagram is used for complementing structural information; the first affinity map has a size that is one sixteenth of the size of the dense depth map; the dense depth map has a size of M is the length of the dense depth map and n is the width of the dense depth map;
the second affinity diagram is used for detail information complementation; the second affinity map has a size that is one-fourth the size of the dense depth map;
The third affinity diagram is used for detail information complementation; the size of the third affinity map is the size of the dense depth map.
Wherein the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;
the feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolution layers;
the affinity map generation branch comprises a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity map generating branch includes 2 deconvolution layers and 1 convolution layer.
Optionally, the affinity graph generating module is further configured to:
Performing feature extraction through ResUNeT networks according to the sparse depth map, the RGB map and the dense depth map to obtain a first image feature, a second image feature and a third image feature;
The first image features include a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the first image feature has a size that is one sixteenth of the size of the dense depth map;
The second image features include a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the second image feature has a size that is one quarter of the size of the dense depth map;
The third image features include a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;
Performing convolution operation according to the first image characteristics to obtain a first affinity graph;
performing convolution operation according to the second image characteristics to obtain a second affinity graph;
and carrying out convolution operation according to the third image characteristic to obtain a third affinity graph.
Optionally, the depth map completion module is further configured to:
Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first complement depth map;
Performing up-sampling processing on the first complement depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on a second affinity map to obtain a second complement depth map;
performing up-sampling processing on the second complement depth map to obtain a third dense depth map; and based on the third affinity graph, iteratively propagating the third dense depth graph to obtain a complement depth graph.
In another aspect, a generic depth map completion apparatus is provided, the generic depth map completion apparatus including: a processor; and a memory having stored thereon computer readable instructions which, when executed by the processor, implement any of the general depth map completion methods based on depth information propagation as described above.
In another aspect, a computer-readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement any of the above-described general depth map completion methods based on depth information propagation is provided.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:
The invention provides a general depth map completion method based on depth information transmission, and the method can obtain precision improvement on any pre-filling method through an alternative pre-filling method, and the higher the precision of the pre-filling method is, the higher the final precision is. Therefore, the method provided by the invention can be used as a pre-filling method and can obtain a depth filling result with higher precision on the basis of the pre-filling method even after a new depth filling method appears. The invention provides a novel affinity graph generation method, and the affinity graph generated by the method solves the problem of too low reasoning speed caused by a large propagation range. Since the affinities of the three stages are respectively used for generating the feature graphs of the corresponding scales, the feature graphs of the three scales respectively correspond to abstract structural information to specific detail information. The whole space propagation network framework greatly reduces the risk of poor generalization of the network obtained by model training; and due to the participation of the supervision information, the difficulty of the learning and training process is reduced. The depth map complementation method has the advantages of high complementation accuracy and high reasoning speed, and overcomes the defect of insufficient resolution of the depth sensor.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a general depth map completion method based on depth information propagation provided by an embodiment of the invention;
FIG. 2 is a block diagram of a general depth map completion device based on depth information propagation according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a generic depth map completing apparatus according to an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is described below with reference to the accompanying drawings.
In embodiments of the invention, words such as "exemplary," "such as" and the like are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the term use of an example is intended to present concepts in a concrete fashion. Furthermore, in embodiments of the present invention, the meaning of "and/or" may be that of both, or may be that of either, optionally one of both.
In the embodiments of the present invention, "image" and "picture" may be sometimes used in combination, and it should be noted that the meaning of the expression is consistent when the distinction is not emphasized. "of", "corresponding (corresponding, relevant)" and "corresponding (corresponding)" are sometimes used in combination, and it should be noted that the meaning of the expression is consistent when the distinction is not emphasized.
In embodiments of the present invention, sometimes a subscript such as W 1 may be expressed in a non-subscript form such as W1, and the meaning of the expression is consistent when de-emphasizing the distinction.
In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention provides a general depth map completion method based on depth information propagation, which can be realized by general depth map completion equipment, wherein the general depth map completion equipment can be a terminal or a server. The general depth map completion method flowchart based on depth information propagation as shown in fig. 1, the process flow of the method may include the following steps:
s1, acquiring data of a scene by using a depth sensor to obtain a sparse depth map; and acquiring data of the scene by using a color camera to obtain an RGB image.
In a possible implementation, a sparse depth map in a scene is acquired using a depth sensor and a color camera, respectivelyAnd RGB map/>. Where m is the length of the image of the sparse depth map, n is the width of the image of the sparse depth map, and the subscript S represents the sparsity.
S2, performing depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map.
In one possible implementation, the decoupling hierarchical convolution space propagation network framework (Decoupled Hierarchical-Convolutional Spatial Propagation Network, DH-CSPN) provided by the invention is used for complementing sparse depth scan results. This framework is very different from traditional space-based deep-complement network frameworks such as convolutional space-propagation networks (Convolutional Spatial Propagation Network, CSPN). The depth complement method used by the framework decouples the subsequent space propagation process, the initial value of the propagation process (namely the initial depth map of the depth complement) is obtained by directly using the pre-filling method, and in the original CSPN framework, the generation of the initial depth map is generated by the network.
Both the initial depth map and the affinity diagram under CSPN framework are supervised by no supervision information, which can lead to the difficulty of learning of the two, and are the direct reasons that the model learns the result with poor generalization ability. The framework provided by the invention can fix the initial dense depth map in the propagation iterative process through a pre-filling method, so that the burden of the convolutional neural network is reduced, the convolutional neural network is focused on the prediction of the affinity, and finally, an affinity prediction method with stronger generalization capability is learned, and a depth completion result with higher precision is obtained.
In this step a dense depth map of the same size as the sparse depth map is obtainedWherein the subscript D represents dense.
Wherein the pre-filling method is a convolution space propagation network, a non-local space propagation network, a dense space propagation network or a full convolution space propagation network.
In a feasible implementation mode, the pre-filling method is an arbitrary depth completion method, and on the premise of obtaining the completion results of other depth completion methods, the depth map after the completion of the method is sent into the depth completion network frame, the completion results are refined, and finally the depth completion result with higher precision is obtained. The invention can be a general depth complement method, which can be connected with the rear end of any depth complement method to carry out refining post-treatment.
The pre-filling method can use very simple image processing expansion operation, even without any operation, and is more complicated, and the pre-filling method of the framework can be replaced by various recently issued depth complement methods.
And S3, inputting the sparse depth map, the RGB map and the dense depth map into ResUNeT networks for feature extraction, and obtaining an affinity map.
Wherein the affinity profile comprises a first affinity profile, a second affinity profile, and a third affinity profile;
the first affinity diagram is used for structural information complementation; the size of the first affinity map is one sixteenth of the size of the dense depth map; the dense depth map has a size of M is the length of the dense depth map and n is the width of the dense depth map;
The second affinity diagram is used for detail information complementation; the second affinity map has a size that is one-fourth the size of the dense depth map;
the third affinity diagram is used for detail information complementation; the size of the third affinity map is the size of the dense depth map.
In a possible implementation, a sparse depth map is usedRGB map/>And dense depth information map/>, obtained after pre-fillingAnd (3) sending the three images into a backbone Network of a Residual convolutional neural Network (ResUNet) structure, extracting the characteristics of the three images, and finally generating three affinity graphs with different sizes.
The affinity patterns are of the size of,/>K is the affinity propagation range, the value is set to 3, and the propagation range of the affinity graph is/>Is a square range of (c). The affinity graph is a correlation coefficient of each pixel point and surrounding on the depth graph, and has a meaning in the iterative propagation process of the actual depth graph in controlling the coefficient of the depth value propagated to a certain direction in the propagation process. Employed for the present invention/>Is controlled by/>Each point in the depth map of size, surrounding the point/>Coefficients when depth information is propagated in a range.
Wherein the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;
the feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolution layers;
The affinity map generation branch comprises a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity map generating branch includes 2 deconvolution layers and 1 convolution layer.
In one possible implementation, resUNet structural designs are divided into an encoder portion and a decoder portion, the encoder structure designs 5 convolutional layers, the first 4 convolutional layers (conv 2-conv 5) are identical to the first layer of ResUNet, and the last convolutional layer conv6 has a convolutional kernel size ofStep size is 2, channel number is 512, and ReLU is used as an activation function.
The decoder structure designs 4 deconvolution layers, and the deconvolution convolution kernel of deconvolution layer dec5 has the size ofStep length is 2, and the number of channels is 256; deconvolution of deconvolution layer dec4 has a convolution kernel size/>Step size 2, channel number 128, the deconvolution layer connecting the output of deconvolution layer dec5 with the output of convolution layer conv 5; deconvolution of deconvolution layer dec3 has a convolution kernel size/>A step size of 2 and a channel number of 64, the deconvolution layer connecting the output of the convolutional layer conv4 with the output of the deconvolution layer dec 4; deconvolution of deconvolution layer dec2 has a convolution kernel size/>The step size is 2 and the number of channels is 64, and the deconvolution layer connects the outputs of the convolution layer conv3 and deconvolution layer dec 3.
Optionally, feature extraction is performed through ResUNeT networks according to the sparse depth map, the RGB map, and the dense depth map to obtain an affinity map, including:
According to the sparse depth map, the RGB map and the dense depth map, extracting features through ResUNeT networks to obtain first image features, second image features and third image features;
The first image features include a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the first image feature has a size that is one sixteenth of the size of the dense depth map;
the second image features include a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the second image feature has a size that is one quarter of the size of the dense depth map;
The third image features include a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;
performing convolution operation according to the first image characteristics to obtain a first affinity graph;
performing convolution operation according to the second image characteristics to obtain a second affinity graph;
And carrying out convolution operation according to the third image characteristic to obtain a third affinity graph.
In a possible embodiment, the affinity map essentially belongs to a map that converts the initial depth map into a more refined depth map, and the level of the expression of this map naturally affects the accuracy of the final refinement result.
According to the concept of dynamic affinity range, the propagation ranges of the first affinity diagram and the second affinity diagram of the invention are as followsBut is in the range of 1/16 and 1/4, depth information in the range of 16 and 4 times, respectively, can be aggregated with respect to the master.
The first affinity map, the second affinity map and the third affinity map of the three stages are generated to correspond to three scale feature maps respectively, and the three scale feature maps correspond to abstract structural information to specific detail information respectively, so that the first affinity map of DH-CSPN can pay more attention to the structural information in the first stage with lower resolution, and the affinity of the third affinity map generated in the final stage can pay more attention to the detail information, and the concept based on dynamic affinity and weight is designed.
The network framework structure that generates the first affinity graph includes deconvolution layers gd2-dec1, deconvolution layers gd2-dec0, and convolution layers gd2-conf0. Deconvolution of deconvolution layer gd2-dec1 has a convolution kernel size ofStep size 2, number of channels 128, which connects the outputs of the convolutional layer conv4 and the deconvolution layer dec 4; deconvolution of deconvolution layer gd2-dec0 has a convolution kernel size/>Step size 1, number of channels/>; The convolution kernel size of the convolution layer gd2-conf0 is/>Step size 1, number of channels/>Sigmoid is used as the activation function.
The network framework structure that generates the second affinity graph includes deconvolution layers gd1-dec1, deconvolution layers gd1-dec0, and convolution layers gd1-conf0. Deconvolution of deconvolution layers gd1-dec1 has a convolution kernel size ofStep length of 1, number of channels ofThis layer connects the outputs of the convolutional layer conv3 and the deconvolution layer dec 3; deconvolution of deconvolution layer gd1-dec0 has a convolution kernel size/>Step size 1, number of channels/>; The convolution kernel size of the convolution layer gd2-conf0 is/>Step size 1, number of channels/>Sigmoid is used as the activation function.
The network framework that generates the third affinity graph includes deconvolution layers gd0-dec1, connection layer concat2, deconvolution layers gd0-dec0, and convolution layers gd0-conf0. Deconvolution of deconvolution layers gd0-dec1 has a convolution kernel size ofStep size 2, channel number 64, this layer connects the outputs of the convolutional layer conv2 and the deconvolution layer dec 2; the connection layer concat2 links the outputs of gd0-dec1, dec1 and concat 1; deconvolution of deconvolution layers gd0-dec0 has a convolution kernel size ofStep size 1, number of channels/>; The convolution kernel size of the convolution layers gd0-conf0 is/>Step length of 1, number of channels ofSigmoid is used as the activation function.
Since the depth image is downsampled at the initial stage in the subsequent iterative propagation process, the detail information is lost and only the structural information is reserved, and since the depth information map with the size of 1/16 only contains the structural information, the first affinity map of the invention is focused on the aspect. Similarly, in the result of the third stage, 1/1, the second affinity graph generated by the method is focused more on the detail information.
The three scale structure of the hierarchical design allows the affinity generation and depth value iterative propagation processes of the first two stages to be performed on smaller scales, and the corresponding calculation amounts are 1/16 and 1/4 of the original calculation amounts, so that the method is more advantageous in terms of calculation amounts compared with a Non-local space propagation network (Non-local Spatial Propagation Network, NLSPN) and a dynamic space propagation network (DYNAMIC SPATIAL Propagation Network, dySPN).
And S4, carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map.
Optionally, performing iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map, including:
downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first complement depth map;
performing up-sampling processing on the first complement depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on the second affinity map to obtain a second complement depth map;
performing up-sampling processing on the second complement depth map to obtain a third dense depth map; and based on the third affinity graph, iteratively propagating the third dense depth graph to obtain a complement depth graph.
In one possible embodiment, a first affinity map is obtainedAnd after the completed dense depth map, the original size is/>Downsampling the dense depth map length and width to 1/4 of the original length and width, which can be expressed as. According to the first affinity diagram/>, which is the same in sizeAnd depth map/>The first affinity graph is used for guiding the depth graph to conduct depth iterative propagation. The mathematical expression of this first stage iterative propagation of depth values is as follows (1):
(1)
wherein i is the row position of each pixel in the depth image pixel, j is the column position of each pixel in the depth image pixel, the subscript N represents the serial number of the iteration times, Represented bitwise multiplication, (a, b) represents/>The relative position of each point within the propagation range.
From equation (1), it can be seen that from the depth map before iterationSurrounding/>, corresponding to the location of (c)Spots within the range by affinity pattern/>Weighted summation calculation is carried out to obtain a depth map/>, after iterative propagation of depth valuesA depth value of each pixel point. In the first stage of iterative propagation, the iterative propagation of depth values on a depth information graph with the size of 1/16 original graph is continued for 6 times, and the/>
According to the first stage iteration propagation resultUp-sampling operation is carried out to change the length and width of the sample into 2 times of the original length and width, namely/>. According to/>And a second affinity map/>, of corresponding sizeIterative propagation of depth values is performed, and the mathematical expression of this second stage propagation process is as follows (2):
(2)
After 6 iterative processes, obtain
According to the second stage iteration propagation resultUp-sampling operation is carried out to change the length and width of the sample into 2 times of the original length and width, namely/>. According to/>And a third affinity map of a corresponding sizeIterative propagation of depth values is performed, and the mathematical expression of this third-stage propagation process is as follows (3):
(3)
After 6 iterative processes, finally obtain Final/>And obtaining a dense depth map after the completion of the method.
In one possible embodiment, the method of the present invention was experimentally verified using the data set of NYU-Depth V2-new york university indoor scene data sets and Depth complement data sets (Konstruktion Innovation und Technologie in der Industrie, KITTI). The evaluation indexes mainly comprise the following six, and the mathematical expression of the evaluation indexes is shown as the following formula (4):
(4)
Wherein, Representing predicted depth information map results,/>Representing the actual depth information result of the scene. /(I)The t value of the index is respectively 1,2 and 3. Among the above indexes, root mean square Error (Root Mean Square Error, RMSE), exponential average root mean square Error (Index Root Mean Square Error, iRMSE), mean absolute Error (Mean Absolute Error, MAE), exponential average absolute Error (Index Mean Absolute Error, iMAE), and Relative Error (REL) are all measures for Error, and the smaller the more preferable the need is; /(I)Representing the number of pixels within a specified error range, the larger the better.
Aiming at a method based on space propagation in the prior art, the method is transformed into a decoupling frame and the accuracy comparison before and after decoupling is carried out. The error comparison results on the NYUv data set are shown in table 1 (decoupling result error comparison table).
TABLE 1
According to table 1, the spatial propagation-based methods in the prior art are all improved after decoupling, and the effectiveness of the method for improving the depth map accuracy according to decoupling is verified.
The suitability of the invention for different pre-filling methods was tested by varying the pre-filling method, table 2 (different pre-filling method result error versus table) being the test results on the NYU dataset after using different pre-filling methods.
TABLE 2
From the results of table 2, it is clear that the effect of the present invention is improved in all the pre-filling methods employed, embodying the feasibility of the present invention as a general depth-complement framework.
The accuracy and computational effort of several space-based methods currently existing are compared under the same ResUnet framework. Table 3 (based on spatial propagation method accuracy and scalar comparison table) is a comparison graph of the effects of various spatial propagation methods on NYUv data sets.
TABLE 3 Table 3
From the results of table 3, it can be seen that the method of the present invention achieves a greater improvement in accuracy with a smaller amount of calculation. The superiority of the network structure generated by the multi-level affinity of the invention in design is illustrated.
The invention provides a general depth map completion method based on depth information transmission, and the method can obtain precision improvement on any pre-filling method through an alternative pre-filling method, and the higher the precision of the pre-filling method is, the higher the final precision is. Therefore, the method provided by the invention can be used as a pre-filling method and can obtain a depth filling result with higher precision on the basis of the pre-filling method even after a new depth filling method appears. The invention provides a novel affinity graph generation method, and the affinity graph generated by the method solves the problem of too low reasoning speed caused by a large propagation range. Since the affinities of the three stages are respectively used for generating the feature graphs of the corresponding scales, the feature graphs of the three scales respectively correspond to abstract structural information to specific detail information. The whole space propagation network framework greatly reduces the risk of poor generalization of the network obtained by model training; and due to the participation of the supervision information, the difficulty of the learning and training process is reduced. The depth map complementation method has the advantages of high complementation accuracy and high reasoning speed, and overcomes the defect of insufficient resolution of the depth sensor.
Fig. 2 is a block diagram of a general depth map completion apparatus based on depth information propagation for a general depth map completion method based on depth information propagation according to an exemplary embodiment. Referring to fig. 2, the apparatus includes a data acquisition module 210, a pre-population module 220, and affinity map generation module 230, a depth map complement module 240. Wherein:
The data acquisition module 210 is configured to acquire data of a scene by using a depth sensor, and obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image;
A pre-filling module 220, configured to perform depth filling on the sparse depth map by using a pre-filling method, so as to obtain a dense depth map;
The affinity graph generating module 230 is configured to input the sparse depth graph, the RGB graph, and the dense depth graph into the ResUNeT network for feature extraction to obtain an affinity graph;
the depth map complement module 240 is configured to perform iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map.
Wherein the pre-filling method is a convolution space propagation network, a non-local space propagation network, a dense space propagation network or a full convolution space propagation network.
Wherein the affinity profile comprises a first affinity profile, a second affinity profile, and a third affinity profile;
the first affinity diagram is used for structural information complementation; the size of the first affinity map is one sixteenth of the size of the dense depth map; the dense depth map has a size of M is the length of the dense depth map and n is the width of the dense depth map; /(I)
The second affinity diagram is used for detail information complementation; the second affinity map has a size that is one-fourth the size of the dense depth map;
the third affinity diagram is used for detail information complementation; the size of the third affinity map is the size of the dense depth map.
Wherein the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;
the feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolution layers;
The affinity map generation branch comprises a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity map generating branch includes 2 deconvolution layers and 1 convolution layer.
Optionally, the affinity graph generating module 230 is further configured to:
According to the sparse depth map, the RGB map and the dense depth map, extracting features through ResUNeT networks to obtain first image features, second image features and third image features;
The first image features include a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the first image feature has a size that is one sixteenth of the size of the dense depth map;
the second image features include a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the second image feature has a size that is one quarter of the size of the dense depth map;
The third image features include a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;
performing convolution operation according to the first image characteristics to obtain a first affinity graph;
performing convolution operation according to the second image characteristics to obtain a second affinity graph;
And carrying out convolution operation according to the third image characteristic to obtain a third affinity graph.
Optionally, the depth map complement module 240 is further configured to:
downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first complement depth map;
performing up-sampling processing on the first complement depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on the second affinity map to obtain a second complement depth map;
performing up-sampling processing on the second complement depth map to obtain a third dense depth map; and based on the third affinity graph, iteratively propagating the third dense depth graph to obtain a complement depth graph.
The invention provides a general depth map completion method based on depth information transmission, and the method can obtain precision improvement on any pre-filling method through an alternative pre-filling method, and the higher the precision of the pre-filling method is, the higher the final precision is. Therefore, the method provided by the invention can be used as a pre-filling method and can obtain a depth filling result with higher precision on the basis of the pre-filling method even after a new depth filling method appears. The invention provides a novel affinity graph generation method, and the affinity graph generated by the method solves the problem of too low reasoning speed caused by a large propagation range. Since the affinities of the three stages are respectively used for generating the feature graphs of the corresponding scales, the feature graphs of the three scales respectively correspond to abstract structural information to specific detail information. The whole space propagation network framework greatly reduces the risk of poor generalization of the network obtained by model training; and due to the participation of the supervision information, the difficulty of the learning and training process is reduced. The depth map complementation method has the advantages of high complementation accuracy and high reasoning speed, and overcomes the defect of insufficient resolution of the depth sensor.
Fig. 3 is a schematic structural diagram of a generic depth map completing apparatus according to an embodiment of the present invention, where, as shown in fig. 3, the generic depth map completing apparatus may include a generic depth map completing device based on depth information propagation shown in fig. 2. Optionally, the generic depth map completion device 310 may include a processor 2001.
Optionally, the generic depth map completion device 310 may also include a memory 2002 and a transceiver 2003.
The processor 2001 may be connected to the memory 2002 and the transceiver 2003 via a communication bus, for example.
The following describes the various components of the generic depth map completion apparatus 310 in detail with reference to fig. 3:
The processor 2001 is a control center of the generic depth map completion apparatus 310, and may be one processor or a plurality of processing elements. For example, processor 2001 is one or more central processing units (central processing unit, CPU), but may also be an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention, such as: one or more microprocessors (DIGITAL SIGNAL processors, DSPs), or one or more field programmable gate arrays (field programmable GATE ARRAY, FPGAs).
Alternatively, the processor 2001 may perform various functions of the generic depth map completion device 310 by running or executing software programs stored in the memory 2002, and invoking data stored in the memory 2002.
In a particular implementation, the processor 2001 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 3, as an example.
In a particular implementation, as one embodiment, the generic depth map completion device 310 may also include multiple processors, such as the processor 2001 and processor 2004 shown in FIG. 3. Each of these processors may be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
The memory 2002 is used for storing a software program for executing the solution of the present invention, and is controlled by the processor 2001 to execute the solution, and the specific implementation may refer to the above method embodiment, which is not described herein again.
Alternatively, memory 2002 may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-only memory, EEPROM), compact disc read-only memory (compact disc read-only memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, without limitation. Memory 2002 may be integrated with processor 2001 or may exist separately and be coupled to processor 2001 via interface circuitry (not shown in fig. 3) of generic depth map completion device 310, as embodiments of the invention are not limited in this regard.
A transceiver 2003 for communicating with a network device or with a terminal device.
Alternatively, transceiver 2003 may include a receiver and a transmitter (not separately shown in fig. 3). The receiver is used for realizing the receiving function, and the transmitter is used for realizing the transmitting function.
Alternatively, transceiver 2003 may be integrated with processor 2001 or may exist separately and be coupled to processor 2001 through interface circuitry (not shown in FIG. 3) of generic depth map completion device 310, as embodiments of the present invention are not limited in this regard.
It should be noted that the structure of the generic depth map completion device 310 shown in fig. 3 is not limited to this router, and an actual knowledge structure recognition device may include more or fewer components than shown, or may combine some components, or may be a different arrangement of components.
In addition, the technical effects of the generic depth map completing device 310 may refer to the technical effects of the generic depth map completing method based on depth information propagation described in the above method embodiments, which are not described herein.
It is to be appreciated that the processor 2001 in embodiments of the invention may be a central processing unit (central processing unit, CPU) which may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application Specific Integrated Circuits (ASICs), off-the-shelf programmable gate arrays (field programmable GATE ARRAY, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should also be appreciated that the memory in embodiments of the present invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an erasable programmable ROM (erasable PROM), an electrically erasable programmable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as external cache memory. By way of example, and not limitation, many forms of random access memory (random access memory, RAM) are available, such as static random access memory (STATIC RAM, SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (double DATA RATE SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINK DRAM, SLDRAM), and direct memory bus random access memory (direct rambus RAM, DR RAM).
The above embodiments may be implemented in whole or in part by software, hardware (e.g., circuitry), firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present invention are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that the term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. In addition, the character "/" herein generally indicates that the associated object is an "or" relationship, but may also indicate an "and/or" relationship, and may be understood by referring to the context.
In the present invention, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and unit described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another device, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. A universal depth map completion method based on depth information propagation, the method comprising:
Acquiring data of a scene by using a depth sensor to obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image;
performing depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map;
inputting the sparse depth map, the RGB map and the dense depth map into ResUNeT networks for feature extraction to obtain an affinity map;
wherein the affinity profile comprises a first affinity profile, a second affinity profile, and a third affinity profile;
the first affinity diagram is used for complementing structural information; the first affinity map has a size that is one sixteenth of the size of the dense depth map; the dense depth map has a size of M is the length of the dense depth map and n is the width of the dense depth map;
the second affinity diagram is used for detail information complementation; the second affinity map has a size that is one-fourth the size of the dense depth map;
The third affinity diagram is used for detail information complementation; the size of the third affinity map is the size of the dense depth map;
Wherein the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;
the feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolution layers;
The affinity map generation branch comprises a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity graph generation branch comprises 2 deconvolution layers and 1 convolution layer;
inputting ResUNeT the sparse depth map, the RGB map and the dense depth map into a network for feature extraction to obtain an affinity map, wherein the method comprises the following steps:
Performing feature extraction through ResUNeT networks according to the sparse depth map, the RGB map and the dense depth map to obtain a first image feature, a second image feature and a third image feature;
The first image features include a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the first image feature has a size that is one sixteenth of the size of the dense depth map;
The second image features include a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the second image feature has a size that is one quarter of the size of the dense depth map;
The third image features include a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;
Performing convolution operation according to the first image characteristics to obtain a first affinity graph;
performing convolution operation according to the second image characteristics to obtain a second affinity graph;
performing convolution operation according to the third image characteristics to obtain a third affinity graph;
and carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map.
2. The depth information propagation-based general depth map completion method according to claim 1, wherein the pre-filling method is a convolution spatial propagation network, a non-local spatial propagation network, a dense spatial propagation network, or a full convolution spatial propagation network.
3. The general depth map completion method based on depth information propagation according to claim 1, wherein the iterative propagation is performed according to the dense depth map and the affinity map to obtain a completion depth map, comprising:
Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first complement depth map;
Performing up-sampling processing on the first complement depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on a second affinity map to obtain a second complement depth map;
performing up-sampling processing on the second complement depth map to obtain a third dense depth map; and based on the third affinity graph, iteratively propagating the third dense depth graph to obtain a complement depth graph.
4. A depth information propagation-based general depth map completion apparatus for implementing the depth information propagation-based general depth map completion method according to any one of claims 1 to 3, the apparatus comprising:
the data acquisition module is used for acquiring data of the scene by using the depth sensor to obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image;
the pre-filling module is used for carrying out depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map;
the affinity graph generation module is used for inputting the sparse depth graph, the RGB graph and the dense depth graph into ResUNeT networks for feature extraction to obtain an affinity graph;
wherein the affinity profile comprises a first affinity profile, a second affinity profile, and a third affinity profile;
the first affinity diagram is used for complementing structural information; the first affinity map has a size that is one sixteenth of the size of the dense depth map; the dense depth map has a size of M is the length of the dense depth map and n is the width of the dense depth map;
the second affinity diagram is used for detail information complementation; the second affinity map has a size that is one-fourth the size of the dense depth map;
The third affinity diagram is used for detail information complementation; the size of the third affinity map is the size of the dense depth map;
Wherein the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;
the feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolution layers;
The affinity map generation branch comprises a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity graph generation branch comprises 2 deconvolution layers and 1 convolution layer;
wherein the affinity graph generation module is further configured to:
Performing feature extraction through ResUNeT networks according to the sparse depth map, the RGB map and the dense depth map to obtain a first image feature, a second image feature and a third image feature;
The first image features include a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the first image feature has a size that is one sixteenth of the size of the dense depth map;
The second image features include a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the second image feature has a size that is one quarter of the size of the dense depth map;
The third image features include a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;
Performing convolution operation according to the first image characteristics to obtain a first affinity graph;
performing convolution operation according to the second image characteristics to obtain a second affinity graph;
performing convolution operation according to the third image characteristics to obtain a third affinity graph;
and the depth map complement module is used for carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map.
5. The depth information propagation-based generic depth map completion apparatus of claim 4, wherein the depth map completion module is further configured to:
Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first complement depth map;
Performing up-sampling processing on the first complement depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on a second affinity map to obtain a second complement depth map;
performing up-sampling processing on the second complement depth map to obtain a third dense depth map; and based on the third affinity graph, iteratively propagating the third dense depth graph to obtain a complement depth graph.
6. A generic depth map completion apparatus, the generic depth map completion apparatus comprising:
A processor;
a memory having stored thereon computer readable instructions which, when executed by the processor, implement the method of any of claims 1 to 3.
7. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program code, which is callable by a processor for executing the method according to any one of claims 1 to 3.
CN202410356521.4A 2024-03-27 2024-03-27 General depth map completion method and device based on depth information propagation Active CN117953029B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410356521.4A CN117953029B (en) 2024-03-27 2024-03-27 General depth map completion method and device based on depth information propagation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410356521.4A CN117953029B (en) 2024-03-27 2024-03-27 General depth map completion method and device based on depth information propagation

Publications (2)

Publication Number Publication Date
CN117953029A CN117953029A (en) 2024-04-30
CN117953029B true CN117953029B (en) 2024-06-07

Family

ID=90803466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410356521.4A Active CN117953029B (en) 2024-03-27 2024-03-27 General depth map completion method and device based on depth information propagation

Country Status (1)

Country Link
CN (1) CN117953029B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685842A (en) * 2018-12-14 2019-04-26 电子科技大学 A kind of thick densification method of sparse depth based on multiple dimensioned network
CN111401411A (en) * 2020-02-28 2020-07-10 北京小米松果电子有限公司 Method and device for acquiring sample image set
CN112560875A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Deep information completion model training method, device, equipment and storage medium
CN113486887A (en) * 2021-06-30 2021-10-08 杭州飞步科技有限公司 Target detection method and device in three-dimensional scene
CN113936047A (en) * 2021-10-14 2022-01-14 重庆大学 Dense depth map generation method and system
KR20220029335A (en) * 2020-08-31 2022-03-08 삼성전자주식회사 Method and apparatus to complement the depth image
CN115482265A (en) * 2022-08-31 2022-12-16 电子科技大学 Outdoor scene depth completion method based on continuous video stream
CN115496788A (en) * 2022-09-30 2022-12-20 杭州电子科技大学 Deep completion method using airspace propagation post-processing module
CN116245930A (en) * 2023-02-28 2023-06-09 北京科技大学顺德创新学院 Depth complement method and device based on attention panoramic sensing guidance
CN116468768A (en) * 2023-04-20 2023-07-21 南京航空航天大学 Scene depth completion method based on conditional variation self-encoder and geometric guidance
CN117635444A (en) * 2023-12-26 2024-03-01 中国人民解放军国防科技大学 Depth completion method, device and equipment based on radiation difference and space distance

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10839543B2 (en) * 2019-02-26 2020-11-17 Baidu Usa Llc Systems and methods for depth estimation using convolutional spatial propagation networks
WO2021013334A1 (en) * 2019-07-22 2021-01-28 Toyota Motor Europe Depth maps prediction system and training method for such a system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685842A (en) * 2018-12-14 2019-04-26 电子科技大学 A kind of thick densification method of sparse depth based on multiple dimensioned network
CN111401411A (en) * 2020-02-28 2020-07-10 北京小米松果电子有限公司 Method and device for acquiring sample image set
KR20220029335A (en) * 2020-08-31 2022-03-08 삼성전자주식회사 Method and apparatus to complement the depth image
CN112560875A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Deep information completion model training method, device, equipment and storage medium
CN113486887A (en) * 2021-06-30 2021-10-08 杭州飞步科技有限公司 Target detection method and device in three-dimensional scene
CN113936047A (en) * 2021-10-14 2022-01-14 重庆大学 Dense depth map generation method and system
CN115482265A (en) * 2022-08-31 2022-12-16 电子科技大学 Outdoor scene depth completion method based on continuous video stream
CN115496788A (en) * 2022-09-30 2022-12-20 杭州电子科技大学 Deep completion method using airspace propagation post-processing module
CN116245930A (en) * 2023-02-28 2023-06-09 北京科技大学顺德创新学院 Depth complement method and device based on attention panoramic sensing guidance
CN116468768A (en) * 2023-04-20 2023-07-21 南京航空航天大学 Scene depth completion method based on conditional variation self-encoder and geometric guidance
CN117635444A (en) * 2023-12-26 2024-03-01 中国人民解放军国防科技大学 Depth completion method, device and equipment based on radiation difference and space distance

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种用于深度补全的双分支引导网络;秦晓飞 等;《光学仪器》;20230508;第45卷(第5期);全文 *
基于全卷积网络的像素级场景理解技术研究;江忠泽;《中国优秀硕士论文全文数据库 信息科技辑》;20220515(第05期);全文 *

Also Published As

Publication number Publication date
CN117953029A (en) 2024-04-30

Similar Documents

Publication Publication Date Title
US20210090327A1 (en) Neural network processing for multi-object 3d modeling
CN111402130B (en) Data processing method and data processing device
EP3716198A1 (en) Image reconstruction method and device
WO2020000390A1 (en) Systems and methods for depth estimation via affinity learned with convolutional spatial propagation networks
US11429838B2 (en) Neural network device for neural network operation, method of operating neural network device, and application processor including the neural network device
US11755889B2 (en) Method, system and apparatus for pattern recognition
US11934949B2 (en) Composite binary decomposition network
CN112001914A (en) Depth image completion method and device
US11244028B2 (en) Neural network processor and convolution operation method thereof
CN112183718A (en) Deep learning training method and device for computing equipment
WO2021096324A1 (en) Method for estimating depth of scene in image and computing device for implementation of the same
Liu et al. Graphcspn: Geometry-aware depth completion via dynamic gcns
US20240135174A1 (en) Data processing method, and neural network model training method and apparatus
CN113327318B (en) Image display method, image display device, electronic equipment and computer readable medium
US20220277581A1 (en) Hand pose estimation method, device and storage medium
US20230230269A1 (en) Depth completion method and apparatus using a spatial-temporal
CN117635444A (en) Depth completion method, device and equipment based on radiation difference and space distance
CN117953029B (en) General depth map completion method and device based on depth information propagation
US20230401670A1 (en) Multi-scale autoencoder generation method, electronic device and readable storage medium
CN116664829A (en) RGB-T semantic segmentation method, system, device and storage medium
CN113470026B (en) Polyp recognition method, device, medium, and apparatus
CN115239815B (en) Camera calibration method and device
CN112016571B (en) Feature extraction method and device based on attention mechanism and electronic equipment
CN113269812A (en) Image prediction model training and application method, device, equipment and storage medium
CN112446328A (en) Monocular depth estimation system, method, device and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant